[GitHub] [hudi] kimberlyamandalu commented on issue #1977: Error running hudi on aws glue

GitBox Fri, 09 Oct 2020 12:01:36 -0700


kimberlyamandalu commented on issue #1977:
URL: https://github.com/apache/hudi/issues/1977#issuecomment-706352107



   By any chance, does anyone have a working example I can follow to get Hudi 
to work with AWS glue? I have added the jars to a S3 path and am adding as 
extra jars parameter to the glue job. However, I am still getting 
ClassNotFoundException when running a simple job to write a hudi dataset.
   
   Py4JJavaError: An error occurred while calling o298.save. : 
java.lang.ClassNotFoundException: Failed to find data source: hoodie. Please 
find packages at http://spark.apache.org/third-party-projects.html at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
 at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:245) at 
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at 
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at 
py4j.Gateway.invoke(Gateway.java:282) at 
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at 
py4j.commands.C
 allCommand.execute(CallCommand.java:79) at 
py4j.GatewayConnection.run(GatewayConnection.java:238) at 
java.lang.Thread.run(Thread.java:748) Caused by: 
java.lang.ClassNotFoundException: hoodie.DefaultSource at 
java.net.URLClassLoader.findClass(URLClassLoader.java:382) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:418) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:351) at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
 at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
 at scala.util.Try$.apply(Try.scala:192) at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
 at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
 at scala.util.Try.orElse(Try.scala:84) at 
org.apache.spark.sql.execution.datasources.Da
 taSource$.lookupDataSource(DataSource.scala:634) ... 13 more


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] kimberlyamandalu commented on issue #1977: Error running hudi on aws glue

Reply via email to