arun990 opened a new issue #2997: URL: https://github.com/apache/hudi/issues/2997
Hi, Getting the below error when trying to save the Hudi table to hdfs location and Google Storage location. Hadoop envirnoment is Dataproc and the spark version 2.4.7 and Hive 2x. Hudi jars used are hudi-spark-bundle_2.12-0.8.0.jar and spark-avro_2.12-2.4.7.jar. Appreciate your help. py4j.protocol.Py4JJavaError: An error occurred while calling o119.save. : java.lang.ClassNotFoundException: Failed to find data source: org.apache.hudi. Please find packages at http://spark.apache.org/third-party-projects.html at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:678) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:265) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:249) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.hudi.DefaultSource at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:652) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:652) at scala.util.Failure.orElse(Try.scala:224) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:652) ... 13 more INFO org.spark_project.jetty.server.AbstractConnector: Stopped Spark@74183a4c{HTTP/1.1,[http/1.1]}{0.0.0.0:0} -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
