zengyangjie opened a new issue, #9974:
URL: https://github.com/apache/hudi/issues/9974
I use spark-shell to run hudi, and the following are the configuration
arguments:
`spark-shell --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.12.0
--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
--conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
--conf 'spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar'`
In spark-shell, I import the following libraries and run a test case:
`import org.apache.spark.sql.SaveMode._
import org.apache.hudi.DataSourceWriteOptions._
import org.apache.hudi.config.HoodieWriteConfig._
spark.range(1).write.format("org.apache.hudi").option(TABLE_NAME,
"hudi_tab01").option(PRECOMBINE_FIELD_OPT_KEY,
"id").option(RECORDKEY_FIELD_OPT_KEY,
"id").mode(Overwrite).save("/tmp/hudi_tab01")`
Then the following error occurred:
`warning: one deprecation; for details, enable :setting -deprecation or
:replay -deprecation
23/11/01 20:25:03 WARN HoodieSparkSqlWriter$: hoodie table at
/tmp/hudi_tab01 already exists. Deleting existing data & overwriting with new
data.
23/11/01 20:25:03 WARN HoodieBackedTableMetadata: Metadata table was not
found at path /tmp/hudi_tab01/.hoodie/metadata
23/11/01 20:25:04 ERROR TorrentBroadcast: Store broadcast broadcast_1 fail,
remove all pieces of the broadcast
org.apache.spark.SparkException: Job aborted due to stage failure: Task
serialization failed: org.apache.spark.SparkException: Failed to register
classes with Kryo
org.apache.spark.SparkException: Failed to register classes with Kryo
at
org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$5(KryoSerializer.scala:183)
at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at
org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:233)
at
org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:171)`
Provided that I did not compile the hudi source code with Maven and only
downloaded the hudi dependency package in the spark-shell.
Is the error related to a jar dependency package? Is it related to the
HUDI_CONF_DIR or hudi-defaults. conf file? If so, how should I configure it?
Have any friends encountered this error and found a solution
I would greatly appreciate it if it could be resolved!
Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]