jasondavindev commented on issue #4122:
URL: https://github.com/apache/hudi/issues/4122#issuecomment-981610020
@xushiyan Thanks! I built the image, but when I trying write a dataframe, I
receive the error
```bash
>>> df.write.format('hudi').options(**hudi_options).save('/tmp/data/sample')
37491 [Thread-3] WARN
org.apache.hudi.common.config.DFSPropertiesConfiguration - Cannot find
HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf
37500 [Thread-3] ERROR
org.apache.hudi.common.config.DFSPropertiesConfiguration - Error reading in
properties from dfs
37500 [Thread-3] WARN
org.apache.hudi.common.config.DFSPropertiesConfiguration - Didn't find config
file under default conf file dir: file:/etc/hudi/conf
38382 [Thread-3] WARN org.apache.hudi.metadata.HoodieBackedTableMetadata -
Metadata table was not found at path /tmp/data/sample/.hoodie/metadata
38400 [Thread-3] WARN org.apache.hudi.metadata.HoodieBackedTableMetadata -
Metadata table was not found at path /tmp/data/sample/.hoodie/metadata
41212 [Thread-3] WARN org.apache.hudi.metadata.HoodieBackedTableMetadata -
Metadata table was not found at path /tmp/data/sample/.hoodie/metadata
41217 [Thread-3] WARN org.apache.hudi.metadata.HoodieBackedTableMetadata -
Metadata table was not found at path /tmp/data/sample/.hoodie/metadata
41972 [Executor task launch worker for task 0.0 in stage 49.0 (TID 44)]
ERROR org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor -
Error upserting bucketType UPDATE for partition :0
java.lang.ExceptionInInitializerError
at
org.apache.hadoop.hbase.io.hfile.LruBlockCache.<clinit>(LruBlockCache.java:935)
at
org.apache.hadoop.hbase.io.hfile.CacheConfig.getL1(CacheConfig.java:553)
at
org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:660)
at
org.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:246)
at
org.apache.hudi.common.table.log.block.HoodieHFileDataBlock.serializeRecords(HoodieHFileDataBlock.java:100)
at
org.apache.hudi.common.table.log.block.HoodieDataBlock.getContentBytes(HoodieDataBlock.java:120)
at
org.apache.hudi.common.table.log.HoodieLogFormatWriter.appendBlocks(HoodieLogFormatWriter.java:164)
at
org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:375)
at
org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:353)
at
org.apache.hudi.table.action.deltacommit.AbstractSparkDeltaCommitActionExecutor.handleUpdate(AbstractSparkDeltaCommitActionExecutor.java:84)
at
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:313)
at
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$execute$ecf5068c$1(BaseSparkCommitActionExecutor.java:172)
at
org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
at
org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:915)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:915)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:386)
at
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1440)
at
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1350)
at
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1414)
at
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1237)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: Unexpected version format: 11.0.13
at org.apache.hadoop.hbase.util.ClassSize.<clinit>(ClassSize.java:119)
... 39 more
```
I found a issue related to this error, but it was a compatibility issue
(0.4.x version).
You can see my application here
https://github.com/jasondavindev/delta-lake-dms-cdc/blob/main/apps/hudi_update.py
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]