miomiocat created HUDI-3696:
-------------------------------

             Summary: ORC dependency conflicts between hudi and spark
                 Key: HUDI-3696
                 URL: https://issues.apache.org/jira/browse/HUDI-3696
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: miomiocat


Hudi use orc-core-xxx-nohive.jar to initialize orc storage api like 
VectorizedRowBatch but orc-core-xxx.jar in spark, so there are conflicts of 
package dependency between them
 
{code:java}
df.write.format("hudi").
option("hoodie.table.name", "spark_shell_created_cow_nopartition_type").  
option("hoodie.datasource.write.precombine.field", "id").  
option("hoodie.datasource.write.recordkey.field", "id").  
option("hoodie.datasource.write.partitionpath.field", "id").  
option("hoodie.datasource.write.hive_style_partitioning", "true").  
option("hoodie.datasource.write.table.type", "COPY_ON_WRITE").  
option("hoodie.table.base.file.format", "ORC").  mode(Overwrite).  
save(basePath) {code}
 
For example, I try to write a ORC format based hudi table with commands above, 
following excetion will be thrown:
 
{code:java}
Caused by: java.lang.NoSuchMethodError: 
org.apache.orc.TypeDescription.createRowBatch()Lorg/apache/orc/storage/ql/exec/vector/VectorizedRowBatch;
    at 
org.apache.hudi.io.storage.HoodieOrcWriter.<init>(HoodieOrcWriter.java:84)
    at 
org.apache.hudi.io.storage.HoodieFileWriterFactory.newOrcFileWriter(HoodieFileWriterFactory.java:102)
    at 
org.apache.hudi.io.storage.HoodieFileWriterFactory.getFileWriter(HoodieFileWriterFactory.java:59)
    at org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:100)
    at org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:73)
    at 
org.apache.hudi.io.CreateHandleFactory.create(CreateHandleFactory.java:46)
    at 
org.apache.hudi.execution.CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteInsertHandler.java:83)
    at 
org.apache.hudi.execution.CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteInsertHandler.java:40)
    at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37)
    at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:134)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    ... 3 more {code}


 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to