Re: hudi hadoop 3 support

Vinoth Chandar Thu, 14 Mar 2019 17:56:32 -0700

Hi Rahul,

We have not tested Hudi with Hadoop 3 yet. Seems like the
FSDataOutputStream constructor got changed in 3.x.
Could you paste the entire stack trace here or preferably in a gist?


Also, are you able to drive this, if we provide help?

Thanks
Vinoth

On Thu, Mar 14, 2019 at 1:19 AM [email protected] <
[email protected]> wrote:

> Dear All
>
> I am unable to run hudi with hadoop 3.0.1. The  spark injection job is
> failing saying method not found.
> 19/03/14 13:30:12 INFO IteratorBasedQueueProducer: starting to buffer
> records
>
> error
> ------
> 19/03/14 13:30:12 INFO BoundedInMemoryExecutor: starting consumer thread
> 19/03/14 13:30:12 INFO FSUtils: Hadoop Configuration: fs.defaultFS:
> [hdfs://x.x.x.x:8020], Config:[Configuration: ], FileSystem:
> [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1685428217_1, ugi=hudi
> (auth:SIMPLE)]]]
> 19/03/14 13:30:12 INFO HoodieIOHandle: Deleting 0 files generated by
> previous failed attempts.
> 19/03/14 13:30:12 INFO FSUtils: Hadoop Configuration: fs.defaultFS:
> [hdfs://x.x.x.x:8020], Config:[Configuration: ], FileSystem:
> [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1685428217_1, ugi=hudi
> (auth:SIMPLE)]]]
> 19/03/14 13:30:12 INFO FSUtils: Hadoop Configuration: fs.defaultFS:
> [hdfs://x.x.x.:8020], Config:[Configuration: ], FileSystem:
> [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1685428217_1, ugi=hudi
> (auth:SIMPLE)]]]
> 19/03/14 13:30:12 INFO IteratorBasedQueueProducer: finished buffering
> records
> 19/03/14 13:30:12 INFO FSUtils: Hadoop Configuration: fs.defaultFS:
> [hdfs://10.0.0.28.local:8020], Config:[Configuration: ], FileSystem:
> [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1685428217_1, ugi=hudi
> (auth:SIMPLE)]]]
> 19/03/14 13:30:12 WARN BlockManager: Putting block rdd_16_0 failed due to
> exception java.lang.RuntimeException:
> com.uber.hoodie.exception.HoodieException:
> com.uber.hoodie.exception.HoodieException:
> java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError:
> org.apache.hadoop.fs.FSDataOutputStream: method
> <init>(Ljava/io/OutputStream;)V not found.
> 19/03/14 13:30:12 WARN BlockManager: Block rdd_16_0 could not be removed
> as it was not found on disk or in memory
> 19/03/14 13:30:12 ERROR Executor: Exception in task 0.0 in stage 7.0 (TID
> 6)
> java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException:
> com.uber.hoodie.exception.HoodieException:
> java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError:
> org.apache.hadoop.fs.FSDataOutputStream: method
> <init>(Ljava/io/OutputStream;)V not found
>         at
> com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121)
>         at
> scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43)
>         at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>         at
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:378)
>         at
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1109)
>         at
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1083)
>         at
> org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1018)
>         at
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1083)
>         at
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:809)
>         at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>         at
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>         at org.apache.spark.scheduler.Task.run(Task.scala:109)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: com.uber.hoodie.exception.HoodieException:
> com.uber.hoodie.exception.HoodieException:
> java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError:
> org.apache.hadoop.fs.FSDataOutputStream: method
> <init>(Ljava/io/OutputStream;)V not found
>
>
> Is there any hope for hudi to work with hadoop3?. Please assist on this.
>
> Thanks & Regards
> Rahul
>
>
>
>

Re: hudi hadoop 3 support

Reply via email to