Re: Failed to bulk insert
Hi Umesh, Can you add the following verbose configs to capture class-loading information when you run the spark application You can set the following in spark config (For default - $SPARK_HOME/conf/spark-defaults.conf) spark.executor.extraJavaOptions=-verbose:classspark.driver.extraJavaOptions=-verbose:class This should print each class getting loaded along with the jar path. Look out for the jars used for loading com.uber.hoodie.avro.HoodieAvroWriteSupport and other hudi classes in the executor/driver where the exception happens. We should then look inside the jar for the constructor signature. Thanks,Balaji.V On Friday, March 29, 2019, 9:04:33 AM PDT, Umesh Kacha wrote: Hi Balaji, I tried it still gives same error I dont have any other hoodie library except spark bundle. I am using Databricks Spark cloud. Do you think Databricks cloud has some other hoodie dependencies? Regards,Umesh On Thu, Mar 28, 2019 at 9:43 AM Umesh Kacha wrote: Hi Balaji thanks no I am still getting same error will debug more I am sure I don't have any other Hoodie jar except bundle one not sure what's wrong. I will ask for help in case I am stuck. On Thu, Mar 28, 2019, 5:00 AM [email protected] wrote: Hi Umesh, Were you able to bulk insert successfully ? Balaji.VOn Monday, March 25, 2019, 9:42:44 AM PDT, [email protected] wrote: Hi Umesh, I don't see any attachments here. Anyways, I did the following test 1. Create a HelloWorld IntelliJ java project2. I added the Spark bundle as a library dependency (added the local jar directly). ( In INtelliJ -> File -> Project Structure -> Libraries -> + 3. I opened HoodieAvroSupport.class You can notice that the constructor signature is actually present. com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V You can do a similar to test to see if the spark-bundle has the correct constructor in the spark-bundle jar you are using. If it is available, then the only reason I could think of is there another jar somewhere in the classpath which brings in unshaded version of com.uber.hoodie.avro.HoodieAvroWriteSupport The classPath environment section under Environment Tab in Spark UI can also be a place to look at. Hope this clarifies. Balaji.VOn Sunday, March 24, 2019, 5:47:46 AM PDT, Umesh Kacha wrote: Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me to do so last time. Please find all the maven jars in my project in an attached snapshot and I am sure they dont clash with each other. On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan wrote: Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is also included. It should be sufficient to include only bundle packages (e:g hoodie-spark-bundle) at runtime. If you need to use other hoodie packages (for your implementation), you can mark them with "provided dependency scope" Let us know if this is the case ? If not, please provide more context about the dependencies getting added (pom.xml/spark logs showing packages getting added). Thanks,Balaji.V On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha wrote: Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.d
Re: Failed to bulk insert
Hi Balaji, I tried it still gives same error I dont have any other hoodie library except spark bundle. I am using Databricks Spark cloud. Do you think Databricks cloud has some other hoodie dependencies? Regards, Umesh On Thu, Mar 28, 2019 at 9:43 AM Umesh Kacha wrote: > Hi Balaji thanks no I am still getting same error will debug more I am > sure I don't have any other Hoodie jar except bundle one not sure what's > wrong. I will ask for help in case I am stuck. > > On Thu, Mar 28, 2019, 5:00 AM [email protected] > wrote: > >> >> Hi Umesh, >> >> Were you able to bulk insert successfully ? >> >> Balaji.V >> On Monday, March 25, 2019, 9:42:44 AM PDT, [email protected] < >> [email protected]> wrote: >> >> >> >> Hi Umesh, >> >> I don't see any attachments here. Anyways, I did the following test >> >> 1. Create a HelloWorld IntelliJ java project >> 2. I added the Spark bundle as a library dependency (added the local jar >> directly). ( In INtelliJ -> File -> Project Structure -> Libraries -> + >> 3. I opened HoodieAvroSupport.class >> >> You can notice that the constructor signature is actually present. >> >> >> com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V >> >> You can do a similar to test to see if the spark-bundle has the correct >> constructor in the spark-bundle jar you are using. If it is available, then >> the only reason I could think of is there another jar somewhere in the >> classpath which brings in unshaded version of >> com.uber.hoodie.avro.HoodieAvroWriteSupport >> >> The classPath environment section under Environment Tab in Spark UI can >> also be a place to look at. Hope this clarifies. >> >> Balaji.V >> On Sunday, March 24, 2019, 5:47:46 AM PDT, Umesh Kacha < >> [email protected]> wrote: >> >> >> Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me >> to do so last time. Please find all the maven jars in my project in an >> attached snapshot and I am sure they dont clash with each other. >> >> On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan >> wrote: >> >> Hi Umesh, >> I suspect you are including both hoodie-common and hoodie-spark-bundle >> jars in your runtime package dependencies. There will be a version >> of HoodieAvroWriteSupport constructor with a proper shaded signature in >> Hoodie Spark bundle but this may not be picked if hoodie-common jar is also >> included. >> It should be sufficient to include only bundle packages (e:g >> hoodie-spark-bundle) at runtime. If you need to use other hoodie packages >> (for your implementation), you can mark them with "provided dependency >> scope" >> Let us know if this is the case ? If not, please provide more context >> about the dependencies getting added (pom.xml/spark logs showing packages >> getting added). >> >> Thanks,Balaji.V >> >> >> On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha < >> [email protected]> wrote: >> >> Hi I filtered out nulls in dataframe for review_date field and it went >> ahead but failed with the following exception. It looks like some run time >> libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is >> uber jar it has all the transitive dependencies it need. No? >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 >> in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage >> 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: >> com.uber.hoodie.exception.HoodieException: >> com.uber.hoodie.exception.HoodieException: >> java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: >> >> com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V >> at >> >> com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) >> at >> scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) >> at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at >> scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at >> >> org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) >> at >> >> org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) >> at >> >> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) >> at >> >> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) >> at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at >> >> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) >> at >> >> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) >> at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at >> org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPart
Re: Failed to bulk insert
Hi Balaji thanks no I am still getting same error will debug more I am sure I don't have any other Hoodie jar except bundle one not sure what's wrong. I will ask for help in case I am stuck. On Thu, Mar 28, 2019, 5:00 AM [email protected] wrote: > > Hi Umesh, > > Were you able to bulk insert successfully ? > > Balaji.V > On Monday, March 25, 2019, 9:42:44 AM PDT, [email protected] < > [email protected]> wrote: > > > > Hi Umesh, > > I don't see any attachments here. Anyways, I did the following test > > 1. Create a HelloWorld IntelliJ java project > 2. I added the Spark bundle as a library dependency (added the local jar > directly). ( In INtelliJ -> File -> Project Structure -> Libraries -> + > 3. I opened HoodieAvroSupport.class > > You can notice that the constructor signature is actually present. > > > com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V > > You can do a similar to test to see if the spark-bundle has the correct > constructor in the spark-bundle jar you are using. If it is available, then > the only reason I could think of is there another jar somewhere in the > classpath which brings in unshaded version of > com.uber.hoodie.avro.HoodieAvroWriteSupport > > The classPath environment section under Environment Tab in Spark UI can > also be a place to look at. Hope this clarifies. > > Balaji.V > On Sunday, March 24, 2019, 5:47:46 AM PDT, Umesh Kacha < > [email protected]> wrote: > > > Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me to > do so last time. Please find all the maven jars in my project in an > attached snapshot and I am sure they dont clash with each other. > > On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan > wrote: > > Hi Umesh, > I suspect you are including both hoodie-common and hoodie-spark-bundle > jars in your runtime package dependencies. There will be a version > of HoodieAvroWriteSupport constructor with a proper shaded signature in > Hoodie Spark bundle but this may not be picked if hoodie-common jar is also > included. > It should be sufficient to include only bundle packages (e:g > hoodie-spark-bundle) at runtime. If you need to use other hoodie packages > (for your implementation), you can mark them with "provided dependency > scope" > Let us know if this is the case ? If not, please provide more context > about the dependencies getting added (pom.xml/spark logs showing packages > getting added). > > Thanks,Balaji.V > > > On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha < > [email protected]> wrote: > > Hi I filtered out nulls in dataframe for review_date field and it went > ahead but failed with the following exception. It looks like some run time > libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is > uber jar it has all the transitive dependencies it need. No? > > org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 > in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage > 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: > com.uber.hoodie.exception.HoodieException: > com.uber.hoodie.exception.HoodieException: > java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: > > com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V > at > > com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) > at > scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at > scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at > > org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) > at > > org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) > at > > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) > at > > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at > > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) > at > > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at > org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at > org.apache.spark.scheduler.Task.run(Task.scala:112) at > > org.apache.spark.executor.Executor$TaskRunner$$anon
Re: Failed to bulk insert
Hi Umesh, Were you able to bulk insert successfully ? Balaji.VOn Monday, March 25, 2019, 9:42:44 AM PDT, [email protected] wrote: Hi Umesh, I don't see any attachments here. Anyways, I did the following test 1. Create a HelloWorld IntelliJ java project2. I added the Spark bundle as a library dependency (added the local jar directly). ( In INtelliJ -> File -> Project Structure -> Libraries -> + 3. I opened HoodieAvroSupport.class You can notice that the constructor signature is actually present. com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V You can do a similar to test to see if the spark-bundle has the correct constructor in the spark-bundle jar you are using. If it is available, then the only reason I could think of is there another jar somewhere in the classpath which brings in unshaded version of com.uber.hoodie.avro.HoodieAvroWriteSupport The classPath environment section under Environment Tab in Spark UI can also be a place to look at. Hope this clarifies. Balaji.VOn Sunday, March 24, 2019, 5:47:46 AM PDT, Umesh Kacha wrote: Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me to do so last time. Please find all the maven jars in my project in an attached snapshot and I am sure they dont clash with each other. On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan wrote: Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is also included. It should be sufficient to include only bundle packages (e:g hoodie-spark-bundle) at runtime. If you need to use other hoodie packages (for your implementation), you can mark them with "provided dependency scope" Let us know if this is the case ? If not, please provide more context about the dependencies getting added (pom.xml/spark logs showing packages getting added). Thanks,Balaji.V On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha wrote: Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) Caused by: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNe
Re: Failed to bulk insert
Hi Umesh, I don't see any attachments here. Anyways, I did the following test 1. Create a HelloWorld IntelliJ java project2. I added the Spark bundle as a library dependency (added the local jar directly). ( In INtelliJ -> File -> Project Structure -> Libraries -> + 3. I opened HoodieAvroSupport.class You can notice that the constructor signature is actually present. com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V You can do a similar to test to see if the spark-bundle has the correct constructor in the spark-bundle jar you are using. If it is available, then the only reason I could think of is there another jar somewhere in the classpath which brings in unshaded version of com.uber.hoodie.avro.HoodieAvroWriteSupport The classPath environment section under Environment Tab in Spark UI can also be a place to look at. Hope this clarifies. Balaji.VOn Sunday, March 24, 2019, 5:47:46 AM PDT, Umesh Kacha wrote: Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me to do so last time. Please find all the maven jars in my project in an attached snapshot and I am sure they dont clash with each other. On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan wrote: Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is also included. It should be sufficient to include only bundle packages (e:g hoodie-spark-bundle) at runtime. If you need to use other hoodie packages (for your implementation), you can mark them with "provided dependency scope" Let us know if this is the case ? If not, please provide more context about the dependencies getting added (pom.xml/spark logs showing packages getting added). Thanks,Balaji.V On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha wrote: Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) Caused by: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:106) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:
Re: Failed to bulk insert
Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me to do so last time. Please find all the maven jars in my project in an attached snapshot and I am sure they dont clash with each other. On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan wrote: > Hi Umesh, > I suspect you are including both hoodie-common and hoodie-spark-bundle > jars in your runtime package dependencies. There will be a version > of HoodieAvroWriteSupport constructor with a proper shaded signature in > Hoodie Spark bundle but this may not be picked if hoodie-common jar is also > included. > It should be sufficient to include only bundle packages (e:g > hoodie-spark-bundle) at runtime. If you need to use other hoodie packages > (for your implementation), you can mark them with "provided dependency > scope" > Let us know if this is the case ? If not, please provide more context > about the dependencies getting added (pom.xml/spark logs showing packages > getting added). > > Thanks,Balaji.V > > > On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha < > [email protected]> wrote: > > Hi I filtered out nulls in dataframe for review_date field and it went > ahead but failed with the following exception. It looks like some run time > libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is > uber jar it has all the transitive dependencies it need. No? > > org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 > in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage > 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: > com.uber.hoodie.exception.HoodieException: > com.uber.hoodie.exception.HoodieException: > java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: > > com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V > at > > com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) > at > scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at > scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at > > org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) > at > > org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) > at > > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) > at > > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at > > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) > at > > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at > org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at > org.apache.spark.scheduler.Task.run(Task.scala:112) at > > org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) > Caused by: com.uber.hoodie.exception.HoodieException: > com.uber.hoodie.exception.HoodieException: > java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: > > com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V > at > > com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:106) > at > > com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:45) > at > > com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:119) > ... 24 more Caused by: com.uber.hoodie.exception.HoodieException: > java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: > > com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V > at > > com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146) > at > > com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:102) > ... 26 more Caused by: java.util.concurrent.ExecutionException: > java.lang.NoSuchMethodError: > > com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V > at java
Re: Failed to bulk insert
Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is also included. It should be sufficient to include only bundle packages (e:g hoodie-spark-bundle) at runtime. If you need to use other hoodie packages (for your implementation), you can mark them with "provided dependency scope" Let us know if this is the case ? If not, please provide more context about the dependencies getting added (pom.xml/spark logs showing packages getting added). Thanks,Balaji.V On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha wrote: Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) Caused by: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:106) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:45) at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:119) ... 24 more Caused by: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:102) ... 26 more Caused by: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:144) ... 27 more Caused by: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.io.storage.HoodieS
Re: Failed to bulk insert
Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is also included. It should be sufficient to include only bundle packages (e:g hoodie-spark-bundle) at runtime. If you need to use other hoodie packages (for your implementation), you can mark them with "provided dependency scope" Let us know if this is the case ? If not, please provide more context about the dependencies getting added (pom.xml/spark logs showing packages getting added). Thanks,Balaji.V On Saturday, March 23, 2019, 1:27:52 AM PDT, Umesh Kacha wrote: Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) Caused by: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:106) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:45) at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:119) ... 24 more Caused by: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:102) ... 26 more Caused by: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:144) ... 27 more Caused by: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.io.storage.HoodieS
Re: Failed to bulk insert
Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 22.0 failed 1 times, most recent failure: Lost task 7.0 in stage 22.0 (TID 3216, localhost, executor driver): java.lang.RuntimeException: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1187) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1096) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1161) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:883) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:351) at org.apache.spark.rdd.RDD.iterator(RDD.scala:302) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340) at org.apache.spark.rdd.RDD.iterator(RDD.scala:304) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) Caused by: com.uber.hoodie.exception.HoodieException: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:106) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:45) at com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:119) ... 24 more Caused by: com.uber.hoodie.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:102) ... 26 more Caused by: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:144) ... 27 more Caused by: java.lang.NoSuchMethodError: com.uber.hoodie.avro.HoodieAvroWriteSupport.(Lcom/uber/hoodie/org/apache/parquet/schema/MessageType;Lorg/apache/avro/Schema;Lcom/uber/hoodie/common/BloomFilter;)V at com.uber.hoodie.io.storage.HoodieStorageWriterFactory.newParquetStorageWriter(HoodieStorageWriterFactory.java:47) at com.uber.hoodie.io.storage.HoodieStorageWriterFactory.getStorageWriter(HoodieStorageWriterFactory.java:38) at com.uber.hoodie.io.HoodieCreateHandle.(HoodieCreateHandle.java:71) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable$CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteLazyInsertIterable.java:149) at com.uber.hoodie.func.CopyOnWriteLazyInsertIterable$CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteLazyInsertIterable.java:127) at com.uber.hoodie.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:38) at com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:124) at java.util.concurrent.Futu
Re: Failed to bulk insert
+1 yes. if its actually null.
Good catch, Frank! :)
On Sun, Mar 10, 2019 at 7:23 PM kaka chen wrote:
> A possible root cause is the filed of record is null.
>
> public static String getNestedFieldValAsString(GenericRecord record,
> String fieldName) {
> String[] parts = fieldName.split("\\.");
> GenericRecord valueNode = record;
> int i = 0;
> for (;i < parts.length; i++) {
> String part = parts[i];
> Object val = valueNode.get(part);
> if (val == null) {
> break;
> }
>
> // return, if last part of name
> if (i == parts.length - 1) {
> return val.toString();
> } else {
> // VC: Need a test here
> if (!(val instanceof GenericRecord)) {
> throw new HoodieException("Cannot find a record at part value
> :" + part);
> }
> valueNode = (GenericRecord) val;
> }
> }
> throw new HoodieException(fieldName + "(Part -" + parts[i] + ")
> field not found in record. "
> + "Acceptable fields were :" + valueNode.getSchema().getFields()
> .stream().map(Field::name).collect(Collectors.toList()));
> }
>
>
> Vinoth Chandar 于2019年3月10日周日 下午2:11写道:
>
> > Hmmm. Thats interesting. I can see that the parsing works, since the
> > exception said "Part - review_date". There are definitely users who have
> > done this before.
> > So not sure what's going on.
> >
> > Can you paste the generated Avro schema? following is the corresponding
> > code line
> > log.info(s"Registered avro schema : ${schema.toString(true)}")
> >
> > May be create a gist (gist.github.com), for easier sharing of
> > code/stacktrace?
> > Thanks
> > Vinoth
> >
> > On Sat, Mar 9, 2019 at 1:33 PM Umesh Kacha
> wrote:
> >
> > > Hi Vinoth thanks I have already did and checked that please see red
> > column
> > > highlighted below.
> > >
> > > root |-- marketplace: string (nullable = true) |-- customer_id: string
> > > (nullable = true) |-- review_id: string (nullable = true) |--
> product_id:
> > > string (nullable = true) |-- product_parent: string (nullable = true)
> |--
> > > product_title: string (nullable = true) |-- product_category: string
> > > (nullable = true) |-- star_rating: string (nullable = true) |--
> > > helpful_votes: string (nullable = true) |-- total_votes: string
> > (nullable =
> > > true) |-- vine: string (nullable = true) |-- verified_purchase: string
> > > (nullable = true) |-- review_headline: string (nullable = true) |--
> > > review_body: string (nullable = true) |-- review_date: string
> (nullable =
> > > true) |-- year: integer (nullable = true)
> > >
> > > On Sun, Mar 10, 2019 at 2:27 AM Vinoth Chandar
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > >>review_date(Part
> > > > -review_date) field not found in record
> > > >
> > > > Seems like the precombine field is not in the input DF? Can you try
> > doing
> > > > df1.printSchema and check that once?
> > > >
> > > > On Sat, Mar 9, 2019 at 11:52 AM Umesh Kacha
> > > wrote:
> > > >
> > > > > Hi I have the following code using which I am trying to bulk insert
> > > huge
> > > > > csv file loaded into Spark DataFrame but it fails saying column
> > > > review_date
> > > > > not found but that column is definitely there in dataframe. Please
> > > guide.
> > > > >
> > > > > df1.write
> > > > > .format("com.uber.hoodie")
> > > > > .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,
> > > > > HoodieTableType.COPY_ON_WRITE.name())
> > > > > .option(DataSourceWriteOptions.OPERATION_OPT_KEY,
> > > > > DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL) // insert
> > > > > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY,
> > > > > "customer_id")
> > > > > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,
> > > "year")
> > > > > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY,
> > > > > "review_date")
> > > > > .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test_table")
> > > > > .mode(SaveMode.Overwrite)
> > > > > .save("/tmp/hoodie/test_hoodie")
> > > > >
> > > > >
> > > > > Caused by: com.uber.hoodie.exception.HoodieException:
> > review_date(Part
> > > > > -review_date) field not found in record. Acceptable fields were
> > > > > :[marketplace, customer_id, review_id, product_id, product_parent,
> > > > > product_title, product_category, star_rating, helpful_votes,
> > > total_votes,
> > > > > vine, verified_purchase, review_headline, review_body, review_date,
> > > year]
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> com.uber.hoodie.DataSourceUtils.getNestedFieldValAsString(DataSourceUtils.java:79)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:93)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:92)
> > > > > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> > > > > scala.collection.Iterator$$anon$11.next(Iterator.scala:4
Re: Failed to bulk insert
A possible root cause is the filed of record is null.
public static String getNestedFieldValAsString(GenericRecord record,
String fieldName) {
String[] parts = fieldName.split("\\.");
GenericRecord valueNode = record;
int i = 0;
for (;i < parts.length; i++) {
String part = parts[i];
Object val = valueNode.get(part);
if (val == null) {
break;
}
// return, if last part of name
if (i == parts.length - 1) {
return val.toString();
} else {
// VC: Need a test here
if (!(val instanceof GenericRecord)) {
throw new HoodieException("Cannot find a record at part value
:" + part);
}
valueNode = (GenericRecord) val;
}
}
throw new HoodieException(fieldName + "(Part -" + parts[i] + ")
field not found in record. "
+ "Acceptable fields were :" + valueNode.getSchema().getFields()
.stream().map(Field::name).collect(Collectors.toList()));
}
Vinoth Chandar 于2019年3月10日周日 下午2:11写道:
> Hmmm. Thats interesting. I can see that the parsing works, since the
> exception said "Part - review_date". There are definitely users who have
> done this before.
> So not sure what's going on.
>
> Can you paste the generated Avro schema? following is the corresponding
> code line
> log.info(s"Registered avro schema : ${schema.toString(true)}")
>
> May be create a gist (gist.github.com), for easier sharing of
> code/stacktrace?
> Thanks
> Vinoth
>
> On Sat, Mar 9, 2019 at 1:33 PM Umesh Kacha wrote:
>
> > Hi Vinoth thanks I have already did and checked that please see red
> column
> > highlighted below.
> >
> > root |-- marketplace: string (nullable = true) |-- customer_id: string
> > (nullable = true) |-- review_id: string (nullable = true) |-- product_id:
> > string (nullable = true) |-- product_parent: string (nullable = true) |--
> > product_title: string (nullable = true) |-- product_category: string
> > (nullable = true) |-- star_rating: string (nullable = true) |--
> > helpful_votes: string (nullable = true) |-- total_votes: string
> (nullable =
> > true) |-- vine: string (nullable = true) |-- verified_purchase: string
> > (nullable = true) |-- review_headline: string (nullable = true) |--
> > review_body: string (nullable = true) |-- review_date: string (nullable =
> > true) |-- year: integer (nullable = true)
> >
> > On Sun, Mar 10, 2019 at 2:27 AM Vinoth Chandar
> wrote:
> >
> > > Hi,
> > >
> > > >>review_date(Part
> > > -review_date) field not found in record
> > >
> > > Seems like the precombine field is not in the input DF? Can you try
> doing
> > > df1.printSchema and check that once?
> > >
> > > On Sat, Mar 9, 2019 at 11:52 AM Umesh Kacha
> > wrote:
> > >
> > > > Hi I have the following code using which I am trying to bulk insert
> > huge
> > > > csv file loaded into Spark DataFrame but it fails saying column
> > > review_date
> > > > not found but that column is definitely there in dataframe. Please
> > guide.
> > > >
> > > > df1.write
> > > > .format("com.uber.hoodie")
> > > > .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,
> > > > HoodieTableType.COPY_ON_WRITE.name())
> > > > .option(DataSourceWriteOptions.OPERATION_OPT_KEY,
> > > > DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL) // insert
> > > > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY,
> > > > "customer_id")
> > > > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,
> > "year")
> > > > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY,
> > > > "review_date")
> > > > .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test_table")
> > > > .mode(SaveMode.Overwrite)
> > > > .save("/tmp/hoodie/test_hoodie")
> > > >
> > > >
> > > > Caused by: com.uber.hoodie.exception.HoodieException:
> review_date(Part
> > > > -review_date) field not found in record. Acceptable fields were
> > > > :[marketplace, customer_id, review_id, product_id, product_parent,
> > > > product_title, product_category, star_rating, helpful_votes,
> > total_votes,
> > > > vine, verified_purchase, review_headline, review_body, review_date,
> > year]
> > > > at
> > > >
> > > >
> > >
> >
> com.uber.hoodie.DataSourceUtils.getNestedFieldValAsString(DataSourceUtils.java:79)
> > > > at
> > > >
> > > >
> > >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:93)
> > > > at
> > > >
> > > >
> > >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:92)
> > > > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> > > > scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> > > >
> > > >
> > >
> >
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
> > > > at
> > > >
> > >
> >
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> > > > at
> > > >
> > >
> >
> org.a
Re: Failed to bulk insert
Hmmm. Thats interesting. I can see that the parsing works, since the
exception said "Part - review_date". There are definitely users who have
done this before.
So not sure what's going on.
Can you paste the generated Avro schema? following is the corresponding
code line
log.info(s"Registered avro schema : ${schema.toString(true)}")
May be create a gist (gist.github.com), for easier sharing of
code/stacktrace?
Thanks
Vinoth
On Sat, Mar 9, 2019 at 1:33 PM Umesh Kacha wrote:
> Hi Vinoth thanks I have already did and checked that please see red column
> highlighted below.
>
> root |-- marketplace: string (nullable = true) |-- customer_id: string
> (nullable = true) |-- review_id: string (nullable = true) |-- product_id:
> string (nullable = true) |-- product_parent: string (nullable = true) |--
> product_title: string (nullable = true) |-- product_category: string
> (nullable = true) |-- star_rating: string (nullable = true) |--
> helpful_votes: string (nullable = true) |-- total_votes: string (nullable =
> true) |-- vine: string (nullable = true) |-- verified_purchase: string
> (nullable = true) |-- review_headline: string (nullable = true) |--
> review_body: string (nullable = true) |-- review_date: string (nullable =
> true) |-- year: integer (nullable = true)
>
> On Sun, Mar 10, 2019 at 2:27 AM Vinoth Chandar wrote:
>
> > Hi,
> >
> > >>review_date(Part
> > -review_date) field not found in record
> >
> > Seems like the precombine field is not in the input DF? Can you try doing
> > df1.printSchema and check that once?
> >
> > On Sat, Mar 9, 2019 at 11:52 AM Umesh Kacha
> wrote:
> >
> > > Hi I have the following code using which I am trying to bulk insert
> huge
> > > csv file loaded into Spark DataFrame but it fails saying column
> > review_date
> > > not found but that column is definitely there in dataframe. Please
> guide.
> > >
> > > df1.write
> > > .format("com.uber.hoodie")
> > > .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,
> > > HoodieTableType.COPY_ON_WRITE.name())
> > > .option(DataSourceWriteOptions.OPERATION_OPT_KEY,
> > > DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL) // insert
> > > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY,
> > > "customer_id")
> > > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,
> "year")
> > > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY,
> > > "review_date")
> > > .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test_table")
> > > .mode(SaveMode.Overwrite)
> > > .save("/tmp/hoodie/test_hoodie")
> > >
> > >
> > > Caused by: com.uber.hoodie.exception.HoodieException: review_date(Part
> > > -review_date) field not found in record. Acceptable fields were
> > > :[marketplace, customer_id, review_id, product_id, product_parent,
> > > product_title, product_category, star_rating, helpful_votes,
> total_votes,
> > > vine, verified_purchase, review_headline, review_body, review_date,
> year]
> > > at
> > >
> > >
> >
> com.uber.hoodie.DataSourceUtils.getNestedFieldValAsString(DataSourceUtils.java:79)
> > > at
> > >
> > >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:93)
> > > at
> > >
> > >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:92)
> > > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> > > scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> > >
> > >
> >
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193)
> > > at
> > >
> > >
> >
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
> > > at
> > >
> >
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> > > at
> > >
> >
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
> > > at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at
> > > org.apache.spark.scheduler.Task.run(Task.scala:112) at
> > >
> > >
> >
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
> > > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) at
> > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
> at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > > at java.lang.Thread.run(Thread.java:748)
> > > Command took 4.45 seconds -- by [email protected] at 3/10/2019,
> > > 1:17:42
> > > AM on Spark_Hudi
> > > Shift+Enter to runshortcuts
> > >
> >
>
Re: Failed to bulk insert
Hi Vinoth thanks I have already did and checked that please see red column
highlighted below.
root |-- marketplace: string (nullable = true) |-- customer_id: string
(nullable = true) |-- review_id: string (nullable = true) |-- product_id:
string (nullable = true) |-- product_parent: string (nullable = true) |--
product_title: string (nullable = true) |-- product_category: string
(nullable = true) |-- star_rating: string (nullable = true) |--
helpful_votes: string (nullable = true) |-- total_votes: string (nullable =
true) |-- vine: string (nullable = true) |-- verified_purchase: string
(nullable = true) |-- review_headline: string (nullable = true) |--
review_body: string (nullable = true) |-- review_date: string (nullable =
true) |-- year: integer (nullable = true)
On Sun, Mar 10, 2019 at 2:27 AM Vinoth Chandar wrote:
> Hi,
>
> >>review_date(Part
> -review_date) field not found in record
>
> Seems like the precombine field is not in the input DF? Can you try doing
> df1.printSchema and check that once?
>
> On Sat, Mar 9, 2019 at 11:52 AM Umesh Kacha wrote:
>
> > Hi I have the following code using which I am trying to bulk insert huge
> > csv file loaded into Spark DataFrame but it fails saying column
> review_date
> > not found but that column is definitely there in dataframe. Please guide.
> >
> > df1.write
> > .format("com.uber.hoodie")
> > .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,
> > HoodieTableType.COPY_ON_WRITE.name())
> > .option(DataSourceWriteOptions.OPERATION_OPT_KEY,
> > DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL) // insert
> > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY,
> > "customer_id")
> > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY, "year")
> > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY,
> > "review_date")
> > .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test_table")
> > .mode(SaveMode.Overwrite)
> > .save("/tmp/hoodie/test_hoodie")
> >
> >
> > Caused by: com.uber.hoodie.exception.HoodieException: review_date(Part
> > -review_date) field not found in record. Acceptable fields were
> > :[marketplace, customer_id, review_id, product_id, product_parent,
> > product_title, product_category, star_rating, helpful_votes, total_votes,
> > vine, verified_purchase, review_headline, review_body, review_date, year]
> > at
> >
> >
> com.uber.hoodie.DataSourceUtils.getNestedFieldValAsString(DataSourceUtils.java:79)
> > at
> >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:93)
> > at
> >
> >
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:92)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> > scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> >
> >
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193)
> > at
> >
> >
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
> > at
> >
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> > at
> >
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
> > at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at
> > org.apache.spark.scheduler.Task.run(Task.scala:112) at
> >
> >
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
> > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) at
> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503) at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > at java.lang.Thread.run(Thread.java:748)
> > Command took 4.45 seconds -- by [email protected] at 3/10/2019,
> > 1:17:42
> > AM on Spark_Hudi
> > Shift+Enter to runshortcuts
> >
>
Re: Failed to bulk insert
Hi,
>>review_date(Part
-review_date) field not found in record
Seems like the precombine field is not in the input DF? Can you try doing
df1.printSchema and check that once?
On Sat, Mar 9, 2019 at 11:52 AM Umesh Kacha wrote:
> Hi I have the following code using which I am trying to bulk insert huge
> csv file loaded into Spark DataFrame but it fails saying column review_date
> not found but that column is definitely there in dataframe. Please guide.
>
> df1.write
> .format("com.uber.hoodie")
> .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,
> HoodieTableType.COPY_ON_WRITE.name())
> .option(DataSourceWriteOptions.OPERATION_OPT_KEY,
> DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL) // insert
> .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY,
> "customer_id")
> .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY, "year")
> .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY,
> "review_date")
> .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test_table")
> .mode(SaveMode.Overwrite)
> .save("/tmp/hoodie/test_hoodie")
>
>
> Caused by: com.uber.hoodie.exception.HoodieException: review_date(Part
> -review_date) field not found in record. Acceptable fields were
> :[marketplace, customer_id, review_id, product_id, product_parent,
> product_title, product_category, star_rating, helpful_votes, total_votes,
> vine, verified_purchase, review_headline, review_body, review_date, year]
> at
>
> com.uber.hoodie.DataSourceUtils.getNestedFieldValAsString(DataSourceUtils.java:79)
> at
>
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:93)
> at
>
> com.uber.hoodie.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:92)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
> scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at
>
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193)
> at
>
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
> at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139) at
> org.apache.spark.scheduler.Task.run(Task.scala:112) at
>
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432) at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503) at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Command took 4.45 seconds -- by [email protected] at 3/10/2019,
> 1:17:42
> AM on Spark_Hudi
> Shift+Enter to runshortcuts
>
