Rong Ma created HUDI-2031:
-----------------------------

             Summary: JVM occasionally crashes during compaction when spark 
speculative execution is enabled
                 Key: HUDI-2031
                 URL: https://issues.apache.org/jira/browse/HUDI-2031
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Rong Ma


This could happen when speculative execution is triggered. The duplicated tasks 
are expected to terminate normally, but sometimes they cannot and will cause 
the JVM crashes.

 
 From executor logs:

ERROR [Executor task launch worker for task 6828] HoodieMergeHandle: Error 
writing record  HoodieRecord{key=HoodieKey

{ recordKey=45246275517 partitionPath=2021-06-13}, currentLocation='null', 
newLocation='null'}ERROR [Executor task launch worker for task 6828] 
HoodieMergeHandle: Error writing record  HoodieRecord\{key=HoodieKey { 
recordKey=45246275517 partitionPath=2021-06-13}

, currentLocation='null', 
newLocation='null'}java.lang.IllegalArgumentException: You cannot call 
toBytes() more than once without calling reset() at 
org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) at 
org.apache.parquet.column.values.rle.RunLengthBitPackingHybridEncoder.toBytes(RunLengthBitPackingHybridEncoder.java:254)
 at 
org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesWriter.getBytes(RunLengthBitPackingHybridValuesWriter.java:65)
 at 
org.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:148)
 at 
org.apache.parquet.column.impl.ColumnWriterV1.accountForValueWritten(ColumnWriterV1.java:106)
 at 
org.apache.parquet.column.impl.ColumnWriterV1.write(ColumnWriterV1.java:200) at 
org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.addBinary(MessageColumnIO.java:469)
 at 
org.apache.parquet.avro.AvroWriteSupport.writeValueWithoutConversion(AvroWriteSupport.java:346)
 at 
org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:278) 
at 
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:191)
 at org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:165) 
at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
 at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299) at 
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvroWithMetadata(HoodieParquetWriter.java:83)
 at 
org.apache.hudi.io.HoodieMergeHandle.writeRecord(HoodieMergeHandle.java:252) at 
org.apache.hudi.io.HoodieMergeHandle.close(HoodieMergeHandle.java:336) at 
org.apache.hudi.table.action.commit.SparkMergeHelper.runMerge(SparkMergeHelper.java:107)
 at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.handleUpdateInternal(HoodieSparkCopyOnWriteTable.java:199)
 at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.handleUpdate(HoodieSparkCopyOnWriteTable.java:190)
 at 
org.apache.hudi.table.action.compact.HoodieSparkMergeOnReadTableCompactor.compact(HoodieSparkMergeOnReadTableCompactor.java:154)
 at 
org.apache.hudi.table.action.compact.HoodieSparkMergeOnReadTableCompactor.lambda$compact$9ec9d4c7$1(HoodieSparkMergeOnReadTableCompactor.java:105)
 at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1041)
 at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at 
scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484) at 
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490) at 
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) 
at 
org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:349)
 at 
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1388)
 at 
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1298)
 at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1362) at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1186) 
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:360) at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:311) at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:313) at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at 
org.apache.spark.scheduler.Task.run(Task.scala:127) at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$runWithUgi$3(Executor.scala:462)
 at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at 
org.apache.spark.executor.Executor$TaskRunner.runWithUgi(Executor.scala:465) at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:394) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)

#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f2b0b37042a, pid=10120, tid=0x00007f2b0b16c700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_201-b09) (build 
1.8.0_201-b09)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.201-b09 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C [libz.so.1+0x342a]
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/home/vipshop/ssd_disk/0/yarn/local14/usercache/hdfs/appcache/application_1620320166879_33384625/container_e104_1620320166879_33384625_01_000008/hs_err_pid10120.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to