BalaMahesh commented on issue #9758:
URL: https://github.com/apache/hudi/issues/9758#issuecomment-1738402516
@codope - any fix for this behaviour ? we started seeing OOM errors on the
executor pods because of these large number of metadata files.
``` id='HoodieFileGroupId{partitionPath='files', fileId='files-0000'}',
metrics={TOTAL_LOG_FILES=671.0, TOTAL_IO_READ_MB=7.0,
TOTAL_LOG_FILES_SIZE=7456183.0, TOTAL_IO_WRITE_MB=0.0, TOTAL_IO_MB=7.0},
bootstrapFilePath=Optional.empty}] files
23/09/28 03:55:29 INFO SparkContext: Starting job: collect at
HoodieJavaRDD.java:163
23/09/28 03:55:29 INFO DAGScheduler: Got job 30 (collect at
HoodieJavaRDD.java:163) with 1 output partitions
23/09/28 03:55:29 INFO DAGScheduler: Final stage: ResultStage 54 (collect at
HoodieJavaRDD.java:163)
23/09/28 03:55:29 INFO DAGScheduler: Parents of final stage: List()
23/09/28 03:55:29 INFO DAGScheduler: Missing parents: List()
23/09/28 03:55:29 INFO DAGScheduler: Submitting ResultStage 54
(MapPartitionsRDD[108] at map at HoodieJavaRDD.java:111), which has no missing
parents
23/09/28 03:55:29 INFO MemoryStore: Block broadcast_40 stored as values in
memory (estimated size 384.6 KiB, free 1048.4 MiB)
23/09/28 03:55:29 INFO MemoryStore: Block broadcast_40_piece0 stored as
bytes in memory (estimated size 130.1 KiB, free 1048.3 MiB)
23/09/28 03:55:29 INFO BlockManagerInfo: Added broadcast_40_piece0 in memory
on glance-grap-studio-bubble-384cd48ad9e9f6d8-driver-svc.spark-jobs.svc:7079
(size: 130.1 KiB, free: 1048.7 MiB)
23/09/28 03:55:29 INFO SparkContext: Created broadcast 40 from broadcast at
DAGScheduler.scala:1478
23/09/28 03:55:29 INFO DAGScheduler: Submitting 1 missing tasks from
ResultStage 54 (MapPartitionsRDD[108] at map at HoodieJavaRDD.java:111) (first
15 tasks are for partitions Vector(0))
23/09/28 03:55:29 INFO TaskSchedulerImpl: Adding task set 54.0 with 1 tasks
resource profile 0
23/09/28 03:55:29 INFO TaskSetManager: Starting task 0.0 in stage 54.0 (TID
226) (10.207.40.187, executor 1, partition 0, PROCESS_LOCAL, 42753 bytes)
taskResourceAssignments Map()
23/09/28 03:55:29 INFO BlockManagerInfo: Added broadcast_40_piece0 in memory
on 10.207.40.187:34929 (size: 130.1 KiB, free: 7.6 GiB)
23/09/28 03:55:34 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:55:44 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:55:54 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:56:04 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:56:14 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:56:24 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:56:34 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:56:44 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:56:54 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:57:04 INFO HoodieAsyncService: Waiting for next instant up to 10
seconds
23/09/28 03:57:04 WARN TaskSetManager: Lost task 0.0 in stage 54.0 (TID 226)
(10.207.40.187 executor 1): java.lang.OutOfMemoryError: Java heap space
at java.base/java.io.BufferedInputStream.<init>(Unknown Source)
at
org.apache.hadoop.fs.BufferedFSInputStream.<init>(BufferedFSInputStream.java:56)
at
org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStreamForGCS(HoodieLogFileReader.java:541)
at
org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:502)
at
org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:118)
at
org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110)
at
org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:247)
at
org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223)
at
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:198)
at
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:114)
at
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
at
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
at
org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:204)
at
org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$9cd4b1be$1(HoodieCompactor.java:129)
at
org.apache.hudi.table.action.compact.HoodieCompactor$$Lambda$2356/0x0000000841269c40.apply(Unknown
Source)
at
org.apache.hudi.data.HoodieJavaRDD$$Lambda$1876/0x0000000840ee1040.call(Unknown
Source)
at
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
at
org.apache.spark.api.java.JavaPairRDD$$$Lambda$655/0x000000084060e840.apply(Unknown
Source)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
at
org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352)
at
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1498)
at
org.apache.spark.storage.BlockManager$$Lambda$633/0x00000008405ad840.apply(Unknown
Source)
at
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408)
at
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472)
at
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
23/09/28 03:57:04 INFO TaskSetManager: Starting task 0.1 in stage 54.0 (TID
227) (10.207.40.187, executor 1, partition 0```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]