BalaMahesh commented on issue #9758:
URL: https://github.com/apache/hudi/issues/9758#issuecomment-1738402516

   @codope  - any fix for this behaviour ? we started seeing OOM errors on the 
executor pods because of these large number of metadata files. 
   
   ``` id='HoodieFileGroupId{partitionPath='files', fileId='files-0000'}', 
metrics={TOTAL_LOG_FILES=671.0, TOTAL_IO_READ_MB=7.0, 
TOTAL_LOG_FILES_SIZE=7456183.0, TOTAL_IO_WRITE_MB=0.0, TOTAL_IO_MB=7.0}, 
bootstrapFilePath=Optional.empty}] files
   23/09/28 03:55:29 INFO SparkContext: Starting job: collect at 
HoodieJavaRDD.java:163
   23/09/28 03:55:29 INFO DAGScheduler: Got job 30 (collect at 
HoodieJavaRDD.java:163) with 1 output partitions
   23/09/28 03:55:29 INFO DAGScheduler: Final stage: ResultStage 54 (collect at 
HoodieJavaRDD.java:163)
   23/09/28 03:55:29 INFO DAGScheduler: Parents of final stage: List()
   23/09/28 03:55:29 INFO DAGScheduler: Missing parents: List()
   23/09/28 03:55:29 INFO DAGScheduler: Submitting ResultStage 54 
(MapPartitionsRDD[108] at map at HoodieJavaRDD.java:111), which has no missing 
parents
   23/09/28 03:55:29 INFO MemoryStore: Block broadcast_40 stored as values in 
memory (estimated size 384.6 KiB, free 1048.4 MiB)
   23/09/28 03:55:29 INFO MemoryStore: Block broadcast_40_piece0 stored as 
bytes in memory (estimated size 130.1 KiB, free 1048.3 MiB)
   23/09/28 03:55:29 INFO BlockManagerInfo: Added broadcast_40_piece0 in memory 
on glance-grap-studio-bubble-384cd48ad9e9f6d8-driver-svc.spark-jobs.svc:7079 
(size: 130.1 KiB, free: 1048.7 MiB)
   23/09/28 03:55:29 INFO SparkContext: Created broadcast 40 from broadcast at 
DAGScheduler.scala:1478
   23/09/28 03:55:29 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 54 (MapPartitionsRDD[108] at map at HoodieJavaRDD.java:111) (first 
15 tasks are for partitions Vector(0))
   23/09/28 03:55:29 INFO TaskSchedulerImpl: Adding task set 54.0 with 1 tasks 
resource profile 0
   23/09/28 03:55:29 INFO TaskSetManager: Starting task 0.0 in stage 54.0 (TID 
226) (10.207.40.187, executor 1, partition 0, PROCESS_LOCAL, 42753 bytes) 
taskResourceAssignments Map()
   23/09/28 03:55:29 INFO BlockManagerInfo: Added broadcast_40_piece0 in memory 
on 10.207.40.187:34929 (size: 130.1 KiB, free: 7.6 GiB)
   23/09/28 03:55:34 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:55:44 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:55:54 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:56:04 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:56:14 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:56:24 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:56:34 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:56:44 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:56:54 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:57:04 INFO HoodieAsyncService: Waiting for next instant up to 10 
seconds
   23/09/28 03:57:04 WARN TaskSetManager: Lost task 0.0 in stage 54.0 (TID 226) 
(10.207.40.187 executor 1): java.lang.OutOfMemoryError: Java heap space
        at java.base/java.io.BufferedInputStream.<init>(Unknown Source)
        at 
org.apache.hadoop.fs.BufferedFSInputStream.<init>(BufferedFSInputStream.java:56)
        at 
org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStreamForGCS(HoodieLogFileReader.java:541)
        at 
org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:502)
        at 
org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:118)
        at 
org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110)
        at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:247)
        at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223)
        at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:198)
        at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:114)
        at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
        at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
        at 
org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:204)
        at 
org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$9cd4b1be$1(HoodieCompactor.java:129)
        at 
org.apache.hudi.table.action.compact.HoodieCompactor$$Lambda$2356/0x0000000841269c40.apply(Unknown
 Source)
        at 
org.apache.hudi.data.HoodieJavaRDD$$Lambda$1876/0x0000000840ee1040.call(Unknown 
Source)
        at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
        at 
org.apache.spark.api.java.JavaPairRDD$$$Lambda$655/0x000000084060e840.apply(Unknown
 Source)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
        at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
        at 
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
        at 
org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352)
        at 
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1498)
        at 
org.apache.spark.storage.BlockManager$$Lambda$633/0x00000008405ad840.apply(Unknown
 Source)
        at 
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408)
        at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472)
        at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   
   23/09/28 03:57:04 INFO TaskSetManager: Starting task 0.1 in stage 54.0 (TID 
227) (10.207.40.187, executor 1, partition 0```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to