TengHuo commented on PR #7626: URL: https://github.com/apache/hudi/pull/7626#issuecomment-1384806549
Hi @trushev Nice feature! We are suffering a similar memory exception in our Flink Hudi MOR pipeline. We found a heap OOM exception and abnormal GC activities in task managers. Task manager GC metrics panel  After checking, we noticed that the size of `CompactionOperation` in memory is unusually big, and it should be caused by `HoodieTableFileSystemView`, because each instance of `HoodieTableFileSystemView` will load all pending compaction plans from the timeline to memory. This is the part of task manager heap histogram showing the abnormal memory usage caused by `CompactionOperation`. ```log 9: 2091712 83668480 org.apache.hudi.common.model.CompactionOperation 479: 27 4752 org.apache.hudi.io.FlinkAppendHandle 686: 28 2016 org.apache.hudi.common.table.view.HoodieTableFileSystemView 800: 28 1344 org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView 806: 27 1296 org.apache.hudi.table.HoodieFlinkMergeOnReadTable 954: 27 864 org.apache.hudi.common.table.view.FileSystemViewManager 1064: 28 672 org.apache.hudi.common.table.view.PriorityBasedFileSystemView ``` In the timeline of our pipeline, there was only 1 unfinished compaction plan, which contained 74704 operations, `74704 * 28 = 2091712`. May I ask if we can lazy load `HoodieTableFileSystemView` in `PriorityBasedFileSystemView` when creating `FlinkAppendHandle`? It can also reduce memory usage for active partitions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
