TengHuo commented on PR #7626: URL: https://github.com/apache/hudi/pull/7626#issuecomment-1384918691
> > May I ask if we can lazy load `HoodieTableFileSystemView` in `PriorityBasedFileSystemView` when creating `FlinkAppendHandle`? It can also reduce memory usage for active partitions. > > @TengHuo thank you for the report. I'll try to reproduce the scenario and consider it here Thanks @trushev for your reply. Sorry, my initial idea may not be a proper way to solve this issue. The thing which I want to share is that caching write handles could take a lot of memory, because each handle obtains an instance of `HoodieTable`, and there is a `viewManager` in every `HoodieTable`, which will load all pending compaction plans from Hudi timeline when the `FileSystemViewManager#getFileSystemView` is called. So I'm thinking if it is feasible that we can share one instance of `HoodieTable` when creating a new handle in the method `HoodieFlinkWriteClient#upsert` and other similar methods? As you have implemented this `BucketHandles` for caching all active handles, we can just create one table instance and drop this instance when all handles are removed from the map. For reproducing this memory issue I got, you need use MOR table, and setup thousands of partitions, as long as one compaction plan generated, you will find each `FlinkAppendHandle` will take a lot of memory because of `CompactionOperation `. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
