stevenzwu commented on issue #2900:
URL: https://github.com/apache/iceberg/issues/2900#issuecomment-894383649


   @ayush-san let's assume if you taskmanager has 1 CPU and 4 GB of memory and 
you run 10 disjointed pipelines (with parallelism of 1) in the same process, so 
you have 10 IcebergFileWriter tasks running in the same taskmanager. Each 
writer can use 128 MB for Parquet row group size. I am sure there will be a few 
Xs of overhead of 128 MB row group size. so the memory usage can add up.
   
   if you have the heap dump file, try it with Eclipse MAT. the denominator 
tree is quite useful to drill down the class holding on to the memory. Use the 
heap dump file generated by `-XX:+HeapDumpOnOutOfMemoryError`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to