kinolaev commented on PR #15712:
URL: https://github.com/apache/iceberg/pull/15712#issuecomment-4127660087

   @danielcweeks , I've double checked production logs, I was wrong, in 
production I had no problem with the connection pool exhausted by data file 
connections, the timeouts were always caused by ManifestFilterManager 
https://github.com/apache/iceberg/pull/15713. The problem this PR addresses I 
only encountered locally, when I tried to reproduce ManifestFilterManager's 
issue by reducing max-connections.
   https://github.com/apache/iceberg/pull/15713 was also caused by invalid 
configuration: thread count * 2 > connection pool size. I run spark in 
kubernetes, and although I set spark.executor.cores, it isn't used for the 
executor's resources.limits.cpu. That is why I had too many threads for the 
ManifestFilterManager.filterManifests call, because worker pool size was based 
on node cpu count instead of container cpu count 
https://github.com/apache/iceberg/blob/apache-iceberg-1.10.1/core/src/main/java/org/apache/iceberg/SystemConfigs.java#L33-L54.
   Without any patches 50 connections should be enough for 4 cpu cores: 4 tasks 
in parallel, 4 threads in worker pool, 16 threads in delete worker pool. In the 
worst case each thread opens 2 simultaneous connections, that's 48 connections 
in total.
   I still think that loading delete files before a data file is the right 
thing. But, yes, it doesn't fix anything or significantly reduce resource usage 
in a properly configured setup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to