Re: [I] Massive TIME_WAIT socket exhaustion during metadata (manifest/avro) reads with S3FileIO + Apache HTTP client [iceberg]

via GitHub Mon, 23 Feb 2026 11:47:46 -0800


mkvenkatesh commented on issue #14951:
URL: https://github.com/apache/iceberg/issues/14951#issuecomment-3946934993


   > so if it is during delete, you could change the size of each batch delete 
up from its default of 250 to a value up to 1000.
   > 
   > ```
   > s3.delete.batch-size 500
   > ```
   > 
   > that means half as many http connections are needed. But as each file 
deleted still counts against your 3500 write/second allocation, large deletes 
can have adverse consequences, see 
[apache/hadoop@56dee66](https://github.com/apache/hadoop/commit/56dee667707926f3796c7757be1a133a362f05c9)
 for the details.
   
   Thank you @steveloughran . The errors we see are from the compaction job and 
not remove unreferenced files. We are seeing two scenarios where the 
BindExceptions happened for us
   
   1. When compaction job tries to materialize all the equality deletes within 
a file group in a Spark job. 
   2. When we end up having one file group (one spark job) with only one spark 
task processing > 1K data files (it's possible some of these are 
equality/positional delete files as well). We have these tiny 14KB data files 
strewn across multiple s3 prefixes all belonging to one partition, so even with 
a very low `max-file-group-size-bytes` we still end up with one spark job/task 
to process > 1K files and we see BindExceptions around that time.
   
   I don't know which settings helped which case, but our problems disappeared. 
Increased `delete-num-threads` to 32 but that caused problems again, so reduced 
it down to 16 and we see no connection issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Massive TIME_WAIT socket exhaustion during metadata (manifest/avro) reads with S3FileIO + Apache HTTP client [iceberg]

Reply via email to