mkvenkatesh commented on issue #14951: URL: https://github.com/apache/iceberg/issues/14951#issuecomment-3946934993
> so if it is during delete, you could change the size of each batch delete up from its default of 250 to a value up to 1000. > > ``` > s3.delete.batch-size 500 > ``` > > that means half as many http connections are needed. But as each file deleted still counts against your 3500 write/second allocation, large deletes can have adverse consequences, see [apache/hadoop@56dee66](https://github.com/apache/hadoop/commit/56dee667707926f3796c7757be1a133a362f05c9) for the details. Thank you @steveloughran . The errors we see are from the compaction job and not remove unreferenced files. We are seeing two scenarios where the BindExceptions happened for us 1. When compaction job tries to materialize all the equality deletes within a file group in a Spark job. 2. When we end up having one file group (one spark job) with only one spark task processing > 1K data files (it's possible some of these are equality/positional delete files as well). We have these tiny 14KB data files strewn across multiple s3 prefixes all belonging to one partition, so even with a very low `max-file-group-size-bytes` we still end up with one spark job/task to process > 1K files and we see BindExceptions around that time. I don't know which settings helped which case, but our problems disappeared. Increased `delete-num-threads` to 32 but that caused problems again, so reduced it down to 16 and we see no connection issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
