kbendick edited a comment on issue #2033:
URL: https://github.com/apache/iceberg/issues/2033#issuecomment-758545399


   Sorry for the large wall of text, but I've personally dealt with some weird 
issues on versioned S3 buckets (particularly when using Flink), and I thought 
I'd share what information I have. I try not to use them if possible due to 
performance issues, but with strong bucket policies in place they're manageable.
   
   @openinx As you've assigned this issue to yourself, and since I've written a 
small essay already in this issue, feel free to reach out on the ASF slack and 
I'd be happy to help in any way that I can (though I imagine you know more 
about S3 buckets than I do, but perhaps you're more accustomed to traditional 
HDFS). I'm not sure what investigation can be done, beyond testing writing to 
an empty versioned bucket, without error logs. As @elkhand mentioned that the 
issue occurred after enabling versioning on an existing bucket, it's quite 
possible that there are a large number of writers / many small files and that 
503-slow down exceptions accrued before the transaction could be completed, 
especially if there are other jobs writing to this same bucket (like with a 
smaller checkpoint time or a shorter interval between Iceberg commits). As I 
mentioned, I've personally encountered this situation when I first started 
using Flink on S3 as the number of jobs, with varying checkpoint interval
 s, writing to the same bucket increased. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to