openinx commented on issue #1383:
URL: https://github.com/apache/iceberg/issues/1383#issuecomment-706978779


   > That might blow up Flink checkpoint state if the enumerated list of 
FileScanTask is too big.
   
   @stevenzwu ,  what is the maximum size of a table in your production 
environment ?  I'm thinking whether it's worth to implement the two-phase 
enumerators in the first version. 
   
   If we have 1PB data and each file have the size 128MB, then it will have 
8388608 files.  If every `FileScanTask` consume 1KB , then its state is ~ 8GB.  
That should be acceptable for the flink state backend.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to