[GitHub] [iceberg] HeartSaVioR commented on issue #1286: Slow parallel operations fail to commit

GitBox Mon, 03 Aug 2020 06:45:02 -0700


HeartSaVioR commented on issue #1286:
URL: https://github.com/apache/iceberg/issues/1286#issuecomment-668030858



   Btw, there's RewriteDataFilesAction in Spark action which does data file 
compaction, though there're some points to improve 
(https://github.com/apache/iceberg/issues/1159), like you do prioritize small 
files, and only pick N gb of files. 
   
   Probably restricting the size is a valid strategy to restrict the time to 
compact - if we can control this as fairly reasonable time, like couple of 
minutes, this can be enabled as part of streaming write, as auto compaction.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] HeartSaVioR commented on issue #1286: Slow parallel operations fail to commit

Reply via email to