jerryshao opened a new issue #1061:
URL: https://github.com/apache/incubator-iceberg/issues/1061


   Iceberg will potentially generate large amount of small data files, 
especially when using streaming mode (Structured Streaming Write/Flink Sink). 
So we should have a mechanism to compact the small data files into large ones. 
   
   In Iceberg we already have `RewriteManifestsAction` and 
`RemoveOrphanFilesAction`, maybe we could add a new action to compact the data 
files.
   
   Any thoughts on it? @rdblue @aokolnychyi 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to