[GitHub] [iceberg] hameizi commented on pull request #2867: Flink: Auto compact file

GitBox Mon, 26 Jul 2021 01:42:49 -0700


hameizi commented on pull request #2867:
URL: https://github.com/apache/iceberg/pull/2867#issuecomment-886502378



   > > This add one feature that flink write iceberg auto compact small files.
   > 
   > Possibly I'm missing something, but I don't see any accounting for files 
that might be already near, or close to the optimal size. It's late and my eyes 
may deceive me, but this appears to be compacting all files to be ideally the 
target file size bytes, regardless of their existing size etc. In some cases, 
the cost of opening and rewriting provides less value than leaving the data as 
is. Can we account for this like we do in some other places? Or am I just 
missing the fact that that functionality is hidden elsewhere?
   > 
   > This would be a good topic to consider discussing in the mentioned GitHub 
issue :)
   
   It will compact files who is the result by partition filter, so it will 
compact all files when there  is  no partitions. But in my work sence, there is 
no problem because we compact every transction that will generate little files. 
So compact files in every transcation is quickly and small cost and the time 
compact files will not more than transaction time too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] hameizi commented on pull request #2867: Flink: Auto compact file

Reply via email to