RussellSpitzer commented on PR #16330:
URL: https://github.com/apache/iceberg/pull/16330#issuecomment-4452121571

   I kind of want to do something smarter than this long term, but that is 
probably a good first step. For example just because a file was "written" with 
a sort order doesn't mean it shouldn't be resorted
   
   In the original doc for example I proposed looking at overlaps and only 
selecting files for rewriting where the overlap depth was at a certain level. 
   
   Like for example if I have files
   
   [1 - 100] - SortId 1
   [1 - 100] - SortId 1
   [1 - 100] - SortId 1
   [1 - 100] - No SortId
   
   Just rewriting the last file doesn't make sense, and ignoring the first 
three is probably a mistake


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to