RussellSpitzer commented on issue #2195:
URL: https://github.com/apache/iceberg/issues/2195#issuecomment-771885958


   I tried a slightly different fix were we just check in the filter command 
whether there is more than 1 file OR if there is a single file but it is a 
partial scan, Scan Length < File sizeInBytes and that seems to work as well 
without building up the fileName set which may be an issue with very large 
numbers of files.
   
   ```java
   private boolean isPartialFileScan(CombinedScanTask task) {
       if (task.files().size() == 1) {
         FileScanTask fileScanTask = task.files().iterator().next();
         boolean test = (fileScanTask.file().fileSizeInBytes() - 
fileScanTask.length()) != 0;
         return test;
       } else {
         return false;
       }
     }
   ```
   
   ```
           .filter(task -> task.files().size() > 1 || isPartialFileScan(task)) 
// Either we are combining multiple files or we have broken a file into smaller 
pieces
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to