RussellSpitzer commented on issue #2195:
URL: https://github.com/apache/iceberg/issues/2195#issuecomment-771885958
I tried a slightly different fix were we just check in the filter command
whether there is more than 1 file OR if there is a single file but it is a
partial scan, Scan Length < File sizeInBytes and that seems to work as well
without building up the fileName set which may be an issue with very large
numbers of files.
```java
private boolean isPartialFileScan(CombinedScanTask task) {
if (task.files().size() == 1) {
FileScanTask fileScanTask = task.files().iterator().next();
boolean test = (fileScanTask.file().fileSizeInBytes() -
fileScanTask.length()) != 0;
return test;
} else {
return false;
}
}
```
```
.filter(task -> task.files().size() > 1 || isPartialFileScan(task))
// Either we are combining multiple files or we have broken a file into smaller
pieces
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]