rdsr commented on a change in pull request #931: Add residual evaluation for MR
reader
URL: https://github.com/apache/incubator-iceberg/pull/931#discussion_r409747002
##########
File path:
mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java
##########
@@ -394,8 +378,14 @@ public void close() throws IOException {
String.format("Cannot read %s file: %s", file.format().name(),
file.path()));
}
currentCloseable = iterable;
- //TODO: Apply residual filtering before returning the iterator
- return iterable.iterator();
+ boolean applyResidualFiltering =
!context.getConfiguration().getBoolean(SKIP_RESIDUAL_FILTERING, false);
Review comment:
seems like we should also take into account that ICEBERG_GENERICS is the
data model before applying residuals. Should this be done inside each of the
method e.g `newAvroIterable` , `newOrcIterable` etc where we dispatch on the
data model. I think, the actual application of the residual maybe be a function
for Iceberg_generics which can be called from each of the
`newAvro|Orc|ParquetIterable` methods.
Thoughts?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]