guilload commented on a change in pull request #1326: URL: https://github.com/apache/iceberg/pull/1326#discussion_r473435656
########## File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergInputFormat.java ########## @@ -51,6 +58,17 @@ forwardConfigSettings(job); + //Convert Hive filter to Iceberg filter + String hiveFilter = job.get(TableScanDesc.FILTER_EXPR_CONF_STR); Review comment: I came to the same conclusion when trying to implement projection pushdown in the storage handler. Unfortunately as @cmathiesen stated, the job config is not yet populated with the projected columns and the filter expression when the storage handler "hooks" such as `configureJobConf` are called. So the right entry points for implementing PPD are `getSplits` and `getRecordReader`. ~However, there's another catch. The `JobConf` objects passed in `getSplits` and `getRecordReader` are actually not the same and the filter expression set in `getSplits` (L#53 in `HiveIcebergInputFormat`) is no longer available when `getRecordReader ` is subsequently called.~ ~Since in the storage handler we don't decompose the filter expression, Hive applies the whole thing anyway and this can't be caught in the test suite but we need to set the filter expression both in `getSplits` and `getRecordReader` to get to a complete PPD implementation.~ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org