[
https://issues.apache.org/jira/browse/HIVE-26137?focusedWorklogId=756609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-756609
]
ASF GitHub Bot logged work on HIVE-26137:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Apr/22 19:03
Start Date: 13/Apr/22 19:03
Worklog Time Spent: 10m
Work Description: szlta commented on code in PR #3203:
URL: https://github.com/apache/hive/pull/3203#discussion_r849804783
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/orc/VectorizedReadUtils.java:
##########
@@ -160,8 +161,9 @@ public static void handleIcebergProjection(FileScanTask
task, JobConf job, TypeD
job.set(ColumnProjectionUtils.ORC_SCHEMA_STRING, readOrcSchema.toString());
// Predicate pushdowns needs to be adjusted too in case of column renames,
we let Iceberg generate this into job
- if (task.residual() != null) {
- Expression boundFilter = Binder.bind(currentSchema.asStruct(),
task.residual(), false);
+ Expression residual = HiveIcebergInputFormat.residualForTask(task, job);
+ if (residual != null) {
Review Comment:
fixed
Issue Time Tracking
-------------------
Worklog Id: (was: 756609)
Time Spent: 0.5h (was: 20m)
> Optimized transfer of Iceberg residual expressions from AM to execution
> -----------------------------------------------------------------------
>
> Key: HIVE-26137
> URL: https://issues.apache.org/jira/browse/HIVE-26137
> Project: Hive
> Issue Type: Improvement
> Reporter: Ádám Szita
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> HIVE-25967 introduced a hack to prevent Iceberg filter expressions to be
> serialized into splits. This temporary fix was to avoid OOM problems on Tez
> AM side, but at the same time prevented predicate pushdowns to work on the
> execution side too.
> This ticket intends to incorporate the long term solution. It turns out that
> the file scan tasks created by Iceberg actually don't contain a "residual"
> expressions, but rather a complete/original one. It becomes residual only
> when it is evaluated against the tasks' partition value, which only happens
> on the execution site. This means that the original filter is the same
> expression for all splits in Tez AM, so we can transfer it via job conf
> instead.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)