[ 
https://issues.apache.org/jira/browse/IMPALA-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18083937#comment-18083937
 ] 

ASF subversion and git services commented on IMPALA-14999:
----------------------------------------------------------

Commit 4b79ca768436231991075c7475c653fa24167fc7 in impala's branch 
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4b79ca768 ]

IMPALA-14999: Calcite planner: support Iceberg tables (part 1)

This commit is part 1 of changes needed to support querying Iceberg
table through the Calcite planner.

The change in this commit is a refactor to support count star
optimization. The Calcite planner populates the HdfsScanNode.countStarSlot_
differently from the original planner.

The original planner does analysis through the physical AggregationNode
and HdfsScanNode at "init" time based on the columns and conjuncts present
in these nodes.

The Calcite planner does this analysis before the physical nodes are
created. The TupleDescriptor is already created before the PlanNode.init()
method is called. The Calcite planner determines the SlotDescriptor for
the countStarSlot_ before the PlanNodes are instantiated, so it simply
populates the variable, avoiding the PlanNode.init code.

The code before this commit places this variable in the extended
ImpalaHdfsScanNode class. However, this does not work for the Iceberg
planner since it uses its own class for IcebergScanNode extended off of
HdfsScanNode. Creating an ImpalaIcebergScanNode class seemed too hacky,
so this commit contains refactored code which will allow the Calcite
planner to populate the countStarSlot_.

The ScanNodeHelper interface has been created to handle the different
ways the countStarSlot_ gets populated. The implementation for the original
planner is used by KuduScanNode, HdfsScanNode, and IcebergScanNode which
calls the ScanNodeHelperImpl.getCountStarOptimizationDescriptor method.

The whole block of code containing "canApply..." and "apply..." has been
moved. Kudu uses a slightly different conjuncts variable so it is passed
into the common method.

For simplification, a small change was made to HdfsScanNode. The fileFormats
was passed into the "canApply" method. Instead of passing it in, the
"canApply..." now matches the parent "ScanNode.canApply..." method, uses

One of the ScanNode.apply... methods was unused and removed.

Other planners that extend the ScanNode also pass in the default original
ScanNodeHelperImpl class, but it is not used for these classes.

The one important change for the IcebergScanNode creation is that the
ScanNodeHelper class is instantiated from SingleNodePlanner and passed into
the IcebergScanPlanner. When the Calcite planner code gets implemented,
it will also call the IcebergScanPlanner, but with its own implementation
of ScanNodeHelper.

Testing: This is just a refactor, so no new tests are needed.

Change-Id: Iba7260194e14d8aeb2df9fe3cb875a0ee8450fb1
Reviewed-on: http://gerrit.cloudera.org:8080/24309
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
Reviewed-by: Peter Rozsa <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Steve Carlin <[email protected]>


> Calcite planner: support iceberg tables
> ---------------------------------------
>
>                 Key: IMPALA-14999
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14999
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Steve Carlin
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to