[
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056837#comment-16056837
]
liyunzhang_intel edited comment on HIVE-11297 at 6/21/17 2:09 AM:
------------------------------------------------------------------
[~csun]: I patch HIVE-11297.6.patch on latest master branch(8c5f55e) and run
query i posted above, i print the operator tree
SplitOpTreeForDPP#process
{code}
.....
/** print the operator tree **/
ArrayList<TableScanOperator> tableScanList = new ArrayList ();
tableScanList.add((TableScanOperator)stack.get(0));
LOG.debug("operator tree:"+Operator.toString(tableScanList));
/** print the operator tree**/
Operator<?> filterOp = pruningSinkOp;
while (filterOp != null) {
if (filterOp.getNumChild() > 1) {
break;
} else {
filterOp = filterOp.getParentOperators().get(0);
}
}
....
{code}
the operator tree is:
{code}
TS[1]-FIL[17]-RS[4]-JOIN[5]-GBY[8]-RS[9]-GBY[10]-FS[12]
TS[1]-FIL[17]-SEL[18]-GBY[19]-SPARKPRUNINGSINK[20]
TS[1]-FIL[17]-SEL[21]-GBY[22]-SPARKPRUNINGSINK[23]
{code}
So can you retest it in your env? if the operator tree is like what you
mentioned, i think all the operator tree in
spark_dynamic_partition_pruning.q.out will be different as i generated in my
env.
was (Author: kellyzly):
[~csun]: I patch HIVE-11297.6.patch on latest master branch(8c5f55e) and run
query i posted above, i print the operator tree of filterOp
SplitOpTreeForDPP#process
{code}
.....
/** print the operator tree **/
ArrayList<TableScanOperator> tableScanList = new ArrayList ();
tableScanList.add((TableScanOperator)stack.get(0));
LOG.debug("operator tree:"+Operator.toString(tableScanList));
/** print the operator tree**/
Operator<?> filterOp = pruningSinkOp;
while (filterOp != null) {
if (filterOp.getNumChild() > 1) {
break;
} else {
filterOp = filterOp.getParentOperators().get(0);
}
}
....
{code}
the operator tree is:
{code}
TS[1]-FIL[17]-RS[4]-JOIN[5]-GBY[8]-RS[9]-GBY[10]-FS[12]
TS[1]-FIL[17]-SEL[18]-GBY[19]-SPARKPRUNINGSINK[20]
TS[1]-FIL[17]-SEL[21]-GBY[22]-SPARKPRUNINGSINK[23]
{code}
> Combine op trees for partition info generating tasks [Spark branch]
> -------------------------------------------------------------------
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
> Issue Type: Bug
> Affects Versions: spark-branch
> Reporter: Chao Sun
> Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch,
> HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, HIVE-11297.6.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates
> partition info for more than one partition columns, multiple operator trees
> are created, which all start from the same table scan op, but have different
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do
> table scan multiple times.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)