Sahil Takiar created HIVE-16507:
-----------------------------------
Summary: Hive Explain User-Level may print out "Vertex dependency
in root stage"
Key: HIVE-16507
URL: https://issues.apache.org/jira/browse/HIVE-16507
Project: Hive
Issue Type: Bug
Reporter: Sahil Takiar
Assignee: Sahil Takiar
User-level explain plans have a section titled {{Vertex dependency in root
stage}} - which (according to the name) prints out the dependencies between all
vertices that are in the root stage.
This logic is controlled by {{DagJsonParser#print}} and it may print out
{{Vertex dependency in root stage}} twice.
The logic in this method first extracts all stages and plans. It then iterates
over all the stages, and if the stage contains any edges, it prints them out.
If we want to be consistent with the statement {{Vertex dependency in root
stage}} then we should add a check to see if the stage we are processing during
the iteration is the root stage or not.
Alternatively, we could print out the edges for each stage and change the line
from {{Vertex dependency in root stage}} to {{Vertex dependency in [stage-id]}}
I'm not sure if its possible for Hive-on-Tez to create a plan with a non-root
stage that contains edges, but it is possible for Hive-on-Spark (support added
for HoS in HIVE-11133).
Example for HoS:
{code}
set hive.optimize.ppd=true;
set hive.ppd.remove.duplicatefilters=true;
set hive.spark.dynamic.partition.pruning=true;
set hive.optimize.metadataonly=false;
set hive.optimize.index.filter=true;
set hive.strict.checks.cartesian.product=false;
set hive.spark.explain.user=true;
set hive.spark.dynamic.partition.pruning=true;
EXPLAIN select count(*) from srcpart where srcpart.ds in (select
max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart);
{code}
Prints
{code}
Plan optimized by CBO.
Vertex dependency in root stage
Reducer 10 <- Map 9 (GROUP)
Reducer 11 <- Reducer 10 (GROUP), Reducer 13 (GROUP)
Reducer 13 <- Map 12 (GROUP)
Vertex dependency in root stage
Reducer 2 <- Map 1 (PARTITION-LEVEL SORT), Reducer 6 (PARTITION-LEVEL SORT)
Reducer 3 <- Reducer 2 (GROUP)
Reducer 5 <- Map 4 (GROUP)
Reducer 6 <- Reducer 5 (GROUP), Reducer 8 (GROUP)
Reducer 8 <- Map 7 (GROUP)
Stage-0
Fetch Operator
limit:-1
Stage-1
Reducer 3
File Output Operator [FS_34]
Group By Operator [GBY_32] (rows=1 width=8)
Output:["_col0"],aggregations:["count(VALUE._col0)"]
<-Reducer 2 [GROUP]
GROUP [RS_31]
Group By Operator [GBY_30] (rows=1 width=8)
Output:["_col0"],aggregations:["count()"]
Join Operator [JOIN_28] (rows=2200 width=10)
condition
map:[{"":"{\"type\":\"Inner\",\"left\":0,\"right\":1}"}],keys:{"0":"_col0","1":"_col0"}
<-Map 1 [PARTITION-LEVEL SORT]
PARTITION-LEVEL SORT [RS_26]
PartitionCols:_col0
Select Operator [SEL_2] (rows=2000 width=10)
Output:["_col0"]
TableScan [TS_0] (rows=2000 width=10)
default@srcpart,srcpart,Tbl:COMPLETE,Col:NONE
<-Reducer 6 [PARTITION-LEVEL SORT]
PARTITION-LEVEL SORT [RS_27]
PartitionCols:_col0
Group By Operator [GBY_24] (rows=1 width=184)
Output:["_col0"],keys:KEY._col0
<-Reducer 5 [GROUP]
GROUP [RS_23]
PartitionCols:_col0
Group By Operator [GBY_22] (rows=2 width=184)
Output:["_col0"],keys:_col0
Filter Operator [FIL_9] (rows=1 width=184)
predicate:_col0 is not null
Group By Operator [GBY_7] (rows=1 width=184)
Output:["_col0"],aggregations:["max(VALUE._col0)"]
<-Map 4 [GROUP]
GROUP [RS_6]
Group By Operator [GBY_5] (rows=1 width=184)
Output:["_col0"],aggregations:["max(ds)"]
Select Operator [SEL_4] (rows=2000 width=10)
Output:["ds"]
TableScan [TS_3] (rows=2000 width=10)
default@srcpart,srcpart,Tbl:COMPLETE,Col:NONE
<-Reducer 8 [GROUP]
GROUP [RS_23]
PartitionCols:_col0
Group By Operator [GBY_22] (rows=2 width=184)
Output:["_col0"],keys:_col0
Filter Operator [FIL_17] (rows=1 width=184)
predicate:_col0 is not null
Group By Operator [GBY_15] (rows=1 width=184)
Output:["_col0"],aggregations:["min(VALUE._col0)"]
<-Map 7 [GROUP]
GROUP [RS_14]
Group By Operator [GBY_13] (rows=1 width=184)
Output:["_col0"],aggregations:["min(ds)"]
Select Operator [SEL_12] (rows=2000 width=10)
Output:["ds"]
TableScan [TS_11] (rows=2000 width=10)
default@srcpart,srcpart,Tbl:COMPLETE,Col:NONE
Stage-2
Reducer 11
{code}
So there are two sections that say {{Vertex dependency in root stage}}.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)