[
https://issues.apache.org/jira/browse/HIVE-28729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krisztian Kasa reassigned HIVE-28729:
-------------------------------------
Assignee: Krisztian Kasa
> Apply nulls order setting in Reduce Sink operator of join branches
> ------------------------------------------------------------------
>
> Key: HIVE-28729
> URL: https://issues.apache.org/jira/browse/HIVE-28729
> Project: Hive
> Issue Type: Sub-task
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
>
> {code:java}
> set hive.default.nulls.last=false;
> create table t1(key int, value string);
> EXPLAIN SELECT sum(hash(a.key,a.value,b.key,b.value)) FROM t1 a INNER JOIN t1
> b on a.key = b.key;
> {code}
> {code:java}
> STAGE DEPENDENCIES:
> Stage-1 is a root stage
> Stage-0 depends on stages: Stage-1
> STAGE PLANS:
> Stage: Stage-1
> Tez
> #### A masked pattern was here ####
> Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
> #### A masked pattern was here ####
> Vertices:
> Map 1
> Map Operator Tree:
> TableScan
> alias: a
> filterExpr: key is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: key (type: int), value (type: string)
> outputColumnNames: key, value
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: key (type: int)
> null sort order: z
> sort order: +
> Map-reduce partition columns: key (type: int)
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: value (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map 4
> Map Operator Tree:
> TableScan
> alias: b
> filterExpr: key is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: key (type: int), value (type: string)
> outputColumnNames: key, value
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: key (type: int)
> null sort order: z
> sort order: +
> Map-reduce partition columns: key (type: int)
> Statistics: Num rows: 1 Data size: 188 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: value (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Reducer 2
> Execution mode: llap
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Inner Join 0 to 1
> keys:
> 0 key (type: int)
> 1 key (type: int)
> outputColumnNames: key, value, key0, value0
> Statistics: Num rows: 1 Data size: 206 Basic stats: COMPLETE
> Column stats: NONE
> Select Operator
> expressions: hash(key,value,key0,value0) (type: int)
> outputColumnNames: $f0
> Statistics: Num rows: 1 Data size: 206 Basic stats:
> COMPLETE Column stats: NONE
> Group By Operator
> aggregations: sum($f0)
> minReductionHashAggr: 0.99
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> null sort order:
> sort order:
> Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: _col0 (type: bigint)
> Reducer 3
> Execution mode: vectorized, llap
> Reduce Operator Tree:
> Group By Operator
> aggregations: sum(VALUE._col0)
> mode: mergepartial
> outputColumnNames: $f0
> Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE
> Column stats: NONE
> File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE
> Column stats: NONE
> table:
> input format:
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format:
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde:
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> ListSink
> {code}
> Nulls order in RS operators are NULLS LAST but is should be NULLS FIRST
> because of the config {{hive.default.nulls.last=false}}
> {code}
> Map 1
> Map Operator Tree:
> ...
> Reduce Output Operator
> key expressions: key (type: int)
> null sort order: z
> ...
> {code}
> {code}
> Map 4
> Map Operator Tree:
> ...
> Reduce Output Operator
> key expressions: key (type: int)
> null sort order: z
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)