Hi Everyone
Lets say I have a table partitioned by period string
how to select max period?
if I run
select max(period) from invoice;
hive 0.13.1 runs MR which is slow
OK
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-1
Tez
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
DagName: lcapp_20150225120606_0242d9fa-9ad5-40d4-b63c-87a1a8330482:1
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: invoice
Statistics: Num rows: 1261837038 Data size: 628971470848
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: period (type: string)
outputColumnNames: period
Statistics: Num rows: 1261837038 Data size:
628971470848 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: max(period)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: string)
Reducer 2
Reduce Operator Tree:
Group By Operator
aggregations: max(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE
Column stats: COMPLETE
Select Operator
expressions: _col0 (type: string)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Time taken: 1.492 seconds, Fetched: 52 row(s)