Hi Everyone

Lets say I have a table partitioned by period string

how to select max period?

if I run
select max(period) from invoice;

hive 0.13.1 runs MR which is slow

OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Tez
      Edges:
        Reducer 2 <- Map 1 (SIMPLE_EDGE)
      DagName: lcapp_20150225120606_0242d9fa-9ad5-40d4-b63c-87a1a8330482:1
      Vertices:
        Map 1
            Map Operator Tree:
                TableScan
                  alias: invoice
                  Statistics: Num rows: 1261837038 Data size: 628971470848
Basic stats: COMPLETE Column stats: COMPLETE
                  Select Operator
                    expressions: period (type: string)
                    outputColumnNames: period
                    Statistics: Num rows: 1261837038 Data size:
628971470848 Basic stats: COMPLETE Column stats: COMPLETE
                    Group By Operator
                      aggregations: max(period)
                      mode: hash
                      outputColumnNames: _col0
                      Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
                      Reduce Output Operator
                        sort order:
                        Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
                        value expressions: _col0 (type: string)
        Reducer 2
            Reduce Operator Tree:
              Group By Operator
                aggregations: max(VALUE._col0)
                mode: mergepartial
                outputColumnNames: _col0
                Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE
Column stats: COMPLETE
                Select Operator
                  expressions: _col0 (type: string)
                  outputColumnNames: _col0
                  Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 1 Data size: 84 Basic stats:
COMPLETE Column stats: COMPLETE
                    table:
                        input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
                        output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                        serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1

Time taken: 1.492 seconds, Fetched: 52 row(s)

Reply via email to