Sorted Group By
---------------
Key: HIVE-931
URL: https://issues.apache.org/jira/browse/HIVE-931
Project: Hadoop Hive
Issue Type: New Feature
Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang
Fix For: 0.5.0
If the table is sorted by a given key, we don't use that for group by. That can
be very useful.
For eg: if T is sorted by column c1,
For select c1, aggr() from T group by c1
we always use a single map-reduce job. No hash table is needed on the mapper,
since the data is sorted by c1 anyway.
This will reduce the memory pressure on the mapper and also remove overhead of
maintaining the hash table.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.