[
https://issues.apache.org/jira/browse/HIVE-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Thusoo updated HIVE-222:
-------------------------------
Attachment: patch-222.txt
Fix for the bug.
There was a bug in way the the aggregation list was being generated for the map
side aggregation. As a result the ordering of the aggregations in the map side
groupby operator and the reduce side groupby operator would differ leading to
this problem. Ideally, we should be using the row schema information to
generate the order but that needs a much larger refactor of how we generate
plans in the group by case. For now this patch should fix the problem.
There are prexisting tests that test this (groupby2_map.q and groupby3_map.q).
The test case however relies on an internal hashmap giving the keys in a
certain order. The bug was easily reproducible with the patch in HIVE-179. I
have tested it with that patch.
> Group by on a combination of disitinct and non distinct aggregates can return
> serialization errors with map side aggregations.
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-222
> URL: https://issues.apache.org/jira/browse/HIVE-222
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.2.0
> Reporter: Ashish Thusoo
> Assignee: Ashish Thusoo
> Priority: Blocker
> Fix For: 0.2.0
>
> Attachments: patch-222.txt
>
>
> For queries of the form (groupby2_map.q in the source)
> SELECT x, count(DISTINCT y), SUM(y) FROM t GROUP BY x
> when map side aggregation is on
> hive.map.aggr=true (This is off by default)
> The following exception can occur:
> [junit] Caused by: java.lang.ClassCastException: java.lang.Long cannot be
> cast to java.lang.Double
> [junit] at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeTypeDouble.serialize(DynamicSerDeTypeDouble.java:60)
> [junit] at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeFieldList.serialize(DynamicSerDeFieldList.java:235)
> [junit] at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeStructBase.serialize(DynamicSerDeStructBase.java:81)
> [junit] at
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.serialize(DynamicSerDe.java:174)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.