Hive multi group by single reducer optimization causes invalid column reference 
error
-------------------------------------------------------------------------------------

                 Key: HIVE-2750
                 URL: https://issues.apache.org/jira/browse/HIVE-2750
             Project: Hive
          Issue Type: Bug
            Reporter: Kevin Wilfong
            Assignee: Kevin Wilfong


After the optimization, if two query blocks have the same distinct clause and 
the same group by keys, but the first query block does not reference all the 
rows the second query block does, an invalid column reference error is raised 
for the columns unreferenced in the first query block.

E.g.
FROM src
INSERT OVERWRITE TABLE dest_g2 SELECT substr(src.key,1,1), count(DISTINCT 
src.key) WHERE substr(src.key,1,1) >= 5 GROUP BY substr(src.key,1,1)
INSERT OVERWRITE TABLE dest_g3 SELECT substr(src.key,1,1), count(DISTINCT 
src.key), count(src.value) WHERE substr(src.key,1,1) < 5 GROUP BY 
substr(src.key,1,1);

This results in an invalid column reference error on src.value

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to