[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Jackson updated HIVE-18375: -------------------------------- Description: Give these tables: {code:SQL} CREATE TABLE employees ( emp_no INT, first_name VARCHAR(14), last_name VARCHAR(16) ); insert into employees values (1, 'Gottlob', 'Frege'), (2, 'Bertrand', 'Russell'), (3, 'Ludwig', 'Wittgenstein'); CREATE TABLE salaries ( emp_no INT, salary INT, from_date DATE, to_date DATE ); insert into salaries values (1, 10, '1900-01-01', '1900-01-31'), (1, 18, '1900-09-01', '1900-09-30'), (2, 15, '1940-03-01', '1950-01-01'), (3, 20, '1920-01-01', '1950-01-01'); {code} This query returns the names of the employees ordered by their peak salary: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, this should still work even if the max_salary is not part of the projection: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, that fails with this error: {code} Error while compiling statement: FAILED: SemanticException [Error 10004]: line 9:9 Invalid table alias or column reference 't1': (possible column names are: last_name, first_name) {code} FWIW, this also fails: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` AS `max_sal` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} But this succeeds: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` AS `max_sal` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `max_sal` DESC; {code} was: Give these tables: {code:SQL} CREATE TABLE employees ( emp_no INT, first_name VARCHAR(14), last_name VARCHAR(16) ); insert into employees values (1, 'Gottlob', 'Frege'), (2, 'Bertrand', 'Russell'), (3, 'Ludwig', 'Wittgenstein'); CREATE TABLE salaries ( emp_no INT, salary INT, from_date DATE, to_date DATE ); insert into salaries values (1, 10, '1900-01-01', '1900-01-31'), (1, 18, '1900-09-01', '1900-09-30'), (2, 15, '1940-03-01', '1950-01-01'), (3, 20, '1920-01-01', '1950-01-01'); {code} This query returns the names of the employees ordered by their peak salary: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, this should still work even if the max_salary is not part of the projection: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, that fails with this error: {code} Error while compiling statement: FAILED: SemanticException [Error 10004]: line 9:9 Invalid table alias or column reference 't1': (possible column names are: last_name, first_name) {code} > Cannot ORDER by subquery fields unless they are selected > -------------------------------------------------------- > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] > Reporter: Paul Jackson > Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > But this succeeds: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `max_sal` DESC; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)