[ https://issues.apache.org/jira/browse/SPARK-32324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
philipse updated SPARK-32324: ----------------------------- Description: Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it. Steps to reproduce: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); {code} Query1: {code:java} SELECT * FROM person lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) {code} Result 1: {code:java} Error: org.apache.spark.sql.catalyst.parser.ParseException: LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) == SQL == SELECT * FROM person ---------^^^ lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) (state=,code=0) {code} Query2: {code:java} SELECT * FROM person PIVOT ( count(distinct age) as a for name in ('Mary','John') ) lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age {code} Reuslt2: +-------+------++---------------++-------- |id|Mary|John|c_age|d_age| +-------+------++---------------++-------- |300|NULL|NULL|30|40| |300|NULL|NULL|30|80| |300|NULL|NULL|60|40| |300|NULL|NULL|60|80| |100|0|NULL|30|40| |100|0|NULL|30|80| |100|0|NULL|60|40| |100|0|NULL|60|80| |400|NULL|NULL|30|40| |400|NULL|NULL|30|80| |400|NULL|NULL|60|40| |400|NULL|NULL|60|80| |200|NULL|1|30|40| |200|NULL|1|30|80| |200|NULL|1|60|40| |200|NULL|1|60|80| +-------+------++---------------++-------- ``` was: Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it. Steps to reproduce: ``` CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); INSERT INTO person VALUES (100, 'John', 30, 1, 'Street 1'), (200, 'Mary', NULL, 1, 'Street 2'), (300, 'Mike', 80, 3, 'Street 3'), (400, 'Dan', 50, 4, 'Street 4'); ``` Query1: ``` SELECT * FROM person lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) ``` Result 1: ``` Error: org.apache.spark.sql.catalyst.parser.ParseException: LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) == SQL == SELECT * FROM person ---------^^^ lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age PIVOT ( count(distinct age) as a for name in ('Mary','John') ) (state=,code=0) ``` Query2: ``` SELECT * FROM person PIVOT ( count(distinct age) as a for name in ('Mary','John') ) lateral view outer explode(array(30,60)) tabelName as c_age lateral view explode(array(40,80)) as d_age ``` Reuslt2: ``` +------+-------+-------+--------+--------+ | id | Mary | John | c_age | d_age | +------+-------+-------+--------+--------+ | 300 | NULL | NULL | 30 | 40 | | 300 | NULL | NULL | 30 | 80 | | 300 | NULL | NULL | 60 | 40 | | 300 | NULL | NULL | 60 | 80 | | 100 | 0 | NULL | 30 | 40 | | 100 | 0 | NULL | 30 | 80 | | 100 | 0 | NULL | 60 | 40 | | 100 | 0 | NULL | 60 | 80 | | 400 | NULL | NULL | 30 | 40 | | 400 | NULL | NULL | 30 | 80 | | 400 | NULL | NULL | 60 | 40 | | 400 | NULL | NULL | 60 | 80 | | 200 | NULL | 1 | 30 | 40 | | 200 | NULL | 1 | 30 | 80 | | 200 | NULL | 1 | 60 | 40 | | 200 | NULL | 1 | 60 | 80 | +------+-------+-------+--------+--------+ ``` > Fix error messages during using PIVOT and lateral view > ------------------------------------------------------ > > Key: SPARK-32324 > URL: https://issues.apache.org/jira/browse/SPARK-32324 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: philipse > Priority: Minor > > Currently when we use `lateral view` and `pivot` together in from clause, if > `lateral view` is before `pivot`, the error message is "LATERAL cannot be > used together with PIVOT in FROM clause".if if `lateral view` is after > `pivot`,the query will be normal ,So the error messages "LATERAL cannot be > used together with PIVOT in FROM clause" is not accurate, we may improve it. > > Steps to reproduce: > {code:java} > CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING); > INSERT INTO person VALUES > (100, 'John', 30, 1, 'Street 1'), > (200, 'Mary', NULL, 1, 'Street 2'), > (300, 'Mike', 80, 3, 'Street 3'), > (400, 'Dan', 50, 4, 'Street 4'); > {code} > > Query1: > > {code:java} > SELECT * FROM person > lateral view outer explode(array(30,60)) tabelName as c_age > lateral view explode(array(40,80)) as d_age > PIVOT ( > count(distinct age) as a > for name in ('Mary','John') > ) > {code} > Result 1: > > {code:java} > Error: org.apache.spark.sql.catalyst.parser.ParseException: > LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9) > == SQL == > SELECT * FROM person > ---------^^^ > lateral view outer explode(array(30,60)) tabelName as c_age > lateral view explode(array(40,80)) as d_age > PIVOT ( > count(distinct age) as a > for name in ('Mary','John') > ) (state=,code=0) > {code} > > > Query2: > > {code:java} > SELECT * FROM person > PIVOT ( > count(distinct age) as a > for name in ('Mary','John') > ) > lateral view outer explode(array(30,60)) tabelName as c_age > lateral view explode(array(40,80)) as d_age > {code} > > Reuslt2: > +-------+------++---------------++-------- > |id|Mary|John|c_age|d_age| > +-------+------++---------------++-------- > |300|NULL|NULL|30|40| > |300|NULL|NULL|30|80| > |300|NULL|NULL|60|40| > |300|NULL|NULL|60|80| > |100|0|NULL|30|40| > |100|0|NULL|30|80| > |100|0|NULL|60|40| > |100|0|NULL|60|80| > |400|NULL|NULL|30|40| > |400|NULL|NULL|30|80| > |400|NULL|NULL|60|40| > |400|NULL|NULL|60|80| > |200|NULL|1|30|40| > |200|NULL|1|30|80| > |200|NULL|1|60|40| > |200|NULL|1|60|80| > +-------+------++---------------++-------- > ``` > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org