[jira] [Assigned] (SPARK-42416) Dateset.show() should not resolve the analyzed logical plan again

2023-02-12 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42416:


Assignee: Gengliang Wang  (was: Apache Spark)

> Dateset.show() should not resolve the analyzed logical plan again
> -
>
> Key: SPARK-42416
> URL: https://issues.apache.org/jira/browse/SPARK-42416
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> For the following query
>  
> {code:java}
>   sql(
> """
>   |CREATE TABLE app_open (
>   |  uid STRING,
>   |  st TIMESTAMP,
>   |  ds INT
>   |) USING parquet PARTITIONED BY (ds);
>   |""".stripMargin)
>   sql(
> """
>   |create or replace temporary view group_by_error as WITH 
> new_app_open AS (
>   |  SELECT
>   |ao.*
>   |  FROM
>   |app_open ao
>   |)
>   |SELECT
>   |uid,
>   |20230208 AS ds
>   |  FROM
>   |new_app_open
>   |  GROUP BY
>   |1,
>   |2
>   |""".stripMargin)
>   sql(
> """
>   |select
>   |  `uid`
>   |from
>   |  group_by_error
>   |""".stripMargin).show(){code}
> Spark will throw the following error
>  
>  
> {code:java}
> [GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 20230208 is not in select list 
> (valid range is [1, 2]).; line 9 pos 4 {code}
>  
>  
> This is because the logical plan is not set as analyzed and it is analyzed 
> again. The analyzer rules about aggregation/sort ordinals are not idempotent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42416) Dateset.show() should not resolve the analyzed logical plan again

2023-02-12 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42416:


Assignee: Apache Spark  (was: Gengliang Wang)

> Dateset.show() should not resolve the analyzed logical plan again
> -
>
> Key: SPARK-42416
> URL: https://issues.apache.org/jira/browse/SPARK-42416
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> For the following query
>  
> {code:java}
>   sql(
> """
>   |CREATE TABLE app_open (
>   |  uid STRING,
>   |  st TIMESTAMP,
>   |  ds INT
>   |) USING parquet PARTITIONED BY (ds);
>   |""".stripMargin)
>   sql(
> """
>   |create or replace temporary view group_by_error as WITH 
> new_app_open AS (
>   |  SELECT
>   |ao.*
>   |  FROM
>   |app_open ao
>   |)
>   |SELECT
>   |uid,
>   |20230208 AS ds
>   |  FROM
>   |new_app_open
>   |  GROUP BY
>   |1,
>   |2
>   |""".stripMargin)
>   sql(
> """
>   |select
>   |  `uid`
>   |from
>   |  group_by_error
>   |""".stripMargin).show(){code}
> Spark will throw the following error
>  
>  
> {code:java}
> [GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 20230208 is not in select list 
> (valid range is [1, 2]).; line 9 pos 4 {code}
>  
>  
> This is because the logical plan is not set as analyzed and it is analyzed 
> again. The analyzer rules about aggregation/sort ordinals are not idempotent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42416) Dateset.show() should not resolve the analyzed logical plan again

2023-02-12 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang reassigned SPARK-42416:
--

Assignee: Gengliang Wang

> Dateset.show() should not resolve the analyzed logical plan again
> -
>
> Key: SPARK-42416
> URL: https://issues.apache.org/jira/browse/SPARK-42416
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> For the following query
>  
> {code:java}
>   sql(
> """
>   |CREATE TABLE app_open (
>   |  uid STRING,
>   |  st TIMESTAMP,
>   |  ds INT
>   |) USING parquet PARTITIONED BY (ds);
>   |""".stripMargin)
>   sql(
> """
>   |create or replace temporary view group_by_error as WITH 
> new_app_open AS (
>   |  SELECT
>   |ao.*
>   |  FROM
>   |app_open ao
>   |)
>   |SELECT
>   |uid,
>   |20230208 AS ds
>   |  FROM
>   |new_app_open
>   |  GROUP BY
>   |1,
>   |2
>   |""".stripMargin)
>   sql(
> """
>   |select
>   |  `uid`
>   |from
>   |  group_by_error
>   |""".stripMargin).show(){code}
> Spark will throw the following error
>  
>  
> {code:java}
> [GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 20230208 is not in select list 
> (valid range is [1, 2]).; line 9 pos 4 {code}
>  
>  
> This is because the logical plan is not set as analyzed and it is analyzed 
> again. The analyzer rules about aggregation/sort ordinals are not idempotent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org