[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-09-08 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192111#comment-17192111
 ] 

Apache Spark commented on SPARK-32638:
--

User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/29680

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Assignee: Takeshi Yamamuro
>Priority: Major
> Fix For: 3.1.0
>
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-09-08 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192110#comment-17192110
 ] 

Apache Spark commented on SPARK-32638:
--

User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/29680

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Assignee: Takeshi Yamamuro
>Priority: Major
> Fix For: 3.1.0
>
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-09-03 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190341#comment-17190341
 ] 

Apache Spark commented on SPARK-32638:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/29643

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Assignee: Takeshi Yamamuro
>Priority: Major
> Fix For: 3.1.0
>
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-09-03 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190340#comment-17190340
 ] 

Apache Spark commented on SPARK-32638:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/29643

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Assignee: Takeshi Yamamuro
>Priority: Major
> Fix For: 3.1.0
>
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-08-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180985#comment-17180985
 ] 

Apache Spark commented on SPARK-32638:
--

User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/29485

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Priority: Major
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-08-19 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180415#comment-17180415
 ] 

Takeshi Yamamuro commented on SPARK-32638:
--

Looks like a minor bug. I'll make a PR later.

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Priority: Major
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-08-19 Thread Guojian Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180373#comment-17180373
 ] 

Guojian Li commented on SPARK-32638:


I reproduce it on 2.4.5 .  and find  it was report before on 
https://issues.apache.org/jira/browse/SPARK-18622

But developer just fix it  by changing out type  not the suggesting solution .  
 When WidenSetOperationTypes adding extra new project , it need to  make sure 
the existing reference still valid.   

the issue is too hard for me to fix ,  really hope  some guys can help me out . 
 :) 

 

 

 

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.5, 3.0.0
>Reporter: Guojian Li
>Priority: Major
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-08-19 Thread Lantao Jin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180365#comment-17180365
 ] 

Lantao Jin commented on SPARK-32638:


Yes. This problem exists in 3.0 and master.

The problem occurs follow these steps:
In ResolveReference.apply()
When resolving the Project, its children are all resolved
'Project ['t.kpi_04]
+- SubqueryAlias t
   +- Union
  :- Project [a#44 AS kpi_04#46]
  :  +- SubqueryAlias test
  : +- LocalRelation , [a#44]
  +- Project [(a#44 + a#44) AS kpi_04#47]
 +- SubqueryAlias test
+- LocalRelation , [a#44]

-> 
Project [kpi_04#46]
+- SubqueryAlias t
   +- Union
  :- Project [a#44 AS kpi_04#46]
  :  +- SubqueryAlias test
  : +- LocalRelation , [a#44]
  +- Project [(a#44 + a#44) AS kpi_04#47]
 +- SubqueryAlias test
+- LocalRelation , [a#44]

After Project resolved. It child Union changes the children by 
WidenSetOperationTypes.

In the next iteration, Project won't be resolved again.
!Project [kpi_04#46]
+- SubqueryAlias t
   +- Union
  :- Project [cast(kpi_04#46 as decimal(22,1)) AS kpi_04#48]
  :  +- Project [a#44 AS kpi_04#46]
  : +- SubqueryAlias test
  :+- LocalRelation , [a#44]
  +- Project [kpi_04#47]
 +- Project [CheckOverflow((promote_precision(cast(a#44 as 
decimal(22,1))) + promote_precision(cast(a#44 as decimal(22,1, 
DecimalType(22,1), true) AS kpi_04#47]
+- SubqueryAlias test
   +- LocalRelation , [a#44]

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 3.0.0
>Reporter: Guojian Li
>Priority: Major
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } 

[jira] [Commented] (SPARK-32638) WidenSetOperationTypes in subquery attribute missing

2020-08-18 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179536#comment-17179536
 ] 

Wenchen Fan commented on SPARK-32638:
-

2.3 is not maintained anymore, can you check with 2.4/3.0/master?

> WidenSetOperationTypes in subquery  attribute  missing
> --
>
> Key: SPARK-32638
> URL: https://issues.apache.org/jira/browse/SPARK-32638
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4
>Reporter: Guojian Li
>Priority: Major
>
> I am migrating sql from mysql to spark sql, meet a very strange case. Below 
> is code to reproduce the exception:
>  
> {code:java}
> val spark = SparkSession.builder()
>  .master("local")
>  .appName("Word Count")
>  .getOrCreate()
> spark.sparkContext.setLogLevel("TRACE")
> val DecimalType = DataTypes.createDecimalType(20, 2)
> val schema = StructType(List(
>  StructField("a", DecimalType, true)
> ))
> val dataList = new util.ArrayList[Row]()
> val df=spark.createDataFrame(dataList,schema)
> df.printSchema()
> df.createTempView("test")
> val sql=
>  """
>  |SELECT t.kpi_04 FROM
>  |(
>  | SELECT a as `kpi_04` FROM test
>  | UNION ALL
>  | SELECT a+a as `kpi_04` FROM test
>  |) t
>  |
>  """.stripMargin
> spark.sql(sql)
> {code}
>  
> Exception Message:
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved 
> attribute(s) kpi_04#2 missing from kpi_04#4 in operator !Project [kpi_04#2]. 
> Attribute(s) with the same name appear in the operation: kpi_04. Please check 
> if the right attribute(s) are used.;;
> !Project [kpi_04#2]
> +- SubqueryAlias t
>  +- Union
>  :- Project [cast(kpi_04#2 as decimal(21,2)) AS kpi_04#4]
>  : +- Project [a#0 AS kpi_04#2]
>  : +- SubqueryAlias test
>  : +- LocalRelation , [a#0]
>  +- Project [kpi_04#3]
>  +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
>  +- SubqueryAlias test
>  +- LocalRelation , [a#0]{code}
>  
>  
> Base the trace log ,seemly the WidenSetOperationTypes add new outer project 
> layer. It caused the parent query lose the reference to subquery. 
>  
>  
> {code:java}
>  
> === Applying Rule 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes ===
> !'Project [kpi_04#2] !Project [kpi_04#2]
> !+- 'SubqueryAlias t +- SubqueryAlias t
> ! +- 'Union +- Union
> ! :- Project [a#0 AS kpi_04#2] :- Project [cast(kpi_04#2 as decimal(21,2)) AS 
> kpi_04#4]
> ! : +- SubqueryAlias test : +- Project [a#0 AS kpi_04#2]
> ! : +- LocalRelation , [a#0] : +- SubqueryAlias test
> ! +- Project [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3] : +- LocalRelation , [a#0]
> ! +- SubqueryAlias test +- Project [kpi_04#3]
> ! +- LocalRelation , [a#0] +- Project 
> [CheckOverflow((promote_precision(cast(a#0 as decimal(21,2))) + 
> promote_precision(cast(a#0 as decimal(21,2, DecimalType(21,2)) AS 
> kpi_04#3]
> ! +- SubqueryAlias test
> ! +- LocalRelation , [a#0]
> {code}
>  
>   in the source code ,WidenSetOperationTypes.scala. it is  a intent behavior, 
> but  possibly  miss this edge case. 
> I hope someone can help me out to fix it . 
>  
>  
> {code:java}
> if (targetTypes.nonEmpty) {
>  // Add an extra Project if the targetTypes are different from the original 
> types.
>  children.map(widenTypes(_, targetTypes))
> } else {
>  // Unable to find a target type to widen, then just return the original set.
>  children
> }{code}
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org