[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator
[ https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987712#comment-16987712 ] Marco Gaido commented on SPARK-29667: - I can agree more with you [~hyukjin.kwon]. I think that having different coercion rules for the two types of IN is very confusing. It'd be great for such things to be consistent among all the framework in order to avoid "surprises" for users IMHO. > implicitly convert mismatched datatypes on right side of "IN" operator > -- > > Key: SPARK-29667 > URL: https://issues.apache.org/jira/browse/SPARK-29667 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Jessie Lin >Priority: Minor > > Ran into error on this sql > Mismatched columns: > {code} > [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] > {code} > the sql and clause > {code} > AND a.id in (select id from db1.table1 where col1 = 1 group by id) > {code} > Once I cast {{decimal(18,0)}} to {{decimal(28,0)}} explicitly above, the sql > ran just fine. Can the sql engine cast implicitly in this case? > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator
[ https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987440#comment-16987440 ] Hyukjin Kwon commented on SPARK-29667: -- I think this issue is a duplicate of SPARK-29860. [The PR against SPARK-29860|https://github.com/apache/spark/pull/26485] fixes this issue. There are also related JIRAs and PRs such as [SPARK-25056|https://github.com/apache/spark/pull/22038] and [SPARK-22413|https://github.com/apache/spark/pull/19635]. Seems like we should better actively review than just leaving them indefinitely. CC [~yumwang], [~mgaido], [~smilegator], [~cloud_fan]. > implicitly convert mismatched datatypes on right side of "IN" operator > -- > > Key: SPARK-29667 > URL: https://issues.apache.org/jira/browse/SPARK-29667 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Jessie Lin >Priority: Minor > > Ran into error on this sql > Mismatched columns: > {code} > [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] > {code} > the sql and clause > {code} > AND a.id in (select id from db1.table1 where col1 = 1 group by id) > {code} > Once I cast {{decimal(18,0)}} to {{decimal(28,0)}} explicitly above, the sql > ran just fine. Can the sql engine cast implicitly in this case? > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator
[ https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987120#comment-16987120 ] Aman Omer commented on SPARK-29667: --- Actually there is also JIRA for decimal type mismatching due to precision https://issues.apache.org/jira/browse/SPARK-29600 > implicitly convert mismatched datatypes on right side of "IN" operator > -- > > Key: SPARK-29667 > URL: https://issues.apache.org/jira/browse/SPARK-29667 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Jessie Lin >Priority: Minor > > Ran into error on this sql > Mismatched columns: > [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] > the sql and clause > AND a.id in (select id from db1.table1 where col1 = 1 group by id) > Once I cast decimal(18,0) to decimal(28,0) explicitly above, the sql ran just > fine. Can the sql engine cast implicitly in this case? > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator
[ https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987119#comment-16987119 ] Aman Omer commented on SPARK-29667: --- Hi [~jessielin], did you observed same issue with any other data type? {code:java} SELECT * FROM (VALUES (1.0), (2.0) AS t1(col1)) WHERE col1 IN (SELECT * FROM (VALUES (1.0), (2.0), (3.00) AS t2(col2))); {code} Above query will fail with following error {noformat} Mismatched columns: [(__auto_generated_subquery_name.`col1`:decimal(2,1), __auto_generated_subquery_name.`col2`:decimal(3,2))] Left side: [decimal(2,1)]. Right side: [decimal(3,2)]. {noformat} Since, `3.00` in subquery will upcast all element in column to decimal(3,2). > implicitly convert mismatched datatypes on right side of "IN" operator > -- > > Key: SPARK-29667 > URL: https://issues.apache.org/jira/browse/SPARK-29667 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Jessie Lin >Priority: Minor > > Ran into error on this sql > Mismatched columns: > [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] > the sql and clause > AND a.id in (select id from db1.table1 where col1 = 1 group by id) > Once I cast decimal(18,0) to decimal(28,0) explicitly above, the sql ran just > fine. Can the sql engine cast implicitly in this case? > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator
[ https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963305#comment-16963305 ] Cheng Lian commented on SPARK-29667: Reproduced this with the following snippet: {code} spark.range(10).select($"id" cast DecimalType(18, 0)).createOrReplaceTempView("t1") spark.range(10).select($"id" cast DecimalType(28, 0)).createOrReplaceTempView("t2") sql("SELECT * FROM t1 WHERE t1.id IN (SELECT id FROM t2)").explain(true) {code} Exception: {noformat} The data type of one or more elements in the left hand side of an IN subquery is not compatible with the data type of the output of the subquery Mismatched columns: [(t1.`id`:decimal(18,0), t2.`id`:decimal(28,0))] Left side: [decimal(18,0)]. Right side: [decimal(28,0)].; line 1 pos 29; 'Project [*] +- 'Filter id#16 IN (list#22 []) : +- Project [id#20] : +- SubqueryAlias `t2` :+- Project [cast(id#18L as decimal(28,0)) AS id#20] : +- Range (0, 10, step=1, splits=Some(8)) +- SubqueryAlias `t1` +- Project [cast(id#14L as decimal(18,0)) AS id#16] +- Range (0, 10, step=1, splits=Some(8)) at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$3.applyOrElse(CheckAnalysis.scala:123) ... {noformat} It seems that Postgres does support this kind of implicit casting: {noformat} postgres=# SELECT CAST(1 AS BIGINT) IN (CAST(1 AS INT)); ?column? -- t (1 row) {noformat} I believe the problem in Spark is that {{o.a.s.s.c.expressions.In#checkInputDataTypes()}} is too strict. > implicitly convert mismatched datatypes on right side of "IN" operator > -- > > Key: SPARK-29667 > URL: https://issues.apache.org/jira/browse/SPARK-29667 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 > Environment: spark-2.4.3-bin-dbr-5.5-snapshot-9833d0f >Reporter: Jessie Lin >Priority: Minor > > Ran into error on this sql > Mismatched columns: > [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] > the sql and clause > AND a.id in (select id from db1.table1 where col1 = 1 group by id) > Once I cast decimal(18,0) to decimal(28,0) explicitly above, the sql ran just > fine. Can the sql engine cast implicitly in this case? > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org