subject:"\[jira\] \[Commented\] \(SPARK\-29667\) implicitly convert mismatched datatypes on right side of \"IN\" operator"

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

2019-12-04 Thread Marco Gaido (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987712#comment-16987712
 ] 

Marco Gaido commented on SPARK-29667:
-

I can agree more with you [~hyukjin.kwon]. I think that having different 
coercion rules for the two types of IN is very confusing. It'd be great for 
such things to be consistent among all the framework in order to avoid 
"surprises" for users IMHO.

> implicitly convert mismatched datatypes on right side of "IN" operator
> --
>
> Key: SPARK-29667
> URL: https://issues.apache.org/jira/browse/SPARK-29667
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Jessie Lin
>Priority: Minor
>
> Ran into error on this sql
> Mismatched columns:
> {code}
> [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] 
> {code}
> the sql and clause
> {code}
>   AND   a.id in (select id from db1.table1 where col1 = 1 group by id)
> {code}
> Once I cast {{decimal(18,0)}} to {{decimal(28,0)}} explicitly above, the sql 
> ran just fine. Can the sql engine cast implicitly in this case?
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

2019-12-03 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987440#comment-16987440
 ] 

Hyukjin Kwon commented on SPARK-29667:
--

I think this issue is a duplicate of SPARK-29860. [The PR against 
SPARK-29860|https://github.com/apache/spark/pull/26485] fixes this issue.
There are also related JIRAs and PRs such as 
[SPARK-25056|https://github.com/apache/spark/pull/22038] and 
[SPARK-22413|https://github.com/apache/spark/pull/19635].

Seems like we should better actively review than just leaving them 
indefinitely. CC [~yumwang], [~mgaido], [~smilegator], [~cloud_fan].

> implicitly convert mismatched datatypes on right side of "IN" operator
> --
>
> Key: SPARK-29667
> URL: https://issues.apache.org/jira/browse/SPARK-29667
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Jessie Lin
>Priority: Minor
>
> Ran into error on this sql
> Mismatched columns:
> {code}
> [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] 
> {code}
> the sql and clause
> {code}
>   AND   a.id in (select id from db1.table1 where col1 = 1 group by id)
> {code}
> Once I cast {{decimal(18,0)}} to {{decimal(28,0)}} explicitly above, the sql 
> ran just fine. Can the sql engine cast implicitly in this case?
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

2019-12-03 Thread Aman Omer (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987120#comment-16987120
 ] 

Aman Omer commented on SPARK-29667:
---

Actually there is also JIRA for decimal type mismatching due to precision
https://issues.apache.org/jira/browse/SPARK-29600

> implicitly convert mismatched datatypes on right side of "IN" operator
> --
>
> Key: SPARK-29667
> URL: https://issues.apache.org/jira/browse/SPARK-29667
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Jessie Lin
>Priority: Minor
>
> Ran into error on this sql
> Mismatched columns:
> [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] 
> the sql and clause
>   AND   a.id in (select id from db1.table1 where col1 = 1 group by id)
> Once I cast decimal(18,0) to decimal(28,0) explicitly above, the sql ran just 
> fine. Can the sql engine cast implicitly in this case?
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

2019-12-03 Thread Aman Omer (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987119#comment-16987119
 ] 

Aman Omer commented on SPARK-29667:
---

Hi [~jessielin], did you observed same issue with any other data type?

{code:java}
SELECT * FROM (VALUES (1.0), (2.0) AS t1(col1))
WHERE col1 IN
(SELECT * FROM (VALUES (1.0), (2.0), (3.00) AS t2(col2)));
{code}
Above query will fail with following error

{noformat}
Mismatched columns:
[(__auto_generated_subquery_name.`col1`:decimal(2,1), 
__auto_generated_subquery_name.`col2`:decimal(3,2))]
Left side:
[decimal(2,1)].
Right side:
[decimal(3,2)].
{noformat}

Since, `3.00` in subquery will upcast all element in column to decimal(3,2).

> implicitly convert mismatched datatypes on right side of "IN" operator
> --
>
> Key: SPARK-29667
> URL: https://issues.apache.org/jira/browse/SPARK-29667
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: Jessie Lin
>Priority: Minor
>
> Ran into error on this sql
> Mismatched columns:
> [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] 
> the sql and clause
>   AND   a.id in (select id from db1.table1 where col1 = 1 group by id)
> Once I cast decimal(18,0) to decimal(28,0) explicitly above, the sql ran just 
> fine. Can the sql engine cast implicitly in this case?
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

2019-10-30 Thread Cheng Lian (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963305#comment-16963305
 ] 

Cheng Lian commented on SPARK-29667:


Reproduced this with the following snippet:
{code}
spark.range(10).select($"id" cast DecimalType(18, 
0)).createOrReplaceTempView("t1")
spark.range(10).select($"id" cast DecimalType(28, 
0)).createOrReplaceTempView("t2")
sql("SELECT * FROM t1 WHERE t1.id IN (SELECT id FROM t2)").explain(true)
{code}
Exception:
{noformat}
The data type of one or more elements in the left hand side of an IN subquery
is not compatible with the data type of the output of the subquery
Mismatched columns:
[(t1.`id`:decimal(18,0), t2.`id`:decimal(28,0))]
Left side:
[decimal(18,0)].
Right side:
[decimal(28,0)].; line 1 pos 29;
'Project [*]
+- 'Filter id#16 IN (list#22 [])
   :  +- Project [id#20]
   : +- SubqueryAlias `t2`
   :+- Project [cast(id#18L as decimal(28,0)) AS id#20]
   :   +- Range (0, 10, step=1, splits=Some(8))
   +- SubqueryAlias `t1`
  +- Project [cast(id#14L as decimal(18,0)) AS id#16]
 +- Range (0, 10, step=1, splits=Some(8))
at 
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$3.applyOrElse(CheckAnalysis.scala:123)
...
{noformat}
It seems that Postgres does support this kind of implicit casting:
{noformat}
postgres=# SELECT CAST(1 AS BIGINT) IN (CAST(1 AS INT));

 ?column?
--
 t
(1 row)
{noformat}
I believe the problem in Spark is that 
{{o.a.s.s.c.expressions.In#checkInputDataTypes()}} is too strict.

> implicitly convert mismatched datatypes on right side of "IN" operator
> --
>
> Key: SPARK-29667
> URL: https://issues.apache.org/jira/browse/SPARK-29667
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.3
> Environment: spark-2.4.3-bin-dbr-5.5-snapshot-9833d0f
>Reporter: Jessie Lin
>Priority: Minor
>
> Ran into error on this sql
> Mismatched columns:
> [(a.`id`:decimal(28,0), db1.table1.`id`:decimal(18,0))] 
> the sql and clause
>   AND   a.id in (select id from db1.table1 where col1 = 1 group by id)
> Once I cast decimal(18,0) to decimal(28,0) explicitly above, the sql ran just 
> fine. Can the sql engine cast implicitly in this case?
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

5 matches

Site Navigation

Mail list logo

Footer information