[GitHub] [spark] Kwafoor commented on a change in pull request #34862: [SPARK-30471][SQL]Fix issue when comparing String and IntegerType


Kwafoor commented on a change in pull request #34862:
URL: https://github.com/apache/spark/pull/34862#discussion_r767382412




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -652,7 +652,13 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
       buildCast[UTF8String](_, UTF8StringUtils.toIntExact)
     case StringType =>
       val result = new IntWrapper()
-      buildCast[UTF8String](_, s => if (s.toInt(result)) result.value else 
null)
+      buildCast[UTF8String](_, s => {
+        if (s.toInt(result)) {
+          result.value}

Review comment:
       In sparkSQL 
   
![image](https://user-images.githubusercontent.com/25627922/145743237-83347617-a47a-4e84-8650-56dd50ad232a.png)
   and 
   
![image](https://user-images.githubusercontent.com/25627922/145743302-0f4544a8-4b61-4130-be1a-1a8e64dbaff6.png)
   
   For in Hive
   
![image](https://user-images.githubusercontent.com/25627922/145743335-9614f19e-743f-49bd-a795-543dbee117c4.png)
   
   When user comparing String and IntegerType in sparkSQL with  
spark.sql.ansi.enabled=false, and the default value of spark.sql.ansi.enabled 
is false, user didn't get any remind but get a wrong result.
   User runs a big and complex sql, it is very hard to find where the result is 
wrong,So they is hard to fix the wrong sql.
   
   Actually I think may be there is  some problem without any check to cast 
String to int when user comparing String and AtomicType.
   
![image](https://user-images.githubusercontent.com/25627922/145743988-13d7a594-88c0-41a8-8a8b-058931cb9bbd.png)
   
   And with spark.sql.ansi.enabled=true, SparkSQL  disallowed  compare String 
and DecimalType, user from hive to SparkSQL unwilling to change their code.
   
![image](https://user-images.githubusercontent.com/25627922/145748708-7b0c1d79-95d9-42a5-8084-cc3bbddcc81c.png)
   
![image](https://user-images.githubusercontent.com/25627922/145748799-4983923e-c532-4212-a1ce-16686f441e82.png)
   
![image](https://user-images.githubusercontent.com/25627922/145744493-9e5ccd5f-18d1-47d5-9b4f-39b3dee61660.png)
   
![image](https://user-images.githubusercontent.com/25627922/145744704-a9e41fe8-e03a-48b0-bfc3-d61baaaf7db8.png)
   
   So I think It should remind User where your sql is not support in SparkSQL 
even without ansi mode, Because this case is exception, and user should to be 
reminded.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Kwafoor commented on a change in pull request #34862: [SPARK-30471][SQL]Fix issue when comparing String and IntegerType

Reply via email to