yaooqinn opened a new pull request #26776: [SPARK-30147][SQL] Trim the string 
when cast string type to booleans
URL: https://github.com/apache/spark/pull/26776
 
 
   ### What changes were proposed in this pull request?
   
   Now, we trim the string when casting string value to those `canCast` types 
values, e.g. int, double, decimal, interval, date, timestamps, except for 
boolean. 
   This behavior makes type cast and coercion inconsistency in Spark.
   Not fitting ANSI SQL standard either.
   ```
   If TD is boolean, then
   Case:
   a) If SD is character string, then SV is replaced by
       TRIM ( BOTH ' ' FROM VE )
       Case:
       i) If the rules for literal in Subclause 5.3, “literal”, can be applied 
to SV to determine a valid
   value of the data type TD, then let TV be that value.
      ii) Otherwise, an exception condition is raised: data exception — invalid 
character value for cast.
   b) If SD is boolean, then TV is SV
   ```
   In this pull request, we trim all the whitespaces from both ends of the 
string before converting it to a bool value. This behavior is as same as 
others, but a bit different from sql standard, which trim only spaces. 
   
   ### Why are the changes needed?
   
   Type cast/coercion consistency
   
   
   ### Does this PR introduce any user-facing change?
   <!--
   If yes, please clarify the previous behavior and the change this PR proposes 
- provide the console output, description and/or an example to show the 
behavior difference if possible.
   If no, write 'No'.
   -->
   
   yes, string with whitespaces in both ends will be trimmed before converted 
to booleans.
   
   e.g. `select cast('\t true' as boolean)` results `true` now, before this pr 
it's `null`
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some 
test cases that check the changes thoroughly including negative and positive 
cases if possible.
   If it was tested in a way different from regular unit tests, please clarify 
how you tested step by step, ideally copy and paste-able, so that other 
reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why 
it was difficult to add.
   -->
   
   add unit tests

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to