GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/8698

    [SPARK-10442][SQL][WIP] fix string to boolean cast

    When we cast string to boolean in hive, it returns `true` if the length of 
string is > 0, and spark SQL follows this behavior.
    
    However, this behavior is very different from other SQL systems:
    
    1. 
[presto](https://github.com/facebook/presto/blob/master/presto-main/src/main/java/com/facebook/presto/type/VarcharOperators.java#L89-L118)
 will return `true` for 't' 'true' '1', `false` for 'f' 'false' '0', throw 
exception for others.
    2. 
[redshift](http://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html) 
will return `true` for 't' 'true' 'y' 'yes' '1', `false` for 'f' 'false' 'n' 
'no' '0', null for others.
    3. 
[postgresql](http://www.postgresql.org/docs/devel/static/datatype-boolean.html) 
will return `true` for 't' 'true' 'y' 'yes' 'on' '1', `false` for 'f' 'false' 
'n' 'no' 'off' '0', throw exception for others.
    4. [vertica](https://my.vertica.com/docs/5.0/HTML/Master/2983.htm) will 
return `true` for 't' 'true' 'y' 'yes' '1', `false` for 'f' 'false' 'n' 'no' 
'0', null for others.
    5. 
[impala](http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_boolean.html)
 throw exception when try to cast string to boolean.
    6. mysql, oracle, sqlserver don't have boolean type
    
    Whether we should change the cast behavior according to other SQL system or 
not is not decided yet, this PR is a test to see if we changed, how many 
compatibility tests will fail.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark string2boolean

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8698
    
----
commit e7b50f63f4edc938636c7cfaf9ba0ba6461c1812
Author: Wenchen Fan <[email protected]>
Date:   2015-09-10T11:58:35Z

    fix string to boolean cast

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to