Yin Huai created SPARK-8573:
-------------------------------

             Summary: For PySpark's DataFrame API, we need to throw exceptions 
when users try to use and/or/not
                 Key: SPARK-8573
                 URL: https://issues.apache.org/jira/browse/SPARK-8573
             Project: Spark
          Issue Type: Bug
          Components: PySpark, SQL
    Affects Versions: 1.3.0
            Reporter: Yin Huai
            Assignee: Davies Liu
            Priority: Critical


In PySpark's DataFrame API, we have
{code}
# `and`, `or`, `not` cannot be overloaded in Python,
# so use bitwise operators as boolean operators
__and__ = _bin_op('and')
__or__ = _bin_op('or')
__invert__ = _func_op('not')
__rand__ = _bin_op("and")
__ror__ = _bin_op("or")
{code}

Right now, users can still use operators like {{and}}, which can cause very 
confusing behaviors. We need to throw an error when users try to use them and 
let them know what is the right way to do.

For example, 
{code}
df = sqlContext.range(1, 10)
df.id > 5 or df.id < 10
Out[30]: Column<(id > 5)>
df.id > 5 and df.id < 10
Out[31]: Column<(id < 10)>
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to