Re: Documentation of boolean column operators missing?

2018-10-23 Thread Nicholas Chammas
On Tue, 23 Oct 2018 at 21:32, Sean Owen wrote: > >> The comments say that it is not possible to overload 'and' and 'or', >> which would have been more natural. >> > Yes, unfortunately, Python does not allow you to override and, or, or not. They are not implemented as “dunder” method (e.g.

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Maciej Szymkiewicz
Even if these were documented Sphinx doesn't include dunder methods by default (with exception to __init__). There is :special-members: option which could be passed to, for example, autoclass. On Tue, 23 Oct 2018 at 21:32, Sean Owen wrote: > (& and | are both logical and bitwise operators in

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Sean Owen
(& and | are both logical and bitwise operators in Java and Scala, FWIW) I don't see them in the python docs; they are defined in column.py but they don't turn up in the docs. Then again, they're not documented: ... __and__ = _bin_op('and') __or__ = _bin_op('or') __invert__ = _func_op('not')

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Nicholas Chammas
Also, to clarify something for folks who don't work with PySpark: The boolean column operators in PySpark are completely different from those in Scala, and non-obvious to boot (since they overload Python's _bitwise_ operators). So their apparent absence from the docs is surprising. On Tue, Oct

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Nicholas Chammas
So it appears then that the equivalent operators for PySpark are completely missing from the docs, right? That’s surprising. And if there are column function equivalents for |, &, and ~, then I can’t find those either for PySpark. Indeed, I don’t think such a thing is possible in PySpark. (e.g.

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Sean Owen
Those should all be Column functions, really, and I see them at http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Column On Tue, Oct 23, 2018, 12:27 PM Nicholas Chammas wrote: > I can’t seem to find any documentation of the &, |, and ~ operators for > PySpark

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Nicholas Chammas
Nope, that’s different. I’m talking about the operators on DataFrame columns in PySpark, not SQL functions. For example: (df .where(~col('is_exiled') & (col('age') > 60)) .show() ) On Tue, Oct 23, 2018 at 1:48 PM Xiao Li wrote: > They are documented at the link below > >

Re: Documentation of boolean column operators missing?

2018-10-23 Thread Xiao Li
They are documented at the link below https://spark.apache.org/docs/2.3.0/api/sql/index.html On Tue, Oct 23, 2018 at 10:27 AM Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > I can’t seem to find any documentation of the &, |, and ~ operators for > PySpark DataFrame columns. I assume

Documentation of boolean column operators missing?

2018-10-23 Thread Nicholas Chammas
I can’t seem to find any documentation of the &, |, and ~ operators for PySpark DataFrame columns. I assume that should be in our docs somewhere. Was it always missing? Am I just missing something obvious? Nick