jorisvandenbossche commented on code in PR #13155:
URL: https://github.com/apache/arrow/pull/13155#discussion_r877024037
##########
python/pyarrow/table.pxi:
##########
@@ -2882,24 +2882,27 @@ cdef class Table(_PandasConvertible):
return pyarrow_wrap_table(result)
- def filter(self, mask, object null_selection_behavior="drop"):
+ def filter(self, mask_or_expr, object null_selection_behavior="drop"):
"""
Select rows from the table.
- See :func:`pyarrow.compute.filter` for full usage.
+ The Table can be filtered based on a mask, which will be passed to
+ :func:`pyarrow.compute.filter` to perform the filtering, or it can
+ be filtered through a boolean :class:`.Expression`
Parameters
----------
- mask : Array or array-like
- The boolean mask to filter the table with.
+ mask_or_expr : Array or array-like or .Expression
+ The boolean mask or the :class:`.Expression` to filter the table
with.
null_selection_behavior
- How nulls in the mask should be handled.
+ How nulls in the mask should be handled, does nothing if
+ an :class:`.Expression` is used.
Review Comment:
> I think that if you care about special handling nulls, you probably want
to build an expression that evaluates as you wish for nulls
I don't think is possible to get the "emit null" behaviour by changing the
expression (for dropping/keeping, you can explicitly fill the null with
False/True, but for preserving the row as null, that's only possible through
this option). I suppose that is a good reason this is an option of the filter
kernel and not eg comparison kernels.
Anyway, this is not that important given that the "drop" behaviour is the
default for both (and is the typical behaviour you want, I think), but this
might be something to open a JIRA for to add `FilterOptions` to the
`FilterNodeOptions` (cc @westonpace would that make sense?)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]