[I] [Pyarrow] Support non-scalar filtering [arrow]

via GitHub Fri, 16 Feb 2024 00:20:54 -0800


JerAguilon opened a new issue, #40099:
URL: https://github.com/apache/arrow/issues/40099


   ### Describe the enhancement requested
   
   It would be fantastic to be able to run expressions on a dataset with 
non-scalar expression. As a dummy example, this would return all the rows in 
which column "foo" is bigger than the last value:
   
   ```
   import pyarrow.compute as pc
   import pyarrow as pa
   
   table = pa.Table.from_arrays([pa.array([1, 5, 3, 4])], names=["foo"])
   expr = pc.field('foo') >= pc.last(pc.field('foo'))
   
   # expected:  pa.Table.from_arrays([pa.array([5, 4])], names=["foo"])
   ```
   
   Today, you'd get something like:
   
   ```
   ArrowInvalid: ExecuteScalarExpression cannot Execute non-scalar expression 
(foo == last(foo))
   ```
   
   Is there a reason we can only execute scalar expressions? Is there a way 
today to interweave aggregations in one filter query?
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Pyarrow] Support non-scalar filtering [arrow]

Reply via email to