[I] [C++] Support scalar aggregate expressions on ExecuteScalarExpression [arrow]

via GitHub Fri, 16 Feb 2024 11:03:02 -0800


JerAguilon opened a new issue, #40102:
URL: https://github.com/apache/arrow/issues/40102


   ### Describe the enhancement requested
   
   It would be fantastic to be able to run expressions on a dataset with 
non-scalar expression. As a dummy example, this would return all the rows in 
which column "foo" is bigger than the last value:
   
   ```
   import pyarrow.compute as pc
   import pyarrow as pa
   
   table = pa.Table.from_arrays([pa.array([1, 5, 3, 4])], names=["foo"])
   expr = pc.field('foo') >= pc.last(pc.field('foo'))
   
   # expected:  pa.Table.from_arrays([pa.array([5, 4])], names=["foo"])
   ```
   
   This is in Python, but the issue is that `ExecuteScalarExpression` gives you:
   
   `ArrowInvalid: ExecuteScalarExpression cannot Execute non-scalar expression 
(foo == last(foo))`
   
   Given that `SCALAR_AGGREGATE`s compute scalars, I think these should be 
executable in `ExecuteScalarExpression` too.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [C++] Support scalar aggregate expressions on ExecuteScalarExpression [arrow]

Reply via email to