jorisvandenbossche commented on PR #13687:
URL: https://github.com/apache/arrow/pull/13687#issuecomment-1240399112

   > Plus, we should probably address the `ds.field('')._call` issue before we 
worry too much about extensive documentation.
   
   The reason that this `_call` is currently private with a leading underscore, 
is because for the built-in compute functions, you can actually use the compute 
function itself and pass it a field expression instead of actual array:
   
   ```
   >>> import pyarrow.compute as pc
   
   # you can do
   >>> pc.field('a')._call("add", [pc.field("b")])
   <pyarrow.compute.Expression add(b)>
   # instead of
   >>> pc.Expression._call("add", [pc.field("a"), pc.field("b")])
   <pyarrow.compute.Expression add(a, b)>
   ```
   
   which was sufficient for the initial examples for dataset projections. 
   Now, this might have some limitations. It already seems this is currently 
limited to only expressions as arguments, so you can't mix with a scalar right 
now (as the current example would do):
   
   ```
   >>> pc.add(pc.field('a'), 1)
   ...
   TypeError: only other expressions allowed as arguments
   ```
   
   Now, that might be something we can fix (didn't again look into it at the 
moment, I suppose I added this limitation in the initial PR for simplicity)
   
   For UDFs, there is of course the additional limitation that this isn't 
available as a `pc.` function. For this use case, we should maybe allow 
`pc.call_function` to accept expressions as well? 
   So that you can do `pc.call_function("my_udf", [pc.field("a")])` instead of 
`pc.Expression.call("my_udf", [pc.field("a")])`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to