icexelloss commented on code in PR #35514:
URL: https://github.com/apache/arrow/pull/35514#discussion_r1196685513


##########
python/pyarrow/tests/test_substrait.py:
##########
@@ -605,3 +605,151 @@ def table_provider(names, schema):
     expected = pa.Table.from_pydict({"out": [1, 2, 3]})
 
     assert res_tb == expected
+
+
+def test_aggregate_udf_basic(varargs_agg_func_fixture):

Review Comment:
   This test case matches how we would this it internally. The end user would 
define sth like
   ```
   def foo(v: pd.Series):
       return np.nanmean(v)
   
   summarize(table, agg=foo, columns=['v'], by='time')
   ```
   
   We would then wrap the foo into a function that Acero is expecting (a 
varargs UDF)
   ```
   def get_acero_func(func):
         def acero_func(ctx, *args):
               return pa.scalar(func(*[arg.to_pandas() for arg in args]))
   ```
   And also register it in Acero on the fly. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to