vibhatha commented on PR #13687:
URL: https://github.com/apache/arrow/pull/13687#issuecomment-1225176371

   > I think part of the challenge with this documentation is that implementing 
`affine` in pure-python is not a very compelling use case. I think the more 
interesting case for UDFs is when we want to use some other library that does 
efficient compute and is capable of working with Arrow data. For example, 
numpy. Here is an example that exposes numpy's `gcd` function (greatest common 
divisor) as an Arrow function:
   > 
   > ```
   > import numpy as np
   > 
   > import pyarrow as pa
   > import pyarrow.compute as pc
   > 
   > function_name = "numpy_gcd"
   > function_docs = {
   >        "summary": "Calculates the greatest common divisor",
   >        "description":
   >            "Given 'x' and 'y' find the greatest number that divides\n"
   >            "evenly into both x and y."
   > }
   > 
   > input_types = {
   >    "x" : pa.int64(),
   >    "y" : pa.int64()
   > }
   > 
   > output_type = pa.int64()
   > 
   > def to_np(val):
   >     if isinstance(val, pa.Scalar):
   >         return val.as_py()
   >     else:
   >         return np.array(val)
   > 
   > def gcd_numpy(ctx, x, y):
   >     np_x = to_np(x)
   >     np_y = to_np(y)
   >     return pa.array(np.gcd(np_x, np_y))
   > 
   > pc.register_scalar_function(gcd_numpy,
   >                             function_name,
   >                             function_docs,
   >                             input_types,
   >                             output_type)
   > 
   > print('gcd(27, 63) should be 9')
   > print(f'Answer={pc.call_function(function_name, [pa.scalar(27), 
pa.scalar(63)])}')
   > print()
   > print('gcd([27, 18], [54, 63]) should be [27, 9]')
   > print(f'Answer={pc.call_function(function_name, [pa.array([27, 18]), 
pa.array([54, 63])])}')
   > print()
   > print('gcd(27, [54, 18]) should be [27, 9]')
   > print(f'Answer={pc.call_function(function_name, [pa.scalar(27), 
pa.array([54, 18])])}')
   > ```
   > 
   > Notice the use of the helper function `to_np` to convert from inputs of 
different shapes to ensure that we get something that numpy can work with.
   
   I see your point. I will update the example to use this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to