jorisvandenbossche commented on PR #12590:
URL: https://github.com/apache/arrow/pull/12590#issuecomment-1089397801

   > I agree that the notion of "scalar function" is likely to be foreign to 
our users and we should make sure to define it very clearly in our 
documentation. 
   > A scalar function is a function that generates one output value for every 
input row. 
   
   I _think_ that I am familiar with out usage of the term "scalar function" in 
our compute kernels, but AFAIK that's not really how it translates here. 
   I expect that a scalar kernel is one that indeed is performed independently 
element-wise on the values (and thus has the characteristics of parallelization 
etc that you describe), but it's still a function you can call on a full 
(chunked) array, creating a new (chunked) array of the same size. But with the 
current `InputType.scalar`, you can only call the registered UDF on a scalar, 
not on a (chunked) array. So that's where the current usage of this term in the 
new API seems to conflict with the usage of this term in general in the compute 
kernels. Because if I want to actually register a "scalar kernel" UDF, I need 
to use 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to