[ 
https://issues.apache.org/jira/browse/ARROW-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428407#comment-17428407
 ] 

Weston Pace commented on ARROW-14290:
-------------------------------------

If a user has custom comparison function that they need to run on a column of 
strings, then they could just create a new kernel function.

If a user has a column of data that always needs to be compared in some 
consistent and unique way (e.g. they have a column of alphanumeric data and the 
less/greater operations should apply a "natural sort" when comparing strings 
with numbers) then the correct answer I believe is an extension type.

If a user needs to apply some kind of non-scalar comparison function (e.g. 
consuming an entire column at once, managing some kind of independent cache, 
converting data from column-major to row-major, etc.) then they can create a 
custom execution node.

So we have several extension points already.  I'm not sure what value there is 
in creating another one.  It is not clear to me how this would be different or 
more user friendly.  I agree that a single, concrete example is probably a good 
starting point.

> [C++] String comparison in between ternary kernel
> -------------------------------------------------
>
>                 Key: ARROW-14290
>                 URL: https://issues.apache.org/jira/browse/ARROW-14290
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Benson Muite
>            Assignee: Benson Muite
>            Priority: Minor
>
> String comparisons in C++ will use order by unicode. This may not be suitable 
> in many language applications, for example when using characters from 
> languages that use more than ASCII.   Sorting algorithms can often allow for 
> the use of custom comparison functions.  It would be helpful to allow for 
> this for the between kernel as well.  Initial work on the between kernel is 
> being tracked in https://issues.apache.org/jira/browse/ARROW-9843



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to