[
https://issues.apache.org/jira/browse/ARROW-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325496#comment-17325496
]
Eduardo Ponce commented on ARROW-11508:
---------------------------------------
A possible solution is establish rules for converting to a "base/operational"
form each generic datum and define them as method _to_base/operational_form()_.
When an operation's inputs are of different non-primitive types and have the
conversion methods, then convert them to their conformant "base/operational"
form, and then apply the operation. The following are aspects to consider:
* Each generic datum should implement _to_base/operational_form()_
* What type should the result be? (Only applicable to operations that return a
result of a similar type).
* Transforming to the "base/operational" form may have performance
implications, so the invoking code should be aware of this.
* How to implement the rules of conversion?
* For cases where the operation is applied to independent data, there is no
direct method of inferring which datum to promote or demote.
> [C++][Compute] Add support for generic conversions to Function::DispatchBest
> ----------------------------------------------------------------------------
>
> Key: ARROW-11508
> URL: https://issues.apache.org/jira/browse/ARROW-11508
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Affects Versions: 3.0.0
> Reporter: Ben Kietzman
> Assignee: Ben Kietzman
> Priority: Major
>
> ARROW-8919 adds support for execution with implicit casts to any function
> which overrides DispatchBest, allowing functions to specify conversions which
> make sense in that function's context. For example "add" can promote its
> arguments if their types disagree. By contrast, some conversions are more
> generic and could be applicable to any function's arguments. For example if
> any datum is dictionary encoded, a kernel which accepts the decoded type
> should be usable with an implicit decoding cast:
> {code:java}
> import pyarrow as pa
> import pyarrow.compute as pc
> arr = pa.array('hello ' * 10)
> enc = arr.dictionary_encode()
> # result should not depend on encoding:
> assert pc.ascii_is_alnum(arr) == pc.ascii_is_alnum(enc)
> # currently raises:
> # ArrowNotImplementedError: Function ascii_is_alnum has no kernel matching
> # input types (array[dictionary<values=string, indices=int32, ordered=0>])
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)