[
https://issues.apache.org/jira/browse/ARROW-13064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363669#comment-17363669
]
Joris Van den Bossche commented on ARROW-13064:
-----------------------------------------------
{quote}Maybe instead we should implement this to operate on listarrays, like
NumPy's np.where or np.select.
{quote}
Looking back at the discussion in ARROW-10640, there is also the mention of
{{np.choose}}. Basically, {{select}} uses (multiple) boolean conditions to
choose between the arrays (so more like the SQL CASE), while {{choose}} uses a
single array of indices into the choice arrays (so you get something like
{{choose(array[int], a0, a1, ... an)}} where the first argument should be
indices referring to array aX. And our current "if_else" kernel is then
basically a special case of this if you cast the boolean to an int)
Also, I don't think think {{np.select}} works on "list arrays". It works on a
"list of arrays", but so that's something different as our ListArray (the
separate arrays in the list are still separate memory-contiguous arrays). So in
C++ terms it's more like {{select(vector<Array[bool]> conditions,
vector<Array[type]> values) -> Array[type]}}
> [C++] Add a general "if, ifelse, ..., else" kernel
> --------------------------------------------------
>
> Key: ARROW-13064
> URL: https://issues.apache.org/jira/browse/ARROW-13064
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Ian Cook
> Priority: Major
>
> ARROW-10640 added a ternary {{if_else}} kernel. Add another kernel that
> extends this concept to an arbitrary number of conditions and associated
> results, like a vectorized {{if-ifelse-...-else}} with an arbitrary number of
> {{ifelse}} and with the {{else}} optional. This is like a SQL {{CASE}}
> statement.
> How best to achieve this is not obvious. To enable SQL-style uses, it would
> be most efficient to implement this as a variadic kernel where the
> even-number arguments (0, 2, ...) are the arrays of boolean conditions, the
> odd-number arguments (1, 3, ...) are the corresponding arrays of results, and
> the final argument is the {{else}} result. But I'm not sure if this is
> practical. Maybe instead we should implement this to operate on listarrays,
> like NumPy's
> {{[np.where|https://numpy.org/doc/stable/reference/generated/numpy.where.html]}}
> or
> {{[np.select|https://numpy.org/doc/stable/reference/generated/numpy.select.html]}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)