[
https://issues.apache.org/jira/browse/ARROW-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17234609#comment-17234609
]
Joris Van den Bossche commented on ARROW-10641:
-----------------------------------------------
In the R world, something equivalent might be {{recode}}
(https://dplyr.tidyverse.org/reference/recode.html, although the function is
"questioned") or the older {{mapvalues}}
(https://www.rdocumentation.org/packages/plyr/versions/1.8.6/topics/mapvalues,
but that is from a retired package)
> [C++] A "replace" or "map" kernel to replace values in array based on mapping
> -----------------------------------------------------------------------------
>
> Key: ARROW-10641
> URL: https://issues.apache.org/jira/browse/ARROW-10641
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Joris Van den Bossche
> Priority: Major
>
> A "replace" or "map" kernel to replace values in array based on mapping. This
> would be similar as the pandas {{Series.replace}} (or {{Series.map}}) kernel,
> and as a small illustration of what is meant:
> {code}
> In [41]: s = pd.Series(["Yes", "Y", "No", "N"])
> In [42]: s
> Out[42]:
> 0 Yes
> 1 Y
> 2 No
> 3 N
> dtype: object
> In [43]: s.replace({"Y": "Yes", "N": "No"})
> Out[43]:
> 0 Yes
> 1 Yes
> 2 No
> 3 No
> dtype: object
> {code}
> Note: in pandas the difference between "replace" and "map" is that replace
> will only replace a value if it is present in the mapping, while map will
> replace every value in the input array with the corresponding value in the
> mapping and return null if not present in the mapping. This different
> behaviour could maybe be triggered with a keyword.
> Note, this is different from ARROW-10306 which is about string replacement
> _within_ array elements (replacing a substring in each string element in the
> array), while here it is about replacing full elements of the array)
> cc [~maartenbreddels]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)