Eduardo Ponce created ARROW-13570:
-------------------------------------
Summary: [C++][Compute] Additional scalar ASCII kernels can reuse
original offsets buffer
Key: ARROW-13570
URL: https://issues.apache.org/jira/browse/ARROW-13570
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Eduardo Ponce
Fix For: 6.0.0
Some ASCII scalar string kernels are able to reuse the original offsets buffer,
so they are not preallocated in the output (use *MemAllocation::NO_PREALLOCATE*
during registration). Currently, only kernels that apply a transformation to
each character independently via
[StringDataTransform|https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_string.cc#L590-L631]
support the no preallocation policy. But there are additional string kernels
that do not modify the length (nor offsets) of the input string but apply
different transforms throughout the characters.
This issue should extend/create *StringDataTransform* to take multiple input
transforms in order to support *MemAllocation::NO_PREALLOCATE* policy for
additional scalar ASCII kernels (e.g., _ascii_capitalize_, _ascii_reverse_,
_ascii_title_).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)