[ 
https://issues.apache.org/jira/browse/ARROW-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduardo Ponce updated ARROW-13570:
----------------------------------
    Description: 
Some ASCII scalar string kernels are able to reuse the original offsets buffer, 
so they are not preallocated in the output (use *MemAllocation::NO_PREALLOCATE* 
during registration). Currently, only kernels that apply a transformation to 
each character independently via 
[StringDataTransform|https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_string.cc#L590-L631]
 support the no preallocation policy. But there are additional string kernels 
that do not modify the length (nor offsets) of the input string but apply 
scalar transforms that depend on neighboring characters.

This issue should extend/create *StringDataTransform* to take multiple input 
transforms in order to support *MemAllocation::NO_PREALLOCATE* policy for 
additional scalar ASCII kernels (e.g., _ascii_title_).

  was:
Some ASCII scalar string kernels are able to reuse the original offsets buffer, 
so they are not preallocated in the output (use *MemAllocation::NO_PREALLOCATE* 
during registration). Currently, only kernels that apply a transformation to 
each character independently via 
[StringDataTransform|https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_string.cc#L590-L631]
 support the no preallocation policy. But there are additional string kernels 
that do not modify the length (nor offsets) of the input string but apply 
different transforms throughout the characters.

This issue should extend/create *StringDataTransform* to take multiple input 
transforms in order to support *MemAllocation::NO_PREALLOCATE* policy for 
additional scalar ASCII kernels (e.g., _ascii_title_).


> [C++][Compute] Additional scalar ASCII kernels can reuse original offsets 
> buffer
> --------------------------------------------------------------------------------
>
>                 Key: ARROW-13570
>                 URL: https://issues.apache.org/jira/browse/ARROW-13570
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Eduardo Ponce
>            Priority: Major
>             Fix For: 6.0.0
>
>
> Some ASCII scalar string kernels are able to reuse the original offsets 
> buffer, so they are not preallocated in the output (use 
> *MemAllocation::NO_PREALLOCATE* during registration). Currently, only kernels 
> that apply a transformation to each character independently via 
> [StringDataTransform|https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_string.cc#L590-L631]
>  support the no preallocation policy. But there are additional string kernels 
> that do not modify the length (nor offsets) of the input string but apply 
> scalar transforms that depend on neighboring characters.
> This issue should extend/create *StringDataTransform* to take multiple input 
> transforms in order to support *MemAllocation::NO_PREALLOCATE* policy for 
> additional scalar ASCII kernels (e.g., _ascii_title_).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to