[GitHub] [arrow] edponce edited a comment on pull request #11023: ARROW-12712: [C++] String repeat kernel

GitBox Tue, 19 Oct 2021 19:48:35 -0700


edponce edited a comment on pull request #11023:
URL: https://github.com/apache/arrow/pull/11023#issuecomment-947272294



   [Validation of repeat count occurs 
here](https://github.com/apache/arrow/pull/11023/files#diff-eb8300bc4dea7d1c46b2576b7dbd8e42b927ab7d42c031f4aecae892a72ee244R2903-R2908)
 but only for when it is a Scalar value. If it is an Array, no validation 
occurs and error is delegated to [output 
allocation](https://github.com/apache/arrow/pull/11023/files#diff-eb8300bc4dea7d1c46b2576b7dbd8e42b927ab7d42c031f4aecae892a72ee244R776).
   
   I understand we should be consistent, so should we validate repeat count for 
Arrays as well and accept the performance hit, or should we not validate at all 
and let the output allocation error out?
   
   It is difficult to error out from inside the transform because it [does not 
output a `Status`, simply the number of transformed/encoded 
bytes](https://github.com/apache/arrow/pull/11023/files#diff-eb8300bc4dea7d1c46b2576b7dbd8e42b927ab7d42c031f4aecae892a72ee244R793)
 contrast to the arithmetic kernels which have a `Status` parameter.
   
   Based on this thought process, I am leaning towards either no validation at 
all or add a `Status` parameter to the string transform and perform validation 
during processing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] edponce edited a comment on pull request #11023: ARROW-12712: [C++] String repeat kernel

Reply via email to