edponce edited a comment on pull request #11023:
URL: https://github.com/apache/arrow/pull/11023#issuecomment-947272294


   [Validation of repeat count occurs 
here](https://github.com/apache/arrow/pull/11023/files#diff-eb8300bc4dea7d1c46b2576b7dbd8e42b927ab7d42c031f4aecae892a72ee244R2903-R2908)
 but only for when it is a Scalar value. If it is an Array, no validation 
occurs and error is delegated to [output 
allocation](https://github.com/apache/arrow/pull/11023/files#diff-eb8300bc4dea7d1c46b2576b7dbd8e42b927ab7d42c031f4aecae892a72ee244R776).
   
   I understand we should be consistent, so should we validate repeat count for 
Arrays as well and accept the performance hit, or should we not validate at all 
and let the output allocation error out?
   
   It is difficult to error out from inside the transform because it [does not 
output a `Status`, simply the number of transformed/encoded 
bytes](https://github.com/apache/arrow/pull/11023/files#diff-eb8300bc4dea7d1c46b2576b7dbd8e42b927ab7d42c031f4aecae892a72ee244R793)
 contrast to the arithmetic kernels which have a `Status` parameter.
   
   Based on this thought process, I am leaning towards either no validation at 
all or add a `Status` parameter to the string transform and perform validation 
during processing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to