zeroshade commented on issue #38458:
URL: https://github.com/apache/arrow/issues/38458#issuecomment-1785476185

   So, at minimum this should be fixed to be consistent between the `Binary` 
and `String` types, it was not intended to be inconsistent.
   
   That all said: I believe the intent of the `ValueOffset` method is to 
provide the starting offset of the value at index `i`, which means that the 
`Binary` type is correct in having the `>=` there, because there's no value at 
an index equal to the length of the slice/array to get the *starting* offset 
for. Instead you'd use `ValueLen`, or `ValueOffsets` and grab the last value or 
something like that. 
   
   If you're trying to get the element size, there's already a `ValueLen` 
method which does that for you. If you're trying to get the actual value, the 
`Value` method returns a slice with the value without copying. 
   
   So, I believe the optimal solution here would be to expose the `ValueLen` 
method in the `BinaryLike` interface so that you do not need to try to 
calculate these byte widths yourself manually and can instead just call that 
method.
   
   What do you both think?
   
   > I also find it unsettling that ValueOffset/ValueOffset64 take in int row 
index parameters whereas slicing methods take in int64
   
   This was done for consistency as most of the existing row index params in 
the array types use `int` rather than an explicit `int64` (likely for 
convenience). In theory this would only affect someone if they were running on 
a 32-bit machine but had enough rows that it would not fit in an int32.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to