chenbaggio commented on issue #13701:
URL: https://github.com/apache/arrow/issues/13701#issuecomment-1196422212

   Dears:
                   here, I can give one example to descirbe why need a function 
to extract binary in byte unit
   
   
                    In distribute database, data has distribute policy and 
relatived hash algorithm for different data type,
                     here we just discuss string-like and binary type, the hash 
algorithm need detach string-like or binary
                     in bytes to calculating, for example , take 1-4 byte cast 
to integer and shift-left 16 bits, then take 5-6byte cast to
                      integer and the result from last step, and so on, the  
'utf8_slice_codeunits' function can partly meet the require if all
                     are ascii,  but if the string-like contain chinese, one 
chinese may occupied three bytes,  start 1 to end 3, three utf8 character
                     may take nine bytes, but it not meet the hash algorithm, 
it only need 3 bytes, so if provide a function but not cast, the same 
                     function arguments like 'utf8_slice_codeunits', it may 
called 'binary_slice_byteunit'
                       
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   At 2022-07-27 11:23:12, "Eduardo Ponce Mojica" ***@***.***> wrote:
   
   Hi @chenbaggio, could you expand on what you refer to as a "byte unit". If 
you refer to char (signed integral), you should be able to use it (via casting) 
with the current int64_t type for start/end/step. Probably, I am 
misunderstanding your request, so could you give an example.
   
   Also, are there other language implementations that have a similar operation?
   
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you were mentioned.Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to