[GitHub] [arrow] jorisvandenbossche commented on pull request #7308: ARROW-6978: [R] Add bindings for sum and mean compute kernels

GitBox Tue, 02 Jun 2020 03:04:07 -0700


jorisvandenbossche commented on pull request #7308:
URL: https://github.com/apache/arrow/pull/7308#issuecomment-637434728



   Also for pandas such null-skipping option woudl be useful (although the 
default there aligns with what arrow already does, but pandas also exposes that 
as an option)
   
   > Summing integers seems to promote to return int64 if given int32 (I didn't 
try with smaller ints), even when overflow is not a danger (I was adding 
numbers 1 to 5). It would be nice if it returned the same type it got unless it 
has to go bigger to avoid overflow.
   
   This would mean that the output type starts to depend on the *values* of the 
input, and not just on the types of the input (which is something to avoid, I 
think?). 
   But does the kernel machinery allow right now to specify the output type? 
(or is this always implicitly inferred from the input types?) This is how numpy 
solves this: by default it will also use int64 for the result, but it has the 
option to specify the output dtype (so you could do `np.array([1, 2, 3], 
dtype="int8").sum(dtype="int8")` to preserve int8 type for the sum)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on pull request #7308: ARROW-6978: [R] Add bindings for sum and mean compute kernels

Reply via email to