[ 
https://issues.apache.org/jira/browse/ARROW-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405274#comment-17405274
 ] 

David Li commented on ARROW-13764:
----------------------------------

And to be clear, this is about the groups, not the values themselves? i.e.
{noformat}
keys   = [0,   0,    1,   1,   null]
values = ["a", null, "b", "b", "c"]{noformat}
should give
{noformat}
counts = {0: 2, 1: 1} {noformat}
instead of
{noformat}
counts = {0: 2, 1: 1, null: 1} {noformat}
?

> [C++] Implement ScalarAggregateOptions for count_distinct (grouped) 
> --------------------------------------------------------------------
>
>                 Key: ARROW-13764
>                 URL: https://issues.apache.org/jira/browse/ARROW-13764
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nic Crane
>            Assignee: David Li
>            Priority: Major
>              Labels: kernel
>             Fix For: 6.0.0
>
>
> I'm writing the R bindings for the grouped {{count_distinct}} kernel, but the 
> current implementation counts nulls as their own group.  To match the R 
> behaviour,  I need to be able to specify whether or not to remove NA/NULL 
> values.
> Please could we have ScalarAggregateOptions implemented for 
> {{count_distinct}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to