[
https://issues.apache.org/jira/browse/ARROW-14050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Li resolved ARROW-14050.
------------------------------
Resolution: Fixed
Issue resolved by pull request 11199
[https://github.com/apache/arrow/pull/11199]
> [C++] tdigest, quantile return empty arrays when nulls not skipped
> ------------------------------------------------------------------
>
> Key: ARROW-14050
> URL: https://issues.apache.org/jira/browse/ARROW-14050
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Affects Versions: 6.0.0
> Reporter: Ian Cook
> Assignee: David Li
> Priority: Critical
> Labels: kernel, pull-request-available
> Fix For: 6.0.0
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> This is a C++ kernel issue, but the examples below use the R bindings to call
> the C++ kernels.
> The aggregate functions {{tdigest}} and {{quantile}} return arrays with the
> same length as the option value {{q}}:
> {code:r}
> call_function("tdigest", Array$create(c(1, 2, 3, NA)), options = list(q =
> c(0.1, 0.9), skip_nulls = TRUE))
> ## Array
> ## <double>
> ## [
> ## 1,
> ## 3
> ## ]{code}
> But when the data includes {{null}} values and the option {{skip_nulls}} is
> set to {{false}}, these kernels instead return zero-length arrays:
> {code:r}
> call_function("tdigest", Array$create(c(1, 2, 3, NA)), options = list(q =
> c(0.1, 0.9), skip_nulls = FALSE))
> ## Array
> ## <double>
> ## []{code}
> This is difficult to handle in bindings; it requires addition of special code
> to handle the case where the array comes back empty. It would be much better
> if the returned array in this situation had the same length {{q}} with
> {{null}} in every position:
> {code:r}
> ## Array
> ## <double>
> ## [
> ## null,
> ## null
> ## ] {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)