Thanks Andrew. The link gives a cool method to calculate variance 
incrementally. I think the problem is that it's computationally too expensive 
(cannot leverage vectorization, three divisions for a single data point).
The equation you provided is great in practice to calculate stdev for one 
array. It doesn't address the issue of combining stdev from multiple arrays.

On 9/16/20 6:25 PM, Andrew Lamb wrote:
Perhaps you can rewrite the functions in terms of other kernels that can be
merged -- for example something like the following

stddev(x) = sqrt((sum(x*x) - sum(x)*sum(x) / count(x))/(count(x)-1)))

(loosely translated from
https://math.stackexchange.com/questions/102978/incremental-computation-of-standard-deviation
)

On Wed, Sep 16, 2020 at 6:12 AM Yibo Cai <yibo....@arm.com> wrote:

Hi,

I have a question about aggregate kernel implementation. Any help is
appreciated.

Aggregate kernel implements "consume" and "merge" interfaces. For a
chunked array, "consume" is called for each array to get a temporary
aggregated result, then "merge" it with previously consumed result. For
associative operations like min/max/sum, this pattern is convenient. We can
easily "merge" min/max/sum of two arrays, e.g, sum([array_a, array_b]) =
sum(array_a) + sum(array_b).

But I wonder what's the best approach to deal with operations like
stdev/percentile. Results of these operations cannot be easily "merged". We
have to walk through all the chunks to get the result. For these
operations, looks "consume" must copy the input array and do all
calculation once at "finalize" time. Or we don't expect it to support
chunked array for them.

Yibo


Reply via email to