Hi,
did you try exploding the arrays, then doing the aggregation/count and
at the end applying a udf to add the 0 values ?
my experience is working on arrays is usually a bad idea.
sakag writes:
> Hi all,
>
> We have a rather interesting use case, and are struggling to come up with an
>
An interesting puzzle indeed.
What is your measure of "that scales"? Does not fail, does not spill,
does not need a huge amount of memory / disk, is O(N), processes X
records per second and core?
Enrico
Am 11.03.20 um 16:59 schrieb sakag:
Hi all,
We have a rather interesting use case,
Hi all,
We have a rather interesting use case, and are struggling to come up with an
approach that scales. Reaching out to seek your expert opinion/feedback and
tips.
What we are trying to do is to find the count of numerical ids over a
sliding time window where each of our data records has