Hi Nirmalaya,

my reply was based on me misreading your original post, thinking you had a
batch of data, not a stream. I see that the apply method can also take a
reducer the pre-aggregates your data before passing it to the window
function. I suspect that pre-aggregation runs locally just like a combiner
would, but I'm really not sure about it. We should have more feedback on
this regard.

On Tue, Feb 16, 2016 at 2:19 AM, Nirmalya Sengupta <
sengupta.nirma...@gmail.com> wrote:

> Hello Stefano <stefano.bagh...@radicalbit.io>
>
> Sorry for the late reply. Many thanks for taking effort to write and share
> an example code snippet.
>
> I have been playing with the countWindow behaviour for some weeks now and
> I am generally aware of the functionality of countWindowAll(). For my
> useCase, where I _have to observe_ the entire stream as it founts in, using
> countWindowAll() is probably the most obvious solution. This is what you
> recommend too. However, because this is going to use 1 thread only (or 1
> node only in a cluster), I was thinking about ways to make use of the
> 'distributedness' of the framework. Hence, my question.
>
> Your reply leads to me read and think a bit more. If I have to use
> parallelism to achieve what I want to achieve, I think managing a
> ValueState of my own is possibly the solution. If you have any other
> thoughts, please share.
>
> From your  earlier response: '... you can still enjoy a high level of
> parallelism up until the last operator by using a combiner, which is
> basically a reducer that operates locally ...'. Could you elaborate this a
> bit, whenever you have time?
>
> -- Nirmalya
>
> --
> Software Technologist
> http://www.linkedin.com/in/nirmalyasengupta
> "If you have built castles in the air, your work need not be lost. That is
> where they should be.
> Now put the foundation under them."
>



-- 
BR,
Stefano Baghino

Software Engineer @ Radicalbit

Reply via email to