sorry for not getting back to you earlier, I was too busy and only read
your mail now.
Adding the context is not a simple optimization of the case you described.
In your case, you will get 30-second windows where the elements are
assigned to those windows based on their timestamp. If you have one big
daily window and do 30-second speculative (early) firing of that window
based on processing time you will at each firing possibly have elements
over that complete day in the window, i.e. you progressively output a more
refined result for that 1-day window.
Does that make sense to you?
On Wed, 14 Sep 2016 at 18:21 AJ Heller <a...@drfloob.com> wrote:
> Thank you, Aljoscha! I look forward to reading the papers you mentioned.
> Regarding FLIP-2, are there any new use cases that a Window Function
> Context enables? If not, my understanding is that adding a this context
> would be an optimization over what is currently possible, but maybe
> inefficient. For example of how I think this would work, instead of a
> "firing reason" context to let you differentiate between (e.g.)
> every-30-second early firings and a daily primary firing, I imagine you
> could split the stream, where one exclusively emits 30 second aggregates
> and the other exclusively emits daily, and deal with them separately.
> If that is the case, that it amounts to an optimization: have you
> considered wheter the added complexity is worth the potential efficiency
> gain? Otherwise, if it amounts to more than a small optimization, I'd be
> very interested to understand what this change would enable, I currently
> don't see it. I am under time pressure to choose a viable project (the idea
> was to be solidified yesterday, actually), and I would very much like to
> work on this now if I can justify it. If not, I would still very much like
> to work on this, but the timing will have to be different.
> Again, thank you Aljoscha, and I apologize for the rushed nature of my
> -aj heller
> On Wed, Sep 14, 2016 at 1:19 AM, Aljoscha Krettek <aljos...@apache.org>
> > Hi AJ,
> > the idea for evictors initially came from IBM Infosphere Streams, if I'm
> > not mistaken:
> > http://www.ibm.com/support/knowledgecenter/SSCRJU_4.0.0/
> > com.ibm.streams.dev.doc/doc/windowhandling.html
> > The
> > first version of the windowing system used a combination of
> > triggers/evictors to do the windowing, this is describe in Jonas Traub's
> > thesis: http://www.diva-portal.se/smash/get/diva2:861798/FULLTEXT01.pdf.
> > I'm quite skeptical about having support for Evictors in the first place.
> > They make computation inefficient because you always have to keep a list
> > all elements and cannot incrementally aggregate using a reduce function.
> > Also, it is quite tricky to figure out how to do eviction based on
> > ProcessingTime with a good interface. If you have some ideas how this
> > be improved I'm open to anything.
> > For now, I would suggest to focus on FLIP-2, since quite a number of
> > would be interested in having that. I would also not put any energy in
> > trying to figure out how the context can be shared between evictors and
> > other parts of the system. If we keep evictors I would like to keep the
> > and implementation completely separate from anything else that's going on
> > in the system.
> > On implementation, the context would probably created by the
> > or by the InternalWindowFunction.
> > Cheers,
> > Aljoscha
> > On Mon, 12 Sep 2016 at 08:27 AJ Heller <a...@drfloob.com> wrote:
> > > Could you point me towards the inspiration for Evictors? Are there any
> > > papers, perhaps, that lay the groundwork for mutable windows like this?
> > >
> > > After much research this weekend, I found that Evictors are unique to
> > > Flink. Conceptually, it looks to me like Dataflow windows are
> > > Looking into other Dataflow implementations: I didn't find anything in
> > > either the Apache Beam SDK docs or the Google Cloud Dataflow API docs
> > that
> > > mention allowing you to remove elements from a window. I'm hesitant to
> > > tread new ground in mutability.
> > >
> > > What do you think about reimplementing Evictors as a kind of cyclic
> > filter
> > > operation? Would it be possible? I believe this would fit into the
> > Dataflow
> > > model better, but I'm still in the early stages of becoming familiar
> > > Flink, and I haven't read the ABS paper  yet to know if there are
> > > snapshot implications. I also don't (yet) see why you couldn't optimize
> > > such a cyclic operation with mutable operations under the hood.
> > >
> > > : http://arxiv.org/abs/1506.08603
> > >
> > >
> > > On Fri, Sep 9, 2016 at 11:46 AM, AJ Heller <a...@drfloob.com> wrote:
> > >
> > >> Thank you for offering your support, I'm excited to dig in!
> > >>
> > >> I have some work to do getting up to speed on the windowing internals.
> > >> And I still need to get my bearing on the Evictor changes, I plan to
> > read
> > >> through the list archive and documents today. Vishnu, are your changes
> > >> already publicly viewable?
> > >>
> > >> Regarding the window modifications in FLIP-2, I see Vishnu that you've
> > >> suggested an interface for the EvictorContext object, and Aljoscha,
> > >> suggested an abstract Context class. Does it make sense for them to
> > agree?
> > >> The other big difference I've seen in the signatures is wheter the
> > Window
> > >> is contained in the context or not.
> > >>
> > >> Have you considered modifying the signature of the methods to accept
> > >> extends Context>` or `<EC extends EvictorContext>`? At least in terms
> > >> FLIP-2, this would allow each process window function to define and
> > >> with its own context (without downcasting, anyway), and similarly in
> > >> future, there'd be less work in changing Context subclasses when new
> > >> abstract methods are added to Context.
> > >>
> > >> But I may be getting ahead of myself. Could you point me towards where
> > >> contexts are/would be created? I'm not clear on the ownership and
> > lifecycle
> > >> of these objects yet.
> > >>
> > >
> > >