Can you say more about how you treat the different sorts of inputs?

On Wed, Jan 24, 2018 at 11:26 AM, Tim Ross <[email protected]>
wrote:

> Yes that is what the pipeline I came up with looks like. However the next
> step in the pipeline is new/expired logic. I have tried a variety of things
> but none have gotten me close to what I want. Hence my questioning to this
> mailing list.
>
>
>
>
>
> Thanks,
>
> Tim
>
> *From: *Kenneth Knowles <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Wednesday, January 24, 2018 at 2:21 PM
>
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Windowing question
>
>
>
> A first approach would be to just not translate any of the new/expired
> logic. Beam does have the concept of expiring a window, though it is true
> that only particular transformations actually drop expired data. Have you
> tried something along the lines of this?
>
>
>
>     pipeline.begin()
>
>         .apply(FooIO.read(...))
>
>         .apply(Window.into(SlidingWindows.of(Duration.
> standardMinutes(5)).every(Duration.standardSeconds(30))))
>
>         .apply(ParDo.of(new DoFn<>() { ... get into proper format ... })
>
>         ... the rest of the logic ...
>
>
>
> Just a very vague sketch of what, actually, most pipelines look like.
>
>
>
> Kenn
>
>
>
> On Wed, Jan 24, 2018 at 11:07 AM, Tim Ross <[email protected]>
> wrote:
>
> I am trying to convert an existing Apache Storm Bolt into an Apache Beam
> pipeline. The storm bolt used sliding windows with a duration of 5 minute
> and a period of 30 seconds. After doing some initial transforms to get data
> in the proper format it would process all elements which were new or
> expired.
>
>
>
> Since Beam doesn’t have the concept of new and expired data in a window
> I’m trying to figure out how one would accomplish this.
>
>
>
>
>
> Thanks,
>
> Tim
>
> *From: *Kenneth Knowles <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Wednesday, January 24, 2018 at 1:45 PM
>
>
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Windowing question
>
>
>
> Generally, Beam will discard expired data for you (including state). Can
> you describe more? What is your windowing strategy? What are the edge
> triggers?
>
>
>
> On Wed, Jan 24, 2018 at 10:37 AM, Tim Ross <[email protected]>
> wrote:
>
> I am just trying to do certain processing on edge triggers, i.e. new or
> expired data, to reduce the overall processing of a very large stream.
>
>
>
> How would I go about doing that with state? As I understand it, state is
> tied to key and window.
>
>
>
> Thanks,
>
> Tim
>
> *From: *Kenneth Knowles <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Wednesday, January 24, 2018 at 1:25 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: Windowing question
>
>
>
> A little clarification: in Beam an element exists in a single window,
> mathematically speaking. So when you use SlidingWindows, for example, to
> assign multiple windows this "copies" the value for each window, and that
> is how you should think of it, from a calculation point of view. Under the
> hood, a compressed representation is often used, but not in all situations.
>
>
>
> Kenn
>
>
>
> On Wed, Jan 24, 2018 at 9:45 AM, Robert Bradshaw <[email protected]>
> wrote:
>
> No, Apache Beam doesn't offer this explicitly. You could accomplish it
> using State, but perhaps if you clarified what you were trying to
> accomplish by using these mechanisms there'd be another way to do the
> same thing.
>
>
> On Wed, Jan 24, 2018 at 7:03 AM, Tim Ross <[email protected]>
> wrote:
> > Is there anything comparable to Apache Storm’s Window.getNew and
> > Window.getExpired in Apache Beam?  How would I determine if an element is
> > new or expired in consecutive windows?
> >
> >
> >
> > Thanks,
> >
> > Tim
> >
> > This e-mail message and any attachments to it are intended only for the
> > named recipients and may contain legally privileged and/or confidential
> > information. If you are not one of the intended recipients, do not
> duplicate
> > or forward this e-mail message.
>
>
>
> This e-mail message and any attachments to it are intended only for the
> named recipients and may contain legally privileged and/or confidential
> information. If you are not one of the intended recipients, do not
> duplicate or forward this e-mail message.
>
>
>
> This e-mail message and any attachments to it are intended only for the
> named recipients and may contain legally privileged and/or confidential
> information. If you are not one of the intended recipients, do not
> duplicate or forward this e-mail message.
>
>
>
> This e-mail message and any attachments to it are intended only for the
> named recipients and may contain legally privileged and/or confidential
> information. If you are not one of the intended recipients, do not
> duplicate or forward this e-mail message.
>

Reply via email to