Amir,

I am kind of new at this so I might be wrong, but in general if your bolts
are stateless then they would translate directly into ParDos one to one.
The spouts are more complex and you would want to find an existing
unbounded source, or you would have to write one.

I don't know if there are any custom groupings in BEAM, there doesn't
appear to be, but in general a shuffle grouping would become a noop, a
fields grouping becomes a GroupByKey transform.

If your bolts are stateful you need to find a way using windowing and
aggregation to produce an equivalent result.

If you are using trident, it is similar all functions become ParDos.  The
thing here is that for the state you need to find an equivalent output
ParDo, or you need to write one.

- Bobby


On Wed, May 11, 2016 at 3:37 PM amir bahmanyari <[email protected]>
wrote:

> Is there a guideline on how to port an existing Storm code to Beam?Thanks
>
>       From: Bobby Evans <[email protected]>
>  To: [email protected]
>  Sent: Wednesday, May 11, 2016 1:25 PM
>  Subject: Re: Beam and Storm
>
> Thanks for the quick response.  I'll keep digging.
>
> On Wed, May 11, 2016 at 3:14 PM Dan Halperin <[email protected]>
> wrote:
>
> > Hi Bobby,
> >
> > You are correct that unbounded sinks are currently implemented as
> > PTransforms that interact with an external service in a ParDo. Even
> bounded
> > sinks are implemented this way -- look at the internals of the Write
> > transform -- the sink class itself is just a convenience wrapper for one
> > common pattern.
> >
> > Thanks,
> > Dan
> >
> > On Wed, May 11, 2016 at 1:00 PM, Bobby Evans <[email protected]> wrote:
> >
> > > I have been trying to get my head wrapped around some of the internals
> of
> > > beam so I can come up with an architecture/plan for STORM-1757
> > > <https://issues.apache.org/jira/browse/STORM-1757> / BEAM-9
> > > <https://issues.apache.org/jira/browse/BEAM-9>.
> > >
> > > I see that there are Sources and Sinks.  Sources can be unbounded, but
> > > there appears to be no equivalent to an unbounded Sink.  What I do find
> > are
> > > things like WriteToBigQuery which despite some internal complexity ends
> > up
> > > being an idempotent transform producing a PDone.  Is this the intended
> > way
> > > for data to be output from a streaming DAG?
> > >
> > > I will likely have more questions as I dig more into state
> checkpointing,
> > > etc.
> > >
> > > Thanks,
> > >
> > > Bobby Evans
> > >
> >
>
>
>

Reply via email to