Re: beam + scala + streamline

Georg Heiler Wed, 12 Apr 2017 22:40:05 -0700

Hi JB ,
Indeed I would like to get access to this lower level features as some are
still missing from beam.


So in a DoFn I would not be able to let's say access spark session and
convert to a spark native data frame.

How is / should such things currently be handled in beam?

Regards
Georg
Jean-Baptiste Onofré <[email protected]> schrieb am Mi. 12. Apr. 2017 um
21:45:

> Hi Georg,
>
> Personally, I didn't have feedback about performance difference between
> flink
> native and beam/flink runner.
>
> For low level runner access, you mean you want to access Spark ML
> function, etc
> from a Beam pipeline ? Even if you can always use a DoFn, you won't have
> access
> to all internal objects/API. That's what you mean ?
>
> Regards
> JB
>
> On 04/12/2017 08:33 PM, Georg Heiler wrote:
> > HI JB,
> >
> > The overall look and feel of streamline (Kafka, storm, hbase and the
> simplicity
> > of the UI) were really compelling. As we are currently starting to build
> a
> > streaming platform we are looking forward to such a tight integration,
> but would
> > rather use flink or spark (maybe via beam) and not storm integrated into
> such a
> > platform.
> >
> > Regarding streamline performance:
> > someone told me that e.g. the flink runner for beam seems to be slower
> than a
> > native flink job. Is this true? Did you observe such characteristics for
> several
> > runners?
> >
> > Regarding streamline low level runner access:
> > in case I want to use some low level functionality (specific to a
> runner) like
> > ML, graph processing or sql-tables api, is it possible to just drop from
> the
> > beam API one level deeper to the actual runner and sort of mesh beam
> with runner
> > native code to integrate these features?
> >
> > Regards,
> > Georg
> >
> > Jean-Baptiste Onofré <[email protected] <mailto:[email protected]>> schrieb
> am Mi.,
> > 12. Apr. 2017 um 20:04 Uhr:
> >
> >     Hi Georg,
> >
> >     You can use Java API via Scala, or you can use the Scio Scala DSL
> (this DSL use
> >     the Beam Java SDK).
> >
> >     For Streamline, can you explain a bit ? Streamline contains
> different parts:
> >     HBase, Kafka, the web frontend, ...
> >
> >     Using provided IO, it should be possible to store the data and use
> streamline on
> >     top of this data for analytics.
> >
> >     Regards
> >     JB
> >
> >     On 04/12/2017 07:57 PM, Georg Heiler wrote:
> >     > Hi,
> >     > new to beam I wonder what API is recommended for using beam from
> scala.
> >     > Would you recommend simply using the java API from scala or
> >     > https://github.com/spotify/scio?
> >     >
> >     > Are there any plans to support beam in
> >     https://github.com/hortonworks/streamline?
> >     >
> >     > regards,
> >     > Georg
> >
> >     --
> >     Jean-Baptiste Onofré
> >     [email protected] <mailto:[email protected]>
> >     http://blog.nanthrax.net
> >     Talend - http://www.talend.com
> >
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: beam + scala + streamline

Reply via email to