+Neville and Rafal for their take ;-)

Excited to see this out. Multiple community driven SDKs are right in line
with our goals for Beam.


On Thu, Mar 24, 2016 at 3:04 PM, Ismaël Mejía <[email protected]> wrote:

> Addendum: actually the semantic model support is not so far away as I said
> before (I havent finished reading and I thought they didn't support
> sessions), and looking at the git history the project is not so young
> either and it is quite active.
>
> On Thu, Mar 24, 2016 at 10:52 PM, Ismaël Mejía <[email protected]> wrote:
>
> > Hello,
> >
> > I just checked a bit the code and what they have done is interesting, the
> > SCollection wrapper is worth a look, as well as the examples to get an
> idea
> > of their intentions, the fact that the code looks so spark-lish
> > (distributed collections like) is something that is quite interesting
> too:
> >
> >     val (sc, args) = ContextAndArgs(cmdlineArgs)
> >     sc.textFile(args.getOrElse("input", ExampleData.KING_LEAR))
> >       .flatMap(_.split("[^a-zA-Z']+").filter(_.nonEmpty))
> >       .countByValue()
> >       .map(t => t._1 + ": " + t._2)
> >       .saveAsTextFile(args("output"))
> >     sc.close()
> >
> > They have a repl, and since the project is a bit young they don't support
> > all the advanced semantics of Beam, They also have a Hadoop File
> > Sink/Source. I think it would be nice to work with them, but if it is not
> > possible, at least I think it is worth to coordinate some sharing e.g. in
> > the Sink/Source area + other extensions.
> >
> > Aditionally their code is also under the Apache license.
> >
> >
> > On Thu, Mar 24, 2016 at 9:20 PM, Jean-Baptiste Onofré <[email protected]>
> > wrote:
> >
> >> Hi Raghu,
> >>
> >> I agree: we should provide SDK in different languages, and DSLs for
> >> specific use cases.
> >>
> >> You got why I sent my proposal  ;)
> >>
> >> Regards
> >> JB
> >>
> >>
> >> On 03/24/2016 07:14 PM, Raghu Angadi wrote:
> >>
> >>> I would love to see Scala API properly supported. I didn't know about
> >>> scio.
> >>> Scala is such a natural fit for Dataflow API.
> >>>
> >>> I am not sure of the policy w.r.t where such packages would live in
> Beam
> >>> repo, but I personally would write my Dataflow applications in Scala.
> It
> >>> is
> >>> probably already the case but my request would be : it should be as
> thin
> >>> as
> >>> reasonably possible (that might make it a bit less like scalding/spark
> >>> API
> >>> in some cases, which I think is a good compromise).
> >>>
> >>> On Thu, Mar 24, 2016 at 6:01 AM, Jean-Baptiste Onofré <[email protected]
> >
> >>> wrote:
> >>>
> >>> Hi beamers,
> >>>>
> >>>> right now, Beam provides Java SDK.
> >>>>
> >>>> AFAIK, very soon, you should have the Python SDK ;)
> >>>>
> >>>> Spotify created a Scala API on top of Google Dataflow SDK:
> >>>>
> >>>> https://github.com/spotify/scio
> >>>>
> >>>> What do you think of asking if they want to donate this as Beam Scala
> >>>> SDK ?
> >>>> I planned to work on a Scala SDK, but as it seems there's already
> >>>> something, it makes sense to leverage it.
> >>>>
> >>>> Thoughts ?
> >>>>
> >>>> Regards
> >>>> JB
> >>>> --
> >>>> Jean-Baptiste Onofré
> >>>> [email protected]
> >>>> http://blog.nanthrax.net
> >>>> Talend - http://www.talend.com
> >>>>
> >>>>
> >>>
> >> --
> >> Jean-Baptiste Onofré
> >> [email protected]
> >> http://blog.nanthrax.net
> >> Talend - http://www.talend.com
> >>
> >
> >
>

Reply via email to