Addendum: actually the semantic model support is not so far away as I said before (I havent finished reading and I thought they didn't support sessions), and looking at the git history the project is not so young either and it is quite active.
On Thu, Mar 24, 2016 at 10:52 PM, Ismaël Mejía <[email protected]> wrote: > Hello, > > I just checked a bit the code and what they have done is interesting, the > SCollection wrapper is worth a look, as well as the examples to get an idea > of their intentions, the fact that the code looks so spark-lish > (distributed collections like) is something that is quite interesting too: > > val (sc, args) = ContextAndArgs(cmdlineArgs) > sc.textFile(args.getOrElse("input", ExampleData.KING_LEAR)) > .flatMap(_.split("[^a-zA-Z']+").filter(_.nonEmpty)) > .countByValue() > .map(t => t._1 + ": " + t._2) > .saveAsTextFile(args("output")) > sc.close() > > They have a repl, and since the project is a bit young they don't support > all the advanced semantics of Beam, They also have a Hadoop File > Sink/Source. I think it would be nice to work with them, but if it is not > possible, at least I think it is worth to coordinate some sharing e.g. in > the Sink/Source area + other extensions. > > Aditionally their code is also under the Apache license. > > > On Thu, Mar 24, 2016 at 9:20 PM, Jean-Baptiste Onofré <[email protected]> > wrote: > >> Hi Raghu, >> >> I agree: we should provide SDK in different languages, and DSLs for >> specific use cases. >> >> You got why I sent my proposal ;) >> >> Regards >> JB >> >> >> On 03/24/2016 07:14 PM, Raghu Angadi wrote: >> >>> I would love to see Scala API properly supported. I didn't know about >>> scio. >>> Scala is such a natural fit for Dataflow API. >>> >>> I am not sure of the policy w.r.t where such packages would live in Beam >>> repo, but I personally would write my Dataflow applications in Scala. It >>> is >>> probably already the case but my request would be : it should be as thin >>> as >>> reasonably possible (that might make it a bit less like scalding/spark >>> API >>> in some cases, which I think is a good compromise). >>> >>> On Thu, Mar 24, 2016 at 6:01 AM, Jean-Baptiste Onofré <[email protected]> >>> wrote: >>> >>> Hi beamers, >>>> >>>> right now, Beam provides Java SDK. >>>> >>>> AFAIK, very soon, you should have the Python SDK ;) >>>> >>>> Spotify created a Scala API on top of Google Dataflow SDK: >>>> >>>> https://github.com/spotify/scio >>>> >>>> What do you think of asking if they want to donate this as Beam Scala >>>> SDK ? >>>> I planned to work on a Scala SDK, but as it seems there's already >>>> something, it makes sense to leverage it. >>>> >>>> Thoughts ? >>>> >>>> Regards >>>> JB >>>> -- >>>> Jean-Baptiste Onofré >>>> [email protected] >>>> http://blog.nanthrax.net >>>> Talend - http://www.talend.com >>>> >>>> >>> >> -- >> Jean-Baptiste Onofré >> [email protected] >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> > >
