+Neville and Rafal for their take ;-) Excited to see this out. Multiple community driven SDKs are right in line with our goals for Beam.
On Thu, Mar 24, 2016 at 3:04 PM, Ismaël Mejía <[email protected]> wrote: > Addendum: actually the semantic model support is not so far away as I said > before (I havent finished reading and I thought they didn't support > sessions), and looking at the git history the project is not so young > either and it is quite active. > > On Thu, Mar 24, 2016 at 10:52 PM, Ismaël Mejía <[email protected]> wrote: > > > Hello, > > > > I just checked a bit the code and what they have done is interesting, the > > SCollection wrapper is worth a look, as well as the examples to get an > idea > > of their intentions, the fact that the code looks so spark-lish > > (distributed collections like) is something that is quite interesting > too: > > > > val (sc, args) = ContextAndArgs(cmdlineArgs) > > sc.textFile(args.getOrElse("input", ExampleData.KING_LEAR)) > > .flatMap(_.split("[^a-zA-Z']+").filter(_.nonEmpty)) > > .countByValue() > > .map(t => t._1 + ": " + t._2) > > .saveAsTextFile(args("output")) > > sc.close() > > > > They have a repl, and since the project is a bit young they don't support > > all the advanced semantics of Beam, They also have a Hadoop File > > Sink/Source. I think it would be nice to work with them, but if it is not > > possible, at least I think it is worth to coordinate some sharing e.g. in > > the Sink/Source area + other extensions. > > > > Aditionally their code is also under the Apache license. > > > > > > On Thu, Mar 24, 2016 at 9:20 PM, Jean-Baptiste Onofré <[email protected]> > > wrote: > > > >> Hi Raghu, > >> > >> I agree: we should provide SDK in different languages, and DSLs for > >> specific use cases. > >> > >> You got why I sent my proposal ;) > >> > >> Regards > >> JB > >> > >> > >> On 03/24/2016 07:14 PM, Raghu Angadi wrote: > >> > >>> I would love to see Scala API properly supported. I didn't know about > >>> scio. > >>> Scala is such a natural fit for Dataflow API. > >>> > >>> I am not sure of the policy w.r.t where such packages would live in > Beam > >>> repo, but I personally would write my Dataflow applications in Scala. > It > >>> is > >>> probably already the case but my request would be : it should be as > thin > >>> as > >>> reasonably possible (that might make it a bit less like scalding/spark > >>> API > >>> in some cases, which I think is a good compromise). > >>> > >>> On Thu, Mar 24, 2016 at 6:01 AM, Jean-Baptiste Onofré <[email protected] > > > >>> wrote: > >>> > >>> Hi beamers, > >>>> > >>>> right now, Beam provides Java SDK. > >>>> > >>>> AFAIK, very soon, you should have the Python SDK ;) > >>>> > >>>> Spotify created a Scala API on top of Google Dataflow SDK: > >>>> > >>>> https://github.com/spotify/scio > >>>> > >>>> What do you think of asking if they want to donate this as Beam Scala > >>>> SDK ? > >>>> I planned to work on a Scala SDK, but as it seems there's already > >>>> something, it makes sense to leverage it. > >>>> > >>>> Thoughts ? > >>>> > >>>> Regards > >>>> JB > >>>> -- > >>>> Jean-Baptiste Onofré > >>>> [email protected] > >>>> http://blog.nanthrax.net > >>>> Talend - http://www.talend.com > >>>> > >>>> > >>> > >> -- > >> Jean-Baptiste Onofré > >> [email protected] > >> http://blog.nanthrax.net > >> Talend - http://www.talend.com > >> > > > > >
