Hello,
I just checked a bit the code and what they have done is interesting, the
SCollection wrapper is worth a look, as well as the examples to get an idea
of their intentions, the fact that the code looks so spark-lish
(distributed collections like) is something that is quite interesting too:
val (sc, args) = ContextAndArgs(cmdlineArgs)
sc.textFile(args.getOrElse("input", ExampleData.KING_LEAR))
.flatMap(_.split("[^a-zA-Z']+").filter(_.nonEmpty))
.countByValue()
.map(t => t._1 + ": " + t._2)
.saveAsTextFile(args("output"))
sc.close()
They have a repl, and since the project is a bit young they don't support
all the advanced semantics of Beam, They also have a Hadoop File
Sink/Source. I think it would be nice to work with them, but if it is not
possible, at least I think it is worth to coordinate some sharing e.g. in
the Sink/Source area + other extensions.
Aditionally their code is also under the Apache license.
On Thu, Mar 24, 2016 at 9:20 PM, Jean-Baptiste Onofré <[email protected]>
wrote:
> Hi Raghu,
>
> I agree: we should provide SDK in different languages, and DSLs for
> specific use cases.
>
> You got why I sent my proposal ;)
>
> Regards
> JB
>
>
> On 03/24/2016 07:14 PM, Raghu Angadi wrote:
>
>> I would love to see Scala API properly supported. I didn't know about
>> scio.
>> Scala is such a natural fit for Dataflow API.
>>
>> I am not sure of the policy w.r.t where such packages would live in Beam
>> repo, but I personally would write my Dataflow applications in Scala. It
>> is
>> probably already the case but my request would be : it should be as thin
>> as
>> reasonably possible (that might make it a bit less like scalding/spark API
>> in some cases, which I think is a good compromise).
>>
>> On Thu, Mar 24, 2016 at 6:01 AM, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>> Hi beamers,
>>>
>>> right now, Beam provides Java SDK.
>>>
>>> AFAIK, very soon, you should have the Python SDK ;)
>>>
>>> Spotify created a Scala API on top of Google Dataflow SDK:
>>>
>>> https://github.com/spotify/scio
>>>
>>> What do you think of asking if they want to donate this as Beam Scala
>>> SDK ?
>>> I planned to work on a Scala SDK, but as it seems there's already
>>> something, it makes sense to leverage it.
>>>
>>> Thoughts ?
>>>
>>> Regards
>>> JB
>>> --
>>> Jean-Baptiste Onofré
>>> [email protected]
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>