Hi Neville,

I don't know how up to date this roadmap is but from "Apache Beam:
Technical Vision":
https://docs.google.com/presentation/d/1E9seGPB_VXtY_KZP4HngDPTbsu5RVZFFaTlwEYa88Zw/edit#slide=id.g108d3a202f_0_287

And for more details:
https://docs.google.com/document/d/1UyAeugHxZmVlQ5cEWo_eOPgXNQA1oD-rGooWOSwAqh8/edit#heading=h.ywcvt1a9xcx1

On 26 March 2016 at 06:53, Jean-Baptiste Onofré <[email protected]> wrote:

> Hi Neville,
>
> that's great news, and the timeline is perfect !
>
> We are working on some refactoring & polishing on our side (Runner API,
> etc). So, one or two months is not a big deal !
>
> Let me know if I can help in any way.
>
> Thanks,
> Regards
> JB
>
>
> On 03/25/2016 08:03 PM, Neville Li wrote:
>
>> Thanks guys. Yes we'd love to donate the project but would also like to
>> polish the API a bit first, like in the next month or two. What's the
>> timeline like for BEAM and related projects?
>>
>> Will also read the technical docs and follow up later.
>>
>> On Fri, Mar 25, 2016, 12:55 AM Ismaël Mejía <[email protected]> wrote:
>>
>> Hello Neville,
>>>
>>> First congratulations guys, excellent job / API, the scalding touches are
>>> pretty neat (as well as the Tap abstraction). I am also new to Beam, so
>>> believe me, you guys already know more than me.
>>>
>>> In my comment I mentioned sessions referring to session windows, but it
>>> was
>>> my mistake since I just took a fast look at your code and initially
>>> didn't
>>> see them. Anyway if you are interested in the model there is a good
>>> description of the current capabilities of the runners in the website,
>>>
>>> https://beam.incubator.apache.org/capability-matrix/
>>>
>>> And the new additions to the model are openly discussed in the mailing
>>> list
>>> and in the technical docs (e.g. lateness):
>>>
>>> https://goo.gl/ps8twC
>>>
>>> -Ismaël
>>>
>>> On Fri, Mar 25, 2016 at 8:36 AM, Neville Li <[email protected]>
>>> wrote:
>>>
>>> Thanks guys for the interest. I'm really excited about all the feedbacks
>>>> from the community.
>>>>
>>>> A little background: we developed Scio to bring Google Cloud Dataflow
>>>> closer to the Scalding/Spark ecosystem that our developers are familiar
>>>> with while bringing some missing pieces to the table (type safe
>>>> BigQuery,
>>>> HDFS, REPL to name a few).
>>>>
>>>> I have to admit that I'm pretty new to the BEAM development but would
>>>>
>>> love
>>>
>>>> to get feedbacks and advices on how to bring Scio closer to BEAM feature
>>>> set and semantics. Scio doesn't have to live with the BEAM code base
>>>> just
>>>> yet (we're still under heavy development) but I'd like to see it as a de
>>>> facto Scala API endorsed by the BEAM community.
>>>>
>>>> @Ismaël: I'm curious what's this session thing you're referring to?
>>>>
>>>> On Thu, Mar 24, 2016 at 3:40 PM Frances Perry <[email protected]>
>>>> wrote:
>>>>
>>>> +Neville and Rafal for their take ;-)
>>>>>
>>>>> Excited to see this out. Multiple community driven SDKs are right in
>>>>>
>>>> line
>>>
>>>> with our goals for Beam.
>>>>>
>>>>>
>>>>> On Thu, Mar 24, 2016 at 3:04 PM, Ismaël Mejía <[email protected]>
>>>>>
>>>> wrote:
>>>
>>>>
>>>>> Addendum: actually the semantic model support is not so far away as I
>>>>>>
>>>>> said
>>>>>
>>>>>> before (I havent finished reading and I thought they didn't support
>>>>>> sessions), and looking at the git history the project is not so young
>>>>>> either and it is quite active.
>>>>>>
>>>>>> On Thu, Mar 24, 2016 at 10:52 PM, Ismaël Mejía <[email protected]>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>>
>>>>>>> I just checked a bit the code and what they have done is
>>>>>>>
>>>>>> interesting,
>>>
>>>> the
>>>>>
>>>>>> SCollection wrapper is worth a look, as well as the examples to get
>>>>>>>
>>>>>> an
>>>>
>>>>> idea
>>>>>>
>>>>>>> of their intentions, the fact that the code looks so spark-lish
>>>>>>> (distributed collections like) is something that is quite
>>>>>>>
>>>>>> interesting
>>>
>>>> too:
>>>>>>
>>>>>>>
>>>>>>>      val (sc, args) = ContextAndArgs(cmdlineArgs)
>>>>>>>      sc.textFile(args.getOrElse("input", ExampleData.KING_LEAR))
>>>>>>>        .flatMap(_.split("[^a-zA-Z']+").filter(_.nonEmpty))
>>>>>>>        .countByValue()
>>>>>>>        .map(t => t._1 + ": " + t._2)
>>>>>>>        .saveAsTextFile(args("output"))
>>>>>>>      sc.close()
>>>>>>>
>>>>>>> They have a repl, and since the project is a bit young they don't
>>>>>>>
>>>>>> support
>>>>>
>>>>>> all the advanced semantics of Beam, They also have a Hadoop File
>>>>>>> Sink/Source. I think it would be nice to work with them, but if it
>>>>>>>
>>>>>> is
>>>
>>>> not
>>>>>
>>>>>> possible, at least I think it is worth to coordinate some sharing
>>>>>>>
>>>>>> e.g.
>>>>
>>>>> in
>>>>>
>>>>>> the Sink/Source area + other extensions.
>>>>>>>
>>>>>>> Aditionally their code is also under the Apache license.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 24, 2016 at 9:20 PM, Jean-Baptiste Onofré <
>>>>>>>
>>>>>> [email protected]
>>>>
>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Raghu,
>>>>>>>>
>>>>>>>> I agree: we should provide SDK in different languages, and DSLs
>>>>>>>>
>>>>>>> for
>>>
>>>> specific use cases.
>>>>>>>>
>>>>>>>> You got why I sent my proposal  ;)
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> JB
>>>>>>>>
>>>>>>>>
>>>>>>>> On 03/24/2016 07:14 PM, Raghu Angadi wrote:
>>>>>>>>
>>>>>>>> I would love to see Scala API properly supported. I didn't know
>>>>>>>>>
>>>>>>>> about
>>>>
>>>>> scio.
>>>>>>>>> Scala is such a natural fit for Dataflow API.
>>>>>>>>>
>>>>>>>>> I am not sure of the policy w.r.t where such packages would live
>>>>>>>>>
>>>>>>>> in
>>>
>>>> Beam
>>>>>>
>>>>>>> repo, but I personally would write my Dataflow applications in
>>>>>>>>>
>>>>>>>> Scala.
>>>>
>>>>> It
>>>>>>
>>>>>>> is
>>>>>>>>> probably already the case but my request would be : it should be
>>>>>>>>>
>>>>>>>> as
>>>
>>>> thin
>>>>>>
>>>>>>> as
>>>>>>>>> reasonably possible (that might make it a bit less like
>>>>>>>>>
>>>>>>>> scalding/spark
>>>>>
>>>>>> API
>>>>>>>>> in some cases, which I think is a good compromise).
>>>>>>>>>
>>>>>>>>> On Thu, Mar 24, 2016 at 6:01 AM, Jean-Baptiste Onofré <
>>>>>>>>>
>>>>>>>> [email protected]
>>>>>
>>>>>>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi beamers,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> right now, Beam provides Java SDK.
>>>>>>>>>>
>>>>>>>>>> AFAIK, very soon, you should have the Python SDK ;)
>>>>>>>>>>
>>>>>>>>>> Spotify created a Scala API on top of Google Dataflow SDK:
>>>>>>>>>>
>>>>>>>>>> https://github.com/spotify/scio
>>>>>>>>>>
>>>>>>>>>> What do you think of asking if they want to donate this as Beam
>>>>>>>>>>
>>>>>>>>> Scala
>>>>>
>>>>>> SDK ?
>>>>>>>>>> I planned to work on a Scala SDK, but as it seems there's
>>>>>>>>>>
>>>>>>>>> already
>>>
>>>> something, it makes sense to leverage it.
>>>>>>>>>>
>>>>>>>>>> Thoughts ?
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> JB
>>>>>>>>>> --
>>>>>>>>>> Jean-Baptiste Onofré
>>>>>>>>>> [email protected]
>>>>>>>>>> http://blog.nanthrax.net
>>>>>>>>>> Talend - http://www.talend.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>> Jean-Baptiste Onofré
>>>>>>>> [email protected]
>>>>>>>> http://blog.nanthrax.net
>>>>>>>> Talend - http://www.talend.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to