Re: Two example pipelines built by Yahoo intern

Jesse Anderson Thu, 10 Aug 2017 08:28:57 -0700

Claire,

Here are a few examples of lambdas and built-in functions in Beam:


http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/
https://github.com/apache/beam/tree/master/examples/java8/src/main/java/org/apache/beam/examples/complete
https://github.com/eljefe6a/beamexample/tree/master/BeamTutorial/src/main/java/org/apache/beam/examples/tutorial/game

Thanks,

Jesse

On Wed, Aug 9, 2017 at 4:29 PM Claire Yuan <[email protected]> wrote:

> Hi,
>   Thank you so much for your comments! Those were really helpful in making
> improvements to our work :) For question asked by Jesse, we are taking the
> examples from Beam and did not notice any lambda expression there. For us,
> it was surprising to see java in this functional and generic coding styles
> when using beam API. But after getting used to it, its convenience did
> amaze us.
>
> Claire
>
>
> On Tuesday, August 8, 2017 4:53 PM, Eugene Kirpichov <[email protected]>
> wrote:
>
>
> +Aljoscha Krettek <[email protected]> for comments on Flink
> runner
> +Thomas Weise <[email protected]> likewise for Apex runner
>
> On Tue, Aug 8, 2017 at 4:52 PM Eugene Kirpichov <[email protected]>
> wrote:
>
> Hi Claire,
>
> Thank you - happy to see a paper with such a detailed description of your
> experience with both usability of Beam per se and the execution on the
> Flink runner!
> The paper looks well-written, and, from a quick look at the code, it seems
> to be using the Beam API properly without obvious opportunities for large
> improvement. Great work!
>
> A couple of suggestions:
> - I think it would be useful to mention explicitly in the paper abstract /
> introduction that you are testing Flink and Apex runners, and mention which
> other runners are currently available, and mention why you're testing
> specifically Flink and Apex. This would be useful to people reading the
> paper without much background in Beam, who might not realize that Beam has
> many different runners with potentially very different performance or level
> of support for features.
> - As a member of the Dataflow team, I'm curious :) Have you considered
> also benchmarking these pipelines on the Dataflow runner? (especially
> streaming)
> - For the issues you found that are clearly not "intended behavior" (e.g.
> unacceptably low performance in streaming mode; pipelines not working at
> all with Apex runner, etc.), would it be possible to add JIRA IDs to the
> paper, so that people who read the paper later can look at the JIRA and see
> if it was already resolved?
>
> Thanks.
>
> On Tue, Aug 8, 2017 at 3:46 PM Jesse Anderson <[email protected]>
> wrote:
>
> Claire,
>
> Interesting work.
>
> In section 5, you talk about the Java language being difficult. Was there
> a reason you didn't use Java lambdas for your work?
>
> Thanks,
>
> Jesse
>
> On Tue, Aug 8, 2017 at 3:40 PM Claire Yuan <[email protected]>
> wrote:
>
> Hi folks,
>   We are a two-members team interning in Yahoo! Inc who are currently
> evaluating the performances and functionalities of Beam API. We built two
> pipelines using Beam API referencing the default examples. One is sentiment
> analysis and the other one is flight performance analysis. Here attached
> the codes written for the two pipelines and instructions in README about
> how to run it in our framework. We would like to share them with you. Also
> there is a paper we wrote about our evaluation results and our experiences
> about using Beam in the last two months during internship. It will be a
> great help if you can have a look at it and maybe have some comments to us.
> Thanks!
>
> --
> Thanks,
>
> Jesse
>
>
>
> --
Thanks,

Jesse

Re: Two example pipelines built by Yahoo intern

Reply via email to