[ 
https://issues.apache.org/jira/browse/BEAM-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8258:
----------------------------------

This Jira ticket has a pull request attached to it, but is still open. Did the 
pull request resolve the issue? If so, could you please mark it resolved? This 
will help the project have a clear view of its open issues.

> Implement Nexmark (benchmark suite) in Python and integrate it with Spark and 
> Flink runners
> -------------------------------------------------------------------------------------------
>
>                 Key: BEAM-8258
>                 URL: https://issues.apache.org/jira/browse/BEAM-8258
>             Project: Beam
>          Issue Type: Bug
>          Components: testing-nexmark
>            Reporter: Ismaël Mejía
>            Priority: P3
>              Labels: gsoc, gsoc2020, mentor
>          Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Apache Beam [1] is a unified and portable programming model for data 
> processing jobs (pipelines). The Beam model [2, 3, 4] has rich mechanisms to 
> process endless streams of events.
> Nexmark [5] is a benchmark for streaming jobs, basically a set of jobs 
> (queries) to test different use cases of the execution system. Beam 
> implemented Nexmark for Java [6, 7] and it has been succesfully used to 
> improve the features of multiple Beam runners and discover performance 
> regressions.
> Thanks to the work on portability [8] we can now run Beam pipelines on top of 
> open source systems like Apache Spark [9] and Apache Flink [10]. The goal of 
> this issue/project is to implement the Nexmark queries on Python and 
> configure them to run on our CI on top of open source systems like Apache 
> Spark and Apache Flink. The goal is that it helps the project to track and 
> improve the evolution of portable open source runners and our python 
> implementation as we do for Java.
> Because of the time constraints of GSoC we will adjust the goals (sub-tasks) 
> depending on progress.
> [1] https://beam.apache.org/
> [2] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
> [3] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102
> [4] 
> https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf
> [5] 
> https://web.archive.org/web/20100620010601/http://datalab.cs.pdx.edu/niagaraST/NEXMark/
> [6] https://beam.apache.org/documentation/sdks/java/testing/nexmark/
> [7] https://github.com/apache/beam/tree/master/sdks/java/testing/nexmark
> [8] https://beam.apache.org/roadmap/portability/
> [9] https://spark.apache.org/
> [10] https://flink.apache.org/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to