[
https://issues.apache.org/jira/browse/BEAM-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-8258:
----------------------------------
This Jira ticket has a pull request attached to it, but is still open. Did the
pull request resolve the issue? If so, could you please mark it resolved? This
will help the project have a clear view of its open issues.
> Implement Nexmark (benchmark suite) in Python and integrate it with Spark and
> Flink runners
> -------------------------------------------------------------------------------------------
>
> Key: BEAM-8258
> URL: https://issues.apache.org/jira/browse/BEAM-8258
> Project: Beam
> Issue Type: Bug
> Components: testing-nexmark
> Reporter: Ismaël Mejía
> Priority: P3
> Labels: gsoc, gsoc2020, mentor
> Time Spent: 6h 40m
> Remaining Estimate: 0h
>
> Apache Beam [1] is a unified and portable programming model for data
> processing jobs (pipelines). The Beam model [2, 3, 4] has rich mechanisms to
> process endless streams of events.
> Nexmark [5] is a benchmark for streaming jobs, basically a set of jobs
> (queries) to test different use cases of the execution system. Beam
> implemented Nexmark for Java [6, 7] and it has been succesfully used to
> improve the features of multiple Beam runners and discover performance
> regressions.
> Thanks to the work on portability [8] we can now run Beam pipelines on top of
> open source systems like Apache Spark [9] and Apache Flink [10]. The goal of
> this issue/project is to implement the Nexmark queries on Python and
> configure them to run on our CI on top of open source systems like Apache
> Spark and Apache Flink. The goal is that it helps the project to track and
> improve the evolution of portable open source runners and our python
> implementation as we do for Java.
> Because of the time constraints of GSoC we will adjust the goals (sub-tasks)
> depending on progress.
> [1] https://beam.apache.org/
> [2] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
> [3] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102
> [4]
> https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf
> [5]
> https://web.archive.org/web/20100620010601/http://datalab.cs.pdx.edu/niagaraST/NEXMark/
> [6] https://beam.apache.org/documentation/sdks/java/testing/nexmark/
> [7] https://github.com/apache/beam/tree/master/sdks/java/testing/nexmark
> [8] https://beam.apache.org/roadmap/portability/
> [9] https://spark.apache.org/
> [10] https://flink.apache.org/
--
This message was sent by Atlassian Jira
(v8.20.1#820001)