Just bumping this so people see it now that 2.26.0 is out :)
On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
Hey!
My name is Teodor Spæren and I'm writing a master thesis investigating
the performance overhead of using Beam instead of using the underlying
systems directly. My focus has been on Flink and I've made a discovery
about some unnecessary copying between operators in the Flink
runner[1][2]. I wrote a fixed for this and it got accepted and merged,
and will be in the upcoming 2.26.0 release[3].
I'm writing this email to ask if anyone on these mailing lists would
be willing to send me some result of applying this option when the new
version of beam releases. Anything will be very much appreciated,
stories, screenshots of performance monitoring before and after, hard
numbers, anything! If you include the cluster size and the workload
that would be awesome too! My master thesis is set to be complete the
coming summer, so there is no real hurry :)
The thesis will be freely accessible[4] and I hope that these findings
will be of help to the beam community. If anyone wishes to submit
stories, but remain anonymous that is also ok :)
The best way to contact me would be to send an email my way here, or
on [email protected].
Any help is appreciated, thanks for your attention!
Best regards,
Teodor Spæren
[1]:
https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
[2]: https://issues.apache.org/jira/browse/BEAM-11146
[3]: https://github.com/apache/beam/pull/13240
[4]: https://www.duo.uio.no/