Hey!

My name is Teodor Spæren and I'm writing a master thesis investigating the performance overhead of using Beam instead of using the underlying systems directly. My focus has been on Flink and I've made a discovery about some unnecessary copying between operators in the Flink runner[1][2]. I wrote a fixed for this and it got accepted and merged,
and will be in the upcoming 2.26.0 release[3].

I'm writing this email to ask if anyone on these mailing lists would be willing to send me some result of applying this option when the new version of beam releases. Anything will be very much appreciated, stories, screenshots of performance monitoring before and after, hard numbers, anything! If you include the cluster size and the workload that would be awesome too! My master thesis is set to be complete the coming summer, so there is no real hurry :)

The thesis will be freely accessible[4] and I hope that these findings will be of help to the beam community. If anyone wishes to submit stories, but remain anonymous that is also ok :)

The best way to contact me would be to send an email my way here, or on teod...@mail.uio.no.

Any help is appreciated, thanks for your attention!

Best regards,
Teodor Spæren


[1]: 
https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
[2]: https://issues.apache.org/jira/browse/BEAM-11146
[3]: https://github.com/apache/beam/pull/13240
[4]: https://www.duo.uio.no/

Reply via email to