I agree. Borrowing the mutation detection from the direct runner as an
intermediate point sounds like a good idea.

On Mon, Dec 21, 2020 at 8:57 AM Kenneth Knowles <[email protected]> wrote:

> I really think we should make a plan to make this the default. If you test
> with the DirectRunner it will do mutation checking and catch pipelines that
> depend on the runner cloning every element. (also the DirectRunner doesn't
> clone). Since the cloning is similar in cost to the mutation detection,
> could we actually add some mutation detection to FlinkRunner pipelines and
> also directly warn if a pipeline is depending on it?
>
> Kenn
>
> On Mon, Dec 21, 2020 at 5:04 AM Teodor Spæren <[email protected]>
> wrote:
>
>> Hey! My option is not default as of now, since it can break pipelines
>> which rely on the faulty flink implementation. I'm creating my own
>> benchmarks locally and will run against those, but the idea of adding it
>> to the official benchmark runs sounds interesting, thanks for bringing
>> it up!
>>
>> Teodor
>>
>> On Tue, Dec 15, 2020 at 06:51:38PM -0800, Ahmet Altay wrote:
>> >Hi Teodor,
>> >
>> >Thank you for working on this. If I remember correctly, there were some
>> >opportunities to improve in the previous paper (e.g. not focusing
>> >deprecated runners, long running benchmarks, varying data sizes). And I
>> am
>> >excited that you are keeping the community as part of your research
>> process
>> >and we will be happy to help you where we can.
>> >
>> >Related to your question. Was the new option used by default? If that
>> >is the case you will probably see its impact on the metrics dashboard
>> [1].
>> >And if it is not on by default, you can add your variant as a new
>> benchmark
>> >and compare the difference across many runs in a controlled benchmarking
>> >environment. Would that help?
>> >
>> >Ahmet
>> >
>> >[1] http://metrics.beam.apache.org/d/1/getting-started?orgId=1
>> >
>> >
>> >On Tue, Dec 15, 2020 at 5:48 AM Teodor Spæren <[email protected]
>> >
>> >wrote:
>> >
>> >> Hey!
>> >>
>> >> Yeah, that paper was what prompted my master thesis! I definitivly will
>> >> post here, once I get more data :)
>> >>
>> >> Teodor
>> >>
>> >> On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote:
>> >> >Hi Teodor,
>> >> >
>> >> >Although I’m sure you’ve come across it, this might have some valuable
>> >> resources or methodologies to consider as you explore this a bit more:
>> >> >
>> >> >https://arxiv.org/pdf/1907.08302.pdf
>> >> >
>> >> >I’m looking forward to reading about your finding, especially using a
>> >> more recent iteration of Beam!
>> >> >
>> >> >Rion
>> >> >
>> >> >> On Dec 14, 2020, at 6:37 AM, Teodor Spæren <
>> [email protected]>
>> >> wrote:
>> >> >>
>> >> >> Just bumping this so people see it now that 2.26.0 is out :)
>> >> >>
>> >> >>> On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
>> >> >>> Hey!
>> >> >>>
>> >> >>> My name is Teodor Spæren and I'm writing a master thesis
>> investigating
>> >> the performance overhead of using Beam instead of using the underlying
>> >> systems directly. My focus has been on Flink and I've made a discovery
>> >> about some unnecessary copying between operators in the Flink
>> runner[1][2].
>> >> I wrote a fixed for this and it got accepted and merged,
>> >> >>> and will be in the upcoming 2.26.0 release[3].
>> >> >>>
>> >> >>> I'm writing this email to ask if anyone on these mailing lists
>> would
>> >> be willing to send me some result of applying this option when the new
>> >> version of beam releases. Anything will be very much appreciated,
>> stories,
>> >> screenshots of performance monitoring before and after, hard numbers,
>> >> anything! If you include the cluster size and the workload that would
>> be
>> >> awesome too! My master thesis is set to be complete the coming summer,
>> so
>> >> there is no real hurry :)
>> >> >>>
>> >> >>> The thesis will be freely accessible[4] and I hope that these
>> findings
>> >> will be of help to the beam community. If anyone wishes to submit
>> stories,
>> >> but remain anonymous that is also ok :)
>> >> >>>
>> >> >>> The best way to contact me would be to send an email my way here,
>> or
>> >> on [email protected].
>> >> >>>
>> >> >>> Any help is appreciated, thanks for your attention!
>> >> >>>
>> >> >>> Best regards,
>> >> >>> Teodor Spæren
>> >> >>>
>> >> >>>
>> >> >>> [1]:
>> >>
>> https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
>> >> >>> [2]: https://issues.apache.org/jira/browse/BEAM-11146
>> >> >>> [3]: https://github.com/apache/beam/pull/13240
>> >> >>> [4]: https://www.duo.uio.no/
>> >>
>>
>

Reply via email to