I agree that we can benefit from having two types of performance tests (low and 
high level) that could complement each other.
Can we detect a regression (if any) automatically and send a report about that? 
Sorry if we already do that for Nexmark.

> On 11 Sep 2018, at 11:29, Etienne Chauchot <[email protected]> wrote:
> 
> Hi Lukasz,
> 
> Well, having low level byte[] based pure performance tests makes sense. And 
> having high level realistic model (Nexmark auction system) makes sense also 
> to avoid testing unrealistic pipelines as you describe.
> 
> Have common code between the 2 seems difficult as both the architecture and 
> the model are different.
> 
> I'm more concerned about having two CI mechanisms to detect 
> functionnal/performance regressions.
> Best
> Etienne
> 
> Le lundi 10 septembre 2018 à 18:33 +0200, Łukasz Gajowy a écrit :
>> In my opinion and as far as I understand Nexmark, there are some benefits to 
>> having both types of tests. The load tests we propose can be very 
>> straightforward and clearly show what is being tested thanks to the fact 
>> that there's no fixed model but very "low level" KV<byte[], byte[]> 
>> collections only. They are more flexible in shapes of the pipelines they can 
>> express e.g. fanout_64, without having to think about specific use cases. 
>> 
>> Having both types would allow developers to decide whether they want to 
>> create a new Nexmark query for their specific case or develop a new Load 
>> test (whatever is easier and more fits their case). However, there is a risk 
>> - with KV<byte[], byte[]> developer can overemphasize cases that can never 
>> happen in practice, so we need to be careful about the exact configurations 
>> we run. 
>> 
>> Still, I can imagine that there surely will be code that should be common to 
>> both types of tests and we seek ways to not duplicate code.
>> 
>> WDYT? 
>> 
>> Regards, 
>> Łukasz
>> 
>> 
>> 
>> pon., 10 wrz 2018 o 16:36 Etienne Chauchot <[email protected] 
>> <mailto:[email protected]>> napisał(a):
>>> Hi,
>>> It seems that there is a notable overlap with what Nexmark already does:
>>> Nexmark mesures performance and regression by exercising all the Beam model 
>>> in both batch and streaming modes with several runners. It also computes on 
>>> synthetic data. Also nexmark is already included as PostCommits in the CI 
>>> and dashboards.
>>> 
>>> Shall we merge the two?
>>> 
>>> Best
>>> 
>>> Etienne
>>> 
>>> Le lundi 10 septembre 2018 à 12:56 +0200, Łukasz Gajowy a écrit :
>>>> Hello everyone, 
>>>> 
>>>> thank you for all your comments to the proposal. To sum up: 
>>>> 
>>>> A set of performance tests exercising Core Beam Transforms (ParDo, 
>>>> GroupByKey, CoGroupByKey, Combine) will be implemented for Java and Python 
>>>> SDKs. Those tests will allow to: 
>>>> measure performance of the transforms on various runners
>>>> exercise the transforms by creating stressful conditions and big loads 
>>>> using Synthetic Source and Synthetic Step API (delays, keeping cpu busy or 
>>>> asleep, processing large keys and values, performing fanout or reiteration 
>>>> of inputs)
>>>> run both in batch and streaming context
>>>> gather various metrics
>>>> notice regressions by comparing data from consequent Jenkins runs  
>>>> Metrics (runtime, consumed bytes, memory usage, split/bundle count) can be 
>>>> gathered during test invocations. We will start with runtime and leverage 
>>>> Metrics API to collect the other metrics in later phases of development. 
>>>> The tests will be fully configurable through pipeline options and it will 
>>>> be possible to run any custom scenarios manually. However, a 
>>>> representative set of testing scenarios will be run periodically using 
>>>> Jenkins.
>>>> 
>>>> Regards, 
>>>> Łukasz 
>>>> 
>>>> śr., 5 wrz 2018 o 20:31 Rafael Fernandez <[email protected] 
>>>> <mailto:[email protected]>> napisał(a):
>>>>> neat! left a comment or two
>>>>> 
>>>>> On Mon, Sep 3, 2018 at 3:53 AM Łukasz Gajowy <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>> Hi all! 
>>>>>> 
>>>>>> I'm bumping this (in case you missed it). Any feedback and questions are 
>>>>>> welcome!
>>>>>> 
>>>>>> Best regards, 
>>>>>> Łukasz
>>>>>> 
>>>>>> pon., 13 sie 2018 o 13:51 Jean-Baptiste Onofré <[email protected] 
>>>>>> <mailto:[email protected]>> napisał(a):
>>>>>>> Hi Lukasz,
>>>>>>> 
>>>>>>> Thanks for the update, and the abstract looks promising.
>>>>>>> 
>>>>>>> Let me take a look on the doc.
>>>>>>> 
>>>>>>> Regards
>>>>>>> JB
>>>>>>> 
>>>>>>> On 13/08/2018 13:24, Łukasz Gajowy wrote:
>>>>>>> > Hi all, 
>>>>>>> > 
>>>>>>> > since Synthetic Sources API has been introduced in Java and Python 
>>>>>>> > SDK,
>>>>>>> > it can be used to test some basic Apache Beam operations (i.e.
>>>>>>> > GroupByKey, CoGroupByKey Combine, ParDo and ParDo with SideInput) in
>>>>>>> > terms of performance. This, in brief, is why we'd like to share the
>>>>>>> > below proposal:
>>>>>>> > 
>>>>>>> > _https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing_
>>>>>>> >  
>>>>>>> > <https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing_>
>>>>>>> > 
>>>>>>> > Let us know what you think in the document's comments. Thank you in
>>>>>>> > advance for all the feedback!
>>>>>>> > 
>>>>>>> > Łukasz
>>>>>>> 

Reply via email to