Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-30 Thread Etienne Chauchot
No problem, always glad to help

Etienne

Le jeudi 29 novembre 2018 à 09:20 -0800, Alex Amato a écrit :
> Thanks Etienne, appreciate the info. This will help me a lot :)
> On Wed, Nov 28, 2018 at 1:02 AM Etienne Chauchot  wrote:
> > Hi Alex,
> > Exporting results to the dashboards is as easy as writing to a BigQuery 
> > table and then configure the dashboard SQL
> > request to display it. Here is an example:
> > - exporting: 
> > https://github.com/apache/beam/blob/ad150c1d654aac5720975727d8c6981c5382b449/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java#L163
> > - displaying:
> > 
> > SELECT
> > DATE(timestamp) as date,
> > runtimeSec
> > FROM
> > [apache-beam-testing:nexmark.nexmark_0_DirectRunner_batch]
> > WHERE
> > timestamp >= TIMESTAMP_TO_SEC(DATE_ADD(CURRENT_TIMESTAMP(), -2, 
> > "WEEK")) 
> > ORDER BY
> > date;
> > 
> > Best
> > Etienne
> > 
> > Le mardi 27 novembre 2018 à 17:34 -0800, Alex Amato a écrit :
> > > It would be great to add some lower level benchmark tests for the java 
> > > SDK. I was thinking of using open census
> > > for collecting benchmarks, which looks easy to use should be license 
> > > compatible. I'm just not sure about how to
> > > export the results so that we can display them on the perfkit dashboard 
> > > for everyone to see.
> > > 
> > > Is there an example PR for this part? Can we write to this data store for 
> > > this perfkit dashboard easily?
> > > 
> > > https://github.com/census-instrumentation/opencensus-java
> > > https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart
> > > 
> > > 
> > > 
> > > 
> > > On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud  
> > > wrote:
> > > > The doc changes look good to me, I'll add Dataflow once it is ready. 
> > > > Thanks for opening the issue on the
> > > > DirectRunner. I'll try to get some progress on a dedicated perf node 
> > > > while you are gone, we can talk about
> > > > increasing the size of the nexmark input collection for the runs once 
> > > > we know what the utilization on that looks
> > > > like.
> > > > Enjoy your time off!
> > > > 
> > > > Andrew
> > > > On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot  
> > > > wrote:
> > > > > Hi guys,As suggested by Anton bellow, I opened a PR on the website to 
> > > > > reference the Nexmark dashboards. As I
> > > > > did not want users to take them for proper neutral benchmarks of the 
> > > > > runners / engines,  but more for a CI
> > > > > piece of software, I added a disclaimer.
> > > > > Please:- tell if you agree on  the publication of such performance 
> > > > > results- comment on the PR for the
> > > > > disclaimer.
> > > > > PR: https://github.com/apache/beam-site/pull/500
> > > > > 
> > > > > Thanks
> > > > > Etienne
> > > > > 
> > > > > Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
> > > > > > Hi Anton, 
> > > > > > Yes, good idea, I'll update nexmark website page
> > > > > > Etienne
> > > > > > Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> > > > > > > These dashboards look great!
> > > > > > > 
> > > > > > > Can publish the links to the dashboards somewhere, for better 
> > > > > > > visibility? E.g. in the jenkins website /
> > > > > > > emails, or the wiki.
> > > > > > > 
> > > > > > > Regards,Anton
> > > > > > > On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
> > > > > > >  wrote:
> > > > > > > > Hi Etienne,
> > > > > > > > 
> > > > > > > > I've been asking around and it sounds like we should be able to 
> > > > > > > > get a dedicated Jenkins node for
> > > > > > > > performance tests. Another thing that might help is making the 
> > > > > > > > runs a few times longer. They are
> > > > > > > > currently running around 2 seconds each, so the total time of 
> > > > > > > > the build probably exceeds testing.
> > > > > > > > Internally at Google we are running them with 2000x as many 
> > > > > > > > events on Dataflow, but a job of that size
> > > > > > > > won't even complete on the Direct Runner.
> > > > > > > > I didn't see the query 3 issues, but now that you point it out 
> > > > > > > > it looks like a bug to me too.
> > > > > > > > 
> > > > > > > > Andrew
> > > > > > > > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> > > > > > > >  wrote:
> > > > > > > > > Hi Andrew,
> > > > > > > > > Yes I saw that, except dedicating jenkins nodes to nexmark, I 
> > > > > > > > > see no other way.
> > > > > > > > > Also, did you see query 3 output size on direct runner? 
> > > > > > > > > Should be a straight line and it is not, I'm
> > > > > > > > > wondering if there is a problem with sate and timers impl in 
> > > > > > > > > direct runner.
> > > > > > > > > Etienne
> > > > > > > > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a 
> > > > > > > > > écrit :
> > > > > > > > > > I'm noticing the graphs are really noisy. It looks like we 
> > > > > > > > > > are running these on shared Jenkins
> > > > > > > > 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-29 Thread Alex Amato
Thanks Etienne, appreciate the info. This will help me a lot :)

On Wed, Nov 28, 2018 at 1:02 AM Etienne Chauchot 
wrote:

> Hi Alex,
> Exporting results to the dashboards is as easy as writing to a BigQuery
> table and then configure the dashboard SQL request to display it. Here is
> an example:
> - exporting:
> https://github.com/apache/beam/blob/ad150c1d654aac5720975727d8c6981c5382b449/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java#L163
> - displaying:
>
> SELECT
> DATE(timestamp) as date,
> runtimeSec
> FROM
> [apache-beam-testing:nexmark.nexmark_0_DirectRunner_batch]
> WHERE
> timestamp >= TIMESTAMP_TO_SEC(DATE_ADD(CURRENT_TIMESTAMP(), -2, "WEEK"))
> ORDER BY
> date;
>
> Best
> Etienne
>
> Le mardi 27 novembre 2018 à 17:34 -0800, Alex Amato a écrit :
>
> It would be great to add some lower level benchmark tests for the java
> SDK. I was thinking of using open census for collecting benchmarks, which
> looks easy to use should be license compatible. I'm just not sure about how
> to export the results so that we can display them on the perfkit dashboard
> for everyone to see.
>
> Is there an example PR for this part? Can we write to this data store for
> this perfkit dashboard easily?
>
> https://github.com/census-instrumentation/opencensus-java
>
> https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart
>
>
>
>
> On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud 
> wrote:
>
> The doc changes look good to me, I'll add Dataflow once it is ready.
> Thanks for opening the issue on the DirectRunner. I'll try to get some
> progress on a dedicated perf node while you are gone, we can talk about
> increasing the size of the nexmark input collection for the runs once we
> know what the utilization on that looks like.
>
> Enjoy your time off!
>
>
> Andrew
>
> On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot 
> wrote:
>
> Hi guys,
> As suggested by Anton bellow, I opened a PR on the website to reference
> the Nexmark dashboards.
> As I did not want users to take them for proper neutral benchmarks of the
> runners / engines, but more for a CI piece of software, I added a
> disclaimer.
>
> Please:
> - tell if you agree on the publication of such performance results
> - comment on the PR for the disclaimer.
>
> PR: https://github.com/apache/beam-site/pull/500
>
> Thanks
>
> Etienne
>
>
> Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
>
> Hi Anton,
>
> Yes, good idea, I'll update nexmark website page
>
> Etienne
>
> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
>
> These dashboards look great!
>
> Can publish the links to the dashboards somewhere, for better visibility?
> E.g. in the jenkins website / emails, or the wiki.
>
> Regards,
> Anton
>
> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
> wrote:
>
> Hi Etienne,
>
> I've been asking around and it sounds like we should be able to get a
> dedicated Jenkins node for performance tests. Another thing that might help
> is making the runs a few times longer. They are currently running around 2
> seconds each, so the total time of the build probably exceeds testing.
> Internally at Google we are running them with 2000x as many events on
> Dataflow, but a job of that size won't even complete on the Direct Runner.
>
> I didn't see the query 3 issues, but now that you point it out it looks
> like a bug to me too.
>
> Andrew
>
> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> wrote:
>
> Hi Andrew,
>
> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other
> way.
>
> Also, did you see query 3 output size on direct runner? Should be a
> straight line and it is not, I'm wondering if there is a problem with sate
> and timers impl in direct runner.
>
> Etienne
>
> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>
> I'm noticing the graphs are really noisy. It looks like we are running
> these on shared Jenkins executors, so our perf tests are fighting with
> other builds for CPU. I've opened an issue
> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
> anyone knows an easy fix to isolate these jobs.
>
> Andrew
>
> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>
> @Etienne: Nice to see the graphs! :)
>
> @Ismael: Good idea, there's no document yet. I think we could create a
> small google doc with instructions on how to do this.
>
> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
> napisał(a):
>
> Hi,
>
> @Andrew, this is because I did not find a way to set 2 scales on the Y
> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
> runtimeSec is usually bellow 10s.
>
> Etienne
>
> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>
> This is great, should make performance work much easier! I'm going to get
> the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-28 Thread Etienne Chauchot
Hi Alex,Exporting results to the dashboards is as easy as writing to a BigQuery 
table and then configure the dashboard
SQL request to display it. Here is an example:- exporting: 
https://github.com/apache/beam/blob/ad150c1d654aac5720975727d8c6981c5382b449/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java#L163
- displaying:
SELECT  DATE(timestamp) as date,runtimeSecFROM  
[apache-beam-testing:nexmark.nexmark_0_DirectRunner_batch]WHERE 
timestamp >= TIMESTAMP_TO_SEC(DATE_ADD(CURRENT_TIMESTAMP(), -2, "WEEK")) ORDER 
BYdate;
BestEtienne
Le mardi 27 novembre 2018 à 17:34 -0800, Alex Amato a écrit :
> It would be great to add some lower level benchmark tests for the java SDK. I 
> was thinking of using open census for
> collecting benchmarks, which looks easy to use should be license compatible. 
> I'm just not sure about how to export the
> results so that we can display them on the perfkit dashboard for everyone to 
> see.
> 
> Is there an example PR for this part? Can we write to this data store for 
> this perfkit dashboard easily?
> 
> https://github.com/census-instrumentation/opencensus-java
> https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart
> 
> 
> 
> 
> On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud  wrote:
> > The doc changes look good to me, I'll add Dataflow once it is ready. Thanks 
> > for opening the issue on the
> > DirectRunner. I'll try to get some progress on a dedicated perf node while 
> > you are gone, we can talk about
> > increasing the size of the nexmark input collection for the runs once we 
> > know what the utilization on that looks
> > like.
> > Enjoy your time off!
> > 
> > Andrew
> > On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot  
> > wrote:
> > > Hi guys,As suggested by Anton bellow, I opened a PR on the website to 
> > > reference the Nexmark dashboards. As I did
> > > not want users to take them for proper neutral benchmarks of the runners 
> > > / engines,  but more for a CI piece of
> > > software, I added a disclaimer.
> > > Please:- tell if you agree on  the publication of such performance 
> > > results- comment on the PR for the disclaimer.
> > > PR: https://github.com/apache/beam-site/pull/500
> > > 
> > > Thanks
> > > Etienne
> > > 
> > > Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
> > > > Hi Anton, 
> > > > Yes, good idea, I'll update nexmark website page
> > > > Etienne
> > > > Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> > > > > These dashboards look great!
> > > > > 
> > > > > Can publish the links to the dashboards somewhere, for better 
> > > > > visibility? E.g. in the jenkins website /
> > > > > emails, or the wiki.
> > > > > 
> > > > > Regards,Anton
> > > > > On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud  
> > > > > wrote:
> > > > > > Hi Etienne,
> > > > > > 
> > > > > > I've been asking around and it sounds like we should be able to get 
> > > > > > a dedicated Jenkins node for performance
> > > > > > tests. Another thing that might help is making the runs a few times 
> > > > > > longer. They are currently running
> > > > > > around 2 seconds each, so the total time of the build probably 
> > > > > > exceeds testing. Internally at Google we are
> > > > > > running them with 2000x as many events on Dataflow, but a job of 
> > > > > > that size won't even complete on the Direct
> > > > > > Runner.
> > > > > > I didn't see the query 3 issues, but now that you point it out it 
> > > > > > looks like a bug to me too.
> > > > > > 
> > > > > > Andrew
> > > > > > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> > > > > >  wrote:
> > > > > > > Hi Andrew,
> > > > > > > Yes I saw that, except dedicating jenkins nodes to nexmark, I see 
> > > > > > > no other way.
> > > > > > > Also, did you see query 3 output size on direct runner? Should be 
> > > > > > > a straight line and it is not, I'm
> > > > > > > wondering if there is a problem with sate and timers impl in 
> > > > > > > direct runner.
> > > > > > > Etienne
> > > > > > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > > > > > > I'm noticing the graphs are really noisy. It looks like we are 
> > > > > > > > running these on shared Jenkins
> > > > > > > > executors, so our perf tests are fighting with other builds for 
> > > > > > > > CPU. I've opened an issue 
> > > > > > > > https://issues.apache.org/jira/browse/BEAM-4804 and am 
> > > > > > > > wondering if anyone knows an easy fix to isolate
> > > > > > > > these jobs.
> > > > > > > > Andrew
> > > > > > > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy 
> > > > > > > >  wrote:
> > > > > > > > > @Etienne: Nice to see the graphs! :)
> > > > > > > > > 
> > > > > > > > > @Ismael: Good idea, there's no document yet. I think we could 
> > > > > > > > > create a small google doc with
> > > > > > > > > instructions on how to do this.
> > > > > > > > > 
> > > > > > > > > pt., 13 lip 2018 o 10:46 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-27 Thread Alex Amato
It would be great to add some lower level benchmark tests for the java SDK.
I was thinking of using open census for collecting benchmarks, which looks
easy to use should be license compatible. I'm just not sure about how to
export the results so that we can display them on the perfkit dashboard for
everyone to see.

Is there an example PR for this part? Can we write to this data store for
this perfkit dashboard easily?

https://github.com/census-instrumentation/opencensus-java
https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart




On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud  wrote:

> The doc changes look good to me, I'll add Dataflow once it is ready.
> Thanks for opening the issue on the DirectRunner. I'll try to get some
> progress on a dedicated perf node while you are gone, we can talk about
> increasing the size of the nexmark input collection for the runs once we
> know what the utilization on that looks like.
>
> Enjoy your time off!
>
>
> Andrew
>
> On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot 
> wrote:
>
>> Hi guys,
>> As suggested by Anton bellow, I opened a PR on the website to reference
>> the Nexmark dashboards.
>> As I did not want users to take them for proper neutral benchmarks of the
>> runners / engines, but more for a CI piece of software, I added a
>> disclaimer.
>>
>> Please:
>> - tell if you agree on the publication of such performance results
>> - comment on the PR for the disclaimer.
>>
>> PR: https://github.com/apache/beam-site/pull/500
>>
>> Thanks
>>
>> Etienne
>>
>>
>> Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
>>
>> Hi Anton,
>>
>> Yes, good idea, I'll update nexmark website page
>>
>> Etienne
>>
>> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
>>
>> These dashboards look great!
>>
>> Can publish the links to the dashboards somewhere, for better visibility?
>> E.g. in the jenkins website / emails, or the wiki.
>>
>> Regards,
>> Anton
>>
>> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
>> wrote:
>>
>> Hi Etienne,
>>
>> I've been asking around and it sounds like we should be able to get a
>> dedicated Jenkins node for performance tests. Another thing that might help
>> is making the runs a few times longer. They are currently running around 2
>> seconds each, so the total time of the build probably exceeds testing.
>> Internally at Google we are running them with 2000x as many events on
>> Dataflow, but a job of that size won't even complete on the Direct Runner.
>>
>> I didn't see the query 3 issues, but now that you point it out it looks
>> like a bug to me too.
>>
>> Andrew
>>
>> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
>> wrote:
>>
>> Hi Andrew,
>>
>> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no
>> other way.
>>
>> Also, did you see query 3 output size on direct runner? Should be a
>> straight line and it is not, I'm wondering if there is a problem with sate
>> and timers impl in direct runner.
>>
>> Etienne
>>
>> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>>
>> I'm noticing the graphs are really noisy. It looks like we are running
>> these on shared Jenkins executors, so our perf tests are fighting with
>> other builds for CPU. I've opened an issue
>> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
>> anyone knows an easy fix to isolate these jobs.
>>
>> Andrew
>>
>> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>>
>> @Etienne: Nice to see the graphs! :)
>>
>> @Ismael: Good idea, there's no document yet. I think we could create a
>> small google doc with instructions on how to do this.
>>
>> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
>> napisał(a):
>>
>> Hi,
>>
>> @Andrew, this is because I did not find a way to set 2 scales on the Y
>> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
>> runtimeSec is usually bellow 10s.
>>
>> Etienne
>>
>> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>>
>> This is great, should make performance work much easier! I'm going to get
>> the Beam SQL Nexmark jobs publishing as well. (Opened
>> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
>> on the Dataflow runner as well if no one else volunteers.
>>
>> I am curious as to why you have two separate graphs for runtime and count
>> rather then graphing runtime/count to get the throughput rate for each run?
>> Or should that be a third graph? Looks like it would just be a small tweak
>> to the query in perfkit.
>>
>>
>>
>> Andrew
>>
>> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
>> wrote:
>>
>> This is really cool Etienne : ) thanks for working on this.
>> Our of curiosity, do you know how often the tests run on each runner?
>>
>> Best
>> -P.
>>
>> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
>> wrote:
>>
>> Awesome Etienne, this is really important for the (user) community to
>> have that visibility since it is one of the most 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Andrew Pilloud
The doc changes look good to me, I'll add Dataflow once it is ready. Thanks
for opening the issue on the DirectRunner. I'll try to get some progress on
a dedicated perf node while you are gone, we can talk about increasing the size
of the nexmark input collection for the runs once we know what the
utilization on that looks like.

Enjoy your time off!

Andrew

On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot 
wrote:

> Hi guys,
> As suggested by Anton bellow, I opened a PR on the website to reference
> the Nexmark dashboards.
> As I did not want users to take them for proper neutral benchmarks of the
> runners / engines, but more for a CI piece of software, I added a
> disclaimer.
>
> Please:
> - tell if you agree on the publication of such performance results
> - comment on the PR for the disclaimer.
>
> PR: https://github.com/apache/beam-site/pull/500
>
> Thanks
>
> Etienne
>
>
> Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
>
> Hi Anton,
>
> Yes, good idea, I'll update nexmark website page
>
> Etienne
>
> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
>
> These dashboards look great!
>
> Can publish the links to the dashboards somewhere, for better visibility?
> E.g. in the jenkins website / emails, or the wiki.
>
> Regards,
> Anton
>
> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
> wrote:
>
> Hi Etienne,
>
> I've been asking around and it sounds like we should be able to get a
> dedicated Jenkins node for performance tests. Another thing that might help
> is making the runs a few times longer. They are currently running around 2
> seconds each, so the total time of the build probably exceeds testing.
> Internally at Google we are running them with 2000x as many events on
> Dataflow, but a job of that size won't even complete on the Direct Runner.
>
> I didn't see the query 3 issues, but now that you point it out it looks
> like a bug to me too.
>
> Andrew
>
> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> wrote:
>
> Hi Andrew,
>
> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other
> way.
>
> Also, did you see query 3 output size on direct runner? Should be a
> straight line and it is not, I'm wondering if there is a problem with sate
> and timers impl in direct runner.
>
> Etienne
>
> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>
> I'm noticing the graphs are really noisy. It looks like we are running
> these on shared Jenkins executors, so our perf tests are fighting with
> other builds for CPU. I've opened an issue
> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
> anyone knows an easy fix to isolate these jobs.
>
> Andrew
>
> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>
> @Etienne: Nice to see the graphs! :)
>
> @Ismael: Good idea, there's no document yet. I think we could create a
> small google doc with instructions on how to do this.
>
> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
> napisał(a):
>
> Hi,
>
> @Andrew, this is because I did not find a way to set 2 scales on the Y
> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
> runtimeSec is usually bellow 10s.
>
> Etienne
>
> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>
> This is great, should make performance work much easier! I'm going to get
> the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as well if no one else volunteers.
>
> I am curious as to why you have two separate graphs for runtime and count
> rather then graphing runtime/count to get the throughput rate for each run?
> Or should that be a third graph? Looks like it would just be a small tweak
> to the query in perfkit.
>
>
>
> Andrew
>
> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:
>
> This is really cool Etienne : ) thanks for working on this.
> Our of curiosity, do you know how often the tests run on each runner?
>
> Best
> -P.
>
> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> wrote:
>
> Awesome Etienne, this is really important for the (user) community to have
> that visibility since it is one of the most important aspect of the Beam's
> quality, kudo!
>
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
>
> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
> écrit :
>
> It's really great to have these dashboards and integration in Jenkins !
>
> Thanks Etienne for driving this !
>
> Regards
> JB
>
> On 11/07/2018 15:13, Etienne Chauchot wrote:
> >
> > Hi guys,
> >
> > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > Nexmark is now included in the perfkit 

[FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Etienne Chauchot
Hi guys,As suggested by Anton bellow, I opened a PR on the website to reference 
the Nexmark dashboards. As I did not
want users to take them for proper neutral benchmarks of the runners / engines, 
 but more for a CI piece of software, I
added a disclaimer.
Please:- tell if you agree on  the publication of such performance results- 
comment on the PR for the disclaimer.
PR: https://github.com/apache/beam-site/pull/500

Thanks
Etienne

Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
> Hi Anton, 
> Yes, good idea, I'll update nexmark website page
> Etienne
> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> > These dashboards look great!
> > 
> > Can publish the links to the dashboards somewhere, for better visibility? 
> > E.g. in the jenkins website / emails, or
> > the wiki.
> > 
> > Regards,Anton
> > On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud  wrote:
> > > Hi Etienne,
> > > 
> > > I've been asking around and it sounds like we should be able to get a 
> > > dedicated Jenkins node for performance
> > > tests. Another thing that might help is making the runs a few times 
> > > longer. They are currently running around 2
> > > seconds each, so the total time of the build probably exceeds testing. 
> > > Internally at Google we are running them
> > > with 2000x as many events on Dataflow, but a job of that size won't even 
> > > complete on the Direct Runner.
> > > I didn't see the query 3 issues, but now that you point it out it looks 
> > > like a bug to me too.
> > > 
> > > Andrew
> > > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot  
> > > wrote:
> > > > Hi Andrew,
> > > > Yes I saw that, except dedicating jenkins nodes to nexmark, I see no 
> > > > other way.
> > > > Also, did you see query 3 output size on direct runner? Should be a 
> > > > straight line and it is not, I'm wondering
> > > > if there is a problem with sate and timers impl in direct runner.
> > > > Etienne
> > > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > > > I'm noticing the graphs are really noisy. It looks like we are 
> > > > > running these on shared Jenkins executors, so
> > > > > our perf tests are fighting with other builds for CPU. I've opened an 
> > > > > issue https://issues.apache.org/jira/bro
> > > > > wse/BEAM-4804 and am wondering if anyone knows an easy fix to isolate 
> > > > > these jobs.
> > > > > Andrew
> > > > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  
> > > > > wrote:
> > > > > > @Etienne: Nice to see the graphs! :)
> > > > > > 
> > > > > > @Ismael: Good idea, there's no document yet. I think we could 
> > > > > > create a small google doc with instructions on
> > > > > > how to do this.
> > > > > > 
> > > > > > pt., 13 lip 2018 o 10:46 Etienne Chauchot  
> > > > > > napisał(a):
> > > > > > > Hi, 
> > > > > > > @Andrew, this is because I did not find a way to set 2 scales on 
> > > > > > > the Y axis on the perfkit graphs. Indeed
> > > > > > > numResults varies from 1 to  100 000 and runtimeSec is usually 
> > > > > > > bellow 10s.
> > > > > > > Etienne
> > > > > > > Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> > > > > > > > This is great, should make performance work much easier! I'm 
> > > > > > > > going to get the Beam SQL Nexmark jobs
> > > > > > > > publishing as well. (Opened 
> > > > > > > > https://issues.apache.org/jira/browse/BEAM-4774 to track.) I 
> > > > > > > > might take on
> > > > > > > > the Dataflow runner as well if no one else volunteers.
> > > > > > > > 
> > > > > > > > I am curious as to why you have two separate graphs for runtime 
> > > > > > > > and count rather then graphing
> > > > > > > > runtime/count to get the throughput rate for each run? Or 
> > > > > > > > should that be a third graph? Looks like it
> > > > > > > > would just be a small tweak to the query in perfkit.
> > > > > > > > Andrew
> > > > > > > > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
> > > > > > > >  wrote:
> > > > > > > > > This is really cool Etienne : ) thanks for working on 
> > > > > > > > > this.Our of curiosity, do you know how often the
> > > > > > > > > tests run on each runner?
> > > > > > > > > 
> > > > > > > > > Best
> > > > > > > > > -P.
> > > > > > > > > 
> > > > > > > > > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> > > > > > > > >  wrote:
> > > > > > > > > > Awesome Etienne, this is really important for the (user) 
> > > > > > > > > > community to have that visibility since it
> > > > > > > > > > is one of the most important aspect of the Beam's quality, 
> > > > > > > > > > kudo!
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Romain Manni-Bucau
> > > > > > > > > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > > > > > > > > 
> > > > > > > > > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> > > > > > > > > >  a écrit :
> > > > > > > > > > > It's really great to have these dashboards and 
> > > > > > > > > > > integration in Jenkins !
> > > > > >