Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-30 Thread Etienne Chauchot
No problem, always glad to help

Etienne

Le jeudi 29 novembre 2018 à 09:20 -0800, Alex Amato a écrit :
> Thanks Etienne, appreciate the info. This will help me a lot :)
> On Wed, Nov 28, 2018 at 1:02 AM Etienne Chauchot  wrote:
> > Hi Alex,
> > Exporting results to the dashboards is as easy as writing to a BigQuery 
> > table and then configure the dashboard SQL
> > request to display it. Here is an example:
> > - exporting: 
> > https://github.com/apache/beam/blob/ad150c1d654aac5720975727d8c6981c5382b449/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java#L163
> > - displaying:
> > 
> > SELECT
> > DATE(timestamp) as date,
> > runtimeSec
> > FROM
> > [apache-beam-testing:nexmark.nexmark_0_DirectRunner_batch]
> > WHERE
> > timestamp >= TIMESTAMP_TO_SEC(DATE_ADD(CURRENT_TIMESTAMP(), -2, 
> > "WEEK")) 
> > ORDER BY
> > date;
> > 
> > Best
> > Etienne
> > 
> > Le mardi 27 novembre 2018 à 17:34 -0800, Alex Amato a écrit :
> > > It would be great to add some lower level benchmark tests for the java 
> > > SDK. I was thinking of using open census
> > > for collecting benchmarks, which looks easy to use should be license 
> > > compatible. I'm just not sure about how to
> > > export the results so that we can display them on the perfkit dashboard 
> > > for everyone to see.
> > > 
> > > Is there an example PR for this part? Can we write to this data store for 
> > > this perfkit dashboard easily?
> > > 
> > > https://github.com/census-instrumentation/opencensus-java
> > > https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart
> > > 
> > > 
> > > 
> > > 
> > > On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud  
> > > wrote:
> > > > The doc changes look good to me, I'll add Dataflow once it is ready. 
> > > > Thanks for opening the issue on the
> > > > DirectRunner. I'll try to get some progress on a dedicated perf node 
> > > > while you are gone, we can talk about
> > > > increasing the size of the nexmark input collection for the runs once 
> > > > we know what the utilization on that looks
> > > > like.
> > > > Enjoy your time off!
> > > > 
> > > > Andrew
> > > > On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot  
> > > > wrote:
> > > > > Hi guys,As suggested by Anton bellow, I opened a PR on the website to 
> > > > > reference the Nexmark dashboards. As I
> > > > > did not want users to take them for proper neutral benchmarks of the 
> > > > > runners / engines,  but more for a CI
> > > > > piece of software, I added a disclaimer.
> > > > > Please:- tell if you agree on  the publication of such performance 
> > > > > results- comment on the PR for the
> > > > > disclaimer.
> > > > > PR: https://github.com/apache/beam-site/pull/500
> > > > > 
> > > > > Thanks
> > > > > Etienne
> > > > > 
> > > > > Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
> > > > > > Hi Anton, 
> > > > > > Yes, good idea, I'll update nexmark website page
> > > > > > Etienne
> > > > > > Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> > > > > > > These dashboards look great!
> > > > > > > 
> > > > > > > Can publish the links to the dashboards somewhere, for better 
> > > > > > > visibility? E.g. in the jenkins website /
> > > > > > > emails, or the wiki.
> > > > > > > 
> > > > > > > Regards,Anton
> > > > > > > On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
> > > > > > >  wrote:
> > > > > > > > Hi Etienne,
> > > > > > > > 
> > > > > > > > I've been asking around and it sounds like we should be able to 
> > > > > > > > get a dedicated Jenkins node for
> > > > > > > > performance tests. Another thing that might help is making the 
> > > > > > > > runs a few times longer. They are
> > > > > > > > currently running around 2 seconds each, so the total time of 
> > > > > > > > the build probably exceeds testing.
> > > > > > > > Internally at Google we are running them with 2000x as many 
> > > > > > > > events on Dataflow, but a job of that size
> > > > > > > > won't even complete on the Direct Runner.
> > > > > > > > I didn't see the query 3 issues, but now that you point it out 
> > > > > > > > it looks like a bug to me too.
> > > > > > > > 
> > > > > > > > Andrew
> > > > > > > > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> > > > > > > >  wrote:
> > > > > > > > > Hi Andrew,
> > > > > > > > > Yes I saw that, except dedicating jenkins nodes to nexmark, I 
> > > > > > > > > see no other way.
> > > > > > > > > Also, did you see query 3 output size on direct runner? 
> > > > > > > > > Should be a straight line and it is not, I'm
> > > > > > > > > wondering if there is a problem with sate and timers impl in 
> > > > > > > > > direct runner.
> > > > > > > > > Etienne
> > > > > > > > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a 
> > > > > > > > > écrit :
> > > > > > > > > > I'm noticing the graphs are really noisy. It looks like we 
> > > > > > > > > > are running these on shared Jenkins
> > > > > > > > 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-29 Thread Alex Amato
Thanks Etienne, appreciate the info. This will help me a lot :)

On Wed, Nov 28, 2018 at 1:02 AM Etienne Chauchot 
wrote:

> Hi Alex,
> Exporting results to the dashboards is as easy as writing to a BigQuery
> table and then configure the dashboard SQL request to display it. Here is
> an example:
> - exporting:
> https://github.com/apache/beam/blob/ad150c1d654aac5720975727d8c6981c5382b449/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java#L163
> - displaying:
>
> SELECT
> DATE(timestamp) as date,
> runtimeSec
> FROM
> [apache-beam-testing:nexmark.nexmark_0_DirectRunner_batch]
> WHERE
> timestamp >= TIMESTAMP_TO_SEC(DATE_ADD(CURRENT_TIMESTAMP(), -2, "WEEK"))
> ORDER BY
> date;
>
> Best
> Etienne
>
> Le mardi 27 novembre 2018 à 17:34 -0800, Alex Amato a écrit :
>
> It would be great to add some lower level benchmark tests for the java
> SDK. I was thinking of using open census for collecting benchmarks, which
> looks easy to use should be license compatible. I'm just not sure about how
> to export the results so that we can display them on the perfkit dashboard
> for everyone to see.
>
> Is there an example PR for this part? Can we write to this data store for
> this perfkit dashboard easily?
>
> https://github.com/census-instrumentation/opencensus-java
>
> https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart
>
>
>
>
> On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud 
> wrote:
>
> The doc changes look good to me, I'll add Dataflow once it is ready.
> Thanks for opening the issue on the DirectRunner. I'll try to get some
> progress on a dedicated perf node while you are gone, we can talk about
> increasing the size of the nexmark input collection for the runs once we
> know what the utilization on that looks like.
>
> Enjoy your time off!
>
>
> Andrew
>
> On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot 
> wrote:
>
> Hi guys,
> As suggested by Anton bellow, I opened a PR on the website to reference
> the Nexmark dashboards.
> As I did not want users to take them for proper neutral benchmarks of the
> runners / engines, but more for a CI piece of software, I added a
> disclaimer.
>
> Please:
> - tell if you agree on the publication of such performance results
> - comment on the PR for the disclaimer.
>
> PR: https://github.com/apache/beam-site/pull/500
>
> Thanks
>
> Etienne
>
>
> Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
>
> Hi Anton,
>
> Yes, good idea, I'll update nexmark website page
>
> Etienne
>
> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
>
> These dashboards look great!
>
> Can publish the links to the dashboards somewhere, for better visibility?
> E.g. in the jenkins website / emails, or the wiki.
>
> Regards,
> Anton
>
> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
> wrote:
>
> Hi Etienne,
>
> I've been asking around and it sounds like we should be able to get a
> dedicated Jenkins node for performance tests. Another thing that might help
> is making the runs a few times longer. They are currently running around 2
> seconds each, so the total time of the build probably exceeds testing.
> Internally at Google we are running them with 2000x as many events on
> Dataflow, but a job of that size won't even complete on the Direct Runner.
>
> I didn't see the query 3 issues, but now that you point it out it looks
> like a bug to me too.
>
> Andrew
>
> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> wrote:
>
> Hi Andrew,
>
> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other
> way.
>
> Also, did you see query 3 output size on direct runner? Should be a
> straight line and it is not, I'm wondering if there is a problem with sate
> and timers impl in direct runner.
>
> Etienne
>
> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>
> I'm noticing the graphs are really noisy. It looks like we are running
> these on shared Jenkins executors, so our perf tests are fighting with
> other builds for CPU. I've opened an issue
> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
> anyone knows an easy fix to isolate these jobs.
>
> Andrew
>
> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>
> @Etienne: Nice to see the graphs! :)
>
> @Ismael: Good idea, there's no document yet. I think we could create a
> small google doc with instructions on how to do this.
>
> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
> napisał(a):
>
> Hi,
>
> @Andrew, this is because I did not find a way to set 2 scales on the Y
> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
> runtimeSec is usually bellow 10s.
>
> Etienne
>
> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>
> This is great, should make performance work much easier! I'm going to get
> the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-28 Thread Etienne Chauchot
Hi Alex,Exporting results to the dashboards is as easy as writing to a BigQuery 
table and then configure the dashboard
SQL request to display it. Here is an example:- exporting: 
https://github.com/apache/beam/blob/ad150c1d654aac5720975727d8c6981c5382b449/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java#L163
- displaying:
SELECT  DATE(timestamp) as date,runtimeSecFROM  
[apache-beam-testing:nexmark.nexmark_0_DirectRunner_batch]WHERE 
timestamp >= TIMESTAMP_TO_SEC(DATE_ADD(CURRENT_TIMESTAMP(), -2, "WEEK")) ORDER 
BYdate;
BestEtienne
Le mardi 27 novembre 2018 à 17:34 -0800, Alex Amato a écrit :
> It would be great to add some lower level benchmark tests for the java SDK. I 
> was thinking of using open census for
> collecting benchmarks, which looks easy to use should be license compatible. 
> I'm just not sure about how to export the
> results so that we can display them on the perfkit dashboard for everyone to 
> see.
> 
> Is there an example PR for this part? Can we write to this data store for 
> this perfkit dashboard easily?
> 
> https://github.com/census-instrumentation/opencensus-java
> https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart
> 
> 
> 
> 
> On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud  wrote:
> > The doc changes look good to me, I'll add Dataflow once it is ready. Thanks 
> > for opening the issue on the
> > DirectRunner. I'll try to get some progress on a dedicated perf node while 
> > you are gone, we can talk about
> > increasing the size of the nexmark input collection for the runs once we 
> > know what the utilization on that looks
> > like.
> > Enjoy your time off!
> > 
> > Andrew
> > On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot  
> > wrote:
> > > Hi guys,As suggested by Anton bellow, I opened a PR on the website to 
> > > reference the Nexmark dashboards. As I did
> > > not want users to take them for proper neutral benchmarks of the runners 
> > > / engines,  but more for a CI piece of
> > > software, I added a disclaimer.
> > > Please:- tell if you agree on  the publication of such performance 
> > > results- comment on the PR for the disclaimer.
> > > PR: https://github.com/apache/beam-site/pull/500
> > > 
> > > Thanks
> > > Etienne
> > > 
> > > Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
> > > > Hi Anton, 
> > > > Yes, good idea, I'll update nexmark website page
> > > > Etienne
> > > > Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> > > > > These dashboards look great!
> > > > > 
> > > > > Can publish the links to the dashboards somewhere, for better 
> > > > > visibility? E.g. in the jenkins website /
> > > > > emails, or the wiki.
> > > > > 
> > > > > Regards,Anton
> > > > > On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud  
> > > > > wrote:
> > > > > > Hi Etienne,
> > > > > > 
> > > > > > I've been asking around and it sounds like we should be able to get 
> > > > > > a dedicated Jenkins node for performance
> > > > > > tests. Another thing that might help is making the runs a few times 
> > > > > > longer. They are currently running
> > > > > > around 2 seconds each, so the total time of the build probably 
> > > > > > exceeds testing. Internally at Google we are
> > > > > > running them with 2000x as many events on Dataflow, but a job of 
> > > > > > that size won't even complete on the Direct
> > > > > > Runner.
> > > > > > I didn't see the query 3 issues, but now that you point it out it 
> > > > > > looks like a bug to me too.
> > > > > > 
> > > > > > Andrew
> > > > > > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> > > > > >  wrote:
> > > > > > > Hi Andrew,
> > > > > > > Yes I saw that, except dedicating jenkins nodes to nexmark, I see 
> > > > > > > no other way.
> > > > > > > Also, did you see query 3 output size on direct runner? Should be 
> > > > > > > a straight line and it is not, I'm
> > > > > > > wondering if there is a problem with sate and timers impl in 
> > > > > > > direct runner.
> > > > > > > Etienne
> > > > > > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > > > > > > I'm noticing the graphs are really noisy. It looks like we are 
> > > > > > > > running these on shared Jenkins
> > > > > > > > executors, so our perf tests are fighting with other builds for 
> > > > > > > > CPU. I've opened an issue 
> > > > > > > > https://issues.apache.org/jira/browse/BEAM-4804 and am 
> > > > > > > > wondering if anyone knows an easy fix to isolate
> > > > > > > > these jobs.
> > > > > > > > Andrew
> > > > > > > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy 
> > > > > > > >  wrote:
> > > > > > > > > @Etienne: Nice to see the graphs! :)
> > > > > > > > > 
> > > > > > > > > @Ismael: Good idea, there's no document yet. I think we could 
> > > > > > > > > create a small google doc with
> > > > > > > > > instructions on how to do this.
> > > > > > > > > 
> > > > > > > > > pt., 13 lip 2018 o 10:46 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-11-27 Thread Alex Amato
It would be great to add some lower level benchmark tests for the java SDK.
I was thinking of using open census for collecting benchmarks, which looks
easy to use should be license compatible. I'm just not sure about how to
export the results so that we can display them on the perfkit dashboard for
everyone to see.

Is there an example PR for this part? Can we write to this data store for
this perfkit dashboard easily?

https://github.com/census-instrumentation/opencensus-java
https://github.com/census-instrumentation/opencensus-java/tree/master/exporters/trace/zipkin#quickstart




On Thu, Jul 19, 2018 at 1:28 PM Andrew Pilloud  wrote:

> The doc changes look good to me, I'll add Dataflow once it is ready.
> Thanks for opening the issue on the DirectRunner. I'll try to get some
> progress on a dedicated perf node while you are gone, we can talk about
> increasing the size of the nexmark input collection for the runs once we
> know what the utilization on that looks like.
>
> Enjoy your time off!
>
>
> Andrew
>
> On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot 
> wrote:
>
>> Hi guys,
>> As suggested by Anton bellow, I opened a PR on the website to reference
>> the Nexmark dashboards.
>> As I did not want users to take them for proper neutral benchmarks of the
>> runners / engines, but more for a CI piece of software, I added a
>> disclaimer.
>>
>> Please:
>> - tell if you agree on the publication of such performance results
>> - comment on the PR for the disclaimer.
>>
>> PR: https://github.com/apache/beam-site/pull/500
>>
>> Thanks
>>
>> Etienne
>>
>>
>> Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
>>
>> Hi Anton,
>>
>> Yes, good idea, I'll update nexmark website page
>>
>> Etienne
>>
>> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
>>
>> These dashboards look great!
>>
>> Can publish the links to the dashboards somewhere, for better visibility?
>> E.g. in the jenkins website / emails, or the wiki.
>>
>> Regards,
>> Anton
>>
>> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
>> wrote:
>>
>> Hi Etienne,
>>
>> I've been asking around and it sounds like we should be able to get a
>> dedicated Jenkins node for performance tests. Another thing that might help
>> is making the runs a few times longer. They are currently running around 2
>> seconds each, so the total time of the build probably exceeds testing.
>> Internally at Google we are running them with 2000x as many events on
>> Dataflow, but a job of that size won't even complete on the Direct Runner.
>>
>> I didn't see the query 3 issues, but now that you point it out it looks
>> like a bug to me too.
>>
>> Andrew
>>
>> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
>> wrote:
>>
>> Hi Andrew,
>>
>> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no
>> other way.
>>
>> Also, did you see query 3 output size on direct runner? Should be a
>> straight line and it is not, I'm wondering if there is a problem with sate
>> and timers impl in direct runner.
>>
>> Etienne
>>
>> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>>
>> I'm noticing the graphs are really noisy. It looks like we are running
>> these on shared Jenkins executors, so our perf tests are fighting with
>> other builds for CPU. I've opened an issue
>> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
>> anyone knows an easy fix to isolate these jobs.
>>
>> Andrew
>>
>> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>>
>> @Etienne: Nice to see the graphs! :)
>>
>> @Ismael: Good idea, there's no document yet. I think we could create a
>> small google doc with instructions on how to do this.
>>
>> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
>> napisał(a):
>>
>> Hi,
>>
>> @Andrew, this is because I did not find a way to set 2 scales on the Y
>> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
>> runtimeSec is usually bellow 10s.
>>
>> Etienne
>>
>> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>>
>> This is great, should make performance work much easier! I'm going to get
>> the Beam SQL Nexmark jobs publishing as well. (Opened
>> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
>> on the Dataflow runner as well if no one else volunteers.
>>
>> I am curious as to why you have two separate graphs for runtime and count
>> rather then graphing runtime/count to get the throughput rate for each run?
>> Or should that be a third graph? Looks like it would just be a small tweak
>> to the query in perfkit.
>>
>>
>>
>> Andrew
>>
>> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
>> wrote:
>>
>> This is really cool Etienne : ) thanks for working on this.
>> Our of curiosity, do you know how often the tests run on each runner?
>>
>> Best
>> -P.
>>
>> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
>> wrote:
>>
>> Awesome Etienne, this is really important for the (user) community to
>> have that visibility since it is one of the most 

Re: [FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Andrew Pilloud
The doc changes look good to me, I'll add Dataflow once it is ready. Thanks
for opening the issue on the DirectRunner. I'll try to get some progress on
a dedicated perf node while you are gone, we can talk about increasing the size
of the nexmark input collection for the runs once we know what the
utilization on that looks like.

Enjoy your time off!

Andrew

On Thu, Jul 19, 2018 at 9:00 AM Etienne Chauchot 
wrote:

> Hi guys,
> As suggested by Anton bellow, I opened a PR on the website to reference
> the Nexmark dashboards.
> As I did not want users to take them for proper neutral benchmarks of the
> runners / engines, but more for a CI piece of software, I added a
> disclaimer.
>
> Please:
> - tell if you agree on the publication of such performance results
> - comment on the PR for the disclaimer.
>
> PR: https://github.com/apache/beam-site/pull/500
>
> Thanks
>
> Etienne
>
>
> Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
>
> Hi Anton,
>
> Yes, good idea, I'll update nexmark website page
>
> Etienne
>
> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
>
> These dashboards look great!
>
> Can publish the links to the dashboards somewhere, for better visibility?
> E.g. in the jenkins website / emails, or the wiki.
>
> Regards,
> Anton
>
> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud 
> wrote:
>
> Hi Etienne,
>
> I've been asking around and it sounds like we should be able to get a
> dedicated Jenkins node for performance tests. Another thing that might help
> is making the runs a few times longer. They are currently running around 2
> seconds each, so the total time of the build probably exceeds testing.
> Internally at Google we are running them with 2000x as many events on
> Dataflow, but a job of that size won't even complete on the Direct Runner.
>
> I didn't see the query 3 issues, but now that you point it out it looks
> like a bug to me too.
>
> Andrew
>
> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> wrote:
>
> Hi Andrew,
>
> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other
> way.
>
> Also, did you see query 3 output size on direct runner? Should be a
> straight line and it is not, I'm wondering if there is a problem with sate
> and timers impl in direct runner.
>
> Etienne
>
> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>
> I'm noticing the graphs are really noisy. It looks like we are running
> these on shared Jenkins executors, so our perf tests are fighting with
> other builds for CPU. I've opened an issue
> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
> anyone knows an easy fix to isolate these jobs.
>
> Andrew
>
> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>
> @Etienne: Nice to see the graphs! :)
>
> @Ismael: Good idea, there's no document yet. I think we could create a
> small google doc with instructions on how to do this.
>
> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
> napisał(a):
>
> Hi,
>
> @Andrew, this is because I did not find a way to set 2 scales on the Y
> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
> runtimeSec is usually bellow 10s.
>
> Etienne
>
> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>
> This is great, should make performance work much easier! I'm going to get
> the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as well if no one else volunteers.
>
> I am curious as to why you have two separate graphs for runtime and count
> rather then graphing runtime/count to get the throughput rate for each run?
> Or should that be a third graph? Looks like it would just be a small tweak
> to the query in perfkit.
>
>
>
> Andrew
>
> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:
>
> This is really cool Etienne : ) thanks for working on this.
> Our of curiosity, do you know how often the tests run on each runner?
>
> Best
> -P.
>
> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> wrote:
>
> Awesome Etienne, this is really important for the (user) community to have
> that visibility since it is one of the most important aspect of the Beam's
> quality, kudo!
>
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
>
> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
> écrit :
>
> It's really great to have these dashboards and integration in Jenkins !
>
> Thanks Etienne for driving this !
>
> Regards
> JB
>
> On 11/07/2018 15:13, Etienne Chauchot wrote:
> >
> > Hi guys,
> >
> > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > Nexmark is now included in the perfkit 

[FEEDBACK REQUEST] Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Etienne Chauchot
Hi guys,As suggested by Anton bellow, I opened a PR on the website to reference 
the Nexmark dashboards. As I did not
want users to take them for proper neutral benchmarks of the runners / engines, 
 but more for a CI piece of software, I
added a disclaimer.
Please:- tell if you agree on  the publication of such performance results- 
comment on the PR for the disclaimer.
PR: https://github.com/apache/beam-site/pull/500

Thanks
Etienne

Le jeudi 19 juillet 2018 à 12:30 +0200, Etienne Chauchot a écrit :
> Hi Anton, 
> Yes, good idea, I'll update nexmark website page
> Etienne
> Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> > These dashboards look great!
> > 
> > Can publish the links to the dashboards somewhere, for better visibility? 
> > E.g. in the jenkins website / emails, or
> > the wiki.
> > 
> > Regards,Anton
> > On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud  wrote:
> > > Hi Etienne,
> > > 
> > > I've been asking around and it sounds like we should be able to get a 
> > > dedicated Jenkins node for performance
> > > tests. Another thing that might help is making the runs a few times 
> > > longer. They are currently running around 2
> > > seconds each, so the total time of the build probably exceeds testing. 
> > > Internally at Google we are running them
> > > with 2000x as many events on Dataflow, but a job of that size won't even 
> > > complete on the Direct Runner.
> > > I didn't see the query 3 issues, but now that you point it out it looks 
> > > like a bug to me too.
> > > 
> > > Andrew
> > > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot  
> > > wrote:
> > > > Hi Andrew,
> > > > Yes I saw that, except dedicating jenkins nodes to nexmark, I see no 
> > > > other way.
> > > > Also, did you see query 3 output size on direct runner? Should be a 
> > > > straight line and it is not, I'm wondering
> > > > if there is a problem with sate and timers impl in direct runner.
> > > > Etienne
> > > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > > > I'm noticing the graphs are really noisy. It looks like we are 
> > > > > running these on shared Jenkins executors, so
> > > > > our perf tests are fighting with other builds for CPU. I've opened an 
> > > > > issue https://issues.apache.org/jira/bro
> > > > > wse/BEAM-4804 and am wondering if anyone knows an easy fix to isolate 
> > > > > these jobs.
> > > > > Andrew
> > > > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  
> > > > > wrote:
> > > > > > @Etienne: Nice to see the graphs! :)
> > > > > > 
> > > > > > @Ismael: Good idea, there's no document yet. I think we could 
> > > > > > create a small google doc with instructions on
> > > > > > how to do this.
> > > > > > 
> > > > > > pt., 13 lip 2018 o 10:46 Etienne Chauchot  
> > > > > > napisał(a):
> > > > > > > Hi, 
> > > > > > > @Andrew, this is because I did not find a way to set 2 scales on 
> > > > > > > the Y axis on the perfkit graphs. Indeed
> > > > > > > numResults varies from 1 to  100 000 and runtimeSec is usually 
> > > > > > > bellow 10s.
> > > > > > > Etienne
> > > > > > > Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> > > > > > > > This is great, should make performance work much easier! I'm 
> > > > > > > > going to get the Beam SQL Nexmark jobs
> > > > > > > > publishing as well. (Opened 
> > > > > > > > https://issues.apache.org/jira/browse/BEAM-4774 to track.) I 
> > > > > > > > might take on
> > > > > > > > the Dataflow runner as well if no one else volunteers.
> > > > > > > > 
> > > > > > > > I am curious as to why you have two separate graphs for runtime 
> > > > > > > > and count rather then graphing
> > > > > > > > runtime/count to get the throughput rate for each run? Or 
> > > > > > > > should that be a third graph? Looks like it
> > > > > > > > would just be a small tweak to the query in perfkit.
> > > > > > > > Andrew
> > > > > > > > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
> > > > > > > >  wrote:
> > > > > > > > > This is really cool Etienne : ) thanks for working on 
> > > > > > > > > this.Our of curiosity, do you know how often the
> > > > > > > > > tests run on each runner?
> > > > > > > > > 
> > > > > > > > > Best
> > > > > > > > > -P.
> > > > > > > > > 
> > > > > > > > > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> > > > > > > > >  wrote:
> > > > > > > > > > Awesome Etienne, this is really important for the (user) 
> > > > > > > > > > community to have that visibility since it
> > > > > > > > > > is one of the most important aspect of the Beam's quality, 
> > > > > > > > > > kudo!
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Romain Manni-Bucau
> > > > > > > > > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > > > > > > > > 
> > > > > > > > > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> > > > > > > > > >  a écrit :
> > > > > > > > > > > It's really great to have these dashboards and 
> > > > > > > > > > > integration in Jenkins !
> > > > > > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Etienne Chauchot
Andrew,here is the ticket about 
query3:https://issues.apache.org/jira/browse/BEAM-4825
I added details with pseudo code of query3 in the ticket and a link to the 
dashboards to see how flaky this query is on
DR.
Etienne

Le jeudi 19 juillet 2018 à 12:22 +0200, Etienne Chauchot a écrit :
> Hi Andrew,
> Le mercredi 18 juillet 2018 à 10:08 -0700, Andrew Pilloud a écrit :
> > Hi Etienne,
> > I've been asking around and it sounds like we should be able to get a 
> > dedicated Jenkins node for performance tests. 
> 
> Cool !
> > Another thing that might help is making the runs a few times longer. They 
> > are currently running around 2 seconds
> > each, so the total time of the build probably exceeds testing.
> 
> You mean increasing the size of the nexmark input collection? Currently it is 
> set to 100 000 events and is narrowed
> down to 10 000 IIRC for some queries (4 and 10 IIRC)
> >  Internally at Google we are running them with 2000x as many events on 
> > Dataflow, but a job of that size won't even
> > complete on the Direct Runner.
> 
> Yes probably not
> > I didn't see the query 3 issues, but now that you point it out it looks 
> > like a bug to me too.
> 
> It seems to me too, I'll open a ticket.
> Etienne
> > Andrew
> > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot  
> > wrote:
> > > Hi Andrew,
> > > Yes I saw that, except dedicating jenkins nodes to nexmark, I see no 
> > > other way.
> > > Also, did you see query 3 output size on direct runner? Should be a 
> > > straight line and it is not, I'm wondering if
> > > there is a problem with sate and timers impl in direct runner.
> > > Etienne
> > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > > I'm noticing the graphs are really noisy. It looks like we are running 
> > > > these on shared Jenkins executors, so our
> > > > perf tests are fighting with other builds for CPU. I've opened an issue 
> > > > https://issues.apache.org/jira/browse/BE
> > > > AM-4804 and am wondering if anyone knows an easy fix to isolate these 
> > > > jobs.
> > > > Andrew
> > > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  
> > > > wrote:
> > > > > @Etienne: Nice to see the graphs! :)
> > > > > 
> > > > > @Ismael: Good idea, there's no document yet. I think we could create 
> > > > > a small google doc with instructions on
> > > > > how to do this.
> > > > > 
> > > > > pt., 13 lip 2018 o 10:46 Etienne Chauchot  
> > > > > napisał(a):
> > > > > > Hi, 
> > > > > > @Andrew, this is because I did not find a way to set 2 scales on 
> > > > > > the Y axis on the perfkit graphs. Indeed
> > > > > > numResults varies from 1 to  100 000 and runtimeSec is usually 
> > > > > > bellow 10s.
> > > > > > Etienne
> > > > > > Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> > > > > > > This is great, should make performance work much easier! I'm 
> > > > > > > going to get the Beam SQL Nexmark jobs
> > > > > > > publishing as well. (Opened 
> > > > > > > https://issues.apache.org/jira/browse/BEAM-4774 to track.) I 
> > > > > > > might take on the
> > > > > > > Dataflow runner as well if no one else volunteers.
> > > > > > > 
> > > > > > > I am curious as to why you have two separate graphs for runtime 
> > > > > > > and count rather then graphing
> > > > > > > runtime/count to get the throughput rate for each run? Or should 
> > > > > > > that be a third graph? Looks like it
> > > > > > > would just be a small tweak to the query in perfkit.
> > > > > > > Andrew
> > > > > > > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
> > > > > > >  wrote:
> > > > > > > > This is really cool Etienne : ) thanks for working on this.Our 
> > > > > > > > of curiosity, do you know how often the
> > > > > > > > tests run on each runner?
> > > > > > > > 
> > > > > > > > Best
> > > > > > > > -P.
> > > > > > > > 
> > > > > > > > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> > > > > > > >  wrote:
> > > > > > > > > Awesome Etienne, this is really important for the (user) 
> > > > > > > > > community to have that visibility since it is
> > > > > > > > > one of the most important aspect of the Beam's quality, kudo!
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Romain Manni-Bucau
> > > > > > > > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > > > > > > > 
> > > > > > > > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> > > > > > > > >  a écrit :
> > > > > > > > > > It's really great to have these dashboards and integration 
> > > > > > > > > > in Jenkins !
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Thanks Etienne for driving this !
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Regards
> > > > > > > > > > 
> > > > > > > > > > JB
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On 11/07/2018 15:13, Etienne Chauchot wrote:
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > Hi guys,

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Etienne Chauchot
Hi Anton, 
Yes, good idea, I'll update nexmark website page
Etienne
Le mercredi 18 juillet 2018 à 10:17 -0700, Anton Kedin a écrit :
> These dashboards look great!
> 
> Can publish the links to the dashboards somewhere, for better visibility? 
> E.g. in the jenkins website / emails, or the
> wiki.
> 
> Regards,Anton
> On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud  wrote:
> > Hi Etienne,
> > 
> > I've been asking around and it sounds like we should be able to get a 
> > dedicated Jenkins node for performance tests.
> > Another thing that might help is making the runs a few times longer. They 
> > are currently running around 2 seconds
> > each, so the total time of the build probably exceeds testing. Internally 
> > at Google we are running them with 2000x
> > as many events on Dataflow, but a job of that size won't even complete on 
> > the Direct Runner.
> > I didn't see the query 3 issues, but now that you point it out it looks 
> > like a bug to me too.
> > 
> > Andrew
> > On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot  
> > wrote:
> > > Hi Andrew,
> > > Yes I saw that, except dedicating jenkins nodes to nexmark, I see no 
> > > other way.
> > > Also, did you see query 3 output size on direct runner? Should be a 
> > > straight line and it is not, I'm wondering if
> > > there is a problem with sate and timers impl in direct runner.
> > > Etienne
> > > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > > I'm noticing the graphs are really noisy. It looks like we are running 
> > > > these on shared Jenkins executors, so our
> > > > perf tests are fighting with other builds for CPU. I've opened an issue 
> > > > https://issues.apache.org/jira/browse/BE
> > > > AM-4804 and am wondering if anyone knows an easy fix to isolate these 
> > > > jobs.
> > > > Andrew
> > > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  
> > > > wrote:
> > > > > @Etienne: Nice to see the graphs! :)
> > > > > 
> > > > > @Ismael: Good idea, there's no document yet. I think we could create 
> > > > > a small google doc with instructions on
> > > > > how to do this.
> > > > > 
> > > > > pt., 13 lip 2018 o 10:46 Etienne Chauchot  
> > > > > napisał(a):
> > > > > > Hi, 
> > > > > > @Andrew, this is because I did not find a way to set 2 scales on 
> > > > > > the Y axis on the perfkit graphs. Indeed
> > > > > > numResults varies from 1 to  100 000 and runtimeSec is usually 
> > > > > > bellow 10s.
> > > > > > Etienne
> > > > > > Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> > > > > > > This is great, should make performance work much easier! I'm 
> > > > > > > going to get the Beam SQL Nexmark jobs
> > > > > > > publishing as well. (Opened 
> > > > > > > https://issues.apache.org/jira/browse/BEAM-4774 to track.) I 
> > > > > > > might take on the
> > > > > > > Dataflow runner as well if no one else volunteers.
> > > > > > > 
> > > > > > > I am curious as to why you have two separate graphs for runtime 
> > > > > > > and count rather then graphing
> > > > > > > runtime/count to get the throughput rate for each run? Or should 
> > > > > > > that be a third graph? Looks like it
> > > > > > > would just be a small tweak to the query in perfkit.
> > > > > > > Andrew
> > > > > > > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
> > > > > > >  wrote:
> > > > > > > > This is really cool Etienne : ) thanks for working on this.Our 
> > > > > > > > of curiosity, do you know how often the
> > > > > > > > tests run on each runner?
> > > > > > > > 
> > > > > > > > Best
> > > > > > > > -P.
> > > > > > > > 
> > > > > > > > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> > > > > > > >  wrote:
> > > > > > > > > Awesome Etienne, this is really important for the (user) 
> > > > > > > > > community to have that visibility since it is
> > > > > > > > > one of the most important aspect of the Beam's quality, kudo!
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Romain Manni-Bucau
> > > > > > > > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > > > > > > > 
> > > > > > > > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> > > > > > > > >  a écrit :
> > > > > > > > > > It's really great to have these dashboards and integration 
> > > > > > > > > > in Jenkins !
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Thanks Etienne for driving this !
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Regards
> > > > > > > > > > 
> > > > > > > > > > JB
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On 11/07/2018 15:13, Etienne Chauchot wrote:
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > Hi guys,
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > I'm glad to announce that the CI of Beam has much 
> > > > > > > > > > > improved ! Indeed
> > > > > > > > > > 
> > > > > > > > > > > Nexmark is now included in the 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-19 Thread Etienne Chauchot
Hi Andrew,
Le mercredi 18 juillet 2018 à 10:08 -0700, Andrew Pilloud a écrit :
> Hi Etienne,
> I've been asking around and it sounds like we should be able to get a 
> dedicated Jenkins node for performance tests. 

Cool !
> Another thing that might help is making the runs a few times longer. They are 
> currently running around 2 seconds each,
> so the total time of the build probably exceeds testing.

You mean increasing the size of the nexmark input collection? Currently it is 
set to 100 000 events and is narrowed down
to 10 000 IIRC for some queries (4 and 10 IIRC)
>  Internally at Google we are running them with 2000x as many events on 
> Dataflow, but a job of that size won't even
> complete on the Direct Runner.

Yes probably not
> I didn't see the query 3 issues, but now that you point it out it looks like 
> a bug to me too.

It seems to me too, I'll open a ticket.
Etienne
> Andrew
> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot  wrote:
> > Hi Andrew,
> > Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other 
> > way.
> > Also, did you see query 3 output size on direct runner? Should be a 
> > straight line and it is not, I'm wondering if
> > there is a problem with sate and timers impl in direct runner.
> > Etienne
> > Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> > > I'm noticing the graphs are really noisy. It looks like we are running 
> > > these on shared Jenkins executors, so our
> > > perf tests are fighting with other builds for CPU. I've opened an issue 
> > > https://issues.apache.org/jira/browse/BEAM
> > > -4804 and am wondering if anyone knows an easy fix to isolate these jobs.
> > > Andrew
> > > On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
> > > > @Etienne: Nice to see the graphs! :)
> > > > 
> > > > @Ismael: Good idea, there's no document yet. I think we could create a 
> > > > small google doc with instructions on how
> > > > to do this.
> > > > 
> > > > pt., 13 lip 2018 o 10:46 Etienne Chauchot  
> > > > napisał(a):
> > > > > Hi, 
> > > > > @Andrew, this is because I did not find a way to set 2 scales on the 
> > > > > Y axis on the perfkit graphs. Indeed
> > > > > numResults varies from 1 to  100 000 and runtimeSec is usually bellow 
> > > > > 10s.
> > > > > Etienne
> > > > > Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> > > > > > This is great, should make performance work much easier! I'm going 
> > > > > > to get the Beam SQL Nexmark jobs
> > > > > > publishing as well. (Opened 
> > > > > > https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might 
> > > > > > take on the
> > > > > > Dataflow runner as well if no one else volunteers.
> > > > > > 
> > > > > > I am curious as to why you have two separate graphs for runtime and 
> > > > > > count rather then graphing runtime/count
> > > > > > to get the throughput rate for each run? Or should that be a third 
> > > > > > graph? Looks like it would just be a
> > > > > > small tweak to the query in perfkit.
> > > > > > Andrew
> > > > > > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  
> > > > > > wrote:
> > > > > > > This is really cool Etienne : ) thanks for working on this.Our of 
> > > > > > > curiosity, do you know how often the
> > > > > > > tests run on each runner?
> > > > > > > 
> > > > > > > Best
> > > > > > > -P.
> > > > > > > 
> > > > > > > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> > > > > > >  wrote:
> > > > > > > > Awesome Etienne, this is really important for the (user) 
> > > > > > > > community to have that visibility since it is
> > > > > > > > one of the most important aspect of the Beam's quality, kudo!
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Romain Manni-Bucau
> > > > > > > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > > > > > > 
> > > > > > > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> > > > > > > >  a écrit :
> > > > > > > > > It's really great to have these dashboards and integration in 
> > > > > > > > > Jenkins !
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Thanks Etienne for driving this !
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Regards
> > > > > > > > > 
> > > > > > > > > JB
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On 11/07/2018 15:13, Etienne Chauchot wrote:
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > Hi guys,
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > I'm glad to announce that the CI of Beam has much improved 
> > > > > > > > > > ! Indeed
> > > > > > > > > 
> > > > > > > > > > Nexmark is now included in the perfkit dashboards.
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > At each commit on master, nexmark suites are run and plots 
> > > > > > > > > > are created
> > > > > > > > > 
> > > > > > > > > > on the graphs.
> > > > > > > > > 
> > > > > > > > > > 
> > > > > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-18 Thread Anton Kedin
These dashboards look great!

Can publish the links to the dashboards somewhere, for better visibility?
E.g. in the jenkins website / emails, or the wiki.

Regards,
Anton

On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud  wrote:

> Hi Etienne,
>
> I've been asking around and it sounds like we should be able to get a
> dedicated Jenkins node for performance tests. Another thing that might help
> is making the runs a few times longer. They are currently running around 2
> seconds each, so the total time of the build probably exceeds testing.
> Internally at Google we are running them with 2000x as many events on
> Dataflow, but a job of that size won't even complete on the Direct Runner.
>
> I didn't see the query 3 issues, but now that you point it out it looks
> like a bug to me too.
>
> Andrew
>
> On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
> wrote:
>
>> Hi Andrew,
>>
>> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no
>> other way.
>>
>> Also, did you see query 3 output size on direct runner? Should be a
>> straight line and it is not, I'm wondering if there is a problem with sate
>> and timers impl in direct runner.
>>
>> Etienne
>>
>> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>>
>> I'm noticing the graphs are really noisy. It looks like we are running
>> these on shared Jenkins executors, so our perf tests are fighting with
>> other builds for CPU. I've opened an issue
>> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
>> anyone knows an easy fix to isolate these jobs.
>>
>> Andrew
>>
>> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>>
>> @Etienne: Nice to see the graphs! :)
>>
>> @Ismael: Good idea, there's no document yet. I think we could create a
>> small google doc with instructions on how to do this.
>>
>> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
>> napisał(a):
>>
>> Hi,
>>
>> @Andrew, this is because I did not find a way to set 2 scales on the Y
>> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
>> runtimeSec is usually bellow 10s.
>>
>> Etienne
>>
>> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>>
>> This is great, should make performance work much easier! I'm going to get
>> the Beam SQL Nexmark jobs publishing as well. (Opened
>> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
>> on the Dataflow runner as well if no one else volunteers.
>>
>> I am curious as to why you have two separate graphs for runtime and count
>> rather then graphing runtime/count to get the throughput rate for each run?
>> Or should that be a third graph? Looks like it would just be a small tweak
>> to the query in perfkit.
>>
>>
>>
>> Andrew
>>
>> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
>> wrote:
>>
>> This is really cool Etienne : ) thanks for working on this.
>> Our of curiosity, do you know how often the tests run on each runner?
>>
>> Best
>> -P.
>>
>> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
>> wrote:
>>
>> Awesome Etienne, this is really important for the (user) community to
>> have that visibility since it is one of the most important aspect of the
>> Beam's quality, kudo!
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau  |  Blog
>>  | Old Blog
>>  | Github
>>  | LinkedIn
>>  | Book
>> 
>>
>>
>> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
>> écrit :
>>
>> It's really great to have these dashboards and integration in Jenkins !
>>
>> Thanks Etienne for driving this !
>>
>> Regards
>> JB
>>
>> On 11/07/2018 15:13, Etienne Chauchot wrote:
>> >
>> > Hi guys,
>> >
>> > I'm glad to announce that the CI of Beam has much improved ! Indeed
>> > Nexmark is now included in the perfkit dashboards.
>> >
>> > At each commit on master, nexmark suites are run and plots are created
>> > on the graphs.
>> >
>> > I've created 2 kind of dashboards:
>> > - one for performances (run times of the queries)
>> > - one for the size of the output PCollection (which should be constant)
>> >
>> > There are dashboards for these runners:
>> > - spark
>> > - flink
>> > - direct runner
>> >
>> > Each dashboard contains:
>> > - graphs in batch mode
>> > - graphs in streaming mode
>> > - graphs for the 13 queries.
>> >
>> > That gives more than a hundred of graphs (my right finger hurts after so
>> > many clics on the mouse :) ). It is detailed that much so that anyone
>> > can focus on the area they have interest in.
>> > Feel free to also create new dashboards with more aggregated data.
>> >
>> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
>> > perfkit dashboards.
>> >
>> > Dashboards are there:
>> >
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>> >
>> 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-18 Thread Andrew Pilloud
Hi Etienne,

I've been asking around and it sounds like we should be able to get a
dedicated Jenkins node for performance tests. Another thing that might help
is making the runs a few times longer. They are currently running around 2
seconds each, so the total time of the build probably exceeds testing.
Internally at Google we are running them with 2000x as many events on
Dataflow, but a job of that size won't even complete on the Direct Runner.

I didn't see the query 3 issues, but now that you point it out it looks
like a bug to me too.

Andrew

On Wed, Jul 18, 2018 at 1:13 AM Etienne Chauchot 
wrote:

> Hi Andrew,
>
> Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other
> way.
>
> Also, did you see query 3 output size on direct runner? Should be a
> straight line and it is not, I'm wondering if there is a problem with sate
> and timers impl in direct runner.
>
> Etienne
>
> Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
>
> I'm noticing the graphs are really noisy. It looks like we are running
> these on shared Jenkins executors, so our perf tests are fighting with
> other builds for CPU. I've opened an issue
> https://issues.apache.org/jira/browse/BEAM-4804 and am wondering if
> anyone knows an easy fix to isolate these jobs.
>
> Andrew
>
> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
>
> @Etienne: Nice to see the graphs! :)
>
> @Ismael: Good idea, there's no document yet. I think we could create a
> small google doc with instructions on how to do this.
>
> pt., 13 lip 2018 o 10:46 Etienne Chauchot 
> napisał(a):
>
> Hi,
>
> @Andrew, this is because I did not find a way to set 2 scales on the Y
> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
> runtimeSec is usually bellow 10s.
>
> Etienne
>
> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>
> This is great, should make performance work much easier! I'm going to get
> the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as well if no one else volunteers.
>
> I am curious as to why you have two separate graphs for runtime and count
> rather then graphing runtime/count to get the throughput rate for each run?
> Or should that be a third graph? Looks like it would just be a small tweak
> to the query in perfkit.
>
>
>
> Andrew
>
> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:
>
> This is really cool Etienne : ) thanks for working on this.
> Our of curiosity, do you know how often the tests run on each runner?
>
> Best
> -P.
>
> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> wrote:
>
> Awesome Etienne, this is really important for the (user) community to have
> that visibility since it is one of the most important aspect of the Beam's
> quality, kudo!
>
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
>
> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
> écrit :
>
> It's really great to have these dashboards and integration in Jenkins !
>
> Thanks Etienne for driving this !
>
> Regards
> JB
>
> On 11/07/2018 15:13, Etienne Chauchot wrote:
> >
> > Hi guys,
> >
> > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > Nexmark is now included in the perfkit dashboards.
> >
> > At each commit on master, nexmark suites are run and plots are created
> > on the graphs.
> >
> > I've created 2 kind of dashboards:
> > - one for performances (run times of the queries)
> > - one for the size of the output PCollection (which should be constant)
> >
> > There are dashboards for these runners:
> > - spark
> > - flink
> > - direct runner
> >
> > Each dashboard contains:
> > - graphs in batch mode
> > - graphs in streaming mode
> > - graphs for the 13 queries.
> >
> > That gives more than a hundred of graphs (my right finger hurts after so
> > many clics on the mouse :) ). It is detailed that much so that anyone
> > can focus on the area they have interest in.
> > Feel free to also create new dashboards with more aggregated data.
> >
> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> > perfkit dashboards.
> >
> > Dashboards are there:
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > <
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> >
> 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-18 Thread Etienne Chauchot
Hi Andrew,
Yes I saw that, except dedicating jenkins nodes to nexmark, I see no other way.
Also, did you see query 3 output size on direct runner? Should be a straight 
line and it is not, I'm wondering if there
is a problem with sate and timers impl in direct runner.
Etienne
Le mardi 17 juillet 2018 à 11:38 -0700, Andrew Pilloud a écrit :
> I'm noticing the graphs are really noisy. It looks like we are running these 
> on shared Jenkins executors, so our perf
> tests are fighting with other builds for CPU. I've opened an issue 
> https://issues.apache.org/jira/browse/BEAM-4804 and
> am wondering if anyone knows an easy fix to isolate these jobs.
> Andrew
> On Fri, Jul 13, 2018 at 2:39 AM Łukasz Gajowy  wrote:
> > @Etienne: Nice to see the graphs! :)
> > 
> > @Ismael: Good idea, there's no document yet. I think we could create a 
> > small google doc with instructions on how to
> > do this.
> > 
> > pt., 13 lip 2018 o 10:46 Etienne Chauchot  napisał(a):
> > > Hi, 
> > > @Andrew, this is because I did not find a way to set 2 scales on the Y 
> > > axis on the perfkit graphs. Indeed
> > > numResults varies from 1 to  100 000 and runtimeSec is usually bellow 10s.
> > > Etienne
> > > Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> > > > This is great, should make performance work much easier! I'm going to 
> > > > get the Beam SQL Nexmark jobs publishing
> > > > as well. (Opened https://issues.apache.org/jira/browse/BEAM-4774 to 
> > > > track.) I might take on the Dataflow runner
> > > > as well if no one else volunteers.
> > > > 
> > > > I am curious as to why you have two separate graphs for runtime and 
> > > > count rather then graphing runtime/count to
> > > > get the throughput rate for each run? Or should that be a third graph? 
> > > > Looks like it would just be a small tweak
> > > > to the query in perfkit.
> > > > Andrew
> > > > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  
> > > > wrote:
> > > > > This is really cool Etienne : ) thanks for working on this.Our of 
> > > > > curiosity, do you know how often the tests
> > > > > run on each runner?
> > > > > 
> > > > > Best
> > > > > -P.
> > > > > 
> > > > > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> > > > >  wrote:
> > > > > > Awesome Etienne, this is really important for the (user) community 
> > > > > > to have that visibility since it is one
> > > > > > of the most important aspect of the Beam's quality, kudo!
> > > > > > 
> > > > > > 
> > > > > > Romain Manni-Bucau
> > > > > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > > > > 
> > > > > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> > > > > >  a écrit :
> > > > > > > It's really great to have these dashboards and integration in 
> > > > > > > Jenkins !
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks Etienne for driving this !
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Regards
> > > > > > > 
> > > > > > > JB
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > On 11/07/2018 15:13, Etienne Chauchot wrote:
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > Hi guys,
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > I'm glad to announce that the CI of Beam has much improved ! 
> > > > > > > > Indeed
> > > > > > > 
> > > > > > > > Nexmark is now included in the perfkit dashboards.
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > At each commit on master, nexmark suites are run and plots are 
> > > > > > > > created
> > > > > > > 
> > > > > > > > on the graphs.
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > I've created 2 kind of dashboards:
> > > > > > > 
> > > > > > > > - one for performances (run times of the queries)
> > > > > > > 
> > > > > > > > - one for the size of the output PCollection (which should be 
> > > > > > > > constant)
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > There are dashboards for these runners:
> > > > > > > 
> > > > > > > > - spark
> > > > > > > 
> > > > > > > > - flink
> > > > > > > 
> > > > > > > > - direct runner
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > Each dashboard contains:
> > > > > > > 
> > > > > > > > - graphs in batch mode 
> > > > > > > 
> > > > > > > > - graphs in streaming mode
> > > > > > > 
> > > > > > > > - graphs for the 13 queries.
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > That gives more than a hundred of graphs (my right finger hurts 
> > > > > > > > after so
> > > > > > > 
> > > > > > > > many clics on the mouse :) ). It is detailed that much so that 
> > > > > > > > anyone
> > > > > > > 
> > > > > > > > can focus on the area they have interest in.
> > > > > > > 
> > > > > > > > Feel free to also create new dashboards with more aggregated 
> > > > > > > > data.
> > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > > Thanks to Lukasz and Cham for reviewing my PRs and showing how 
> > > > > > > > to use
> > > > > > > 
> > > > > > > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-13 Thread Łukasz Gajowy
@Etienne: Nice to see the graphs! :)

@Ismael: Good idea, there's no document yet. I think we could create a
small google doc with instructions on how to do this.

pt., 13 lip 2018 o 10:46 Etienne Chauchot  napisał(a):

> Hi,
>
> @Andrew, this is because I did not find a way to set 2 scales on the Y
> axis on the perfkit graphs. Indeed numResults varies from 1 to 100 000 and
> runtimeSec is usually bellow 10s.
>
> Etienne
>
> Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
>
> This is great, should make performance work much easier! I'm going to get
> the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as well if no one else volunteers.
>
> I am curious as to why you have two separate graphs for runtime and count
> rather then graphing runtime/count to get the throughput rate for each run?
> Or should that be a third graph? Looks like it would just be a small tweak
> to the query in perfkit.
>
>
>
> Andrew
>
> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:
>
> This is really cool Etienne : ) thanks for working on this.
> Our of curiosity, do you know how often the tests run on each runner?
>
> Best
> -P.
>
> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> wrote:
>
> Awesome Etienne, this is really important for the (user) community to have
> that visibility since it is one of the most important aspect of the Beam's
> quality, kudo!
>
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
>
> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
> écrit :
>
> It's really great to have these dashboards and integration in Jenkins !
>
> Thanks Etienne for driving this !
>
> Regards
> JB
>
> On 11/07/2018 15:13, Etienne Chauchot wrote:
> >
> > Hi guys,
> >
> > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > Nexmark is now included in the perfkit dashboards.
> >
> > At each commit on master, nexmark suites are run and plots are created
> > on the graphs.
> >
> > I've created 2 kind of dashboards:
> > - one for performances (run times of the queries)
> > - one for the size of the output PCollection (which should be constant)
> >
> > There are dashboards for these runners:
> > - spark
> > - flink
> > - direct runner
> >
> > Each dashboard contains:
> > - graphs in batch mode
> > - graphs in streaming mode
> > - graphs for the 13 queries.
> >
> > That gives more than a hundred of graphs (my right finger hurts after so
> > many clics on the mouse :) ). It is detailed that much so that anyone
> > can focus on the area they have interest in.
> > Feel free to also create new dashboards with more aggregated data.
> >
> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> > perfkit dashboards.
> >
> > Dashboards are there:
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > <
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> >
> >
> > Enjoy,
> >
> > Etienne
> >
> >
>
>


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-13 Thread Etienne Chauchot
Hi, 
@Andrew, this is because I did not find a way to set 2 scales on the Y axis on 
the perfkit graphs. Indeed numResults
varies from 1 to  100 000 and runtimeSec is usually bellow 10s.
Etienne
Le jeudi 12 juillet 2018 à 12:04 -0700, Andrew Pilloud a écrit :
> This is great, should make performance work much easier! I'm going to get the 
> Beam SQL Nexmark jobs publishing as
> well. (Opened https://issues.apache.org/jira/browse/BEAM-4774 to track.) I 
> might take on the Dataflow runner as well
> if no one else volunteers.
> 
> I am curious as to why you have two separate graphs for runtime and count 
> rather then graphing runtime/count to get
> the throughput rate for each run? Or should that be a third graph? Looks like 
> it would just be a small tweak to the
> query in perfkit.
> Andrew
> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:
> > This is really cool Etienne : ) thanks for working on this.Our of 
> > curiosity, do you know how often the tests run on
> > each runner?
> > 
> > Best
> > -P.
> > 
> > On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau  
> > wrote:
> > > Awesome Etienne, this is really important for the (user) community to 
> > > have that visibility since it is one of the
> > > most important aspect of the Beam's quality, kudo!
> > > 
> > > 
> > > Romain Manni-Bucau
> > > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > > 
> > > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a 
> > > écrit :
> > > > It's really great to have these dashboards and integration in Jenkins !
> > > > 
> > > > 
> > > > 
> > > > Thanks Etienne for driving this !
> > > > 
> > > > 
> > > > 
> > > > Regards
> > > > 
> > > > JB
> > > > 
> > > > 
> > > > 
> > > > On 11/07/2018 15:13, Etienne Chauchot wrote:
> > > > 
> > > > > 
> > > > 
> > > > > Hi guys,
> > > > 
> > > > > 
> > > > 
> > > > > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > > > 
> > > > > Nexmark is now included in the perfkit dashboards.
> > > > 
> > > > > 
> > > > 
> > > > > At each commit on master, nexmark suites are run and plots are created
> > > > 
> > > > > on the graphs.
> > > > 
> > > > > 
> > > > 
> > > > > I've created 2 kind of dashboards:
> > > > 
> > > > > - one for performances (run times of the queries)
> > > > 
> > > > > - one for the size of the output PCollection (which should be 
> > > > > constant)
> > > > 
> > > > > 
> > > > 
> > > > > There are dashboards for these runners:
> > > > 
> > > > > - spark
> > > > 
> > > > > - flink
> > > > 
> > > > > - direct runner
> > > > 
> > > > > 
> > > > 
> > > > > Each dashboard contains:
> > > > 
> > > > > - graphs in batch mode 
> > > > 
> > > > > - graphs in streaming mode
> > > > 
> > > > > - graphs for the 13 queries.
> > > > 
> > > > > 
> > > > 
> > > > > That gives more than a hundred of graphs (my right finger hurts after 
> > > > > so
> > > > 
> > > > > many clics on the mouse :) ). It is detailed that much so that anyone
> > > > 
> > > > > can focus on the area they have interest in.
> > > > 
> > > > > Feel free to also create new dashboards with more aggregated data.
> > > > 
> > > > > 
> > > > 
> > > > > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> > > > 
> > > > > perfkit dashboards.
> > > > 
> > > > > 
> > > > 
> > > > > Dashboards are there:
> > > > 
> > > > > 
> > > > 
> > > > > https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> > > > 
> > > > > https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > > > 
> > > > > https://apache-beam-testing.appspo
> > > > t.com/explore?dashboard=5138380291571712
> > > > 
> > > > > 
> > > > 
> > > > > https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> > > > 
> > > > > https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> > > > 
> > > > > https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> > > > 
> > > > > 
> > > > 
> > > > > 
> > > > 
> > > > > Enjoy, 
> > > > 
> > > > > 
> > > > 
> > > > > Etienne
> > > > 
> > > > > 
> > > > 
> > > > > 
> > > > 
> > > > 
> > > > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-13 Thread Etienne Chauchot
Hi Pablo,
yes they run at each commit on master (post-commit jenkins script)
Etienne
Le jeudi 12 juillet 2018 à 11:40 -0700, Pablo Estrada a écrit :
> This is really cool Etienne : ) thanks for working on this.Our of curiosity, 
> do you know how often the tests run on
> each runner?
> 
> Best
> -P.
> 
> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau  
> wrote:
> > Awesome Etienne, this is really important for the (user) community to have 
> > that visibility since it is one of the
> > most important aspect of the Beam's quality, kudo!
> > 
> > 
> > Romain Manni-Bucau
> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> > 
> > Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a 
> > écrit :
> > > It's really great to have these dashboards and integration in Jenkins !
> > > 
> > > 
> > > 
> > > Thanks Etienne for driving this !
> > > 
> > > 
> > > 
> > > Regards
> > > 
> > > JB
> > > 
> > > 
> > > 
> > > On 11/07/2018 15:13, Etienne Chauchot wrote:
> > > 
> > > > 
> > > 
> > > > Hi guys,
> > > 
> > > > 
> > > 
> > > > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > > 
> > > > Nexmark is now included in the perfkit dashboards.
> > > 
> > > > 
> > > 
> > > > At each commit on master, nexmark suites are run and plots are created
> > > 
> > > > on the graphs.
> > > 
> > > > 
> > > 
> > > > I've created 2 kind of dashboards:
> > > 
> > > > - one for performances (run times of the queries)
> > > 
> > > > - one for the size of the output PCollection (which should be constant)
> > > 
> > > > 
> > > 
> > > > There are dashboards for these runners:
> > > 
> > > > - spark
> > > 
> > > > - flink
> > > 
> > > > - direct runner
> > > 
> > > > 
> > > 
> > > > Each dashboard contains:
> > > 
> > > > - graphs in batch mode 
> > > 
> > > > - graphs in streaming mode
> > > 
> > > > - graphs for the 13 queries.
> > > 
> > > > 
> > > 
> > > > That gives more than a hundred of graphs (my right finger hurts after so
> > > 
> > > > many clics on the mouse :) ). It is detailed that much so that anyone
> > > 
> > > > can focus on the area they have interest in.
> > > 
> > > > Feel free to also create new dashboards with more aggregated data.
> > > 
> > > > 
> > > 
> > > > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> > > 
> > > > perfkit dashboards.
> > > 
> > > > 
> > > 
> > > > Dashboards are there:
> > > 
> > > > 
> > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > > 
> > > > https://apache-beam-testing.appspot.
> > > com/explore?dashboard=5138380291571712
> > > 
> > > > 
> > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> > > 
> > > > 
> > > 
> > > > 
> > > 
> > > > Enjoy, 
> > > 
> > > > 
> > > 
> > > > Etienne
> > > 
> > > > 
> > > 
> > > > 
> > > 
> > > 
> > > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Ahmet Altay
Thank you Etienne! This looks great.

I hope we can get other languages to have benchmarks at this level soon
enough.

Ahmet

On Thu, Jul 12, 2018 at 1:45 PM, Ismaël Mejía  wrote:

> That’s great to see in action, great work Etienne!
>
> Is there any document on how to integrate ‘stuff’ into the dashboards?
> I think this is worth having for people willing to do so like Kai or
> Andrew. Are there any docs on this? or maybe Lukasz Gajowy know ?
> On Thu, Jul 12, 2018 at 9:04 PM Andrew Pilloud 
> wrote:
> >
> > This is great, should make performance work much easier! I'm going to
> get the Beam SQL Nexmark jobs publishing as well. (Opened
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take
> on the Dataflow runner as well if no one else volunteers.
> >
> > I am curious as to why you have two separate graphs for runtime and
> count rather then graphing runtime/count to get the throughput rate for
> each run? Or should that be a third graph? Looks like it would just be a
> small tweak to the query in perfkit.
> >
> > Andrew
> >
> > On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada 
> wrote:
> >>
> >> This is really cool Etienne : ) thanks for working on this.
> >> Our of curiosity, do you know how often the tests run on each runner?
> >>
> >> Best
> >> -P.
> >>
> >> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau <
> rmannibu...@gmail.com> wrote:
> >>>
> >>> Awesome Etienne, this is really important for the (user) community to
> have that visibility since it is one of the most important aspect of the
> Beam's quality, kudo!
> >>>
> >>>
> >>> Romain Manni-Bucau
> >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>>
> >>>
> >>> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré 
> a écrit :
> 
>  It's really great to have these dashboards and integration in Jenkins
> !
> 
>  Thanks Etienne for driving this !
> 
>  Regards
>  JB
> 
>  On 11/07/2018 15:13, Etienne Chauchot wrote:
>  >
>  > Hi guys,
>  >
>  > I'm glad to announce that the CI of Beam has much improved ! Indeed
>  > Nexmark is now included in the perfkit dashboards.
>  >
>  > At each commit on master, nexmark suites are run and plots are
> created
>  > on the graphs.
>  >
>  > I've created 2 kind of dashboards:
>  > - one for performances (run times of the queries)
>  > - one for the size of the output PCollection (which should be
> constant)
>  >
>  > There are dashboards for these runners:
>  > - spark
>  > - flink
>  > - direct runner
>  >
>  > Each dashboard contains:
>  > - graphs in batch mode
>  > - graphs in streaming mode
>  > - graphs for the 13 queries.
>  >
>  > That gives more than a hundred of graphs (my right finger hurts
> after so
>  > many clics on the mouse :) ). It is detailed that much so that
> anyone
>  > can focus on the area they have interest in.
>  > Feel free to also create new dashboards with more aggregated data.
>  >
>  > Thanks to Lukasz and Cham for reviewing my PRs and showing how to
> use
>  > perfkit dashboards.
>  >
>  > Dashboards are there:
>  >
>  > https://apache-beam-testing.appspot.com/explore?dashboard=
> 5084698770407424
>  > https://apache-beam-testing.appspot.com/explore?dashboard=
> 5699257587728384
>  >  5138380291571712>https://apache-beam-testing.appspot.
> com/explore?dashboard=5138380291571712
>  >
>  > https://apache-beam-testing.appspot.com/explore?dashboard=
> 5099379773931520
>  > https://apache-beam-testing.appspot.com/explore?dashboard=
> 5731568492478464
>  > https://apache-beam-testing.appspot.com/explore?dashboard=
> 5163657986048000
>  >
>  >
>  > Enjoy,
>  >
>  > Etienne
>  >
>  >
> 
>  --
>  Jean-Baptiste Onofré
>  jbono...@apache.org
>  http://blog.nanthrax.net
>  Talend - http://www.talend.com
> >>
> >> --
> >> Got feedback? go/pabloem-feedback
>


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Ismaël Mejía
That’s great to see in action, great work Etienne!

Is there any document on how to integrate ‘stuff’ into the dashboards?
I think this is worth having for people willing to do so like Kai or
Andrew. Are there any docs on this? or maybe Lukasz Gajowy know ?
On Thu, Jul 12, 2018 at 9:04 PM Andrew Pilloud  wrote:
>
> This is great, should make performance work much easier! I'm going to get the 
> Beam SQL Nexmark jobs publishing as well. (Opened 
> https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take on 
> the Dataflow runner as well if no one else volunteers.
>
> I am curious as to why you have two separate graphs for runtime and count 
> rather then graphing runtime/count to get the throughput rate for each run? 
> Or should that be a third graph? Looks like it would just be a small tweak to 
> the query in perfkit.
>
> Andrew
>
> On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:
>>
>> This is really cool Etienne : ) thanks for working on this.
>> Our of curiosity, do you know how often the tests run on each runner?
>>
>> Best
>> -P.
>>
>> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau  
>> wrote:
>>>
>>> Awesome Etienne, this is really important for the (user) community to have 
>>> that visibility since it is one of the most important aspect of the Beam's 
>>> quality, kudo!
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>
>>>
>>> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a 
>>> écrit :

 It's really great to have these dashboards and integration in Jenkins !

 Thanks Etienne for driving this !

 Regards
 JB

 On 11/07/2018 15:13, Etienne Chauchot wrote:
 >
 > Hi guys,
 >
 > I'm glad to announce that the CI of Beam has much improved ! Indeed
 > Nexmark is now included in the perfkit dashboards.
 >
 > At each commit on master, nexmark suites are run and plots are created
 > on the graphs.
 >
 > I've created 2 kind of dashboards:
 > - one for performances (run times of the queries)
 > - one for the size of the output PCollection (which should be constant)
 >
 > There are dashboards for these runners:
 > - spark
 > - flink
 > - direct runner
 >
 > Each dashboard contains:
 > - graphs in batch mode
 > - graphs in streaming mode
 > - graphs for the 13 queries.
 >
 > That gives more than a hundred of graphs (my right finger hurts after so
 > many clics on the mouse :) ). It is detailed that much so that anyone
 > can focus on the area they have interest in.
 > Feel free to also create new dashboards with more aggregated data.
 >
 > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
 > perfkit dashboards.
 >
 > Dashboards are there:
 >
 > https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
 > https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
 > https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
 >
 > https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
 > https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
 > https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
 >
 >
 > Enjoy,
 >
 > Etienne
 >
 >

 --
 Jean-Baptiste Onofré
 jbono...@apache.org
 http://blog.nanthrax.net
 Talend - http://www.talend.com
>>
>> --
>> Got feedback? go/pabloem-feedback


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Andrew Pilloud
This is great, should make performance work much easier! I'm going to get
the Beam SQL Nexmark jobs publishing as well. (Opened
https://issues.apache.org/jira/browse/BEAM-4774 to track.) I might take on
the Dataflow runner as well if no one else volunteers.

I am curious as to why you have two separate graphs for runtime and count
rather then graphing runtime/count to get the throughput rate for each run?
Or should that be a third graph? Looks like it would just be a small tweak
to the query in perfkit.

Andrew

On Thu, Jul 12, 2018 at 11:40 AM Pablo Estrada  wrote:

> This is really cool Etienne : ) thanks for working on this.
> Our of curiosity, do you know how often the tests run on each runner?
>
> Best
> -P.
>
> On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
> wrote:
>
>> Awesome Etienne, this is really important for the (user) community to
>> have that visibility since it is one of the most important aspect of the
>> Beam's quality, kudo!
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau  |  Blog
>>  | Old Blog
>>  | Github
>>  | LinkedIn
>>  | Book
>> 
>>
>>
>> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
>> écrit :
>>
>>> It's really great to have these dashboards and integration in Jenkins !
>>>
>>> Thanks Etienne for driving this !
>>>
>>> Regards
>>> JB
>>>
>>> On 11/07/2018 15:13, Etienne Chauchot wrote:
>>> >
>>> > Hi guys,
>>> >
>>> > I'm glad to announce that the CI of Beam has much improved ! Indeed
>>> > Nexmark is now included in the perfkit dashboards.
>>> >
>>> > At each commit on master, nexmark suites are run and plots are created
>>> > on the graphs.
>>> >
>>> > I've created 2 kind of dashboards:
>>> > - one for performances (run times of the queries)
>>> > - one for the size of the output PCollection (which should be constant)
>>> >
>>> > There are dashboards for these runners:
>>> > - spark
>>> > - flink
>>> > - direct runner
>>> >
>>> > Each dashboard contains:
>>> > - graphs in batch mode
>>> > - graphs in streaming mode
>>> > - graphs for the 13 queries.
>>> >
>>> > That gives more than a hundred of graphs (my right finger hurts after
>>> so
>>> > many clics on the mouse :) ). It is detailed that much so that anyone
>>> > can focus on the area they have interest in.
>>> > Feel free to also create new dashboards with more aggregated data.
>>> >
>>> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
>>> > perfkit dashboards.
>>> >
>>> > Dashboards are there:
>>> >
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>>> > <
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>>> >
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
>>> >
>>> >
>>> > Enjoy,
>>> >
>>> > Etienne
>>> >
>>> >
>>>
>>> --
>>> Jean-Baptiste Onofré
>>> jbono...@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>> --
> Got feedback? go/pabloem-feedback
> 
>


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Pablo Estrada
This is really cool Etienne : ) thanks for working on this.
Our of curiosity, do you know how often the tests run on each runner?

Best
-P.

On Thu, Jul 12, 2018 at 2:15 AM Romain Manni-Bucau 
wrote:

> Awesome Etienne, this is really important for the (user) community to have
> that visibility since it is one of the most important aspect of the Beam's
> quality, kudo!
>
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
>
> Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
> écrit :
>
>> It's really great to have these dashboards and integration in Jenkins !
>>
>> Thanks Etienne for driving this !
>>
>> Regards
>> JB
>>
>> On 11/07/2018 15:13, Etienne Chauchot wrote:
>> >
>> > Hi guys,
>> >
>> > I'm glad to announce that the CI of Beam has much improved ! Indeed
>> > Nexmark is now included in the perfkit dashboards.
>> >
>> > At each commit on master, nexmark suites are run and plots are created
>> > on the graphs.
>> >
>> > I've created 2 kind of dashboards:
>> > - one for performances (run times of the queries)
>> > - one for the size of the output PCollection (which should be constant)
>> >
>> > There are dashboards for these runners:
>> > - spark
>> > - flink
>> > - direct runner
>> >
>> > Each dashboard contains:
>> > - graphs in batch mode
>> > - graphs in streaming mode
>> > - graphs for the 13 queries.
>> >
>> > That gives more than a hundred of graphs (my right finger hurts after so
>> > many clics on the mouse :) ). It is detailed that much so that anyone
>> > can focus on the area they have interest in.
>> > Feel free to also create new dashboards with more aggregated data.
>> >
>> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
>> > perfkit dashboards.
>> >
>> > Dashboards are there:
>> >
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>> > <
>> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>> >
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
>> >
>> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
>> >
>> >
>> > Enjoy,
>> >
>> > Etienne
>> >
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
> --
Got feedback? go/pabloem-feedback


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Romain Manni-Bucau
Awesome Etienne, this is really important for the (user) community to have
that visibility since it is one of the most important aspect of the Beam's
quality, kudo!

Romain Manni-Bucau
@rmannibucau  |  Blog
 | Old Blog
 | Github  |
LinkedIn  | Book



Le jeu. 12 juil. 2018 à 10:59, Jean-Baptiste Onofré  a
écrit :

> It's really great to have these dashboards and integration in Jenkins !
>
> Thanks Etienne for driving this !
>
> Regards
> JB
>
> On 11/07/2018 15:13, Etienne Chauchot wrote:
> >
> > Hi guys,
> >
> > I'm glad to announce that the CI of Beam has much improved ! Indeed
> > Nexmark is now included in the perfkit dashboards.
> >
> > At each commit on master, nexmark suites are run and plots are created
> > on the graphs.
> >
> > I've created 2 kind of dashboards:
> > - one for performances (run times of the queries)
> > - one for the size of the output PCollection (which should be constant)
> >
> > There are dashboards for these runners:
> > - spark
> > - flink
> > - direct runner
> >
> > Each dashboard contains:
> > - graphs in batch mode
> > - graphs in streaming mode
> > - graphs for the 13 queries.
> >
> > That gives more than a hundred of graphs (my right finger hurts after so
> > many clics on the mouse :) ). It is detailed that much so that anyone
> > can focus on the area they have interest in.
> > Feel free to also create new dashboards with more aggregated data.
> >
> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> > perfkit dashboards.
> >
> > Dashboards are there:
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > <
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> >
> >
> > Enjoy,
> >
> > Etienne
> >
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Jean-Baptiste Onofré
It's really great to have these dashboards and integration in Jenkins !

Thanks Etienne for driving this !

Regards
JB

On 11/07/2018 15:13, Etienne Chauchot wrote:
> 
> Hi guys,
> 
> I'm glad to announce that the CI of Beam has much improved ! Indeed
> Nexmark is now included in the perfkit dashboards.
> 
> At each commit on master, nexmark suites are run and plots are created
> on the graphs.
> 
> I've created 2 kind of dashboards:
> - one for performances (run times of the queries)
> - one for the size of the output PCollection (which should be constant)
> 
> There are dashboards for these runners:
> - spark
> - flink
> - direct runner
> 
> Each dashboard contains:
> - graphs in batch mode 
> - graphs in streaming mode
> - graphs for the 13 queries.
> 
> That gives more than a hundred of graphs (my right finger hurts after so
> many clics on the mouse :) ). It is detailed that much so that anyone
> can focus on the area they have interest in.
> Feel free to also create new dashboards with more aggregated data.
> 
> Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> perfkit dashboards.
> 
> Dashboards are there:
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> 
> 
> Enjoy, 
> 
> Etienne
> 
> 

-- 
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-12 Thread Etienne Chauchot
Hi Kai,
Cool for TPC-H, it will be complementary to Nexmark.Regarding Dataflow we can 
run Nexmark on dataflow, note that it was
the original target of Nexmark port by Mark. We have not done it because we 
have no DF environment available. 
Is someone from google willing to run nexmark on dataflow and add the 
postCommit script and the perfkit dashboards?
Thanks
Etienne
Le mercredi 11 juillet 2018 à 20:11 -0700, Kai Jiang a écrit :
> Hi Etienne,
> It's awesome for working on these useful dashboards. I am getting TPC-H 
> benchmark running on Flink and Dataflow
> Runner. I could work on similar dashboards for TPC benchmark after code 
> merged.
> Also, it's great to have a dashboards for Dataflow.
> 
> Best,
> Kaiᐧ
> On Wed, Jul 11, 2018 at 6:35 AM Etienne Chauchot  wrote:
> > First catch of the nexmark-CI:It seems that there was a change in the 
> > direct runner.
> > Query3 (exercise state and timers) - output size should be constant but has 
> > increased today => Was there a change in
> > state and timer related code?- the output size of this query is different 
> > between batch and streaming modes on
> > direct runner.
> > Etienne
> > Le mercredi 11 juillet 2018 à 15:25 +0200, Etienne Chauchot a écrit :
> > > Is someone interested in creating the scripts and dashboards for the 
> > > other runners? They can be created by copying
> > > the existing scripts and dashboards and changing one gradle parameter in 
> > > the scripts and the table name in the
> > > dashboards. 
> > > I have created the 
> > > tickets:https://issues.apache.org/jira/browse/BEAM-4763https://issues.apache.org/jira/browse/BE
> > > AM-4762https://issues.apache.org/jira/browse/BEAM-4761https://issues.apache.org/jira/browse/BEAM-4760
> > > Etienne Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a 
> > > écrit :
> > > > Hi guys, 
> > > > 
> > > > I'm glad to announce that the CI of Beam has much improved !  Indeed 
> > > > Nexmark is now included in the perfkit
> > > > dashboards.
> > > > 
> > > > At each commit on master, nexmark suites are run and plots are created 
> > > > on the graphs.
> > > > 
> > > > I've created 2 kind of dashboards:
> > > > - one for performances (run times of the queries)
> > > > - one for the size of the output PCollection (which  should be constant)
> > > > 
> > > > There are dashboards for these runners:
> > > > - spark
> > > > - flink
> > > > - direct runner
> > > > 
> > > > Each dashboard contains:
> > > > - graphs in batch mode 
> > > > - graphs in streaming mode
> > > > - graphs for the 13 queries.
> > > > 
> > > > That gives more than a hundred of graphs (my right finger hurts after 
> > > > so many clics on the mouse :) ). It is
> > > > detailed that much so that anyone can focus on the area they have 
> > > > interest in.
> > > > Feel free to also create new dashboards with more aggregated data.  
> > > > 
> > > > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use 
> > > > perfkit dashboards.
> > > > 
> > > > Dashboards are there: 
> > > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> > > > 
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> > > > https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> > > > 
> > > > 
> > > > Enjoy, 
> > > > 
> > > > Etienne
> > > > 
> > > > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Kai Jiang
Hi Etienne,

It's awesome for working on these useful dashboards. I am getting TPC-H
benchmark running on Flink and Dataflow Runner. I could work on similar
dashboards for TPC benchmark after code merged.
Also, it's great to have a dashboards for Dataflow.

Best,
Kai
ᐧ

On Wed, Jul 11, 2018 at 6:35 AM Etienne Chauchot 
wrote:

> First catch of the nexmark-CI:
> It seems that there was a change in the direct runner.
>
> Query3 (exercise state and timers)
> - output size should be constant but has increased today => Was there a
> change in state and timer related code?
> - the output size of this query is different between batch and streaming
> modes on direct runner.
>
> Etienne
>
> Le mercredi 11 juillet 2018 à 15:25 +0200, Etienne Chauchot a écrit :
>
> Is someone interested in creating the scripts and dashboards for the other
> runners? They can be created by copying the existing scripts and dashboards
> and changing one gradle parameter in the scripts and the table name in the
> dashboards.
>
> I have created the tickets:
> https://issues.apache.org/jira/browse/BEAM-4763
> https://issues.apache.org/jira/browse/BEAM-4762
> https://issues.apache.org/jira/browse/BEAM-4761
> https://issues.apache.org/jira/browse/BEAM-4760
>
> Etienne
> Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :
>
>
> Hi guys,
>
> I'm glad to announce that the CI of Beam has much improved ! Indeed
> Nexmark is now included in the perfkit dashboards.
>
> At each commit on master, nexmark suites are run and plots are created on
> the graphs.
>
> I've created 2 kind of dashboards:
> - one for performances (run times of the queries)
> - one for the size of the output PCollection (which should be constant)
>
> There are dashboards for these runners:
> - spark
> - flink
> - direct runner
>
> Each dashboard contains:
> - graphs in batch mode
> - graphs in streaming mode
> - graphs for the 13 queries.
>
> That gives more than a hundred of graphs (my right finger hurts after so
> many clics on the mouse :) ). It is detailed that much so that anyone can
> focus on the area they have interest in.
> Feel free to also create new dashboards with more aggregated data.
>
> Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> perfkit dashboards.
>
> Dashboards are there:
>
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
>
>
> Enjoy,
>
> Etienne
>
>
>


Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Etienne Chauchot
First catch of the nexmark-CI:It seems that there was a change in the direct 
runner.
Query3 (exercise state and timers) - output size should be constant but has 
increased today => Was there a change in
state and timer related code?- the output size of this query is different 
between batch and streaming modes on direct
runner.
Etienne
Le mercredi 11 juillet 2018 à 15:25 +0200, Etienne Chauchot a écrit :
> Is someone interested in creating the scripts and dashboards for the other 
> runners? They can be created by copying the
> existing scripts and dashboards and changing one gradle parameter in the 
> scripts and the table name in the
> dashboards. 
> I have created the 
> tickets:https://issues.apache.org/jira/browse/BEAM-4763https://issues.apache.org/jira/browse/BEAM-4
> 762https://issues.apache.org/jira/browse/BEAM-4761https://issues.apache.org/jira/browse/BEAM-4760
> Etienne Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :
> > Hi guys, 
> > 
> > I'm glad to announce that the CI of Beam has much improved !  Indeed 
> > Nexmark is now included in the perfkit
> > dashboards.
> > 
> > At each commit on master, nexmark suites are run and plots are created on 
> > the graphs.
> > 
> > I've created 2 kind of dashboards:
> > - one for performances (run times of the queries)
> > - one for the size of the output PCollection (which  should be constant)
> > 
> > There are dashboards for these runners:
> > - spark
> > - flink
> > - direct runner
> > 
> > Each dashboard contains:
> > - graphs in batch mode 
> > - graphs in streaming mode
> > - graphs for the 13 queries.
> > 
> > That gives more than a hundred of graphs (my right finger hurts after so 
> > many clics on the mouse :) ). It is
> > detailed that much so that anyone can focus on the area they have interest 
> > in.
> > Feel free to also create new dashboards with more aggregated data.  
> > 
> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use 
> > perfkit dashboards.
> > 
> > Dashboards are there: 
> > 
> > https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> > https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> > 
> > https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> > https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> > https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> > 
> > 
> > Enjoy, 
> > 
> > Etienne
> > 
> > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Etienne Chauchot
Is someone interested in creating the scripts and dashboards for the other 
runners? They can be created by copying the
existing scripts and dashboards and changing one gradle parameter in the 
scripts and the table name in the dashboards. 
I have created the 
tickets:https://issues.apache.org/jira/browse/BEAM-4763https://issues.apache.org/jira/browse/BEAM-476
2https://issues.apache.org/jira/browse/BEAM-4761https://issues.apache.org/jira/browse/BEAM-4760
Etienne Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :
> Hi guys, 
> 
> I'm glad to announce that the CI of Beam has much improved !  Indeed Nexmark 
> is now included in the perfkit
> dashboards.
> 
> At each commit on master, nexmark suites are run and plots are created on the 
> graphs.
> 
> I've created 2 kind of dashboards:
> - one for performances (run times of the queries)
> - one for the size of the output PCollection (which  should be constant)
> 
> There are dashboards for these runners:
> - spark
> - flink
> - direct runner
> 
> Each dashboard contains:
> - graphs in batch mode 
> - graphs in streaming mode
> - graphs for the 13 queries.
> 
> That gives more than a hundred of graphs (my right finger hurts after so many 
> clics on the mouse :) ). It is detailed
> that much so that anyone can focus on the area they have interest in.
> Feel free to also create new dashboards with more aggregated data.  
> 
> Thanks to Lukasz and Cham for reviewing my PRs and showing how to use perfkit 
> dashboards.
> 
> Dashboards are there: 
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> 
> 
> Enjoy, 
> 
> Etienne
> 
>