[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=114344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-114344
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 21/Jun/18 14:10
Start Date: 21/Jun/18 14:10
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5713: [BEAM-4283] Fix 
naming of the BigQuery fields
URL: https://github.com/apache/beam/pull/5713#issuecomment-399117062
 
 
   thx @iemejia !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 114344)
Time Spent: 8h 40m  (was: 8.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
> Fix For: 2.6.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=114335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-114335
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 21/Jun/18 13:23
Start Date: 21/Jun/18 13:23
Worklog Time Spent: 10m 
  Work Description: iemejia closed pull request #5713: [BEAM-4283] Fix 
naming of the BigQuery fields
URL: https://github.com/apache/beam/pull/5713
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
index ceed04774c1..10317ea755a 100644
--- a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
+++ b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
@@ -145,10 +145,10 @@ public NexmarkPerf decode(InputStream inStream)
 new TableSchema()
 .setFields(
 ImmutableList.of(
-new 
TableFieldSchema().setName("Runtime(sec)").setType("FLOAT"),
-new 
TableFieldSchema().setName("Events(/sec)").setType("FLOAT"),
+new 
TableFieldSchema().setName("runtimeSec").setType("FLOAT"),
+new 
TableFieldSchema().setName("eventsPerSec").setType("FLOAT"),
 new TableFieldSchema()
-.setName("Size of the result collection")
+.setName("numResults")
 .setType("INTEGER")));
 
 String tableSpec =
@@ -163,9 +163,9 @@ public NexmarkPerf decode(InputStream inStream)
 input -> {
   NexmarkPerf nexmarkPerf = input.getValue();
   TableRow row = new TableRow()
-  .set("Runtime(sec)", nexmarkPerf.runtimeSec)
-  .set("Events(/sec)", nexmarkPerf.eventsPerSec)
-  .set("Size of the result collection", nexmarkPerf.numResults);
+  .set("runtimeSec", nexmarkPerf.runtimeSec)
+  .set("eventsPerSec", nexmarkPerf.eventsPerSec)
+  .set("numResults", nexmarkPerf.numResults);
   return row;
 };
 BigQueryIO.Write io =
diff --git 
a/sdks/java/nexmark/src/test/java/org/apache/beam/sdk/nexmark/PerfsToBigQueryTest.java
 
b/sdks/java/nexmark/src/test/java/org/apache/beam/sdk/nexmark/PerfsToBigQueryTest.java
index 6fc5dabb50e..0be5b3d2d8a 100644
--- 
a/sdks/java/nexmark/src/test/java/org/apache/beam/sdk/nexmark/PerfsToBigQueryTest.java
+++ 
b/sdks/java/nexmark/src/test/java/org/apache/beam/sdk/nexmark/PerfsToBigQueryTest.java
@@ -104,16 +104,16 @@ public void testSavePerfsToBigQuery() throws IOException, 
InterruptedException {
 assertEquals("Wrong number of rows inserted", 2, actualRows.size());
 List expectedRows = new ArrayList<>();
 TableRow row1 = new TableRow()
-.set("Runtime(sec)", nexmarkPerf1.runtimeSec).set("Events(/sec)", 
nexmarkPerf1.eventsPerSec)
+.set("runtimeSec", nexmarkPerf1.runtimeSec).set("eventsPerSec", 
nexmarkPerf1.eventsPerSec)
 // when read using TableRowJsonCoder the row field is boxed into an 
Integer, cast it to int
 // to for bowing into Integer in the expectedRows.
-.set("Size of the result collection", (int) nexmarkPerf1.numResults);
+.set("numResults", (int) nexmarkPerf1.numResults);
 expectedRows.add(row1);
 TableRow row2 = new TableRow()
-.set("Runtime(sec)", nexmarkPerf2.runtimeSec).set("Events(/sec)", 
nexmarkPerf2.eventsPerSec)
+.set("runtimeSec", nexmarkPerf2.runtimeSec).set("eventsPerSec", 
nexmarkPerf2.eventsPerSec)
 // when read using TableRowJsonCoder the row field is boxed into an 
Integer, cast it to int
 // to for bowing into Integer in the expectedRows.
-.set("Size of the result collection", (int) nexmarkPerf2.numResults);
+.set("numResults", (int) nexmarkPerf2.numResults);
 expectedRows.add(row2);
 assertThat(actualRows, containsInAnyOrder(Iterables.toArray(expectedRows, 
TableRow.class)));
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 114335)
Time Spent: 8.5h  (was: 8h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: 

[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=114281=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-114281
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 21/Jun/18 09:58
Start Date: 21/Jun/18 09:58
Worklog Time Spent: 10m 
  Work Description: echauchot opened a new pull request #5713: [BEAM-4283] 
Fix naming of the BigQuery fields
URL: https://github.com/apache/beam/pull/5713
 
 
   Fix naming of the fields of the added BQ table. Naming error not see in UT 
(see https://issues.apache.org/jira/browse/BEAM-4607)
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 114281)
Time Spent: 8h 20m  (was: 8h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
> Fix For: 2.6.0
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=112374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112374
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 15/Jun/18 16:06
Start Date: 15/Jun/18 16:06
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #5464: [BEAM-4283] 
Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
index 29fb8924f22..19aef00094d 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
@@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
   checkArgument(testServices != null, "testServices can not be null");
   return toBuilder().setBigQueryServices(testServices).build();
 }
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServices.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServices.java
index 1295cc0fe2c..c4e5462306f 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServices.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServices.java
@@ -32,10 +32,12 @@
 import java.io.Serializable;
 import java.util.List;
 import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.values.ValueInSingleWindow;
 
 /** An interface for real, mock, or fake implementations of Cloud BigQuery 
services. */
-interface BigQueryServices extends Serializable {
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public interface BigQueryServices extends Serializable {
 
   /**
* Returns a real, mock, or fake {@link JobService}.
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java
index 0a384e74e17..d581a925f6c 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java
@@ -23,6 +23,7 @@
 import java.io.ByteArrayOutputStream;
 import java.io.IOException;
 import java.util.List;
+import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.coders.Coder.Context;
 import org.apache.beam.sdk.coders.ListCoder;
 
@@ -30,16 +31,17 @@
 /**
  * A fake implementation of BigQuery's query service..
  */
-class FakeBigQueryServices implements BigQueryServices {
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class FakeBigQueryServices implements BigQueryServices {
   private JobService jobService;
   private FakeDatasetService datasetService;
 
-  FakeBigQueryServices withJobService(JobService jobService) {
+  public FakeBigQueryServices withJobService(JobService jobService) {
 this.jobService = jobService;
 return this;
   }
 
-  FakeBigQueryServices withDatasetService(FakeDatasetService datasetService) {
+  public FakeBigQueryServices withDatasetService(FakeDatasetService 
datasetService) {
 this.datasetService = datasetService;
 return this;
   }
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeDatasetService.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeDatasetService.java
index 3526ed5ddd0..50d4b7af06d 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeDatasetService.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeDatasetService.java
@@ -39,6 +39,7 @@
 import java.util.concurrent.ThreadLocalRandom;
 import java.util.regex.Pattern;
 import javax.annotation.Nullable;

[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=112345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112345
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 15/Jun/18 14:52
Start Date: 15/Jun/18 14:52
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-397646791
 
 
   Hi Cham, is it ready for merging? If so, can you merge it ?  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 112345)
Time Spent: 8h  (was: 7h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=111972=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-111972
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 14/Jun/18 18:27
Start Date: 14/Jun/18 18:27
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-397393514
 
 
   Retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 111972)
Time Spent: 7h 50m  (was: 7h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=111807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-111807
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 14/Jun/18 07:56
Start Date: 14/Jun/18 07:56
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-397206054
 
 
   Hi Cham, thanks, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 111807)
Time Spent: 7h 40m  (was: 7.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=111433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-111433
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 13/Jun/18 07:52
Start Date: 13/Jun/18 07:52
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-396527944
 
 
   Hi @chamikaramj , I fixed the serialization issue and some other things. It 
should be OK now. PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 111433)
Time Spent: 7.5h  (was: 7h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=111018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-111018
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 12/Jun/18 10:24
Start Date: 12/Jun/18 10:24
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-396527944
 
 
   Hi Cham, I fixed the serialization issue and some other things. It should be 
OK now. PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 111018)
Time Spent: 7h 20m  (was: 7h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=111011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-111011
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 12/Jun/18 09:32
Start Date: 12/Jun/18 09:32
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-396527944
 
 
   Hi Cham, I fixed the serialization issue and some other things. I should be 
OK now. PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 111011)
Time Spent: 7h 10m  (was: 7h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=110892=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110892
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 11/Jun/18 23:29
Start Date: 11/Jun/18 23:29
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-396418368
 
 
   @echauchot please let me know if this is ready for another look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110892)
Time Spent: 7h  (was: 6h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-07 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109814
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 07/Jun/18 17:46
Start Date: 07/Jun/18 17:46
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-395507028
 
 
   Yeah, probably just pass values that you need from options object to ' 
NexmarkUtils.tableSpec' instead of the full options object ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109814)
Time Spent: 6h 50m  (was: 6h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-07 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109735
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 07/Jun/18 15:16
Start Date: 07/Jun/18 15:16
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-395458493
 
 
   @chamikaramj I missed part of the stacktrace. I know where the serialization 
issue come from: I use PipelineOptions in a SerializableFunction used to 
configure the IO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109735)
Time Spent: 6h 40m  (was: 6.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109418
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 15:04
Start Date: 06/Jun/18 15:04
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-395100488
 
 
   @chamikaramj I applied your comments, thanks for the feedback.
   The stacktrace for the serialization issue is in gradle build scan. I copy 
it bellow
   
   ```
java.lang.IllegalArgumentException: unable to serialize 
DoFnAndMainOutput{doFn=org.apache.beam.sdk.io.gcp.bigquery.PrepareWrite$1@3819e3a6,
 mainOutputTag=Tag}Open stacktrace
   Caused by: java.io.NotSerializableException: PipelineOptions objects are not 
serializable and should not be embedded into transforms (did you capture a 
PipelineOptions object in a field or in an anonymous class?). Instead, if 
you're using a DoFn, access PipelineOptions at runtime via 
ProcessContext/StartBundleContext/FinishBundleContext.getPipelineOptions(), or 
pre-extract necessary fields from PipelineOptions at pipeline construction 
time.Close stacktrace
   at 
org.apache.beam.sdk.options.ProxyInvocationHandler.writeObject(ProxyInvocationHandler.java:174)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1128)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
   at 
org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:53)
   at 
org.apache.beam.repackaged.beam_runners_direct_java.runners.core.construction.ParDoTranslation.translateDoFn(ParDoTranslation.java:462)
   at 
org.apache.beam.repackaged.beam_runners_direct_java.runners.core.construction.ParDoTranslation$1.translateDoFn(ParDoTranslation.java:160)
   at 

[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109414
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 15:00
Start Date: 06/Jun/18 15:00
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-395100488
 
 
   @chamikaramj I applied your comments, thanks for the feedback.
   The stacktrace for the serialization issue is in gradle build scan. I copy 
it bellow
   
   > java.lang.IllegalArgumentException: unable to serialize 
DoFnAndMainOutput{doFn=org.apache.beam.sdk.io.gcp.bigquery.PrepareWrite$1@3819e3a6,
 mainOutputTag=Tag}Open stacktrace
   Caused by: java.io.NotSerializableException: PipelineOptions objects are not 
serializable and should not be embedded into transforms (did you capture a 
PipelineOptions object in a field or in an anonymous class?). Instead, if 
you're using a DoFn, access PipelineOptions at runtime via 
ProcessContext/StartBundleContext/FinishBundleContext.getPipelineOptions(), or 
pre-extract necessary fields from PipelineOptions at pipeline construction 
time.Close stacktrace
   at 
org.apache.beam.sdk.options.ProxyInvocationHandler.writeObject(ProxyInvocationHandler.java:174)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1128)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
   at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
   at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
   at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
   at 
org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:53)
   at 
org.apache.beam.repackaged.beam_runners_direct_java.runners.core.construction.ParDoTranslation.translateDoFn(ParDoTranslation.java:462)
   at 
org.apache.beam.repackaged.beam_runners_direct_java.runners.core.construction.ParDoTranslation$1.translateDoFn(ParDoTranslation.java:160)
   at 

[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109325
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 10:15
Start Date: 06/Jun/18 10:15
Worklog Time Spent: 10m 
  Work Description: asfgit commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-395018816
 
 
   FAILURE

   --none--


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109325)
Time Spent: 6h 10m  (was: 6h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109307
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 08:38
Start Date: 06/Jun/18 08:38
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193334164
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -74,22 +96,89 @@ void runAll(OptionT options, NexmarkLauncher 
nexmarkLauncher) throws IOException
   appendPerf(options.getPerfFilename(), configuration, perf);
   actual.put(configuration, perf);
   // Summarize what we've run so far.
-  saveSummary(null, configurations, actual, baseline, start);
+  saveSummary(null, configurations, actual, baseline, start, options);
 }
   }
+  if (options.getExportSummaryToBigQuery()){
+savePerfsToBigQuery(options, actual, null);
+  }
 } finally {
   if (options.getMonitorJobs()) {
 // Report overall performance.
-saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start);
+saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start, options);
 saveJavascript(options.getJavascriptFilename(), configurations, 
actual, baseline, start);
   }
 }
-
 if (!successful) {
   throw new RuntimeException("Execution was not successful");
 }
   }
 
+  @VisibleForTesting
+  static void savePerfsToBigQuery(
+  NexmarkOptions options,
+  Map perfs,
+  @Nullable FakeBigQueryServices fakeBigQueryServices) {
+Pipeline pipeline = Pipeline.create(options);
 
 Review comment:
   Also an extra point is that the volume data to insert into BQ is at maximum 
12 (queries) x 3 fields for each nexmark run; thus I don't think there is any 
performance/speed concern here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109307)
Time Spent: 5h 50m  (was: 5h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109308
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 08:38
Start Date: 06/Jun/18 08:38
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193334164
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -74,22 +96,89 @@ void runAll(OptionT options, NexmarkLauncher 
nexmarkLauncher) throws IOException
   appendPerf(options.getPerfFilename(), configuration, perf);
   actual.put(configuration, perf);
   // Summarize what we've run so far.
-  saveSummary(null, configurations, actual, baseline, start);
+  saveSummary(null, configurations, actual, baseline, start, options);
 }
   }
+  if (options.getExportSummaryToBigQuery()){
+savePerfsToBigQuery(options, actual, null);
+  }
 } finally {
   if (options.getMonitorJobs()) {
 // Report overall performance.
-saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start);
+saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start, options);
 saveJavascript(options.getJavascriptFilename(), configurations, 
actual, baseline, start);
   }
 }
-
 if (!successful) {
   throw new RuntimeException("Execution was not successful");
 }
   }
 
+  @VisibleForTesting
+  static void savePerfsToBigQuery(
+  NexmarkOptions options,
+  Map perfs,
+  @Nullable FakeBigQueryServices fakeBigQueryServices) {
+Pipeline pipeline = Pipeline.create(options);
 
 Review comment:
   Also an extra point is that the data volume to insert into BQ is at maximum 
12 (queries) x 3 fields for each nexmark run; thus I don't think there is any 
performance/speed concern here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109308)
Time Spent: 6h  (was: 5h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109306
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 08:35
Start Date: 06/Jun/18 08:35
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r19303
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   Yes the injection is for testing only. I agree impl should not be public, 
and public FakeBigQueryServices, BigQueryServices interface and 
withTestServices will be usefull for other that have the same needs that this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109306)
Time Spent: 5h 40m  (was: 5.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109304
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 08:31
Start Date: 06/Jun/18 08:31
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193332252
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkConfiguration.java
 ##
 @@ -47,6 +47,10 @@
   @JsonProperty
   public NexmarkUtils.SinkType sinkType = NexmarkUtils.SinkType.DEVNULL;
 
+  /** Shall we export the summary to BigQuery. */
 
 Review comment:
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109304)
Time Spent: 5.5h  (was: 5h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109270
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 01:21
Start Date: 06/Jun/18 01:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193266362
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeJobService.java
 ##
 @@ -78,7 +78,7 @@
 /**
  * A fake implementation of BigQuery's job service.
  */
-class FakeJobService implements JobService, Serializable {
+public class FakeJobService implements JobService, Serializable {
 
 Review comment:
   @Experimental(Experimental.Kind.SOURCE_SINK)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109270)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109269
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 01:21
Start Date: 06/Jun/18 01:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193266581
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkConfiguration.java
 ##
 @@ -47,6 +47,10 @@
   @JsonProperty
   public NexmarkUtils.SinkType sinkType = NexmarkUtils.SinkType.DEVNULL;
 
+  /** Shall we export the summary to BigQuery. */
 
 Review comment:
   Please rewrite this comment (sounds like a TODO currently). What you wrote 
above is fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109269)
Time Spent: 5h 10m  (was: 5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109268
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 01:21
Start Date: 06/Jun/18 01:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193265905
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   Yes. I assume "inject BigQueryServices to savePerfsToBigQuery()" is for 
testing. BigQueryServicesImpl should not be public. I don't think there's a 
point in making 'withTestServices' available for testing without making 
'BigQueryServices' interface available. I think 'withTestServices' in 
combination with FakeBigQueryServices will be useful for other Beam components 
that need to test pipelines that write to BigQuery as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109268)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109271=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109271
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 01:21
Start Date: 06/Jun/18 01:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193266493
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -300,4 +390,6 @@ public static void main(String[] args) throws IOException {
 NexmarkLauncher nexmarkLauncher = new 
NexmarkLauncher<>(options);
 new Main<>().runAll(options, nexmarkLauncher);
   }
+
+
 
 Review comment:
   Remove extra newlines.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109271)
Time Spent: 5h 20m  (was: 5h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109267
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 06/Jun/18 01:21
Start Date: 06/Jun/18 01:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193266011
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeDatasetService.java
 ##
 @@ -46,7 +46,7 @@
 import org.apache.beam.sdk.values.ValueInSingleWindow;
 
 /** A fake dataset service that can be serialized, for use in 
testReadFromTable. */
-class FakeDatasetService implements DatasetService, Serializable {
+public class FakeDatasetService implements DatasetService, Serializable {
 
 Review comment:
   @Experimental(Experimental.Kind.SOURCE_SINK)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109267)
Time Spent: 5h  (was: 4h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109015=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109015
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 10:04
Start Date: 05/Jun/18 10:04
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-394654301
 
 
   @chamikaramj thanks for the review. Can you please answer my questions 
above? In the last commit I assumed that you meant what I wrote above. PTAL
   Also can you please take a look at the serialization issue and give me your 
opinion?
   Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109015)
Time Spent: 4h 50m  (was: 4h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109010
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:58
Start Date: 05/Jun/18 09:58
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193001386
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   You mean that I should make `BigQueryServices` public in addition to 
`FakeBigQueryServices`, annotate it experimental, inject `BigQueryServices` to 
`savePerfsToBigQuery()` and add bigQuery test artifacts only to the deps of 
nexmark test artifact to avoid deps problem that I was raising above. Is that 
right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109010)
Time Spent: 4h 40m  (was: 4.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=109008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-109008
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:52
Start Date: 05/Jun/18 09:52
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193012811
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -74,22 +96,89 @@ void runAll(OptionT options, NexmarkLauncher 
nexmarkLauncher) throws IOException
   appendPerf(options.getPerfFilename(), configuration, perf);
   actual.put(configuration, perf);
   // Summarize what we've run so far.
-  saveSummary(null, configurations, actual, baseline, start);
+  saveSummary(null, configurations, actual, baseline, start, options);
 }
   }
+  if (options.getExportSummaryToBigQuery()){
+savePerfsToBigQuery(options, actual, null);
+  }
 } finally {
   if (options.getMonitorJobs()) {
 // Report overall performance.
-saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start);
+saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start, options);
 saveJavascript(options.getJavascriptFilename(), configurations, 
actual, baseline, start);
   }
 }
-
 if (!successful) {
   throw new RuntimeException("Execution was not successful");
 }
   }
 
+  @VisibleForTesting
+  static void savePerfsToBigQuery(
+  NexmarkOptions options,
+  Map perfs,
+  @Nullable FakeBigQueryServices fakeBigQueryServices) {
+Pipeline pipeline = Pipeline.create(options);
 
 Review comment:
   Yes it is technically feasible to create a new PipelineOptions with runner 
== DirectRunner and use it in the second pipeline. But where I'm not convinced 
is that it will require to ship direct runner libs in nexmark even when we are 
running the queries on another runner. It might be problematic to have them in 
the fat jar deployed on a spark cluster for example. 
   Currently, we only ship direct runner libs in the classpath when we run JVM 
local tests on the direct runner (profile).
   Please note that the other big query additions to nexmark (sink the output 
PCollection to big query) currently run with the same runner than the queries 
in the same pipeline)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 109008)
Time Spent: 4.5h  (was: 4h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108999
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:28
Start Date: 05/Jun/18 09:28
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193001386
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   You mean that I should make BigQueryServices public in addition to 
FakeBigQueryServices, annotate it experimental and add bigquery test artifacts 
only to the deps of nexmark test artifact to avoid deps problem that I was 
raising above. Is that right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108999)
Time Spent: 4h 20m  (was: 4h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108996
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:23
Start Date: 05/Jun/18 09:23
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193001386
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   You mean that I should make BigQueryServices public in addition to 
FakeBigQueryServices, annotate it experimental and add bigquery test artifacts 
only to the deps of nexmark test artifact. Is that right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108996)
Time Spent: 4h 10m  (was: 4h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108985=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108985
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:14
Start Date: 05/Jun/18 09:14
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r193001386
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   You mean that I should make BigQueryServices public in addition to 
FakeBigQueryServices, annotate it experimental and add bigquery test artifacts 
only to the deps of nexmark test artifact?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108985)
Time Spent: 4h  (was: 3h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108980
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:05
Start Date: 05/Jun/18 09:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192998650
 
 

 ##
 File path: sdks/java/nexmark/build.gradle
 ##
 @@ -42,6 +42,7 @@ configurations {
 }
 
 dependencies {
+  compile 'com.google.cloud:google-cloud-bigquery:1.28.0'
 
 Review comment:
   +1, duplicate comment, see above.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108980)
Time Spent: 3h 50m  (was: 3h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108979
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 09:03
Start Date: 05/Jun/18 09:03
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192997887
 
 

 ##
 File path: sdks/java/nexmark/build.gradle
 ##
 @@ -42,6 +42,7 @@ configurations {
 }
 
 dependencies {
+  compile 'com.google.cloud:google-cloud-bigquery:1.28.0'
 
 Review comment:
   Indeed, I thought I had removed it, but I may have missed it in the rebase 
process. Thanks for pointing out


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108979)
Time Spent: 3h 40m  (was: 3.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108901=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108901
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 00:58
Start Date: 05/Jun/18 00:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192918564
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java
 ##
 @@ -30,16 +30,16 @@
 /**
  * A fake implementation of BigQuery's query service..
  */
-class FakeBigQueryServices implements BigQueryServices {
+public class FakeBigQueryServices implements BigQueryServices {
 
 Review comment:
   Please add "@Experimental(Experimental.Kind.SOURCE_SINK)".


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108901)
Time Spent: 3h 20m  (was: 3h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108902
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 00:58
Start Date: 05/Jun/18 00:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192918714
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -1452,7 +1452,10 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
 }
 
 @VisibleForTesting
-Write withTestServices(BigQueryServices testServices) {
+/**
+ * This method is for test usage only
+ */
+public Write withTestServices(BigQueryServices testServices) {
 
 Review comment:
   I think it's fine to make BigQueryServices public but add 
"@Experimental(Experimental.Kind.SOURCE_SINK)" so that we can remove it in the 
future if needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108902)
Time Spent: 3h 20m  (was: 3h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108900
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 00:58
Start Date: 05/Jun/18 00:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192920253
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -74,22 +96,89 @@ void runAll(OptionT options, NexmarkLauncher 
nexmarkLauncher) throws IOException
   appendPerf(options.getPerfFilename(), configuration, perf);
   actual.put(configuration, perf);
   // Summarize what we've run so far.
-  saveSummary(null, configurations, actual, baseline, start);
+  saveSummary(null, configurations, actual, baseline, start, options);
 }
   }
+  if (options.getExportSummaryToBigQuery()){
+savePerfsToBigQuery(options, actual, null);
+  }
 } finally {
   if (options.getMonitorJobs()) {
 // Report overall performance.
-saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start);
+saveSummary(options.getSummaryFilename(), configurations, actual, 
baseline, start, options);
 saveJavascript(options.getJavascriptFilename(), configurations, 
actual, baseline, start);
   }
 }
-
 if (!successful) {
   throw new RuntimeException("Execution was not successful");
 }
   }
 
+  @VisibleForTesting
+  static void savePerfsToBigQuery(
+  NexmarkOptions options,
+  Map perfs,
+  @Nullable FakeBigQueryServices fakeBigQueryServices) {
+Pipeline pipeline = Pipeline.create(options);
 
 Review comment:
   I'm still not sure why we can't create a new PipelineOptions object here 
(PipelineOptions newOptions = PipelineOptionsFactory.create()) and use 
DirectRunner. That will be much faster (and cleaner) than using the original 
options with original runner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108900)
Time Spent: 3h 20m  (was: 3h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108904=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108904
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 00:58
Start Date: 05/Jun/18 00:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192915615
 
 

 ##
 File path: sdks/java/nexmark/build.gradle
 ##
 @@ -42,6 +42,7 @@ configurations {
 }
 
 dependencies {
+  compile 'com.google.cloud:google-cloud-bigquery:1.28.0'
 
 Review comment:
   We don't need this anymore, right ? (Also, This dependency will result in a 
conflict till we upgrade gRPC, protobuf, and Spanner. See 
https://issues.apache.org/jira/browse/BEAM-4229.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108904)
Time Spent: 3.5h  (was: 3h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108903
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 05/Jun/18 00:58
Start Date: 05/Jun/18 00:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192916538
 
 

 ##
 File path: sdks/java/nexmark/build.gradle
 ##
 @@ -42,6 +42,7 @@ configurations {
 }
 
 dependencies {
+  compile 'com.google.cloud:google-cloud-bigquery:1.28.0'
 
 Review comment:
   We don't need this anymore, right ? Also, this can result in dependency 
conflicts, see https://issues.apache.org/jira/browse/BEAM-4229.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108903)
Time Spent: 3.5h  (was: 3h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108610
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 04/Jun/18 14:35
Start Date: 04/Jun/18 14:35
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-394377009
 
 
   @chamikaramj I did the solution with the second pipeline which uses 
BigQueryIO that you suggested PTAL.
   Some comments : 
   1. I had to make FakeBigQuery* public. But what troubles me is that, as you 
did not want to put BigQueyServices public, I had to inject 
FakeBigQueryServices to use with BQIO.withTestServices(). So that makes a test 
artifact a dependence of nexmark production code.
   2. I still have a serialization issue. I guess it is because of NexmarkPerf 
serialization (see custom coder in Main). => I would prefer that we discuss the 
global design (point 1 and related) before I try to fix this second point.
   Thanks again


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108610)
Time Spent: 3h 10m  (was: 3h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108595
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 04/Jun/18 14:26
Start Date: 04/Jun/18 14:26
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192759136
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   in this PR we output the perf of a nexmark run on a set of queries using a 
particular runner. We will end up having BQ tables per request x runner 
(pipelineOptions.getRunner()).  So the pipeline that writes must me run with 
the same runner than the pipeline that tests the queries otherwise we will 
output perfs data as if their were gathered using directRunner each time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108595)
Time Spent: 3h  (was: 2h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108161=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108161
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 01/Jun/18 18:31
Start Date: 01/Jun/18 18:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192479918
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   Using any runner (for the second pipeline) is fine. But can you clarify why 
you cannot use DirectRunner here ? Using DirectRunner will be the easiest 
option if you want to push a small amount of data to BigQuery. You'll have to 
create a new PipelineOptions object for this (or update "runner" property of 
current PipelineOptions object).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108161)
Time Spent: 2h 50m  (was: 2h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108159
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 01/Jun/18 18:30
Start Date: 01/Jun/18 18:30
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192479918
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   Using any runner (for the second pipeline) is fine. But can you clarify why 
you cannot use DirectRunner here ? Using DirectRunner will be the easiest 
option if you want to push a small amount of data to BigQuery. You'll have to 
create a new PipelineOptions object for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108159)
Time Spent: 2h 40m  (was: 2.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108087=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108087
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 01/Jun/18 14:14
Start Date: 01/Jun/18 14:14
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192408722
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   True, I'm using BigQueryService class. If you don't want it exposed, I can 
use BigQueryIO in a second pipeline indeed. I can create this pipeline once the 
run pipeline is finished and run it over of PCollection> and write this Pcollection to BQ using BigQueryIO. The only this 
is that the second pipeline will not run using DirectRunner, but using the 
runner configured in the piepelineOptions (the runner that we are currently 
testing with nexmark)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108087)
Time Spent: 2.5h  (was: 2h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=108086=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108086
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 01/Jun/18 14:12
Start Date: 01/Jun/18 14:12
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192408722
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   True, I'm using BigQueryService class. If you don't want it exposed, I can 
use BigQueryIO in a second pipeline indeed. I can create this pipeline once the 
run pipeline is finished and run it over of PCollection> and write this Pcollection to BQ using BigQueryIO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108086)
Time Spent: 2h 20m  (was: 2h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107695=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107695
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 15:10
Start Date: 31/May/18 15:10
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192133881
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   Using FakeBigQueryService for testing is fine, but seems like here you are 
using the BigQueryService class (which is supposed to be an implementation 
detail of Beam BigQuery connector) to publish results to real BigQuery, no ? 
(apologies if I misunderstood) I was suggesting to use a second Beam pipeline 
that uses DirectRunner (not the original pipeline) to publish results to 
BigQuery instead. I don't think  we should expand the public interface of 
BigQuery connector for this task.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107695)
Time Spent: 2h 10m  (was: 2h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107627
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 09:51
Start Date: 31/May/18 09:51
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-393478182
 
 
   @chamikaramj Thanks for your review, I answered all your comments. PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107627)
Time Spent: 2h  (was: 1h 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107626
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 09:50
Start Date: 31/May/18 09:50
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192045764
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkConfiguration.java
 ##
 @@ -420,6 +428,9 @@ public String toShortString() {
 if (sinkType != DEFAULT.sinkType) {
   sb.append(String.format("; sinkType:%s", sinkType));
 }
+if (exportSummaryToBigQuery != DEFAULT.exportSummaryToBigQuery) {
 
 Review comment:
   Like all the other configuration items,  when we print the configuration, we 
print their value only if they are set by the user (i.e. their value is not the 
default value)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107626)
Time Spent: 1h 50m  (was: 1h 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107625
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 09:46
Start Date: 31/May/18 09:46
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192044761
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkConfiguration.java
 ##
 @@ -47,6 +47,10 @@
   @JsonProperty
   public NexmarkUtils.SinkType sinkType = NexmarkUtils.SinkType.DEVNULL;
 
+  /** Shall we export the summary to BigQuery. */
 
 Review comment:
   If false, the summary is only output to the console.
   If true the summary is output to the console and it's content is written to 
bigquery tables per query x runner x mode.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107625)
Time Spent: 1h 40m  (was: 1.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107622
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 09:42
Start Date: 31/May/18 09:42
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #5464: 
[BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192043662
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   For your 2 questions:
   1. No, here we output to BigQuery only the response time of the pipeline so 
it needs to be finished. 
   2. What you suggest (bigquery client) is what I did at first and I asked you 
how I can unit test it and you answered that I should use FakeBigQueryServices. 
But to use FakeBigQueryServices I needed to refactor everything to use 
FakeBigQueryServices's super class (BigQueryServices)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107622)
Time Spent: 1.5h  (was: 1h 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107609
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 08:40
Start Date: 31/May/18 08:40
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192022213
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java
 ##
 @@ -146,13 +164,77 @@ private void appendPerf(
   private static final String LINE =
   
"==";
 
-  /**
-   * Print summary  of {@code actual} vs (if non-null) {@code baseline}.
-   */
+  /** Send {@code nexmarkPerf} to BigQuery. */
+  @VisibleForTesting
+  static void writeQueryPerftoBigQuery(
 
 Review comment:
   Can we perform this write using a Beam pipeline that uses DirectRunner and 
BigQuery sink instead of using BigQueryService class ? I think you should use 
either that or use a BigQuery client library directly instead of using the 
intermediate "BigQueryService" class in Beam which is not intended to be a 
public utility.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107609)
Time Spent: 1h  (was: 50m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107611
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 08:40
Start Date: 31/May/18 08:40
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192025438
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkConfiguration.java
 ##
 @@ -420,6 +428,9 @@ public String toShortString() {
 if (sinkType != DEFAULT.sinkType) {
   sb.append(String.format("; sinkType:%s", sinkType));
 }
+if (exportSummaryToBigQuery != DEFAULT.exportSummaryToBigQuery) {
 
 Review comment:
   Why are we comparing with default values ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107611)
Time Spent: 1h 20m  (was: 1h 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=107610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107610
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 31/May/18 08:40
Start Date: 31/May/18 08:40
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5464: [BEAM-4283] Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#discussion_r192024638
 
 

 ##
 File path: 
sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkConfiguration.java
 ##
 @@ -47,6 +47,10 @@
   @JsonProperty
   public NexmarkUtils.SinkType sinkType = NexmarkUtils.SinkType.DEVNULL;
 
+  /** Shall we export the summary to BigQuery. */
 
 Review comment:
   How about "If true, exports the summary to BigQuery." ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107610)
Time Spent: 1h 10m  (was: 1h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=106361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106361
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 28/May/18 15:24
Start Date: 28/May/18 15:24
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-392554240
 
 
   @chamikaramj finally, in the production code, I used `BigQueryServices` 
included in `BigQueryIO` with fake windows/pane/timestamp instead of the 
regular big query client. Thus, I could use `FakeBigQueryServices` in the test.
   PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106361)
Time Spent: 50m  (was: 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-25 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=105854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-105854
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 25/May/18 09:15
Start Date: 25/May/18 09:15
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-391985730
 
 
   @chamikaramj yes as I wrote in the comment in the code I took a look at 
FakeBigQueryServices and BigQueryServicesImpl and it is PCollection oriented 
among other things it deals with windowing. My use case is a simple insert of 
values outside a PCollection to BQ. I was searching for a more simple way of 
testing. But if it is the only way to test, I could create fake windows/pane. 
But this seems overkill to test a simple insert done using de bigQuery regular 
client.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 105854)
Time Spent: 40m  (was: 0.5h)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-25 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=105852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-105852
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 25/May/18 08:55
Start Date: 25/May/18 08:55
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-391985730
 
 
   @chamikaramj yes as I wrote in the comment in the code I took a look at 
FakeBigQueryServices and BigQueryServicesImpl and it is PCollection oriented 
among other things it deals with windowing. My use case is a simple insert of 
values outside a PCollection to BQ. I was searching for a more simple way of 
testing. I inserted using de bigQuery regular client


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 105852)
Time Spent: 0.5h  (was: 20m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-25 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=105850=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-105850
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 25/May/18 08:47
Start Date: 25/May/18 08:47
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-391985730
 
 
   @chamikaramj yes as I wrote in the comment in the code I took a look at 
FakeBigQueryServices and it is PCollection oriented. My use case is a simple 
insert of values outside a PCollection to BQ. I was searching for a more simple 
way of testing


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 105850)
Time Spent: 20m  (was: 10m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-24 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=105551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-105551
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 24/May/18 12:05
Start Date: 24/May/18 12:05
Worklog Time Spent: 10m 
  Work Description: echauchot opened a new pull request #5464: [BEAM-4283] 
Write Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464
 
 
   This is to write Nexmark execution times to BigQuery to be able to integrate 
Nexmark output in perfkit dashboards.
   @jkff I did not do a proper test, can you please advice me on a way to 
properly unit test a simple insert (no PCollection) into Big Query. I left a 
TODO and comment in the test.
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 105551)
Time Spent: 10m
Remaining Estimate: 0h

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)