[jira] [Commented] (BEAM-4751) Finish io futurize stage 2: fix the missing pylint3 check in tox.ini

2018-08-01 Thread Matthias Feys (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566387#comment-16566387
 ] 

Matthias Feys commented on BEAM-4751:
-

Yes, fixed now

> Finish io futurize stage 2: fix the missing pylint3 check in tox.ini
> 
>
> Key: BEAM-4751
> URL: https://issues.apache.org/jira/browse/BEAM-4751
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Matthias Feys
>Assignee: Matthias Feys
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Finish io futurize stage 2: fix the missing pylint3 check in tox.ini



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4751) Finish io futurize stage 2: fix the missing pylint3 check in tox.ini

2018-08-01 Thread Matthias Feys (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Feys resolved BEAM-4751.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> Finish io futurize stage 2: fix the missing pylint3 check in tox.ini
> 
>
> Key: BEAM-4751
> URL: https://issues.apache.org/jira/browse/BEAM-4751
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Matthias Feys
>Assignee: Matthias Feys
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Finish io futurize stage 2: fix the missing pylint3 check in tox.ini



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4000) Futurize and fix python 2 compatibility for io subpackage

2018-08-01 Thread Matthias Feys (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Feys resolved BEAM-4000.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> Futurize and fix python 2 compatibility for io subpackage
> -
>
> Key: BEAM-4000
> URL: https://issues.apache.org/jira/browse/BEAM-4000
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=130098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130098
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 02/Aug/18 04:26
Start Date: 02/Aug/18 04:26
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5958: [BEAM-4778] add option to flink job server to clean staged artifacts 
per-job
URL: https://github.com/apache/beam/pull/5958#discussion_r207098322
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -138,56 +140,57 @@ public static String generateStagingSessionToken(String 
sessionId, String basePa
 StagingSessionToken stagingSessionToken = new StagingSessionToken();
 stagingSessionToken.setSessionId(sessionId);
 stagingSessionToken.setBasePath(basePath);
-return encodeStagingSessionToken(stagingSessionToken);
+return stagingSessionToken.encode();
   }
 
   private String encodedFileName(ArtifactMetadata artifactMetadata) {
 return "artifact_"
 + Hashing.sha256().hashString(artifactMetadata.getName(), 
CHARSET).toString();
   }
 
-  private static StagingSessionToken decodeStagingSessionToken(String 
stagingSessionToken)
-  throws Exception {
-try {
-  return MAPPER.readValue(stagingSessionToken, StagingSessionToken.class);
-} catch (JsonProcessingException e) {
-  String message =
-  String.format(
-  "Unable to deserialize staging token %s. Expected format: %s. 
Error: %s",
-  stagingSessionToken,
-  "{\"sessionId\": \"sessionId\", \"basePath\": \"basePath\"}",
-  e.getMessage());
-  throw new 
StatusRuntimeException(Status.INVALID_ARGUMENT.withDescription(message));
-}
-  }
+  public void removeArtifacts(String stagingSessionToken) throws Exception {
+StagingSessionToken parsedToken = 
StagingSessionToken.decode(stagingSessionToken);
+ResourceId dir = getJobDirResourceId(parsedToken);
+ResourceId manifestResourceId = dir.resolve(MANIFEST, 
StandardResolveOptions.RESOLVE_FILE);
 
-  private static String encodeStagingSessionToken(StagingSessionToken 
stagingSessionToken)
-  throws Exception {
-try {
-  return MAPPER.writeValueAsString(stagingSessionToken);
-} catch (JsonProcessingException e) {
-  LOG.error("Error {} occurred while serializing {}.", e.getMessage(), 
stagingSessionToken);
-  throw e;
+LOG.info("Removing dir {}", dir);
 
 Review comment:
   good point; I made just the last one `info` and the rest `debug`, lmk if you 
want them all as debug


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130098)
Time Spent: 5h 40m  (was: 5.5h)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=130099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130099
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 02/Aug/18 04:26
Start Date: 02/Aug/18 04:26
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5958: [BEAM-4778] add option to flink job server to clean staged artifacts 
per-job
URL: https://github.com/apache/beam/pull/5958#discussion_r207098356
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -329,6 +335,29 @@ private void setBasePath(String basePath) {
   this.basePath = basePath;
 }
 
+public String encode() throws Exception {
+  try {
+return MAPPER.writeValueAsString(this);
+  } catch (JsonProcessingException e) {
+LOG.error("Error {} occurred while serializing {}.", e.getMessage(), 
this);
 
 Review comment:
   good catch, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130099)
Time Spent: 5h 50m  (was: 5h 40m)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #761

2018-08-01 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=130085=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130085
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 02/Aug/18 03:51
Start Date: 02/Aug/18 03:51
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5958: 
[BEAM-4778] add option to flink job server to clean staged artifacts per-job
URL: https://github.com/apache/beam/pull/5958#discussion_r207094725
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -138,56 +140,57 @@ public static String generateStagingSessionToken(String 
sessionId, String basePa
 StagingSessionToken stagingSessionToken = new StagingSessionToken();
 stagingSessionToken.setSessionId(sessionId);
 stagingSessionToken.setBasePath(basePath);
-return encodeStagingSessionToken(stagingSessionToken);
+return stagingSessionToken.encode();
   }
 
   private String encodedFileName(ArtifactMetadata artifactMetadata) {
 return "artifact_"
 + Hashing.sha256().hashString(artifactMetadata.getName(), 
CHARSET).toString();
   }
 
-  private static StagingSessionToken decodeStagingSessionToken(String 
stagingSessionToken)
-  throws Exception {
-try {
-  return MAPPER.readValue(stagingSessionToken, StagingSessionToken.class);
-} catch (JsonProcessingException e) {
-  String message =
-  String.format(
-  "Unable to deserialize staging token %s. Expected format: %s. 
Error: %s",
-  stagingSessionToken,
-  "{\"sessionId\": \"sessionId\", \"basePath\": \"basePath\"}",
-  e.getMessage());
-  throw new 
StatusRuntimeException(Status.INVALID_ARGUMENT.withDescription(message));
-}
-  }
+  public void removeArtifacts(String stagingSessionToken) throws Exception {
+StagingSessionToken parsedToken = 
StagingSessionToken.decode(stagingSessionToken);
+ResourceId dir = getJobDirResourceId(parsedToken);
+ResourceId manifestResourceId = dir.resolve(MANIFEST, 
StandardResolveOptions.RESOLVE_FILE);
 
-  private static String encodeStagingSessionToken(StagingSessionToken 
stagingSessionToken)
-  throws Exception {
-try {
-  return MAPPER.writeValueAsString(stagingSessionToken);
-} catch (JsonProcessingException e) {
-  LOG.error("Error {} occurred while serializing {}.", e.getMessage(), 
stagingSessionToken);
-  throw e;
+LOG.info("Removing dir {}", dir);
 
 Review comment:
   Do we really this and following logging at info level? Why not debug?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130085)
Time Spent: 5.5h  (was: 5h 20m)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=130084=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130084
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 02/Aug/18 03:50
Start Date: 02/Aug/18 03:50
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5958: 
[BEAM-4778] add option to flink job server to clean staged artifacts per-job
URL: https://github.com/apache/beam/pull/5958#discussion_r207094725
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -138,56 +140,57 @@ public static String generateStagingSessionToken(String 
sessionId, String basePa
 StagingSessionToken stagingSessionToken = new StagingSessionToken();
 stagingSessionToken.setSessionId(sessionId);
 stagingSessionToken.setBasePath(basePath);
-return encodeStagingSessionToken(stagingSessionToken);
+return stagingSessionToken.encode();
   }
 
   private String encodedFileName(ArtifactMetadata artifactMetadata) {
 return "artifact_"
 + Hashing.sha256().hashString(artifactMetadata.getName(), 
CHARSET).toString();
   }
 
-  private static StagingSessionToken decodeStagingSessionToken(String 
stagingSessionToken)
-  throws Exception {
-try {
-  return MAPPER.readValue(stagingSessionToken, StagingSessionToken.class);
-} catch (JsonProcessingException e) {
-  String message =
-  String.format(
-  "Unable to deserialize staging token %s. Expected format: %s. 
Error: %s",
-  stagingSessionToken,
-  "{\"sessionId\": \"sessionId\", \"basePath\": \"basePath\"}",
-  e.getMessage());
-  throw new 
StatusRuntimeException(Status.INVALID_ARGUMENT.withDescription(message));
-}
-  }
+  public void removeArtifacts(String stagingSessionToken) throws Exception {
+StagingSessionToken parsedToken = 
StagingSessionToken.decode(stagingSessionToken);
+ResourceId dir = getJobDirResourceId(parsedToken);
+ResourceId manifestResourceId = dir.resolve(MANIFEST, 
StandardResolveOptions.RESOLVE_FILE);
 
-  private static String encodeStagingSessionToken(StagingSessionToken 
stagingSessionToken)
-  throws Exception {
-try {
-  return MAPPER.writeValueAsString(stagingSessionToken);
-} catch (JsonProcessingException e) {
-  LOG.error("Error {} occurred while serializing {}.", e.getMessage(), 
stagingSessionToken);
-  throw e;
+LOG.info("Removing dir {}", dir);
 
 Review comment:
   Do we really want all this logging at info level? Why not debug?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130084)
Time Spent: 5h 20m  (was: 5h 10m)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=130083=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130083
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 02/Aug/18 03:49
Start Date: 02/Aug/18 03:49
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5958: 
[BEAM-4778] add option to flink job server to clean staged artifacts per-job
URL: https://github.com/apache/beam/pull/5958#discussion_r207094635
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -329,6 +335,29 @@ private void setBasePath(String basePath) {
   this.basePath = basePath;
 }
 
+public String encode() throws Exception {
+  try {
+return MAPPER.writeValueAsString(this);
+  } catch (JsonProcessingException e) {
+LOG.error("Error {} occurred while serializing {}.", e.getMessage(), 
this);
 
 Review comment:
   Why logging vs. using `StatusRuntimeException` as shown below? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130083)
Time Spent: 5h 10m  (was: 5h)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1148

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] Tests for running Python on Flink.

[ankurgoenka] Fixing lint issues

[ankurgoenka] Adding gradle task for python streaming validates flink runner

--
[...truncated 19.60 MB...]
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:09.958Z: Expanding CoGroupByKey operations into 
optimizable parts.
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.208Z: Expanding GroupByKey operations into 
optimizable parts.
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.255Z: Lifting ValueCombiningMappingFns into 
MergeBucketsMappingFns
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.511Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.551Z: Elided trivial flatten 
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.595Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.635Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.680Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.761Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.829Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Read information schema
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.896Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.947Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T03:19:10.982Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
Aug 02, 2018 3:19:14 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 

[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=130074=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130074
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 02/Aug/18 03:07
Start Date: 02/Aug/18 03:07
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5958: [BEAM-4778] add 
option to flink job server to clean staged artifacts per-job
URL: https://github.com/apache/beam/pull/5958#issuecomment-409791203
 
 
   I removed the terminationListener per your suggestion @angoenka; it should 
be ready to go!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130074)
Time Spent: 5h  (was: 4h 50m)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Go_GradleBuild #568

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] Tests for running Python on Flink.

[ankurgoenka] Fixing lint issues

[ankurgoenka] Adding gradle task for python streaming validates flink runner

--
[...truncated 541.41 KB...]
  }
],
"parallel_input": {
  "@type": "OutputReference",
  "step_name": "e8",
  "output_name": "i0"
},
"serialized_fn": "e9"
  }
},
{
  "kind": "ParallelDo",
  "name": "e10",
  "properties": {
"user_name": "passert.Hash(out)/beam.addFixedKeyFn",
"output_info": [
  {
"user_name": "i0",
"output_name": "i0",
"encoding": {
  "@type": "kind:windowed_value",
  "component_encodings": [
{
  "@type": "kind:pair",
  "component_encodings": [
{
  "@type": "kind:length_prefix",
  "component_encodings": [
{
  "@type": 
"Cgd2YXJpbnR6EgIIAhqFAQpxZ2l0aHViLmNvbS9hcGFjaGUvYmVhbS9zZGtzL2dvL3Rlc3QvdmVuZG9yL2dpdGh1Yi5jb20vYXBhY2hlL2JlYW0vc2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvY29kZXJ4LmVuY1ZhckludFoSEAgWIgQIGUAPKgYIFBICCAgikQEKcWdpdGh1Yi5jb20vYXBhY2hlL2JlYW0vc2Rrcy9nby90ZXN0L3ZlbmRvci9naXRodWIuY29tL2FwYWNoZS9iZWFtL3Nka3MvZ28vcGtnL2JlYW0vY29yZS9ydW50aW1lL2NvZGVyeC5kZWNWYXJJbnRaEhwIFiIECBlAAyIGCBQSAggIKgQIGUAPKgQIGUAB"
}
  ]
},
{
  "@type": "kind:bytes"
}
  ],
  "is_pair_like": true
},
{
  "@type": "kind:global_window"
}
  ],
  "is_wrapper": true
}
  }
],
"parallel_input": {
  "@type": "OutputReference",
  "step_name": "e9",
  "output_name": "i0"
},
"serialized_fn": "e10"
  }
},
{
  "kind": "GroupByKey",
  "name": "e11",
  "properties": {
"user_name": "passert.Hash(out)/CoGBK",
"disallow_combiner_lifting": true,
"output_info": [
  {
"user_name": "i0",
"output_name": "i0",
"encoding": {
  "@type": "kind:windowed_value",
  "component_encodings": [
{
  "@type": "kind:pair",
  "component_encodings": [
{
  "@type": "kind:length_prefix",
  "component_encodings": [
{
  "@type": 
"Cgd2YXJpbnR6EgIIAhqFAQpxZ2l0aHViLmNvbS9hcGFjaGUvYmVhbS9zZGtzL2dvL3Rlc3QvdmVuZG9yL2dpdGh1Yi5jb20vYXBhY2hlL2JlYW0vc2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvY29kZXJ4LmVuY1ZhckludFoSEAgWIgQIGUAPKgYIFBICCAgikQEKcWdpdGh1Yi5jb20vYXBhY2hlL2JlYW0vc2Rrcy9nby90ZXN0L3ZlbmRvci9naXRodWIuY29tL2FwYWNoZS9iZWFtL3Nka3MvZ28vcGtnL2JlYW0vY29yZS9ydW50aW1lL2NvZGVyeC5kZWNWYXJJbnRaEhwIFiIECBlAAyIGCBQSAggIKgQIGUAPKgQIGUAB"
}
  ]
},
{
  "@type": "kind:stream",
  "component_encodings": [
{
  "@type": "kind:bytes"
}
  ],
  "is_stream_like": true
}
  ],
  "is_pair_like": true
},
{
  "@type": "kind:global_window"
}
  ],
  "is_wrapper": true
}
  }
],
"parallel_input": {
  "@type": "OutputReference",
  "step_name": "e10",
  "output_name": "i0"
},
"serialized_fn": 
"%0A%29%22%27%0A%02c4%12%21%0A%1F%0A%1D%0A%1Bbeam:coder:global_window:v1j9%0A%25%0A%23%0A%21beam:windowfn:global_windows:v0.1%10%01%1A%02c4%22%02:%00%28%010%018%02H%01"
  }
},
{
  "kind": "ParallelDo",
  "name": "e12",
  "properties": {
"user_name": "passert.Hash(out)/passert.hashFn",
"output_info": [
  {
"user_name": "bogus",
"output_name": "bogus",
"encoding": {
  "@type": "kind:windowed_value",
  "component_encodings": [
{
  "@type": "kind:bytes"
},
{
  "@type": "kind:global_window"
}
  ],
  "is_wrapper": true
}
  }
],
"parallel_input": {
  "@type": "OutputReference",
  "step_name": "e11",
  "output_name": "i0"
},
"serialized_fn": "e12"
  }

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=130053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130053
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 02/Aug/18 01:42
Start Date: 02/Aug/18 01:42
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6110: [BEAM-4176] Tests for 
running Python on Flink.
URL: https://github.com/apache/beam/pull/6110
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/runners/portability/flink_runner_test.py 
b/sdks/python/apache_beam/runners/portability/flink_runner_test.py
new file mode 100644
index 000..51f01c9a66d
--- /dev/null
+++ b/sdks/python/apache_beam/runners/portability/flink_runner_test.py
@@ -0,0 +1,86 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import shutil
+import sys
+import tempfile
+import unittest
+
+import apache_beam as beam
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from apache_beam.options.pipeline_options import StandardOptions
+from apache_beam.runners.portability import portable_runner
+from apache_beam.runners.portability import portable_runner_test
+from apache_beam.testing.util import assert_that
+
+if __name__ == '__main__':
+  # Run as
+  #
+  # python -m apache_beam.runners.portability.flink_runner_test \
+  # /path/to/job_server.jar \
+  # [FlinkRunnerTest.test_method, ...]
+  flinkJobServerJar = sys.argv.pop(1)
+  streaming = sys.argv.pop(1).lower() == 'streaming'
+
+  # This is defined here to only be run when we invoke this file explicitly.
+  class FlinkRunnerTest(portable_runner_test.PortableRunnerTest):
+_use_grpc = True
+_use_subprocesses = True
+
+@classmethod
+def _subprocess_command(cls, port):
+  tmp_dir = tempfile.mkdtemp(prefix='flinktest')
+  try:
+return [
+'java',
+'-jar', flinkJobServerJar,
+'--artifacts-dir', tmp_dir,
+'--job-host', 'localhost:%s' % port,
+]
+  finally:
+shutil.rmtree(tmp_dir)
+
+@classmethod
+def get_runner(cls):
+  return portable_runner.PortableRunner()
+
+def create_options(self):
+  options = super(FlinkRunnerTest, self).create_options()
+  options.view_as(DebugOptions).experiments = ['beam_fn_api']
+  options.view_as(SetupOptions).sdk_location = 'container'
+  if streaming:
+options.view_as(StandardOptions).streaming = True
+  return options
+
+# Can't read host files from within docker, read a "local" file there.
+def test_read(self):
+  with self.create_pipeline() as p:
+lines = p | beam.io.ReadFromText('/etc/profile')
+assert_that(lines, lambda lines: len(lines) > 0)
+
+def test_no_subtransform_composite(self):
+  raise unittest.SkipTest("BEAM-4781")
+
+# Inherits all other tests.
+
+  # Run the tests.
+  logging.getLogger().setLevel(logging.INFO)
+  unittest.main()
diff --git a/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py 
b/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
index 6544c9d477a..277a817dd2a 100644
--- a/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
+++ b/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
@@ -199,7 +199,9 @@ def test_gbk_side_input(self):
   def test_multimap_side_input(self):
 with self.create_pipeline() as p:
   main = p | 'main' >> beam.Create(['a', 'b'])
-  side = p | 'side' >> beam.Create([('a', 1), ('b', 2), ('a', 3)])
+  side = (p | 'side' >> beam.Create([('a', 1), ('b', 2), ('a', 3)])
+  # TODO(BEAM-4782): Obviate the need for this map.
+  | beam.Map(lambda kv: (kv[0], kv[1])))
  

[beam] 01/01: Merge pull request #6110: [BEAM-4176] Tests for running Python on Flink.

2018-08-01 Thread thw
This is an automated email from the ASF dual-hosted git repository.

thw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 5201fa91cbf40d1730e1b2fb62bcdb4bce5ca0eb
Merge: 25a0070 5f50552
Author: Thomas Weise 
AuthorDate: Wed Aug 1 18:42:27 2018 -0700

Merge pull request #6110: [BEAM-4176] Tests for running Python on Flink.

 .../runners/portability/flink_runner_test.py   | 86 ++
 .../runners/portability/fn_api_runner_test.py  |  4 +-
 .../runners/portability/portable_runner_test.py| 36 +
 sdks/python/build.gradle   | 19 +
 4 files changed, 131 insertions(+), 14 deletions(-)



[beam] branch master updated (25a0070 -> 5201fa9)

2018-08-01 Thread thw
This is an automated email from the ASF dual-hosted git repository.

thw pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 25a0070  Merge pull request #6120: [BEAM-5057] Fixes a Javadoc error
 add 812cbec  Tests for running Python on Flink.
 add 99379e0  Fixing lint issues
 add 5f50552  Adding gradle task for python streaming validates flink runner
 new 5201fa9  Merge pull request #6110: [BEAM-4176] Tests for running 
Python on Flink.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../runners/portability/flink_runner_test.py   | 86 ++
 .../runners/portability/fn_api_runner_test.py  |  4 +-
 .../runners/portability/portable_runner_test.py| 36 +
 sdks/python/build.gradle   | 19 +
 4 files changed, 131 insertions(+), 14 deletions(-)
 create mode 100644 
sdks/python/apache_beam/runners/portability/flink_runner_test.py



[jira] [Commented] (BEAM-4826) Flink runner sends bad flatten to SDK

2018-08-01 Thread Henning Rohde (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566235#comment-16566235
 ] 

Henning Rohde commented on BEAM-4826:
-

Dataflow drops the flatten altogether.

> Flink runner sends bad flatten to SDK
> -
>
> Key: BEAM-4826
> URL: https://issues.apache.org/jira/browse/BEAM-4826
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Henning Rohde
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
>
> For a Go flatten test w/ 3 input, the Flink runner splits this into 3 bundle 
> descriptors. But it sends the original 3-input flatten but w/ 1 actual input 
> present in each bundle descriptor. This is inconsistent and the SDK shouldn't 
> expect dangling PCollections. In contrast, Dataflow removes the flatten when 
> it does the same split.
> Snippet:
> register: <
>   process_bundle_descriptor: <
> id: "3"
> transforms: <
>   key: "e4"
>   value: <
> unique_name: "github.com/apache/beam/sdks/go/pkg/beam.createFn'1"
> spec: <
>   urn: "urn:beam:transform:pardo:v1"
>   payload: [...]
> >
> inputs: <
>   key: "i0"
>   value: "n3"
> >
> outputs: <
>   key: "i0"
>   value: "n4"
> >
>   >
> >
> transforms: <
>   key: "e7"
>   value: <
> unique_name: "Flatten"
> spec: <
>   urn: "beam:transform:flatten:v1"
> >
> inputs: <
>   key: "i0"
>   value: "n2"
> >
> inputs: <
>   key: "i1"
>   value: "n4" . // <--- only one present.
> >
> inputs: <
>   key: "i2"
>   value: "n6"
> >
> outputs: <
>   key: "i0"
>   value: "n7"
> >
>   >
> >
> [...]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4826) Flink runner sends bad flatten to SDK

2018-08-01 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566224#comment-16566224
 ] 

Ankur Goenka commented on BEAM-4826:


[~herohde] Quick question.

What do you mean by "Dataflow removes the flatten when it does the same split".

Does the dataflow drop the flatten whole together or does it drop the redundant 
inputs from flatten transform when sending process bundle descriptor to the 
SDKHarness?

 

The challenge with flink and in general with portability implementation is that 
it can potentially create different ProcessBundleDescriptor for a flatten for 
each input based on how those inputs are created. 

There are 2 potential fix for this.
 # Remove the redundant input at the execution time in the runner.
 # Create multiple flatten transforms for each stage created after fusion.

I think fix 1 is better because it gives runner more information about how to 
fuse things which in fix 2, this information is hard to attain.

> Flink runner sends bad flatten to SDK
> -
>
> Key: BEAM-4826
> URL: https://issues.apache.org/jira/browse/BEAM-4826
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Henning Rohde
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
>
> For a Go flatten test w/ 3 input, the Flink runner splits this into 3 bundle 
> descriptors. But it sends the original 3-input flatten but w/ 1 actual input 
> present in each bundle descriptor. This is inconsistent and the SDK shouldn't 
> expect dangling PCollections. In contrast, Dataflow removes the flatten when 
> it does the same split.
> Snippet:
> register: <
>   process_bundle_descriptor: <
> id: "3"
> transforms: <
>   key: "e4"
>   value: <
> unique_name: "github.com/apache/beam/sdks/go/pkg/beam.createFn'1"
> spec: <
>   urn: "urn:beam:transform:pardo:v1"
>   payload: [...]
> >
> inputs: <
>   key: "i0"
>   value: "n3"
> >
> outputs: <
>   key: "i0"
>   value: "n4"
> >
>   >
> >
> transforms: <
>   key: "e7"
>   value: <
> unique_name: "Flatten"
> spec: <
>   urn: "beam:transform:flatten:v1"
> >
> inputs: <
>   key: "i0"
>   value: "n2"
> >
> inputs: <
>   key: "i1"
>   value: "n4" . // <--- only one present.
> >
> inputs: <
>   key: "i2"
>   value: "n6"
> >
> outputs: <
>   key: "i0"
>   value: "n7"
> >
>   >
> >
> [...]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4826) Flink runner sends bad flatten to SDK

2018-08-01 Thread Henning Rohde (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566219#comment-16566219
 ] 

Henning Rohde commented on BEAM-4826:
-

Yeah - unused outputs from ParDo is fine. As you say, they may occur from user 
code naturally.

Flatten is the only transform I can think of, where the runner would split it 
by input. We should just remove it in that case, like Dataflow does (because a 
1 input flatten is a no-op) -- although for Go at least, it would work fine. 
Other transforms generally have an expected input arity and the SDK can't 
execute the bundle if the inputs are not all wired up. Mucking with such inputs 
probably wouldn't work well.

> Flink runner sends bad flatten to SDK
> -
>
> Key: BEAM-4826
> URL: https://issues.apache.org/jira/browse/BEAM-4826
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Henning Rohde
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
>
> For a Go flatten test w/ 3 input, the Flink runner splits this into 3 bundle 
> descriptors. But it sends the original 3-input flatten but w/ 1 actual input 
> present in each bundle descriptor. This is inconsistent and the SDK shouldn't 
> expect dangling PCollections. In contrast, Dataflow removes the flatten when 
> it does the same split.
> Snippet:
> register: <
>   process_bundle_descriptor: <
> id: "3"
> transforms: <
>   key: "e4"
>   value: <
> unique_name: "github.com/apache/beam/sdks/go/pkg/beam.createFn'1"
> spec: <
>   urn: "urn:beam:transform:pardo:v1"
>   payload: [...]
> >
> inputs: <
>   key: "i0"
>   value: "n3"
> >
> outputs: <
>   key: "i0"
>   value: "n4"
> >
>   >
> >
> transforms: <
>   key: "e7"
>   value: <
> unique_name: "Flatten"
> spec: <
>   urn: "beam:transform:flatten:v1"
> >
> inputs: <
>   key: "i0"
>   value: "n2"
> >
> inputs: <
>   key: "i1"
>   value: "n4" . // <--- only one present.
> >
> inputs: <
>   key: "i2"
>   value: "n6"
> >
> outputs: <
>   key: "i0"
>   value: "n7"
> >
>   >
> >
> [...]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4000) Futurize and fix python 2 compatibility for io subpackage

2018-08-01 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566210#comment-16566210
 ] 

Valentyn Tymofieiev commented on BEAM-4000:
---

With linked PR having been merged, should we close this issue?

> Futurize and fix python 2 compatibility for io subpackage
> -
>
> Key: BEAM-4000
> URL: https://issues.apache.org/jira/browse/BEAM-4000
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4751) Finish io futurize stage 2: fix the missing pylint3 check in tox.ini

2018-08-01 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566209#comment-16566209
 ] 

Valentyn Tymofieiev commented on BEAM-4751:
---

With linked PR having been merged, should we close this one now?

> Finish io futurize stage 2: fix the missing pylint3 check in tox.ini
> 
>
> Key: BEAM-4751
> URL: https://issues.apache.org/jira/browse/BEAM-4751
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Matthias Feys
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Finish io futurize stage 2: fix the missing pylint3 check in tox.ini



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4730) Replace try/except imports related to Py2/3 compatibility with from past.builtins imports

2018-08-01 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566208#comment-16566208
 ] 

Valentyn Tymofieiev commented on BEAM-4730:
---

I think this was addressed by https://github.com/apache/beam/pull/5869 - should 
we close this issue?
 
I see two places in the codebase where we still have try/excepts, which I think 
are written as intended: 
- 
https://github.com/apache/beam/blob/25a0070e48ab74e8d8e78f5f480d4824cffe270e/sdks/python/apache_beam/typehints/trivial_inference.py#L38
- 
https://github.com/apache/beam/blob/25a0070e48ab74e8d8e78f5f480d4824cffe270e/sdks/python/apache_beam/coders/coders.py#L317

> Replace try/except imports related to Py2/3 compatibility with from 
> past.builtins imports
> -
>
> Key: BEAM-4730
> URL: https://issues.apache.org/jira/browse/BEAM-4730
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1147

2018-08-01 Thread Apache Jenkins Server
See 


--
[...truncated 20.47 MB...]
INFO: 2018-08-02T00:41:06.723Z: Autoscaling was automatically enabled for 
job 2018-08-01_17_41_06-9008827477136494736.
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:10.271Z: Checking required Cloud APIs are enabled.
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:10.808Z: Checking permissions granted to controller 
Service Account.
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:14.933Z: Worker configuration: n1-standard-1 in 
us-central1-b.
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:15.446Z: Expanding CoGroupByKey operations into 
optimizable parts.
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:15.764Z: Expanding GroupByKey operations into 
optimizable parts.
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:15.829Z: Lifting ValueCombiningMappingFns into 
MergeBucketsMappingFns
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.216Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.273Z: Elided trivial flatten 
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.317Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.372Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.425Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.485Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.530Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Read information schema
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.578Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
Aug 02, 2018 12:41:22 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-02T00:41:16.624Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
Aug 02, 2018 12:41:22 AM 

Jenkins build is back to normal : beam_PostCommit_Python_Verify #5637

2018-08-01 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-5058) Python precommits should run E2E tests

2018-08-01 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-5058:

Summary: Python precommits should run E2E tests  (was: Python precommits 
don't run any E2E tests)

> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>
> According to https://beam.apache.org/contribute/testing/ (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> {batch, streaming}x{SDK language}x{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
> These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> https://github.com/apache/beam/pull/5731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5058) Python precommits should run E2E tests

2018-08-01 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-5058:

Description: 
According to [https://beam.apache.org/contribute/testing/] (which I'm working 
on), end-to-end tests should be run in precommit on each combination of 
\{batch, streaming}x\{SDK language}x\{supported runner}.

At least 2 tests need to be added to Python's precommit: wordcount and 
wordcount_streaming on Dataflow, and possibly on other supported runners 
(direct runner and new runners plz).
 These tests should be configured to run from a Gradle sub-project, so that 
they're run in parallel to the unit tests.

Example that parallelizes Java precommit integration tests: 
[https://github.com/apache/beam/pull/5731]

  was:
According to https://beam.apache.org/contribute/testing/ (which I'm working 
on), end-to-end tests should be run in precommit on each combination of {batch, 
streaming}x{SDK language}x{supported runner}.

At least 2 tests need to be added to Python's precommit: wordcount and 
wordcount_streaming on Dataflow, and possibly on other supported runners 
(direct runner and new runners plz).
These tests should be configured to run from a Gradle sub-project, so that 
they're run in parallel to the unit tests.

Example that parallelizes Java precommit integration tests: 
https://github.com/apache/beam/pull/5731


> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>
> According to [https://beam.apache.org/contribute/testing/] (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> \{batch, streaming}x\{SDK language}x\{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
>  These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> [https://github.com/apache/beam/pull/5731]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5058) Python precommits don't run any E2E tests

2018-08-01 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-5058:
---

 Summary: Python precommits don't run any E2E tests
 Key: BEAM-5058
 URL: https://issues.apache.org/jira/browse/BEAM-5058
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, testing
Reporter: Udi Meiri


According to https://beam.apache.org/contribute/testing/ (which I'm working 
on), end-to-end tests should be run in precommit on each combination of {batch, 
streaming}x{SDK language}x{supported runner}.

At least 2 tests need to be added to Python's precommit: wordcount and 
wordcount_streaming on Dataflow, and possibly on other supported runners 
(direct runner and new runners plz).
These tests should be configured to run from a Gradle sub-project, so that 
they're run in parallel to the unit tests.

Example that parallelizes Java precommit integration tests: 
https://github.com/apache/beam/pull/5731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Py_VR_Dataflow #688

2018-08-01 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=130045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130045
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:29
Start Date: 02/Aug/18 00:29
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on issue #6101: [BEAM-4828] 
Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#issuecomment-409767049
 
 
   I did some testing with a fifo queue, and the Write transform needs some 
work before we merge this. The easiest change would be to accept a 
PCollection rather than a PColleciton


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130045)
Time Spent: 7h 10m  (was: 7h)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PreCommit_Java_Cron #175

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[yifanzou] [BEAM-5018] Upgrade org.tukaani:xz to 1.8

[yifanzou] [BEAM-3906] Automate Validation Aganist Python Wheel

[chamikara] Fixes a Javadoc error

--
[...truncated 15.83 MB...]
org.apache.beam.sdk.nexmark.queries.sql.SqlQuery5Test > testBids STANDARD_ERROR
Aug 02, 2018 12:22:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `AuctionBids`.`auction`, `AuctionBids`.`num`
FROM (SELECT `B1`.`auction`, COUNT(*) AS `num`, HOP_START(`B1`.`dateTime`, 
INTERVAL '5' SECOND, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B1`
GROUP BY `B1`.`auction`, HOP(`B1`.`dateTime`, INTERVAL '5' SECOND, INTERVAL 
'10' SECOND)) AS `AuctionBids`
INNER JOIN (SELECT MAX(`CountBids`.`num`) AS `maxnum`, 
`CountBids`.`starttime`
FROM (SELECT COUNT(*) AS `num`, HOP_START(`B2`.`dateTime`, INTERVAL '5' 
SECOND, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B2`
GROUP BY `B2`.`auction`, HOP(`B2`.`dateTime`, INTERVAL '5' SECOND, INTERVAL 
'10' SECOND)) AS `CountBids`
GROUP BY `CountBids`.`starttime`) AS `MaxBids` ON `AuctionBids`.`starttime` 
= `MaxBids`.`starttime` AND `AuctionBids`.`num` >= `MaxBids`.`maxnum`
Aug 02, 2018 12:22:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(auction=[$0], num=[$1])
  LogicalJoin(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
LogicalProject(auction=[$0], num=[$2], starttime=[$1])
  LogicalAggregate(group=[{0, 1}], num=[COUNT()])
LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
  BeamIOSourceRel(table=[[beam, Bid]])
LogicalProject(maxnum=[$1], starttime=[$0])
  LogicalAggregate(group=[{0}], maxnum=[MAX($1)])
LogicalProject(starttime=[$1], num=[$0])
  LogicalProject(num=[$2], starttime=[$1])
LogicalAggregate(group=[{0, 1}], num=[COUNT()])
  LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
BeamIOSourceRel(table=[[beam, Bid]])

Aug 02, 2018 12:22:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
  BeamJoinRel(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
BeamCalcRel(expr#0..2=[{inputs}], auction=[$t0], num=[$t2], 
starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], expr#6=[1], 
expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])
BeamCalcRel(expr#0..1=[{inputs}], maxnum=[$t1], starttime=[$t0])
  BeamAggregationRel(group=[{1}], maxnum=[MAX($0)])
BeamCalcRel(expr#0..2=[{inputs}], num=[$t2], starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], 
expr#6=[1], expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery3Test > 
testJoinsPeopleWithAuctions STANDARD_ERROR
Aug 02, 2018 12:22:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `P`.`name`, `P`.`city`, `P`.`state`, `A`.`id`
FROM `beam`.`Auction` AS `A`
INNER JOIN `beam`.`Person` AS `P` ON `A`.`seller` = `P`.`id`
WHERE `A`.`category` = 10 AND (`P`.`state` = 'OR' OR `P`.`state` = 'ID' OR 
`P`.`state` = 'CA')
Aug 02, 2018 12:22:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(name=[$11], city=[$14], state=[$15], id=[$0])
  LogicalFilter(condition=[AND(=($8, 10), OR(=($15, 'OR'), =($15, 'ID'), 
=($15, 'CA')))])
LogicalJoin(condition=[=($7, $10)], joinType=[inner])
  BeamIOSourceRel(table=[[beam, Auction]])
  BeamIOSourceRel(table=[[beam, Person]])

Aug 02, 2018 12:22:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..17=[{inputs}], name=[$t11], city=[$t14], state=[$t15], 
id=[$t0])
  BeamJoinRel(condition=[=($7, $10)], joinType=[inner])
BeamCalcRel(expr#0..9=[{inputs}], expr#10=[10], expr#11=[=($t8, $t10)], 
proj#0..9=[{exprs}], $condition=[$t11])
  BeamIOSourceRel(table=[[beam, Auction]])
BeamCalcRel(expr#0..7=[{inputs}], expr#8=['OR'], expr#9=[=($t5, $t8)], 
expr#10=['ID'], expr#11=[=($t5, $t10)], expr#12=['CA'], expr#13=[=($t5, $t12)], 
expr#14=[OR($t9, $t11, $t13)], proj#0..7=[{exprs}], $condition=[$t14])
  BeamIOSourceRel(table=[[beam, Person]])



[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130030=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130030
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065425
 
 

 ##
 File path: sdks/python/apache_beam/transforms/trigger.py
 ##
 @@ -484,6 +494,9 @@ def __repr__(self):
   def __eq__(self, other):
 return type(self) == type(other) and self.underlying == other.underlying
 
+  def __hash__(self):
+return hash((type(self), self.underlying))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130030)
Time Spent: 15h 10m  (was: 15h)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130025
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065473
 
 

 ##
 File path: sdks/python/apache_beam/transforms/trigger.py
 ##
 @@ -526,6 +537,9 @@ def __repr__(self):
   def __eq__(self, other):
 return type(self) == type(other) and self.triggers == other.triggers
 
+  def __hash__(self):
+return hash((type(self), self.triggers))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130025)
Time Spent: 14h 50m  (was: 14h 40m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130033
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207068499
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -246,10 +263,33 @@ def __init__(self, value, timestamp):
 self.value = value
 self.timestamp = Timestamp.of(timestamp)
 
-  def __cmp__(self, other):
-if type(self) is not type(other):
-  return cmp(type(self), type(other))
-return cmp((self.value, self.timestamp), (other.value, other.timestamp))
+  def __eq__(self, other):
 
 Review comment:
   I checked performance of `windowed_value`, `interval_window`, 
`timestamped_value`, `bounded_window` in dictionaries and ordered lists, with 
and without this PR. For the most part, performance is not changed or  
improved. `@total_ordering` does not significantly affect it. Only concern is 
using `hash(type(self))` when evaluating hashes of objects may be unnecessary 
in most cases, and slightly decreases the performance here: 
https://github.com/apache/beam/pull/5729/files#diff-d7dfd884622fb59806ba9276cf3bd8fbR242.
 So I left some more comments to simplify hash functions. The change above was 
also the trigger for test flakiness, although ultimately the test was at fault. 
   
   Without PR:
   ```
   wv_with_one_window: dict, 1 element(s)  : per element median time 
cost: 4.71699e-06 sec, relative std: 5.93%  
   
   wv_with_multiple_windows: dict, 1 element(s): per element median time 
cost: 4.02698e-05 sec, relative std: 0.60%  
   
   interval_window: dict, 1 element(s) : per element median time 
cost: 1.5276e-06 sec, relative std: 1.78%   
   
   timestamped_value: dict, 1 element(s)   : per element median time 
cost: 1.39499e-07 sec, relative std: 7.44%  
   
   interval_window: sorting., 1 element(s) : per element median time 
cost: 4.04392e-05 sec, relative std: 0.63%  
   
   timestamped_value: sorting., 1 element(s)   : per element median time 
cost: 1.80363e-05 sec, relative std: 1.35%  
   
   bounded_window: sorting., 1 element(s)  : per element median time 
cost: 4.06633e-05 sec, relative std: 1.26%  
   
   ```
   With PR (including the change suggested in last iteration).
   
   ```
   wv_with_one_window: dict, 1 element(s)  : per element median time 
cost: 5.047e-06 sec, relative std: 2.16%
   wv_with_multiple_windows: dict, 1 element(s): per element median time 
cost: 4.0575e-05 sec, relative std: 2.20%
   interval_window: dict, 1 element(s) : per element median time 
cost: 1.53821e-06 sec, relative std: 2.43%
   timestamped_value: dict, 1 element(s)   : per element median time 
cost: 1.27995e-06 sec, relative std: 6.11%
   interval_window: sorting., 1 element(s) : per element median time 
cost: 1.83087e-05 sec, relative std: 1.28%
   timestamped_value: sorting., 1 element(s)   : per element median time 
cost: 8.4375e-06 sec, relative std: 2.62%
   bounded_window: sorting., 1 element(s)  : per element median time 
cost: 1.80462e-05 sec, relative std: 3.56%
   ```
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130033)
Time Spent: 15.5h  (was: 15h 20m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130028
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065504
 
 

 ##
 File path: sdks/python/apache_beam/transforms/trigger.py
 ##
 @@ -620,6 +634,9 @@ def __repr__(self):
   def __eq__(self, other):
 return type(self) == type(other) and self.triggers == other.triggers
 
+  def __hash__(self):
+return hash((type(self), self.triggers))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130028)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130024
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065689
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -246,10 +273,23 @@ def __init__(self, value, timestamp):
 self.value = value
 self.timestamp = Timestamp.of(timestamp)
 
-  def __cmp__(self, other):
-if type(self) is not type(other):
-  return cmp(type(self), type(other))
-return cmp((self.value, self.timestamp), (other.value, other.timestamp))
+  def __eq__(self, other):
+return (type(self) == type(other)
+and self.value == other.value
+and self.timestamp == other.timestamp)
+
+  def __hash__(self):
+return hash((type(self), self.value, self.timestamp))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130024)
Time Spent: 14h 40m  (was: 14.5h)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130032
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065995
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -474,6 +526,12 @@ def __eq__(self, other):
 if type(self) == type(other) == Sessions:
   return self.gap_size == other.gap_size
 
+  def __ne__(self, other):
+return not self == other
+
+  def __hash__(self):
+return hash((type(self), self.gap_size))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130032)
Time Spent: 15h 20m  (was: 15h 10m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130029
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065825
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -348,6 +391,9 @@ def __eq__(self, other):
 if type(self) == type(other) == FixedWindows:
   return self.size == other.size and self.offset == other.offset
 
+  def __hash__(self):
+return hash((type(self), self.size, self.offset))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130029)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130022=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130022
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207064664
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -291,6 +293,9 @@ def __eq__(self, other):
   return self.param_id == other.param_id
 return False
 
+  def __hash__(self):
+return hash((type(self), self.param_id))
 
 Review comment:
   let's simplify this to `hash(self.param_id)`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130022)
Time Spent: 14h 20m  (was: 14h 10m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130023
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207064203
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -218,10 +239,15 @@ def __init__(self, start, end):
 self.start = Timestamp.of(start)
 
   def __hash__(self):
-return hash((self.start, self.end))
+return hash((self.start, self.end, type(self)))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130023)
Time Spent: 14.5h  (was: 14h 20m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130031
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065911
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -407,6 +453,12 @@ def __eq__(self, other):
   and self.offset == other.offset
   and self.period == other.period)
 
+  def __ne__(self, other):
+return not self == other
+
+  def __hash__(self):
+return hash((type(self), self.offset, self.period))
 
 Review comment:
   Let's remove `type(self)` from the tuple.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130031)
Time Spent: 15h 20m  (was: 15h 10m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130027
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065328
 
 

 ##
 File path: sdks/python/apache_beam/transforms/trigger.py
 ##
 @@ -446,6 +453,9 @@ def __repr__(self):
   def __eq__(self, other):
 return type(self) == type(other) and self.count == other.count
 
+  def __hash__(self):
+return hash((type(self), self.count))
 
 Review comment:
   Let's simplify this to `return hash(self.count)`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130027)
Time Spent: 15h  (was: 14h 50m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=130026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130026
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 02/Aug/18 00:08
Start Date: 02/Aug/18 00:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207065158
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -1712,6 +1718,10 @@ def __eq__(self, other):
   and self.timestamp_combiner == other.timestamp_combiner)
 return False
 
+  def __hash__(self):
+return hash((type(self), self.windowfn, self.accumulation_mode,
 
 Review comment:
   Since this was not defined, most likely this will be dead code, and current 
implementation may break the contract with `__eq__` since it's not taking 
`self._default` into account, let's make it `__hash__ = None`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130026)
Time Spent: 15h  (was: 14h 50m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 15h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=130020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130020
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 23:56
Start Date: 01/Aug/18 23:56
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on issue #6101: [BEAM-4828] 
Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#issuecomment-409762073
 
 
   I cannot duplicate the build failures locally, and they do not appear to be 
related to changes in this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130020)
Time Spent: 7h  (was: 6h 50m)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=130010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130010
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 23:19
Start Date: 01/Aug/18 23:19
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on issue #6101: [BEAM-4828] 
Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#issuecomment-409756219
 
 
   @iemejia Please take another look


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130010)
Time Spent: 6h 40m  (was: 6.5h)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=130011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-130011
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 23:19
Start Date: 01/Aug/18 23:19
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on issue #6101: [BEAM-4828] 
Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#issuecomment-409756263
 
 
   @NathanHowell 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 130011)
Time Spent: 6h 50m  (was: 6h 40m)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4147) Abstractions for artifact delivery via arbitrary storage backends

2018-08-01 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-4147.

   Resolution: Resolved
Fix Version/s: 2.6.0

We have a working ArtifactStaging and ArtifactRetriveal service based on Beam 
file system which cover a majority of DFS.

> Abstractions for artifact delivery via arbitrary storage backends
> -
>
> Key: BEAM-4147
> URL: https://issues.apache.org/jira/browse/BEAM-4147
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Axel Magnuson
>Priority: Major
> Fix For: 2.6.0
>
>
> We need a way to wire in arbitrary runner artifact storage backends into the 
> job server and through to artifact staging on workers. This requires some new 
> abstractions in front of the job service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3230) beam_PerformanceTests_Python is failing

2018-08-01 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566089#comment-16566089
 ] 

Valentyn Tymofieiev commented on BEAM-3230:
---

I think [~markflyhigh] was looking into this recently and may know better what 
the current status is.

> beam_PerformanceTests_Python is failing
> ---
>
> Key: BEAM-3230
> URL: https://issues.apache.org/jira/browse/BEAM-3230
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Critical
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Jenkings beam_PerformanceTests_Python stage is failing for python builds.
> Here is the link to a failure console output 
> https://builds.apache.org/job/beam_PerformanceTests_Python/582/console



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3230) beam_PerformanceTests_Python is failing

2018-08-01 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566065#comment-16566065
 ] 

Ankur Goenka commented on BEAM-3230:


[~tvalentyn] The latest fail is 
[https://builds.apache.org/job/beam_PerformanceTests_Python/1263/console] where 
its missing the version.py file.

Is this test still valid after changes in the past few months?

> beam_PerformanceTests_Python is failing
> ---
>
> Key: BEAM-3230
> URL: https://issues.apache.org/jira/browse/BEAM-3230
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Critical
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Jenkings beam_PerformanceTests_Python stage is failing for python builds.
> Here is the link to a failure console output 
> https://builds.apache.org/job/beam_PerformanceTests_Python/582/console



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5022) Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to BeamModulePlugin

2018-08-01 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566055#comment-16566055
 ] 

Ankur Goenka commented on BEAM-5022:


Addressed as part of [https://github.com/apache/beam/pull/6073]

> Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to 
> BeamModulePlugin
> --
>
> Key: BEAM-5022
> URL: https://issues.apache.org/jira/browse/BEAM-5022
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, runner-flink
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>
> Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to 
> BeamModulePlugin So that it can be used by other portable runners tests.
>  
> Also Create an interface TestJobserverDriver and make the drivers extend it 
> instead of using reflection start the Jobserver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4067) Java: FlinkPortableTestRunner: runs portably via self-started local Flink

2018-08-01 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-4067.

   Resolution: Fixed
Fix Version/s: 2.7.0

> Java: FlinkPortableTestRunner: runs portably via self-started local Flink
> -
>
> Key: BEAM-4067
> URL: https://issues.apache.org/jira/browse/BEAM-4067
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Minor
> Fix For: 2.7.0
>
>
> The portable Flink runner cannot be tested through the normal mechanisms used 
> for ValidatesRunner tests because it requires a job server to be constructed 
> out of band and for pipelines to be run through it. We should implement a 
> shim that acts as a standard Java SDK Runner that spins up the necessary 
> server (possibly in-process) and runs against it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4067) Java: FlinkPortableTestRunner: runs portably via self-started local Flink

2018-08-01 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566051#comment-16566051
 ] 

Ankur Goenka commented on BEAM-4067:


This feature is implemented in https://github.com/apache/beam/pull/5935

> Java: FlinkPortableTestRunner: runs portably via self-started local Flink
> -
>
> Key: BEAM-4067
> URL: https://issues.apache.org/jira/browse/BEAM-4067
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Minor
>
> The portable Flink runner cannot be tested through the normal mechanisms used 
> for ValidatesRunner tests because it requires a job server to be constructed 
> out of band and for pipelines to be run through it. We should implement a 
> shim that acts as a standard Java SDK Runner that spins up the necessary 
> server (possibly in-process) and runs against it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #759

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Fixes a Javadoc error

--
[...truncated 18.70 MB...]
INFO: Adding 
View.AsSingleton/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow)
 as step s8
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
View.AsSingleton/Combine.GloballyAsSingletonView/CreateDataflowView as step s9
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding Create123/Read(CreateSource) as step s10
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding OutputSideInputs as step s11
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Window.Into()/Window.Assign as step 
s12
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous) as step 
s13
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map 
as step s14
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign as step 
s15
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
Aug 01, 2018 10:00:08 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
Aug 01, 2018 10:00:08 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0801220004-932518de/output/results/staging/
Aug 01, 2018 

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129990=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129990
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 01/Aug/18 21:55
Start Date: 01/Aug/18 21:55
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6110: 
[BEAM-4176] Tests for running Python on Flink.
URL: https://github.com/apache/beam/pull/6110#discussion_r207045099
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/flink_runner_test.py
 ##
 @@ -0,0 +1,76 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import sys
+import unittest
+
+import apache_beam as beam
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from apache_beam.runners.portability import portable_runner
+from apache_beam.runners.portability import portable_runner_test
+from apache_beam.testing.util import assert_that
+
+if __name__ == '__main__':
+  # Run as
+  #
+  # python -m apache_beam.runners.portability.flink_runner_test \
+  # /path/to/job_server.jar \
+  # [FlinkRunnerTest.test_method, ...]
+  flinkJobServerJar = sys.argv.pop(1)
+
+  # This is defined here to only be run when we invoke this file explicitly.
+  class FlinkRunnerTest(portable_runner_test.PortableRunnerTest):
+_use_grpc = True
+_use_subprocesses = True
+
+@classmethod
+def _subprocess_command(cls, port):
+  return [
+  'java',
+  '-jar', flinkJobServerJar,
+  '--artifacts-dir', '/tmp/flink',
+  '--job-host', 'localhost:%s' % port,
+  ]
+
+@classmethod
+def get_runner(cls):
+  return portable_runner.PortableRunner()
+
+def create_options(self):
+  options = super(FlinkRunnerTest, self).create_options()
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129990)
Time Spent: 9h 10m  (was: 9h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129991
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 01/Aug/18 21:55
Start Date: 01/Aug/18 21:55
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6110: 
[BEAM-4176] Tests for running Python on Flink.
URL: https://github.com/apache/beam/pull/6110#discussion_r207034228
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/flink_runner_test.py
 ##
 @@ -0,0 +1,76 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import sys
+import unittest
+
+import apache_beam as beam
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from apache_beam.runners.portability import portable_runner
+from apache_beam.runners.portability import portable_runner_test
+from apache_beam.testing.util import assert_that
+
+if __name__ == '__main__':
+  # Run as
+  #
+  # python -m apache_beam.runners.portability.flink_runner_test \
+  # /path/to/job_server.jar \
+  # [FlinkRunnerTest.test_method, ...]
+  flinkJobServerJar = sys.argv.pop(1)
+
+  # This is defined here to only be run when we invoke this file explicitly.
+  class FlinkRunnerTest(portable_runner_test.PortableRunnerTest):
+_use_grpc = True
+_use_subprocesses = True
+
+@classmethod
+def _subprocess_command(cls, port):
+  return [
+  'java',
+  '-jar', flinkJobServerJar,
+  '--artifacts-dir', '/tmp/flink',
 
 Review comment:
   Makes sense.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129991)
Time Spent: 9h 20m  (was: 9h 10m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=129976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129976
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 21:24
Start Date: 01/Aug/18 21:24
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on a change in pull request 
#6101: [BEAM-4828] Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#discussion_r207036773
 
 

 ##
 File path: sdks/java/io/amazon-web-services/build.gradle
 ##
 @@ -42,4 +45,12 @@ dependencies {
   shadowTest library.java.mockito_core
   shadowTest library.java.junit
   shadowTest library.java.slf4j_jdk14
+  shadowTest group: 'org.elasticmq', name: 'elasticmq-rest-sqs_2.12', version: 
'0.14.1'
+}
+
+test {
 
 Review comment:
   If you would like me to add some pure unit tests, lmk, I'm willing to do so.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129976)
Time Spent: 6.5h  (was: 6h 20m)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=129974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129974
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 21:23
Start Date: 01/Aug/18 21:23
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on a change in pull request 
#6101: [BEAM-4828] Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#discussion_r207036603
 
 

 ##
 File path: sdks/java/io/amazon-web-services/build.gradle
 ##
 @@ -42,4 +45,12 @@ dependencies {
   shadowTest library.java.mockito_core
   shadowTest library.java.junit
   shadowTest library.java.slf4j_jdk14
+  shadowTest group: 'org.elasticmq', name: 'elasticmq-rest-sqs_2.12', version: 
'0.14.1'
+}
+
+test {
 
 Review comment:
   Further, I was modeling my tests after what AmqpIOTest is doing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129974)
Time Spent: 6h 20m  (was: 6h 10m)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=129964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129964
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 21:20
Start Date: 01/Aug/18 21:20
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on a change in pull request 
#6101: [BEAM-4828] Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#discussion_r207035209
 
 

 ##
 File path: 
sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/sqs/SqsUnboundedReader.java
 ##
 @@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.aws.sqs;
+
+import com.amazonaws.services.sqs.AmazonSQS;
+import com.amazonaws.services.sqs.AmazonSQSClientBuilder;
+import com.amazonaws.services.sqs.model.Message;
+import com.amazonaws.services.sqs.model.ReceiveMessageRequest;
+import com.amazonaws.services.sqs.model.ReceiveMessageResult;
+import com.google.common.collect.Lists;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayDeque;
+import java.util.HashSet;
+import java.util.List;
+import java.util.NoSuchElementException;
+import java.util.Queue;
+import java.util.Set;
+import org.apache.beam.sdk.io.UnboundedSource;
+import org.apache.beam.sdk.io.UnboundedSource.CheckpointMark;
+import org.joda.time.Instant;
+
+class SqsUnboundedReader extends UnboundedSource.UnboundedReader {
+
+  public static final int MAX_NUMBER_OF_MESSAGES = 10;
+  private final SqsUnboundedSource source;
+  private final AmazonSQS sqs;
+
+  private Message current;
+  private final Queue messagesNotYetRead;
+  private Set receiptHandlesToDelete;
+
+  public SqsUnboundedReader(SqsUnboundedSource source, SqsCheckpointMark 
sqsCheckpointMark) {
+this.source = source;
+this.current = null;
+
+this.messagesNotYetRead = new ArrayDeque<>();
+receiptHandlesToDelete = new HashSet<>();
+
+final SqsConfiguration sqsConfiguration = source.getSqsConfiguration();
+sqs =
+AmazonSQSClientBuilder.standard()
+.withClientConfiguration(sqsConfiguration.getClientConfiguration())
+.withCredentials(sqsConfiguration.getAwsCredentialsProvider())
+.withRegion(sqsConfiguration.getAwsRegion())
+.build();
+
+if (sqsCheckpointMark != null) {
+  if (sqsCheckpointMark.getReceiptHandlesToDelete() != null) {
+
receiptHandlesToDelete.addAll(sqsCheckpointMark.getReceiptHandlesToDelete());
+  }
+}
+  }
+
+  @Override
+  public Instant getWatermark() {
+return Instant.now();
 
 Review comment:
   I did not initially spot the strange way that sqs was handling their message 
timestamps poor documentation. I am refactoring this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129964)
Time Spent: 6h 10m  (was: 6h)

> Add Amazon SqsIO
> 
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4828) Add Amazon SqsIO

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=129963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129963
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 01/Aug/18 21:18
Start Date: 01/Aug/18 21:18
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on a change in pull request 
#6101: [BEAM-4828] Add Amazon SqsIO
URL: https://github.com/apache/beam/pull/6101#discussion_r207034913
 
 

 ##
 File path: 
sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/sqs/SqsUnboundedReader.java
 ##
 @@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.aws.sqs;
+
+import com.amazonaws.services.sqs.AmazonSQS;
+import com.amazonaws.services.sqs.AmazonSQSClientBuilder;
+import com.amazonaws.services.sqs.model.Message;
+import com.amazonaws.services.sqs.model.ReceiveMessageRequest;
+import com.amazonaws.services.sqs.model.ReceiveMessageResult;
+import com.google.common.collect.Lists;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayDeque;
+import java.util.HashSet;
+import java.util.List;
+import java.util.NoSuchElementException;
+import java.util.Queue;
+import java.util.Set;
+import org.apache.beam.sdk.io.UnboundedSource;
+import org.apache.beam.sdk.io.UnboundedSource.CheckpointMark;
+import org.joda.time.Instant;
+
+class SqsUnboundedReader extends UnboundedSource.UnboundedReader {
+
+  public static final int MAX_NUMBER_OF_MESSAGES = 10;
+  private final SqsUnboundedSource source;
+  private final AmazonSQS sqs;
+
+  private Message current;
+  private final Queue messagesNotYetRead;
+  private Set receiptHandlesToDelete;
+
+  public SqsUnboundedReader(SqsUnboundedSource source, SqsCheckpointMark 
sqsCheckpointMark) {
+this.source = source;
+this.current = null;
+
+this.messagesNotYetRead = new ArrayDeque<>();
+receiptHandlesToDelete = new HashSet<>();
+
+final SqsConfiguration sqsConfiguration = source.getSqsConfiguration();
+sqs =
+AmazonSQSClientBuilder.standard()
+.withClientConfiguration(sqsConfiguration.getClientConfiguration())
+.withCredentials(sqsConfiguration.getAwsCredentialsProvider())
+.withRegion(sqsConfiguration.getAwsRegion())
+.build();
+
+if (sqsCheckpointMark != null) {
+  if (sqsCheckpointMark.getReceiptHandlesToDelete() != null) {
+
receiptHandlesToDelete.addAll(sqsCheckpointMark.getReceiptHandlesToDelete());
+  }
+}
+  }
+
+  @Override
+  public Instant getWatermark() {
+return Instant.now();
+  }
+
+  @Override
+  public Instant getCurrentTimestamp() throws NoSuchElementException {
+if (current == null) {
+  throw new NoSuchElementException();
+}
+return Instant.parse(current.getAttributes().get("Timestamp"));
+  }
+
+  @Override
+  public Message getCurrent() throws NoSuchElementException {
+if (current == null) {
+  throw new NoSuchElementException();
+}
+return current;
+  }
+
+  @Override
+  public CheckpointMark getCheckpointMark() {
+List snapshotReceiptHandlesToDelete = 
Lists.newArrayList(receiptHandlesToDelete);
+return new SqsCheckpointMark(this, snapshotReceiptHandlesToDelete);
+  }
+
+  @Override
+  public SqsUnboundedSource getCurrentSource() {
+return source;
+  }
+
+  @Override
+  public boolean start() {
+return advance();
+  }
+
+  @Override
+  public boolean advance() {
+if (messagesNotYetRead.isEmpty()) {
+  pull();
+}
+
+current = messagesNotYetRead.poll();
+if (current == null) {
+  return false;
+}
+
+receiptHandlesToDelete.add(current.getReceiptHandle());
+return true;
+  }
+
+  @Override
+  public void close() {}
+
+  void delete(String receiptHandle) {
+sqs.deleteMessage(source.getRead().queueUrl(), receiptHandle);
 
 Review comment:
   AWS limits to 10, I vote to defer.


This is an automated message from the Apache Git Service.
To respond to the message, please log 

[beam-site] 01/02: Fixed Spelling.

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 766d24ea80f1ec5fb058a950c5204e3869aaed0d
Author: Jimmy Casey 
AuthorDate: Sun Jul 29 21:36:38 2018 +

Fixed Spelling.
---
 content/js/language-switch.js | 2 +-
 src/js/language-switch.js | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/content/js/language-switch.js b/content/js/language-switch.js
index 6c30e88..6157b71 100644
--- a/content/js/language-switch.js
+++ b/content/js/language-switch.js
@@ -69,7 +69,7 @@ $(document).ready(function() {
 /**
  * @desc Search next sibling and if it's also a code block, then 
store
 it's type and move onto the next element. It will keep
-looking untill their is no direct code block decendent 
left.
+looking until their is no direct code block decendent left.
  * @param object $el - jQuery object, from where to start 
searching.
  * @param array $lang - list to hold types, found while searching.
  * @return array - list of types found.
diff --git a/src/js/language-switch.js b/src/js/language-switch.js
index 6c30e88..6157b71 100644
--- a/src/js/language-switch.js
+++ b/src/js/language-switch.js
@@ -69,7 +69,7 @@ $(document).ready(function() {
 /**
  * @desc Search next sibling and if it's also a code block, then 
store
 it's type and move onto the next element. It will keep
-looking untill their is no direct code block decendent 
left.
+looking until their is no direct code block decendent left.
  * @param object $el - jQuery object, from where to start 
searching.
  * @param array $lang - list to hold types, found while searching.
  * @return array - list of types found.



[beam-site] 02/02: This closes #515

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 9db1b3c46de8d7db43489992fc43d9f5d7da37a1
Merge: bcb7061 766d24e
Author: Mergebot 
AuthorDate: Wed Aug 1 21:10:02 2018 +

This closes #515

 content/js/language-switch.js | 2 +-
 src/js/language-switch.js | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)



[beam-site] branch mergebot updated (fc61efe -> 9db1b3c)

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from fc61efe  This closes #516
 add bcb7061  Prepare repository for deployment.
 new 766d24e  Fixed Spelling.
 new 9db1b3c  This closes #515

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/testing/index.html | 273 +-
 content/js/language-switch.js |   2 +-
 src/js/language-switch.js |   2 +-
 3 files changed, 7 insertions(+), 270 deletions(-)



[beam-site] 01/01: Prepare repository for deployment.

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit bcb7061eb21de36e7ea353d4337fef6537e6db48
Author: Mergebot 
AuthorDate: Wed Aug 1 21:09:18 2018 +

Prepare repository for deployment.
---
 content/contribute/testing/index.html | 273 +-
 1 file changed, 5 insertions(+), 268 deletions(-)

diff --git a/content/contribute/testing/index.html 
b/content/contribute/testing/index.html
index 85a50d8..333a387 100644
--- a/content/contribute/testing/index.html
+++ b/content/contribute/testing/index.html
@@ -28,7 +28,7 @@
   
   
   
-  Beam Testing Guide
+  Beam Testing
   
   https://fonts.googleapis.com/css?family=Roboto:100,300,400; 
rel="stylesheet">
@@ -195,13 +195,6 @@
 
 
 
-  Overview
-  Testing Matrix
-
-  Java SDK
-  Python SDK
-
-  
   Testing Scenarios
 
   Precommit
@@ -251,267 +244,11 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 -->
 
-Beam Testing Documentation
-
-
-  Overview
-  Testing 
Matrix
-  Java SDK
-  Python 
SDK
-
-  
-  Testing 
Scenarios
-  Precommit
-  Postcommit
-
-  
-  Testing 
Types
-  Unit
-  How to run Python unit 
tests
-  How to run Java NeedsRunner 
tests
-
-  
-  ValidatesRunner
-  E2E
-
-  
-  Testing 
Systems
-  E2E Testing Framework
-  ValidatesRunner Tests
-  Effective use of 
the TestPipeline JUnit rule
-  API Surface testing
-
-  
-  Best 
practices for writing tests
-  Aim for one failure path
-  Avoid non-deterministic 
code
-  Use descriptive test names
-  Use a pre-commit test if 
possible
-
-  
-
+Beam Testing
 
-Overview
-
-Apache Beam is a rapidly-maturing software project with a strong
-commitment to testing. Consequently, it has many testing-related needs. It
-requires precommit tests to ensure code going into the repository meets a
-certain quality bar and it requires ongoing postcommit tests to make sure that
-more subtle changes which escape precommit are nonetheless caught. This 
document
-outlines how to write tests, which tests are appropriate where, and when tests
-are run, with some additional information about the testing systems at the
-bottom.
-
-If you’re writing tests, take a look at the testing matrix first, find what 
you
-want to test, then look into the “Scenarios” and “Types” sections below for 
more
-details on those testing types.
-
-Testing Matrix
-
-Java SDK
-
-
-  
-   Component to Test
-   
-   Test Scenario
-   
-   Tool to Use
-   
-   Link to Example
-   
-   Type
-   
-   Runs In
-   
-  
-   BoundedSource
-   
-   Correctly Reads Input
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java#L128;>SourceTestUtils.readFromSource
-   
-   https://github.com/apache/beam/blob/84a0dd1714028370befa80dea16f720edce05252/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java#L972;>TextIOTest
-   
-   Unit
-   
-   Precommit, Postcommit
-   
-  /tr
-  
-   
-   
-   Correct Initial Splitting
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java#L201;>SourceTestUtils.assertSourcesEqualReferenceSource
-   
-   https://github.com/apache/beam/blob/8b1e64a668489297e11926124c4eee6c8f69a3a7/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java#L339;>BigtableTest
-   
-   Unit
-   
-   Precommit, Postcommit
-   
-  
-  
-   
-   
-   Correct Dynamic Splitting
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java#L541;>SourceTestUtils.
 assertSplitAtFractionExhaustive
-   
-   https://github.com/apache/beam/blob/84a0dd1714028370befa80dea16f720edce05252/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java#L1021;>TextIOTest
-   
-   Unit
-   
-   Precommit, Postcommit
-   
-  
-  
-   Transform
-   
-   Correctness
-   
-   @NeedsRunner Test
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java#L1199;>ParDoTest
-   
-   @NeedsRunner
-   
-   
-   
-  
-  
-   Example Pipeline
-   
-   Verify Behavior on Each Runner
-   
-   E2E Test
-   
-   https://github.com/apache/beam/blob/master/examples/java/src/test/java/org/apache/beam/examples/WordCountIT.java#L76;>WordCountIT
-   
-   E2E
-   
-   Postcommit (Except WordCountIT)
-   
-  
-  
-   Source/Sink with external resource
-   
-   External Resource Faked
-   
-   Unit / @NeedsRunner Test
-   
-   

[beam-site] branch asf-site updated (6d7fd1a -> bcb7061)

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 6d7fd1a  Prepare repository for deployment.
 add 92a9016  Remove testing matrix and simplify intro.
 add fc61efe  This closes #516
 new bcb7061  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/testing/index.html | 273 +-
 src/contribute/testing.md | 233 +
 2 files changed, 11 insertions(+), 495 deletions(-)



[beam-site] 01/02: Remove testing matrix and simplify intro.

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 92a9016de82c61537a0ae43c506dd3ac7f65723c
Author: Udi Meiri 
AuthorDate: Mon Jul 30 16:02:30 2018 -0700

Remove testing matrix and simplify intro.
---
 src/contribute/testing.md | 233 ++
 1 file changed, 6 insertions(+), 227 deletions(-)

diff --git a/src/contribute/testing.md b/src/contribute/testing.md
index 5fdd9ac..ef0814b 100644
--- a/src/contribute/testing.md
+++ b/src/contribute/testing.md
@@ -1,6 +1,6 @@
 ---
 layout: section
-title: 'Beam Testing Guide'
+title: 'Beam Testing'
 section_menu: section-menu/contribute.html
 permalink: /contribute/testing/
 ---
@@ -18,232 +18,11 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 -->
 
-# Beam Testing Documentation
-
-* TOC
-{:toc}
-
-## Overview
-
-Apache Beam is a rapidly-maturing software project with a strong
-commitment to testing. Consequently, it has many testing-related needs. It
-requires precommit tests to ensure code going into the repository meets a
-certain quality bar and it requires ongoing postcommit tests to make sure that
-more subtle changes which escape precommit are nonetheless caught. This 
document
-outlines how to write tests, which tests are appropriate where, and when tests
-are run, with some additional information about the testing systems at the
-bottom.
-
-If you’re writing tests, take a look at the testing matrix first, find what you
-want to test, then look into the “Scenarios” and “Types” sections below for 
more
-details on those testing types.
-
-## Testing Matrix
-
-### Java SDK
-
-
-  
-   Component to Test
-   
-   Test Scenario
-   
-   Tool to Use
-   
-   Link to Example
-   
-   Type
-   
-   Runs In
-   
-  
-   BoundedSource
-   
-   Correctly Reads Input
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java#L128;>SourceTestUtils.readFromSource
-   
-   https://github.com/apache/beam/blob/84a0dd1714028370befa80dea16f720edce05252/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java#L972;>TextIOTest
-   
-   Unit
-   
-   Precommit, Postcommit
-   
-  
-  
-   
-   
-   Correct Initial Splitting
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java#L201;>SourceTestUtils.assertSourcesEqualReferenceSource
-   
-   https://github.com/apache/beam/blob/8b1e64a668489297e11926124c4eee6c8f69a3a7/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java#L339;>BigtableTest
-   
-   Unit
-   
-   Precommit, Postcommit
-   
-  
-  
-   
-   
-   Correct Dynamic Splitting
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java#L541;>SourceTestUtils.
 assertSplitAtFractionExhaustive
-   
-   https://github.com/apache/beam/blob/84a0dd1714028370befa80dea16f720edce05252/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java#L1021;>TextIOTest
-   
-   Unit
-   
-   Precommit, Postcommit
-   
-  
-  
-   Transform
-   
-   Correctness
-   
-   @NeedsRunner Test
-   
-   https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java#L1199;>ParDoTest
-   
-   @NeedsRunner
-   
-   
-   
-  
-  
-   Example Pipeline
-   
-   Verify Behavior on Each Runner
-   
-   E2E Test
-   
-   https://github.com/apache/beam/blob/master/examples/java/src/test/java/org/apache/beam/examples/WordCountIT.java#L76;>WordCountIT
-   
-   E2E
-   
-   Postcommit (Except WordCountIT)
-   
-  
-  
-   Source/Sink with external resource
-   
-   External Resource Faked
-   
-   Unit / @NeedsRunner Test
-   
-   https://github.com/apache/beam/blob/84a0dd1714028370befa80dea16f720edce05252/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java#L646;>FakeBigtableService
 in BigtableTest
-   
-   Unit / @NeedsRunner
-   
-   Precommit / Postcommit
-   
-  
-  
-   
-   
-   Real Interactions With External Resource
-   
-   E2E Test
-   
-   https://github.com/apache/beam/blob/84a0dd1714028370befa80dea16f720edce05252/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableReadIT.java#L40;>BigtableReadIT
-   
-   E2E
-   
-   Postcommit
-   
-  
-  
-   Runner
-   
-   Correctness
-   
-   E2E Test, @ValidatesRunner
-   
-   https://github.com/apache/beam/blob/master/examples/java/src/test/java/org/apache/beam/examples/WordCountIT.java#L78;>WordCountIT,
 https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java;>ParDoTest
-   
-   E2E, @ValidatesRunner
-   
-   Postcommit
-   
-  
-  

[beam-site] branch mergebot updated (67a2c16 -> fc61efe)

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 67a2c16  This closes #511
 add 6d7fd1a  Prepare repository for deployment.
 new 92a9016  Remove testing matrix and simplify intro.
 new fc61efe  This closes #516

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/become-a-committer/index.html   |   6 +
 content/contribute/committer-guide/index.html  |   6 +
 content/contribute/dependencies/index.html |   6 +
 content/contribute/design-documents/index.html |   6 +
 content/contribute/docker-images/index.html|   6 +
 content/contribute/eclipse/index.html  |   6 +
 content/contribute/index.html  |   6 +
 content/contribute/intellij/index.html |   6 +
 content/contribute/portability/index.html  |   6 +
 .../index.html | 151 +++--
 .../index.html | 122 +--
 .../{eclipse => postcommits-policies}/index.html   | 151 ++---
 .../contribute/ptransform-style-guide/index.html   |   6 +
 content/contribute/release-guide/index.html|   6 +
 content/contribute/runner-guide/index.html |   6 +
 content/contribute/testing/index.html  | 105 ++
 .../contribute/website-contributions/index.html|   6 +
 src/contribute/testing.md  | 233 +
 18 files changed, 395 insertions(+), 445 deletions(-)
 copy content/contribute/{website-contributions => 
postcommits-guides}/index.html (79%)
 copy content/contribute/{website-contributions => 
postcommits-policies-details}/index.html (76%)
 copy content/contribute/{eclipse => postcommits-policies}/index.html (76%)



[beam-site] 02/02: This closes #516

2018-08-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit fc61efe78c5b61d7864e9e3be2d3e7069942cfe8
Merge: 6d7fd1a 92a9016
Author: Mergebot 
AuthorDate: Wed Aug 1 21:04:47 2018 +

This closes #516

 src/contribute/testing.md | 233 ++
 1 file changed, 6 insertions(+), 227 deletions(-)



[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=129945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129945
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:59
Start Date: 01/Aug/18 20:59
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r207028792
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -246,10 +273,23 @@ def __init__(self, value, timestamp):
 self.value = value
 self.timestamp = Timestamp.of(timestamp)
 
-  def __cmp__(self, other):
-if type(self) is not type(other):
-  return cmp(type(self), type(other))
-return cmp((self.value, self.timestamp), (other.value, other.timestamp))
+  def __eq__(self, other):
+return (type(self) == type(other)
+and self.value == other.value
+and self.timestamp == other.timestamp)
+
+  def __hash__(self):
+return hash((type(self), self.value, self.timestamp))
+
+  def __ne__(self, other):
+return not self == other
+
+  def __lt__(self, other):
 
 Review comment:
   Since types are not comparable in Python 3, how about we change the 
implementation to:
   ```
 def __lt__(self, other):
   if type(self) != type(other):
   return type(self).__name__ < type(other).__name__
   if self.value != other.value:
 return self.value < other.value
   return self.timestamp < other.timestamp
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129945)
Time Spent: 14h 10m  (was: 14h)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4453) Provide automatic schema registration for POJOs

2018-08-01 Thread Reuven Lax (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax resolved BEAM-4453.
--
   Resolution: Fixed
Fix Version/s: 2.6.0

> Provide automatic schema registration for POJOs
> ---
>
> Key: BEAM-4453
> URL: https://issues.apache.org/jira/browse/BEAM-4453
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.6.0
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=129944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129944
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:59
Start Date: 01/Aug/18 20:59
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r206991261
 
 

 ##
 File path: sdks/python/apache_beam/transforms/cy_combiners.py
 ##
 @@ -162,7 +167,7 @@ def extract_output(self):
   self.sum %= 2**64
   if self.sum >= INT64_MAX:
 self.sum -= 2**64
-return self.sum / self.count if self.count else _NAN
+return self.sum // self.count if self.count else _NAN
 
 Review comment:
   Please also make the change in line 266.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129944)
Time Spent: 14h  (was: 13h 50m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4452) Create a lazy row on top of a generic Getter interface

2018-08-01 Thread Reuven Lax (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax resolved BEAM-4452.
--
   Resolution: Fixed
Fix Version/s: 2.7.0

> Create a lazy row on top of a generic Getter interface
> --
>
> Key: BEAM-4452
> URL: https://issues.apache.org/jira/browse/BEAM-4452
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.7.0
>
>
> This allows us to have a Row object that uses the underlying user object 
> (POJO, proto, Avro, etc.) as its intermediate storage. This will save a lot 
> of expensive conversions back and forth to Row on each ParDo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #1146

2018-08-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #5636

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Fixes a Javadoc error

--
[...truncated 1.25 MB...]
test_repr (apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.IterableHintTestCase) 
... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_tuple_compatibility 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_must_be_iterable 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_enforce_kv_type_constraint 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_proxy_to_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_simple_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_valid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_valid_simple_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_list_constraint_compatibility 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_list_repr (apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_getitem_proxy_to_union 
(apache_beam.typehints.typehints_test.OptionalHintTestCase) ... ok
test_getitem_sequence_not_allowed 
(apache_beam.typehints.typehints_test.OptionalHintTestCase) ... ok
test_any_return_type_hint 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_must_be_primitive_type_or_type_constraint 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_must_be_single_return_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_no_kwargs_accepted 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_composite_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_simple_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_violation 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_invalid_elem_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_must_be_set 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_valid_elem_composite_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_valid_elem_simple_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_any_argument_type_hint 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_basic_type_assertion 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_composite_type_assertion 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_invalid_only_positional_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_must_be_primitive_type_or_constraint 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_mix_positional_and_keyword_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_only_positional_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_simple_type_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_functions_as_regular_generator 
(apache_beam.typehints.typehints_test.TestGeneratorWrapper) ... ok
test_compatibility (apache_beam.typehints.typehints_test.TupleHintTestCase) ... 
ok
test_compatibility_arbitrary_length 

[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=129936=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129936
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:44
Start Date: 01/Aug/18 20:44
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5958: [BEAM-4778] add 
option to flink job server to clean staged artifacts per-job
URL: https://github.com/apache/beam/pull/5958#issuecomment-409716055
 
 
   @ryan-williams Is the PR good for merge?
   cc: @tweise 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129936)
Time Spent: 4h 50m  (was: 4h 40m)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129935=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129935
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:42
Start Date: 01/Aug/18 20:42
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6073: [BEAM-4176] Validate 
Runner Tests generalization and enable for local reference runner
URL: https://github.com/apache/beam/pull/6073#issuecomment-409715426
 
 
   cc: @tweise 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129935)
Time Spent: 9h  (was: 8h 50m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129934=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129934
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:41
Start Date: 01/Aug/18 20:41
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6073: [BEAM-4176] Validate 
Runner Tests generalization and enable for local reference runner
URL: https://github.com/apache/beam/pull/6073#issuecomment-409715187
 
 
   Ping for review :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129934)
Time Spent: 8h 50m  (was: 8h 40m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5041) Java Fn SDK Harness skips unprocessed pCollections

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5041?focusedWorklogId=129933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129933
 ]

ASF GitHub Bot logged work on BEAM-5041:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:36
Start Date: 01/Aug/18 20:36
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6093: [BEAM-5041] Java Fn 
SDK Harness use pTransform to track processed graph
URL: https://github.com/apache/beam/pull/6093#issuecomment-409713790
 
 
   R: @bsidhom @ryan-williams @tweise 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129933)
Time Spent: 0.5h  (was: 20m)

> Java Fn SDK Harness skips unprocessed pCollections
> --
>
> Key: BEAM-5041
> URL: https://issues.apache.org/jira/browse/BEAM-5041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Java Sdk Harness used pCollections to keep track of computed consumers 
> [here|https://github.com/apache/beam/blob/ff95a82e461bd8319d9733be60e75992ba90cd7c/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java#L158].
>  This is incorrect as consumers are based on pTransforms so pTransforms 
> should be used to keep track of computed consumers.
> In case of Flatten, this creates an issue where pTransforms having same input 
> as that to flatten are not executed. This causes 
> [https://github.com/apache/beam/blob/ff95a82e461bd8319d9733be60e75992ba90cd7c/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/FlattenTest.java#L316]
>  to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Py_VR_Dataflow #687

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Fixes a Javadoc error

--
[...truncated 268.66 KB...]
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
  

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1145

2018-08-01 Thread Apache Jenkins Server
See 


Changes:

[yifanzou] [BEAM-3906] Automate Validation Aganist Python Wheel

--
[...truncated 20.56 MB...]
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.508Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Batch mutations together into SpannerIO.Write/Write 
mutations to Cloud Spanner/Group by partition/GroupByWindow
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.558Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike) into SpannerIO.Write/Write mutations to 
Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.594Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow)
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.630Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Extract
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.676Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.724Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.762Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.795Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.832Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.869Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.900Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T20:07:55.935Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 01, 2018 8:07:58 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 

Jenkins build is back to normal : beam_PostCommit_Go_GradleBuild #566

2018-08-01 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-5057) beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5057?focusedWorklogId=129925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129925
 ]

ASF GitHub Bot logged work on BEAM-5057:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:05
Start Date: 01/Aug/18 20:05
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #6120: [BEAM-5057] 
Fixes a Javadoc error
URL: https://github.com/apache/beam/pull/6120
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
index 6afca8fc401..bd2eec1371b 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
@@ -296,7 +296,7 @@ public String getName() {
   /**
* Sets a schema on this PCollection.
*
-   * Can only be called on a {@link PCollection}.
+   * Can only be called on a {@link PCollection}.
*/
   @Experimental(Kind.SCHEMAS)
   public PCollection setRowSchema(Schema schema) {


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129925)
Time Spent: 1h 20m  (was: 1h 10m)

> beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error
> --
>
> Key: BEAM-5057
> URL: https://issues.apache.org/jira/browse/BEAM-5057
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/console]
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/125/console]
>  
> * What went wrong:
> Execution failed for task ':beam-sdks-java-core:javadoc'.
> > Javadoc generation failed. Generated Javadoc options file (useful for 
> > troubleshooting): 
> > '/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/sdks/java/core/build/tmp/javadoc/javadoc.options'
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5057) beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5057?focusedWorklogId=129924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129924
 ]

ASF GitHub Bot logged work on BEAM-5057:


Author: ASF GitHub Bot
Created on: 01/Aug/18 20:04
Start Date: 01/Aug/18 20:04
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #6120: [BEAM-5057] Fixes 
a Javadoc error
URL: https://github.com/apache/beam/pull/6120#issuecomment-409703980
 
 
   Test failures are unrelated and "./gradlew :beam-sdks-java-core:javadoc" 
passes locally.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129924)
Time Spent: 1h 10m  (was: 1h)

> beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error
> --
>
> Key: BEAM-5057
> URL: https://issues.apache.org/jira/browse/BEAM-5057
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/console]
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/125/console]
>  
> * What went wrong:
> Execution failed for task ':beam-sdks-java-core:javadoc'.
> > Javadoc generation failed. Generated Javadoc options file (useful for 
> > troubleshooting): 
> > '/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/sdks/java/core/build/tmp/javadoc/javadoc.options'
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (de38a09 -> 25a0070)

2018-08-01 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from de38a09  Merge pull request #4943 from 
yifanzou/BEAM-3906/Automate_Validation_Against_Python_Wheel
 add 1d1a4f9  Fixes a Javadoc error
 new 25a0070  Merge pull request #6120: [BEAM-5057] Fixes a Javadoc error

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[beam] 01/01: Merge pull request #6120: [BEAM-5057] Fixes a Javadoc error

2018-08-01 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 25a0070e48ab74e8d8e78f5f480d4824cffe270e
Merge: de38a09 1d1a4f9
Author: Chamikara Jayalath 
AuthorDate: Wed Aug 1 13:04:58 2018 -0700

Merge pull request #6120: [BEAM-5057] Fixes a Javadoc error

 .../java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=129921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129921
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 01/Aug/18 19:47
Start Date: 01/Aug/18 19:47
Worklog Time Spent: 10m 
  Work Description: Fematich commented on issue #5729: [BEAM-4006] Futurize 
transforms subpackage
URL: https://github.com/apache/beam/pull/5729#issuecomment-409698800
 
 
   Run Python Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129921)
Time Spent: 13h 50m  (was: 13h 40m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3906) Get Python Wheel Validation Automated

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3906?focusedWorklogId=129919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129919
 ]

ASF GitHub Bot logged work on BEAM-3906:


Author: ASF GitHub Bot
Created on: 01/Aug/18 19:38
Start Date: 01/Aug/18 19:38
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #4943: 
[BEAM-3906] Automate Validation Aganist Python Wheel
URL: https://github.com/apache/beam/pull/4943#discussion_r207006676
 
 

 ##
 File path: 
release/src/main/python-release/run_release_candidate_python_mobile_gaming.sh
 ##
 @@ -0,0 +1,187 @@
+#!/bin/bash
+#
+#Licensed to the Apache Software Foundation (ASF) under one or more
+#contributor license agreements.  See the NOTICE file distributed with
+#this work for additional information regarding copyright ownership.
+#The ASF licenses this file to You under the Apache License, Version 2.0
+#(the "License"); you may not use this file except in compliance with
+#the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+#
+
+#  This file will verify Apache/Beam release candidate python by following 
steps:
+#
+#  1. Create a new virtualenv and install the SDK
+#  2. Run UserScore examples with DirectRunner
+#  3. Run UserScore examples with DataflowRunner
+#  4. Run HourlyTeamScore on DirectRunner
+#  5. Run HourlyTeamScore on DataflowRunner
+#
+
+set -e
+set -v
+
+source release/src/main/python-release/python_release_automation_utils.sh
+
+# Assign default values
+BEAM_PYTHON_SDK=$BEAM_PYTHON_SDK_ZIP
+
+
+###
+# Remove temp directory when complete.
+# Globals:
+#   TMPDIR
+# Arguments:
+#   None
+###
+function complete() {
+  print_separator "Validation $1"
+  rm -rf $TMPDIR
+}
+
+
+###
+# Download files from RC staging location, install python sdk
+# Globals:
+#   BEAM_PYTHON_SDK
+# Arguments:
+#   None
+###
+function install_sdk() {
+  print_separator "Creating new virtualenv and installing the SDK"
+  virtualenv temp_virtualenv
+  . temp_virtualenv/bin/activate
+  gcloud_version=$(gcloud --version | head -1 | awk '{print $4}')
+  if [[ "$gcloud_version" < "189" ]]; then
+update_gcloud
+  fi
+  pip install google-compute-engine
+  pip install $BEAM_PYTHON_SDK[gcp]
+}
+
+
+###
+# Run UserScore with DirectRunner
+# Globals:
+#   USERSCORE_OUTPUT_PREFIX, DATASET, BUCKET_NAME
+# Arguments:
+#   None
+###
+function verify_userscore_direct() {
+  print_separator "Running userscore example with DirectRunner"
+  output_file_name="$USERSCORE_OUTPUT_PREFIX-direct-runner.txt"
+  python -m apache_beam.examples.complete.game.user_score \
+--output=$output_file_name \
+--project=$PROJECT_ID \
+--dataset=$DATASET \
+--input=gs://$BUCKET_NAME/5000_gaming_data.csv
+
+  verify_user_score "direct"
 
 Review comment:
   Thanks. I'll pay attention to it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129919)
Time Spent: 35h 50m  (was: 35h 40m)

> Get Python Wheel Validation Automated
> -
>
> Key: BEAM-3906
> URL: https://issues.apache.org/jira/browse/BEAM-3906
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-python, testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 35h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Go_GradleBuild #565

2018-08-01 Thread Apache Jenkins Server
   }
  ]
}
  ],
  "is_stream_like": true
}
  ],
  "is_pair_like": true
},
{
  "@type": "kind:global_window"
}
  ],
  "is_wrapper": true
}
  }
],
"parallel_input": {
  "@type": "OutputReference",
  "step_name": "e8",
  "output_name": "i0"
},
"serialized_fn": 
"%0A%29%22%27%0A%02c1%12%21%0A%1F%0A%1D%0A%1Bbeam:coder:global_window:v1j9%0A%25%0A%23%0A%21beam:windowfn:global_windows:v0.1%10%01%1A%02c1%22%02:%00%28%010%018%02H%01"
  }
},
{
  "kind": "ParallelDo",
  "name": "e10",
  "properties": {
"user_name": "passert.Sum(a)/passert.sumFn",
"output_info": [
  {
"user_name": "bogus",
"output_name": "bogus",
"encoding": {
  "@type": "kind:windowed_value",
  "component_encodings": [
{
  "@type": "kind:bytes"
},
{
  "@type": "kind:global_window"
}
  ],
  "is_wrapper": true
}
  }
],
"parallel_input": {
  "@type": "OutputReference",
  "step_name": "e9",
  "output_name": "i0"
},
"serialized_fn": "e10"
  }
}
  ],
  "type": "JOB_TYPE_BATCH"
}
2018/08/01 19:26:07 Submitted job: 2018-08-01_12_26_06-10794725367443553748
2018/08/01 19:26:07 Console: 
https://console.cloud.google.com/dataflow/job/2018-08-01_12_26_06-10794725367443553748?project=apache-beam-testing
2018/08/01 19:26:07 Logs: 
https://console.cloud.google.com/logs/viewer?project=apache-beam-testing=dataflow_step%2Fjob_id%2F2018-08-01_12_26_06-10794725367443553748
2018/08/01 19:26:07 Submitted job: 2018-08-01_12_26_06-7766298945856587835
2018/08/01 19:26:07 Console: 
https://console.cloud.google.com/dataflow/job/2018-08-01_12_26_06-7766298945856587835?project=apache-beam-testing
2018/08/01 19:26:07 Logs: 
https://console.cloud.google.com/logs/viewer?project=apache-beam-testing=dataflow_step%2Fjob_id%2F2018-08-01_12_26_06-7766298945856587835
2018/08/01 19:26:07 Job state: JOB_STATE_PENDING ...
2018/08/01 19:26:07 Job state: JOB_STATE_PENDING ...
2018/08/01 19:26:07 Submitted job: 2018-08-01_12_26_06-16575379234168570234
2018/08/01 19:26:07 Console: 
https://console.cloud.google.com/dataflow/job/2018-08-01_12_26_06-16575379234168570234?project=apache-beam-testing
2018/08/01 19:26:07 Logs: 
https://console.cloud.google.com/logs/viewer?project=apache-beam-testing=dataflow_step%2Fjob_id%2F2018-08-01_12_26_06-16575379234168570234
2018/08/01 19:26:08 Submitted job: 2018-08-01_12_26_06-8065761360251173168
2018/08/01 19:26:08 Console: 
https://console.cloud.google.com/dataflow/job/2018-08-01_12_26_06-8065761360251173168?project=apache-beam-testing
2018/08/01 19:26:08 Logs: 
https://console.cloud.google.com/logs/viewer?project=apache-beam-testing=dataflow_step%2Fjob_id%2F2018-08-01_12_26_06-8065761360251173168
2018/08/01 19:26:08 Job state: JOB_STATE_PENDING ...
2018/08/01 19:26:08 Job state: JOB_STATE_PENDING ...
2018/08/01 19:26:08 Submitted job: 2018-08-01_12_26_07-6351961819148544955
2018/08/01 19:26:08 Console: 
https://console.cloud.google.com/dataflow/job/2018-08-01_12_26_07-6351961819148544955?project=apache-beam-testing
2018/08/01 19:26:08 Logs: 
https://console.cloud.google.com/logs/viewer?project=apache-beam-testing=dataflow_step%2Fjob_id%2F2018-08-01_12_26_07-6351961819148544955
2018/08/01 19:26:08 Job state: JOB_STATE_PENDING ...
2018/08/01 19:26:37 Job still running ...
2018/08/01 19:26:37 Job still running ...
2018/08/01 19:26:38 Job still running ...
2018/08/01 19:26:38 Job still running ...
2018/08/01 19:26:38 Job still running ...
2018/08/01 19:27:07 Job still running ...
2018/08/01 19:27:07 Job still running ...
2018/08/01 19:27:08 Job still running ...
2018/08/01 19:27:08 Job still running ...
2018/08/01 19:27:08 Job still running ...
2018/08/01 19:27:37 Job still running ...
2018/08/01 19:27:37 Job still running ...
2018/08/01 19:27:38 Job still running ...
2018/08/01 19:27:38 Job still running ...
2018/08/01 19:27:39 Job still running ...
2018/08/01 19:28:07 Job still running ...
2018/08/01 19:28:07 Job still running ...
2018/08/01 19:28:08 J

[jira] [Work logged] (BEAM-3906) Get Python Wheel Validation Automated

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3906?focusedWorklogId=129915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129915
 ]

ASF GitHub Bot logged work on BEAM-3906:


Author: ASF GitHub Bot
Created on: 01/Aug/18 19:22
Start Date: 01/Aug/18 19:22
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #4943: [BEAM-3906] Automate 
Validation Aganist Python Wheel
URL: https://github.com/apache/beam/pull/4943
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/job_ReleaseCandidate_Python.groovy 
b/.test-infra/jenkins/job_ReleaseCandidate_Python.groovy
index 0102e0b7fb1..4df59b146e6 100644
--- a/.test-infra/jenkins/job_ReleaseCandidate_Python.groovy
+++ b/.test-infra/jenkins/job_ReleaseCandidate_Python.groovy
@@ -21,9 +21,6 @@ import CommonJobProperties as commonJobProperties
 job('beam_PostRelease_Python_Candidate') {
 description('Runs verification of the Python release candidate.')
 
-// Execute concurrent builds if necessary.
-concurrentBuild()
-
 // Set common parameters.
 commonJobProperties.setTopLevelMainJobProperties(delegate)
 
@@ -35,8 +32,7 @@ job('beam_PostRelease_Python_Candidate') {
 
 // Execute shell command to test Python SDK.
 steps {
-shell('cd ' + commonJobProperties.checkoutDir +
-' && bash 
release/src/main/groovy/run_release_candidate_python_quickstart.sh' +
-' && bash 
release/src/main/groovy/run_release_candidate_python_mobile_gaming.sh')
+  shell('cd ' + commonJobProperties.checkoutDir +
+' && bash 
release/src/main/python-release/python_release_automation.sh')
 }
 }
diff --git a/release/src/main/groovy/python_release_automation_utils.sh 
b/release/src/main/groovy/python_release_automation_utils.sh
deleted file mode 100644
index 554f3aa5693..000
--- a/release/src/main/groovy/python_release_automation_utils.sh
+++ /dev/null
@@ -1,135 +0,0 @@
-#!/bin/bash
-#
-#Licensed to the Apache Software Foundation (ASF) under one or more
-#contributor license agreements.  See the NOTICE file distributed with
-#this work for additional information regarding copyright ownership.
-#The ASF licenses this file to You under the Apache License, Version 2.0
-#(the "License"); you may not use this file except in compliance with
-#the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-#Unless required by applicable law or agreed to in writing, software
-#distributed under the License is distributed on an "AS IS" BASIS,
-#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#See the License for the specific language governing permissions and
-#limitations under the License.
-#
-
-
-set -e
-set -v
-
-###
-# Print Separators.
-# Arguments:
-#   Info to be printed.
-# Outputs:
-#   Writes info to stdout.
-###
-function print_separator() {
-echo 
""
-echo $1
-echo 
""
-}
-
-###
-# Update gcloud version.
-# Arguments:
-#   None
-###
-function update_gcloud() {
-curl 
https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-189.0.0-linux-x86_64.tar.gz
 \
---output gcloud.tar.gz
-tar xf gcloud.tar.gz
-./google-cloud-sdk/install.sh --quiet
-. ./google-cloud-sdk/path.bash.inc
-gcloud components update --quiet || echo 'gcloud components update failed'
-gcloud -v
-}
-
-###
-# Get Python SDK version from sdk/python/apache_beam/version.py.
-# Arguments:
-#   None
-# Outputs:
-#   Writes version to stdout.
-###
-function get_version() {
-version=$(awk '/__version__/{print $3}' sdks/python/apache_beam/version.py)
-if [[ $version = *".dev"* ]]; then
-echo $version | cut -c 2- | rev | cut -d'.' -f2- | rev
-else
-echo $version
-fi
-}
-
-###
-# Publish data to Pubsub topic for streaming wordcount examples.
-# Arguments:
-#   None
-###
-function run_pubsub_publish(){
-words=("hello world!", "I like cats!", "Python", "hello Python", "hello 
Python")
-for word in ${words[@]}; do
-gcloud pubsub topics publish $PUBSUB_TOPIC1 --message "$word"
-done
-sleep 10
-}
-

[beam] 01/01: Merge pull request #4943 from yifanzou/BEAM-3906/Automate_Validation_Against_Python_Wheel

2018-08-01 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit de38a09929e3876633c7cfbb15104ccf751bc588
Merge: 1086391 9e52923
Author: Ahmet Altay 
AuthorDate: Wed Aug 1 12:22:30 2018 -0700

Merge pull request #4943 from 
yifanzou/BEAM-3906/Automate_Validation_Against_Python_Wheel

[BEAM-3906] Automate Validation Aganist Python Wheel

 .../jenkins/job_ReleaseCandidate_Python.groovy |   8 +-
 .../main/groovy/python_release_automation_utils.sh | 135 --
 .../run_release_candidate_python_mobile_gaming.sh  | 188 -
 .../run_release_candidate_python_quickstart.sh | 231 
 .../python-release/python_release_automation.sh|  25 ++
 .../python_release_automation_utils.sh | 297 +
 .../run_release_candidate_python_mobile_gaming.sh  | 187 +
 .../run_release_candidate_python_quickstart.sh | 259 ++
 8 files changed, 770 insertions(+), 560 deletions(-)



[beam] branch master updated (1086391 -> de38a09)

2018-08-01 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 1086391  Merge pull request #6086: [BEAM-5018] Upgrade org.tukaani:xz 
to 1.8
 add 9e52923  [BEAM-3906] Automate Validation Aganist Python Wheel
 new de38a09  Merge pull request #4943 from 
yifanzou/BEAM-3906/Automate_Validation_Against_Python_Wheel

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../jenkins/job_ReleaseCandidate_Python.groovy |   8 +-
 .../main/groovy/python_release_automation_utils.sh | 135 --
 .../run_release_candidate_python_mobile_gaming.sh  | 188 -
 .../run_release_candidate_python_quickstart.sh | 231 
 .../python-release/python_release_automation.sh|  13 +-
 .../python_release_automation_utils.sh | 297 +
 .../run_release_candidate_python_mobile_gaming.sh  | 187 +
 .../run_release_candidate_python_quickstart.sh | 259 ++
 8 files changed, 752 insertions(+), 566 deletions(-)
 delete mode 100644 release/src/main/groovy/python_release_automation_utils.sh
 delete mode 100755 
release/src/main/groovy/run_release_candidate_python_mobile_gaming.sh
 delete mode 100755 
release/src/main/groovy/run_release_candidate_python_quickstart.sh
 copy .test-infra/kubernetes/cassandra/LargeITCluster/teardown.sh => 
release/src/main/python-release/python_release_automation.sh (69%)
 create mode 100644 
release/src/main/python-release/python_release_automation_utils.sh
 create mode 100755 
release/src/main/python-release/run_release_candidate_python_mobile_gaming.sh
 create mode 100755 
release/src/main/python-release/run_release_candidate_python_quickstart.sh



[jira] [Work logged] (BEAM-3906) Get Python Wheel Validation Automated

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3906?focusedWorklogId=129914=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129914
 ]

ASF GitHub Bot logged work on BEAM-3906:


Author: ASF GitHub Bot
Created on: 01/Aug/18 19:21
Start Date: 01/Aug/18 19:21
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #4943: 
[BEAM-3906] Automate Validation Aganist Python Wheel
URL: https://github.com/apache/beam/pull/4943#discussion_r207002128
 
 

 ##
 File path: 
release/src/main/python-release/run_release_candidate_python_mobile_gaming.sh
 ##
 @@ -0,0 +1,187 @@
+#!/bin/bash
+#
+#Licensed to the Apache Software Foundation (ASF) under one or more
+#contributor license agreements.  See the NOTICE file distributed with
+#this work for additional information regarding copyright ownership.
+#The ASF licenses this file to You under the Apache License, Version 2.0
+#(the "License"); you may not use this file except in compliance with
+#the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+#
+
+#  This file will verify Apache/Beam release candidate python by following 
steps:
+#
+#  1. Create a new virtualenv and install the SDK
+#  2. Run UserScore examples with DirectRunner
+#  3. Run UserScore examples with DataflowRunner
+#  4. Run HourlyTeamScore on DirectRunner
+#  5. Run HourlyTeamScore on DataflowRunner
+#
+
+set -e
+set -v
+
+source release/src/main/python-release/python_release_automation_utils.sh
+
+# Assign default values
+BEAM_PYTHON_SDK=$BEAM_PYTHON_SDK_ZIP
+
+
+###
+# Remove temp directory when complete.
+# Globals:
+#   TMPDIR
+# Arguments:
+#   None
+###
+function complete() {
+  print_separator "Validation $1"
+  rm -rf $TMPDIR
+}
+
+
+###
+# Download files from RC staging location, install python sdk
+# Globals:
+#   BEAM_PYTHON_SDK
+# Arguments:
+#   None
+###
+function install_sdk() {
+  print_separator "Creating new virtualenv and installing the SDK"
+  virtualenv temp_virtualenv
+  . temp_virtualenv/bin/activate
+  gcloud_version=$(gcloud --version | head -1 | awk '{print $4}')
+  if [[ "$gcloud_version" < "189" ]]; then
+update_gcloud
+  fi
+  pip install google-compute-engine
+  pip install $BEAM_PYTHON_SDK[gcp]
+}
+
+
+###
+# Run UserScore with DirectRunner
+# Globals:
+#   USERSCORE_OUTPUT_PREFIX, DATASET, BUCKET_NAME
+# Arguments:
+#   None
+###
+function verify_userscore_direct() {
+  print_separator "Running userscore example with DirectRunner"
+  output_file_name="$USERSCORE_OUTPUT_PREFIX-direct-runner.txt"
+  python -m apache_beam.examples.complete.game.user_score \
+--output=$output_file_name \
+--project=$PROJECT_ID \
+--dataset=$DATASET \
+--input=gs://$BUCKET_NAME/5000_gaming_data.csv
+
+  verify_user_score "direct"
 
 Review comment:
   Got it. For future PRs try be consistent in naming. (E.g 
verify_userscore_direct vs verify_user_score. Use either userscore or 
user_score consistently.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129914)
Time Spent: 35.5h  (was: 35h 20m)

> Get Python Wheel Validation Automated
> -
>
> Key: BEAM-3906
> URL: https://issues.apache.org/jira/browse/BEAM-3906
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-python, testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 35.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4826) Flink runner sends bad flatten to SDK

2018-08-01 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565838#comment-16565838
 ] 

Ankur Goenka commented on BEAM-4826:


I see. 

The split of flatten is ok. I will see how we can structure the code to remove 
unused pPcollection in flatten or other transforms input.

Though unused outputs will still be present as parDo will generate them in the 
user code.

 

> Flink runner sends bad flatten to SDK
> -
>
> Key: BEAM-4826
> URL: https://issues.apache.org/jira/browse/BEAM-4826
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Henning Rohde
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
>
> For a Go flatten test w/ 3 input, the Flink runner splits this into 3 bundle 
> descriptors. But it sends the original 3-input flatten but w/ 1 actual input 
> present in each bundle descriptor. This is inconsistent and the SDK shouldn't 
> expect dangling PCollections. In contrast, Dataflow removes the flatten when 
> it does the same split.
> Snippet:
> register: <
>   process_bundle_descriptor: <
> id: "3"
> transforms: <
>   key: "e4"
>   value: <
> unique_name: "github.com/apache/beam/sdks/go/pkg/beam.createFn'1"
> spec: <
>   urn: "urn:beam:transform:pardo:v1"
>   payload: [...]
> >
> inputs: <
>   key: "i0"
>   value: "n3"
> >
> outputs: <
>   key: "i0"
>   value: "n4"
> >
>   >
> >
> transforms: <
>   key: "e7"
>   value: <
> unique_name: "Flatten"
> spec: <
>   urn: "beam:transform:flatten:v1"
> >
> inputs: <
>   key: "i0"
>   value: "n2"
> >
> inputs: <
>   key: "i1"
>   value: "n4" . // <--- only one present.
> >
> inputs: <
>   key: "i2"
>   value: "n6"
> >
> outputs: <
>   key: "i0"
>   value: "n7"
> >
>   >
> >
> [...]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5057) beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5057?focusedWorklogId=129910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129910
 ]

ASF GitHub Bot logged work on BEAM-5057:


Author: ASF GitHub Bot
Created on: 01/Aug/18 18:54
Start Date: 01/Aug/18 18:54
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #6120: [BEAM-5057] Fixes 
a Javadoc error
URL: https://github.com/apache/beam/pull/6120#issuecomment-409683754
 
 
   Retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129910)
Time Spent: 1h  (was: 50m)

> beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error
> --
>
> Key: BEAM-5057
> URL: https://issues.apache.org/jira/browse/BEAM-5057
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/console]
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/125/console]
>  
> * What went wrong:
> Execution failed for task ':beam-sdks-java-core:javadoc'.
> > Javadoc generation failed. Generated Javadoc options file (useful for 
> > troubleshooting): 
> > '/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/sdks/java/core/build/tmp/javadoc/javadoc.options'
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #1144

2018-08-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1143

2018-08-01 Thread Apache Jenkins Server
See 


--
[...truncated 20.53 MB...]
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.281Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike) into SpannerIO.Write/Write mutations to 
Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.316Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow)
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.357Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Extract
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.387Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.439Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.486Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.567Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.608Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.656Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.689Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.729Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-01T18:44:40.762Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map
Aug 01, 2018 6:44:44 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process

Jenkins build is back to normal : beam_PerformanceTests_AvroIOIT #821

2018-08-01 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_Compressed_TextIOIT #808

2018-08-01 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-5056) [SQL] Nullability of aggregation expressions isn't inferred properly

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5056?focusedWorklogId=129905=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129905
 ]

ASF GitHub Bot logged work on BEAM-5056:


Author: ASF GitHub Bot
Created on: 01/Aug/18 18:42
Start Date: 01/Aug/18 18:42
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #6118: 
[BEAM-5056] [SQL] Fix nullability in output schema
URL: https://github.com/apache/beam/pull/6118#discussion_r206989895
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamJoinRel.java
 ##
 @@ -257,25 +257,52 @@ private boolean triggersOncePerWindow(WindowingStrategy 
windowingStrategy) {
   private PCollection standardJoin(
   PCollection> extractedLeftRows,
   PCollection> extractedRightRows,
-  Row leftNullRow,
-  Row rightNullRow) {
+  Schema leftSchema,
+  Schema rightSchema) {
 PCollection>> joinedRows = null;
+
 switch (joinType) {
   case LEFT:
-joinedRows =
-org.apache.beam.sdk.extensions.joinlibrary.Join.leftOuterJoin(
-extractedLeftRows, extractedRightRows, rightNullRow);
-break;
+{
+  Schema rigthNullSchema = buildNullSchema(rightSchema);
+  Row rightNullRow = Row.nullRow(rigthNullSchema);
+
+  setValueCoder(extractedRightRows, SchemaCoder.of(rigthNullSchema));
+
+  joinedRows =
+  org.apache.beam.sdk.extensions.joinlibrary.Join.leftOuterJoin(
+  extractedLeftRows, extractedRightRows, rightNullRow);
+
+  break;
+}
   case RIGHT:
-joinedRows =
-org.apache.beam.sdk.extensions.joinlibrary.Join.rightOuterJoin(
-extractedLeftRows, extractedRightRows, leftNullRow);
-break;
+{
+  Schema leftNullSchema = buildNullSchema(leftSchema);
+  Row leftNullRow = Row.nullRow(leftNullSchema);
+
+  setValueCoder(extractedLeftRows, SchemaCoder.of(leftNullSchema));
 
 Review comment:
   Don't worry too much about it though, what you have looks fine to me


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129905)
Time Spent: 1h 40m  (was: 1.5h)

> [SQL] Nullability of aggregation expressions isn't inferred properly
> 
>
> Key: BEAM-5056
> URL: https://issues.apache.org/jira/browse/BEAM-5056
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Gleb Kanterov
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Given schema and rows:
> {code:java}
> Schema schema =
> Schema.builder()
> .addNullableField("f_int1", Schema.FieldType.INT32)
> .addNullableField("f_int2", Schema.FieldType.INT32)
> .build();
> List rows =
> TestUtils.RowsBuilder.of(schema)
> .addRows(null, null)
> .getRows();
> {code}
> Following query fails:
> {code:sql}
> SELECT AVG(f_int1) FROM PCOLLECTION GROUP BY f_int2
> {code}
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Field EXPR$0 is not 
> nullable{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PerformanceTests_ParquetIOIT #314

2018-08-01 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_ParquetIOIT_HDFS #208

2018-08-01 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-5056) [SQL] Nullability of aggregation expressions isn't inferred properly

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5056?focusedWorklogId=129903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129903
 ]

ASF GitHub Bot logged work on BEAM-5056:


Author: ASF GitHub Bot
Created on: 01/Aug/18 18:38
Start Date: 01/Aug/18 18:38
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #6118: 
[BEAM-5056] [SQL] Fix nullability in output schema
URL: https://github.com/apache/beam/pull/6118#discussion_r206988393
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamJoinRel.java
 ##
 @@ -257,25 +257,52 @@ private boolean triggersOncePerWindow(WindowingStrategy 
windowingStrategy) {
   private PCollection standardJoin(
   PCollection> extractedLeftRows,
   PCollection> extractedRightRows,
-  Row leftNullRow,
-  Row rightNullRow) {
+  Schema leftSchema,
+  Schema rightSchema) {
 PCollection>> joinedRows = null;
+
 switch (joinType) {
   case LEFT:
-joinedRows =
-org.apache.beam.sdk.extensions.joinlibrary.Join.leftOuterJoin(
-extractedLeftRows, extractedRightRows, rightNullRow);
-break;
+{
+  Schema rigthNullSchema = buildNullSchema(rightSchema);
+  Row rightNullRow = Row.nullRow(rigthNullSchema);
+
+  setValueCoder(extractedRightRows, SchemaCoder.of(rigthNullSchema));
+
+  joinedRows =
+  org.apache.beam.sdk.extensions.joinlibrary.Join.leftOuterJoin(
+  extractedLeftRows, extractedRightRows, rightNullRow);
+
+  break;
+}
   case RIGHT:
-joinedRows =
-org.apache.beam.sdk.extensions.joinlibrary.Join.rightOuterJoin(
-extractedLeftRows, extractedRightRows, leftNullRow);
-break;
+{
+  Schema leftNullSchema = buildNullSchema(leftSchema);
+  Row leftNullRow = Row.nullRow(leftNullSchema);
+
+  setValueCoder(extractedLeftRows, SchemaCoder.of(leftNullSchema));
 
 Review comment:
   I am thinking about something like this:
   
   ```
   Row buildNullRow(schema, extractedRows) {
 Schema nullSchema = buildNullSchema(schema);
 Row nullRow = Row.nullRow(nullSchema);
 setValueCoder(extractedRows, SchemaCoder.of(nullSchema));
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129903)
Time Spent: 1.5h  (was: 1h 20m)

> [SQL] Nullability of aggregation expressions isn't inferred properly
> 
>
> Key: BEAM-5056
> URL: https://issues.apache.org/jira/browse/BEAM-5056
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Gleb Kanterov
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Given schema and rows:
> {code:java}
> Schema schema =
> Schema.builder()
> .addNullableField("f_int1", Schema.FieldType.INT32)
> .addNullableField("f_int2", Schema.FieldType.INT32)
> .build();
> List rows =
> TestUtils.RowsBuilder.of(schema)
> .addRows(null, null)
> .getRows();
> {code}
> Following query fails:
> {code:sql}
> SELECT AVG(f_int1) FROM PCOLLECTION GROUP BY f_int2
> {code}
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Field EXPR$0 is not 
> nullable{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4936) Beam Dependency Update Request: org.codehaus.groovy

2018-08-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4936?focusedWorklogId=129902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129902
 ]

ASF GitHub Bot logged work on BEAM-4936:


Author: ASF GitHub Bot
Created on: 01/Aug/18 18:37
Start Date: 01/Aug/18 18:37
Worklog Time Spent: 10m 
  Work Description: yifanzou removed a comment on issue #6115: DO NOT 
MERGE, [BEAM-4936] update org.apache.httpcomponents
URL: https://github.com/apache/beam/pull/6115#issuecomment-409646394
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129902)
Time Spent: 1h  (was: 50m)

> Beam Dependency Update Request: org.codehaus.groovy
> ---
>
> Key: BEAM-4936
> URL: https://issues.apache.org/jira/browse/BEAM-4936
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> 2018-07-25 20:26:44.528984
> Please review and upgrade the org.codehaus.groovy to the latest 
> version None 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >