[GitHub] [beam] allenpradeep commented on pull request #11570: [BEAM-9822] Merge the stages 'Gather and Sort' and 'Create Batches'

2020-05-06 Thread GitBox


allenpradeep commented on pull request #11570:
URL: https://github.com/apache/beam/pull/11570#issuecomment-625003968


   Hi Niel,
   I see a bunch of unit tests failing on this commit. 
   I am working on a patch on top of this and i noticed this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] darshanj commented on a change in pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


darshanj commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r421185866



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Thanks. I understand the reasoning. My thinking was in lines similar 
like a Join implementation. As an end-user abstraction, isn't it would be more 
natural and straightforward to think left diff right which abstract that there 
is CoGrouping inside. This looks in my mind a natural usecase. 
   
   Or will it be good idea to both variants `PCollectionList` and 
`PCollection` of API?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chadrik commented on pull request #11038: [BEAM-7746] More typing fixes

2020-05-06 Thread GitBox


chadrik commented on pull request #11038:
URL: https://github.com/apache/beam/pull/11038#issuecomment-624974342


   Rebased on top of #11620, fixed some more issues, and removed some gating 
from mypy.ini, so mypy checks are all green. 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11557: [BEAM-9845] Stage artifacts over expansion service.

2020-05-06 Thread GitBox


chamikaramj commented on pull request #11557:
URL: https://github.com/apache/beam/pull/11557#issuecomment-624967725


   Created https://issues.apache.org/jira/browse/BEAM-9913



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #11557: [BEAM-9845] Stage artifacts over expansion service.

2020-05-06 Thread GitBox


robertwb commented on pull request #11557:
URL: https://github.com/apache/beam/pull/11557#issuecomment-624967141


   Things are passing locally (e.g. https://scans.gradle.com/s/m4g4v2flilkm6 ) 
but I would like to make sure Jenkins is happy so I'll hold off a bit more 
hoping we can get this resolved. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11548: [BEAM-9136] Skip pulling licenses by default.

2020-05-06 Thread GitBox


chamikaramj commented on pull request #11548:
URL: https://github.com/apache/beam/pull/11548#issuecomment-624966273


   Seems like this broke cross-language tests.
   
https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_XVR_Flink/
   
https://scans.gradle.com/s/x24iwckafknka/console-log?task=:sdks:java:container:pullLicenses



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11557: [BEAM-9845] Stage artifacts over expansion service.

2020-05-06 Thread GitBox


chamikaramj commented on pull request #11557:
URL: https://github.com/apache/beam/pull/11557#issuecomment-624966377


   Seems like failure is due to https://github.com/apache/beam/pull/11548



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #11557: [BEAM-9845] Stage artifacts over expansion service.

2020-05-06 Thread GitBox


robertwb commented on pull request #11557:
URL: https://github.com/apache/beam/pull/11557#issuecomment-624963926


   @Hannah-Jiang Is there a bug tracking these failures?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rose-rong-liu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton

2020-05-06 Thread GitBox


rose-rong-liu commented on a change in pull request #11075:
URL: https://github.com/apache/beam/pull/11075#discussion_r421170144



##
File path: website/src/documentation/patterns/ai-platform.md
##
@@ -0,0 +1,90 @@
+---
+layout: section
+title: "AI Platform integration patterns"
+section_menu: section-menu/documentation.html
+permalink: /documentation/patterns/ai-platform/
+---
+
+
+# AI Platform integration patterns
+
+This page describes common patterns in pipelines with Google Cloud AI Platform 
transforms.
+
+
+  Adapt for:
+  
+Java SDK
+Python SDK
+  
+
+
+## Getting predictions
+
+This section shows how to use [Google Cloud AI Platform 
Prediction](https://cloud.google.com/ai-platform/prediction/docs/overview) to 
make predictions about new data from a cloud-hosted machine learning model.
+ 
+[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library with a Beam 
PTransform called `RunInference`. `RunInference` is able to perform an 
inference that can use an external service endpoint for receiving data. When 
using a service endpoint, the transform takes a PCollection of type 
`tf.train.Example` and, for every batch of elements, sends a request to AI 
Platform Prediction. The size of a batch may vary. For more details on how Beam 
finds the best batch size, refer to a docstring for 
[BatchElements](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html?highlight=batchelements#apache_beam.transforms.util.BatchElements).
+ 
+ The transform produces a PCollection of type `PredictLog`, which contains 
predictions. 
+
+Before getting started, deploy a TensorFlow model to AI Platform Prediction. 
The cloud service manages the infrastructure needed to handle prediction 
requests in both efficient and scalable way. Do note that only TensorFlow 
models are supported by the transform. For more information, see [Exporting a 
SavedModel for 
prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction).
+
+Once a machine learning model is deployed, prepare a list of instances to get 
predictions for. To send binary data, make sure that the name of an input ends 
in `_bytes`. This will base64-encode data before sending a request.
+
+### Example
+Here is an example of a pipeline that reads input instances from the file, 
converts JSON objects to `tf.train.Example` objects and sends data to AI 
Platform Prediction. The content of a file can look like this:
+
+```
+{"input": "the quick brown"}
+{"input": "la bruja le"}
+``` 
+
+The example creates `tf.train.BytesList` instances, thus it expects byte-like 
strings as input. However, other data types, like `tf.train.FloatList` and 
`tf.train.Int64List`, are also supported by the transform.
+
+Here is the code:
+
+{:.language-java}
+```java
+// Getting predictions is not yet available for Java. [BEAM-9501]
+```
+
+{:.language-py}
+```py
+import json
+
+import apache_beam as beam
+
+import tensorflow as tf
+from tfx_bsl.beam.run_inference import RunInference
+from tfx_bsl.proto import model_spec_pb2
+
+def convert_json_to_tf_example(json_obj):
+  dict_ = json.loads(json_obj)
+  for name, text in dict_.items():
+  value = tf.train.Feature(bytes_list=tf.train.BytesList(
+value=[text.encode('utf-8')]))
+  feature = {name: value}
+  return tf.train.Example(features=tf.train.Features(feature=feature))
+
+with beam.Pipeline() as p:
+ _ = (p
+ | beam.io.ReadFromText('gs://my-bucket/samples.json')
+ | beam.Map(convert_json_to_tf_example)
+ | RunInference(
+ model_spec_pb2.InferenceEndpoint(
+ model_endpoint_spec=model_spec_pb2.ModelEndpointSpec(

Review comment:
   ModelEndpointSpec will be changed to AIPlatformPredictionModelSpec in 
next release of tfx_bsl.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rose-rong-liu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton

2020-05-06 Thread GitBox


rose-rong-liu commented on a change in pull request #11075:
URL: https://github.com/apache/beam/pull/11075#discussion_r421169742



##
File path: website/src/documentation/patterns/ai-platform.md
##
@@ -0,0 +1,90 @@
+---
+layout: section
+title: "AI Platform integration patterns"
+section_menu: section-menu/documentation.html
+permalink: /documentation/patterns/ai-platform/
+---
+
+
+# AI Platform integration patterns
+
+This page describes common patterns in pipelines with Google Cloud AI Platform 
transforms.
+
+
+  Adapt for:
+  
+Java SDK
+Python SDK
+  
+
+
+## Getting predictions
+
+This section shows how to use [Google Cloud AI Platform 
Prediction](https://cloud.google.com/ai-platform/prediction/docs/overview) to 
make predictions about new data from a cloud-hosted machine learning model.
+ 
+[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library with a Beam 
PTransform called `RunInference`. `RunInference` is able to perform an 
inference that can use an external service endpoint for receiving data. When 
using a service endpoint, the transform takes a PCollection of type 
`tf.train.Example` and, for every batch of elements, sends a request to AI 
Platform Prediction. The size of a batch may vary. For more details on how Beam 
finds the best batch size, refer to a docstring for 
[BatchElements](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html?highlight=batchelements#apache_beam.transforms.util.BatchElements).
+ 
+ The transform produces a PCollection of type `PredictLog`, which contains 
predictions. 
+
+Before getting started, deploy a TensorFlow model to AI Platform Prediction. 
The cloud service manages the infrastructure needed to handle prediction 
requests in both efficient and scalable way. Do note that only TensorFlow 
models are supported by the transform. For more information, see [Exporting a 
SavedModel for 
prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction).
+
+Once a machine learning model is deployed, prepare a list of instances to get 
predictions for. To send binary data, make sure that the name of an input ends 
in `_bytes`. This will base64-encode data before sending a request.
+
+### Example
+Here is an example of a pipeline that reads input instances from the file, 
converts JSON objects to `tf.train.Example` objects and sends data to AI 
Platform Prediction. The content of a file can look like this:
+
+```
+{"input": "the quick brown"}
+{"input": "la bruja le"}
+``` 
+
+The example creates `tf.train.BytesList` instances, thus it expects byte-like 
strings as input. However, other data types, like `tf.train.FloatList` and 
`tf.train.Int64List`, are also supported by the transform.
+
+Here is the code:
+
+{:.language-java}
+```java
+// Getting predictions is not yet available for Java. [BEAM-9501]
+```
+
+{:.language-py}
+```py
+import json
+
+import apache_beam as beam
+
+import tensorflow as tf
+from tfx_bsl.beam.run_inference import RunInference
+from tfx_bsl.proto import model_spec_pb2
+
+def convert_json_to_tf_example(json_obj):
+  dict_ = json.loads(json_obj)
+  for name, text in dict_.items():
+  value = tf.train.Feature(bytes_list=tf.train.BytesList(
+value=[text.encode('utf-8')]))
+  feature = {name: value}
+  return tf.train.Example(features=tf.train.Features(feature=feature))

Review comment:
   Is there extra space?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rose-rong-liu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton

2020-05-06 Thread GitBox


rose-rong-liu commented on a change in pull request #11075:
URL: https://github.com/apache/beam/pull/11075#discussion_r421169300



##
File path: website/src/documentation/patterns/ai-platform.md
##
@@ -0,0 +1,90 @@
+---
+layout: section
+title: "AI Platform integration patterns"
+section_menu: section-menu/documentation.html
+permalink: /documentation/patterns/ai-platform/
+---
+
+
+# AI Platform integration patterns
+
+This page describes common patterns in pipelines with Google Cloud AI Platform 
transforms.
+
+
+  Adapt for:
+  
+Java SDK
+Python SDK
+  
+
+
+## Getting predictions
+
+This section shows how to use [Google Cloud AI Platform 
Prediction](https://cloud.google.com/ai-platform/prediction/docs/overview) to 
make predictions about new data from a cloud-hosted machine learning model.
+ 
+[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library with a Beam 
PTransform called `RunInference`. `RunInference` is able to perform an 
inference that can use an external service endpoint for receiving data. When 
using a service endpoint, the transform takes a PCollection of type 
`tf.train.Example` and, for every batch of elements, sends a request to AI 
Platform Prediction. The size of a batch may vary. For more details on how Beam 
finds the best batch size, refer to a docstring for 
[BatchElements](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html?highlight=batchelements#apache_beam.transforms.util.BatchElements).
+ 
+ The transform produces a PCollection of type `PredictLog`, which contains 
predictions. 

Review comment:
   s/PredictLog/PredictionLog





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11607: [BEAM-9430] Fixes the bounds of initial watermark set to estimators instead of raising an error

2020-05-06 Thread GitBox


chamikaramj commented on pull request #11607:
URL: https://github.com/apache/beam/pull/11607#issuecomment-624959542


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11607: [BEAM-9430] Fixes the bounds of initial watermark set to estimators instead of raising an error

2020-05-06 Thread GitBox


chamikaramj commented on pull request #11607:
URL: https://github.com/apache/beam/pull/11607#issuecomment-624959475


   Replaced the exception with a bound adjustment. PTAL.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #11626: Cleanup ToString transforms.

2020-05-06 Thread GitBox


robertwb commented on pull request #11626:
URL: https://github.com/apache/beam/pull/11626#issuecomment-624958207


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11624: [BEAM-9767] Make streaming_wordcount use a test timeout and increase from 5s to 30s

2020-05-06 Thread GitBox


pabloem commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624954133


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #11626: Cleanup ToString transforms.

2020-05-06 Thread GitBox


robertwb commented on pull request #11626:
URL: https://github.com/apache/beam/pull/11626#issuecomment-624953526


   Run Python2_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on a change in pull request #11627: Fix thread local to be initialized on every thread.

2020-05-06 Thread GitBox


boyuanzz commented on a change in pull request #11627:
URL: https://github.com/apache/beam/pull/11627#discussion_r421161112



##
File path: sdks/python/apache_beam/utils/subprocess_server.py
##
@@ -161,8 +161,9 @@ class JavaJarServer(SubprocessServer):
   BEAM_GROUP_ID = 'org.apache.beam'
   JAR_CACHE = os.path.expanduser("~/.apache_beam/cache/jars")
 
-  _BEAM_SERVICES = threading.local()
-  _BEAM_SERVICES.replacements = {}
+  _BEAM_SERVICES = _BEAM_SERVICES = type(

Review comment:
   Thanks! I'll merge it after all tests pass.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on a change in pull request #11627: Fix thread local to be initialized on every thread.

2020-05-06 Thread GitBox


robertwb commented on a change in pull request #11627:
URL: https://github.com/apache/beam/pull/11627#discussion_r421160293



##
File path: sdks/python/apache_beam/utils/subprocess_server.py
##
@@ -161,8 +161,9 @@ class JavaJarServer(SubprocessServer):
   BEAM_GROUP_ID = 'org.apache.beam'
   JAR_CACHE = os.path.expanduser("~/.apache_beam/cache/jars")
 
-  _BEAM_SERVICES = threading.local()
-  _BEAM_SERVICES.replacements = {}
+  _BEAM_SERVICES = _BEAM_SERVICES = type(

Review comment:
   Hmm... not sure how the duplicate got in there. 
   
   `type(...)` is actually creating a subclass here. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] allenpradeep commented on pull request #11628: [BEAM-9911]Replace SpannerIO.write latency counter to distribution

2020-05-06 Thread GitBox


allenpradeep commented on pull request #11628:
URL: https://github.com/apache/beam/pull/11628#issuecomment-624951382


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] allenpradeep opened a new pull request #11628: [BEAM-9911]Replace SpannerIO.write latency counter to distribution

2020-05-06 Thread GitBox


allenpradeep opened a new pull request #11628:
URL: https://github.com/apache/beam/pull/11628


   As part of improvements to spanner write, spanner_write_total_latency_ms was 
added for more visibility. This counter tracks the total latency in 
milliseconds suffered by all the write calls to spanner and is not actionable.
   Replacing this with a Distribution make this more actionable as it provides 
4 counters(MIN, MAX, MEAN, COUNT) 
   
   @nielm  @chamikaramj 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[GitHub] [beam] pabloem commented on pull request #11624: [BEAM-9767] Make streaming_wordcount use a test timeout and increase from 5s to 30s

2020-05-06 Thread GitBox


pabloem commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624948172


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on a change in pull request #11627: Fix thread local to be initialized on every thread.

2020-05-06 Thread GitBox


boyuanzz commented on a change in pull request #11627:
URL: https://github.com/apache/beam/pull/11627#discussion_r421155093



##
File path: sdks/python/apache_beam/utils/subprocess_server.py
##
@@ -161,8 +161,9 @@ class JavaJarServer(SubprocessServer):
   BEAM_GROUP_ID = 'org.apache.beam'
   JAR_CACHE = os.path.expanduser("~/.apache_beam/cache/jars")
 
-  _BEAM_SERVICES = threading.local()
-  _BEAM_SERVICES.replacements = {}
+  _BEAM_SERVICES = _BEAM_SERVICES = type(

Review comment:
   Additional `_BEAM_SERVICES`?
   
   It seems like creating a subclass of `threading.local()` would work in this 
case. Why `type` also works?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rahul8383 edited a comment on pull request #11581: [BEAM-8307] NPE in Calcite dialect when input PCollection has logical…

2020-05-06 Thread GitBox


rahul8383 edited a comment on pull request #11581:
URL: https://github.com/apache/beam/pull/11581#issuecomment-622049393


   R: @reuvenlax @amaliujia



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb opened a new pull request #11627: Fix thread local to be initialized on every thread.

2020-05-06 Thread GitBox


robertwb opened a new pull request #11627:
URL: https://github.com/apache/beam/pull/11627


   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[GitHub] [beam] robertwb commented on pull request #11627: Fix thread local to be initialized on every thread.

2020-05-06 Thread GitBox


robertwb commented on pull request #11627:
URL: https://github.com/apache/beam/pull/11627#issuecomment-624938579







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Exclude Spark runner from UsesKeyInParDo tests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624934968


   Merged manually, closing now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb opened a new pull request #11626: Cleanup ToString transforms.

2020-05-06 Thread GitBox


robertwb opened a new pull request #11626:
URL: https://github.com/apache/beam/pull/11626


   Just saw this when running some tests.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[GitHub] [beam] pabloem commented on pull request #11624: [BEAM-9767] Make streaming_wordcount an integration test

2020-05-06 Thread GitBox


pabloem commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624933298


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11624: [BEAM-9767] Make streaming_wordcount an integration test

2020-05-06 Thread GitBox


pabloem commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624929844


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia edited a comment on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


amaliujia edited a comment on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-624927719


   You can run `./gradlew ${module}:check` to run all checks, include unit 
testing and style check.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-624927719


   You can run `./gradlew $module:check` to run all checks, include unit 
testing and style check.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia removed a comment on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia removed a comment on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624916990







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624917455


   Run Java Spark PortableValidatesRunner Batch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624917269







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624917065


   Run Spark ValidatesRunner



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624917133


   Run Spark StructuredStreaming ValidatesRunner



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624917207


   Run Spark StructuredStreaming ValidatesRunner



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624916990


   Run Spark ValidatesRunner



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624916303


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #11620: [BEAM-7746] Enable mypy type checking for Beam Python code.

2020-05-06 Thread GitBox


udim commented on a change in pull request #11620:
URL: https://github.com/apache/beam/pull/11620#discussion_r421119702



##
File path: sdks/python/mypy.ini
##
@@ -58,3 +58,65 @@ ignore_errors = true
 [mypy-apache_beam.typehints.typehints_test_py3]
 # error: Signature of "process" incompatible with supertype "DoFn"  [override]
 ignore_errors = true
+
+
+# TODO: Remove the lines below.

Review comment:
   ```suggestion
   # TODO(BEAM-7746): Remove the lines below.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11624: [BEAM-9767] Make streaming_wordcount an integration test

2020-05-06 Thread GitBox


pabloem commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624916278


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia removed a comment on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia removed a comment on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624916303


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11559: [BEAM-9836] Excluding spark runner for KeyTests

2020-05-06 Thread GitBox


iemejia commented on pull request #11559:
URL: https://github.com/apache/beam/pull/11559#issuecomment-624915813


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on a change in pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-06 Thread GitBox


iemejia commented on a change in pull request #11619:
URL: https://github.com/apache/beam/pull/11619#discussion_r421116028



##
File path: 
sdks/java/testing/test-utils/src/test/java/org/apache/beam/sdk/testutils/jvmverification/JvmVerification.java
##
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.testutils.jvmverification;
+
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v10;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v11;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v12;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v13;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v14;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_1;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_2;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_3;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_4;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_5;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_6;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_7;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v1_8;
+import static 
org.apache.beam.sdk.testutils.jvmverification.JvmVerification.Java.v9;
+import static org.junit.Assert.assertEquals;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.HashMap;
+import java.util.Map;
+import 
org.apache.beam.repackaged.core.org.apache.commons.compress.utils.IOUtils;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.commons.codec.binary.Hex;
+import org.junit.Test;
+
+public class JvmVerification {
+
+  private static final Map versionMapping = new HashMap<>();
+
+  static {
+versionMapping.put("002D", v1_1);
+versionMapping.put("002E", v1_2);
+versionMapping.put("002F", v1_3);
+versionMapping.put("0030", v1_4);
+versionMapping.put("0031", v1_5);

Review comment:
   Maybe we can get rid here of all the intermediary releases and let only 
the ones we care about (aka the LTS ones) so 1_8 and v11 for the moment, no?

##
File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
##
@@ -698,6 +700,21 @@ class BeamModulePlugin implements Plugin {
 + (defaultLintSuppressions + 
configuration.disableLintWarnings).collect { "-Xlint:-${it}" })
   }
 
+  if (project.hasProperty("compileAndRunTestsWithJava11")) {
+def java11Home = project.findProperty("java11Home")
+project.tasks.compileTestJava {
+  options.fork = true
+  options.forkOptions.javaHome = java11Home as File
+  sourceCompatibility = JavaVersion.VERSION_1_8
+  targetCompatibility = JavaVersion.VERSION_11
+  options.compilerArgs += ['-Xlint:-path']

Review comment:
   What about putting here the equivalent of `--release=11` (to ensure we 
use the bootclasspath of the Java 11 JVM).
   
https://stackoverflow.com/questions/43102787/what-is-the-release-flag-in-the-java-9-compiler
   I suppose the source compatibility with Java 8 is already validated in the 
default build.

##
File path: 
sdks/java/testing/test-utils/src/test/java/org/apache/beam/sdk/testutils/jvmverification/JvmVerification.java
##
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in 

[GitHub] [beam] ibzib commented on pull request #11403: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-06 Thread GitBox


ibzib commented on pull request #11403:
URL: https://github.com/apache/beam/pull/11403#issuecomment-624911213


   Run Python 2 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r421113369



##
File path: website/build.gradle
##
@@ -91,30 +92,38 @@ task startDockerContainer(type: Exec) {
 "${->createDockerContainer.containerId()}" // Lazily evaluate containerId.
 }
 
+task initGitSubmodules(type: Exec) {
+commandLine 'docker', 'exec', '-u', 'root',
+"${->startDockerContainer.containerId()}", 'git',  
'submodule', 'update', '--init',  '--recursive'
+}
+
+task installDependencies(type: Exec) {
+commandLine 'docker', 'exec', '-u', 'root', '--workdir', 
"$dockerSourceDir",
+"${->startDockerContainer.containerId()}", 'yarn', 'install'
+}
+
+task buildGithubSamples(type: Exec) {
+  commandLine 'docker', 'exec', '-u', 'root', '--workdir', "$dockerSourceDir",
+  "${->startDockerContainer.containerId()}", 'yarn', 
'build_github_samples'
+}

Review comment:
   Confirmed that removing these lines seems to have fixed the issue with 
creating files owned by root. I ran
   `find 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Website_Stage_GCS_Commit/ 
-group root` and `find 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Website_Commit/ -group 
root` on the workers it used and they don't have any. 
   
   Also ran find with `-exec rm` on every worker to remove some root-owned 
files that I seemed to have missed before. We _should_ be good now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chadrik commented on pull request #11620: [BEAM-7746] Enable mypy type checking for Beam Python code.

2020-05-06 Thread GitBox


chadrik commented on pull request #11620:
URL: https://github.com/apache/beam/pull/11620#issuecomment-624907702


   Great idea.  LGTM.  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11625: Remove -u root

2020-05-06 Thread GitBox


pabloem commented on pull request #11625:
URL: https://github.com/apache/beam/pull/11625#issuecomment-624906925


   LGTM. Thanks Brian!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rahul8383 commented on pull request #11609: [BEAM-9887] Throw IllegalArgumentException when building Row with logical types with Invalid input

2020-05-06 Thread GitBox


rahul8383 commented on pull request #11609:
URL: https://github.com/apache/beam/pull/11609#issuecomment-624906438


   > This seems to be a holdover. Previously Row stored logical type values as 
their base type, so we probably called `toBaseType(toInputType(x))`.
   
   Even before the code which changed `Row` to store logical type values 
instead of base values in memory, while building the row, the input value is 
expected to be of correct length.
   
https://github.com/apache/beam/blob/9f0cb649d39ee6236ea27f111acb4b66591a80ec/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L658-L660
   I think this issue is present since the introduction of `FixedBytes`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


pabloem commented on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-624905871


   okay, @bntnam we should not fail Website Precommit tests when some links are 
broken. This is consistent with previous policy, where we didn't fail on broken 
tests. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] Akshay-Iyangar commented on pull request #11396: [BEAM-9742] Add Configurable FluentBackoff to JdbcIO Write

2020-05-06 Thread GitBox


Akshay-Iyangar commented on pull request #11396:
URL: https://github.com/apache/beam/pull/11396#issuecomment-624902109


   @lukecwik @aromanenko-dev @jfarr @timrobertson100 Let me know what you guys 
think needs to be done. 
   Seems like people are ok with making FluentBackOff public? 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11625: Remove -u root

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11625:
URL: https://github.com/apache/beam/pull/11625#issuecomment-624901486


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11625: [WIP] Remove -u root

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11625:
URL: https://github.com/apache/beam/pull/11625#issuecomment-624901099


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11625: Remove -u root

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11625:
URL: https://github.com/apache/beam/pull/11625#issuecomment-624901320


   Run Website_Stage_GCS PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11625: [WIP] Remove -u root

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11625:
URL: https://github.com/apache/beam/pull/11625#issuecomment-624900732


   Run Website_Stage_GCS PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


amaliujia commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421097127



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   Thanks for clarifying this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robinyqiu commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType

2020-05-06 Thread GitBox


robinyqiu commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-624894351


   The failing test `SparkPortableExecutionTest.testExecution` should be 
unrelated to this change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11624: [BEAM-9767] Make streaming_wordcount an integration test

2020-05-06 Thread GitBox


pabloem commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624894129


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rohdesamuel commented on pull request #11624: [BEAM-9767] Make streaming_wordcount an integration test

2020-05-06 Thread GitBox


rohdesamuel commented on pull request #11624:
URL: https://github.com/apache/beam/pull/11624#issuecomment-624893653


   R: @pabloem 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kmjung commented on pull request #11580: [BEAM-9861] Reject fractional values outside of (0.0, 1.0)

2020-05-06 Thread GitBox


kmjung commented on pull request #11580:
URL: https://github.com/apache/beam/pull/11580#issuecomment-624891188


   R: @chamikaramj 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


ibzib commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421092661



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   Done

##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-624889206


   > > Thanks!
   > > In terms of naming, I found it surprising that union (for example) does 
deduplication. Maybe name them distinctUnion and multisetUnion or something 
like that?
   > 
   > > Thanks!
   > > In terms of naming, I found it surprising that union (for example) does 
deduplication. Maybe name them distinctUnion and multisetUnion or something 
like that?
   > 
   > I don't have strong view here but I feel that as it is inside `SetFns` and 
will be used like `SetFns.union`, it follows SET DISTINCT semantics. It is 
equivalent of UNION in SQL like `A U B`.Also all other functions in `SetFns` 
follow SET DISTINCT semantics. What is your view, how do we make it better?
   
   I am wondering if you can use a bool flag to indicate it is "ALL" or 
"DISTINCT" semantics.  By doing so there is no naming problem. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


ibzib commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421089353



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   I can do that.. but I hope we get rid of the "manual" steps in the 
future. They add no utility over the script and make the release guide hard to 
read.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


amaliujia commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r421087268



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Depends on if we treat SET transforms as binary operation or a set 
operation. If it is binary, this PR's implementation makes sense (plus it is 
chain-able). If we consider it is set operation, then it should be 
`PTransform, PCollection>` by the nature of a set.
   
   
   I personally lean to SET transforms here should work on set, like CoGroup. 
Also this PR is built on top of CoGroup already (which is 
`PTransform, PCollection`).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


udim commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421088091



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   Please also update the release guide:
   
https://github.com/apache/beam/blame/master/website/src/contribute/release-guide.md#L783





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


amaliujia commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r421087268



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Depends on if we treat SET operations as binary operation or a "set" 
operation. If it is binary, this PR's implementation makes sense (plus it is 
chain-able). If we consider it is "set" operation, then it should be 
`PTransform, PCollection>` by the nature of a set.
   
   
   I personally lean to SET operations here should work on "set", like CoGroup. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11625: [WIP] Remove -u root

2020-05-06 Thread GitBox


pabloem commented on pull request #11625:
URL: https://github.com/apache/beam/pull/11625#issuecomment-624887659


   Thanks! : D LGTM as long as the tests pass



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit opened a new pull request #11625: [WIP] Remove -u root

2020-05-06 Thread GitBox


TheNeuralBit opened a new pull request #11625:
URL: https://github.com/apache/beam/pull/11625


   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)[![Build
 

[GitHub] [beam] amaliujia commented on a change in pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-06 Thread GitBox


amaliujia commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r421087268



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Depends on if we treat SET as binary operation or a "set" operation. If 
it is binary, this PR's implementation makes sense (plus it is chain-able). If 
we consider it is "set" operation, then it should be 
`PTransform, PCollection>` by the nature of a set.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


udim commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421086807



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   There are 2 virtualenvs here. The one in line 272 will use the default 
python version, and the one created by tox will use python3.7.
   Tox runs "python setup.py sdist" in the first virtualenv and installs the 
tarball in the second.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-624884305


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-624883802


   restest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit edited a comment on pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


TheNeuralBit edited a comment on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-624883802


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rohdesamuel opened a new pull request #11624: make streaming_wordcount an integration test

2020-05-06 Thread GitBox


rohdesamuel opened a new pull request #11624:
URL: https://github.com/apache/beam/pull/11624


   Change-Id: I083dccc63d8c44274ec175e2bd1520c540adf9b3
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 

[GitHub] [beam] TheNeuralBit commented on pull request #11569: [BEAM-9840] Support for Parameterized Types when converting from HCat…

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11569:
URL: https://github.com/apache/beam/pull/11569#issuecomment-624879626


   Filed [BEAM-9909](https://issues.apache.org/jira/browse/BEAM-9909) to track 
the logical type change



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11086: [BEAM-8910] Make custom BQ source read from Avro

2020-05-06 Thread GitBox


pabloem commented on pull request #11086:
URL: https://github.com/apache/beam/pull/11086#issuecomment-624879527


   Run Python 2 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11569: [BEAM-9840] Support for Parameterized Types when converting from HCat…

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11569:
URL: https://github.com/apache/beam/pull/11569#issuecomment-624878454


   Yep! sorry about that meant to do this yesterday



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r421075832



##
File path: website/build.gradle
##
@@ -91,30 +92,38 @@ task startDockerContainer(type: Exec) {
 "${->createDockerContainer.containerId()}" // Lazily evaluate containerId.
 }
 
+task initGitSubmodules(type: Exec) {
+commandLine 'docker', 'exec', '-u', 'root',
+"${->startDockerContainer.containerId()}", 'git',  
'submodule', 'update', '--init',  '--recursive'
+}
+
+task installDependencies(type: Exec) {
+commandLine 'docker', 'exec', '-u', 'root', '--workdir', 
"$dockerSourceDir",
+"${->startDockerContainer.containerId()}", 'yarn', 'install'
+}
+
+task buildGithubSamples(type: Exec) {
+  commandLine 'docker', 'exec', '-u', 'root', '--workdir', "$dockerSourceDir",
+  "${->startDockerContainer.containerId()}", 'yarn', 
'build_github_samples'
+}

Review comment:
   I wonder if we should remove the `-u root` from the other docker exec 
command below as well? I don't know why it's there, and I had to remove all 4 
in order to be able to run `:websitePrecommit` locally.
   
   Also once I did get it running I ran into the following error:
   ```
   Error: Error building site: TOCSS: failed to transform "scss/main.scss" 
(text/x-scss): SCSS processing failed: file 
"/opt/website/www/site/assets/scss/_table-wrapper.sass", line 12, col 35: 
Invalid CSS after "...the License. */": expected 1 selector or at-rule, was "*/ 
{}" 
   ```
   
   Which seems to be another issue. Need to change the comment syntax in 
main.scss





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rahul8383 commented on pull request #11569: [BEAM-9840] Support for Parameterized Types when converting from HCat…

2020-05-06 Thread GitBox


rahul8383 commented on pull request #11569:
URL: https://github.com/apache/beam/pull/11569#issuecomment-624876337


   @TheNeuralBit if it is okay with you that I handle converting the 
parameterized types to necessary logical types in a separate PR,  and if there 
are no comments, can we merge this PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rohdesamuel commented on pull request #11503: [BEAM-9692] Make GroupByKey into a primitive

2020-05-06 Thread GitBox


rohdesamuel commented on pull request #11503:
URL: https://github.com/apache/beam/pull/11503#issuecomment-624873948


   R: @robertwb 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-06 Thread GitBox


pabloem commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r421070663



##
File path: website/build.gradle
##
@@ -91,30 +92,38 @@ task startDockerContainer(type: Exec) {
 "${->createDockerContainer.containerId()}" // Lazily evaluate containerId.
 }
 
+task initGitSubmodules(type: Exec) {
+commandLine 'docker', 'exec', '-u', 'root',
+"${->startDockerContainer.containerId()}", 'git',  
'submodule', 'update', '--init',  '--recursive'
+}
+
+task installDependencies(type: Exec) {
+commandLine 'docker', 'exec', '-u', 'root', '--workdir', 
"$dockerSourceDir",
+"${->startDockerContainer.containerId()}", 'yarn', 'install'
+}
+
+task buildGithubSamples(type: Exec) {
+  commandLine 'docker', 'exec', '-u', 'root', '--workdir', "$dockerSourceDir",
+  "${->startDockerContainer.containerId()}", 'yarn', 
'build_github_samples'
+}

Review comment:
   Can you try running these as non-root? The fact that docker is running 
as root is causing troubles for other tests, it seems.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11555: [BEAM-8134] Grafana dashboards for Load Tests and IO IT Performance Tests

2020-05-06 Thread GitBox


aaltay commented on pull request #11555:
URL: https://github.com/apache/beam/pull/11555#issuecomment-624863732


   This LGTM. I believe the only open comment is about adding a landing page, 
but otherwise I do not have additional comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11555: [BEAM-8134] Grafana dashboards for Load Tests and IO IT Performance Tests

2020-05-06 Thread GitBox


aaltay commented on pull request #11555:
URL: https://github.com/apache/beam/pull/11555#issuecomment-624863286


   > > Some different colors (Example: 
http://metrics.beam.apache.org/d/bnlHKP3Wz/java-io-it-tests-dataflow?orgId=1 -- 
TextIOIT | 1 GB | GCS | "Many files" | GCS Copies is in blue color)
   > 
   > It was a purposeful change. This is the only test within Java IO IT 
dashboard that reports a different kind of metric (not _read_time_ or 
_write_time_, but _copies_per_sec_). I can modify the color if you think all 
colors should be the same.
   
   No, different colors make sense for different metrics.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-06 Thread GitBox


aaltay commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-624862871


   /cc @tysonjh 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


amaliujia commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421050465



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   this is interesting cause I thought this script uses python2, and then 
in this PR `tox -e py37-docs` looks like work with python2





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on pull request #11403: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-06 Thread GitBox


ibzib commented on pull request #11403:
URL: https://github.com/apache/beam/pull/11403#issuecomment-624855116


   Run Python 2 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11622: [BEAM-3288] Add suggested fix to error message

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11622:
URL: https://github.com/apache/beam/pull/11622#issuecomment-624854433


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm commented on pull request #11590: [BEAM-8944] Improve UnboundedThreadPoolExecutor performance

2020-05-06 Thread GitBox


mxm commented on pull request #11590:
URL: https://github.com/apache/beam/pull/11590#issuecomment-624854053


   Seems like the streaming test is timing out when it wasn't before. Not sure 
if that's related to the changes here. I'll check.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


amaliujia commented on a change in pull request #11623:
URL: https://github.com/apache/beam/pull/11623#discussion_r421050465



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -275,7 +276,7 @@ if [[ $confirmation = "y" ]]; then
   git clone ${GIT_REPO_URL}
   cd ${BEAM_ROOT_DIR}
   git checkout ${RELEASE_BRANCH}
-  cd sdks/python && tox -e docs
+  cd sdks/python && pip install -r build-requirements.txt && tox -e py37-docs

Review comment:
   this is interesting cause I thought this script uses python2, but `tox 
-e py37-docs` looks works by python2





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

2020-05-06 Thread GitBox


mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624846518


   Run Python Load Tests ParDo Flink Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

2020-05-06 Thread GitBox


mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624853271


   Run Python Load Tests ParDo Flink Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib opened a new pull request #11623: [BEAM-9908] Fix Python build failures in release script.

2020-05-06 Thread GitBox


ibzib opened a new pull request #11623:
URL: https://github.com/apache/beam/pull/11623


   R: @udim 
   CC: @amaliujia 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

2020-05-06 Thread GitBox


mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624846518


   Run Python Load Tests ParDo Flink Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit opened a new pull request #11622: [BEAM-3288] Add suggested fix to error message

2020-05-06 Thread GitBox


TheNeuralBit opened a new pull request #11622:
URL: https://github.com/apache/beam/pull/11622


   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)[![Build
 

[GitHub] [beam] tvalentyn commented on pull request #11621: Remove a bunch of spurious warnings in tests.

2020-05-06 Thread GitBox


tvalentyn commented on pull request #11621:
URL: https://github.com/apache/beam/pull/11621#issuecomment-624827505


   PTAL at faliures.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11616: Use csv reader instead of split to read csv data.

2020-05-06 Thread GitBox


aaltay commented on pull request #11616:
URL: https://github.com/apache/beam/pull/11616#issuecomment-624821797


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11609: [BEAM-9887] Throw IllegalArgumentException when building Row with logical types with Invalid input

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11609:
URL: https://github.com/apache/beam/pull/11609#issuecomment-624812424


   I'd be +1 for just dropping the padding logic. I don't think it should be 
the responsibility of the LogicalType to coerce values like this. What do you 
think @reuvenlax?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11609: [BEAM-9887] Throw IllegalArgumentException when building Row with logical types with Invalid input

2020-05-06 Thread GitBox


TheNeuralBit commented on pull request #11609:
URL: https://github.com/apache/beam/pull/11609#issuecomment-624810068


   Ahh ok. I'm sorry for being so dense, I see what you're saying now. In 
`toBaseType` we validate that the array's length == the fixed length: 
   
https://github.com/apache/beam/blob/5e1571760b61b8ce247d5375b71c8df4d69d6409/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes/FixedBytes.java#L67-L70
   
   This is where the exception checked in your new tests is thrown.
   
   So we never actually get into `toInputType` where we could accept a shorter 
byte array and zero-pad. And there doesn't seem to be any other way to set a 
value by base type with a shorter byte array and pad it.
   
   This seems to be a holdover. Previously Row stored logical type values as 
their base type, so we probably called `toBaseType(toInputType(x))`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lostluck commented on pull request #11615: [BEAM-9731] passert.Equals: sort output strings for easier reading

2020-05-06 Thread GitBox


lostluck commented on pull request #11615:
URL: https://github.com/apache/beam/pull/11615#issuecomment-624805830


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lostluck commented on pull request #11615: [BEAM-9731] passert.Equals: sort output strings for easier reading

2020-05-06 Thread GitBox


lostluck commented on pull request #11615:
URL: https://github.com/apache/beam/pull/11615#issuecomment-624805946


   LGTM, I'll merge once the tests pass.
   
   Good catch!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >