[GitHub] [beam] ihji opened a new pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


ihji opened a new pull request #11771:
URL: https://github.com/apache/beam/pull/11771


   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/last

[GitHub] [beam] ihji commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


ihji commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-631950394


   R: @chamikaramj 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11760: [BEAM-10043] Fix grammar / spelling in language-switch.js

2020-05-21 Thread GitBox


kamilwu commented on pull request #11760:
URL: https://github.com/apache/beam/pull/11760#issuecomment-631967204


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11760: [BEAM-10043] Fix grammar / spelling in language-switch.js

2020-05-21 Thread GitBox


kamilwu commented on pull request #11760:
URL: https://github.com/apache/beam/pull/11760#issuecomment-631976285


   Thanks @epicfaace!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu merged pull request #11760: [BEAM-10043] Fix grammar / spelling in language-switch.js

2020-05-21 Thread GitBox


kamilwu merged pull request #11760:
URL: https://github.com/apache/beam/pull/11760


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kkucharc commented on pull request #11360: [BEAM-9722] added SnowflakeIO with Read operation

2020-05-21 Thread GitBox


kkucharc commented on pull request #11360:
URL: https://github.com/apache/beam/pull/11360#issuecomment-631990753


   Run Python2_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm opened a new pull request #11772: [BEAM-9930] Update blog post for new Beam Summit Digital dates

2020-05-21 Thread GitBox


mxm opened a new pull request #11772:
URL: https://github.com/apache/beam/pull/11772


   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_

[GitHub] [beam] DariuszAniszewski commented on pull request #11360: [BEAM-9722] added SnowflakeIO with Read operation

2020-05-21 Thread GitBox


DariuszAniszewski commented on pull request #11360:
URL: https://github.com/apache/beam/pull/11360#issuecomment-632019864


   Just a small comment about the force-push from above - it was mistakenly 
done, then reverted. HEAD of this branch is still on **3ba192a** and comment is 
a leftover.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on a change in pull request #11566:
URL: https://github.com/apache/beam/pull/11566#discussion_r428605560



##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/BatchRequestForDLP.java
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.privacy.dlp.v2.Table;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.metrics.Counter;
+import org.apache.beam.sdk.metrics.Metrics;
+import org.apache.beam.sdk.state.BagState;
+import org.apache.beam.sdk.state.StateSpec;
+import org.apache.beam.sdk.state.StateSpecs;
+import org.apache.beam.sdk.state.TimeDomain;
+import org.apache.beam.sdk.state.Timer;
+import org.apache.beam.sdk.state.TimerSpec;
+import org.apache.beam.sdk.state.TimerSpecs;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.values.KV;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * DoFn batching the input PCollection into bigger requests in order to better 
utilize the Cloud DLP
+ * service.
+ */
+@Experimental
+class BatchRequestForDLP extends DoFn, KV>> {
+  public static final Logger LOG = 
LoggerFactory.getLogger(BatchRequestForDLP.class);
+
+  private final Counter numberOfRowsBagged =
+  Metrics.counter(BatchRequestForDLP.class, "numberOfRowsBagged");
+  private final Integer batchSize;
+
+  @StateId("elementsBag")
+  private final StateSpec>> elementsBag = 
StateSpecs.bag();
+
+  @TimerId("eventTimer")
+  private final TimerSpec eventTimer = TimerSpecs.timer(TimeDomain.EVENT_TIME);
+
+  public BatchRequestForDLP(Integer batchSize) {
+this.batchSize = batchSize;
+  }
+
+  @ProcessElement
+  public void process(
+  @Element KV element,
+  @StateId("elementsBag") BagState> elementsBag,
+  @TimerId("eventTimer") Timer eventTimer,
+  BoundedWindow w) {
+elementsBag.add(element);
+eventTimer.set(w.maxTimestamp());
+  }
+
+  @OnTimer("eventTimer")
+  public void onTimer(
+  @StateId("elementsBag") BagState> elementsBag,
+  OutputReceiver>> output) {
+String key = elementsBag.read().iterator().next().getKey();
+AtomicInteger bufferSize = new AtomicInteger();
+List rows = new ArrayList<>();
+elementsBag
+.read()
+.forEach(
+element -> {
+  int elementSize = element.getValue().getSerializedSize();
+  boolean clearBuffer = bufferSize.intValue() + elementSize > 
batchSize;
+  if (clearBuffer) {
+numberOfRowsBagged.inc(rows.size());
+LOG.debug("Clear Buffer {} , Key {}", bufferSize.intValue(), 
element.getKey());

Review comment:
   Sure, I'll move it and add units. Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on a change in pull request #11566:
URL: https://github.com/apache/beam/pull/11566#discussion_r428606587



##
File path: 
sdks/java/extensions/ml/src/test/java/org/apache/beam/sdk/extensions/ml/DLPTextOperationsIT.java
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import static org.junit.Assert.assertTrue;
+
+import com.google.privacy.dlp.v2.CharacterMaskConfig;
+import com.google.privacy.dlp.v2.DeidentifyConfig;
+import com.google.privacy.dlp.v2.DeidentifyContentResponse;
+import com.google.privacy.dlp.v2.Finding;
+import com.google.privacy.dlp.v2.InfoType;
+import com.google.privacy.dlp.v2.InfoTypeTransformations;
+import com.google.privacy.dlp.v2.InspectConfig;
+import com.google.privacy.dlp.v2.InspectContentResponse;
+import com.google.privacy.dlp.v2.Likelihood;
+import com.google.privacy.dlp.v2.PrimitiveTransformation;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+@RunWith(JUnit4.class)
+public class DLPTextOperationsIT {
+  @Rule public TestPipeline testPipeline = TestPipeline.create();
+
+  private static final String IDENTIFYING_TEXT = "mary@example.com";
+  private static InfoType emailAddress = 
InfoType.newBuilder().setName("EMAIL_ADDRESS").build();
+  private static final InspectConfig inspectConfig =
+  InspectConfig.newBuilder()
+  .addInfoTypes(emailAddress)
+  .setMinLikelihood(Likelihood.LIKELY)
+  .build();
+
+  @Test
+  public void inspectsText() {
+String projectId = 
testPipeline.getOptions().as(GcpOptions.class).getProject();
+PCollection> inspectionResult =
+testPipeline
+.apply(Create.of(KV.of("", IDENTIFYING_TEXT)))
+.apply(
+DLPInspectText.newBuilder()
+.setBatchSize(52400)

Review comment:
   @santhh Is it 52400 or 524000? It should be the latter, since we're 
setting the limit to 524 kb, I think.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] harriskistpay opened a new pull request #11773: [BEAM-1589] Added @OnWindowExpiration annotation.

2020-05-21 Thread GitBox


harriskistpay opened a new pull request #11773:
URL: https://github.com/apache/beam/pull/11773


   Updated CHANGES.md file
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.o

[GitHub] [beam] harriskistpay closed pull request #11773: [BEAM-1589] Added @OnWindowExpiration annotation.

2020-05-21 Thread GitBox


harriskistpay closed pull request #11773:
URL: https://github.com/apache/beam/pull/11773


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rehmanmuradali opened a new pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.

2020-05-21 Thread GitBox


rehmanmuradali opened a new pull request #11774:
URL: https://github.com/apache/beam/pull/11774


   Updated CHANGES.md
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/j

[GitHub] [beam] rehmanmuradali commented on pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.

2020-05-21 Thread GitBox


rehmanmuradali commented on pull request #11774:
URL: https://github.com/apache/beam/pull/11774#issuecomment-632067025


   R: @reuvenlax 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia opened a new pull request #11775: [BEAM-10050] Change labels checked in VideoIntelligenceIT

2020-05-21 Thread GitBox


mwalenia opened a new pull request #11775:
URL: https://github.com/apache/beam/pull/11775


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on a change in pull request #11566:
URL: https://github.com/apache/beam/pull/11566#discussion_r428661780



##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/DLPDeidentifyText.java
##
@@ -0,0 +1,215 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.auto.value.AutoValue;
+import com.google.cloud.dlp.v2.DlpServiceClient;
+import com.google.privacy.dlp.v2.ContentItem;
+import com.google.privacy.dlp.v2.DeidentifyConfig;
+import com.google.privacy.dlp.v2.DeidentifyContentRequest;
+import com.google.privacy.dlp.v2.DeidentifyContentResponse;
+import com.google.privacy.dlp.v2.FieldId;
+import com.google.privacy.dlp.v2.InspectConfig;
+import com.google.privacy.dlp.v2.ProjectName;
+import com.google.privacy.dlp.v2.Table;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+
+/**
+ * A {@link PTransform} connecting to Cloud DLP and deidentifying text 
according to provided
+ * settings. The transform supports both CSV formatted input data and 
unstructured input.
+ *
+ * If the csvHeader property is set, csvDelimiter also should be, else the 
results will be
+ * incorrect. If csvHeader is not set, input is assumed to be unstructured.
+ *
+ * Either inspectTemplateName (String) or inspectConfig {@link 
InspectConfig} need to be set. The
+ * situation is the same with deidentifyTemplateName and deidentifyConfig 
({@link DeidentifyConfig}.
+ *
+ * Batch size defines how big are batches sent to DLP at once in bytes.
+ *
+ * The transform outputs {@link KV} of {@link String} (eg. filename) and 
{@link
+ * DeidentifyContentResponse}, which will contain {@link Table} of results for 
the user to consume.
+ */
+@Experimental
+@AutoValue
+public abstract class DLPDeidentifyText
+extends PTransform<
+PCollection>, PCollection>> {
+
+  @Nullable
+  public abstract String inspectTemplateName();
+
+  @Nullable
+  public abstract String deidentifyTemplateName();
+
+  @Nullable
+  public abstract InspectConfig inspectConfig();
+
+  @Nullable
+  public abstract DeidentifyConfig deidentifyConfig();
+
+  @Nullable
+  public abstract PCollectionView> csvHeader();
+
+  @Nullable
+  public abstract String csvDelimiter();
+
+  public abstract Integer batchSize();
+
+  public abstract String projectId();
+
+  @AutoValue.Builder
+  public abstract static class Builder {
+public abstract Builder setInspectTemplateName(String inspectTemplateName);
+
+public abstract Builder setCsvHeader(PCollectionView> 
csvHeader);
+
+public abstract Builder setCsvDelimiter(String delimiter);
+
+public abstract Builder setBatchSize(Integer batchSize);
+
+public abstract Builder setProjectId(String projectId);
+
+public abstract Builder setDeidentifyTemplateName(String 
deidentifyTemplateName);
+
+public abstract Builder setInspectConfig(InspectConfig inspectConfig);
+
+public abstract Builder setDeidentifyConfig(DeidentifyConfig 
deidentifyConfig);
+
+public abstract DLPDeidentifyText build();
+  }
+
+  public static DLPDeidentifyText.Builder newBuilder() {
+return new AutoValue_DLPDeidentifyText.Builder();
+  }
+
+  /**
+   * The transform batches the contents of input PCollection and then calls 
Cloud DLP service to
+   * perform the deidentification.
+   *
+   * @param input input PCollection
+   * @return PCollection after transformations
+   */
+  @Override
+  public PCollection> expand(
+  PCollection> input) {
+return input
+.apply(ParDo.of(new MapStringToDlpRow(csvDelimiter(
+.apply("Batch Contents", ParDo.of(new BatchRequestForDLP(batchSize(
+.apply(
+"DLPDeidentify",
+ParDo.of(
+new DeidentifyT

[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on a change in pull request #11566:
URL: https://github.com/apache/beam/pull/11566#discussion_r428666183



##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/MapStringToDlpRow.java
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.privacy.dlp.v2.Table;
+import com.google.privacy.dlp.v2.Value;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.values.KV;
+
+class MapStringToDlpRow extends DoFn, KV> {
+  private final String delimiter;
+
+  public MapStringToDlpRow(String delimiter) {
+this.delimiter = delimiter;
+  }
+
+  @ProcessElement
+  public void processElement(ProcessContext context) {

Review comment:
   Ok, will do!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] henryken commented on pull request #11734: [BEAM-9679] Add Core Transforms section / GroupByKey lesson to the Go SDK katas

2020-05-21 Thread GitBox


henryken commented on pull request #11734:
URL: https://github.com/apache/beam/pull/11734#issuecomment-632104571


   Thanks @damondouglas!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kkucharc commented on pull request #11360: [BEAM-9722] added SnowflakeIO with Read operation

2020-05-21 Thread GitBox


kkucharc commented on pull request #11360:
URL: https://github.com/apache/beam/pull/11360#issuecomment-632106649


   I retested failing test - probably the previous one was timeouting. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on a change in pull request #11566:
URL: https://github.com/apache/beam/pull/11566#discussion_r428676595



##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/DLPReidentifyText.java
##
@@ -0,0 +1,206 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.auto.value.AutoValue;
+import com.google.cloud.dlp.v2.DlpServiceClient;
+import com.google.privacy.dlp.v2.ContentItem;
+import com.google.privacy.dlp.v2.DeidentifyConfig;
+import com.google.privacy.dlp.v2.FieldId;
+import com.google.privacy.dlp.v2.InspectConfig;
+import com.google.privacy.dlp.v2.ProjectName;
+import com.google.privacy.dlp.v2.ReidentifyContentRequest;
+import com.google.privacy.dlp.v2.ReidentifyContentResponse;
+import com.google.privacy.dlp.v2.Table;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A {@link PTransform} connecting to Cloud DLP and inspecting text for 
identifying data according
+ * to provided settings.
+ *
+ * Either inspectTemplateName (String) or inspectConfig {@link 
InspectConfig} need to be set, the
+ * same goes for reidentifyTemplateName or reidentifyConfig.
+ *
+ * Batch size defines how big are batches sent to DLP at once in bytes.
+ */
+@Experimental
+@AutoValue
+public abstract class DLPReidentifyText
+extends PTransform<
+PCollection>, PCollection>> {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(DLPInspectText.class);
+
+  public static final Integer DLP_PAYLOAD_LIMIT = 52400;
+
+  @Nullable
+  public abstract String inspectTemplateName();
+
+  @Nullable
+  public abstract String reidentifyTemplateName();
+
+  @Nullable
+  public abstract InspectConfig inspectConfig();
+
+  @Nullable
+  public abstract DeidentifyConfig reidentifyConfig();
+
+  @Nullable
+  public abstract String csvDelimiter();
+
+  @Nullable
+  public abstract PCollectionView> csvHeaders();
+
+  public abstract Integer batchSize();
+
+  public abstract String projectId();
+
+  @AutoValue.Builder
+  public abstract static class Builder {
+public abstract Builder setInspectTemplateName(String inspectTemplateName);
+
+public abstract Builder setInspectConfig(InspectConfig inspectConfig);
+
+public abstract Builder setReidentifyConfig(DeidentifyConfig 
deidentifyConfig);
+
+public abstract Builder setReidentifyTemplateName(String 
deidentifyTemplateName);
+
+public abstract Builder setBatchSize(Integer batchSize);
+
+public abstract Builder setCsvHeaders(PCollectionView> 
csvHeaders);
+
+public abstract Builder setCsvDelimiter(String delimiter);
+
+public abstract Builder setProjectId(String projectId);
+
+public abstract DLPReidentifyText build();
+  }
+
+  public static DLPReidentifyText.Builder newBuilder() {
+return new AutoValue_DLPReidentifyText.Builder();
+  }
+
+  @Override
+  public PCollection> expand(
+  PCollection> input) {
+return input
+.apply(ParDo.of(new MapStringToDlpRow(csvDelimiter(
+.apply("Batch Contents", ParDo.of(new BatchRequestForDLP(batchSize(
+.apply(
+"DLPDeidentify",
+ParDo.of(
+new ReidentifyText(
+projectId(),
+inspectTemplateName(),
+reidentifyTemplateName(),
+inspectConfig(),
+reidentifyConfig(),
+csvHeaders(;
+  }
+
+  public static class ReidentifyText
+  extends DoFn>, KV> {
+private final String projectId;
+private final String inspectTemplateName;
+private final String reidentifyTemplateName;
+private final InspectConfig inspectC

[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on a change in pull request #11566:
URL: https://github.com/apache/beam/pull/11566#discussion_r428679955



##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/DLPDeidentifyText.java
##
@@ -0,0 +1,215 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.auto.value.AutoValue;
+import com.google.cloud.dlp.v2.DlpServiceClient;
+import com.google.privacy.dlp.v2.ContentItem;
+import com.google.privacy.dlp.v2.DeidentifyConfig;
+import com.google.privacy.dlp.v2.DeidentifyContentRequest;
+import com.google.privacy.dlp.v2.DeidentifyContentResponse;
+import com.google.privacy.dlp.v2.FieldId;
+import com.google.privacy.dlp.v2.InspectConfig;
+import com.google.privacy.dlp.v2.ProjectName;
+import com.google.privacy.dlp.v2.Table;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+
+/**
+ * A {@link PTransform} connecting to Cloud DLP and deidentifying text 
according to provided
+ * settings. The transform supports both CSV formatted input data and 
unstructured input.
+ *
+ * If the csvHeader property is set, csvDelimiter also should be, else the 
results will be
+ * incorrect. If csvHeader is not set, input is assumed to be unstructured.
+ *
+ * Either inspectTemplateName (String) or inspectConfig {@link 
InspectConfig} need to be set. The
+ * situation is the same with deidentifyTemplateName and deidentifyConfig 
({@link DeidentifyConfig}.
+ *
+ * Batch size defines how big are batches sent to DLP at once in bytes.
+ *
+ * The transform outputs {@link KV} of {@link String} (eg. filename) and 
{@link
+ * DeidentifyContentResponse}, which will contain {@link Table} of results for 
the user to consume.
+ */
+@Experimental
+@AutoValue
+public abstract class DLPDeidentifyText
+extends PTransform<
+PCollection>, PCollection>> {
+
+  @Nullable
+  public abstract String inspectTemplateName();
+
+  @Nullable
+  public abstract String deidentifyTemplateName();
+
+  @Nullable
+  public abstract InspectConfig inspectConfig();
+
+  @Nullable
+  public abstract DeidentifyConfig deidentifyConfig();
+
+  @Nullable
+  public abstract PCollectionView> csvHeader();
+
+  @Nullable
+  public abstract String csvDelimiter();
+
+  public abstract Integer batchSize();
+
+  public abstract String projectId();
+
+  @AutoValue.Builder
+  public abstract static class Builder {
+public abstract Builder setInspectTemplateName(String inspectTemplateName);
+
+public abstract Builder setCsvHeader(PCollectionView> 
csvHeader);
+
+public abstract Builder setCsvDelimiter(String delimiter);
+
+public abstract Builder setBatchSize(Integer batchSize);
+
+public abstract Builder setProjectId(String projectId);
+
+public abstract Builder setDeidentifyTemplateName(String 
deidentifyTemplateName);
+
+public abstract Builder setInspectConfig(InspectConfig inspectConfig);
+
+public abstract Builder setDeidentifyConfig(DeidentifyConfig 
deidentifyConfig);
+
+public abstract DLPDeidentifyText build();
+  }
+

Review comment:
   Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-21 Thread GitBox


mwalenia commented on pull request #11566:
URL: https://github.com/apache/beam/pull/11566#issuecomment-632113183


   @tysonjh , @santhh Thanks for the review, I added the first batch of fixes. 
I'll take care of the Javadocs tomorrow.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu opened a new pull request #11776: [BEAM-9421] Better documentation of output results from AnnotateText transform

2020-05-21 Thread GitBox


kamilwu opened a new pull request #11776:
URL: https://github.com/apache/beam/pull/11776


   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/l

[GitHub] [beam] reuvenlax merged pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.

2020-05-21 Thread GitBox


reuvenlax merged pull request #11774:
URL: https://github.com/apache/beam/pull/11774


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] reuvenlax commented on pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.

2020-05-21 Thread GitBox


reuvenlax commented on pull request #11774:
URL: https://github.com/apache/beam/pull/11774#issuecomment-632168553


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] y1chi commented on a change in pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


y1chi commented on a change in pull request #11756:
URL: https://github.com/apache/beam/pull/11756#discussion_r428754395



##
File path: 
sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnApiDoFnRunnerTest.java
##
@@ -947,49 +910,213 @@ public void testTimers() throws Exception {
 timerInGlobalWindow("A", new Instant(1600L), new Instant(10012L)),
 timerInGlobalWindow("X", new Instant(1700L), new Instant(10022L)),
 timerInGlobalWindow("C", new Instant(1800L), new Instant(10022L)),
-timerInGlobalWindow("B", new Instant(1900L), new 
Instant(10022L;
+timerInGlobalWindow("B", new Instant(1900L), new Instant(10022L)),
+timerInGlobalWindow("B", new Instant(2000L), new Instant(10032L)),
+timerInGlobalWindow("Y", new Instant(2100L), new 
Instant(10042L;
+assertThat(
+fakeTimerClient.getTimers(eventFamilyTimer),
+contains(
+timerInGlobalWindow("X", "event-timer1", new Instant(1000L), new 
Instant(1003L)),
+timerInGlobalWindow("Y", "event-timer1", new Instant(1100L), new 
Instant(1103L)),
+timerInGlobalWindow("X", "event-timer1", new Instant(1200L), new 
Instant(1203L)),
+timerInGlobalWindow("Y", "event-timer1", new Instant(1300L), new 
Instant(1303L)),
+timerInGlobalWindow("A", "event-timer1", new Instant(1400L), new 
Instant(2413L)),
+timerInGlobalWindow("B", "event-timer1", new Instant(1500L), new 
Instant(2513L)),
+timerInGlobalWindow("A", "event-timer1", new Instant(1600L), new 
Instant(2613L)),
+timerInGlobalWindow("X", "event-timer1", new Instant(1700L), new 
Instant(1723L)),
+timerInGlobalWindow("C", "event-timer1", new Instant(1800L), new 
Instant(1823L)),
+timerInGlobalWindow("B", "event-timer1", new Instant(1900L), new 
Instant(1923L)),
+timerInGlobalWindow("B", "event-timer1", new Instant(2000L), new 
Instant(2033L)),
+timerInGlobalWindow("Y", "event-timer1", new Instant(2100L), new 
Instant(2143L;
+assertThat(
+fakeTimerClient.getTimers(processingFamilyTimer),
+contains(
+timerInGlobalWindow("X", "processing-timer1", new Instant(1000L), 
new Instant(10004L)),
+timerInGlobalWindow("Y", "processing-timer1", new Instant(1100L), 
new Instant(10004L)),
+timerInGlobalWindow("X", "processing-timer1", new Instant(1200L), 
new Instant(10004L)),
+timerInGlobalWindow("Y", "processing-timer1", new Instant(1300L), 
new Instant(10004L)),
+timerInGlobalWindow("A", "processing-timer1", new Instant(1400L), 
new Instant(10014L)),
+timerInGlobalWindow("B", "processing-timer1", new Instant(1500L), 
new Instant(10014L)),
+timerInGlobalWindow("A", "processing-timer1", new Instant(1600L), 
new Instant(10014L)),
+timerInGlobalWindow("X", "processing-timer1", new Instant(1700L), 
new Instant(10024L)),
+timerInGlobalWindow("C", "processing-timer1", new Instant(1800L), 
new Instant(10024L)),
+timerInGlobalWindow("B", "processing-timer1", new Instant(1900L), 
new Instant(10024L)),
+timerInGlobalWindow("B", "processing-timer1", new Instant(2000L), 
new Instant(10034L)),
+timerInGlobalWindow(
+"Y", "processing-timer1", new Instant(2100L), new 
Instant(10044L;
 mainOutputValues.clear();
 
 assertFalse(fakeTimerClient.isOutboundClosed(eventTimer));
 assertFalse(fakeTimerClient.isOutboundClosed(processingTimer));
+assertFalse(fakeTimerClient.isOutboundClosed(eventFamilyTimer));
+assertFalse(fakeTimerClient.isOutboundClosed(processingFamilyTimer));
 fakeTimerClient.closeInbound(eventTimer);
 fakeTimerClient.closeInbound(processingTimer);
+fakeTimerClient.closeInbound(eventFamilyTimer);
+fakeTimerClient.closeInbound(processingFamilyTimer);
 
 Iterables.getOnlyElement(finishFunctionRegistry.getFunctions()).run();
 assertThat(mainOutputValues, empty());
 
 assertTrue(fakeTimerClient.isOutboundClosed(eventTimer));
 assertTrue(fakeTimerClient.isOutboundClosed(processingTimer));
+assertTrue(fakeTimerClient.isOutboundClosed(eventFamilyTimer));
+assertTrue(fakeTimerClient.isOutboundClosed(processingFamilyTimer));
 
 Iterables.getOnlyElement(teardownFunctions).run();
 assertThat(mainOutputValues, empty());
 
 assertEquals(
 ImmutableMap.builder()
 .put(bagUserStateKey("bag", "X"), encode("X0", "X1", "X2", 
"processing"))
-.put(bagUserStateKey("bag", "Y"), encode("Y1", "Y2"))
+.put(bagUserStateKey("bag", "Y"), encode("Y1", "Y2", 
"processing-family"))
 .put(bagUserStateKey("bag", "A"), encode("A0", "event", "event"))
-.put(bagUserStateKey("bag", "B"), encode("event", "processing"))
+.put(bagUserStateKey("bag", "B"), encode("event", "

[GitHub] [beam] mxm opened a new pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane in DirectRunner

2020-05-21 Thread GitBox


mxm opened a new pull request #11777:
URL: https://github.com/apache/beam/pull/11777


   We have a test pipeline which runs with the DirectRunner. When upgrading 
from 2.18.0 to 2.21.0 the test failed with the following exception:
   
   ```
   tp = Exception('Monitor task detected a pipeline stall.',), value = None, tb 
= None
   
   def raise_(tp, value=None, tb=None):
   """
   A function that matches the Python 2.x ``raise`` statement. This
   allows re-raising exceptions with the cls value and traceback on
   Python 2 and 3.
   """
   if value is not None and isinstance(tp, Exception):
   raise TypeError("instance exception may not have a separate 
value")
   if value is not None:
   exc = tp(value)
   else:
   exc = tp
   if exc.__traceback__ is not tb:
   raise exc.with_traceback(tb)
   >   raise exc
   E   Exception: Monitor task detected a pipeline stall.
   ```
   
   I was able to bisect the error. This commit introduced the failure: 
ea9b1f350b88c2996cafb4d24351869e82857731
   
   The fix lets to the pipeline running correctly.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastComplete

[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632179933


   Run Go PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm commented on pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane in DirectRunner

2020-05-21 Thread GitBox


mxm commented on pull request #11777:
URL: https://github.com/apache/beam/pull/11777#issuecomment-632181118


   R: @rohdesamuel



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm commented on pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane in DirectRunner

2020-05-21 Thread GitBox


mxm commented on pull request #11777:
URL: https://github.com/apache/beam/pull/11777#issuecomment-632181899


   Please have a look @rohdesamuel if you consider the fix valid. I'm not very 
familiar with the Python SDK triggering code.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632182849


   retest all please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632185565


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632185540


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632186469


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632186575


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632187939


   Run Python 3.6 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632187854


   Run Python 2 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632188019


   Run Python 3.7 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632188678


   Run Java Spark PortableValidatesRunner Batch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632190367


   Run Release Gradle Build



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632191832


   Run SQL_Java11 PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #11764: [BEAM-10048] Clean up release guide.

2020-05-21 Thread GitBox


ibzib commented on a change in pull request #11764:
URL: https://github.com/apache/beam/pull/11764#discussion_r428769910



##
File path: website/www/site/content/en/contribute/release-guide.md
##
@@ -583,189 +582,44 @@ For this step, we recommend you using automation script 
to create a RC, but you
   1. Stage source release into dist.apache.org dev 
[repo](https://dist.apache.org/repos/dist/dev/beam/).
   1. Stage,sign and hash python binaries into dist.apache.ord dev repo python 
dir
   1. Stage SDK docker images to [docker hub Apache 
organization](https://hub.docker.com/search?q=apache%2Fbeam&type=image).
-  1. Create a PR to update beam and beam-site, changes includes:
+  1. Create a PR to update beam-site, changes includes:
  * Copy python doc into beam-site
  * Copy java doc into beam-site
- * Update release version into 
[_config.yml](https://github.com/apache/beam/blob/master/website/_config.yml).
  
  Tasks you need to do manually
-  1. Add new release into `website/src/get-started/downloads.md`.
-  1. Update last release download links in 
`website/src/get-started/downloads.md`.
-  1. Update `website/src/.htaccess` to redirect to the new version.
-  1. Build and stage python wheels.
+  1. Verify the script worked.
+  1. Verify that the source and Python binaries are present in 
[dist.apache.org](https://dist.apache.org/repos/dist/dev/beam).
+  1. Verify Docker images are published. How to find images:
+  1. Visit 
[https://hub.docker.com/u/apache](https://hub.docker.com/search?q=apache%2Fbeam&type=image)
+  2. Visit each repository and navigate to *tags* tab.
+  3. Verify images are pushed with tags: ${RELEASE}_rc{RC_NUM}
+  1. Verify that third party licenses are included in Docker containers by 
logging in to the images.
+  - For Python SDK images, there should be around 80 ~ 100 
dependencies.
+  Please note that dependencies for the SDKs with different Python 
versions vary.
+  Need to verify all Python images by replacing `${ver}` with each 
supported Python version `X.Y`.
+  ```
+  docker run -it --entrypoint=/bin/bash 
apache/beam_python${ver}_sdk:${RELEASE}_rc{RC_NUM}
+  ls -al /opt/apache/beam/third_party_licenses/ | wc -l
+  ```
+  - For Java SDK images, there should be around 1400 dependencies.
+  ```
+  docker run -it --entrypoint=/bin/bash 
apache/beam_java_sdk:${RELEASE}_rc{RC_NUM}
+  ls -al /opt/apache/beam/third_party_licenses/ | wc -l
+  ```
   1. Publish staging artifacts
-  1. Go to the staging repo to close the staging repository on [Apache 
Nexus](https://repository.apache.org/#stagingRepositories). 
+  1. Go to the staging repo to close the staging repository on [Apache 
Nexus](https://repository.apache.org/#stagingRepositories).
   1. When prompted for a description, enter “Apache Beam, version X, 
release candidate Y”.
-
-
-### (Alternative) Run all steps manually

Review comment:
   Because it's mostly an exact copy of build_release_candidate.sh. Any 
part of this block that wasn't part of build_release_candidate.sh was moved 
into the "manual steps" section.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #11764: [BEAM-10048] Clean up release guide.

2020-05-21 Thread GitBox


ibzib commented on a change in pull request #11764:
URL: https://github.com/apache/beam/pull/11764#discussion_r428770502



##
File path: release/src/main/scripts/build_release_candidate.sh
##
@@ -113,7 +113,7 @@ if [[ $confirmation = "y" ]]; then
   echo "-Staging Java Artifacts into Maven---"

Review comment:
   I wasn't sure the best way to do that. Maybe we can add it in a later PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #11764: [BEAM-10048] Clean up release guide.

2020-05-21 Thread GitBox


ibzib commented on a change in pull request #11764:
URL: https://github.com/apache/beam/pull/11764#discussion_r428774392



##
File path: website/www/site/content/en/contribute/release-guide.md
##
@@ -583,189 +582,44 @@ For this step, we recommend you using automation script 
to create a RC, but you
   1. Stage source release into dist.apache.org dev 
[repo](https://dist.apache.org/repos/dist/dev/beam/).
   1. Stage,sign and hash python binaries into dist.apache.ord dev repo python 
dir
   1. Stage SDK docker images to [docker hub Apache 
organization](https://hub.docker.com/search?q=apache%2Fbeam&type=image).
-  1. Create a PR to update beam and beam-site, changes includes:
+  1. Create a PR to update beam-site, changes includes:
  * Copy python doc into beam-site
  * Copy java doc into beam-site
- * Update release version into 
[_config.yml](https://github.com/apache/beam/blob/master/website/_config.yml).
  
  Tasks you need to do manually
-  1. Add new release into `website/src/get-started/downloads.md`.
-  1. Update last release download links in 
`website/src/get-started/downloads.md`.
-  1. Update `website/src/.htaccess` to redirect to the new version.
-  1. Build and stage python wheels.
+  1. Verify the script worked.
+  1. Verify that the source and Python binaries are present in 
[dist.apache.org](https://dist.apache.org/repos/dist/dev/beam).
+  1. Verify Docker images are published. How to find images:
+  1. Visit 
[https://hub.docker.com/u/apache](https://hub.docker.com/search?q=apache%2Fbeam&type=image)
+  2. Visit each repository and navigate to *tags* tab.
+  3. Verify images are pushed with tags: ${RELEASE}_rc{RC_NUM}
+  1. Verify that third party licenses are included in Docker containers by 
logging in to the images.
+  - For Python SDK images, there should be around 80 ~ 100 
dependencies.
+  Please note that dependencies for the SDKs with different Python 
versions vary.
+  Need to verify all Python images by replacing `${ver}` with each 
supported Python version `X.Y`.
+  ```
+  docker run -it --entrypoint=/bin/bash 
apache/beam_python${ver}_sdk:${RELEASE}_rc{RC_NUM}
+  ls -al /opt/apache/beam/third_party_licenses/ | wc -l
+  ```
+  - For Java SDK images, there should be around 1400 dependencies.
+  ```
+  docker run -it --entrypoint=/bin/bash 
apache/beam_java_sdk:${RELEASE}_rc{RC_NUM}
+  ls -al /opt/apache/beam/third_party_licenses/ | wc -l
+  ```
   1. Publish staging artifacts
-  1. Go to the staging repo to close the staging repository on [Apache 
Nexus](https://repository.apache.org/#stagingRepositories). 
+  1. Go to the staging repo to close the staging repository on [Apache 
Nexus](https://repository.apache.org/#stagingRepositories).

Review comment:
   I added more detailed instructions.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] apilloud commented on a change in pull request #11682: [BEAM-9946] | added new api in Partition Transform

2020-05-21 Thread GitBox


apilloud commented on a change in pull request #11682:
URL: https://github.com/apache/beam/pull/11682#discussion_r428778641



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Partition.java
##
@@ -85,7 +141,14 @@
* @throws IllegalArgumentException if {@code numPartitions <= 0}
*/
   public static  Partition of(int numPartitions, PartitionFn 
partitionFn) {
-return new Partition<>(new PartitionDoFn(numPartitions, partitionFn));
+
+Contextful ctfFn =
+Contextful.fn(
+(T element, Contextful.Fn.Context c) ->
+partitionFn.partitionFor(element, numPartitions),
+Requirements.empty());
+Object aClass = partitionFn;

Review comment:
   The statement `Object aClass = partitionFn;` has no effect. You can just 
pass `partitionFn` directly into the function.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm commented on a change in pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane

2020-05-21 Thread GitBox


mxm commented on a change in pull request #11777:
URL: https://github.com/apache/beam/pull/11777#discussion_r428778698



##
File path: sdks/python/apache_beam/transforms/trigger.py
##
@@ -1368,7 +1368,7 @@ def _output(
 if timestamp is None:
   # If no watermark hold was set, output at end of window.
   timestamp = window.max_timestamp()
-elif input_watermark < window.end and self.trigger_fn.has_ontime_pane():
+elif output_watermark < window.end and self.trigger_fn.has_ontime_pane():

Review comment:
   Note that this is the fix in question. Please check @rohdesamuel if that 
makes sense.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] apilloud commented on pull request #11682: [BEAM-9946] | added new api in Partition Transform

2020-05-21 Thread GitBox


apilloud commented on pull request #11682:
URL: https://github.com/apache/beam/pull/11682#issuecomment-632218345


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632219614


   WDYT about https://github.com/ihji/beam/pull/1 ?
   (only change is to Environments.java other changes should go away if you 
rebase)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj merged pull request #11757: [BEAM-8019] Clarifies Dataflow execution environment model

2020-05-21 Thread GitBox


chamikaramj merged pull request #11757:
URL: https://github.com/apache/beam/pull/11757


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb merged pull request #11503: [BEAM-9692] Make GroupByKey into a primitive

2020-05-21 Thread GitBox


robertwb merged pull request #11503:
URL: https://github.com/apache/beam/pull/11503


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11521: [BEAM-9577] Update Java Runners to handle dependency-based artifact staging.

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11521:
URL: https://github.com/apache/beam/pull/11521#issuecomment-632237798


   @robertwb I think this broke Java PVR Spark Batch. First failure is here: 
https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/2887/ Not 
sure if there is a jira already



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11521: [BEAM-9577] Update Java Runners to handle dependency-based artifact staging.

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11521:
URL: https://github.com/apache/beam/pull/11521#issuecomment-632241122


   [BEAM-9971](https://issues.apache.org/jira/browse/BEAM-9971) was filed to 
track



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-21 Thread GitBox


amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-632248973


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-21 Thread GitBox


amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-632249476


   run Java Precommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms

2020-05-21 Thread GitBox


amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-632249853


   Tests retriggered. 
   
   And it becomes a really big PR :-) Nice work!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632250883


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632251140


   Run Java Flink PortableValidatesRunner Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632250772


   retest all please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner

2020-05-21 Thread GitBox


boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632251027


   Run Java Flink PortableValidatesRunner Batch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ihji commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


ihji commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632251342


   > WDYT about [ihji#1](https://github.com/ihji/beam/pull/1) ?
   > (only change is to Environments.java other changes should go away if you 
rebase)
   
   It will add 36 characters + the length of file name. Is there any concern 
about the maximum length of GCS object name? I think it's okay if GCS allows 
enough space in object names.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] veblush opened a new pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3

2020-05-21 Thread GitBox


veblush opened a new pull request #11778:
URL: https://github.com/apache/beam/pull/11778


   Backport of #11651
   
   CC: @theneuralbit
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://b

[GitHub] [beam] TheNeuralBit commented on pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632260642







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tedromer opened a new pull request #11779: [BEAM-10055] Add --region to python examples where it was missing

2020-05-21 Thread GitBox


tedromer opened a new pull request #11779:
URL: https://github.com/apache/beam/pull/11779


   Add --region to python examples where it was missing.
   
   R: ibzib



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #8457: [BEAM-3342] Create a Cloud Bigtable IO connector for Python

2020-05-21 Thread GitBox


aaltay commented on pull request #8457:
URL: https://github.com/apache/beam/pull/8457#issuecomment-632270901


   There are still failing tests on https://github.com/apache/beam/pull/11295. 
@mf2199 - What is the next step for this PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11580: [BEAM-9861] Reject fractional values outside of (0.0, 1.0)

2020-05-21 Thread GitBox


aaltay commented on pull request #11580:
URL: https://github.com/apache/beam/pull/11580#issuecomment-632271494


   @kmjung - What is the next step for this PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632271987







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632272181


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kmjung commented on pull request #11580: [BEAM-9861] Reject fractional values outside of (0.0, 1.0)

2020-05-21 Thread GitBox


kmjung commented on pull request #11580:
URL: https://github.com/apache/beam/pull/11580#issuecomment-632272486


   The next step is to merge this PR, I believe. I don't believe there's 
anything blocking other than a review from @chamikaramj.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …

2020-05-21 Thread GitBox


udim commented on a change in pull request #11661:
URL: https://github.com/apache/beam/pull/11661#discussion_r428838484



##
File path: sdks/python/test-suites/dataflow/common.gradle
##
@@ -109,4 +109,21 @@ task validatesRunnerStreamingTests {
   args '-c', ". ${envdir}/bin/activate && 
${runScriptsDir}/run_integration_test.sh $cmdArgs"
 }
   }
-}
\ No newline at end of file
+}
+
+task runPerformanceTest {
+dependsOn 'installGcpTest'
+dependsOn ':sdks:python:sdist'
+
+def test = project.findProperty('test')
+def testOpts = project.findProperty('test-pipeline-options')
+testOpts += " 
--sdk_location=${files(configurations.distTarBall.files).singleFile}"
+
+  doLast {
+exec {
+  workingDir "${project.rootDir}/sdks/python"
+  executable 'sh'
+  args '-c', ". ${envdir}/bin/activate && ${envdir}/bin/python setup.py 
nosetests --tests=${test}  --test-pipeline-options=\"${testOpts}\" 
--ignore-files \'.*py3\\d?\\.py\$\'"

Review comment:
   I don't believe we have `--test-pipeline-options` support yet in pytest, 
so nose is the solution for now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11086: [BEAM-8910] Make custom BQ source read from Avro

2020-05-21 Thread GitBox


pabloem commented on pull request #11086:
URL: https://github.com/apache/beam/pull/11086#issuecomment-632277003


   Run Python2_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632278144


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632279237


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aijamalnk opened a new pull request #11780: [WIP] Uploading mascot to the website

2020-05-21 Thread GitBox


aijamalnk opened a new pull request #11780:
URL: https://github.com/apache/beam/pull/11780


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build

[GitHub] [beam] ibzib merged pull request #11764: [BEAM-10048] Clean up release guide.

2020-05-21 Thread GitBox


ibzib merged pull request #11764:
URL: https://github.com/apache/beam/pull/11764


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632296307


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11780: [WIP] Uploading mascot to the website

2020-05-21 Thread GitBox


pabloem commented on pull request #11780:
URL: https://github.com/apache/beam/pull/11780#issuecomment-632296533


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rohdesamuel commented on pull request #11765: [BEAM-9322] Remove passthrough_pcollection_output_ids and force_generated_pcollection_output_ids flags

2020-05-21 Thread GitBox


rohdesamuel commented on pull request #11765:
URL: https://github.com/apache/beam/pull/11765#issuecomment-632299883


   R: @robertwb 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rohdesamuel commented on pull request #11745: Add to/from_runner_api_parameters to WriteToBigQuery

2020-05-21 Thread GitBox


rohdesamuel commented on pull request #11745:
URL: https://github.com/apache/beam/pull/11745#issuecomment-632301124


   R: @pabloem thanks Pablo! 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik opened a new pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


lukecwik opened a new pull request #11781:
URL: https://github.com/apache/beam/pull/11781


   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/

[GitHub] [beam] lukecwik commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


lukecwik commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632303098


   R: @ihji @chamikaramj 
   CC: @TheNeuralBit (cherry pick into 2.22)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632304060


   Is there any test coverage of this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lostluck opened a new pull request #11782: [BEAM-10056] Fix validation for struct CoGBKs

2020-05-21 Thread GitBox


lostluck opened a new pull request #11782:
URL: https://github.com/apache/beam/pull/11782


   We were a bit too strict for CoGBKs with multiple value streams with 
Structural DoFns. This PR expands the validation to support precise validation 
at construction time, and relaxes the execution time validation.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.a

[GitHub] [beam] lostluck commented on pull request #11782: [BEAM-10056] Fix validation for struct CoGBKs

2020-05-21 Thread GitBox


lostluck commented on pull request #11782:
URL: https://github.com/apache/beam/pull/11782#issuecomment-632304959


   R: @youngoli 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632309156


   Run Python2_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on a change in pull request #11086: [BEAM-8910] Make custom BQ source read from Avro

2020-05-21 Thread GitBox


pabloem commented on a change in pull request #11086:
URL: https://github.com/apache/beam/pull/11086#discussion_r428877884



##
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##
@@ -610,7 +611,8 @@ def __init__(
   coder=None,
   use_standard_sql=False,
   flatten_results=True,
-  kms_key=None):
+  kms_key=None,
+  use_json_exports=False):

Review comment:
   Restarting this discussion after some rebasing and preparing...
   
   I think I'd prefer to allow users to choose the export file format. This is 
what we do for writing to BQ in Java and Python. Allowing users to choose 
output type for specific column types would add overhead of (if avro and 
json_format_bytes: formatbytes; if avro and json_datetime_format: 
formatdatetime; if not avro and not json_format_bytes: formatbytes; ...).
   
   We can discourage users from using JSON (AVRO is already the default) - and 
eventually stop supporting it if that makes sense. Thoughts?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11682: [BEAM-9946] | added new api in Partition Transform

2020-05-21 Thread GitBox


aaltay commented on pull request #11682:
URL: https://github.com/apache/beam/pull/11682#issuecomment-632313052


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11682: [BEAM-9946] | added new api in Partition Transform

2020-05-21 Thread GitBox


aaltay commented on pull request #11682:
URL: https://github.com/apache/beam/pull/11682#issuecomment-632313262


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632314521


   I suspect test coverage will be internal to Dataflow (hence during import).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632314627


   I'm doing a manual validation for now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632318843


   Got it, thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632327533


   Something just occurred to me - are there any tests that use the DATE Type 
in an aggregation (e.g. MAX)?
   
   I'd think that would run into the same issue I have in #11456 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit edited a comment on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType

2020-05-21 Thread GitBox


TheNeuralBit edited a comment on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632327533


   Something just occurred to me - are there any tests that use the DATE Type 
in an aggregation (e.g. MAX)?
   
   I'd think that would run into the same issue I have in #11456 (processing 
logical types using their representation)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] apilloud commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType

2020-05-21 Thread GitBox


apilloud commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632330703


   Interesting question. You should probably add a test for JOIN as well, which 
will have a similar class of problems.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robinyqiu commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType

2020-05-21 Thread GitBox


robinyqiu commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632331823


   > are there any tests that use the DATE Type in an aggregation (e.g. MAX)?
   
   No. Thanks for bringing this up. I think it is likely to run into the 
problem.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] reuvenlax commented on a change in pull request #11456: [BEAM-7554] Add MillisInstant logical type to replace DATETIME

2020-05-21 Thread GitBox


reuvenlax commented on a change in pull request #11456:
URL: https://github.com/apache/beam/pull/11456#discussion_r428901750



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes/MillisInstant.java
##
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.logicaltypes;
+
+import org.joda.time.Instant;
+import org.joda.time.ReadableInstant;
+
+/** A timestamp represented as milliseconds since the epoch. */
+public class MillisInstant extends MillisType {
+  public static final String IDENTIFIER = 
"beam:logical_type:millis_instant:v1";

Review comment:
   A timestamp type seems like it's by definition time-zone agnostic. If we 
want a datetime type, that should be a different type.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632335607


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632335448


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch

2020-05-21 Thread GitBox


TheNeuralBit commented on pull request #11770:
URL: https://github.com/apache/beam/pull/11770#issuecomment-632335832


   Run Python 3.7 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


lukecwik commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632342320


   Test coverage is by existing IOs that enable these features which we don't 
have enough of in Beam (since it requires portable runners to implement SDF to 
a greater extent then they do) so supplement it with internal testing in Google.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.

2020-05-21 Thread GitBox


chamikaramj commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632347848


   LGTM. Thanks.
   
   Ran the Kafka test few times and it seems to be working.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >