[GitHub] [beam] ihji opened a new pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
ihji opened a new pull request #11771: URL: https://github.com/apache/beam/pull/11771 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/last
[GitHub] [beam] ihji commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
ihji commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-631950394 R: @chamikaramj This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kamilwu commented on pull request #11760: [BEAM-10043] Fix grammar / spelling in language-switch.js
kamilwu commented on pull request #11760: URL: https://github.com/apache/beam/pull/11760#issuecomment-631967204 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kamilwu commented on pull request #11760: [BEAM-10043] Fix grammar / spelling in language-switch.js
kamilwu commented on pull request #11760: URL: https://github.com/apache/beam/pull/11760#issuecomment-631976285 Thanks @epicfaace! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kamilwu merged pull request #11760: [BEAM-10043] Fix grammar / spelling in language-switch.js
kamilwu merged pull request #11760: URL: https://github.com/apache/beam/pull/11760 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kkucharc commented on pull request #11360: [BEAM-9722] added SnowflakeIO with Read operation
kkucharc commented on pull request #11360: URL: https://github.com/apache/beam/pull/11360#issuecomment-631990753 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mxm opened a new pull request #11772: [BEAM-9930] Update blog post for new Beam Summit Digital dates
mxm opened a new pull request #11772: URL: https://github.com/apache/beam/pull/11772 Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_
[GitHub] [beam] DariuszAniszewski commented on pull request #11360: [BEAM-9722] added SnowflakeIO with Read operation
DariuszAniszewski commented on pull request #11360: URL: https://github.com/apache/beam/pull/11360#issuecomment-632019864 Just a small comment about the force-push from above - it was mistakenly done, then reverted. HEAD of this branch is still on **3ba192a** and comment is a leftover. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on a change in pull request #11566: URL: https://github.com/apache/beam/pull/11566#discussion_r428605560 ## File path: sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/BatchRequestForDLP.java ## @@ -0,0 +1,101 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.ml; + +import com.google.privacy.dlp.v2.Table; +import java.util.ArrayList; +import java.util.List; +import java.util.concurrent.atomic.AtomicInteger; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.metrics.Counter; +import org.apache.beam.sdk.metrics.Metrics; +import org.apache.beam.sdk.state.BagState; +import org.apache.beam.sdk.state.StateSpec; +import org.apache.beam.sdk.state.StateSpecs; +import org.apache.beam.sdk.state.TimeDomain; +import org.apache.beam.sdk.state.Timer; +import org.apache.beam.sdk.state.TimerSpec; +import org.apache.beam.sdk.state.TimerSpecs; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.windowing.BoundedWindow; +import org.apache.beam.sdk.values.KV; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * DoFn batching the input PCollection into bigger requests in order to better utilize the Cloud DLP + * service. + */ +@Experimental +class BatchRequestForDLP extends DoFn, KV>> { + public static final Logger LOG = LoggerFactory.getLogger(BatchRequestForDLP.class); + + private final Counter numberOfRowsBagged = + Metrics.counter(BatchRequestForDLP.class, "numberOfRowsBagged"); + private final Integer batchSize; + + @StateId("elementsBag") + private final StateSpec>> elementsBag = StateSpecs.bag(); + + @TimerId("eventTimer") + private final TimerSpec eventTimer = TimerSpecs.timer(TimeDomain.EVENT_TIME); + + public BatchRequestForDLP(Integer batchSize) { +this.batchSize = batchSize; + } + + @ProcessElement + public void process( + @Element KV element, + @StateId("elementsBag") BagState> elementsBag, + @TimerId("eventTimer") Timer eventTimer, + BoundedWindow w) { +elementsBag.add(element); +eventTimer.set(w.maxTimestamp()); + } + + @OnTimer("eventTimer") + public void onTimer( + @StateId("elementsBag") BagState> elementsBag, + OutputReceiver>> output) { +String key = elementsBag.read().iterator().next().getKey(); +AtomicInteger bufferSize = new AtomicInteger(); +List rows = new ArrayList<>(); +elementsBag +.read() +.forEach( +element -> { + int elementSize = element.getValue().getSerializedSize(); + boolean clearBuffer = bufferSize.intValue() + elementSize > batchSize; + if (clearBuffer) { +numberOfRowsBagged.inc(rows.size()); +LOG.debug("Clear Buffer {} , Key {}", bufferSize.intValue(), element.getKey()); Review comment: Sure, I'll move it and add units. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on a change in pull request #11566: URL: https://github.com/apache/beam/pull/11566#discussion_r428606587 ## File path: sdks/java/extensions/ml/src/test/java/org/apache/beam/sdk/extensions/ml/DLPTextOperationsIT.java ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.ml; + +import static org.junit.Assert.assertTrue; + +import com.google.privacy.dlp.v2.CharacterMaskConfig; +import com.google.privacy.dlp.v2.DeidentifyConfig; +import com.google.privacy.dlp.v2.DeidentifyContentResponse; +import com.google.privacy.dlp.v2.Finding; +import com.google.privacy.dlp.v2.InfoType; +import com.google.privacy.dlp.v2.InfoTypeTransformations; +import com.google.privacy.dlp.v2.InspectConfig; +import com.google.privacy.dlp.v2.InspectContentResponse; +import com.google.privacy.dlp.v2.Likelihood; +import com.google.privacy.dlp.v2.PrimitiveTransformation; +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.extensions.gcp.options.GcpOptions; +import org.apache.beam.sdk.testing.PAssert; +import org.apache.beam.sdk.testing.TestPipeline; +import org.apache.beam.sdk.transforms.Create; +import org.apache.beam.sdk.transforms.SerializableFunction; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.junit.Rule; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +@RunWith(JUnit4.class) +public class DLPTextOperationsIT { + @Rule public TestPipeline testPipeline = TestPipeline.create(); + + private static final String IDENTIFYING_TEXT = "mary@example.com"; + private static InfoType emailAddress = InfoType.newBuilder().setName("EMAIL_ADDRESS").build(); + private static final InspectConfig inspectConfig = + InspectConfig.newBuilder() + .addInfoTypes(emailAddress) + .setMinLikelihood(Likelihood.LIKELY) + .build(); + + @Test + public void inspectsText() { +String projectId = testPipeline.getOptions().as(GcpOptions.class).getProject(); +PCollection> inspectionResult = +testPipeline +.apply(Create.of(KV.of("", IDENTIFYING_TEXT))) +.apply( +DLPInspectText.newBuilder() +.setBatchSize(52400) Review comment: @santhh Is it 52400 or 524000? It should be the latter, since we're setting the limit to 524 kb, I think. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] harriskistpay opened a new pull request #11773: [BEAM-1589] Added @OnWindowExpiration annotation.
harriskistpay opened a new pull request #11773: URL: https://github.com/apache/beam/pull/11773 Updated CHANGES.md file Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.o
[GitHub] [beam] harriskistpay closed pull request #11773: [BEAM-1589] Added @OnWindowExpiration annotation.
harriskistpay closed pull request #11773: URL: https://github.com/apache/beam/pull/11773 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rehmanmuradali opened a new pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.
rehmanmuradali opened a new pull request #11774: URL: https://github.com/apache/beam/pull/11774 Updated CHANGES.md Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/j
[GitHub] [beam] rehmanmuradali commented on pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.
rehmanmuradali commented on pull request #11774: URL: https://github.com/apache/beam/pull/11774#issuecomment-632067025 R: @reuvenlax This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mwalenia opened a new pull request #11775: [BEAM-10050] Change labels checked in VideoIntelligenceIT
mwalenia opened a new pull request #11775: URL: https://github.com/apache/beam/pull/11775 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on a change in pull request #11566: URL: https://github.com/apache/beam/pull/11566#discussion_r428661780 ## File path: sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/DLPDeidentifyText.java ## @@ -0,0 +1,215 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.ml; + +import com.google.auto.value.AutoValue; +import com.google.cloud.dlp.v2.DlpServiceClient; +import com.google.privacy.dlp.v2.ContentItem; +import com.google.privacy.dlp.v2.DeidentifyConfig; +import com.google.privacy.dlp.v2.DeidentifyContentRequest; +import com.google.privacy.dlp.v2.DeidentifyContentResponse; +import com.google.privacy.dlp.v2.FieldId; +import com.google.privacy.dlp.v2.InspectConfig; +import com.google.privacy.dlp.v2.ProjectName; +import com.google.privacy.dlp.v2.Table; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.stream.Collectors; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.PTransform; +import org.apache.beam.sdk.transforms.ParDo; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionView; + +/** + * A {@link PTransform} connecting to Cloud DLP and deidentifying text according to provided + * settings. The transform supports both CSV formatted input data and unstructured input. + * + * If the csvHeader property is set, csvDelimiter also should be, else the results will be + * incorrect. If csvHeader is not set, input is assumed to be unstructured. + * + * Either inspectTemplateName (String) or inspectConfig {@link InspectConfig} need to be set. The + * situation is the same with deidentifyTemplateName and deidentifyConfig ({@link DeidentifyConfig}. + * + * Batch size defines how big are batches sent to DLP at once in bytes. + * + * The transform outputs {@link KV} of {@link String} (eg. filename) and {@link + * DeidentifyContentResponse}, which will contain {@link Table} of results for the user to consume. + */ +@Experimental +@AutoValue +public abstract class DLPDeidentifyText +extends PTransform< +PCollection>, PCollection>> { + + @Nullable + public abstract String inspectTemplateName(); + + @Nullable + public abstract String deidentifyTemplateName(); + + @Nullable + public abstract InspectConfig inspectConfig(); + + @Nullable + public abstract DeidentifyConfig deidentifyConfig(); + + @Nullable + public abstract PCollectionView> csvHeader(); + + @Nullable + public abstract String csvDelimiter(); + + public abstract Integer batchSize(); + + public abstract String projectId(); + + @AutoValue.Builder + public abstract static class Builder { +public abstract Builder setInspectTemplateName(String inspectTemplateName); + +public abstract Builder setCsvHeader(PCollectionView> csvHeader); + +public abstract Builder setCsvDelimiter(String delimiter); + +public abstract Builder setBatchSize(Integer batchSize); + +public abstract Builder setProjectId(String projectId); + +public abstract Builder setDeidentifyTemplateName(String deidentifyTemplateName); + +public abstract Builder setInspectConfig(InspectConfig inspectConfig); + +public abstract Builder setDeidentifyConfig(DeidentifyConfig deidentifyConfig); + +public abstract DLPDeidentifyText build(); + } + + public static DLPDeidentifyText.Builder newBuilder() { +return new AutoValue_DLPDeidentifyText.Builder(); + } + + /** + * The transform batches the contents of input PCollection and then calls Cloud DLP service to + * perform the deidentification. + * + * @param input input PCollection + * @return PCollection after transformations + */ + @Override + public PCollection> expand( + PCollection> input) { +return input +.apply(ParDo.of(new MapStringToDlpRow(csvDelimiter( +.apply("Batch Contents", ParDo.of(new BatchRequestForDLP(batchSize( +.apply( +"DLPDeidentify", +ParDo.of( +new DeidentifyT
[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on a change in pull request #11566: URL: https://github.com/apache/beam/pull/11566#discussion_r428666183 ## File path: sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/MapStringToDlpRow.java ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.ml; + +import com.google.privacy.dlp.v2.Table; +import com.google.privacy.dlp.v2.Value; +import java.util.Arrays; +import java.util.List; +import java.util.Objects; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.values.KV; + +class MapStringToDlpRow extends DoFn, KV> { + private final String delimiter; + + public MapStringToDlpRow(String delimiter) { +this.delimiter = delimiter; + } + + @ProcessElement + public void processElement(ProcessContext context) { Review comment: Ok, will do! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] henryken commented on pull request #11734: [BEAM-9679] Add Core Transforms section / GroupByKey lesson to the Go SDK katas
henryken commented on pull request #11734: URL: https://github.com/apache/beam/pull/11734#issuecomment-632104571 Thanks @damondouglas! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kkucharc commented on pull request #11360: [BEAM-9722] added SnowflakeIO with Read operation
kkucharc commented on pull request #11360: URL: https://github.com/apache/beam/pull/11360#issuecomment-632106649 I retested failing test - probably the previous one was timeouting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on a change in pull request #11566: URL: https://github.com/apache/beam/pull/11566#discussion_r428676595 ## File path: sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/DLPReidentifyText.java ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.ml; + +import com.google.auto.value.AutoValue; +import com.google.cloud.dlp.v2.DlpServiceClient; +import com.google.privacy.dlp.v2.ContentItem; +import com.google.privacy.dlp.v2.DeidentifyConfig; +import com.google.privacy.dlp.v2.FieldId; +import com.google.privacy.dlp.v2.InspectConfig; +import com.google.privacy.dlp.v2.ProjectName; +import com.google.privacy.dlp.v2.ReidentifyContentRequest; +import com.google.privacy.dlp.v2.ReidentifyContentResponse; +import com.google.privacy.dlp.v2.Table; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.stream.Collectors; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.PTransform; +import org.apache.beam.sdk.transforms.ParDo; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionView; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * A {@link PTransform} connecting to Cloud DLP and inspecting text for identifying data according + * to provided settings. + * + * Either inspectTemplateName (String) or inspectConfig {@link InspectConfig} need to be set, the + * same goes for reidentifyTemplateName or reidentifyConfig. + * + * Batch size defines how big are batches sent to DLP at once in bytes. + */ +@Experimental +@AutoValue +public abstract class DLPReidentifyText +extends PTransform< +PCollection>, PCollection>> { + + public static final Logger LOG = LoggerFactory.getLogger(DLPInspectText.class); + + public static final Integer DLP_PAYLOAD_LIMIT = 52400; + + @Nullable + public abstract String inspectTemplateName(); + + @Nullable + public abstract String reidentifyTemplateName(); + + @Nullable + public abstract InspectConfig inspectConfig(); + + @Nullable + public abstract DeidentifyConfig reidentifyConfig(); + + @Nullable + public abstract String csvDelimiter(); + + @Nullable + public abstract PCollectionView> csvHeaders(); + + public abstract Integer batchSize(); + + public abstract String projectId(); + + @AutoValue.Builder + public abstract static class Builder { +public abstract Builder setInspectTemplateName(String inspectTemplateName); + +public abstract Builder setInspectConfig(InspectConfig inspectConfig); + +public abstract Builder setReidentifyConfig(DeidentifyConfig deidentifyConfig); + +public abstract Builder setReidentifyTemplateName(String deidentifyTemplateName); + +public abstract Builder setBatchSize(Integer batchSize); + +public abstract Builder setCsvHeaders(PCollectionView> csvHeaders); + +public abstract Builder setCsvDelimiter(String delimiter); + +public abstract Builder setProjectId(String projectId); + +public abstract DLPReidentifyText build(); + } + + public static DLPReidentifyText.Builder newBuilder() { +return new AutoValue_DLPReidentifyText.Builder(); + } + + @Override + public PCollection> expand( + PCollection> input) { +return input +.apply(ParDo.of(new MapStringToDlpRow(csvDelimiter( +.apply("Batch Contents", ParDo.of(new BatchRequestForDLP(batchSize( +.apply( +"DLPDeidentify", +ParDo.of( +new ReidentifyText( +projectId(), +inspectTemplateName(), +reidentifyTemplateName(), +inspectConfig(), +reidentifyConfig(), +csvHeaders(; + } + + public static class ReidentifyText + extends DoFn>, KV> { +private final String projectId; +private final String inspectTemplateName; +private final String reidentifyTemplateName; +private final InspectConfig inspectC
[GitHub] [beam] mwalenia commented on a change in pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on a change in pull request #11566: URL: https://github.com/apache/beam/pull/11566#discussion_r428679955 ## File path: sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/DLPDeidentifyText.java ## @@ -0,0 +1,215 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.ml; + +import com.google.auto.value.AutoValue; +import com.google.cloud.dlp.v2.DlpServiceClient; +import com.google.privacy.dlp.v2.ContentItem; +import com.google.privacy.dlp.v2.DeidentifyConfig; +import com.google.privacy.dlp.v2.DeidentifyContentRequest; +import com.google.privacy.dlp.v2.DeidentifyContentResponse; +import com.google.privacy.dlp.v2.FieldId; +import com.google.privacy.dlp.v2.InspectConfig; +import com.google.privacy.dlp.v2.ProjectName; +import com.google.privacy.dlp.v2.Table; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.stream.Collectors; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.PTransform; +import org.apache.beam.sdk.transforms.ParDo; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionView; + +/** + * A {@link PTransform} connecting to Cloud DLP and deidentifying text according to provided + * settings. The transform supports both CSV formatted input data and unstructured input. + * + * If the csvHeader property is set, csvDelimiter also should be, else the results will be + * incorrect. If csvHeader is not set, input is assumed to be unstructured. + * + * Either inspectTemplateName (String) or inspectConfig {@link InspectConfig} need to be set. The + * situation is the same with deidentifyTemplateName and deidentifyConfig ({@link DeidentifyConfig}. + * + * Batch size defines how big are batches sent to DLP at once in bytes. + * + * The transform outputs {@link KV} of {@link String} (eg. filename) and {@link + * DeidentifyContentResponse}, which will contain {@link Table} of results for the user to consume. + */ +@Experimental +@AutoValue +public abstract class DLPDeidentifyText +extends PTransform< +PCollection>, PCollection>> { + + @Nullable + public abstract String inspectTemplateName(); + + @Nullable + public abstract String deidentifyTemplateName(); + + @Nullable + public abstract InspectConfig inspectConfig(); + + @Nullable + public abstract DeidentifyConfig deidentifyConfig(); + + @Nullable + public abstract PCollectionView> csvHeader(); + + @Nullable + public abstract String csvDelimiter(); + + public abstract Integer batchSize(); + + public abstract String projectId(); + + @AutoValue.Builder + public abstract static class Builder { +public abstract Builder setInspectTemplateName(String inspectTemplateName); + +public abstract Builder setCsvHeader(PCollectionView> csvHeader); + +public abstract Builder setCsvDelimiter(String delimiter); + +public abstract Builder setBatchSize(Integer batchSize); + +public abstract Builder setProjectId(String projectId); + +public abstract Builder setDeidentifyTemplateName(String deidentifyTemplateName); + +public abstract Builder setInspectConfig(InspectConfig inspectConfig); + +public abstract Builder setDeidentifyConfig(DeidentifyConfig deidentifyConfig); + +public abstract DLPDeidentifyText build(); + } + Review comment: Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mwalenia commented on pull request #11566: [BEAM-9723] Add DLP integration transforms
mwalenia commented on pull request #11566: URL: https://github.com/apache/beam/pull/11566#issuecomment-632113183 @tysonjh , @santhh Thanks for the review, I added the first batch of fixes. I'll take care of the Javadocs tomorrow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kamilwu opened a new pull request #11776: [BEAM-9421] Better documentation of output results from AnnotateText transform
kamilwu opened a new pull request #11776: URL: https://github.com/apache/beam/pull/11776 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/l
[GitHub] [beam] reuvenlax merged pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.
reuvenlax merged pull request #11774: URL: https://github.com/apache/beam/pull/11774 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] reuvenlax commented on pull request #11774: [BEAM-1589] Added @OnWindowExpiration annotation.
reuvenlax commented on pull request #11774: URL: https://github.com/apache/beam/pull/11774#issuecomment-632168553 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] y1chi commented on a change in pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
y1chi commented on a change in pull request #11756: URL: https://github.com/apache/beam/pull/11756#discussion_r428754395 ## File path: sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnApiDoFnRunnerTest.java ## @@ -947,49 +910,213 @@ public void testTimers() throws Exception { timerInGlobalWindow("A", new Instant(1600L), new Instant(10012L)), timerInGlobalWindow("X", new Instant(1700L), new Instant(10022L)), timerInGlobalWindow("C", new Instant(1800L), new Instant(10022L)), -timerInGlobalWindow("B", new Instant(1900L), new Instant(10022L; +timerInGlobalWindow("B", new Instant(1900L), new Instant(10022L)), +timerInGlobalWindow("B", new Instant(2000L), new Instant(10032L)), +timerInGlobalWindow("Y", new Instant(2100L), new Instant(10042L; +assertThat( +fakeTimerClient.getTimers(eventFamilyTimer), +contains( +timerInGlobalWindow("X", "event-timer1", new Instant(1000L), new Instant(1003L)), +timerInGlobalWindow("Y", "event-timer1", new Instant(1100L), new Instant(1103L)), +timerInGlobalWindow("X", "event-timer1", new Instant(1200L), new Instant(1203L)), +timerInGlobalWindow("Y", "event-timer1", new Instant(1300L), new Instant(1303L)), +timerInGlobalWindow("A", "event-timer1", new Instant(1400L), new Instant(2413L)), +timerInGlobalWindow("B", "event-timer1", new Instant(1500L), new Instant(2513L)), +timerInGlobalWindow("A", "event-timer1", new Instant(1600L), new Instant(2613L)), +timerInGlobalWindow("X", "event-timer1", new Instant(1700L), new Instant(1723L)), +timerInGlobalWindow("C", "event-timer1", new Instant(1800L), new Instant(1823L)), +timerInGlobalWindow("B", "event-timer1", new Instant(1900L), new Instant(1923L)), +timerInGlobalWindow("B", "event-timer1", new Instant(2000L), new Instant(2033L)), +timerInGlobalWindow("Y", "event-timer1", new Instant(2100L), new Instant(2143L; +assertThat( +fakeTimerClient.getTimers(processingFamilyTimer), +contains( +timerInGlobalWindow("X", "processing-timer1", new Instant(1000L), new Instant(10004L)), +timerInGlobalWindow("Y", "processing-timer1", new Instant(1100L), new Instant(10004L)), +timerInGlobalWindow("X", "processing-timer1", new Instant(1200L), new Instant(10004L)), +timerInGlobalWindow("Y", "processing-timer1", new Instant(1300L), new Instant(10004L)), +timerInGlobalWindow("A", "processing-timer1", new Instant(1400L), new Instant(10014L)), +timerInGlobalWindow("B", "processing-timer1", new Instant(1500L), new Instant(10014L)), +timerInGlobalWindow("A", "processing-timer1", new Instant(1600L), new Instant(10014L)), +timerInGlobalWindow("X", "processing-timer1", new Instant(1700L), new Instant(10024L)), +timerInGlobalWindow("C", "processing-timer1", new Instant(1800L), new Instant(10024L)), +timerInGlobalWindow("B", "processing-timer1", new Instant(1900L), new Instant(10024L)), +timerInGlobalWindow("B", "processing-timer1", new Instant(2000L), new Instant(10034L)), +timerInGlobalWindow( +"Y", "processing-timer1", new Instant(2100L), new Instant(10044L; mainOutputValues.clear(); assertFalse(fakeTimerClient.isOutboundClosed(eventTimer)); assertFalse(fakeTimerClient.isOutboundClosed(processingTimer)); +assertFalse(fakeTimerClient.isOutboundClosed(eventFamilyTimer)); +assertFalse(fakeTimerClient.isOutboundClosed(processingFamilyTimer)); fakeTimerClient.closeInbound(eventTimer); fakeTimerClient.closeInbound(processingTimer); +fakeTimerClient.closeInbound(eventFamilyTimer); +fakeTimerClient.closeInbound(processingFamilyTimer); Iterables.getOnlyElement(finishFunctionRegistry.getFunctions()).run(); assertThat(mainOutputValues, empty()); assertTrue(fakeTimerClient.isOutboundClosed(eventTimer)); assertTrue(fakeTimerClient.isOutboundClosed(processingTimer)); +assertTrue(fakeTimerClient.isOutboundClosed(eventFamilyTimer)); +assertTrue(fakeTimerClient.isOutboundClosed(processingFamilyTimer)); Iterables.getOnlyElement(teardownFunctions).run(); assertThat(mainOutputValues, empty()); assertEquals( ImmutableMap.builder() .put(bagUserStateKey("bag", "X"), encode("X0", "X1", "X2", "processing")) -.put(bagUserStateKey("bag", "Y"), encode("Y1", "Y2")) +.put(bagUserStateKey("bag", "Y"), encode("Y1", "Y2", "processing-family")) .put(bagUserStateKey("bag", "A"), encode("A0", "event", "event")) -.put(bagUserStateKey("bag", "B"), encode("event", "processing")) +.put(bagUserStateKey("bag", "B"), encode("event", "
[GitHub] [beam] mxm opened a new pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane in DirectRunner
mxm opened a new pull request #11777: URL: https://github.com/apache/beam/pull/11777 We have a test pipeline which runs with the DirectRunner. When upgrading from 2.18.0 to 2.21.0 the test failed with the following exception: ``` tp = Exception('Monitor task detected a pipeline stall.',), value = None, tb = None def raise_(tp, value=None, tb=None): """ A function that matches the Python 2.x ``raise`` statement. This allows re-raising exceptions with the cls value and traceback on Python 2 and 3. """ if value is not None and isinstance(tp, Exception): raise TypeError("instance exception may not have a separate value") if value is not None: exc = tp(value) else: exc = tp if exc.__traceback__ is not tb: raise exc.with_traceback(tb) > raise exc E Exception: Monitor task detected a pipeline stall. ``` I was able to bisect the error. This commit introduced the failure: ea9b1f350b88c2996cafb4d24351869e82857731 The fix lets to the pipeline running correctly. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastComplete
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632179933 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mxm commented on pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane in DirectRunner
mxm commented on pull request #11777: URL: https://github.com/apache/beam/pull/11777#issuecomment-632181118 R: @rohdesamuel This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mxm commented on pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane in DirectRunner
mxm commented on pull request #11777: URL: https://github.com/apache/beam/pull/11777#issuecomment-632181899 Please have a look @rohdesamuel if you consider the fix valid. I'm not very familiar with the Python SDK triggering code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
boyuanzz commented on pull request #11756: URL: https://github.com/apache/beam/pull/11756#issuecomment-632182849 retest all please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
boyuanzz commented on pull request #11756: URL: https://github.com/apache/beam/pull/11756#issuecomment-632185565 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632185540 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632186469 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632186575 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632187939 Run Python 3.6 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632187854 Run Python 2 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632188019 Run Python 3.7 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632188678 Run Java Spark PortableValidatesRunner Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632190367 Run Release Gradle Build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632191832 Run SQL_Java11 PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ibzib commented on a change in pull request #11764: [BEAM-10048] Clean up release guide.
ibzib commented on a change in pull request #11764: URL: https://github.com/apache/beam/pull/11764#discussion_r428769910 ## File path: website/www/site/content/en/contribute/release-guide.md ## @@ -583,189 +582,44 @@ For this step, we recommend you using automation script to create a RC, but you 1. Stage source release into dist.apache.org dev [repo](https://dist.apache.org/repos/dist/dev/beam/). 1. Stage,sign and hash python binaries into dist.apache.ord dev repo python dir 1. Stage SDK docker images to [docker hub Apache organization](https://hub.docker.com/search?q=apache%2Fbeam&type=image). - 1. Create a PR to update beam and beam-site, changes includes: + 1. Create a PR to update beam-site, changes includes: * Copy python doc into beam-site * Copy java doc into beam-site - * Update release version into [_config.yml](https://github.com/apache/beam/blob/master/website/_config.yml). Tasks you need to do manually - 1. Add new release into `website/src/get-started/downloads.md`. - 1. Update last release download links in `website/src/get-started/downloads.md`. - 1. Update `website/src/.htaccess` to redirect to the new version. - 1. Build and stage python wheels. + 1. Verify the script worked. + 1. Verify that the source and Python binaries are present in [dist.apache.org](https://dist.apache.org/repos/dist/dev/beam). + 1. Verify Docker images are published. How to find images: + 1. Visit [https://hub.docker.com/u/apache](https://hub.docker.com/search?q=apache%2Fbeam&type=image) + 2. Visit each repository and navigate to *tags* tab. + 3. Verify images are pushed with tags: ${RELEASE}_rc{RC_NUM} + 1. Verify that third party licenses are included in Docker containers by logging in to the images. + - For Python SDK images, there should be around 80 ~ 100 dependencies. + Please note that dependencies for the SDKs with different Python versions vary. + Need to verify all Python images by replacing `${ver}` with each supported Python version `X.Y`. + ``` + docker run -it --entrypoint=/bin/bash apache/beam_python${ver}_sdk:${RELEASE}_rc{RC_NUM} + ls -al /opt/apache/beam/third_party_licenses/ | wc -l + ``` + - For Java SDK images, there should be around 1400 dependencies. + ``` + docker run -it --entrypoint=/bin/bash apache/beam_java_sdk:${RELEASE}_rc{RC_NUM} + ls -al /opt/apache/beam/third_party_licenses/ | wc -l + ``` 1. Publish staging artifacts - 1. Go to the staging repo to close the staging repository on [Apache Nexus](https://repository.apache.org/#stagingRepositories). + 1. Go to the staging repo to close the staging repository on [Apache Nexus](https://repository.apache.org/#stagingRepositories). 1. When prompted for a description, enter “Apache Beam, version X, release candidate Y”. - - -### (Alternative) Run all steps manually Review comment: Because it's mostly an exact copy of build_release_candidate.sh. Any part of this block that wasn't part of build_release_candidate.sh was moved into the "manual steps" section. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ibzib commented on a change in pull request #11764: [BEAM-10048] Clean up release guide.
ibzib commented on a change in pull request #11764: URL: https://github.com/apache/beam/pull/11764#discussion_r428770502 ## File path: release/src/main/scripts/build_release_candidate.sh ## @@ -113,7 +113,7 @@ if [[ $confirmation = "y" ]]; then echo "-Staging Java Artifacts into Maven---" Review comment: I wasn't sure the best way to do that. Maybe we can add it in a later PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ibzib commented on a change in pull request #11764: [BEAM-10048] Clean up release guide.
ibzib commented on a change in pull request #11764: URL: https://github.com/apache/beam/pull/11764#discussion_r428774392 ## File path: website/www/site/content/en/contribute/release-guide.md ## @@ -583,189 +582,44 @@ For this step, we recommend you using automation script to create a RC, but you 1. Stage source release into dist.apache.org dev [repo](https://dist.apache.org/repos/dist/dev/beam/). 1. Stage,sign and hash python binaries into dist.apache.ord dev repo python dir 1. Stage SDK docker images to [docker hub Apache organization](https://hub.docker.com/search?q=apache%2Fbeam&type=image). - 1. Create a PR to update beam and beam-site, changes includes: + 1. Create a PR to update beam-site, changes includes: * Copy python doc into beam-site * Copy java doc into beam-site - * Update release version into [_config.yml](https://github.com/apache/beam/blob/master/website/_config.yml). Tasks you need to do manually - 1. Add new release into `website/src/get-started/downloads.md`. - 1. Update last release download links in `website/src/get-started/downloads.md`. - 1. Update `website/src/.htaccess` to redirect to the new version. - 1. Build and stage python wheels. + 1. Verify the script worked. + 1. Verify that the source and Python binaries are present in [dist.apache.org](https://dist.apache.org/repos/dist/dev/beam). + 1. Verify Docker images are published. How to find images: + 1. Visit [https://hub.docker.com/u/apache](https://hub.docker.com/search?q=apache%2Fbeam&type=image) + 2. Visit each repository and navigate to *tags* tab. + 3. Verify images are pushed with tags: ${RELEASE}_rc{RC_NUM} + 1. Verify that third party licenses are included in Docker containers by logging in to the images. + - For Python SDK images, there should be around 80 ~ 100 dependencies. + Please note that dependencies for the SDKs with different Python versions vary. + Need to verify all Python images by replacing `${ver}` with each supported Python version `X.Y`. + ``` + docker run -it --entrypoint=/bin/bash apache/beam_python${ver}_sdk:${RELEASE}_rc{RC_NUM} + ls -al /opt/apache/beam/third_party_licenses/ | wc -l + ``` + - For Java SDK images, there should be around 1400 dependencies. + ``` + docker run -it --entrypoint=/bin/bash apache/beam_java_sdk:${RELEASE}_rc{RC_NUM} + ls -al /opt/apache/beam/third_party_licenses/ | wc -l + ``` 1. Publish staging artifacts - 1. Go to the staging repo to close the staging repository on [Apache Nexus](https://repository.apache.org/#stagingRepositories). + 1. Go to the staging repo to close the staging repository on [Apache Nexus](https://repository.apache.org/#stagingRepositories). Review comment: I added more detailed instructions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] apilloud commented on a change in pull request #11682: [BEAM-9946] | added new api in Partition Transform
apilloud commented on a change in pull request #11682: URL: https://github.com/apache/beam/pull/11682#discussion_r428778641 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Partition.java ## @@ -85,7 +141,14 @@ * @throws IllegalArgumentException if {@code numPartitions <= 0} */ public static Partition of(int numPartitions, PartitionFn partitionFn) { -return new Partition<>(new PartitionDoFn(numPartitions, partitionFn)); + +Contextful ctfFn = +Contextful.fn( +(T element, Contextful.Fn.Context c) -> +partitionFn.partitionFor(element, numPartitions), +Requirements.empty()); +Object aClass = partitionFn; Review comment: The statement `Object aClass = partitionFn;` has no effect. You can just pass `partitionFn` directly into the function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mxm commented on a change in pull request #11777: [BEAM-10054] Fix watermark hold for on_time_pane
mxm commented on a change in pull request #11777: URL: https://github.com/apache/beam/pull/11777#discussion_r428778698 ## File path: sdks/python/apache_beam/transforms/trigger.py ## @@ -1368,7 +1368,7 @@ def _output( if timestamp is None: # If no watermark hold was set, output at end of window. timestamp = window.max_timestamp() -elif input_watermark < window.end and self.trigger_fn.has_ontime_pane(): +elif output_watermark < window.end and self.trigger_fn.has_ontime_pane(): Review comment: Note that this is the fix in question. Please check @rohdesamuel if that makes sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] apilloud commented on pull request #11682: [BEAM-9946] | added new api in Partition Transform
apilloud commented on pull request #11682: URL: https://github.com/apache/beam/pull/11682#issuecomment-632218345 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
chamikaramj commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-632219614 WDYT about https://github.com/ihji/beam/pull/1 ? (only change is to Environments.java other changes should go away if you rebase) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj merged pull request #11757: [BEAM-8019] Clarifies Dataflow execution environment model
chamikaramj merged pull request #11757: URL: https://github.com/apache/beam/pull/11757 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robertwb merged pull request #11503: [BEAM-9692] Make GroupByKey into a primitive
robertwb merged pull request #11503: URL: https://github.com/apache/beam/pull/11503 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11521: [BEAM-9577] Update Java Runners to handle dependency-based artifact staging.
TheNeuralBit commented on pull request #11521: URL: https://github.com/apache/beam/pull/11521#issuecomment-632237798 @robertwb I think this broke Java PVR Spark Batch. First failure is here: https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/2887/ Not sure if there is a jira already This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11521: [BEAM-9577] Update Java Runners to handle dependency-based artifact staging.
TheNeuralBit commented on pull request #11521: URL: https://github.com/apache/beam/pull/11521#issuecomment-632241122 [BEAM-9971](https://issues.apache.org/jira/browse/BEAM-9971) was filed to track This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms
amaliujia commented on pull request #11610: URL: https://github.com/apache/beam/pull/11610#issuecomment-632248973 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms
amaliujia commented on pull request #11610: URL: https://github.com/apache/beam/pull/11610#issuecomment-632249476 run Java Precommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] amaliujia commented on pull request #11610: [BEAM-9825] | Implement Intersect,Union,Except transforms
amaliujia commented on pull request #11610: URL: https://github.com/apache/beam/pull/11610#issuecomment-632249853 Tests retriggered. And it becomes a really big PR :-) Nice work! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
boyuanzz commented on pull request #11756: URL: https://github.com/apache/beam/pull/11756#issuecomment-632250883 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
boyuanzz commented on pull request #11756: URL: https://github.com/apache/beam/pull/11756#issuecomment-632251140 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
boyuanzz commented on pull request #11756: URL: https://github.com/apache/beam/pull/11756#issuecomment-632250772 retest all please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #11756: [BEAM-9603] Add timer family support to FnApiDoFnRunner
boyuanzz commented on pull request #11756: URL: https://github.com/apache/beam/pull/11756#issuecomment-632251027 Run Java Flink PortableValidatesRunner Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
ihji commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-632251342 > WDYT about [ihji#1](https://github.com/ihji/beam/pull/1) ? > (only change is to Environments.java other changes should go away if you rebase) It will add 36 characters + the length of file name. Is there any concern about the maximum length of GCS object name? I think it's okay if GCS allows enough space in object names. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] veblush opened a new pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3
veblush opened a new pull request #11778: URL: https://github.com/apache/beam/pull/11778 Backport of #11651 CC: @theneuralbit Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://b
[GitHub] [beam] TheNeuralBit commented on pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3
TheNeuralBit commented on pull request #11778: URL: https://github.com/apache/beam/pull/11778#issuecomment-632260642 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] tedromer opened a new pull request #11779: [BEAM-10055] Add --region to python examples where it was missing
tedromer opened a new pull request #11779: URL: https://github.com/apache/beam/pull/11779 Add --region to python examples where it was missing. R: ibzib This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] aaltay commented on pull request #8457: [BEAM-3342] Create a Cloud Bigtable IO connector for Python
aaltay commented on pull request #8457: URL: https://github.com/apache/beam/pull/8457#issuecomment-632270901 There are still failing tests on https://github.com/apache/beam/pull/11295. @mf2199 - What is the next step for this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] aaltay commented on pull request #11580: [BEAM-9861] Reject fractional values outside of (0.0, 1.0)
aaltay commented on pull request #11580: URL: https://github.com/apache/beam/pull/11580#issuecomment-632271494 @kmjung - What is the next step for this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
chamikaramj commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-632271987 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
chamikaramj commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-632272181 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kmjung commented on pull request #11580: [BEAM-9861] Reject fractional values outside of (0.0, 1.0)
kmjung commented on pull request #11580: URL: https://github.com/apache/beam/pull/11580#issuecomment-632272486 The next step is to merge this PR, I believe. I don't believe there's anything blocking other than a review from @chamikaramj. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] udim commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
udim commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r428838484 ## File path: sdks/python/test-suites/dataflow/common.gradle ## @@ -109,4 +109,21 @@ task validatesRunnerStreamingTests { args '-c', ". ${envdir}/bin/activate && ${runScriptsDir}/run_integration_test.sh $cmdArgs" } } -} \ No newline at end of file +} + +task runPerformanceTest { +dependsOn 'installGcpTest' +dependsOn ':sdks:python:sdist' + +def test = project.findProperty('test') +def testOpts = project.findProperty('test-pipeline-options') +testOpts += " --sdk_location=${files(configurations.distTarBall.files).singleFile}" + + doLast { +exec { + workingDir "${project.rootDir}/sdks/python" + executable 'sh' + args '-c', ". ${envdir}/bin/activate && ${envdir}/bin/python setup.py nosetests --tests=${test} --test-pipeline-options=\"${testOpts}\" --ignore-files \'.*py3\\d?\\.py\$\'" Review comment: I don't believe we have `--test-pipeline-options` support yet in pytest, so nose is the solution for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] pabloem commented on pull request #11086: [BEAM-8910] Make custom BQ source read from Avro
pabloem commented on pull request #11086: URL: https://github.com/apache/beam/pull/11086#issuecomment-632277003 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
chamikaramj commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-632278144 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11771: [BEAM-10052] check hash and avoid duplicated artifacts
chamikaramj commented on pull request #11771: URL: https://github.com/apache/beam/pull/11771#issuecomment-632279237 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] aijamalnk opened a new pull request #11780: [WIP] Uploading mascot to the website
aijamalnk opened a new pull request #11780: URL: https://github.com/apache/beam/pull/11780 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
[GitHub] [beam] ibzib merged pull request #11764: [BEAM-10048] Clean up release guide.
ibzib merged pull request #11764: URL: https://github.com/apache/beam/pull/11764 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632296307 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] pabloem commented on pull request #11780: [WIP] Uploading mascot to the website
pabloem commented on pull request #11780: URL: https://github.com/apache/beam/pull/11780#issuecomment-632296533 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rohdesamuel commented on pull request #11765: [BEAM-9322] Remove passthrough_pcollection_output_ids and force_generated_pcollection_output_ids flags
rohdesamuel commented on pull request #11765: URL: https://github.com/apache/beam/pull/11765#issuecomment-632299883 R: @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rohdesamuel commented on pull request #11745: Add to/from_runner_api_parameters to WriteToBigQuery
rohdesamuel commented on pull request #11745: URL: https://github.com/apache/beam/pull/11745#issuecomment-632301124 R: @pabloem thanks Pablo! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lukecwik opened a new pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
lukecwik opened a new pull request #11781: URL: https://github.com/apache/beam/pull/11781 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/
[GitHub] [beam] lukecwik commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
lukecwik commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632303098 R: @ihji @chamikaramj CC: @TheNeuralBit (cherry pick into 2.22) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
TheNeuralBit commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632304060 Is there any test coverage of this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck opened a new pull request #11782: [BEAM-10056] Fix validation for struct CoGBKs
lostluck opened a new pull request #11782: URL: https://github.com/apache/beam/pull/11782 We were a bit too strict for CoGBKs with multiple value streams with Structural DoFns. This PR expands the validation to support precise validation at construction time, and relaxes the execution time validation. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.a
[GitHub] [beam] lostluck commented on pull request #11782: [BEAM-10056] Fix validation for struct CoGBKs
lostluck commented on pull request #11782: URL: https://github.com/apache/beam/pull/11782#issuecomment-632304959 R: @youngoli This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3
TheNeuralBit commented on pull request #11778: URL: https://github.com/apache/beam/pull/11778#issuecomment-632309156 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] pabloem commented on a change in pull request #11086: [BEAM-8910] Make custom BQ source read from Avro
pabloem commented on a change in pull request #11086: URL: https://github.com/apache/beam/pull/11086#discussion_r428877884 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -610,7 +611,8 @@ def __init__( coder=None, use_standard_sql=False, flatten_results=True, - kms_key=None): + kms_key=None, + use_json_exports=False): Review comment: Restarting this discussion after some rebasing and preparing... I think I'd prefer to allow users to choose the export file format. This is what we do for writing to BQ in Java and Python. Allowing users to choose output type for specific column types would add overhead of (if avro and json_format_bytes: formatbytes; if avro and json_datetime_format: formatdatetime; if not avro and not json_format_bytes: formatbytes; ...). We can discourage users from using JSON (AVRO is already the default) - and eventually stop supporting it if that makes sense. Thoughts? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] aaltay commented on pull request #11682: [BEAM-9946] | added new api in Partition Transform
aaltay commented on pull request #11682: URL: https://github.com/apache/beam/pull/11682#issuecomment-632313052 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] aaltay commented on pull request #11682: [BEAM-9946] | added new api in Partition Transform
aaltay commented on pull request #11682: URL: https://github.com/apache/beam/pull/11682#issuecomment-632313262 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
chamikaramj commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632314521 I suspect test coverage will be internal to Dataflow (hence during import). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
chamikaramj commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632314627 I'm doing a manual validation for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
TheNeuralBit commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632318843 Got it, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType
TheNeuralBit commented on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-632327533 Something just occurred to me - are there any tests that use the DATE Type in an aggregation (e.g. MAX)? I'd think that would run into the same issue I have in #11456 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit edited a comment on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType
TheNeuralBit edited a comment on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-632327533 Something just occurred to me - are there any tests that use the DATE Type in an aggregation (e.g. MAX)? I'd think that would run into the same issue I have in #11456 (processing logical types using their representation) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] apilloud commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType
apilloud commented on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-632330703 Interesting question. You should probably add a test for JOIN as well, which will have a similar class of problems. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robinyqiu commented on pull request #11272: [BEAM-9641] Support ZetaSQL DATE type as a Beam LogicalType
robinyqiu commented on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-632331823 > are there any tests that use the DATE Type in an aggregation (e.g. MAX)? No. Thanks for bringing this up. I think it is likely to run into the problem. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] reuvenlax commented on a change in pull request #11456: [BEAM-7554] Add MillisInstant logical type to replace DATETIME
reuvenlax commented on a change in pull request #11456: URL: https://github.com/apache/beam/pull/11456#discussion_r428901750 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes/MillisInstant.java ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.schemas.logicaltypes; + +import org.joda.time.Instant; +import org.joda.time.ReadableInstant; + +/** A timestamp represented as milliseconds since the epoch. */ +public class MillisInstant extends MillisType { + public static final String IDENTIFIER = "beam:logical_type:millis_instant:v1"; Review comment: A timestamp type seems like it's by definition time-zone agnostic. If we want a datetime type, that should be a different type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
TheNeuralBit commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632335607 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11778: [BEAM-8889][release-2.22.0] Upgrades gcsio to 2.1.3
TheNeuralBit commented on pull request #11778: URL: https://github.com/apache/beam/pull/11778#issuecomment-632335448 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #11770: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
TheNeuralBit commented on pull request #11770: URL: https://github.com/apache/beam/pull/11770#issuecomment-632335832 Run Python 3.7 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lukecwik commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
lukecwik commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632342320 Test coverage is by existing IOs that enable these features which we don't have enough of in Beam (since it requires portable runners to implement SDF to a greater extent then they do) so supplement it with internal testing in Google. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #11781: [BEAM-2939, BEAM-10057] Ensure that we can process an EmptyUnboundedSource and also prevent splitting on it.
chamikaramj commented on pull request #11781: URL: https://github.com/apache/beam/pull/11781#issuecomment-632347848 LGTM. Thanks. Ran the Kafka test few times and it seems to be working. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org