Re: [PR] add a way for channels to be closed manually [beam]

2024-03-06 Thread via GitHub


m-trieu commented on code in PR #30425:
URL: https://github.com/apache/beam/pull/30425#discussion_r1515691626


##
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/grpc/GrpcDispatcherClient.java:
##
@@ -97,6 +97,14 @@ CloudWindmillServiceV1Alpha1Stub getWindmillServiceStub() {
 : randomlySelectNextStub(windmillServiceStubs));
   }
 
+  WindmillServiceAddress getWindmillServiceAddress() {
+ImmutableList endpoints =
+ImmutableList.copyOf(dispatcherStubs.get().dispatcherEndpoints());

Review Comment:
   realized we don't need this since the cache handles all of it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] add a way for channels to be closed manually [beam]

2024-03-06 Thread via GitHub


m-trieu commented on code in PR #30425:
URL: https://github.com/apache/beam/pull/30425#discussion_r1515690054


##
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/grpc/stubs/ChannelCache.java:
##
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.worker.windmill.client.grpc.stubs;
+
+import java.io.PrintWriter;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Function;
+import javax.annotation.concurrent.ThreadSafe;
+import org.apache.beam.runners.dataflow.worker.status.StatusDataProvider;
+import org.apache.beam.runners.dataflow.worker.windmill.WindmillServiceAddress;
+import org.apache.beam.vendor.grpc.v1p60p1.io.grpc.ManagedChannel;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.cache.CacheBuilder;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.cache.CacheLoader;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.cache.LoadingCache;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Cache of gRPC channels for {@link WindmillServiceAddress}. Follows https://grpc.io/docs/guides/performance/#java>gRPC recommendations 
for re-using channels
+ * when possible.
+ *
+ * @implNote Backed by {@link LoadingCache} which is thread-safe.
+ */
+@ThreadSafe
+public final class ChannelCache implements StatusDataProvider {
+  private static final Logger LOG = 
LoggerFactory.getLogger(ChannelCache.class);
+  private final LoadingCache 
channelCache;
+
+  public ChannelCache(
+  boolean useIsolatedChannels,
+  Function channelFactory) {
+this.channelCache =
+CacheBuilder.newBuilder()
+.build(
+new CacheLoader() {
+  @Override
+  public ManagedChannel load(WindmillServiceAddress 
serviceAddress) {
+// IsolationChannel will create and manage separate RPC 
channels to the same
+// serviceAddress via calling the channelFactory, else 
just directly return the
+// RPC channel.
+return useIsolatedChannels
+? IsolationChannel.create(() -> 
channelFactory.apply(serviceAddress))
+: channelFactory.apply(serviceAddress);
+  }
+});
+  }
+
+  private static void shutdownChannel(ManagedChannel channel) {
+channel.shutdown();
+try {
+  channel.awaitTermination(10, TimeUnit.SECONDS);
+} catch (InterruptedException e) {
+  LOG.error("Couldn't close gRPC channel={}", channel, e);
+}
+channel.shutdownNow();
+  }
+
+  public ManagedChannel get(WindmillServiceAddress windmillServiceAddress) {
+return channelCache.getUnchecked(windmillServiceAddress);
+  }
+
+  public void removeAndClose(WindmillServiceAddress windmillServiceAddress) {
+Optional.ofNullable(channelCache.getIfPresent(windmillServiceAddress))
+.ifPresent(ChannelCache::shutdownChannel);

Review Comment:
   added the caffeine cache to worker build file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump github.com/testcontainers/testcontainers-go from 0.26.0 to 0.29.1 in /sdks [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30557:
URL: https://github.com/apache/beam/pull/30557#issuecomment-1982403037

   Checks are failing. Will not request review until checks are succeeding. If 
you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Bump github.com/testcontainers/testcontainers-go from 0.26.0 to 0.29.1 in /sdks [beam]

2024-03-06 Thread via GitHub


dependabot[bot] opened a new pull request, #30557:
URL: https://github.com/apache/beam/pull/30557

   Bumps 
[github.com/testcontainers/testcontainers-go](https://github.com/testcontainers/testcontainers-go)
 from 0.26.0 to 0.29.1.
   
   Release notes
   Sourced from https://github.com/testcontainers/testcontainers-go/releases;>github.com/testcontainers/testcontainers-go's
 releases.
   
   v0.29.1
   What's Changed
    Features
   
   Add k3s WithManifest option (https://redirect.github.com/testcontainers/testcontainers-go/issues/1920;>#1920)
 https://github.com/pablochacin;>@​pablochacin
   feat: add ollama module (https://redirect.github.com/testcontainers/testcontainers-go/issues/2265;>#2265)
 https://github.com/mdelapenya;>@​mdelapenya
   Adding surrealDB module (https://redirect.github.com/testcontainers/testcontainers-go/issues/2192;>#2192)
 https://github.com/jespino;>@​jespino
   feat: WithLogger ContainerCustomizer support (https://redirect.github.com/testcontainers/testcontainers-go/issues/2259;>#2259)
 https://github.com/stevenh;>@​stevenh
   feat: WithEnv customize request option (https://redirect.github.com/testcontainers/testcontainers-go/issues/2260;>#2260)
 https://github.com/stevenh;>@​stevenh
   feat: add vector database modules (Qdrant, Weaviate, Chroma, pgvector, 
OpenSearch, Milvus) (https://redirect.github.com/testcontainers/testcontainers-go/issues/2245;>#2245)
 https://github.com/mdelapenya;>@​mdelapenya
   
    Bug Fixes
   
   Fix Dockerfile not located when added to dockerignore (https://redirect.github.com/testcontainers/testcontainers-go/issues/2272;>#2272)
 https://github.com/danvergara;>@​danvergara
   bug: allow start container with reuse in different test package (https://redirect.github.com/testcontainers/testcontainers-go/issues/2247;>#2247)
 https://github.com/Alviner;>@​Alviner
   
    Documentation
   
   docs: fix comment corruption (https://redirect.github.com/testcontainers/testcontainers-go/issues/2262;>#2262)
 https://github.com/stevenh;>@​stevenh
   docs: improve module creation section (https://redirect.github.com/testcontainers/testcontainers-go/issues/2239;>#2239)
 https://github.com/mdelapenya;>@​mdelapenya
   
   粒 Housekeeping
   
   generic.go: GenericContainer(): clearer error message (https://redirect.github.com/testcontainers/testcontainers-go/issues/2327;>#2327)
 https://github.com/JordanP;>@​JordanP
   chore: confirm support for new mongo images (https://redirect.github.com/testcontainers/testcontainers-go/issues/2326;>#2326)
 https://github.com/mdelapenya;>@​mdelapenya
   chore: bump Go version to 1.21 (https://redirect.github.com/testcontainers/testcontainers-go/issues/2292;>#2292)
 https://github.com/mdelapenya;>@​mdelapenya
   Move the file and mounts tests into a test package (https://redirect.github.com/testcontainers/testcontainers-go/issues/2270;>#2270)
 https://github.com/Minivera;>@​Minivera
   chore(milvus): embed etcd should use default ports (https://redirect.github.com/testcontainers/testcontainers-go/issues/2258;>#2258)
 https://github.com/mdelapenya;>@​mdelapenya
   chore: use logger.PrintXX instead of fmt.PrintXX (https://redirect.github.com/testcontainers/testcontainers-go/issues/2257;>#2257)
 https://github.com/stevenh;>@​stevenh
   Fix modulege template to succeed on make lint command (https://redirect.github.com/testcontainers/testcontainers-go/issues/2243;>#2243)
 https://github.com/jespino;>@​jespino
   chore: enforce test package in modules (https://redirect.github.com/testcontainers/testcontainers-go/issues/2241;>#2241)
 https://github.com/mdelapenya;>@​mdelapenya
   
    Dependency updates
   
   chore(deps): bump google.golang.org/grpc from 1.61.1 to 1.62.0 in 
/modules/qdrant (https://redirect.github.com/testcontainers/testcontainers-go/issues/2281;>#2281)
 https://github.com/dependabot;>@​dependabot
   chore(deps): bump github.com/ClickHouse/clickhouse-go/v2 from 2.18.0 to 
2.20.0 in /modules/clickhouse (https://redirect.github.com/testcontainers/testcontainers-go/issues/2290;>#2290)
 https://github.com/dependabot;>@​dependabot
   chore(deps): bump github.com/Shopify/toxiproxy/v2 from 2.7.0 to 2.8.0 in 
/examples/toxiproxy (https://redirect.github.com/testcontainers/testcontainers-go/issues/2282;>#2282)
 https://github.com/dependabot;>@​dependabot
   chore(deps): bump github.com/neo4j/neo4j-go-driver/v5 from 5.16.0 to 
5.18.0 in /modules/neo4j (https://redirect.github.com/testcontainers/testcontainers-go/issues/2278;>#2278)
 https://github.com/dependabot;>@​dependabot
   chore(deps): bump github.com/minio/minio-go/v7 from 7.0.66 to 7.0.68 in 
/modules/minio (https://redirect.github.com/testcontainers/testcontainers-go/issues/2304;>#2304)
 https://github.com/dependabot;>@​dependabot
   chore(deps): bump github.com/tmc/langchaingo from 0.1.4 to 0.1.5 in 
/modules/ollama (https://redirect.github.com/testcontainers/testcontainers-go/issues/2318;>#2318)
 https://github.com/dependabot;>@​dependabot
   chore(deps): bump 

Re: [PR] add a way for channels to be closed manually [beam]

2024-03-06 Thread via GitHub


m-trieu commented on code in PR #30425:
URL: https://github.com/apache/beam/pull/30425#discussion_r1515523277


##
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/grpc/stubs/ChannelCache.java:
##
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.worker.windmill.client.grpc.stubs;
+
+import java.io.PrintWriter;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Function;
+import javax.annotation.concurrent.ThreadSafe;
+import org.apache.beam.runners.dataflow.worker.status.StatusDataProvider;
+import org.apache.beam.runners.dataflow.worker.windmill.WindmillServiceAddress;
+import org.apache.beam.vendor.grpc.v1p60p1.io.grpc.ManagedChannel;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.cache.CacheBuilder;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.cache.CacheLoader;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.cache.LoadingCache;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Cache of gRPC channels for {@link WindmillServiceAddress}. Follows https://grpc.io/docs/guides/performance/#java>gRPC recommendations 
for re-using channels
+ * when possible.
+ *
+ * @implNote Backed by {@link LoadingCache} which is thread-safe.
+ */
+@ThreadSafe
+public final class ChannelCache implements StatusDataProvider {
+  private static final Logger LOG = 
LoggerFactory.getLogger(ChannelCache.class);
+  private final LoadingCache 
channelCache;
+
+  public ChannelCache(
+  boolean useIsolatedChannels,
+  Function channelFactory) {
+this.channelCache =
+CacheBuilder.newBuilder()
+.build(
+new CacheLoader() {
+  @Override
+  public ManagedChannel load(WindmillServiceAddress 
serviceAddress) {
+// IsolationChannel will create and manage separate RPC 
channels to the same
+// serviceAddress via calling the channelFactory, else 
just directly return the
+// RPC channel.
+return useIsolatedChannels
+? IsolationChannel.create(() -> 
channelFactory.apply(serviceAddress))
+: channelFactory.apply(serviceAddress);
+  }
+});
+  }
+
+  private static void shutdownChannel(ManagedChannel channel) {
+channel.shutdown();
+try {
+  channel.awaitTermination(10, TimeUnit.SECONDS);
+} catch (InterruptedException e) {
+  LOG.error("Couldn't close gRPC channel={}", channel, e);
+}
+channel.shutdownNow();
+  }
+
+  public ManagedChannel get(WindmillServiceAddress windmillServiceAddress) {
+return channelCache.getUnchecked(windmillServiceAddress);
+  }
+
+  public void removeAndClose(WindmillServiceAddress windmillServiceAddress) {
+Optional.ofNullable(channelCache.getIfPresent(windmillServiceAddress))
+.ifPresent(ChannelCache::shutdownChannel);

Review Comment:
   sgtm added a removal listener to the current implementation.
   
   If we sub guava cache with caffeine cache, do we have to go through 
vendoring? or any kind of dependency review?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Jimmytobin2425 patch 1 [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30550:
URL: https://github.com/apache/beam/pull/30550#issuecomment-1982330769

   Assigning reviewers. If you would like to opt out of this review, comment 
`assign to next reviewer`:
   
   R: @liferoad for label python.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any 
comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review 
comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] try new protobuf [beam]

2024-03-06 Thread via GitHub


riteshghorse opened a new pull request, #30556:
URL: https://github.com/apache/beam/pull/30556

   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO] Begin adding RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30555:
URL: https://github.com/apache/beam/pull/30555#issuecomment-1982288013

   Assigning reviewers. If you would like to opt out of this review, comment 
`assign to next reviewer`:
   
   R: @damccorm for label website.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any 
comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review 
comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO] Stage WebApis examples module [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30554:
URL: https://github.com/apache/beam/pull/30554#issuecomment-1982252454

   Checks are failing. Will not request review until checks are succeeding. If 
you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add vertex AI dependency [beam]

2024-03-06 Thread via GitHub


damondouglas merged PR #30553:
URL: https://github.com/apache/beam/pull/30553


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [RRIO] Begin adding RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas opened a new pull request, #30555:
URL: https://github.com/apache/beam/pull/30555

   This PR addresses #30379 by adding the beginnings of RequestResponseIO 
examples and documentation on website for the Java SDK.
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add vertex AI dependency [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30553:
URL: https://github.com/apache/beam/pull/30553#issuecomment-1982228545

   Checks are failing. Will not request review until checks are succeeding. If 
you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [RRIO] Stage WebApis examples module [beam]

2024-03-06 Thread via GitHub


damondouglas opened a new pull request, #30554:
URL: https://github.com/apache/beam/pull/30554

   This PR addresses #30379 staging for examples that make WebApis requests 
pulling images using RequestResponseIO.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BEAM-30531] Automatically execute unbounded pipelines in streaming mode. [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30533:
URL: https://github.com/apache/beam/pull/30533#issuecomment-198873

   waiting on author


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Do not pre-install beam in tensorrt container [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30552:
URL: https://github.com/apache/beam/pull/30552#issuecomment-1982216770

   Run Python PreCommit 3.8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [TESTING] Update protobuf timestamp limit for protobuf 4.26 [beam]

2024-03-06 Thread via GitHub


riteshghorse commented on PR #29873:
URL: https://github.com/apache/beam/pull/29873#issuecomment-1982216462

   Closing this in favor of adding our own wrapper for the timestamp method 
without checks in short-term .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [TESTING] Update protobuf timestamp limit for protobuf 4.26 [beam]

2024-03-06 Thread via GitHub


riteshghorse closed pull request #29873: [TESTING] Update protobuf timestamp 
limit for protobuf 4.26
URL: https://github.com/apache/beam/pull/29873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add vertex AI dependency [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30553:
URL: https://github.com/apache/beam/pull/30553#issuecomment-1982206061

   cc: @kennknowles 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Do not pre-install beam in tensorrt container [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30552:
URL: https://github.com/apache/beam/pull/30552#issuecomment-1982205371

   Checks are failing. Will not request review until checks are succeeding. If 
you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add vertex AI dependency [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30553:
URL: https://github.com/apache/beam/pull/30553#issuecomment-1982205023

   Actually, do we need to add Vertex AI as a Beam dependency if we don't 
directly depend on it? can the dependency be only added for the project that 
defines the relevant example? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Refactor commit logic out of StreamingDataflowWorker [beam]

2024-03-06 Thread via GitHub


m-trieu commented on code in PR #30312:
URL: https://github.com/apache/beam/pull/30312#discussion_r1515381512


##
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/commits/StreamingEngineWorkCommitter.java:
##
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.worker.windmill.client.commits;
+
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicLong;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import javax.annotation.concurrent.ThreadSafe;
+import org.apache.beam.runners.dataflow.worker.streaming.WeightedBoundedQueue;
+import org.apache.beam.runners.dataflow.worker.streaming.Work;
+import org.apache.beam.runners.dataflow.worker.windmill.client.CloseableStream;
+import 
org.apache.beam.runners.dataflow.worker.windmill.client.WindmillStream.CommitWorkStream;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Streaming engine implementation of {@link WorkCommitter}. Commits work back 
to Streaming Engine
+ * backend.
+ */
+@ThreadSafe
+final class StreamingEngineWorkCommitter implements WorkCommitter {
+  private static final Logger LOG = 
LoggerFactory.getLogger(StreamingEngineWorkCommitter.class);
+  private static final int COMMIT_BATCH_SIZE = 5;

Review Comment:
   ah gotcha, fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Add vertex AI dependency [beam]

2024-03-06 Thread via GitHub


damondouglas opened a new pull request, #30553:
URL: https://github.com/apache/beam/pull/30553

   This PR addresses #30379 by adding the Vertex AI dependency. This is needed 
for future example.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Refactor commit logic out of StreamingDataflowWorker [beam]

2024-03-06 Thread via GitHub


m-trieu commented on code in PR #30312:
URL: https://github.com/apache/beam/pull/30312#discussion_r1515369962


##
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/commits/StreamingEngineWorkCommitter.java:
##
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.worker.windmill.client.commits;
+
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicLong;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import javax.annotation.Nullable;
+import javax.annotation.concurrent.ThreadSafe;
+import org.apache.beam.runners.dataflow.worker.streaming.WeightedBoundedQueue;
+import org.apache.beam.runners.dataflow.worker.streaming.Work;
+import org.apache.beam.runners.dataflow.worker.windmill.client.CloseableStream;
+import 
org.apache.beam.runners.dataflow.worker.windmill.client.WindmillStream.CommitWorkStream;
+import org.apache.beam.sdk.annotations.Internal;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Streaming engine implementation of {@link WorkCommitter}. Commits work back 
to Streaming Engine
+ * backend.
+ */
+@Internal
+@ThreadSafe
+public final class StreamingEngineWorkCommitter implements WorkCommitter {
+  private static final Logger LOG = 
LoggerFactory.getLogger(StreamingEngineWorkCommitter.class);
+  private static final int COMMIT_BATCH_SIZE = 5;
+  private static final int TARGET_COMMIT_BATCH_SIZE = 500 << 20; // 500MB
+
+  private final Supplier> 
commitWorkStreamFactory;
+  private final WeightedBoundedQueue commitQueue;
+  private final ExecutorService commitSenders;
+  private final AtomicLong activeCommitBytes;
+  private final Consumer onCommitComplete;
+  private final int numCommitSenders;
+
+  private StreamingEngineWorkCommitter(
+  Supplier> commitWorkStreamFactory,
+  int numCommitSenders,
+  Consumer onCommitComplete) {
+this.commitWorkStreamFactory = commitWorkStreamFactory;
+this.commitQueue =
+WeightedBoundedQueue.create(
+TARGET_COMMIT_BATCH_SIZE,
+commit -> Math.min(TARGET_COMMIT_BATCH_SIZE, commit.getSize()));
+this.commitSenders =
+Executors.newFixedThreadPool(
+numCommitSenders,
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setPriority(Thread.MAX_PRIORITY)
+.setNameFormat("CommitThread-%d")
+.build());
+this.activeCommitBytes = new AtomicLong();
+this.onCommitComplete = onCommitComplete;
+this.numCommitSenders = numCommitSenders;
+  }
+
+  public static StreamingEngineWorkCommitter create(
+  Supplier> commitWorkStreamFactory,
+  int numCommitSenders,
+  Consumer onCommitComplete) {
+return new StreamingEngineWorkCommitter(
+commitWorkStreamFactory, numCommitSenders, onCommitComplete);
+  }
+
+  @Override
+  @SuppressWarnings("FutureReturnValueIgnored")
+  public void start() {
+if (!commitSenders.isShutdown() || !commitSenders.isTerminated()) {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Refactor commit logic out of StreamingDataflowWorker [beam]

2024-03-06 Thread via GitHub


m-trieu commented on code in PR #30312:
URL: https://github.com/apache/beam/pull/30312#discussion_r1515369507


##
runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/windmill/client/commits/StreamingEngineWorkCommitterTest.java:
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.worker.windmill.client.commits;
+
+import static com.google.common.truth.Truth.assertThat;
+import static org.junit.Assert.assertNotNull;
+
+import com.google.api.services.dataflow.model.MapTask;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.function.Consumer;
+import java.util.function.Supplier;
+import org.apache.beam.runners.dataflow.worker.FakeWindmillServer;
+import org.apache.beam.runners.dataflow.worker.streaming.ComputationState;
+import org.apache.beam.runners.dataflow.worker.streaming.Work;
+import org.apache.beam.runners.dataflow.worker.util.BoundedQueueExecutor;
+import org.apache.beam.runners.dataflow.worker.windmill.Windmill;
+import 
org.apache.beam.runners.dataflow.worker.windmill.Windmill.WorkItemCommitRequest;
+import org.apache.beam.runners.dataflow.worker.windmill.client.CloseableStream;
+import 
org.apache.beam.runners.dataflow.worker.windmill.client.WindmillStream.CommitWorkStream;
+import 
org.apache.beam.runners.dataflow.worker.windmill.client.WindmillStreamPool;
+import org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.ByteString;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.ImmutableMap;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ErrorCollector;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+import org.mockito.Mockito;
+
+@RunWith(JUnit4.class)
+public class StreamingEngineWorkCommitterTest {
+
+  @Rule public ErrorCollector errorCollector = new ErrorCollector();
+  private StreamingEngineWorkCommitter workCommitter;
+  private FakeWindmillServer fakeWindmillServer;
+  private Supplier> commitWorkStreamFactory;
+
+  private static Work createMockWork(long workToken, Consumer 
processWorkFn) {
+return Work.create(
+Windmill.WorkItem.newBuilder()
+.setKey(ByteString.EMPTY)
+.setWorkToken(workToken)
+.setShardingKey(workToken)
+.setCacheToken(workToken)
+.build(),
+Instant::now,
+Collections.emptyList(),
+processWorkFn);
+  }
+
+  private static ComputationState createComputationState(String computationId) 
{
+return new ComputationState(
+computationId,
+new MapTask().setSystemName("system").setStageName("stage"),
+Mockito.mock(BoundedQueueExecutor.class),
+ImmutableMap.of(),
+null);
+  }
+
+  private static CompleteCommit asCompleteCommit(Commit commit) {
+if (commit.work().isFailed()) {
+  return CompleteCommit.forFailedWork(commit);
+}
+
+return CompleteCommit.create(commit, Windmill.CommitStatus.OK);
+  }
+
+  @Before
+  public void setUp() {
+fakeWindmillServer =
+new FakeWindmillServer(
+errorCollector, ignored -> 
Optional.of(Mockito.mock(ComputationState.class)));
+commitWorkStreamFactory =
+WindmillStreamPool.create(
+1, Duration.standardMinutes(1), 
fakeWindmillServer::commitWorkStream)
+::getCloseableStream;
+  }
+
+  @After
+  public void cleanUp() {
+workCommitter.stop();
+  }
+
+  private StreamingEngineWorkCommitter createWorkCommitter(
+  Consumer onCommitComplete) {
+return StreamingEngineWorkCommitter.create(commitWorkStreamFactory, 1, 
onCommitComplete);
+  }
+
+  @Test
+  public void testCommit_sendsCommitsToStreamingEngine() {
+workCommitter = createWorkCommitter(ignored -> {});
+List commits = new ArrayList<>();
+for (int i = 1; i <= 5; i++) {
+  Work work = createMockWork(i, ignored -> {});
+  

Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas closed pull request #30430: [RRIO]: Add RequestResponseIO examples 
and documentation on website for the Java SDK
URL: https://github.com/apache/beam/pull/30430


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas commented on PR #30430:
URL: https://github.com/apache/beam/pull/30430#issuecomment-1982182212

   I'll break up the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30548:
URL: https://github.com/apache/beam/pull/30548#issuecomment-1982175603

   alternative fix at #30552


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Do not pre-install beam in tensorrt container [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30552:
URL: https://github.com/apache/beam/pull/30552#issuecomment-1982174944

   The change already applied to gcr. WIll wait to see if Python PostCommit 
passes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Do not pre-install beam in tensorrt container [beam]

2024-03-06 Thread via GitHub


Abacn opened a new pull request, #30552:
URL: https://github.com/apache/beam/pull/30552

   Fix Python PostCommit tensorrttest
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fixes an upgrade imcompatiblity of BQ read/write transforms [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30551:
URL: https://github.com/apache/beam/pull/30551#issuecomment-1982163394

   Checks are failing. Will not request review until checks are succeeding. If 
you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Duet AI prompts: containers, hints, external calls (no links) [beam]

2024-03-06 Thread via GitHub


olehborysevych commented on code in PR #30435:
URL: https://github.com/apache/beam/pull/30435#discussion_r1515349817


##
learning/prompts/documentation-lookup-nolinks/47_batching_for_external_calls.md:
##
@@ -0,0 +1,70 @@
+Prompt:
+How to reduce payload when calling external services from my Apache Beam 
pipeline?
+
+Response:
+To reduce payload when calling external services from your Apache Beam 
pipeline, you can employ batching techniques by using the `GroupIntoBatches` 
transform. Batching involves aggregating multiple elements into a single 
payload, reducing the number of requests sent to the external service and 
minimizing overhead.
+
+Under the hood, the `GroupIntoBatches` transform utilizes state and timers to 
grant users precise control over batch size and buffering duration parameters, 
while abstracting away the implementation details. Key parameters include:
+* `maxBufferDuration`: controls the maximum waiting time for a batch to be 
emitted.
+* `batchSize`: determines the maximum number of elements in each batch. 
Elements are buffered until the specified number is reached, then emitted as a 
batch.
+* `batchSizeBytes` (Java only): limits the byte size of a single batch, 
determined by the input coder.
+* `elementByteSize` (Java only): sets the byte size of a single batch using a 
user-defined function.
+* `withShardedKey()`: enhances parallelism by distributing a single key across 
multiple threads.

Review Comment:
   withSharedKey() should be separated from other parameters since it's not a 
parameter but a function @dariabezkorovaina 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Reduce nullness checks in flink adapters. [beam]

2024-03-06 Thread via GitHub


robertwb merged PR #30488:
URL: https://github.com/apache/beam/pull/30488


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Reduce nullness checks in flink adapters. [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30488:
URL: https://github.com/apache/beam/pull/30488#issuecomment-1982086997

   Assigning reviewers. If you would like to opt out of this review, comment 
`assign to next reviewer`:
   
   R: @lostluck added as fallback since no labels match configuration
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any 
comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review 
comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30548:
URL: https://github.com/apache/beam/pull/30548#issuecomment-1982075343

   rebuilding the image sounds like a more appropriate fix here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30548:
URL: https://github.com/apache/beam/pull/30548#issuecomment-1982074688

   I'd rather keep it as is unless Beam requires some functionality from a 
newer version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug]: Upgrade compatibility broken when upgrading BQ read/write transforms to 2.55.0.dev [beam]

2024-03-06 Thread via GitHub


chamikaramj commented on issue #30534:
URL: https://github.com/apache/beam/issues/30534#issuecomment-1982068658

   Should be fixed by https://github.com/apache/beam/issues/30534.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Fixes an upgrade imcompatiblity of BQ read/write transforms [beam]

2024-03-06 Thread via GitHub


chamikaramj opened a new pull request, #30551:
URL: https://github.com/apache/beam/pull/30551

   This fixes https://github.com/apache/beam/issues/30534.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Reduce nullness checks in flink adapters. [beam]

2024-03-06 Thread via GitHub


robertwb commented on code in PR #30488:
URL: https://github.com/apache/beam/pull/30488#discussion_r1515270765


##
runners/flink/src/main/java/org/apache/beam/runners/flink/adapter/BeamFlinkDataSetAdapter.java:
##
@@ -213,7 +216,8 @@ private  
FlinkBatchPortablePipelineTranslator.PTransformTranslator flink
   // When we run into a FlinkInput operator, it "produces" the 
corresponding input as its
   // "computed result."
   String inputId = t.getTransform().getSpec().getPayload().toStringUtf8();
-  DataSet flinkInput = Preconditions.checkNotNull( 
(DataSet) inputMap.get(inputId));
+  DataSet flinkInput =
+  Preconditions.checkNotNull((DataSet) inputMap.get(inputId));

Review Comment:
   OK, I'll buy the NPE == segfault argument (though this code will probably 
never ever get hit.) Updated to use our version. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30548:
URL: https://github.com/apache/beam/pull/30548#issuecomment-1982001500

   Stopping reviewer notifications for this pull request: review requested by 
someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30548:
URL: https://github.com/apache/beam/pull/30548#issuecomment-198284

   R: @tvalentyn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Reduce nullness checks in flink adapters. [beam]

2024-03-06 Thread via GitHub


robertwb commented on code in PR #30488:
URL: https://github.com/apache/beam/pull/30488#discussion_r1515262267


##
runners/flink/src/main/java/org/apache/beam/runners/flink/adapter/FlinkInput.java:
##
@@ -71,7 +71,7 @@ public String getUrn() {
 }
 
 @Override
-@SuppressWarnings("nullness")  // 
TODO(https://github.com/apache/beam/issues/20497)
+@SuppressWarnings("nullness") // 
TODO(https://github.com/apache/beam/issues/20497)

Review Comment:
   That's not possible because the problem is the type checking of the 
overridden method signature. (This will require a change to the subclass and 
all its implementations, which may be good to do but out of scope.) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30548:
URL: https://github.com/apache/beam/pull/30548#issuecomment-1981998857

   tensorrt test passed: 
https://github.com/apache/beam/actions/runs/8179296900/job/22365079715?pr=30548
   
   Revert test only change and mark as ready for review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Jimmytobin2425 patch 1 [beam]

2024-03-06 Thread via GitHub


jimmytobin2425 opened a new pull request, #30550:
URL: https://github.com/apache/beam/pull/30550

   The issue this warning message points to 
(https://github.com/apache/beam/issues/22969) has been resolved showing it was 
an issue with Python 2 and not with beam. I propose to remove this warning 
message to reduce log spam.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Remove warning for mixing yield and return in a DoFn [beam]

2024-03-06 Thread via GitHub


jimmytobin2425 closed pull request #30549: Remove warning for mixing yield and 
return in a DoFn
URL: https://github.com/apache/beam/pull/30549


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BEAM-30531] Automatically execute unbounded pipelines in streaming mode. [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30533:
URL: https://github.com/apache/beam/pull/30533#issuecomment-1981972184

   Assigning reviewers. If you would like to opt out of this review, comment 
`assign to next reviewer`:
   
   R: @riteshghorse for label python.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any 
comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review 
comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix internal test failure from PR-30455 [beam]

2024-03-06 Thread via GitHub


Abacn commented on code in PR #30546:
URL: https://github.com/apache/beam/pull/30546#discussion_r1515232397


##
sdks/python/apache_beam/io/gcp/bigquery_tools_test.py:
##
@@ -224,9 +224,15 @@ def test_delete_dataset_retries_for_timeouts(self, 
patched_time_sleep):
 self.assertTrue(client.datasets.Delete.called)
 
   @mock.patch('time.sleep', return_value=None)
+  @mock.patch(
+  'apitools.base.py.base_api._SkipGetCredentials', return_value=True)
   @mock.patch('google.cloud._http.JSONConnection.http')
-  def test_user_agent_insert_all(self, http_mock, patched_sleep):
-wrapper = beam.io.gcp.bigquery_tools.BigQueryWrapper()
+  def test_user_agent_insert_all(
+  self, http_mock, patched_skip_get_credentials, patched_sleep):
+try:

Review Comment:
   This does not sounds a valid workaround. It's like error happens in a line 
and then we skip the test. If it is "it needs network access to get 
credential.", then we should skip the test when there isn't network access in 
the `@skipIf` annotation, and not inside the test



##
sdks/python/apache_beam/io/gcp/bigquery_tools_test.py:
##
@@ -224,9 +224,15 @@ def test_delete_dataset_retries_for_timeouts(self, 
patched_time_sleep):
 self.assertTrue(client.datasets.Delete.called)
 
   @mock.patch('time.sleep', return_value=None)
+  @mock.patch(
+  'apitools.base.py.base_api._SkipGetCredentials', return_value=True)
   @mock.patch('google.cloud._http.JSONConnection.http')
-  def test_user_agent_insert_all(self, http_mock, patched_sleep):
-wrapper = beam.io.gcp.bigquery_tools.BigQueryWrapper()
+  def test_user_agent_insert_all(
+  self, http_mock, patched_skip_get_credentials, patched_sleep):
+try:

Review Comment:
   This does not sound a valid workaround. It's like error happens in a line 
and then we skip the test. If it is "it needs network access to get 
credential.", then we should skip the test when there isn't network access in 
the `@skipIf` annotation, and not inside the test



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix internal test failure from PR-30455 [beam]

2024-03-06 Thread via GitHub


Abacn commented on code in PR #30546:
URL: https://github.com/apache/beam/pull/30546#discussion_r1515232397


##
sdks/python/apache_beam/io/gcp/bigquery_tools_test.py:
##
@@ -224,9 +224,15 @@ def test_delete_dataset_retries_for_timeouts(self, 
patched_time_sleep):
 self.assertTrue(client.datasets.Delete.called)
 
   @mock.patch('time.sleep', return_value=None)
+  @mock.patch(
+  'apitools.base.py.base_api._SkipGetCredentials', return_value=True)
   @mock.patch('google.cloud._http.JSONConnection.http')
-  def test_user_agent_insert_all(self, http_mock, patched_sleep):
-wrapper = beam.io.gcp.bigquery_tools.BigQueryWrapper()
+  def test_user_agent_insert_all(
+  self, http_mock, patched_skip_get_credentials, patched_sleep):
+try:

Review Comment:
   This does not sounds a valid workaround. It's like error happens in a line 
and then we skip the test. If it is "it needs network access to get 
credential.", then we should skip the test when there isn't network access in 
the decorator, and not inside the test



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug]: Upgrade compatibility broken when upgrading BQ read/write transforms to 2.55.0.dev [beam]

2024-03-06 Thread via GitHub


chamikaramj commented on issue #30534:
URL: https://github.com/apache/beam/issues/30534#issuecomment-1981934491

   I'm working on a fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] ActiveWorkRefresh [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30390:
URL: https://github.com/apache/beam/pull/30390#issuecomment-1981922464

   There is a flaky test added:
   
   testInvalidateStuckCommits: https://github.com/apache/beam/runs/22276370706
   
   ```
   Wanted but not invoked:
   forComputation.invalidate(
   ,
   0L
   );
   -> at 
org.apache.beam.runners.dataflow.worker.windmill.work.refresh.DispatchedActiveWorkRefresherTest.testInvalidateStuckCommits(DispatchedActiveWorkRefresherTest.java:231)
   Actually, there were zero interactions with this mock.
   
at 
org.apache.beam.runners.dataflow.worker.windmill.work.refresh.DispatchedActiveWorkRefresherTest.testInvalidateStuckCommits(DispatchedActiveWorkRefresherTest.java:231)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Failing Test]: dataflow runner worker project test stuck causing Java PreCommit time out [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #28957:
URL: https://github.com/apache/beam/issues/28957#issuecomment-1981917789

   Other flaky test:
   
   testLatencyAttributionToQueuedState: 
https://github.com/apache/beam/runs/22270690743
   
   ```
   java.lang.AssertionError: expected: but was:
at org.junit.Assert.assertEquals(Assert.java:146)
at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorkerTest.testLatencyAttributionToQueuedState(StreamingDataflowWorkerTest.java:3444)
   ```
   
   testInvalidateStuckCommits: https://github.com/apache/beam/runs/22276370706
   ```
   Wanted but not invoked:
   forComputation.invalidate(
   ,
   0L
   );
   -> at 
org.apache.beam.runners.dataflow.worker.windmill.work.refresh.DispatchedActiveWorkRefresherTest.testInvalidateStuckCommits(DispatchedActiveWorkRefresherTest.java:231)
   Actually, there were zero interactions with this mock.
   
at 
org.apache.beam.runners.dataflow.worker.windmill.work.refresh.DispatchedActiveWorkRefresherTest.testInvalidateStuckCommits(DispatchedActiveWorkRefresherTest.java:231)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add KafkaIO Stress test [beam]

2024-03-06 Thread via GitHub


Abacn merged PR #30467:
URL: https://github.com/apache/beam/pull/30467


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Remove warning for mixing yield and return in a DoFn [beam]

2024-03-06 Thread via GitHub


jimmytobin2425 opened a new pull request, #30549:
URL: https://github.com/apache/beam/pull/30549

   The issue this warning message points to 
(https://github.com/apache/beam/issues/22969) has been resolved showing it was 
an issue with Python 2 and not with beam. I propose to remove this warning 
message to reduce log spam.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damccorm commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515217941


##
website/www/site/content/en/documentation/io/built-in/webapis.md:
##
@@ -0,0 +1,441 @@
+---
+title: "Web Apis I/O connector"
+---
+
+
+[Built-in I/O Transforms](/documentation/io/built-in/)
+
+# Web APIs I/O connector
+
+{{< language-switcher java py go >}}
+
+The Beam SDKs include a built-in transform, called RequestResponseIO that can 
read from and write to Web APIs such as
+REST or gRPC.
+
+## Before you start
+
+{{< paragraph class="language-java" >}}
+To use RequestResponseIO, add the Maven artifact dependency to your `pom.xml` 
file. See
+[Maven 
Central](https://central.sonatype.com/artifact/org.apache.beam/beam-sdks-java-io-rrio)
 for available versions.
+{{< /paragraph >}}
+
+{{< highlight java >}}
+
+org.apache.beam
+beam-sdks-java-io-rrio
+{{< param release_latest >}}
+
+{{< /highlight >}}
+
+{{< paragraph class="language-py" >}}
+To use RequestResponseIO, install the Beam SDK by running `pip install 
apache-beam`
+{{< /paragraph >}}

Review Comment:
   I'd still recommend having a callout that a Python implementation does exist 
(plus the example). Its cheap and better than not having it documented at all.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damccorm commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515216455


##
website/www/site/content/en/documentation/io/built-in/webapis.md:
##
@@ -0,0 +1,441 @@
+---
+title: "Web Apis I/O connector"
+---
+
+
+[Built-in I/O Transforms](/documentation/io/built-in/)
+
+# Web APIs I/O connector
+
+{{< language-switcher java py go >}}
+
+The Beam SDKs include a built-in transform, called RequestResponseIO that can 
read from and write to Web APIs such as
+REST or gRPC.

Review Comment:
   I'm just suggesting adding a sentence here, not building out full examples. 
I think this sentence as written doesn't convey the full value of the feature.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damccorm commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515215407


##
examples/java/webapis/src/main/java/org/apache/beam/examples/webapis/UsingHttpClientExample.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.examples.webapis;

Review Comment:
   Yep, in this case we'd be splitting up the pr not to deliver incremental 
value, but because smaller PRs are easier to review and generally safer (safety 
is a little less important for examples) see go/small-cls or 
https://blog.codacy.com/small-pull-requests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug]: Upgrade compatibility broken when upgrading BQ read/write transforms to 2.55.0.dev [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #30534:
URL: https://github.com/apache/beam/issues/30534#issuecomment-1981879397

   What is the plan as for 2.55.0 release ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Bump google auth lower bound [beam]

2024-03-06 Thread via GitHub


Abacn opened a new pull request, #30548:
URL: https://github.com/apache/beam/pull/30548

   Fix tensorRT test in Python PostCommit. Caused by 
https://github.com/googleapis/google-cloud-python/issues/12254 but only 
surfaced to Beam until March 4th, likely transient dependency upgrade
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Failing Test]: Python PostCommit failing hdfsIntegrationTest in generate_external_transform_wrappers [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #30459:
URL: https://github.com/apache/beam/issues/30459#issuecomment-1981848636

   There is another new failure:
   
   ```
   :sdks:python:test-suites:dataflow:py38:tensorRTtests
   ```
   
   ```
ERROR:apache_beam.runners.dataflow.dataflow_runner:: JOB_MESSAGE_ERROR: 
Traceback (most recent call last):
  File "apache_beam/runners/common.py", line 1435, in 
apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 636, in 
apache_beam.runners.common.SimpleInvoker.invoke_process

  File "apache_beam/runners/common.py", line 1611, in 
apache_beam.runners.common._OutputHandler.handle_process_outputs
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 1555, in process
initial_restriction = self.restriction_provider.initial_restriction(
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/io/iobase.py",
 line 1631, in initial_restriction
range_tracker = element_source.get_range_tracker(None, None)
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py",
 line 206, in get_range_tracker
return self._get_concat_source().get_range_tracker(
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/options/value_provider.py",
 line 193, in _f
> Task :sdks:python:test-suites:dataflow:py38:tensorRTtests
return fnc(self, *args, **kwargs)
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/io/filebasedsource.py",
 line 144, in _get_concat_source
match_result = FileSystems.match([pattern])[0]
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/io/filesystems.py",
 line 204, in match
return filesystem.match(patterns, limits)
  File 
"/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.8/site-packages/apache_beam/io/filesystem.py",
 line 804, in match
raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with 
exceptions {'gs://apache-beam-ml/testing/inputs/tensorrt_image_file_names.txt': 
AttributeError("'Credentials' object has no attribute 'universe_domain'")}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515187165


##
website/www/site/content/en/documentation/io/built-in/webapis.md:
##
@@ -0,0 +1,441 @@
+---
+title: "Web Apis I/O connector"
+---
+
+
+[Built-in I/O Transforms](/documentation/io/built-in/)
+
+# Web APIs I/O connector
+
+{{< language-switcher java py go >}}
+
+The Beam SDKs include a built-in transform, called RequestResponseIO that can 
read from and write to Web APIs such as
+REST or gRPC.
+
+## Before you start
+
+{{< paragraph class="language-java" >}}
+To use RequestResponseIO, add the Maven artifact dependency to your `pom.xml` 
file. See
+[Maven 
Central](https://central.sonatype.com/artifact/org.apache.beam/beam-sdks-java-io-rrio)
 for available versions.
+{{< /paragraph >}}
+
+{{< highlight java >}}
+
+org.apache.beam
+beam-sdks-java-io-rrio
+{{< param release_latest >}}
+
+{{< /highlight >}}
+
+{{< paragraph class="language-py" >}}
+To use RequestResponseIO, install the Beam SDK by running `pip install 
apache-beam`
+{{< /paragraph >}}

Review Comment:
   I removed the distracting go/python placeholders.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BEAM-30531] Automatically execute unbounded pipelines in streaming mode. [beam]

2024-03-06 Thread via GitHub


robertwb commented on PR #30533:
URL: https://github.com/apache/beam/pull/30533#issuecomment-1981829484

   Yes, happy to let this change bake.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix internal test failure from PR-30455 [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30546:
URL: https://github.com/apache/beam/pull/30546#issuecomment-1981828119

   Assigning reviewers. If you would like to opt out of this review, comment 
`assign to next reviewer`:
   
   R: @AnandInguva for label python.
   R: @ahmedabu98 for label io.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any 
comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review 
comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515170769


##
website/www/site/content/en/documentation/io/built-in/webapis.md:
##
@@ -0,0 +1,441 @@
+---
+title: "Web Apis I/O connector"
+---
+
+
+[Built-in I/O Transforms](/documentation/io/built-in/)
+
+# Web APIs I/O connector
+
+{{< language-switcher java py go >}}
+
+The Beam SDKs include a built-in transform, called RequestResponseIO that can 
read from and write to Web APIs such as
+REST or gRPC.

Review Comment:
   @damccorm Perhaps in future PRs to make it easier to review. This website 
should just have the basics.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515178887


##
examples/java/webapis/src/main/java/org/apache/beam/examples/webapis/UsingHttpClientExample.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.examples.webapis;

Review Comment:
   The code example is necessary for the text in the website. The second 
example builds on the first. The first is not really valuable. The second is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damondouglas commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515170769


##
website/www/site/content/en/documentation/io/built-in/webapis.md:
##
@@ -0,0 +1,441 @@
+---
+title: "Web Apis I/O connector"
+---
+
+
+[Built-in I/O Transforms](/documentation/io/built-in/)
+
+# Web APIs I/O connector
+
+{{< language-switcher java py go >}}
+
+The Beam SDKs include a built-in transform, called RequestResponseIO that can 
read from and write to Web APIs such as
+REST or gRPC.

Review Comment:
   @damccorm I want such features in future PRs to make it easier to review. 
This PR should just have the basics.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bump github.com/cloudevents/sdk-go/v2 from 2.6.1 to 2.15.2 in /playground/backend [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30547:
URL: https://github.com/apache/beam/pull/30547#issuecomment-1981785135

   Checks are failing. Will not request review until checks are succeeding. If 
you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Failing Test]: beam_PostCommit_XVR_Direct perma-red. [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #28972:
URL: https://github.com/apache/beam/issues/28972#issuecomment-1981769577

   For some reason the window in WindowedValue decoded here
   
   
https://github.com/apache/beam/blob/1a05f39883fca49f8b8068a68a358dfe973055c0/sdks/python/apache_beam/runners/portability/fn_api_runner/execution.py#L238
   
   is not a tuple of window objects, but a type of byte e.g. 
`(b"\x80\x00\x00\x00\x00\x00'\x10\x90N",)`, `(b'\x80\x00\x00\x00\x00\x00N 
\x90N',)`, etc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RRIO]: Add RequestResponseIO examples and documentation on website for the Java SDK [beam]

2024-03-06 Thread via GitHub


damccorm commented on code in PR #30430:
URL: https://github.com/apache/beam/pull/30430#discussion_r1515134717


##
examples/java/webapis/src/main/java/org/apache/beam/examples/webapis/UsingHttpClientExample.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.examples.webapis;

Review Comment:
   Minor nit: in the future, it would be helpful to split PRs like this in 2; 
it makes it easier to review (and likely would drive down the time to review 
completion since it requires less time carved out)



##
website/www/site/content/en/documentation/io/built-in/webapis.md:
##
@@ -0,0 +1,441 @@
+---
+title: "Web Apis I/O connector"
+---
+
+
+[Built-in I/O Transforms](/documentation/io/built-in/)
+
+# Web APIs I/O connector
+
+{{< language-switcher java py go >}}
+
+The Beam SDKs include a built-in transform, called RequestResponseIO that can 
read from and write to Web APIs such as
+REST or gRPC.
+
+## Before you start
+
+{{< paragraph class="language-java" >}}
+To use RequestResponseIO, add the Maven artifact dependency to your `pom.xml` 
file. See
+[Maven 
Central](https://central.sonatype.com/artifact/org.apache.beam/beam-sdks-java-io-rrio)
 for available versions.
+{{< /paragraph >}}
+
+{{< highlight java >}}
+
+org.apache.beam
+beam-sdks-java-io-rrio
+{{< param release_latest >}}
+
+{{< /highlight >}}
+
+{{< paragraph class="language-py" >}}
+To use RequestResponseIO, install the Beam SDK by running `pip install 
apache-beam`
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" wrap="span" >}}
+At this time the Go SDK implementation of RequestResponseIO is not available. 
See tracker issue:
+https://github.com/apache/beam/issues/30423.
+{{< /paragraph >}}
+
+## Additional resources
+
+{{< paragraph class="language-java" wrap="span" >}}
+* [RequestResponseIO source 
code](https://github.com/apache/beam/tree/master/sdks/java/io/rrio)
+* [RequestResponseIO 
Javadoc](https://beam.apache.org/releases/javadoc/current/org/apache/beam/io/requestresponse/RequestResponseIO.html)
+{{< /paragraph >}}
+
+{{< paragraph class="language-py" wrap="span" >}}
+* [RequestResponseIO source 
code](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/requestresponse.py)
+* [RequestResponseIO 
PyDoc](https://beam.apache.org/releases/pydoc/current/apache_beam.io.requestresponse.html)
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" wrap="span" >}}
+TODO: see https://github.com/apache/beam/issues/30423.
+{{< /paragraph >}}
+
+## RequestResponseIO basics
+
+### Minimal code
+
+The minimal code needed to read from or write to Web APIs is:
+
+{{< paragraph class="language-java" wrap="span" >}}
+1. 
[Caller](https://beam.apache.org/releases/javadoc/current/org/apache/beam/io/requestresponse/Caller.html)
 implementation.
+2. Instantiate 
[RequestResponseIO](https://beam.apache.org/releases/javadoc/current/org/apache/beam/io/requestresponse/RequestResponseIO.html).
+{{< /paragraph >}}
+
+{{< paragraph class="language-py" wrap="span" >}}
+1. 
[Caller](https://beam.apache.org/releases/pydoc/current/apache_beam.io.requestresponse.html#apache_beam.io.requestresponse.Caller)
 implementation.
+2. Instantiate 
[RequestResponseIO](https://beam.apache.org/releases/pydoc/current/apache_beam.io.requestresponse.html#apache_beam.io.requestresponse.RequestResponseIO).
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" wrap="span" >}}
+TODO: see https://github.com/apache/beam/issues/30423.
+{{< /paragraph >}}
+
+ Implementing the Caller
+
+{{< paragraph class="language-java" >}}
+[Caller](https://beam.apache.org/releases/javadoc/current/org/apache/beam/io/requestresponse/Caller.html)
 requires
+only one method override.

Review Comment:
   It would be helpful to describe the purpose of the `call` function here



##
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy:
##
@@ -760,6 +760,7 @@ class BeamModulePlugin implements Plugin {
 google_cloud_platform_libraries_bom : 
"com.google.cloud:libraries-bom:26.32.0",
 google_cloud_spanner: 
"com.google.cloud:google-cloud-spanner", // 

Re: [PR] Implement ordered list state for FnApi. [beam]

2024-03-06 Thread via GitHub


shunping commented on PR #30317:
URL: https://github.com/apache/beam/pull/30317#issuecomment-1981758955

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] fix playground backend container builds [beam]

2024-03-06 Thread via GitHub


damccorm merged PR #30497:
URL: https://github.com/apache/beam/pull/30497


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] fix playground backend container builds [beam]

2024-03-06 Thread via GitHub


damccorm commented on PR #30497:
URL: https://github.com/apache/beam/pull/30497#issuecomment-1981719597

   Agreed this is unrelated to precommit failure (looks like a go lint issue)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Bump github.com/cloudevents/sdk-go/v2 from 2.6.1 to 2.15.2 in /playground/backend [beam]

2024-03-06 Thread via GitHub


dependabot[bot] opened a new pull request, #30547:
URL: https://github.com/apache/beam/pull/30547

   Bumps 
[github.com/cloudevents/sdk-go/v2](https://github.com/cloudevents/sdk-go) from 
2.6.1 to 2.15.2.
   
   Release notes
   Sourced from https://github.com/cloudevents/sdk-go/releases;>github.com/cloudevents/sdk-go/v2's
 releases.
   
   Release v2.15.2
   What's Changed
   
   Patch for a potential security issue. See https://github.com/cloudevents/sdk-go/blob/HEAD/TBD;>CVE-2024-28110.
   Note: this could be a breaking change for people if they purposely 
change golang's HTTP DefaultClient, or change the CloudEvents 
Client returned from NewClient, and expect those 
changes to be visible on other HTTP flows using those Clients. E.g. auth
   
   Full Changelog: https://github.com/cloudevents/sdk-go/compare/v2.15.1...v2.15.2;>https://github.com/cloudevents/sdk-go/compare/v2.15.1...v2.15.2
   Release v2.15.1
   What's Changed
   
   Bump andstor/file-existence-action from 2 to 3 by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1009;>cloudevents/sdk-go#1009
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /test/conformance by 
https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/993;>cloudevents/sdk-go#993
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /test/benchmark by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/994;>cloudevents/sdk-go#994
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /samples/kafka by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/995;>cloudevents/sdk-go#995
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /test/integration by 
https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/996;>cloudevents/sdk-go#996
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in 
/protocol/kafka_sarama/v2 by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/997;>cloudevents/sdk-go#997
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /samples/http by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/998;>cloudevents/sdk-go#998
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /samples/nats by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/999;>cloudevents/sdk-go#999
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /samples/stan by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1004;>cloudevents/sdk-go#1004
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in 
/samples/nats_jetstream by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1003;>cloudevents/sdk-go#1003
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /protocol/nats/v2 by 
https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1002;>cloudevents/sdk-go#1002
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in 
/protocol/nats_jetstream/v2 by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1001;>cloudevents/sdk-go#1001
   Bump golang.org/x/crypto from 0.14.0 to 0.17.0 in /protocol/stan/v2 by 
https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1000;>cloudevents/sdk-go#1000
   Propose the confluent-kafka-go binding for Kafka by https://github.com/yanmxa;>@​yanmxa in https://redirect.github.com/cloudevents/sdk-go/pull/1008;>cloudevents/sdk-go#1008
   Sync CESQL tck tests by https://github.com/Cali0707;>@​Cali0707 in https://redirect.github.com/cloudevents/sdk-go/pull/1010;>cloudevents/sdk-go#1010
   Fix docstring typos in nats and jetstream protocol by https://github.com/jafossum;>@​jafossum in https://redirect.github.com/cloudevents/sdk-go/pull/1013;>cloudevents/sdk-go#1013
   Bump golangci/golangci-lint-action from 3 to 4 by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1016;>cloudevents/sdk-go#1016
   Bump the bundler group across 1 directories with 1 update by https://github.com/dependabot;>@​dependabot in https://redirect.github.com/cloudevents/sdk-go/pull/1011;>cloudevents/sdk-go#1011
   Remove vi swp file by https://github.com/duglin;>@​duglin in https://redirect.github.com/cloudevents/sdk-go/pull/1020;>cloudevents/sdk-go#1020
   
   New Contributors
   
   https://github.com/Cali0707;>@​Cali0707 made 
their first contribution in https://redirect.github.com/cloudevents/sdk-go/pull/1010;>cloudevents/sdk-go#1010
   https://github.com/jafossum;>@​jafossum made 
their first contribution in https://redirect.github.com/cloudevents/sdk-go/pull/1013;>cloudevents/sdk-go#1013
   
   Full Changelog: 

Re: [I] [Failing Test]: beam_PostCommit_XVR_Direct perma-red. [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #28972:
URL: https://github.com/apache/beam/issues/28972#issuecomment-1981692067

   CC: @robertwb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Allow local runners to execute arbitrary cross language pipelines without Docker. [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #29283:
URL: https://github.com/apache/beam/pull/29283#issuecomment-1981680333

   It appears Python XVR Direct is failing after this change. Specifically 
https://github.com/apache/beam/issues/28972#issuecomment-1930538626 any ideas?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Fix pr 30455 [beam]

2024-03-06 Thread via GitHub


shunping opened a new pull request, #30546:
URL: https://github.com/apache/beam/pull/30546

   PR #30455 has caused some internal test failure. We are going to skip the 
test if BigQueryWrapper cannot be imported.
   
   (Internal bug id: 302004313)
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   

   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Failing Test]: beam_PostCommit_XVR_Direct perma-red. [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #28972:
URL: https://github.com/apache/beam/issues/28972#issuecomment-1981654497

   This is a regression in Beam 2.53.0. Unfortunately GHA logs expires in 3 
months. From now one only knows the regression happens between Nov 8, 2023 
(last successful run and 
https://github.com/apache/beam/actions/runs/7018120946) - Dec 8, 2023 #2111 
(first run see this issue and still has log)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] fix playground backend container builds [beam]

2024-03-06 Thread via GitHub


volatilemolotov commented on PR #30497:
URL: https://github.com/apache/beam/pull/30497#issuecomment-1981623256

   No, seems like a test failure, that part of code was not touched


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Document requirements cache behavior differences. [beam]

2024-03-06 Thread via GitHub


tvalentyn commented on PR #30493:
URL: https://github.com/apache/beam/pull/30493#issuecomment-1981593752

   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Document requirements cache behavior differences. [beam]

2024-03-06 Thread via GitHub


rszper commented on code in PR #30493:
URL: https://github.com/apache/beam/pull/30493#discussion_r1514988361


##
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
 
 The runner will use the `requirements.txt` file to install your additional 
dependencies onto the remote workers.
 
-> **NOTE**: An alternative to `pip freeze` is to use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the 
dependencies required for the pipeline from a `--requirements_file`, where only 
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the 
dependencies required for the pipeline from a `requirements.in` file, where 
only the top-level dependencies are mentioned.

Review Comment:
   ```suggestion
   > **NOTE**: As an alternative to `pip freeze`, use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all of the 
dependencies required for the pipeline from a `requirements.in` file. In the 
`requirements.in` file, only the top-level dependencies are mentioned.
   ```



##
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
 
 The runner will use the `requirements.txt` file to install your additional 
dependencies onto the remote workers.
 
-> **NOTE**: An alternative to `pip freeze` is to use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the 
dependencies required for the pipeline from a `--requirements_file`, where only 
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the 
dependencies required for the pipeline from a `requirements.in` file, where 
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
+if available. This mechanism allows staging dependency packages to the runner
+at submission, and at runtime the runner workers might be able to install the
+packages from cache, without a connection to PyPI. To disable staging the
+requirements, supply the `--requirements_cache=skip` pipeline option.

Review Comment:
   ```suggestion
   requirements, use the `--requirements_cache=skip` pipeline option.
   ```



##
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##
@@ -118,7 +128,10 @@ Often, your pipeline code spans multiple files. To run 
your project remotely, yo
 
 --setup_file /path/to/setup.py
 
-**Note:** If you [created a requirements.txt file](#pypi-dependencies) and 
your project spans multiple files, you can get rid of the `requirements.txt` 
file and instead, add all packages contained in `requirements.txt` to the 
`install_requires` field of the setup call (in step 1).
+**Note:** It is not necessary to supply the `--requirements_file` 
[option](#pypi-dependencies) if the dependenices of your package are defined in 
the `install_requires` field of the `setup.py` file (see step 1).
+However unlike the `--requirements_file` option, when  using the 
`--setup_file` option, Beam does not stage the dependent packages to the Runner,
+only the pipeline package is staged and its dependencies are installed from 
PyPI

Review Comment:
   ```suggestion
   Only the pipeline package is staged. If they aren't already provided in the 
runtime environment,
   ```



##
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
 
 The runner will use the `requirements.txt` file to install your additional 
dependencies onto the remote workers.
 
-> **NOTE**: An alternative to `pip freeze` is to use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the 
dependencies required for the pipeline from a `--requirements_file`, where only 
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like 
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the 
dependencies required for the pipeline from a `requirements.in` file, where 
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline

Review Comment:
   ```suggestion
   the specified packages locally into a requirements cache directory,
   ```




Re: [I] [Task]: Update the minor version of protobuf library in the upper bound prior to Beam release. [beam]

2024-03-06 Thread via GitHub


riteshghorse commented on issue #25590:
URL: https://github.com/apache/beam/issues/25590#issuecomment-1981553174

   +1. I'm working with @tvalentyn on workaround for that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add Redistribute transform to Java SDK and Dataflow translation [beam]

2024-03-06 Thread via GitHub


kennknowles commented on PR #30545:
URL: https://github.com/apache/beam/pull/30545#issuecomment-1981548748

   One question which wasn't in the design docs was how to implement: wrap 
Reshuffle (aka build a composite that just invokes Reshuffle and relies on 
everything built around it) or fork. This PR chose to fork. Pro/con:
   
   Arguments for wrapping:
- less code
- runners that implement Reshuffle specially already will implement 
Redistribute the same way
- if there is something I missed in how Reshuffle is treated, it will get 
picked up because we are still using it
   
   Arguments for forking:
- decouple whatever state a runner may store, and just generally decouple 
their evolution
- people won't unpack their "Redistribute" and see a Reshuffle inside and 
get the wrong idea
- (minor) can remove update compatibility path
- if there is something I missed in how Redistribute and Reshuffle are 
different, we are free for them to diverge
   
   So I chose forking but could be convinced otherwise. My way creates more 
code and more work, because we need to make runners treat it specially - for 
Dataflow we can do way better than the existing Reshuffle translation, for the 
other runners it'll be a re-use of existing Reshuffle translation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [Python] Vertex AI Feature Store enrichment handler [beam]

2024-03-06 Thread via GitHub


riteshghorse merged PR #30388:
URL: https://github.com/apache/beam/pull/30388


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [Python] Vertex AI Feature Store enrichment handler [beam]

2024-03-06 Thread via GitHub


riteshghorse commented on PR #30388:
URL: https://github.com/apache/beam/pull/30388#issuecomment-1981522022

   Unrelated failure/flake. Merging this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix hdfs integration test [beam]

2024-03-06 Thread via GitHub


Abacn merged PR #30458:
URL: https://github.com/apache/beam/pull/30458


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Failing Test]: Python PostCommit failing hdfsIntegrationTest in generate_external_transform_wrappers [beam]

2024-03-06 Thread via GitHub


Abacn closed issue #30459: [Failing Test]: Python PostCommit failing 
hdfsIntegrationTest in generate_external_transform_wrappers
URL: https://github.com/apache/beam/issues/30459


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix hdfs integration test [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30458:
URL: https://github.com/apache/beam/pull/30458#issuecomment-1981509771

   Python Test need rebase to pass


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Add KafkaIO Stress test [beam]

2024-03-06 Thread via GitHub


Abacn commented on code in PR #30467:
URL: https://github.com/apache/beam/pull/30467#discussion_r1514948281


##
it/kafka/src/test/java/org/apache/beam/it/kafka/KafkaIOST.java:
##
@@ -0,0 +1,480 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.it.kafka;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.cloud.Timestamp;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.time.Duration;
+import java.time.ZoneId;
+import java.time.format.DateTimeFormatter;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+import org.apache.beam.it.common.PipelineLauncher;
+import org.apache.beam.it.common.PipelineOperator;
+import org.apache.beam.it.common.TestProperties;
+import org.apache.beam.it.gcp.IOLoadTestBase;
+import 
org.apache.beam.runners.dataflow.options.DataflowPipelineWorkerPoolOptions;
+import org.apache.beam.sdk.io.kafka.KafkaIO;
+import org.apache.beam.sdk.io.synthetic.SyntheticSourceOptions;
+import org.apache.beam.sdk.options.StreamingOptions;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.TestPipelineOptions;
+import org.apache.beam.sdk.testutils.NamedTestResult;
+import org.apache.beam.sdk.testutils.metrics.IOITMetrics;
+import org.apache.beam.sdk.testutils.publishing.InfluxDBSettings;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.MapElements;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.PeriodicImpulse;
+import org.apache.beam.sdk.transforms.Reshuffle;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Strings;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.ImmutableMap;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.primitives.Longs;
+import org.apache.kafka.clients.admin.AdminClient;
+import org.apache.kafka.clients.admin.NewTopic;
+import org.apache.kafka.clients.producer.ProducerConfig;
+import org.apache.kafka.common.serialization.ByteArraySerializer;
+import org.joda.time.Instant;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+
+/**
+ * KafkaIO stress tests. The test is designed to assess the performance of 
KafkaIO under various
+ * conditions. To run the test, a live remote Kafka broker is required. You 
can deploy Kafka within
+ * a Kubernetes cluster following the example described here: {@link
+ * .github/workflows/beam_PerformanceTests_Kafka_IO.yml} If you choose to use 
Kubernetes, it's
+ * important to remember that each pod should have a minimum of 10GB memory 
allocated. Additionally,
+ * when running the test, it's necessary to pass the addresses of Kafka 
bootstrap servers as an
+ * argument.
+ *
+ * Usage: 
+ * - To run medium-scale stress tests: {@code gradle 
:it:kafka:KafkaStressTestMedium
+ * -DbootstrapServers="0.0.0.0:32400,1.1.1.1:32400"} 
+ * - To run large-scale stress tests: {@code gradle 
:it:kafka:KafkaStressTestLarge
+ * -DbootstrapServers="0.0.0.0:32400,1.1.1.1:32400"}
+ */
+public final class KafkaIOST extends IOLoadTestBase {
+  /**
+   * The load will initiate at 1x, progressively increase to 2x and 4x, then 
decrease to 2x and
+   * eventually return to 1x.
+   */
+  private static final int[] DEFAULT_LOAD_INCREASE_ARRAY = {1, 2, 2, 4, 2, 1};
+
+  private static InfluxDBSettings influxDBSettings;
+  private static final String WRITE_ELEMENT_METRIC_NAME = "write_count";
+  private static final String READ_ELEMENT_METRIC_NAME = "read_count";
+  private static final int DEFAULT_ROWS_PER_SECOND = 1000;
+  private Configuration configuration;
+  private AdminClient adminClient;
+  private String testConfigName;
+  private String tempLocation;
+  private String kafkaTopic;
+
+  @Rule public TestPipeline writePipeline = TestPipeline.create();
+  @Rule public TestPipeline 

Re: [PR] Implement ordered list state for FnApi. [beam]

2024-03-06 Thread via GitHub


github-actions[bot] commented on PR #30317:
URL: https://github.com/apache/beam/pull/30317#issuecomment-1981431117

   Assigning reviewers. If you would like to opt out of this review, comment 
`assign to next reviewer`:
   
   R: @robertwb for label java.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any 
comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review 
comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug]: Go SDK Dataflow jobs fail on DataSampling disabled [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #29760:
URL: https://github.com/apache/beam/issues/29760#issuecomment-1981419452

   It seems this issue was resolved? The linked workflow run 
https://github.com/apache/beam/issues/29760#issuecomment-1854793020 was 
successful


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Refactor commit logic out of StreamingDataflowWorker [beam]

2024-03-06 Thread via GitHub


scwhittle commented on PR #30312:
URL: https://github.com/apache/beam/pull/30312#issuecomment-1981420874

   There are still some open comments.  This is part of work to support direct 
path which won't be ready for the cut. So I don't think this should hold up the 
cut.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug]: Go SDK Dataflow jobs fail on DataSampling disabled [beam]

2024-03-06 Thread via GitHub


Abacn closed issue #29760: [Bug]: Go SDK Dataflow jobs fail on DataSampling 
disabled
URL: https://github.com/apache/beam/issues/29760


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Task]: Update the minor version of protobuf library in the upper bound prior to Beam release. [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #25590:
URL: https://github.com/apache/beam/issues/25590#issuecomment-1981406944

   4.25.3 is the latest 4.x release as of 2.55.0 cut day. and it is `<4.26.0`. 
Next release needs to aware that the upcoming release bumped the major version, 
4.25 -> 5.26, indicating breaking change 
   
   note: protobuf currently encode its version in minor version number for all 
language impls; and the major version is reserved for language specific 
breaking change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Task]: Update the minor version of cloudpickle library prior to Beam release. [beam]

2024-03-06 Thread via GitHub


Abacn commented on issue #23119:
URL: https://github.com/apache/beam/issues/23119#issuecomment-1981400099

   2.2.1 is still the latest of 2.x as of 2.55.0 cut


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Refactor commit logic out of StreamingDataflowWorker [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30312:
URL: https://github.com/apache/beam/pull/30312#issuecomment-1981396920

   fyi today is release cut. What is the status of this PR? I see the comments 
are all replied? CC: @scwhittle @m-trieu 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] fix playground backend container builds [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30497:
URL: https://github.com/apache/beam/pull/30497#issuecomment-1981390739

   Is Playground PreCommit failure related to this change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Bigtable: use InstanceBuilder to dynamic load override class [beam]

2024-03-06 Thread via GitHub


Abacn merged PR #30542:
URL: https://github.com/apache/beam/pull/30542


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Revert "Document yaml pipeline options" [beam]

2024-03-06 Thread via GitHub


Abacn closed pull request #30536: Revert "Document yaml pipeline options"
URL: https://github.com/apache/beam/pull/30536


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix hdfs integration test [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30458:
URL: https://github.com/apache/beam/pull/30458#issuecomment-1981375814

   R: @riteshghorse 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Fix hdfs integration test [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30458:
URL: https://github.com/apache/beam/pull/30458#issuecomment-1981375569

   both hdfsIntegration test and azureIntegrationTest passed, though there are 
new failures due to #30417 and reverted in #30535


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Revert "Add test code to overwrite SQL in Beam Python JDBC (#30417)" [beam]

2024-03-06 Thread via GitHub


Abacn commented on PR #30535:
URL: https://github.com/apache/beam/pull/30535#issuecomment-1981370763

   PreCommit Python failed unrelated change and fixed by #30540, merging for now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Revert "Add test code to overwrite SQL in Beam Python JDBC (#30417)" [beam]

2024-03-06 Thread via GitHub


Abacn merged PR #30535:
URL: https://github.com/apache/beam/pull/30535


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >