(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 9dae3bdcb0a Updating config from bot 9dae3bdcb0a is described below commit 9dae3bdcb0a8ada4fa305f144e96d153389b4365 Author: github-actions AuthorDate: Fri Apr 19 05:05:49 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31049.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31049.json b/scripts/ci/pr-bot/state/pr-state/pr-31049.json new file mode 100644 index 000..242a48d7d3b --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31049.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": true, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch dependabot/go_modules/sdks/github.com/docker/docker-26.0.2incompatible created (now fbb9402d5ae)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/go_modules/sdks/github.com/docker/docker-26.0.2incompatible in repository https://gitbox.apache.org/repos/asf/beam.git at fbb9402d5ae Bump github.com/docker/docker in /sdks No new revisions were added by this update.
(beam) branch dependabot/go_modules/sdks/github.com/docker/docker-26.0.1incompatible deleted (was f85b499a5cf)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/go_modules/sdks/github.com/docker/docker-26.0.1incompatible in repository https://gitbox.apache.org/repos/asf/beam.git was f85b499a5cf Bump github.com/docker/docker in /sdks The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(beam) branch nightly-refs/heads/master updated (4f964bf05d5 -> d05196dea8a)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch nightly-refs/heads/master in repository https://gitbox.apache.org/repos/asf/beam.git from 4f964bf05d5 Fix workflow param value for Grafana link (#31011) add bcb40cf4e4a Change caching of global window inputs to be guarded by experiment (#31013) add 2eb1a756258 [Python] Clean doc related to write data in bigquery.py (#30887) add 04ff4bdd7fc Support BQ clustering with value provider (#30460) add 61153bbda6a Update documentation of @SchemaFieldNumber (#30273) (#30277) add bb0b63cb940 Bump TPCDS test Flink version (#31041) add 70e067e1fde fix url for content security (#31043) add bb310e7e907 Change type for UnboundedReaderMaxReadTimeSec (#31037) add b69e8c615af Updates Python Dev container used by Dataflow (#31029) add 76c77cd28ae Fix typo in count_unique_words() (#31023) add 3e52e3554a0 Add code change guide contributor-doc (#30879) add d05196dea8a Upgrade the version of GRPC to pick up a fix for #30867 (#31044) No new revisions were added by this update. Summary of changes: .../workflows/beam_PostCommit_Java_Tpcds_Flink.yml | 2 +- contributor-docs/code-change-guide.md | 525 + .../options/DataflowPipelineDebugOptions.java | 9 +- .../dataflow/worker/WorkerCustomSources.java | 4 +- .../dataflow/worker/WorkerCustomSourcesTest.java | 7 +- .../sdk/schemas/annotations/SchemaFieldNumber.java | 7 +- .../beam/sdk/io/gcp/bigquery/BigQueryHelpers.java | 22 + .../beam/sdk/io/gcp/bigquery/BigQueryIO.java | 31 +- .../sdk/io/gcp/bigquery/BigQueryIOTranslation.java | 18 +- .../gcp/bigquery/DynamicDestinationsHelpers.java | 8 + .../sdk/io/gcp/bigquery/BigQueryClusteringIT.java | 5 +- .../sdk/io/gcp/bigquery/BigQueryHelpersTest.java | 11 + .../io/gcp/bigquery/BigQueryIOTranslationTest.java | 10 +- .../BigQueryTimePartitioningClusteringIT.java | 1 + sdks/python/apache_beam/io/gcp/bigquery.py | 6 +- sdks/python/apache_beam/ml/transforms/tft.py | 4 +- sdks/python/apache_beam/runners/common.pxd | 4 +- sdks/python/apache_beam/runners/common.py | 75 ++- .../apache_beam/runners/dataflow/internal/names.py | 2 +- .../container/py310/base_image_requirements.txt| 8 +- .../container/py311/base_image_requirements.txt| 8 +- .../container/py38/base_image_requirements.txt | 8 +- .../container/py39/base_image_requirements.txt | 8 +- sdks/python/setup.py | 6 +- website/www/site/static/.htaccess | 2 +- 25 files changed, 707 insertions(+), 84 deletions(-) create mode 100644 contributor-docs/code-change-guide.md
(beam) branch asf-site updated: Publishing website 2024/04/18 23:41:01 at commit d05196d
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/asf-site by this push: new f82564c7a92 Publishing website 2024/04/18 23:41:01 at commit d05196d f82564c7a92 is described below commit f82564c7a927c321ecec8aa1c094533bb84edf4c Author: runner AuthorDate: Thu Apr 18 23:41:02 2024 + Publishing website 2024/04/18 23:41:01 at commit d05196d --- website/generated-content/.htaccess | 2 +- website/generated-content/sitemap.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/website/generated-content/.htaccess b/website/generated-content/.htaccess index a93e707fdb3..51e28c9a274 100644 --- a/website/generated-content/.htaccess +++ b/website/generated-content/.htaccess @@ -27,4 +27,4 @@ RedirectMatch "/contribute/release-guide" "https://github.com/apache/beam/blob/m RedirectMatch "/contribute/committer-guide" "https://github.com/apache/beam/blob/master/contributor-docs/committer-guide.md; -Header set Content-Security-Policy "frame-src 'self' https://play.beam.apache.org/ https://youtube.com/ ;" +Header set Content-Security-Policy "frame-src 'self' https://play.beam.apache.org/ https://www.youtube.com/ ;" diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml index d6a95ad8b4c..a6d90e353aa 100644 --- a/website/generated-content/sitemap.xml +++ b/website/generated-content/sitemap.xml @@ -1 +1 @@ -http://www.sitemaps.org/schemas/sitemap/0.9; xmlns:xhtml="http://www.w3.org/1999/xhtml;>/categories/blog/2024-04-18T13:26:40-04:00/blog/2024-04-18T13:26:40-04:00/categories/2024-04-18T13:26:40-04:00/blog/beam-yaml-release/2024-04-18T13:26:40-04:00/ [...] \ No newline at end of file +http://www.sitemaps.org/schemas/sitemap/0.9; xmlns:xhtml="http://www.w3.org/1999/xhtml;>/categories/blog/2024-04-18T15:31:01-07:00/blog/2024-04-18T15:31:01-07:00/categories/2024-04-18T15:31:01-07:00/blog/beam-yaml-release/2024-04-18T15:31:01-07:00/ [...] \ No newline at end of file
(beam) branch release-2.56.0 updated: [release-2.56.0] Exclude broken versions of GRPCIO and upgrade the base image requirements (#31045)
This is an automated email from the ASF dual-hosted git repository. tvalentyn pushed a commit to branch release-2.56.0 in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/release-2.56.0 by this push: new 99688da2ba7 [release-2.56.0] Exclude broken versions of GRPCIO and upgrade the base image requirements (#31045) 99688da2ba7 is described below commit 99688da2ba75e239603f0cd031e2d83d0ab761d4 Author: tvalentyn AuthorDate: Thu Apr 18 15:31:06 2024 -0700 [release-2.56.0] Exclude broken versions of GRPCIO and upgrade the base image requirements (#31045) * Exclude broken versions of GRPCIO * Upgrade requirements. --- sdks/python/container/py310/base_image_requirements.txt | 8 sdks/python/container/py311/base_image_requirements.txt | 8 sdks/python/container/py38/base_image_requirements.txt | 8 sdks/python/container/py39/base_image_requirements.txt | 8 sdks/python/setup.py| 6 ++ 5 files changed, 18 insertions(+), 20 deletions(-) diff --git a/sdks/python/container/py310/base_image_requirements.txt b/sdks/python/container/py310/base_image_requirements.txt index 32bdfa95bc8..980dd99d5b7 100644 --- a/sdks/python/container/py310/base_image_requirements.txt +++ b/sdks/python/container/py310/base_image_requirements.txt @@ -50,7 +50,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -78,8 +78,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -137,7 +137,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.4.2 scipy==1.13.0 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/container/py311/base_image_requirements.txt b/sdks/python/container/py311/base_image_requirements.txt index 6db63f234ef..f0615b45b29 100644 --- a/sdks/python/container/py311/base_image_requirements.txt +++ b/sdks/python/container/py311/base_image_requirements.txt @@ -48,7 +48,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -76,8 +76,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -135,7 +135,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.4.2 scipy==1.13.0 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/container/py38/base_image_requirements.txt b/sdks/python/container/py38/base_image_requirements.txt index f59c4004078..c87b4fac4b2 100644 --- a/sdks/python/container/py38/base_image_requirements.txt +++ b/sdks/python/container/py38/base_image_requirements.txt @@ -51,7 +51,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -79,8 +79,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -141,7 +141,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.3.2 scipy==1.10.1 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/container/py39/base_image_requirements.txt b/sdks/python/container/py39/base_image_requirements.txt index 0b4a933e788..bd63ce55de0 100644 --- a/sdks/python/container/py39/base_image_requirements.txt +++ b/sdks/python/container/py39/base_image_requirements.txt @@ -50,7 +50,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -78,8 +78,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -138,7 +138,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.4.2 scipy==1.13.0 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/setup.py b/sdks/python/setup.py index ad8923fcc8d..13799dca942 100644 --- a/sdks/python/setup.py +++ b/sdks/python/setup.py @@ -365,7 +365,7 @@ if __name__ ==
(beam) branch master updated: Upgrade the version of GRPC to pick up a fix for #30867 (#31044)
This is an automated email from the ASF dual-hosted git repository. tvalentyn pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new d05196dea8a Upgrade the version of GRPC to pick up a fix for #30867 (#31044) d05196dea8a is described below commit d05196dea8a393ef2afec8e95c49a309375a42d9 Author: tvalentyn AuthorDate: Thu Apr 18 15:31:01 2024 -0700 Upgrade the version of GRPC to pick up a fix for #30867 (#31044) * Exclude broken versions of GRPCIO * Upgrade requirements. --- sdks/python/container/py310/base_image_requirements.txt | 8 sdks/python/container/py311/base_image_requirements.txt | 8 sdks/python/container/py38/base_image_requirements.txt | 8 sdks/python/container/py39/base_image_requirements.txt | 8 sdks/python/setup.py| 6 ++ 5 files changed, 18 insertions(+), 20 deletions(-) diff --git a/sdks/python/container/py310/base_image_requirements.txt b/sdks/python/container/py310/base_image_requirements.txt index 32bdfa95bc8..980dd99d5b7 100644 --- a/sdks/python/container/py310/base_image_requirements.txt +++ b/sdks/python/container/py310/base_image_requirements.txt @@ -50,7 +50,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -78,8 +78,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -137,7 +137,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.4.2 scipy==1.13.0 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/container/py311/base_image_requirements.txt b/sdks/python/container/py311/base_image_requirements.txt index 6db63f234ef..f0615b45b29 100644 --- a/sdks/python/container/py311/base_image_requirements.txt +++ b/sdks/python/container/py311/base_image_requirements.txt @@ -48,7 +48,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -76,8 +76,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -135,7 +135,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.4.2 scipy==1.13.0 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/container/py38/base_image_requirements.txt b/sdks/python/container/py38/base_image_requirements.txt index f59c4004078..c87b4fac4b2 100644 --- a/sdks/python/container/py38/base_image_requirements.txt +++ b/sdks/python/container/py38/base_image_requirements.txt @@ -51,7 +51,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -79,8 +79,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -141,7 +141,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.3.2 scipy==1.10.1 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/container/py39/base_image_requirements.txt b/sdks/python/container/py39/base_image_requirements.txt index 0b4a933e788..bd63ce55de0 100644 --- a/sdks/python/container/py39/base_image_requirements.txt +++ b/sdks/python/container/py39/base_image_requirements.txt @@ -50,7 +50,7 @@ fastavro==1.9.4 fasteners==0.19 freezegun==1.4.0 future==1.0.0 -google-api-core==2.16.2 +google-api-core==2.18.0 google-api-python-client==2.126.0 google-apitools==0.5.31 google-auth==2.29.0 @@ -78,8 +78,8 @@ googleapis-common-protos==1.63.0 greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 -grpcio==1.62.1 -grpcio-status==1.62.1 +grpcio==1.62.2 +grpcio-status==1.62.2 guppy3==3.1.4.post1 hdfs==2.7.3 httplib2==0.22.0 @@ -138,7 +138,7 @@ rpds-py==0.18.0 rsa==4.9 scikit-learn==1.4.2 scipy==1.13.0 -shapely==2.0.3 +shapely==2.0.4 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.5 diff --git a/sdks/python/setup.py b/sdks/python/setup.py index ad8923fcc8d..13799dca942 100644 --- a/sdks/python/setup.py +++ b/sdks/python/setup.py @@ -365,7 +365,7 @@ if __name__ == '__main__': 'cloudpickle~=2.2.1', 'fastavro>=0.23.6,<2',
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 6e08086e899 Updating config from bot 6e08086e899 is described below commit 6e08086e899e0108deb96c6a6ef4a04774738f27 Author: github-actions AuthorDate: Thu Apr 18 22:06:06 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31046.json | 10 ++ 1 file changed, 10 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31046.json b/scripts/ci/pr-bot/state/pr-state/pr-31046.json new file mode 100644 index 000..aa3638a74ec --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31046.json @@ -0,0 +1,10 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": { +"go": "riteshghorse" + }, + "nextAction": "Reviewers", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 45d2ff1f46e Updating config from bot 45d2ff1f46e is described below commit 45d2ff1f46e20595e6cfac0462045603580eade5 Author: github-actions AuthorDate: Thu Apr 18 22:06:07 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-go.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-go.json b/scripts/ci/pr-bot/state/reviewers-for-label-go.json index 4caef3f3c4d..7b5c60ca435 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-go.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-go.json @@ -5,6 +5,6 @@ "jrmccluskey": 1713395156069, "youngoli": 1657688896155, "damccorm": 1680501930289, -"riteshghorse": 1713269616781 +"riteshghorse": 1713477964544 } } \ No newline at end of file
(beam) branch master updated: Add code change guide contributor-doc (#30879)
This is an automated email from the ASF dual-hosted git repository. yhu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 3e52e3554a0 Add code change guide contributor-doc (#30879) 3e52e3554a0 is described below commit 3e52e3554a07b20cf318239c25269ffbe6797433 Author: Yi Hu AuthorDate: Thu Apr 18 17:01:41 2024 -0400 Add code change guide contributor-doc (#30879) * Add code change guide contributor-doc - Co-authored-by: Rebecca Szper <98840847+rsz...@users.noreply.github.com> --- contributor-docs/code-change-guide.md | 525 ++ 1 file changed, 525 insertions(+) diff --git a/contributor-docs/code-change-guide.md b/contributor-docs/code-change-guide.md new file mode 100644 index 000..2d04e8bb8d6 --- /dev/null +++ b/contributor-docs/code-change-guide.md @@ -0,0 +1,525 @@ + + +Last Updated: Apr 18, 2024 + +This guide is for Beam users and developers changing and testing Beam code. +Specifically, this guide provides information about: + +1. Testing code changes locally + +2. Building Beam artifacts with modified Beam code and using the modified code for pipelines + +# Repository structure + +The Apache Beam GitHub repository (Beam repo) is, for the most part, a "mono repo": +it contains everything in the Beam project, including the SDK, test +infrastructure, dashboards, the [Beam website](https://beam.apache.org), +the [Beam Playground](https://play.beam.apache.org), and so on. + +## Gradle quick start + +The Beam repo is a single Gradle project that contains all components, including Python, +Go, the website, etc. It is useful to familiarize yourself with the Gradle project structure: +https://docs.gradle.org/current/userguide/multi_project_builds.html + +### Gradle key concepts + +Grade uses the following key concepts: + +* **project**: a folder that contains the `build.gradle` file +* **task**: an action defined in the `build.gradle` file +* **plugin**: runs in the project's `build.gradle` and contains predefined tasks and hierarchies + +For example, common tasks for a Java project or subproject include: + +- `compileJava` +- `compileTestJava` +- `test` +- `integrationTest` + +To run a Gradle task, the command is `./gradlew -p ` or `./gradlew :project:path:task_name`. For example: + +``` +./gradlew -p sdks/java/core compileJava + +./gradlew :sdks:java:harness:test +``` + +### Gradle project configuration: Beam specific + +* A **huge** plugin `buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin` manages everything. + +In each java project or subproject, the `build.gradle` file starts with: + +```groovy + +apply plugin: 'org.apache.beam.module' + +applyJavaNature( ... ) +``` + +Relevant usage of `BeamModulePlugin` includes: +* Manage Java dependencies +* Configure projects (Java, Python, Go, Proto, Docker, Grpc, Avro, an so on) + * Java -> `applyJavaNature`; Python -> `applyPythonNature`, and so on + * Define common custom tasks for each type of project +* `test`: run Java unit tests +* `spotlessApply`: format java code + +## Code paths + +The following are example code paths relevant for SDK development: + +Java code paths are mainly found in two directories: + +* `sdks/java` Java SDK + * `sdks/java/core` Java core + * `sdks/java/harness` SDK harness (entrypoint of SDK container) + +* `runners` Java runner supports. For example, + * `runners/direct-java` Java direct runner + * `runners/flink-java` Java Flink runner + * `runners/google-cloud-dataflow-java` Dataflow runner (job submission, translation, etc) +* `runners/google-cloud-dataflow-java/worker` Worker on Dataflow legacy runner + +For SDKS in other language, all relevant files are in `sdks/LANG`, for example, + +* `sdks/python` contains the setup file and scripts to trigger test-suites + * `sdks/python/apache_beam` actual beam package +* `sdks/python/apache_beam/runners/worker` SDK worker harness entrypoint, state sampler +* `sdks/python/apache_beam/io` I/O connectors +* `sdks/python/apache_beam/transforms` most "core" components +* `sdks/python/apache_beam/ml` Beam ML +* `sdks/python/apache_beam/runners` runner implementations and wrappers +* ... + +* `sdks/go` Go SDK + +* `.github/workflow` GitHub action workflows (for example, tests run under PR). Most + workflows run a single Gradle command. Check which command is running for + a test so that you can run the same command locally during development. + +## Environment setup + +To set up local development environments, first see the [Contributing guide](../CONTRIBUTING.md) . +If you plan to use Dataflow, see the [Google Cloud documentation](https://cloud.google.com/dataflow/docs/quickstarts/create-pipeline-java) to setup `gcloud` credentials. + +To check if your environment is set up, follow these steps: + +Depending on the
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 5d4137f7d0b Updating config from bot 5d4137f7d0b is described below commit 5d4137f7d0b49ead39679c74fd959a93f48d98ad Author: github-actions AuthorDate: Thu Apr 18 20:25:05 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31045.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31045.json b/scripts/ci/pr-bot/state/pr-state/pr-31045.json new file mode 100644 index 000..9c2aa5aa212 --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31045.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": true, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new f5d65a72a7f Updating config from bot f5d65a72a7f is described below commit f5d65a72a7fc4768d7f9639ae4feedcba3c2495b Author: github-actions AuthorDate: Thu Apr 18 20:21:14 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31044.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31044.json b/scripts/ci/pr-bot/state/pr-state/pr-31044.json new file mode 100644 index 000..9c2aa5aa212 --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31044.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": true, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new a68902e04fe Updating config from bot a68902e04fe is described below commit a68902e04fe9b510df4dfdb982914d9f5bdea063 Author: github-actions AuthorDate: Thu Apr 18 20:06:00 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31026.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31026.json b/scripts/ci/pr-bot/state/pr-state/pr-31026.json index 321b31c0f05..7d1dfe55f2f 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-31026.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-31026.json @@ -6,5 +6,5 @@ "nextAction": "Reviewers", "stopReviewerNotifications": false, "remindAfterTestsPass": [], - "committerAssigned": false + "committerAssigned": true } \ No newline at end of file
(beam) branch master updated (b69e8c615af -> 76c77cd28ae)
This is an automated email from the ASF dual-hosted git repository. jrmccluskey pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from b69e8c615af Updates Python Dev container used by Dataflow (#31029) add 76c77cd28ae Fix typo in count_unique_words() (#31023) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/ml/transforms/tft.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
(beam) branch master updated (bb310e7e907 -> b69e8c615af)
This is an automated email from the ASF dual-hosted git repository. tvalentyn pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from bb310e7e907 Change type for UnboundedReaderMaxReadTimeSec (#31037) add b69e8c615af Updates Python Dev container used by Dataflow (#31029) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/runners/dataflow/internal/names.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
(beam) branch master updated: Change type for UnboundedReaderMaxReadTimeSec (#31037)
This is an automated email from the ASF dual-hosted git repository. jrmccluskey pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bb310e7e907 Change type for UnboundedReaderMaxReadTimeSec (#31037) bb310e7e907 is described below commit bb310e7e90720b620f1089574f1656ca84a3656d Author: Radosław Stankiewicz AuthorDate: Thu Apr 18 21:50:05 2024 +0200 Change type for UnboundedReaderMaxReadTimeSec (#31037) * add ms part for UnboundedReader checkpointing * typo * spotless * spotless * spotless * [IntLongMath] Expression of type int may overflow before being assigned to a long * readerMaxReadTime sec as double * readerMaxReadTime sec as double * readerMaxReadTime sec as double * readerMaxReadTime sec as double * spotless --- .../runners/dataflow/options/DataflowPipelineDebugOptions.java | 9 + .../apache/beam/runners/dataflow/worker/WorkerCustomSources.java | 4 +++- .../beam/runners/dataflow/worker/WorkerCustomSourcesTest.java| 7 --- 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java index 30496dec296..3f6c47ece68 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java @@ -218,11 +218,12 @@ public interface DataflowPipelineDebugOptions /** The max amount of time an UnboundedReader is consumed before checkpointing. */ @Description( - "The max amount of time before an UnboundedReader is consumed before checkpointing, in seconds.") - @Default.Integer(10) - Integer getUnboundedReaderMaxReadTimeSec(); + "The max amount of time before an UnboundedReader is consumed before checkpointing, " + + "in seconds. Duration can be set to fractions of seconds. ") + @Default.Double(10.0) + double getUnboundedReaderMaxReadTimeSec(); - void setUnboundedReaderMaxReadTimeSec(Integer value); + void setUnboundedReaderMaxReadTimeSec(double value); /** The max elements read from an UnboundedReader before checkpointing. */ @Description("The max elements read from an UnboundedReader before checkpointing. ") diff --git a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java index 8c086016ee9..a8e358f19e0 100644 --- a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java +++ b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java @@ -798,7 +798,9 @@ public class WorkerCustomSources { DataflowPipelineDebugOptions debugOptions = options.as(DataflowPipelineDebugOptions.class); this.endTime = Instant.now() - .plus(Duration.standardSeconds(debugOptions.getUnboundedReaderMaxReadTimeSec())); + .plus( + Duration.millis( + (long) (debugOptions.getUnboundedReaderMaxReadTimeSec() * 1000L))); this.maxElems = debugOptions.getUnboundedReaderMaxElements(); this.backoffFactory = FluentBackoff.DEFAULT diff --git a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSourcesTest.java b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSourcesTest.java index d451ec093f7..261567930fe 100644 --- a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSourcesTest.java +++ b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSourcesTest.java @@ -598,6 +598,7 @@ public class WorkerCustomSourcesTest { int maxElements = 10; DataflowPipelineDebugOptions debugOptions = options.as(DataflowPipelineDebugOptions.class); debugOptions.setUnboundedReaderMaxElements(maxElements); +debugOptions.setUnboundedReaderMaxReadTimeSec(10); ByteString state = ByteString.EMPTY; for (int i = 0; i < 10 * maxElements; @@ -645,10 +646,10 @@ public class WorkerCustomSourcesTest { numReadOnThisIteration++; } Instant afterReading = Instant.now(); - long maxReadSec =
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 2540b2385ea Updating config from bot 2540b2385ea is described below commit 2540b2385eaad0f59c6030f88cfa745e52ae134b Author: github-actions AuthorDate: Thu Apr 18 19:44:38 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31023.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31023.json b/scripts/ci/pr-bot/state/pr-state/pr-31023.json index 28c0f36dc93..a9b5a3dc2c4 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-31023.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-31023.json @@ -4,7 +4,7 @@ "python": "shunping" }, "nextAction": "Reviewers", - "stopReviewerNotifications": false, + "stopReviewerNotifications": true, "remindAfterTestsPass": [], "committerAssigned": false } \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new b1de035fc22 Updating config from bot b1de035fc22 is described below commit b1de035fc22c74c3af47e237b5c63630191a406c Author: github-actions AuthorDate: Thu Apr 18 19:34:33 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31042.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31042.json b/scripts/ci/pr-bot/state/pr-state/pr-31042.json new file mode 100644 index 000..242a48d7d3b --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31042.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": true, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch master updated: fix url for content security (#31043)
This is an automated email from the ASF dual-hosted git repository. tvalentyn pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 70e067e1fde fix url for content security (#31043) 70e067e1fde is described below commit 70e067e1fdec5c9a3d8914dd7501a784026961f8 Author: Svetak Sundhar AuthorDate: Thu Apr 18 14:57:02 2024 -0400 fix url for content security (#31043) --- website/www/site/static/.htaccess | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/static/.htaccess b/website/www/site/static/.htaccess index a93e707fdb3..51e28c9a274 100644 --- a/website/www/site/static/.htaccess +++ b/website/www/site/static/.htaccess @@ -27,4 +27,4 @@ RedirectMatch "/contribute/release-guide" "https://github.com/apache/beam/blob/m RedirectMatch "/contribute/committer-guide" "https://github.com/apache/beam/blob/master/contributor-docs/committer-guide.md; -Header set Content-Security-Policy "frame-src 'self' https://play.beam.apache.org/ https://youtube.com/ ;" +Header set Content-Security-Policy "frame-src 'self' https://play.beam.apache.org/ https://www.youtube.com/ ;"
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 452b6506de3 Updating config from bot 452b6506de3 is described below commit 452b6506de3e1b046c0a4687ea26bd714c710ce1 Author: github-actions AuthorDate: Thu Apr 18 18:32:26 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31043.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31043.json b/scripts/ci/pr-bot/state/pr-state/pr-31043.json index 40494bf71be..47cf66fd58c 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-31043.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-31043.json @@ -4,7 +4,7 @@ "no-matching-label": "riteshghorse" }, "nextAction": "Reviewers", - "stopReviewerNotifications": false, + "stopReviewerNotifications": true, "remindAfterTestsPass": [], "committerAssigned": false } \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new ca4431941e0 Updating config from bot ca4431941e0 is described below commit ca4431941e0dd93fed89c8dc09e627f484fec55f Author: github-actions AuthorDate: Thu Apr 18 18:06:26 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31043.json | 10 ++ 1 file changed, 10 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31043.json b/scripts/ci/pr-bot/state/pr-state/pr-31043.json new file mode 100644 index 000..40494bf71be --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31043.json @@ -0,0 +1,10 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": { +"no-matching-label": "riteshghorse" + }, + "nextAction": "Reviewers", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 2919e960fa1 Updating config from bot 2919e960fa1 is described below commit 2919e960fa19b86c4a692bbe7406b466008345fb Author: github-actions AuthorDate: Thu Apr 18 18:06:27 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json b/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json index 823023772db..c74b74a688c 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json @@ -9,7 +9,7 @@ "jrmccluskey": 1713450856660, "kennknowles": 1713436557130, "lostluck": 1712853284277, -"riteshghorse": 1712784868733, +"riteshghorse": 1713463584793, "robertwb": 1712934380808, "tvalentyn": 1712945147592, "damondouglas": 1712932473916,
(beam) branch asf-site updated: Publishing website 2024/04/18 17:37:34 at commit bb0b63c
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/asf-site by this push: new 16a81216cb7 Publishing website 2024/04/18 17:37:34 at commit bb0b63c 16a81216cb7 is described below commit 16a81216cb733270b8ab7848b4cc693b6a26 Author: runner AuthorDate: Thu Apr 18 17:37:34 2024 + Publishing website 2024/04/18 17:37:34 at commit bb0b63c --- website/generated-content/sitemap.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml index 19bb8892c9b..d6a95ad8b4c 100644 --- a/website/generated-content/sitemap.xml +++ b/website/generated-content/sitemap.xml @@ -1 +1 @@ -http://www.sitemaps.org/schemas/sitemap/0.9; xmlns:xhtml="http://www.w3.org/1999/xhtml;>/categories/blog/2024-04-18T10:37:07+02:00/blog/2024-04-18T10:37:07+02:00/categories/2024-04-18T10:37:07+02:00/blog/beam-yaml-release/2024-04-18T10:37:07+02:00/ [...] \ No newline at end of file +http://www.sitemaps.org/schemas/sitemap/0.9; xmlns:xhtml="http://www.w3.org/1999/xhtml;>/categories/blog/2024-04-18T13:26:40-04:00/blog/2024-04-18T13:26:40-04:00/categories/2024-04-18T13:26:40-04:00/blog/beam-yaml-release/2024-04-18T13:26:40-04:00/ [...] \ No newline at end of file
(beam) branch fixtpcdsflink deleted (was aaa08fc8aaa)
This is an automated email from the ASF dual-hosted git repository. damccorm pushed a change to branch fixtpcdsflink in repository https://gitbox.apache.org/repos/asf/beam.git was aaa08fc8aaa Revert "Add trigger file" The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new e0d46dd9984 Updating config from bot e0d46dd9984 is described below commit e0d46dd9984f22910653032218115d81e485da25 Author: github-actions AuthorDate: Thu Apr 18 17:34:27 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31038.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31038.json b/scripts/ci/pr-bot/state/pr-state/pr-31038.json new file mode 100644 index 000..242a48d7d3b --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31038.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": true, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch master updated: Bump TPCDS test Flink version (#31041)
This is an automated email from the ASF dual-hosted git repository. damccorm pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bb0b63cb940 Bump TPCDS test Flink version (#31041) bb0b63cb940 is described below commit bb0b63cb9401527bd54b83530fd8239b86cd00db Author: Yi Hu AuthorDate: Thu Apr 18 13:26:40 2024 -0400 Bump TPCDS test Flink version (#31041) * Bump TPCDS test Flink version * Add trigger file * Revert "Add trigger file" This reverts commit 950dbe55a301ae0bc0311204834503285180505f. --- .github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml b/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml index 820a4c9792c..b6d27fd3377 100644 --- a/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml +++ b/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml @@ -101,5 +101,5 @@ jobs: with: gradle-command: :sdks:java:testing:tpcds:run arguments: | --Ptpcds.runner=:runners:flink:1.13 \ +-Ptpcds.runner=:runners:flink:1.17 \ "-Ptpcds.args=${{env.tpcdsBigQueryArgs}} ${{env.tpcdsInfluxDBArgs}} ${{ env.GRADLE_COMMAND_ARGUMENTS }} --queries=${{env.tpcdsQueriesArg}}" \
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new abaf741c7f8 Updating config from bot abaf741c7f8 is described below commit abaf741c7f85e49941dcccd4622ed8d3f295f382 Author: github-actions AuthorDate: Thu Apr 18 16:30:07 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31041.json | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31041.json b/scripts/ci/pr-bot/state/pr-state/pr-31041.json index 242a48d7d3b..774e36bd490 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-31041.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-31041.json @@ -1,7 +1,9 @@ { "commentedAboutFailingChecks": true, - "reviewersAssignedForLabels": {}, - "nextAction": "Author", + "reviewersAssignedForLabels": { +"build": "damccorm" + }, + "nextAction": "Reviewers", "stopReviewerNotifications": false, "remindAfterTestsPass": [], "committerAssigned": false
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new d19b576af7d Updating config from bot d19b576af7d is described below commit d19b576af7d64ab8d00760cd52acd6f27628c276 Author: github-actions AuthorDate: Thu Apr 18 16:30:09 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-build.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-build.json b/scripts/ci/pr-bot/state/reviewers-for-label-build.json index 2ed7ba71a10..755b175fd56 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-build.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-build.json @@ -1,7 +1,7 @@ { "label": "build", "dateOfLastReviewAssignment": { -"damccorm": 1713420410429, +"damccorm": 1713457806684, "Abacn": 1713371791885 } } \ No newline at end of file
(beam) branch fixtpcdsflink updated (950dbe55a30 -> aaa08fc8aaa)
This is an automated email from the ASF dual-hosted git repository. yhu pushed a change to branch fixtpcdsflink in repository https://gitbox.apache.org/repos/asf/beam.git from 950dbe55a30 Add trigger file add aaa08fc8aaa Revert "Add trigger file" No new revisions were added by this update. Summary of changes: .github/trigger_files/beam_PostCommit_Java_Tpcds_Flink.json | 3 --- 1 file changed, 3 deletions(-) delete mode 100644 .github/trigger_files/beam_PostCommit_Java_Tpcds_Flink.json
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 9facfda12d6 Updating config from bot 9facfda12d6 is described below commit 9facfda12d658b620f7b0317f4f841fb80e0564c Author: github-actions AuthorDate: Thu Apr 18 16:06:59 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-30772.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-30772.json b/scripts/ci/pr-bot/state/pr-state/pr-30772.json index 700fd9146fb..d9045c102c6 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-30772.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-30772.json @@ -4,7 +4,7 @@ "java": "bvolpato" }, "nextAction": "Reviewers", - "stopReviewerNotifications": false, + "stopReviewerNotifications": true, "remindAfterTestsPass": [], "committerAssigned": false } \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new e466d05bd10 Updating config from bot e466d05bd10 is described below commit e466d05bd106f9e25e7b62f1f86f72a71d4ef6b1 Author: github-actions AuthorDate: Thu Apr 18 16:05:56 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31041.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31041.json b/scripts/ci/pr-bot/state/pr-state/pr-31041.json new file mode 100644 index 000..242a48d7d3b --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31041.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": true, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch release-2.56.0 updated: Cherry picking (#30460) BQ clustering valueprovider (#31039)
This is an automated email from the ASF dual-hosted git repository. ahmedabualsaud pushed a commit to branch release-2.56.0 in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/release-2.56.0 by this push: new 3c7a01360df Cherry picking (#30460) BQ clustering valueprovider (#31039) 3c7a01360df is described below commit 3c7a01360df3635ae56b7073dfa36e75e7492f61 Author: Ahmed Abualsaud <65791736+ahmedab...@users.noreply.github.com> AuthorDate: Thu Apr 18 11:47:41 2024 -0400 Cherry picking (#30460) BQ clustering valueprovider (#31039) * Support clustering with value provider * remove * add some documentation * fix * address comments * update test * spotless * use serializable json clustering; fix translation * fall back on super's clustering and time partitioning when needed * fork based on version 2.56.0 * fix test --- .../beam/sdk/io/gcp/bigquery/BigQueryHelpers.java | 22 +++ .../beam/sdk/io/gcp/bigquery/BigQueryIO.java | 31 ++ .../sdk/io/gcp/bigquery/BigQueryIOTranslation.java | 18 - .../gcp/bigquery/DynamicDestinationsHelpers.java | 8 ++ .../sdk/io/gcp/bigquery/BigQueryClusteringIT.java | 5 +++- .../sdk/io/gcp/bigquery/BigQueryHelpersTest.java | 11 .../io/gcp/bigquery/BigQueryIOTranslationTest.java | 10 +-- .../BigQueryTimePartitioningClusteringIT.java | 1 + 8 files changed, 85 insertions(+), 21 deletions(-) diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java index c4ad09ce6ea..8c600cf780a 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java @@ -17,11 +17,13 @@ */ package org.apache.beam.sdk.io.gcp.bigquery; +import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkArgument; import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState; import com.google.api.client.util.BackOff; import com.google.api.client.util.BackOffUtils; import com.google.api.client.util.Sleeper; +import com.google.api.services.bigquery.model.Clustering; import com.google.api.services.bigquery.model.Dataset; import com.google.api.services.bigquery.model.Job; import com.google.api.services.bigquery.model.JobReference; @@ -31,6 +33,8 @@ import com.google.api.services.bigquery.model.TableReference; import com.google.api.services.bigquery.model.TableSchema; import com.google.api.services.bigquery.model.TimePartitioning; import com.google.cloud.hadoop.util.ApiErrorExtractor; +import com.google.gson.JsonElement; +import com.google.gson.JsonParser; import java.io.IOException; import java.io.Serializable; import java.math.BigInteger; @@ -40,6 +44,7 @@ import java.util.List; import java.util.Map; import java.util.UUID; import java.util.regex.Matcher; +import java.util.stream.Collectors; import org.apache.beam.sdk.extensions.gcp.util.BackOffAdapter; import org.apache.beam.sdk.io.FileSystems; import org.apache.beam.sdk.io.fs.ResolveOptions; @@ -704,6 +709,23 @@ public class BigQueryHelpers { } } + static Clustering clusteringFromJsonFields(String jsonStringClustering) { +JsonElement jsonClustering = JsonParser.parseString(jsonStringClustering); + +checkArgument( +jsonClustering.isJsonArray(), +"Received an invalid Clustering json string: %s." ++ "Please provide a serialized json array like so: [\"column1\", \"column2\"]", +jsonStringClustering); + +List fields = +jsonClustering.getAsJsonArray().asList().stream() +.map(JsonElement::getAsString) +.collect(Collectors.toList()); + +return new Clustering().setFields(fields); + } + static String resolveTempLocation( String tempLocationDir, String bigQueryOperationName, String stepUuid) { return FileSystems.matchNewResource(tempLocationDir, true) diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java index a646e1e6247..fce8f1c5d40 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java @@ -2350,7 +2350,7 @@ public class BigQueryIO { abstract @Nullable ValueProvider getJsonTimePartitioning(); -abstract @Nullable Clustering getClustering();
(beam) branch fixtpcdsflink updated (d67b8d846f6 -> 950dbe55a30)
This is an automated email from the ASF dual-hosted git repository. yhu pushed a change to branch fixtpcdsflink in repository https://gitbox.apache.org/repos/asf/beam.git from d67b8d846f6 Bump TPCDS test Flink version add 950dbe55a30 Add trigger file No new revisions were added by this update. Summary of changes: ...va_PVR_Spark3_Streaming.json => beam_PostCommit_Java_Tpcds_Flink.json} | 0 1 file changed, 0 insertions(+), 0 deletions(-) copy .github/trigger_files/{beam_PostCommit_Java_PVR_Spark3_Streaming.json => beam_PostCommit_Java_Tpcds_Flink.json} (100%)
(beam) 01/01: Bump TPCDS test Flink version
This is an automated email from the ASF dual-hosted git repository. yhu pushed a commit to branch fixtpcdsflink in repository https://gitbox.apache.org/repos/asf/beam.git commit d67b8d846f623bbc26ad61fe693802dbe984db7a Author: Yi Hu AuthorDate: Thu Apr 18 11:39:35 2024 -0400 Bump TPCDS test Flink version --- .github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml b/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml index 820a4c9792c..b6d27fd3377 100644 --- a/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml +++ b/.github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml @@ -101,5 +101,5 @@ jobs: with: gradle-command: :sdks:java:testing:tpcds:run arguments: | --Ptpcds.runner=:runners:flink:1.13 \ +-Ptpcds.runner=:runners:flink:1.17 \ "-Ptpcds.args=${{env.tpcdsBigQueryArgs}} ${{env.tpcdsInfluxDBArgs}} ${{ env.GRADLE_COMMAND_ARGUMENTS }} --queries=${{env.tpcdsQueriesArg}}" \
(beam) branch fixtpcdsflink created (now d67b8d846f6)
This is an automated email from the ASF dual-hosted git repository. yhu pushed a change to branch fixtpcdsflink in repository https://gitbox.apache.org/repos/asf/beam.git at d67b8d846f6 Bump TPCDS test Flink version This branch includes the following new commits: new d67b8d846f6 Bump TPCDS test Flink version The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(beam) branch master updated: Update documentation of @SchemaFieldNumber (#30273) (#30277)
This is an automated email from the ASF dual-hosted git repository. robertwb pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 61153bbda6a Update documentation of @SchemaFieldNumber (#30273) (#30277) 61153bbda6a is described below commit 61153bbda6abd5e2d6798544d5f5a81f08d15ee4 Author: bzablocki AuthorDate: Thu Apr 18 17:01:50 2024 +0200 Update documentation of @SchemaFieldNumber (#30273) (#30277) --- .../org/apache/beam/sdk/schemas/annotations/SchemaFieldNumber.java | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/annotations/SchemaFieldNumber.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/annotations/SchemaFieldNumber.java index 32110395f60..1bfcda7270b 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/annotations/SchemaFieldNumber.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/annotations/SchemaFieldNumber.java @@ -31,16 +31,19 @@ import javax.annotation.Nonnull; * specified index. There cannot be "gaps" in field numbers, or schema inference will fail. If used, * all fields (or getters in the case of a bean) must be annotated. * + * The annotation takes a String as an argument, but this has to be an Integer-parsable String. + * Otherwise the pipeline will throw a RuntimeException. + * * For example, say we have a Java POJO with a field that we want in our schema but under a * different name: * * * {@literal @}DefaultSchema(JavaFieldSchema.class) * class MyClass { - * {@literal @}SchemaFieldNumber(1) + * {@literal @}SchemaFieldNumber("1") * public String user; * - *{@literal @}SchemaFieldNumber(0) + *{@literal @}SchemaFieldNumber("0") * public int ageInYears; * } *
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new ebab5a4036c Updating config from bot ebab5a4036c is described below commit ebab5a4036c0619ea1ba5c61f9b601c1f54db083 Author: github-actions AuthorDate: Thu Apr 18 14:54:24 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31039.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31039.json b/scripts/ci/pr-bot/state/pr-state/pr-31039.json new file mode 100644 index 000..9c2aa5aa212 --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31039.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": true, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch master updated (2eb1a756258 -> 04ff4bdd7fc)
This is an automated email from the ASF dual-hosted git repository. ahmedabualsaud pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 2eb1a756258 [Python] Clean doc related to write data in bigquery.py (#30887) add 04ff4bdd7fc Support BQ clustering with value provider (#30460) No new revisions were added by this update. Summary of changes: .../beam/sdk/io/gcp/bigquery/BigQueryHelpers.java | 22 +++ .../beam/sdk/io/gcp/bigquery/BigQueryIO.java | 31 ++ .../sdk/io/gcp/bigquery/BigQueryIOTranslation.java | 18 - .../gcp/bigquery/DynamicDestinationsHelpers.java | 8 ++ .../sdk/io/gcp/bigquery/BigQueryClusteringIT.java | 5 +++- .../sdk/io/gcp/bigquery/BigQueryHelpersTest.java | 11 .../io/gcp/bigquery/BigQueryIOTranslationTest.java | 10 +-- .../BigQueryTimePartitioningClusteringIT.java | 1 + 8 files changed, 85 insertions(+), 21 deletions(-)
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 4e3102d8061 Updating config from bot 4e3102d8061 is described below commit 4e3102d806173197345e51841aafd259e9bfbff5 Author: github-actions AuthorDate: Thu Apr 18 14:34:19 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json b/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json index 83b030e8c81..823023772db 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json @@ -6,7 +6,7 @@ "chamikaramj": 1713058448533, "damccorm": 1713357664828, "johnjcasey": 1713399580129, -"jrmccluskey": 1712595975431, +"jrmccluskey": 1713450856660, "kennknowles": 1713436557130, "lostluck": 1712853284277, "riteshghorse": 1712784868733,
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 226e17d37e8 Updating config from bot 226e17d37e8 is described below commit 226e17d37e89989b0275e0f64a6762e6817ec883 Author: github-actions AuthorDate: Thu Apr 18 14:34:17 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31037.json | 10 ++ 1 file changed, 10 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31037.json b/scripts/ci/pr-bot/state/pr-state/pr-31037.json new file mode 100644 index 000..cd352f44205 --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31037.json @@ -0,0 +1,10 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": { +"no-matching-label": "jrmccluskey" + }, + "nextAction": "Reviewers", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch master updated: [Python] Clean doc related to write data in bigquery.py (#30887)
This is an automated email from the ASF dual-hosted git repository. riteshghorse pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 2eb1a756258 [Python] Clean doc related to write data in bigquery.py (#30887) 2eb1a756258 is described below commit 2eb1a756258b24b4c3b92c14e0959d64b0008bbf Author: Kevin ZHOU AuthorDate: Thu Apr 18 15:57:47 2024 +0200 [Python] Clean doc related to write data in bigquery.py (#30887) * add missing closing parenthesis * add unique names to PTransform operations --- sdks/python/apache_beam/io/gcp/bigquery.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sdks/python/apache_beam/io/gcp/bigquery.py b/sdks/python/apache_beam/io/gcp/bigquery.py index 08698b273b1..43bd1702218 100644 --- a/sdks/python/apache_beam/io/gcp/bigquery.py +++ b/sdks/python/apache_beam/io/gcp/bigquery.py @@ -140,15 +140,15 @@ events of different types to different tables, and the table names are computed at pipeline runtime, one may do something like the following:: with Pipeline() as p: - elements = (p | beam.Create([ + elements = (p | 'Create elements' >> beam.Create([ {'type': 'error', 'timestamp': '12:34:56', 'message': 'bad'}, {'type': 'user_log', 'timestamp': '12:34:59', 'query': 'flu symptom'}, ])) - table_names = (p | beam.Create([ + table_names = (p | 'Create table_names' >> beam.Create([ ('error', 'my_project:dataset1.error_table_for_today'), ('user_log', 'my_project:dataset1.query_table_for_today'), - ]) + ])) table_names_dict = beam.pvalue.AsDict(table_names)
(beam) branch dependabot/gradle/commons-cli-commons-cli-1.7.0 deleted (was 5ef29b15910)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/gradle/commons-cli-commons-cli-1.7.0 in repository https://gitbox.apache.org/repos/asf/beam.git was 5ef29b15910 Bump commons-cli:commons-cli from 1.6.0 to 1.7.0 The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(beam) branch release-2.56.0 updated: Change caching of global window inputs to be guarded by experiment (#31013) (#31035)
This is an automated email from the ASF dual-hosted git repository. damccorm pushed a commit to branch release-2.56.0 in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/release-2.56.0 by this push: new 674ad975f6f Change caching of global window inputs to be guarded by experiment (#31013) (#31035) 674ad975f6f is described below commit 674ad975f6f6ba2b6fad744aa7c1b1fbd509c275 Author: Sam Whittle AuthorDate: Thu Apr 18 14:55:16 2024 +0200 Change caching of global window inputs to be guarded by experiment (#31013) (#31035) * Change caching of global window inputs to be guarded by experiment disable_global_windowed_args_caching --- sdks/python/apache_beam/runners/common.pxd | 4 +- sdks/python/apache_beam/runners/common.py | 75 -- 2 files changed, 54 insertions(+), 25 deletions(-) diff --git a/sdks/python/apache_beam/runners/common.pxd b/sdks/python/apache_beam/runners/common.pxd index 9fb44af6377..683bf8fcac1 100644 --- a/sdks/python/apache_beam/runners/common.pxd +++ b/sdks/python/apache_beam/runners/common.pxd @@ -100,7 +100,9 @@ cdef class PerWindowInvoker(DoFnInvoker): cdef dict kwargs_for_process_batch cdef list placeholders_for_process_batch cdef bint has_windowed_inputs - cdef bint cache_globally_windowed_args + cdef bint recalculate_window_args + cdef bint has_cached_window_args + cdef bint has_cached_window_batch_args cdef object process_method cdef object process_batch_method cdef bint is_splittable diff --git a/sdks/python/apache_beam/runners/common.py b/sdks/python/apache_beam/runners/common.py index 82ff939dbae..7a1cef4005e 100644 --- a/sdks/python/apache_beam/runners/common.py +++ b/sdks/python/apache_beam/runners/common.py @@ -761,6 +761,17 @@ class PerWindowInvoker(DoFnInvoker): self.current_window_index = None self.stop_window_index = None +# TODO(https://github.com/apache/beam/issues/28776): Remove caching after +# fully rolling out. +# If true, always recalculate window args. If false, has_cached_window_args +# and has_cached_window_batch_args will be set to true if the corresponding +# self.args_for_process,have been updated and should be reused directly. +self.recalculate_window_args = ( +self.has_windowed_inputs or 'disable_global_windowed_args_caching' in +RuntimeValueProvider.experiments) +self.has_cached_window_args = False +self.has_cached_window_batch_args = False + # Try to prepare all the arguments that can just be filled in # without any additional work. in the process function. # Also cache all the placeholders needed in the process function. @@ -921,16 +932,23 @@ class PerWindowInvoker(DoFnInvoker): additional_kwargs, ): # type: (...) -> Optional[SplitResultResidual] -if self.has_windowed_inputs: - assert len(windowed_value.windows) <= 1 - window, = windowed_value.windows +if self.has_cached_window_args: + args_for_process, kwargs_for_process = ( + self.args_for_process, self.kwargs_for_process) else: - window = GlobalWindow() -side_inputs = [si[window] for si in self.side_inputs] -side_inputs.extend(additional_args) -args_for_process, kwargs_for_process = util.insert_values_in_args( -self.args_for_process, self.kwargs_for_process, -side_inputs) + if self.has_windowed_inputs: +assert len(windowed_value.windows) <= 1 +window, = windowed_value.windows + else: +window = GlobalWindow() + side_inputs = [si[window] for si in self.side_inputs] + side_inputs.extend(additional_args) + args_for_process, kwargs_for_process = util.insert_values_in_args( + self.args_for_process, self.kwargs_for_process, side_inputs) + if not self.recalculate_window_args: +self.args_for_process, self.kwargs_for_process = ( +args_for_process, kwargs_for_process) +self.has_cached_window_args = True # Extract key in the case of a stateful DoFn. Note that in the case of a # stateful DoFn, we set during __init__ self.has_windowed_inputs to be @@ -1012,20 +1030,29 @@ class PerWindowInvoker(DoFnInvoker): ): # type: (...) -> Optional[SplitResultResidual] -if self.has_windowed_inputs: - assert isinstance(windowed_batch, HomogeneousWindowedBatch) - assert len(windowed_batch.windows) <= 1 - window, = windowed_batch.windows +if self.has_cached_window_batch_args: + args_for_process_batch, kwargs_for_process_batch = ( + self.args_for_process_batch, self.kwargs_for_process_batch) else: - window = GlobalWindow() -side_inputs = [si[window] for si in self.side_inputs] -side_inputs.extend(additional_args) -(args_for_process_batch, kwargs_for_process_batch) = ( -util.insert_values_in_args( -
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new eb2040994aa Updating config from bot eb2040994aa is described below commit eb2040994aa165a5fc4e5a4db5c35abf426d0fef Author: github-actions AuthorDate: Thu Apr 18 12:13:43 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-python.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-python.json b/scripts/ci/pr-bot/state/reviewers-for-label-python.json index 08c5a68710c..dfb3e6e166e 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-python.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-python.json @@ -10,7 +10,7 @@ "y1chi": 1667002607045, "damccorm": 1713318705495, "jrmccluskey": 1713401349432, -"riteshghorse": 1713274474831, +"riteshghorse": 1713442420035, "liferoad": 1713297957925, "shunping": 1713384363515 }
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new a4d5eae5cc1 Updating config from bot a4d5eae5cc1 is described below commit a4d5eae5cc1ee983ab7ad60aedef1580d7626002 Author: github-actions AuthorDate: Thu Apr 18 12:13:44 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-io.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-io.json b/scripts/ci/pr-bot/state/reviewers-for-label-io.json index 798b6551f7a..3a91199efff 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-io.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-io.json @@ -2,7 +2,7 @@ "label": "io", "dateOfLastReviewAssignment": { "chamikaramj": 1713297957925, -"johnjcasey": 1712730964541, +"johnjcasey": 1713442420837, "pabloem": 1691787951165, "Abacn": 1713318705496, "ahmedabu98": 1713356028189,
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 214a4bde2e7 Updating config from bot 214a4bde2e7 is described below commit 214a4bde2e7831c6a6dcb80e75d5f44be6a042bb Author: github-actions AuthorDate: Thu Apr 18 12:13:42 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-30887.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-30887.json b/scripts/ci/pr-bot/state/pr-state/pr-30887.json index 0772dd28d82..dc9e9848d27 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-30887.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-30887.json @@ -1,8 +1,8 @@ { "commentedAboutFailingChecks": false, "reviewersAssignedForLabels": { -"python": "liferoad", -"io": "chamikaramj" +"python": "riteshghorse", +"io": "johnjcasey" }, "nextAction": "Reviewers", "stopReviewerNotifications": false,
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new f4385283161 Updating config from bot f4385283161 is described below commit f438528316197ebf47c67c11e7099e90e773624e Author: github-actions AuthorDate: Thu Apr 18 11:00:14 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-30279.json | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-30279.json b/scripts/ci/pr-bot/state/pr-state/pr-30279.json index 242a48d7d3b..faef0cb8db0 100644 --- a/scripts/ci/pr-bot/state/pr-state/pr-30279.json +++ b/scripts/ci/pr-bot/state/pr-state/pr-30279.json @@ -1,7 +1,9 @@ { "commentedAboutFailingChecks": true, - "reviewersAssignedForLabels": {}, - "nextAction": "Author", + "reviewersAssignedForLabels": { +"java": "m-trieu" + }, + "nextAction": "Reviewers", "stopReviewerNotifications": false, "remindAfterTestsPass": [], "committerAssigned": false
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 7e7553cf985 Updating config from bot 7e7553cf985 is described below commit 7e7553cf985db3f095368c43757da7e399e1accf Author: github-actions AuthorDate: Thu Apr 18 11:00:15 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-java.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-java.json b/scripts/ci/pr-bot/state/reviewers-for-label-java.json index e84c17c3e07..d466e71aedd 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-java.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-java.json @@ -8,7 +8,7 @@ "apilloud": 1678822446183, "Abacn": 1713387985633, "bvolpato": 1712595969392, -"m-trieu": 1713290776308, +"m-trieu": 1713438013062, "damondouglas": 1713356027654 } } \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 69a1a333ee4 Updating config from bot 69a1a333ee4 is described below commit 69a1a333ee43c05921993d9677ac58a1ed01b4f0 Author: github-actions AuthorDate: Thu Apr 18 10:35:59 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json b/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json index 910930edcf0..83b030e8c81 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-no-matching-label.json @@ -7,7 +7,7 @@ "damccorm": 1713357664828, "johnjcasey": 1713399580129, "jrmccluskey": 1712595975431, -"kennknowles": 1712453421772, +"kennknowles": 1713436557130, "lostluck": 1712853284277, "riteshghorse": 1712784868733, "robertwb": 1712934380808,
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 598a0edaa5e Updating config from bot 598a0edaa5e is described below commit 598a0edaa5ed607dcaf0c8288cca9a28d7629f86 Author: github-actions AuthorDate: Thu Apr 18 10:35:58 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31036.json | 10 ++ 1 file changed, 10 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31036.json b/scripts/ci/pr-bot/state/pr-state/pr-31036.json new file mode 100644 index 000..8ad90f2eea6 --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31036.json @@ -0,0 +1,10 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": { +"no-matching-label": "kennknowles" + }, + "nextAction": "Reviewers", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new af7f9759aa6 Updating config from bot af7f9759aa6 is described below commit af7f9759aa62288f62cb40c79849fce199e97cb2 Author: github-actions AuthorDate: Thu Apr 18 08:51:41 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31035.json | 8 1 file changed, 8 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31035.json b/scripts/ci/pr-bot/state/pr-state/pr-31035.json new file mode 100644 index 000..9c2aa5aa212 --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31035.json @@ -0,0 +1,8 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": {}, + "nextAction": "Author", + "stopReviewerNotifications": true, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file
(beam) branch master updated: Change caching of global window inputs to be guarded by experiment (#31013)
This is an automated email from the ASF dual-hosted git repository. scwhittle pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bcb40cf4e4a Change caching of global window inputs to be guarded by experiment (#31013) bcb40cf4e4a is described below commit bcb40cf4e4a9b9045b51162edab09cf245456038 Author: Sam Whittle AuthorDate: Thu Apr 18 10:37:07 2024 +0200 Change caching of global window inputs to be guarded by experiment (#31013) * Change caching of global window inputs to be guarded by experiment disable_global_windowed_args_caching --- sdks/python/apache_beam/runners/common.pxd | 4 +- sdks/python/apache_beam/runners/common.py | 75 -- 2 files changed, 54 insertions(+), 25 deletions(-) diff --git a/sdks/python/apache_beam/runners/common.pxd b/sdks/python/apache_beam/runners/common.pxd index 9fb44af6377..683bf8fcac1 100644 --- a/sdks/python/apache_beam/runners/common.pxd +++ b/sdks/python/apache_beam/runners/common.pxd @@ -100,7 +100,9 @@ cdef class PerWindowInvoker(DoFnInvoker): cdef dict kwargs_for_process_batch cdef list placeholders_for_process_batch cdef bint has_windowed_inputs - cdef bint cache_globally_windowed_args + cdef bint recalculate_window_args + cdef bint has_cached_window_args + cdef bint has_cached_window_batch_args cdef object process_method cdef object process_batch_method cdef bint is_splittable diff --git a/sdks/python/apache_beam/runners/common.py b/sdks/python/apache_beam/runners/common.py index 82ff939dbae..7a1cef4005e 100644 --- a/sdks/python/apache_beam/runners/common.py +++ b/sdks/python/apache_beam/runners/common.py @@ -761,6 +761,17 @@ class PerWindowInvoker(DoFnInvoker): self.current_window_index = None self.stop_window_index = None +# TODO(https://github.com/apache/beam/issues/28776): Remove caching after +# fully rolling out. +# If true, always recalculate window args. If false, has_cached_window_args +# and has_cached_window_batch_args will be set to true if the corresponding +# self.args_for_process,have been updated and should be reused directly. +self.recalculate_window_args = ( +self.has_windowed_inputs or 'disable_global_windowed_args_caching' in +RuntimeValueProvider.experiments) +self.has_cached_window_args = False +self.has_cached_window_batch_args = False + # Try to prepare all the arguments that can just be filled in # without any additional work. in the process function. # Also cache all the placeholders needed in the process function. @@ -921,16 +932,23 @@ class PerWindowInvoker(DoFnInvoker): additional_kwargs, ): # type: (...) -> Optional[SplitResultResidual] -if self.has_windowed_inputs: - assert len(windowed_value.windows) <= 1 - window, = windowed_value.windows +if self.has_cached_window_args: + args_for_process, kwargs_for_process = ( + self.args_for_process, self.kwargs_for_process) else: - window = GlobalWindow() -side_inputs = [si[window] for si in self.side_inputs] -side_inputs.extend(additional_args) -args_for_process, kwargs_for_process = util.insert_values_in_args( -self.args_for_process, self.kwargs_for_process, -side_inputs) + if self.has_windowed_inputs: +assert len(windowed_value.windows) <= 1 +window, = windowed_value.windows + else: +window = GlobalWindow() + side_inputs = [si[window] for si in self.side_inputs] + side_inputs.extend(additional_args) + args_for_process, kwargs_for_process = util.insert_values_in_args( + self.args_for_process, self.kwargs_for_process, side_inputs) + if not self.recalculate_window_args: +self.args_for_process, self.kwargs_for_process = ( +args_for_process, kwargs_for_process) +self.has_cached_window_args = True # Extract key in the case of a stateful DoFn. Note that in the case of a # stateful DoFn, we set during __init__ self.has_windowed_inputs to be @@ -1012,20 +1030,29 @@ class PerWindowInvoker(DoFnInvoker): ): # type: (...) -> Optional[SplitResultResidual] -if self.has_windowed_inputs: - assert isinstance(windowed_batch, HomogeneousWindowedBatch) - assert len(windowed_batch.windows) <= 1 - window, = windowed_batch.windows +if self.has_cached_window_batch_args: + args_for_process_batch, kwargs_for_process_batch = ( + self.args_for_process_batch, self.kwargs_for_process_batch) else: - window = GlobalWindow() -side_inputs = [si[window] for si in self.side_inputs] -side_inputs.extend(additional_args) -(args_for_process_batch, kwargs_for_process_batch) = ( -util.insert_values_in_args( -
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 3eac8acb7c9 Updating config from bot 3eac8acb7c9 is described below commit 3eac8acb7c9e078b414a9f3d04649445b069c915 Author: github-actions AuthorDate: Thu Apr 18 06:06:53 2024 + Updating config from bot --- scripts/ci/pr-bot/state/reviewers-for-label-build.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-build.json b/scripts/ci/pr-bot/state/reviewers-for-label-build.json index 10322b972ae..2ed7ba71a10 100644 --- a/scripts/ci/pr-bot/state/reviewers-for-label-build.json +++ b/scripts/ci/pr-bot/state/reviewers-for-label-build.json @@ -1,7 +1,7 @@ { "label": "build", "dateOfLastReviewAssignment": { -"damccorm": 1713357661356, +"damccorm": 1713420410429, "Abacn": 1713371791885 } } \ No newline at end of file
(beam) branch pr-bot-state updated: Updating config from bot
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch pr-bot-state in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/pr-bot-state by this push: new 0a832139e2a Updating config from bot 0a832139e2a is described below commit 0a832139e2aba4531687179eb66813dc7e82b800 Author: github-actions AuthorDate: Thu Apr 18 06:06:51 2024 + Updating config from bot --- scripts/ci/pr-bot/state/pr-state/pr-31034.json | 10 ++ 1 file changed, 10 insertions(+) diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31034.json b/scripts/ci/pr-bot/state/pr-state/pr-31034.json new file mode 100644 index 000..8a92900947e --- /dev/null +++ b/scripts/ci/pr-bot/state/pr-state/pr-31034.json @@ -0,0 +1,10 @@ +{ + "commentedAboutFailingChecks": false, + "reviewersAssignedForLabels": { +"build": "damccorm" + }, + "nextAction": "Reviewers", + "stopReviewerNotifications": false, + "remindAfterTestsPass": [], + "committerAssigned": false +} \ No newline at end of file