(beam) branch users/damccorm/issue-templates deleted (was edf5e547695)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch users/damccorm/issue-templates in repository https://gitbox.apache.org/repos/asf/beam.git was edf5e547695 Add infra option for remaining templates + autolabel The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(beam) branch master updated (04dd443bda7 -> 229477976fe)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 04dd443bda7 Update bug.yml (#31947) add edf5e547695 Add infra option for remaining templates + autolabel add 229477976fe Merge pull request #31952 from apache/users/damccorm/issue-templates No new revisions were added by this update. Summary of changes: .github/ISSUE_TEMPLATE/failing_test.yml | 2 ++ .github/ISSUE_TEMPLATE/feature.yml | 2 ++ .github/ISSUE_TEMPLATE/task.yml | 2 ++ .github/issue-rules.yml | 2 ++ 4 files changed, 8 insertions(+)
(beam) 01/01: Update bug.yml
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit ee587e3382892cd220ee9431201c762123067e43 Author: Ahmet Altay AuthorDate: Mon Jul 22 15:52:21 2024 -0700 Update bug.yml - Add an "Infrastructure" option for issues related to CICD, github etc. - Set the default for priorities to priority 2 as implied by the text already. (I could not find the docs for this but found an example here: https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository) --- .github/ISSUE_TEMPLATE/bug.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/ISSUE_TEMPLATE/bug.yml b/.github/ISSUE_TEMPLATE/bug.yml index 67f8b21445d..a2fbeae1319 100644 --- a/.github/ISSUE_TEMPLATE/bug.yml +++ b/.github/ISSUE_TEMPLATE/bug.yml @@ -50,6 +50,7 @@ body: - "Priority: 2 (default / most bugs should be filed as P2)" - "Priority: 1 (data loss / total loss of function)" - "Priority: 0 (outage / urgent vulnerability)" + default: 1 validations: required: true - type: checkboxes @@ -68,6 +69,7 @@ body: - label: "Component: Beam playground" - label: "Component: Beam katas" - label: "Component: Website" + - label: "Component: Infrastructure" - label: "Component: Spark Runner" - label: "Component: Flink Runner" - label: "Component: Samza Runner"
(beam) branch aaltay-patch-1 created (now ee587e33828)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at ee587e33828 Update bug.yml This branch includes the following new commits: new ee587e33828 Update bug.yml The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(beam) branch aaltay-patch-1 deleted (was 0445ae998ac)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git was 0445ae998ac Update beamquest.md The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(beam) 01/01: Update beamquest.md
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit 0445ae998acad1dcae34d5aee9281010371d4c6d Author: Ahmet Altay AuthorDate: Thu Jun 20 18:03:07 2024 -0700 Update beamquest.md Fix beam quest external link. --- website/www/site/content/en/blog/beamquest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/blog/beamquest.md b/website/www/site/content/en/blog/beamquest.md index 8840921fb98..3e04629aae3 100644 --- a/website/www/site/content/en/blog/beamquest.md +++ b/website/www/site/content/en/blog/beamquest.md @@ -34,6 +34,6 @@ Individuals aren’t the only ones who can benefit from completing this quest - Data Processing is a key part of AI/ML workflows. Given the recent advancements in artificial intelligence, now’s the time to jump into the world of data processing! Get started on your journey [here](https://www.cloudskillsboost.google/quests/310). -We are currently offering this quest **FREE OF CHARGE**. To obtain your badge for **FREE**, use the [Access Code](https://www.cloudskillsboost.google/catalog?qlcampaign=1h-swiss-19), create an account, and search ["Getting Started with Apache Beam"](https://www.cloudskillsboost.google/quests/310). If the code does not work, please email [d...@beam.apache.org](d...@beam.apache.org) to obtain a free code. +We are currently offering this quest **FREE OF CHARGE**. To obtain your badge for **FREE**, use the [Access Code](https://www.cloudskillsboost.google/catalog?qlcampaign=1h-swiss-19), create an account, and search ["Getting Started with Apache Beam"](https://www.cloudskillsboost.google/course_templates/724). If the code does not work, please email [d...@beam.apache.org](d...@beam.apache.org) to obtain a free code. PS: Once you earn your badge, please [share it on social media](https://support.google.com/qwiklabs/answer/9222527?hl=en&sjid=14905615709060962899-NA)!
(beam) branch aaltay-patch-1 created (now 0445ae998ac)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at 0445ae998ac Update beamquest.md This branch includes the following new commits: new 0445ae998ac Update beamquest.md The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(beam) branch aaltay-patch-1 created (now 1bb3931eb6b)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at 1bb3931eb6b Update game_stats.py This branch includes the following new commits: new 1bb3931eb6b Update game_stats.py The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(beam) 01/01: Update game_stats.py
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit 1bb3931eb6b147bc7c3f2dabf32e8492e7de617c Author: Ahmet Altay AuthorDate: Tue Apr 30 13:13:26 2024 -0700 Update game_stats.py Fixing a typo in examples & docs. --- sdks/python/apache_beam/examples/complete/game/game_stats.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sdks/python/apache_beam/examples/complete/game/game_stats.py b/sdks/python/apache_beam/examples/complete/game/game_stats.py index d6f5aab3e7b..233d22b7542 100644 --- a/sdks/python/apache_beam/examples/complete/game/game_stats.py +++ b/sdks/python/apache_beam/examples/complete/game/game_stats.py @@ -196,7 +196,7 @@ class WriteToBigQuery(beam.PTransform): # [START abuse_detect] class CalculateSpammyUsers(beam.PTransform): """Filter out all but those users with a high clickrate, which we will - consider as 'spammy' uesrs. + consider as 'spammy' users. We do this by finding the mean total score per user, then using that information as a side input to filter out all but those user scores that are
(beam) branch aaltay-patch-1 deleted (was 381709f8a7f)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git was 381709f8a7f Update contributor-spotlight-johanna-ojeling.md The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(beam) branch master updated: Create contributor-spotlight-johanna-ojeling.md (#29408)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new b6b1902620a Create contributor-spotlight-johanna-ojeling.md (#29408) b6b1902620a is described below commit b6b1902620a22bea23b56e9b873f5025f79e5edb Author: Ahmet Altay AuthorDate: Mon Nov 13 12:42:39 2023 -0800 Create contributor-spotlight-johanna-ojeling.md (#29408) * Create contributor-spotlight-johanna-ojeling.md Add Contributor Spotlight: Johanna Öjeling blog post. * Update contributor-spotlight-johanna-ojeling.md remove trailing space --- .../blog/contributor-spotlight-johanna-ojeling.md | 63 ++ 1 file changed, 63 insertions(+) diff --git a/website/www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md b/website/www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md new file mode 100644 index 000..717f591eca9 --- /dev/null +++ b/website/www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md @@ -0,0 +1,63 @@ +--- +title: "Contributor Spotlight: Johanna Öjeling" +date: 2023-11-11 15:00:00 -0800 +categories: + - blog +authors: + - altay +--- + + +Johanna Öjeling is a Senior Software Engineer at [Normative](https://normative.io/). She started using Apache Beam in 2020 at her previous company [Datatonic](http://datatonic.com) and began contributing in 2022 at a personal capacity. We interviewed Johanna to learn more about her interests and we hope that this will inspire new, future, diverse set of contributors to participate in OSS projects. + +**What areas of interest are you passionate about in your career?** + +My core interest lies in distributed and data-intensive systems, and I enjoy working on challenges related to performance, scalability and maintainability. I also feel strongly about developer experience, and like to build tools and frameworks that make developers happier and more productive. Aside from that, I take pleasure in mentoring and coaching other software engineers to grow their skills and pursue a fulfilling career. + +**What motivated you to make your first contribution?** + +I was already a user of the Apache Beam Java and Python SDKs and Google Cloud Dataflow in my previous job, and had started to play around with the Go SDK to learn Go. When I noticed that a feature I wanted was missing, it seemed like a great opportunity to implement it. I had been curious about developing open source software for some time, but did not have a good idea until then of what to contribute with. + +**In which way have you contributed to Apache Beam?** + +I have primarily worked on the Go SDK with implementation of new features, bug fixes, tests, documentation and code reviews. Some examples include a MongoDB I/O connector with dynamically scalable reads and writes, a file I/O connector supporting continuous file discovery, and an Amazon S3 file system implementation. + +**How has your open source engagement impacted your personal or professional growth?** + +Contributing to open source is one of the best decisions I have taken professionally. The Beam community has been incredibly welcoming and appreciative, and it has been rewarding to collaborate with talented people around the world to create software that is free for anyone to benefit from. Open source has opened up new opportunities to challenge myself, dive deeper into technologies I like, and learn from highly skilled professionals. To me, it has served as an outlet for creativity, pr [...] + +**How have you noticed contributing to open source is different from contributing to closed source/proprietary software?** + +My observation has been that there are higher requirements for software quality in open source, and it is more important to get things right the first time. My closed source software experience is from startups/scale-ups where speed is prioritized. When not working on public facing APIs or libraries, one can also more easily change things, whereas we need to be mindful about breaking changes in Beam. I care for software quality and value the high standards the Beam committers hold. + +**What do you like to do with your spare time when you're not contributing to Beam?** + +Coding is a passion of mine so I tend to spend a lot of my free time on hobby projects, reading books and articles, listening to talks and attending events. When I was younger I loved learning foreign languages and studied English, French, German and Spanish. Later I discovered an interest in computer science and switched focus to programming languages. I decided to change careers to software engineering and have tried to learn as much as possible ever since. I love that it never ends. + +**What future features/improvements are you most excited abou
(beam) branch aaltay-patch-1 updated (140877bef6b -> 381709f8a7f)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git from 140877bef6b Create contributor-spotlight-johanna-ojeling.md add 381709f8a7f Update contributor-spotlight-johanna-ojeling.md No new revisions were added by this update. Summary of changes: .../www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
(beam) branch aaltay-patch-1 created (now 140877bef6b)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at 140877bef6b Create contributor-spotlight-johanna-ojeling.md This branch includes the following new commits: new 140877bef6b Create contributor-spotlight-johanna-ojeling.md The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(beam) 01/01: Create contributor-spotlight-johanna-ojeling.md
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit 140877bef6bef7f3af0a915f208e0b76709d544d Author: Ahmet Altay AuthorDate: Sat Nov 11 14:20:26 2023 -0800 Create contributor-spotlight-johanna-ojeling.md Add Contributor Spotlight: Johanna Öjeling blog post. --- .../blog/contributor-spotlight-johanna-ojeling.md | 63 ++ 1 file changed, 63 insertions(+) diff --git a/website/www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md b/website/www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md new file mode 100644 index 000..f4c2c564b69 --- /dev/null +++ b/website/www/site/content/en/blog/contributor-spotlight-johanna-ojeling.md @@ -0,0 +1,63 @@ +--- +title: "Contributor Spotlight: Johanna Öjeling" +date: 2023-11-11 15:00:00 -0800 +categories: + - blog +authors: + - altay +--- + + +Johanna Öjeling is a Senior Software Engineer at [Normative](https://normative.io/). She started using Apache Beam in 2020 at her previous company [Datatonic](http://datatonic.com) and began contributing in 2022 at a personal capacity. We interviewed Johanna to learn more about her interests and we hope that this will inspire new, future, diverse set of contributors to participate in OSS projects. + +**What areas of interest are you passionate about in your career?** + +My core interest lies in distributed and data-intensive systems, and I enjoy working on challenges related to performance, scalability and maintainability. I also feel strongly about developer experience, and like to build tools and frameworks that make developers happier and more productive. Aside from that, I take pleasure in mentoring and coaching other software engineers to grow their skills and pursue a fulfilling career. + +**What motivated you to make your first contribution?** + +I was already a user of the Apache Beam Java and Python SDKs and Google Cloud Dataflow in my previous job, and had started to play around with the Go SDK to learn Go. When I noticed that a feature I wanted was missing, it seemed like a great opportunity to implement it. I had been curious about developing open source software for some time, but did not have a good idea until then of what to contribute with. + +**In which way have you contributed to Apache Beam?** + +I have primarily worked on the Go SDK with implementation of new features, bug fixes, tests, documentation and code reviews. Some examples include a MongoDB I/O connector with dynamically scalable reads and writes, a file I/O connector supporting continuous file discovery, and an Amazon S3 file system implementation. + +**How has your open source engagement impacted your personal or professional growth?** + +Contributing to open source is one of the best decisions I have taken professionally. The Beam community has been incredibly welcoming and appreciative, and it has been rewarding to collaborate with talented people around the world to create software that is free for anyone to benefit from. Open source has opened up new opportunities to challenge myself, dive deeper into technologies I like, and learn from highly skilled professionals. To me, it has served as an outlet for creativity, pr [...] + +**How have you noticed contributing to open source is different from contributing to closed source/proprietary software?** + +My observation has been that there are higher requirements for software quality in open source, and it is more important to get things right the first time. My closed source software experience is from startups/scale-ups where speed is prioritized. When not working on public facing APIs or libraries, one can also more easily change things, whereas we need to be mindful about breaking changes in Beam. I care for software quality and value the high standards the Beam committers hold. + +**What do you like to do with your spare time when you're not contributing to Beam?** + +Coding is a passion of mine so I tend to spend a lot of my free time on hobby projects, reading books and articles, listening to talks and attending events. When I was younger I loved learning foreign languages and studied English, French, German and Spanish. Later I discovered an interest in computer science and switched focus to programming languages. I decided to change careers to software engineering and have tried to learn as much as possible ever since. I love that it never ends. + +**What future features/improvements are you most excited about, or would you like to see on Beam?** + +The multi-language pipeline support is an impressive feature of Beam, and I like that new SDKs such as TypeScript and Swift are emerging, which enables developers to write pipelines in their preferred language. Naturally, I am also excited to see where the Go SDK is headed and how we
[beam] 01/01: [Website] add linkedIn case-study (#28988)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 36574ce310999a74b87424e600fcb7394ced5caf Merge: 0b50a173608 f8ca45ebb7b Author: Ahmet Altay AuthorDate: Tue Oct 17 12:04:37 2023 -0700 [Website] add linkedIn case-study (#28988) .../www/site/content/en/case-studies/linkedin.md | 284 - website/www/site/data/en/quotes.yaml | 5 + .../images/case-study/linkedin/bingfeng-xia.jpg| Bin 0 -> 99417 bytes .../static/images/case-study/linkedin/scheme-1.png | Bin 0 -> 72662 bytes .../static/images/case-study/linkedin/scheme-2.png | Bin 0 -> 91660 bytes .../static/images/case-study/linkedin/scheme-3.png | Bin 0 -> 207951 bytes .../static/images/case-study/linkedin/scheme-4.png | Bin 0 -> 98569 bytes .../static/images/case-study/linkedin/scheme-5.png | Bin 0 -> 116720 bytes .../static/images/case-study/linkedin/scheme-6.png | Bin 0 -> 26758 bytes .../images/case-study/linkedin/xinyu-liu.jpg | Bin 0 -> 100350 bytes 10 files changed, 285 insertions(+), 4 deletions(-)
[beam] branch master updated (0b50a173608 -> 36574ce3109)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 0b50a173608 Fix num workers (#29006) add f8ca45ebb7b [Website] add linkedin case-study new 36574ce3109 [Website] add linkedIn case-study (#28988) The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../www/site/content/en/case-studies/linkedin.md | 284 - website/www/site/data/en/quotes.yaml | 5 + .../images/case-study/linkedin/bingfeng-xia.jpg| Bin 0 -> 99417 bytes .../static/images/case-study/linkedin/scheme-1.png | Bin 0 -> 72662 bytes .../static/images/case-study/linkedin/scheme-2.png | Bin 0 -> 91660 bytes .../static/images/case-study/linkedin/scheme-3.png | Bin 0 -> 207951 bytes .../static/images/case-study/linkedin/scheme-4.png | Bin 0 -> 98569 bytes .../static/images/case-study/linkedin/scheme-5.png | Bin 0 -> 116720 bytes .../static/images/case-study/linkedin/scheme-6.png | Bin 0 -> 26758 bytes .../images/case-study/linkedin/xinyu-liu.jpg | Bin 0 -> 100350 bytes 10 files changed, 285 insertions(+), 4 deletions(-) create mode 100644 website/www/site/static/images/case-study/linkedin/bingfeng-xia.jpg create mode 100644 website/www/site/static/images/case-study/linkedin/scheme-1.png create mode 100644 website/www/site/static/images/case-study/linkedin/scheme-2.png create mode 100644 website/www/site/static/images/case-study/linkedin/scheme-3.png create mode 100644 website/www/site/static/images/case-study/linkedin/scheme-4.png create mode 100644 website/www/site/static/images/case-study/linkedin/scheme-5.png create mode 100644 website/www/site/static/images/case-study/linkedin/scheme-6.png create mode 100644 website/www/site/static/images/case-study/linkedin/xinyu-liu.jpg
[beam] branch master updated: [Blog] Quest updated dates (#28824)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new f30f6c5e220 [Blog] Quest updated dates (#28824) f30f6c5e220 is described below commit f30f6c5e22046e2bd603ec47a3b9e38c80510fae Author: Svetak Sundhar AuthorDate: Wed Oct 4 17:42:26 2023 + [Blog] Quest updated dates (#28824) --- website/www/site/content/en/blog/beamquest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/blog/beamquest.md b/website/www/site/content/en/blog/beamquest.md index eea893bf822..dde6376b407 100644 --- a/website/www/site/content/en/blog/beamquest.md +++ b/website/www/site/content/en/blog/beamquest.md @@ -34,6 +34,6 @@ Individuals aren’t the only ones who can benefit from completing this quest - Data Processing is a key part of AI/ML workflows. Given the recent advancements in artificial intelligence, now’s the time to jump into the world of data processing! Get started on your journey [here](https://www.cloudskillsboost.google/quests/310). -We are currently offering this quest **FREE OF CHARGE** until **July 8, 2023** for the **first 2,000** people. To obtain your badge for **FREE**, use the [Access Code](https://www.cloudskillsboost.google/catalog?qlcampaign=1h-swiss-19), create an account, and search ["Getting Started with Apache Beam"](https://www.cloudskillsboost.google/quests/310). +We are currently offering this quest **FREE OF CHARGE**. To obtain your badge for **FREE**, use the [Access Code](https://www.cloudskillsboost.google/catalog?qlcampaign=1h-swiss-19), create an account, and search ["Getting Started with Apache Beam"](https://www.cloudskillsboost.google/quests/310). If the code does not work, please email [d...@beam.apache.org](d...@beam.apache.org) to obtain a free code. PS: Once you earn your badge, please [share it on social media](https://support.google.com/qwiklabs/answer/9222527?hl=en&sjid=14905615709060962899-NA)!
[beam] branch aaltay-patch-1 deleted (was 3655b1ea3fb)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git was 3655b1ea3fb Update dyi-content-discovery-platform-genai-beam.md The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
[beam] branch master updated (2e0521162f6 -> f3df03d8fa9)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 2e0521162f6 [Java BQ] Storage API streaming load test (#28264) add 3655b1ea3fb Update dyi-content-discovery-platform-genai-beam.md add f3df03d8fa9 Merge pull request #28788 from apache/aaltay-patch-1 No new revisions were added by this update. Summary of changes: .../site/content/en/blog/dyi-content-discovery-platform-genai-beam.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] 01/01: Update dyi-content-discovery-platform-genai-beam.md
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit 3655b1ea3fb33c20b758b6a70a76c699ed767f89 Author: Ahmet Altay AuthorDate: Mon Oct 2 21:02:20 2023 -0700 Update dyi-content-discovery-platform-genai-beam.md Fixing the publication date. --- .../site/content/en/blog/dyi-content-discovery-platform-genai-beam.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md b/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md index 8057374591d..fd967e318a0 100644 --- a/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md +++ b/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md @@ -1,7 +1,7 @@ --- layout: post title: "DIY GenAI Content Discovery Platform with Apache Beam" -date: 2023-09-27 00:00:01 -0800 +date: 2023-10-02 00:00:01 -0800 categories: - blog authors:
[beam] branch aaltay-patch-1 created (now 3655b1ea3fb)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at 3655b1ea3fb Update dyi-content-discovery-platform-genai-beam.md This branch includes the following new commits: new 3655b1ea3fb Update dyi-content-discovery-platform-genai-beam.md The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[beam] branch master updated: [Blog Post] Apache Beam for a content discovery platform (#28734)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 71c8459633e [Blog Post] Apache Beam for a content discovery platform (#28734) 71c8459633e is described below commit 71c8459633ec86e576eca080a26be9f42474ecb2 Author: pablo rodriguez defino AuthorDate: Mon Oct 2 17:07:58 2023 -0700 [Blog Post] Apache Beam for a content discovery platform (#28734) Co-authored-by: Rebecca Szper <98840847+rsz...@users.noreply.github.com> Co-authored-by: nams113 <39890215+nams...@users.noreply.github.com> --- .../dyi-content-discovery-platform-genai-beam.md | 338 + website/www/site/data/authors.yml | 6 + .../images/blog/dyi-cdp-genai-beam/cdp-arch.png| Bin 0 -> 271543 bytes .../blog/dyi-cdp-genai-beam/cdp-highlevel.png | Bin 0 -> 31242 bytes .../images/blog/dyi-cdp-genai-beam/pipeline-1.png | Bin 0 -> 146525 bytes .../pipeline-2-extractcontent.png | Bin 0 -> 130427 bytes .../pipeline-3-errorhandling.png | Bin 0 -> 112800 bytes .../pipeline-4-processembeddings1.png | Bin 0 -> 49246 bytes .../pipeline-4-processembeddings2.png | Bin 0 -> 58035 bytes .../dyi-cdp-genai-beam/pipeline-5-storecontent.png | Bin 0 -> 74751 bytes .../dyi-cdp-genai-beam/pipeline-6-refresh1.png | Bin 0 -> 74889 bytes .../dyi-cdp-genai-beam/pipeline-6-refresh2.png | Bin 0 -> 72757 bytes .../dyi-cdp-genai-beam/pipeline-6-refresh3.png | Bin 0 -> 53972 bytes 13 files changed, 344 insertions(+) diff --git a/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md b/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md new file mode 100644 index 000..8057374591d --- /dev/null +++ b/website/www/site/content/en/blog/dyi-content-discovery-platform-genai-beam.md @@ -0,0 +1,338 @@ +--- +layout: post +title: "DIY GenAI Content Discovery Platform with Apache Beam" +date: 2023-09-27 00:00:01 -0800 +categories: + - blog +authors: + - pabs + - namitasharma +--- + + +# DIY GenAI Content Discovery Platform with Apache Beam + +Your digital assets, such as documents, PDFs, spreadsheets, and presentations, contain a wealth of valuable information, but sometimes it's hard to find what you're looking for. This blog post explains how to build a DIY starter architecture, based on near real-time ingestion processing and large language models (LLMs), to extract meaningful information from your assets. The model makes the information available and discoverable through a simple natural language query. + +Building a near real-time processing pipeline for content ingestion might seem like a complex task, and it can be. To make pipeline building easier, the Apache Beam framework exposes a set of powerful constructs. These constructs remove the following complexities: interacting with multiple types of content sources and destinations, error handling, and modularity. They also maintain resiliency and scalability with minimal effort. You can use an Apache Beam streaming pipeline to complete t [...] + +- Connect to the many components of a solution. +- Quickly process content ingestion requests of documents. +- Make the information in the documents available a few seconds after ingestion. + +LLMs are often used to extract content and summarize information stored in many different places. Organizations can use LLMs to quickly find relevant information disseminated in multiple documents written across the years. The information might be in different formats, or the documents might be too long and complex to read and understand quickly. Use LLMs to process this content to make it easier for people to find the information that they need. + +Follow the steps in this guide to create a custom scalable solution for data extraction, content ingestion, and storage. Learn how to kickstart the development of a LLM-based solution using Google Cloud products and generative AI offerings. Google Cloud is designed to be simple to use, scalable, and flexible, so you can use it as a starting point for further expansion or experimentation. + +### High-level Flow + +In this workflow, content uptake and query interactions are completely separated. An external content owner can send documents stored in Google Docs or in a binary text format and receive a tracking ID for the ingestion request. The ingestion process gets the content of the document and creates chunks that are configurable in size. Each document chunk is used to generate embeddings. These embeddings represent the content semantics, in the form of a vector of 768 dimensions. Given the doc [...] + + + +The query resolution process doesn't
[beam] branch master updated (50b087cb81e -> 5ceed8eaf09)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 50b087cb81e Bump github.com/google/uuid from 1.3.0 to 1.3.1 in /sdks (#28087) add 5ceed8eaf09 use google form for feedback (#28013) No new revisions were added by this update. Summary of changes: website/www/site/layouts/partials/feedback.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated: Re-word line in Octo case study
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 8837c93d642 Re-word line in Octo case study new 83f09e70682 Merge pull request #27992 from jrmccluskey/caseStudyCleanup 8837c93d642 is described below commit 8837c93d6421040e6cdd87d9db90cef64493a14e Author: Jack McCluskey AuthorDate: Mon Aug 14 14:09:36 2023 -0400 Re-word line in Octo case study --- website/www/site/content/en/case-studies/octo.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/case-studies/octo.md b/website/www/site/content/en/case-studies/octo.md index b7fcf824ee6..9ab6fd4b00c 100644 --- a/website/www/site/content/en/case-studies/octo.md +++ b/website/www/site/content/en/case-studies/octo.md @@ -64,7 +64,7 @@ In this spotlight, OCTO’s Data Architect, Godefroy Clair, and Data Engineers, OCTO’s Client, a prominent grocery and convenience store retailer with tens of thousands of stores across several countries, relies on an internal web app to empower store managers with informed purchasing decisions and effective store management. The web app provides access to crucial product details, stock quantities, pricing, promotions, and more, sourced from various internal data stores, platforms, and systems. -Before 2022, the Client utilized [Cloud Composer](https://cloud.google.com/composer) for orchestrating batch pipelines that consolidated and processed data from Cloud Storage files and Pub/Sub messages and wrote the output to BigQuery. However, with most source data uploaded at night, batch processing posed challenges in meeting SLAs and providing the most recent information to store managers before store opening. Moreover, incorrect or missing data uploads required cumbersome database s [...] +Before 2022, the Client utilized an orchestration engine for orchestrating batch pipelines that consolidated and processed data from Cloud Storage files and Pub/Sub messages and wrote the output to BigQuery. However, with most source data uploaded at night, batch processing posed challenges in meeting SLAs and providing the most recent information to store managers before store opening. Moreover, incorrect or missing data uploads required cumbersome database state reverts, involving a su [...] To address these issues, the Client sought OCTO's expertise to transform their data ecosystem and migrate their core use case from batch to streaming. The objectives included faster data processing, ensuring the freshest data in the web app, simplifying pipeline and database maintenance, ensuring scalability and resilience, and efficiently handling spikes in data volumes.
[beam] branch aaltay-patch-1 deleted (was fb4c8587386)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git was fb4c8587386 Add anchor link for the logo wall The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
[beam] branch aaltay-patch-2 deleted (was c4d9faa430a)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-2 in repository https://gitbox.apache.org/repos/asf/beam.git was c4d9faa430a update index The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
[beam] branch master updated: Update learning-resources.md with links to video material (#27079)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new e351272a711 Update learning-resources.md with links to video material (#27079) e351272a711 is described below commit e351272a71166d5daafb5d4313e78d5808b3ccbd Author: Ahmet Altay AuthorDate: Fri Jun 9 13:35:39 2023 -0700 Update learning-resources.md with links to video material (#27079) --- .../content/en/get-started/resources/learning-resources.md | 10 ++ .../www/site/layouts/partials/section-menu/en/get-started.html | 3 +++ 2 files changed, 13 insertions(+) diff --git a/website/www/site/content/en/get-started/resources/learning-resources.md b/website/www/site/content/en/get-started/resources/learning-resources.md index 2deac6c7067..abd4566f4d3 100644 --- a/website/www/site/content/en/get-started/resources/learning-resources.md +++ b/website/www/site/content/en/get-started/resources/learning-resources.md @@ -80,6 +80,16 @@ If you have additional material that you would like to see here, please let us k * **[Timely and Stateful Processing](/blog/2017/08/28/timely-processing.html)** - An example on how to do batched RPC calls. The call requests are stored in a mutable state as they are received. Once there are either enough requests or a certain time has passed, the batch of requests is triggered to be sent. * **[Running External Libraries](https://cloud.google.com/blog/products/gcp/running-external-libraries-with-cloud-dataflow-for-grid-computing-workloads)** - Call an external library written in a language that does not have a native SDK in Apache Beam such as C++. +## Videos {#videos} + +* **[Getting Started with Apache Beam](https://www.youtube.com/playlist?list=PLIivdWyY5sqIEiHGunZXg_yoS7unlHNJt)** - Five part video series for understanding basic to advanced concepts. +* See more [Videos and Podcasts](/get-started/resources/videos-and-podcasts/) + +## Courses {#courses} + +* **[Beam College](https://beamcollege.dev/)** -- Free live and recorded lessons for learning Beam and data processing. +* **[Serverless Data Processing](https://www.coursera.org/specializations/serverless-data-processing-with-dataflow)** - Course specialized for Dataflow runner. + ## Books {#books} ### Building Big Data Pipelines with Apache Beam diff --git a/website/www/site/layouts/partials/section-menu/en/get-started.html b/website/www/site/layouts/partials/section-menu/en/get-started.html index 843f0ae5a15..80771f5039e 100644 --- a/website/www/site/layouts/partials/section-menu/en/get-started.html +++ b/website/www/site/layouts/partials/section-menu/en/get-started.html @@ -44,7 +44,10 @@ Getting Started Articles +Videos +Courses Books +Certifications Interactive Labs Beam Katas Code Examples
[beam] branch aaltay-patch-2 updated (3100354a510 -> c4d9faa430a)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-2 in repository https://gitbox.apache.org/repos/asf/beam.git from 3100354a510 Update learning-resources.md add c4d9faa430a update index No new revisions were added by this update. Summary of changes: website/www/site/layouts/partials/section-menu/en/get-started.html | 3 +++ 1 file changed, 3 insertions(+)
[beam] branch aaltay-patch-2 updated (48a11bfcf3b -> 3100354a510)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-2 in repository https://gitbox.apache.org/repos/asf/beam.git from 48a11bfcf3b Update learning-resources.md add 3100354a510 Update learning-resources.md No new revisions were added by this update. Summary of changes: website/www/site/content/en/get-started/resources/learning-resources.md | 1 + 1 file changed, 1 insertion(+)
[beam] 01/01: Update learning-resources.md
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-2 in repository https://gitbox.apache.org/repos/asf/beam.git commit 48a11bfcf3b5184115495b9dbf364bcc4cfd50bc Author: Ahmet Altay AuthorDate: Thu Jun 8 20:54:09 2023 -0700 Update learning-resources.md Adding new learning resources --- .../site/content/en/get-started/resources/learning-resources.md | 9 + 1 file changed, 9 insertions(+) diff --git a/website/www/site/content/en/get-started/resources/learning-resources.md b/website/www/site/content/en/get-started/resources/learning-resources.md index 2deac6c7067..746bd0c8c1f 100644 --- a/website/www/site/content/en/get-started/resources/learning-resources.md +++ b/website/www/site/content/en/get-started/resources/learning-resources.md @@ -80,6 +80,15 @@ If you have additional material that you would like to see here, please let us k * **[Timely and Stateful Processing](/blog/2017/08/28/timely-processing.html)** - An example on how to do batched RPC calls. The call requests are stored in a mutable state as they are received. Once there are either enough requests or a certain time has passed, the batch of requests is triggered to be sent. * **[Running External Libraries](https://cloud.google.com/blog/products/gcp/running-external-libraries-with-cloud-dataflow-for-grid-computing-workloads)** - Call an external library written in a language that does not have a native SDK in Apache Beam such as C++. +## Videos {#videos} + +* **[Getting Started with Apache Beam](https://www.youtube.com/playlist?list=PLIivdWyY5sqIEiHGunZXg_yoS7unlHNJt)** - Five part video series for understanding basic to advanced concepts. + +## Courses {#courses} + +* **[Beam College](https://beamcollege.dev/)** -- Free live and recorded lessons for learning Beam and data processing. +* **[Serverless Data Processing](https://www.coursera.org/specializations/serverless-data-processing-with-dataflow)** - Course specialized for Dataflow runner. + ## Books {#books} ### Building Big Data Pipelines with Apache Beam
[beam] branch aaltay-patch-2 created (now 48a11bfcf3b)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-2 in repository https://gitbox.apache.org/repos/asf/beam.git at 48a11bfcf3b Update learning-resources.md This branch includes the following new commits: new 48a11bfcf3b Update learning-resources.md The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[beam] branch master updated (8e5bf7c577d -> 3213394640e)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 8e5bf7c577d Add note for updated release tag for 2.48 (#27073) add 3213394640e Quote Addition + Sharing instructions [Beam Quest Blog] (#27066) No new revisions were added by this update. Summary of changes: website/www/site/content/en/blog/beamquest.md | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
[beam] branch master updated: Add Certification + Beam Quest to Documentation (#26997)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 2e4e68bc8f0 Add Certification + Beam Quest to Documentation (#26997) 2e4e68bc8f0 is described below commit 2e4e68bc8f03c2ce4e5976593280f42d2ddce0bf Author: Svetak Sundhar AuthorDate: Thu Jun 8 16:12:58 2023 -0400 Add Certification + Beam Quest to Documentation (#26997) Co-authored-by: Rebecca Szper <98840847+rsz...@users.noreply.github.com> --- .../site/content/en/get-started/resources/learning-resources.md | 8 1 file changed, 8 insertions(+) diff --git a/website/www/site/content/en/get-started/resources/learning-resources.md b/website/www/site/content/en/get-started/resources/learning-resources.md index e435a07b287..2deac6c7067 100644 --- a/website/www/site/content/en/get-started/resources/learning-resources.md +++ b/website/www/site/content/en/get-started/resources/learning-resources.md @@ -90,6 +90,14 @@ If you have additional material that you would like to see here, please let us k **[Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing](https://learning.oreilly.com/library/view/streaming-systems/9781491983867/)** by Tyler Akidau, Slava Chernyak, Reuven Lax. (August 2018). Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. + +## Certifications {#certifications} + +### Getting Started with Apache Beam Quest + +**[Get Started with Apache Beam](https://www.cloudskillsboost.google/quests/310)** This quest includes four labs that teach you how to write and test Apache Beam pipelines. Three of the labs use Java and one uses Python. Each lab takes about 1.5 hours to complete. When you complete the quest, you're granted a badge that you can use to show your Beam expertise. + + ## Interactive Labs {#interactive-labs} ### Java
[beam] branch master updated: Beam quest Blog Addon (#27051)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 718a449bf89 Beam quest Blog Addon (#27051) 718a449bf89 is described below commit 718a449bf89b5fba6f7cf4ebd34020277ee4b187 Author: Svetak Sundhar AuthorDate: Wed Jun 7 16:28:20 2023 -0400 Beam quest Blog Addon (#27051) Co-authored-by: Rebecca Szper <98840847+rsz...@users.noreply.github.com> Co-authored-by: Ahmet Altay --- website/www/site/content/en/blog/beamquest.md | 4 +++- .../site/static/images/blog/apch-beam-w_bdg_c_en.png| Bin 81879 -> 0 bytes .../site/static/images/blog/beam-badge-image-scaled.png | Bin 0 -> 53129 bytes 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/website/www/site/content/en/blog/beamquest.md b/website/www/site/content/en/blog/beamquest.md index 61b30d41b6b..56ad8401536 100644 --- a/website/www/site/content/en/blog/beamquest.md +++ b/website/www/site/content/en/blog/beamquest.md @@ -21,7 +21,7 @@ limitations under the License. --> @@ -32,3 +32,5 @@ Beam is one of the largest big data open source projects actively in development Individuals aren’t the only ones who can benefit from completing this quest - organizations can too! Because earning this badge represents deep knowledge of an industry leading big data library, having the badge validates your organization’s understanding of Beam. In addition, you can run the Beam library on a wide variety of runners, including Google Cloud Dataflow, Flink, Spark, and more, making knowledge about this library highly transferable. Finally, your organization can use this [...] Data Processing is a key part of AI/ML workflows. Given the recent advancements in artificial intelligence, now’s the time to jump into the world of data processing! Get started on your journey [here](https://www.cloudskillsboost.google/quests/310). + +We are currently offering this quest **FREE OF CHARGE** until **July 8, 2023** for the **first 2,000** people. To obtain your badge for **FREE**, use the [Access Code](https://www.cloudskillsboost.google/catalog?qlcampaign=1h-swiss-19), create an account, and search ["Getting Started with Apache Beam"](https://www.cloudskillsboost.google/quests/310). diff --git a/website/www/site/static/images/blog/apch-beam-w_bdg_c_en.png b/website/www/site/static/images/blog/apch-beam-w_bdg_c_en.png deleted file mode 100644 index e39d5e75917..000 Binary files a/website/www/site/static/images/blog/apch-beam-w_bdg_c_en.png and /dev/null differ diff --git a/website/www/site/static/images/blog/beam-badge-image-scaled.png b/website/www/site/static/images/blog/beam-badge-image-scaled.png new file mode 100644 index 000..c71761b0cb0 Binary files /dev/null and b/website/www/site/static/images/blog/beam-badge-image-scaled.png differ
[beam] branch master updated: Beam Quest Blogpost (#27004)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bfa7a63f47f Beam Quest Blogpost (#27004) bfa7a63f47f is described below commit bfa7a63f47f00e03f1b6d6a019218dd1cea83c51 Author: Svetak Sundhar AuthorDate: Tue Jun 6 19:55:03 2023 -0400 Beam Quest Blogpost (#27004) Co-authored-by: Rebecca Szper <98840847+rsz...@users.noreply.github.com> --- website/www/site/content/en/blog/beamquest.md | 34 + website/www/site/data/authors.yml | 4 +++ .../static/images/blog/apch-beam-w_bdg_c_en.png| Bin 0 -> 81879 bytes 3 files changed, 38 insertions(+) diff --git a/website/www/site/content/en/blog/beamquest.md b/website/www/site/content/en/blog/beamquest.md new file mode 100644 index 000..61b30d41b6b --- /dev/null +++ b/website/www/site/content/en/blog/beamquest.md @@ -0,0 +1,34 @@ +--- +title: "Getting started with Apache Beam: An open source proficiency credential sponsored by Google Cloud" +date: 2023-06-06 00:00:01 -0800 +categories: + - blog +aliases: + - /blog/2023/06/06/beam-quest.html +authors: + - svetakvsundhar +--- + + + + + +We’re excited to announce the release of the [“Getting Started with Apache Beam” quest](https://www.cloudskillsboost.google/quests/310), a series of four online labs that venture into different Apache Beam concepts. When you complete all four labs, you’ll earn a Google Cloud badge that you can share on platforms like LinkedIn. Earning this badge should take less than seven hours total, and signing up for the quest costs $20 (there are often free specials for people who attend Beam events [...] + +Beam is one of the largest big data open source projects actively in development. Over the past six years, the Apache Beam community has seen tremendous growth in the number of contributors, committers, and users. If you’re a long time Beam user, you can now earn a badge to show your skills to potential employers. If you’re new to Beam, you can begin your learning journey with this quest. To attempt this quest, you don’t need any prior knowledge of data processing or distributed systems [...] + +Individuals aren’t the only ones who can benefit from completing this quest - organizations can too! Because earning this badge represents deep knowledge of an industry leading big data library, having the badge validates your organization’s understanding of Beam. In addition, you can run the Beam library on a wide variety of runners, including Google Cloud Dataflow, Flink, Spark, and more, making knowledge about this library highly transferable. Finally, your organization can use this [...] + +Data Processing is a key part of AI/ML workflows. Given the recent advancements in artificial intelligence, now’s the time to jump into the world of data processing! Get started on your journey [here](https://www.cloudskillsboost.google/quests/310). diff --git a/website/www/site/data/authors.yml b/website/www/site/data/authors.yml index 18498392e23..36931d3ee87 100644 --- a/website/www/site/data/authors.yml +++ b/website/www/site/data/authors.yml @@ -245,6 +245,10 @@ alexkosolapov: hermannb: name: Brittany Hermann email: herma...@google.com +svetakvsundhar: + name: Svetak Sundhar + email: svetaksund...@google.com + twitter: svetaksundhar iht: name: Israel Herraiz email: i...@google.com diff --git a/website/www/site/static/images/blog/apch-beam-w_bdg_c_en.png b/website/www/site/static/images/blog/apch-beam-w_bdg_c_en.png new file mode 100644 index 000..e39d5e75917 Binary files /dev/null and b/website/www/site/static/images/blog/apch-beam-w_bdg_c_en.png differ
[beam] branch master updated: Update the 2.47 release notes with the autoUpdateSchema issue
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 9e9ea9acd21 Update the 2.47 release notes with the autoUpdateSchema issue new d3bb6efec65 Merge pull request #26807 from liferoad/autoschema-known-issues 9e9ea9acd21 is described below commit 9e9ea9acd21a016fe3297fe4016a3778e0b5318f Author: xqhu AuthorDate: Sat May 20 19:36:03 2023 -0400 Update the 2.47 release notes with the autoUpdateSchema issue --- website/www/site/content/en/blog/beam-2.47.0.md | 4 1 file changed, 4 insertions(+) diff --git a/website/www/site/content/en/blog/beam-2.47.0.md b/website/www/site/content/en/blog/beam-2.47.0.md index b51681e78e7..d0375a1e2ad 100644 --- a/website/www/site/content/en/blog/beam-2.47.0.md +++ b/website/www/site/content/en/blog/beam-2.47.0.md @@ -63,6 +63,10 @@ For more information on changes in 2.47.0, check out the [detailed release notes * BigQuery sink in STORAGE_WRITE_API mode in batch pipelines might result in data consistency issues during the handling of other unrelated transient errors for Beam SDKs 2.35.0 - 2.46.0 (inclusive). For more details see: https://github.com/apache/beam/issues/26521 +### Known Issues + +* BigQueryIO Storage API write with autoUpdateSchema may cause data corruption for Beam SDKs 2.45.0 - 2.47.0 (inclusive) ([#26789](https://github.com/apache/beam/issues/26789)) + ## List of Contributors According to git shortlog, the following people contributed to the 2.47.0 release. Thank you to all contributors!
[beam] branch master updated: Wesbite fix update booking (#26364)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 2c6a215b9d8 Wesbite fix update booking (#26364) 2c6a215b9d8 is described below commit 2c6a215b9d87e2b94809d54f75b2dda12eb92c90 Author: bullet03 AuthorDate: Fri Apr 21 07:27:29 2023 +0600 Wesbite fix update booking (#26364) Co-authored-by: Alex Kosolapov --- .../www/site/content/en/case-studies/booking.md| 4 ++-- .../case-study/booking/stateful-capabilities.png | Bin 109921 -> 110284 bytes .../case-study/booking/streaming-processing.png| Bin 178911 -> 115271 bytes 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/website/www/site/content/en/case-studies/booking.md b/website/www/site/content/en/case-studies/booking.md index b0d9af14a47..487861a6e30 100644 --- a/website/www/site/content/en/case-studies/booking.md +++ b/website/www/site/content/en/case-studies/booking.md @@ -78,7 +78,7 @@ The mass bidding infrastructure had undergone several rewrites before Apache Bea Igor Dralyuk - Senior Software Engineer @ Booking.com + Principal Engineer @ Booking.com @@ -141,7 +141,7 @@ The quality of documentation, as well as the vibrant Apache Beam open-source com Igor Dralyuk - Senior Software Engineer @ Booking.com + Principal Engineer @ Booking.com diff --git a/website/www/site/static/images/case-study/booking/stateful-capabilities.png b/website/www/site/static/images/case-study/booking/stateful-capabilities.png index 84899dbfc72..ad709a7dbd2 100644 Binary files a/website/www/site/static/images/case-study/booking/stateful-capabilities.png and b/website/www/site/static/images/case-study/booking/stateful-capabilities.png differ diff --git a/website/www/site/static/images/case-study/booking/streaming-processing.png b/website/www/site/static/images/case-study/booking/streaming-processing.png index a32d94ab8ca..1871cfffac6 100644 Binary files a/website/www/site/static/images/case-study/booking/streaming-processing.png and b/website/www/site/static/images/case-study/booking/streaming-processing.png differ
[beam] branch master updated (bd8950176db -> c3fd2e09492)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from bd8950176db Use external config schema to construct Python SchemaTransform payload (#26100) add c3fd2e09492 [Website] add booking case-study (#26294) No new revisions were added by this update. Summary of changes: website/www/site/assets/scss/_case_study.scss | 4 + .../www/site/content/en/case-studies/booking.md| 258 + website/www/site/data/en/quotes.yaml | 5 + .../static/images/case-study/booking/booking.ico | Bin 0 -> 1582 bytes .../images/case-study/booking/igor_dralyuk.jpg | Bin 0 -> 11870 bytes .../images/case-study/booking/prasanjit_barua.jpg | Bin 0 -> 42661 bytes .../images/case-study/booking/sergey_dovgal.jpg| Bin 0 -> 39705 bytes .../case-study/booking/stateful-capabilities.png | Bin 0 -> 109921 bytes .../case-study/booking/streaming-processing.png| Bin 0 -> 178911 bytes .../static/images/case-study/booking/warren_qi.jpg | Bin 0 -> 26681 bytes .../static/images/logos/powered-by/booking.png | Bin 0 -> 9264 bytes 11 files changed, 267 insertions(+) create mode 100644 website/www/site/content/en/case-studies/booking.md create mode 100644 website/www/site/static/images/case-study/booking/booking.ico create mode 100644 website/www/site/static/images/case-study/booking/igor_dralyuk.jpg create mode 100644 website/www/site/static/images/case-study/booking/prasanjit_barua.jpg create mode 100644 website/www/site/static/images/case-study/booking/sergey_dovgal.jpg create mode 100644 website/www/site/static/images/case-study/booking/stateful-capabilities.png create mode 100644 website/www/site/static/images/case-study/booking/streaming-processing.png create mode 100644 website/www/site/static/images/case-study/booking/warren_qi.jpg create mode 100644 website/www/site/static/images/logos/powered-by/booking.png
[beam] branch master updated: Fix example code on from-spark.md
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new d5810397e60 Fix example code on from-spark.md new 170c4184ce1 Merge pull request #25633 from andreykot/patch-1 d5810397e60 is described below commit d5810397e60564e063fa0f393daab93119ba1bc3 Author: Andrey Kot AuthorDate: Sun Feb 26 02:09:53 2023 +0100 Fix example code on from-spark.md --- website/www/site/content/en/get-started/from-spark.md | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/website/www/site/content/en/get-started/from-spark.md b/website/www/site/content/en/get-started/from-spark.md index 26a615304b3..36d1fb2a045 100644 --- a/website/www/site/content/en/get-started/from-spark.md +++ b/website/www/site/content/en/get-started/from-spark.md @@ -312,11 +312,12 @@ with beam.Pipeline() as pipeline: min_value = values | beam.CombineGlobally(min) max_value = values | beam.CombineGlobally(max) -# To access `total`, we need to pass it as a side input. +# To access `min_value` and `max_value`, we need to pass them as a side input. scaled_values = values | beam.Map( -lambda x, min_value, max_value: x / lambda x: (x - min_value) / (max_value - min_value), -min_value =beam.pvalue.AsSingleton(min_value), -max_value =beam.pvalue.AsSingleton(max_value)) +lambda x, minimum, maximum: (x - minimum) / (maximum - minimum), +minimum=beam.pvalue.AsSingleton(min_value), +maximum=beam.pvalue.AsSingleton(max_value), +) scaled_values | beam.Map(print) {{< /highlight >}}
[beam] 01/01: Add anchor link for the logo wall
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit fb4c8587386c65a11e257582f348fa3892f9986a Author: Ahmet Altay AuthorDate: Wed Feb 22 10:19:23 2023 -0800 Add anchor link for the logo wall --- website/www/site/layouts/case-studies/list.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/layouts/case-studies/list.html b/website/www/site/layouts/case-studies/list.html index c609d56f468..1021cf13912 100644 --- a/website/www/site/layouts/case-studies/list.html +++ b/website/www/site/layouts/case-studies/list.html @@ -70,7 +70,7 @@ limitations under the License. See accompanying LICENSE file. Share your story -Also used by +Also used by {{ range where $pages "Params.category" "ne" "study" }} {{ if .Params.hasLink }}
[beam] branch aaltay-patch-1 created (now fb4c8587386)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at fb4c8587386 Add anchor link for the logo wall This branch includes the following new commits: new fb4c8587386 Add anchor link for the logo wall The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[beam] branch aaltay-patch-1 created (now bf2380c5ba8)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git at bf2380c5ba8 Update the title of the wordcount quickstart This branch includes the following new commits: new bf2380c5ba8 Update the title of the wordcount quickstart The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[beam] 01/01: Update the title of the wordcount quickstart
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit bf2380c5ba8058c824df555bf7ed43bffbfcd3e5 Author: Ahmet Altay AuthorDate: Tue Feb 14 12:26:35 2023 -0800 Update the title of the wordcount quickstart Update the title to correctly reflect the new content. (follow up to : https://github.com/apache/beam/pull/24804) --- website/www/site/content/en/get-started/quickstart-py.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/get-started/quickstart-py.md b/website/www/site/content/en/get-started/quickstart-py.md index aa905998d57..bd6fd3eaa0d 100644 --- a/website/www/site/content/en/get-started/quickstart-py.md +++ b/website/www/site/content/en/get-started/quickstart-py.md @@ -1,5 +1,5 @@ --- -title: "Beam Quickstart for Python" +title: "WordCount Quickstart for Python" ---
[beam] branch master updated (b2d500f8494 -> f77366a8115)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from b2d500f8494 Fix pulling licenses (#25234) add f77366a8115 Ignore flags for beam_sql magic (#25210) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/runners/interactive/sql/utils.py | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
[beam] branch master updated: Adding the registered trademark symbol to the Apache Beam title on the browser tab
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 7ab334d807e Adding the registered trademark symbol to the Apache Beam title on the browser tab new a5e6d90081b Merge pull request #25008 from rszper/rszper-trademark 7ab334d807e is described below commit 7ab334d807ec779926eaca6fb981900827accde3 Author: Rebecca Szper AuthorDate: Fri Jan 13 21:43:19 2023 + Adding the registered trademark symbol to the Apache Beam title on the browser tab --- website/www/site/content/en/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/_index.md b/website/www/site/content/en/_index.md index 719e5e12431..283fdd43c0c 100644 --- a/website/www/site/content/en/_index.md +++ b/website/www/site/content/en/_index.md @@ -1,5 +1,5 @@ --- -title: "Apache Beam" +title: "Apache Beam®" ---
[beam] branch master updated: Fix link to videos-and-podcasts page. (#24733)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 256433fbc57 Fix link to videos-and-podcasts page. (#24733) 256433fbc57 is described below commit 256433fbc57d0cabab93819ad2c2a35690588547 Author: Ahmet Altay AuthorDate: Tue Dec 20 13:06:48 2022 -0800 Fix link to videos-and-podcasts page. (#24733) --- website/www/site/content/en/blog/gsoc-19.md | 2 +- website/www/site/content/en/get-started/from-spark.md | 2 +- website/www/site/content/en/get-started/mobile-gaming-example.md| 2 +- website/www/site/content/en/get-started/quickstart-go.md| 2 +- website/www/site/content/en/get-started/quickstart-java.md | 2 +- website/www/site/content/en/get-started/quickstart-py.md| 2 +- website/www/site/content/en/get-started/quickstart/java.md | 2 +- .../www/site/content/en/get-started/resources/videos-and-podcasts.md| 1 + website/www/site/content/en/get-started/try-apache-beam.md | 2 +- website/www/site/content/en/get-started/wordcount-example.md| 2 +- 10 files changed, 10 insertions(+), 9 deletions(-) diff --git a/website/www/site/content/en/blog/gsoc-19.md b/website/www/site/content/en/blog/gsoc-19.md index b889193c24f..4c6c4a431f2 100644 --- a/website/www/site/content/en/blog/gsoc-19.md +++ b/website/www/site/content/en/blog/gsoc-19.md @@ -49,7 +49,7 @@ I wanted to explore Data Engineering, so for GSoC, I wanted to work on a project I had already read the [Streaming Systems book](http://streamingsystems.net/). So, I had an idea of the concepts that Beam is built on, but had never actually used Beam. Before actually submitting a proposal, I went through a bunch of resources to make sure I had a concrete understanding of Beam. I read the [Streaming 101](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101) and [Streaming 102](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102) blogs by Tyler Akidau. They are the perfect introduction to Beam’s unified model for Batch and Streaming. -In addition, I watched all Beam talks on YouTube. You can find them on the [Beam Website](https://beam.apache.org/documentation/resources/videos-and-podcasts/). +In addition, I watched all Beam talks on YouTube. You can find them on the [Beam Website](https://beam.apache.org/get-started/resources/videos-and-podcasts/). Beam has really good documentation. The [Programming Guide](https://beam.apache.org/documentation/programming-guide/) lays out all of Beam’s concepts really well. [Beam’s execution model](https://beam.apache.org/documentation/runtime/model) is also documented well and is a must-read to understand how Beam processes data. [waitingforcode.com](https://www.waitingforcode.com/apache-beam) also has good blog posts about Beam concepts. To get a better sense of the Beam codebase, I played around with it and worked on some PRs to understand Beam better and got familiar with the test suite and workflows. diff --git a/website/www/site/content/en/get-started/from-spark.md b/website/www/site/content/en/get-started/from-spark.md index 54546f6de4b..b1659b02cfc 100644 --- a/website/www/site/content/en/get-started/from-spark.md +++ b/website/www/site/content/en/get-started/from-spark.md @@ -332,7 +332,7 @@ with beam.Pipeline() as pipeline: * Learn how to read from and write to files in the [_Pipeline I/O_ section of the _Programming guide_](/documentation/programming-guide/#pipeline-io) * Walk through additional WordCount examples in the [WordCount Example Walkthrough](/get-started/wordcount-example). * Take a self-paced tour through our [Learning Resources](/documentation/resources/learning-resources). -* Dive in to some of our favorite [Videos and Podcasts](/documentation/resources/videos-and-podcasts). +* Dive in to some of our favorite [Videos and Podcasts](/get-started/resources/videos-and-podcasts). * Join the Beam [users@](/community/contact-us) mailing list. * If you're interested in contributing to the Apache Beam codebase, see the [Contribution Guide](/contribute). diff --git a/website/www/site/content/en/get-started/mobile-gaming-example.md b/website/www/site/content/en/get-started/mobile-gaming-example.md index d972f4597ea..63be47688be 100644 --- a/website/www/site/content/en/get-started/mobile-gaming-example.md +++ b/website/www/site/content/en/get-started/mobile-gaming-example.md @@ -412,7 +412,7 @@ We can use the resulting information to find, for example, what times of day our ## Next Steps * Take a self-paced tour through our [Learning Resources](/documentation/resources/learning-resources). -* Dive in to some of our favorite [Videos and Podcasts](/documentation/resources/videos-and-pod
[beam-starter-kotlin] branch main updated: Removing apilloud
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-kotlin.git The following commit(s) were added to refs/heads/main by this push: new d882a05 Removing apilloud new a369f40 Merge pull request #13 from davidcavazos/patch-1 d882a05 is described below commit d882a05d037a1326778edbcc80775cb7fad5aa7a Author: David Cavazos AuthorDate: Tue Dec 13 10:07:59 2022 -0800 Removing apilloud --- .github/dependabot.yml | 2 -- 1 file changed, 2 deletions(-) diff --git a/.github/dependabot.yml b/.github/dependabot.yml index a40bfc2..c63ba92 100644 --- a/.github/dependabot.yml +++ b/.github/dependabot.yml @@ -18,7 +18,6 @@ updates: - kennknowles - robertwb - kileys - - apilloud - package-ecosystem: "github-actions" directory: "/" @@ -30,4 +29,3 @@ updates: - kennknowles - robertwb - kileys - - apilloud
[beam] branch master updated: refs: issue-24196, fix broken hyperlink
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new ead245539d0 refs: issue-24196, fix broken hyperlink new fef8acdbc0e Merge pull request #24199 from Laksh47/issue#24196 ead245539d0 is described below commit ead245539d01dec0f3e08699c1e1cc6777a5ef0e Author: Laksh AuthorDate: Wed Nov 16 09:32:46 2022 -0500 refs: issue-24196, fix broken hyperlink --- website/www/site/content/en/blog/splitAtFraction-method.md| 2 +- website/www/site/content/en/blog/splittable-do-fn.md | 4 ++-- website/www/site/content/en/documentation/runners/dataflow.md | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/website/www/site/content/en/blog/splitAtFraction-method.md b/website/www/site/content/en/blog/splitAtFraction-method.md index 0ae5b734693..f2e280aaabd 100644 --- a/website/www/site/content/en/blog/splitAtFraction-method.md +++ b/website/www/site/content/en/blog/splitAtFraction-method.md @@ -22,7 +22,7 @@ See the License for the specific language governing permissions and limitations under the License. --> -This morning, Eugene and Malo from the Google Cloud Dataflow team posted [*No shard left behind: dynamic work rebalancing in Google Cloud Dataflow*](https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow). This article discusses Cloud Dataflow’s solution to the well-known straggler problem. +This morning, Eugene and Malo from the Google Cloud Dataflow team posted [*No shard left behind: dynamic work rebalancing in Google Cloud Dataflow*](https://cloud.google.com/blog/products/gcp/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow). This article discusses Cloud Dataflow’s solution to the well-known straggler problem. diff --git a/website/www/site/content/en/blog/splittable-do-fn.md b/website/www/site/content/en/blog/splittable-do-fn.md index d7c1abfafe8..f38a5d6d488 100644 --- a/website/www/site/content/en/blog/splittable-do-fn.md +++ b/website/www/site/content/en/blog/splittable-do-fn.md @@ -187,7 +187,7 @@ runner with information such as its estimated size (or its generalization, uses this information to tune the execution and control the breakdown of the `Source` into bundles. For example, a slowly progressing large bundle of a file may be [dynamically -split](https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow) +split](https://cloud.google.com/blog/products/gcp/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow) by a batch-focused runner before it becomes a straggler, and a latency-focused streaming runner may control how many elements it reads from a source in each bundle to optimize for latency vs. per-bundle overhead. @@ -251,7 +251,7 @@ a `@ProcessElement` call is going to take too long and become a straggler, it can split the restriction in some proportion so that the primary is short enough to not be a straggler, and can schedule the residual in parallel on another worker. For details, see [No Shard Left -Behind](https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow). +Behind](https://cloud.google.com/blog/products/gcp/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow). Logically, the execution of an SDF on an element works according to the following diagram, where "magic" stands for the runner-specific ability to split diff --git a/website/www/site/content/en/documentation/runners/dataflow.md b/website/www/site/content/en/documentation/runners/dataflow.md index eb5398d3c25..7b5d3e60f56 100644 --- a/website/www/site/content/en/documentation/runners/dataflow.md +++ b/website/www/site/content/en/documentation/runners/dataflow.md @@ -26,7 +26,7 @@ The Cloud Dataflow Runner and service are suitable for large scale, continuous j * a fully managed service * [autoscaling](https://cloud.google.com/dataflow/service/dataflow-service-desc#autoscaling) of the number of workers throughout the lifetime of the job -* [dynamic work rebalancing](https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow) +* [dynamic work rebalancing](https://cloud.google.com/blog/products/gcp/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow) The [Beam Capability Matrix](/documentation/runners/capability-matrix/) documents the supported capabilities of the Cloud Dataflow Runner.
[beam-starter-java] branch main updated: update to java 17
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-java.git The following commit(s) were added to refs/heads/main by this push: new 51870d2 update to java 17 new 4c4d429 Merge pull request #32 from davidcavazos/update-java17 51870d2 is described below commit 51870d2f556a3c5d989893adf01983cf6889f10b Author: David Cavazos AuthorDate: Tue Nov 15 14:30:48 2022 -0800 update to java 17 --- .github/workflows/test.yaml | 6 +++--- README.md | 12 build.gradle| 5 ++--- build.sbt | 2 +- pom.xml | 13 + 5 files changed, 15 insertions(+), 23 deletions(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index b1b9817..37d64b4 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -18,7 +18,7 @@ jobs: - uses: actions/setup-java@v3 with: distribution: 'temurin' -java-version: '11' +java-version: '17' cache: 'gradle' - run: gradle assemble --info - run: gradle test --info @@ -31,7 +31,7 @@ jobs: - uses: actions/setup-java@v3 with: distribution: 'temurin' -java-version: '11' +java-version: '17' - run: sbt -v 'Test / compile' assembly - run: sbt -v test - run: java -jar build/pipeline.jar --inputText="🎉" @@ -43,7 +43,7 @@ jobs: - uses: actions/setup-java@v3 with: distribution: 'temurin' -java-version: '11' +java-version: '17' cache: 'maven' - run: mvn -DskipTests test-compile package - run: mvn test diff --git a/README.md b/README.md index 083abf4..fafeef3 100644 --- a/README.md +++ b/README.md @@ -8,16 +8,12 @@ you can choose the license you prefer and feel free to delete anything related t Make sure you have a [Java](https://en.wikipedia.org/wiki/Java_%28programming_language%29) development environment ready. If you don't, an easy way to install it is with [`sdkman`](https://sdkman.io). -> ℹ️ Apache Beam currently only supports Java 8 (LTS) and Java 11 (LTS). -> -> ⚠️ **[Java 8 ends its active support in March 2022](https://endoflife.date/java)**, so we recommend using Java 11 (LTS) until Java 17 (LTS) is supported. - ```sh # Install sdkman. curl -s "https://get.sdkman.io"; | bash -# Make sure you have Java 11 installed. -sdk install java 11.0.12-tem +# Make sure you have Java 17 installed. +sdk install java 17.0.5-tem ``` ## Source file structure @@ -31,7 +27,7 @@ There are only two source files: > ℹ️ Most build tools expect all the Java source files to be under > `src/main/java/` and tests to be under `src/test/java/` by default. -### Option A: Gradle +### Option A: Gradle _(recommended)_ [Gradle](https://gradle.org) is a build tool focused on flexibility and performance. @@ -98,7 +94,7 @@ sbt assembly java -jar build/pipeline.jar --inputText="🎉" ``` -### Option C: Apache Maven _(not recommended)_ +### Option C: Apache Maven [Apache Maven](http://maven.apache.org) is a project management and comprehension tool based on the concept of a project object model (POM). diff --git a/build.gradle b/build.gradle index 2fbf7dd..378181f 100644 --- a/build.gradle +++ b/build.gradle @@ -23,11 +23,10 @@ test { useJUnit() } -def beamVersion = '2.42.0' dependencies { // App dependencies. -implementation "org.apache.beam:beam-sdks-java-core:${beamVersion}" -implementation "org.apache.beam:beam-runners-direct-java:${beamVersion}" +implementation "org.apache.beam:beam-sdks-java-core:2.42.0" +implementation "org.apache.beam:beam-runners-direct-java:2.42.0" implementation "org.slf4j:slf4j-jdk14:1.7.32" // Tests dependencies. diff --git a/build.sbt b/build.sbt index 357158b..ff79e18 100644 --- a/build.sbt +++ b/build.sbt @@ -8,7 +8,7 @@ mainClass := Some("com.example.App") -val beamVersion = "2.39.0" +val beamVersion = "2.42.0" libraryDependencies ++= Seq( // App dependencies. "org.apache.beam" % "beam-sdks-java-core" % beamVersion, diff --git a/pom.xml b/pom.xml index cecb53c..669d6ce 100644 --- a/pom.xml +++ b/pom.xml @@ -17,12 +17,9 @@ 1 -11 -11 +17 +17 UTF-8 - -2.40.0 -4.13.2 @@ -83,13 +80,13 @@ org.apache.beam beam-sdks-java-core - ${beam.version} + 2.42.0 org.apache.beam beam-runners-direct-java - ${beam.version} + 2.42.0 runtime @@ -103,7 +100,7 @@ junit junit - ${junit.version} + 4.13.2 test
[beam] branch master updated: Retroactively announce Batched DoFn support in 2.42.0 Blog (#24011)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 708e05f0dc6 Retroactively announce Batched DoFn support in 2.42.0 Blog (#24011) 708e05f0dc6 is described below commit 708e05f0dc6c0a2e62674cd0ecb8a6744ae57d3a Author: Brian Hulette AuthorDate: Tue Nov 8 12:18:23 2022 -0800 Retroactively announce Batched DoFn support in 2.42.0 Blog (#24011) * Retroactively announce Batched DoFn support in 2.42.0 * Add to blog as well --- CHANGES.md | 3 +++ website/www/site/content/en/blog/beam-2.42.0.md | 3 +++ 2 files changed, 6 insertions(+) diff --git a/CHANGES.md b/CHANGES.md index f81f6978aa7..a46cdee0de8 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -132,6 +132,9 @@ ## Highlights * Added support for stateful DoFns to the Go SDK. +* Added support for [Batched + DoFns](https://beam.apache.org/documentation/programming-guide/#batched-dofns) + to the Python SDK. ## New Features / Improvements diff --git a/website/www/site/content/en/blog/beam-2.42.0.md b/website/www/site/content/en/blog/beam-2.42.0.md index edb144f25e2..08b74962117 100644 --- a/website/www/site/content/en/blog/beam-2.42.0.md +++ b/website/www/site/content/en/blog/beam-2.42.0.md @@ -31,6 +31,9 @@ For more information on changes in 2.42.0, check out the [detailed release notes ## Highlights * Added support for stateful DoFns to the Go SDK. +* Added support for [Batched + DoFns](https://beam.apache.org/documentation/programming-guide/#batched-dofns) + to the Python SDK. ## New Features / Improvements
[beam] branch master updated (d10b4a28fff -> 1a643d16112)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from d10b4a28fff removed trailing whitespace (#23987) add 1a643d16112 Beam starter projects blog post (#23964) No new revisions were added by this update. Summary of changes: .../site/content/en/blog/beam-starter-projects.md | 77 ++ website/www/site/data/authors.yml | 3 + 2 files changed, 80 insertions(+) create mode 100644 website/www/site/content/en/blog/beam-starter-projects.md
[beam] branch master updated: Blog post for Hop web in Google Cloud (#23652)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 948d9e2d558 Blog post for Hop web in Google Cloud (#23652) 948d9e2d558 is described below commit 948d9e2d558f13a71cf5dc17ea463d1d45fb5e5c Author: Israel Herraiz AuthorDate: Sun Oct 16 22:15:44 2022 +0200 Blog post for Hop web in Google Cloud (#23652) --- website/www/site/content/en/blog/hop-web-cloud.md | 304 + .../blog/hop-web-cloud/hop-web-cloud-image1.png| Bin 0 -> 3617 bytes .../blog/hop-web-cloud/hop-web-cloud-image2.png| Bin 0 -> 35106 bytes .../blog/hop-web-cloud/hop-web-cloud-image3.png| Bin 0 -> 35049 bytes .../blog/hop-web-cloud/hop-web-cloud-image4.png| Bin 0 -> 17595 bytes .../blog/hop-web-cloud/hop-web-cloud-image5.png| Bin 0 -> 27628 bytes .../blog/hop-web-cloud/hop-web-cloud-image6.png| Bin 0 -> 32022 bytes 7 files changed, 304 insertions(+) diff --git a/website/www/site/content/en/blog/hop-web-cloud.md b/website/www/site/content/en/blog/hop-web-cloud.md new file mode 100644 index 000..34e1aabff78 --- /dev/null +++ b/website/www/site/content/en/blog/hop-web-cloud.md @@ -0,0 +1,304 @@ +--- +title: "Apache Hop web version with Cloud Dataflow" +date: 2022-10-15 00:00:01 -0800 +categories: + - blog +aliases: + - /blog/2022/10/15/hop-web-cloud.html +authors: + - iht +--- + + +Hop is a codeless visual development environment for Apache Beam pipelines that +can run jobs in any Beam runner, such as Dataflow, Flink or Spark. [In a +previous post](https://beam.apache.org/blog/apache-hop-with-dataflow/), we +introduced the desktop version of Apache Hop. Hop also has a web environment, +Hop Web, that you can run from a container, so you don't have to install +anything on your computer to use it. + +In this detailed tutorial, you access Hop through the internet using a web +browser and point to a container running in a virtual machine on Google +Cloud. That container will launch jobs in Dataflow and report back the results +of those jobs. Because we don't want just anyone to access your Hop instance, +we’re going to secure it so that only you can access that virtual machine. The +following diagram illustrates the setup: + +![Architecture deployed with this tutorial](/images/blog/hop-web-cloud/hop-web-cloud-image2.png) + +We will show how to do the deployment described previously, creating a web and +visual development environment that builds Beam pipelines using just a web +browser. When complete, you will have a secure web environment that you can use +to create pipelines with your web browser and launch them using Google Cloud +Dataflow. + +## What do you need to run this example? + +We are using Google Cloud, so the first thing you need is a Google Cloud +project. If needed, you can sign up for the free trial of Google Cloud at +[https://cloud.google.com/free](https://cloud.google.com/free). + +When you have a project, you can use [Cloud +Shell](https://cloud.google.com/shell) in your web browser with no additional +setup. In Cloud Shell, the Google Cloud SDK is automatically configured for your +project and credentials. That's the option we use here. Alternatively, you can +configure the Google Cloud SDK in your local computer. For instructions, see +[https://cloud.google.com/sdk/docs/install](https://cloud.google.com/sdk/docs/install). + +To open Cloud Shell, go to the [Google Cloud console] +(http://console.cloud.google.com), make sure your project is selected, and click +the Cloud Shell button ![Cloud Shell +button](/images/blog/hop-web-cloud/hop-web-cloud-image1.png). Cloud Shell opens, +and you can use it to run the commands shown in this post. + +The commands that we are going to use in the next steps are [available in a Gist +in Github](https://gist.github.com/iht/6219b227424ada477462c7b9d9d93c57), just +in case you prefer to run that script instead of copying the commands from this +tutorial. + +## Permissions and accounts + +When we run a Dataflow pipeline, we can use our personal Google Cloud +credentials to run the job. But Hop web will be running in a virtual machine, +and in Google Cloud, virtual machines run using service accounts as +credentials. So we need to make sure that we have a service account that has +permission to run Dataflow jobs. + +By default, virtual machines use the service account called _Compute Engine +default service account_. For the sake of simplicity, we will use this +account. Still, we need to add some permissions to run Dataflow jobs with that +service account. + +First, let's make sure that you have enabled all the required Google Cloud +APIs. [Click this link to enable Dataflow, BigQuery and +Pub/Sub](https://console.cloud.google.com/flows/enableapi?apiid=dataflow,compute_component,loggi
svn commit: r57297 - /release/beam/extensions/jupyterlab-sidepanel/v3.0.0/
Author: altay Date: Tue Oct 11 00:20:31 2022 New Revision: 57297 Log: add jupyterlab sidepanel v3.0.0 Added: release/beam/extensions/jupyterlab-sidepanel/v3.0.0/ release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.asc release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.sha512 Added: release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip == (empty) Added: release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.asc == --- release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.asc (added) +++ release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.asc Tue Oct 11 00:20:31 2022 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIzBAABCAAdFiEE0wZJ4JwAISBINnlHlHKpxPia5PEFAmM8o48ACgkQlHKpxPia +5PFleQ//ZhKiPR3DFcHRB+aXPMCplJd3/bh4F9Y8tKvaUrf8UW1KJQdfwT6yumgt +fB3vNwpJxmaGYpcZhunorHy2wJkmubzRPntQXbbf7AJzmheV4ZgK19oxtNtVdQZh +hxF+tdXq2/jU0Z4Rij2gtu6EEs2/qEuEkd3fFT66BUNI1jdVLkuSTWsqQETS1rR4 +UDdlFywiUHDqZKneysAvK+0v2+FvagNWL912DBAhkDe1wufY12FD7JN/4J0inI4a +H8vHFV8f+A4x7vVCX9PeMafTI18eEHDC4jc2WCogWZer1v7b6s19sXZrM7ECJ8BA +Mebzdb31cHDjsSwp6j5e5uOKYjWM9BMJnXrqorooJ0kOcruaPOa17vIElxiCBQBj +ZJR2aGae66RdQ59UJ4xDtDIiSVWIpL8z1NPVVeF4kcU053kCpl8m6d7tTnKHhbSp +iw1UqdigxXsCiIYs5ameW2rXgG/Vsi9N878KO7/s0ohkkqb/ubdSdvb6LF36g7K+ +/iCwrqXi1eOw8034fcwozzMHHIX77AZf1cAu7jhhMmOl5L1fioJmn3WCCSl+dRU5 +mMZc0cEL2SqySF7zSZuyQhBzlQufo0r7GUJEZmevBFksHYN6yneFIOUK2FbY8yQX +/PFhXpke3FLm/bxzqOii3l/13GHSV4Jl6EkBkoRbMBOJyvrQ7Us= +=PhGX +-END PGP SIGNATURE- Added: release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.sha512 == --- release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.sha512 (added) +++ release/beam/extensions/jupyterlab-sidepanel/v3.0.0/apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip.sha512 Tue Oct 11 00:20:31 2022 @@ -0,0 +1 @@ +cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e apache-beam-jupyterlab-sidepanel-v3.0.0-source-release.zip
[beam] branch master updated: fix: only report backlog bytes on data records (#23493)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 0448f2ea3dc fix: only report backlog bytes on data records (#23493) 0448f2ea3dc is described below commit 0448f2ea3dc67e25dfb63bc7c0032ac2fe0f57a8 Author: Thiago Nunes AuthorDate: Thu Oct 6 10:36:41 2022 +1100 fix: only report backlog bytes on data records (#23493) --- .../action/DataChangeRecordAction.java | 12 +- .../action/QueryChangeStreamAction.java| 10 + .../action/DataChangeRecordActionTest.java | 12 +- .../action/QueryChangeStreamActionTest.java| 45 ++ .../dofn/ReadChangeStreamPartitionDoFnTest.java| 2 +- 5 files changed, 61 insertions(+), 20 deletions(-) diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/DataChangeRecordAction.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/DataChangeRecordAction.java index 2446a6c9316..f806c7fcb74 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/DataChangeRecordAction.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/DataChangeRecordAction.java @@ -22,12 +22,14 @@ import java.util.Optional; import org.apache.beam.sdk.io.gcp.spanner.changestreams.model.ChildPartitionsRecord; import org.apache.beam.sdk.io.gcp.spanner.changestreams.model.DataChangeRecord; import org.apache.beam.sdk.io.gcp.spanner.changestreams.model.PartitionMetadata; +import org.apache.beam.sdk.io.gcp.spanner.changestreams.restriction.ThroughputEstimator; import org.apache.beam.sdk.io.gcp.spanner.changestreams.restriction.TimestampRange; import org.apache.beam.sdk.transforms.DoFn.OutputReceiver; import org.apache.beam.sdk.transforms.DoFn.ProcessContinuation; import org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator; import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker; import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Utf8; import org.joda.time.Instant; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -66,6 +68,8 @@ public class DataChangeRecordAction { * org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn} SDF * @param watermarkEstimator the watermark estimator of the {@link * org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn} SDF + * @param throughputEstimator an estimator to calculate local throughput of the {@link + * org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn}. * @return {@link Optional#empty()} if the caller can continue processing more records. A non * empty {@link Optional} with {@link ProcessContinuation#stop()} if this function was unable * to claim the {@link ChildPartitionsRecord} timestamp @@ -76,7 +80,8 @@ public class DataChangeRecordAction { DataChangeRecord record, RestrictionTracker tracker, OutputReceiver outputReceiver, - ManualWatermarkEstimator watermarkEstimator) { + ManualWatermarkEstimator watermarkEstimator, + ThroughputEstimator throughputEstimator) { final String token = partition.getPartitionToken(); LOG.debug("[" + token + "] Processing data record " + record.getCommitTimestamp()); @@ -91,6 +96,11 @@ public class DataChangeRecordAction { outputReceiver.outputWithTimestamp(record, commitInstant); watermarkEstimator.setWatermark(commitInstant); +// The size of a record is represented by the number of bytes needed for the +// string representation of the record. Here, we only try to achieve an estimate +// instead of an accurate throughput. +throughputEstimator.update(Timestamp.now(), Utf8.encodedLength(record.toString())); + LOG.debug("[" + token + "] Data record action completed successfully"); return Optional.empty(); } diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/QueryChangeStreamAction.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/QueryChangeStreamAction.java index 6afb57c6e94..4265d1356ab 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/QueryChangeStreamAction.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/action/QueryChangeStr
svn commit: r56136 - /release/beam/KEYS
Author: altay Date: Fri Aug 5 20:07:54 2022 New Revision: 56136 Log: Add key for kiley...@apache.org Modified: release/beam/KEYS Modified: release/beam/KEYS == --- release/beam/KEYS (original) +++ release/beam/KEYS Fri Aug 5 20:07:54 2022 @@ -2724,3 +2724,158 @@ IrtKOCebhVFW1R+GGXCLM7W5jii5I3VcnfImDgQz POn3aXrCY8RBzzQ= =mgKW -END PGP PUBLIC KEY BLOCK- +pub rsa4096 2021-01-12 [SC] + E498A4DEB65779F2E2635D60F7AF878F6FCA5BB5 +uid [ultimate] Kiley +sig 3F7AF878F6FCA5BB5 2021-01-12 Kiley +sub rsa4096 2021-01-12 [E] +sig F7AF878F6FCA5BB5 2021-01-12 Kiley + +pub rsa3072 2022-01-18 [SC] + 225103D6FCAB1CF2E8EF600BF1090C9DAA8BB357 +uid [ultimate] kiley sok +sig 3F1090C9DAA8BB357 2022-01-18 kiley sok +sub rsa3072 2022-01-18 [E] +sig F1090C9DAA8BB357 2022-01-18 kiley sok + +pub rsa4096 2022-08-05 [SC] + 4D5731CC0AA38097D091EB091E7B28884452AE5D +uid [ultimate] Kiley Sok +sig 31E7B28884452AE5D 2022-08-05 Kiley Sok +sub rsa4096 2022-08-05 [E] +sig 1E7B28884452AE5D 2022-08-05 Kiley Sok + +-BEGIN PGP PUBLIC KEY BLOCK- + +mQINBF/+KJYBEADaEEB/n6SRKmglehzbIy5miJq6Nq+mHSBMXp3JZfn6lKLUrULx +czZHRfKRZZD5a3v4mpK5JZg6ylFykWu7MAn051ELEB09VddbBA96DZu7yGbn1yxi +bN4Dx+zNl5KeK3kkhTlbAMQPEZU2FRIv1maVccv/3DTNxDXKO09PbdGvk7PJzPz2 +w4QQ31iG1atQtAarCbZVRn0aiSiaAos0Mhi9SQK+wqitR9uRBh+tU631kWhjbqe8 +Mb1SNQTpkzdXNtaINRZzJDhGRFlMhz9GDtzuctZS2hAeHmMHia56mZ++LjBIsoUP +3HExRL6S03nnalPv8BXxZdkUNfoqauPO++rAq0+q7kVznkM6b4D2olD12KkA4ahn +6g+Lga7DAfZCeVc3a5HezZ0wsijwnPcfpIWcEhVMrLD8EVXU3mjTGZCt9l8SFVvt +mIku8DiqKu7eIz/WFVuaHN0WhaVOSpJsfq/aomMD461X5ti5+Vx68U8LW86AWfgR +8TEmGAZlDDzZULbsqzHnHC/h7EEkcPvpfP//souqpic9gjBZjae5U2LhG+fy/7tN +JWxfd/SbnjOAvmUxYGYXBF1TgLfkIdW8vkPsI6n3Q2ir66qHyy8EK4YV1O2a3s5i +VQgwKQYX+lv24GFNV0McfnDEuTz9P2NpJXJJwmGbR+Gv6mzRcCReXN6w0wARAQAB +tBtLaWxleSA8a2lsZXlzb2tAZ29vZ2xlLmNvbT6JAk4EEwEKADgWIQTkmKTetld5 +8uJjXWD3r4ePb8pbtQUCX/4olgIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAK +CRD3r4ePb8pbtfvED/9ogOx5DJYRlL987rXEoxGeDdfC642WfQZjmlF5MLH5neu3 +Vw7IvHblLMDlplPrp0je6o+kTkvYDvsfW8bnUMepAQ/MFutpb8P+7E6qCKImcn0k +f4ggpgRqSIEJLTw1+J1wLumvpf6Zohd5EfUZ7C6F2UN+tEKtLB2o6CliW5lJn/lM +dBm/8fu65NlQTw916+udFpDbZTH7a1rzZ2WfFPxeUI6lYb0XN4Jz5dToi1y431fr +VI/9x1TY1h0oL+rzsma7A52YW37ysSjRBbnQ+eCAMb/qL/znsyDZz3aASPkpnRMn +G9jfY16l6EjYgRkd78U7NkoHZtiybWXcE3Tdg325EfSWIeAas8zvesVsuWaJ8y+e +MIm4IdvWcvaUtH5wj0JeWOEQJd6DRgD0LWn6H6qih07BnaICpE0eKNSiIyoAmuv+ +ZbnHiCI0PloCK+c5YHqw2hZreQed5aF/5SMbbZDlvDIAZIQVI/gCuHqg7oGJhyIH +bZmveOREv1Wne9G9eXF5Lg1BEjip1kDDN/NNDTre6Mq3eWx67GcNFWygUPLkd5kU +AAaweGuWIIpJYrpXLumtsP7ANim5Hi/87OQ3rweCWTXL66eMoVNCG/08cky8Y1XZ +w3/zBz3811UP0Oss+NBxFKWz1Y56mCBgHnXogaJ2iEqqIh9fR0qnMrGO+X9WiLkC +DQRf/iiWARAAvEkgp0de4+vu7Wmoiw+UY767P2RF/qchB4KkKa6eb3/mcx2/VIvl +ADrzTi6segr9FUYx1mOhKZGe+A9Bu7fhLsXslPUlpKqISHg+V/vo995PZih934jg +fanhaRy23Ygkdpx+aA5YkQcs16zb2HdoCpO9cmozJJU7q3P4EXL3LRxn3ChJE5m8 +Rva7tGrMhPI7v7vut/YIjFAiW+7y3N6hqJDpmDOCqJ0Bx2iLGKfr/YxSVplHe+Fw +gkZe5B05mPA39MOE0qovKeTrZ4Zxo1jxCqfbJGloYJS+wSOAjI5uupFTZSAxJqwZ +2pig2dBF6t16xIs9JCKEWb47fNtBxwf4o5VyE+d4K1WkNRSnH/vfWn1GQRYKupB5 +ql/374yP3EENxJzLE8t7X4909eZYfTMXtqmWKp2G2yvU+2AVwTmNwpxEde0uyHl/ +B/ueYtG0wMwzzd+yxICd8nENiuZwfG9UbyeBFWEeT6v8uyuCtyWvxhRxXPpqIuKM +rdQMnM2y+gMOLWi7WBdnhSTP3Ass5XuFaHeAl3HN40OYRPDFEBhr6B8DzE+kxfzT +RfgAM/0NCdWtCD5a18bO+H+q21WMr1PQxBeNt+VFxTLjTBe40xxrdHzppXbB0ByA ++mmhS0qOv9jhJaWSayh9aeTDkHBIkbzj1rDgWJf6CzYqZOOo91ApIgsAEQEAAYkC +NgQYAQoAIBYhBOSYpN62V3ny4mNdYPevh49vylu1BQJf/iiWAhsMAAoJEPevh49v +ylu1SiMQAKFfMPsnnXtJjjZfYP4UZSUy0g/rkEY6w6gZHspOADblZfN74TQjrhzW +QgWzFGpBSK3B/CkKelYsujmXgJvcVCXzxaqO04zQAxKpqNv0ciztx6wTpN0vx9xA +fFANGuiYYlMnRbg6PBHXJ49HFtmlZqYdf5z6CXLJPK/w/gF97dHkEP2sS7UoB2BQ +UHWUAMpvikvx1RcqWvE+lOtxu1DcCqqU7fRqLqo10SmMXIfwTiZMeFAfMIr/dP3D +r8kbwuTxz2sV2QeaJGAQPP1Tn+YqKgrEKICw/q+vWKJNkR5bWFXEoG1DG6ryVpJx +l+IBQNttDDqAMPbmGO07BUVJF4FZfAvlGECmodlMPr5Htj7QUB2JaYgROmd6bCCj +gcoQe9EbJpIfb4ZkCsvemQwk4VpssCpaBd+c80nYC2eiEwkohAIULhAJK2FCDGVQ +xzp6skuYpO0BLKjEUKMYWl522tG/EbUiKW/VUC41HXbyiH51BqzWRwzpLKRUcyvR +YXfXWDsYVG4klWl7sGYGIvzL5tnVV56tPcqgc17eQ1oDNkV81bHIKQHgbO0nxGDj +7JKIQTmDef1SAGjD5R9RqWyAUGNcKK22YWf7f1nT49GQs5ZkIDRqgEMB4Xh/qC7f +eUzXtCt/h0+QzBiRlEcK1JxKp6yBvy0msU8v2Hk41XGgf62UJWmwmQGNBGHnAH8B +DADjc6LrsF1LcXMiWe/2syWEVXQ+MWLsmli3j2l8zq7+YgPHSwEmUDR3lklnt7uE +bbdUIN+497drQ63amfgPR0fhcgLq0swXIDKYZ625qpHrgyWLylzEA8R1Ed8f3kBW +G1k/KToIaVUaa/2Bg05/ypGk+lOdEy310TtT+XIkUgpCQ6FHg3cyKLetjji9XGfz +z0p197o6ptr8uduinn+6nLInco9bvBeptR3B+YenX7YEJkcFO++ts4bHz4F0uFgf +a99/8aiLKTlZj7q7nMy+Z/vHokY7UnTZKlLgwglv3kM+K5HuXbtUrJlrFMf4QKSx +HQ/QEjDKqR9Gnp1NrtmQ3wXUrCNaw0mrd1Nvu821qf3koB5pBd/IbixOip29LBu3 +8ngCcigDHZHO0L+zxFuqIb6GuCTL8nM0ldtDgr+vTbCZrQ7tYregRN6Y1bC/LTzP +EX12TPe/9qYEBg48DkI8KCShZhVI15mIFYZR2jNZA3CBcyX1ltXS4z4JvzM8oBCH +3JMAEQEAAbQea2lsZXkgc29rIDxraWxleXNva0BnbWFpbC5jb20+iQHOBBMBCgA4
[beam] branch master updated: convert windmill min timestamp to beam min timestamp (#21915)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 3ad20d2acdc convert windmill min timestamp to beam min timestamp (#21915) 3ad20d2acdc is described below commit 3ad20d2acdc43582bc307157f29bc015292590d7 Author: Naireen Hussain AuthorDate: Tue Jul 26 16:07:00 2022 -0700 convert windmill min timestamp to beam min timestamp (#21915) * convert windmill min timestamp to beam min timestamp * convert windmill min timestamp to beam min timestamp Co-authored-by: Naireen Hussain --- .../runners/dataflow/worker/WindmillTimeUtils.java | 3 +++ .../dataflow/worker/WindmillTimeUtilsTest.java | 10 +++ .../worker/WindmillTimerInternalsTest.java | 31 +++--- 3 files changed, 40 insertions(+), 4 deletions(-) diff --git a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java index 8552c27d596..9732826bdd6 100644 --- a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java +++ b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java @@ -45,6 +45,9 @@ public class WindmillTimeUtils { // Windmill should never send us an unknown timestamp. Preconditions.checkArgument(timestampUs != Long.MIN_VALUE); Instant result = new Instant(divideAndRoundDown(timestampUs, 1000)); +if (result.isBefore(BoundedWindow.TIMESTAMP_MIN_VALUE)) { + return BoundedWindow.TIMESTAMP_MIN_VALUE; +} if (result.isAfter(BoundedWindow.TIMESTAMP_MAX_VALUE)) { // End of time. return BoundedWindow.TIMESTAMP_MAX_VALUE; diff --git a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java index 5f910c3acb5..84e76d2f8bd 100644 --- a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java +++ b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java @@ -23,6 +23,7 @@ import static org.apache.beam.runners.dataflow.worker.WindmillTimeUtils.windmill import static org.junit.Assert.assertEquals; import org.apache.beam.sdk.transforms.windowing.BoundedWindow; +import org.joda.time.Duration; import org.joda.time.Instant; import org.junit.Test; import org.junit.runner.RunWith; @@ -56,6 +57,15 @@ public class WindmillTimeUtilsTest { assertEquals(new Instant(-17), windmillToHarnessTimestamp(-16987)); assertEquals(new Instant(-17), windmillToHarnessTimestamp(-17000)); assertEquals(new Instant(-18), windmillToHarnessTimestamp(-17001)); +assertEquals(BoundedWindow.TIMESTAMP_MIN_VALUE, windmillToHarnessTimestamp(Long.MIN_VALUE + 1)); +assertEquals(BoundedWindow.TIMESTAMP_MIN_VALUE, windmillToHarnessTimestamp(Long.MIN_VALUE + 2)); +// Long.MIN_VALUE = -9223372036854775808, need to add 1808 microseconds to get to next +// millisecond returned by Beam. +assertEquals( +BoundedWindow.TIMESTAMP_MIN_VALUE.plus(Duration.millis(1)), +windmillToHarnessTimestamp(Long.MIN_VALUE + 1808)); +assertEquals( +BoundedWindow.TIMESTAMP_MIN_VALUE, windmillToHarnessTimestamp(Long.MIN_VALUE + 1807)); } @Test diff --git a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimerInternalsTest.java b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimerInternalsTest.java index 2d222b534c7..8632034a9b2 100644 --- a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimerInternalsTest.java +++ b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimerInternalsTest.java @@ -88,12 +88,22 @@ public class WindmillTimerInternalsTest { TimerData.of( namespace, timestamp, timestamp.minus(Duration.millis(1)), timeDomain)); for (TimerData timer : anonymousTimers) { -assertThat( +Instant expectedTimestamp = + timer.getOutputTimestamp().isBefore(BoundedWindow.TIMESTAMP_MIN_VALUE) +? BoundedWindow.TIMESTAMP_MIN_VALUE +: timer.getOutputTimestamp(); +TimerData computed
[beam-starter-java] branch main updated: Fix gradle command
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-java.git The following commit(s) were added to refs/heads/main by this push: new f582596 Fix gradle command new 7830be0 Merge pull request #16 from davidcavazos/patch-1 f582596 is described below commit f582596c708143de4c2f6e313f4f786f6c55fa67 Author: David Cavazos AuthorDate: Mon Jul 25 14:13:36 2022 -0700 Fix gradle command --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2aed355..083abf4 100644 --- a/README.md +++ b/README.md @@ -48,7 +48,7 @@ A basic Gradle setup consists of a [`build.gradle`](build.gradle) file written i gradle run # To run passing command line arguments. -gradle run -Pargs=--inputText="🎉" +gradle run --args=--inputText="🎉" # To run the tests. gradle test --info
[beam] branch master updated (deb72620e02 -> 0fbecde63b3)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from deb72620e02 Merge pull request #22442 from chamikaramj/runinference_kv_support add 0fbecde63b3 Enable configuration to avoid successfully written Table Row propagation as part of WriteResult for StreamingInserts (#21813) No new revisions were added by this update. Summary of changes: .../sdk/io/gcp/bigquery/BatchedStreamingWrite.java | 40 +- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java | 19 +- .../beam/sdk/io/gcp/bigquery/StreamingInserts.java | 34 ++ .../sdk/io/gcp/bigquery/StreamingWriteTables.java | 37 ++-- .../beam/sdk/io/gcp/bigquery/WriteResult.java | 6 ++-- .../sdk/io/gcp/bigquery/BigQueryIOWriteTest.java | 14 6 files changed, 137 insertions(+), 13 deletions(-)
[beam-starter-python] branch main updated: enable dependabot
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-python.git The following commit(s) were added to refs/heads/main by this push: new 09b3af8 enable dependabot new 3a26289 Merge pull request #3 from davidcavazos/dependabot 09b3af8 is described below commit 09b3af8a9012f0ef88f17f1ce9d37da14bf9f9e0 Author: David Cavazos AuthorDate: Wed Jul 13 13:08:06 2022 -0700 enable dependabot --- .github/dependabot.yml | 19 +++ 1 file changed, 19 insertions(+) diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 000..6285813 --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,19 @@ +# Copyright 2022 Google LLC +# +# Licensed under the Apache License, Version 2.0 https://www.apache.org/licenses/LICENSE-2.0> or the MIT license +# https://opensource.org/licenses/MIT>, at your +# option. This file may not be copied, modified, or distributed +# except according to those terms. + +version: 2 +updates: + - package-ecosystem: "pip" +directory: "/" +schedule: + interval: "daily" + + - package-ecosystem: "github-actions" +directory: "/" +schedule: + interval: "daily"
[beam-starter-java] branch main updated: enable dependabot
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-java.git The following commit(s) were added to refs/heads/main by this push: new 5007c98 enable dependabot new 0241905 Merge pull request #5 from davidcavazos/dependabot 5007c98 is described below commit 5007c98221af0e749d5e0b25b777278a4b91fcb9 Author: David Cavazos AuthorDate: Wed Jul 13 13:06:06 2022 -0700 enable dependabot --- .github/dependabot.yml | 26 ++ 1 file changed, 26 insertions(+) diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 000..077482b --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,26 @@ +# Copyright 2022 Google LLC +# +# Licensed under the Apache License, Version 2.0 https://www.apache.org/licenses/LICENSE-2.0> or the MIT license +# https://opensource.org/licenses/MIT>, at your +# option. This file may not be copied, modified, or distributed +# except according to those terms. + +version: 2 +updates: + - package-ecosystem: "gradle" +directory: "/" +schedule: + interval: "daily" + + - package-ecosystem: "maven" +directory: "/" +schedule: + interval: "daily" + + # SBT is not yet supported: https://github.com/dependabot/dependabot-core/issues/352 + + - package-ecosystem: "github-actions" +directory: "/" +schedule: + interval: "daily"
[beam] branch master updated: Correcting the regex for the Dataflow job name.
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new ed5002db70a Correcting the regex for the Dataflow job name. new 78ec29eb903 Merge pull request #21932 from rszper/rszper-jobNameRegex ed5002db70a is described below commit ed5002db70a83f810fdcbab500dde14707925ed5 Author: Rebecca Szper AuthorDate: Fri Jun 17 12:19:15 2022 -0700 Correcting the regex for the Dataflow job name. --- .../core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java | 2 +- .../dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java index 8a5e07b18ff..0cefaba81e1 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java @@ -284,7 +284,7 @@ public interface PipelineOptions extends HasDisplayData { @Description( "Name of the pipeline execution." - + "It must match the regular expression '[a-z]([-a-z0-9]{0,38}[a-z0-9])?'." + + "It must match the regular expression '[a-z]([-a-z0-9]{0,1022}[a-z0-9])?'." + "It defaults to ApplicationName-UserName-Date-RandomInteger") @Default.InstanceFactory(JobNameFactory.class) String getJobName(); diff --git a/sdks/python/apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py b/sdks/python/apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py index e2e8ae879a3..21c3018596e 100644 --- a/sdks/python/apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py +++ b/sdks/python/apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py @@ -2748,7 +2748,7 @@ class Job(_messages.Message): given name may exist in a project at any given time. If a caller attempts to create a Job with the same name as an already-existing Job, the attempt returns the existing Job. The name must match the regular - expression `[a-z]([-a-z0-9]{0,38}[a-z0-9])?` + expression `[a-z]([-a-z0-9]{0,1022}[a-z0-9])?` pipelineDescription: Preliminary field: The format of this data may change at any time. A description of the user pipeline and stages through which it is executed. Created by Cloud Dataflow service. Only retrieved with
[beam] branch master updated: fix: Add a retry code to insertall retry policy
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new c7485a0a92d fix: Add a retry code to insertall retry policy new 2574f9106c7 Merge pull request #17772 from yirutang/insertall c7485a0a92d is described below commit c7485a0a92deb45a858a14037d3aad1dc1d9e471 Author: yirutang AuthorDate: Mon May 23 15:56:19 2022 -0700 fix: Add a retry code to insertall retry policy --- .../java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java | 2 +- .../java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicyTest.java | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java index 24b48d7dc24..33f846870fc 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java @@ -46,7 +46,7 @@ public abstract class InsertRetryPolicy implements Serializable { // A list of known persistent errors for which retrying never helps. static final Set PERSISTENT_ERRORS = - ImmutableSet.of("invalid", "invalidQuery", "notImplemented", "row-too-large"); + ImmutableSet.of("invalid", "invalidQuery", "notImplemented", "row-too-large", "parseError"); /** Return true if this failure should be retried. */ public abstract boolean shouldRetry(Context context); diff --git a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicyTest.java b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicyTest.java index d2a1cd25405..164d7d6fd82 100644 --- a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicyTest.java +++ b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicyTest.java @@ -57,6 +57,8 @@ public class InsertRetryPolicyTest { policy.shouldRetry(new Context(generateErrorAmongMany(5, "timeout", "invalidQuery"; assertFalse( policy.shouldRetry(new Context(generateErrorAmongMany(5, "timeout", "notImplemented"; +assertFalse( +policy.shouldRetry(new Context(generateErrorAmongMany(5, "timeout", "parseError"; } static class RetryAllExceptInvalidQuery extends InsertRetryPolicy {
[beam] branch master updated: Modified KafkaIO.Read SDF->Legacy forced override to fail if configured functionality would be lost (#16888)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 09129e122ae Modified KafkaIO.Read SDF->Legacy forced override to fail if configured functionality would be lost (#16888) 09129e122ae is described below commit 09129e122ae6b426f79dbb576ef5f8f9a4c0db79 Author: Balázs Németh AuthorDate: Tue Jun 21 01:08:38 2022 +0200 Modified KafkaIO.Read SDF->Legacy forced override to fail if configured functionality would be lost (#16888) --- .../java/org/apache/beam/sdk/io/kafka/KafkaIO.java | 21 - .../KafkaIOReadImplementationCompatibility.java| 27 ++ 2 files changed, 38 insertions(+), 10 deletions(-) diff --git a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java index 4f67eaa4be3..4263b918817 100644 --- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java +++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java @@ -57,6 +57,7 @@ import org.apache.beam.sdk.io.Read.Unbounded; import org.apache.beam.sdk.io.UnboundedSource; import org.apache.beam.sdk.io.UnboundedSource.CheckpointMark; import org.apache.beam.sdk.io.kafka.KafkaIOReadImplementationCompatibility.KafkaIOReadImplementation; +import org.apache.beam.sdk.io.kafka.KafkaIOReadImplementationCompatibility.KafkaIOReadImplementationCompatibilityException; import org.apache.beam.sdk.io.kafka.KafkaIOReadImplementationCompatibility.KafkaIOReadImplementationCompatibilityResult; import org.apache.beam.sdk.options.ExperimentalOptions; import org.apache.beam.sdk.options.PipelineOptions; @@ -1383,12 +1384,20 @@ public class KafkaIO { public PTransformReplacement>> getReplacementTransform( AppliedPTransform>, ReadFromKafkaViaSDF> transform) { -return PTransformReplacement.of( -transform.getPipeline().begin(), -new ReadFromKafkaViaUnbounded<>( -transform.getTransform().kafkaRead, -transform.getTransform().keyCoder, -transform.getTransform().valueCoder)); +try { + return PTransformReplacement.of( + transform.getPipeline().begin(), + new ReadFromKafkaViaUnbounded<>( + transform.getTransform().kafkaRead, + transform.getTransform().keyCoder, + transform.getTransform().valueCoder)); +} catch (KafkaIOReadImplementationCompatibilityException e) { + throw new IllegalStateException( + "The current runner does not support SDF-based Kafka read properly " + + "and the replacement runner lacks the support for the following properties: " + + e.getConflictingProperties() + + ". For example if you are using Dataflow then consider using Dataflow Runner v2."); +} } @Override diff --git a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOReadImplementationCompatibility.java b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOReadImplementationCompatibility.java index 9599398c0bd..1db28d2a3e6 100644 --- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOReadImplementationCompatibility.java +++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOReadImplementationCompatibility.java @@ -19,10 +19,10 @@ package org.apache.beam.sdk.io.kafka; import static org.apache.beam.sdk.io.kafka.KafkaIOReadImplementationCompatibility.KafkaIOReadImplementation.LEGACY; import static org.apache.beam.sdk.io.kafka.KafkaIOReadImplementationCompatibility.KafkaIOReadImplementation.SDF; -import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState; import java.lang.reflect.Method; import java.util.Arrays; +import java.util.Collection; import java.util.EnumSet; import java.util.Objects; import javax.annotation.Nonnull; @@ -32,6 +32,7 @@ import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.Visi import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.CaseFormat; import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.HashMultimap; import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableSet; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableSortedSet; import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Multimap; import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Sets; import org.checkerframework.checker.initialization.qual.UnderInitialization; @@ -23
[beam-starter-java] branch main updated: actually test the pipeline (#4)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-java.git The following commit(s) were added to refs/heads/main by this push: new bd89083 actually test the pipeline (#4) bd89083 is described below commit bd8908339e4cb9a5632aa6b44364b52e4ef762f8 Author: David Cavazos AuthorDate: Fri Jun 17 09:45:25 2022 -0700 actually test the pipeline (#4) --- .github/workflows/test.yaml| 2 +- build.gradle | 9 + build.sbt | 7 --- pom.xml| 24 src/main/java/com/example/App.java | 24 +--- src/test/java/com/example/AppTest.java | 20 6 files changed, 55 insertions(+), 31 deletions(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index b93db98..98e851c 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -8,7 +8,7 @@ name: Test -on: push +on: [push, pull_request] jobs: Gradle: diff --git a/build.gradle b/build.gradle index a8bd176..2f368ec 100644 --- a/build.gradle +++ b/build.gradle @@ -19,11 +19,11 @@ application { } test { -// JUnit 5 testing integration. -useJUnitPlatform() +// JUnit 4. +useJUnit() } -def beamVersion = '2.33.0' +def beamVersion = '2.39.0' dependencies { // App dependencies. implementation "org.apache.beam:beam-sdks-java-core:${beamVersion}" @@ -31,7 +31,8 @@ dependencies { implementation "org.slf4j:slf4j-jdk14:1.7.32" // Tests dependencies. -testImplementation "org.junit.jupiter:junit-jupiter:5.8.1" +testImplementation "junit:junit:4.13.2" +testImplementation 'org.hamcrest:hamcrest:2.2' } // Package a self-contained jar file. diff --git a/build.sbt b/build.sbt index 5e94af5..357158b 100644 --- a/build.sbt +++ b/build.sbt @@ -8,7 +8,7 @@ mainClass := Some("com.example.App") -val beamVersion = "2.33.0" +val beamVersion = "2.39.0" libraryDependencies ++= Seq( // App dependencies. "org.apache.beam" % "beam-sdks-java-core" % beamVersion, @@ -16,8 +16,9 @@ libraryDependencies ++= Seq( "org.slf4j" % "slf4j-jdk14" % "1.7.32", // Test dependencies. - "net.aichler" % "jupiter-interface" % JupiterKeys.jupiterVersion.value % Test, - "org.junit.jupiter" % "junit-jupiter" % "5.8.1" % Test + "junit" % "junit" % "4.13.2" % Test, + "com.novocode" % "junit-interface" % "0.11" % Test, + "org.hamcrest" % "hamcrest" % "2.2" % Test ) // Package self-contained jar file. diff --git a/pom.xml b/pom.xml index c0e15fe..31a1b3f 100644 --- a/pom.xml +++ b/pom.xml @@ -21,8 +21,8 @@ 11 UTF-8 -2.33.0 -5.8.1 +2.39.0 +4.13.2 @@ -44,18 +44,11 @@ - + org.apache.maven.plugins maven-surefire-plugin 3.0.0-M5 - - -org.junit.jupiter -junit-jupiter-engine -${junit.version} - - @@ -108,10 +101,17 @@ - org.junit.jupiter - junit-jupiter + junit + junit ${junit.version} test + + + org.hamcrest + hamcrest + 2.2 + test + \ No newline at end of file diff --git a/src/main/java/com/example/App.java b/src/main/java/com/example/App.java index 20e7afe..f5f8a14 100644 --- a/src/main/java/com/example/App.java +++ b/src/main/java/com/example/App.java @@ -8,13 +8,18 @@ package com.example; +import java.util.Arrays; +import java.util.List; + import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.coders.StringUtf8Coder; import org.apache.beam.sdk.options.Default; import org.apache.beam.sdk.options.Description; import org.apache.beam.sdk.options.PipelineOptionsFactory; import org.apache.beam.sdk.options.StreamingOptions; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.MapElements; +import org.apache.beam.sdk.values.PCollection; import org.apache.beam.sdk.values.TypeDescriptors; public class App { @@ -26,15 +31,20 @@ public class App { void setInputText(String value); } + public static PCollection buildPipeline(Pipeline pipeline, String inputText) { + return pipeline + .apply("Create elements", Create.of(Arrays.asList("Hello", "World!", inputText))) + .apply("Print elements", + Map
[beam-starter-python] branch main updated: test on pull request
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-python.git The following commit(s) were added to refs/heads/main by this push: new 543117d test on pull request new 8ba6cb3 Merge pull request #2 from davidcavazos/test-on-pr 543117d is described below commit 543117d6ab0b1050e714773380c89564001dfdcd Author: David Cavazos AuthorDate: Thu Jun 16 14:00:13 2022 -0700 test on pull request --- .github/workflows/test.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml index ca7484a..be3d2a7 100644 --- a/.github/workflows/test.yaml +++ b/.github/workflows/test.yaml @@ -8,7 +8,7 @@ name: Test -on: [push] +on: [push, pull_request] jobs: tests:
[beam] branch master updated: convert windmill min timestamp to beam min timestamp
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 955177b8c38 convert windmill min timestamp to beam min timestamp new 21049601776 Merge pull request #21740 from Naireen/weird_timestamps 955177b8c38 is described below commit 955177b8c38868e3dffad9ab4d7ee31dda06cc92 Author: Naireen Hussain AuthorDate: Wed Jun 8 00:01:14 2022 + convert windmill min timestamp to beam min timestamp --- .../apache/beam/runners/dataflow/worker/WindmillTimeUtils.java | 3 +++ .../beam/runners/dataflow/worker/WindmillTimeUtilsTest.java| 10 ++ 2 files changed, 13 insertions(+) diff --git a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java index 8552c27d596..9732826bdd6 100644 --- a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java +++ b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java @@ -45,6 +45,9 @@ public class WindmillTimeUtils { // Windmill should never send us an unknown timestamp. Preconditions.checkArgument(timestampUs != Long.MIN_VALUE); Instant result = new Instant(divideAndRoundDown(timestampUs, 1000)); +if (result.isBefore(BoundedWindow.TIMESTAMP_MIN_VALUE)) { + return BoundedWindow.TIMESTAMP_MIN_VALUE; +} if (result.isAfter(BoundedWindow.TIMESTAMP_MAX_VALUE)) { // End of time. return BoundedWindow.TIMESTAMP_MAX_VALUE; diff --git a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java index 5f910c3acb5..84e76d2f8bd 100644 --- a/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java +++ b/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtilsTest.java @@ -23,6 +23,7 @@ import static org.apache.beam.runners.dataflow.worker.WindmillTimeUtils.windmill import static org.junit.Assert.assertEquals; import org.apache.beam.sdk.transforms.windowing.BoundedWindow; +import org.joda.time.Duration; import org.joda.time.Instant; import org.junit.Test; import org.junit.runner.RunWith; @@ -56,6 +57,15 @@ public class WindmillTimeUtilsTest { assertEquals(new Instant(-17), windmillToHarnessTimestamp(-16987)); assertEquals(new Instant(-17), windmillToHarnessTimestamp(-17000)); assertEquals(new Instant(-18), windmillToHarnessTimestamp(-17001)); +assertEquals(BoundedWindow.TIMESTAMP_MIN_VALUE, windmillToHarnessTimestamp(Long.MIN_VALUE + 1)); +assertEquals(BoundedWindow.TIMESTAMP_MIN_VALUE, windmillToHarnessTimestamp(Long.MIN_VALUE + 2)); +// Long.MIN_VALUE = -9223372036854775808, need to add 1808 microseconds to get to next +// millisecond returned by Beam. +assertEquals( +BoundedWindow.TIMESTAMP_MIN_VALUE.plus(Duration.millis(1)), +windmillToHarnessTimestamp(Long.MIN_VALUE + 1808)); +assertEquals( +BoundedWindow.TIMESTAMP_MIN_VALUE, windmillToHarnessTimestamp(Long.MIN_VALUE + 1807)); } @Test
[beam] branch master updated (986fb40250c -> edf9b7906cd)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 986fb40250c Provide a diagnostic error message when a filesystem scheme is not found. (#21816) add b890da3519c Bump cloud.google.com/go/pubsub from 1.21.1 to 1.22.2 in /sdks add edf9b7906cd Merge pull request #21716 from apache/dependabot/go_modules/sdks/cloud.google.com/go/pubsub-1.22.2 No new revisions were added by this update. Summary of changes: sdks/go.mod | 4 ++-- sdks/go.sum | 16 +++- 2 files changed, 13 insertions(+), 7 deletions(-)
[beam-starter-java] branch main updated: move pull request template file
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/beam-starter-java.git The following commit(s) were added to refs/heads/main by this push: new 7b4753a move pull request template file new 0a0d0ee Merge pull request #2 from davidcavazos/move-file 7b4753a is described below commit 7b4753a812df3c78ec67ed425039437dd822623b Author: David Cavazos AuthorDate: Thu Jun 9 11:13:06 2022 -0700 move pull request template file --- .github/{workflows => }/PULL_REQUEST_TEMPLATE.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/.github/workflows/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md similarity index 100% rename from .github/workflows/PULL_REQUEST_TEMPLATE.md rename to .github/PULL_REQUEST_TEMPLATE.md
[beam] branch master updated: [BEAM-12554] Create new instances of FileSink in sink_fn (#17708)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 77b813a6e6e [BEAM-12554] Create new instances of FileSink in sink_fn (#17708) 77b813a6e6e is described below commit 77b813a6e6e37d5557d57fb1d5e71353bfce6ddc Author: Yi Hu AuthorDate: Tue Jun 7 12:10:39 2022 -0400 [BEAM-12554] Create new instances of FileSink in sink_fn (#17708) * [BEAM-12554] Create new instances of FileSink in sink_fn * add unit test for WriteToFiles dynamic destination * add test to both type signature and type instance as sink param --- sdks/python/apache_beam/io/fileio.py | 12 ++- sdks/python/apache_beam/io/fileio_test.py | 36 +++ 2 files changed, 43 insertions(+), 5 deletions(-) diff --git a/sdks/python/apache_beam/io/fileio.py b/sdks/python/apache_beam/io/fileio.py index 6b88600a97b..d1839e9de0f 100644 --- a/sdks/python/apache_beam/io/fileio.py +++ b/sdks/python/apache_beam/io/fileio.py @@ -504,7 +504,8 @@ class WriteToFiles(beam.PTransform): given their final names. By default, the temporary directory will be within the temp_location of your pipeline. sink (callable, FileSink): The sink to use to write into a file. It should -implement the methods of a ``FileSink``. If none is provided, a +implement the methods of a ``FileSink``. Pass a class signature or an +instance of FileSink to this parameter. If none is provided, a ``TextSink`` is used. shards (int): The number of shards per destination and trigger firing. max_writers_per_bundle (int): The number of writers that can be open @@ -525,8 +526,11 @@ class WriteToFiles(beam.PTransform): @staticmethod def _get_sink_fn(input_sink): # type: (...) -> Callable[[Any], FileSink] -if isinstance(input_sink, FileSink): - return lambda x: input_sink +if isinstance(input_sink, type) and issubclass(input_sink, FileSink): + return lambda x: input_sink() +elif isinstance(input_sink, FileSink): + kls = input_sink.__class__ + return lambda x: kls() elif callable(input_sink): return input_sink else: @@ -791,7 +795,6 @@ class _WriteUnshardedRecordsFn(beam.DoFn): def _get_or_create_writer_and_sink(self, destination, window): """Returns a tuple of writer, sink.""" writer_key = (destination, window) - if writer_key in self._writers_and_sinks: return self._writers_and_sinks.get(writer_key) elif len(self._writers_and_sinks) >= self.max_num_writers_per_bundle: @@ -807,7 +810,6 @@ class _WriteUnshardedRecordsFn(beam.DoFn): create_metadata_fn=sink.create_metadata) sink.open(writer) - self._writers_and_sinks[writer_key] = (writer, sink) self._file_names[writer_key] = full_file_name return self._writers_and_sinks[writer_key] diff --git a/sdks/python/apache_beam/io/fileio_test.py b/sdks/python/apache_beam/io/fileio_test.py index f21fb8d1796..ab4dba2366c 100644 --- a/sdks/python/apache_beam/io/fileio_test.py +++ b/sdks/python/apache_beam/io/fileio_test.py @@ -459,6 +459,42 @@ class WriteFilesTest(_TestCaseWithTempDirCleanUp): assert_that(result, equal_to([row for row in self.SIMPLE_COLLECTION])) + def test_write_to_dynamic_destination(self): + +sink_params = [ +fileio.TextSink, # pass a type signature +fileio.TextSink() # pass a FileSink object +] + +for sink in sink_params: + dir = self._new_tempdir() + + with TestPipeline() as p: +_ = ( +p +| "Create" >> beam.Create(range(100)) +| beam.Map(lambda x: str(x)) +| fileio.WriteToFiles( +path=dir, +destination=lambda n: "odd" if int(n) % 2 else "even", +sink=sink, +file_naming=fileio.destination_prefix_naming("test"))) + + with TestPipeline() as p: +result = ( +p +| fileio.MatchFiles(FileSystems.join(dir, '*')) +| fileio.ReadMatches() +| beam.Map( +lambda f: ( +os.path.basename(f.metadata.path).split('-')[0], +sorted(map(int, f.read_utf8().strip().split('\n')) + +assert_that( +result, +equal_to([('odd', list(range(1, 100, 2))), + ('even', list(range(0, 100, 2)))])) + def test_write_to_different_file_types_some_spilling(self): dir = self._new_tempdir()
[beam] branch master updated (4d22202fde9 -> 31114e893ce)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 4d22202fde9 [BEAM-14255] Drop clock abstraction (#17671) add 31114e893ce Adds __repr__ to NullableCoder (#17757) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/coders/coders.py | 3 +++ 1 file changed, 3 insertions(+)
[beam] branch master updated: [BEAM-14170] - Create a test that runs sickbayed tests (#17471)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 9a6f7699b5d [BEAM-14170] - Create a test that runs sickbayed tests (#17471) 9a6f7699b5d is described below commit 9a6f7699b5d8daf846221d522d3702c5a4c7b562 Author: Fernando Morales <80284146+fernando-wizel...@users.noreply.github.com> AuthorDate: Sun May 29 14:06:52 2022 -0600 [BEAM-14170] - Create a test that runs sickbayed tests (#17471) --- .../jenkins/job_PostCommit_Java_Sickbay.groovy | 43 build.gradle.kts | 8 +++ runners/direct-java/build.gradle | 38 ++- runners/flink/flink_runner.gradle | 35 -- runners/portability/java/build.gradle | 29 +++- runners/samza/build.gradle | 78 ++ runners/spark/job-server/spark_job_server.gradle | 37 -- .../sdk/io/BoundedReadFromUnboundedSourceTest.java | 2 - .../org/apache/beam/sdk/transforms/WatchTest.java | 2 - 9 files changed, 225 insertions(+), 47 deletions(-) diff --git a/.test-infra/jenkins/job_PostCommit_Java_Sickbay.groovy b/.test-infra/jenkins/job_PostCommit_Java_Sickbay.groovy new file mode 100644 index 000..6d2a97fb4f9 --- /dev/null +++ b/.test-infra/jenkins/job_PostCommit_Java_Sickbay.groovy @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import CommonJobProperties as commonJobProperties +import PostcommitJobBuilder + +// This job runs the Java sickbay tests. +PostcommitJobBuilder.postCommitJob('beam_PostCommit_Java_Sickbay', +'Run Java Sickbay', 'Java Sickbay Tests', this) { + + description('Run Java Sickbay Tests') + + // Set common parameters. + commonJobProperties.setTopLevelMainJobProperties(delegate, 'master', 120) + + publishers { +archiveJunit('**/build/test-results/**/*.xml') + } + + // Execute shell command to run sickbay tests. + steps { +gradle { + rootBuildScriptDir(commonJobProperties.checkoutDir) + tasks(':javaPostCommitSickbay') + commonJobProperties.setGradleSwitches(delegate) +} + } +} diff --git a/build.gradle.kts b/build.gradle.kts index ef0dc431bc8..fd8d8714177 100644 --- a/build.gradle.kts +++ b/build.gradle.kts @@ -207,6 +207,14 @@ tasks.register("javaPostCommit") { dependsOn(":sdks:java:io:neo4j:integrationTest") } +tasks.register("javaPostCommitSickbay") { + dependsOn(":runners:samza:validatesRunnerSickbay") + dependsOn(":runners:flink:validatesRunnerSickbay") + dependsOn(":runners:spark:validatesRunnerSickbay") + dependsOn(":runners:direct-java:validatesRunnerSickbay") + dependsOn(":runners:portability:java:validatesRunnerSickbay") +} + tasks.register("javaHadoopVersionsTest") { dependsOn(":sdks:java:io:hadoop-common:hadoopVersionsTest") dependsOn(":sdks:java:io:hadoop-file-system:hadoopVersionsTest") diff --git a/runners/direct-java/build.gradle b/runners/direct-java/build.gradle index 3e6c97945a5..995d1f2fd1f 100644 --- a/runners/direct-java/build.gradle +++ b/runners/direct-java/build.gradle @@ -115,6 +115,15 @@ static def pipelineOptionsStringCrossPlatformHandling(String[] options) { } } +def sickbayTests = [ +// https://issues.apache.org/jira/browse/BEAM-2791 +'org.apache.beam.sdk.testing.UsesLoopingTimer', +// https://issues.apache.org/jira/browse/BEAM-8035 + 'org.apache.beam.sdk.transforms.WatchTest.testMultiplePollsWithManyResults', +// https://issues.apache.org/jira/browse/BEAM-6354 + 'org.apache.beam.sdk.io.BoundedReadFromUnboundedSourceTest.testTimeBound', +] + task needsRunnerTests(type: Test) { group = "Verification" description = "Runs tests that require a runner to valid
[beam] branch master updated: cleaned up TypeScript in coders.ts (#17689)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 0fb68863779 cleaned up TypeScript in coders.ts (#17689) 0fb68863779 is described below commit 0fb68863779bb6cf082cd91331159e5743bb17d6 Author: David Huntsperger <5672572+pc...@users.noreply.github.com> AuthorDate: Fri May 27 19:22:26 2022 -0700 cleaned up TypeScript in coders.ts (#17689) * cleaned up TypeScript in coders.ts * run prettier again * exporting HackedWriter interface * fixed type and linting issues left after resolving merge conflict --- sdks/typescript/src/apache_beam/coders/coders.ts | 43 1 file changed, 29 insertions(+), 14 deletions(-) diff --git a/sdks/typescript/src/apache_beam/coders/coders.ts b/sdks/typescript/src/apache_beam/coders/coders.ts index 3a9547fff34..d00bf9fb6b9 100644 --- a/sdks/typescript/src/apache_beam/coders/coders.ts +++ b/sdks/typescript/src/apache_beam/coders/coders.ts @@ -24,7 +24,7 @@ export interface ProtoContext { } interface Class { - new (...args: any[]): T; + new (...args: unknown[]): T; } /** @@ -37,7 +37,8 @@ interface Class { * for the key and the coder for the value as parameters). */ class CoderRegistry { - internal_registry = {}; + internal_registry: Record Coder> = +{}; getCoder( urn: string, @@ -46,6 +47,7 @@ class CoderRegistry { ) { const constructor: (...args) => Coder = this.internal_registry[urn]; + if (constructor === undefined) { throw new Error("Could not find coder for URN " + urn); } @@ -58,15 +60,18 @@ class CoderRegistry { // TODO: Figure out how to branch on constructors (called with new) and // ordinary functions. - register(urn: string, coderClass: Class>) { + register(urn: string, coderClass: Class>) { this.registerClass(urn, coderClass); } - registerClass(urn: string, coderClass: Class>) { + registerClass(urn: string, coderClass: Class>) { this.registerConstructor(urn, (...args) => new coderClass(...args)); } - registerConstructor(urn: string, constructor: (...args) => Coder) { + registerConstructor( +urn: string, +constructor: (...args: unknown[]) => Coder + ) { this.internal_registry[urn] = constructor; } } @@ -134,20 +139,31 @@ export interface Coder { toProto(pipelineContext: ProtoContext): runnerApi.Coder; } -function writeByteCallback(val, buf, pos) { +function writeByteCallback( + val: number, + buf: { [x: string]: number }, + pos: number +) { buf[pos] = val & 0xff; } +export interface HackedWriter extends Writer { + _push?(...args: unknown[]); +} + /** * Write a single byte, as an unsigned integer, directly to the writer. */ -export function writeRawByte(b, writer: Writer) { - var hackedWriter = writer; - hackedWriter._push(writeByteCallback, 1, b); +export function writeRawByte(b: unknown, writer: HackedWriter) { + writer._push?.(writeByteCallback, 1, b); } -function writeBytesCallback(val, buf, pos) { - for (var i = 0; i < val.length; ++i) { +function writeBytesCallback( + val: number[], + buf: { [x: string]: number }, + pos: number +) { + for (let i = 0; i < val.length; ++i) { buf[pos + i] = val[i]; } } @@ -156,7 +172,6 @@ function writeBytesCallback(val, buf, pos) { * Writes a sequence of bytes, as unsigned integers, directly to the writer, * without a prefixing with the length of the bytes that writer.bytes() does. */ -export function writeRawBytes(value: Uint8Array, writer: Writer) { - var hackedWriter = writer; - hackedWriter._push(writeBytesCallback, value.length, value); +export function writeRawBytes(value: Uint8Array, writer: HackedWriter) { + writer._push?.(writeBytesCallback, value.length, value); }
[beam] branch master updated: minor: don't capture stderr in kata tests
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 54b6b461462 minor: don't capture stderr in kata tests new db4d2cb31bb Merge pull request #17639 from iasoon/kata-no-stderr 54b6b461462 is described below commit 54b6b461462e07f423e68223e3014c0f3ce3db37 Author: Ilion Beyst AuthorDate: Thu May 12 17:45:36 2022 +0200 minor: don't capture stderr in kata tests --- learning/katas/python/test_helper.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/learning/katas/python/test_helper.py b/learning/katas/python/test_helper.py index fe547429541..3864a2eb454 100644 --- a/learning/katas/python/test_helper.py +++ b/learning/katas/python/test_helper.py @@ -34,8 +34,7 @@ def get_file_output(encoding="utf-8", path=sys.argv[-1], arg_string=""): """ import subprocess -proc = subprocess.Popen([sys.executable, path], stdin=subprocess.PIPE, stdout=subprocess.PIPE, -stderr=subprocess.STDOUT) +proc = subprocess.Popen([sys.executable, path], stdin=subprocess.PIPE, stdout=subprocess.PIPE) if arg_string: for arg in arg_string.split("\n"): proc.stdin.write(bytearray(str(arg) + "\n", encoding))
[beam] branch master updated: [BEAM-14006] Update Python katas to 2.38 and fix issue with one test
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 3761d1f6b56 [BEAM-14006] Update Python katas to 2.38 and fix issue with one test new 57f37052067 Merge pull request #17634 from iht/update_python_katas 3761d1f6b56 is described below commit 3761d1f6b56f3e72e1d5b0b38722be04990f14ef Author: Israel Herraiz AuthorDate: Thu May 12 15:03:56 2022 +0200 [BEAM-14006] Update Python katas to 2.38 and fix issue with one test --- learning/katas/python/Examples/section-remote-info.yaml | 2 +- learning/katas/python/Introduction/Hello Beam/Hello Beam/tests.py | 3 +++ .../Triggers/Early Triggers/Early Triggers/task-remote-info.yaml| 2 ++ .../katas/python/Triggers/Early Triggers/lesson-remote-info.yaml| 3 +++ .../Event Time Triggers/Event Time Triggers/task-remote-info.yaml | 2 ++ .../python/Triggers/Event Time Triggers/lesson-remote-info.yaml | 3 +++ .../Window Accumulation Mode/task-remote-info.yaml | 2 ++ .../Triggers/Window Accumulation Mode/lesson-remote-info.yaml | 3 +++ learning/katas/python/Triggers/section-remote-info.yaml | 2 ++ learning/katas/python/course-info.yaml | 2 +- learning/katas/python/requirements.txt | 6 +++--- 11 files changed, 25 insertions(+), 5 deletions(-) diff --git a/learning/katas/python/Examples/section-remote-info.yaml b/learning/katas/python/Examples/section-remote-info.yaml index 908fdcde408..121445ed683 100644 --- a/learning/katas/python/Examples/section-remote-info.yaml +++ b/learning/katas/python/Examples/section-remote-info.yaml @@ -1,2 +1,2 @@ id: 85647 -update_date: Mon, 09 Mar 2020 14:34:14 UTC +update_date: Thu, 12 May 2022 13:06:24 UTC diff --git a/learning/katas/python/Introduction/Hello Beam/Hello Beam/tests.py b/learning/katas/python/Introduction/Hello Beam/Hello Beam/tests.py index d0e90986785..5888f4b475d 100644 --- a/learning/katas/python/Introduction/Hello Beam/Hello Beam/tests.py +++ b/learning/katas/python/Introduction/Hello Beam/Hello Beam/tests.py @@ -20,6 +20,9 @@ from test_helper import failed, passed, get_file_output, test_is_not_empty def test_output(): output = get_file_output() +# Remove warning line about docker and Python versions +output = [x for x in output if not x.startswith("WARNING")] + if len(output) == 1 and 'Hello Beam' in output: passed() else: diff --git a/learning/katas/python/Triggers/Early Triggers/Early Triggers/task-remote-info.yaml b/learning/katas/python/Triggers/Early Triggers/Early Triggers/task-remote-info.yaml new file mode 100644 index 000..9eb0d3c6e7a --- /dev/null +++ b/learning/katas/python/Triggers/Early Triggers/Early Triggers/task-remote-info.yaml @@ -0,0 +1,2 @@ +id: 1315717712 +update_date: Thu, 01 Jan 1970 00:00:00 UTC diff --git a/learning/katas/python/Triggers/Early Triggers/lesson-remote-info.yaml b/learning/katas/python/Triggers/Early Triggers/lesson-remote-info.yaml new file mode 100644 index 000..44f72ca4f2e --- /dev/null +++ b/learning/katas/python/Triggers/Early Triggers/lesson-remote-info.yaml @@ -0,0 +1,3 @@ +id: 415852098 +update_date: Thu, 01 Jan 1970 00:00:00 UTC +unit: 0 diff --git a/learning/katas/python/Triggers/Event Time Triggers/Event Time Triggers/task-remote-info.yaml b/learning/katas/python/Triggers/Event Time Triggers/Event Time Triggers/task-remote-info.yaml new file mode 100644 index 000..5f00e885248 --- /dev/null +++ b/learning/katas/python/Triggers/Event Time Triggers/Event Time Triggers/task-remote-info.yaml @@ -0,0 +1,2 @@ +id: 825593025 +update_date: Thu, 01 Jan 1970 00:00:00 UTC diff --git a/learning/katas/python/Triggers/Event Time Triggers/lesson-remote-info.yaml b/learning/katas/python/Triggers/Event Time Triggers/lesson-remote-info.yaml new file mode 100644 index 000..0cd79c49350 --- /dev/null +++ b/learning/katas/python/Triggers/Event Time Triggers/lesson-remote-info.yaml @@ -0,0 +1,3 @@ +id: 1858884960 +update_date: Thu, 01 Jan 1970 00:00:00 UTC +unit: 0 diff --git a/learning/katas/python/Triggers/Window Accumulation Mode/Window Accumulation Mode/task-remote-info.yaml b/learning/katas/python/Triggers/Window Accumulation Mode/Window Accumulation Mode/task-remote-info.yaml new file mode 100644 index 000..53ffc371231 --- /dev/null +++ b/learning/katas/python/Triggers/Window Accumulation Mode/Window Accumulation Mode/task-remote-info.yaml @@ -0,0 +1,2 @@ +id: 84386334 +update_date: Thu, 01 Jan 1970 00:00:00 UTC diff --git a/learning/katas/python/Triggers/Window Accumulation Mode/lesson-remote-info.yaml b/learning/katas/python/Triggers/Window Accumulation Mode/lesson-remote-info.yaml new file mode 100644 index
[beam] branch master updated: BEAM-14419: Remove invalid mod type
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bacacb08d24 BEAM-14419: Remove invalid mod type new b753f4422aa Merge pull request #17556 from thiagotnunes/remove-invalid-mod-type bacacb08d24 is described below commit bacacb08d249fded8bd03f02269f896f114c031c Author: Thiago Nunes AuthorDate: Thu May 5 15:15:04 2022 +1000 BEAM-14419: Remove invalid mod type Removes invalid mod type from the DataChangeRecord. This is not returned from the Change Streams API and can be removed safely. --- .../apache/beam/sdk/io/gcp/spanner/changestreams/model/ModType.java| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/model/ModType.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/model/ModType.java index 6d5fcf542aa..4929831fba0 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/model/ModType.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/model/ModType.java @@ -22,12 +22,11 @@ import org.apache.beam.sdk.coders.DefaultCoder; /** * Represents the type of modification applied in the {@link DataChangeRecord}. It can be one of the - * following: INSERT, UPDATE, INSERT_OR_UPDATE or DELETE. + * following: INSERT, UPDATE or DELETE. */ @DefaultCoder(AvroCoder.class) public enum ModType { INSERT, UPDATE, - INSERT_OR_UPDATE, DELETE }
[beam] branch master updated: [BEAM-14478] Fix missing 'projectId' attribute error
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new e028a737e80 [BEAM-14478] Fix missing 'projectId' attribute error new 647648fc964 Merge pull request #17688 from ihji/BEAM-14478 e028a737e80 is described below commit e028a737e802812fc051ed1da1e33993366fafeb Author: Heejong Lee AuthorDate: Mon May 16 17:03:17 2022 -0700 [BEAM-14478] Fix missing 'projectId' attribute error Fix 'RuntimeValueProvider' object has no attribute 'projectId' error in _CustomBigQuerySource.split --- sdks/python/apache_beam/io/gcp/bigquery.py | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/sdks/python/apache_beam/io/gcp/bigquery.py b/sdks/python/apache_beam/io/gcp/bigquery.py index 4d7df85f905..d9809005734 100644 --- a/sdks/python/apache_beam/io/gcp/bigquery.py +++ b/sdks/python/apache_beam/io/gcp/bigquery.py @@ -834,7 +834,10 @@ class _CustomBigQuerySource(BoundedSource): self._setup_temporary_dataset(bq) self.table_reference = self._execute_query(bq) - if not self.table_reference.projectId: + if isinstance(self.table_reference, vp.ValueProvider): +self.table_reference = bigquery_tools.parse_table_reference( +self.table_reference.get(), project=self._get_project()) + elif not self.table_reference.projectId: self.table_reference.projectId = self._get_project() schema, metadata_list = self._export_files(bq)
[beam] branch master updated (5a3f40535f2 -> a37d324791b)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 5a3f40535f2 [BEAM-14441] Automatically assign issue labels based on responses to template (#17661) add a37d324791b README update for the Docker Error 255 during Website launch on Apple Silicon (#17456) No new revisions were added by this update. Summary of changes: website/README.md | 6 ++ 1 file changed, 6 insertions(+)
[beam] branch master updated: Updates CHANGES.md to include some recently discovered known issues
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 6c18bcf5ed4 Updates CHANGES.md to include some recently discovered known issues new 4ac68204ea6 Merge pull request #17631 from chamikaramj/update_release_md 6c18bcf5ed4 is described below commit 6c18bcf5ed446a9aafb1f45fc45dbaa9c04e79e1 Author: Chamikara Jayalath AuthorDate: Wed May 11 13:33:08 2022 -0700 Updates CHANGES.md to include some recently discovered known issues --- CHANGES.md | 23 +++ 1 file changed, 23 insertions(+) diff --git a/CHANGES.md b/CHANGES.md index ae674ead6e5..63b46e74a14 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -163,6 +163,10 @@ * This caused unnecessarily long pre-processing times before job submission for large complex pipelines. * Fix `pyarrow` version parsing (Python)([BEAM-14235](https://issues.apache.org/jira/browse/BEAM-14235)) +## Known Issues + +* Some pipelines that use Java SpannerIO may raise a NPE when the project ID is not specified ([BEAM-14405](https://issues.apache.org/jira/browse/BEAM-14405)) + # [2.37.0] - 2022-03-04 ## Highlights @@ -195,6 +199,9 @@ ## Known Issues +* On rare occations, Python Datastore source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Python GCS source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) + # [2.36.0] - 2022-02-07 ## I/Os @@ -224,6 +231,9 @@ * Users may encounter an unexpected java.lang.ArithmeticException when outputting a timestamp for an element further than allowedSkew from an allowed DoFN skew set to a value more than Integer.MAX_VALUE. +* On rare occations, Python Datastore source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Python GCS source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Java SpannerIO source may swallow some exceptions. Users are adviced to upgrade to Beam 2.37.0 or later ([BEAM-14005](https://issues.apache.org/jira/browse/BEAM-14005)) # [2.35.0] - 2021-12-29 @@ -277,6 +287,9 @@ ## Known Issues * Users of beam-sdks-java-io-hcatalog (and beam-sdks-java-extensions-sql-hcatalog) must take care to override the transitive log4j dependency when they add a hive dependency ([BEAM-13499](https://issues.apache.org/jira/browse/BEAM-13499)). +* On rare occations, Python Datastore source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Python GCS source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Java SpannerIO source may swallow some exceptions. Users are adviced to upgrade to Beam 2.37.0 or later ([BEAM-14005](https://issues.apache.org/jira/browse/BEAM-14005)) # [2.34.0] - 2021-11-11 @@ -313,6 +326,12 @@ * Fixed error when importing the DataFrame API with pandas 1.0.x installed ([BEAM-12945](https://issues.apache.org/jira/browse/BEAM-12945)). * Fixed top.SmallestPerKey implementation in the Go SDK ([BEAM-12946](https://issues.apache.org/jira/browse/BEAM-12946)). +## Known Issues + +* On rare occations, Python Datastore source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Python GCS source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https://issues.apache.org/jira/browse/BEAM-14282)) +* On rare occations, Java SpannerIO source may swallow some exceptions. Users are adviced to upgrade to Beam 2.37.0 or later ([BEAM-14005](https://issues.apache.org/jira/browse/BEAM-14005)) + # [2.33.0] - 2021-10-07 ## Highlights @@ -361,6 +380,7 @@ * Spark 2.x users will need to update Spark's Jackson runtime dependencies (`spark.jackson.version`) to at least version 2.9.2, due to Beam updating its dependencies. * Go SDK jobs may produce "Failed to deduce Step from MonitoringInfo" messages following successful job execution. The messages are benign and don't indicate job failure. These are due to not yet handling PCollection metrics. +* On rare occations, Python GCS source may swallow some exceptions. Users are adviced to upgrade to Beam 2.38.0 or later ([BEAM-14282](https
[beam] branch master updated: Quote pip install package name
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 6e772d67e75 Quote pip install package name new f432136d32d Merge pull request #17450 from kynx/patch-1 6e772d67e75 is described below commit 6e772d67e75f421d368b5c689bc4dd1ec3bd5eea Author: kynx AuthorDate: Fri Apr 22 20:44:13 2022 +0100 Quote pip install package name macOS 12.3.1 Python 3.8.12 pip 22.0.4 Without the quotes zsh barfs with `zsh: no matches found: apache-beam[test]` --- website/www/site/content/en/get-started/quickstart-py.md | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/website/www/site/content/en/get-started/quickstart-py.md b/website/www/site/content/en/get-started/quickstart-py.md index 006a3f5d176..ee68fa7be5f 100644 --- a/website/www/site/content/en/get-started/quickstart-py.md +++ b/website/www/site/content/en/get-started/quickstart-py.md @@ -101,23 +101,23 @@ PS> python -m pip install apache-beam Extra requirements -The above installation will not install all the extra dependencies for using features like the Google Cloud Dataflow runner. Information on what extra packages are required for different features are highlighted below. It is possible to install multiple extra requirements using something like `pip install apache-beam[feature1,feature2]`. +The above installation will not install all the extra dependencies for using features like the Google Cloud Dataflow runner. Information on what extra packages are required for different features are highlighted below. It is possible to install multiple extra requirements using something like `pip install 'apache-beam[feature1,feature2]'`. - **Google Cloud Platform** - - Installation Command: `pip install apache-beam[gcp]` + - Installation Command: `pip install 'apache-beam[gcp]'` - Required for: - Google Cloud Dataflow Runner - GCS IO - Datastore IO - BigQuery IO - **Amazon Web Services** - - Installation Command: `pip install apache-beam[aws]` + - Installation Command: `pip install 'apache-beam[aws]'` - Required for I/O connectors interfacing with AWS - **Tests** - - Installation Command: `pip install apache-beam[test]` + - Installation Command: `pip install 'apache-beam[test]'` - Required for developing on beam and running unittests - **Docs** - - Installation Command: `pip install apache-beam[docs]` + - Installation Command: `pip install 'apache-beam[docs]'` - Generating API documentation using Sphinx ## Execute a pipeline
[beam] branch master updated: Move master readme.md to 2.40.0
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 78ae512859b Move master readme.md to 2.40.0 new 017f846ca34 Merge pull request #17552 from y1chi/update_md 78ae512859b is described below commit 78ae512859b69130d9f52dff18444378bf6a1233 Author: Yichi Zhang AuthorDate: Wed May 4 22:43:09 2022 + Move master readme.md to 2.40.0 --- CHANGES.md | 33 + 1 file changed, 33 insertions(+) diff --git a/CHANGES.md b/CHANGES.md index 8071b24be4a..5f7aea3f86b 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -50,6 +50,39 @@ * ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). --> +# [2.40.0] - Unreleased + +## Highlights + +* New highly anticipated feature X added to Python SDK ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). +* New highly anticipated feature Y added to Java SDK ([BEAM-Y](https://issues.apache.org/jira/browse/BEAM-Y)). + +## I/Os + +* Support for X source added (Java/Python) ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). + +## New Features / Improvements + +* X feature added (Java/Python) ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). + +## Breaking Changes + +* X behavior was changed ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). + +## Deprecations + +* X behavior is deprecated and will be removed in X versions ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). + +## Bugfixes + +* Fixed X (Java/Python) ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). +* Fixed Java expansion service to allow specific files to stage ([BEAM-14160](https://issues.apache.org/jira/browse/BEAM-14160)). + +## Known Issues + +* ([BEAM-X](https://issues.apache.org/jira/browse/BEAM-X)). + + # [2.39.0] - Unreleased ## Highlights
[beam] branch master updated: [BEAM-14369] Fix "target/options: no such file or directory" error while building Java container
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 3a9e7c7d7f3 [BEAM-14369] Fix "target/options: no such file or directory" error while building Java container new f687a2d71f6 Merge pull request #17474 from ihji/BEAM-14369 3a9e7c7d7f3 is described below commit 3a9e7c7d7f3132e617f4330dfbb5758e2ae4911a Author: Heejong Lee AuthorDate: Tue Apr 26 15:45:37 2022 -0700 [BEAM-14369] Fix "target/options: no such file or directory" error while building Java container --- sdks/java/container/common.gradle | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sdks/java/container/common.gradle b/sdks/java/container/common.gradle index e58f19e99f4..72c643a0445 100644 --- a/sdks/java/container/common.gradle +++ b/sdks/java/container/common.gradle @@ -83,7 +83,7 @@ task copyJdkOptions(type: Copy) { task skipPullLicenses(type: Exec) { executable "sh" -args "-c", "mkdir -p build/target/go-licenses build/target/third_party_licenses && touch build/target/third_party_licenses/skip" +args "-c", "mkdir -p build/target/go-licenses build/target/options build/target/third_party_licenses && touch build/target/third_party_licenses/skip" } docker { @@ -115,4 +115,4 @@ if (project.rootProject.hasProperty(["docker-pull-licenses"]) || dockerPrepare.dependsOn copySdkHarnessLauncher dockerPrepare.dependsOn copyDockerfileDependencies dockerPrepare.dependsOn ":sdks:java:container:downloadCloudProfilerAgent" -dockerPrepare.dependsOn copyJdkOptions \ No newline at end of file +dockerPrepare.dependsOn copyJdkOptions
[beam] branch master updated: Update Java katas to Beam 2.38
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bb407bfcf93 Update Java katas to Beam 2.38 new 9b19687ea22 Merge pull request #17427 from iht/update_java_kata bb407bfcf93 is described below commit bb407bfcf938a3a798179f74da84d2e5b8cac896 Author: Israel Herraiz AuthorDate: Thu Apr 21 16:35:08 2022 +0200 Update Java katas to Beam 2.38 --- .../java/Common Transforms/Aggregation/Count/task-remote-info.yaml | 2 +- .../katas/java/Common Transforms/Aggregation/Max/task-remote-info.yaml | 2 +- .../katas/java/Common Transforms/Aggregation/Mean/task-remote-info.yaml | 2 +- .../katas/java/Common Transforms/Aggregation/Min/task-remote-info.yaml | 2 +- .../katas/java/Common Transforms/Aggregation/Sum/task-remote-info.yaml | 2 +- .../katas/java/Common Transforms/Filter/Filter/task-remote-info.yaml| 2 +- .../katas/java/Common Transforms/Filter/ParDo/task-remote-info.yaml | 2 +- .../java/Common Transforms/WithKeys/WithKeys/task-remote-info.yaml | 2 +- .../java/Core Transforms/Branching/Branching/task-remote-info.yaml | 2 +- .../Core Transforms/CoGroupByKey/CoGroupByKey/task-remote-info.yaml | 2 +- .../Combine/BinaryCombineFn Lambda/task-remote-info.yaml| 2 +- .../java/Core Transforms/Combine/BinaryCombineFn/task-remote-info.yaml | 2 +- .../java/Core Transforms/Combine/Combine PerKey/task-remote-info.yaml | 2 +- .../katas/java/Core Transforms/Combine/CombineFn/task-remote-info.yaml | 2 +- .../java/Core Transforms/Combine/Simple Function/task-remote-info.yaml | 2 +- .../Composite Transform/Composite Transform/task-remote-info.yaml | 2 +- .../DoFn Additional Parameters/task-remote-info.yaml| 2 +- .../katas/java/Core Transforms/Flatten/Flatten/task-remote-info.yaml| 2 +- .../java/Core Transforms/GroupByKey/GroupByKey/task-remote-info.yaml| 2 +- .../java/Core Transforms/Map/FlatMapElements/task-remote-info.yaml | 2 +- .../katas/java/Core Transforms/Map/MapElements/task-remote-info.yaml| 2 +- .../java/Core Transforms/Map/ParDo OneToMany/task-remote-info.yaml | 2 +- learning/katas/java/Core Transforms/Map/ParDo/task-remote-info.yaml | 2 +- .../java/Core Transforms/Partition/Partition/task-remote-info.yaml | 2 +- .../java/Core Transforms/Side Input/Side Input/task-remote-info.yaml| 2 +- .../java/Core Transforms/Side Output/Side Output/task-remote-info.yaml | 2 +- .../katas/java/Examples/Word Count/Word Count/task-remote-info.yaml | 2 +- learning/katas/java/IO/Built-in IOs/Built-in IOs/task-remote-info.yaml | 2 +- learning/katas/java/IO/TextIO/TextIO Read/task-remote-info.yaml | 2 +- .../katas/java/Introduction/Hello Beam/Hello Beam/task-remote-info.yaml | 2 +- .../java/Triggers/Early Triggers/Early Triggers/task-remote-info.yaml | 2 +- .../Event Time Triggers/Event Time Triggers/task-remote-info.yaml | 2 +- .../Window Accumulation Mode/task-remote-info.yaml | 2 +- .../katas/java/Windowing/Adding Timestamp/ParDo/task-remote-info.yaml | 2 +- .../Windowing/Adding Timestamp/WithTimestamps/task-remote-info.yaml | 2 +- .../Windowing/Fixed Time Window/Fixed Time Window/task-remote-info.yaml | 2 +- learning/katas/java/build.gradle| 2 +- learning/katas/java/course-remote-info.yaml | 2 +- 38 files changed, 38 insertions(+), 38 deletions(-) diff --git a/learning/katas/java/Common Transforms/Aggregation/Count/task-remote-info.yaml b/learning/katas/java/Common Transforms/Aggregation/Count/task-remote-info.yaml index 65c4399fcba..1483d3399bf 100644 --- a/learning/katas/java/Common Transforms/Aggregation/Count/task-remote-info.yaml +++ b/learning/katas/java/Common Transforms/Aggregation/Count/task-remote-info.yaml @@ -1,2 +1,2 @@ id: 1076163 -update_date: Fri, 12 Jun 2020 08:08:04 UTC +update_date: Thu, 21 Apr 2022 14:23:22 UTC diff --git a/learning/katas/java/Common Transforms/Aggregation/Max/task-remote-info.yaml b/learning/katas/java/Common Transforms/Aggregation/Max/task-remote-info.yaml index 4c826f7fb62..cc7ef5a98b3 100644 --- a/learning/katas/java/Common Transforms/Aggregation/Max/task-remote-info.yaml +++ b/learning/katas/java/Common Transforms/Aggregation/Max/task-remote-info.yaml @@ -1,2 +1,2 @@ id: 1076167 -update_date: Fri, 12 Jun 2020 08:08:18 UTC +update_date: Thu, 21 Apr 2022 14:23:27 UTC diff --git a/learning/katas/java/Common Transforms/Aggregation/Mean/task-remote-info.yaml b/learning/katas/java/Common Transforms/Aggregation/Mean/task-remote-info.yaml index 979f0aac7f3..d7f60487d73 100644 --- a/learning/katas/java/Common Transforms/Aggregation/Mean/task-remote-info.yaml +++ b/learning/katas/java/Common
[beam] branch master updated (8f4456636ab -> 76903772b2e)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 8f4456636ab [BEAM-14346] Fix incorrect error case index in ret2() (#17425) new 3baf14c8e29 Revert "[BEAM-14300] Fix Java precommit failure" new 11c465351a3 Revert "Merge pull request #17223 from [BEAM-14215] Improve argument validation in SnowflakeIO" new 76903772b2e Merge pull request #17426 from Snowflake-Labs/revert-snowflake-io-fix The 35403 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../apache/beam/sdk/io/snowflake/SnowflakeIO.java | 49 +++ .../test/unit/DataSourceConfigurationTest.java | 141 +++-- .../test/unit/read/SnowflakeIOReadTest.java| 52 3 files changed, 39 insertions(+), 203 deletions(-)
svn commit: r53983 - /dev/beam/2.38.0/
Author: altay Date: Wed Apr 20 22:51:10 2022 New Revision: 53983 Log: Remove 2.38.0, now that it is released Removed: dev/beam/2.38.0/
svn commit: r53982 - in /release/beam: ./ 2.37.0/ 2.38.0/ 2.38.0/python/
Author: altay Date: Wed Apr 20 22:51:06 2022 New Revision: 53982 Log: Add 2.38.0 release and remove 2.37.0 release Added: release/beam/2.38.0/ release/beam/2.38.0/apache-beam-2.38.0-source-release.zip (with props) release/beam/2.38.0/apache-beam-2.38.0-source-release.zip.asc release/beam/2.38.0/apache-beam-2.38.0-source-release.zip.sha512 release/beam/2.38.0/python/ release/beam/2.38.0/python/apache-beam-2.38.0.zip (with props) release/beam/2.38.0/python/apache-beam-2.38.0.zip.asc release/beam/2.38.0/python/apache-beam-2.38.0.zip.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-macosx_10_9_x86_64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-macosx_10_9_x86_64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-macosx_10_9_x86_64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux1_i686.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux1_i686.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux1_i686.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux1_x86_64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux1_x86_64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux1_x86_64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2010_i686.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2010_i686.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2010_i686.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2010_x86_64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2010_x86_64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2010_x86_64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2014_aarch64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2014_aarch64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-manylinux2014_aarch64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-win32.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-win32.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-win32.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-win_amd64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-win_amd64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp36-cp36m-win_amd64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-macosx_10_9_x86_64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-macosx_10_9_x86_64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-macosx_10_9_x86_64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux1_i686.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux1_i686.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux1_i686.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux1_x86_64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux1_x86_64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux1_x86_64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2010_i686.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2010_i686.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2010_i686.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2010_x86_64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2010_x86_64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2010_x86_64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2014_aarch64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2014_aarch64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-manylinux2014_aarch64.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-win32.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-win32.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-win32.whl.sha512 release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-win_amd64.whl (with props) release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-win_amd64.whl.asc release/beam/2.38.0/python/apache_beam-2.38.0-cp37-cp37m-win_amd64.whl.sha512 release/beam/2.38.0/python
[beam] branch master updated (c694d22cd42 -> 140706f9f55)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from c694d22cd42 Merge pull request #17318 from akvelon/BEAM-14247-add-graph-model-design add 140706f9f55 Update .asf.yaml (#17409) No new revisions were added by this update. Summary of changes: .asf.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated: Re-raise exceptions swallowed in several Python I/O connectors
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 867a5859c4b Re-raise exceptions swallowed in several Python I/O connectors new f5a9712ef94 Merge pull request #17329 from chamikaramj/re_raise_swallowed_exceptions 867a5859c4b is described below commit 867a5859c4b2c9cd27bd398a12ce2f1d062f4f2b Author: Chamikara Jayalath AuthorDate: Sat Apr 9 00:26:10 2022 -0700 Re-raise exceptions swallowed in several Python I/O connectors --- .../io/gcp/datastore/v1new/datastoreio.py | 2 + .../io/gcp/datastore/v1new/datastoreio_test.py | 18 + sdks/python/apache_beam/io/gcp/gcsio.py| 1 + sdks/python/apache_beam/io/gcp/gcsio_test.py | 46 -- 4 files changed, 57 insertions(+), 10 deletions(-) diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py b/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py index 4ac2803619d..dea51a0ce89 100644 --- a/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py +++ b/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py @@ -301,8 +301,10 @@ class ReadFromDatastore(PTransform): except (ClientError, GoogleAPICallError) as e: # e.code.value contains the numeric http status code. service_call_metric.call(e.code.value) +raise except HttpError as e: service_call_metric.call(e) +raise class _Mutate(PTransform): diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio_test.py b/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio_test.py index 603bcd018c3..8a7977e475c 100644 --- a/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio_test.py +++ b/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio_test.py @@ -328,13 +328,17 @@ class DatastoreioTest(unittest.TestCase): client_query.fetch.side_effect = [ exceptions.DeadlineExceeded("Deadline exceed") ] - list(_query_fn.process(self._mock_query)) - self.verify_read_call_metric( - self._PROJECT, self._NAMESPACE, "deadline_exceeded", 1) - # Test success - client_query.fetch.side_effect = [[]] - list(_query_fn.process(self._mock_query)) - self.verify_read_call_metric(self._PROJECT, self._NAMESPACE, "ok", 1) + try: +list(_query_fn.process(self._mock_query)) + except Exception: +self.verify_read_call_metric( +self._PROJECT, self._NAMESPACE, "deadline_exceeded", 1) +# Test success +client_query.fetch.side_effect = [[]] +list(_query_fn.process(self._mock_query)) +self.verify_read_call_metric(self._PROJECT, self._NAMESPACE, "ok", 1) + else: +raise Exception('Excepted _query_fn.process call to raise an error') def verify_read_call_metric(self, project_id, namespace, status, count): """Check if a metric was recorded for the Datastore IO read API call.""" diff --git a/sdks/python/apache_beam/io/gcp/gcsio.py b/sdks/python/apache_beam/io/gcp/gcsio.py index cc740c397e9..0caf4415247 100644 --- a/sdks/python/apache_beam/io/gcp/gcsio.py +++ b/sdks/python/apache_beam/io/gcp/gcsio.py @@ -642,6 +642,7 @@ class GcsDownloader(Downloader): service_call_metric.call('ok') except HttpError as e: service_call_metric.call(e) + raise @retry.with_exponential_backoff( retry_filter=retry.retry_on_server_errors_and_timeout_filter) diff --git a/sdks/python/apache_beam/io/gcp/gcsio_test.py b/sdks/python/apache_beam/io/gcp/gcsio_test.py index 1d2836a5614..2f867ffc528 100644 --- a/sdks/python/apache_beam/io/gcp/gcsio_test.py +++ b/sdks/python/apache_beam/io/gcp/gcsio_test.py @@ -103,10 +103,17 @@ class FakeGcsObjects(object): # has to persist even past the deletion of the object. self.last_generation = {} self.list_page_tokens = {} +self._fail_when_getting_metadata = [] +self._fail_when_reading = [] - def add_file(self, f): + def add_file( + self, f, fail_when_getting_metadata=False, fail_when_reading=False): self.files[(f.bucket, f.object)] = f self.last_generation[(f.bucket, f.object)] = f.generation +if fail_when_getting_metadata: + self._fail_when_getting_metadata.append(f) +if fail_when_reading: + self._fail_when_reading.append(f) def get_file(self, bucket, obj): return self.files.get((bucket, obj), None) @@ -123,8 +130,12 @@ class FakeGcsObjects(object): # Failing with an HTTP 404 if file does not exist. raise HttpError({'status': 404}, None, None) if download is None: + if f in self._fail_when_getting_metadata: +raise HttpError({'status': 429}, None, None)
[beam] branch master updated (141fb79 -> 0c93be1)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 141fb79 Merge pull request #17145 from ibzib/no-flink11 add 0c93be1 [BEAM-14172] Update tox.ini for pydocs (#17176) No new revisions were added by this update. Summary of changes: sdks/python/tox.ini | 1 + 1 file changed, 1 insertion(+)
[beam] branch aaltay-patch-1 updated (0102c1a -> 2d35d86)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git. from 0102c1a Update tox.ini add 2d35d86 Update sdks/python/tox.ini No new revisions were added by this update. Summary of changes: sdks/python/tox.ini | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] 01/01: Update tox.ini
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git commit 0102c1a562af37c9cee7063413bdf04b5fa5b3be Author: Ahmet Altay AuthorDate: Thu Mar 24 14:50:21 2022 -0700 Update tox.ini --- sdks/python/tox.ini | 1 + 1 file changed, 1 insertion(+) diff --git a/sdks/python/tox.ini b/sdks/python/tox.ini index e7eb3ab..33c3fb0 100644 --- a/sdks/python/tox.ini +++ b/sdks/python/tox.ini @@ -143,6 +143,7 @@ deps = Sphinx==1.8.5 sphinx_rtd_theme==0.4.3 docutils<0.18 + Jinja2==3.0.3 # TODO(BEAM-14712): Sphinx version is too old. commands = time {toxinidir}/scripts/generate_pydoc.sh
[beam] branch aaltay-patch-1 created (now 0102c1a)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git. at 0102c1a Update tox.ini This branch includes the following new commits: new 0102c1a Update tox.ini The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[beam] branch master updated (5148f38 -> 73fbb41)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 5148f38 Merge pull request #17023 from [BEAM-12164]: Remove child partition query workaround add a392cf6 Update Changes.md w/Go pipeline pre-process fix. add 73fbb41 Merge pull request #17131 from apache/lostluck-changes No new revisions were added by this update. Summary of changes: CHANGES.md | 2 ++ 1 file changed, 2 insertions(+)
[beam] branch aaltay-patch-1 created (now 162936f)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch aaltay-patch-1 in repository https://gitbox.apache.org/repos/asf/beam.git. at 162936f [DO NOT MERGE] Testing pip-licenses upgrade No new revisions were added by this update.
[beam] branch master updated: [BEAM-10212] Clean-up comments, remove rawtypes usage.
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 48ed687 [BEAM-10212] Clean-up comments, remove rawtypes usage. new 4c979cc Merge pull request #16954 from lukecwik/beam10212 48ed687 is described below commit 48ed6878d6f886bcfba745fb5474088ef76141f3 Author: Luke Cwik AuthorDate: Fri Feb 25 12:19:31 2022 -0800 [BEAM-10212] Clean-up comments, remove rawtypes usage. --- .../apache/beam/fn/harness/FnApiDoFnRunner.java| 26 +++--- 1 file changed, 8 insertions(+), 18 deletions(-) diff --git a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java index 89c05b8..cf84027 100644 --- a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java +++ b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java @@ -721,7 +721,6 @@ public class FnApiDoFnRunner primaryInFullyProcessedWindowsRoot, +WindowedValue primarySplitRoot, +WindowedValue residualSplitRoot, +WindowedValue residualInUnprocessedWindowsRoot) { return new AutoValue_FnApiDoFnRunner_WindowedSplitResult( primaryInFullyProcessedWindowsRoot, primarySplitRoot, @@ -1008,18 +1000,17 @@ public class FnApiDoFnRunner getPrimaryInFullyProcessedWindowsRoot(); -public abstract @Nullable WindowedValue getPrimarySplitRoot(); +public abstract @Nullable WindowedValue getPrimarySplitRoot(); -public abstract @Nullable WindowedValue getResidualSplitRoot(); +public abstract @Nullable WindowedValue getResidualSplitRoot(); -public abstract @Nullable WindowedValue getResidualInUnprocessedWindowsRoot(); +public abstract @Nullable WindowedValue getResidualInUnprocessedWindowsRoot(); } @AutoValue @AutoValue.CopyAnnotations - @SuppressWarnings({"rawtypes"}) abstract static class SplitResultsWithStopIndex { public static SplitResultsWithStopIndex of( WindowedSplitResult windowSplit, @@ -1750,7 +1741,6 @@ public class FnApiDoFnRunner
[beam] branch master updated: [BEAM-13923] Fix the answers placeholders locations in the Java katas
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new eb25730 [BEAM-13923] Fix the answers placeholders locations in the Java katas new 3a7a213 Merge pull request #16827 from iht/groom_java_katas eb25730 is described below commit eb2573076c4afbcb8ed7c5330c318898ca023de1 Author: Israel Herraiz AuthorDate: Fri Feb 11 17:46:49 2022 +0100 [BEAM-13923] Fix the answers placeholders locations in the Java katas --- .../Common Transforms/Aggregation/Count/task-info.yaml | 4 ++-- .../java/Common Transforms/Aggregation/Max/task-info.yaml | 4 ++-- .../java/Common Transforms/Aggregation/Mean/task-info.yaml | 4 ++-- .../java/Common Transforms/Aggregation/Min/task-info.yaml | 4 ++-- .../java/Common Transforms/Aggregation/Sum/task-info.yaml | 4 ++-- .../java/Common Transforms/Filter/Filter/task-info.yaml| 4 ++-- .../java/Common Transforms/Filter/ParDo/task-info.yaml | 4 ++-- .../Common Transforms/WithKeys/WithKeys/task-info.yaml | 4 ++-- .../Core Transforms/Branching/Branching/task-info.yaml | 8 .../learning/katas/coretransforms/cogroupbykey/Task.java | 9 + .../CoGroupByKey/CoGroupByKey/task-info.yaml | 2 +- .../Combine/BinaryCombineFn Lambda/task-info.yaml | 4 ++-- .../Core Transforms/Combine/BinaryCombineFn/task-info.yaml | 2 +- .../Core Transforms/Combine/Combine PerKey/task-info.yaml | 6 +++--- .../java/Core Transforms/Combine/CombineFn/task-info.yaml | 2 +- .../Core Transforms/Combine/Simple Function/task-info.yaml | 2 +- .../Composite Transform/Composite Transform/task-info.yaml | 4 ++-- .../DoFn Additional Parameters/task-info.yaml | 2 +- .../DoFn Additional Parameters/task.md | 2 ++ .../java/Core Transforms/Flatten/Flatten/task-info.yaml| 4 ++-- .../Core Transforms/GroupByKey/GroupByKey/task-info.yaml | 4 ++-- .../katas/coretransforms/map/flatmapelements/Task.java | 1 - .../Core Transforms/Map/FlatMapElements/task-info.yaml | 4 ++-- .../java/Core Transforms/Map/MapElements/task-info.yaml| 4 ++-- .../Core Transforms/Map/ParDo OneToMany/task-info.yaml | 4 ++-- .../katas/java/Core Transforms/Map/ParDo/task-info.yaml| 4 ++-- .../Core Transforms/Partition/Partition/task-info.yaml | 4 ++-- .../beam/learning/katas/coretransforms/sideinput/Task.java | 8 .../Core Transforms/Side Input/Side Input/task-info.yaml | 4 ++-- .../Core Transforms/Side Output/Side Output/task-info.yaml | 4 ++-- .../java/Examples/Word Count/Word Count/task-info.yaml | 4 ++-- .../java/Introduction/Hello Beam/Hello Beam/task-info.yaml | 4 ++-- .../beam/learning/katas/triggers/earlytriggers/Task.java | 8 .../Triggers/Early Triggers/Early Triggers/task-info.yaml | 2 +- .../learning/katas/triggers/eventtimetriggers/Task.java| 8 .../Event Time Triggers/Event Time Triggers/task-info.yaml | 2 +- .../beam/learning/katas/triggers/windowaccummode/Task.java | 8 .../Window Accumulation Mode/task-info.yaml| 2 +- .../katas/windowing/addingtimestamp/pardo/Task.java| 8 .../java/Windowing/Adding Timestamp/ParDo/task-info.yaml | 2 +- .../windowing/addingtimestamp/withtimestamps/Task.java | 8 .../Adding Timestamp/WithTimestamps/task-info.yaml | 2 +- .../Fixed Time Window/Fixed Time Window/task-info.yaml | 4 ++-- learning/katas/java/build.gradle | 14 +++--- 44 files changed, 127 insertions(+), 69 deletions(-) diff --git a/learning/katas/java/Common Transforms/Aggregation/Count/task-info.yaml b/learning/katas/java/Common Transforms/Aggregation/Count/task-info.yaml index 3240233..29b6c27 100644 --- a/learning/katas/java/Common Transforms/Aggregation/Count/task-info.yaml +++ b/learning/katas/java/Common Transforms/Aggregation/Count/task-info.yaml @@ -22,8 +22,8 @@ files: - name: src/org/apache/beam/learning/katas/commontransforms/aggregation/count/Task.java visible: true placeholders: - - offset: 1707 -length: 29 + - offset: 1896 +length: 36 placeholder_text: TODO() - name: test/org/apache/beam/learning/katas/commontransforms/aggregation/count/TaskTest.java visible: false diff --git a/learning/katas/java/Common Transforms/Aggregation/Max/task-info.yaml b/learning/katas/java/Common Transforms/Aggregation/Max/task-info.yaml index cd10c1f..86a8087 100644 --- a/learning/katas/java/Common Transforms/Aggregation/Max/task-info.yaml +++ b/learning/katas/java/Common Transforms/Aggregation/Max/task-info.yaml @@ -22,8 +22,8 @@ files: - name: src/org/apache/beam/learning/katas/commontransforms/aggregation/max/Task.java visible: true placeholders: - - offset: 1709
svn commit: r52939 - /dev/beam/2.37.0/
Author: altay Date: Wed Mar 9 01:48:46 2022 New Revision: 52939 Log: Remove 2.37.0. It is now released Removed: dev/beam/2.37.0/
svn commit: r52938 - in /release/beam: ./ 2.35.0/ 2.36.0/ 2.37.0/ 2.37.0/python/
Author: altay Date: Wed Mar 9 01:48:08 2022 New Revision: 52938 Log: Add 2.37.0 release, remove 2.36.0 and 2.35.0 Added: release/beam/2.37.0/ release/beam/2.37.0/apache-beam-2.37.0-source-release.zip (with props) release/beam/2.37.0/apache-beam-2.37.0-source-release.zip.asc release/beam/2.37.0/apache-beam-2.37.0-source-release.zip.sha512 release/beam/2.37.0/python/ release/beam/2.37.0/python/apache-beam-2.37.0.zip (with props) release/beam/2.37.0/python/apache-beam-2.37.0.zip.asc release/beam/2.37.0/python/apache-beam-2.37.0.zip.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-macosx_10_9_x86_64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-macosx_10_9_x86_64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-macosx_10_9_x86_64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux1_i686.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux1_i686.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux1_i686.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux1_x86_64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux1_x86_64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux1_x86_64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2010_i686.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2010_i686.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2010_i686.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2010_x86_64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2010_x86_64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2010_x86_64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2014_aarch64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2014_aarch64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-manylinux2014_aarch64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-win32.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-win32.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-win32.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-win_amd64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-win_amd64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp36-cp36m-win_amd64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-macosx_10_9_x86_64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-macosx_10_9_x86_64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-macosx_10_9_x86_64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux1_i686.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux1_i686.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux1_i686.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux1_x86_64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux1_x86_64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux1_x86_64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2010_i686.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2010_i686.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2010_i686.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2010_x86_64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2010_x86_64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2010_x86_64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2014_aarch64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2014_aarch64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-manylinux2014_aarch64.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-win32.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-win32.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-win32.whl.sha512 release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-win_amd64.whl (with props) release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-win_amd64.whl.asc release/beam/2.37.0/python/apache_beam-2.37.0-cp37-cp37m-win_amd64.whl.sha512 release/beam/2.37.0/python
[beam] branch master updated (9afde56 -> 9833b7b)
This is an automated email from the ASF dual-hosted git repository. altay pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 9afde56 Adding a logical type for Schemas using proto serialization. (#16940) add 9833b7b BEAM-13765 missing PAssert methods (#16668) No new revisions were added by this update. Summary of changes: .../java/org/apache/beam/sdk/testing/PAssert.java | 47 ++ 1 file changed, 47 insertions(+)
[beam] branch master updated: Remove resolved issue in notebook
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 87e01f5 Remove resolved issue in notebook new 0c21b99 Merge pull request #17021 from davidcavazos/patch-1 87e01f5 is described below commit 87e01f50bc1bdb43e1a37f4a6acd20726db5278b Author: David Cavazos AuthorDate: Fri Mar 4 13:58:42 2022 -0800 Remove resolved issue in notebook --- .../documentation/transforms/python/elementwise/pardo-py.ipynb | 2 -- 1 file changed, 2 deletions(-) diff --git a/examples/notebooks/documentation/transforms/python/elementwise/pardo-py.ipynb b/examples/notebooks/documentation/transforms/python/elementwise/pardo-py.ipynb index c0125c9..b74887a 100644 --- a/examples/notebooks/documentation/transforms/python/elementwise/pardo-py.ipynb +++ b/examples/notebooks/documentation/transforms/python/elementwise/pardo-py.ipynb @@ -365,8 +365,6 @@ "\n", "> *Known issues:*\n", ">\n", -"> * [[BEAM-7885]](https://issues.apache.org/jira/browse/BEAM-7885)\n", -"> `DoFn.setup()` doesn't run for streaming jobs running in the `DirectRunner`.\n", "> * [[BEAM-7340]](https://issues.apache.org/jira/browse/BEAM-7340)\n", "> `DoFn.teardown()` metrics are lost." ]
[beam] branch master updated: Remove resolved issue in docs + update class path on sample (#17018)
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 3ed277b Remove resolved issue in docs + update class path on sample (#17018) 3ed277b is described below commit 3ed277b809d36afb3e11178e8658e4695ff3abcb Author: David Cavazos AuthorDate: Fri Mar 4 12:16:27 2022 -0800 Remove resolved issue in docs + update class path on sample (#17018) --- .../apache_beam/examples/snippets/transforms/elementwise/pardo.py | 2 +- .../content/en/documentation/transforms/python/elementwise/pardo.md | 2 -- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/sdks/python/apache_beam/examples/snippets/transforms/elementwise/pardo.py b/sdks/python/apache_beam/examples/snippets/transforms/elementwise/pardo.py index c54d05e..fd8dca1 100644 --- a/sdks/python/apache_beam/examples/snippets/transforms/elementwise/pardo.py +++ b/sdks/python/apache_beam/examples/snippets/transforms/elementwise/pardo.py @@ -96,7 +96,7 @@ def pardo_dofn_methods(test=None): class DoFnMethods(beam.DoFn): def __init__(self): print('__init__') - self.window = beam.window.GlobalWindow() + self.window = beam.transforms.window.GlobalWindow() def setup(self): print('setup') diff --git a/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md b/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md index efa64ab..dc12acd 100644 --- a/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md +++ b/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md @@ -141,8 +141,6 @@ Output: > *Known issues:* > -> * [[BEAM-7885]](https://issues.apache.org/jira/browse/BEAM-7885) -> `DoFn.setup()` doesn't run for streaming jobs running in the `DirectRunner`. > * [[BEAM-7340]](https://issues.apache.org/jira/browse/BEAM-7340) > `DoFn.teardown()` metrics are lost.