Re: release process: typescript SDK?

2024-04-29 Thread Austin Bennett
@Robert Bradshaw  -- this seems sensible.  I don't
have the relevant NPM credentials, so am unable to address myself.

Having manual steps in the release process, and esp. not keeping all SDKs
up-to-date seems worth addressing.

On Wed, Apr 17, 2024 at 8:29 AM Danny McCormick 
wrote:

> Probably the easiest way for this to happen is for @Robert Bradshaw
>  to get the token set up as a secret (should be
> quick) and then Austin to take the workflow forward.
>
> In the past to get secrets added, Infra has asked that I (a) email
> r...@apache.org with the secret name and secret contents, and (b) opened
> a JIRA to externally track progress -
> https://issues.apache.org/jira/browse/INFRA-25009
>
> On Wed, Apr 17, 2024 at 11:24 AM Austin Bennett  wrote:
>
>> I don't mind doing, esp. if nobody is eager to handle/prioritize the push
>> artifact in near-term.  If I'm to do, let's connect off-list for
>> token/creds.
>>
>> Furthermore, I agree that getting RCs as part of the overall
>> release/validation process would be a nice addition.
>>
>> On Tue, Apr 16, 2024 at 2:43 PM Robert Bradshaw via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Correct, I've just been pushing these manually, and lately there haven't
>>> been many changes to push. I'm all for getting these set up as part of the
>>> standard release process.
>>>
>>> On Tue, Apr 16, 2024 at 1:22 PM Danny McCormick <
>>> dannymccorm...@google.com> wrote:
>>>
>>>> I've never published npm artifacts before, but I imagine the hardest
>>>> part is getting the credentials set up, then it is probably very easy to
>>>> set up a GitHub Actions workflow to publish
>>>> <https://docs.github.com/en/actions/publishing-packages/publishing-nodejs-packages#publishing-packages-to-the-npm-registry>.
>>>> Who has done these releases in the past/has credentials for the npm
>>>> package? Maybe @Robert Bradshaw ? We will need a
>>>> token set up as a secret to automate this.
>>>>
>>>> I'll also note that we don't do any typescript validation today, and it
>>>> would be nice to publish RCs as part of this
>>>>
>>>> On Tue, Apr 16, 2024 at 4:11 PM Austin Bennett 
>>>> wrote:
>>>>
>>>>> Hi Beam Devs,
>>>>>
>>>>> Calling out it looks like our release process for apache-beam for
>>>>> typescript/npm is broken, seemingly the last published release was 2.49.0
>>>>> about 9 months ago.  The other languages look like they are publishing to
>>>>> expected locations.
>>>>>
>>>>> https://www.npmjs.com/package/apache-beam
>>>>>
>>>>> I noticed this since I was digging into security concerns raised by
>>>>> GitHub's dependabot across our repos [ ex:
>>>>> https://github.com/apache/beam-starter-typescript/security/dependabot ], 
>>>>> and
>>>>> towards getting our repos tidied.
>>>>>
>>>>> This leads me to believe we may want two distinct things:
>>>>> * update our release docs/process/scripts to ensure that we
>>>>> generate/publish all artifacts to relevant repositories.
>>>>> * Arrive at a process to more straightforwardly attend to security
>>>>> updates [ maybe we want these sent to dev list, or another distribution? ]
>>>>>
>>>>> From a very quick search, it did not look like we have scripts to push
>>>>> to npm.  That should be verified more thoroughly -- i haven't done a
>>>>> release before, so relevant scripts could be hiding elsewhere.
>>>>>
>>>>> Cheers,
>>>>> Austin
>>>>>
>>>>>
>>>>> NOTE:  everything with our main Beam repo specifically looks OK.  Some
>>>>> things discovered were on the other/supplementary repos, though I believe
>>>>> those are still worthwhile to attend to and support.
>>>>>
>>>>


Survey/Pulse: Beam on Flink/Spark/Samza/etc

2024-04-29 Thread Austin Bennett
Curious who all is using Beam on runners other than dataflow.

Please respond either on-list or to me directly ...  Mostly just curious
the extent of whether Beam is fulfilling its promise of runner
agnosticism.  Getting good data on that is hard, so anecdotes would be very
welcomed!


Re: release process: typescript SDK?

2024-04-17 Thread Austin Bennett
I don't mind doing, esp. if nobody is eager to handle/prioritize the push
artifact in near-term.  If I'm to do, let's connect off-list for
token/creds.

Furthermore, I agree that getting RCs as part of the overall
release/validation process would be a nice addition.

On Tue, Apr 16, 2024 at 2:43 PM Robert Bradshaw via dev 
wrote:

> Correct, I've just been pushing these manually, and lately there haven't
> been many changes to push. I'm all for getting these set up as part of the
> standard release process.
>
> On Tue, Apr 16, 2024 at 1:22 PM Danny McCormick 
> wrote:
>
>> I've never published npm artifacts before, but I imagine the hardest part
>> is getting the credentials set up, then it is probably very easy to set up
>> a GitHub Actions workflow to publish
>> <https://docs.github.com/en/actions/publishing-packages/publishing-nodejs-packages#publishing-packages-to-the-npm-registry>.
>> Who has done these releases in the past/has credentials for the npm
>> package? Maybe @Robert Bradshaw ? We will need a
>> token set up as a secret to automate this.
>>
>> I'll also note that we don't do any typescript validation today, and it
>> would be nice to publish RCs as part of this
>>
>> On Tue, Apr 16, 2024 at 4:11 PM Austin Bennett  wrote:
>>
>>> Hi Beam Devs,
>>>
>>> Calling out it looks like our release process for apache-beam for
>>> typescript/npm is broken, seemingly the last published release was 2.49.0
>>> about 9 months ago.  The other languages look like they are publishing to
>>> expected locations.
>>>
>>> https://www.npmjs.com/package/apache-beam
>>>
>>> I noticed this since I was digging into security concerns raised by
>>> GitHub's dependabot across our repos [ ex:
>>> https://github.com/apache/beam-starter-typescript/security/dependabot ], and
>>> towards getting our repos tidied.
>>>
>>> This leads me to believe we may want two distinct things:
>>> * update our release docs/process/scripts to ensure that we
>>> generate/publish all artifacts to relevant repositories.
>>> * Arrive at a process to more straightforwardly attend to security
>>> updates [ maybe we want these sent to dev list, or another distribution? ]
>>>
>>> From a very quick search, it did not look like we have scripts to push
>>> to npm.  That should be verified more thoroughly -- i haven't done a
>>> release before, so relevant scripts could be hiding elsewhere.
>>>
>>> Cheers,
>>> Austin
>>>
>>>
>>> NOTE:  everything with our main Beam repo specifically looks OK.  Some
>>> things discovered were on the other/supplementary repos, though I believe
>>> those are still worthwhile to attend to and support.
>>>
>>


release process: typescript SDK?

2024-04-16 Thread Austin Bennett
Hi Beam Devs,

Calling out it looks like our release process for apache-beam for
typescript/npm is broken, seemingly the last published release was 2.49.0
about 9 months ago.  The other languages look like they are publishing to
expected locations.

https://www.npmjs.com/package/apache-beam

I noticed this since I was digging into security concerns raised by
GitHub's dependabot across our repos [ ex:
https://github.com/apache/beam-starter-typescript/security/dependabot ], and
towards getting our repos tidied.

This leads me to believe we may want two distinct things:
* update our release docs/process/scripts to ensure that we
generate/publish all artifacts to relevant repositories.
* Arrive at a process to more straightforwardly attend to security updates
[ maybe we want these sent to dev list, or another distribution? ]

>From a very quick search, it did not look like we have scripts to push to
npm.  That should be verified more thoroughly -- i haven't done a release
before, so relevant scripts could be hiding elsewhere.

Cheers,
Austin


NOTE:  everything with our main Beam repo specifically looks OK.  Some
things discovered were on the other/supplementary repos, though I believe
those are still worthwhile to attend to and support.


Re: tox issues in dev container

2024-04-10 Thread Austin Bennett
based on 'devcontainer', I assume running in/via:  https://containers.dev/
?

On Fri, Apr 5, 2024 at 3:46 PM XQ Hu via dev  wrote:

> always pin the versions as well.
>
> On Fri, Apr 5, 2024 at 5:24 PM Valentyn Tymofieiev via dev <
> dev@beam.apache.org> wrote:
>
>> Could you please provide more info about how you create your environment?
>> Also what OS do you use?
>>
>> On Fri, Apr 5, 2024 at 2:08 PM Joey Tran 
>> wrote:
>>
>>> Yeah that was the tox command I was running
>>>
>>> On Fri, Apr 5, 2024, 4:37 PM XQ Hu via dev  wrote:
>>>

 https://cwiki.apache.org/confluence/display/BEAM/Python+Tips#PythonTips-LintandFormattingChecks

 This generally works well. Have you checked this?

 On Fri, Apr 5, 2024 at 4:07 PM Joey Tran 
 wrote:

> I think I might be doing something silly with my environment.
>
> I'm trying to lint using tox in a dev container, but running tox ends
> with this error:
> ```
> (env)  jtran@[Beam Build Env.]:~/beam {flatmapdefault} ]
> $ tox
>   File "/usr/lib/python3/dist-packages/tox/reporter.py", line 32, in
> __init__
> self._reset(**kwargs)
>   File "/usr/lib/python3/dist-packages/tox/reporter.py", line 38, in
> _reset
> self.tw = py.io.TerminalWriter()
> AttributeError: module 'py' has no attribute 'io'
> ```
>
> This is preventing me from linting (sorry to everyone on my PRs who
> keep seeing linting errors...)
>
> Any help here would be welcome. I've been struggling generally to get
> a stable dev environment working.
>
> Cheers,
> Joey
>



Re: Adding Dead Letter Queues to Beam IOs

2023-11-12 Thread Austin Bennett
This will eventually be a great addition to aid usability -- thanks for
starting to think through and address, John!



On Fri, Nov 10, 2023, 10:54 AM Danny McCormick via dev 
wrote:

> Thanks - the general ideas seem solid, I added some questions/comments as
> well.
>
> On Fri, Nov 10, 2023 at 1:32 PM Robert Bradshaw via dev <
> dev@beam.apache.org> wrote:
>
>> Thanks. I added some comments to the doc and open PR.
>>
>> On Wed, Nov 8, 2023 at 12:44 PM John Casey via dev 
>> wrote:
>> >
>> > Hi All,
>> >
>> > I've written up a design for adding DLQs to existing Beam IOs. It's
>> been through a round of reviews with some Dataflow folks at Google, but I'd
>> appreciate any comments the rest of Beam have around how to refine the
>> design.
>> >
>> > TL;DR: Make it easy for a user to configure IOs to route bad data to an
>> alternate sink instead of crashing the pipeline or having the record be
>> retried indefinitely.
>> >
>> >
>> https://docs.google.com/document/d/1NGeCk6tOqF-TiGEAV7ixd_vhIiWz9sHPlCa1P_77Ajs/edit?usp=sharing
>> >
>> > Thanks!
>> >
>> > John
>>
>


Re: [VOTE] Release 2.52.0, release candidate #4

2023-11-12 Thread Austin Bennett
Danny:

The rc # differs between the subject [ #4 ] and the first sentence of your
email [ #3 ].

I think we can assume this is a VOTE to match the subject and is for RC #4
, but wanted to call that out.

Cheers,
Austin



On Sat, Nov 11, 2023, 3:44 PM Danny McCormick via dev 
wrote:

> Hi everyone,
> Please review and vote on the release candidate #3 for the version 2.52.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> Reviewers are encouraged to test their own use cases with the release
> candidate, and vote +1 if no issues are found. Only PMC member votes will
> count towards the final vote, but votes from all community members is
> encouraged and helpful for finding regressions; you can either test your
> own use cases or use cases from the validation sheet [10].
>
> The complete staging area is available for your review, which includes:
>
>- GitHub Release notes [1]
>- the official Apache source release to be deployed to dist.apache.org [2],
>which is signed with the key with fingerprint D20316F712213422 [3]
>- all artifacts to be deployed to the Maven Central Repository [4]
>- source code tag "v2.52.0-RC4" [5]
>- website pull request listing the release [6], the blog post [6], and
>publishing the API reference manual [7]
>- Python artifacts are deployed along with the source release to the
>dist.apache.org [2] and PyPI[8].
>- Go artifacts and documentation are available at pkg.go.dev [9]
>- Validation sheet with a tab for 2.52.0 release to help with
>validation [10]
>- Docker images published to Docker Hub [11]
>- PR to run tests against release branch [12]
>
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> For guidelines on how to try the release in your projects, check out our
> blog post at https://beam.apache.org/blog/validate-beam-release/.
>
> Thanks,
> Danny
>
> [1] https://github.com/apache/beam/milestone/16
> [2] https://dist.apache.org/repos/dist/dev/beam/2.52.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1362/
> [5] https://github.com/apache/beam/tree/v2.52.0-RC4
> [6] https://github.com/apache/beam/pull/29331
> [7] https://github.com/apache/beam-site/pull/654
> [8] https://pypi.org/project/apache-beam/2.52.0rc4/
> [9]
> https://pkg.go.dev/github.com/apache/beam/sdks/v2@v2.52.0-RC4/go/pkg/beam
> [10]
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1387982510
> [11] https://hub.docker.com/search?q=apache%2Fbeam=image
> [12] https://github.com/apache/beam/pull/29404
>


Lakehouse Formats with IO/Integration --> Hudi? Iceberg?

2023-11-06 Thread Austin Bennett
Beam Devs,

I was looking through GH Issue and online more generally and hadn't seen
much...  Has anyone written any Beam IO or other integration for writing to
[ or reading from ] either Hudi or Iceberg?

Any experience that can be shared [ on list, else feel free to message me
off list and I'll also share back what is OK to be shared ]?

Thanks!
Austin


Re: [Discuss] Idea to increase RC voting participation

2023-10-17 Thread Austin Bennett
Great effort.  I'm also interested in streamlining releases -- so if there
are alot of manual tests that could be automated, would be great
to discover and then look to address.

On Tue, Oct 17, 2023 at 8:47 AM Robert Bradshaw via dev 
wrote:

> +1
>
> I would also strongly suggest that people try out the release against
> their own codebases. This has the benefit of ensuring the release won't
> break your own code when they go out, and stress-tests the new code against
> real-world pipelines. (Ideally our own tests are all passing, and this
> validation is automated as much as possible (though ensuring it matches our
> documentation and works in a clean environment still has value), but
> there's a lot of code and uses out there that we don't have access to
> during normal Beam development.)
>
> On Tue, Oct 17, 2023 at 8:21 AM Svetak Sundhar via dev <
> dev@beam.apache.org> wrote:
>
>> Hi all,
>>
>> I’ve participated in RC testing for a few releases and have observed a
>> bit of a knowledge gap in how releases can be tested. Given that Beam
>> encourages contributors to vote on RC’s regardless of tenure, and that
>> voting on an RC is a relatively low-effort, high leverage way to influence
>> the release of the library, I propose the following:
>>
>> During the vote for the next release, voters can document the process
>> they followed on a separate document, and add the link on column G here
>> .
>> One step further, could be a screencast of running the test, and attaching
>> a link of that.
>>
>> We can keep repeating this through releases until we have documentation
>> for many of the different tests. We can then add these docs into the repo.
>>
>> I’m proposing this because I’ve gathered the following feedback from
>> colleagues that are tangentially involved with Beam: They are interested in
>> participating in release validation, but don’t know how to get started.
>> Happy to hear other suggestions too, if there are any to address the
>> above.
>>
>> Thanks,
>>
>>
>> Svetak Sundhar
>>
>>   Data Engineer
>> s vetaksund...@google.com
>>
>>


Re: [ANNOUNCE] New Committer: Sam Whittle

2023-10-16 Thread Austin Bennett
Thanks, Sam!

On Mon, Oct 16, 2023 at 12:39 PM XQ Hu via dev  wrote:

> Congratulations!
>
> On Mon, Oct 16, 2023 at 1:58 PM Ahmet Altay via dev 
> wrote:
>
>> Congratulations Sam!
>>
>> On Mon, Oct 16, 2023 at 10:42 AM Byron Ellis via dev 
>> wrote:
>>
>>> Congrats Sam!
>>>
>>> On Mon, Oct 16, 2023 at 10:32 AM Chamikara Jayalath via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congrats Sam!

 On Mon, Oct 16, 2023 at 9:32 AM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Sam Whittle (scwhit...@apache.org).
>
> Sam has been contributing to Beam since 2016! In particular, he
> specializes in streaming and the Dataflow Java worker but his 
> contributions
> expand naturally from there to the Java SDK, IOs, and even a bit of Python
> :-). Sam has contributed a ton of code over the years and is generous in
> code review and sharing his expertise.
>
> Considering his contributions to the project over this timeframe, the
> Beam PMC trusts Sam with the responsibilities of a Beam committer. [1]
>
> Thank you Sam! And we are looking to see more of your contributions!
>
> Kenn, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: [ANNOUNCE] New Committer: Byron Ellis

2023-10-16 Thread Austin Bennett
thanks, Byron!

On Mon, Oct 16, 2023 at 12:38 PM XQ Hu via dev  wrote:

> Congratulations!
>
> On Mon, Oct 16, 2023 at 1:58 PM Ahmet Altay via dev 
> wrote:
>
>> Congratulations Byron!
>>
>> On Mon, Oct 16, 2023 at 10:35 AM Tomo Suzuki via dev 
>> wrote:
>>
>>> Congratulations!
>>>
>>>
>>> On Mon, Oct 16, 2023 at 1:33 PM Chamikara Jayalath via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congrats Byron!

 On Mon, Oct 16, 2023 at 9:32 AM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Byron Ellis (b...@apache.org).
>
> Byron has been with Beam for over a year now. You may all know him as
> the guy who just decided to write a Swift SDK :-). In addition to that big
> contribution Byron has also fixed plenty of bugs, prototyped DBT-tyle
> pipeline authoring, and participated in our collective decision-making
> process.
>
> Considering his contributions to the project over this timeframe, the
> Beam PMC trusts Byron with the responsibilities of a Beam committer.
> [1]
>
> Thank you Byron! And we are looking to see more of your contributions!
>
> Kenn, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>

>>>
>>> --
>>> Regards,
>>> Tomo
>>>
>>


Re: [ANNOUNCE] New PMC Member: Valentyn Tymofieiev

2023-10-03 Thread Austin Bennett
Thanks for everything @Valentyn Tymofieiev  !

On Tue, Oct 3, 2023 at 12:53 PM Ahmed Abualsaud 
wrote:

> Congrats Valentyn!
>
> On 2023/10/03 18:39:49 Kenneth Knowles wrote:
> > Hi all,
> >
> > Please join me and the rest of the Beam PMC in welcoming Valentyn
> > Tymofieiev  as our newest PMC member.
> >
> > Valentyn has been contributing to Beam since 2017. Notable highlights
> > include his work on the Python SDK and also in our container management.
> > Valentyn also is involved in many discussions around Beam's
> infrastructure
> > and community processes. If you look through Valentyn's history, you will
> > see an abundance of the most critical maintenance work that is the
> beating
> > heart of any project.
> >
> > Congratulations Valentyn and thanks for being a part of Apache Beam!
> >
> > Kenn, on behalf of the Beam PMC (which now includes Valentyn)
> >
>


Re: [ANNOUNCE] New PMC Member: Robert Burke

2023-10-03 Thread Austin Bennett
Thanks for all you do @Robert Burke  !

On Tue, Oct 3, 2023 at 12:53 PM Ahmed Abualsaud 
wrote:

> Congrats Rebo!
>
> On 2023/10/03 18:39:47 Kenneth Knowles wrote:
> > Hi all,
> >
> > Please join me and the rest of the Beam PMC in welcoming Robert Burke <
> > lostl...@apache.org> as our newest PMC member.
> >
> > Robert has been a part of the Beam community since 2017. He is our
> resident
> > Gopher, producing the Go SDK and most recently the local, portable, Prism
> > runner. Robert has presented on Beam many times, having written not just
> > core Beam code but quite interesting pipelines too :-)
> >
> > Congratulations Robert and thanks for being a part of Apache Beam!
> >
> > Kenn, on behalf of the Beam PMC (which now includes Robert)
> >
>


Re: [ANNOUNCE] New PMC Member: Alex Van Boxel

2023-10-03 Thread Austin Bennett
Thanks for all you do, @Alex Van Boxel  !

On Tue, Oct 3, 2023 at 12:50 PM Ahmed Abualsaud via dev 
wrote:

> Congratulations!
>
> On Tue, Oct 3, 2023 at 3:48 PM Byron Ellis via dev 
> wrote:
>
>> Congrats!
>>
>> On Tue, Oct 3, 2023 at 12:40 PM Danielle Syse via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congratulations Alex!! Definitely well deserved!
>>>
>>> On Tue, Oct 3, 2023 at 2:57 PM Ahmet Altay via dev 
>>> wrote:
>>>
 Congratulations Alex! Well deserved!

 On Tue, Oct 3, 2023 at 11:54 AM Ritesh Ghorse via dev <
 dev@beam.apache.org> wrote:

> Congratulations Alex!
>
> On Tue, Oct 3, 2023 at 2:54 PM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
>
>> Congrats Alex, this is well deserved!
>>
>> On Tue, Oct 3, 2023 at 2:50 PM Jack McCluskey via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congrats, Alex!
>>>
>>> On Tue, Oct 3, 2023 at 2:49 PM XQ Hu via dev 
>>> wrote:
>>>
 Configurations, Alex!

 On Tue, Oct 3, 2023 at 2:40 PM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming Alex Van
> Boxel  as our newest PMC member.
>
> Alex has been with Beam since 2016, very early in the life of the
> project. Alex has contributed code, design ideas, and perhaps most
> importantly been a huge part of organizing Beam Summits, and of course
> presenting at them as well. Alex really brings the ASF community 
> spirit to
> Beam.
>
> Congratulations Alex and thanks for being a part of Apache Beam!
>
> Kenn, on behalf of the Beam PMC (which now includes Alex)
>



Re: Contribution of Asgarde: Error Handling for Beam?

2023-09-20 Thread Austin Bennett
It sounds like there is sufficient support for adding error-handling/DLQ
into Beam; that will be a great boost for users.

I'll start to take a look at formalizing some sort of BIP / Design doc, and
once that takes shape, will surface on list [ am out of my normal routine,
traveling for work, so that might take some time ... so anyone sufficiently
interested, do not feel blocked by me ].

I'm wondering about even intermixing (a) and (b).  Ex: short-run we could
look to define an API surface in the Beam Core API that we are happy with,
which *could* rely on asgarde [ in github.com/apache/beam-asgarde ] in the
short run.



On Wed, Sep 13, 2023 at 5:48 PM Alexey Romanenko 
wrote:

> I agree with Cham on these two options.
>
> In the end, it would be great to have such functionality (error handling /
> DLQ) integrated into Beam core API, but it will require, for sure, some
> technical discussions and reviews before - so it will take more time.
>
> Though, to make it available for users soon as a part of Beam
> distribution, adding this as an extension looks very feasible for me.
>
> —
> Alexey
>
> On 12 Sep 2023, at 19:44, Chamikara Jayalath via dev 
> wrote:
>
> Thanks Mazlum, this sounds great. I think there are two ways we can
> proceed if we decide to integrate the Asgarde library into Beam.
>
> (1) Directly import the code into Beam without significant modifications
> and/or a review (though we may add tests).
>
> (2) Go through a design/code review to determine whether this is the best
> approach for implementing error handling / DLQ in Beam transforms or
> whether there are other alternatives/modifications to Asgarde we want to
> consider.
>
> If we do (1) I prefer adding Asgarde as a separate Gradle module in Beam.
> We can later integrate it into the core module after a design/code review.
>
> Thank,
> Cham
>
>
>
> On Tue, Sep 12, 2023 at 10:26 AM Mazlum TOSUN 
> wrote:
>
>> Hello Austin and everyone,
>>
>> I am open for discussion.
>>
>> My first intention with Asgarde was to help the Beam community, because
>> Dead Letter Queue is so important in Beam and all the data pipeline
>> frameworks.
>> When I worked with Beam on production with my customers, we needed to
>> catch errors with side outputs and dead letter queue.
>>
>> This library really helped us to keep a less verbose code while applying
>> all the error handling logic, that is error prone and verbose if it is
>> repeated.
>>
>> As Kennet said, my intention was to stay as close as possible to Beam,
>> with a Wrapper and a Failure Monad on top of a PCollection, to handle all
>> the code and complexity for try catch blocks and side output.
>>
>> For the governance, even if I am the creator of this library, the most
>> important isn't me but the community and to help the community.
>> If the best solution to help the community is including the library
>> directly on Beam, we can go in this direction, with of course your reviews
>> and recommendations.
>>
>> Then the library will belong to the community and we will continue to
>> improve it.
>>
>> For the decision about the best place, I will comply with the majority.
>>
>> Best regards,
>>
>> Mazlum
>>
>> On Mon, Sep 11, 2023 at 11:15 PM Austin Bennett 
>> wrote:
>>
>>> @Mazlum TOSUN  --  you and I have spoken a few
>>> times about this.  it'd be good for you to comment here on list, on any of
>>> your concerns with governance, and/or other thoughts.  Ex: if you think
>>> contributing asgarde directly is the thing [ or perhaps expressing any
>>> interest helping write/contribute the relevant functionality into beam ...
>>> it is possible that by adding the actual functionality into beam - like
>>> Kenn's mentioned 'other place' we could make asgarde as an separate add-on
>>> obsolete ].
>>>
>>>
>>>
>>> On Fri, Sep 8, 2023 at 8:55 AM Kenneth Knowles  wrote:
>>>
>>>> For anyone who hasn't clicked over the Asgarde, my TL;DR description of
>>>> it is that it adds the "failure monad" aka "andThen" style error/result
>>>> handling on top of chaining of PCollections. So it is at a similar level of
>>>> abstraction of our basic transforms and generally useful for chaining
>>>> dead-letter side outputs. It is no more or less appropriate for the core
>>>> SDK than, say, the Project/Filter/Join transforms, or Watch, etc. If we
>>>> actually aspired to have a thin core with the accessories like that in
>>>> another place, then it should go to that other

Re: Contribution of Asgarde: Error Handling for Beam?

2023-09-11 Thread Austin Bennett
 a great library. I'm on the fence of whether it
>>>>>>> makes sense to include with Beam proper vs. be a library that builds on 
>>>>>>> top
>>>>>>> of Beam. (Would there be benefits of tighter integration? There is the
>>>>>>> maintenance/loss of governance issue.) I am definitely not on the side 
>>>>>>> that
>>>>>>> the entire Beam ecosystem needs to be distributed/maintained by Beam
>>>>>>> itself.
>>>>>>>
>>>>>>> Regardless of the direction we go, I think it could make a lot of
>>>>>>> sense to put pointers to it in our documentation.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 5, 2023 at 7:21 AM Danny McCormick via dev <
>>>>>>> dev@beam.apache.org> wrote:
>>>>>>>
>>>>>>>> I think my only concerns here are around the toil we'll be taking
>>>>>>>> on, and will we be leaving the asgarde project in a better or worse 
>>>>>>>> place.
>>>>>>>>
>>>>>>>> From a release standpoint, we would need to release it with the
>>>>>>>> same cadence as Beam. Adding asgarde into our standard release process
>>>>>>>> seems fairly straightforward, though, so I'm not too worried about it -
>>>>>>>> looks like it's basically (1) add a commit like this
>>>>>>>> <https://github.com/tosun-si/asgarde/commit/432de527d67dc71f06507328319b466b6d0fb56a>,
>>>>>>>> (2) run this workflow
>>>>>>>> <https://github.com/tosun-si/asgarde/blob/main/.github/workflows/publish-project.yml>,
>>>>>>>> and (3) tag/mark the release as released on GitHub.
>>>>>>>>
>>>>>>>> In terms of bug fixes and improvements, though, I'm a little
>>>>>>>> worried that we might be leaving things in a worse state since Mazlum 
>>>>>>>> has
>>>>>>>> been the only contributor thus far, and he would lose some governance 
>>>>>>>> (and
>>>>>>>> possibly the ability to commit code on his own). An extra motivated
>>>>>>>> community member or two could change the math a bit, but I'm not sure 
>>>>>>>> if
>>>>>>>> there are actually clear advantages to including it in Apache other 
>>>>>>>> than
>>>>>>>> visibility. Would adding links to our docs calling Asgarde out as an 
>>>>>>>> option
>>>>>>>> accomplish the same purpose?
>>>>>>>>
>>>>>>>> > Let's be careful about whether these tests are included in our
>>>>>>>> presubmits. Contrib code with flaky tests has been a major pain point 
>>>>>>>> in
>>>>>>>> the past.
>>>>>>>>
>>>>>>>> +1 - I think if we do this I'd vote that it be in a separate repo (
>>>>>>>> github.com/apache/beam-asgarde made sense to me).
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Overall, I'm probably a slight -1 to adding this to the Apache
>>>>>>>> workspace, but +1 to at least adding links from the Beam docs to 
>>>>>>>> Asgarde.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Danny
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Sep 5, 2023 at 12:03 AM Reuven Lax via dev <
>>>>>>>> dev@beam.apache.org> wrote:
>>>>>>>>
>>>>>>>>> Let's be careful about whether these tests are included in our
>>>>>>>>> presubmits. Contrib code with flaky tests has been a major pain point 
>>>>>>>>> in
>>>>>>>>> the past.
>>>>>>>>>
>>>>>>>>> On Sat, Sep 2, 2023 at 12:02 PM Austin Bennett 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Wanting us to not miss this. @Mazlum TOSUN
>>>>>>>>>>  is happy to donate Asgarde to
>>>>>>>>>> our project.
>>>>>>>>>>
>>>>>>>>>> It looks like he'd need a SGA and CCLA [ 1 ] on file; anything
>>>>>>>>>> else?
>>>>>>>>>>
>>>>>>>>>> I recalled the donation of Euphoria [ 2 ] , so I looked at those
>>>>>>>>>> threads [ 3 ]  for insights into the process.  It didn't look like 
>>>>>>>>>> there
>>>>>>>>>> was a needed VOTE, so mostly a matter of ensuring necessary 
>>>>>>>>>> signatures, and
>>>>>>>>>> ideally some sort of consensus [ or non-opposition ] to the donation.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [ 1 ] https://www.apache.org/licenses/contributor-agreements.html
>>>>>>>>>> [ 2 ] https://beam.apache.org/documentation/sdks/java/euphoria/
>>>>>>>>>> [ 3 ]
>>>>>>>>>> https://lists.apache.org/thread/xzlx4rm2tvc36mmwvhyvtdvsw7bnjscp
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 15, 2023 at 7:05 AM Kerry Donny-Clark via dev <
>>>>>>>>>> dev@beam.apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> This looks like an excellent contribution. I can easily
>>>>>>>>>>> understand the motivation, and I think Beam would benefit from a 
>>>>>>>>>>> higher
>>>>>>>>>>> level abstraction for error handling.
>>>>>>>>>>> Kerry
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 14, 2023, 6:31 PM Austin Bennett 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Beam Devs,
>>>>>>>>>>>>
>>>>>>>>>>>> @Mazlum <https://www.linkedin.com/in/mazlum-tosun-900b1812/>
>>>>>>>>>>>> was suggested to consider donating Asgarde
>>>>>>>>>>>> <https://github.com/tosun-si/asgarde> to Beam for Java/Kotlin
>>>>>>>>>>>> error handling to Beam [ see:
>>>>>>>>>>>> https://2022.beamsummit.org/sessions/error-handling-asgarde/
>>>>>>>>>>>> for last year's Beam Summit talk ], he is also the author of
>>>>>>>>>>>> Pasgard <https://github.com/tosun-si/pasgarde>e [ for Python ]
>>>>>>>>>>>> and Milgard [ for a simplified Kotlin API ].
>>>>>>>>>>>>
>>>>>>>>>>>> Would Asgarde be a good contribution, something the Beam
>>>>>>>>>>>> community would be willing to accept?  I imagine we might want it 
>>>>>>>>>>>> to live
>>>>>>>>>>>> at github.com/apache/beam-asgarde ?  Or perhaps there is a
>>>>>>>>>>>> good place in github.com/apache/beam ??
>>>>>>>>>>>>
>>>>>>>>>>>> Especially once/if officially part of Beam, I imagine we'd add
>>>>>>>>>>>> follow-up items like getting onto the website/docs, and related.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Austin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> P.S.  This might warrant separate/additional conversations for
>>>>>>>>>>>> his other libraries, but let's focus any discussion on Asgarde for 
>>>>>>>>>>>> now?
>>>>>>>>>>>>
>>>>>>>>>>>


Contribution of Asgarde: Error Handling for Beam?

2023-09-02 Thread Austin Bennett
Wanting us to not miss this. @Mazlum TOSUN  is
happy to donate Asgarde to our project.

It looks like he'd need a SGA and CCLA [ 1 ] on file; anything else?

I recalled the donation of Euphoria [ 2 ] , so I looked at those threads [
3 ]  for insights into the process.  It didn't look like there was a needed
VOTE, so mostly a matter of ensuring necessary signatures, and ideally some
sort of consensus [ or non-opposition ] to the donation.


[ 1 ] https://www.apache.org/licenses/contributor-agreements.html
[ 2 ] https://beam.apache.org/documentation/sdks/java/euphoria/
[ 3 ] https://lists.apache.org/thread/xzlx4rm2tvc36mmwvhyvtdvsw7bnjscp



On Thu, Jun 15, 2023 at 7:05 AM Kerry Donny-Clark via dev <
dev@beam.apache.org> wrote:

> This looks like an excellent contribution. I can easily understand the
> motivation, and I think Beam would benefit from a higher level abstraction
> for error handling.
> Kerry
>
> On Wed, Jun 14, 2023, 6:31 PM Austin Bennett  wrote:
>
>> Hi Beam Devs,
>>
>> @Mazlum <https://www.linkedin.com/in/mazlum-tosun-900b1812/> was
>> suggested to consider donating Asgarde
>> <https://github.com/tosun-si/asgarde> to Beam for Java/Kotlin error
>> handling to Beam [ see:
>> https://2022.beamsummit.org/sessions/error-handling-asgarde/ for last
>> year's Beam Summit talk ], he is also the author of Pasgard
>> <https://github.com/tosun-si/pasgarde>e [ for Python ] and Milgard [ for
>> a simplified Kotlin API ].
>>
>> Would Asgarde be a good contribution, something the Beam community would
>> be willing to accept?  I imagine we might want it to live at
>> github.com/apache/beam-asgarde ?  Or perhaps there is a good place in
>> github.com/apache/beam ??
>>
>> Especially once/if officially part of Beam, I imagine we'd add follow-up
>> items like getting onto the website/docs, and related.
>>
>> Cheers,
>> Austin
>>
>>
>> P.S.  This might warrant separate/additional conversations for his other
>> libraries, but let's focus any discussion on Asgarde for now?
>>
>


Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-08-28 Thread Austin Bennett
I've thought about this a ton, but haven't been in a position to undertake
the work.  Thanks for bringing this up, @Anand Inguva
 !

I'd point us to https://python-poetry.org/  ... [ which is where I'd look
take us, but I'm also not able to do all the work, so my
suggestion/preference doensn't matter that much ]

https://python-poetry.org/docs/pyproject#the-pyprojecttoml-file <- for info
on pyproject.toml file.

Notice the use of a 'lock' file is very valuable, ex:
https://python-poetry.org/docs/basic-usage/#committing-your-poetrylock-file-to-version-control

I haven't come across `build`, that might be great too.  I'd highlight that
Poetry is pretty common across industry these days, rock-solid, ecosystem
of interoperability, users, etc...   If not familiar, PLEASE have a look at
that.




On Mon, Aug 28, 2023 at 8:04 AM Kerry Donny-Clark via dev <
dev@beam.apache.org> wrote:

> +1
> Hi Anand,
> I appreciate this effort. Managing python dependencies has been a major
> pain point for me, and I think this approach would help.
> Kerry
>
> On Mon, Aug 28, 2023 at 10:14 AM Anand Inguva via dev 
> wrote:
>
>> Hello Beam Dev Team,
>>
>> I've compiled a design document
>> [1]
>> proposing the integration of pyproject.toml into Apache Beam's Python build
>> process. Your insights and feedback would be invaluable.
>>
>> What is pyproject.toml?
>> pyproject.toml is a configuration file that specifies a project's build
>> dependencies and other project-related metadata in a standardized
>> format. Before pyproject.toml, Python projects often had multiple
>> configuration files (like setup.py, setup.cfg, and requirements.txt).
>> pyproject.toml aims to centralize these configurations into one place,
>> making project setups more organized and straightforward. One of the
>> significant features enabled by pyproject.toml is the ability to perform
>> isolated builds. This ensures that build dependencies are separated from
>> the project's runtime dependencies, leading to more consistent and
>> reproducible builds.
>>
>> [1]
>> https://docs.google.com/document/d/17-y48WW25-VGBWZNyTdoN0WUN03k9ZhJjLp9wtyG1Wc/edit#heading=h.wskna8eurvjv
>>
>> Thanks,
>> Anand
>>
>


Asgarde: Error Handling for Beam?

2023-06-14 Thread Austin Bennett
Hi Beam Devs,

@Mazlum  was suggested
to consider donating Asgarde  to
Beam for Java/Kotlin error handling to Beam [ see:
https://2022.beamsummit.org/sessions/error-handling-asgarde/ for last
year's Beam Summit talk ], he is also the author of Pasgard
e [ for Python ] and Milgard [ for a
simplified Kotlin API ].

Would Asgarde be a good contribution, something the Beam community would be
willing to accept?  I imagine we might want it to live at
github.com/apache/beam-asgarde ?  Or perhaps there is a good place in
github.com/apache/beam ??

Especially once/if officially part of Beam, I imagine we'd add follow-up
items like getting onto the website/docs, and related.

Cheers,
Austin


P.S.  This might warrant separate/additional conversations for his other
libraries, but let's focus any discussion on Asgarde for now?


Re: [beam-starter-typescript]: Missing place to create issue

2023-06-14 Thread Austin Bennett
A few additional thoughts:

*  @Anyone --> Should each starter repo allow issues?  Or, better to file
issues in https://github.com/apache/beam/issues ?

* @david-kh...@hotmail.com -- I'd say in general PRs are likely welcome.

* Seems like Contributing.md should get updated.  The text linking to
issues, actually takes people to PRs.



On Wed, Jun 14, 2023 at 10:26 AM Kerry Donny-Clark via dev <
dev@beam.apache.org> wrote:

> Jack may also be able to help you create an issue.
> Kerry
>
> On Wed, Jun 14, 2023, 1:09 PM XQ Hu via dev  wrote:
>
>> I believe Robert is the owner for that project.
>>
>> On Mon, Jun 12, 2023 at 11:30 PM david-kh...@hotmail.com <
>> david-kh...@hotmail.com> wrote:
>>
>>> Hi Beam community,
>>>
>>>
>>>
>>> I am David and new to the community. After tried to tweak some code from
>>> beam-starter-ts, I have found some issues and want to raise. But there is
>>> no way I can create an Github issue in the same project
>>>
>>> apache/beam-starter-typescript: Apache beam (github.com)
>>> .
>>>
>>>
>>>
>>> I also double check the Contribute.md and get no idea still.
>>>
>>>
>>>
>>> Would you mind guide me to the right path?
>>>
>>>
>>>
>>> Regards,
>>>
>>> David L.
>>>
>>


Re: [VOTE] Release 2.47.0, release candidate #3

2023-05-06 Thread Austin Bennett
+1 ( non-binding )

On Fri, May 5, 2023 at 10:49 PM Jean-Baptiste Onofré 
wrote:

> +1 (binding)
>
> Regards
> JB
>
> On Fri, May 5, 2023 at 4:52 AM Jack McCluskey via dev 
> wrote:
>
>> Hi everyone,
>>
>> Please review and vote on the release candidate #3 for the version
>> 2.47.0, as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>> Reviewers are encouraged to test their own use cases with the release
>> candidate, and vote +1 if no issues are found. *Non-PMC members are
>> allowed and encouraged to vote. Please help validate the release for your
>> use case!*
>>
>> The complete staging area is available for your review, which includes:
>> * GitHub Release notes [1],
>> * the official Apache source release to be deployed to dist.apache.org [2],
>> which is signed with the key with fingerprint DF3CBA4F3F4199F4 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "v2.47.0-RC3" [5],
>> * website pull request listing the release [6], the blog post [6], and
>> publishing the API reference manual [7].
>> * Java artifacts were built with Gradle 7.5.1 and OpenJDK/Oracle JDK
>> 8.0.322.
>> * Python artifacts are deployed along with the source release to the
>> dist.apache.org [2] and PyPI[8].
>> * Go artifacts and documentation are available at pkg.go.dev [9]
>> * Validation sheet with a tab for 2.47.0 release to help with validation
>> [10].
>> * Docker images published to Docker Hub [11].
>> * PR to run tests against release branch [12].
>>
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>>
>> The GCR copies of the FnAPI containers are rolling out now, they should
>> be out within the next 8 hours or so.
>>
>> For guidelines on how to try the release in your projects, check out our
>> blog post at /blog/validate-beam-release/.
>>
>> Thanks,
>>
>> Jack McCluskey
>>
>> [1] https://github.com/apache/beam/milestone/10
>> [2] https://dist.apache.org/repos/dist/dev/beam/2.47.0/
>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> [4]
>> https://repository.apache.org/content/repositories/orgapachebeam-1322/
>> [5] https://github.com/apache/beam/tree/v2.47.0-RC3
>> [6] https://github.com/apache/beam/pull/26439
>> [7] https://github.com/apache/beam-site/pull/644
>> [8] https://pypi.org/project/apache-beam/2.47.0rc3/
>> [9]
>> https://pkg.go.dev/github.com/apache/beam/sdks/v2@v2.47.0-RC3/go/pkg/beam
>> [10]
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=.
>> ..
>> [11] https://hub.docker.com/search?q=apache%2Fbeam=image
>> [12] https://github.com/apache/beam/pull/26152
>>
>> --
>>
>>
>> Jack McCluskey
>> SWE - DataPLS PLAT/ Dataflow ML
>> RDU
>> jrmcclus...@google.com
>>
>>
>>


Re: [LAZY CONSENSUS] Drop @Experimental annotations and concept from Beam

2023-05-01 Thread Austin Bennett
great, thanks, @* Kenneth Knowles  !

On Mon, May 1, 2023 at 10:12 AM Kenneth Knowles  wrote:

> We are well past the lazy consensus. I will remove the @Experimental
> annotations and concept from Beam.
>
> Kenn
>
> On Tue, Apr 25, 2023 at 3:40 PM Kenneth Knowles  wrote:
>
>> Hello!
>>
>> I propose to drop @Experimental annotations and concept from Beam.
>> Discussion occurred at
>> https://lists.apache.org/thread/tvvdckdom8jtv2xr9mzg0ltjjpbmydrv.
>>
>> Once approved, I will make the code changes to eliminate the annotation.
>>
>> If no one has an objection or further discussion needed in 72 hours, it
>> can be considered approved. See
>> https://community.apache.org/committers/lazyConsensus.html
>>
>> Kenn
>>
>


Re: [ANNOUNCE] New committer: Anand Inguva

2023-04-25 Thread Austin Bennett
Thanks, and Congratulations, Anand!

On Tue, Apr 25, 2023 at 8:12 AM Reza Rokni via dev 
wrote:

> Congratulations!
>
> On Tue, Apr 25, 2023 at 3:48 AM Alexey Romanenko 
> wrote:
>
>> Congratulations, Anand! Well deserved!
>>
>> On 25 Apr 2023, at 06:02, Byron Ellis via dev 
>> wrote:
>>
>> Congrats Anand!
>>
>> On Mon, Apr 24, 2023 at 9:54 AM Ahmet Altay via dev 
>> wrote:
>>
>>> Congratulations Anand!
>>>
>>> On Mon, Apr 24, 2023 at 8:05 AM Kerry Donny-Clark via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Great work Anand, this is well deserved.


 On Mon, Apr 24, 2023 at 10:35 AM Yi Hu via dev 
 wrote:

> Congrats Anand!
>
> On Fri, Apr 21, 2023 at 3:54 PM Danielle Syse via dev <
> dev@beam.apache.org> wrote:
>
>> Congratulations!
>>
>> On Fri, Apr 21, 2023 at 3:53 PM Damon Douglas via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congratulations Anand!
>>>
>>> On Fri, Apr 21, 2023 at 12:28 PM Ritesh Ghorse via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congratulations Anand!

 On Fri, Apr 21, 2023 at 3:24 PM Ahmed Abualsaud via dev <
 dev@beam.apache.org> wrote:

> Congrats Anand!
>
> On Fri, Apr 21, 2023 at 3:18 PM Anand Inguva via dev <
> dev@beam.apache.org> wrote:
>
>> Thanks everyone. Really excited to be a part of Beam Committers.
>>
>> On Fri, Apr 21, 2023 at 3:07 PM XQ Hu via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congratulations, Anand!!!
>>>
>>> On Fri, Apr 21, 2023 at 2:31 PM Jack McCluskey via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congratulations, Anand!

 On Fri, Apr 21, 2023 at 2:28 PM Valentyn Tymofieiev via dev <
 dev@beam.apache.org> wrote:

> Congratulations!
>
> On Fri, Apr 21, 2023 at 8:19 PM Jan Lukavský 
> wrote:
>
>> Congrats Anand!
>> On 4/21/23 20:05, Robert Burke wrote:
>>
>> Congratulations Anand!
>>
>> On Fri, Apr 21, 2023, 10:55 AM Danny McCormick via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Woohoo, congrats Anand! This is very well deserved!
>>>
>>> On Fri, Apr 21, 2023 at 1:54 PM Chamikara Jayalath <
>>> chamik...@apache.org> wrote:
>>>
 Hi all,

 Please join me and the rest of the Beam PMC in welcoming a
 new committer: Anand Inguva (ananding...@apache.org)

 Anand has been contributing to Apache Beam for more than a
 year and  authored and reviewed more than 100 PRs. Anand has 
 been a core
 contributor to Beam Python SDK and drove the efforts to 
 support Python 3.10
 and Python 3.11.

 Considering their contributions to the project over this
 timeframe, the Beam PMC trusts Anand with the responsibilities 
 of a Beam
 committer. [1]

 Thank you Anand! And we are looking to see more of your
 contributions!

 Cham, on behalf of the Apache Beam PMC

 [1]
 https://beam.apache.org/contribute/become-a-committer
 /#an-apache-beam-committer

>>>
>>


Re: [ANNOUNCE] New committer: Damon Douglas

2023-04-24 Thread Austin Bennett
thanks for all you do @Damon Douglas  !

On Mon, Apr 24, 2023 at 1:00 PM Robert Burke  wrote:

> Congratulations Damon!!!
>
> On Mon, Apr 24, 2023, 12:52 PM Kenneth Knowles  wrote:
>
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Damon Douglas (damondoug...@apache.org)
>>
>> Damon has contributed widely: Beam Katas, playground, infrastructure, and
>> many IO connectors. Damon does lots of code review in addition to code.
>> (yes, you can review code as a non-committer!)
>>
>> Considering their contributions to the project over this timeframe, the
>> Beam PMC trusts Damon with the responsibilities of a Beam committer. [1]
>>
>> Thank you Damon! And we are looking to see more of your contributions!
>>
>> Kenn, on behalf of the Apache Beam PMC
>>
>> [1]
>>
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: new contributor messaging: behaviorbot/welcome

2023-03-06 Thread Austin Bennett
Nudge on https://github.com/apache/beam/pull/25586 ...

Can a PMC member install the bot [ or work with infra to make that happen,
ex: via https://github.com/apps/welcome/installations/new ]?  I'd be happy
to, but do not believe I have those permissions - do advise if I should
message/create-tickets and copy any individual from PMC specifically.  Once
that's done, we can merge the code for the bot to be configured - imagining
that is a better second step, so we do not have code in the codebase that
doesn't do anything.


On Tue, Feb 21, 2023 at 8:42 PM Austin Bennett  wrote:

> A PR: https://github.com/apache/beam/pull/25586
>
> text could likely be improved ( open to suggestions/changes ), but this
> captures at least the intent.
>
> For this to work, we need to install the bot as also mentioned in the PR.
>
>
>
> On Tue, Feb 21, 2023 at 6:02 PM Robert Burke  wrote:
>
>> I agree that the bot is better than nothing at all.
>>
>> +1 to getting a PR with messaging out for review.
>>
>> On Tue, Feb 21, 2023, 5:29 PM Robert Bradshaw via dev <
>> dev@beam.apache.org> wrote:
>>
>>> FWIW, I'm generally in favor of such a bot. I think it really boils
>>> down to a concrete proposal of what the content (and triggers) would
>>> be.
>>>
>>> On Tue, Feb 21, 2023 at 1:36 PM Austin Bennett
>>>  wrote:
>>> >
>>> > It is fantastic if generally able to address welcoming newcomers
>>> manually [ @Robert Burke ! ] .  Community communication, human connection [
>>> ex: community > code ] ideal!!  In this particular case, I imagine
>>> automation does not contradict - nor detract from - the manual/human touch.
>>> >
>>> > As shared, the very specific use case I had in mind was to support -->
>>> https://news.apache.org/foundation/entry/the-asf-launches-firstasfcontribution-campaign
>>> ...  I wanted to send a message thanking for someone's first PR merge, and
>>> encourage them to fill out the form ( while that campaign is active.  In
>>> that case, I did imagine a static [ meaning hardcoded, non-changing ]
>>> message that prompts them at the moment that they make their real first
>>> code contribution [ as it gets merged ], since that would be most relevant
>>> and immediate feedback.
>>> >
>>> > If we think overkill, no problem either.  If an issue with choosing to
>>> use a bot, vs a GH action - I can also spend time to create a custom GH
>>> Action that accommodates that.  But, that might not be worthwhile if the
>>> discussed use case isn't functionality we even want as part of the project.
>>> >
>>> > On Tue, Feb 21, 2023 at 12:28 PM Robert Bradshaw 
>>> wrote:
>>> >>
>>> >> On Tue, Feb 21, 2023 at 10:59 AM Kenneth Knowles 
>>> wrote:
>>> >> >
>>> >> > Agree with Robert here. The human connection is important. Can we
>>> have a behaviorbot that reminds the reviewer to be extra welcoming up
>>> front, and then thankful afterwards, instead? :-)
>>> >>
>>> >> +1
>>> >>
>>> >> > That said, a bot comment would at least state our intention of
>>> being welcoming and grateful, even if we then do not live up to it
>>> perfectly. It isn't very different than having it in the PR template or
>>> https://beam.apache.org/contribute/ or CONTRIBUTING.md which GitHub
>>> presents to first time contributors. I tend to favor static text that can
>>> be referred to over dynamic text posted by code in special circumstances.
>>> But I think hitting this from all angles, for different sorts of people in
>>> the world, is fine, if the maintenance burden is very low (which it appears
>>> to be)
>>> >>
>>> >> I think the primary value in such a bot is to set expectations/inform
>>> >> the contributor of something they might not know but is relevant to
>>> >> their action. Otherwise, I am more in favor of static text somewhere
>>> >> they're sure to encounter it (and there are benefits to doing it
>>> >> before they create a PR, e.g. as part of a template, rather than
>>> >> after).
>>> >>
>>> >>
>>> >> > On Tue, Feb 21, 2023 at 10:01 AM Robert Burke 
>>> wrote:
>>> >> >>
>>> >> >> I can't speak for all committers but I'm always aware when it's
>>> someone's first time contributing to beam (the First Time Contributor badge
>>> is instrumental h

Re: new contributor messaging: behaviorbot/welcome

2023-02-21 Thread Austin Bennett
A PR: https://github.com/apache/beam/pull/25586

text could likely be improved ( open to suggestions/changes ), but this
captures at least the intent.

For this to work, we need to install the bot as also mentioned in the PR.



On Tue, Feb 21, 2023 at 6:02 PM Robert Burke  wrote:

> I agree that the bot is better than nothing at all.
>
> +1 to getting a PR with messaging out for review.
>
> On Tue, Feb 21, 2023, 5:29 PM Robert Bradshaw via dev 
> wrote:
>
>> FWIW, I'm generally in favor of such a bot. I think it really boils
>> down to a concrete proposal of what the content (and triggers) would
>> be.
>>
>> On Tue, Feb 21, 2023 at 1:36 PM Austin Bennett
>>  wrote:
>> >
>> > It is fantastic if generally able to address welcoming newcomers
>> manually [ @Robert Burke ! ] .  Community communication, human connection [
>> ex: community > code ] ideal!!  In this particular case, I imagine
>> automation does not contradict - nor detract from - the manual/human touch.
>> >
>> > As shared, the very specific use case I had in mind was to support -->
>> https://news.apache.org/foundation/entry/the-asf-launches-firstasfcontribution-campaign
>> ...  I wanted to send a message thanking for someone's first PR merge, and
>> encourage them to fill out the form ( while that campaign is active.  In
>> that case, I did imagine a static [ meaning hardcoded, non-changing ]
>> message that prompts them at the moment that they make their real first
>> code contribution [ as it gets merged ], since that would be most relevant
>> and immediate feedback.
>> >
>> > If we think overkill, no problem either.  If an issue with choosing to
>> use a bot, vs a GH action - I can also spend time to create a custom GH
>> Action that accommodates that.  But, that might not be worthwhile if the
>> discussed use case isn't functionality we even want as part of the project.
>> >
>> > On Tue, Feb 21, 2023 at 12:28 PM Robert Bradshaw 
>> wrote:
>> >>
>> >> On Tue, Feb 21, 2023 at 10:59 AM Kenneth Knowles 
>> wrote:
>> >> >
>> >> > Agree with Robert here. The human connection is important. Can we
>> have a behaviorbot that reminds the reviewer to be extra welcoming up
>> front, and then thankful afterwards, instead? :-)
>> >>
>> >> +1
>> >>
>> >> > That said, a bot comment would at least state our intention of being
>> welcoming and grateful, even if we then do not live up to it perfectly. It
>> isn't very different than having it in the PR template or
>> https://beam.apache.org/contribute/ or CONTRIBUTING.md which GitHub
>> presents to first time contributors. I tend to favor static text that can
>> be referred to over dynamic text posted by code in special circumstances.
>> But I think hitting this from all angles, for different sorts of people in
>> the world, is fine, if the maintenance burden is very low (which it appears
>> to be)
>> >>
>> >> I think the primary value in such a bot is to set expectations/inform
>> >> the contributor of something they might not know but is relevant to
>> >> their action. Otherwise, I am more in favor of static text somewhere
>> >> they're sure to encounter it (and there are benefits to doing it
>> >> before they create a PR, e.g. as part of a template, rather than
>> >> after).
>> >>
>> >>
>> >> > On Tue, Feb 21, 2023 at 10:01 AM Robert Burke 
>> wrote:
>> >> >>
>> >> >> I can't speak for all committers but I'm always aware when it's
>> someone's first time contributing to beam (the First Time Contributor badge
>> is instrumental here), and manually thank them and welcome them to Beam.
>> >> >>
>> >> >> Seems more meaningful for the merging comitter to do it rather than
>> an automated process.
>> >> >>
>> >> >> Maybe i just have bad experiences with automated phone trees
>> >> >>
>> >> >> On Tue, Feb 21, 2023, 9:02 AM Danny McCormick via dev <
>> dev@beam.apache.org> wrote:
>> >> >>>
>> >> >>> If the merge message is a key part of this then I'm fine using
>> behaviorbot (though I think a PMC member would need to install it, I don't
>> have the right permission set).
>> >> >>>
>> >> >>> > I'd also be happy to leverage first-interaction for everything
>> it can do, and only use welcome-bot for the things that aren't met
&g

Re: [ANNOUNCE] New PMC Member: Jan Lukavský

2023-02-21 Thread Austin Bennett
Thanks, Jan, for all the ways you've contributed to the community!

On Tue, Feb 21, 2023 at 8:06 PM Ahmet Altay via dev 
wrote:

> Congratulations Jan!
>
> On Fri, Feb 17, 2023 at 4:52 AM Jan Lukavský  wrote:
>
>> Thanks everyone!
>>
>> This is great honor, I'm grateful for the support of the Apache Beam
>> community.
>>
>> Best,
>>
>>  Jan
>> On 2/17/23 11:15, Shivam Singhal wrote:
>>
>> Congratulations Jan!
>>
>> On Fri, 17 Feb 2023 at 14:26, Moritz Mack  wrote:
>>
>>> Congrats, Jan!
>>>
>>>
>>>
>>> On 16.02.23, 23:28, "Luke Cwik via dev"  wrote:
>>>
>>>
>>>
>>> Congrats, well deserved. On Thu, Feb 16, 2023 at 10: 32 AM Anand Inguva
>>> via dev 
>>>  wrote:
>>> Congratulations!! On Thu, Feb 16, 2023 at 12: 42 PM Chamikara Jayalath via
>>> dev 
>>>  wrote: Congrats Jan!On
>>>
>>> Congrats, well deserved.
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 10:32 AM Anand Inguva via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>> Congratulations!!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 12:42 PM Chamikara Jayalath via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>> Congrats Jan!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 8:35 AM John Casey via dev 
>>> wrote:
>>>
>>> Thanks Jan!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 11:11 AM Danny McCormick via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>> Congratulations!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 11:09 AM Reza Rokni via dev 
>>> wrote:
>>>
>>> Congratulations!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 7:47 AM Robert Burke  wrote:
>>>
>>> Congratulations!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023, 7:44 AM Danielle Syse via dev 
>>> wrote:
>>>
>>> Congrats, Jan! That's awesome news. Thank you for your continued
>>> contributions!
>>>
>>>
>>>
>>> On Thu, Feb 16, 2023 at 10:42 AM Alexey Romanenko <
>>> aromanenko@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming Jan Lukavský <
>>> j...@apache.org> as our newest PMC member.
>>>
>>> Jan has been a part of Beam community and a long time contributor since
>>> 2018 in many significant ways, including code contributions in different
>>> areas, participating in technical discussions, advocating for users, giving
>>> a talk at Beam Summit and even writing one of the few Beam books!
>>>
>>> Congratulations Jan and thanks for being a part of Apache Beam!
>>>
>>> ---
>>> Alexey
>>>
>>> *As a recipient of an email from the Talend Group, your personal data
>>> will be processed by our systems. Please see our Privacy Notice
>>> *for more information about our
>>> collection and use of your personal information, our security practices,
>>> and your data protection rights, including any rights you may have to
>>> object to automated-decision making or profiling we use to analyze support
>>> or marketing related communications. To manage or discontinue promotional
>>> communications, use the communication preferences portal
>>> . To exercise your
>>> data protection rights, use the privacy request form
>>> .
>>> Contact us here or by mail to either
>>> of our co-headquarters: Talend, Inc.: 400 South El Camino Real, Ste 1400,
>>> San Mateo, CA 94402; Talend SAS: 5/7 rue Salomon De Rothschild, 92150
>>> Suresnes, France
>>>
>>


Re: new contributor messaging: behaviorbot/welcome

2023-02-21 Thread Austin Bennett
It is fantastic if generally able to address welcoming newcomers
manually [ @Robert
Burke  ! ] .  Community communication, human connection
[ ex: community > code
<https://theapacheway.com/community-over-code/#:~:text=Community%20Over%20Code%20is%20a,continue%20to%20maintain%20the%20code.>
]
ideal!!  In this particular case, I imagine automation does not contradict
- nor detract from - the manual/human touch.

As shared, the very specific use case I had in mind was to support -->
https://news.apache.org/foundation/entry/the-asf-launches-firstasfcontribution-campaign
...
I wanted to send a message thanking for someone's first PR merge, and
encourage them to fill out the form <https://forms.gle/FDwR9wLZCkwhirTM9> (
while that campaign is active.  In that case, I did imagine a static [
meaning hardcoded, non-changing ] message that prompts them at the moment
that they make their real first code contribution [ as it gets merged ],
since that would be most relevant and immediate feedback.

If we think overkill, no problem either.  If an issue with choosing to use
a bot, vs a GH action - I can also spend time to create a custom GH Action
that accommodates that.  But, that might not be worthwhile if the discussed
use case isn't functionality we even want as part of the project.

On Tue, Feb 21, 2023 at 12:28 PM Robert Bradshaw 
wrote:

> On Tue, Feb 21, 2023 at 10:59 AM Kenneth Knowles  wrote:
> >
> > Agree with Robert here. The human connection is important. Can we have a
> behaviorbot that reminds the reviewer to be extra welcoming up front, and
> then thankful afterwards, instead? :-)
>
> +1
>
> > That said, a bot comment would at least state our intention of being
> welcoming and grateful, even if we then do not live up to it perfectly. It
> isn't very different than having it in the PR template or
> https://beam.apache.org/contribute/ or CONTRIBUTING.md which GitHub
> presents to first time contributors. I tend to favor static text that can
> be referred to over dynamic text posted by code in special circumstances.
> But I think hitting this from all angles, for different sorts of people in
> the world, is fine, if the maintenance burden is very low (which it appears
> to be)
>
> I think the primary value in such a bot is to set expectations/inform
> the contributor of something they might not know but is relevant to
> their action. Otherwise, I am more in favor of static text somewhere
> they're sure to encounter it (and there are benefits to doing it
> before they create a PR, e.g. as part of a template, rather than
> after).
>
>
> > On Tue, Feb 21, 2023 at 10:01 AM Robert Burke 
> wrote:
> >>
> >> I can't speak for all committers but I'm always aware when it's
> someone's first time contributing to beam (the First Time Contributor badge
> is instrumental here), and manually thank them and welcome them to Beam.
> >>
> >> Seems more meaningful for the merging comitter to do it rather than an
> automated process.
> >>
> >> Maybe i just have bad experiences with automated phone trees
> >>
> >> On Tue, Feb 21, 2023, 9:02 AM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
> >>>
> >>> If the merge message is a key part of this then I'm fine using
> behaviorbot (though I think a PMC member would need to install it, I don't
> have the right permission set).
> >>>
> >>> > I'd also be happy to leverage first-interaction for everything it
> can do, and only use welcome-bot for the things that aren't met elsewhere [
> also happy to eventually remove welcome-bot, ex: after that ASF campaign or
> once a suitable off-the-shelf replacement comes along ]
> >>>
> >>> I don't think we should do this, there's not really a benefit gained
> if we're still using welcome-bot.
> >>>
> >>> > @Danny McCormick - any idea whether there is another tool that can
> help with messaging on first-pr-merge that we'd be more happy with [ I can
> search around some if that's the path ]?
> >>>
> >>> My best alternative would be actions/first-interaction for first
> issues/prs opened and a custom workflow using an if/else and
> actions/comment-pull-request for the pr merge comment, that is probably
> more trouble than it is worth though (>10 lines of code for something that
> can just be config).
> >>>
> >>> > And/or since I imagine you might know GH Action internals [ IIRC you
> had worked with/for that organization ] better than me at the moment, do
> you think that's functionality that could straightforwardly be added to
> first-interaction [ if they would accept a PR ]
> >>>
> >>> This wouldn't be too ha

Re: new contributor messaging: behaviorbot/welcome

2023-02-21 Thread Austin Bennett
There are lots of great places for messages/encouragement to developers as
they work more into our community.  Though, PR merge messages would
potentially be quite valuable [ for ex:
https://news.apache.org/foundation/entry/the-asf-launches-firstasfcontribution-campaign
... specifically, I wanted to send a message thanking for someone's first
PR merge, and encourage them to fill out the form
<https://forms.gle/FDwR9wLZCkwhirTM9> ( while that campaign is active ), so
that they then write up something for ASF to publish, which in-turn
increases the visibility of Beam :-) and Beam as a great example of a
healthy ASF project ].

No disagreement that if something exists off-the-shelf that is actions
based that is a plenty fine way to proceed.  For the shared use-case, the
PR merge is the ideal place to message.

Alternatives:
* I'd also be happy to leverage first-interaction for everything it can do,
and only use welcome-bot for the things that aren't met elsewhere [ also
happy to eventually remove welcome-bot, ex: after that ASF campaign or once
a suitable off-the-shelf replacement comes along ]
or
* @Danny McCormick  - any idea whether there is
another tool that can help with messaging on first-pr-merge that we'd be
more happy with [ I can search around some if that's the path ]?  And/or
since I imagine you might know GH Action internals [ IIRC you had worked
with/for that organization ] better than me at the moment, do you think
that's functionality that could straightforwardly be added to
first-interaction <https://github.com/actions/first-interaction> [ if they
would accept a PR ].  Else, if we think the APIs support a
decent/straightforward design, I can always create a custom GH action.  I
can dig in there if that's the route needed to accomplish, but thought you
might recall the GH APIs better than my current knowledge.  Thoughts?


On Mon, Feb 20, 2023 at 6:47 PM Danny McCormick via dev 
wrote:

> Hey Austin, I'm +1 for adding a welcome bot, I would vote we use
> https://github.com/actions/first-interaction instead though.
>
> The pros I see are:
> - (minor) we don't need to install the bot (which would require infra
> approval I believe)
> - GitHub has generally lowered (if not completely deprecated) probot apps
> in favor of actions
> - it matches our other automations which are all actions based
>
> The only con I see:
> - actions/first-interaction doesn't support PR merge messages (
> https://github.com/behaviorbot/welcome#first-pr-merge)
>
> If you put up a PR for `first-interaction`, I'm happy to review/merge
> (barring further disagreement on this thread).
>
> Thanks,
> Danny
>
> On Mon, Feb 20, 2023 at 4:33 PM Austin Bennett  wrote:
>
>> Hi Devs,
>>
>> I'd like us to consider adding behaviorbot
>> <https://github.com/behaviorbot>, and specifically behaviorbot/welcome
>> <https://github.com/behaviorbot/welcome> to beam's repo.  This will
>> allow us to easily have a bit of messaging to new contributors.  Ex: on
>> first issue creation and/or first PR.  Such messaging gets defined in
>> `.github/config.yml` ...
>>
>> I imagine this is not particularly contentious.  If we do believe fine,
>> can someone install: https://github.com/apps/welcome to our repo?  Once
>> in the repo, I can configure [ and get a review for ] the messaging for the
>> various conditions [ to live in `.github/config.yml`  ]
>>
>> Thanks,
>> Austin
>>
>


new contributor messaging: behaviorbot/welcome

2023-02-20 Thread Austin Bennett
Hi Devs,

I'd like us to consider adding behaviorbot ,
and specifically behaviorbot/welcome
 to beam's repo.  This will allow
us to easily have a bit of messaging to new contributors.  Ex: on first
issue creation and/or first PR.  Such messaging gets defined in
`.github/config.yml` ...

I imagine this is not particularly contentious.  If we do believe fine, can
someone install: https://github.com/apps/welcome to our repo?  Once in the
repo, I can configure [ and get a review for ] the messaging for the
various conditions [ to live in `.github/config.yml`  ]

Thanks,
Austin


Fwd: [NOTICE] Upcoming global changes to default GitHub Actions behavior for outside collaborators

2023-02-13 Thread Austin Bennett
FYI -

I am not sure this is overly concerning, but wanted to ensure people had
seen

-- Forwarded message -
From: Daniel Gruno 
Date: Mon, Feb 13, 2023, 11:49 AM
Subject: [NOTICE] Upcoming global changes to default GitHub Actions
behavior for outside collaborators
To: 


To Project PMCs:

GitHub for Apache projects is currently set to allow a non-committer
contributor to use GitHub Actions if a previous pull request by that
person has been approved.

This has raised some security concerns, and could cause issues with
overall use and availability of GitHub Actions.

The Infrastructure Team proposes to change the default to “always
require approval for external contributors”. We intend to make this
change on Sunday the 19th of March, 2023.

This change will apply to all GitHub repositories that do not already
have a specific GitHub Actions policy set.

Projects that have a strong desire to use the “only need approval first
time” option should communicate that, explaining their reasons, in a
Jira ticket for Infra. Please be as specific as you can in which
repositories you wish to have this option set for, should you choose to.

With regards,
Daniel, on behalf of the ASF Infrastructure Team.


Re: [Go SDK] Direct Runner Replacement: Prism

2023-02-09 Thread Austin Bennett
Thanks for the work on this; a very welcomed feature/contribution!

On Thu, Feb 9, 2023 at 7:36 AM Jack McCluskey via dev 
wrote:

> Congratulations on getting the runner to a state you're happy contributing
> to the main repo! I'm happy to help review PRs and get sub-packages in.
> Anything that helps developers and users test Beam pipelines more
> effectively is a welcome inclusion.
>
> Thanks,
>
> Jack McCluskey
>
> P.S. I'm glad the Prism name stuck, that's definitely one of my finer
> branding efforts
>
> On Wed, Feb 8, 2023 at 6:23 PM Robert Burke  wrote:
>
>> Hello Beam!
>>
>> == tl;dr; ==
>>
>> I wrote a local, portable Beam runner in Go to replace the Go direct
>> runner.  I'd like to contribute it to the Beam Repo. The Big PR with
>> everything is here: https://github.com/apache/beam/pull/25391
>>
>> I'll be sending smaller PRs out for review to get it into the repo. Take
>> a look at the big one, don't mind the mess, but do ask questions, or offer
>> constructive suggestions to make it clearer. There are ample TODOs that
>> could be added. This thread will be kept up to date with the progress.
>>
>> Highlights:
>> Avoids false positive issues the Go Direct runner has, especially around
>> serialization issues.
>> Single transform at a time execution.
>> Watermark propagation through Graph for GBKs and Side Input windowing.
>> Will be capable of testing the whole Go SDK, in time.
>> Will be capable of being a stand alone single binary runner, in time.
>> ++Many opportunities for contribution after getting into the repo!++
>>
>> Lowlights:
>> Only for Go SDK, for now.
>> ~~Many unimplemented features~~
>>
>> Where to start reading?
>>
>> Vision README:
>> https://github.com/apache/beam/blob/9044f2d4ae151f4222a2f3e0a3264c1198040181/sdks/go/pkg/beam/runners/prism/README.md
>>
>>
>> Code Structure README:
>> https://github.com/apache/beam/blob/9044f2d4ae151f4222a2f3e0a3264c1198040181/sdks/go/pkg/beam/runners/prism/internal/README.md
>>
>>
>> executePipeline entrypoint:
>> https://github.com/apache/beam/blob/9044f2d4ae151f4222a2f3e0a3264c1198040181/sdks/go/pkg/beam/runners/prism/internal/execute.go#L41
>>
>>
>>
>> == The long version ==
>>
>> Since last year, I was puttering away at making a Portable Beam Runner
>> authored in Go. Partly because I wanted to learn the "runner" half of beam,
>> and partly because the Go Direct Runner (and most other direct runners),
>> are not good at testing.
>>
>> I managed to get it roughly ready for basic batch execution by end of
>> February 2022 , and then 2022 got away from me. And I couldn't pick it up
>> until the end of the year.
>>
>> I gave a talk about this at Beam Summit 2022
>> https://2022.beamsummit.org/sessions/portable-go-beam-runner/ that
>> covers my motivation for it. Loosely, Beam has a Testing Problem. There are
>> large parts of Beam execution that matter for real world performance and
>> correctness, but the facilities to test these don't exist.  For example,
>> take Combiner Lifting, if a combiner is unlifted, but implements
>> AddInput... then Merge is never called, leaving it untested. And the user
>> has no control over this, or may not even be aware of it. How a DoFn is
>> executed matters for coverage, and user confidence.  In particular for
>> Streaming jobs, users will tend to try things out on their Prod runner, but
>> that doesn't help if one is testing on local Flink, but executing on Google
>> Cloud Dataflow, which behave very differently.
>>
>> Regardless of whether you agree with that thesis...  I wanted to fill
>> that gap. I wanted a runner that could be configured to test those
>> situations, and in particular, make it easier to develop SDKs and all the
>> features of Beam that don't get their own blog posts.
>>
>> Especially for the Go SDK. Java, being the oldest, has arguably the only
>> "correct" beam runner, in the form of the Java Direct Runner. But one can't
>> execute Go pipelines on that. Python has a portable execution of its
>> runner, but the current state of python is Parallelism hostile at best. It
>> supports a great many things, like Cross Language, but can't support
>> streaming execution (ProcessContinations etc) at present. Also, being a
>> large Python program, it's harder to follow.  The Java Direct runner, while
>> being slightly easier to follow, doesn't have a clear execution flow.
>> Neither of them are particularly easy for Non Language Experts to stand up
>> and use, especially outside of the Beam repo.
>>
>> The Go SDK's Direct Runner has many flaws, most of which are due to
>> Direct execution, rather than Portable Execution.  Implementing features
>> largely meant hacking certain things in, so they would be able to be
>> executed. This also made supporting and testing Cross Language Transforms,
>> State and Timers in Go pipelines a non-starter for users. And that's just
>> the tip.
>>
>> So I wanted something better. I mentioned it a few times to others, but I
>> kept hearing the same refrain: "I 

Re: [fyi] Updating ip address for Playground staging

2023-02-01 Thread Austin Bennett
[ Happy New Year ] Pablo,

I imagine there shouldn't be an issue with changing regions, esp. in a
testing environment ( verifying this is actually testing, which is distinct
from production as served from an account with a 'testing' name :-) ).
Though, it seems likely that you'll also want to update the associated
compute [ in beam playground's case, looks to be a GKE cluster ] to be in
the 'same' region as the regional IP address.

There is potentially reason to consider using a Global IP to increase
future flexibility.  Also, global IPs can be associated with K8s Ingress.
Do advise if it would be helpful to talk about networking/resources in GCP,
I spent much time focused on such networking.  I'd also be happy to get
hands-on to help us out, esp. in the event that there will be an
easier-to-maintain solution [ can document, and/or get PRs together, as
needed ].  Do advise -

Cheers,
Austin

On Wed, Feb 1, 2023 at 4:37 PM Pablo Estrada via dev 
wrote:

> Hi all,
> this email is just to inform contributors that due to resource constraints
> in the apache-beam-testing project, we want to update the IP address for
> the staging environment of the Beam playground from a us-west one to a
> us-east one.
>
> If you have any concerns about this, please reach out to me.
> Best
> -P.
>


Re: How to write an IO guide draft

2023-01-10 Thread Austin Bennett
This is great, thanks for putting this together!

A related question:  are we as a community targeting java to be the
canonical/target IO language if an IO does not currently exist?  If that is
not the case, then I would imagine we are hoping that we might eventually
also wind up with good examples for implementing IOs in other languages as
well [ not suggesting that you/John address that, but that we add GH Issues
as that might be worthwhile to hope others take on ]?



On Mon, Jan 9, 2023 at 8:58 AM John Casey via dev 
wrote:

> Hi All,
>
> I spent the last few weeks of December drafting a "How to write an IO
> guide":
> https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>
> and an associated code sample: https://github.com/apache/beam/pull/24799
>
> My goal is to make it easier for a new IO developer to create a new IO
> from scratch. This is intended to complement the various standards
> documents that have been floating around. Where those are intended to
> prescribe structure of an IO, this is more focused on the mechanics of
> internal design.
>
> Please take a look and let me know what you think,
>
> John
>


Re: A Declarative API for Apache Beam

2022-12-16 Thread Austin Bennett
Seems a worthwhile addition which can expand the community by making Beam
increasingly accessible to additional users and for more use-cases.

A bit of a tangent, since commenting on @Byron Ellis 's
part, but ...  Ensuring some have also seen Dataform [ ex:
https://cloud.google.com/dataform/docs/overview ... and - formerly -
https://dataform.co/ ] , since now part of the same company as you, there
are potentially additional maybe-straightforward
conversations/lessons-learned/etc to discuss [ in addition to collabs with
the dbt community ].  At times, I think of these two [ dbt, dataform] as
addressing similar things.



On Thu, Dec 15, 2022 at 4:17 PM Ahmet Altay via dev 
wrote:

> +1 to both of these proposals. In the past 12 months I have heard of at
> least 3 YAML implementations built on top of Beam in large production
> systems. Unfortunately, none of those were open sourced. Having these out
> of the box would be great, and it will clearly have used demand. Thank
> you all!
>
> On Thu, Dec 15, 2022 at 10:59 AM Robert Bradshaw via dev <
> dev@beam.apache.org> wrote:
>
>> On Thu, Dec 15, 2022 at 3:37 AM Steven van Rossum
>>  wrote:
>> >
>> > This is great! I developed a similar template a year or two ago as a
>> reference for a customer to speed up their development process and
>> unsurprisingly it did speed up their development.
>> > Here's an example of the config layout I came up with at the time:
>> >
>> > options:
>> >   runner: DirectRunner
>> >
>> > pipeline:
>> > # - 
>> > #   label: PubSub XML source
>> > #   transform:
>> > # !PTransform:apache_beam.io.ReadFromPubSub
>> > # subscription: projects/PROJECT/subscriptions/SUBSCRIPTION
>> > - _source_1
>> >   label: XML source 1
>> >   transform:
>> > !PTransform:apache_beam.Create
>> > values:
>> > - /path/to/file.xml
>> > - _source_2
>> >   label: XML source 2
>> >   transform:
>> > !PTransform:apache_beam.Create
>> > values:
>> > - /path/to/another/file.xml
>> > - _xml
>> >   label: XMLs
>> >   inputs:
>> >   - step: *message_source_1
>> >   - step: *message_source_2
>> >   transform:
>> > !PTransform:utils.transforms.ParseXmlDocument {}
>> > - _messages
>> >   label: Validate XMLs
>> >   inputs:
>> >   - step: *message_xml
>> > tag: success
>> >   transform:
>> > !PTransform:utils.transforms.ValidateXmlDocumentWithXmlSchema
>> > schema: /path/to/file.xsd
>> > - _messages
>> >   label: Convert XMLs
>> >   inputs:
>> >   - step: *validated_messages
>> >   transform:
>> > !PTransform:utils.transforms.ConvertXmlDocumentToDictionary
>> > schema: /path/to/file.xsd
>> > - label: Print XMLs
>> >   inputs:
>> >   - step: *converted_messages
>> >   transform:
>> > !PTransform:utils.transforms.Print {}
>> >
>> > Highlights:
>> > Pipeline options are supplied under an options property.
>>
>> Yep, I was thinking exactly the same:
>>
>> https://github.com/apache/beam/blob/c5518014d47a42651df94419e3ccbc79eaf96cb3/sdks/python/apache_beam/yaml/main.py#L51
>>
>> > A pipeline is a flat set of all transforms in the pipeline.
>>
>> One can certainly enumerate the transforms as a flat set, but I do
>> think being able to define a composite structure is nice. In addition,
>> the "chain" composite allows one to automatically infer the
>> input-output relation rather than having to spell it out (much as one
>> can chain multiple transforms in the various SDKs rather than have to
>> assign each result to a intermediate).
>>
>> > Transforms are defined using a YAML tag and named properties and can be
>> used by constructing a YAML reference.
>>
>> That's an interesting idea. Can it be done inline as well?
>>
>> > DAG construction is done using a simple topological sort of transforms
>> and their dependencies.
>>
>> Same.
>>
>> > Named side outputs can be referenced using a tag field.
>>
>> I didn't put this in any of the examples, but I do the same. If a
>> transform Foo produces multiple outputs, one can (in fact must)
>> reference the various outputs by Foo.output1, Foo.output2, etc.
>>
>> > Multiple inputs are merged with a Flatten transform.
>>
>> PTransfoms can have named inputs as well (they're not always
>> symmetric), so I let inputs be a map if they care to distinguish them.
>>
>> > Not sure if there's any inspiration left to take from this, but I
>> figured I'd throw it up here to share.
>>
>> Thanks. It's neat to see others coming up with the same idea, with
>> very similar conventions, and validates that it'd be both natural and
>> useful.
>>
>>
>> > On Thu, Dec 15, 2022 at 12:48 AM Chamikara Jayalath via dev <
>> dev@beam.apache.org> wrote:
>> >>
>> >> +1 for these proposals and agree that these will simplify and
>> demystify Beam for many new users. I think when combined with the
>> x-lang/Schema-Aware transform binding, these might end up being adequate
>> solutions for many production use-cases as well (unless users need to
>> define custom composites, I/O connectors, etc.).
>> >>
>> >> Also, 

Re: [DISCUSSION][JAVA] Current state of Java 17 support

2022-11-29 Thread Austin Bennett
-1 for ongoing Java8 support [ or, said another way, +1 for dropping
support of Java8 ]

+1 for having tests that run for ANY JDK that we say we support.  Is there
any reason the resources to support are too costly [ or outweigh the
benefits of additional confidence in ensuring we support what we say we do
]?  I am not certain on whether this would only be critical for releases,
or should be done as part of regular CI.

On Tue, Nov 29, 2022 at 8:51 AM Alexey Romanenko 
wrote:

> Hello,
>
> I’m sorry if it’s already discussed somewhere but I find myself a little
> bit lost in the subject.
> So, I’d like to clarify this - what is a current official state of Java 17
> support at Beam?
>
> I recall that a great job was done to make Beam compatible with Java 17
> [1] and Beam already provides “beam_java17_sdk” Docker image [2] but, iiuc,
> Java 8 is still the default JVM to run all Java tests on Jenkins ("Java
> PreCommit" in the first order) and there are only limited number of tests
> that are running with JDK 11 and 17 on Jenkins by dedicated jobs.
>
> So, my question would sound like if Beam officially supports Java 17 (and
> 11), do we need to run all Beam Java SDK related tests (VR and IT test
> including) against all supported Java SDKs?
>
> Do we still need to support Java 8 SDK?
>
> In the same time, as we are heading to move everything from Jenkins to
> GitHub actions, what would be the default JDK there or we will run all
> Java-related actions against all supported JDKs?
>
> —
> Alexey
>
> [1] https://issues.apache.org/jira/browse/BEAM-12240
> [2] https://hub.docker.com/r/apache/beam_java17_sdk
>
>
>
>


Re: bhulette stepping back (for now)

2022-11-11 Thread Austin Bennett
Thanks for everything you've done, @bhule...@apache.org!

On Fri, Nov 11, 2022 at 11:01 AM Pablo Estrada via dev 
wrote:

> I promised I wouldn't cry so I won't. Cya!
>
> On Fri, Nov 11, 2022 at 10:46 AM Robin Qiu via dev 
> wrote:
>
>> Thanks for your contribution Brian! Hope you enjoy your new team!
>>
>> Best,
>> Robin
>>
>> On Fri, Nov 11, 2022 at 10:27 AM Kenneth Knowles  wrote:
>>
>>> Your contributions have been huge. You will be missed! But have a
>>> fabulous time with BigQuery. And thank you so much for letting us know [1]
>>>
>>> Kenn
>>>
>>> [1] See "stepping down considerately" from
>>> https://www.apache.org/foundation/policies/conduct.html
>>>
>>> On Thu, Nov 10, 2022 at 4:00 PM Brian Hulette 
>>> wrote:
>>>
 Hi dev@beam,

 I just wanted to let the community know that I will be stepping back
 from Beam development for now. I'm switching to a different team within
 Google next week - I will be working on BigQuery.

 I'm removing myself from automated code review assignments [1], and
 won't actively monitor the beam lists anymore. That being said, I'm happy
 to contribute to discussions or code reviews when it would be particularly
 helpful, e.g. for anything relating to DataFrames/Schemas/SQL. I can always
 be reached at bhule...@apache.org, and @TheNeuralBit [2] on GitHub.

 Brian

 [1] https://github.com/apache/beam/pull/24108
 [2] https://github.com/TheNeuralBit

>>>


Re: [DISCUSS] Avro dependency update, design doc

2022-11-11 Thread Austin Bennett
@Moritz: I *think* should be fine, and don't have anything specific to
offer for what might go wrong throughout the process.  :-) :shrug:



On Fri, Nov 11, 2022 at 2:07 AM Moritz Mack  wrote:

> Thanks a lot for the feedback so far! I can only second Alexey. It was
> painful to come to realize that the only feasible option seems to be
> copying a lot of code during the transition phase.
>
> For that reason, it will be critical to be disciplined about the removal
> of the to-be deprecated code in core and, ahead of time, agree on when to
> remove it again. Any thought on how long the transition phase should be?
>
>
>
>  *I am concerned of what could go wrong for users in the
> in-between/transition state while more slowly transitioning avro to
> extension.*
>
>
>
> @Austin Do you have any specific concern in mind here?
>
> To minimize this risk, we propose that all APIs should be kept as is to
> make the migration as easy as possible and kick off with the Avro version
> used in core. The only thing that changes will be package names.
>
>
>
> / Moritz
>
>
>
> On 10.11.22, 22:46, "Kenneth Knowles"  wrote:
>
>
>
> Thank you for writing this document. It really helps to understand the
> options. I agree that option 2 (make a new extension and deprecate from
> core) seems best. I think +Reuven Lax might have the most context on any
> technical issue we will
>
> Thank you for writing this document. It really helps to understand the
> options. I agree that option 2 (make a new extension and deprecate from
> core) seems best. I think +Reuven Lax  might have the
> most context on any technical issue we will encounter around schema codegen.
>
>
>
> Kenn
>
>
>
> On Thu, Nov 10, 2022 at 7:24 AM Alexey Romanenko 
> wrote:
>
> Personally, I think that keeping two mostly identical versions of
> Avro-related code in two different places (“core" and "extension") is rathe
> bad practice, especially, in case of need to fix some issues there -
> though, it’s a very low risk there since this code is quite mature and it’s
> not touched often. On the other hand, it should give time for users
> (several Beam releases) to update their code and use Avro from extension
> artifact instead of core.
>
>
>
> Though, if we accept that this breaking change at compile time is
> allowable, then this process of transition should be much faster and can be
> performed within only one Beam release. Our main concern here is runtime
> breaking changes that we can miss but *must* be avoided by all means.
>
>
>
> —
>
> Alexey
>
>
>
> On 9 Nov 2022, at 18:47, Austin Bennett  wrote:
>
>
>
> Being tied to a specific version of a dependency, and esp. one that is
> not-[actually-long-term]critical, sounds like a problem.  It doesn't seem
> like Avro needs to be in core.  I am in favor of about any path someone
> wants to address towards removing that from core [ *#2 in the design doc
> seems reasonable* ].
>
>
>
> Naturally, having ways to more easily change versions [esp. to remediate
> CVEs, but for any specific reason ], seems very valuable.
>
>
>
> It reads as a significant problem; I wouldn't take issue with a breaking [
> compile time ] change, if that got things addressed and somewhat
> straightforwardly - *I am concerned of what could go wrong for users in
> the in-between/transition state while more slowly transitioning avro to
> extension.*
>
>
>
> On Wed, Nov 9, 2022 at 5:43 AM Alexey Romanenko 
> wrote:
>
> Any thoughts on this? For now, we'd need to decide which path finally to
> take to move forward.
>
>
>
> Thanks in advance!
>
>
>
> —
>
> Alexey
>
>
>
> On 4 Nov 2022, at 16:44, Alexey Romanenko 
> wrote:
>
>
>
> Hi all,
>
>
>
> Following-up an Avro dependency update discussion [1] that showed a lot of
> uncertainties to move forward, Moritz and I decided to create a design
> document [2] with potential options, that we believe, can be considered and
> used further. Unfortunately, all solutions lead to breaking changes in some
> way, though, for some of them the negative effect can be reduced by
> preparing users for this in advance and make this transition smoother.
>
>
>
> Please, take a look on this doc and leave your comments and opinions -
> your feedback is very welcomed!
>
>
>
> [1] https://lists.apache.org/thread/mz8hvz8dwhd0tzmv2lyobhlz7gtg4gq7
> <https://urldefense.com/v3/__https:/lists.apache.org/thread/mz8hvz8dwhd0tzmv2lyobhlz7gtg4gq7__;!!CiXD_PY!ThEMK5roxk1uO6Fpjmb3STnoNeOVFmjhytcQpHAT6WFXjpRioz7nJPSvMRRHwUXJovjHbRvmvA$>
>
> [2]
> https://docs.google.com/docume

Re: [Python][Bikeshed] typehint vs. type-hint vs. "type hint"

2022-11-09 Thread Austin Bennett
+1

And, well articulated, @Robert Burke  !

On Wed, Nov 9, 2022 at 10:43 AM Robert Burke  wrote:

> +1 to standardizing on "type hint"
>
> I learned that there's a rule in English that defines when one would have
> a space in a compound word or not. If removing one of the components would
> change the meaning of the others, the space should be removed.
>
> Eg. By removing "type", it's still a hint, so "type hint" is appropriate.
>
> This rule is why the Apple products are "MacBooks" or "AirPods" rather
> than "Mac Book" and "Air Pods". Without the Mac, the laptop isn't a Book.
> Without the Air, it's not a "pod".
>
>
>
> On Wed, Nov 9, 2022, 10:35 AM Kenneth Knowles  wrote:
>
>> +1 to "type hint" for referring to hinting that something has a
>> particular type, and "type" for referring to a... type.
>>
>> On Mon, Nov 7, 2022 at 7:27 AM Jack McCluskey via dev <
>> dev@beam.apache.org> wrote:
>>
>>> I'm in agreement that how we refer to type hints in documentation should
>>> be standardized across the board. It's a good practice for both style and
>>> clarity. Seems like it wouldn't be too hard to update our docstrings
>>> either, based on a quick search of the repo.
>>>
>>> On Mon, Nov 7, 2022 at 9:00 AM Brian Hulette via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Hi everyone,

 In a recent code review we noticed that we are not consistent when
 describing python type hints in documentation. Depending on who wrote the
 patch, we switch between typehint, type-hint, and "type hint" [1].

 I think we should standardize on "type hint" as this is what Guido used
 in PEP 484 [2]. Please comment on the issue in the next few days if you
 disagree with this approach.

 Note this is orthogonal to how we refer to type hints in _code_, in our
 public APIs. In general we use "type" in that context (e.g.
 `with_input_types`), and there doesn't seem to be a consistency issue.

 [1] https://github.com/apache/beam/issues/23950
 [2] https://peps.python.org/pep-0484/

>>>


Re: [ANNOUNCE] New committer: Yi Hu

2022-11-09 Thread Austin Bennett
Congrats, and Thanks, Yi!

On Wed, Nov 9, 2022 at 11:24 AM Valentyn Tymofieiev via dev <
dev@beam.apache.org> wrote:

> I am with the Beam PMC on this, congratulations and very well deserved, Yi!
>
> On Wed, Nov 9, 2022 at 11:08 AM Byron Ellis via dev 
> wrote:
>
>> Congratulations!
>>
>> On Wed, Nov 9, 2022 at 11:00 AM Pablo Estrada via dev <
>> dev@beam.apache.org> wrote:
>>
>>> +1 thanks Yi : D
>>>
>>> On Wed, Nov 9, 2022 at 10:47 AM Danny McCormick via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congrats Yi! I've really appreciated the ways you've consistently taken
 responsibility for improving our team's infra and working through sharp
 edges in the codebase that others have ignored. This is definitely well
 deserved!

 Thanks,
 Danny

 On Wed, Nov 9, 2022 at 1:37 PM Anand Inguva via dev <
 dev@beam.apache.org> wrote:

> Congratulations Yi!
>
> On Wed, Nov 9, 2022 at 1:35 PM Ritesh Ghorse via dev <
> dev@beam.apache.org> wrote:
>
>> Congratulations Yi!
>>
>> On Wed, Nov 9, 2022 at 1:34 PM Ahmed Abualsaud via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congrats Yi!
>>>
>>> On Wed, Nov 9, 2022 at 1:33 PM Sachin Agarwal via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congratulations Yi!

 On Wed, Nov 9, 2022 at 10:32 AM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Yi Hu (y...@apache.org)
>
> Yi started contributing to Beam in early 2022. Yi's contributions
> are very diverse! I/Os, performance tests, Jenkins, support for Schema
> logical types. Not only code but a very large amount of code review. 
> Yi is
> also noted for picking up smaller issues that normally would be left 
> on the
> backburner and filing issues that he finds rather than ignoring them.
>
> Considering their contributions to the project over this
> timeframe, the Beam PMC trusts Yi with the responsibilities of a Beam
> committer. [1]
>
> Thank you Yi! And we are looking to see more of your contributions!
>
> Kenn, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: [DISCUSS] Avro dependency update, design doc

2022-11-09 Thread Austin Bennett
Being tied to a specific version of a dependency, and esp. one that is
not-[actually-long-term]critical, sounds like a problem.  It doesn't seem
like Avro needs to be in core.  I am in favor of about any path someone
wants to address towards removing that from core [ *#2 in the design doc
seems reasonable* ].

Naturally, having ways to more easily change versions [esp. to remediate
CVEs, but for any specific reason ], seems very valuable.

It reads as a significant problem; I wouldn't take issue with a breaking [
compile time ] change, if that got things addressed and somewhat
straightforwardly - *I am concerned of what could go wrong for users in the
in-between/transition state while more slowly transitioning avro to
extension.*

On Wed, Nov 9, 2022 at 5:43 AM Alexey Romanenko 
wrote:

> Any thoughts on this? For now, we'd need to decide which path finally to
> take to move forward.
>
> Thanks in advance!
>
> —
> Alexey
>
> On 4 Nov 2022, at 16:44, Alexey Romanenko 
> wrote:
>
> Hi all,
>
> Following-up an Avro dependency update discussion [1] that showed a lot of
> uncertainties to move forward, Moritz and I decided to create a design
> document [2] with potential options, that we believe, can be considered and
> used further. Unfortunately, all solutions lead to breaking changes in some
> way, though, for some of them the negative effect can be reduced by
> preparing users for this in advance and make this transition smoother.
>
> Please, take a look on this doc and leave your comments and opinions -
> your feedback is very welcomed!
>
> [1] https://lists.apache.org/thread/mz8hvz8dwhd0tzmv2lyobhlz7gtg4gq7
> [2]
> https://docs.google.com/document/d/1tKIyTk_-HhkmVuJsxvWP5eTELESpCBe_Vmb1nJ3Ia34/edit?usp=sharing
>
> —
> Alexey
>
>
>


Re: [DISCUSS] Jenkins -> GitHub Actions ?

2022-11-07 Thread Austin Bennett
+1

Also would help address a good amount of what concerns me that was [sorta]
raised by https://lists.apache.org/thread/7jr99nc5xsb3ft1d75kb0ml32bzw89rv


Once we think this is something we want to do, but might be
blocked/concerned because of lack of definitively comparable features, I'd
be happy to take a look at what exists in the wider ecosystem or could be
built.

Cheers -



On Fri, Oct 21, 2022 at 11:10 AM Ismaël Mejía  wrote:

> +1 Github Actions are more intuitive and easy to modify and test for
> everyone.
> Also Beam wins because that makes one less system to maintain.
>
> Regards,
> Ismaël
>
> On Wed, Oct 19, 2022 at 5:50 PM Danny McCormick via dev
>  wrote:
> >
> > Thanks for kicking this conversation off. I'm +1 on migrating, but only
> once we've found a specific replacement for easy observability (which
> workflows have been failing lately, and how often) and trigger phrases (for
> retries and workflows that aren't automatically kicked off but should be
> run for extra validation, e.g. postcommits). Until we have viable
> replacements, I don't think we should make the move. Publishing nightly
> snapshots is eventually also a must to fully migrate, but probably doesn't
> need to block us from making progress here.
> >
> > With those caveats, the reason that I'm +1 on moving is that our Jenkins
> reliability has been rough. Since I joined the project in January, I can
> think of 3 different incidents that significantly harmed our ability to do
> work.
> >
> > 1. Jenkins triggers cause multi-day outage - this led to a multi-day
> code freeze, and we lost our trigger functionality for days afterwards.
> Investigating/restoring our state ate up a pretty full week for me.
> > 2. Jenkins plugin cause multi-day outage - this led to multiple days of
> Jenkins downtime before eventually being resolved by Infra.
> > 3. Cert issues cause many workers to go down - I don't have a thread for
> this because I handled most of the investigation the day of, but many of
> our workers went down for around a day and nobody noticed until queue time
> reached 6+ hours for each workflow.
> >
> > There may be others that I'm overlooking.
> >
> > GitHub Actions isn't a magic bullet to fix these problems, but it
> minimizes the amount of infra that we're maintaining ourselves, increases
> the isolation between workflows (catastrophic failure is less likely), has
> uptime guarantees, and is more likely to receive investment going forward
> (we're likely to get increasing benefits over time for free). We've also
> done a lot of exploration in this area already, so we're not starting from
> scratch.
> >
> > Thanks,
> > Danny
> >
> > On Wed, Oct 19, 2022 at 11:32 AM Kenneth Knowles 
> wrote:
> >>
> >> Hi all,
> >>
> >> As you probably noticed, there's a lot of work going on around adding
> more GitHub Actions workflows.
> >>
> >> Can we fully migrate to GitHub Actions? Similar to our GitHub Issues
> migration (but less user-facing) it would bring us on to "default"
> infrastructure that more people understand and is maintained by GitHub.
> >>
> >> So far we have hit some serious roadblocks. It isn't just a simple
> migration. We have to weigh doing the work to get there.
> >>
> >> I started a document with a table of the things we get from Jenkins
> that we need to be sure to have for GitHub Actions before we could think
> about migrating:
> >>
> >> https://s.apache.org/beam-jenkins-to-gha
> >>
> >> Can you please help me by adding things that we get from Jenkins, and
> if you know how to get them from GitHub Actions add that too.
> >>
> >> Thanks!
> >>
> >> Kenn
>


Re: [ANNOUNCE] New committer: Ritesh Ghorse

2022-11-03 Thread Austin Bennett
Congratulations, and Thanks @riteshgho...@apache.org!

On Thu, Nov 3, 2022 at 4:17 PM Sachin Agarwal via dev 
wrote:

> Congrats Ritesh!
>
> On Thu, Nov 3, 2022 at 4:16 PM Kenneth Knowles  wrote:
>
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Ritesh Ghorse (riteshgho...@apache.org)
>>
>> Ritesh started contributing to Beam in mid-2021 and has contributed
>> immensely to bringin the Go SDK to fruition, in addition to contributions
>> to Java and Python and release validation.
>>
>> Considering their contributions to the project over this timeframe, the
>> Beam PMC trusts Ritesh with the responsibilities of a Beam committer. [1]
>>
>> Thank you Ritesh! And we are looking to see more of your contributions!
>>
>> Kenn, on behalf of the Apache Beam PMC
>>
>> [1]
>>
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


[DISCUSS] Separating Cloud Environment(s) - Separate Dev/Test from Production 'Testing' environment(s)

2022-10-06 Thread Austin Bennett
Howdy!

I'm increasingly concerned by our Cloud 'Testing' environment ( resources
associated with the 'apache-beam-testing' GCP project ).

The following might sound a little strange, given repetition of words [ and
many meanings of 'testing']:

* There is a 'production' need for ongoing testing infrastructure, tests,
pipelines, builds.
and
* There are also a 'development' and/or 'test' needs for extending the
functionality, tests, and exploring new things, before definitely adopting
and making production [ ex: getting onto master ].
and
* There is a very real possibility that someone 'testing' or 'developing'
new features/functionality, could accidentally delete/change/modify
critical resources in the solo environment -AND- We do not have super
quick/easy restore paths in the event of serious trouble - ex: if things
deleted.  [ I assume, as I don't see terraform/other code that seems
definitely easy to re-create resources ...  BUT, I could be missing and not
understanding this point ].


There could be benefits [ also costs/burdens, or pros/cons to consider ]
from separating functions/environments to minimize when/how people are
manually using resources/roles with permissions to modify the critical
infrastructure.

I have some ideas for steps to address [ and where I could step in to help
implement ], but before starting to formalize ideas and strategies [ longer
write-ups, etc ], wanted to open for discussion to ensure whether this is
thinking that makes sense, that the community believes would be valuable.
If done well, longer-term I imagine most of the to-be-solution would have
little effect on actual users of environments [ including
developers/committers/etc ] and mostly would be around better separation
and more attention paid to controls/branches/roles/permissions.

*Can people share thoughts around potentially having at least one more
environment to add additional safety around our critical code testing
infrastructure, to keep that further distinct from workflows/process/code
that is still being developed?*

Do advise,
Cheers,
Austin


Re: What to do about issues that track flaky tests?

2022-09-14 Thread Austin Bennett
+1 to being realistic -- proper labels are worthwhile.  Though, some flaky
tests probably should be P1, and just because isn't addressed in a timely
manner doesn't mean it isn't a P1 - though, it does mean it wasn't
addressed.



On Wed, Sep 14, 2022 at 1:19 PM Kenneth Knowles  wrote:

> I would like to make this alert email actionable.
>
> I went through most of these issues. About half are P1 "flake" issues. I
> don't think magically expecting them to be deflaked is helpful. So I have a
> couple ideas:
>
> 1. Exclude "flake" P1s from this email. This is what we used to do. But
> then... are they really P1s?
> 2. Make "flake" bugs P2 if they are not currently impacting our test
> signal. But then... we may have a gap in test coverage that could cause
> severe problems. But anyhow something that is P1 for a long time is not
> *really* P1, so it is just being realistic.
>
> What do you all think?
>
> Kenn
>
> On Wed, Sep 14, 2022 at 3:03 AM  wrote:
>
>> This is your daily summary of Beam's current high priority issues that
>> may need attention.
>>
>> See https://beam.apache.org/contribute/issue-priorities for the
>> meaning and expectations around issue priorities.
>>
>> Unassigned P1 Issues:
>>
>> https://github.com/apache/beam/issues/23227 [Bug]: Python SDK
>> installation cannot generate proto with protobuf 3.20.2
>> https://github.com/apache/beam/issues/23179 [Bug]: Parquet size exploded
>> for no apparent reason
>> https://github.com/apache/beam/issues/22913 [Bug]:
>> beam_PostCommit_Java_ValidatesRunner_Flink is flakey
>> https://github.com/apache/beam/issues/22303 [Task]: Add tests to Kafka
>> SDF and fix known and discovered issues
>> https://github.com/apache/beam/issues/22299 [Bug]: JDBCIO Write freeze
>> at getConnection() in WriteFn
>> https://github.com/apache/beam/issues/21794 Dataflow runner creates a
>> new timer whenever the output timestamp is change
>> https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get
>> output to Failed Inserts PCollection
>> https://github.com/apache/beam/issues/21704
>> beam_PostCommit_Java_DataflowV2 failures parent bug
>> https://github.com/apache/beam/issues/21701
>> beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors
>> https://github.com/apache/beam/issues/21700
>> --dataflowServiceOptions=use_runner_v2 is broken
>> https://github.com/apache/beam/issues/21696 Flink Tests failure :
>> java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.beam.runners.core.construction.SerializablePipelineOptions
>> https://github.com/apache/beam/issues/21695 DataflowPipelineResult does
>> not raise exception for unsuccessful states.
>> https://github.com/apache/beam/issues/21694 BigQuery Storage API insert
>> with writeResult retry and write to error table
>> https://github.com/apache/beam/issues/21480 flake:
>> FlinkRunnerTest.testEnsureStdoutStdErrIsRestored
>> https://github.com/apache/beam/issues/21472 Dataflow streaming tests
>> failing new AfterSynchronizedProcessingTime test
>> https://github.com/apache/beam/issues/21471 Flakes: Failed to load cache
>> entry
>> https://github.com/apache/beam/issues/21470 Test flake:
>> test_split_half_sdf
>> https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink
>> flaky: Connection refused
>> https://github.com/apache/beam/issues/21468
>> beam_PostCommit_Python_Examples_Dataflow failing
>> https://github.com/apache/beam/issues/21467 GBK and CoGBK streaming Java
>> load tests failing
>> https://github.com/apache/beam/issues/21465 Kafka commit offset drop
>> data on failure for runners that have non-checkpointing shuffle
>> https://github.com/apache/beam/issues/21463 NPE in Flink Portable
>> ValidatesRunner streaming suite
>> https://github.com/apache/beam/issues/21462 Flake in
>> org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in
>> use
>> https://github.com/apache/beam/issues/21271 pubsublite.ReadWriteIT flaky
>> in beam_PostCommit_Java_DataflowV2
>> https://github.com/apache/beam/issues/21270
>> org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
>> flaky on Dataflow Runner V2
>> https://github.com/apache/beam/issues/21267 WriteToBigQuery submits a
>> duplicate BQ load job if a 503 error code is returned from googleapi
>> https://github.com/apache/beam/issues/21266
>> org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
>> is flaky in Java ValidatesRunner Flink suite.
>> https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do
>> not follow spec
>> https://github.com/apache/beam/issues/21261
>> org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
>> is flaky
>> https://github.com/apache/beam/issues/21260 Python DirectRunner does not
>> emit data at GC time
>> https://github.com/apache/beam/issues/21257 Either Create or
>> DirectRunner 

Re: Beam Website Feedback

2022-09-09 Thread Austin Bennett
Any chance you've seen a GH Issue for that?  If not, please feel free to
file one :-)

https://github.com/apache/beam/issues

On Fri, Sep 9, 2022 at 12:09 PM Peter McArthur  wrote:

> I wish there were a python example for "Slowly updating global window side
> inputs” on this page
> https://beam.apache.org/documentation/patterns/side-inputs/#slowly-updating-global-window-side-inputs
>
> thank you
>
>


Re: [idea] A new IO connector named DataLakeIO, which support to connect Beam and data lake, such as Delta Lake, Apache Hudi, Apache iceberg.

2022-08-30 Thread Austin Bennett
Is there enough commonality across Delta, Hudi, Iceberg for this generic
solution?  I imagined we'd potentially have individual IOs for each.  A
generic one seems possible, but certainly would like to learn more.

Also, are others in the community working on connectors for ANY of those
Delta Lake, Hudi, or Iceberg IOs?  Would hope for some form of coordination
and/or at least awareness between people addressing
complementary/overlapping areas.

On Mon, Aug 29, 2022 at 4:15 PM Neil Kolban via dev 
wrote:

> Howdy,
> I have a client who would be interested to use this.  Is there a link to a
> GitHub repo or other place I can read more?
>
> Neil  (kol...@google.com)
>
> On 2022/08/05 07:23:31 张涛 wrote:
> >
> > Hi, we developed a new IO connector named DataLakeIO, to connect Beam
> and data lake, such as Delta Lake, Apache Hudi, Apache iceberg. Beam can
> use DataLakeIO to read data from data lake, and write data to data lake. We
> did not find data lake IO on
> https://beam.apache.org/documentation/io/built-in/, we want to contribute
> this new IO connector to Beam, what should we do next? Thank you very much!
>


Re: Present an Apache Beam Learning Session

2022-08-01 Thread Austin Bennett
Hi Cacie,

The list could probably use some more details about what you are seeking.

Ex:
Where is CUNA Mutual group physically located?
Or are you imagining this occurring online?
Do people need to get hands-on ( you mention overview so maybe not )?
Or, you are seeking just something about what/why Apache Beam [ and
potentially how others use ]?
and
Are you hoping that something gets tailored to better entice your
organization [ ex: learning more about, and suggesting ways that this might
work ]?  There are likely a bunch of ways to make a presentation more
successful, depending on the desired outcomes.
What is generally the size of the group that attends?  etc.

Cheers,
Austin


On Mon, Aug 1, 2022 at 9:31 AM Robichaux, Cacie <
cacie.robich...@cunamutual.com> wrote:

> I am a scrum master at CUNA Mutual Group. I put together learning sessions
> everything other Thursday at 12 CST. Would there be someone who could come
> for 45 minutes to 1 hr to discuss and give an overview of Apache Beam? We
> are flexible on the Thursday.
>
>
>
> Cacie Robichaux
>


Re: [ANNOUNCE] New committer: John Casey

2022-08-01 Thread Austin Bennett
Thanks, John!

On Mon, Aug 1, 2022 at 12:01 PM John Casey via dev 
wrote:

> Thanks! I'm looking forward to continuing to improve all of our connectors.
>
> On Mon, Aug 1, 2022 at 1:52 PM Yichi Zhang via dev 
> wrote:
>
>> Congratulations John!
>>
>> On Sat, Jul 30, 2022 at 4:23 AM Robert Burke  wrote:
>>
>>> Woohoo! Congrats John and welcome to committership!
>>>
>>> On Fri, Jul 29, 2022, 10:07 PM Kenneth Knowles  wrote:
>>>
 Hi all,

 Please join me and the rest of the Beam PMC in welcoming
 a new committer: John Casey (johnca...@apache.org)

 John started contributing to Beam in late 2021. John has quickly become
 our resident expert on KafkaIO - identifying bugs, making enhancements,
 helping users - in addition to a variety of other contributions.

 Considering his contributions to the project over this timeframe, the
 Beam PMC trusts John with the responsibilities of a Beam committer. [1]

 Thank you John! And we are looking to see more of your contributions!

 Kenn, on behalf of the Apache Beam PMC

 [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


Re: [RFC] State & Timers API Design for Go SDK

2022-07-28 Thread Austin Bennett
Looks great!

On Thu, Jul 28, 2022 at 10:54 AM Jack McCluskey via dev 
wrote:

> Great write-up on state and timers! The solution you chose feels very
> in-line with how the Go SDK works. Make sure the design doc makes it onto
> the wiki once you've addressed any feedback!
>
> On Thu, Jul 28, 2022 at 1:49 PM Kerry Donny-Clark via dev <
> dev@beam.apache.org> wrote:
>
>> I think this a perfect example of a clear design doc. Great, deeply
>> detailed alternatives considered and why they were rejected. This makes
>> review easy, and lets us follow your thought process.
>> I think this is a good implementation, and I support the chosen approach.
>> Kerry
>>
>> On Thu, Jul 28, 2022 at 1:41 PM Kenneth Knowles  wrote:
>>
>>> Really thorough. Love it!
>>>
>>> On Thu, Jul 28, 2022 at 9:02 AM Ritesh Ghorse via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Hey everyone,

 Danny  and I have been working on
 designing the state and timers for Go SDK. We wrote a design doc with
 user-facing API, execution details, and different alternatives considered.
 It would be really helpful if we could get your
 suggestions/feedback/comments on the design.

 Design Doc:
 https://docs.google.com/document/d/1rcKa1Z6orDDFr1l8t6NA1eLl6zanQbYAEiAqk39NQUU/edit?usp=sharing

 Thanks!
 Ritesh Ghorse

>>>


Re: [ANNOUNCE] New committer: Steven Niemitz

2022-07-20 Thread Austin Bennett
Great!

On Wed, Jul 20, 2022 at 10:11 AM Aizhamal Nurmamat kyzy 
wrote:

> Congrats, Steve!
>
> On Wed, Jul 20, 2022 at 3:10 AM Jan Lukavský  wrote:
>
>> Congrats Steve!
>> On 7/20/22 06:20, Reuven Lax via dev wrote:
>>
>> Welcome Steve!
>>
>> On Tue, Jul 19, 2022 at 1:05 PM Connell O'Callaghan via dev <
>> dev@beam.apache.org> wrote:
>>
>>>
>>> +++1 Woohoo! Congratulations Steven (and to the BEAM community) on this
>>> announcement!!!
>>>
>>> Thank you Luke for this update
>>>
>>>
>>> On Tue, Jul 19, 2022 at 12:34 PM Robert Burke 
>>> wrote:
>>>
 Woohoo! Welcome and congratulations Steven!

 On Tue, Jul 19, 2022, 12:40 PM Luke Cwik via dev 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Steven Niemitz (sniemitz@)
>
> Steven started contributing to Beam in 2017 fixing bugs and improving
> logging and usability. Stevens most recent focus has been on performance
> optimizations within the Java SDK.
>
> Considering the time span and number of contributions, the Beam PMC
> trusts Steven with the responsibilities of a Beam committer. [1]
>
> Thank you Steven! And we are looking to see more of your contributions!
>
> Luke, on behalf of the Apache Beam PMC
>
> [1] https://beam.apache.org/contribute/become-a-committer
> /#an-apache-beam-committer
>



Re: Join a meeting to help coordinate implementing a Dask Runner for Beam

2022-06-21 Thread Austin Bennett
Looks/Sounds great!

On Tue, Jun 21, 2022 at 11:06 AM Alex Merose  wrote:

> We had a great meeting last week on this topic! Here is a proposal /
> meeting notes doc:
>
> https://docs.google.com/document/d/1Awj_eNmH-WRSte3bKcCcUlQDiZ5mMKmCO_xV-mHWAak/edit#heading=h.y0pwg4polebc
>
> Tomorrow, another engineer (https://github.com/cisaacstern) and I are
> meeting to create an initial prototype of the Dask runner in the main Beam
> repo. Let us know if you'd like to help out in any way. We'll post updates
> in this mailing list + the above doc.
>
> Best,
> Alex Merose
>
> On 2022/06/08 14:22:41 Ryan Abernathey wrote:
> > Dear Beamer,
> >
> > Thank you for all of your work on this amazing project. I am new to Beam
> > and am quite excited about its potential to help with some data
> processing
> > challenges in my field of climate science.
> >
> > Our community is interested in running Beam on Dask Distributed clusters,
> > which we already know how to deploy. This has been discussed at
> > https://issues.apache.org/jira/browse/BEAM-5336 and
> > https://github.com/apache/beam/issues/18962. It seems technically
> feasible.
> >
> > We are trying to organize a meeting next week to kickstart and coordinate
> > this effort. It would be great if we could entrain some Beam maintainers
> > into this meeting. If you have interest in this topic and are available
> > next week, please share your availability here -
> > https://www.when2meet.com/?15861604-jLnA4
> >
> > Alternatively, if you have any guidance or suggestions you wish to
> provide
> > by email or GitHub discussion, we welcome your input.
> >
> > Thanks again for your open source work.
> >
> > Best,
> > Ryan Abernathey
> >
>


Re: [ANNOUNCE] New committer: Ke Wu

2022-05-27 Thread Austin Bennett
Congrats, and Thanks, Ke!

On Fri, May 27, 2022 at 3:49 PM Ahmet Altay  wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer:
> Ke Wu (kw2542@)
>
> Ke has been contributing to Beam since 2020. Ke's contributions are mostly
> focused on the SamzaRunner, as a result of Ke's efforts Beam has a fully
> featured, portable, supported SamzaRunner with happy users!
>
> Considering these contributions, the Beam PMC trusts Ke with the
> responsibilities of a Beam committer.[1]
>
> Thank you Ke!
>
> Ahmet
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>


Re: Javascript SDK

2022-05-04 Thread Austin Bennett
+1 -- had started playing with this a couple weeks ago, is really shaping
up!


Some questions about docs, and making [ developing any language ] more
approachable -->

I wonder whether we have learned enough from this for a guide of sorts for
future language development.  Perhaps, since fresh, would be a good time to
ensure things are noted.

I'm thinking some helpful docs to include somewhere ( if they don't already
exist ), specifically aimed at making it more approachable for someone to
consider starting to work on another language ( ex: i'm thinking of dart,
since I've been writing alot of that lately :-), though there are plenty of
candidate languages )
* What were the tricky points ( language specific? or model specific? )?
* What would be needed to be considered a real MVP?  Ex: could be
considered suggestions - rather than requirements.  Ex: suggested to start
with XXX, and YYY can be more tricky [ at least in ZZZ context ] so
potentially save that for later.
* I'd also like to get a sense of the minimum level of features needed for
something to be accepted into main?  Ex: this one is easier since it was
led by a bunch of well-known-and-core-community members.  But, somewhat
outlining a process would potentially be helpful for people to see there is
a route to making something happen.
* etc...



Also, at what point do we think things marked @Experimental should get on
the website?  I'm thinking about getting on the sdks/language page --
https://beam.apache.org/documentation/sdks/python/  Naturally, is a
function of when someone is willing to do the work, but I also don't know
whether we'd overly want to highlight something that is still
rather-early/experimental on the general website.










On Wed, May 4, 2022 at 3:43 PM Robert Burke  wrote:

> +1 (non-PMC)
>
> On Wed, May 4, 2022, 3:37 PM Ahmet Altay  wrote:
>
>> Thank you!
>>
>> On Wed, May 4, 2022 at 4:22 PM Sachin Agarwal 
>> wrote:
>>
>>> Wow - great work y'all!
>>>
>>> On Wed, May 4, 2022 at 3:21 PM Robert Bradshaw 
>>> wrote:
>>>
 The entire SDK has now been reviewed and all outstanding issues
 addressed. https://github.com/apache/beam/pull/17341 (Big shout out to
 Danny McCormick for his tireless work here!) This does not mean the
 SDK is done, but it's marked as experimental (and isolated) and IMHO
 to a point where we can continue to iterate on the main branch similar
 to how we do our other development.

 Any objections or other thoughts on merging?

>>>
>> +1 to merging to the main branch.
>>
>>
>>>
 On Mon, Feb 7, 2022 at 9:21 AM Robert Bradshaw 
 wrote:
 >
 > +1 to separating things out if bundling them together becomes too
 > burdensome, though I agree we're not at that point yet (and there is a
 > non-trivial amount of overhead in just doing a release--speaking of
 > which I encourage everyone to look at and vote on the pending RC).
 >
 > That being said, the portability API, and the ability to evolve it in
 > a backwards compatible way with capabilities and requirements, makes
 > it easy to evolve each SDK and Runner independently and not have to
 > worry about which subset of the cross product is actually supported.
 >
 > On Mon, Feb 7, 2022 at 1:44 AM Jan Lukavský  wrote:
 > >
 > > I'll add one note from a different perspective. I think that
 long-term we should consider having separate release cycles for core, SDKs,
 DSLs and runners. It feels releasing all parts as a single "monolith" will
 gradually cause the core parts (e.g. model, runners-core, ...) to be more
 and more expensive to modify, because each modification to these core
 parts, might affect more and more other components. Enabling all SDKs and
 runners to "choose" the supported SDK-core or runner-core (while
 encouraging them to support the most recent!) is more maintainable for the
 future.
 > >
 > > I'm not saying we need to do something right now before merging the
 JS SDK, but on the other hand adding like 10 more SDKs would start to be an
 issue. We probably could talk about if (and how) we could make some sort of
 separation.
 > >
 > >  Jan
 > >
 > > On 2/4/22 18:42, Robert Burke wrote:
 > >
 > > I imagine by the nature of the Apache 2.0 license, the quality of
 the code in a given release is not a given without some other statement by
 the maintainers. We should clear and present warning signs. Erm.
 Experimental labeling.
 > >
 > > On Fri, Feb 4, 2022 at 8:27 AM Kenneth Knowles 
 wrote:
 > >>
 > >>
 > >>
 > >> On Thu, Feb 3, 2022 at 3:43 PM Robert Burke 
 wrote:
 > >>>
 > >>> Personally, if it gets added to the repo at all I'd rather we rip
 off the band-aid and at least have all the tests regularly run, and various
 GitHub actions. Even if we aren't doing the container release activities,
 because it's 

Re: Beam Website Feedback

2022-03-29 Thread Austin Bennett
Hi Charles,

I am not working with any of the mentioned DBs, so can't speak to the state
of IOs ( though I think you can use a jdbc connection with each of these,
so perhaps see:
https://beam.apache.org/releases/pydoc/2.24.0/apache_beam.io.jdbc.html ) .
But, I would suggest that contributions to the project are welcome,
especially in the event of a lacking IO!

Cheers,
Austin

On Tue, Mar 29, 2022 at 10:09 AM Charles Kangai 
wrote:

> Why is there no python odbc i/o transform either available or in
> development? Suppose my source data is in Oracle, MySQL or SQL Server –
> what am I to do?
>
> Thanks,
> Charles Kangai
> Email: char...@charleskangai.co.uk
>
>


Re: Create a Dataset in GCP Testing Project?

2022-01-24 Thread Austin Bennett
Bumping this from last week...  Can someone with permissions please verify
my permissions in the GCP Testing Project ?  It doesn't seem like I still
have access.

Thanks!

On Tue, Jan 18, 2022 at 3:10 PM Austin Bennett 
wrote:

> Following up here, it seems I have lost my access to the (GCP-based)
> Testing project -- had not addressed/finished this ticket for some time, as
> had been working on other things for a bit after being distracted.
>
> Can someone else please re-add me?  [ apologies if I can't figure out
> access and one of my emails already has access - I checked several ].
> Thanks!
>
> On Tue, Sep 28, 2021 at 2:47 PM Austin Bennett <
> whatwouldausti...@gmail.com> wrote:
>
>> Thanks, and Yes, some in wiki would be helpful -- I looked there first (
>> still am not certain on our conventions/why, so not sure that I’m the
>> person to document anything beyond, “send a message to the list if you’re a
>> committer and need access :-) ”).  Perhaps that is an issue which should be
>> filed (at least to add to backlog) in Jira.
>>
>> This kicks off some other questions relative to our infra and
>> conventions/practices, starting a new thread for that.
>>
>>
>> On Mon, Sep 27, 2021 at 11:10 AM Robert Burke  wrote:
>>
>>> We should probably add something to the wiki for that.
>>>
>>> On Mon, Sep 27, 2021, 10:42 AM Brian Hulette 
>>> wrote:
>>>
>>>> I don't think there's any policy in place for controlling access to the
>>>> apache-beam-testing project. I think in general PMC members are owners and
>>>> committers are editors, but it looks like there are a lot of exceptions to
>>>> this rule. For example, I am an owner - so I was able to grant you editor
>>>> access. I think you should be able to create a new dataset now.
>>>>
>>>> Brian
>>>>
>>>> On Fri, Sep 24, 2021 at 12:07 PM Austin Bennett <
>>>> whatwouldausti...@gmail.com> wrote:
>>>>
>>>>> Hi Devs,
>>>>>
>>>>> I am working on https://issues.apache.org/jira/browse/BEAM-10652 and
>>>>> specifically sorting out Integration Tests for this.  I believe that I 
>>>>> need
>>>>> to create a dataset for this to work given errors ( a special purpose
>>>>> dataset seems cleaner than reusing an existing dataset, and in line with
>>>>> the conventions I see ).
>>>>>
>>>>> I would like to work through this, rather than someone just handling
>>>>> it.
>>>>>
>>>>> How to get access to our GCP Projects used for testing?  I might also
>>>>> have questions for how we generally like things done within ( ex: I 
>>>>> haven't
>>>>> seen terraform repose for how we manage that infrastructure ;-) ).
>>>>>
>>>>> Thanks,
>>>>> Austin
>>>>>
>>>>>


Re: Create a Dataset in GCP Testing Project?

2022-01-18 Thread Austin Bennett
Following up here, it seems I have lost my access to the (GCP-based)
Testing project -- had not addressed/finished this ticket for some time, as
had been working on other things for a bit after being distracted.

Can someone else please re-add me?  [ apologies if I can't figure out
access and one of my emails already has access - I checked several ].
Thanks!

On Tue, Sep 28, 2021 at 2:47 PM Austin Bennett 
wrote:

> Thanks, and Yes, some in wiki would be helpful -- I looked there first (
> still am not certain on our conventions/why, so not sure that I’m the
> person to document anything beyond, “send a message to the list if you’re a
> committer and need access :-) ”).  Perhaps that is an issue which should be
> filed (at least to add to backlog) in Jira.
>
> This kicks off some other questions relative to our infra and
> conventions/practices, starting a new thread for that.
>
>
> On Mon, Sep 27, 2021 at 11:10 AM Robert Burke  wrote:
>
>> We should probably add something to the wiki for that.
>>
>> On Mon, Sep 27, 2021, 10:42 AM Brian Hulette  wrote:
>>
>>> I don't think there's any policy in place for controlling access to the
>>> apache-beam-testing project. I think in general PMC members are owners and
>>> committers are editors, but it looks like there are a lot of exceptions to
>>> this rule. For example, I am an owner - so I was able to grant you editor
>>> access. I think you should be able to create a new dataset now.
>>>
>>> Brian
>>>
>>> On Fri, Sep 24, 2021 at 12:07 PM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
>>>> Hi Devs,
>>>>
>>>> I am working on https://issues.apache.org/jira/browse/BEAM-10652 and
>>>> specifically sorting out Integration Tests for this.  I believe that I need
>>>> to create a dataset for this to work given errors ( a special purpose
>>>> dataset seems cleaner than reusing an existing dataset, and in line with
>>>> the conventions I see ).
>>>>
>>>> I would like to work through this, rather than someone just handling
>>>> it.
>>>>
>>>> How to get access to our GCP Projects used for testing?  I might also
>>>> have questions for how we generally like things done within ( ex: I haven't
>>>> seen terraform repose for how we manage that infrastructure ;-) ).
>>>>
>>>> Thanks,
>>>> Austin
>>>>
>>>>


Re: Beam Website Feedback

2021-11-23 Thread Austin Bennett
Prs welcome :-)



On Mon, Nov 22, 2021, 4:46 PM Michael Lehenbauer 
wrote:

> The python version of
> https://beam.apache.org/documentation/programming-guide/#using-schemas
> has a bunch of missing examples...
>
> [image: image.png]
>


Re: Create a Dataset in GCP Testing Project?

2021-09-28 Thread Austin Bennett
Thanks, and Yes, some in wiki would be helpful -- I looked there first (
still am not certain on our conventions/why, so not sure that I’m the
person to document anything beyond, “send a message to the list if you’re a
committer and need access :-) ”).  Perhaps that is an issue which should be
filed (at least to add to backlog) in Jira.

This kicks off some other questions relative to our infra and
conventions/practices, starting a new thread for that.


On Mon, Sep 27, 2021 at 11:10 AM Robert Burke  wrote:

> We should probably add something to the wiki for that.
>
> On Mon, Sep 27, 2021, 10:42 AM Brian Hulette  wrote:
>
>> I don't think there's any policy in place for controlling access to the
>> apache-beam-testing project. I think in general PMC members are owners and
>> committers are editors, but it looks like there are a lot of exceptions to
>> this rule. For example, I am an owner - so I was able to grant you editor
>> access. I think you should be able to create a new dataset now.
>>
>> Brian
>>
>> On Fri, Sep 24, 2021 at 12:07 PM Austin Bennett <
>> whatwouldausti...@gmail.com> wrote:
>>
>>> Hi Devs,
>>>
>>> I am working on https://issues.apache.org/jira/browse/BEAM-10652 and
>>> specifically sorting out Integration Tests for this.  I believe that I need
>>> to create a dataset for this to work given errors ( a special purpose
>>> dataset seems cleaner than reusing an existing dataset, and in line with
>>> the conventions I see ).
>>>
>>> I would like to work through this, rather than someone just handling
>>> it.
>>>
>>> How to get access to our GCP Projects used for testing?  I might also
>>> have questions for how we generally like things done within ( ex: I haven't
>>> seen terraform repose for how we manage that infrastructure ;-) ).
>>>
>>> Thanks,
>>> Austin
>>>
>>>


Create a Dataset in GCP Testing Project?

2021-09-24 Thread Austin Bennett
Hi Devs,

I am working on https://issues.apache.org/jira/browse/BEAM-10652 and
specifically sorting out Integration Tests for this.  I believe that I need
to create a dataset for this to work given errors ( a special purpose
dataset seems cleaner than reusing an existing dataset, and in line with
the conventions I see ).

I would like to work through this, rather than someone just handling it.

How to get access to our GCP Projects used for testing?  I might also have
questions for how we generally like things done within ( ex: I haven't seen
terraform repose for how we manage that infrastructure ;-) ).

Thanks,
Austin


Re: Avro String decoding changes in Beam 2.30.0

2021-08-05 Thread Austin Bennett
Claire and I discussed this a bit earlier, and I asked her to submit a PR.
It is found: https://github.com/apache/beam/pull/15292



On Wed, Aug 4, 2021 at 2:21 PM Kirill Panarin 
wrote:

> +1 to what Claire is proposing. Would be nice to have this merged to avoid
> fixing things on our side!
>
> On 2021/08/02 14:58:11, Claire McGinty 
> wrote:
> > hi, I wanted to ping this thread again! It would be really helpful to
> know
> > whether this is something that can be eventually fixed in Beam, or
> whether
> > we'll have to make the changes on our end.
> >
> > - Claire
> >
> > On Tue, Jul 27, 2021 at 8:06 AM Claire McGinty <
> claire.d.mcgi...@gmail.com>
> > wrote:
> >
> > > Right; since ReflectData has just the one instance per classloader
> using
> > > it in the API is purely stylistic to match the other signatures taking
> in
> > > Class/Schema :) open to any changes, the Boolean flag option is
> probably
> > > clearer I can do whichever option is most in line with Beam style!
> > >
> > > -Claire
> > >
> > > On Tue, Jul 27, 2021 at 5:47 AM Ryan Skraba  wrote:
> > >
> > >> Hello!  I took a quick look -- I think there's some potential
> > >> confusion on this line[1] and the reflectData argument being passed
> > >> into the new constructor.
> > >>
> > >> If I'm reading correctly, the argument passed in is never actually
> > >> used in the eventual ReflectDatumReader/Writer, and it's a different
> > >> type than the "this.reflectData" member in the instance.
> > >>
> > >> To restore the original behaviour, I'd probably recommend just passing
> > >> in a boolean argument instead, something very explicit along the lines
> > >> of "useReflectionOnSpecificData", or "alwaysUseAvroReflect".  That's
> > >> also the reason to consider a very simple AvroReflectCoder.of(...)
> > >> instead of an AvroCoder.of(x, y, true) factory method for readability,
> > >> like what was done with AvroGenericCoder.
> > >>
> > >> It would be easier to comment on a PR, don't hesitate!
> > >>
> > >> All my best, Ryan
> > >>
> > >> [1]
> > >>
> https://github.com/apache/beam/compare/master...clairemcginty:avro_reflect_coder_option?expand=1#diff-e875a9933286d97dd3d3d21a61e6f11c0e35624e97411c1b98f1ac672c21045dR311
> > >>
> > >>
> > >> On Mon, Jul 26, 2021 at 6:42 PM Claire McGinty
> > >>  wrote:
> > >> >
> > >> > Thanks! I put up a branch with a possible solution for adding the
> > >> Reflect option to AvroCoder with as minimal a code change as possible
> [1] -
> > >> would love to get anyone's thoughts on this.
> > >> >
> > >> > - Claire
> > >> >
> > >> > On Wed, Jul 21, 2021 at 7:00 PM Ahmet Altay 
> wrote:
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Wed, Jul 21, 2021 at 9:37 AM Claire McGinty <
> > >> claire.d.mcgi...@gmail.com> wrote:
> > >> >>>
> > >> >>> Hi Ahmet! Yes, I think it should be documented in the release
> notes.
> > >> >>
> > >> >>
> > >> >> Great. +Vitaly, do you want to add the breaking change to the
> release
> > >> notes, since this was related your change.
> > >> >>
> > >> >>>
> > >> >>> What do you think of Ryan’s suggestion to add a ReflectAvroCoder
> or a
> > >> configuration option to the existing AvroCoder?
> > >> >>
> > >> >>
> > >> >> I am not sure I am the best person to answer this. Second option,
> of
> > >> adding a configuration to the existing AvroCoder, rather than
> creating a
> > >> new coder makes more sense to me.
> > >> >>
> > >> >> That said, people who might have an opinion: /cc @Ismaël Mejía
> > >> @Kenneth Knowles @Lukasz Cwik +Vitaly
> > >> >>
> > >> >>>
> > >> >>>
> > >> >>> Thanks,
> > >> >>> Claire
> > >> >>>
> > >> >>> On Tue, Jul 20, 2021 at 4:15 PM Ahmet Altay 
> wrote:
> > >> 
> > >>  Is this something we need to add to the 2.30.0 release notes (
> > >> https://beam.apache.org/blog/beam-2.30.0/) as a breaking change?
> > >> 
> > >>  On Fri, Jul 16, 2021 at 7:11 AM Ryan Skraba 
> wrote:
> > >> >
> > >> > Hello!  Good catch, I'm taking a look, but it looks like you're
> > >> > entirely correct and there isn't any obvious workaround.  I
> guess
> > >> you
> > >> > could regenerate every SpecificRecord class in order to add the
> > >> > "java-class" or "avro.java.string" annotation, but that
> shouldn't be
> > >> > necessary.
> > >> >
> > >> > From the Avro perspective, we should always have been using
> > >> > SpecificDatumReader/Writer for all generated
> SpecificRecords...  We
> > >> > would still have the same Utf8 and .toString problems, but at
> least
> > >> > there would be no change in behaviour during migration :/
> > >> >
> > >> > As a side note, the Apache Avro project should probably
> reconsider
> > >> > whether the Utf8 class still adds any value with modern JVMs!
> If I
> > >> > understand correctly, it was originally in place because Hadoop
> had
> > >> a
> > >> > performance boost when it could reuse mutable data containers.
> > >> >
> > >> > Moving forward, I think your suggestion is the 

Re: Spark Structured Streaming runner migrated to Spark 3

2021-08-05 Thread Austin Bennett
Hooray!  Thanks, Etienne!

On Thu, Aug 5, 2021 at 3:11 AM Etienne Chauchot 
wrote:

> Hi all,
>
> Just to let you know that Spark Structured Streaming runner was migrated
> to Spark 3.
>
> Enjoy !
>
> Etienne
>
>


Re: One Pager - Test Command Line Discoverability in Beam

2021-05-25 Thread Austin Bennett
Cool; will be good to have and make things clearer!

On Tue, May 25, 2021 at 2:39 PM Kyle Weaver  wrote:

> I left some comments. In summary, I think this is mostly a documentation
> problem. If running a test isn't as easy as "./gradlew
> $MODULE:integrationTest", there should be instructions in the test class's
> javadoc.
>
> On Tue, May 25, 2021 at 2:05 PM Udi Meiri  wrote:
>
>> My first place to go would be here:
>> https://cwiki.apache.org/confluence/display/BEAM/Java+Tips (although it
>> doesn't document your use-case)
>>
>> You are right that finding the correct gradle task or jenkins job is not
>> straightforward.
>>
>>
>> On Tue, May 25, 2021 at 12:48 PM Alex Amato  wrote:
>>
>>> Friendly ping. I'll wait for more suggestions by the end of the week.
>>> Then close it out.
>>>
>>> -- Forwarded message -
>>> From: Alex Amato 
>>> Date: Fri, May 21, 2021 at 2:54 PM
>>> Subject: One Pager - Test Command Line Discoverability in Beam
>>> To: dev 
>>>
>>>
>>> Hi, I have had some issues determining how to run Beam tests. I have
>>> written a one pager for review and would like your feedback, to solve the
>>> problem
>>> 
>>> :
>>>
>>> "A Beam developer is looking at a test file, such as
>>> “BigQueryTornadoesIT.java” and wants to run this test. But they do not know
>>> the command line they need to type to run this test."
>>>
>>> I would like your feedback, to get toward a more concrete proposal. A
>>> few solutions are possible for this, mentioned in the proposal. But any
>>> solution that makes it very easy to understand how to run the test is a
>>> viable option as well.
>>>
>>> Cheers,
>>> Alex
>>>
>>


Re: [ANNOUNCE] New committer: Ning Kang

2021-03-24 Thread Austin Bennett
Thanks, Ning!

On Wed, Mar 24, 2021 at 5:51 AM Reza Rokni  wrote:

> Congratulations!
>
> On Wed, Mar 24, 2021 at 12:53 PM Kenneth Knowles  wrote:
>
>> Congratulations and thanks for all your contributions, Ning!
>>
>> Kenn
>>
>> On Tue, Mar 23, 2021 at 4:10 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> Congratulations, Ning, very well deserved!
>>>
>>> On Tue, Mar 23, 2021 at 2:01 PM Chamikara Jayalath 
>>> wrote:
>>>
 Congrats Ning!

 On Tue, Mar 23, 2021 at 1:23 PM Rui Wang  wrote:

> Congrats!
>
>
>
> -Rui
>
> On Tue, Mar 23, 2021 at 1:05 PM Yichi Zhang  wrote:
>
>> Congratulations Ning!
>>
>> On Tue, Mar 23, 2021 at 1:00 PM Robin Qiu  wrote:
>>
>>> Congratulations Ning!
>>>
>>> On Tue, Mar 23, 2021 at 12:56 PM Ahmet Altay 
>>> wrote:
>>>
 Congratulations Ning!

 On Tue, Mar 23, 2021 at 12:38 PM Alexey Romanenko <
 aromanenko@gmail.com> wrote:

> Congrats, Ning Kang! Well deserved!
> Thank you for your contributions and users support!
>
> Alexey
>
> On 23 Mar 2021, at 20:35, Pablo Estrada 
> wrote:
>
> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Ning Kang.
>
> Ning has been working in Beam for a while. He has contributed to
> the interactive experience of the Pyhton SDK, and developed a sidebar
> component, along with a release process for it. Ning has also helped 
> users
> on StackOverflow and user@, especially when it comes to
> Interactive Beam.
>
> Considering these contributions, the Beam PMC trusts Ning with the
> responsibilities of a Beam committer.[1]
>
> Thanks Ning!
> -P.
>
> [1] https://beam.apache.org/contribute/become-a-committer
> /#an-apache-beam-committer
>
>
>


Re: UX Research Findings Readout for Apache Beam Community

2021-01-30 Thread Austin Bennett
Is it possible to writeup/share results for those not able to attend and/or
to digest ahead of attending?



On Thu, Jan 28, 2021, 10:46 AM Carlos Camacho Frausto <
carlos.cama...@wizeline.com> wrote:

> Hello,
> Some weeks ago, our firm conducted a User Experience Research Study for
> Google Apache Beam to identify users’ needs and pain points when learning
> and using Apache Beam.
>
> *Today, we are glad to invite you to a Readout session where we will
> present the key findings and a list of recommendations in order to improve
> the learning experience for Apache Beam users. This session will consider a
> Q where you’ll be able to interact with the community. *
>
> We are considering a session of 60 minutes on any of these possible dates:
>
>- Thursday, February 11th at 11:00 AM CST / 6:00 PM CEST
>- Thursday, February 11th at 2:00 PM CST / 9:00 PM CEST
>- Friday, February 12th at 11:00 AM CST / 6:00 PM CEST
>- Friday, February 12th at 2:00 PM CST / 9:00 PM CEST
>
>
> If you would like to attend the session, *please help us know which of
> the dates/times options work best for you by filling up this form
> *. <
> https://forms.gle/LHjB3uYiJ35BFcbM6>
>
> --
>
> Carlos Camacho | WIZELINE
>
> UX Designer
>
> carlos.cama...@wizeline.com
>
> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>
> Follow us @WizelineGlobal  | Facebook
>  | LinkedIn
> 
>
>
>
>
>
>
>
>
> *This email and its contents (including any attachments) are being sent
> toyou on the condition of confidentiality and may be protected by
> legalprivilege. Access to this email by anyone other than the intended
> recipientis unauthorized. If you are not the intended recipient, please
> immediatelynotify the sender by replying to this message and delete the
> materialimmediately from your system. Any further use, dissemination,
> distributionor reproduction of this email is strictly prohibited. Further,
> norepresentation is made with respect to any content contained in this
> email.*


Re: Standarizing the "Runner" concept across website content

2021-01-07 Thread Austin Bennett
To those unfamiliar with these concepts, I generally conflate everything to
a "Runner" to keep things simple.  Though, also mention "execution engine"
at times.  Glad there appears to be concrete consensus on how we want to
talk about this.  It will also help guide me in being consistent :-)



On Wed, Jan 6, 2021 at 3:05 PM Griselda Cuevas  wrote:

> Thank you all for this productive conversation!
>
> Interestingly enough, a usability study we ran for Apache Beam (more
> details coming soon) pointed out that our documentation and website assume
> that the readers will be already familiar with Data Processing basic
> concepts such as engines, pipelines, etc. So introducing a glossary and
> even rethinking how we add this concepts into our new documentation is a
> good practice to have in mind.
>
> In the meantime, I will adopt the suggestion of differentiating between
> engine and runner. The first application I made of this is in the copy for
> the home page, which you can find as an attached file in this Jira ticket
> [1] in case you want to add comments/suggestions.
>
> The home page is the most important page in the website, as it's the one
> that explains Beam to the world and markets it's features, so appreciate
> feedback there too.
>
> Thanks everyone!
>
> [1]
> https://issues.apache.org/jira/browse/BEAM-11346?jql=project%20%3D%20beam%20AND%20assignee%20%3D%20gris%20ORDER%20BY%20priority%20DESC
>
> On Wed, 6 Jan 2021 at 13:33, Kenneth Knowles  wrote:
>
>>
>>
>> On Wed, Jan 6, 2021 at 12:28 PM Robert Burke  wrote:
>>
>>> +1 on consolidating and being consistent with our terms.
>>>
>>> I've always considered them (Runner/Engine) synonymous. From a user
>>> perspective, an engine without a runner isn't any good for their beam
>>> pipeline. That there's an adapter is an implementation detail in some
>>> instances. I do appreciate not using Adapter a term, avoiding confusing
>>> descriptions.
>>>
>>> However, if we make the change and there's a clear glossary of terms
>>> somewhere then
>>>
>>> That puts the lifecycle of a pipeline to be (loosely) something like...
>>>
>>> A Beam User authors Pipelines by writing DoFns, adding them as
>>> PTransforms connected by PCollections into a Pipeline using a Beam SDK. An
>>> SDK converts the pipeline into a portable representation, and submit it to
>>> the Job Management Service of a Beam Runner. A Beam Runner translates the
>>> portable pipeline representation into terms an underlying Engine
>>> understands for Execution. The Beam Runner also reverses this translation
>>> when the Engine delegates tasks to workers, so that the Beam SDKs can
>>> execute the user's DoFns in keeping with the Beam Semantics.
>>>
>>
>> An explicit glossary is a great idea to combine with standardizing
>> terminology across the site. I think the important context is that most of
>> the engines already existed before Beam and many of them are more
>> well-known. In fact, a pretty good way for a user to understand the essence
>> of what Beam is about is by taking a look at all the engines for which
>> there are Beam runners :-)
>>
>> Engine: a system/product for doing [big] data processing
>> Pipeline: user authors this logic that says what they want to compute (I
>> think the fact that it is a DAG of PTransforms is relevant but we can get
>> away with omitting it for the high-level view and to avoid introducing the
>> term PTransform too early)
>> Runner: executes a Beam pipeline on an engine (agree that "adapter" is
>> too generic)
>>
>> I'd say below that level of granularity is getting into things that you
>> need to know only after you have started writing pipelines. Possibly you
>> need to introduce SDK harness to make clear that Beam pipelines are
>> inherently multi-language/multi-runtime, even if the engine isn't (my
>> personal opinion is that "UDF server" is the best understood terminology
>> for this, and so much better that it is never too late to abandon the
>> cryptic term "SDK harness").
>>
>> Kenn
>>
>>
>>> (Not covered, bundles etc, but you get the idea...)
>>>
>>> On Wed, Jan 6, 2021, 11:16 AM Robert Bradshaw 
>>> wrote:
>>>
 +1 to keeping the distinction between Runner and Engine as Kenn
 described, and cleaning up the site with these in mind (I don't think the
 term engine is widely used yet).

 On Wed, Jan 6, 2021 at 11:15 AM Yichi Zhang  wrote:

> I agree with what kenn said, in most cases I would refer to the term
> runner as the adapter for translating user's pipeline code into a job
> representation and submitting it to the execution engine. Though in some
> cases they may still be used interchangeably such as direct runner?
>
> On Wed, Jan 6, 2021 at 11:02 AM Kenneth Knowles 
> wrote:
>
>> I personally try to always distinguish two concepts: the thing doing
>> the computing (like Spark or Flink), and the adapter for running a Beam
>> pipeline (like SparkRunner or FlinkRunner). I use the term 

Re: Looking for a PMC member to help with website development

2020-11-04 Thread Austin Bennett
And, @Griselda Cuevas  -- not meaning to change focus of
thread.  It seemed you might have the ability to cast a wider net.  But, I
also might be off on the differences in roles/rights/responsibilities.

On Wed, Nov 4, 2020 at 9:26 AM Austin Bennett 
wrote:

> To understand differences in PMC vs committer --> would like to understand
> why a committer doesn't suffice for the listed requests (and @Griselda
> Cuevas   you are a committer, but it seems you'd
> potentially just want another committer to also review).
>
> It seems the website is less about shifting the direction of the project
> or exposing APIs that we may feel compelled to support long term, which is
> why I naively assume PMC not especially needed (and it does look like this
> is being done with full visibility to PMC at a high level).
>
>
> On Wed, Nov 4, 2020 at 8:35 AM Gris Cuevas  wrote:
>
>> Hi folks,
>>
>> We're going to move into development phase for the new website and we
>> need a point of contact in the PMC who could help us with the following:
>> - Review of contribution
>> - Input on implementation questions such as how to divide contributions
>> to make them easier to review/edit
>> - Code architecture questions
>>
>> And other questions that come up from the development, would anyone
>> volunteer to help us with this?
>>
>> Gris
>>
>


Re: Looking for a PMC member to help with website development

2020-11-04 Thread Austin Bennett
To understand differences in PMC vs committer --> would like to understand
why a committer doesn't suffice for the listed requests (and @Griselda
Cuevas   you are a committer, but it seems you'd
potentially just want another committer to also review).

It seems the website is less about shifting the direction of the project or
exposing APIs that we may feel compelled to support long term, which is why
I naively assume PMC not especially needed (and it does look like this is
being done with full visibility to PMC at a high level).


On Wed, Nov 4, 2020 at 8:35 AM Gris Cuevas  wrote:

> Hi folks,
>
> We're going to move into development phase for the new website and we need
> a point of contact in the PMC who could help us with the following:
> - Review of contribution
> - Input on implementation questions such as how to divide contributions to
> make them easier to review/edit
> - Code architecture questions
>
> And other questions that come up from the development, would anyone
> volunteer to help us with this?
>
> Gris
>


BeamSQL and Beam equivalent -- examples?

2020-11-01 Thread Austin Bennett
Hi All,

For something I am currently writing -- I am seeking any examples of
BeamSQL and Beam that take the same input and produce the same output.  I
can't recall, off head, any examples/slides/writeups.  Do any exist?

I would like to show:

(a) that BeamSQL is a real thing :-)
(b) that Beam can express the same as BeamSQL
(c) that Beam can be more expressive than just SQL concepts.

Imagining such examples can help with points a and b.

Thanks,
Austin


Re: Beam summit update on blog?

2020-09-22 Thread Austin Bennett
Imagining we should also share specific talks over a longer time-frame?
Like not just a one off announcement, but drive activity/attention over
next months, until concretely planning.  Something like every other week
tweet of specific talk.

Which would be separate from any new (small) events we assemble

On Tue, Sep 22, 2020, 2:16 PM Ahmet Altay  wrote:

>
>
> On Tue, Sep 22, 2020 at 2:09 PM Matthias Baetens <
> baetensmatth...@gmail.com> wrote:
>
>> Sure! I'm out on holidays at the moment, but will pick it up sometime
>> next week - we can brainstorm a bit and see how we can get something done
>> together if you'd like! :)
>>
>
> Sounds great! Enjoy your holiday :)
>
>
>>
>> On Tue, 22 Sep 2020 at 20:31, Ahmet Altay  wrote:
>>
>>>
>>>
>>> On Mon, Sep 21, 2020 at 10:05 PM Matthias Baetens <
>>> baetensmatth...@gmail.com> wrote:
>>>
 +1 totally. I was thinking a retweet of the Beam Summit recording on
 the Apache Beam channel would be helpful as well - but I'm happy to look
 into doing a blogpost as well!

>>>
>>> Great, thank you Matthias. I did not help much with this summit so I am
>>> probably lacking context. But I am happy to help with a post as much as I
>>> can.
>>>
>>>

 On Tue, Sep 22, 2020, 04:25 Ahmet Altay  wrote:

> Would it make sense to publish a blog post and a tweet about the Beam
> Summit 2020 with some highlights and links to videos? There are many hours
> of video content. This is great and a social media shareable post / tweet
> could help with promoting this content.
>
> Ahmet
>



Re: [ANNOUNCE] New committer: Reza Ardeshir Rokni

2020-09-10 Thread Austin Bennett
Thanks and congrats, Reza!

On Thu, Sep 10, 2020 at 5:48 PM Heejong Lee  wrote:

> Congratulations!
>
> On Thu, Sep 10, 2020 at 4:42 PM Robert Bradshaw 
> wrote:
>
>> Thank you and welcome, Reza!
>>
>> On Thu, Sep 10, 2020 at 4:00 PM Ahmet Altay  wrote:
>>
>>> Congratulations Reza! And thank you for your contributions!
>>>
>>> On Thu, Sep 10, 2020 at 3:59 PM Chamikara Jayalath 
>>> wrote:
>>>
 Congrats Reza!

 On Thu, Sep 10, 2020 at 10:35 AM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Reza Ardeshir Rokni.
>
> Reza has been part of the Beam community since 2017! Reza has
> spearheaded advanced Beam examples [1], blogged and presented at multiple
> Beam Summits. Reza helps out users on the mailing lists [2] and
> StackOverflow [3]. When Reza's work uncovers a missing feature in Beam, he
> adds it [4]. Considering these contributions, the Beam PMC trusts Reza 
> with
> the responsibilities of a Beam committer [5].
>
> Thank you, Reza, for your contributions.
>
> Kenn
>
> [1] https://github.com/apache/beam/pull/3961
> [2]
> https://lists.apache.org/list.html?u...@beam.apache.org:gte=0d:reza%20rokni
> [3] https://stackoverflow.com/tags/apache-beam/topusers
> [4] https://github.com/apache/beam/pull/11929
> [5]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: Ability to link to "latest" of python docs

2020-09-08 Thread Austin Bennett
+dev 

Lynn,

Seems totally doable.  If others don't speak up with a good way to do this
(or in opposition), I'm sure we can sort something out to accomplish this
(will dig into intersphinx mapping tomorrow).

Cheers,
Austin




On Tue, Sep 8, 2020, 5:19 PM Lynn Root  wrote:

> Hey folks -
>
> I'm wondering if there's a way to link to the latest SDK version of the
> Python documentation. I see that if I go here
> , it lists all the available
> documented SDK versions. But it'd be really nice to go to a link like "
> https://beam.apache.org/releases/pydoc/latest; and be automatically
> pointed to the latest one. This is particularly handy for documenting
> libraries that use beam via intersphinx mapping
> .
>
> Thanks!
>
> --
> Lynn Root
> Staff Engineer, Spotify
>


Intro to Beam and Contributing Workshops

2020-07-19 Thread Austin Bennett
Hi All,

I'm a huge fan of HOPE .

In the virtual edition this year, I am giving 2 talks.

* a 2hr introduction to Beam.
* a 1hr introduction to contributing to open source (with specific examples
from Beam).

These to occur on 30/31 July, schedule found:
https://scheduler.hope.net/hope2020/schedule/

I can see whether there are additional passes available for me as a speaker
to share with the community (not sure on this point).

Cheers,
Austin


Re: [VOTE] Extension name of Interactive Beam Side Panel in JupyterLab

2020-07-16 Thread Austin Bennett
if specific to jupyterlab, then [3] makes sense.  Am wondering whether that
gets confusing, if it will also be used in datalab, and/or other
(similar/same underlying tech but going by different names)?

On Thu, Jul 16, 2020 at 12:35 PM Pablo Estrada  wrote:

> +1 for 3. Thanks Ning.
>
> On Thu, Jul 16, 2020 at 10:54 AM Kenneth Knowles  wrote:
>
>> +1 for [3]
>>
>> On Wed, Jul 15, 2020 at 5:47 PM Robert Bradshaw 
>> wrote:
>>
>>> +1 for [3] as well.
>>>
>>> On Wed, Jul 15, 2020 at 5:40 PM Ahmet Altay  wrote:
>>> >
>>> > I agree with Kyle. [3] sounds more accurate.
>>> >
>>> > On Wed, Jul 15, 2020 at 3:00 PM Kyle Weaver 
>>> wrote:
>>> >>
>>> >> I prefer [3].
>>> >>
>>> >> On Tue, Jul 14, 2020 at 10:53 AM Ning Kang  wrote:
>>> >>>
>>> >>> Hi everyone,
>>> >>>
>>> >>> Last week, I sent a design doc and proposals in this email thread
>>> about creating a JupyterLab extension for Interactive Beam. If you haven't
>>> had a chance to look at it and you're interested in Interactive Beam,
>>> please feel free to leave comments.
>>> >>>
>>> >>> Let's start a vote for the name of this extension to be used when
>>> published to NPM.
>>> >>> Here are some of the candidate names:
>>> >>> [1] apache-beam-sidepanel
>>> >>> [2] apache-beam-interactive-sidepanel
>>> >>> [3] apache-beam-jupyterlab-sidepanel
>>> >>> [4] 
>>> >>>
>>> >>>
>>> >>> The vote will be open for at least 72 hours. It is adopted by
>>> majority approval, with at least 3 PMC affirmative votes.
>>> >>>
>>> >>> Thanks!
>>> >>>
>>> >>> Ning.
>>>
>>


Re: [ANNOUNCE] New committer: Aizhamal Nurmamat kyzy

2020-06-29 Thread Austin Bennett
Congratulations, @Aizhamal Nurmamat kyzy  !

On Mon, Jun 29, 2020 at 2:32 PM Valentyn Tymofieiev 
wrote:

> Congratulations and big thank you for all the hard work on Beam, Aizhamal!
>
> On Mon, Jun 29, 2020 at 9:56 AM Kenneth Knowles  wrote:
>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Aizhamal Nurmamat kyzy
>>
>> Over the last 15 months or so, Aizhamal has driven many efforts in the
>> Beam community and contributed to others. Aizhamal started by helping with
>> the Beam newsletter [1] then continued by contributing to meetup planning
>> [2] [3] and Beam Summit planning [4]. Aizhamal created Beam's system for
>> managing social media [5] and contributed many tweets, coordinated the vote
>> and design of Beam's mascot [6] [7], drove migration of Beam's site to a
>> more i18n-friendly infrastructure [8], kept on top of Beam's enrollment in
>> Season of Docs [9], and even organized remote Beam Webinars during the
>> pandemic [10].
>>
>> In consideration of Aizhamal's contributions, the Beam PMC trusts her with
>> the responsibilities of a Beam committer [11].
>>
>> Thank you, Aizhamal, for your contributions and looking forward to many
>> more!
>>
>> Kenn, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://lists.apache.org/thread.html/447ae9fdf580ad88522aabc8a0f3703c51acd8885578bb422389a4b0%40%3Cdev.beam.apache.org%3E
>> [2]
>>
>> https://lists.apache.org/thread.html/ebeeae53a64dca8bb491e26b8254d247226e6d770e33dbc9428202df%40%3Cdev.beam.apache.org%3E
>> [3]
>>
>> https://lists.apache.org/thread.html/rc31d3d57b39e6cf12ea3b6da0e884f198f8cbef9a73f6a50199e0e13%40%3Cdev.beam.apache.org%3E
>> [4]
>>
>> https://lists.apache.org/thread.html/99815d5cd047e302b0ef4b918f2f6db091b8edcf430fb62e4eeb1060%40%3Cdev.beam.apache.org%3E
>> [5]
>> https://lists.apache.org/thread.html/babceeb52624fd4dd129c259db8ee9017cb68cba069b68fca7480c41%40%3Cdev.beam.apache.org%3E
>> [6]
>>
>> https://lists.apache.org/thread.html/60aa4b149136e6aa4643749731f4b5a041ae4952e7b7e57654888bed%40%3Cdev.beam.apache.org%3E
>> [7]
>>
>> https://lists.apache.org/thread.html/r872ba2860319cbb5ca20de953c43ed7d750155ca805cfce3b70085b0%40%3Cdev.beam.apache.org%3E
>> [8]
>>
>> https://lists.apache.org/thread.html/rfab4cc1411318c3f4667bee051df68f37be11846ada877f3576c41a9%40%3Cdev.beam.apache.org%3E
>> [9]
>>
>> https://lists.apache.org/thread.html/r4df2e596751e263a83300818776fbb57cb1e84171c474a9fd016ec10%40%3Cdev.beam.apache.org%3E
>> [10]
>>
>> https://lists.apache.org/thread.html/r81b93d700fedf3012b9f02f56b5d693ac4c1aac1568edf9e0767b15f%40%3Cuser.beam.apache.org%3E
>> [11]
>>
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


Re: [ANNOUNCE] New PMC Member: Alexey Romanenko

2020-06-16 Thread Austin Bennett
Congrats!

On Tue, Jun 16, 2020 at 1:27 PM Valentyn Tymofieiev 
wrote:

> Congratulations!
>
> On Tue, Jun 16, 2020 at 11:41 AM Ahmet Altay  wrote:
>
>> Congratulations!
>>
>> On Tue, Jun 16, 2020 at 10:05 AM Pablo Estrada 
>> wrote:
>>
>>> Yooohooo! Thanks for all your contributions and hard work Alexey!:)
>>>
>>> On Tue, Jun 16, 2020, 8:57 AM Ismaël Mejía  wrote:
>>>
 Please join me and the rest of Beam PMC in welcoming Alexey Romanenko
 as our
 newest PMC member.

 Alexey has significantly contributed to the project in different ways:
 new
 features and improvements in the Spark runner(s) as well as maintenance
 of
 multiple IO connectors including some of our most used ones (Kafka and
 Kinesis/Aws). Alexey is also quite active helping new contributors and
 our user
 community in the mailing lists / slack and Stack overflow.

 Congratulations Alexey!  And thanks for being a part of Beam!

 Ismaël

>>>


Re: Beam Summit Status Report - 6/10

2020-06-15 Thread Austin Bennett
Great, if worried about publicly archiving things -- it seems that PMC/etc
is starting to view this effort even more as a community effort of the
project!?  That's my hope.  As opposed to an external, yet somewhat
condoned, effort.

That said, have we worked out the parts where such efforts need to be
distinct, vs. actually a part of Apache Beam?  If trying to fit more within
the model general open source project management, are there areas of
concern to be aware of?


On Sun, Jun 14, 2020 at 8:45 PM Kenneth Knowles  wrote:

> Thanks!
>
> Can we please have the status in the text of the email also? This way it
> is easy to read quickly while processing email, and also it is archived for
> the project & public. (Google Docs change over time so the link isn't a
> good archive)
>
> Kenn
>
> On Wed, Jun 10, 2020 at 6:34 PM Ahmet Altay  wrote:
>
>> Thank you Brittany and all others working on this. Progress looks good. :)
>>
>> On Wed, Jun 10, 2020 at 4:56 PM Brittany Hermann 
>> wrote:
>>
>>> Hi folks,
>>>
>>> I wanted to provide you with the Beam Summit Status report from today's
>>> meeting. If you would like to join the next public meeting on Wednesday,
>>> June 24th at 11:30 AM PST please let me know and I will send a calendar
>>> invite over to you!
>>>
>>> Also don't forget to submit your CFP
>>>  by June 15th and register
>>> for the Summit !
>>>
>>> 
>>>
>>> https://docs.google.com/document/d/11PXOBUbeldgPqz6OlTswCal6SxyX76Bb_ZVKBdwsd7o/edit?usp=sharing
>>>
>>> Have a great day!
>>>
>>> --
>>>
>>> Brittany Hermann
>>>
>>> Open Source Program Manager (Provided by Adecco Staffing)
>>>
>>> 1190 Bordeaux Drive , Building 4, Sunnyvale, CA 94089
>>> 
>>>
>>>
>>>


Re: Try Beam Katas Today

2020-06-07 Thread Austin Bennett
I have no problem accessing Katas when I'm in IntelliJ IDEA, or in
PyCharm.  But, I would expect - with edutools plugin - expect to be able to
use the Katas from GoLand.  Currently not able to do that.  Has anyone had
successes accessing Go Katas from GoLand?  Perhaps I am doing something
strange?

On Mon, May 25, 2020 at 9:03 PM Austin Bennett 
wrote:

> Cool; thanks Henry!
>
> On Mon, May 25, 2020 at 7:49 PM Henry Suryawirawan <
> hsuryawira...@google.com> wrote:
>
>> Hi Austin,
>>
>> The remote-info yaml files would get updated upon updating the course on
>> Stepik, not entirely recreated as if from scratch.
>> The important IDs metadata that track the course ID, section ID, lesson
>> ID, and task ID are preserved.
>> Having these files in the repo allows multiple people to be able to
>> update the same courses on Stepik without resulting in new courses
>> everytime it gets updated.
>>
>> As an illustration, let's take the Go Katas.
>> The course-remote-info.yaml tracks the course ID as `id: 70387`.
>> This corresponds to the equivalent Stepik course page:
>> https://stepik.org/course/70387/promo
>> This course ID has also been endorsed by JetBrains to feature visibly on
>> the course list.
>> If we don't keep track of the IDs, we'll always recreate the course with
>> new IDs every time it gets uploaded.
>> I hope this helps to clarify the needs for the remote-info files to be in
>> the repo.
>>
>> I've explored a way to automatically update the course that can be
>> triggered by the CI.
>> What I found is that at the moment it is better to update the course from
>> the IDE menu, otherwise we would have to find out what actually happens
>> behind the menu and reverse engineer the steps.
>> It may be possible to do, but I am also concerned if JetBrains updates
>> the plugin and changes the way it handles the course upload, and that we
>> have to keep up with the changes at the same time.
>>
>> On the stats, we have some statistics that are publicly available and
>> privately available for the course instructors.
>> The promo page shows the publicly available stat of the number of
>> learners who have tried the course, any star and review, e.g. for Java
>> Katas (https://stepik.org/course/54530/promo).
>> The private stats include number of learners per day, task pass rate, the
>> learners, etc.
>>
>>
>> Henry Suryawirawan
>>
>> Strategic Cloud Engineer
>>
>> hsuryawira...@google.com
>>
>>
>>
>>
>>
>>
>> On Tue, May 26, 2020 at 9:27 AM Austin Bennett <
>> whatwouldausti...@gmail.com> wrote:
>>
>>> Hi Henry,
>>>
>>> Cool.  Most makes sense.
>>>
>>> What I am missing is the need for the '*-remote-info.yaml' in the beam
>>> repo (I do get what purpose it serves for Stepik).  There is probably a
>>> good reason.
>>>
>>> To get nitpicky (am genuinely curious) --> It seems that this sort of
>>> metadata would get (re-)created upon (re-)uploading a course.  What does
>>> persisting it to the repo get us?  This is also in line with nascent
>>> thoughts of auto-deploying katas to stepik on accepted merge into
>>> learning/katas/ -- rather than this being done manually.Perhaps this is
>>> just due to my lack of having spent much time with Stepik.
>>>
>>> Also, do we have download statistics from Stepik for the Katas?  What
>>> other bits of information can we gather from Stepik (as you share, if
>>> Stepik is using *-remote-info.yaml for tracking, what information are we
>>> able to gather from there)?
>>>
>>> Thanks,
>>> Austin
>>>
>>> On Mon, May 25, 2020 at 1:10 AM Henry Suryawirawan <
>>> hsuryawira...@google.com> wrote:
>>>
>>>> Hi Austin,
>>>>
>>>> Thanks for your help in adding a new lesson.
>>>> I will have a look and help to review the pull request.
>>>>
>>>> On your questions:
>>>> 1. Apart from the *-remote-info.yaml, the other yaml files should
>>>> contain Apache license header. We explicitly turn off license header check
>>>> for the remote-info files as can be referred in the build.gradle
>>>> <https://github.com/apache/beam/blob/master/build.gradle>. The reason
>>>> is because the remote-info files are auto generated, and they will always
>>>> get replaced whenever we update the course on Stepik.
>>>> All the YAML files are important and hav

Fwd: [DISCUSSION] Use github actions for python wheels ?

2020-05-30 Thread Austin Bennett
There seems to be support here for this idea.  After digging through
things, I *think* I understand the moving pieces and can address (reason I
was digging through the code, including beam-wheels repo [1]).  Though, not
100% that I have not overlooked bits.

Would be happy to pickup
https://issues.apache.org/jira/browse/BEAM-9388 (imagine
would get to it over the course of a month or few -- so also wouldn't want
to take it on if others would be looking to do it sooner) -- seems
worthwhile to reduce manual steps for the release process, as well as the
additional implications (daily builds) if can easily manage.  I have some
experience with GitHub Actions, but not with our (BEAM) deployment/release
process, nor much with building wheels.  I assume there are sufficient docs
available to go through for the mechanics of the latter.  If picking up
this issue, I suspect may need to bother some of those that have done
recent releases -- that would best occur on the jira ticket, once at that
point?

@Ismail -- given your experience with GitHub actions,
any reason to think GitHub Actions not appropriate/ready for this specific
task?

@Ahmet Altay  happy to understand the extent of what you
had in mind, maybe the extensions are not as important to plan out, as
they're straightforwardly bolted on (ex: daily builds).  More tactically
would be valuable to ensure I understand what all needs to occur.  Any
other source of info to consume other than
https://github.com/apache/beam-wheels and
https://beam.apache.org/contribute/release-guide/.

Also, open to the thought that this might be taking on too much, without
more experience with the release process and such.  Do advise...?


[1]
https://lists.apache.org/thread.html/r93884eb080297647207f7d2b8a393e224029fc2c3509017886e84051%40%3Cdev.beam.apache.org%3E



-- Forwarded message -
From: Ismaël Mejía 
Date: Wed, Feb 26, 2020 at 12:09 PM
Subject: Re: [DISCUSSION] Use github actions for python wheels ?
To: dev 


+1 I have been migrating multiple projects into github actions recently and
even if for some tasks it is not as mature and polished as travis it has
proven to be way more reliable.


On Wed, Feb 26, 2020 at 7:07 PM Ahmet Altay  wrote:

> I created https://issues.apache.org/jira/browse/BEAM-9388 to explore
> this. To be explicit and not to do cookie licking, I would not be able to
> work on this at the moment. If anyone is interested please take it.
> Otherwise I will try to come back and explore this when I can.
>
> On Tue, Feb 25, 2020 at 2:57 PM Robert Bradshaw 
> wrote:
>
>> I'd be in favor of this, assuming it actually simplifies things.
>
>
> This is also my concern. I do think that it will simplify things, but I am
> not certain as I am not very familiar with the github actions.
>
>
>> (Note
>> that the wheels are for several variants of linux, presumably we could
>> do cross-compiles. Also, manylinux is a "minimal" linux specifically
>> built as to produce shared object libraries compatible with a wide
>> variety of distributions--we can't just assume that a shared object
>> library built on one modern linux will just work on another. (But
>> maybe it's sufficient to do this within a docker environment?)
>>
>
> There will be no change in this area. Both in Both Travis and github
> actions offer a comparable set of options.
>
>
>>
>> On Tue, Feb 25, 2020 at 2:23 PM Kenneth Knowles  wrote:
>> >
>> > +1 to exploring this.
>> >
>> > On bui...@apache.org there is lots of discussion and general approval
>> for trying it. It is enabled and used by some projects. Calcite uses it to
>> build their website, for example.
>>
>
> Great.
>
>
>> >
>> > Kenn
>> >
>> >
>> > On Tue, Feb 25, 2020 at 2:08 PM Ahmet Altay  wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I recently had a chance to look at the documentation for github
>> actions. I think we could use github actions instead of travis to for
>> building python wheels during releases. This will have the following
>> advantages:
>> >>
>> >> - We will eliminate one repo. (If you don't know, we have
>> https://github.com/apache/beam-wheels for the sole purpose of building
>> wheels file.)
>> >> - Workflow will be stored in the same repo. This will prevent bit rot
>> that is only discovered at release times. (happened a few times, although
>> usually easy to fix.)
>> >> - github actions supports ubuntu, mac, windows environments. We could
>> try to build wheels for windows as well. (Travis also supports the same
>> environments but we only use linux and mac environments. Maybe there are
>> other blockers for building wheels for Windows.)
>> >> - We could do more, like daily python builds.
>> >>
>> >> Downsides would be:
>> >> - I do not know if github actions will require some special set of
>> permissions that require an approval from infra.
>> >> - Travis works fine most of the time. This might be unnecessary work.
>> >>
>> >> What do you think? Is this feasible, would this be useful?
>> >>
>> >> Ahmet
>>
>


Re: Try Beam Katas Today

2020-05-25 Thread Austin Bennett
Cool; thanks Henry!

On Mon, May 25, 2020 at 7:49 PM Henry Suryawirawan 
wrote:

> Hi Austin,
>
> The remote-info yaml files would get updated upon updating the course on
> Stepik, not entirely recreated as if from scratch.
> The important IDs metadata that track the course ID, section ID, lesson
> ID, and task ID are preserved.
> Having these files in the repo allows multiple people to be able to update
> the same courses on Stepik without resulting in new courses everytime it
> gets updated.
>
> As an illustration, let's take the Go Katas.
> The course-remote-info.yaml tracks the course ID as `id: 70387`.
> This corresponds to the equivalent Stepik course page:
> https://stepik.org/course/70387/promo
> This course ID has also been endorsed by JetBrains to feature visibly on
> the course list.
> If we don't keep track of the IDs, we'll always recreate the course with
> new IDs every time it gets uploaded.
> I hope this helps to clarify the needs for the remote-info files to be in
> the repo.
>
> I've explored a way to automatically update the course that can be
> triggered by the CI.
> What I found is that at the moment it is better to update the course from
> the IDE menu, otherwise we would have to find out what actually happens
> behind the menu and reverse engineer the steps.
> It may be possible to do, but I am also concerned if JetBrains updates the
> plugin and changes the way it handles the course upload, and that we have
> to keep up with the changes at the same time.
>
> On the stats, we have some statistics that are publicly available and
> privately available for the course instructors.
> The promo page shows the publicly available stat of the number of learners
> who have tried the course, any star and review, e.g. for Java Katas (
> https://stepik.org/course/54530/promo).
> The private stats include number of learners per day, task pass rate, the
> learners, etc.
>
>
> Henry Suryawirawan
>
> Strategic Cloud Engineer
>
> hsuryawira...@google.com
>
>
>
>
>
>
> On Tue, May 26, 2020 at 9:27 AM Austin Bennett <
> whatwouldausti...@gmail.com> wrote:
>
>> Hi Henry,
>>
>> Cool.  Most makes sense.
>>
>> What I am missing is the need for the '*-remote-info.yaml' in the beam
>> repo (I do get what purpose it serves for Stepik).  There is probably a
>> good reason.
>>
>> To get nitpicky (am genuinely curious) --> It seems that this sort of
>> metadata would get (re-)created upon (re-)uploading a course.  What does
>> persisting it to the repo get us?  This is also in line with nascent
>> thoughts of auto-deploying katas to stepik on accepted merge into
>> learning/katas/ -- rather than this being done manually.Perhaps this is
>> just due to my lack of having spent much time with Stepik.
>>
>> Also, do we have download statistics from Stepik for the Katas?  What
>> other bits of information can we gather from Stepik (as you share, if
>> Stepik is using *-remote-info.yaml for tracking, what information are we
>> able to gather from there)?
>>
>> Thanks,
>> Austin
>>
>> On Mon, May 25, 2020 at 1:10 AM Henry Suryawirawan <
>> hsuryawira...@google.com> wrote:
>>
>>> Hi Austin,
>>>
>>> Thanks for your help in adding a new lesson.
>>> I will have a look and help to review the pull request.
>>>
>>> On your questions:
>>> 1. Apart from the *-remote-info.yaml, the other yaml files should
>>> contain Apache license header. We explicitly turn off license header check
>>> for the remote-info files as can be referred in the build.gradle
>>> <https://github.com/apache/beam/blob/master/build.gradle>. The reason
>>> is because the remote-info files are auto generated, and they will always
>>> get replaced whenever we update the course on Stepik.
>>> All the YAML files are important and have to be included as part of the
>>> repository. The {task, lesson,section}-info.yaml files are metadata
>>> files used by the JetBrains EduTools plugin for Educational projects. The
>>> *-remote-info.yaml files contain metadata information (e.g. the IDs)
>>> important for Stepik to track our courses.
>>>
>>> 2. We do not leave out the beam.Create. Apart from the Introduction
>>> lesson, you can find it in the cmd/main.go file. We explicitly create the
>>> main.go file in order for the learner to be able to also run the pipeline
>>> independently and observe the output, just like when they write the
>>> pipeline normally.
>>>
>>> Hope my answers help to clarify.
>>>
>

Re: Try Beam Katas Today

2020-05-25 Thread Austin Bennett
Hi Henry,

Cool.  Most makes sense.

What I am missing is the need for the '*-remote-info.yaml' in the beam repo
(I do get what purpose it serves for Stepik).  There is probably a good
reason.

To get nitpicky (am genuinely curious) --> It seems that this sort of
metadata would get (re-)created upon (re-)uploading a course.  What does
persisting it to the repo get us?  This is also in line with nascent
thoughts of auto-deploying katas to stepik on accepted merge into
learning/katas/ -- rather than this being done manually.Perhaps this is
just due to my lack of having spent much time with Stepik.

Also, do we have download statistics from Stepik for the Katas?  What other
bits of information can we gather from Stepik (as you share, if Stepik is
using *-remote-info.yaml for tracking, what information are we able to
gather from there)?

Thanks,
Austin

On Mon, May 25, 2020 at 1:10 AM Henry Suryawirawan 
wrote:

> Hi Austin,
>
> Thanks for your help in adding a new lesson.
> I will have a look and help to review the pull request.
>
> On your questions:
> 1. Apart from the *-remote-info.yaml, the other yaml files should contain
> Apache license header. We explicitly turn off license header check for the
> remote-info files as can be referred in the build.gradle
> <https://github.com/apache/beam/blob/master/build.gradle>. The reason is
> because the remote-info files are auto generated, and they will always get
> replaced whenever we update the course on Stepik.
> All the YAML files are important and have to be included as part of the
> repository. The {task, lesson,section}-info.yaml files are metadata files
> used by the JetBrains EduTools plugin for Educational projects. The
> *-remote-info.yaml files contain metadata information (e.g. the IDs)
> important for Stepik to track our courses.
>
> 2. We do not leave out the beam.Create. Apart from the Introduction
> lesson, you can find it in the cmd/main.go file. We explicitly create the
> main.go file in order for the learner to be able to also run the pipeline
> independently and observe the output, just like when they write the
> pipeline normally.
>
> Hope my answers help to clarify.
>
>
> Henry Suryawirawan
>
> Strategic Cloud Engineer
>
> hsuryawira...@google.com
>
>
>
>
>
> On Mon, May 25, 2020 at 7:13 AM Austin Bennett <
> whatwouldausti...@gmail.com> wrote:
>
>> @Rion, @Henry Suryawirawan , @Damon,  I added
>> a Flatten Kata for Go.  Please have a look:
>> https://github.com/apache/beam/pull/11806  -- tagged all of you as other
>> authors of Katas.
>>
>> A few questions:
>>
>> 1)  Across all the katas, we have files '{task,
>> lesson,section}-remote-info.yaml'.  These files do not contain the apache
>> license, and I imagine they are generated by Steptik/other (also, to get
>> working locally those files were not needed).  Should these files be
>> ignored (via .gitignore) and kept out of the Beam repository?  Wondering
>> why we would want those in the repo, and if yes, should they have the
>> Apache License on them?
>>
>> 2) On Go Katas generally.  I wrote this one following convention of other
>> Go Katas found in the repository.  For the Java and Python versions, the
>> code that people work with includes seeing the Beam.Create.  This is left
>> out of the GoLang katas, and kept behind the scenes.  Is there reasoning
>> for breaking from the convention of the other Katas?
>> https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#Create
>>
>> Thanks,
>> Austin
>>
>>
>> On Thu, May 21, 2020 at 8:00 PM Rion Williams 
>> wrote:
>>
>>> Hi Henry,
>>>
>>> I submitted a pull request related to the Beam Katas that can be found
>>> here (https://github.com/apache/beam/pull/11761) and included you as a
>>> reviewer. I updated all of the related metadata, generated the course, and
>>> tested through it to ensure it worked as expected (and the placeholders all
>>> worked as expected as well).
>>>
>>> The generated course can be found here on Stepik (
>>> https://stepik.org/course/72488 <https://stepik.org/course/72488/promo>)
>>> and I’ve reached out to a few folks to put it through its paces in the
>>> wild.
>>>
>>> Let me know if there’s anything else I can do or changes that need to be
>>> made in the PR or elsewhere.
>>>
>>> Thanks again,
>>>
>>> Rion
>>>
>>> On May 20, 2020, at 2:12 AM, Henry Suryawirawan <
>>> hsuryawira...@google.com> wrote:
>>>
>>> 
>>> Yeah there was a recent pull request merged for the md fil

Re: deploy_travis.sh ?

2020-05-24 Thread Austin Bennett
Thanks Ashwin,

Exactly.  The readme doesn't match the current state of the code.


On Sun, May 24, 2020, 2:34 PM Ashwin Ramaswami 
wrote:

> Not sure, but deploy_travis.sh seems to have been deleted in this commit:
> https://github.com/apache/beam-wheels/commit/40c0bd1a36d70736b84e925889cdad417e429fd4
>
> On 2020/05/22 02:13:35, Austin Bennett 
> wrote:
> > Hi All,
> >
> > Was digging into https://github.com/apache/beam-wheels after being
> reminded
> > of https://issues.apache.org/jira/browse/BEAM-9388 (I recall the
> original
> > conversation -->
> >
> https://lists.apache.org/thread.html/r4a7d34e64a34e9fe589d06aec74d9b464d252c516fe96c35b2d6c9ae%40%3Cdev.beam.apache.org%3E
> > Not
> > sure whether taking on that issue is too ambitious, given would likely be
> > lacking much context.
> >
> > Eliminating manual steps is the thing, so esp. happy to help with such
> > things to make our lives easier!
> >
> > Trying to track down what all is involved here.  Seems I am missing
> > something,  deploy_travis.sh   is mentioned in the README
> > <https://github.com/apache/beam-wheels/blob/master/README.md>, but I
> don't
> > see it in this helper repo, nor in the main beam repo.  It actually looks
> > like step now relies upon:
> >
> https://github.com/apache/beam/blob/master/release/src/main/scripts/sign_hash_python_wheels.sh
> > to take from GCS to SVN?
> >
> > If my understanding is correct, I will at least update the
> > documentation/README in beam-wheels.
> >
> > Cheers,
> > Austin
> >
>


Re: Try Beam Katas Today

2020-05-24 Thread Austin Bennett
@Rion, @Henry Suryawirawan , @Damon,  I added a
Flatten Kata for Go.  Please have a look:
https://github.com/apache/beam/pull/11806  -- tagged all of you as other
authors of Katas.

A few questions:

1)  Across all the katas, we have files '{task,
lesson,section}-remote-info.yaml'.  These files do not contain the apache
license, and I imagine they are generated by Steptik/other (also, to get
working locally those files were not needed).  Should these files be
ignored (via .gitignore) and kept out of the Beam repository?  Wondering
why we would want those in the repo, and if yes, should they have the
Apache License on them?

2) On Go Katas generally.  I wrote this one following convention of other
Go Katas found in the repository.  For the Java and Python versions, the
code that people work with includes seeing the Beam.Create.  This is left
out of the GoLang katas, and kept behind the scenes.  Is there reasoning
for breaking from the convention of the other Katas?
https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#Create

Thanks,
Austin


On Thu, May 21, 2020 at 8:00 PM Rion Williams  wrote:

> Hi Henry,
>
> I submitted a pull request related to the Beam Katas that can be found
> here (https://github.com/apache/beam/pull/11761) and included you as a
> reviewer. I updated all of the related metadata, generated the course, and
> tested through it to ensure it worked as expected (and the placeholders all
> worked as expected as well).
>
> The generated course can be found here on Stepik (
> https://stepik.org/course/72488 )
> and I’ve reached out to a few folks to put it through its paces in the
> wild.
>
> Let me know if there’s anything else I can do or changes that need to be
> made in the PR or elsewhere.
>
> Thanks again,
>
> Rion
>
> On May 20, 2020, at 2:12 AM, Henry Suryawirawan 
> wrote:
>
> 
> Yeah there was a recent pull request merged for the md file format change.
> I checked your repo and it still contains the task.html, so need your help
> to merge with the latest master.
>
> For the answer placeholder, you may refer to this doc
>  first
> to understand how it works.
> It will auto update the placeholder position in the task-info.yaml.
>
> If you encounter any issue, just let me know.
> Thanks Rion.
>
>
> Regards,
> Henry
>
>
>
> On Wed, May 20, 2020 at 12:43 PM Rion Williams 
> wrote:
>
>> Hi Henry,
>>
>> Thanks for the quick response, I appreciate it. I believe that I pulled
>> the latest from master a day or so ago, so I’ll make sure to pull the most
>> recent changes in.
>>
>> As far as the placeholders, they aren’t currently present (as I don’t
>> believe they were present in the Java ones within the learning/katas
>> directory), however I can easily add those in to align with the content of
>> the existing course. I wasn’t entirely sure based on the existing
>> directories if the files should contain the placeholders or the actual
>> implementations, either way, it’s a pretty trivial series of changes.
>>
>> I’ll try to put these together tomorrow and push up a PR. I’ll make sure
>> to include you as a reviewer.
>>
>> Thanks for the initial feedback,
>>
>> Rion
>>
>> On May 19, 2020, at 11:15 PM, Henry Suryawirawan <
>> hsuryawira...@google.com> wrote:
>>
>> 
>> Thanks Rion for adding the Kotlin version.
>> This is great to show other people that Beam can be done in Kotlin too!
>>
>> I can help to review your work.
>> Please help to incorporate the Java Katas latest changes from master.
>> There are recent changes to the task description file format from html
>> to md.
>> Please also help to remove all the *-remote-info.yaml files.
>> I assume that you've adjusted the answer placeholders in all tasks as
>> well.
>> Afterwards, you can create a pull request and assign me as reviewer.
>>
>> Please reach out to me if you have any questions.
>>
>>
>> Regards,
>> Henry
>>
>>
>>
>>
>> On Wed, May 20, 2020 at 3:33 AM Rion Williams 
>> wrote:
>>
>>> Sure! I ran through all of the tests locally on my branch (as tests) and
>>> then performed a check against all of the known tasks (via Course Creator >
>>> Check All Tasks) and 35/36 tasks passed successfully with the only one that
>>> didn't being a Built-in IO one that doesn't currently have any
>>> implementation. Although, I'd love for someone else to try the same thing
>>> since as far as I can tell it "works on my machine".
>>>
>>> Thanks!
>>>
>>> Rion
>>>
>>> On 2020/05/19 19:12:57, Pablo Estrada  wrote:
>>> > This is really cool Rion!
>>> >
>>> > I believe it's possible to start trying out the katas from your
>>> branch? If
>>> > so, I can give them a try, and use that as a review...
>>> > Henry, any other ideas?
>>> >
>>> > On Tue, May 19, 2020 at 12:04 PM Rion Williams 
>>> > wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > I was recently added as a contributor and created a JIRA ticket
>>> related to
>>> > > the existing Katas 

deploy_travis.sh ?

2020-05-21 Thread Austin Bennett
Hi All,

Was digging into https://github.com/apache/beam-wheels after being reminded
of https://issues.apache.org/jira/browse/BEAM-9388 (I recall the original
conversation -->
https://lists.apache.org/thread.html/r4a7d34e64a34e9fe589d06aec74d9b464d252c516fe96c35b2d6c9ae%40%3Cdev.beam.apache.org%3E
Not
sure whether taking on that issue is too ambitious, given would likely be
lacking much context.

Eliminating manual steps is the thing, so esp. happy to help with such
things to make our lives easier!

Trying to track down what all is involved here.  Seems I am missing
something,  deploy_travis.sh   is mentioned in the README
, but I don't
see it in this helper repo, nor in the main beam repo.  It actually looks
like step now relies upon:
https://github.com/apache/beam/blob/master/release/src/main/scripts/sign_hash_python_wheels.sh
to take from GCS to SVN?

If my understanding is correct, I will at least update the
documentation/README in beam-wheels.

Cheers,
Austin


Re: Transparency to Beam Digital Summit Planning

2020-05-20 Thread Austin Bennett
Should the link/meeting notes be publicly available?  Not just available to
individuals plus all of @google?



On Wed, May 20, 2020 at 2:06 PM Brittany Hermann 
wrote:

> Hi folks,
>
> I wanted to provide a few different ways of transparency to you during the
> planning of the Beam Digital Summit.
>
> 1) *Beam Summit Status Reports:* I will be sending out weekly Beam Summit
> Status Reports which will include the goals, attendees, topics discussed,
> and decisions made every Wednesday.
>
> 2) *Community Guests on Committee Planning Calls:* We would like to
> invite you to join as a guest to these planning calls. This would allow
> for observation of the planning process and to see if there are ways for
> future collaboration on promotions, etc. for the event. If you are
> interested in joining the first bi-weekly meeting starting next week,
> please reach out to me and I will send the invite with call-in information
> directly to you.
>
> In the meantime, I have attached this week's Beam Summit Status report
> below.
>
>
> https://docs.google.com/document/d/1_jLhKvW5MTtkHOZDJyzCTSLUDiD4RjlJmU35rXV-3n0/edit?usp=sharing
>
> Have a great rest of your week!
>
> --
>
> Brittany Hermann
>
> Open Source Program Manager (Provided by Adecco Staffing)
>
> 1190 Bordeaux Drive , Building 4, Sunnyvale, CA 94089
> 
>
>
>


Re: [ANNOUNCE] New committer: Robin Qiu

2020-05-20 Thread Austin Bennett
Congrats!

On Tue, May 19, 2020, 8:32 PM Chamikara Jayalath 
wrote:

> Congrats Robin!
>
> On Tue, May 19, 2020 at 2:39 PM Rui Wang  wrote:
>
>> Nice! Congrats!
>>
>>
>>
>> -Rui
>>
>> On Tue, May 19, 2020 at 11:13 AM Pablo Estrada 
>> wrote:
>>
>>> yoohoo : )
>>>
>>> On Tue, May 19, 2020 at 11:03 AM Yifan Zou  wrote:
>>>
 Congratulations, Robin!

 On Tue, May 19, 2020 at 10:53 AM Udi Meiri  wrote:

> Congratulations Robin!
>
> On Tue, May 19, 2020, 10:15 Valentyn Tymofieiev 
> wrote:
>
>> Congratulations, Robin!
>>
>> On Tue, May 19, 2020 at 9:10 AM Yichi Zhang 
>> wrote:
>>
>>> Congrats Robin!
>>>
>>> On Tue, May 19, 2020 at 8:56 AM Kamil Wasilewski <
>>> kamil.wasilew...@polidea.com> wrote:
>>>
 Congrats!

 On Tue, May 19, 2020 at 5:33 PM Jan Lukavský 
 wrote:

> Congrats Robin!
> On 5/19/20 5:01 PM, Tyson Hamilton wrote:
>
> Congratulations!
>
> On Tue, May 19, 2020 at 6:10 AM Omar Ismail 
> wrote:
>
>> Congrats!
>>
>> On Tue, May 19, 2020 at 5:00 AM Gleb Kanterov 
>> wrote:
>>
>>> Congratulations!
>>>
>>> On Tue, May 19, 2020 at 7:31 AM Aizhamal Nurmamat kyzy <
>>> aizha...@apache.org> wrote:
>>>
 Congratulations, Robin! Thank you for your contributions!

 On Mon, May 18, 2020, 7:18 PM Boyuan Zhang 
 wrote:

> Congrats~~
>
> On Mon, May 18, 2020 at 7:17 PM Reza Rokni 
> wrote:
>
>> Congratulations!
>>
>> On Tue, May 19, 2020 at 10:06 AM Ahmet Altay <
>> al...@google.com> wrote:
>>
>>> Hi everyone,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming a
>>> new committer: Robin Qiu .
>>>
>>> Robin has been active in the community for close to 2 years,
>>> worked on HyperLogLog++ [1], SQL [2], improved documentation, 
>>> and helped
>>> with releases(*).
>>>
>>> In consideration of his contributions, the Beam PMC trusts
>>> him with the responsibilities of a Beam committer [3].
>>>
>>> Thank you for your contributions Robin!
>>>
>>> -Ahmet, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://www.meetup.com/Zurich-Apache-Beam-Meetup/events/265529665/
>>> [2]
>>> https://www.meetup.com/Belgium-Apache-Beam-Meetup/events/264933301/
>>> [3] https://beam.apache.org/contribute/become-a-committer
>>> /#an-apache-beam-committer
>>> (*) And maybe he will be a release manager soon :)
>>>
>>> --
>>
>> Omar Ismail |  Technical Solutions Engineer |
>> omarism...@google.com |
>>
>


Event Calendar?

2020-05-19 Thread Austin Bennett
Hi All,

As we have events more often that are more accessible (digital), wondering
whether others see a value of adding a calendar to the website?

Perhaps related, is it worth updating
https://beam.apache.org/community/in-person/ <- to something that isn't
'in-person' since doing things in-person is perhaps (hopefully not
completely) a vestige of the past.

Cheers,
Austin


Re: Try Beam Katas Today

2020-05-14 Thread Austin Bennett
It looks like there are instructions online for writing exercises/Katas:
https://www.jetbrains.com/help/education/educator-start-guide.html

Do we have a guide for contributing and publication/releases occur
(publishing to Stepik)?  Although the code lives in the main repo
(therefore subject to those contrib guidelines), I think the
release/publication schedule is distinct?

This hopefully will help illustrate that we are able to contribute to Katas
(PRs welcome?), and not just consume them!



On Thu, May 14, 2020 at 1:41 AM Henry Suryawirawan 
wrote:

> Yeah certainly we can expand it further.
> There are more lessons that definitely can be added further.
>
> >Eg more the write side windowing interactions?
> Are you referring to Write IOs?
>
>
>
> On Wed, May 13, 2020 at 11:56 PM Nathan Fisher 
> wrote:
>
>> I went through them earlier this week! Definitely helpful.
>>
>> Is it possible to expand the katas available in the lO section? Eg more
>> the write side windowing interactions?
>>
>> On Wed, May 13, 2020 at 11:36, Luke Cwik  wrote:
>>
>>> These are an excellent learning tool.
>>>
>>> On Tue, May 12, 2020 at 11:02 PM Pablo Estrada 
>>> wrote:
>>>
 Sharing Damon's email with the user@ list as well. Thanks Damon!

 On Tue, May 12, 2020 at 9:02 PM Damon Douglas 
 wrote:

> Hello Everyone,
>
> If you don't already know, there are helpful instructional tools for
> learning the Apache Beam SDKs called Beam Katas hosted on
> https://stepik.org.  Similar to traditional Kata
> , they are meant to be repeated
> as practice.  Before practicing the katas myself, I found myself
> copy/pasting code (Please accept my confession  ).  Now I find myself
> actually composing pipelines.  Just like kata forms, you find them 
> becoming
> part of you.  If you are interested, below are listed the current 
> available
> katas:
>
> 1.  Java - https://stepik.org/course/54530
>
> 2.  Python -  https://stepik.org/course/54532
>
> 3.  Go (in development) - https://stepik.org/course/70387
>
> If you are absolutely brand new to Beam and it scares you like it
> scared me, come talk to me.
>
> Best,
>
> Damon
>
 --
>> Nathan Fisher
>>  w: http://junctionbox.ca/
>>
>


Re: Companies using Beam?

2020-04-30 Thread Austin Bennett
A first pass, something like:

https://druid.apache.org/druid-powered
https://spark.apache.org/powered-by.html

or even as simple as:
https://github.com/apache/airflow#who-uses-apache-airflow

Would go a long way in the sorts of very high-level conversations I'm
having around technology adoption/standardization.

Getting into more specifics/testimonials/case-studies is also great, but I
wouldn't expect those to get looked at by most, until passing the first bar
of seeming to having a significant adoption.

@Aizhamal Nurmamat kyzy   - happy to contribute as I
can.

On Tue, Apr 28, 2020 at 10:13 PM Jean-Baptiste Onofre 
wrote:

> Hi,
>
> We already have some testimonials on Beam home page (I did the one about
> Beam use at Talend).
>
> It makes sense to have a dedicated section as it gives ideas about use
> case and production system running with Beam.
>
> Regards
> JB
>
> > Le 28 avr. 2020 à 23:42, Austin Bennett  a
> écrit :
> >
> > Hi All,
> >
> > Have we considered getting onto our website or our our GitHub repo the
> ability for individuals to share that their company is using Beam?  Seeing
> - what I believe to be a reasonable list of - companies productively using
> Beam would be helpful to point others to.  For instance, a common question
> I get is whether anyone or who is using?  I'm not sure that's the best
> metric or datapoint in many cases for adoption, but a heuristic that some
> rely upon.
> >
> > Naturally, we could ask for a roll-call, esp. via user list, but
> imagining  a persistent web-list would be of interest.
> >
> > Cheers,
> > Austin
> >
> >
> > P.S.  If putting such a list into our repo, that would also get some
> people to submit PRs (so more contributors!) :-)
> >
> >
>
>


Companies using Beam?

2020-04-28 Thread Austin Bennett
Hi All,

Have we considered getting onto our website or our our GitHub repo the
ability for individuals to share that their company is using Beam?  Seeing
- what I believe to be a reasonable list of - companies productively using
Beam would be helpful to point others to.  For instance, a common question
I get is whether anyone or who is using?  I'm not sure that's the best
metric or datapoint in many cases for adoption, but a heuristic that some
rely upon.

Naturally, we could ask for a roll-call, esp. via user list, but imagining
 a persistent web-list would be of interest.

Cheers,
Austin


P.S.  If putting such a list into our repo, that would also get some people
to submit PRs (so more contributors!) :-)


Beam Digital Summit 2020 -- JUNE 2020!

2020-04-22 Thread Austin Bennett
Hi All,

We are excited to announce the Beam Digital Summit 2020!

This will occur for partial days during the week of 15-19 June.

CfP is open and found: https://sessionize.com/beam-digital-summit-2020/

CfP closes on 20 May 2020.  Do not hesitate to reach out to the organizers
with any questions.

See you there (online)!
Austin, on behalf of the Beam Summit Steering Committee


Re: New Contributor Intro

2020-04-17 Thread Austin Bennett
Nice, welcome to the project, Cameron -- many of us met back at Beam Summit
EU/Berlin!

I don't have the permissions to assign you, but imagine someone will take
care of that soon.

On Fri, Apr 17, 2020 at 11:15 AM Cameron Morgan 
wrote:

> Hello everyone,
>
> I would like the permission to contribute to Beam.
>
> Jira: cameron-p-m
> GitHub: cameron-p-m
>
> The issue I fixed locally: https://issues.apache.org/jira/browse/BEAM-9502
> I have spoken to Jira:yaronneuman(akaneuman.ya...@gmail.com) about owning
> the issue, and I have a local fix.
>
> Thank you,
> Cameron
>


Re: Usage metrics for Beam

2020-04-13 Thread Austin Bennett
@Pablo,
https://blog.sonatype.com/2010/12/now-available-central-download-statistics-for-oss-projects/
suggests for Apache projects, if you have "deployer" permissions then you
can access -- I gather that you have sufficient permissions, and that Kenn
was able to follow the steps within.

 Though, I bet there are other sources of info, too -- and seems quite
interesting datasets to figure out how to source.  It seems like there is
at least some curiosity, so I'll continue to see what might be available --
currently for some things requiring permissions, I likely do not have
access, but will then share (as would anyways), whatever turns up.



On Mon, Apr 13, 2020 at 1:29 PM Pablo Estrada  wrote:

> I'm also curious about this, but I don't know where to look for the maven
> numbers. Where exactly are they? Do I need to get access to the repository?
> Thanks
> -P.
>
> On Thu, Apr 9, 2020, 9:27 PM Robert Bradshaw  wrote:
>
>> Yes, it's hard to know what can conclusively be drawn from the raw
>> totals. I do think trends and ratios (e.g. Py2 vs. Py3) will, however,
>> roughly reflect underlying usage (which itself is ambiguously defined).
>>
>> On Thu, Apr 9, 2020 at 7:30 PM Kenneth Knowles  wrote:
>>
>>> Yea, interpreting the raw absolute number is tricky. You can probably
>>> manage to see certain kinds of trends if you just look at relative numbers.
>>>
>>> Kenn
>>>
>>> On Thu, Apr 9, 2020 at 6:42 PM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
>>>> @Robert Bradshaw   , you sent that pypi link [1]
>>>> the other day in response to something else, which is what prompted me to
>>>> ask Gris about Maven (based on that link [2], @* Kenneth Knowles
>>>>   ).  I recall talking to someone about Maven
>>>> download statistics at ApacheCon.
>>>>
>>>> Perhaps these are not the only sources; happy to explore any/all that
>>>> others may have in mind.  Generally (personally) interested in
>>>> understanding our ecosystem, trends, etc.
>>>>
>>>> Naturally, there is a ton open to interpretation, even if we had
>>>> absolute raw data (down to IP/account/etc, which would also have ManyIps to
>>>> one user, as well as many users to one IP, in the case of uses on corporate
>>>> networks).  Getting distinct users, for instance, would be incredibly
>>>> challenging; not sure how/whether other projects even do such things.
>>>>
>>>> [1] https://pypistats.org/packages/apache-beam
>>>> [2]
>>>> https://blog.sonatype.com/2010/12/now-available-central-download-statistics-for-oss-projects/
>>>>
>>>> On Thu, Apr 9, 2020 at 4:02 PM Kenneth Knowles  wrote:
>>>>
>>>>> I found some info from 2010 [1] that it was available to anyone with
>>>>> deploy permission. The instructions still work.
>>>>>
>>>>> Kenn
>>>>>
>>>>> [1]
>>>>> https://blog.sonatype.com/2010/12/now-available-central-download-statistics-for-oss-projects/
>>>>>
>>>>> On Thu, Apr 9, 2020 at 3:41 PM Robert Bradshaw 
>>>>> wrote:
>>>>>
>>>>>> For Python, there's https://pypistats.org/packages/apache-beam .
>>>>>> It's unclear how accurate these are, and how many of these downloads
>>>>>> represent users vs. tools (e.g. setting up environments for continuous
>>>>>> testing).
>>>>>>
>>>>>> On Thu, Apr 9, 2020 at 3:29 PM Griselda Cuevas 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi folks - I'm interested in knowing more about Beam's adoption
>>>>>>> through user downloads.
>>>>>>>
>>>>>>> Do you know what's the protocol to access Maven and check on Java
>>>>>>> downloads?
>>>>>>>
>>>>>>> Also - do you have any other recos on how to measure the project's
>>>>>>> adoption evolution?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> G
>>>>>>>
>>>>>>


Re: Usage metrics for Beam

2020-04-09 Thread Austin Bennett
@Robert Bradshaw   , you sent that pypi link [1] the
other day in response to something else, which is what prompted me to ask
Gris about Maven (based on that link [2], @* Kenneth Knowles
  ).  I recall talking to someone about Maven download
statistics at ApacheCon.

Perhaps these are not the only sources; happy to explore any/all that
others may have in mind.  Generally (personally) interested in
understanding our ecosystem, trends, etc.

Naturally, there is a ton open to interpretation, even if we had absolute
raw data (down to IP/account/etc, which would also have ManyIps to one
user, as well as many users to one IP, in the case of uses on corporate
networks).  Getting distinct users, for instance, would be incredibly
challenging; not sure how/whether other projects even do such things.

[1] https://pypistats.org/packages/apache-beam
[2]
https://blog.sonatype.com/2010/12/now-available-central-download-statistics-for-oss-projects/

On Thu, Apr 9, 2020 at 4:02 PM Kenneth Knowles  wrote:

> I found some info from 2010 [1] that it was available to anyone with
> deploy permission. The instructions still work.
>
> Kenn
>
> [1]
> https://blog.sonatype.com/2010/12/now-available-central-download-statistics-for-oss-projects/
>
> On Thu, Apr 9, 2020 at 3:41 PM Robert Bradshaw 
> wrote:
>
>> For Python, there's https://pypistats.org/packages/apache-beam . It's
>> unclear how accurate these are, and how many of these downloads represent
>> users vs. tools (e.g. setting up environments for continuous testing).
>>
>> On Thu, Apr 9, 2020 at 3:29 PM Griselda Cuevas  wrote:
>>
>>> Hi folks - I'm interested in knowing more about Beam's adoption through
>>> user downloads.
>>>
>>> Do you know what's the protocol to access Maven and check on Java
>>> downloads?
>>>
>>> Also - do you have any other recos on how to measure the project's
>>> adoption evolution?
>>>
>>> Thanks!
>>> G
>>>
>>


Re: [VOTE] Accept the Firefly design donation as Beam Mascot - Deadline Mon April 6

2020-04-02 Thread Austin Bennett
+1 (nonbinding)

On Thu, Apr 2, 2020 at 12:10 PM Luke Cwik  wrote:

> +1 (binding)
>
> On Thu, Apr 2, 2020 at 11:54 AM Pablo Estrada  wrote:
>
>> +1! (binding)
>>
>> On Thu, Apr 2, 2020 at 11:19 AM Alex Van Boxel  wrote:
>>
>>> Thanks for clearing this up Aizhamal.
>>>
>>> +1 (non binding)
>>>
>>> _/
>>> _/ Alex Van Boxel
>>>
>>>
>>> On Thu, Apr 2, 2020 at 8:14 PM Aizhamal Nurmamat kyzy <
>>> aizha...@apache.org> wrote:
>>>
 Good point, Alex. Actually Julian and I have talked about producing
 this kind of guide. It will be delivered as an additional contribution in
 the follow up. We think this will be a derivative of the original design,
 and be done after the original is officially accepted.

 With this vote, we want to accept the Firefly donation as designed [1],
 and let Julian produce other artifacts using the official Beam mascot later
 on.

 [1]
 https://docs.google.com/document/d/1zK8Cm8lwZ3ALVFpD1aY7TLCVNwlyTS3PXxTV2qQCAbk/edit?usp=sharing


 On Thu, Apr 2, 2020 at 10:37 AM Alex Van Boxel 
 wrote:

> I don't want to be a spoiler... but this vote feels like a final
> deliverable... but without a style guide as Kenn originally suggested most
> of use will not be able to adapt the design. This would include:
>
>- frontal view
>- side view
>- back view
>
> actually different posses so we can mix and match. Without this it
> will never reach the potential of the Go gopher or gRPC Pancakes.
>
> Note this is *not* a negative vote but I'm afraid that the use
> without a guide will be fairly limited as most of use are not designers.
> Just a concern.
>
>  _/
> _/ Alex Van Boxel
>
>
> On Thu, Apr 2, 2020 at 7:27 PM Andrew Pilloud 
> wrote:
>
>> +1, Accept the donation of the Firefly design as Beam Mascot
>>
>> On Thu, Apr 2, 2020 at 10:19 AM Julian Bruno 
>> wrote:
>>
>>> Hello Apache Beam Community,
>>>
>>> Please vote on the acceptance of the final design of the Firefly as
>>> Beam's mascot [1]. Please share your input no later than Monday, April 
>>> 6,
>>> at noon Pacific Time.
>>>
>>> [ ] +1, Accept the donation of the Firefly design as Beam Mascot
>>>
>>> [ ] -1, Decline the donation of the Firefly design as Beam Mascot
>>>
>>> Vote is adopted by at least 3 PMC +1 approval votes, with no PMC -1
>>> disapproval
>>>
>>> votes. Non-PMC votes are still encouraged.
>>>
>>> PMC voters, please help by indicating your vote as "(binding)"
>>>
>>> The vote and input phase will be open until Monday, April 6, at 12
>>> pm Pacific Time.
>>>
>>> Thank you very much for your feedback and ideas,
>>>
>>> Julian
>>>
>>> [1]
>>> https://docs.google.com/document/d/1zK8Cm8lwZ3ALVFpD1aY7TLCVNwlyTS3PXxTV2qQCAbk/edit?usp=sharing
>>>
>>>
>>> --
>>> Julian Bruno // Visual Artist & Graphic Designer
>>>  (510) 367-0551 / SF Bay Area, CA
>>> www.instagram.com/julbro.art
>>>
>>>


Re: Next LTS?

2020-03-25 Thread Austin Bennett
I'll submit PR, and tag you, Ismael.

Merge once there's a sufficient consensus.


On Wed, Mar 25, 2020 at 3:57 AM Ismaël Mejía  wrote:

> +1 to remove any mention of LTS related information to not create
> misinformation on users.
> Would you be up to do that Austin? or someone else?
>
> On Wed, Mar 25, 2020 at 1:02 AM Robert Bradshaw 
> wrote:
> >
> > I would want to avoid maintain a Python 2 LTS, even if just for the
> > fact that the infrastructure might not be there.
> >
> > On Tue, Mar 24, 2020 at 3:58 PM Valentyn Tymofieiev 
> wrote:
> > >
> > > Yes, we had a suggestion to pick a stable Python 2 release as an LTS.
> The suggestion assumed that LTS will continue to exist. Now, if Python 2 is
> the only reason to have an LTS, we can consider it as long as:
> > > - we scope the LTS portion to Python SDK only.
> > > - we have an ownership story for Python 2 LTS, for example volunteers
> in dev or user community who will be willing to maintain that release.
> > >
> > > We can bring this up when we drop Python 2 support. We decided to
> revisit that conversation in a couple of months IIRC.
> > >
> > > On Tue, Mar 24, 2020 at 3:44 PM Ahmet Altay  wrote:
> > >>
> > >> Removing it makes sense. We did not have a good way of measuring the
> demand for LTS releases.
> > >>
> > >> There was a suggestion to mark the last release with python 2 support
> to be an LTS release, was there a conclusion on that? ( +Valentyn
> Tymofieiev )
> > >>
> > >> Ahmet
> > >>
> > >> On Tue, Mar 24, 2020 at 2:34 PM Robert Bradshaw 
> wrote:
> > >>>
> > >>> There seems to have been lack of demand. I agree we should remove
> > >>> these statements from our site until we find a reason to re-visit
> > >>> doing LTS release.
> > >>>
> > >>> On Tue, Mar 24, 2020 at 2:23 PM Austin Bennett
> > >>>  wrote:
> > >>> >
> > >>> > What's our LTS policy these days?  It seems we should remove the
> following from our site (and encourage GCP does the same, below), if we're
> not going to maintain these.  I'll update policy page via PR, if get the go
> ahead that it is our desire.  Seems we can't suggest policies in a policy
> doc that we don't follow...?
> > >>> >
> > >>> > I am not trying to suggest demand for LTS.  If others haven't
> spoken up, that also indicates lack of demand.  Point of my message is to
> say, we should update our Policies doc, if those aren't what we are
> practicing (and can re-add later if wanting to revive LTS).
> > >>> >
> > >>> > https://beam.apache.org/community/policies/
> > >>> >
> > >>> > Apache Beam aims to make 8 releases in a 12 month period. To
> accommodate users with longer upgrade cycles, some of these releases will
> be tagged as long term support (LTS) releases. LTS releases receive patches
> to fix major issues for 12 months, starting from the release’s initial
> release date. There will be at least one new LTS release in a 12 month
> period, and LTS releases are considered deprecated after 12 months. The
> community will mark a release as a LTS release based on various factors,
> such as the number of LTS releases currently in flight and whether the
> accumulated feature set since the last LTS provides significant upgrade
> value. Non-LTS releases do not receive patches and are considered
> deprecated immediately after the next following minor release. We encourage
> you to update early and often; do not wait until the deprecation date of
> the version you are using.
> > >>> >
> > >>> >
> > >>> >
> > >>> >
> > >>> > Seems a Google Specific Concern, but related to the community:
> https://cloud.google.com/dataflow/docs/support/sdk-version-support-status#apache-beam-sdks-2x
> > >>> >
> > >>> > Apache Beam is an open source, community-led project. Google is
> part of the community, but we do not own the project or control the release
> process. We might open bugs or submit patches to the Apache Beam codebase
> on behalf of Dataflow customers, but we cannot create hotfixes or official
> releases of Apache Beam on demand.
> > >>> >
> > >>> > However, the Apache Beam community designates specific releases as
> long term support (LTS) releases. LTS releases receive patches to fix major
> issues for a designated period of time. See the Apache Beam policies page
> for more det

Re: Next LTS?

2020-03-24 Thread Austin Bennett
What's our LTS policy these days?  It seems we should remove the following
from our site (and encourage GCP does the same, below), if we're not going
to maintain these.  I'll update policy page via PR, if get the go ahead
that it is our desire.  Seems we can't suggest policies in a policy doc
that we don't follow...?

I am not trying to suggest demand for LTS.  If others haven't spoken up,
that also indicates lack of demand.  Point of my message is to say, we
should update our Policies doc, if those aren't what we are practicing (and
can re-add later if wanting to revive LTS).

https://beam.apache.org/community/policies/

Apache Beam aims to make 8 releases in a 12 month period. To accommodate
users with longer upgrade cycles, some of these releases will be tagged as
long term support (LTS) releases. LTS releases receive patches to fix major
issues for 12 months, starting from the release’s initial release date.
There will be at least one new LTS release in a 12 month period, and LTS
releases are considered deprecated after 12 months. The community will mark
a release as a LTS release based on various factors, such as the number of
LTS releases currently in flight and whether the accumulated feature set
since the last LTS provides significant upgrade value. Non-LTS releases do
not receive patches and are considered deprecated immediately after the
next following minor release. We encourage you to update early and often;
do not wait until the deprecation date of the version you are using.



Seems a Google Specific Concern, but related to the community:
https://cloud.google.com/dataflow/docs/support/sdk-version-support-status#apache-beam-sdks-2x

Apache Beam <http://beam.apache.org/> is an open source, community-led
project. Google is part of the community, but we do not own the project or
control the release process. We might open bugs or submit patches to the
Apache Beam codebase on behalf of Dataflow customers, but we cannot create
hotfixes or official releases of Apache Beam on demand.

However, the Apache Beam community designates specific releases as *long
term support (LTS)* releases. LTS releases receive patches to fix major
issues for a designated period of time. See the Apache Beam policies
<https://beam.apache.org/community/policies/> page for more details about
release policies.









On Thu, Sep 19, 2019 at 5:01 PM Ahmet Altay  wrote:

> I agree with retiring 2.7 as the LTS family. Based on my experience with
> users 2.7 does not have a particularly high adoption and as pointed out has
> known critical issues. Declaring another LTS pending demand sounds
> reasonable but how are we going to gauge this demand?
>
> +Yifan Zou  +Alan Myrvold  on
> the tooling question as well. Unless we address the tooling problem it
> seems difficult to feasibly maintain LTS versions over time.
>
> On Thu, Sep 19, 2019 at 3:45 PM Austin Bennett <
> whatwouldausti...@gmail.com> wrote:
>
>> To be clear, I was picking on - or reminding us of - the promise: I don't
>> have a strong personal need/desire (at least currently) for LTS to exist.
>> Though, worth ensuring we live up to what we keep on the website.  And,
>> without an active LTS, probably something we should take off the site?
>>
>> On Thu, Sep 19, 2019 at 1:33 PM Pablo Estrada  wrote:
>>
>>> +Łukasz Gajowy  had at some point thought of
>>> setting up jenkins jobs without coupling them to the state of the repo
>>> during the last Seed Job. It may be that that improvement can help test
>>> older LTS-type releases?
>>>
>>> On Thu, Sep 19, 2019 at 1:11 PM Robert Bradshaw 
>>> wrote:
>>>
>>>> In many ways the 2.7 LTS was trying to flesh out the process. I think
>>>> we learned some valuable lessons. It would have been good to push out
>>>> something (even if it didn't have everything we wanted) but that is
>>>> unlikely to be worth pursuing now (and 2.7 should probably be retired
>>>> as LTS and no longer recommended).
>>>>
>>>> I agree that it does not seem there is strong demand for an LTS at
>>>> this point. I would propose that we keep 2.16, etc. as potential
>>>> candidates, but only declare one as LTS pending demand. The question
>>>> of how to keep our tooling stable (or backwards/forwards compatible)
>>>> is a good one, especially as we move to drop Python 2.7 in 2020 (which
>>>> could itself be a driver for an LTS).
>>>>
>>>> On Thu, Sep 19, 2019 at 12:27 PM Kenneth Knowles 
>>>> wrote:
>>>> >
>>>> > Yes, I pretty much dropped 2.7.1 release process due to lack of
>>>> interest.
>>>> >
>>>> > There are known problems so that I cannot r

Meetups

2020-03-23 Thread Austin Bennett
Seems we won't be convening in-person in about any city anytime soon.

Seems like a chance to come together virtually.

WHO CAN SHARE?

Seeking:
* Use Cases
* Developing Beam/Components
* Other

If anything particular, also, what would you like to hear -- can see if we
can track such speakers down.


Re: Issue in testing doing DirectRunner Apache Beam Local Machine

2020-03-20 Thread Austin Bennett
Hi Yogesh,

Many of the examples, use the direct runner:
https://github.com/apache/beam/tree/master/examples/java/src/main/java/org/apache/beam/examples

As does the word count example:
https://beam.apache.org/get-started/quickstart-java/#get-the-wordcount-code

I'm not well versed enough with our Java SDK (and dependencies for it), to
recognize the error you shared.

Good luck!
Austin


On Fri, Mar 20, 2020 at 9:25 AM Kumbhare, Yogesh 
wrote:

> Hi Team,
>
>
>
> Please find the below command to build .
>
>
>
> mvn compile exec:java
> -Dexec.mainClass=com.intuit.dedupe.beam.poc.StreamingDedupe -Pdirect-runner
>
>
>
> is there any DirectRunner example you have please do share with us.
>
>
>
>
>
> Thanks ,
>
> Yogesh K
>
> 8050397434
>
>
>
>
>
>
>
>
>
>
>
> *From: *"Kumbhare, Yogesh" 
> *Date: *Friday, 20 March 2020 at 3:03 PM
> *To: *"dev@beam.apache.org" 
> *Subject: *Issue in testing doing DirectRunner Apache Beam Local Machine
>
>
>
> Hi Team ,
>
>
>
> I am facing issue to run the simple project on Direct Runner Apache beam ,
> to test the example locally in our machine .
>
>
>
> I added in pom.xml dependency ,
>
>
>
> 
>
>org.apache.beam
>
>beam-runners-direct-java
>
>2.19.0
>
>runtime
>
> 
>
>
>
>
>
>
>
> Please find the below error , let me know any update .
>
>
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Class
> org.apache.beam.model.pipeline.v1.RunnerApi$StandardPTransforms$Primitives
> does not implement the requested interface
> org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ProtocolMessageEnum
>
> at
> org.apache.beam.repackaged.direct_java.runners.core.construction.BeamUrns.getUrn(
> BeamUrns.java:27)
>
> at
> org.apache.beam.repackaged.direct_java.runners.core.construction.PTransformTranslation.(
> PTransformTranslation.java:129)
>
> at
> org.apache.beam.repackaged.direct_java.runners.core.construction.PTransformMatchers.lambda$writeWithRunnerDeterminedSharding$1(
> PTransformMatchers.java:483)
>
> at org.apache.beam.sdk.Pipeline$2.enterCompositeTransform(
> Pipeline.java:268)
>
> at
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(
> TransformHierarchy.java:645)
>
> at
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(
> TransformHierarchy.java:649)
>
> at
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(
> TransformHierarchy.java:311)
>
> at org.apache.beam.sdk.runners.TransformHierarchy.visit(
> TransformHierarchy.java:245)
>
> at org.apache.beam.sdk.Pipeline.traverseTopologically(
> Pipeline.java:458)
>
> at org.apache.beam.sdk.Pipeline.replace(Pipeline.java:258)
>
> at org.apache.beam.sdk.Pipeline.replaceAll(
> Pipeline.java:208)
>
> at org.apache.beam.runners.direct.DirectRunner.run(
> DirectRunner.java:170)
>
> at org.apache.beam.runners.direct.DirectRunner.run(
> DirectRunner.java:67)
>
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
>
> at com.intuit.dedupe.beam.poc.StreamingDedupe.main(
> StreamingDedupe.java:144)
>
>
>
>
>
>
>


Re: [ANNOUNCE] New committer: Jincheng Sun

2020-02-24 Thread Austin Bennett
Congrats!

On Mon, Feb 24, 2020, 11:22 PM Alex Van Boxel  wrote:

> Congrats!
>
>  _/
> _/ Alex Van Boxel
>
>
> On Mon, Feb 24, 2020 at 8:13 PM Kyle Weaver  wrote:
>
>> Thanks Jincheng for all your work on Beam and Flink integration.
>>
>> On Mon, Feb 24, 2020 at 11:02 AM Yichi Zhang  wrote:
>>
>>> Congrats, Jincheng!
>>>
>>> On Mon, Feb 24, 2020 at 9:45 AM Ahmet Altay  wrote:
>>>
 Congratulations!

 On Mon, Feb 24, 2020 at 6:48 AM Thomas Weise  wrote:

> Congratulations!
>
>
> On Mon, Feb 24, 2020 at 6:45 AM Ismaël Mejía 
> wrote:
>
>> Congrats Jincheng!
>>
>> On Mon, Feb 24, 2020 at 1:39 PM Gleb Kanterov 
>> wrote:
>>
>>> Congratulations!
>>>
>>> On Mon, Feb 24, 2020 at 1:18 PM Hequn Cheng 
>>> wrote:
>>>
 Congratulations Jincheng, well deserved!

 Best,
 Hequn

 On Mon, Feb 24, 2020 at 7:21 PM Reza Rokni  wrote:

> Congrats!
>
> On Mon, Feb 24, 2020 at 7:15 PM Jan Lukavský 
> wrote:
>
>> Congrats Jincheng!
>>
>>   Jan
>>
>> On 2/24/20 11:55 AM, Maximilian Michels wrote:
>> > Hi everyone,
>> >
>> > Please join me and the rest of the Beam PMC in welcoming a new
>> > committer: Jincheng Sun 
>> >
>> > Jincheng has worked on generalizing parts of Beam for Flink's
>> Python
>> > API. He has also picked up other issues, like fixing
>> documentation,
>> > implementing missing features, or cleaning up code [1].
>> >
>> > In consideration of his contributions, the Beam PMC trusts him
>> with
>> > the responsibilities of a Beam committer [2].
>> >
>> > Thank you for your contributions Jincheng!
>> >
>> > -Max, on behalf of the Apache Beam PMC
>> >
>> > [1]
>> >
>> https://jira.apache.org/jira/browse/BEAM-9299?jql=project%20%3D%20BEAM%20AND%20assignee%20in%20(sunjincheng121)
>> > [2]
>> >
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>
>


  1   2   >