Re: Collecting feedback for Beam usage

2019-09-20 Thread Kyle Weaver
There are some logistics that would need worked out. For example, Where
would the data go? Who would own it?

Also, I'm not convinced we need yet another place to discuss Beam when we
already have discussed the challenge of simultaneously monitoring mailing
lists, Stack Overflow, Slack, etc. While "how do you use Beam" is certainly
an interesting question, and I'd be curious to know that >= X many people
use a certain runner, I'm not sure answers to these questions are as useful
for guiding the future of Beam as discussions on the dev/users lists, etc.
as the latter likely result in more depth/specific feedback.

However, I do think it could be useful in general to include links directly
in the console output. For example, maybe something along the lines of "Oh
no, your Flink pipeline crashed! Check Jira/file a bug/ask the mailing
list."

Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com


On Fri, Sep 20, 2019 at 4:14 PM Ankur Goenka  wrote:

> Hi,
>
> At the moment we don't really have a good way to collect any usage
> statistics for Apache Beam. Like runner used etc. As many of the users
> don't really have a way to report their usecase.
> How about if we create a feedback page where users can add their pipeline
> details and usecase.
> Also, we can start printing the link to this page when user launch the
> pipeline in the command line.
> Example:
> $ python my_pipeline.py --runner DirectRunner --input /tmp/abc
>
> Starting pipeline
> Please use
> http://feedback.beam.org?args=runner=DirectRunner,input=/tmp/abc
> Pipeline started
> ..
>
> Using a link and not publishing the data automatically will give user
> control over what they publish and what they don't. We can enhance the text
> and usage further but the basic idea is to ask for user feeback at each run
> of the pipeline.
> Let me know what you think.
>
>
> Thanks,
> Ankur
>


Collecting feedback for Beam usage

2019-09-20 Thread Ankur Goenka
Hi,

At the moment we don't really have a good way to collect any usage
statistics for Apache Beam. Like runner used etc. As many of the users
don't really have a way to report their usecase.
How about if we create a feedback page where users can add their pipeline
details and usecase.
Also, we can start printing the link to this page when user launch the
pipeline in the command line.
Example:
$ python my_pipeline.py --runner DirectRunner --input /tmp/abc

Starting pipeline
Please use http://feedback.beam.org?args=runner=DirectRunner,input=/tmp/abc
Pipeline started
..

Using a link and not publishing the data automatically will give user
control over what they publish and what they don't. We can enhance the text
and usage further but the basic idea is to ask for user feeback at each run
of the pipeline.
Let me know what you think.


Thanks,
Ankur


Re: contributor permission for Beam Jira tickets

2019-09-20 Thread Robert Burke
You should be set to self assign issues. (Thanks Pablo)

Welcome!

On Fri, Sep 20, 2019, 12:21 PM dev wearebold 
wrote:

> Hey my Jira username is johnpatoch69
>
> Thank you
>
> Regards,
>
> J
>
> Le 20 sept. 2019 à 20:57, Robert Burke  a écrit :
>
> Absolutely! What's your Jira username? You can create an account if you
> don't already have one, following the instructions here:
> https://beam.apache.org/contribute/#prerequisites
>
> I'm here to help with most things Go SDK too, and provide timely reviews
> and merges of Go PRs.  Just be sure to mention me "R: @lostluck"
>
> On Fri, Sep 20, 2019, 10:16 AM dev wearebold 
> wrote:
>
>>
>> Hi,
>> I’m John, I’d like to work on the Go SDK for Beam, mostly on the util
>> stuff,
>> and I’ll need to create/assign tickets for my work.
>> Can someone add me as a contributor for Beam’s Jira issue?
>>
>> Have a nice day
>>
>> J.
>>
>
>


Re: contributor permission for Beam Jira tickets

2019-09-20 Thread dev wearebold
Hey my Jira username is johnpatoch69

Thank you

Regards,

J

> Le 20 sept. 2019 à 20:57, Robert Burke  a écrit :
> 
> Absolutely! What's your Jira username? You can create an account if you don't 
> already have one, following the instructions here: 
> https://beam.apache.org/contribute/#prerequisites 
> 
> 
> I'm here to help with most things Go SDK too, and provide timely reviews and 
> merges of Go PRs.  Just be sure to mention me "R: @lostluck"
> 
> On Fri, Sep 20, 2019, 10:16 AM dev wearebold  > wrote:
> 
> Hi,
> I’m John, I’d like to work on the Go SDK for Beam, mostly on the util stuff, 
> and I’ll need to create/assign tickets for my work.
> Can someone add me as a contributor for Beam’s Jira issue?
> 
> Have a nice day
> 
> J. 



Re: contributor permission for Beam Jira tickets

2019-09-20 Thread Robert Burke
Absolutely! What's your Jira username? You can create an account if you
don't already have one, following the instructions here:
https://beam.apache.org/contribute/#prerequisites

I'm here to help with most things Go SDK too, and provide timely reviews
and merges of Go PRs.  Just be sure to mention me "R: @lostluck"

On Fri, Sep 20, 2019, 10:16 AM dev wearebold 
wrote:

>
> Hi,
> I’m John, I’d like to work on the Go SDK for Beam, mostly on the util
> stuff,
> and I’ll need to create/assign tickets for my work.
> Can someone add me as a contributor for Beam’s Jira issue?
>
> Have a nice day
>
> J.
>


Re: Plan for dropping python 2 support

2019-09-20 Thread Valentyn Tymofieiev
Thank you, Chad, for refreshing this conversation and adding the
perspective of Python 2 users of Beam who have not(yet) completed the
migration. My thoughts below.

- It is in the best interest of everyone to ensure a smooth migration for
Beam users. However a migration needs to happen since Python ecosystem is
moving off of Python 2.
- Beam has a couple of dozen dependencies, and we cannot have an
expectation that Python 2 versions of these dependencies will be maintained
in 2020.
- BEAM-1251 should be closed, since it may communicate a signal that Beam
does not support Python 3, while it does. Beam has first announced support
of Python 3 in Beam 2.11.0, admittedly later than many mainstream libraries
in Python ecosystem.
- I think Python 2 LTS release (if we continue them) may have critical bug
fixes, but not new features, so we won't be backporting new features.
- Beam portability allows users to customize usercode runtime environment,
and it should be possible for users to supply a Python 2 SDK harness
container, should they have no other option. This would require a
backported user-supplied version of Beam SDK that works on Python 2,
although such SDK may become difficult/impractical to maintain for most
users.
- There are several open issues related to Python 3, but they are
improvements in nature, and we are steadily closing them off. I am not
aware of any adoption blockers for Beam Python 3, specific to Beam.
- I have not heard of users reports who attempted but were not able to use
Beam on Python 3.
- This does not mean that our offering is perfect, there may be errors and
omissions that are yet to be discovered. However, it would be in the best
interest of the Beam community to discover these issues earlier. A message
that Beam will discontinue Python 2 support will encourage users to
migrate, therefore I also support Beam signing https://python3statement.org.
- Having more usage statistics and feedback closer to 2020 can help us be
more confident in deciding when to stop Python 2 support.

On Thu, Sep 19, 2019 at 6:05 PM Ahmet Altay  wrote:

> Thanks a lot for sharing your thoughts, I completely agree that we need to
> minimize the burden on our users as much as possible. Especially in this
> case when we are offering a robust python 3 solution just now. However I do
> share the same concerns related to dependencies and tool chains, It will be
> increasingly difficult for us to keep our code base compatible with python2
> and python3 overtime. (To be very explicit, one of those dependencies is
> Dataflow's python pre-portability workers.)
>
> On Thu, Sep 19, 2019 at 5:17 PM Maximilian Michels  wrote:
>
>> Granted that we just have finalized the Python 3 support, we should
>> allow time for it to mature and for users to make the switch.
>>
>> > Oh, and one more thing, I think it'd make sense for Apache Beam to
>> > sign https://python3statement.org/. The promise is that we'd
>> > discontinue Python 2 support *in* 2020, which is not committing us to
>> > January if we're not ready. Worth a vote?
>>
>> +1
>>
>
> +1
>
>
>>
>> On 19.09.19 15:59, Robert Bradshaw wrote:
>> > Oh, and one more thing, I think it'd make sense for Apache Beam to
>> > sign https://python3statement.org/. The promise is that we'd
>> > discontinue Python 2 support *in* 2020, which is not committing us to
>> > January if we're not ready. Worth a vote?
>> >
>> >
>> > On Thu, Sep 19, 2019 at 3:58 PM Robert Bradshaw 
>> wrote:
>> >>
>> >> Exactly how long we support Python 2 depends on our users. Other than
>> >> those that speak up (such as yourself, thanks!), it's hard to get a
>> >> handle on how many need Python 2 and for how long. (Should we send out
>> >> a survey? Maybe after some experience with 2.16?)
>>
>
> +1, we had some success with collecting information from users using
> Twitter surveys.
>
>
>> >>
>> >> On the one hand, the whole ecosystem is finally moving on, and even if
>> >> Beam continues to support Python 2 our dependencies, or other projects
>> >> that are being used in conjunction with Beam, will also be going
>> >> Python 3 only. On the other hand, Beam is, admittedly, quite late to
>> >> the party and could be the one holding people back, and looking at how
>> >> long it took us, if we just barely make it by the end of the year it's
>> >> unreasonable to say at that point "oh, and we're dropping 2.7 at the
>> >> same time."
>> >>
>> >> The good news is that 2.16 is shaping up to be a release I would
>> >> recommend everyone migrate to Python 3 on. The remaining issues are
>> >> things like some issues with main sessions (which already has issues
>> >> in Python 2) and not supporting keyword-only arguments (a new feature,
>> >> not a regression). I would guess that even 2.15 is already good enough
>> >> for most people, at least to kick the tires and running tests to start
>> >> the effort.
>>
>
> I share the same sentiment. Beam 2.16 will offer a strong python 3
> offering. Yes, there are known issues but

contributor permission for Beam Jira tickets

2019-09-20 Thread dev wearebold

Hi,
I’m John, I’d like to work on the Go SDK for Beam, mostly on the util stuff, 
and I’ll need to create/assign tickets for my work.
Can someone add me as a contributor for Beam’s Jira issue?

Have a nice day

J. 

Re: Next LTS?

2019-09-20 Thread Łukasz Gajowy
And, yes, the fact that Jenkins jobs are separately evolving but pretty
tightly coupled to the repo contents is a serious problem that I wish we
had fixed. So verification of each PR was manual.

Could you tell a little bit more about what exactly were the problems back
then? Was that due to incompatible docker images used in portability tests
maybe? Any other issues?

My thoughts about Jenkins revolved around decoupling from Jenkins plugins,
Groovy DSL etc and replace as much as possible with more universal tools
(bash, Gradle). The main drivers were to: (1) be able to run the same thing
that runs on Jenkins using bash/Gradle (same scripts), (2) potentially be
able to replace Jenkins more easily with some more modern/better CI/CD tool
in the future (Github Actions/Gitlab or simply newer Jenkins with
Jenkinsfiles). I don't understand yet what was the problem cited above (I
didn't work on the LTS back then) so I'm not sure it would help with
releasing LTS versions with backports.

Łukasz

pt., 20 wrz 2019 o 02:01 Ahmet Altay  napisał(a):

> I agree with retiring 2.7 as the LTS family. Based on my experience with
> users 2.7 does not have a particularly high adoption and as pointed out has
> known critical issues. Declaring another LTS pending demand sounds
> reasonable but how are we going to gauge this demand?
>
> +Yifan Zou  +Alan Myrvold  on
> the tooling question as well. Unless we address the tooling problem it
> seems difficult to feasibly maintain LTS versions over time.
>
> On Thu, Sep 19, 2019 at 3:45 PM Austin Bennett <
> whatwouldausti...@gmail.com> wrote:
>
>> To be clear, I was picking on - or reminding us of - the promise: I don't
>> have a strong personal need/desire (at least currently) for LTS to exist.
>> Though, worth ensuring we live up to what we keep on the website.  And,
>> without an active LTS, probably something we should take off the site?
>>
>> On Thu, Sep 19, 2019 at 1:33 PM Pablo Estrada  wrote:
>>
>>> +Łukasz Gajowy  had at some point thought of
>>> setting up jenkins jobs without coupling them to the state of the repo
>>> during the last Seed Job. It may be that that improvement can help test
>>> older LTS-type releases?
>>>
>>> On Thu, Sep 19, 2019 at 1:11 PM Robert Bradshaw 
>>> wrote:
>>>
 In many ways the 2.7 LTS was trying to flesh out the process. I think
 we learned some valuable lessons. It would have been good to push out
 something (even if it didn't have everything we wanted) but that is
 unlikely to be worth pursuing now (and 2.7 should probably be retired
 as LTS and no longer recommended).

 I agree that it does not seem there is strong demand for an LTS at
 this point. I would propose that we keep 2.16, etc. as potential
 candidates, but only declare one as LTS pending demand. The question
 of how to keep our tooling stable (or backwards/forwards compatible)
 is a good one, especially as we move to drop Python 2.7 in 2020 (which
 could itself be a driver for an LTS).

 On Thu, Sep 19, 2019 at 12:27 PM Kenneth Knowles 
 wrote:
 >
 > Yes, I pretty much dropped 2.7.1 release process due to lack of
 interest.
 >
 > There are known problems so that I cannot recommend anyone to use
 2.7.0, yet 2.7 it is the current LTS family. So my work on 2.7.1 was
 philosophical. I did not like the fact that we had a designated LTS family
 with no usable releases.
 >
 > But many backports were proposed to block 2.7.1 and took a very long
 time to get contirbutors to implement the backports. I ended up doing many
 of them just to move it along. This indicates a lack of interest to me. The
 problem is that we cannot really use a strict cut off date as a way to
 ensure people do the important things and skip the unimportant things,
 because we do know that the issues are critical.
 >
 > And, yes, the fact that Jenkins jobs are separately evolving but
 pretty tightly coupled to the repo contents is a serious problem that I
 wish we had fixed. So verification of each PR was manual.
 >
 > Altogether, I still think LTS is valuable to have as a promise to
 users that we will backport critical fixes. I would like to keep that
 promise and continue to try. Things that are rapidly changing (which
 something always will be) just won't have fixes backported, and that seems
 OK.
 >
 > Kenn
 >
 > On Thu, Sep 19, 2019 at 10:59 AM Maximilian Michels 
 wrote:
 >>
 >> An LTS only makes sense if we end up patching the LTS, which so far
 we
 >> have never done. There has been work done in backporting fixes, see
 >> https://github.com/apache/beam/commits/release-2.7.1 but the effort
 was
 >> never completed. The main reason I believe were complications with
 >> running the evolved release scripts against old Beam versions.
 >>
 >> Now that the portability layer keeps maturing, it makes