Re: Collecting feedback for Beam usage

2019-09-26 Thread Kenneth Knowles
Ah, I didn't realize pypi was already collecting py2 vs py3. That saves having to split artifacts. Kenn On Thu, Sep 26, 2019 at 5:03 PM Robert Bradshaw wrote: > Pypi download statistics are freely available at > https://pypistats.org/packages/apache-beam . (To answer the original > question,

Re: Collecting feedback for Beam usage

2019-09-26 Thread Robert Bradshaw
Pypi download statistics are freely available at https://pypistats.org/packages/apache-beam . (To answer the original question, nearly all Python 2 at this point, but starting to show a drop.) I think the goal is to get more/orthogonal coverage than a twitter poll or waiting for users to speak up

Re: Collecting feedback for Beam usage

2019-09-24 Thread Kenneth Knowles
Agreeing with many things here and my own flavor to the points: 1. User's privacy is more important than anything else 2. The goal should be to make things better for users 3. Trading user's opt-in for functionality (like Gradle scans) is not acceptable 4. It should be effectively invisible to

Re: Collecting feedback for Beam usage

2019-09-24 Thread Eugene Kirpichov
Creating a central place for collecting Beam usage sounds compelling, but we'd have to be careful about several aspects: - It goes without saying that this can never be on-by-default, even for a tiny fraction of pipelines. - For further privacy protection, including the user's PipelineOptions is

Re: Collecting feedback for Beam usage

2019-09-24 Thread Lukasz Cwik
One of the options could be to just display the URL and not to phone home. I would like it so that users can integrate this into their deployment solution so we get regular stats instead of only when a user decides to run a pipeline manually. On Tue, Sep 24, 2019 at 11:13 AM Robert Bradshaw

Re: Collecting feedback for Beam usage

2019-09-24 Thread Mikhail Gryzykhin
I'm with Luke on this. We can add a set of flags to send home stats and crash dumps if user agrees. If we keep code isolated, it will be easy enough for user to check what is being sent. One more heavy-weight option is to also allow user configure and persist what information he is ok with

Re: Collecting feedback for Beam usage

2019-09-24 Thread Lukasz Cwik
Why not add a flag to the SDK that would do the phone home when specified? >From a support perspective it would be useful to know: * SDK version * Runner * SDK provided PTransforms that are used * Features like user state/timers/side inputs/splittable dofns/... * Graph complexity (# nodes, #

Re: Collecting feedback for Beam usage

2019-09-23 Thread Robert Bradshaw
On Mon, Sep 23, 2019 at 3:08 PM Brian Hulette wrote: > > Would people actually click on that link though? I think Kyle has a point > that in practice users would only find and click on that link when they're > having some kind of issue, especially if the link has "feedback" in it. I think the

Re: Collecting feedback for Beam usage

2019-09-23 Thread Chad Dombrova
A survey would be a good place to start. This came up in the python2-sunsetting thread as well: we don't know what versions of python people are using with Beam, which makes it difficult to answer the question of support. -chad On Mon, Sep 23, 2019 at 2:57 PM Ankur Goenka wrote: > I agree,

Re: Collecting feedback for Beam usage

2019-09-23 Thread Brian Hulette
Would people actually click on that link though? I think Kyle has a point that in practice users would only find and click on that link when they're having some kind of issue, especially if the link has "feedback" in it. I agree usage data would be really valuable, but I'm not sure that this

Re: Collecting feedback for Beam usage

2019-09-23 Thread Ankur Goenka
I agree, these are the questions that need to be answered. The data can be anonymize and stored as public data in BigQuery or some other place. The intent is to get the usage statistics so that we can get to know what people are using Flink or Spark etc and not intended for discussion or a help

Re: Collecting feedback for Beam usage

2019-09-20 Thread Kyle Weaver
There are some logistics that would need worked out. For example, Where would the data go? Who would own it? Also, I'm not convinced we need yet another place to discuss Beam when we already have discussed the challenge of simultaneously monitoring mailing lists, Stack Overflow, Slack, etc. While

Collecting feedback for Beam usage

2019-09-20 Thread Ankur Goenka
Hi, At the moment we don't really have a good way to collect any usage statistics for Apache Beam. Like runner used etc. As many of the users don't really have a way to report their usecase. How about if we create a feedback page where users can add their pipeline details and usecase. Also, we