Examples of poolArgs in WorkerPool

2019-10-01 Thread Chris Roat
There is field called poolArgs

in the definition of a WorkerPool.   How was this intended to be used?
 Are there known allowed keys for the various runners?  In particular, I
would like to know if the DataFlow API supports specifying attaching GPUs
to workers.

Cheers,
C


Re: NOTICE: New Python PreCommit jobs

2019-10-01 Thread Chad Dombrova
I haven’t used nose’s parallel execution plugin, but I have used pytest
with xdist with success. If your tests are designed to run in any order and
are properly sandboxed to prevent crosstalk between concurrent runs, which
they *should* be, then in my experience it works very well.


On Fri, Sep 27, 2019 at 6:51 PM Kenneth Knowles  wrote:

> Do things go wrong when nose is configured to use parallel execution?
>
> On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova  wrote:
>
>> By the way, the outcome on this was that splitting the python precommit
>> job into one job per python version resulted in increasing the total test
>> completion time by 66%, which is obviously not good.  This is because we
>> are using Gradle to run the python tests tasks in parallel (the jenkins VMs
>> have 16 cores each, utilized across 2 slots, IIRC), but after the split
>> there were only 1-2 gradle tasks per test.  Since the python test runner,
>> nose, is currently not using parallel execution, there were not enough
>> concurrent tasks to make proper use of the VM's CPUs.
>>
>> tl;dr  I'm going to create a followup PR to split out just the Lint job
>> (same as we have Spotless for Java).   This is our best ROI for now.
>>
>> -chad
>>
>>
>> On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver  wrote:
>>
>>> > Do we have good pypi caching?
>>>
>>> Building Python SDK harness containers takes 2 mins each (times 4, the
>>> number of versions) on my machine, even if nothing has changed. But we're
>>> already paying that cost, so I don't think splitting the jobs should make
>>> it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if
>>> anyone has any ideas)
>>>
>>> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>>>
>>>
>>> On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada 
>>> wrote:
>>>
 Thanks Chad, and thank you for notifying on the dev list.

 On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles 
 wrote:

> Nice.
>
> Do we have good pypi caching? If not this could add a lot of overhead
> to our already-backed-up CI queue. (btw I still think your change is good,
> and just makes proper caching more important)
>
> Kenn
>
> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova 
> wrote:
>
>> Hi all,
>> I'm working to make the CI experience with python a bit better, and
>> my current initiative is splitting up the giant Python PreCommit job 
>> into 5
>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>>
>> Around 11am Pacific time tomorrow I'm going to initiate the seed
>> jobs, at which point all PRs will start to run the new precommit jobs.
>> It's a bit of a chicken-and-egg scenario with testing this, so there 
>> could
>> be issues that pop up after the seed jobs are created, but I'll be 
>> working
>> to resolve those issues as quickly as possible.
>>
>> If you run into problems because of this change, please let me know
>> on the github PR.
>>
>> Here's the PR: https://github.com/apache/beam/pull/9642
>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>>
>> The upshot is that after this is done you'll get better feedback on
>> python test failures!
>>
>> Let me know if you have any concerns.
>>
>> thanks,
>> chad
>>
>>


Re: Dockerhub push denied for py3.6 and py3.7 image

2019-10-01 Thread Pablo Estrada
When she set up the repo, Hannah requested PMC members to ask for
privileges, so I did.
The set of admins currently is just Hannah and myself - and I don't think
this is available in a public page.

We could either have a PMC-managed account, or allow more PMC members to
have admin privileges - for redundancy.
Best
-P.

On Tue, Oct 1, 2019 at 6:44 PM Ahmet Altay  wrote:

> Who are the admins on dockerhub currently? Is there a page that shows a
> list? The next person doing the release will probably run into similar
> issues. For example, pypi page for beam [1] shows lists of maintainers.
>
> [1] https://pypi.org/project/apache-beam/
>
> Thank you,
> Ahmet
>
> On Tue, Oct 1, 2019 at 11:32 AM Mark Liu  wrote:
>
>> I can push them now. Thank you Pablo!
>>
>> On Tue, Oct 1, 2019 at 11:05 AM Pablo Estrada  wrote:
>>
>>> You were right that the push permissions for repository maintainers were
>>> missing. I've just added the permissions, and you should be able to push to
>>> them now.
>>> Thanks Mark!
>>>
>>> On Tue, Oct 1, 2019 at 11:02 AM Pablo Estrada 
>>> wrote:
>>>
 I'll check for you. One second.

 On Tue, Oct 1, 2019 at 10:32 AM Mark Liu  wrote:

> Hello Dockerhub Admins,
>
> I was able to push Java, Go, py2.7 and py3.5 images to
> hub.docker.com/u/apachebeam for 2.16 release, but failed for py3.6
> and py3.7 due to "denied: requested access to the resource is denied".
> Wondering if I missed some permissions. Can any Dockerhub admins help me
> check it?
>
> Thanks,
> Mark, Release Manager
>



Re: Multiple iterations after GroupByKey with SparkRunner

2019-10-01 Thread Reuven Lax
On Mon, Sep 30, 2019 at 2:02 AM Jan Lukavský  wrote:

> > The fact that the annotation on the ParDo "changes" the GroupByKey
> implementation is very specific to the Spark runner implementation.
>
> I don't quite agree. It is not very specific to Spark, it is specific to
> generally all runners, that produce grouped elements in a way that is not
> reiterable. That is the key property. The example you gave with HDFS does
> not satisfy this condition (files on HDFS are certainly reiterable), and
> that's why no change to the GBK is needed (it actually already has the
> required property). A quick look at what FlinkRunner (at least non portable
> does) is that it implements GBK using reducing elements into List. That is
> going to crash on big PCollection, which is even nicely documented:
>
>* For internal use to translate {@link GroupByKey}. For a large {@link 
> PCollection} this is
>* expected to crash!
>
>
> If this is fixed, then it is likely to start behave the same as Spark. So
> actually I think the opposite is true - Dataflow is a special case, because
> of how its internal shuffle service works.
>

I think you misunderstood - I was not trying to dish on the Spark runner.
Rather my point is that whether the GroupByKey implementation is affected
or not is runner dependent. In some runners it is and in others it isn't.
However in all cases the *semantics* of the ParDo is affected. Since Beam
tries as much as possible to be runner agnostic, we should default to
making the change where there is an obvious semantic difference.

> In general I sympathize with the worry about non-local effects. Beam is
> already full of them (e.g. a Window.into statement effects downstream
> GroupByKeys). In each case where they were added there was extensive debate
> and discussion (Windowing semantics were debated for many months), exactly
> because there was concern over adding these non-local effects. In every
> case, no other good solution could be found. For the case of windowing for
> example, it was often easy to propose simple local APIs (e.g. just pass the
> window fn as a parameter to GroupByKey), however all of these local
> solutions ended up not working for important use cases when we analyzed
> them more deeply.
>
> That is very interesting. Could you elaborate more about some examples of
> the use cases which didn't work? I'd like to try to match it against how
> Euphoria is structured, it should be more resistant to this non-local
> effects, because it very often bundles together multiple Beam's primitives
> to single transform - ReduceByKey is one example of this, if is actually
> mix of Window.into() + GBK + ParDo, Although it might look like if this
> transform can be broken down to something else, then it is not primitive
> (euphoria has no native equivalent of GBK itself), but it has several other
> nice implications - that is that Combine now becomes a special case of RBK.
> It now becomes only a question of where and how you can "run" the reduce
> function. The logic is absolutely equal. This can be worked in more detail
> and actually show, that even Combine and RBK can be decribed by a more
> general stateful operation (ReduceStateByKey), and so finally Euphoria
> actually has only two really "primitive" operations - these are FlatMap
> (basically stateless ParDo) and RSBK. As I already mentioned on some other
> thread, when stateful ParDo would support merging windows, it can be shown
> that both Combine and GBK become special cases of this.
>
> > As you mentioned below, I do think it's perfectly reasonable for a DSL
> to impose its own semantics. Scio already does this - the raw Beam API is
> used by a DSL as a substrate, but the DSL does not need to blindly mirror
> the semantics of the raw Beam API - at least in my opinion!
>
> Sure, but currently, there is no way for DSL to "hook" into runner, so it
> has to use raw Beam SDK, and so this will fail in cases like this - where
> Beam actually has stronger guarantees than it is required by the DSL. It
> would be cool if we could find a way to do that - this pretty much aligns
> with another question raised on ML, about the possibility to override a
> default implementation of a PTransform for specific pipeline.
>
> Jan
>
>
> On 9/29/19 7:46 PM, Reuven Lax wrote:
>
> Jan,
>
> The fact that the annotation on the ParDo "changes" the GroupByKey
> implementation is very specific to the Spark runner implementation. You can
> imagine another runner that simply writes out files in HDFS to implement a
> GroupByKey - this GroupByKey implementation is agnostic whether the result
> will be reiterated or not; in this case it is very much the ParDo
> implementation that changes to implement a reiterable. vI think you don't
> like the fact that an annotation on the ParDo will have a non-local effect
> on the implementation of the GroupByKey upstream. However arguably the
> non-local effect is just a quirk of how the Spark runner is implemented -
> other 

Re: Dockerhub push denied for py3.6 and py3.7 image

2019-10-01 Thread Ahmet Altay
Who are the admins on dockerhub currently? Is there a page that shows a
list? The next person doing the release will probably run into similar
issues. For example, pypi page for beam [1] shows lists of maintainers.

[1] https://pypi.org/project/apache-beam/

Thank you,
Ahmet

On Tue, Oct 1, 2019 at 11:32 AM Mark Liu  wrote:

> I can push them now. Thank you Pablo!
>
> On Tue, Oct 1, 2019 at 11:05 AM Pablo Estrada  wrote:
>
>> You were right that the push permissions for repository maintainers were
>> missing. I've just added the permissions, and you should be able to push to
>> them now.
>> Thanks Mark!
>>
>> On Tue, Oct 1, 2019 at 11:02 AM Pablo Estrada  wrote:
>>
>>> I'll check for you. One second.
>>>
>>> On Tue, Oct 1, 2019 at 10:32 AM Mark Liu  wrote:
>>>
 Hello Dockerhub Admins,

 I was able to push Java, Go, py2.7 and py3.5 images to
 hub.docker.com/u/apachebeam for 2.16 release, but failed for py3.6 and
 py3.7 due to "denied: requested access to the resource is denied".
 Wondering if I missed some permissions. Can any Dockerhub admins help me
 check it?

 Thanks,
 Mark, Release Manager

>>>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Ankur Goenka
+1

On Tue, Oct 1, 2019 at 4:27 PM Ruoyun Huang  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 3:52 PM Rui Wang  wrote:
>
>> +1
>>
>> I needed to use https://python3statement.org to access the website BTW
>> (https, not http).
>>
>>
>> -Rui
>>
>> On Tue, Oct 1, 2019 at 3:29 PM Cam Mach  wrote:
>>
>>> +1
>>>
>>>
>>>
>>> On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:
>>>
 +1

 On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
 wrote:

> +1
>
> wt., 1 paź 2019 o 11:29 Maximilian Michels 
> napisał(a):
>
>> +1
>>
>> On 30.09.19 23:03, Reza Rokni wrote:
>> > +1
>> >
>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli <
>> ttanay...@gmail.com
>> > > wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi <
>> smar...@apache.org
>> > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>> > mailto:owenzhang1...@gmail.com>>
>> wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>> > > > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>> > mailto:valen...@google.com>>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > Please vote whether to sign a pledge on behalf
>> of
>> > Apache Beam to sunset Beam Python 2 offering
>> (in new
>> > releases) in 2020 on http://python3stament.org
>>  as
>> > follows:
>> >
>> > [ ] +1: Sign a pledge to discontinue support of
>> > Python 2 in Beam in 2020.
>> > [ ] -1: Do not sign a pledge to discontinue
>> support
>> > of Python 2 in Beam in 2020.
>> >
>> > The motivation and details for this vote were
>> > discussed in [1, 2]. Please follow up in [2] if
>> you
>> > have any questions.
>> >
>> > This is a procedural vote [3] that will follow
>> the
>> > majority approval rules and will be open for at
>> > least 72 hours.
>> >
>> > Thanks,
>> > Valentyn
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>> > [2]
>> >
>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>> > [3]
>> https://www.apache.org/foundation/voting.html
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it
>> has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are
>> provided
>> > solely as a basis for further discussion, and are not intended to
>> be and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an
>> agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>
>
> --
> 
> Ruoyun  Huang
>
>


Re: Multiple iterations after GroupByKey with SparkRunner

2019-10-01 Thread Robert Bradshaw
For this specific usecase, I would suggest this be done via PTranform URNs.
E.g. one could have a GroupByKeyOneShot whose implementation is

input
.apply(GroupByKey.of()
.apply(kv -> KV.of(kv.key(), kv.iterator())

A runner would be free to recognize and optimize this in the graph (based
on its urn) and swap out a more efficient implementation. Of course a
Coder would have to be introduced, and the semantics of
PCollection are a bit odd due to the inherently mutable nature of
Iterators. (Possibly a ReducePerKey transform would be a better
abstraction.)


On Tue, Oct 1, 2019 at 2:16 AM Jan Lukavský  wrote:

> The car analogy was meant to say, that in real world you have to make
> decision before you take any action. There is no retroactivity possible.
>
> Reuven pointed out, that it is possible (although it seems a little weird
> to me, but that is the only thing I can tell against it :-)), that the way
> a grouped PCollection is produced might be out of control of a consuming
> operator. One example of this might be, that the grouping is produced in a
> submodule (some library), but still, the consumer wants to be able to
> specify if he wants or doesn't want reiterations. There still is a
> "classical" solution to this - the library might expose an interface to
> specify a factory for the grouped PCollection, so that the user of the
> library will be able to specify what he wants. But we can say, that we
> don't want to force users (or authors of libraries) to do that. That's okay
> for me.
>
> If we move on, our next option might be to specify the annotation on the
> consumer (as suggested), but that has all the "not really nice" properties
> of being counter-intuitive, ignoring strong types, etc., etc., for which
> reason I think that this should be ruled out as well.
>
> This leaves us with a single option (at least I have not figured out any
> other) - which is we can bundle GBK and associated ParDo into atomic
> PTransform, which can then be overridden by runners that need special
> handling of this situation - these are all runners that need buffer data to
> memory in order to support reiterations (spark and flink, note that this
> problem arises only for batch case, because in streaming case, one can
> reasonably assume that the data resides in a state that supports
> reiterations). But - we already have this PTransform in Euphoria, it is
> called ReduceByKey, and has all the required properties (technically, it is
> not a PTransform now, but that is a minor detail and can be changed
> trivially).
>
> So, the direction I was trying to take this discussion was - what could be
> the best way for a runner to natively support a PTransform from a DSL? I
> can imagine several options:
>
>  a) support it directly and let runners depend on the DSL (compileOnly
> dependency might suffice, because users will include the DSL into their
> code to be able to use it)
>
>  b) create an interface in runners for user-code to be able to provide
> translation for user-specified operators (this could be absolutely generic,
> DSLs might just use this feature the same way any user could), after all
> runners already use a concept of Translator, but that is pretty much
> copy-pasted, not abstracted into a general purpose one
>
>  c) move the operators that need to be translated into core
>
> The option (c) then leaves open questions related to - if we would want to
> move other operators to core, would this be the right time to ask questions
> if our current set of "core" operators is the ideal one? Or could this be
> optimized?
>
> Jan
> On 10/1/19 12:32 AM, Kenneth Knowles wrote:
>
> In the car analogy, you have something this:
>
> Iterable: car
> Iterator: taxi ride
>
> They are related, but not as variations of a common concept.
>
> In the discussion of Combine vs RSBK, if the reducer is required to be an
> associative and commutative operator, then it is the same thing under a
> different name. If the reducer can be non-associative or non-commutative,
> then it admits fewer transformations/optimizations.
>
> If you introduce a GroupIteratorsByKey and implement GroupByKey as a
> transform that combines the iterator by concatenation, I think you do get
> an internally consistent system. To execute efficiently, you need to always
> identify and replace the GroupByKey operation with a primitive one. It does
> make some sense to expose the weakest primitives for the sake of DSLs. But
> they are very poorly suited for end-users, and for GBK on most runners you
> get the more powerful one for free.
>
> Kenn
>
> On Mon, Sep 30, 2019 at 2:02 AM Jan Lukavský  wrote:
>
>> > The fact that the annotation on the ParDo "changes" the GroupByKey
>> implementation is very specific to the Spark runner implementation.
>>
>> I don't quite agree. It is not very specific to Spark, it is specific to
>> generally all runners, that produce grouped elements in a way that is not
>> reiterable. That is the key property. The 

Re: Reading from BigQuery on portable runners in Python SDK

2019-10-01 Thread Chamikara Jayalath
Yes this is something we wanted to do for sometime but could not prioritize
due to other high priority work. JIRA is
https://issues.apache.org/jira/browse/BEAM-1440.

Note that BigQuery sources have many moving parts and Java BigQuery source
[1] is one of the most complicated sources we have. So I suggest
following the Java implementation closely when implementing the Python
version.

Another option will be to wait till we have Splittable DoFn for Python
bounded sources which is expected to be available soon but this does not
necessarily have to be the case since we'll be providing converters from
BounndedSources to SDF (but pure SDF versions probably will be better in
some regards).

Thanks,
Cham


[1]
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L546

On Tue, Oct 1, 2019 at 8:48 AM Ahmet Altay  wrote:

> +Chamikara Jayalath  and +Pablo Estrada
>  might have ideas related to this.
>
> On Tue, Oct 1, 2019 at 2:39 AM Kamil Wasilewski <
> kamil.wasilew...@polidea.com> wrote:
>
>> If anyone is interested, here is a link to my code:
>> https://github.com/kamilwu/beam/tree/bounded-source-for-bq
>>
>> On Tue, Oct 1, 2019 at 11:17 AM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>>
>>> Hi all,
>>>
>>> At the moment, we have a BigQuery native source for Python SDK, which
>>> can be used only by Dataflow runner. Consequently, it doesn't work on
>>> portable runners, such as Flink.
>>>
>>> Recently I have written a prototypical source which implements
>>> iobase.BoundedSource, so that other runners can read from BigQuery as well.
>>> It works the same way as in Java SDK [1], which means that it exports
>>> BigQuery table to JSON and returns TextSource objects in the split() call.
>>> However, it has the following problems:
>>> - it doesn't work on Direct runner,
>>>
>>
> I believe DirectRunner already have an implementation for reading from BQ.
>
>
>> - its API is highly experimental.
>>>
>>
> Which API is highly experimental?
>
>
>>
>>> This is where my question begins. What should we do in order to provide
>>> support for reading from BigQuery on other runners than Dataflow? Do you
>>> think it's fine to continue working on the source I described? Or maybe it
>>> should be done in an entirely different way (not by exporting tables to
>>> JSON)?
>>>
>>> Thanks,
>>> Kamil
>>>
>>> [1]
>>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
>>>
>>


Re: New JIRA Component Request

2019-10-01 Thread Ning Kang
Thanks, Pablo!

On Tue, Oct 1, 2019 at 3:21 PM Pablo Estrada  wrote:

> I've created a runner-py-interactive component:
> https://jira.apache.org/jira/issues/?jql=project+%3D+BEAM+AND+component+%3D+runner-py-interactive
>
> Hope that helps!
> -P.
>
> On Tue, Oct 1, 2019 at 3:16 PM Ning Kang  wrote:
>
>> +1
>> FYI, I'm temporarily using examples-python component.
>>
>> On Tue, Oct 1, 2019 at 3:04 PM Sam Rohde  wrote:
>>
>>> Hi All,
>>>
>>> I am working improvements to the InteractiveRunner along side with +David
>>> Yan , +Ning Kang , and Alexey
>>> Strokach. I am requesting on behalf of this working group to add a new Jira
>>> component "runner-interactive" as the current list of components is
>>> insufficient.
>>>
>>> Regards,
>>> Sam
>>>
>>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Cam Mach
+1



On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
> wrote:
>
>> +1
>>
>> wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):
>>
>>> +1
>>>
>>> On 30.09.19 23:03, Reza Rokni wrote:
>>> > +1
>>> >
>>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>>> > mailto:owenzhang1...@gmail.com>>
>>> wrote:
>>> >
>>> > +1
>>> >
>>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>>> > >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>>> > mailto:valen...@google.com>>
>>> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > Please vote whether to sign a pledge on behalf of
>>> > Apache Beam to sunset Beam Python 2 offering (in
>>> new
>>> > releases) in 2020 on http://python3stament.org as
>>> > follows:
>>> >
>>> > [ ] +1: Sign a pledge to discontinue support of
>>> > Python 2 in Beam in 2020.
>>> > [ ] -1: Do not sign a pledge to discontinue support
>>> > of Python 2 in Beam in 2020.
>>> >
>>> > The motivation and details for this vote were
>>> > discussed in [1, 2]. Please follow up in [2] if you
>>> > have any questions.
>>> >
>>> > This is a procedural vote [3] that will follow the
>>> > majority approval rules and will be open for at
>>> > least 72 hours.
>>> >
>>> > Thanks,
>>> > Valentyn
>>> >
>>> > [1]
>>> >
>>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>>> > [2]
>>> >
>>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>>> > [3] https://www.apache.org/foundation/voting.html
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > This email may be confidential and privileged. If you received this
>>> > communication by mistake, please don't forward it to anyone else,
>>> please
>>> > erase all copies and attachments, and please let me know that it has
>>> > gone to the wrong person.
>>> >
>>> > The above terms reflect a potential business arrangement, are provided
>>> > solely as a basis for further discussion, and are not intended to be
>>> and
>>> > do not constitute a legally binding obligation. No legally binding
>>> > obligations will be created, implied, or inferred until an agreement
>>> in
>>> > final form is executed in writing by all parties involved.
>>> >
>>>
>>


Re: New JIRA Component Request

2019-10-01 Thread Sam Rohde
Thanks for the quick response Pablo!

On Tue, Oct 1, 2019 at 3:21 PM Pablo Estrada  wrote:

> I've created a runner-py-interactive component:
> https://jira.apache.org/jira/issues/?jql=project+%3D+BEAM+AND+component+%3D+runner-py-interactive
>
> Hope that helps!
> -P.
>
> On Tue, Oct 1, 2019 at 3:16 PM Ning Kang  wrote:
>
>> +1
>> FYI, I'm temporarily using examples-python component.
>>
>> On Tue, Oct 1, 2019 at 3:04 PM Sam Rohde  wrote:
>>
>>> Hi All,
>>>
>>> I am working improvements to the InteractiveRunner along side with +David
>>> Yan , +Ning Kang , and Alexey
>>> Strokach. I am requesting on behalf of this working group to add a new Jira
>>> component "runner-interactive" as the current list of components is
>>> insufficient.
>>>
>>> Regards,
>>> Sam
>>>
>>


Re: New JIRA Component Request

2019-10-01 Thread Pablo Estrada
I've created a runner-py-interactive component:
https://jira.apache.org/jira/issues/?jql=project+%3D+BEAM+AND+component+%3D+runner-py-interactive

Hope that helps!
-P.

On Tue, Oct 1, 2019 at 3:16 PM Ning Kang  wrote:

> +1
> FYI, I'm temporarily using examples-python component.
>
> On Tue, Oct 1, 2019 at 3:04 PM Sam Rohde  wrote:
>
>> Hi All,
>>
>> I am working improvements to the InteractiveRunner along side with +David
>> Yan , +Ning Kang , and Alexey
>> Strokach. I am requesting on behalf of this working group to add a new Jira
>> component "runner-interactive" as the current list of components is
>> insufficient.
>>
>> Regards,
>> Sam
>>
>


Re: New JIRA Component Request

2019-10-01 Thread Ning Kang
+1
FYI, I'm temporarily using examples-python component.

On Tue, Oct 1, 2019 at 3:04 PM Sam Rohde  wrote:

> Hi All,
>
> I am working improvements to the InteractiveRunner along side with +David
> Yan , +Ning Kang , and Alexey
> Strokach. I am requesting on behalf of this working group to add a new Jira
> component "runner-interactive" as the current list of components is
> insufficient.
>
> Regards,
> Sam
>


New JIRA Component Request

2019-10-01 Thread Sam Rohde
Hi All,

I am working improvements to the InteractiveRunner along side with +David
Yan , +Ning Kang , and Alexey
Strokach. I am requesting on behalf of this working group to add a new Jira
component "runner-interactive" as the current list of components is
insufficient.

Regards,
Sam


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Ismaël Mejía
+1

On Tue, Oct 1, 2019, 10:40 PM Lukasz Cwik  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 10:39 AM Ning Kang  wrote:
>
>> +1
>>
>> On Tue, Oct 1, 2019 at 10:17 AM Pablo Estrada  wrote:
>>
>>> +1
>>>
>>> I guess it was http://python3statement.org : )
>>>
>>> On Tue, Oct 1, 2019 at 10:14 AM Mark Liu  wrote:
>>>
 +1

 btw, the link (http://python3stament.org) you provided is broken.

 On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
> wrote:
>
>> +1
>>
>> wt., 1 paź 2019 o 11:29 Maximilian Michels 
>> napisał(a):
>>
>>> +1
>>>
>>> On 30.09.19 23:03, Reza Rokni wrote:
>>> > +1
>>> >
>>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli <
>>> ttanay...@gmail.com
>>> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi <
>>> smar...@apache.org
>>> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>>> > mailto:owenzhang1...@gmail.com>>
>>> wrote:
>>> >
>>> > +1
>>> >
>>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>>> > >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>>> > mailto:valen...@google.com>>
>>> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > Please vote whether to sign a pledge on behalf
>>> of
>>> > Apache Beam to sunset Beam Python 2 offering
>>> (in new
>>> > releases) in 2020 on http://python3stament.org
>>>  as
>>> > follows:
>>> >
>>> > [ ] +1: Sign a pledge to discontinue support of
>>> > Python 2 in Beam in 2020.
>>> > [ ] -1: Do not sign a pledge to discontinue
>>> support
>>> > of Python 2 in Beam in 2020.
>>> >
>>> > The motivation and details for this vote were
>>> > discussed in [1, 2]. Please follow up in [2]
>>> if you
>>> > have any questions.
>>> >
>>> > This is a procedural vote [3] that will follow
>>> the
>>> > majority approval rules and will be open for at
>>> > least 72 hours.
>>> >
>>> > Thanks,
>>> > Valentyn
>>> >
>>> > [1]
>>> >
>>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>>> > [2]
>>> >
>>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>>> > [3]
>>> https://www.apache.org/foundation/voting.html
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > This email may be confidential and privileged. If you received
>>> this
>>> > communication by mistake, please don't forward it to anyone else,
>>> please
>>> > erase all copies and attachments, and please let me know that it
>>> has
>>> > gone to the wrong person.
>>> >
>>> > The above terms reflect a potential business arrangement, are
>>> provided
>>> > solely as a basis for further discussion, and are not intended to
>>> be and
>>> > do not constitute a legally binding obligation. No legally binding
>>> > obligations will be created, implied, or inferred until an
>>> agreement in
>>> > final form is executed in writing by all parties involved.
>>> >
>>>
>>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Lukasz Cwik
+1

On Tue, Oct 1, 2019 at 10:39 AM Ning Kang  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 10:17 AM Pablo Estrada  wrote:
>
>> +1
>>
>> I guess it was http://python3statement.org : )
>>
>> On Tue, Oct 1, 2019 at 10:14 AM Mark Liu  wrote:
>>
>>> +1
>>>
>>> btw, the link (http://python3stament.org) you provided is broken.
>>>
>>> On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:
>>>
 +1

 On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
 wrote:

> +1
>
> wt., 1 paź 2019 o 11:29 Maximilian Michels 
> napisał(a):
>
>> +1
>>
>> On 30.09.19 23:03, Reza Rokni wrote:
>> > +1
>> >
>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli <
>> ttanay...@gmail.com
>> > > wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi <
>> smar...@apache.org
>> > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>> > mailto:owenzhang1...@gmail.com>>
>> wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>> > > > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>> > mailto:valen...@google.com>>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > Please vote whether to sign a pledge on behalf
>> of
>> > Apache Beam to sunset Beam Python 2 offering
>> (in new
>> > releases) in 2020 on http://python3stament.org
>>  as
>> > follows:
>> >
>> > [ ] +1: Sign a pledge to discontinue support of
>> > Python 2 in Beam in 2020.
>> > [ ] -1: Do not sign a pledge to discontinue
>> support
>> > of Python 2 in Beam in 2020.
>> >
>> > The motivation and details for this vote were
>> > discussed in [1, 2]. Please follow up in [2] if
>> you
>> > have any questions.
>> >
>> > This is a procedural vote [3] that will follow
>> the
>> > majority approval rules and will be open for at
>> > least 72 hours.
>> >
>> > Thanks,
>> > Valentyn
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>> > [2]
>> >
>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>> > [3]
>> https://www.apache.org/foundation/voting.html
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it
>> has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are
>> provided
>> > solely as a basis for further discussion, and are not intended to
>> be and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an
>> agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>


Re: Beam KinesisIO Migration V1 to V2

2019-10-01 Thread Ismaël Mejía
+dev

On Tue, Oct 1, 2019 at 8:35 PM Ismaël Mejía  wrote:
>
> Thanks a lot Cam for bringing this document to the mailing list (I let
> some comments there). There was a recent proposal doc about supporting
> async on Beam so you can be interested on taking a look at the
> evolution of that [1]. It is definitely interesting for the
> implications for IO authors.
>
> I piggyback a bit this email to make other members of the community
> aware of the ongoing work by Cam. He has been a contributor for a
> while in the Java IO front, he contributed the IO for DynamoDB and due
> to his work there he started working on the update of all the Amazon
> Web Services (AWS) related IOs (SNS, SQS, etc) to use the new AWS SDK
> (v2), that’s how he got into the Kinesis migration point, btw AWS IO
> migration is an ongoing effort so if other people are interested on
> contributing into this area, do not hesitate to contact him (or me).
> BEAM-7555 [2] for more info. (Sorry for the ad :P)
>
> Ismaël
>
> [1] 
> https://lists.apache.org/thread.html/26d0b7b4f89dcc265fd5deb9cdca3f3bc6daa7cdf2fe56e09b5e7a36@%3Cdev.beam.apache.org%3E
> [2] https://issues.apache.org/jira/browse/BEAM-7555
>
>
> On Mon, Sep 30, 2019 at 2:02 PM Cam Mach  wrote:
> >
> > Hello Beam Dev,
> >
> > I have discussed with a couple of Beam dev regarding this topic. We found 
> > something interesting in the new AWS Kinesis SDK and Libraries V2, so like 
> > to propose a design for this migration.
> >
> > Here is the design doc: 
> > https://docs.google.com/document/d/1XeIVbiDHBReZY8rEI2OWA3cTEQuaR7RPdwGAup6S1DM
> >
> > I would love to hear from you, your feedback and comments
> >
> > Thanks,
> > Cam
> >
> >


Re: Dockerhub push denied for py3.6 and py3.7 image

2019-10-01 Thread Mark Liu
I can push them now. Thank you Pablo!

On Tue, Oct 1, 2019 at 11:05 AM Pablo Estrada  wrote:

> You were right that the push permissions for repository maintainers were
> missing. I've just added the permissions, and you should be able to push to
> them now.
> Thanks Mark!
>
> On Tue, Oct 1, 2019 at 11:02 AM Pablo Estrada  wrote:
>
>> I'll check for you. One second.
>>
>> On Tue, Oct 1, 2019 at 10:32 AM Mark Liu  wrote:
>>
>>> Hello Dockerhub Admins,
>>>
>>> I was able to push Java, Go, py2.7 and py3.5 images to
>>> hub.docker.com/u/apachebeam for 2.16 release, but failed for py3.6 and
>>> py3.7 due to "denied: requested access to the resource is denied".
>>> Wondering if I missed some permissions. Can any Dockerhub admins help me
>>> check it?
>>>
>>> Thanks,
>>> Mark, Release Manager
>>>
>>


Re: Dockerhub push denied for py3.6 and py3.7 image

2019-10-01 Thread Pablo Estrada
I'll check for you. One second.

On Tue, Oct 1, 2019 at 10:32 AM Mark Liu  wrote:

> Hello Dockerhub Admins,
>
> I was able to push Java, Go, py2.7 and py3.5 images to
> hub.docker.com/u/apachebeam for 2.16 release, but failed for py3.6 and
> py3.7 due to "denied: requested access to the resource is denied".
> Wondering if I missed some permissions. Can any Dockerhub admins help me
> check it?
>
> Thanks,
> Mark, Release Manager
>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Pablo Estrada
+1

I guess it was http://python3statement.org : )

On Tue, Oct 1, 2019 at 10:14 AM Mark Liu  wrote:

> +1
>
> btw, the link (http://python3stament.org) you provided is broken.
>
> On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:
>
>> +1
>>
>> On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
>> wrote:
>>
>>> +1
>>>
>>> wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):
>>>
 +1

 On 30.09.19 23:03, Reza Rokni wrote:
 > +1
 >
 > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli >>> > > wrote:
 >
 > +1
 >
 > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi >>> > > wrote:
 >
 > +1
 >
 > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
 > mailto:owenzhang1...@gmail.com>>
 wrote:
 >
 > +1
 >
 > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
 > >>> > > wrote:
 >
 > +1
 >
 > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
 > mailto:valen...@google.com>>
 wrote:
 >
 > Hi everyone,
 >
 > Please vote whether to sign a pledge on behalf of
 > Apache Beam to sunset Beam Python 2 offering (in
 new
 > releases) in 2020 on http://python3stament.org as
 > follows:
 >
 > [ ] +1: Sign a pledge to discontinue support of
 > Python 2 in Beam in 2020.
 > [ ] -1: Do not sign a pledge to discontinue
 support
 > of Python 2 in Beam in 2020.
 >
 > The motivation and details for this vote were
 > discussed in [1, 2]. Please follow up in [2] if
 you
 > have any questions.
 >
 > This is a procedural vote [3] that will follow the
 > majority approval rules and will be open for at
 > least 72 hours.
 >
 > Thanks,
 > Valentyn
 >
 > [1]
 >
 https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
 > [2]
 >
 https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
 > [3] https://www.apache.org/foundation/voting.html
 >
 >
 >
 > --
 >
 > This email may be confidential and privileged. If you received this
 > communication by mistake, please don't forward it to anyone else,
 please
 > erase all copies and attachments, and please let me know that it has
 > gone to the wrong person.
 >
 > The above terms reflect a potential business arrangement, are
 provided
 > solely as a basis for further discussion, and are not intended to be
 and
 > do not constitute a legally binding obligation. No legally binding
 > obligations will be created, implied, or inferred until an agreement
 in
 > final form is executed in writing by all parties involved.
 >

>>>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Robert Bradshaw
The correct link is https://python3statement.org/

On Tue, Oct 1, 2019 at 10:14 AM Mark Liu  wrote:
>
> +1
>
> btw, the link (http://python3stament.org) you provided is broken.
>
> On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:
>>
>> +1
>>
>> On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy  wrote:
>>>
>>> +1
>>>
>>> wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):

 +1

 On 30.09.19 23:03, Reza Rokni wrote:
 > +1
 >
 > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli >>> > > wrote:
 >
 > +1
 >
 > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi >>> > > wrote:
 >
 > +1
 >
 > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
 > mailto:owenzhang1...@gmail.com>> wrote:
 >
 > +1
 >
 > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
 > >>> > > wrote:
 >
 > +1
 >
 > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
 > mailto:valen...@google.com>> wrote:
 >
 > Hi everyone,
 >
 > Please vote whether to sign a pledge on behalf of
 > Apache Beam to sunset Beam Python 2 offering (in new
 > releases) in 2020 on http://python3stament.org as
 > follows:
 >
 > [ ] +1: Sign a pledge to discontinue support of
 > Python 2 in Beam in 2020.
 > [ ] -1: Do not sign a pledge to discontinue support
 > of Python 2 in Beam in 2020.
 >
 > The motivation and details for this vote were
 > discussed in [1, 2]. Please follow up in [2] if you
 > have any questions.
 >
 > This is a procedural vote [3] that will follow the
 > majority approval rules and will be open for at
 > least 72 hours.
 >
 > Thanks,
 > Valentyn
 >
 > [1]
 > 
 > https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
 > [2]
 > 
 > https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
 > [3] https://www.apache.org/foundation/voting.html
 >
 >
 >
 > --
 >
 > This email may be confidential and privileged. If you received this
 > communication by mistake, please don't forward it to anyone else, please
 > erase all copies and attachments, and please let me know that it has
 > gone to the wrong person.
 >
 > The above terms reflect a potential business arrangement, are provided
 > solely as a basis for further discussion, and are not intended to be and
 > do not constitute a legally binding obligation. No legally binding
 > obligations will be created, implied, or inferred until an agreement in
 > final form is executed in writing by all parties involved.
 >


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Valentyn Tymofieiev
Thanks, sorry, there is a typo. The link is https://python3statement.org.



On Tue, Oct 1, 2019 at 10:14 AM Mark Liu  wrote:

> +1
>
> btw, the link (http://python3stament.org) you provided is broken.
>
> On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:
>
>> +1
>>
>> On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
>> wrote:
>>
>>> +1
>>>
>>> wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):
>>>
 +1

 On 30.09.19 23:03, Reza Rokni wrote:
 > +1
 >
 > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli >>> > > wrote:
 >
 > +1
 >
 > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi >>> > > wrote:
 >
 > +1
 >
 > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
 > mailto:owenzhang1...@gmail.com>>
 wrote:
 >
 > +1
 >
 > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
 > >>> > > wrote:
 >
 > +1
 >
 > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
 > mailto:valen...@google.com>>
 wrote:
 >
 > Hi everyone,
 >
 > Please vote whether to sign a pledge on behalf of
 > Apache Beam to sunset Beam Python 2 offering (in
 new
 > releases) in 2020 on http://python3stament.org as
 > follows:
 >
 > [ ] +1: Sign a pledge to discontinue support of
 > Python 2 in Beam in 2020.
 > [ ] -1: Do not sign a pledge to discontinue
 support
 > of Python 2 in Beam in 2020.
 >
 > The motivation and details for this vote were
 > discussed in [1, 2]. Please follow up in [2] if
 you
 > have any questions.
 >
 > This is a procedural vote [3] that will follow the
 > majority approval rules and will be open for at
 > least 72 hours.
 >
 > Thanks,
 > Valentyn
 >
 > [1]
 >
 https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
 > [2]
 >
 https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
 > [3] https://www.apache.org/foundation/voting.html
 >
 >
 >
 > --
 >
 > This email may be confidential and privileged. If you received this
 > communication by mistake, please don't forward it to anyone else,
 please
 > erase all copies and attachments, and please let me know that it has
 > gone to the wrong person.
 >
 > The above terms reflect a potential business arrangement, are
 provided
 > solely as a basis for further discussion, and are not intended to be
 and
 > do not constitute a legally binding obligation. No legally binding
 > obligations will be created, implied, or inferred until an agreement
 in
 > final form is executed in writing by all parties involved.
 >

>>>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Mark Liu
+1

btw, the link (http://python3stament.org) you provided is broken.

On Tue, Oct 1, 2019 at 9:44 AM Udi Meiri  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
> wrote:
>
>> +1
>>
>> wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):
>>
>>> +1
>>>
>>> On 30.09.19 23:03, Reza Rokni wrote:
>>> > +1
>>> >
>>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>>> > mailto:owenzhang1...@gmail.com>>
>>> wrote:
>>> >
>>> > +1
>>> >
>>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>>> > >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>>> > mailto:valen...@google.com>>
>>> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > Please vote whether to sign a pledge on behalf of
>>> > Apache Beam to sunset Beam Python 2 offering (in
>>> new
>>> > releases) in 2020 on http://python3stament.org as
>>> > follows:
>>> >
>>> > [ ] +1: Sign a pledge to discontinue support of
>>> > Python 2 in Beam in 2020.
>>> > [ ] -1: Do not sign a pledge to discontinue support
>>> > of Python 2 in Beam in 2020.
>>> >
>>> > The motivation and details for this vote were
>>> > discussed in [1, 2]. Please follow up in [2] if you
>>> > have any questions.
>>> >
>>> > This is a procedural vote [3] that will follow the
>>> > majority approval rules and will be open for at
>>> > least 72 hours.
>>> >
>>> > Thanks,
>>> > Valentyn
>>> >
>>> > [1]
>>> >
>>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>>> > [2]
>>> >
>>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>>> > [3] https://www.apache.org/foundation/voting.html
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > This email may be confidential and privileged. If you received this
>>> > communication by mistake, please don't forward it to anyone else,
>>> please
>>> > erase all copies and attachments, and please let me know that it has
>>> > gone to the wrong person.
>>> >
>>> > The above terms reflect a potential business arrangement, are provided
>>> > solely as a basis for further discussion, and are not intended to be
>>> and
>>> > do not constitute a legally binding obligation. No legally binding
>>> > obligations will be created, implied, or inferred until an agreement
>>> in
>>> > final form is executed in writing by all parties involved.
>>> >
>>>
>>


Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Pablo Estrada
Welcome Maria! : )

On Tue, Oct 1, 2019 at 8:32 AM Ahmet Altay  wrote:

> Welcome!
>
> On Tue, Oct 1, 2019 at 3:26 AM Jesse Anderson 
> wrote:
>
>> Excellent and welcome!
>>
>> [image: Big Data Institute] Jesse Anderson
>> Managing Director
>> Big Data Institute
>> (775) 393 9122 | je...@bigdatainstitute.io
>> bigdatainstitute.io 
>>
>>
>> On Tue, Oct 1, 2019 at 10:46 AM Łukasz Gajowy  wrote:
>>
>>> Welcome! :)
>>>
>>> wt., 1 paź 2019 o 11:30 Maximilian Michels  napisał(a):
>>>
 Welcome Maria! Looking forward to your proposal.

 Cheers,
 Max

 On 01.10.19 00:33, Reza Rokni wrote:
 > Welcome!
 >
 > On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik >>> > > wrote:
 >
 > Welcome to the community.
 >
 > On Mon, Sep 30, 2019 at 3:15 PM María Cruz >>> > > wrote:
 >
 > Hi everyone,
 > my name is María Cruz, I am from Buenos Aires but I live in
 the
 > Bay Area. I recently became acquainted with Apache Beam
 project,
 > and I got a chance to meet some of the Beam community at
 Apache
 > Con North America this past September. I'm testing out a
 > communications framework
 > <
 https://medium.com/@marianarra_/designing-a-communications-framework-for-community-engagement-e087312f9b83
 >
 > for Open Source communities. I'm emailing the list now because
 > I'd like to work on a communications strategy for Beam, to
 make
 > the most of the content you produce during Beam Summits.
 >
 > A little bit more about me. I am a communications strategist
 > with 11 years of experience in the field, 8 of which are in
 the
 > non-profit sector. I started working in Open Source in 2013,
 > when I joined Wikimedia, the social movement behind
 Wikipedia. I
 > now work to support Google Open Source projects, and I also
 > volunteer in the communications team of the Apache Software
 > Foundation, working closely with Sally (for those of you who
 > know her).
 >
 > I will be sending the list a proposal in the coming days.
 > Looking forward to hearing from you!
 >
 > Best,
 >
 > María
 >
 >
 >
 > --
 >
 > This email may be confidential and privileged. If you received this
 > communication by mistake, please don't forward it to anyone else,
 please
 > erase all copies and attachments, and please let me know that it has
 > gone to the wrong person.
 >
 > The above terms reflect a potential business arrangement, are
 provided
 > solely as a basis for further discussion, and are not intended to be
 and
 > do not constitute a legally binding obligation. No legally binding
 > obligations will be created, implied, or inferred until an agreement
 in
 > final form is executed in writing by all parties involved.
 >

>>>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Udi Meiri
+1

On Tue, Oct 1, 2019 at 3:22 AM Łukasz Gajowy 
wrote:

> +1
>
> wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):
>
>> +1
>>
>> On 30.09.19 23:03, Reza Rokni wrote:
>> > +1
>> >
>> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli > > > wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi > > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
>> > mailto:owenzhang1...@gmail.com>>
>> wrote:
>> >
>> > +1
>> >
>> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
>> > > > > wrote:
>> >
>> > +1
>> >
>> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
>> > mailto:valen...@google.com>>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > Please vote whether to sign a pledge on behalf of
>> > Apache Beam to sunset Beam Python 2 offering (in new
>> > releases) in 2020 on http://python3stament.org as
>> > follows:
>> >
>> > [ ] +1: Sign a pledge to discontinue support of
>> > Python 2 in Beam in 2020.
>> > [ ] -1: Do not sign a pledge to discontinue support
>> > of Python 2 in Beam in 2020.
>> >
>> > The motivation and details for this vote were
>> > discussed in [1, 2]. Please follow up in [2] if you
>> > have any questions.
>> >
>> > This is a procedural vote [3] that will follow the
>> > majority approval rules and will be open for at
>> > least 72 hours.
>> >
>> > Thanks,
>> > Valentyn
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
>> > [2]
>> >
>> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
>> > [3] https://www.apache.org/foundation/voting.html
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are provided
>> > solely as a basis for further discussion, and are not intended to be
>> and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Reading from BigQuery on portable runners in Python SDK

2019-10-01 Thread Ahmet Altay
+Chamikara Jayalath  and +Pablo Estrada
 might have ideas related to this.

On Tue, Oct 1, 2019 at 2:39 AM Kamil Wasilewski <
kamil.wasilew...@polidea.com> wrote:

> If anyone is interested, here is a link to my code:
> https://github.com/kamilwu/beam/tree/bounded-source-for-bq
>
> On Tue, Oct 1, 2019 at 11:17 AM Kamil Wasilewski <
> kamil.wasilew...@polidea.com> wrote:
>
>> Hi all,
>>
>> At the moment, we have a BigQuery native source for Python SDK, which can
>> be used only by Dataflow runner. Consequently, it doesn't work on portable
>> runners, such as Flink.
>>
>> Recently I have written a prototypical source which implements
>> iobase.BoundedSource, so that other runners can read from BigQuery as well.
>> It works the same way as in Java SDK [1], which means that it exports
>> BigQuery table to JSON and returns TextSource objects in the split() call.
>> However, it has the following problems:
>> - it doesn't work on Direct runner,
>>
>
I believe DirectRunner already have an implementation for reading from BQ.


> - its API is highly experimental.
>>
>
Which API is highly experimental?


>
>> This is where my question begins. What should we do in order to provide
>> support for reading from BigQuery on other runners than Dataflow? Do you
>> think it's fine to continue working on the source I described? Or maybe it
>> should be done in an entirely different way (not by exporting tables to
>> JSON)?
>>
>> Thanks,
>> Kamil
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
>>
>


Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Ahmet Altay
Welcome!

On Tue, Oct 1, 2019 at 3:26 AM Jesse Anderson 
wrote:

> Excellent and welcome!
>
> [image: Big Data Institute] Jesse Anderson
> Managing Director
> Big Data Institute
> (775) 393 9122 | je...@bigdatainstitute.io
> bigdatainstitute.io 
>
>
> On Tue, Oct 1, 2019 at 10:46 AM Łukasz Gajowy  wrote:
>
>> Welcome! :)
>>
>> wt., 1 paź 2019 o 11:30 Maximilian Michels  napisał(a):
>>
>>> Welcome Maria! Looking forward to your proposal.
>>>
>>> Cheers,
>>> Max
>>>
>>> On 01.10.19 00:33, Reza Rokni wrote:
>>> > Welcome!
>>> >
>>> > On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik >> > > wrote:
>>> >
>>> > Welcome to the community.
>>> >
>>> > On Mon, Sep 30, 2019 at 3:15 PM María Cruz >> > > wrote:
>>> >
>>> > Hi everyone,
>>> > my name is María Cruz, I am from Buenos Aires but I live in the
>>> > Bay Area. I recently became acquainted with Apache Beam
>>> project,
>>> > and I got a chance to meet some of the Beam community at Apache
>>> > Con North America this past September. I'm testing out a
>>> > communications framework
>>> > <
>>> https://medium.com/@marianarra_/designing-a-communications-framework-for-community-engagement-e087312f9b83
>>> >
>>> > for Open Source communities. I'm emailing the list now because
>>> > I'd like to work on a communications strategy for Beam, to make
>>> > the most of the content you produce during Beam Summits.
>>> >
>>> > A little bit more about me. I am a communications strategist
>>> > with 11 years of experience in the field, 8 of which are in the
>>> > non-profit sector. I started working in Open Source in 2013,
>>> > when I joined Wikimedia, the social movement behind Wikipedia.
>>> I
>>> > now work to support Google Open Source projects, and I also
>>> > volunteer in the communications team of the Apache Software
>>> > Foundation, working closely with Sally (for those of you who
>>> > know her).
>>> >
>>> > I will be sending the list a proposal in the coming days.
>>> > Looking forward to hearing from you!
>>> >
>>> > Best,
>>> >
>>> > María
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > This email may be confidential and privileged. If you received this
>>> > communication by mistake, please don't forward it to anyone else,
>>> please
>>> > erase all copies and attachments, and please let me know that it has
>>> > gone to the wrong person.
>>> >
>>> > The above terms reflect a potential business arrangement, are provided
>>> > solely as a basis for further discussion, and are not intended to be
>>> and
>>> > do not constitute a legally binding obligation. No legally binding
>>> > obligations will be created, implied, or inferred until an agreement
>>> in
>>> > final form is executed in writing by all parties involved.
>>> >
>>>
>>


Re: [ANNOUNCE] New committer: Alan Myrvold

2019-10-01 Thread Alexey Romanenko
Congratulations! Well deserved!

> On 30 Sep 2019, at 19:12, Udi Meiri  wrote:
> 
> Congrats Alan!
> 
> On Mon, Sep 30, 2019 at 11:12 AM Alan Myrvold  > wrote:
> Thanks!! Looking forward to making more impact to Apache Beam
> 
> On Mon, Sep 30, 2019 at 10:56 AM Mikhail Gryzykhin  > wrote:
> Congratulations!
> 
> On Mon, Sep 30, 2019 at 9:47 AM David Cavazos  > wrote:
> Congratulations Alan!
> 
> On Mon, Sep 30, 2019 at 7:57 AM Connell O'Callaghan  > wrote:
> Congratulations Alan - well done!!! Ahmet thank you for sharing this great 
> news!!! 
> 
> On Mon, Sep 30, 2019 at 7:34 AM Łukasz Gajowy  > wrote:
> Congratulations :)
> 
> pon., 30 wrz 2019 o 15:41 Reza Rokni  > napisał(a):
> Woohoo Congratulations! 
> 
> On Mon, 30 Sep 2019 at 21:06, Thomas Weise  > wrote:
> Congratulations, Alan!
> 
> 
> On Mon, Sep 30, 2019 at 4:47 AM Ismaël Mejía  > wrote:
> Congrats Alan!
> 
> On Mon, Sep 30, 2019, 11:20 AM Tanay Tummalapalli  > wrote:
> Congratulations, Alan!
> 
> 
> On Mon, Sep 30, 2019 at 1:03 PM Gleb Kanterov  > wrote:
> Congratulations!
> 
> On Sat, Sep 28, 2019 at 12:07 AM Valentyn Tymofieiev  > wrote:
> Congratulations, Alan. Well deserved.
> 
> On Fri, Sep 27, 2019 at 2:09 PM Chamikara Jayalath  > wrote:
> Congrats Alan!!
> 
> On Fri, Sep 27, 2019 at 1:49 PM Jan Lukavský  > wrote:
> Congrats Alan!
> On 9/27/19 10:22 PM, Mark Liu wrote:
>> Congratulations Alan!!!
>> 
>> On Fri, Sep 27, 2019 at 12:55 PM Ning Kang > > wrote:
>> Congrats Alan!
>> 
>> On Fri, Sep 27, 2019 at 12:02 PM Ankur Goenka > > wrote:
>> Congratulations Alan!
>> 
>> On Fri, Sep 27, 2019 at 11:17 AM Yichi Zhang > > wrote:
>> Congrats, Alan!
>> 
>> On Fri, Sep 27, 2019 at 10:26 AM Robin Qiu > > wrote:
>> Congrats, Alan!
>> 
>> On Fri, Sep 27, 2019 at 10:15 AM Hannah Jiang > > wrote:
>> Congrats Alan!
>> 
>> On Fri, Sep 27, 2019 at 9:57 AM Ruoyun Huang > > wrote:
>> Congratulations, Alan!
>> 
>> 
>> On Fri, Sep 27, 2019 at 9:55 AM Rui Wang > > wrote:
>> Congrats!
>> 
>> -Rui
>> 
>> On Fri, Sep 27, 2019 at 9:54 AM Pablo Estrada > > wrote:
>> Yooh! : D
>> 
>> On Fri, Sep 27, 2019 at 9:53 AM Yifan Zou > > wrote:
>> Congratulations, Alan!
>> 
>> On Fri, Sep 27, 2019 at 9:18 AM Ahmet Altay > > wrote:
>> Hi,
>> 
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Alan Myrvold
>> 
>> Alan has been a long time Beam contributor. His contributions made Beam more 
>> productive and friendlier [1] for all contributors with significant 
>> improvements to Beam release process, automation, and infrastructure.
>> 
>> In consideration of Alan's contributions, the Beam PMC trusts him
>> with the responsibilities of a Beam committer [2].
>> 
>> Thank you, Alan, for your contributions and looking forward to many more!
>> 
>> Ahmet, on behalf of the Apache Beam PMC
>> 
>> [1] 
>> https://beam-summit-na-2019.firebaseapp.com/schedule/2019-09-11?sessionId=1126
>>  
>> 
>> [2] 
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>  
>> 
>> 
>> 
>> -- 
>> 
>> Ruoyun  Huang
>> 
> 
> 
> -- 
> Cheers,
> Gleb
> 
> 
> -- 
> This email may be confidential and privileged. If you received this 
> communication by mistake, please don't forward it to anyone else, please 
> erase all copies and attachments, and please let me know that it has gone to 
> the wrong person. 
> The above terms reflect a potential business arrangement, are provided solely 
> as a basis for further discussion, and are not intended to be and do not 
> constitute a legally binding obligation. No legally binding obligations will 
> be created, implied, or inferred until an agreement in final form is executed 
> in writing by all parties involved.



Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Jesse Anderson
Excellent and welcome!

[image: Big Data Institute] Jesse Anderson
Managing Director
Big Data Institute
(775) 393 9122 | je...@bigdatainstitute.io
bigdatainstitute.io 


On Tue, Oct 1, 2019 at 10:46 AM Łukasz Gajowy  wrote:

> Welcome! :)
>
> wt., 1 paź 2019 o 11:30 Maximilian Michels  napisał(a):
>
>> Welcome Maria! Looking forward to your proposal.
>>
>> Cheers,
>> Max
>>
>> On 01.10.19 00:33, Reza Rokni wrote:
>> > Welcome!
>> >
>> > On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik > > > wrote:
>> >
>> > Welcome to the community.
>> >
>> > On Mon, Sep 30, 2019 at 3:15 PM María Cruz > > > wrote:
>> >
>> > Hi everyone,
>> > my name is María Cruz, I am from Buenos Aires but I live in the
>> > Bay Area. I recently became acquainted with Apache Beam project,
>> > and I got a chance to meet some of the Beam community at Apache
>> > Con North America this past September. I'm testing out a
>> > communications framework
>> > <
>> https://medium.com/@marianarra_/designing-a-communications-framework-for-community-engagement-e087312f9b83
>> >
>> > for Open Source communities. I'm emailing the list now because
>> > I'd like to work on a communications strategy for Beam, to make
>> > the most of the content you produce during Beam Summits.
>> >
>> > A little bit more about me. I am a communications strategist
>> > with 11 years of experience in the field, 8 of which are in the
>> > non-profit sector. I started working in Open Source in 2013,
>> > when I joined Wikimedia, the social movement behind Wikipedia. I
>> > now work to support Google Open Source projects, and I also
>> > volunteer in the communications team of the Apache Software
>> > Foundation, working closely with Sally (for those of you who
>> > know her).
>> >
>> > I will be sending the list a proposal in the coming days.
>> > Looking forward to hearing from you!
>> >
>> > Best,
>> >
>> > María
>> >
>> >
>> >
>> > --
>> >
>> > This email may be confidential and privileged. If you received this
>> > communication by mistake, please don't forward it to anyone else,
>> please
>> > erase all copies and attachments, and please let me know that it has
>> > gone to the wrong person.
>> >
>> > The above terms reflect a potential business arrangement, are provided
>> > solely as a basis for further discussion, and are not intended to be
>> and
>> > do not constitute a legally binding obligation. No legally binding
>> > obligations will be created, implied, or inferred until an agreement in
>> > final form is executed in writing by all parties involved.
>> >
>>
>


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Łukasz Gajowy
+1

wt., 1 paź 2019 o 11:29 Maximilian Michels  napisał(a):

> +1
>
> On 30.09.19 23:03, Reza Rokni wrote:
> > +1
> >
> > On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli  > > wrote:
> >
> > +1
> >
> > On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi  > > wrote:
> >
> > +1
> >
> > On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
> > mailto:owenzhang1...@gmail.com>>
> wrote:
> >
> > +1
> >
> > On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
> >  > > wrote:
> >
> > +1
> >
> > On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
> > mailto:valen...@google.com>>
> wrote:
> >
> > Hi everyone,
> >
> > Please vote whether to sign a pledge on behalf of
> > Apache Beam to sunset Beam Python 2 offering (in new
> > releases) in 2020 on http://python3stament.org as
> > follows:
> >
> > [ ] +1: Sign a pledge to discontinue support of
> > Python 2 in Beam in 2020.
> > [ ] -1: Do not sign a pledge to discontinue support
> > of Python 2 in Beam in 2020.
> >
> > The motivation and details for this vote were
> > discussed in [1, 2]. Please follow up in [2] if you
> > have any questions.
> >
> > This is a procedural vote [3] that will follow the
> > majority approval rules and will be open for at
> > least 72 hours.
> >
> > Thanks,
> > Valentyn
> >
> > [1]
> >
> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
> > [2]
> >
> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
> > [3] https://www.apache.org/foundation/voting.html
> >
> >
> >
> > --
> >
> > This email may be confidential and privileged. If you received this
> > communication by mistake, please don't forward it to anyone else, please
> > erase all copies and attachments, and please let me know that it has
> > gone to the wrong person.
> >
> > The above terms reflect a potential business arrangement, are provided
> > solely as a basis for further discussion, and are not intended to be and
> > do not constitute a legally binding obligation. No legally binding
> > obligations will be created, implied, or inferred until an agreement in
> > final form is executed in writing by all parties involved.
> >
>


Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Łukasz Gajowy
Welcome! :)

wt., 1 paź 2019 o 11:30 Maximilian Michels  napisał(a):

> Welcome Maria! Looking forward to your proposal.
>
> Cheers,
> Max
>
> On 01.10.19 00:33, Reza Rokni wrote:
> > Welcome!
> >
> > On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik  > > wrote:
> >
> > Welcome to the community.
> >
> > On Mon, Sep 30, 2019 at 3:15 PM María Cruz  > > wrote:
> >
> > Hi everyone,
> > my name is María Cruz, I am from Buenos Aires but I live in the
> > Bay Area. I recently became acquainted with Apache Beam project,
> > and I got a chance to meet some of the Beam community at Apache
> > Con North America this past September. I'm testing out a
> > communications framework
> > <
> https://medium.com/@marianarra_/designing-a-communications-framework-for-community-engagement-e087312f9b83
> >
> > for Open Source communities. I'm emailing the list now because
> > I'd like to work on a communications strategy for Beam, to make
> > the most of the content you produce during Beam Summits.
> >
> > A little bit more about me. I am a communications strategist
> > with 11 years of experience in the field, 8 of which are in the
> > non-profit sector. I started working in Open Source in 2013,
> > when I joined Wikimedia, the social movement behind Wikipedia. I
> > now work to support Google Open Source projects, and I also
> > volunteer in the communications team of the Apache Software
> > Foundation, working closely with Sally (for those of you who
> > know her).
> >
> > I will be sending the list a proposal in the coming days.
> > Looking forward to hearing from you!
> >
> > Best,
> >
> > María
> >
> >
> >
> > --
> >
> > This email may be confidential and privileged. If you received this
> > communication by mistake, please don't forward it to anyone else, please
> > erase all copies and attachments, and please let me know that it has
> > gone to the wrong person.
> >
> > The above terms reflect a potential business arrangement, are provided
> > solely as a basis for further discussion, and are not intended to be and
> > do not constitute a legally binding obligation. No legally binding
> > obligations will be created, implied, or inferred until an agreement in
> > final form is executed in writing by all parties involved.
> >
>


Re: Reading from BigQuery on portable runners in Python SDK

2019-10-01 Thread Kamil Wasilewski
If anyone is interested, here is a link to my code:
https://github.com/kamilwu/beam/tree/bounded-source-for-bq

On Tue, Oct 1, 2019 at 11:17 AM Kamil Wasilewski <
kamil.wasilew...@polidea.com> wrote:

> Hi all,
>
> At the moment, we have a BigQuery native source for Python SDK, which can
> be used only by Dataflow runner. Consequently, it doesn't work on portable
> runners, such as Flink.
>
> Recently I have written a prototypical source which implements
> iobase.BoundedSource, so that other runners can read from BigQuery as well.
> It works the same way as in Java SDK [1], which means that it exports
> BigQuery table to JSON and returns TextSource objects in the split() call.
> However, it has the following problems:
> - it doesn't work on Direct runner,
> - its API is highly experimental.
>
> This is where my question begins. What should we do in order to provide
> support for reading from BigQuery on other runners than Dataflow? Do you
> think it's fine to continue working on the source I described? Or maybe it
> should be done in an entirely different way (not by exporting tables to
> JSON)?
>
> Thanks,
> Kamil
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
>


Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Maximilian Michels

Welcome Maria! Looking forward to your proposal.

Cheers,
Max

On 01.10.19 00:33, Reza Rokni wrote:

Welcome!

On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik > wrote:


Welcome to the community.

On Mon, Sep 30, 2019 at 3:15 PM María Cruz mailto:macruz...@gmail.com>> wrote:

Hi everyone,
my name is María Cruz, I am from Buenos Aires but I live in the
Bay Area. I recently became acquainted with Apache Beam project,
and I got a chance to meet some of the Beam community at Apache
Con North America this past September. I'm testing out a
communications framework


for Open Source communities. I'm emailing the list now because
I'd like to work on a communications strategy for Beam, to make
the most of the content you produce during Beam Summits.

A little bit more about me. I am a communications strategist
with 11 years of experience in the field, 8 of which are in the
non-profit sector. I started working in Open Source in 2013,
when I joined Wikimedia, the social movement behind Wikipedia. I
now work to support Google Open Source projects, and I also
volunteer in the communications team of the Apache Software
Foundation, working closely with Sally (for those of you who
know her).

I will be sending the list a proposal in the coming days.
Looking forward to hearing from you!

Best,

María



--

This email may be confidential and privileged. If you received this 
communication by mistake, please don't forward it to anyone else, please 
erase all copies and attachments, and please let me know that it has 
gone to the wrong person.


The above terms reflect a potential business arrangement, are provided 
solely as a basis for further discussion, and are not intended to be and 
do not constitute a legally binding obligation. No legally binding 
obligations will be created, implied, or inferred until an agreement in 
final form is executed in writing by all parties involved.




Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Maximilian Michels

+1

On 30.09.19 23:03, Reza Rokni wrote:

+1

On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli > wrote:


+1

On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi mailto:smar...@apache.org>> wrote:

+1

On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang
mailto:owenzhang1...@gmail.com>> wrote:

+1

On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett
mailto:whatwouldausti...@gmail.com>> wrote:

+1

On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev
mailto:valen...@google.com>> wrote:

Hi everyone,

Please vote whether to sign a pledge on behalf of
Apache Beam to sunset Beam Python 2 offering (in new
releases) in 2020 on http://python3stament.org as
follows:

[ ] +1: Sign a pledge to discontinue support of
Python 2 in Beam in 2020.
[ ] -1: Do not sign a pledge to discontinue support
of Python 2 in Beam in 2020.

The motivation and details for this vote were
discussed in [1, 2]. Please follow up in [2] if you
have any questions.

This is a procedural vote [3] that will follow the
majority approval rules and will be open for at
least 72 hours.

Thanks,
Valentyn

[1]

https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
[2]

https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
[3] https://www.apache.org/foundation/voting.html



--

This email may be confidential and privileged. If you received this 
communication by mistake, please don't forward it to anyone else, please 
erase all copies and attachments, and please let me know that it has 
gone to the wrong person.


The above terms reflect a potential business arrangement, are provided 
solely as a basis for further discussion, and are not intended to be and 
do not constitute a legally binding obligation. No legally binding 
obligations will be created, implied, or inferred until an agreement in 
final form is executed in writing by all parties involved.




Reading from BigQuery on portable runners in Python SDK

2019-10-01 Thread Kamil Wasilewski
Hi all,

At the moment, we have a BigQuery native source for Python SDK, which can
be used only by Dataflow runner. Consequently, it doesn't work on portable
runners, such as Flink.

Recently I have written a prototypical source which implements
iobase.BoundedSource, so that other runners can read from BigQuery as well.
It works the same way as in Java SDK [1], which means that it exports
BigQuery table to JSON and returns TextSource objects in the split() call.
However, it has the following problems:
- it doesn't work on Direct runner,
- its API is highly experimental.

This is where my question begins. What should we do in order to provide
support for reading from BigQuery on other runners than Dataflow? Do you
think it's fine to continue working on the source I described? Or maybe it
should be done in an entirely different way (not by exporting tables to
JSON)?

Thanks,
Kamil

[1]
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java


Re: Multiple iterations after GroupByKey with SparkRunner

2019-10-01 Thread Jan Lukavský
The car analogy was meant to say, that in real world you have to make 
decision before you take any action. There is no retroactivity possible.


Reuven pointed out, that it is possible (although it seems a little 
weird to me, but that is the only thing I can tell against it :-)), that 
the way a grouped PCollection is produced might be out of control of a 
consuming operator. One example of this might be, that the grouping is 
produced in a submodule (some library), but still, the consumer wants to 
be able to specify if he wants or doesn't want reiterations. There still 
is a "classical" solution to this - the library might expose an 
interface to specify a factory for the grouped PCollection, so that the 
user of the library will be able to specify what he wants. But we can 
say, that we don't want to force users (or authors of libraries) to do 
that. That's okay for me.


If we move on, our next option might be to specify the annotation on the 
consumer (as suggested), but that has all the "not really nice" 
properties of being counter-intuitive, ignoring strong types, etc., 
etc., for which reason I think that this should be ruled out as well.


This leaves us with a single option (at least I have not figured out any 
other) - which is we can bundle GBK and associated ParDo into atomic 
PTransform, which can then be overridden by runners that need special 
handling of this situation - these are all runners that need buffer data 
to memory in order to support reiterations (spark and flink, note that 
this problem arises only for batch case, because in streaming case, one 
can reasonably assume that the data resides in a state that supports 
reiterations). But - we already have this PTransform in Euphoria, it is 
called ReduceByKey, and has all the required properties (technically, it 
is not a PTransform now, but that is a minor detail and can be changed 
trivially).


So, the direction I was trying to take this discussion was - what could 
be the best way for a runner to natively support a PTransform from a 
DSL? I can imagine several options:


 a) support it directly and let runners depend on the DSL (compileOnly 
dependency might suffice, because users will include the DSL into their 
code to be able to use it)


 b) create an interface in runners for user-code to be able to provide 
translation for user-specified operators (this could be absolutely 
generic, DSLs might just use this feature the same way any user could), 
after all runners already use a concept of Translator, but that is 
pretty much copy-pasted, not abstracted into a general purpose one


 c) move the operators that need to be translated into core

The option (c) then leaves open questions related to - if we would want 
to move other operators to core, would this be the right time to ask 
questions if our current set of "core" operators is the ideal one? Or 
could this be optimized?


Jan

On 10/1/19 12:32 AM, Kenneth Knowles wrote:

In the car analogy, you have something this:

    Iterable: car
    Iterator: taxi ride

They are related, but not as variations of a common concept.

In the discussion of Combine vs RSBK, if the reducer is required to be 
an associative and commutative operator, then it is the same thing 
under a different name. If the reducer can be non-associative or 
non-commutative, then it admits fewer transformations/optimizations.


If you introduce a GroupIteratorsByKey and implement GroupByKey as a 
transform that combines the iterator by concatenation, I think you do 
get an internally consistent system. To execute efficiently, you need 
to always identify and replace the GroupByKey operation with a 
primitive one. It does make some sense to expose the weakest 
primitives for the sake of DSLs. But they are very poorly suited for 
end-users, and for GBK on most runners you get the more powerful one 
for free.


Kenn

On Mon, Sep 30, 2019 at 2:02 AM Jan Lukavský > wrote:


> The fact that the annotation on the ParDo "changes" the
GroupByKey implementation is very specific to the Spark runner
implementation.

I don't quite agree. It is not very specific to Spark, it is
specific to generally all runners, that produce grouped elements
in a way that is not reiterable. That is the key property. The
example you gave with HDFS does not satisfy this condition (files
on HDFS are certainly reiterable), and that's why no change to the
GBK is needed (it actually already has the required property). A
quick look at what FlinkRunner (at least non portable does) is
that it implements GBK using reducing elements into List. That is
going to crash on big PCollection, which is even nicely documented:

    * For internal use to translate {@link GroupByKey}. For a large 
{@link PCollection} this is
    * expected to crash!

If this is fixed, then it is likely to start behave the same as
Spark. So actually I think the opposite is true - 

Re: Introduction + Support in Comms for Beam!

2019-10-01 Thread Reza Rokni
Welcome!

On Tue, 1 Oct 2019 at 11:18, Lukasz Cwik  wrote:

> Welcome to the community.
>
> On Mon, Sep 30, 2019 at 3:15 PM María Cruz  wrote:
>
>> Hi everyone,
>> my name is María Cruz, I am from Buenos Aires but I live in the Bay Area.
>> I recently became acquainted with Apache Beam project, and I got a chance
>> to meet some of the Beam community at Apache Con North America this past
>> September. I'm testing out a communications framework
>> 
>> for Open Source communities. I'm emailing the list now because I'd like to
>> work on a communications strategy for Beam, to make the most of the content
>> you produce during Beam Summits.
>>
>> A little bit more about me. I am a communications strategist with 11
>> years of experience in the field, 8 of which are in the non-profit sector.
>> I started working in Open Source in 2013, when I joined Wikimedia, the
>> social movement behind Wikipedia. I now work to support Google Open Source
>> projects, and I also volunteer in the communications team of the Apache
>> Software Foundation, working closely with Sally (for those of you who know
>> her).
>>
>> I will be sending the list a proposal in the coming days. Looking forward
>> to hearing from you!
>>
>> Best,
>>
>> María
>>
>

-- 

This email may be confidential and privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone
to the wrong person.

The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and do
not constitute a legally binding obligation. No legally binding obligations
will be created, implied, or inferred until an agreement in final form is
executed in writing by all parties involved.


Re: [VOTE] Sign a pledge to discontinue support of Python 2 in 2020.

2019-10-01 Thread Reza Rokni
+1

On Tue, 1 Oct 2019 at 13:54, Tanay Tummalapalli  wrote:

> +1
>
> On Tue, Oct 1, 2019 at 8:19 AM Suneel Marthi  wrote:
>
>> +1
>>
>> On Mon, Sep 30, 2019 at 10:33 PM Manu Zhang 
>> wrote:
>>
>>> +1
>>>
>>> On Tue, Oct 1, 2019 at 9:44 AM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
 +1

 On Mon, Sep 30, 2019 at 5:22 PM Valentyn Tymofieiev <
 valen...@google.com> wrote:

> Hi everyone,
>
> Please vote whether to sign a pledge on behalf of Apache Beam to
> sunset Beam Python 2 offering (in new releases) in 2020 on
> http://python3stament.org as follows:
>
> [ ] +1: Sign a pledge to discontinue support of Python 2 in Beam in
> 2020.
> [ ] -1: Do not sign a pledge to discontinue support of Python 2 in
> Beam in 2020.
>
> The motivation and details for this vote were discussed in [1, 2].
> Please follow up in [2] if you have any questions.
>
> This is a procedural vote [3] that will follow the majority approval
> rules and will be open for at least 72 hours.
>
> Thanks,
> Valentyn
>
> [1]
> https://lists.apache.org/thread.html/eba6caa58ea79a7ecbc8560d1c680a366b44c531d96ce5c699d41535@%3Cdev.beam.apache.org%3E
> [2]
> https://lists.apache.org/thread.html/456631fe1a696c537ef8ebfee42cd3ea8121bf7c639c52da5f7032e7@%3Cdev.beam.apache.org%3E
> [3] https://www.apache.org/foundation/voting.html
>
>

-- 

This email may be confidential and privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone
to the wrong person.

The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and do
not constitute a legally binding obligation. No legally binding obligations
will be created, implied, or inferred until an agreement in final form is
executed in writing by all parties involved.