Re: Exploding windows and FnApiDoFnRunner

2020-05-04 Thread Robert Bradshaw
In Python we only explode windows if the Window is being inspected. (There is no separate "DoFnRunner" for FnApi vs. Legacy execution.) On Mon, May 4, 2020 at 9:21 AM Luke Cwik wrote: > > Reuven you are correct that the optimization has yet to be implemented. > Robert the FnApiDoFnRunner is the

Re: Jenkins jobs not running for my PR 10438

2020-05-04 Thread Robert Bradshaw
Done. On Mon, May 4, 2020 at 7:35 AM Rehman Murad Ali wrote: > > Hi Beam committers, > > Would you please trigger the basic checks as well as validatesRunner check > for this PR? > https://github.com/apache/beam/pull/11350 > > > Thanks & Regards > > Rehman Murad Ali > Software Engineer >

Re: JIRA priorities explaination

2020-05-01 Thread Robert Bradshaw
ConstantsHelp.jspa?decorator=popup#PriorityLevels > [2] https://jira.atlassian.com/browse/JRASERVER-3821 > > On Fri, Oct 25, 2019 at 4:25 PM Pablo Estrada wrote: >> >> That SGTM >> >> On Fri, Oct 25, 2019 at 4:18 PM Robert Bradshaw wrote: >>> >>> +1 to

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-01 Thread Robert Bradshaw
od change. In the past, >>>>>> a few times, we edited dates on posts (e.g. a release date was entered >>>>>> incorrectly) and we had to either have a mismatch between dates in the >>>>>> url >>>>>> and the date in the

Re: Rethinking Python's PortableRunner default job server

2020-04-30 Thread Robert Bradshaw
11 AM Kyle Weaver > wrote: > >> > >> I'll bite :) Thanks for the feedback everyone! > >> > >> On Thu, Apr 30, 2020 at 1:01 PM Robert Bradshaw > wrote: > >>> > >>> I filed https://issues.apache.org/jira/browse/BEAM-9860. Any takers?

Re: Rethinking Python's PortableRunner default job server

2020-04-30 Thread Robert Bradshaw
the user reported. > > On Wed, Apr 29, 2020 at 10:05 PM Robert Bradshaw > wrote: > > > > +1, I was actually thinking about this just the other day. > PortableRunner should require job_endpoint to be set, and we can have a > nice error message directing the explicit use o

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-04-29 Thread Robert Bradshaw
wer >>> a little bit due to the timezone. :) >>> >>> Best regards, >>> Nam >>> >>> >>> >>> On Tue, Apr 28, 2020 at 8:49 PM Aizhamal Nurmamat kyzy < >>> aizha...@apache.org> wrote: >>> >>>> Addi

Re: Rethinking Python's PortableRunner default job server

2020-04-29 Thread Robert Bradshaw
+1, I was actually thinking about this just the other day. PortableRunner should require job_endpoint to be set, and we can have a nice error message directing the explicit use of FlinkRunner for the old behavior. On Wed, Apr 29, 2020 at 11:50 AM Kyle Weaver wrote: > > Could the error message

Re: Automation for Jira

2020-04-29 Thread Robert Bradshaw
+1 to more automation. I'm in favor of all but 4, I think it's quite common for issues to be noticed but not worked on for 60+ days. Most of the time when a developer files an issue they either (1) are working on it right now or (2) are filing it away because it's something they're not working

Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-28 Thread Robert Bradshaw
h.) >> >> I guess I assumed there was some reason we needed "lightweight images" in >> our tests (because licenses take up a lot of space IIRC), but maybe not. >> Can you elaborate on the purpose of this option Hannah? >> >> On Tue, Apr 28, 2020 at 6:

Re: Companies using Beam?

2020-04-28 Thread Robert Bradshaw
I think this is a great idea, as long as we can get critical mass. One danger I've seen is that such pages can grow stale/feel dated if not regularly updated/added to, so we should have a plan there. On Tue, Apr 28, 2020 at 4:21 PM Aizhamal Nurmamat kyzy wrote: > +1 on adding

Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-28 Thread Robert Bradshaw
ker for 2.21.0 because I was afraid >>>>>>> something was broken, but now it looks like the failures were just >>>>>>> flakes. >>>>>>> So BEAM-9764 <https://issues.apache.org/jira/browse/BEAM-9764> should >>>>>&

Re: How to submit PRs for dependant changes?

2020-04-28 Thread Robert Bradshaw
I prefer (c) as well, rebasing as things get merged. I would do (a) if they're really prerequisites for one another. On Tue, Apr 28, 2020 at 10:40 AM Udi Meiri wrote: > (a) or (c) should work. (c) is preferred if you want faster reviews. > > For multiple JIRAs, I've seen both

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-04-28 Thread Robert Bradshaw
Thanks. It'll be great to better support more languages. I looked at the PR and there seems to be no provenance/history. E.g. all the content seems to be entirely new files rather than diffs from the old. (There also seems to be a huge amount of auto-generated js code as well.) On Tue, Apr 28,

Re: [QUESTION] Reading Snappy Compressed Text Files

2020-04-22 Thread Robert Bradshaw
On Wed, Apr 22, 2020 at 11:06 AM Jeff Klukas wrote: > Beam is able to infer compression from file extensions for a variety of > formats, but snappy is not among them currently: > > > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java >

Re: Reference to Beam in upcoming Kubeflow Book

2020-04-17 Thread Robert Bradshaw
On Fri, Apr 17, 2020 at 4:58 PM Holden Karau wrote: > > On Fri, Apr 17, 2020 at 3:52 PM Robert Bradshaw > wrote: > >> On Fri, Apr 17, 2020 at 2:56 PM Holden Karau >> wrote: >> >>> >>> On Fri, Apr 17, 2020 at 2:45 PM Robert Bradshaw >>&g

Re: Reference to Beam in upcoming Kubeflow Book

2020-04-17 Thread Robert Bradshaw
On Fri, Apr 17, 2020 at 2:56 PM Holden Karau wrote: > > On Fri, Apr 17, 2020 at 2:45 PM Robert Bradshaw > wrote: > >> Hi Holden! >> >> I agree with Kyle that it makes sense to have some caveat about Flink and >> Spark, though at this point they're not /that/

Re: Reference to Beam in upcoming Kubeflow Book

2020-04-17 Thread Robert Bradshaw
Hi Holden! I agree with Kyle that it makes sense to have some caveat about Flink and Spark, though at this point they're not /that/ new (at least not Flink). I am curious what extra support Kubeflow is "missing" (or, conversely, what extra support it has for Dataflow that goes beyond just

Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-16 Thread Robert Bradshaw
ate a Java docker image. >> >> The caching approach mentioned by Robert brings many benefits, not only >> to this use case. >> However, we would like to include this work as part of 2.21.0, so I will >> move with the multi processing approach this time. >> >

Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-15 Thread Robert Bradshaw
Is the cost primarily in pulling these remote licenses/sources? I'd guess that 99.9% of the URLs remain the same from run to run. Would a simple cache, or caching proxy, be sufficient? Otherwise, a tag to check that licenses can be pulled, but not really pull them, might be sufficient. (Making

Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-15 Thread Robert Bradshaw
In terms of pre-commit, 7-8 minutes seems worth not having to debug dependencies that broke this in post-commit. We should look at caching (IIRC we've long wanted to do this for pip and maven packages anyway). We could also consider whether, for development purposes, we could build "lite"

Re: [DISCUSS] Let's establish a guideline for using Python type annotations in Beam codebase

2020-04-13 Thread Robert Bradshaw
On Mon, Apr 13, 2020 at 11:48 AM Valentyn Tymofieiev wrote: > > On Mon, Apr 13, 2020 at 10:53 AM Robert Bradshaw > wrote: > >> On Mon, Apr 13, 2020 at 10:38 AM Valentyn Tymofieiev >> wrote: >> >>> To clarify, I don't suggest that every variable shou

Re: [DISCUSS] Let's establish a guideline for using Python type annotations in Beam codebase

2020-04-13 Thread Robert Bradshaw
eckers enabled in presubmit and see what it takes to keep those happy before establishing more strict criterea. (It does sound like we have consensus on using type comments until 2.7 is dropped.) > On Fri, Apr 10, 2020 at 4:56 PM Robert Bradshaw > wrote: > >> On Fri, Apr 10, 2020 at 4:

Re: [DISCUSS] Let's establish a guideline for using Python type annotations in Beam codebase

2020-04-10 Thread Robert Bradshaw
0, 2020 at 1:46 PM Robert Bradshaw > wrote: > >> I prefer type-comments, as they can be validated by type checkers. Once >> we drop 2.7, we can go with actual type annotations (and the comments can >> be automatically converted over). >> >> On Fri, Apr 10, 2020

Re: [DISCUSS] Let's establish a guideline for using Python type annotations in Beam codebase

2020-04-10 Thread Robert Bradshaw
I prefer type-comments, as they can be validated by type checkers. Once we drop 2.7, we can go with actual type annotations (and the comments can be automatically converted over). On Fri, Apr 10, 2020 at 11:17 AM Valentyn Tymofieiev wrote: > I am seeing several styles we use to annotate

Re: Usage metrics for Beam

2020-04-09 Thread Robert Bradshaw
raw absolute number is tricky. You can probably > manage to see certain kinds of trends if you just look at relative numbers. > > Kenn > > On Thu, Apr 9, 2020 at 6:42 PM Austin Bennett > wrote: > >> @Robert Bradshaw , you sent that pypi link [1] >> the other day

Re: [VOTE] Release 2.20.0, release candidate #2

2020-04-09 Thread Robert Bradshaw
+1, the artifacts and signatures all look good, and I also checked that the Python wheels work with a simple pipeline in a fresh virtual environment. On Thu, Apr 9, 2020 at 5:11 PM Ahmet Altay wrote: > +1 - validated python quickstarts batch/streaming with python 2.7. > > Thank you Rui! > > On

Re: Usage metrics for Beam

2020-04-09 Thread Robert Bradshaw
For Python, there's https://pypistats.org/packages/apache-beam . It's unclear how accurate these are, and how many of these downloads represent users vs. tools (e.g. setting up environments for continuous testing). On Thu, Apr 9, 2020 at 3:29 PM Griselda Cuevas wrote: > Hi folks - I'm

Re: [VOTE] Release 2.20.0, release candidate #1

2020-04-06 Thread Robert Bradshaw
;>>>>> missing that commit is -1, or that can be marked as a known issue in >>>>>>> release note. >>>>>>> >>>>>>> >>>>>>> -Rui >>>>>>> >>>>>>> On Mon, Apr 6,

Re: [VOTE] Release 2.20.0, release candidate #1

2020-04-06 Thread Robert Bradshaw
that likely not in the binary artifacts either. On Mon, Apr 6, 2020 at 1:22 PM Rui Wang wrote: > I think PR#11252 is in the release branch? See > https://github.com/apache/beam/commits/release-2.20.0 (the top commit) > > > > -Rui > > On Mon, Apr 6, 2020 at 1:21 PM

Re: [VOTE] Release 2.20.0, release candidate #1

2020-04-06 Thread Robert Bradshaw
Valentyn, do the container issues affect our external containers as well? I verified the signatures and sources, they all look good, except that we're missing https://github.com/apache/beam/pull/11252 if we were hoping to get that in. The wheel looks fine as well. On Mon, Apr 6, 2020 at 12:16 PM

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-04-03 Thread Robert Bradshaw
t is still too long. The cost of >>> supporting a version may include: >>> - Developing against older Python version >>> - Release overhead (building & storing containers, wheels, doing >>> release validation) >>> - Complexity / development cost to supp

Re: Unportable Dataflow Pipeline Questions

2020-04-02 Thread Robert Bradshaw
multiple steps (we'll have to keep updating features >>> such as a cross-language to be in lockstep which will be hard and result in >>> a lot of throwaway work). >>> >>> Thanks, >>> Cham >>> >>> >>>> On Tue, Mar 31, 2020 at 6:01 PM Robert Bur

Re: Java PortabilityApi PreCommit failures

2020-04-01 Thread Robert Bradshaw
Alternatively we could roll back, but looks like disabling the tests has been merged. I'll take on fixing the container images (which looks like it will require getting the imports up to date). Thanks for tracking this down. On Wed, Apr 1, 2020 at 7:55 PM Luke Cwik wrote: > Been seeing these

Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-04-01 Thread Robert Bradshaw
E.g. something like https://github.com/apache/beam/pull/11283 On Wed, Apr 1, 2020 at 2:57 PM Robert Bradshaw wrote: > On Wed, Apr 1, 2020 at 1:48 PM Sam Rohde wrote: > >> To restate the original issue it is that the current method of setting >> the output tags on PCollectio

Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-04-01 Thread Robert Bradshaw
gt; PCollection, we use the keys as tags. We can extend this naturally to tuples, named tuples, nesting, etc. (though I don't know if there are any hidden assumptions left about having an output labeled None if we want to push this through to completion). > > > > On Wed, A

Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-04-01 Thread Robert Bradshaw
t; Union[PValue, NamedTuple[str, PCollection], > Tuple[str, PCollection], Dict[str, PCollection], DoOutputsTuple] > > i.e. no arbitrary nesting when outputting from an expand > > On Tue, Mar 31, 2020 at 5:15 PM Robert Bradshaw > wrote: > >> On Tue, Mar 31, 2020 at 4:13 PM L

Re: Default WindowFn for Unbounded source

2020-04-01 Thread Robert Bradshaw
On Wed, Apr 1, 2020 at 12:53 AM Jan Lukavský wrote: > Hi Amit, > > answers inline. > On 4/1/20 12:23 AM, amit kumar wrote: > > Thanks Ankur for your reply. > > By default the allowed lateness for a global window is zero but we can > also set it to be non-zero which will be used in the

Re: Unportable Dataflow Pipeline Questions

2020-03-31 Thread Robert Bradshaw
On Tue, Mar 31, 2020 at 12:06 PM Sam Rohde wrote: > Hi All, > > I am currently investigating making the Python DataflowRunner to use a > portable pipeline representation so that we can eventually get rid of the > Pipeline(runner) weirdness. > > In that case, I have a lot questions about the

Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-03-31 Thread Robert Bradshaw
and outputs of PCollections as maps rather than lists generally across the Python representations (which also relates to some of the ugliness that Cham has been running into with cross-language). > On Tue, Mar 31, 2020 at 2:51 PM Robert Bradshaw wrote: >> >> On Tue, Mar 31, 2020

Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-03-31 Thread Robert Bradshaw
On Tue, Mar 31, 2020 at 1:13 PM Sam Rohde wrote: >>> >>> * Don't allow arbitrary nestings returned during expansion, force composite >>> transforms to always provide an unambiguous name (either a tuple with >>> PCollections with unique tags or a dictionary with untagged PCollections or >>> a

Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-03-31 Thread Robert Bradshaw
On Tue, Mar 24, 2020 at 1:07 PM Sam Rohde wrote: > > Hi All, > > Problem > I would like to discuss BEAM-9322 and the correct way to set the output tags > of a transform with nested PCollections, e.g. a dict of PCollections, a tuple > of dicts of PCollections. Before the fixing of BEAM-1833, the

Re: [PROPOSAL] Leveraging SQL TableProviders for Cross-Language IOs

2020-03-30 Thread Robert Bradshaw
A belated but very enthusiastic +1 to this proposal. Added some comments to the doc. On Thu, Jan 16, 2020 at 9:05 AM Kenneth Knowles wrote: > > Nice! This is quite clever. > > Kenn > > On Mon, Jan 13, 2020 at 5:08 PM Chamikara Jayalath > wrote: >> >> Thanks Brian. Added some comments. >> >> On

Re: Next LTS?

2020-03-24 Thread Robert Bradshaw
way of measuring the demand >> for LTS releases. >> >> There was a suggestion to mark the last release with python 2 support to be >> an LTS release, was there a conclusion on that? ( +Valentyn Tymofieiev ) >> >> Ahmet >> >> On Tue, Mar 24, 2020 at 2:34 PM R

Re: [PROPOSAL] Add licenses and notices to SDK docker images

2020-03-24 Thread Robert Bradshaw
Thank you for updating the doc. As I mentioned on the PR, I do not think we should check all 100K lines of auto-generated/pulled licence files into the repository and run separate asynchronous processes to try to keep things in sync and fix things up as dependencies evolve. Instead, we should

Re: Next LTS?

2020-03-24 Thread Robert Bradshaw
t;> Though, worth ensuring we live up to what we keep on the website. And, >>> without an active LTS, probably something we should take off the site? >>> >>> On Thu, Sep 19, 2019 at 1:33 PM Pablo Estrada wrote: >>>> >>>> +Łukasz Gajowy had at som

Re: Special characters in Beam Schema field names

2020-03-19 Thread Robert Bradshaw
will make it totally clear that the dot is not > a field separator. If we're generating *new* field names, I'd just as soon a convention that generates non-special ones just for ease of use. > On Wed, Mar 18, 2020 at 5:09 PM Robert Bradshaw wrote: >> >> Give the flexibility of

Re: Special characters in Beam Schema field names

2020-03-18 Thread Robert Bradshaw
Give the flexibility of SQL, and the diversity of upstream systems, I'd lean on the side of being maximally flexible and saying a field name is a utf-8 string (including whitespace?), but special characters may require quoting and/or not allow some convenience (e.g. POJO creation). On Wed, Mar

Re: Contributing Twister2 runner to Apache Beam

2020-03-05 Thread Robert Bradshaw
I think we will get to a point where it makes sense for runners to live in their own repositories, with their own release cadence, but we're not at that point yet. One prerequisite is a stable API--we're closing in on that with the portability protos, but many (java) runners actually share the

Re: Run Python PreCommit break?

2020-03-05 Thread Robert Bradshaw
https://github.com/apache/beam/pull/11021 for getting rid of these vestigal error logs. On Thu, Mar 5, 2020 at 1:21 PM Rui Wang wrote: > > Hi Community, > > Is python precommit breaking? I have observed a consistent test case failure > from >

Re: Python Static Typing: Next Steps

2020-03-03 Thread Robert Bradshaw
wever could be a good occasion to rework the current PythonLint >> > job. Since yapf has been introduced, some of the checks made by >> > pylint/flake are now unnecessary and could be dismantled. This would >> > speed-up PythonLint quite a lot. >> > I volunteer

Re: Java SplittableDoFn Watermark API

2020-03-03 Thread Robert Bradshaw
e *all* runners become portable runners. The at doesn't mean they all need to user docker images, or even GRPC, but I don't think having classical-only or classical-excluded features is where we want to be long-term. > On Tue, Mar 3, 2020 at 1:41 AM Robert Bradshaw wrote: > > > &g

Re: Error logging from fn_api_runners

2020-03-02 Thread Robert Bradshaw
Yeah, this was an oversight on my part. I don't think we need to log this at all. https://github.com/apache/beam/pull/11021 for anyone to look at. On Mon, Mar 2, 2020 at 2:44 PM Heejong Lee wrote: > > I think it should be either info or debug but not error. > > On Mon, Mar 2, 2020 at 2:35 PM

Re: Python Static Typing: Next Steps

2020-03-02 Thread Robert Bradshaw
It seems people are conflating git pre-commit hooks (which IMHO should ideally be in the sub-second range, and run when an author does "git commit") with jenkins pre-commit tests (for which minutes is nothing compared to what we already do). I am +1 to adding mypy to the latter for sure, and think

Re: Java SplittableDoFn Watermark API

2020-03-02 Thread Robert Bradshaw
I don't have a strong preference for using a provider/having a set of tightly coupled methods in Java, other than that we be consistent (and we already use the methods style for restrictions). On Mon, Mar 2, 2020 at 3:32 PM Luke Cwik wrote: > > Jan, there are some parts of Apache Beam the

Re: Python Static Typing: Next Steps

2020-03-02 Thread Robert Bradshaw
+1 We should enable this on jenkins, plus trivial instructions (ideally a one-liner tox command) to run it locally. Hopefully the errors will be easy enough for contributors to figure out (in particular local to and commensurate in complexity with the code that they're editing), and I agree it's

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Robert Bradshaw
te: >>>> >>>> I feel 4+ versions take too long to run anything. >>>> >>>> would vote for lowest + highest, 2 versions. >>>> >>>> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri wrote: >>>>> >>>>> I agree

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Robert Bradshaw
too long to run anything. >> >> would vote for lowest + highest, 2 versions. >> >> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri wrote: >>> >>> I agree with having low-frequency tests for low-priority versions. >>> Low-priority versions could be de

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Robert Bradshaw
d highest version, and can get by with smoke tests + infrequent post-commits for the ones between. > Kenn > > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw wrote: >> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%, or about >> 20% of all Python 3 down

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Robert Bradshaw
+1 to consulting users. Currently 3.5 downloads sit at 3.7%, or about 20% of all Python 3 downloads. I would propose getting in warnings about 3.5 EoL well ahead of time, at the very least as part of the 2.7 warning. Fortunately, supporting multiple 3.x versions is significantly easier than

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Robert Bradshaw
Thanks for bringing this up. I've actually been thinking about the same thing (specifically with regards to 3.5 and 3.8). I think it would makes sense to add support for 3.8 right away (or at least get a good sense of what work needs to be done and what our dependency situation is like), and to

Re: [VOTE] Vendored Dependencies Release Byte Buddy 1.10.8 RC2

2020-02-26 Thread Robert Bradshaw
+1 (binding) On Wed, Feb 26, 2020 at 1:11 PM Pablo Estrada wrote: > > +1 (binding) > Verified hashes. > Thank you Ismael! > > On Wed, Feb 26, 2020 at 11:30 AM Luke Cwik wrote: >> >> +1 (binding) >> >> Verified signatures and contents of jar to not contain module-info.class >> >> On Wed, Feb 26,

Re: python Multiprocessing started with in a do function

2020-02-26 Thread Robert Bradshaw
I suspect this may be due to long-standing bugs regarding forking a process that has grpc channels. See, e.g. https://github.com/grpc/grpc/issues/18321 On Wed, Feb 26, 2020 at 9:02 AM laxman reddy wrote: > > Hello Team, > i am using beam for experimenting for my project usecase >

Re: [DISCUSSION] Use github actions for python wheels ?

2020-02-25 Thread Robert Bradshaw
I'd be in favor of this, assuming it actually simplifies things. (Note that the wheels are for several variants of linux, presumably we could do cross-compiles. Also, manylinux is a "minimal" linux specifically built as to produce shared object libraries compatible with a wide variety of

Re: [ANNOUNCE] New committer: Chad Dombrova

2020-02-24 Thread Robert Bradshaw
Well deserved, Chad. Congratulations! On Mon, Feb 24, 2020 at 2:43 PM Reza Rokni wrote: > > Congratulations! :-) > > On Tue, Feb 25, 2020 at 6:41 AM Chad Dombrova wrote: >> >> Thanks, folks! I'm very excited to "retest this" :) >> >> Especially big thanks to Robert and Udi for all their hard

Re: [VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.2 for BEAM-9252

2020-02-21 Thread Robert Bradshaw
+1 (binding) On Fri, Feb 21, 2020 at 4:48 PM Ahmet Altay wrote: > > +1 > > On Fri, Feb 21, 2020 at 4:39 PM Luke Cwik wrote: >> >> +1 (binding) >> I diffed the binary contents of the 0.1 jar and 0.2 jar with no changes to >> the contents of the files and can confirm that module-info.class the

Re: FnAPI proto backwards compatibility

2020-02-20 Thread Robert Bradshaw
8:42 PM, Robert Burke wrote: > > +1 to deferring for now. Since they should not be modified after adoption, it > makes sense not to get ahead of ourselves. > > On Thu, Feb 13, 2020, 10:59 AM Robert Bradshaw wrote: >> >> On Thu, Feb 13, 2020 at 10:12 AM Robert Burke w

Re: Cross-language pipelines status

2020-02-19 Thread Robert Bradshaw
gt; -chad > > > On Wed, Feb 19, 2020 at 6:00 PM Robert Bradshaw wrote: >> >> Hopefully this should be resovled by >> https://issues.apache.org/jira/browse/BEAM-9229 >> >> On Wed, Feb 19, 2020 at 5:52 PM Chad Dombrova wrote: >> > >> > We are

Re: Cross-language pipelines status

2020-02-19 Thread Robert Bradshaw
Hopefully this should be resovled by https://issues.apache.org/jira/browse/BEAM-9229 On Wed, Feb 19, 2020 at 5:52 PM Chad Dombrova wrote: > > We are using external transforms to get access to PubSubIO within python. It > works well, but there is one major issue remaining to fix: we have to

Re: FnAPI proto backwards compatibility

2020-02-14 Thread Robert Bradshaw
Oh, sorry. Try it again https://docs.google.com/document/d/1CyVElQDYHBRfXu6k1VSXv3Yok_4r8c4V0bkh2nFAWYc/edit?usp=sharing On Fri, Feb 14, 2020 at 2:04 PM Jan Lukavský wrote: > > Hi Robert, > > the doc seems to be locked. > > Jan > > On 2/14/20 10:56 PM, Robert Bradshaw wr

Re: FnAPI proto backwards compatibility

2020-02-14 Thread Robert Bradshaw
gt;>> Kenn >>> >>>> >>>> c) we can take advantage of these pipeline features to get rid of the >>>> categories of @ValidatesRunner tests, because we could have just simply >>>> @ValidatesRunner and each test would be matched against ru

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Robert Bradshaw
understand" (eg. Combiner lifting, and state backed iterables), as > well as "what the pipeline requires from the runner" and "what the runner is > able to do" (eg. Requires sorted input) > > > On Thu, Feb 13, 2020, 9:06 AM Luke Cwik wrote: >> >>

Re: [PROPOSAL] Transition released containers to the official ASF dockerhub organization

2020-02-13 Thread Robert Bradshaw
gt;>> +1 very nice explanation >>> >>> On Wed, Jan 15, 2020 at 1:57 PM Ahmet Altay wrote: >>>> >>>> +1 - Thank you for driving this! >>>> >>>> On Wed, Jan 15, 2020 at 1:55 PM Thomas Weise wrote: >>>>

Re: FnAPI proto backwards compatibility

2020-02-12 Thread Robert Bradshaw
On Wed, Feb 12, 2020 at 11:08 AM Luke Cwik wrote: > > We can always detect on the runner/SDK side whether there is an unknown > field[1] within a payload and fail to process it but this is painful in two > situations: > 1) It doesn't provide for a good error message since you can't say what the

Re: FnAPI proto backwards compatibility

2020-02-12 Thread Robert Bradshaw
On Tue, Feb 11, 2020 at 7:25 PM Kenneth Knowles wrote: > > On Tue, Feb 11, 2020 at 8:38 AM Robert Bradshaw wrote: >> >> On Mon, Feb 10, 2020 at 7:35 PM Kenneth Knowles wrote: >> > >> > On the runner requirements side: if you have such a list at the

Re: FnAPI proto backwards compatibility

2020-02-11 Thread Robert Bradshaw
>> >> [1] >> https://lists.apache.org/thread.html/e93ac64d484551d61e559e1ba0cf4a15b760e69d74c5b1d0549ff74f%40%3Cdev.beam.apache.org%3E >> >> On Mon, Feb 10, 2020 at 3:55 PM Robert Bradshaw wrote: >>> >>> With an eye towards cross-language (which in

Re: Labels on PR

2020-02-11 Thread Robert Bradshaw
+1 to finding the right balance. I do think per-runner makes sense, rather than a general "runners." IOs might make sense as well. Not sure about all the extensions-* I'd leave those out for now. On Tue, Feb 11, 2020 at 5:56 AM Ismaël Mejía wrote: > > > So I propose going simple with a limited

FnAPI proto backwards compatibility

2020-02-10 Thread Robert Bradshaw
With an eye towards cross-language (which includes cross-version) pipelines and services (specifically looking at Dataflow) supporting portable pipelines, there's been a desire to stabilize the portability protos. There are currently many cleanups we'd like to do [1] (some essential, others nice

Re: Retest this please access?

2020-02-10 Thread Robert Bradshaw
We're working on that, follow https://issues.apache.org/jira/browse/INFRA-19670 On Mon, Feb 10, 2020 at 9:52 AM Daniel Collins wrote: > > Hello all, > > I'm feeling a bit bad about asking my reviewers to re-run presubmits. How > would I go about getting access to "Retest this please" being

Re: [BEAM-8550] @RequiresTimeSortedInput ready for merge to master

2020-02-07 Thread Robert Bradshaw
There are two separable concerns here. (1) The @RequiresTimeSortedInput feature itself. This is a subtle feature needed for certain pipelines, and if anything Jan has gone the extra mile discussing, documenting, and designing this and trying to reach consensus. I feel like there has been a

Re: [DISCUSS] Autoformat python code with Black

2020-02-07 Thread Robert Bradshaw
_time(1).advance_watermark_to( > +13).advance_processing_time(1).advance_watermark_to( > + > 14).advance_processing_time(1).advance_watermark_to( > +15).advance_processing_time(1)) > > On Thu, Feb 6, 2020 at 1:

Re: Time precision in Python

2020-02-07 Thread Robert Bradshaw
ha, I was just surprised by the precision loss. Thanks! >> >> On Thu, Feb 6, 2020 at 1:50 PM Robert Bradshaw wrote: >>> >>> Yes, the inconsistency of timestamp granularity is something that >>> hasn't yet been resolved (see previous messages on this list). As lon

Re: [DISCUSS] Autoformat python code with Black

2020-02-06 Thread Robert Bradshaw
Java >>>> case >>>> what we did was to just notice every PR that was affected by the change. >>>> And clearly document how to validate and autoformat the code. >>>> >>>> So the earlier the better, go go autoformat! >>>> >>&

Re: Time precision in Python

2020-02-06 Thread Robert Bradshaw
Yes, the inconsistency of timestamp granularity is something that hasn't yet been resolved (see previous messages on this list). As long as we round consistently, it won't result in out-of-order windows, but it may result in timestamp truncation and (for sub-millisecond small windows) even window

Re: [DISCUSS] Autoformat python code with Black

2020-02-05 Thread Robert Bradshaw
, 2020 at 3:55 PM Ahmet Altay wrote: > > Do we need a formal vote? There is consensus on this thread and on the PR. > > On Wed, Feb 5, 2020 at 3:37 PM Robert Bradshaw wrote: >> >> The PR is looking good. Should we call a vote? >> >> On Mon, Jan 27, 202

Re: [DISCUSS] Autoformat python code with Black

2020-02-05 Thread Robert Bradshaw
The PR is looking good. Should we call a vote? On Mon, Jan 27, 2020 at 11:03 AM Robert Bradshaw wrote: > > Thanks. I commented on the PR. I think if we're going this route we > should add a pre-commit, plus instructions on how to run the tool > (similar to spotless). > > On

Re: Deterministic field ordering in derived schemas

2020-02-05 Thread Robert Bradshaw
+1 to standardizing on a deterministic ordering for inference if none is imposed by the structure. On Wed, Feb 5, 2020, 8:55 AM Gleb Kanterov wrote: > There are Beam schema providers that use Java reflection to get fields for > classes with fields and auto-value classes. It isn't relevant for

Re: Python2.7 Beam End-of-Life Date

2020-02-04 Thread Robert Bradshaw
On Tue, Feb 4, 2020 at 12:12 PM Chad Dombrova wrote: >> >> Not to mention that all the nice work for the type hints will have to be >> redone in the for 3.x. > > Note that there's a tool for automatically converting type comments to > annotations: https://github.com/ilevkivskyi/com2ann > > So

Re: [DISCUSSION] Improve release notes by adding a change list file

2020-02-03 Thread Robert Bradshaw
On Mon, Feb 3, 2020 at 4:49 PM Ahmet Altay wrote: > > On Mon, Feb 3, 2020 at 2:09 PM Robert Bradshaw wrote: >> >> I would suggest we start with the simpler single file. If merge >> conflicts become an issue, we could look at other options, but I think >> it's w

Re: [DISCUSSION] Improve release notes by adding a change list file

2020-02-03 Thread Robert Bradshaw
] https://github.com/python-attrs/attrs >> >> >> On Fri, Jan 31, 2020 at 5:09 PM Ahmet Altay wrote: >>> >>> Thank you for the quick responses. I sent out >>> https://github.com/apache/beam/pull/10743 to make this change. Please >>> provide fee

Re: [DISCUSSION] Improve release notes by adding a change list file

2020-01-31 Thread Robert Bradshaw
Yes, yes, yes! This is the one model of release notes that I've actually seen work well at scale. https://lists.apache.org/thread.html/41e03ace17dbcccf7e267ba6d538736b2a99a8e73e7fb45702766b17%40%3Cdev.beam.apache.org%3E Let's make it happen. On Fri, Jan 31, 2020 at 3:47 PM Robert Burke wrote:

Re: Release 2.19.0, release candidate #1

2020-01-31 Thread Robert Bradshaw
+1 (binding) I validated the source tarball and all the signatures/checksums, and also tried out a Python wheel on a fresh install with some direct runner pipelines. On Fri, Jan 31, 2020 at 1:44 AM Jean-Baptiste Onofré wrote: > > +1 (binding) > > Checked quickly on beam-samples. > > Thanks, >

Re: [ANNOUNCE] New committer: Hannah Jiang

2020-01-29 Thread Robert Bradshaw
Congratulations, Hannah! On Wed, Jan 29, 2020 at 3:23 PM Chamikara Jayalath wrote: > Congrats Hannah! > > On Wed, Jan 29, 2020 at 9:22 AM Hannah Jiang > wrote: > >> Thanks everyone! >> It is a very rewarding journey and I am happy to be able to achieve a >> mini milestone. :) >> >> >> On Wed,

Re: [DISCUSS] Autoformat python code with Black

2020-01-27 Thread Robert Bradshaw
precommit job that >>> fails if any unformatted code is detected looks like too strict. What do >>> you think? >>> >>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw wrote: >>>> >>>> Thanks! Now we get to debate what knobs to twiddle :

Re: [DISCUSS] Autoformat python code with Black

2020-01-23 Thread Robert Bradshaw
;> iteration. We will skip some of conversations about code style. >>>>>>>>> >>>> >>>>>>>>> >>>> ... >>>>>>>>> >>>>> >>>>>>>>> >>>>&g

Re: Updating Metrics Counter in user defined thread

2020-01-21 Thread Robert Bradshaw
he thread local setup is that parallelism >> is typically handled by the Beam, rather than introducing a separate >> threading model. Though, perhaps breaking out of this threading model is >> more common than we initially thought. >> >> I hope thats helpful, sorry we d

Re: [DISCUSS] Autoformat python code with Black

2020-01-21 Thread Robert Bradshaw
ires Python 3 to be run. I don’t know how >>>>> big obstacle it would be. >>>>> >>>>> >>>>> I believe there are two options how it would be possible to introduce >>>>> Black. First: just do it, it will hurt but then it would be

Re: [DISCUSS] Integrate Google Cloud AI functionalities

2020-01-21 Thread Robert Bradshaw
The current state is that it works, and a large amount of testing is being added [1], but the public API is still in flux (especially the java-as-callee side [2], and the specification of dependencies [3,4]). It is being actively worked on though. [1] https://github.com/apache/beam/pull/10051 [2]

Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-21 Thread Robert Bradshaw
ranch and tag JIRA > issues with the all relevant releases that should be blocked on it. > > Ahmet > > On Tue, Jan 21, 2020 at 11:36 AM Udi Meiri wrote: >> >> I was not aware of https://issues.apache.org/jira/browse/BEAM-9123 or the PR >> on the release br

Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-21 Thread Robert Bradshaw
The source tarball seems to be missing the commit at https://github.com/apache/beam/commit/a61dfbf4570e3adb30e15315c116751faeda897e On Tue, Jan 21, 2020 at 9:49 AM Ahmet Altay wrote: > > All, could you help with validations and voting? > > On Wed, Jan 15, 2020 at 6:14 PM Ahmet Altay wrote: >>

<    1   2   3   4   5   6   7   8   9   10   >