Re: TextIO binary file

2017-01-31 Thread Robert Bradshaw
On Tue, Jan 31, 2017 at 12:04 PM, Aviem Zur wrote: > +1 on what Stas said. > I think there is value in not having the user write a custom IO for a > protocol they use which is not covered by Beam IOs. Plus having them deal > with not only the encoding but also the IO part is

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-01-26 Thread Robert Bradshaw
First off, let me say that a *correctly* batching DoFn is a lot of value, especially because it's (too) easy to (often unknowingly) implement it incorrectly. My take is that a BatchingParDo should be a PTransform that takes a DoFn, ? extends Iterable> as a parameter, as

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-01-27 Thread Robert Bradshaw
On Fri, Jan 27, 2017 at 6:55 AM, Etienne Chauchot <echauc...@gmail.com> wrote: > Hi Robert, > > Le 26/01/2017 à 18:17, Robert Bradshaw a écrit : >> >> First off, let me say that a *correctly* batching DoFn is a lot of >> value, especially because it's (too) easy to

Re: Beam Fn API

2017-01-20 Thread Robert Bradshaw
Also, note that we can still support the "simple" case. For example, if the user supplies us with a jar file (as they do now) a runner could launch it as a subprocesses and communicate with it via this same Fn API or install it in a fixed container itself--the user doesn't *need* to know about

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-01-26 Thread Robert Bradshaw
On Thu, Jan 26, 2017 at 6:58 PM, Kenneth Knowles <k...@google.com.invalid> wrote: > On Thu, Jan 26, 2017 at 4:15 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> On Thu, Jan 26, 2017 at 3:42 PM, Eugene Kirpichov >> <kirpic...@google.com.inva

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-01-26 Thread Robert Bradshaw
On Thu, Jan 26, 2017 at 4:20 PM, Ben Chambers wrote: > Here's an example API that would make this part of a DoFn. The idea here is > that it would still be run as `ParDo.of(new MyBatchedDoFn())`, but the > runner (and DoFnRunner) could see that it has asked for

Re: [ANNOUNCEMENT] New committers, January 2017 edition!

2017-01-26 Thread Robert Bradshaw
Welcome and congratulations! On Thu, Jan 26, 2017 at 5:05 PM, Sourabh Bajaj wrote: > Congrats!! > > On Thu, Jan 26, 2017 at 5:02 PM Jason Kuster > wrote: > >> Congrats all! Very exciting. :) >> >> On Thu, Jan 26, 2017 at 4:48 PM,

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-01-26 Thread Robert Bradshaw
k the "make batches of at most N but don't wait too long if you don't get to N" is a very useful first (and tractable) start that can be built on. > On Thu, Jan 26, 2017 at 3:01 PM Robert Bradshaw <rober...@google.com.invalid> > wrote: > >> On Thu, Jan 26, 2017 at 12:48 PM

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-01-26 Thread Robert Bradshaw
On Thu, Jan 26, 2017 at 12:48 PM, Eugene Kirpichov wrote: > I don't think we should make batching a core feature of the Beam > programming model (by adding it to DoFn as this code snippet implies). I'm > reasonably sure there are less invasive ways of implementing

Re: Better developer instructions for using Maven?

2017-02-22 Thread Robert Bradshaw
On Wed, Feb 22, 2017 at 7:51 AM, Jean-Baptiste Onofré wrote: > Thanks Kenn, it's perfectly clear now ;) > That was Kenn's vote. I'm of the opposite opinion (at least I think checkstyle should be done by default, possibly others). It's clear many people aren't very happy with

Re: Better developer instructions for using Maven?

2017-02-10 Thread Robert Bradshaw
IMO) decide to go > > do something else. > > > > Folks other than newcomers can learn a repertoire of commands, like > Robert > > says. So we shouldn't consider them (aka "us") so much when deciding > > whether "fast" or "slow" is the default,

Re: Beam File System in the Python SDK

2017-03-01 Thread Robert Bradshaw
Much needed! Added a couple of comments. On Wed, Mar 1, 2017 at 3:08 PM, Sourabh Bajaj < sourabhba...@google.com.invalid> wrote: > Hi, > > BEAM-1441 is a ticket > for > implementing the Beam File System in the Python SDK similar to the one >

Re: Better developer instructions for using Maven?

2017-01-09 Thread Robert Bradshaw
On Mon, Jan 9, 2017 at 3:49 AM, Aljoscha Krettek wrote: > I also usually prefer "mvn verify" to to the expected thing but I see that > quick iteration times are key. I see https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html verify - run any

Re: Creating Sum.[*]Fn instances

2016-12-22 Thread Robert Bradshaw
I was about to comment the same. Generally the CombineFns are more composable units than the global and per-key wrappings; it's not clear why we favor the latter for some Combiners. On Thu, Dec 22, 2016 at 9:59 AM, Ben Chambers wrote: > Don't they need to be visible for use

Re: [jira] [Commented] (BEAM-1261) State API should allow state to be managed in different windows

2017-03-23 Thread Robert Bradshaw
I like the idea of being able to use WindowMappingFns to access state across windows in a manner similar to how side inputs are accessed. On Wed, Mar 22, 2017 at 9:56 PM, Kenneth Knowles (JIRA) wrote: > > [ https://issues.apache.org/jira/browse/BEAM-1261?page= >

Re: [DISCUSS] Change "RunnableOnService" To A More Intuitive Name

2017-03-27 Thread Robert Bradshaw
d the rename from RunnableOnService to ValidatesRunner in > > the > > > > Java codebase (Python was already there) > > > > https://github.com/apache/beam/pull/2157. > > > > > > > > I'm sure there will be stragglers throughout our docs, etc, so ple

Re: Splittable DoFn for Python SDK

2017-03-14 Thread Robert Bradshaw
+1, I think this is a natural extension of the SDF to Python. On Tue, Mar 14, 2017 at 1:19 PM, Chamikara Jayalath wrote: > Thanks Eugene. Will keep you cc'd. > > - Cham > > On Tue, Mar 14, 2017 at 1:15 PM Eugene Kirpichov > wrote: > >> Thanks Cham!

Re: Style: how much testing for transform builder classes?

2017-03-21 Thread Robert Bradshaw
re comprehensive than manual tests...)? AutoValue like alleviates many, but not all, of these concerns - as Ismael > points out. > If two features are not orthogonal, that perhaps merits more test (and documentation). > > > > On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw

Re: Python build artifacts seem to be misconfigured

2017-04-11 Thread Robert Bradshaw
We should also ignore them: https://github.com/apache/beam/pull/2494 On Thu, Apr 6, 2017 at 6:45 PM, Kenneth Knowles wrote: > Thanks for the pointer. I'll dig in to tox docs to see why this isn't > happening. Probably something to do with unclean shutdowns. > > On Thu,

Re: Renaming SideOutput

2017-04-11 Thread Robert Bradshaw
+1, I think this is a lot clearer. On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk wrote: > strong +1 for changing the name away from sideOutput - the fact that > sideInput and sideOutput are not really related was definitely a source of > confusion for me when learning

Re: Renaming SideOutput

2017-04-11 Thread Robert Bradshaw
t;k...@google.com.invalid> wrote: > +1 ditto about sideInput and sideOutput not actually being related > > On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> +1, I think this is a lot clearer. >> >> On

Re: Question regarding loops in BEAM programs

2017-04-13 Thread Robert Bradshaw
There is no way (short of inspecting stack traces and bytecodes) for Beam to distinguish between for (int i=0; i<3; i++) { pc = pc.apply(MyTransform()); } from pc.apply(MyTransform()).apply(MyTransform()).apply(MyTransform()); However, PTransforms are hierarchal, so you

Re: Naming of Combine.Globally

2017-04-18 Thread Robert Bradshaw
On Tue, Apr 18, 2017 at 3:03 AM, Wesley Tanaka wrote: > I believe that foldl in Haskell https://www.haskell.org/hoogle/?hoogle=foldl > admits a separate accumulator type from the type of the data structure being > "folded" > And, well, python lets you have your way

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Robert Bradshaw
Strongly in favor of removing this. If it's actually needed one can incorporate the key into the value for inspection in the various phases of the CombineFn, so it's no loss of expressiveness. It's perfectly reasonable to make this (rare) usecase more complicated to greatly simplify the common

Re: Let's make Beam transforms comply with PTransform Style Guide

2017-03-03 Thread Robert Bradshaw
Here's a crazy idea: what if we had a virtual fixit/hackathon to knock these out (similar to the virtual meet-up, but with an agenda)? I find communal hacking sessions towards a common goal are a good way to get to know each other and get a lot done. Would there be any interest in this? On Wed,

Re: [VOTE] Release 0.6.0, release candidate #2

2017-03-11 Thread Robert Bradshaw
On Fri, Mar 10, 2017 at 9:05 PM, Ahmet Altay wrote: > Hi everyone, > > Please review and vote on the release candidate #2 for the version 0.6.0, > as follows: > [ ] +1, Approve the release > [ ] -1, Do not approve the release (please provide specific comments) > > > The

Re: Style: how much testing for transform builder classes?

2017-03-11 Thread Robert Bradshaw
+1 to reducing "trivial" tests such as these. More below. On Fri, Mar 10, 2017 at 7:53 PM, Kenneth Knowles wrote: > +0.5 > > Tests of trivial validation failures, if they check the error message, are > actually tests of effective communication in the error message

Re: [VOTE] Release 0.6.0, release candidate #2

2017-03-13 Thread Robert Bradshaw
On Sat, Mar 11, 2017 at 11:19 PM, Ahmet Altay <al...@google.com.invalid> wrote: > On Sat, Mar 11, 2017 at 11:48 AM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > > > On Fri, Mar 10, 2017 at 9:05 PM, Ahmet Altay <al...@google.com.invalid>

Re: [VOTE] Release 0.6.0, release candidate #2

2017-03-13 Thread Robert Bradshaw
+1 (binding) On Mon, Mar 13, 2017 at 11:10 AM, Robert Bradshaw <rober...@google.com> wrote: > On Sat, Mar 11, 2017 at 11:19 PM, Ahmet Altay <al...@google.com.invalid> > wrote: > >> On Sat, Mar 11, 2017 at 11:48 AM, Robert Bradshaw < >> rober...@google.com.in

Re: Proposed API for a Whole File IO

2017-08-01 Thread Robert Bradshaw
On Tue, Aug 1, 2017 at 1:42 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > Hi, > As mentioned on the PR - I support the creation of such an IO (both read > and write) with the caveats that Reuven mentioned; we can refine the naming > during code review. > Note that you won't be

Re: [DISCUSS] Beam pipeline logical and physical DAGs visualization.

2017-08-03 Thread Robert Bradshaw
Nice. In terms of shared data structures, we have https://github.com/apache/beam/blob/master/sdks/common/runner-api/src/main/proto/beam_runner_api.proto . Presumably a utility that converts this to a dot file would be quite useful. It might be interesting to experiment with different ways of

Re: Adding back PipelineRunner#apply method

2017-08-15 Thread Robert Bradshaw
On Tue, Aug 15, 2017 at 10:21 AM, Shen Li wrote: > Hi Thomas, > > Does it mean future Pipeline implementations would allow applications to > set the runner after a pipeline has been constructed? Correct, that's the intent. > > Thanks, > Shen > > On Tue, Aug 15, 2017 at

Re: [PROPOSAL] "Requires deterministic input"

2017-08-15 Thread Robert Bradshaw
On Tue, Aug 15, 2017 at 2:14 PM, Reuven Lax <re...@google.com.invalid> wrote: > On Tue, Aug 15, 2017 at 1:59 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> On Sat, Aug 12, 2017 at 1:13 AM, Reuven Lax <re...@google.com.invalid> >> wrote

Re: [PROPOSAL] "Requires deterministic input"

2017-08-15 Thread Robert Bradshaw
On Sat, Aug 12, 2017 at 1:13 AM, Reuven Lax <re...@google.com.invalid> wrote: > On Fri, Aug 11, 2017 at 10:52 PM, Robert Bradshaw < >> The question here is whether the ordering is part of the "content" of >> an iterable. > > My initial instinct was to sa

Re: [PROPOSAL] "Requires deterministic input"

2017-08-11 Thread Robert Bradshaw
On Thu, Aug 10, 2017 at 1:53 PM, Reuven Lax wrote: > On Thu, Aug 10, 2017 at 1:07 PM, Kenneth Knowles > wrote: > >> > > >- Does it also imply fixed length and content for value >> iterators? >> > > > >> >> The concept of "value iterator"

Re: Style of messages for checkArgument/checkNotNull in IOs

2017-08-11 Thread Robert Bradshaw
Huge +1 to the checkArgument(username != null, ...) style. A note on validate(), aren't we trying to remove pipeline options from PTransforms altogether (and, in addition, how does this even work with the Runner API and cross-language transforms). On Thu, Aug 10, 2017 at 4:59 PM, Eugene

Re: Requiring PTransform to set a coder on its resulting collections

2017-08-11 Thread Robert Bradshaw
> kirpic...@google.com.invalid> wrote: > >> I've updated the guidance in PTransform Style Guide on setting coders >> https://beam.apache.org/contribute/ptransform-style-guide/#coders >> according to this discussion. >> https://github.com/apache/beam-site/pull/

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-07-10 Thread Robert Bradshaw
Sorry, just saw https://github.com/apache/beam/pull/2211 On Mon, Jul 10, 2017 at 5:37 PM, Robert Bradshaw <rober...@google.com> wrote: > Any progress on this? > > On Thu, Mar 9, 2017 at 1:43 AM, Etienne Chauchot <echauc...@gmail.com> wrote: >> Hi all, >> >>

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-07-10 Thread Robert Bradshaw
Any progress on this? On Thu, Mar 9, 2017 at 1:43 AM, Etienne Chauchot wrote: > Hi all, > > We had a discussion with Kenn yesterday about point 1 bellow, I would like > to note it here on the ML: > > Using new method timer.set() instead of timer.setForNowPlus() makes the >

Re: [jira] [Commented] (BEAM-2573) Better filesystem discovery mechanism in Python SDK

2017-07-07 Thread Robert Bradshaw
Wouldn't you want to import filesystems to do validation of paths, etc? (Also, registering an imported object is less error-prone than registering a string.) On Fri, Jul 7, 2017 at 9:29 PM, Chamikara Jayalath (JIRA) wrote: > > [ >

Re: MergeBot is here!

2017-07-15 Thread Robert Bradshaw
kets for some of this >> stuff at some point, but am more likely to track via github issues on the >> mergebot repository for now). Comments welcome. :) >> >> https://docs.google.com/document/d/13D1nUgTeonyvNtRi4bJM- >> Vyj9YOCVHZT7QA6EOauKT4/edit >> >> On Wed, Jul

Re: Should Pipeline wait till all processing time timers fire before exit?

2017-07-25 Thread Robert Bradshaw
I generally agree, but it's unclear what to do with timers that are scheduled during the execution of existing timers. (For example, a "heartbeat" source may process a timer by emitting an element and scheduling a timer for the future. One would never be able to fire "all" timers. I suppose this

Re: Custom window merging

2017-07-27 Thread Robert Bradshaw
t the > standardization deserves attention and documentation. > These are already called out individually in the compatibility matrix, which probably makes sense as it allows a runner to declare "partial" support for windowing. On Wed, Jul 26, 2017 at 9:34 PM, Robert Bradshaw <

Re: Requiring PTransform to set a coder on its resulting collections

2017-07-27 Thread Robert Bradshaw
On Thu, Jul 27, 2017 at 10:04 AM, Kenneth Knowles wrote: > On Thu, Jul 27, 2017 at 2:22 AM, Lukasz Cwik > wrote: >> >> Ken/Robert, I believe users will want the ability to set the output coder >> because coders may have intrinsic properties

Re: Requiring PTransform to set a coder on its resulting collections

2017-07-26 Thread Robert Bradshaw
+1, I'm a huge fan of moving this direction. Right now there's also the ugliness that setCoder() may be called any number of times before a PCollection is used (the last setter winning) but is an error to call it once it has been used (and here "used" is not clear--if a PCollection is returned

Re: Should Pipeline wait till all processing time timers fire before exit?

2017-07-26 Thread Robert Bradshaw
e about drain. > On Tue, Jul 25, 2017 at 5:34 PM, Eugene Kirpichov < > kirpic...@google.com.invalid> wrote: > >> Yes, and I think in this case the pipeline should never transition to DONE. >> >> On Tue, Jul 25, 2017 at 3:42 PM Robert Bradshaw >> <rober...

Re: MergeBot is here!

2017-07-12 Thread Robert Bradshaw
On Tue, Jul 11, 2017 at 7:14 PM, Kenneth Knowles <k...@google.com.invalid> wrote: > > On Tue, Jul 11, 2017 at 4:25 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > > > On Tue, Jul 11, 2017 at 8:51 AM, Kenneth Knowles <k...@google.com.invalid&

Re: Passing pipeline options into PTransforms and Filesystems in Python

2017-07-11 Thread Robert Bradshaw
Templates, including ValueProviders, were recently added to the Python SDK. +1 to pursuing this train of thought (and as I mentioned on the bug, and has been mentioned here, we don't want to add PipelineOptions access to PTransforms/at construction time). On Tue, Jul 11, 2017 at 3:21 PM, Kenneth

Re: [jira] [Created] (BEAM-2729) Post commit fail: Input to GroupByKey must be of Tuple or Any type

2017-08-04 Thread Robert Bradshaw
Key: BEAM-2729 > URL: https://issues.apache.org/jira/browse/BEAM-2729 > Project: Beam > Issue Type: Bug > Components: sdk-py > Reporter: Ahmet Altay > Assignee: Robert Bradshaw > > > r

Re: [VOTE] Release 2.1.0, release candidate #3

2017-08-16 Thread Robert Bradshaw
+1 binding (I've been on vacation as well.) On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik wrote: > Back from vacation. > > +1 binding > > BEAM-2671 has been marked for 2.2.0 release. > > > > On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant wrote: >

Re: [Proposal] Progress Reporting in Fn API

2017-08-22 Thread Robert Bradshaw
I put together https://docs.google.com/document/d/1Dx18qBTvFWNqwLeecemOpKfleKzFyeV3Qwh71SHATvY/edit?usp=sharing which explains a bit how I think about progress and might be helpful. On Mon, Aug 21, 2017 at 10:22 AM, Vikas RK wrote: > Hi, > > I have updated the proposal >

Re: Issue with Coder documentation regarding context

2017-05-03 Thread Robert Bradshaw
I filed a https://issues.apache.org/jira/browse/BEAM-2166 simply removing these from the public API (for the reasons listed). We can always bring them back in a forward compatible way if it turns out that they're actually needed. On Thu, Feb 9, 2017 at 1:18 PM, Jean-Baptiste Onofré

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-11 Thread Robert Bradshaw
I like the idea of WWW and PPP, assuming there is a standard enough stringification of windows and panes. However, we may want to elide adjacent tokes if the window is global or the pane is the only possible (or first?) one to avoid writing things like --of-0005---. On Thu, May 11, 2017 at

Re: Behavior of Top.Largest

2017-05-15 Thread Robert Bradshaw
On Sun, May 14, 2017 at 3:36 PM, Ben Chambers wrote: > Exposing the CombineFn is necessary for use with composed combine or > combining value state. There may be other reasons we made this visible, but > these continue to justify it. > These are the CompareFns, not

Re: Behavior of Top.Largest

2017-05-22 Thread Robert Bradshaw
gt; > that is not possible, an easier alternative would be to file a JIRA issue >> > so that the work could be tracked in the other SDK. >> > >> > Ahmet >> > >> > On Fri, May 19, 2017 at 4:22 PM, Robert Bradshaw < >> > rober...@google.com.inva

Re: low availability in the coming 4 weeks

2017-05-25 Thread Robert Bradshaw
Congratulations! On Wed, May 24, 2017 at 8:50 PM, James wrote: > Congratulations Mingmin! Take your time with your new baby/ > > Mingmin Xu 于2017年5月25日周四 上午11:33写道: > >> Hello everyone, >> >> I'll take 4 weeks off to take care of my new born baby. I'm

Re: How can I disable running Python SDK tests when testing my Java change?

2017-05-18 Thread Robert Bradshaw
We could consider splitting Python up into the four things it runs: all tests with Cython, all tests without Cython, docs, and checkstyle. However, I never use Maven when developing the python portions. On Thu, May 18, 2017 at 6:35 PM, Thomas Groh wrote: > Generally I

Re: Proper developer instructions for Python SDK

2017-05-22 Thread Robert Bradshaw
On Mon, May 22, 2017 at 11:13 AM, Chamikara Jayalath <chamik...@apache.org> wrote: > (moving to a separate thread) > > On Mon, May 22, 2017 at 10:45 AM Robert Bradshaw > <rober...@google.com.invalid> wrote: > >> On Sun, May 21, 2017 at 11:45 AM, Dan Halperin >&

Re: Beam Fn API

2017-05-31 Thread Robert Bradshaw
; go >>>>>>>> >>>>>>>>> with a 1-1 relationship or break it up more finely grained and >>>>>>>>> >>>>>>>> dedicate >>>>>> >>>>>>> some machines to have sp

Re: [jira] [Commented] (BEAM-101) Data-driven triggers

2017-06-01 Thread Robert Bradshaw
>> URL: https://issues.apache.org/jira/browse/BEAM-101 >> Project: Beam >> Issue Type: New Feature >> Components: beam-model >>Reporter: Robert Bradshaw >> >> For some applications, it's useful to declare

Re: [jira] [Created] (BEAM-2426) Remove imports from runner/init

2017-06-08 Thread Robert Bradshaw
Yeah, wish we had gotten to this pre-alpha. On Thu, Jun 8, 2017 at 3:10 PM, Ahmet Altay (JIRA) wrote: > Ahmet Altay created BEAM-2426: > - > > Summary: Remove imports from runner/init > Key: BEAM-2426 >

On emitting from finshBundle

2017-05-05 Thread Robert Bradshaw
The JIRA issue https://issues.apache.org/jira/browse/BEAM-1283 suggests requiring an explicit Window when emitting from finshBundle. I'm starting a thread because JIRA/GitHub probably isn't the best (or most efficient) place to have this discussion. The original Spec requires the ambient WindowFn

Re: On emitting from finshBundle

2017-05-05 Thread Robert Bradshaw
t; timestamp it wants to use, and output the correct thing to the correct > timestamp and window. I believe that having only the ability to > outputWindowed(value, timestamp, window) makes it quite obvious that this > is necessary. It is not boilerplate to do so, but core functionality. Yes, and

Re: First stable release: version designation?

2017-05-08 Thread Robert Bradshaw
I also have a definite (I guess that's closer to strong that slight) preference for 2.0. With version numbers, a gap is less likely to cause trouble than the ambiguity of an overlap, and easy to document (vs. with ambiguity, one wouldn't even think to consult the documentation without knowing the

Re: On emitting from finshBundle

2017-05-08 Thread Robert Bradshaw
to global combine), as long as we can detect its abuse. I'd be interested in hearing if others have thoughts on this as well. - Robert On Fri, May 5, 2017 at 2:05 PM, Kenneth Knowles <k...@google.com.invalid> wrote: > On Fri, May 5, 2017 at 1:53 PM, Robert Bradshaw <rober...@google

Re: Process for getting the first stable release out

2017-05-08 Thread Robert Bradshaw
hem, touched by significantly more people.) > With post commits running automatically on master only, that seems like a > logical starting point. But, it doesn't matter really -- either way works. > > On Mon, May 8, 2017 at 12:30 PM, Robert Bradshaw < > rober...@google.com.invalid> wr

Re: Process for getting the first stable release out

2017-05-08 Thread Robert Bradshaw
t one or the other before it's merged. There certainly may be cases where we decide to merge into master to be safe and optionally CP after the fact, but for many PRs it's clear where they should end up. > On Mon, May 8, 2017 at 1:10 PM, Robert Bradshaw <rober...@google.com.invalid >

Re: On emitting from finshBundle

2017-05-05 Thread Robert Bradshaw
On Fri, May 5, 2017 at 1:33 PM, Kenneth Knowles <k...@google.com.invalid> wrote: > On Fri, May 5, 2017 at 12:43 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> On Fri, May 5, 2017 at 12:14 PM, Kenneth Knowles <k...@google.com.invalid> >> wr

Re: Congratulations Davor!

2017-05-04 Thread Robert Bradshaw
Congratulations, Davor! Well deserved. On Thu, May 4, 2017 at 9:53 AM, Hadar Hod wrote: > Congrats, Davor! > > On Thu, May 4, 2017 at 8:56 AM, Chamikara Jayalath > wrote: > >> Congrats Davor. Very well deserved. >> >> - Cham >> >> On Thu, May 4,

Re: Behavior of Top.Largest

2017-05-19 Thread Robert Bradshaw
I see this was implemented. Do we have a policy/guideline for when a name is "bad enough" to merit renaming (and keeping a duplicate, deprecated member around for a year or more). On Mon, May 15, 2017 at 9:25 AM, Robert Bradshaw <rober...@google.com> wrote: > On Sun, May 14, 2

Re: Proposal: Unbreak Beam Python 2.1.0 with 2.1.1 bugfix release

2017-09-19 Thread Robert Bradshaw
s need to happen to fix it? >> >> On Tue, Sep 19, 2017, 5:49 PM Chamikara Jayalath <chamik...@apache.org> >> wrote: >> >> > +1 for cutting 2.1.1 for Python SDK only. >> > >> > Thanks, >> > Cham >> > >> > O

Re: Proposal: Unbreak Beam Python 2.1.0 with 2.1.1 bugfix release

2017-09-19 Thread Robert Bradshaw
+1. Right now anyone who follows our quickstart instructions or otherwise installs the latest release of apache_beam is broken. On Tue, Sep 19, 2017 at 2:05 PM, Charles Chen wrote: > The latest version (2.1.0) of Beam Python ( > https://pypi.python.org/pypi/apache-beam)

Re: Simplifying beam pipeline construction

2017-09-18 Thread Robert Bradshaw
use the proposed design pattern. On the back end, we could > adjust the portability framework protos. > > On Mon, Sep 18, 2017 at 5:49 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> In the effort to simplify and clean up the Beam API, especially with

Re: [VOTE RESULT] Release 2.1.1, release candidate #1

2017-09-22 Thread Robert Bradshaw
Correction, Chamikara Jayalath is a committer, not a member of the PMC. This does not change the results; the voting still stands unanimous at 4 PMC votes + a significant committer vote in the affirmative. On Fri, Sep 22, 2017 at 2:16 PM, Robert Bradshaw <rober...@google.com> wrote: >

[VOTE RESULT] Release 2.1.1, release candidate #1

2017-09-22 Thread Robert Bradshaw
I'm happy to announce that we have unanimously approved this bugfix release. There are 5 approving PMC member votes: - Chamikara Jayalath - Kenneth Knowles - Daniel Halperin - Jean-Baptiste Onofré - Aljoscha Krettek There are no disapproving votes. Thanks everyone! I will be

Re: Using side inputs in any user code via thread-local side input accessor

2017-09-13 Thread Robert Bradshaw
+1 to reducing the amount of boilerplate for dealing with side inputs. I prefer the "NewDoFn" style of side inputs for consistency. The primary drawback seems to be lambda's incompatibility with annotations. This is solved in Python by letting all the first annotated argument of the process

Re: Using side inputs in any user code via thread-local side input accessor

2017-09-13 Thread Robert Bradshaw
mplicit" api. If we do go this direction for side inputs, we should also consider it for state and side outputs. > On Wed, Sep 13, 2017 at 1:03 PM Robert Bradshaw <rober...@google.com.invalid> > wrote: > >> +1 to reducing the amount of boilerplate for dealing with side

Re: Using side inputs in any user code via thread-local side input accessor

2017-09-13 Thread Robert Bradshaw
On Wed, Sep 13, 2017 at 1:56 PM, Eugene Kirpichov <kirpic...@google.com.invalid> wrote: > On Wed, Sep 13, 2017 at 1:44 PM Robert Bradshaw <rober...@google.com.invalid> > wrote: > >> On Wed, Sep 13, 2017 at 1:17 PM, Eugene Kirpichov >> <kirpic...@googl

Simplifying beam pipeline construction

2017-09-18 Thread Robert Bradshaw
In the effort to simplify and clean up the Beam API, especially with an eye towards making Beam more friendly towards interactive use, I propose getting rid of the Pipline object. See the full proposal at https://s.apache.org/no-beam-pipeline . I'd like to hear people's thoughts on the idea. -

[VOTE] Release 2.1.1, release candidate #1

2017-09-21 Thread Robert Bradshaw
Hi everyone, As discussed earlier in this list [1] we'd like to get a bugfix release out for beam 2.1. Please review and vote on the release candidate #1 for the version 2.1.1, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments)

Re: [PROPOSAL] FileIO.write: a modular replacement for FileBasedSink/WriteFiles

2017-09-07 Thread Robert Bradshaw
Huge +1. This brings things more in line with Python's FileBasedSink where one simply overrides write[_encoded]_record and, usually, open/close. We may want to consider aligning the APIs. (And, of course bringing things like DynamicDestinations to Python.) On Wed, Sep 6, 2017 at 9:24 PM,

Re: [jira] [Commented] (BEAM-2815) Python DirectRunner is unusable with input files in the 100-250MB range

2017-09-07 Thread Robert Bradshaw
Ad another data point, you could try using --runner=apache_beam.runners.portability.fn_api_runner.FnApiRunner On Thu, Sep 7, 2017 at 1:20 PM, Ahmet Altay (JIRA) wrote: > > [ >

Re: [jira] [Commented] (BEAM-3040) Python precommit timed out after 150 minutes

2017-10-12 Thread Robert Bradshaw
The very old pip causes ~10 minute builds of the proto generation tools, but we're past that at this point. Not sure what these hangs are, it's probably some inter-process grpc synchronization thing. On Thu, Oct 12, 2017 at 11:46 AM, Ahmet Altay (JIRA) wrote: > > [

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Robert Bradshaw
e: >> >> Yea, I think voting is the next step. Luke - I think you are obviously the >> right person to set up the email of what exactly we are voting on, since >> you've driven this improvement. >> >> On Tue, Nov 28, 2017 at 12:08 AM, Robert Bradshaw <rober...@goo

Re: [VOTE] Release 2.2.0, release candidate #4

2017-11-25 Thread Robert Bradshaw
gt;>>>>> >>>>>>>>>> I build a pipeline using RC 2.2 today and ran with runner on >>>>> >>>>> yarn. >>>>>>>>>> >>>>>>>>>> It worked seamlessly for unbounded sources. C

Re: [VOTE] Release 2.2.0, release candidate #4

2017-11-25 Thread Robert Bradshaw
>>>>> > >>>>> > >>>>>> Hi, > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Typo in previous mail. I meant Flink runner. > >&

Re: [VOTE] Release 2.2.0, release candidate #4

2017-11-25 Thread Robert Bradshaw
Upload to PyPi done. https://pypi.python.org/pypi/apache-beam On Sat, Nov 25, 2017 at 8:56 AM, Robert Bradshaw <rober...@google.com> wrote: > I can do that. > > On Fri, Nov 24, 2017, 11:13 PM Reuven Lax <re...@google.com.invalid> wrote: >> >> I am not an owner

Re: [VOTE] Fixing @yyy.com.INVALID mailing addresses

2017-11-22 Thread Robert Bradshaw
+1 On Wed, Nov 22, 2017, 10:10 PM Jean-Baptiste Onofré wrote: > +1 > > Regards > JB > > On 11/23/2017 12:25 AM, Lukasz Cwik wrote: > > I have noticed that some e-mail addresses (notably @google.com) get > > .INVALID suffixed onto it so per...@yyy.com become >

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-28 Thread Robert Bradshaw
On Tue, Nov 28, 2017 at 9:48 AM, Reuven Lax wrote: > > On Tue, Nov 28, 2017 at 9:14 AM, Jean-Baptiste Onofré > wrote: >> >> Hi Reuven, >> >> Yes, I remember that we agreed on a release per month. However, we didn't >> do it before. I think the most important

On voting and process

2017-11-29 Thread Robert Bradshaw
those who felt differently to call for a vote, which would have been respected if anyone felt strongly. - Robert > On Wed, Nov 29, 2017 at 10:02 AM, Robert Bradshaw <rober...@google.com> > wrote: >> >> +1 (binding) >> >> I agree with what both JB and R

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-28 Thread Robert Bradshaw
It's great to see all the discussion going on here. I think it's important to point out that merging a parallel set of gradle build scripts is a separate (and much less disruptive) step than, say, switching over the default (or even recommended) build/release process to use them, let alone

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-29 Thread Robert Bradshaw
+1 (binding) I agree with what both JB and Reuven had to say about process. On Wed, Nov 29, 2017 at 7:45 AM, Jean-Baptiste Onofré wrote: > Hi Reuven, > > I know that the merge was not malicious. No problem at all ;) > > It's just about the community and consensus. > > For

Re: Callbacks/other functions run after a PDone/output transform

2017-12-04 Thread Robert Bradshaw
+1 At the very least an empty PCollection could be produced with no promises about its contents but the ability to be followed (e.g. as a side input), which is forward compatible with whatever actual metadata one may decide to produce in the future. On Mon, Dec 4, 2017 at 11:06 AM, Kenneth

Re: Guarding against unsafe triggers at construction time

2017-12-04 Thread Robert Bradshaw
On Mon, Dec 4, 2017 at 3:19 PM, Eugene Kirpichov wrote: > Hi, > > After a recent investigation of a data loss bug caused by unintuitive > behavior of some kinds of triggers, we had a discussion about how we can > protect against future issues like this, and I summarized it

Re: [PROPOSAL] Beam Go SDK feature branch

2017-12-01 Thread Robert Bradshaw
Very Exciting! The Go language is in a very different point of the design space than either Java or Python; it's interesting to see how you've explored making this fit with the Beam model. Thanks for the detailed design doc. +1 to targeting the portability framework directly. Once all runners

Re: [jira] [Commented] (BEAM-3343) beam_PostCommit_Python_Verify fails due to "ImportError: cannot import name coder_impl"

2017-12-14 Thread Robert Bradshaw
URL: https://issues.apache.org/jira/browse/BEAM-3343 >> Project: Beam >> Issue Type: Bug >> Components: sdk-py-core >>Reporter: Chamikara Jayalath >>Assignee: Robert Bradshaw >>Priority: Cr

Re: [jira] [Commented] (BEAM-3357) Python SDK head fails to run tests due to Requirement.parse('protobuf<=3.4.0,>=3.2.0')

2017-12-15 Thread Robert Bradshaw
I am also in favor of pinning as an immediate fix, bumping the bound otherwise. Regarding putting an upper bound to avoid being broken, the last two breaks have been due to just having an (unneeded) upper bound (which held us back to broken/incompatible releases in relationship to other

Re: [DISCUSS] Migrating Apache Beam PreCommits/PostCommits to Gradle Criteria

2017-12-15 Thread Robert Bradshaw
+1 to these being minimal criteria. (I assume there's an implicit "the coverage is the same or better" as well?) On Fri, Dec 15, 2017 at 12:37 PM, Lukasz Cwik wrote: > Romain, choosing the criteria is about setting the minimum expectations and > your welcome to suggest

Re: [jira] [Commented] (BEAM-3357) Python SDK head fails to run tests due to Requirement.parse('protobuf<=3.4.0,>=3.2.0')

2017-12-15 Thread Robert Bradshaw
On Fri, Dec 15, 2017 at 1:51 PM, Ahmet Altay <al...@google.com> wrote: > > On Fri, Dec 15, 2017 at 1:38 PM, Robert Bradshaw <rober...@google.com> > wrote: >> >> I am also in favor of pinning as an immediate fix, bumping the bound >> otherwise. >> >&

Re: [jira] [Commented] (BEAM-3357) Python SDK head fails to run tests due to Requirement.parse('protobuf<=3.4.0,>=3.2.0')

2017-12-15 Thread Robert Bradshaw
ble we will prevent users from using latest versions of dependencies. >> On the other hand it will prevent breaking of already released versions. >> >>> >>> >>> Thanks, >>> Cham >>> >>> On Fri, Dec 15, 2017 at 2:19 PM Ahmet Altay

  1   2   3   4   5   6   7   8   9   10   >