Re: Build failed in Jenkins: beam_PostCommit_Java_RunnableOnService_Apex #435

2017-02-07 Thread Kenneth Knowles
Seems like a real configuration issue. The new use of Jackson YAML causing trouble with Apex RoS tests? Filed https://issues.apache.org/jira/browse/BEAM-1434. On Tue, Feb 7, 2017 at 11:04 PM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Dan Halperin
I am generally persuaded to at least change my number to something like 0 :). These are pretty reasonable perspectives, especially pointing out that withSideInputs is pretty useless in Count ;) On Tue, Feb 7, 2017 at 10:04 PM, Kenneth Knowles wrote: > On Tue, Feb 7, 2017 at 8:43 PM, Eugene Kirp

Re: Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Dataflow #2226

2017-02-07 Thread Kenneth Knowles
This was a manually triggered build against a PR. Ignore. On Tue, Feb 7, 2017 at 4:20 PM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See RunnableOnService_Dataflow/2226/> > >

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Kenneth Knowles
On Tue, Feb 7, 2017 at 8:43 PM, Eugene Kirpichov < > kirpic...@google.com.invalid> wrote: > I must admit I didn't quite > understand the option of "implements CombiningTransform". > On Tue, Feb 7, 2017 at 9:04 PM, Robert Bradshaw wrote: > Sorry, I'll try to clarify. ... <>... > FWIW this is a

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Kenneth Knowles
On Tue, Feb 7, 2017 at 8:43 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > So we have 3 options for Count.globally() and .perKey(): > 1. Return Combine.Globally, Combine.PerKey transforms (status quo) > This is not a full picture of the status quo. Count.perElement() returns a PerE

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Robert Bradshaw
On Tue, Feb 7, 2017 at 8:43 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > I like the idea of "if you're really just a Combine, then expose (only) the > CombineFn". > > However, in case of Count, another argument is that Count.perElement() > already returns a PerElement transform, a

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Eugene Kirpichov
I like the idea of "if you're really just a Combine, then expose (only) the CombineFn". However, in case of Count, another argument is that Count.perElement() already returns a PerElement transform, and it'd be awkward if globally() and perKey() were only exposed as CombineFn's. In a sense, if you

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Robert Bradshaw
On Tue, Feb 7, 2017 at 7:49 PM, Kenneth Knowles wrote: > I am +0.7 on this idea. My rationale is contained in this thread, but I > thought I would paraphrase it anyhow: > > "You automatically get all the features of Combine" / "If you add a feature > to Combine you have to update all wrappers" >

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Ben Chambers
Going back to the beginning, why not "Combine.perKey(Count.fn())" or some such? We have a lot of boilerplate already to support "Count.perKey()" and that will only become worse when we try to have a Count.PerKey class that has the functionality of Combine.PerKey, why not just get rid of the boiler

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Kenneth Knowles
I am +0.7 on this idea. My rationale is contained in this thread, but I thought I would paraphrase it anyhow: "You automatically get all the features of Combine" / "If you add a feature to Combine you have to update all wrappers" 0. I have been in some of the mentioned historical discussions and

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Eugene Kirpichov
There's 2 points here 1. If Count.globally() is implemented via Combine.globally(), then should Count.globally() return a Combine.Globally, or should it wrap it into a new class Count.Globally? (that's what I'm wondering in this thread) I think the "least visibility" argument here would lead us t

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Dan Halperin
A little bit more inline: On Tue, Feb 7, 2017 at 5:15 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > Hello, > > I was auditing Beam for violations of PTransform style guide > https://beam.apache.org/contribute/ptransform-style-guide/ and came across > another style point that deser

Re: Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Dan Halperin
I'll agree with the "Cons" by referencing back to this thread: https://lists.apache.org/thread.html/caa8k_flvcmx+tyksxdmcxxe9y_zyohe4ovht9f2jb1wckob...@mail.gmail.com On Tue, Feb 7, 2017 at 5:15 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > Hello, > > I was auditing Beam for viol

Should you always have a separate PTransform class for a new transform?

2017-02-07 Thread Eugene Kirpichov
Hello, I was auditing Beam for violations of PTransform style guide https://beam.apache.org/contribute/ptransform-style-guide/ and came across another style point that deserves discussion. Look at Count transform: public static Combine.Globally globally() { return Combine.globally(new Cou

Re: How does SideInputHandler work?

2017-02-07 Thread Shen Li
Hi Kenn, Thanks for explaining. What if a View directly follows a Create transform: https://github.com/ apache/beam/blob/master/sdks/java/core/src/test/java/org/ apache/beam/sdk/transforms/ViewTest.java#L198 Does the runner need to implement the "Combine" behavior inside the View translator? Th

Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-02-07 Thread Eugene Kirpichov
Hello, I'm almost done adding support for Splittable DoFn http://s.apache.org/splittable-do-fn to Dataflow streaming runner*, and very excited about that. There's only 1 PR remaining, plus enabling some tests. * (batch runner is much harder because it's

Re: [DISCUSS] Beam data plane serialization tech

2017-02-07 Thread Kenneth Knowles
This has lain dormant as I was drawn off to other things. But now I'm looping back on this so there are no surprises in my upcoming (third) revision to PR #662 [1] to use protocol buffers instead of JSON schema or Avro (the two prior versions - now I know what the runner API looks like in every for

Re: How does SideInputHandler work?

2017-02-07 Thread Kenneth Knowles
I have a couple of answers inline. Some others may have more to say, or corrections for what I have said On Tue, Feb 7, 2017 at 12:34 PM, Shen Li wrote: > Hi, > > I am trying to understand how does the SideInputHandler work. It seems that > the SideInputHandler#addSideInputValue method overwrite

Re: Let's make Beam transforms comply with PTransform Style Guide

2017-02-07 Thread Eugene Kirpichov
Hey all, I bit the bullet and audited all PTransform classes in Beam Java SDK and filed JIRA issues for all violations I could find. I linked all them to the master JIRA issue https://issues.apache.org/jira/browse/BEAM-1353 In general, all of these should be fixed before declaring Beam stable API

How does SideInputHandler work?

2017-02-07 Thread Shen Li
Hi, I am trying to understand how does the SideInputHandler work. It seems that the SideInputHandler#addSideInputValue method overwrites the ValueStates of all windows associated with the input WindowedValue (i.e., discards any existing side input states) : https://github.com/apache/beam/blob/mas

Re: PTransform style guide PR

2017-02-07 Thread Aviem Zur
Very well written. Examples for every concept make it very easily relatable and understandable. On Tue, Jan 31, 2017 at 3:52 AM Eugene Kirpichov wrote: > I don't think I'll have capacity to review every PR that brings particular > Beam transforms in accordance with the style guide - but I'm happ

Re: Build failed in Jenkins: beam_PostCommit_Java_RunnableOnService_Apex #429

2017-02-07 Thread Kenneth Knowles
Trouble downloading os-maven-plugin. On Tue, Feb 7, 2017 at 9:00 AM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See RunnableOnService_Apex/429/changes> > > Changes: > > [tgroh] Check that Elements, Timers have permitted Times

Re: Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #2561

2017-02-07 Thread Kenneth Knowles
Seems we've got steady breakage; looks Maven-central related but I've only looked in on a couple of logs. Filed https://issues.apache.org/jira/browse/BEAM-1412, unassigned for now. On Tue, Feb 7, 2017 at 9:20 AM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See

Re: Report to the Board, February 2017 edition

2017-02-07 Thread Jean-Baptiste Onofré
Hi It looks good to me. Thanks Davor Regards JB On Feb 6, 2017, 19:32, at 19:32, Davor Bonaci wrote: >We are expected to submit a project report to the ASF Board of >Directors >ahead of its next meeting. The report is due on Wednesday, 2/8. > >This is the second is the series of three monthly r