Re: DoFn Reuse

2016-06-08 Thread Aljoscha Krettek
Ahh, what we could do is artificially induce bundles using either count or processing time or both. Just so that finishBundle() is called once in a while. On Wed, 8 Jun 2016 at 17:12 Aljoscha Krettek <aljos...@apache.org> wrote: > Pretty sure, yes. The Iterable in a MapPartitionFuncti

Re: DoFn Reuse

2016-06-08 Thread Aljoscha Krettek
s around it to > explain the pitfalls of doing this. > - Bobby > > On Wednesday, June 8, 2016 4:24 AM, Aljoscha Krettek < > aljos...@apache.org> wrote: > > > Hi, > a quick related question: In the Flink runner we basically see everything > as one big bundle, i.e.

Re: 0.1.0-incubating release

2016-06-07 Thread Aljoscha Krettek
By the way, is there any document where we keep track of what checks to run for a release? Maybe I missed something, there. On Tue, 7 Jun 2016 at 21:29 Jean-Baptiste Onofré wrote: > Just submitted: https://github.com/apache/incubator-beam/pull/428 > > to fix the src

Re: [VOTE] Release version 0.1.0-incubating

2016-06-09 Thread Aljoscha Krettek
+1 (binding) I ran "mvn clean verify" on the source package, executed WordCount using the FlinkPipelineRunner. NOTICE, LICENSE and DISCLAIMER also look good On Thu, 9 Jun 2016 at 18:50 Dan Halperin wrote: > +1 (binding) > > per checklist 2.1, I decompressed the

Re: Testing and the Capability Matrix

2016-06-14 Thread Aljoscha Krettek
@Thomas Completely agree, this is also how it is currently handled in the Flink runner. I was talking about the presentation of the compatibility matrix on the web site, whether we should have separate columns for Flink Stream/Batch and Spark Stream/Batch. (And possibly other runners in the

Re: [NOTICE] Change on Filter

2016-06-17 Thread Aljoscha Krettek
There has been an issue about this for a while now: https://issues.apache.org/jira/browse/BEAM-234 On Fri, 17 Jun 2016 at 09:55 Jean-Baptiste Onofré wrote: > Hi Ismaël, > > I didn't talk a change between Dataflow SDK and Beam, I'm talking about > a change between two Beam

Re: [DISCUSS] Beam data plane serialization tech

2016-06-17 Thread Aljoscha Krettek
Hi, am I correct in assuming that the transmitted envelopes would mostly contain coder-serialized values? If so, wouldn't the header of an envelope just be the number of contained bytes and number of values? I'm probably missing something but with these assumptions I don't see the benefit of using

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-23 Thread Aljoscha Krettek
±1 for the named apply On Thu, Jun 23, 2016, 07:07 Robert Bradshaw wrote: > +1, I think it makes more sense to name the application of a transform > rather than the transform itself. (Still mulling on how best to do > this with Python...) > > On Wed, Jun 22, 2016 at

Re: [DISCUSS] Beam data plane serialization tech

2016-06-27 Thread Aljoscha Krettek
, not sure where's the benefit in having a > > > > "schematic" serialization. > > > > > > > > I know that Spark and I think Flink as well, use Kryo > > > > <https://github.com/EsotericSoftware/kryo> for serialization (to be > > > >

Re: Improvements to issue/version tracking

2016-06-28 Thread Aljoscha Krettek
+1 The release view and especially the automatic generation of release notes should come in quite handy. On Tue, 28 Jun 2016 at 01:01 Davor Bonaci wrote: > Hi everyone, > I'd like to propose a simple change in Beam JIRA that will hopefully > improve our issue and

Re: Scala DSL

2016-06-26 Thread Aljoscha Krettek
I'm also in favor of branding it a DSL rather than an SDK. Mostly because it uses the Java SDK and because it does not (necessarily) follow/implement the Beam model. As the Java SDK does and what the Python SDK is apparently going for. On Sat, 25 Jun 2016 at 10:04 Amit Sela

Re: Apache Beam blog

2016-02-12 Thread Aljoscha Krettek
+1 > On 12 Feb 2016, at 18:58, Tyler Akidau wrote: > > +1 > > -Tyler > > On Fri, Feb 12, 2016 at 9:57 AM Amit Sela wrote: > >> +1 >> >> I think we could also publish user's use-case examples and stories. "How we >> are using Beam" or something like

Re: PROPOSAL: Apache Beam (virtual) meeting: 05/11/2016 08:00 - 11:00 Pacific time

2016-04-13 Thread Aljoscha Krettek
Either works for me. On Tue, 12 Apr 2016 at 22:29 Kenneth Knowles wrote: > Either works for me. Thanks James! > > On Tue, Apr 12, 2016 at 11:31 AM, Amit Sela wrote: > > > Anytime works for me. > > > > On Tue, Apr 12, 2016, 21:24 Jean-Baptiste

Re: [DISCUSS] Beam IO native IO

2016-04-28 Thread Aljoscha Krettek
+1 I agree with what Robert said and Davor laid out in more detail. Portability is one of the primary concerns of Beam. On Thu, 28 Apr 2016 at 18:27 Davor Bonaci wrote: > Generally speaking, the SDKs define all user APIs, including all IOs. We > should strive that

Re: [DISCUSS] Adding Some Sort of SideInputRunner

2016-04-28 Thread Aljoscha Krettek
y. Cheers, Aljoscha On Thu, 28 Apr 2016 at 18:55 Kenneth Knowles <k...@google.com.invalid> wrote: > On Thu, Apr 28, 2016 at 1:26 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Bump. > > > > I'm afraid this might have gotten lost during the conferen

Failing Jenkins Runs

2016-05-19 Thread Aljoscha Krettek
Hi, on all of the recent PRs Jenkins fails with this message: https://builds.apache.org/job/beam_PreCommit_MavenVerify/1213/console Does anyone have an idea what might be going on? Also, where is Jenkins configured? With this I could take a look myself. -Aljoscha

Re: Dynamic work rebalancing for Beam

2016-05-19 Thread Aljoscha Krettek
Interesting read, thanks for the link! On Thu, 19 May 2016 at 07:09 Dan Halperin wrote: > Hey folks, > > This morning, my colleagues Eugene & Malo posted *No shard left behind: > dynamic work rebalancing in Google Cloud Dataflow > < >

Re: [PROPOSAL] Writing More Expressive Beam Tests

2016-05-24 Thread Aljoscha Krettek
e system), so runners can implement it > independently. > > On Mon, May 16, 2016 at 9:00 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Hi, > > sorry for resurrecting such an old thread but are there already thoughts > on > > how the quiescence h

Re: add component tag to pull request title / commit comment

2016-05-11 Thread Aljoscha Krettek
This will, however, also take precious space in the Commit Title. And some commits might not be about only one clear-cut component. On Wed, 11 May 2016 at 11:43 Maximilian Michels wrote: > +1 I think it makes it easier to see at a glance to which part of Beam > a commit

Re: [DISCUSS] Adding Some Sort of SideInputRunner

2016-05-03 Thread Aljoscha Krettek
and publish the design doc for that, and I want > everyone to have access to it for any discussion. > > Does this help? > > On Tue, May 3, 2016 at 1:58 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > I'm afraid I have yet another question. What's the

Re: [DISCUSS] Adding Some Sort of SideInputRunner

2016-04-28 Thread Aljoscha Krettek
Bump. I'm afraid this might have gotten lost during the conferences/summits. On Thu, 21 Apr 2016 at 13:30 Aljoscha Krettek <aljos...@apache.org> wrote: > Ok, I'll try and start such a design. Before I can start, I have a few > questions about how the side inputs actually

Re: [PROPOSAL] A brand new DoFn

2016-07-28 Thread Aljoscha Krettek
+1 At first I liked the API but was skeptical because I though that this would require reflective invocation. Then I read on and saw that code generation is used and was convinced. :-) I especially like how it both cleans up the API and allows more optimizations in the future, especially with

Re: [RESULT] Release version 0.2.0-incubating

2016-07-31 Thread Aljoscha Krettek
ul 31, 2016 at 12:29 PM, Dan Halperin <dhalp...@google.com> > wrote: > > > I'm happy to announce that we have unanimously approved this release. > > > > There are 3 binding approving votes: > > * Dan Halperin > > * Jean-Baptiste Onofré > > * Amit Sela >

Re: Beam/Flink : State access

2016-07-26 Thread Aljoscha Krettek
Hi, the purpose of Beam is to abstract the user from the underlying execution engine. IMHO, allowing access to state of the underlying execution engine will never be a goal for the Beam project. If you want/need to access Flink state, I think this is a good indicator that you should use Flink

Re: Help understand how Flink Runner translate triggering information

2016-07-25 Thread Aljoscha Krettek
Hi, for that you would have to look at how Combine.PerKey and GroupByKey are translated. We use a GroupAlsoByWindowViaWindowSetDoFn that internally uses a ReduceFnRunner to manage all the windowing. The windowing strategy as well as the SystemReduceFn is passed to

Re: [VOTE] Release version 0.2.0-incubating

2016-07-31 Thread Aljoscha Krettek
Just a quick note, these two were not fixed for 0.2.0: - [BEAM-478 ] - Create Vector, Matrix types and operations to enable linear algebra API - [BEAM-322 ] - Compare encoded keys in streaming mode I

Re: [VOTE] Release version 0.2.0-incubating

2016-07-31 Thread Aljoscha Krettek
ot; - source release contains no binaries - sources files have license headers (I'm relying on the rat plugin here, though) (As I said above I fixed the "fixed version" tag on two issues.) On Sun, 31 Jul 2016 at 08:22 Aljoscha Krettek <aljos...@apache.org> wrote: > Just

Re: [PROPOSAL] Pipeline Runner API design doc

2016-08-02 Thread Aljoscha Krettek
Hi, thanks for putting this together. Now that I'm seeing them side by side I think the Avro schema looks a lot nicer than the JSON schema but it's probably alright since we don't want to change this often (as you already said). The advantage of JSON is that the (intermediate) plans can easily be

Re: [PROPOSAL] CoGBK as primitive transform instead of GBK

2016-07-21 Thread Aljoscha Krettek
+1 Out of curiosity, does Cloud Dataflow have a CoGBK primitive or will it also be executed as a GBK there? On Thu, 21 Jul 2016 at 02:29 Kam Kasravi wrote: > +1 - awesome Manu. > > On Wednesday, July 20, 2016 1:53 PM, Kenneth Knowles > wrote:

Re: Adding DoFn Setup and Teardown methods

2016-07-18 Thread Aljoscha Krettek
Did you mean "usual" or "useful"? ;-) On Mon, 18 Jul 2016 at 12:42 Maximilian Michels <m...@apache.org> wrote: > +1 for setup() and teardown() methods. Very usual for proper initialization > and cleanup of DoFn related data structures. > > On Wed, Jun 29, 20

Re: Sliding-Windowed PCollectionView as SideInput

2016-06-27 Thread Aljoscha Krettek
Hi, the WindowFn is responsible for mapping from main-input window to side-input window. Have a look at WindowFn.getSideInputWindow(). For SlidingWindows this takes the last possible sliding window as the side-input window. Cheers, Aljoscha On Sun, 26 Jun 2016 at 22:30 Shen Li

Re: [PROPOSAL] Splittable DoFn - Replacing the Source API with non-monolithic element processing in DoFn

2016-08-05 Thread Aljoscha Krettek
I really like the proposal, especially how it unifies at lot of things. One question: How would this work with sources that (right now) return true from UnboundedSource.requiresDeduping(). As I understand it the code that executes such sources has to do bookkeeping to ensure that we don't get

Re: [PROPOSAL] Website page or Jira to host all current proposal discussion and docs

2016-08-08 Thread Aljoscha Krettek
Please have a look at this: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals We recently started using this process in Flink and so far are quite happy with it. On Mon, 8 Aug 2016 at 06:52 Jean-Baptiste Onofré wrote: > Good point Ben. > > I would

Re: [PROPOSAL] Splittable DoFn - Replacing the Source API with non-monolithic element processing in DoFn

2016-08-08 Thread Aljoscha Krettek
a canned deduping > transform. Does this address your question? > > Thanks! > > On Fri, Aug 5, 2016 at 7:14 AM Aljoscha Krettek <aljos...@apache.org> > wrote: > > > I really like the proposal, especially how it unifies at lot of things. > > > > On

Re: [PROPOSAL] State and Timers for DoFn (aka per-key workflows)

2016-07-29 Thread Aljoscha Krettek
+1 Very nice proposal and the API already looks very good. I guess the only thing people still like to discuss on this is naming of things. :-) I just have one general remark about giving users access to state and timers. The Beam model takes great care to mostly shield users from the reality of

Re: [DISCUSS] Beam data plane serialization tech

2016-06-29 Thread Aljoscha Krettek
My bad, I didn't know that. Thanks for the clarification! On Wed, 29 Jun 2016 at 16:38 Daniel Kulp <dk...@apache.org> wrote: > > > On Jun 27, 2016, at 10:24 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > > Out of the systems you suggested Thri

Re: Adding DoFn Setup and Teardown methods

2016-06-29 Thread Aljoscha Krettek
+1 I think some people might already mistake the startBundle()/finishBundle() methods for what the new methods are supposed to be On Tue, 28 Jun 2016 at 19:38 Raghu Angadi wrote: > This is terrific! > Thanks for the proposal. > > On Tue, Jun 28, 2016 at 9:06 AM,

Re: Display Data Runner Support

2016-07-04 Thread Aljoscha Krettek
Thanks Scott for this compilation of information! I'll look into how this can be incorporated into the Flink runner once I have some time on my hands. On Thu, 30 Jun 2016 at 17:05 Scott Wegner wrote: > Hi Beam Dev community, > > I wanted to circle-back on a recent

Re: [PROPOSAL] Splittable DoFn - Replacing the Source API with non-monolithic element processing in DoFn

2016-08-21 Thread Aljoscha Krettek
te associated with > > K1 will be GC'd. > > > > So basically it's almost like cooperative thread scheduling: things run > for > > a while, until the runner tells them to checkpoint, then they set a timer > > to resume themselves, and the runner fires the timers, and the pr

Re: [PROPOSAL] Splittable DoFn - Replacing the Source API with non-monolithic element processing in DoFn

2016-08-20 Thread Aljoscha Krettek
Hi, I have another question that I think wasn't addressed in the meeting. At least it wasn't mentioned in the notes. In the context of replacing sources by a combination of to SDFs, how do you determine how many "SDF executor" instances you need downstream? For the sake of argument assume that

Re: Proposal: Dynamic PIpelineOptions

2016-08-05 Thread Aljoscha Krettek
+1 It's true that Flink provides a way to pass dynamic parameters to operator instances. That's not used in any of the built-in sources and operators, however. They are instantiated with their parameters when the graph is constructed. So what you are suggesting for Beam would actually provide

Re: KafkaIO Windowing Fn

2016-09-01 Thread Aljoscha Krettek
ansformTreeNode.java:225)* > > * at > > org.apache.beam.sdk.runners.TransformTreeNode.visit( > > TransformTreeNode.java:220)* > > * at > > org.apache.beam.sdk.runners.TransformTreeNode.visit( > > TransformTreeNode.java:220)* > > * a* > > > > Regards > > Sum

Re: KafkaIO Windowing Fn

2016-09-02 Thread Aljoscha Krettek
m not sure whats actually triggering the window firing here. ( does not > look like to be 30 sec trigger) > > > > Regards > Sumit Chawla > > > On Wed, Aug 31, 2016 at 11:14 PM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Ah I see, the Flink Ru

Re: KafkaIO Windowing Fn

2016-08-31 Thread Aljoscha Krettek
hen you try to run the Pipeline. > > On Tue, Aug 30, 2016 at 11:24 PM, Chawla,Sumit <sumitkcha...@gmail.com> > wrote: > > > Yes. I added it only for DirectRunner as it cannot translate > > Read(UnboundedSourceOfKafka) > > > > Regards > > Sumit Cha

Re: KafkaIO Windowing Fn

2016-08-31 Thread Aljoscha Krettek
Flink. It have removed business > specific transformations only. > > Regards > Sumit Chawla > > > On Tue, Aug 30, 2016 at 4:49 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Hi, > > could you maybe also post the complete tha

Re: KafkaIO Windowing Fn

2016-08-30 Thread Aljoscha Krettek
Hi, could you maybe also post the complete that you're using with the FlinkRunner? I could have a look into it. Cheers, Aljoscha On Tue, 30 Aug 2016 at 09:01 Chawla,Sumit wrote: > Hi Thomas > > Sorry i tried with DirectRunner but ran into some kafka issues. Following >

Re: [PROPOSAL] Splittable DoFn - Replacing the Source API with non-monolithic element processing in DoFn

2016-08-30 Thread Aljoscha Krettek
turns. That's the way it is implemented > in my current prototype https://github.com/apache/incubator-beam/pull/896 > (see > SplittableParDo.ProcessFn) > > On Mon, Aug 29, 2016 at 2:55 AM Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Hi, > > I have an

About Finishing Triggers

2016-09-14 Thread Aljoscha Krettek
Hi, I had a chat with Kenn at Flink Forward and he did an off-hand remark about how it might be better if triggers where not allowed to mark a window as finished and instead always be "Repeatedly" (if I understood correctly). Maybe you (Kenn) could go a bit more in depth about what you meant by

Re: Anyone @scale tomorrow?

2016-09-09 Thread Aljoscha Krettek
Cool, thanks for letting us know! On Fri, Sep 9, 2016, 18:45 Dan Halperin wrote: > Hey folks, > > Wanted to let you know that the Beam talk went pretty well. People were > *very* excited about Beam -- loved the idea of not having to rewrite their > pipelines every

Re: Simplifying User-Defined Metrics in Beam

2016-10-06 Thread Aljoscha Krettek
Hi, I'm currently in holidays but I'll put some thought into this and give my comments once I get back. Aljoscha On Wed, Oct 5, 2016, 21:51 Ben Chambers wrote: > To provide some more background I threw together a quick doc outlining my > current thinking for this

Re: Remove legacy import-order?

2016-08-24 Thread Aljoscha Krettek
+1 on the import order +1 on also starting a discussion about enforced formatting On Wed, 24 Aug 2016 at 06:43 Jean-Baptiste Onofré wrote: > Agreed. > > It makes sense for the import order. > > Regards > JB > > On 08/24/2016 02:32 AM, Ben Chambers wrote: > > I think

Re: [PROPOSAL] Splittable DoFn - Replacing the Source API with non-monolithic element processing in DoFn

2016-08-29 Thread Aljoscha Krettek
' into the state of K2. > > > Etc. > > > If partition 1 goes away, the processElement call will return "do not > > > resume", so a timer will not be set and instead the state associated > with > > > K1 will be GC'd. > > > > > &g

Re: [DISCUSS] Deferring (pre) combine for merging windows.

2016-10-24 Thread Aljoscha Krettek
@Amit: Yes, Flink is more "what you write is what you get". For example, in Flink we have a Fold function for windows which cannot be efficiently computed with merging windows (it would require using a "group by" window and then folding the iterable). We just don't allow this. For Beam, I think

[VOTE] Release 0.3.0-incubating, release candidate #1

2016-10-24 Thread Aljoscha Krettek
Hi Team! Please review and vote at your leisure on release candidate #1 for version 0.3.0-incubating, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA

Re: The Availability of PipelineOptions

2016-10-25 Thread Aljoscha Krettek
+1 This sounds quite straightforward. On Tue, 25 Oct 2016 at 01:36 Thomas Groh wrote: > Hey everyone, > > I've been working on a declaration of intent for how we want to use > PipelineOptions and an API change to be consistent with that intent. This > is generally part

Re: [VOTE] Release 0.3.0-incubating, release candidate #1

2016-10-25 Thread Aljoscha Krettek
ndom in the test suite. > > >* How did you generated the checksums? Because both SHA1/MD5 can't be > > >automatically checked because "no properly formatted SHA1/MD5 checksum > > >lines found". > > > > > >Great to see the project moving forward at

Re: [DISCUSS] Current ongoing work on runners

2016-10-25 Thread Aljoscha Krettek
I think we might need to update the capability matrix with some of the new features that have popped up. Immediate things that come to mind are: * Timer/State API for user DoFns (coupled with new-style DoFn) (not yet completely in master) * SplittableDoFn This would allow tracking the process

Re: [ANNOUNCEMENT] New committers!

2016-10-22 Thread Aljoscha Krettek
Welcome everyone! +3 :-) On Sat, 22 Oct 2016 at 06:43 Jean-Baptiste Onofré wrote: > Just a small thing. > > If it's not already done, don't forget to sign a ICLA and let us know > your apache ID. > > Thanks, > Regards > JB > > On 10/22/2016 12:18 AM, Davor Bonaci wrote: > >

Re: Tracking backward-incompatible changes for Beam

2016-10-22 Thread Aljoscha Krettek
Very good idea! Should we already start thinking about automatic tests for backwards compatibility of the API? On Fri, 21 Oct 2016 at 10:56 Jean-Baptiste Onofré wrote: > Hi Dan, > > +1, good idea. > > Regards > JB > > On 10/21/2016 02:21 AM, Dan Halperin wrote: > > Hey

Re: Maven Release Plugin Does Not Update Version of Archetypes

2016-10-24 Thread Aljoscha Krettek
he version in the > > 0.3.0-release branch? > > > > On Mon, Oct 24, 2016 at 11:09 AM, Dan Halperin <dhalp...@google.com> > > wrote: > > > >> Correct issue link: https://issues.apache.org/jira/browse/BEAM-806 > >> > >> No answers, but l

Re: Start of release 0.3.0-incubating

2016-10-21 Thread Aljoscha Krettek
t; >> > >> Thanks Aljosha !! > >> > >> Do you mind to wait the week end or Monday to start the release ? I > would > >> like to include MqttIO if possible. > >> > >> Thanks ! > >> Regards > >> JB > >> >

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Aljoscha Krettek
+1 I'm quite enthusiastic about the growth of the community and the open discussions! On Tue, 22 Nov 2016 at 19:51 Jason Kuster wrote: > An enthusiastic +1! > > In particular it's been really great to see the commitment and interest of > the community in

Re: Flink runner. Wrapper for DoFn

2016-11-19 Thread Aljoscha Krettek
topics) > .withValueCoder(StringUtf8Coder.of()).withoutMetadata()) > .apply("startBundle", ParDo.of( new DoFn<KV<byte[], String>, KV<String, > String>>() { > Amir- > > From: Aljoscha Krettek <aljos...@apache.org> > To: amir

Fwd: Jenkins build became unstable: beam_PostCommit_RunnableOnService_FlinkLocal #813

2016-11-11 Thread Aljoscha Krettek
This looks like it was introduced by the new commit mentioned here but it's actually caused by a pre-existing (unknown) bug in the Flink runner: https://issues.apache.org/jira/browse/BEAM-965. I also have a fix ready. Btw, how should we deal with these messages from Jenkins? I'm writing to the

Re: Hosting data stores for IO Transform testing

2016-11-21 Thread Aljoscha Krettek
Hi Stephen, I really like your proposal! I don't have any comments because this seems very well "researched" already. I'm hoping others will also have a look at this as well because "real" integration testing provides a new level of confidence in the code, IMHO. Cheers, Aljoscha On Wed, 16 Nov

Re: Flink runner. Wrapper for DoFn

2016-11-21 Thread Aljoscha Krettek
t flush() on every element, because now it's bottleneck for me? Thanks, Alexey Diomin 2016-11-19 11:59 GMT+04:00 Aljoscha Krettek <aljos...@apache.org>: @amir, what do you mean? Naming a ParDo "startBundle" is not the same thing as having a @StartBundle or startBundle() (for OldDoFn)

Re: Release Guide

2016-10-20 Thread Aljoscha Krettek
Hi, thanks for taking the time and writing this extensive doc! If no-one is against this I would like to be the release manager for the next (0.3.0-incubating) release. I would work with the guide and update it with anything that I learn along the way. Should I open a new thread for this or is it

Re: Start of release 0.3.0-incubating

2016-10-26 Thread Aljoscha Krettek
ature, and then hold it again for some other new feature. > >>> > >>> Can you make a strong argument for why MQTT in particular should be > >>> release > >>> blocking? > >>> > >>> Dan > >>> > >>> O

[VOTE] Apache Beam release 0.3.0-incubating

2016-10-28 Thread Aljoscha Krettek
Hi everyone, Please review and vote on the release candidate #1 for the Apache Beam version 0.3.0-incubating, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The complete staging area is available for your review, which includes: *

[RESULT] [VOTE] Release 0.3.0-incubating, release candidate #1

2016-10-28 Thread Aljoscha Krettek
voting. Thanks everyone! On Fri, 28 Oct 2016 at 09:09 Aljoscha Krettek <aljos...@apache.org> wrote: > The voting time has elapsed. I'm hereby closing this vote and will tally > the results in a separate thread. > > On Thu, 27 Oct 2016 at 17:38 Neelesh Salian <nsal...@cloude

Re: Simplifying User-Defined Metrics in Beam

2016-10-13 Thread Aljoscha Krettek
I finally found the time to have a look. :-) The API looks very good! (It's very similar to an API we recently added to Flink, which is inspired by the same Codahale/Dropwizard metrics). About the semantics, the "A", "B" and "C" you mention in the doc: doesn't this mean that we have to keep the

Re: [KUDOS] Contributed runner: Apache Apex!

2016-10-17 Thread Aljoscha Krettek
Congrats! :-) On Mon, 17 Oct 2016 at 18:55 Kenneth Knowles wrote: > *I would like to :-) > > On Mon, Oct 17, 2016 at 9:51 AM Kenneth Knowles wrote: > > > Hi all, > > > > I would to, once again, call attention to a great addition to Beam: a > > runner

Re: Flink runner. Optimization for sideOutput with tags

2016-12-06 Thread Aljoscha Krettek
euse, but we never use object after > > collect. > > In some cases we need more performance and serialization on every > > transformation very expensive, > > but try merge all business logic in one DoFn it to make processing > > unsupportable. > > > > >>

Re: [DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Aljoscha Krettek
+1 I've seen this mistake myself in some PRs. On Thu, 8 Dec 2016 at 06:10 Ben Chambers wrote: > +1 -- This seems like the best option. It's a mechanical change, and the > compiler will let users know it needs to be made. It will make the mistake > much less

Re: Meet up at Strata+Hadoop World in Singapore

2016-11-29 Thread Aljoscha Krettek
Hi, I'll also be there to give a talk (and also at the Beam tutorial). Cheers, Aljoscha On Wed, Nov 30, 2016, 00:51 Dan Halperin wrote: > Hey folks, > > Who will be attending Strata+Hadoop World next week in Singapore? Tyler and > I will be there, giving a Beam tutorial

Re: How to create a Pipeline with Cycles

2016-11-30 Thread Aljoscha Krettek
Hi, there is support for cycles in Flink but the Yahoo benchmark is not making use of that feature, if I'm not completely mistaken. Cheers, Aljoscha On Wed, 30 Nov 2016 at 09:57 Ismaël Mejía wrote: > Hello, > > Shen you should probably first check the benchmark

Re: How to create a Pipeline with Cycles

2016-11-30 Thread Aljoscha Krettek
b/master/flink-benchmarks/src/main/java/flink/benchmark/AdvertisingTopologyNative.java > > My excuses again, > Ismael > > On Wed, Nov 30, 2016 at 11:30 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Hi, > > there is support for cycles in Flink but the Yah

Re: Increase stream parallelism after reading from UnboundedSource

2016-12-05 Thread Aljoscha Krettek
Hi, I can only speak for Flink, there you usually fan-out/parallelise the stream after a non-parallel source. Cheers, Aljoscha On Mon, 5 Dec 2016 at 15:48 Amit Sela wrote: > Hi all, > > I have a general question about how stream-processing frameworks/engines > usually

Re: [VOTE] Release 0.4.0-incubating, release candidate #3

2016-12-20 Thread Aljoscha Krettek
+1 - verified signatures - ran Quickstart using the staging repository on Flink cluster - verified build form source (We'll probably do a 0.5.0 release shortly after this so we can fix the BigQuery issues.) On Mon, 19 Dec 2016 at 21:22 Dan Halperin wrote: > I