Re: Flink support for OrderedListState

2021-11-16 Thread Aljoscha Krettek
Hi, sorry for the slow response! I was on vacation and only saw this now. I can't add much more than what David already said: right now, it's not possible to do an *efficient* implementation on Flink. The closest Flink state type would be MapState, which keeps state in multiple underlying

Re: Beam Summit Europe: speakers and schedule online!

2019-05-24 Thread Aljoscha Krettek
You’re both right. The Kulturbrauerei Area is between Knaackstraße and Schönhauser Allee and there’s entrances on multiple sides. Schönhauser Alee is the more prominent street, though. Btw, I live on Schönhauser Allee. :-) > On 24. May 2019, at 13:48, Suneel Marthi wrote: > > Kulturbraurei

Re: DISCUSS: Sorted MapState API

2019-05-24 Thread Aljoscha Krettek
This is quite interesting! The Flink Table API (relational and SQL) has an implementation for the type of join you mention in the example. We call it Temporal Table Join, and it works on something we call Temporal Tables:

Re: Definition of Unified model (WAS: Semantics of PCollection.isBounded)

2019-05-16 Thread Aljoscha Krettek
> function properly on batch engines. Which is what I observe - there is non > determinism on batch pipeline although everything seems to be "well defined", > elements arrive arbitrarily out of order and are arbitrarily out of order > dropped. This leads to different results

Re: Semantics of PCollection.isBounded

2019-05-16 Thread Aljoscha Krettek
Please take this with a grain of salt, because I might be a bit rusty on this. I think the Beam model does not prescribe any ordering (by time or otherwise) on inputs. Mostly because always requiring it would be prohibitively expensive on most Runners, especially global sorting. If you want to

Re: Beam application upgrade on Flink crashes

2018-08-22 Thread Aljoscha Krettek
Hi, Unfortunately, there are currently no compatibility guarantees between different Beam versions. Beam itself doesn't have the required interfaces or procedures in place for supporting backwards compatibility of state and there have been quite some changes in the internals between Flink 1.4

Re: Kafka connector for Beam Python SDK

2018-04-30 Thread Aljoscha Krettek
Is this what we want to do in the long run, i.e. implement copies of connectors for different SDKs? I thought the plan was to enable using connectors written in different languages, i.e. use the Java Kafka I/O from python. This way we wouldn't duplicate bugs for three different language (Java,

Re: Add a (temporary) Portable Flink branch to the ASF repo?

2018-04-18 Thread Aljoscha Krettek
repo for > collaboration instead of personal github accounts seems like a worthy goal. > > >>> On Thu, Apr 12, 2018 at 4:21 PM Robert Bradshaw <rober...@google.com > >>> <mailto:rober...@google.com>> > wrote: > > >>>> I suppose with the hackath

Re: [Go SDK] Proposal: Set up a Vanity Import Path

2018-04-18 Thread Aljoscha Krettek
+1 this sounds super reasonable > On 17. Apr 2018, at 20:11, Kenneth Knowles wrote: > > This seems like a valuable layer of indirection to establish. The mechanisms > are pretty esoteric, but I trust Gophers to know the best way to do it. > Commented just a smidgin on the

Re: Add a (temporary) Portable Flink branch to the ASF repo?

2018-04-12 Thread Aljoscha Krettek
I would also be in favour of adding a branch to our main repo. A random branch on some personal GitHub account can seem a bit sketchy and adding a branch to our repo could make it more visible for people that are interested. > On 12. Apr 2018, at 15:29, Ben Sidhom wrote: >

Re: Gradle migration fixit: April 3

2018-03-31 Thread Aljoscha Krettek
Thanks Luke, that was helpful! I've been playing around with gradle a bit and now have a question: By default the shadow plugin will include dependencies that are "compile" or "runtime" (compile dependencies are by default also runtime dependencies) into the shaded jar. Is that correct? The

Re: Flink Runner display transform bug

2018-03-22 Thread Aljoscha Krettek
I think this might be a problem in how we set display names? Would you mind opening a Jira issue for that? > On 22. Mar 2018, at 04:27, Alexey Diomin wrote: > > Hi > > I have this display in 2.3.0 and 2.4.0 versions. > Main problem that RawParDo doesn't provide correctly

Re: (java) stream & beam?

2018-03-13 Thread Aljoscha Krettek
https://beam.apache.org/blog/2016/05/27/where-is-my-pcollection-dot-map.html > On 11. Mar 2018, at 22:21, Romain Manni-Bucau wrote: > > > > Le 12 mars 2018 00:16, "Reuven Lax" > a écrit : > I think it would be interesting to

Re: Portable Flink Runner plan

2018-03-07 Thread Aljoscha Krettek
@Axel I assigned https://issues.apache.org/jira/browse/BEAM-2588 <https://issues.apache.org/jira/browse/BEAM-2588> to you. It might make sense to also grab other issues that you're already working on. > On 7. Mar 2018, at 21:18, Aljoscha Krettek <aljos...@apache.org> wrote: >

Re: Portable Flink Runner plan

2018-03-07 Thread Aljoscha Krettek
gt; composed and reused with other runner implementations. Thomas and Axel have > more context around that. > > > On Wed, Mar 7, 2018 at 8:47 AM Aljoscha Krettek <aljos...@apache.org > <mailto:aljos...@apache.org>> wrote: > Hi, > > Has anyone started on http

Re: Portable Flink Runner plan

2018-03-07 Thread Aljoscha Krettek
Hi, Has anyone started on https://issues.apache.org/jira/browse/BEAM-2588 (FlinkRunner shim for serving Job API). If not I would start on that. My plan is to implement a FlinkJobService that implements JobServiceImplBase, similar to

Re: Code reviews in Beam

2018-02-20 Thread Aljoscha Krettek
This is excellent! I can't really add anything right now but I think having a PR dashboard is one of the most important points because it also indirectly solves "Review Latency" and "Code Review Response SLA" by making things more visible. -- Aljoscha > On 19. Feb 2018, at 19:32, Reuven Lax

Re: [INFO] Gradle build is flaky on Jenkins

2018-02-09 Thread Aljoscha Krettek
Yes, I was just about to write about this as well. In my recent PRs this always failed for different reasons. Thanks for looking into this! > On 9. Feb 2018, at 11:35, Jean-Baptiste Onofré wrote: > > Hi guys, > > I noticed that the Gradle build on Jenkins is flaky: it

Re: [CANCEL][VOTE] Release 2.3.0, release candidate #1

2018-02-05 Thread Aljoscha Krettek
I merged fixes for: - https://issues.apache.org/jira/browse/BEAM-3186 - https://issues.apache.org/jira/browse/BEAM-3589 @JB I didn't yet merge them on the 2.3.0 branch, though, but I can or you

Re: [VOTE] Release 2.3.0, release candidate #1

2018-02-01 Thread Aljoscha Krettek
-1 I think the issue discovered with unbounded sources on Flink Streaming Runner is a serious regression. Good news is that there is already a fix for that: https://github.com/apache/beam/pull/4558/files And BEAM-3587 also seems serious enough, IMHO. Btw, BEAM-3186, which seems quite serious,

Re: drop scala....version from artifact ;)

2018-02-01 Thread Aljoscha Krettek
ka IO, ...) > > Thoughts ? > > I'm OK to stay on 1 for now. > > Regards > JB > > On 02/01/2018 02:45 PM, Aljoscha Krettek wrote: >> I think it's not wise to remove the Scala suffix. When using the Flink >> Runner you have to make sure that the Scala ver

Re: drop scala....version from artifact ;)

2018-02-01 Thread Aljoscha Krettek
I think it's not wise to remove the Scala suffix. When using the Flink Runner you have to make sure that the Scala version matches the Scala version of the Flink Cluster. And I think comparing the suffix of your flink-runner dependency and the suffix of your Flink dist is an easy way of doing

Re: KafkaIO reading from latest offset when pipeline fails on FlinkRunner

2018-01-04 Thread Aljoscha Krettek
What is the exact behaviour you're seeing. It's not 100 % clear to me from the initial message. Best, Aljoscha > On 4. Jan 2018, at 12:56, Sushil Ks wrote: > > *bump* > > > On Dec 15, 2017 11:22 PM, "Lukasz Cwik" > wrote: >

Re: A personal update

2017-12-13 Thread Aljoscha Krettek
Welcome back! :-) > On 13. Dec 2017, at 15:42, Ismaël Mejía wrote: > > Hello Davor, great to know you are going to continue contributing to > the project. Welcome back and best of wishes for this new phase ! > > On Wed, Dec 13, 2017 at 3:12 PM, Kenneth Knowles

Re: [DISCUSS] Updating contribution guide for gitbox

2017-11-29 Thread Aljoscha Krettek
I think I agree with Kenn on the "merge question": - There should be a merge commit because this records important information, for example, I like having the option of figuring out what PR certain commits came from - Individual meaningful commits of a PR should be preserved, I think having

Re: [VOTE] Use Gradle for Apache Beam developmental processes

2017-11-29 Thread Aljoscha Krettek
+1 I agree with JB on the process but I think overall using Gradle will bring only benefits. > On 29. Nov 2017, at 09:44, Jean-Baptiste Onofré wrote: > > -0 > > It's not for the change itself (gradle build is interesting) but for the > process used here, which, IMHO, is

Re: [VOTE] Fixing @yyy.com.INVALID mailing addresses

2017-11-24 Thread Aljoscha Krettek
+1 > On 23. Nov 2017, at 23:22, Manu Zhang wrote: > > +1 > > On Thu, Nov 23, 2017 at 11:32 PM Maximilian Michels wrote: > >> +1 >> >> Thanks for looking into it! >> >> On 23.11.17 00:25, Lukasz Cwik wrote: >>> I have noticed that some e-mail

Re: [PROPOSAL] "Requires deterministic input"

2017-11-15 Thread Aljoscha Krettek
+1 > On 15. Nov 2017, at 14:07, Jean-Baptiste Onofré wrote: > > Agree ! > > Thanks Kenn, > Regards > JB > > On 11/15/2017 02:05 PM, Kenneth Knowles wrote: >> Reviving this again, since it came up again today in yet another context. I >> think it is time to add this as an

Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

2017-11-07 Thread Aljoscha Krettek
it and by request > only, or multi-step in order to avoid running lots of extra tests, etc. > > Do you think you might have time to work on this goal of splitting apart > jobs that require splitting? > > > On Wed, Oct 11, 2017 at 2:08 AM, Aljoscha Krettek <aljos...

Re: [VOTE] Switch to new JIRA workflow for pending proposals

2017-11-07 Thread Aljoscha Krettek
+1 (better late than never) > On 3. Nov 2017, at 19:17, Reuven Lax wrote: > > At least three of these votes are from PMC members, so I'll let them know > that this proposal is adopted. > > On Wed, Nov 1, 2017 at 10:00 AM, Chamikara Jayalath < >

Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

2017-10-11 Thread Aljoscha Krettek
les in .test-infra and change the --activate-profiles line, right? Are there still manual steps required for "re-seeding" the jobs? > On 9. Oct 2017, at 18:06, Kenneth Knowles <k...@google.com.INVALID> wrote: > > +1 to the goal, and replying inline on details. > > On Mon,

Re: [VOTE] Migrate to gitbox

2017-10-10 Thread Aljoscha Krettek
+1 > On 10. Oct 2017, at 09:42, Jean-Baptiste Onofré wrote: > > Hi all, > > following the discussion, here's the formal vote to migrate to gitbox: > > [ ] +1, Approve to migrate to gitbox > [ ] -1, Do not migrate (please provide specific comments) > > The vote will be open

Re: Unable to find registrar for s3n when restoring flink job from savepoint

2017-09-14 Thread Aljoscha Krettek
I responded on the issue. > On 12. Sep 2017, at 22:25, Lukasz Cwik wrote: > > Filed https://issues.apache.org/jira/browse/BEAM-2948 > > On Tue, Sep 12, 2017 at 2:10 AM, Pawel Bartoszek > wrote: > >> Hi, >> >> I am running a flink v1.2.1

Re: streaming output in just one files

2017-09-10 Thread Aljoscha Krettek
rs/dataflow/DataflowRunner.java#L354> > is the code that does this; it should be quite simple to do something > similar for Flink, and then there will be no need for users to explicitly > call withNumShards themselves. > > On Thu, Aug 10, 2017 at 3:09 AM, Aljoscha Krettek &

Re: Portability JIRA organization

2017-09-08 Thread Aljoscha Krettek
+1 Sounds great! > On 9. Sep 2017, at 00:44, Kenneth Knowles wrote: > > +1 to this. > > It is really easy to lose track of things in a sea of tickets, and > portability touches every SDK and runner, so getting this organized will be > hugely helpful. Especially

Re: [DISCUSS] Capability Matrix revamp

2017-08-28 Thread Aljoscha Krettek
I like where this is going! Regarding benchmarking, I think we could do this if we had common benchmarking infrastructure and pipelines that regularly run on different Runners so that we have up-to-date data. I think we can also have a more technical section where we show stats on the level

Re: ConcurrentModificationException while performing checkpoint for Kinesis stream

2017-08-17 Thread Aljoscha Krettek
Hi Pawel, I'll have a look! Best, Aljoscha > On 16. Aug 2017, at 18:30, Lukasz Cwik wrote: > > Moved to dev@beam.apache.org > > On Wed, Aug 16, 2017 at 9:22 AM, Pawel Bartoszek > wrote: > >> When flink performs a checkpoint I get

Re: [VOTE] Release 2.1.0, release candidate #3

2017-08-17 Thread Aljoscha Krettek
(Belated) +1 * verified signatures * verified that Quickstart works with Flink Runner > On 16. Aug 2017, at 20:41, Robert Bradshaw > wrote: > > +1 binding > > (I've been on vacation as well.) > > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik

Re: [ANNOUNCEMENT] New PMC members, August 2017 edition!

2017-08-12 Thread Aljoscha Krettek
Congratulations! :-) Best, Aljoscha > On 12. Aug 2017, at 06:39, Robert Bradshaw > wrote: > > Congratulations! > > On Fri, Aug 11, 2017 at 2:23 PM, Jean-Baptiste Onofré > wrote: >> Congrats ! >> >> Regards >> JB >> >> >> On 08/11/2017

Re: [ANNOUNCEMENT] New committers, August 2017 edition!

2017-08-12 Thread Aljoscha Krettek
Congrats, everyone! It's well deserved. Best, Aljoscha > On 12. Aug 2017, at 08:06, Pei HE wrote: > > Congratulations to all! > -- > Pei > > On Sat, Aug 12, 2017 at 10:50 AM, James wrote: > >> Thank you guys, glad to contribute to this great project,

Re: Exactly-once Kafka sink

2017-08-10 Thread Aljoscha Krettek
017, at 11:13, Aljoscha Krettek <aljos...@apache.org> wrote: > > @Raghu: Yes, exactly, that's what I thought about this morning, actually. > These are the methods of an operator that are relevant to checkpointing: > > class FlinkOperator() { > open(); > snapshotStat

Re: [PROPOSAL] "Requires deterministic input"

2017-08-10 Thread Aljoscha Krettek
+1 to the annotation approach. I outlined how implementing this would work in the Flink runner in the Thread about the exactly-once Kafka Sink. > On 9. Aug 2017, at 23:03, Reuven Lax wrote: > > Yes - I don't think we should try and make any deterministic guarantees >

Re: Exactly-once Kafka sink

2017-08-10 Thread Aljoscha Krettek
gt;> Took a look at Flink PR, commented on a few issues I see in comments >> there >>> : https://github.com/apache/flink/pull/4239. May be an extra shuffle or >>> storing all them messages in state can get over those. >>> >>> On Wed, Aug 9, 2017 at 2:

Re: Exactly-once Kafka sink

2017-08-09 Thread Aljoscha Krettek
ess this operation in Beam, whether it be via an > explicit Checkpoint() operation or via marking DoFns as having side > effects, and having the runner automatically insert such a Checkpoint in > front of them. In Flink, this operation can be implemented using what > Aljoscha posted. &

Re: Exactly-once Kafka sink

2017-08-08 Thread Aljoscha Krettek
Hi, In Flink, there is a TwoPhaseCommit SinkFunction that can be used for such cases: [1]. The PR for a Kafka 0.11 exactly once producer builds on that: [2] Best, Aljoscha [1]

Re: [VOTE] Release 2.1.0, release candidate #2

2017-08-08 Thread Aljoscha Krettek
+1 to releasing now and working on a fix for a follow-up release. > On 8. Aug 2017, at 06:52, Jean-Baptiste Onofré wrote: > > Hi Kenn, > > As said, I just gave an extra couple of days to Stas and I to try to fix the > issue. However, we didn't fix it yet, and I'm still

Re: [CANCEL][VOTE] Release 2.1.0, release candidate #2

2017-07-24 Thread Aljoscha Krettek
Onofré <j...@nanthrax.net> wrote: > > +1 > > Definitely good to have it for RC3. > > Regards > JB > > On 07/24/2017 02:05 PM, Aljoscha Krettek wrote: >> When we're cutting a new RC anyways we could also include the fixes for >> https://issues.apache.or

Re: [CANCEL][VOTE] Release 2.1.0, release candidate #2

2017-07-24 Thread Aljoscha Krettek
When we're cutting a new RC anyways we could also include the fixes for https://issues.apache.org/jira/browse/BEAM-2571 . It's an actual bug in the Flink Runner and the fix for that is a set of three fixes that should be easy to cherry-pick on

Re: [VOTE] Release 2.1.0, release candidate #2

2017-07-20 Thread Aljoscha Krettek
tiste Onofré <j...@nanthrax.net> >>> wrote: >>> >>> Hi Aljoscha >>>> >>>> Do you have all python requirements installed on your machine ? >>>> >>>> Especially, pip, setuptools, tox, ... ? >>>> >>>>

Re: [VOTE] Release 2.1.0, release candidate #2

2017-07-20 Thread Aljoscha Krettek
ax.net> wrote: > > Hi Aljoscha > > Do you have all python requirements installed on your machine ? > > Especially, pip, setuptools, tox, ... ? > > It sounds like a missing Python requirement on your machine to me. > > Regards > JB > > On 07/20/2017 10:36

Re: [VOTE] Release 2.1.0, release candidate #2

2017-07-20 Thread Aljoscha Krettek
+ 0.8 I tried running "mvn package” on my machine build it fails. This is the log output: https://gist.github.com/aljoscha/dc194303bede8bc635e2d8b691bb58f8 . It fails when trying to build the Python part. Unfortunately I know

Re: [Proposal] Submitting pipelines to Runners in another language

2017-07-11 Thread Aljoscha Krettek
This looks excellent! Please let me know once we get to actually implement this for a specific runner. Flink in my case, of course! :-) > On 8. Jul 2017, at 00:07, Thomas Groh wrote: > > I left a couple of comments. > > I'm looking forwards to this - it's going to

Reduced Availability from 17.6. - 24.6

2017-06-16 Thread Aljoscha Krettek
Hi, I’ll be on vacation next week, just in case anyone is wondering why I’m not responding. :-) Best, Aljoscha

Re: [DISCUSS] Bundle in Flink Runner

2017-06-14 Thread Aljoscha Krettek
Hi, Thanks for summarising everything that was discussed so far and also for coming up with a good implementation plan! The plan looks good, it will unblock proper bundle support on Beam because we cannot wait for Flink to support bundles the way we like it. I also want to quickly highlight

Re: [DISCUSS] Source Watermark Metrics

2017-06-02 Thread Aljoscha Krettek
Hi, Thanks for reviving this thread. I think having the watermark is very good. Some runners, for example Dataflow and Flink have their own internal metric for the watermark but having it cross-runner seems beneficial (if maybe a bit wasteful). Best, Aljoscha > On 2. Jun 2017, at 03:52,

Re: Beam Fn API

2017-05-31 Thread Aljoscha Krettek
Thanks for banging these out Lukasz. I’ll try and read them all this week. We’re also planning to add support for the Fn API to the Flink Runner so that we can execute Python programs. I’m sure we’ll get some valuable feedback for you while doing that. > On 26. May 2017, at 22:49, Lukasz Cwik

Re: New doc: Beam Runner Guide

2017-05-22 Thread Aljoscha Krettek
Cool, I’ll give this a read once I’m back from conference season. :-) > On 22. May 2017, at 07:12, Kenneth Knowles wrote: > > Hi all, > > I spent a little time since 2.0.0 to write a guide for runner authors, > aimed at covering everything you need to know and what you need to

Re: [PROPOSAL] Running Splittable DoFn via Source API

2017-05-11 Thread Aljoscha Krettek
Yes, that additional piece of work was basically my concern. It’s a very mild concern, though, and I’m in favour of implementing SDF as a source. Best, Aljoscha > On 11. May 2017, at 01:00, Eugene Kirpichov > wrote: > > Hi, > > Aljoscha - can you clarify your

Re: Pipeline termination in the unified Beam model

2017-05-10 Thread Aljoscha Krettek
Hi, A bit of clarification, the Flink Runner does not terminate a Job when the timeout is reached in waitUntilFinish(Duration). When we reach the timeout we simply return and keep the job running. I thought that was the expected behaviour. Regarding job termination, I think it’s easy to change

Re: [DISCUSS] Remove TimerInternals.deleteTimer(*) and Timer.cancel()

2017-05-10 Thread Aljoscha Krettek
times for the same timestamp (probably fine/idempotent) >> >> Technically, timers are marked `@Experimental`. But, given the interest in >> state and timers, making changes here will be very hard on users. >> >> Unless someone objects with a strong case, I am comfort

[DISCUSS] Remove TimerInternals.deleteTimer(*) and Timer.cancel()

2017-05-08 Thread Aljoscha Krettek
I wanted to bring this up before the First Stable release and see what other people think. The methods I’m talking about are: void deleteTimer(StateNamespace namespace, String timerId, TimeDomain timeDomain); @Deprecated void deleteTimer(StateNamespace namespace, String timerId); @Deprecated

Re: Congratulations Davor!

2017-05-04 Thread Aljoscha Krettek
Congrats! :-) > On 4. May 2017, at 14:34, Kenneth Knowles wrote: > > Awesome! > > On Thu, May 4, 2017 at 1:19 AM, Ted Yu wrote: > >> Congratulations, Davor! >> >> On Thu, May 4, 2017 at 12:45 AM, Aviem Zur wrote: >> >>>

Re: [PROPOSAL] Running Splittable DoFn via Source API

2017-05-02 Thread Aljoscha Krettek
+1 I’m a bit hesitant, though, because stage 3 (in the new plan) could become the current stage 1: now, stage 1 is “waiting for Runners to support SDF” while stage 2 is “implement sources as SDF”. We are blocked by Runner support in stage 1 while in the new scheme we would be blocked on Runner

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Aljoscha Krettek
+1, as I’m almost always in favour of simplification > On 21. Apr 2017, at 19:59, Robert Bradshaw > wrote: > > Strongly in favor of removing this. If it's actually needed one can > incorporate the key into the value for inspection in the various > phases of the

Re: Pipeline termination in the unified Beam model

2017-04-18 Thread Aljoscha Krettek
BEAM-593 is blocked by Flink issues: - https://issues.apache.org/jira/browse/FLINK-2313: Change Streaming Driver Execution Model - https://issues.apache.org/jira/browse/FLINK-4272: Create a JobClient for job control and monitoring where the second is kind of a duplicate of the first one.

Re: Naming of Combine.Globally

2017-04-18 Thread Aljoscha Krettek
Hi, I think both fold and reduce fail to capture all the power or (what we call) combine. Reduce requires a function of type (T, T) -> T. It requires that the output type be the same as the input type. Fold takes a function (T, A) -> A where T is the input type and A is the accumulation type.

Re: [DISCUSSION] PAssert success/failure count validation for all runners

2017-04-12 Thread Aljoscha Krettek
in a runner-agnostic way in TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? <https://issues.apache.org/jira/browse/BEAM-1763?> Best, Aljoscha > On 10. Apr 2017, at 10:10, Kenneth Knowles <k...@google.com.INVALID> wrote: > > On Sat, Apr 8, 2017 at 7:00 AM, Al

Re: Renaming SideOutput

2017-04-11 Thread Aljoscha Krettek
+1 On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote: > I think that's a good idea. I would call the outputs of a ParDo the "Main > Output" and "Additional Outputs" - it seems like an easy way to make it > clear that there's one output that is always expected, and there may be > more. > > On

Re: Proposed Splittable DoFn API changes

2017-04-08 Thread Aljoscha Krettek
+1 I was too busy with traveling and preparations for Flink Forward but I wanted to retroactively confirm that these are good changes. :-) > On 7. Apr 2017, at 22:43, Jean-Baptiste Onofré wrote: > > Hi Eugene, > > thanks for the update and nice example. > > I plan to start

Re: [DISCUSSION] Consistent use of loggers

2017-04-03 Thread Aljoscha Krettek
Yes, I think we can exclude log4j from the Flink dependencies. It’s somewhat annoying that they are there in the first place. The Flink doc has this to say about the topic: https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/logging.html > On 3. Apr 2017, at 17:56, Aviem Zur

Re: [PROPOSAL] @OnWindowExpiration

2017-03-29 Thread Aljoscha Krettek
+1 I had also already commented on the issue a while back ;-) On Wed, Mar 29, 2017, at 21:23, Kenneth Knowles wrote: > I had totally forgotten that this was filed as > https://issues.apache.org/jira/browse/BEAM-1589 already, which I have now > assigned to myself. > > And, of course, there have

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-03-28 Thread Aljoscha Krettek
> Regards > JB > > On 03/28/2017 08:00 AM, Aljoscha Krettek wrote: > > Hi, > > sorry for being so slow but I’m currently traveling. > > > > The Flink code works but I think it could benefit from some refactoring > > to make the code nice and maintainable. >

Re: Side-Channel Inputs and an SDK

2017-03-26 Thread Aljoscha Krettek
+1 Having all connections be explicit seems like a must. > On 24 Mar 2017, at 19:48, Thomas Groh wrote: > > Hey everyone; > > I have a quick one-pager on PTransform capabilities and the ability for a > PTransform to receive inputs via a Side-channel. This is a

Re: Kafka Offset handling for Restart/failure scenarios.

2017-03-26 Thread Aljoscha Krettek
Thanks, > Jins George > > On 03/23/2017 10:27 AM, Aljoscha Krettek wrote: >> For this you would use externalised checkpoints: >> https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/checkpoints.html >> >> >> Unfortunately, the doc

Re: First IO IT Running!

2017-03-22 Thread Aljoscha Krettek
Great news! I can’t wait to also have support for this for the Flink Runner. Which is partially blocked by me or others working on the Flink Runner, I guess… :-( > On 22 Mar 2017, at 05:15, Jean-Baptiste Onofré wrote: > > Awesome !!! Great news ! > > Thanks guys for that !

Re: Kafka Offset handling for Restart/failure scenarios.

2017-03-22 Thread Aljoscha Krettek
he >>>> most >>>>>> recent savepoint upon restore. >>>>>> >>>>>> There shouldn't be any "special" Flink runner support needed, nor is >>>> the >>>>>> State API involved. >>>>>&

Re: [VOTE] Release 0.6.0, release candidate #2

2017-03-14 Thread Aljoscha Krettek
ion (see JIRA): Not a release blocker (but still a bug in > TestPipeline). > > On Mon, Mar 13, 2017 at 5:40 PM Eugene Kirpichov <kirpic...@google.com> > wrote: > > > +Aljoscha Krettek <aljos...@data-artisans.com> > > > > On Mon, Mar 13, 2017 at 5:30 PM E

Re: [VOTE] Release 0.6.0, release candidate #1

2017-03-10 Thread Aljoscha Krettek
Sorry for the BEAM-1674 problem. We just discovered this by chance because Kenneth added a more thorough Stateful DoFn test. I have a fix for it in this PR: https://github.com/apache/beam/pull/2217 I'm afraid we have to cancel the release yet again because this is a real bug that people can run

Fwd: Jenkins build is still unstable: beam_PostCommit_Java_RunnableOnService_Flink #1866

2017-03-09 Thread Aljoscha Krettek
I'm looking into the recent streak of unstable builds. - Original message - From: Apache Jenkins Server To: comm...@beam.apache.org, k...@google.com, al...@google.com Subject: Jenkins build is still unstable: beam_PostCommit_Java_RunnableOnService_Flink #1866

Re: Apache Beam (virtual) contributor meeting @ Tue Mar 7, 2017

2017-03-02 Thread Aljoscha Krettek
Shoot, I can't because I already have another meeting scheduled. Don't mind me, though. Will you also maybe produce a video of the meeting? On Wed, 1 Mar 2017 at 21:50 Davor Bonaci wrote: > Hi everyone, > Based on the high demand [1], let's try to organize a virtual

Re: First stable release: version designation?

2017-03-02 Thread Aljoscha Krettek
I prefer 2.0.0 for the first stable release. It totally makes sense for people coming from Dataflow 1.x and I can already envision the confusion between Beam 1.5 and Dataflow 1.5. On Thu, 2 Mar 2017 at 07:42 Jean-Baptiste Onofré wrote: > Hi Davor, > > > For a Beam community

Re: Release 0.6.0

2017-03-01 Thread Aljoscha Krettek
ion. I will document the friction points during this release process. Following the release we can start a discussion about how to fix those. Ahmet On Tue, Feb 28, 2017 at 9:22 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > That was my mistake, sorry for that. I should have tagged [1] as a bl

Re: Performance Testing Next Steps

2017-03-01 Thread Aljoscha Krettek
Thanks for writing this and taking care of this, Jason! I'm afraid I also cannot add anything except that I'm excited to see some results from this. On Wed, 1 Mar 2017 at 03:28 Kenneth Knowles wrote: Just got a chance to look this over. I don't have anything to add,

Re: Release 0.6.0

2017-02-28 Thread Aljoscha Krettek
ate > thread > > we plan for this to be the last release before the "first stable > release"; > > and picking the new features now will provide additional coverage for it. > > > > So, +1, but please tag in JIRA. > > > > On Tue, Feb 28, 2017 at 2:09 AM, Al

[DISCUSS] Per-Key Watermark Maintenance

2017-02-27 Thread Aljoscha Krettek
We recently started a discussion on the Flink ML about this topic: [1] The gist of it is that for some use cases tracking the watermark per-key instead of globally (or rather per partition) can be useful for some cases. Think, for example, of tracking some user data off mobile phones where the

Re: Flink Runner with Flink 1.2

2017-02-27 Thread Aljoscha Krettek
Hi, I'm afraid you'll have to either use master or wait until the 0.6 release. We're currently releasing quite frequently, though, so that shouldn't take to long. Best, Aljoscha On Sun, 26 Feb 2017 at 23:31 Sathya Hariesh Prakash (sathypra) < sathy...@cisco.com> wrote: > Hi Guys – We are in

Re: Interest in a (virtual) contributor meeting?

2017-02-22 Thread Aljoscha Krettek
+1 On Wed, 22 Feb 2017 at 10:08 JingsongLee wrote: > +1 > > > 来自阿里邮箱 iPhone版 --原始邮件 --发件人:Davor Bonaci < > da...@apache.org>日期:2017-02-22 11:19:12收件人:dev@beam.apache.org < > dev@beam.apache.org>主题:Interest in a (virtual) contributor

Re: Better developer instructions for using Maven?

2017-02-10 Thread Aljoscha Krettek
at 1:55 PM Jean-Baptiste Onofré < > j...@nanthrax.net > > > > > > > >wrote: > > > > > > > > > >> Hi > > > > >> > > > > >> We discussed about that at the beginning of the project

Re: [VOTE] Apache Beam, version 0.5.0, release candidate #2

2017-02-04 Thread Aljoscha Krettek
+1 - Checked signatures - Verified that source release builds - Verified that code contains fix for the regression in the Flink Runner - Tried Quickstart against Staging Repository - Verified that Quickstart works against a Flink Cluster The top level README.md doesn't contain the Apex Runner

Re: [CANCEL][VOTE] Apache Beam, version 0.5.0, release candidate #1

2017-02-02 Thread Aljoscha Krettek
We should go with the option Kenn proposed instead of a "hard revert". On Thu, 2 Feb 2017 at 10:55 Jean-Baptiste Onofré wrote: > Hi everyone, > > Due to the regression in the Flink runner (and potentially other > runners), I cancel RC#1. > > The fixes/reverts will be done on

Re: [VOTE] Apache Beam, version 0.5.0, release candidate #1

2017-01-31 Thread Aljoscha Krettek
/beam_PostCommit_Java_RunnableOnService_Dataflow/ (still waiting in queue as of writing) I think the MavenInstall hooks fail because the (Google-internal) Dataflow Runner Harness doesn't work with the changed code, though I'm only guessing here. On Tue, 31 Jan 2017 at 21:26 Aljoscha Krettek <aljos...@apache.org> wrote:

Re: [VOTE] Apache Beam, version 0.5.0, release candidate #1

2017-01-31 Thread Aljoscha Krettek
. (There's more besides dropping.) I think we should go ahead and fix for 0.6. On Tue, Jan 31, 2017, 18:23 Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi Aljoscha, > > so you propose to cancel this vote to prepare a RC2 ? > > Regards > JB > > On 01/31/2017 05

Re: [VOTE] Apache Beam, version 0.5.0, release candidate #1

2017-01-31 Thread Aljoscha Krettek
It's not just an issue with the Flink Runner, if I'm not mistaken. Flink had late-data dropping via the LateDataDroppingDoFnRunner (which got "disabled" by the two commits I mention in the issue) while I think that the Apex and Spark Runners might not have had dropping in the first place. (Not

Re: [ANNOUNCEMENT] New committers, January 2017 edition!

2017-01-27 Thread Aljoscha Krettek
Welcome aboard! :-) On Fri, 27 Jan 2017 at 11:27 Ismaël Mejía wrote: > Congratulations, well deserved guys ! > > > On Fri, Jan 27, 2017 at 9:28 AM, Amit Sela wrote: > > > Welcome and congratulations to all! > > > > On Fri, Jan 27, 2017, 10:12 Ahmet

Re: Better developer instructions for using Maven?

2017-01-26 Thread Aljoscha Krettek
mportant that we mention the existing profiles (and the > > >intended > > >>>> checks) in the contribution guide (e.g. -Prelease (or -Pall-checks > > >>>> triggers > > >>>> these validations). > > >>>> > > >

Re: On my activity at the project

2017-01-21 Thread Aljoscha Krettek
It's great to work with you Max but enjoy your time off. P.S. I'm happy to be the component lead for the Flink Runner. Thanks for updating, Frances! On Tue, 17 Jan 2017 at 18:53 Kenneth Knowles wrote: > Great to work with you so far, and looking forward to it in the