Re: [RESULT] [VOTE] Release Apache Flink 0.8.0 (RC3)

2015-01-18 Thread Márton Balassi
The vote has passed with +6 binding votes from the PMC. +1 votes are from: Aljoscha Krettek Robert Metzger Vasiliki Kalavri Henry Saputra Fabian Hueske Stephen Ewen Thank you for checking the release. I'll publish the release now On Thu, Jan 15, 2015 at 12:10 PM, Márton Balassi balassi.mar

Re: [VOTE] Release Apache Flink 0.8.1 (RC2)

2015-02-17 Thread Márton Balassi
+1 Checked signatures, checksums, pom. Built from src, run local examples. On Tue, Feb 17, 2015 at 11:59 PM, Robert Metzger rmetz...@apache.org wrote: +1 I've checked the RC on a HDP 2.2 sandbox (using Flink on YARN). Also ran wordcount on it. The hadoop1 quickstarts have the correct

Re: [DISCUSS] Dedicated streaming mode and start scripts

2015-02-17 Thread Márton Balassi
When it comes to the current use cases I'm for this separation. @Ufuk: As Gyula has already pointed out with the current design of integration it should not be a problem. Even if we submitted programs to the wrong clusters it would only cause performance issues. Eventually it would be nice to

Re: [SUGGESTION] Push latest doc to Flink website

2015-02-18 Thread Márton Balassi
+1 We used to have this a couple of releases ago. On Wed, Feb 18, 2015 at 4:30 PM, Henry Saputra henry.sapu...@gmail.com wrote: Hi All, I am thinking of pushing latest doc in master ((i.e the snapshot build) to Flink website to help people follow the latest change and development without

Re: Question about Commit Policy

2015-01-27 Thread Márton Balassi
. +1 to stick with them On Wed, Jan 7, 2015 at 3:03 PM, Márton Balassi balassi.mar...@gmail.com wrote: I prefer component declarations, the current best practice comes in handy when searching through commits. Answering a when

[VOTE] Release Apache Flink 0.8.0 (RC2)

2015-01-12 Thread Márton Balassi
Please vote on releasing the following candidate as Apache Flink version 0.8.0 This release will be the first major release for Flink as a top level project. - The commit to be voted on is in the branch release-0.8.0-rc2 (commit

Re: Gelly is in!

2015-02-11 Thread Márton Balassi
Woot! :) On Wed, Feb 11, 2015 at 11:53 AM, Stephan Ewen se...@apache.org wrote: Hi everyone! I am happy to say that the graph library Gelly is finally in the code :-) Thanks Vasia, Daniel, Andra, and Carsten for the great work! Greetings, Stephan

Re: Planning Release 0.8.1

2015-02-09 Thread Márton Balassi
at 12:52 PM, Aljoscha Krettek aljos...@apache.org wrote: @robert, yes, will do On Fri, Feb 6, 2015 at 12:28 PM, Márton Balassi balassi.mar...@gmail.com wrote: Found a streaming bug, Gyula fixed it. Pushing it soon to both master and branch-0.8. On Fri, Feb 6, 2015

Re: [DISCUSS] Name of Expression API and DataSet abstraction

2015-03-16 Thread Márton Balassi
+1 for Max's suggestion. On Mon, Mar 16, 2015 at 10:32 AM, Ufuk Celebi u...@apache.org wrote: On Fri, Mar 13, 2015 at 6:08 PM, Maximilian Michels m...@apache.org wrote: Thanks for starting the discussion. We should definitely not keep flink-expressions. I'm in favor of DataTable for

Re: Restructuring the maven projects

2015-03-17 Thread Márton Balassi
...@gmail.com wrote: Thanks Marton, having 2 threads discussing same thing can be confusing. - Henry On Mon, Jan 5, 2015 at 3:52 AM, Márton Balassi mbala...@apache.org wrote: Let us consider this thread the standard for the restructure, it is perfectly in line with the wishes I have

Re: Website documentation minor bug

2015-03-09 Thread Márton Balassi
+1 for the proposed solution from Max +1 for decreasing the size: but let's have preview, I also think that the current one is a bit too large On Mon, Mar 9, 2015 at 2:16 PM, Maximilian Michels m...@apache.org wrote: We can fix this for the headings by adding the following CSS rule: h1, h2,

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-10 Thread Márton Balassi
to the similar concept of Spark RRD staging execution =P I suppose there will be a runtime configuration or hint to tell the Flink Job manager to indicate which execution is preferred? - Henry On Tue, Mar 3, 2015 at 2:09 AM, Márton Balassi balassi.mar...@gmail.com wrote: Hi Henry

Re: [VOTE] Name of Expression API Representation

2015-03-26 Thread Márton Balassi
+DataTable On Thu, Mar 26, 2015 at 9:29 AM, Markl, Volker, Prof. Dr. volker.ma...@tu-berlin.de wrote: +Table I also agree with that line of argument (think SQL ;-) ) -Ursprüngliche Nachricht- Von: Timo Walther [mailto:twal...@apache.org] Gesendet: Donnerstag, 26. März 2015 09:28

Re: A small Project I've been working on

2015-04-01 Thread Márton Balassi
Woot! On Wed, Apr 1, 2015 at 9:01 AM, Aljoscha Krettek aljos...@apache.org wrote: Right now, runtime is roughly thrice that of equivalent java programs. But I plan on bringing that to the same ballpark using code generation. On Wed, Apr 1, 2015 at 8:54 AM, Fabian Hueske fhue...@gmail.com

Re: Question about Infinite Streaming Job on Mini Cluster and ITCase

2015-04-01 Thread Márton Balassi
Hey Matthias, Thanks for reporting the Exception thrown, we were not preparing for this use case yet. We fixed it with Gyula, he is pushing a fix for it right now: When the job is cancelled (for example due to shutting down the executor underneath) you should not see that InterruptedException as

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-27 Thread Márton Balassi
+1 for 0.9.0-milestone-1. On Fri, Mar 27, 2015 at 12:47 PM, Stephan Ewen se...@apache.org wrote: Okay, to how about we make this dependency groupIdorg.apache.flink/groupId artifactIdflink-core/artifactId version0.9.0-milestone-1/version /dependency I think it is common that milestones

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-03 Thread Márton Balassi
Hi Henry, Batch mode is a new execution mode for batch Flink jobs where instead of pipelining the whole execution the job is scheduled in stages, thus materializing the intermediate result before continuing to the next operators. For implications see [1]. [1]

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-02 Thread Márton Balassi
Hey, We have a nice list of new features - it definitely makes sense to have that as a release. On my side I really want to have a first limited version of streaming fault tolerance in it. +1 for Robert's proposal for the deadlines. I'm also volunteering for release manager. Best, Marton On

Re: [DISCUSS] Dedicated streaming mode and start scripts

2015-02-27 Thread Márton Balassi
...@apache.org mailto:ktzou...@apache.org wrote: +1 On Tue, Feb 17, 2015 at 12:14 PM, Márton Balassi mbala...@apache.org mailto:mbala...@apache.org wrote: When it comes to the current use cases I'm for this separation. @Ufuk: As Gyula has already pointed out with the current design of integration

Re: Questions about flink-streaming-examples

2015-02-26 Thread Márton Balassi
Dear Mathias, Thanks for reporting the issue. I have successfully built flink-streaming-examples with maven, you can depend on test classes, the following in the pom does the trick: dependency groupIdorg.apache.flink/groupId artifactIdflink-streaming-core/artifactId

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-26 Thread Márton Balassi
@Timo: No feature freeze for this, yes. On Thu, Mar 26, 2015 at 3:36 PM, Timo Walther twal...@apache.org wrote: +1 for a beta release. So there is no feature-freeze until the RC right? On 26.03.2015 15:32, Márton Balassi wrote: +1 for the early release. I'd call it 0.9-milestone1

Re: Tests for the Steaming classes

2015-03-23 Thread Márton Balassi
Thanks for looking into this, Stephan. +1 for the JIRAs. On Mon, Mar 23, 2015 at 10:55 AM, Ufuk Celebi u...@apache.org wrote: On 23 Mar 2015, at 10:44, Stephan Ewen se...@apache.org wrote: Hi everyone! With the streaming stuff getting heavier exposure, I think it needs a few more

Re: Storm compatibility layer for Flink (first beta available)

2015-04-02 Thread Márton Balassi
Hey Mathias, Thanks, this is a really nice contribution. I just scrolled through the code, but I really like it and big thanks for the the tests for the examples. The rebase Fabian suggested would help a lot when merging. On Thu, Apr 2, 2015 at 9:19 PM, Fabian Hueske fhue...@gmail.com wrote:

Re: Rework of the window-join semantics

2015-04-03 Thread Márton Balassi
could define it like this: stream_A = a.window(...) stream_B = b.window(...) stream_A.join(stream_B).where().equals().with() So a join would just be a join of two WindowedDataStreamS. This would neatly move the windowing stuff into one place. On Thu, Apr 2, 2015 at 9:54 PM, Márton Balassi

Re: Test sources in wrong folder

2015-04-03 Thread Márton Balassi
Dear Flavio, 'mvn clean install -DskipTests' should do the trick. On Fri, Apr 3, 2015 at 12:11 AM, Flavio Pompermaier pomperma...@okkam.it wrote: Hi to all, I was trying to compile Flink 0.9 skipping test compilation (-Dmaven.test.skip=true) but this is not possible because there are

Re: Making state in streaming more explicit

2015-05-01 Thread Márton Balassi
The current aim is the first option as you have correctly derived. :) On May 1, 2015 5:39 PM, Aljoscha Krettek aljos...@apache.org wrote: From this discussion I derive that we will have a state abstraction that everyone who requires state will work with? Or will the state be in object fields

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-11 Thread Márton Balassi
We had a conversation with Stephan, Aljoscha, Gyula and Paris and converged on the following outline for the streaming source interface. The question is tricky because we need to coordinate between the actual source computation and triggering the checkpointing of the state of the source. We

[DISCUSS] Merging Storm compatibility to Flink-contrib

2015-05-12 Thread Márton Balassi
The purpose of flink-contrib currently is to hold contributions to the project that we do not consider part of the core flink functionality, but provide useful tools around it. In general code placed here has to meet less requirements in terms of covering all corner cases if it provides a nice

Re: Storm compatibility layer for Flink (first beta available)

2015-04-06 Thread Márton Balassi
:) Gyula On Thursday, April 2, 2015, Márton Balassi balassi.mar...@gmail.com wrote: Hey Mathias, Thanks, this is a really nice contribution. I just scrolled through the code, but I really like it and big thanks for the the tests for the examples. The rebase Fabian suggested

Re: [DISCUSS] Break up streaming connectors into subprojects

2015-04-08 Thread Márton Balassi
Overall I think this is a nice approach, but let us then also discuss where would we like to put these jars. Currently these jars are not in the lib folder of the Flink distribution, which mean that whenever a user would like to use them they have to package it with there usercode which is a bit

Re: Rework of the window-join semantics

2015-04-08 Thread Márton Balassi
this intuitive. On Fri, Apr 3, 2015 at 11:23 AM, Márton Balassi balassi.mar...@gmail.commailto:balassi.mar...@gmail.com wrote: That would be really neat, the problem I see there, that we do not distinguish between dataStream.window() and dataStream.window().every() currently, they both

Re: [DISCUSS] Re-add record copy to chained operator calls

2015-05-20 Thread Márton Balassi
+1 for copying. On May 20, 2015 10:50 AM, Gyula Fóra gyf...@apache.org wrote: Hey, The latest streaming operator rework removed the copying of the outputs before passing them to chained operators. This is a major break for the previous operator semantics which guaranteed immutability. I

Re: Fwd: Discussion: Storm Comparability Layer

2015-06-03 Thread Márton Balassi
nemderogator...@gmail.com Date: 2015-06-03 15:31 GMT+02:00 Subject: Re: Discussion: Storm Comparability Layer To: Márton Balassi balassi.mar...@gmail.com Hey, Matthias, Of course, you can remove my last commit. I just wanted to remove the failing tests, and some unnecessary comments. Please

Re: Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Márton Balassi
Added F7 Running against Kafka cluster for me in the doc. Doing it tomorrow. On Mon, Jun 8, 2015 at 7:00 PM, Chiwan Park chiwanp...@icloud.com wrote: Hi. I’m very excited about preparing a new major release. :) I just picked two tests. I will report status as soon as possible. Regards,

Re: Closing JIRA issues

2015-06-05 Thread Márton Balassi
Hey Lokesh, The implicit practice is that the committer merging you PR closes the JIRA. Please do not close the JIRA, before your patch is merged. If the committer forgets to close it after your code is in you are very welcome to close it yourself. Best, Marton On Fri, Jun 5, 2015 at 5:24 PM,

Github mirror is down

2015-06-06 Thread Márton Balassi
Our codebase github mirror is out of sync for at least 8 hours. I have filed a JIRA ticket for Infra. [1] [1] https://issues.apache.org/jira/browse/INFRA-9777

Re: Planning the 0.9 Release

2015-06-08 Thread Márton Balassi
The problem is still there. @Aljoscha: It would be great if you could take it. On Mon, Jun 8, 2015 at 9:41 AM, Gyula Fóra gyf...@apache.org wrote: I agree with Marton. I thought Aljoscha was working on that. On Monday, June 8, 2015, Márton Balassi balassi.mar...@gmail.com wrote: FLINK-2054

Re: checkstyle failure

2015-06-03 Thread Márton Balassi
Actually it is really important that you are using a Mac, it look like: Spark had the same issue with hbase annotations. [1] I have added a JIRA for Flink, should be easy as Spark had a PR for it. [2, 3] [1] https://issues.apache.org/jira/browse/SPARK-4455 [2]

Re: checkstyle failure

2015-06-03 Thread Márton Balassi
Sure, feel free to take it. :) On Wed, Jun 3, 2015 at 5:00 PM, Lokesh Rajaram rajaram.lok...@gmail.com wrote: Hello Marton, Nice find. can I work on this JIRA? Thanks, Lokesh On Wed, Jun 3, 2015 at 7:48 AM, Márton Balassi balassi.mar...@gmail.com wrote: Actually it is really

Re: Build works locally but fails on travis (Storm compatibility)

2015-06-10 Thread Márton Balassi
Hey, As the storm-compatibility-core build goes fine this is a dependency issue with storm-compatibility-examples. As a first try replace: dependency groupIdorg.apache.flink/groupId artifactIdflink-streaming-core/artifactId version${project.version}/version scopetest/scope

Re: Force enabling checkpoints for iterative streaming jobs

2015-06-10 Thread Márton Balassi
I agree that for the sake of the above mentioned use cases it is reasonable to add this to the release with the right documentation, for machine learning potentially loosing one round of feedback data should not matter. Let us not block prominent users until the next release on this. On Wed, Jun

Re: Fwd: Discussion: Storm Comparability Layer

2015-06-04 Thread Márton Balassi
. It builds locally. I cannot reproduce the error... (Maybe just trigger the build again) -Matthias On 06/03/2015 09:09 PM, Márton Balassi wrote: Thanks for the updates, Matthias. Both of your questions get an other context, because we have decided to go back to the run()/cancel() type

The household of the Kafka connector

2015-06-22 Thread Márton Balassi
Hey, Due to the effort invested to the Kafka connector mainly by Robert and Gabor Hermann we are going to ship a fairly nice solution for reading from and writing to Kafka with 0.9.0. This is the most prominent streaming connector currently, and rightfully so as pipeline level end-to-end exactly

Known minor streaming issue in 0.9.0

2015-06-22 Thread Márton Balassi
Hey, I have found that open and close methods of streaming RichWindowFunctions are not called. I have the fix [1] as I did implement a fix for a similar issue some time ago, [2] sorry for not realizing it back then. [1] https://github.com/apache/flink/pull/855 [2]

Re: Known minor streaming issue in 0.9.0

2015-06-22 Thread Márton Balassi
Added a ticket, so we can refer to it. https://issues.apache.org/jira/browse/FLINK-2257 On Mon, Jun 22, 2015 at 2:14 PM, Ufuk Celebi u...@apache.org wrote: On 22 Jun 2015, at 14:00, Maximilian Michels m...@apache.org wrote: Hi Marton, Thanks for spotting this issue. It is a bug we

Re: Thoughts About Streaming

2015-06-27 Thread Márton Balassi
@Matthias: Your point of working with a minimal number of clear concepts is desirable to say the least. :) The reasoning behind the KeyedDatastream is to associate Flink persisted operator state with the keys of the data that produced it, so that stateful computation becomes scalabe in the

Re: Build works locally but fails on travis (Storm compatibility)

2015-06-10 Thread Márton Balassi
, Márton Balassi wrote: Hey, As the storm-compatibility-core build goes fine this is a dependency issue with storm-compatibility-examples. As a first try replace: dependency groupIdorg.apache.flink/groupId artifactIdflink-streaming-core/artifactId version${project.version

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Márton Balassi
@Aljoscha: 1) I think this just means that you can set the state backend on a taskmanager basis. 3) This is a serious issue then. Is it work when you set it in the flink-conf.yaml? On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek aljos...@apache.org wrote: So, during my testing of the state

Re: Testing Apache Flink 0.9.0-rc2

2015-06-16 Thread Márton Balassi
@Max: The PR is good to go on my side. Does the job, could be a bit nicer though. Added to the document. On Tue, Jun 16, 2015 at 10:54 PM, Aljoscha Krettek aljos...@apache.org wrote: I added the two relevant Table API commits to the release doc. On Tue, 16 Jun 2015 at 21:49 Maximilian Michels

Re: Testing Apache Flink 0.9.0-rc1

2015-06-12 Thread Márton Balassi
@Till: This also apples to the streaming connectors. On Fri, Jun 12, 2015 at 9:45 AM, Till Rohrmann trohrm...@apache.org wrote: Hi guys, I just noticed while testing the TableAPI on the cluster that it is not part of the dist module. Therefore, programs using the TableAPI will only run when

Re: Testing Apache Flink 0.9.0-rc1

2015-06-12 Thread Márton Balassi
As for outstanding issues I think streaming is good to go as far as I know. I am personally against including all libraries - at least speaking for the streaming connectors. Robert, Stephan and myself had a detailed discussion on that some time ago and the disadvantage of having all the libraries

Re: [VOTE] Release Apache Flink 0.9.0 (release-0.9.0-rc3)

2015-06-17 Thread Márton Balassi
+1 Verified checksums and signatures Built from source Run bundled batch examples on a local setup Run streaming example with checkpointed operator state on cluster, killed taskmanagers underneath On Wed, Jun 17, 2015 at 2:35 AM, Maximilian Michels m...@apache.org wrote: Dear community!

Re: Shading artifact name

2015-06-16 Thread Márton Balassi
Ok, never mind. I got it: it is decided later dependent on the hadoop profile. Makes sense. On Tue, Jun 16, 2015 at 1:24 PM, Márton Balassi balassi.mar...@gmail.com wrote: Hey, While reading through the flink-parent pom I have stumbled upon this. [1] shading-artifact.nameerror/shading

[DISCUSS] Consolidate method naming between the batch and streaming API

2015-06-01 Thread Márton Balassi
Looking at the DataSet and DataStream APIs we have come to the conclusion with Aljoscha that there are a few methods that although providing the same functionality are named differently. These are the following: 1. rebalance (batch) / distribute (streaming): Rebalances the data sent to the

Re: [DISCUSS] Streaming Sources (again)

2015-05-31 Thread Márton Balassi
I am also for having only one source interface. It seems that interruptability is to much of a burden on the sources, locking version should be still acceptable from the user point of view. We are dealing with inherently concurrent tasks, I suppose our users are familiar with locking - especially

Re: Memleak in the SessionWindowing example

2015-05-28 Thread Márton Balassi
Thanks for debugging this Gabor, indeed a good catch. I am not so sure about surfacing it in the API though - it seems very specific for the session windowing case. I am also wondering whether maybe this should actually be the default behavior - if there are already empty windows for a group why

Re: Is there Any api that let DataStream join DataSet ?

2015-06-28 Thread Márton Balassi
Hi, Flink currently does not have explicit Api support for that, but is definitely possible to do. In fact Gyula (cc-d) mocked up a prototype for a similar problem some time ago. The idea needs some refinement to properly support all the viable use cases though and the streaming Api currently

Re: [ANNOUNCE] New Committer Chesnay Schepler

2015-08-20 Thread Márton Balassi
Welcome Chesnay! On Thu, Aug 20, 2015 at 7:29 PM, Henry Saputra henry.sapu...@gmail.com wrote: Welcome Chesnay! On Thu, Aug 20, 2015 at 2:18 AM, Robert Metzger rmetz...@apache.org wrote: The Project Management Committee (PMC) for Apache Flink has asked Chesnay Schepler to become a

Re: [FAILING TEST] StateCheckpoinedITCase

2015-08-22 Thread Márton Balassi
+1 for Vasia's suggestion On Aug 22, 2015 8:07 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: I just came across 2 more :/ I'm also in favor of tracking these with JIRA. How about test-stability for a label? -V. On 21 August 2015 at 12:47, Matthias J. Sax

Off-by-one issues in the windowing code

2015-06-29 Thread Márton Balassi
I have found two off-by-one issues in the windowing code. The first may result in duplicate data in the last window and is easy to fix. [1] The second may result data being swallowed in the last window, and is also not difficult to fix. [2] I've talked to Aljoscha about fixing the second one,

Re: FLINK-2066

2015-06-28 Thread Márton Balassi
Hey, Thanks for picking up the issue. This value can be specified as execution-retries.delay in the flink-conf.yaml. Hence you can check the associated value in the ConfigConstants [1] and track the way it is used. It is passed a couple of times, but is ultimately used in ExecutionGraph. [2] [1]

Re: On some GUI tools for building Flink Streaming data flows...

2015-08-11 Thread Márton Balassi
Hey Christian, Thanks for the insider view on SPQR. I have to agree with Gyula that dynamic topology build is not the highest priority for Flink currently, but certainly a very interesting feature and one that has already been requested by a couple of users. As for none of the open source

Re: Flink contributor list

2015-07-27 Thread Márton Balassi
manmat - Mátyás Manninger, aspired to be a GSoC student last year. On Mon, Jul 27, 2015 at 12:32 PM, Kostas Tzoumas ktzou...@apache.org wrote: Cool, I added a link to the website. Keep the corrections coming On Mon, Jul 27, 2015 at 11:30 AM, Vasiliki Kalavri vasilikikala...@gmail.com wrote:

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Márton Balassi
Hi, Hadoop is not a necessity for running Flink, but rather an option. Try the steps of the setup guide. [1] If you really nee HDFS though to get the best IO performance I would suggest having Hadoop on all your machines running Flink. [1]

Re: Powered by Flink

2015-10-19 Thread Márton Balassi
Thanks for starting and big +1 for making it more prominent. On Mon, Oct 19, 2015 at 2:53 PM, Fabian Hueske wrote: > Thanks for starting this Kostas. > > I think the list is quite hidden in the wiki. Should we link from > flink.apache.org to that page? > > Cheers, Fabian > >

Re: streaming GroupBy + Fold

2015-10-14 Thread Márton Balassi
apache.org> > >> wrote: > >> > >> > Hi, > >> > If you are using a fold you are using none of the new code paths. I > will > >> > add support for Fold to the new windowing implementation today, > though. > >> > > >> >

Re: [DISCUSS] Java code style

2015-10-20 Thread Márton Balassi
+1 for both As we are planning to restructure the maven projects at the point that breaks the PRs anyway, so going on step further at this point in time is reasonable for me. On Tue, Oct 20, 2015 at 2:37 PM, Matthias J. Sax wrote: > big +1 for both! > > On 10/20/2015 02:31

Re: [DISCUSS] Introducing a review process for pull requests

2015-10-07 Thread Márton Balassi
+1 One minor comment: I suppose you implicitly mean that a committer can shepherd her own PR. On Wed, Oct 7, 2015 at 10:22 AM, Fabian Hueske wrote: > @Matthias: That's a good point. Each PR should be backed by a JIRA issue. > If that's not the case, we have to make the

Re: streaming GroupBy + Fold

2015-10-05 Thread Márton Balassi
at the appended > log in conjunction with the code. Each ERROR printout in the log relates to > an accumulator receiving wrong values. > > cheers Martin > > On Sat, Oct 3, 2015 at 11:29 AM, Márton Balassi <balassi.mar...@gmail.com> > wrote: > >> Hey, >>

Re: [DISCUSS] Collecting Instable Tests

2015-10-09 Thread Márton Balassi
I like Fabian's approach, you can also share these filters on JIRA. On Fri, Oct 9, 2015 at 2:55 PM, Fabian Hueske wrote: > Sorry, that was not the correct link. > You can create a bookmark for this one: > > >

Re: Does DataSet job also use Barriers to ensure exactly once.?

2015-07-09 Thread Márton Balassi
As Kostas mentioned the failure mechanisms for streaming and batch processing are different, but you can expect exactly once processing guarantees from both of them. On Thu, Jul 9, 2015 at 2:43 PM, 马国维 maguo...@outlook.com wrote: hi, everyoneThe doc say Flink Streaming use Barriers to ensure

Re: [DISCUSSION] Release current master as 0.9.1 (mod few changes)

2015-08-26 Thread Márton Balassi
+1 On Wed, Aug 26, 2015 at 3:11 PM, Maximilian Michels m...@apache.org wrote: We will have a proper minor release and a preview of 0.10. After all, a good compromise. +1 On Wed, Aug 26, 2015 at 2:57 PM, Chiwan Park chiwanp...@icloud.com wrote: Robert's suggestion looks good. +1 Sent

Re: Using event timestamps

2015-09-13 Thread Márton Balassi
Hey Gyula, I have been recently looking at the streaming UdfOperators and can not recall a utility for the sources that you are looking for, but maybe I am also missing it. :) It would be a convenient addition though. Best, Marton On Sun, Sep 13, 2015 at 8:59 PM, Gyula Fóra

Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Márton Balassi
Great to see streaming graduating. :) I like the outline, both getting rid of staging, having the examples together and generally flattening the structure are very reasonable to me. You have listed flink-streaming-examples under flink-streaming-connectors and left out some less prominent maven

Re: streaming GroupBy + Fold

2015-10-03 Thread Márton Balassi
Hey, Thanks for reporting the problem, Martin. I have not merged the PR Stephan is referring to yet. [1] There I am cleaning up some of the internals too. Just out of curiosity, could you share the code for the failing test please? [1] https://github.com/apache/flink/pull/1155 On Fri, Oct 2,

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Márton Balassi
@Matthias: +1. On Fri, Oct 2, 2015 at 11:27 AM, Stephan Ewen wrote: > @matthias +1 for that approach > > On Fri, Oct 2, 2015 at 11:21 AM, Matthias J. Sax wrote: > > > It think, rename "flink-storm-compatibility-core" to just "flink-storm" > > would be the

Re: streaming GroupBy + Fold

2015-10-05 Thread Márton Balassi
unfortunately. [1] Could you maybe produce a minimalistic example that we can actually execute? :) [1] https://github.com/mbalassi/flink/commit/9f1f02d05e2bc2043a8f514d39fbf7753ea7058d On Mon, Oct 5, 2015 at 10:06 PM, Márton Balassi <balassi.mar...@gmail.com> wrote: > Thanks, I am checki

Re: [VOTE][CANCEL] Release Apache Flink 0.10.0-milestone-1 (RC1)

2015-09-25 Thread Márton Balassi
@Matthias: Please open the PR. Maybe tag it with [0.10.0-milestone-1]. Thanks, Marton On Fri, Sep 25, 2015 at 11:38 AM, Matthias J. Sax wrote: > Do we need a Jira for the WebClient fix? Or can I just commit it? > > If anybody whats to review, please find it here: >

Re: [ANNOUNCE] Flink 0.10.1 released

2015-11-27 Thread Márton Balassi
Thanks, Robert! On Fri, Nov 27, 2015 at 5:02 PM, Vasiliki Kalavri wrote: > Thank you Robert ^^ > > On 27 November 2015 at 16:23, Till Rohrmann wrote: > > > Thanks Robert for being the release manager for 0.10.1 > > > > On Fri, Nov 27, 2015 at

Re: Object reuse documentation should be improved

2015-12-14 Thread Márton Balassi
Thanks for writing this up, Gábor. As Aljoscha suggested chaining changes all of these and makes it very tricky to work with these which should be clearly documented. That was the reason while some time ago the streaming API always copied the output of a UDF by default to avoid this ambiguous

Re: how to write dataset in a file?

2015-11-21 Thread Márton Balassi
Additionally as having multiple files under /output1.txt is standard in the Hadoop ecosystem you can transparently read all the files with env.readTextFile("/output1.txt"). You can also set parallelism on individual operators (e.g the file writer) if you really need a single output. On Fri, Nov

Re: HA Cluster restart behaviour

2016-06-01 Thread Márton Balassi
I also think that the current mechanism is weird. IMHO it makes sense to add the flag to both the start and stop scripts. On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi wrote: > Yes, it's expected, but you are certainly not the first one to be > confused by this behaviour. > > The

Re: Side-effects of DataSet::count

2016-05-29 Thread Márton Balassi
Hey Eron, Yes, DataSet#collect and count methods implicitly trigger a JobGraph execution, thus they also trigger writing to any previously defined sinks. The idea behind this behavior is to enable interactive querying (the one that you are used to get from a shell environment) and it is also a

Inconvenient (unforeseen?) consequences of PR #1683

2016-02-25 Thread Márton Balassi
Recent changes to the build [1] where many libraries got their core dependencies (the ones included in the flink-dist fat jar) moved to the provided scope. The reasoning was that when submitting to the Flink cluster the application already has these dependencies, while when a user writes a

Re: Inconvenient (unforeseen?) consequences of PR #1683

2016-02-25 Thread Márton Balassi
Issued JIRA ticket 3511 to make it referable in other discussions. [1] [1] https://issues.apache.org/jira/browse/FLINK-3511 On Thu, Feb 25, 2016 at 3:36 PM, Márton Balassi <balassi.mar...@gmail.com> wrote: > Recent changes to the build [1] where many libraries got their core >

Re: [VOTE] Release Apache Flink 1.0.0 (RC1)

2016-02-25 Thread Márton Balassi
Thanks for creating the candidate Robert and for the heads-up, Slim. I would like to get a PR [1] in before 1.0.0 as it breaks hashing behavior of DataStream.keyBy. The PR has the feature implemented and the java tests adopted, there is still a bit of outstanding fix for the scala tests. Gábor

Re: [VOTE] Release Apache Flink 1.0.0 (RC1)

2016-02-25 Thread Márton Balassi
; I did a quick check of the open pull requests yesterday evening and > > >> found > > >> >> one [1] to be included into the RC as well. Since the PR you > > mentioned > > >> is > > >> >> marked with [WIP] I thought its not yet ready to

Re: Inconvenient (unforeseen?) consequences of PR #1683

2016-02-26 Thread Márton Balassi
t has both the "library" and the core flink > dependencies with scope "compile" > > That way the example should run in the IDE out of the cox, and users that > reference the libraries will still get the correct packaging (include the > library in the user jar, but not

Re: Congrats on 1000 stars on Github

2016-02-26 Thread Márton Balassi
Great to see that. :) On Fri, Feb 26, 2016 at 1:56 PM, Theodore Vasiloudis < theodoros.vasilou...@gmail.com> wrote: > I'm sure others noticed this as well yesterday, but the project has passed > 1000 stars on Github, > just in time for the 1.0 release ;) > > Here's to the next 1000! > > --Theo >

Re: Dense matricies in FlinkML

2016-02-18 Thread Márton Balassi
Hi guys, They are at least already registered for serialization [1], so there should be no intentional conflict as Theo has suggested. [1] https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/common/FlinkMLTools.scala#L67-L73 Best, Marton On

Re: cancel running stream job

2016-02-19 Thread Márton Balassi
Adding to Ufuk's answer: yes, cancelling the job frees up the resources. :) Best: Marton On Fri, Feb 19, 2016 at 12:10 PM, Ufuk Celebi wrote: > Yes, you can cancel it via the web frontend or the CLI interface [1]. > > If you can send messages to the JobManager, you can also

Re: Tuple performance and the curious JIT compiler

2016-03-10 Thread Márton Balassi
ot;hold > >> > off" on working on that, would be good to know a bit more, like > >> > - when is it decided whether this project takes place? > >> > - when would results be there? > >> > - can we expect the results to be usable, i.e., how

Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-14 Thread Márton Balassi
Hey, I think we came to the agreement that this PR is not mergeable right now, so I am closing it. I personally find it inconsistent to not have the fully API mirrored in Scala though, but this is something that we can revisit when prepairing 2.0. Best, Marton On Mon, Mar 14, 2016 at 8:14 PM,

Re: Accessing the Configuration

2016-03-09 Thread Márton Balassi
le). > > On Wed, Mar 9, 2016 at 12:38 PM, Márton Balassi <balassi.mar...@gmail.com> > wrote: > > > Hey, > > > > I was wondering whether there is a way to access the Configuration from > an > > (Stream)ExecutionEnviroment or a RichFunction. Pr

Accessing the Configuration

2016-03-09 Thread Márton Balassi
Hey, I was wondering whether there is a way to access the Configuration from an (Stream)ExecutionEnviroment or a RichFunction. Practically I would like to set a temporary persist path in the Configuration and access the location somewhere during the topology. I have followed the way the

[streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-12 Thread Márton Balassi
Hey, I have just come across a shortcoming of the streaming Scala API: it completely lacks the Scala implementation of the DataStreamSink and instead the Java version is used. [1] I would regard this as a bug that needs a fix for 1.0.1. Unfortunately this is also api-breaking. Will post it to

Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-12 Thread Márton Balassi
The JIRA issue is FLINK-3610. On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi <balassi.mar...@gmail.com> wrote: > > I have just come across a shortcoming of the streaming Scala API: it > completely lacks the Scala implementation of the DataStreamSink and > instead the Java v

Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Márton Balassi
can > merge the change because its API breaking. > One of the promises of the 1.0 release is that we are not breaking any APIs > in the 1.x.y series of Flink. We can fix those issues with a 2.x release. > > On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi <balassi.mar...@gmail.com&

Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Márton Balassi
't forbid adding new methods. Maybe we can > find a good way to resolve the issue without changing the signature of > existing methods. > And for tracking API breaking changes, maybe it makes sense to create a > 2.0.0 version in JIRA and set the "fix-for" for the issue to 2.0. &g

Re: GSoC Project Proposal Draft: Code Generation in Serializers

2016-03-19 Thread Márton Balassi
gt; >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Tuple-performance-and-the-curious-JIT-compiler-td10666.html > ), > >> and I wanted to make this information available to be able to > incorporate > >> this into that discussion. I have written this dr

  1   2   3   >