Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Stephan Ewen
@matthias +1 for that approach On Fri, Oct 2, 2015 at 11:21 AM, Matthias J. Sax wrote: > It think, rename "flink-storm-compatibility-core" to just "flink-storm" > would be the cleanest solution. > > So in flink-contrib there would be two modules: > - flink-storm > -

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Márton Balassi
@Matthias: +1. On Fri, Oct 2, 2015 at 11:27 AM, Stephan Ewen wrote: > @matthias +1 for that approach > > On Fri, Oct 2, 2015 at 11:21 AM, Matthias J. Sax wrote: > > > It think, rename "flink-storm-compatibility-core" to just "flink-storm" > > would be the

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Aljoscha Krettek
+1 On Fri, 2 Oct 2015 at 11:37 Márton Balassi wrote: > @Matthias: +1. > > On Fri, Oct 2, 2015 at 11:27 AM, Stephan Ewen wrote: > > > @matthias +1 for that approach > > > > On Fri, Oct 2, 2015 at 11:21 AM, Matthias J. Sax > wrote: >

Re: An update on the DataStream API refactoring WiP

2015-10-02 Thread Kostas Tzoumas
Oh, and of course, support for event time. I might be forgetting more, feel free to add to the list On Fri, Oct 2, 2015 at 2:40 PM, Kostas Tzoumas wrote: > Hi folks, > > Currently, Aljoscha, Stephan, and I are reworking the DataStream API as > discussed before. Things are

Re: Release Flink 0.10

2015-10-02 Thread Robert Metzger
The new one does not have access to the JobManager log file. Also, the graphs for the TaskManagers are missing. On Fri, Oct 2, 2015 at 3:51 PM, Stephan Ewen wrote: > I would actually like to remove the old one, but I am okay with keeping it > and activating the new one by

Re: An update on the DataStream API refactoring WiP

2015-10-02 Thread Maximilian Michels
You made very sensible choices for improving and finalizing the Streaming API. The documentation is much clearer now. By the way, here is the pull request: https://github.com/apache/flink/pull/1208 On Fri, Oct 2, 2015 at 3:02 PM, Stephan Ewen wrote: > I added two comments to

Re: An update on the DataStream API refactoring WiP

2015-10-02 Thread Kostas Tzoumas
right, I meant DataStream On Fri, Oct 2, 2015 at 2:47 PM, Robert Metzger wrote: > I suspect: "- Deletion of "DataSet.forward() and .global()"" is a typo, you > meant DataStream ? > > On Fri, Oct 2, 2015 at 2:44 PM, Kostas Tzoumas > wrote: > > > Oh, and

[jira] [Created] (FLINK-2808) Rework / Extend the StatehandleProvider

2015-10-02 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2808: --- Summary: Rework / Extend the StatehandleProvider Key: FLINK-2808 URL: https://issues.apache.org/jira/browse/FLINK-2808 Project: Flink Issue Type: Improvement

Re: Release Flink 0.10

2015-10-02 Thread Stephan Ewen
I would actually like to remove the old one, but I am okay with keeping it and activating the new one by default On Fri, Oct 2, 2015 at 3:49 PM, Robert Metzger wrote: > The list from Kostas also contained the new JobManager front end. > > Do we want to enable it by default

An update on the DataStream API refactoring WiP

2015-10-02 Thread Kostas Tzoumas
Hi folks, Currently, Aljoscha, Stephan, and I are reworking the DataStream API as discussed before. Things are a bit in-flight right now with several commits and pull requests, and the current master containing code from both the old and the new API. I want to give you an idea of how the new API

Re: An update on the DataStream API refactoring WiP

2015-10-02 Thread Stephan Ewen
I added two comments to the pull request that this is based on... On Fri, Oct 2, 2015 at 2:47 PM, Robert Metzger wrote: > I suspect: "- Deletion of "DataSet.forward() and .global()"" is a typo, you > meant DataStream ? > > On Fri, Oct 2, 2015 at 2:44 PM, Kostas Tzoumas

Re: Release Flink 0.10

2015-10-02 Thread Robert Metzger
The list from Kostas also contained the new JobManager front end. Do we want to enable it by default in the 0.10 release? Are we going to keep the old interface, or are we removing it? I'm voting for enabling the new one by default and keeping the old one for the next release. What do you

[jira] [Created] (FLINK-2801) Rework Storm Compatibility Tests

2015-10-02 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2801: --- Summary: Rework Storm Compatibility Tests Key: FLINK-2801 URL: https://issues.apache.org/jira/browse/FLINK-2801 Project: Flink Issue Type: Bug

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Maximilian Michels
+1 Matthias, let's limit the overhead this has for the module maintainers. On Fri, Oct 2, 2015 at 12:17 AM, Matthias J. Sax wrote: > I will commit something to flink-storm-compatibility tomorrow that > contains some internal package restructuring. I think, renaming the > three

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Matthias J. Sax
It think, rename "flink-storm-compatibility-core" to just "flink-storm" would be the cleanest solution. So in flink-contrib there would be two modules: - flink-storm - flink-storm-examples Please let me know if you have any objection about it. -Matthias On 10/02/2015 10:45 AM, Matthias J.

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Till Rohrmann
+1 for the new project structure. Getting rid of our code dump is a good thing. On Fri, Oct 2, 2015 at 10:25 AM, Maximilian Michels wrote: > +1 Matthias, let's limit the overhead this has for the module maintainers. > > On Fri, Oct 2, 2015 at 12:17 AM, Matthias J. Sax

Re: Hash-based aggregation

2015-10-02 Thread Stephan Ewen
I think that roughly, an approach like the compacting hash table is the right one. Go ahead and take a stab at it, if you want, ping us if you run into obstacles. Here are a few thoughts on the hash-aggregator from discussions between Fabian and me: 1) It may be worth to have a specialized

[jira] [Created] (FLINK-2806) No TypeInfo for Scala's Nothing type

2015-10-02 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2806: -- Summary: No TypeInfo for Scala's Nothing type Key: FLINK-2806 URL: https://issues.apache.org/jira/browse/FLINK-2806 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-2804) Support blocking job submission with Job Manager recovery

2015-10-02 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2804: -- Summary: Support blocking job submission with Job Manager recovery Key: FLINK-2804 URL: https://issues.apache.org/jira/browse/FLINK-2804 Project: Flink Issue

[jira] [Created] (FLINK-2803) Add test case for Flink's memory allocation

2015-10-02 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2803: - Summary: Add test case for Flink's memory allocation Key: FLINK-2803 URL: https://issues.apache.org/jira/browse/FLINK-2803 Project: Flink Issue

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Matthias J. Sax
Sure. Will do that. -Matthias On 10/02/2015 10:35 AM, Stephan Ewen wrote: > @Matthias: How about getting rid of the storm-compatibility-parent and > making the core and examples projects directly projects in "contrib" > > On Fri, Oct 2, 2015 at 10:34 AM, Till Rohrmann

[jira] [Created] (FLINK-2807) Add javadocs/comments to new windowing mechanics

2015-10-02 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-2807: - Summary: Add javadocs/comments to new windowing mechanics Key: FLINK-2807 URL: https://issues.apache.org/jira/browse/FLINK-2807 Project: Flink Issue Type:

Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Stephan Ewen
Hi all! Now that we are coming to the next release, I wanted to make sure we finalize the decision on that point, because it would be nice to not break the behavior of system afterwards. Right now, when tasks are chained together, the system copies the elements always between different tasks in

[jira] [Created] (FLINK-2813) Document off-heap configuration

2015-10-02 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2813: - Summary: Document off-heap configuration Key: FLINK-2813 URL: https://issues.apache.org/jira/browse/FLINK-2813 Project: Flink Issue Type: Bug

Re: Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Stephan Ewen
@Martin: I think you were a user of the Batch API before we made the non-reuse mode the default mode. By now, when you use a GroupReduceFunction or a MapPartitionFunction or so, you need not do any cloning or copying. All functions that receive groups will always get fresh elements. This

Re: Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Matthias J. Sax
+1 for disable copy by default On 10/02/2015 05:53 PM, Stephan Ewen wrote: > Hi all! > > Now that we are coming to the next release, I wanted to make sure we > finalize the decision on that point, because it would be nice to not break > the behavior of system afterwards. > > Right now, when

Re: Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Martin Neumann
It seems like I'm one of the few people that run into the mutable elements trap on the Batch API from time to time. At the moment I always clone when I'm not 100% sure to avoid hunting the bugs later. So far I was happy to learn that this is not a problem in Streaming, but that's just me. When

[jira] [Created] (FLINK-2812) KeySelectorUtil.getSelectorForKeys and TypeExtractor.getKeySelectorTypes are incompatible

2015-10-02 Thread JIRA
Márton Balassi created FLINK-2812: - Summary: KeySelectorUtil.getSelectorForKeys and TypeExtractor.getKeySelectorTypes are incompatible Key: FLINK-2812 URL: https://issues.apache.org/jira/browse/FLINK-2812

[jira] [Created] (FLINK-2811) Add page with configuration overview

2015-10-02 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2811: - Summary: Add page with configuration overview Key: FLINK-2811 URL: https://issues.apache.org/jira/browse/FLINK-2811 Project: Flink Issue Type: Sub-task

streaming GroupBy + Fold

2015-10-02 Thread Martin Neumann
Hej, In one of my Programs I run a Fold on a GroupedDataStream. The aim is to aggregate the values in each group. It seems the aggregator in the Fold function is shared on operator level, so all groups that end up on the same operator get mashed together. Is this the wanted behavior? If so, what

Re: Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Till Rohrmann
Do we know what kind of impact the non-reuse policy has? Maybe the serialization overhead is subsumed by other effects. But in general I'm ok with changing the default to non copying. We just have to document this feature properly. On Oct 2, 2015 6:31 PM, "Maximilian Michels"

Re: streaming GroupBy + Fold

2015-10-02 Thread Martin Neumann
One of my colleagues found it today when we where hunting bugs today. We where using the latest 0.10 version pulled from maven this morning. The program we where testing is new code so I cant tell you if the behavior has changed or if it was always like this. On Fri, Oct 2, 2015 at 7:46 PM,

RE: Pulling Streaming out of staging and project restructure

2015-10-02 Thread fhueske
+1 From: Henry Saputra Sent: Friday, October 2, 2015 19:34 To: dev@flink.apache.org Subject: Re: Pulling Streaming out of staging and project restructure +1 On Friday, October 2, 2015, Matthias J. Sax wrote: > It think, rename "flink-storm-compatibility-core" to just

[jira] [Created] (FLINK-2800) kryo serialization problem

2015-10-02 Thread Stefano Bortoli (JIRA)
Stefano Bortoli created FLINK-2800: -- Summary: kryo serialization problem Key: FLINK-2800 URL: https://issues.apache.org/jira/browse/FLINK-2800 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-2815) [REFACTOR] Remove Pact from class and file names since it is no longer valid reference

2015-10-02 Thread Henry Saputra (JIRA)
Henry Saputra created FLINK-2815: Summary: [REFACTOR] Remove Pact from class and file names since it is no longer valid reference Key: FLINK-2815 URL: https://issues.apache.org/jira/browse/FLINK-2815