[jira] [Created] (FLINK-1841) WindowJoinITCase fails

2015-04-07 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1841: Summary: WindowJoinITCase fails Key: FLINK-1841 URL: https://issues.apache.org/jira/browse/FLINK-1841 Project: Flink Issue Type: Bug Components: St

Re: [DISCUSS] Break up streaming connectors into subprojects

2015-04-07 Thread Henry Saputra
Would this proposal also include packaging streaming connectors into separate source and binary jars? - Henry On Tue, Apr 7, 2015 at 12:21 PM, Stephan Ewen wrote: > What do you think about dividing the streaming connectors project into > various smaller projects, basically one per connector? > >

[DISCUSS] Break up streaming connectors into subprojects

2015-04-07 Thread Stephan Ewen
What do you think about dividing the streaming connectors project into various smaller projects, basically one per connector? I am personally always happy when projects offer me artifacts that contain what I need, and not a lot of other unnecessary dependencies as well Many people using the strea

[jira] [Created] (FLINK-1840) Job execution fails on Windows (native and Cygwin)

2015-04-07 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1840: Summary: Job execution fails on Windows (native and Cygwin) Key: FLINK-1840 URL: https://issues.apache.org/jira/browse/FLINK-1840 Project: Flink Issue Type:

Re: Should collect() and count() be treated as data sinks?

2015-04-07 Thread Stephan Ewen
For the sake of prototyping, can you use a util that simply materializes the intermediate result in a file system (using typeInfo input and output formats) ? On Tue, Apr 7, 2015 at 6:21 PM, Maximilian Michels wrote: > On Mon, Apr 6, 2015 at 2:37 PM, Stephan Ewen wrote: > > > BTW: Should "print(

[jira] [Created] (FLINK-1839) Failures in TwitterStreamITCase

2015-04-07 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1839: --- Summary: Failures in TwitterStreamITCase Key: FLINK-1839 URL: https://issues.apache.org/jira/browse/FLINK-1839 Project: Flink Issue Type: Bug Compone

Re: Should collect() and count() be treated as data sinks?

2015-04-07 Thread Maximilian Michels
On Mon, Apr 6, 2015 at 2:37 PM, Stephan Ewen wrote: > BTW: Should "print()" be also an "eager" statement? I think it needs to be, > if we want to print to the driver's std out Yes, if we change print() to print on the Client, then it needs to execute eagerly. On Thu, Apr 2, 2015 at 6:59 PM, Al

Re: Parquet Article / Tutorial

2015-04-07 Thread Henry Saputra
+1 to the idea. Awesome work, Felix On Tuesday, April 7, 2015, Maximilian Michels wrote: > Hi Felix, > > Very nice informative read. > > +1 for a short blog post and a full version in the wiki. > +1 for putting this into flink-contrib > > > On Tue, Apr 7, 2015 at 1:46 PM, Fabian Hueske > wrote

Re: Should collect() and count() be treated as data sinks?

2015-04-07 Thread Alexander Alexandrov
> Should "print()" be also an "eager" statement? I would expect this to be the case as I can only imagine an implementation of print() via collect(). 2015-04-06 14:37 GMT+02:00 Stephan Ewen : > count() and collect() need to immediately trigger an execution, because the > driver program cannot pr

[jira] [Created] (FLINK-1838) Update streaming programming guide

2015-04-07 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1838: - Summary: Update streaming programming guide Key: FLINK-1838 URL: https://issues.apache.org/jira/browse/FLINK-1838 Project: Flink Issue Type: Task Compone

[jira] [Created] (FLINK-1837) Throw an exceptions for iterative streaming programs with checkpointing enabled

2015-04-07 Thread JIRA
Márton Balassi created FLINK-1837: - Summary: Throw an exceptions for iterative streaming programs with checkpointing enabled Key: FLINK-1837 URL: https://issues.apache.org/jira/browse/FLINK-1837 Proje

Re: Parquet Article / Tutorial

2015-04-07 Thread Maximilian Michels
Hi Felix, Very nice informative read. +1 for a short blog post and a full version in the wiki. +1 for putting this into flink-contrib On Tue, Apr 7, 2015 at 1:46 PM, Fabian Hueske wrote: > Very nice article! > How about adding the full article to the wiki and having a shorter version > as a b

Flink Forward 2015

2015-04-07 Thread Kostas Tzoumas
Hi everyone, The folks at data Artisans and the Berlin Big Data Center are organizing the first physical conference all about Apache Flink in Berlin the coming October: http://flink-forward.org The conference will be held in a beautiful spot an old brewery turned event space (the same space that

Re: Parquet Article / Tutorial

2015-04-07 Thread Fabian Hueske
Very nice article! How about adding the full article to the wiki and having a shorter version as a blog post (with a link to the wiki)? Adding the code to contrib would be great! 2015-04-07 12:45 GMT+02:00 Kostas Tzoumas : > Looks very nice! Would love to see a blog post on that! > > On Mon, Apr

[jira] [Created] (FLINK-1836) just test

2015-04-07 Thread zhu (JIRA)
zhu created FLINK-1836: -- Summary: just test Key: FLINK-1836 URL: https://issues.apache.org/jira/browse/FLINK-1836 Project: Flink Issue Type: Bug Reporter: zhu just test -- This message

[jira] [Created] (FLINK-1835) Spurious failure of YARN tests

2015-04-07 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1835: --- Summary: Spurious failure of YARN tests Key: FLINK-1835 URL: https://issues.apache.org/jira/browse/FLINK-1835 Project: Flink Issue Type: Bug Componen

Re: Parquet Article / Tutorial

2015-04-07 Thread Kostas Tzoumas
Looks very nice! Would love to see a blog post on that! On Mon, Apr 6, 2015 at 7:19 PM, Felix Neutatz wrote: > The intention was to post it on the blog, but if you think it would better > fit into the wiki, that would be also fine :) > > About the code: I have not thought about putting it to con

Re: Rework of the window-join semantics

2015-04-07 Thread Paris Carbone
Hello Matthias, Sure, ordering guarantees are indeed a tricky thing, I recall having that discussion back in TU Berlin. Bear in mind thought that DataStream, our abstract data type, represents a *partitioned* unbounded sequence of events. There are no *global* ordering guarantees made whatsoeve

Re: Rework of the window-join semantics

2015-04-07 Thread Matthias J. Sax
Hi @all, please keep me in the loop for this work. I am highly interested and I want to help on it. My initial thoughts are as follows: 1) Currently, system timestamps are used and the suggested approach can be seen as state-of-the-art (there is actually a research paper using the exact same jo

Re: Rework of the window-join semantics

2015-04-07 Thread Gyula Fóra
Hey, I agree with Kostas, if we define the exact semantics how this works, this is not more ad-hoc than any other stateful operator with multiple inputs. (And I don't think any other system support something similar) We need to make some design choices that are similar to the issues we had for wi

[jira] [Created] (FLINK-1834) Is mapred.output.dir conf parameter really required?

2015-04-07 Thread Flavio Pompermaier (JIRA)
Flavio Pompermaier created FLINK-1834: - Summary: Is mapred.output.dir conf parameter really required? Key: FLINK-1834 URL: https://issues.apache.org/jira/browse/FLINK-1834 Project: Flink

Re: Rework of the window-join semantics

2015-04-07 Thread Kostas Tzoumas
Yes, we should write these semantics down. I volunteer to help. I don't think that this is very ad-hoc. The semantics are basically the following. Assuming an arriving element from the left side: (1) We find the right-side matches (2) We insert the left-side arrival into the left window (3) We rec

[jira] [Created] (FLINK-1833) Refactor partition availability notification in ExecutionGraph

2015-04-07 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1833: -- Summary: Refactor partition availability notification in ExecutionGraph Key: FLINK-1833 URL: https://issues.apache.org/jira/browse/FLINK-1833 Project: Flink Iss

Re: [QUESTION] Sort Key Types

2015-04-07 Thread Fabian Hueske
Sure, simple API concepts are important. But this concept is quite hidden and will only appear to people who are already quite involved with the system. Only users who define own TypeInformations that require the distinction between sorting and regular keys need to worry about it. So it is only re

Re: Rework of the window-join semantics

2015-04-07 Thread Stephan Ewen
Is the approach of joining an element at a time from one input against a window on the other input not a bit arbitrary? This just joins whatever currently happens to be the window by the time the single element arrives - that is a bit non-predictable, right? As a more general point: The whole sem

Re: [QUESTION] Sort Key Types

2015-04-07 Thread Stephan Ewen
I think the point is that understanding the concepts becomes increasingly more difficult if we just keep introducing more concepts all the time with little consideration. Nothing prohibits to sort on Tuples and case classes, it only requires a string or two more in the program code. I think it is

Re: [QUESTION] Sort Key Types

2015-04-07 Thread Fabian Hueske
Limiting sorting to atomic fields would also prohibit to sort on Tuples and CaseClasses. I guess that is something that is not too uncommon. As a user is would also expect to sort POJOs that implement Comparable. Also the default implementation isSortKey() returns the result of isKeyType(). So use

[jira] [Created] (FLINK-1832) start-local.bat/start-local.sh does not work if there is a white space in the file path (windows)

2015-04-07 Thread Nikolaas Steenbergen (JIRA)
Nikolaas Steenbergen created FLINK-1832: --- Summary: start-local.bat/start-local.sh does not work if there is a white space in the file path (windows) Key: FLINK-1832 URL: https://issues.apache.org/jira/browse

Re: [QUESTION] Sort Key Types

2015-04-07 Thread Stephan Ewen
I am wondering if it is necessary to add this extra distinction and complexity in the code. One simple way to get around this would be simply require that user requested sorts specify all atomic fields directly. Wouldn't that be a fair restriction? I am saying this because I am seeing the API cla

Re: [jira] [Created] (FLINK-1831) runtime.taskmanager.RegistrationTests fails sporiously

2015-04-07 Thread Till Rohrmann
The error looks as if there is already another JobManager started with FAKE_JOB_MANAGER name. This might be caused by a JobManager which has not yet completely shut down. On Tue, Apr 7, 2015 at 9:52 AM, Márton Balassi (JIRA) wrote: > Márton Balassi created FLINK-1831: > -

Re: [QUESTION] Sort Key Types

2015-04-07 Thread Fabian Hueske
Regular keys differ from sort keys in that they can be (somehow) sorted, but their order is not necessarily "intuitive". So regular keys are sufficient for sort-based grouping, but not for explicit sorting (groupSort, partitionSort, outputSort). Right now, this difference is only relevant for POJO

[jira] [Created] (FLINK-1831) runtime.taskmanager.RegistrationTests fails sporiously

2015-04-07 Thread JIRA
Márton Balassi created FLINK-1831: - Summary: runtime.taskmanager.RegistrationTests fails sporiously Key: FLINK-1831 URL: https://issues.apache.org/jira/browse/FLINK-1831 Project: Flink Issue