[jira] [Created] (FLINK-2794) Exactly-once support

2015-10-01 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2794:
-

 Summary: Exactly-once support
 Key: FLINK-2794
 URL: https://issues.apache.org/jira/browse/FLINK-2794
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Streaming
Reporter: Maximilian Michels






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Márton Balassi
Great to see streaming graduating. :)

I like the outline, both getting rid of staging, having the examples
together and generally flattening the structure are very reasonable to me.

You have listed flink-streaming-examples under flink-streaming-connectors
and left out some less prominent maven modules, but I assume the first is
accidental while the second is intentional to make the list a bit briefer.

Best,

Marton


On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen  wrote:

> Hi all!
>
> We are making good headway with reworking the last parts of the Window API.
> After that, the streaming API should be good to be pulled out of staging.
>
> Since we are reorganizing the projects as part of that, I would shift a bit
> more to bring things a bit more up to date.
>
> In this restructure, I would like to get rid of the "flink-staging"
> project. Anyone who only uses the maven artifacts sees no difference
> whether a project is in "staging" or not, so it does not help much to have
> that directory structure.
> On the other hand, projects have a tendency to linger in staging forever
> (like avro, spargel, hbase, jdbc, ...)
>
> The new structure could be
>
> flink-core
> flink-java
> flink-scala
> flink-streaming-core
> flink-streaming-scala
>
> flink-runtime
> flink-runtime-web
> flink-optimizer
> flink-clients
>
> flink-shaded
>   -> flink-shaded-hadoop
>   -> flink-shaded-hadoop2
>   -> flink-shaded-include-yarn-tests
>   -> flink-shaded-curator
>
> flink-examples
>   -> (have all examples, Scala and Java, Batch and Streaming)
>
> flink-batch-connectors
>   -> flink-avro
>   -> flink-jdbc
>   -> flink-hadoop-compatibility
>   -> flink-hbase
>   -> flink-hcatalog
>
> flink-streaming-connectors
>   -> flink-connector-twitter
>   -> flink-streaming-examples
>   -> flink-connector-flume
>   -> flink-connector-kafka
>   -> flink-connector-elasticsearch
>   -> flink-connector-rabbitmq
>   -> flink-connector-filesystem
>
> flink-libraries
>   -> flink-gelly
>   -> flink-gelly-scala
>   -> flink-ml
>   -> flink-table
>   -> flink-language-binding
>   -> flink-python
>
>
> flink-scala-shell
>
> flink-test-utils
> flink-tests
> flink-fs-tests
>
> flink-contrib
>   -> flink-storm-compatibility
>   -> flink-storm-compatibility-examples
>   -> flink-streaming-utils
>   -> flink-tweet-inputformat
>   -> flink-operator-stats
>   -> flink-tez
>
> flink-quickstart
>   -> flink-quickstart-java
>   -> flink-quickstart-scala
>   -> flink-tez-quickstart
>
> flink-yarn
> flink-yarn-tests
>
> flink-dist
>
> flink-benchmark
>
>
> Let me know if that makes sense!
>
> Greetings,
> Stephan
>


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Stephan Ewen
+1 for Robert's comments.

On Thu, Oct 1, 2015 at 3:16 PM, Robert Metzger  wrote:

> Big +1 for graduating streaming out of staging. It is widely used, also in
> production and we are spending a lot of effort into hardening it.
> I also agree with the proposed new maven module structure.
>
> We have to carefully test the reworked structure for the scripts which are
> generating the hadoop1 and the scala 2.11 poms (they are transformed using
> a bunch of bash scripts). I can do that once the PR is open.
>
> @Chesnay: I would be fine with including the language binding into python
> > Where would new projects reside in, that previously would have been put
> into flink-staging?
>
> flink-contrib
>
>
> @Kostas: I understand the idea behind your suggested renaming, but thats
> just a name. I don't think its going to influence how people are seeing
> Flink: It doesn't feel like second class when adding "flink-streaming-core"
> to the dependencies to me.
> Also, the "flink-datastream-scala" module would depend on
> "flink-dataset-scala", which is kind of weird.
>
>
> I'm wondering whether we should remove the "flink-test-utils" module. I
> don't think its really necessary, because we can put the test jars into the
> flink-tests project and include them using the "test-jar" dependency.
>
>
> On Thu, Oct 1, 2015 at 2:27 PM, Kostas Tzoumas 
> wrote:
>
> > +1
> >
> > I wanted to suggest that we rename modules to fully accept streaming as
> > first class, qualifying also "batch" as "batch" (e.g., flink-java -->
> > flink-dataset-java, flink-streaming --> flink-datastream, etc).
> >
> > This would break maven dependencies (temporary hell :-) so it's not a
> > decision to take lightly. I'm not strongly advocating for it.
> >
> >
> > On Thu, Oct 1, 2015 at 12:44 PM, Chesnay Schepler 
> > wrote:
> >
> > > I like it in general. But while we're at it, what is the purpose of the
> > > flink-tests project, or rather which tests belong there instead of the
> > > individual projects?
> > >
> > > Where would new projects reside in, that previously would have been put
> > > into flink-staging?
> > >
> > > Lastly, I'd like to merge flink-language-binding into flink-python. I
> can
> > > go more into detail but the gist of it is that the abstraction just
> > doesn't
> > > work.
> > >
> > >
> > > On 01.10.2015 12:40, Márton Balassi wrote:
> > >
> > >> Great to see streaming graduating. :)
> > >>
> > >> I like the outline, both getting rid of staging, having the examples
> > >> together and generally flattening the structure are very reasonable to
> > me.
> > >>
> > >> You have listed flink-streaming-examples under
> > flink-streaming-connectors
> > >> and left out some less prominent maven modules, but I assume the first
> > is
> > >> accidental while the second is intentional to make the list a bit
> > briefer.
> > >>
> > >> Best,
> > >>
> > >> Marton
> > >>
> > >>
> > >> On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen 
> wrote:
> > >>
> > >> Hi all!
> > >>>
> > >>> We are making good headway with reworking the last parts of the
> Window
> > >>> API.
> > >>> After that, the streaming API should be good to be pulled out of
> > staging.
> > >>>
> > >>> Since we are reorganizing the projects as part of that, I would
> shift a
> > >>> bit
> > >>> more to bring things a bit more up to date.
> > >>>
> > >>> In this restructure, I would like to get rid of the "flink-staging"
> > >>> project. Anyone who only uses the maven artifacts sees no difference
> > >>> whether a project is in "staging" or not, so it does not help much to
> > >>> have
> > >>> that directory structure.
> > >>> On the other hand, projects have a tendency to linger in staging
> > forever
> > >>> (like avro, spargel, hbase, jdbc, ...)
> > >>>
> > >>> The new structure could be
> > >>>
> > >>> flink-core
> > >>> flink-java
> > >>> flink-scala
> > >>> flink-streaming-core
> > >>> flink-streaming-scala
> > >>>
> > >>> flink-runtime
> > >>> flink-runtime-web
> > >>> flink-optimizer
> > >>> flink-clients
> > >>>
> > >>> flink-shaded
> > >>>-> flink-shaded-hadoop
> > >>>-> flink-shaded-hadoop2
> > >>>-> flink-shaded-include-yarn-tests
> > >>>-> flink-shaded-curator
> > >>>
> > >>> flink-examples
> > >>>-> (have all examples, Scala and Java, Batch and Streaming)
> > >>>
> > >>> flink-batch-connectors
> > >>>-> flink-avro
> > >>>-> flink-jdbc
> > >>>-> flink-hadoop-compatibility
> > >>>-> flink-hbase
> > >>>-> flink-hcatalog
> > >>>
> > >>> flink-streaming-connectors
> > >>>-> flink-connector-twitter
> > >>>-> flink-streaming-examples
> > >>>-> flink-connector-flume
> > >>>-> flink-connector-kafka
> > >>>-> flink-connector-elasticsearch
> > >>>-> flink-connector-rabbitmq
> > >>>-> flink-connector-filesystem
> > >>>
> > >>> flink-libraries
> > >>>-> flink-gelly
> > >>>-> flink-gelly-scala
> > >>>-> flink-ml
> > >>>-> 

Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Chesnay Schepler
If we remove flink-staging because projects tend to get stuck there, 
what mechanism prevents the same happening with flink-contrib?


On 01.10.2015 15:19, Stephan Ewen wrote:

+1 for Robert's comments.

On Thu, Oct 1, 2015 at 3:16 PM, Robert Metzger  wrote:


Big +1 for graduating streaming out of staging. It is widely used, also in
production and we are spending a lot of effort into hardening it.
I also agree with the proposed new maven module structure.

We have to carefully test the reworked structure for the scripts which are
generating the hadoop1 and the scala 2.11 poms (they are transformed using
a bunch of bash scripts). I can do that once the PR is open.

@Chesnay: I would be fine with including the language binding into python

Where would new projects reside in, that previously would have been put

into flink-staging?

flink-contrib


@Kostas: I understand the idea behind your suggested renaming, but thats
just a name. I don't think its going to influence how people are seeing
Flink: It doesn't feel like second class when adding "flink-streaming-core"
to the dependencies to me.
Also, the "flink-datastream-scala" module would depend on
"flink-dataset-scala", which is kind of weird.


I'm wondering whether we should remove the "flink-test-utils" module. I
don't think its really necessary, because we can put the test jars into the
flink-tests project and include them using the "test-jar" dependency.


On Thu, Oct 1, 2015 at 2:27 PM, Kostas Tzoumas 
wrote:


+1

I wanted to suggest that we rename modules to fully accept streaming as
first class, qualifying also "batch" as "batch" (e.g., flink-java -->
flink-dataset-java, flink-streaming --> flink-datastream, etc).

This would break maven dependencies (temporary hell :-) so it's not a
decision to take lightly. I'm not strongly advocating for it.


On Thu, Oct 1, 2015 at 12:44 PM, Chesnay Schepler 
wrote:


I like it in general. But while we're at it, what is the purpose of the
flink-tests project, or rather which tests belong there instead of the
individual projects?

Where would new projects reside in, that previously would have been put
into flink-staging?

Lastly, I'd like to merge flink-language-binding into flink-python. I

can

go more into detail but the gist of it is that the abstraction just

doesn't

work.


On 01.10.2015 12:40, Márton Balassi wrote:


Great to see streaming graduating. :)

I like the outline, both getting rid of staging, having the examples
together and generally flattening the structure are very reasonable to

me.

You have listed flink-streaming-examples under

flink-streaming-connectors

and left out some less prominent maven modules, but I assume the first

is

accidental while the second is intentional to make the list a bit

briefer.

Best,

Marton


On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen 

wrote:

Hi all!

We are making good headway with reworking the last parts of the

Window

API.
After that, the streaming API should be good to be pulled out of

staging.

Since we are reorganizing the projects as part of that, I would

shift a

bit
more to bring things a bit more up to date.

In this restructure, I would like to get rid of the "flink-staging"
project. Anyone who only uses the maven artifacts sees no difference
whether a project is in "staging" or not, so it does not help much to
have
that directory structure.
On the other hand, projects have a tendency to linger in staging

forever

(like avro, spargel, hbase, jdbc, ...)

The new structure could be

flink-core
flink-java
flink-scala
flink-streaming-core
flink-streaming-scala

flink-runtime
flink-runtime-web
flink-optimizer
flink-clients

flink-shaded
-> flink-shaded-hadoop
-> flink-shaded-hadoop2
-> flink-shaded-include-yarn-tests
-> flink-shaded-curator

flink-examples
-> (have all examples, Scala and Java, Batch and Streaming)

flink-batch-connectors
-> flink-avro
-> flink-jdbc
-> flink-hadoop-compatibility
-> flink-hbase
-> flink-hcatalog

flink-streaming-connectors
-> flink-connector-twitter
-> flink-streaming-examples
-> flink-connector-flume
-> flink-connector-kafka
-> flink-connector-elasticsearch
-> flink-connector-rabbitmq
-> flink-connector-filesystem

flink-libraries
-> flink-gelly
-> flink-gelly-scala
-> flink-ml
-> flink-table
-> flink-language-binding
-> flink-python


flink-scala-shell

flink-test-utils
flink-tests
flink-fs-tests

flink-contrib
-> flink-storm-compatibility
-> flink-storm-compatibility-examples
-> flink-streaming-utils
-> flink-tweet-inputformat
-> flink-operator-stats
-> flink-tez

flink-quickstart
-> flink-quickstart-java
-> flink-quickstart-scala
-> flink-tez-quickstart

flink-yarn
flink-yarn-tests

flink-dist

flink-benchmark


Let me know if that makes sense!

Greetings,
Stephan






[jira] [Created] (FLINK-2795) Print JobExecutionResult for interactively invoked jobs

2015-10-01 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2795:
-

 Summary: Print JobExecutionResult for interactively invoked jobs
 Key: FLINK-2795
 URL: https://issues.apache.org/jira/browse/FLINK-2795
 Project: Flink
  Issue Type: Improvement
  Components: Command-line client
Affects Versions: 0.9, 0.10
Reporter: Maximilian Michels
 Fix For: 0.10


{{JobExecutionResult}}s are currently not available if the Flink job is 
submitted interactively via the command-line client. The reason for that is 
that the execution goes through the {{ContextEnvironment}} which itself holds a 
reference to the {{Client}} which it uses to submit the job.

We need to pass the JobExecutionResult back to the Client like we do it for the 
JobID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Pulling Streaming out of staging and project restructure

2015-10-01 Thread Stephan Ewen
Hi all!

We are making good headway with reworking the last parts of the Window API.
After that, the streaming API should be good to be pulled out of staging.

Since we are reorganizing the projects as part of that, I would shift a bit
more to bring things a bit more up to date.

In this restructure, I would like to get rid of the "flink-staging"
project. Anyone who only uses the maven artifacts sees no difference
whether a project is in "staging" or not, so it does not help much to have
that directory structure.
On the other hand, projects have a tendency to linger in staging forever
(like avro, spargel, hbase, jdbc, ...)

The new structure could be

flink-core
flink-java
flink-scala
flink-streaming-core
flink-streaming-scala

flink-runtime
flink-runtime-web
flink-optimizer
flink-clients

flink-shaded
  -> flink-shaded-hadoop
  -> flink-shaded-hadoop2
  -> flink-shaded-include-yarn-tests
  -> flink-shaded-curator

flink-examples
  -> (have all examples, Scala and Java, Batch and Streaming)

flink-batch-connectors
  -> flink-avro
  -> flink-jdbc
  -> flink-hadoop-compatibility
  -> flink-hbase
  -> flink-hcatalog

flink-streaming-connectors
  -> flink-connector-twitter
  -> flink-streaming-examples
  -> flink-connector-flume
  -> flink-connector-kafka
  -> flink-connector-elasticsearch
  -> flink-connector-rabbitmq
  -> flink-connector-filesystem

flink-libraries
  -> flink-gelly
  -> flink-gelly-scala
  -> flink-ml
  -> flink-table
  -> flink-language-binding
  -> flink-python


flink-scala-shell

flink-test-utils
flink-tests
flink-fs-tests

flink-contrib
  -> flink-storm-compatibility
  -> flink-storm-compatibility-examples
  -> flink-streaming-utils
  -> flink-tweet-inputformat
  -> flink-operator-stats
  -> flink-tez

flink-quickstart
  -> flink-quickstart-java
  -> flink-quickstart-scala
  -> flink-tez-quickstart

flink-yarn
flink-yarn-tests

flink-dist

flink-benchmark


Let me know if that makes sense!

Greetings,
Stephan


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Chesnay Schepler
I like it in general. But while we're at it, what is the purpose of the 
flink-tests project, or rather which tests belong there instead of the 
individual projects?


Where would new projects reside in, that previously would have been put 
into flink-staging?


Lastly, I'd like to merge flink-language-binding into flink-python. I 
can go more into detail but the gist of it is that the abstraction just 
doesn't work.


On 01.10.2015 12:40, Márton Balassi wrote:

Great to see streaming graduating. :)

I like the outline, both getting rid of staging, having the examples
together and generally flattening the structure are very reasonable to me.

You have listed flink-streaming-examples under flink-streaming-connectors
and left out some less prominent maven modules, but I assume the first is
accidental while the second is intentional to make the list a bit briefer.

Best,

Marton


On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen  wrote:


Hi all!

We are making good headway with reworking the last parts of the Window API.
After that, the streaming API should be good to be pulled out of staging.

Since we are reorganizing the projects as part of that, I would shift a bit
more to bring things a bit more up to date.

In this restructure, I would like to get rid of the "flink-staging"
project. Anyone who only uses the maven artifacts sees no difference
whether a project is in "staging" or not, so it does not help much to have
that directory structure.
On the other hand, projects have a tendency to linger in staging forever
(like avro, spargel, hbase, jdbc, ...)

The new structure could be

flink-core
flink-java
flink-scala
flink-streaming-core
flink-streaming-scala

flink-runtime
flink-runtime-web
flink-optimizer
flink-clients

flink-shaded
   -> flink-shaded-hadoop
   -> flink-shaded-hadoop2
   -> flink-shaded-include-yarn-tests
   -> flink-shaded-curator

flink-examples
   -> (have all examples, Scala and Java, Batch and Streaming)

flink-batch-connectors
   -> flink-avro
   -> flink-jdbc
   -> flink-hadoop-compatibility
   -> flink-hbase
   -> flink-hcatalog

flink-streaming-connectors
   -> flink-connector-twitter
   -> flink-streaming-examples
   -> flink-connector-flume
   -> flink-connector-kafka
   -> flink-connector-elasticsearch
   -> flink-connector-rabbitmq
   -> flink-connector-filesystem

flink-libraries
   -> flink-gelly
   -> flink-gelly-scala
   -> flink-ml
   -> flink-table
   -> flink-language-binding
   -> flink-python


flink-scala-shell

flink-test-utils
flink-tests
flink-fs-tests

flink-contrib
   -> flink-storm-compatibility
   -> flink-storm-compatibility-examples
   -> flink-streaming-utils
   -> flink-tweet-inputformat
   -> flink-operator-stats
   -> flink-tez

flink-quickstart
   -> flink-quickstart-java
   -> flink-quickstart-scala
   -> flink-tez-quickstart

flink-yarn
flink-yarn-tests

flink-dist

flink-benchmark


Let me know if that makes sense!

Greetings,
Stephan





Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Kostas Tzoumas
+1

I wanted to suggest that we rename modules to fully accept streaming as
first class, qualifying also "batch" as "batch" (e.g., flink-java -->
flink-dataset-java, flink-streaming --> flink-datastream, etc).

This would break maven dependencies (temporary hell :-) so it's not a
decision to take lightly. I'm not strongly advocating for it.


On Thu, Oct 1, 2015 at 12:44 PM, Chesnay Schepler 
wrote:

> I like it in general. But while we're at it, what is the purpose of the
> flink-tests project, or rather which tests belong there instead of the
> individual projects?
>
> Where would new projects reside in, that previously would have been put
> into flink-staging?
>
> Lastly, I'd like to merge flink-language-binding into flink-python. I can
> go more into detail but the gist of it is that the abstraction just doesn't
> work.
>
>
> On 01.10.2015 12:40, Márton Balassi wrote:
>
>> Great to see streaming graduating. :)
>>
>> I like the outline, both getting rid of staging, having the examples
>> together and generally flattening the structure are very reasonable to me.
>>
>> You have listed flink-streaming-examples under flink-streaming-connectors
>> and left out some less prominent maven modules, but I assume the first is
>> accidental while the second is intentional to make the list a bit briefer.
>>
>> Best,
>>
>> Marton
>>
>>
>> On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen  wrote:
>>
>> Hi all!
>>>
>>> We are making good headway with reworking the last parts of the Window
>>> API.
>>> After that, the streaming API should be good to be pulled out of staging.
>>>
>>> Since we are reorganizing the projects as part of that, I would shift a
>>> bit
>>> more to bring things a bit more up to date.
>>>
>>> In this restructure, I would like to get rid of the "flink-staging"
>>> project. Anyone who only uses the maven artifacts sees no difference
>>> whether a project is in "staging" or not, so it does not help much to
>>> have
>>> that directory structure.
>>> On the other hand, projects have a tendency to linger in staging forever
>>> (like avro, spargel, hbase, jdbc, ...)
>>>
>>> The new structure could be
>>>
>>> flink-core
>>> flink-java
>>> flink-scala
>>> flink-streaming-core
>>> flink-streaming-scala
>>>
>>> flink-runtime
>>> flink-runtime-web
>>> flink-optimizer
>>> flink-clients
>>>
>>> flink-shaded
>>>-> flink-shaded-hadoop
>>>-> flink-shaded-hadoop2
>>>-> flink-shaded-include-yarn-tests
>>>-> flink-shaded-curator
>>>
>>> flink-examples
>>>-> (have all examples, Scala and Java, Batch and Streaming)
>>>
>>> flink-batch-connectors
>>>-> flink-avro
>>>-> flink-jdbc
>>>-> flink-hadoop-compatibility
>>>-> flink-hbase
>>>-> flink-hcatalog
>>>
>>> flink-streaming-connectors
>>>-> flink-connector-twitter
>>>-> flink-streaming-examples
>>>-> flink-connector-flume
>>>-> flink-connector-kafka
>>>-> flink-connector-elasticsearch
>>>-> flink-connector-rabbitmq
>>>-> flink-connector-filesystem
>>>
>>> flink-libraries
>>>-> flink-gelly
>>>-> flink-gelly-scala
>>>-> flink-ml
>>>-> flink-table
>>>-> flink-language-binding
>>>-> flink-python
>>>
>>>
>>> flink-scala-shell
>>>
>>> flink-test-utils
>>> flink-tests
>>> flink-fs-tests
>>>
>>> flink-contrib
>>>-> flink-storm-compatibility
>>>-> flink-storm-compatibility-examples
>>>-> flink-streaming-utils
>>>-> flink-tweet-inputformat
>>>-> flink-operator-stats
>>>-> flink-tez
>>>
>>> flink-quickstart
>>>-> flink-quickstart-java
>>>-> flink-quickstart-scala
>>>-> flink-tez-quickstart
>>>
>>> flink-yarn
>>> flink-yarn-tests
>>>
>>> flink-dist
>>>
>>> flink-benchmark
>>>
>>>
>>> Let me know if that makes sense!
>>>
>>> Greetings,
>>> Stephan
>>>
>>>
>


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Robert Metzger
Big +1 for graduating streaming out of staging. It is widely used, also in
production and we are spending a lot of effort into hardening it.
I also agree with the proposed new maven module structure.

We have to carefully test the reworked structure for the scripts which are
generating the hadoop1 and the scala 2.11 poms (they are transformed using
a bunch of bash scripts). I can do that once the PR is open.

@Chesnay: I would be fine with including the language binding into python
> Where would new projects reside in, that previously would have been put
into flink-staging?

flink-contrib


@Kostas: I understand the idea behind your suggested renaming, but thats
just a name. I don't think its going to influence how people are seeing
Flink: It doesn't feel like second class when adding "flink-streaming-core"
to the dependencies to me.
Also, the "flink-datastream-scala" module would depend on
"flink-dataset-scala", which is kind of weird.


I'm wondering whether we should remove the "flink-test-utils" module. I
don't think its really necessary, because we can put the test jars into the
flink-tests project and include them using the "test-jar" dependency.


On Thu, Oct 1, 2015 at 2:27 PM, Kostas Tzoumas  wrote:

> +1
>
> I wanted to suggest that we rename modules to fully accept streaming as
> first class, qualifying also "batch" as "batch" (e.g., flink-java -->
> flink-dataset-java, flink-streaming --> flink-datastream, etc).
>
> This would break maven dependencies (temporary hell :-) so it's not a
> decision to take lightly. I'm not strongly advocating for it.
>
>
> On Thu, Oct 1, 2015 at 12:44 PM, Chesnay Schepler 
> wrote:
>
> > I like it in general. But while we're at it, what is the purpose of the
> > flink-tests project, or rather which tests belong there instead of the
> > individual projects?
> >
> > Where would new projects reside in, that previously would have been put
> > into flink-staging?
> >
> > Lastly, I'd like to merge flink-language-binding into flink-python. I can
> > go more into detail but the gist of it is that the abstraction just
> doesn't
> > work.
> >
> >
> > On 01.10.2015 12:40, Márton Balassi wrote:
> >
> >> Great to see streaming graduating. :)
> >>
> >> I like the outline, both getting rid of staging, having the examples
> >> together and generally flattening the structure are very reasonable to
> me.
> >>
> >> You have listed flink-streaming-examples under
> flink-streaming-connectors
> >> and left out some less prominent maven modules, but I assume the first
> is
> >> accidental while the second is intentional to make the list a bit
> briefer.
> >>
> >> Best,
> >>
> >> Marton
> >>
> >>
> >> On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen  wrote:
> >>
> >> Hi all!
> >>>
> >>> We are making good headway with reworking the last parts of the Window
> >>> API.
> >>> After that, the streaming API should be good to be pulled out of
> staging.
> >>>
> >>> Since we are reorganizing the projects as part of that, I would shift a
> >>> bit
> >>> more to bring things a bit more up to date.
> >>>
> >>> In this restructure, I would like to get rid of the "flink-staging"
> >>> project. Anyone who only uses the maven artifacts sees no difference
> >>> whether a project is in "staging" or not, so it does not help much to
> >>> have
> >>> that directory structure.
> >>> On the other hand, projects have a tendency to linger in staging
> forever
> >>> (like avro, spargel, hbase, jdbc, ...)
> >>>
> >>> The new structure could be
> >>>
> >>> flink-core
> >>> flink-java
> >>> flink-scala
> >>> flink-streaming-core
> >>> flink-streaming-scala
> >>>
> >>> flink-runtime
> >>> flink-runtime-web
> >>> flink-optimizer
> >>> flink-clients
> >>>
> >>> flink-shaded
> >>>-> flink-shaded-hadoop
> >>>-> flink-shaded-hadoop2
> >>>-> flink-shaded-include-yarn-tests
> >>>-> flink-shaded-curator
> >>>
> >>> flink-examples
> >>>-> (have all examples, Scala and Java, Batch and Streaming)
> >>>
> >>> flink-batch-connectors
> >>>-> flink-avro
> >>>-> flink-jdbc
> >>>-> flink-hadoop-compatibility
> >>>-> flink-hbase
> >>>-> flink-hcatalog
> >>>
> >>> flink-streaming-connectors
> >>>-> flink-connector-twitter
> >>>-> flink-streaming-examples
> >>>-> flink-connector-flume
> >>>-> flink-connector-kafka
> >>>-> flink-connector-elasticsearch
> >>>-> flink-connector-rabbitmq
> >>>-> flink-connector-filesystem
> >>>
> >>> flink-libraries
> >>>-> flink-gelly
> >>>-> flink-gelly-scala
> >>>-> flink-ml
> >>>-> flink-table
> >>>-> flink-language-binding
> >>>-> flink-python
> >>>
> >>>
> >>> flink-scala-shell
> >>>
> >>> flink-test-utils
> >>> flink-tests
> >>> flink-fs-tests
> >>>
> >>> flink-contrib
> >>>-> flink-storm-compatibility
> >>>-> flink-storm-compatibility-examples
> >>>-> flink-streaming-utils
> >>>-> flink-tweet-inputformat
> >>>-> 

[jira] [Created] (FLINK-2798) Prepare new web dashboard for executing in on YARN

2015-10-01 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2798:
-

 Summary: Prepare new web dashboard for executing in on YARN
 Key: FLINK-2798
 URL: https://issues.apache.org/jira/browse/FLINK-2798
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend, YARN Client
Reporter: Robert Metzger






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Robert Metzger
@Chesnay: Nothing prevents projects from getting stuck there. But at least
we don't have two staging repositories (dist, staging). Also I would not
say that there has been no graduation out of staging. Yarn was also there
once, streaming and gelly are currently leaving it. So I would actually say
it worked.

One more thing: Can we actually put the web submission client into a
separate module until we have something new and better?

It does really not fit into "flink-clients". It makes Flink users pull the
jetty dependencies even if they never use them for example when running
Flink from an IDE.


On Thu, Oct 1, 2015 at 4:45 PM, Kostas Tzoumas  wrote:

> +1 to Robert and practicality :-)
>
> As I said before, I do not feel strongly about this, I was torn myself.
>
> On Thu, Oct 1, 2015 at 3:28 PM, Chesnay Schepler 
> wrote:
>
> > If we remove flink-staging because projects tend to get stuck there, what
> > mechanism prevents the same happening with flink-contrib?
> >
> >
> > On 01.10.2015 15:19, Stephan Ewen wrote:
> >
> >> +1 for Robert's comments.
> >>
> >> On Thu, Oct 1, 2015 at 3:16 PM, Robert Metzger 
> >> wrote:
> >>
> >> Big +1 for graduating streaming out of staging. It is widely used, also
> in
> >>> production and we are spending a lot of effort into hardening it.
> >>> I also agree with the proposed new maven module structure.
> >>>
> >>> We have to carefully test the reworked structure for the scripts which
> >>> are
> >>> generating the hadoop1 and the scala 2.11 poms (they are transformed
> >>> using
> >>> a bunch of bash scripts). I can do that once the PR is open.
> >>>
> >>> @Chesnay: I would be fine with including the language binding into
> python
> >>>
>  Where would new projects reside in, that previously would have been
> put
> 
> >>> into flink-staging?
> >>>
> >>> flink-contrib
> >>>
> >>>
> >>> @Kostas: I understand the idea behind your suggested renaming, but
> thats
> >>> just a name. I don't think its going to influence how people are seeing
> >>> Flink: It doesn't feel like second class when adding
> >>> "flink-streaming-core"
> >>> to the dependencies to me.
> >>> Also, the "flink-datastream-scala" module would depend on
> >>> "flink-dataset-scala", which is kind of weird.
> >>>
> >>>
> >>> I'm wondering whether we should remove the "flink-test-utils" module. I
> >>> don't think its really necessary, because we can put the test jars into
> >>> the
> >>> flink-tests project and include them using the "test-jar" dependency.
> >>>
> >>>
> >>> On Thu, Oct 1, 2015 at 2:27 PM, Kostas Tzoumas 
> >>> wrote:
> >>>
> >>> +1
> 
>  I wanted to suggest that we rename modules to fully accept streaming
> as
>  first class, qualifying also "batch" as "batch" (e.g., flink-java -->
>  flink-dataset-java, flink-streaming --> flink-datastream, etc).
> 
>  This would break maven dependencies (temporary hell :-) so it's not a
>  decision to take lightly. I'm not strongly advocating for it.
> 
> 
>  On Thu, Oct 1, 2015 at 12:44 PM, Chesnay Schepler  >
>  wrote:
> 
>  I like it in general. But while we're at it, what is the purpose of
> the
> > flink-tests project, or rather which tests belong there instead of
> the
> > individual projects?
> >
> > Where would new projects reside in, that previously would have been
> put
> > into flink-staging?
> >
> > Lastly, I'd like to merge flink-language-binding into flink-python. I
> >
>  can
> >>>
>  go more into detail but the gist of it is that the abstraction just
> >
>  doesn't
> 
> > work.
> >
> >
> > On 01.10.2015 12:40, Márton Balassi wrote:
> >
> > Great to see streaming graduating. :)
> >>
> >> I like the outline, both getting rid of staging, having the examples
> >> together and generally flattening the structure are very reasonable
> to
> >>
> > me.
> 
> > You have listed flink-streaming-examples under
> >>
> > flink-streaming-connectors
> 
> > and left out some less prominent maven modules, but I assume the
> first
> >>
> > is
> 
> > accidental while the second is intentional to make the list a bit
> >>
> > briefer.
> 
> > Best,
> >>
> >> Marton
> >>
> >>
> >> On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen 
> >>
> > wrote:
> >>>
>  Hi all!
> >>
> >>> We are making good headway with reworking the last parts of the
> >>>
> >> Window
> >>>
>  API.
> >>> After that, the streaming API should be good to be pulled out of
> >>>
> >> staging.
> 
> > Since we are reorganizing the projects as part of that, I would
> >>>
> >> shift a
> >>>
>  bit
> >>> more to bring things a bit more up to date.
> >>>
> >>> In this restructure, I would 

Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Kostas Tzoumas
+1 to Robert and practicality :-)

As I said before, I do not feel strongly about this, I was torn myself.

On Thu, Oct 1, 2015 at 3:28 PM, Chesnay Schepler  wrote:

> If we remove flink-staging because projects tend to get stuck there, what
> mechanism prevents the same happening with flink-contrib?
>
>
> On 01.10.2015 15:19, Stephan Ewen wrote:
>
>> +1 for Robert's comments.
>>
>> On Thu, Oct 1, 2015 at 3:16 PM, Robert Metzger 
>> wrote:
>>
>> Big +1 for graduating streaming out of staging. It is widely used, also in
>>> production and we are spending a lot of effort into hardening it.
>>> I also agree with the proposed new maven module structure.
>>>
>>> We have to carefully test the reworked structure for the scripts which
>>> are
>>> generating the hadoop1 and the scala 2.11 poms (they are transformed
>>> using
>>> a bunch of bash scripts). I can do that once the PR is open.
>>>
>>> @Chesnay: I would be fine with including the language binding into python
>>>
 Where would new projects reside in, that previously would have been put

>>> into flink-staging?
>>>
>>> flink-contrib
>>>
>>>
>>> @Kostas: I understand the idea behind your suggested renaming, but thats
>>> just a name. I don't think its going to influence how people are seeing
>>> Flink: It doesn't feel like second class when adding
>>> "flink-streaming-core"
>>> to the dependencies to me.
>>> Also, the "flink-datastream-scala" module would depend on
>>> "flink-dataset-scala", which is kind of weird.
>>>
>>>
>>> I'm wondering whether we should remove the "flink-test-utils" module. I
>>> don't think its really necessary, because we can put the test jars into
>>> the
>>> flink-tests project and include them using the "test-jar" dependency.
>>>
>>>
>>> On Thu, Oct 1, 2015 at 2:27 PM, Kostas Tzoumas 
>>> wrote:
>>>
>>> +1

 I wanted to suggest that we rename modules to fully accept streaming as
 first class, qualifying also "batch" as "batch" (e.g., flink-java -->
 flink-dataset-java, flink-streaming --> flink-datastream, etc).

 This would break maven dependencies (temporary hell :-) so it's not a
 decision to take lightly. I'm not strongly advocating for it.


 On Thu, Oct 1, 2015 at 12:44 PM, Chesnay Schepler 
 wrote:

 I like it in general. But while we're at it, what is the purpose of the
> flink-tests project, or rather which tests belong there instead of the
> individual projects?
>
> Where would new projects reside in, that previously would have been put
> into flink-staging?
>
> Lastly, I'd like to merge flink-language-binding into flink-python. I
>
 can
>>>
 go more into detail but the gist of it is that the abstraction just
>
 doesn't

> work.
>
>
> On 01.10.2015 12:40, Márton Balassi wrote:
>
> Great to see streaming graduating. :)
>>
>> I like the outline, both getting rid of staging, having the examples
>> together and generally flattening the structure are very reasonable to
>>
> me.

> You have listed flink-streaming-examples under
>>
> flink-streaming-connectors

> and left out some less prominent maven modules, but I assume the first
>>
> is

> accidental while the second is intentional to make the list a bit
>>
> briefer.

> Best,
>>
>> Marton
>>
>>
>> On Thu, Oct 1, 2015 at 12:25 PM, Stephan Ewen 
>>
> wrote:
>>>
 Hi all!
>>
>>> We are making good headway with reworking the last parts of the
>>>
>> Window
>>>
 API.
>>> After that, the streaming API should be good to be pulled out of
>>>
>> staging.

> Since we are reorganizing the projects as part of that, I would
>>>
>> shift a
>>>
 bit
>>> more to bring things a bit more up to date.
>>>
>>> In this restructure, I would like to get rid of the "flink-staging"
>>> project. Anyone who only uses the maven artifacts sees no difference
>>> whether a project is in "staging" or not, so it does not help much to
>>> have
>>> that directory structure.
>>> On the other hand, projects have a tendency to linger in staging
>>>
>> forever

> (like avro, spargel, hbase, jdbc, ...)
>>>
>>> The new structure could be
>>>
>>> flink-core
>>> flink-java
>>> flink-scala
>>> flink-streaming-core
>>> flink-streaming-scala
>>>
>>> flink-runtime
>>> flink-runtime-web
>>> flink-optimizer
>>> flink-clients
>>>
>>> flink-shaded
>>> -> flink-shaded-hadoop
>>> -> flink-shaded-hadoop2
>>> -> flink-shaded-include-yarn-tests
>>> -> flink-shaded-curator
>>>
>>> flink-examples
>>> -> (have all examples, Scala and Java, Batch and Streaming)
>>>
>>> 

Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Aljoscha Krettek
+1 For pulling out and the restructure. Enough good arguments have been
brought forward and I agree with all of them.

On Thu, 1 Oct 2015 at 17:47 Ufuk Celebi  wrote:

>
> > On 01 Oct 2015, at 16:48, Robert Metzger  wrote:
> >
> > @Chesnay: Nothing prevents projects from getting stuck there. But at
> least
> > we don't have two staging repositories (dist, staging). Also I would not
> > say that there has been no graduation out of staging. Yarn was also there
> > once, streaming and gelly are currently leaving it. So I would actually
> say
> > it worked.
> >
> > One more thing: Can we actually put the web submission client into a
> > separate module until we have something new and better?
> >
> > It does really not fit into "flink-clients". It makes Flink users pull
> the
> > jetty dependencies even if they never use them for example when running
> > Flink from an IDE.
>
> I think most people use it as part of the binary distribution and nothing
> else anyways. So +1.


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Maximilian Michels
+1 for the new Maven project structure
+1 for removing the flink-testing-utils module
+1 for moving flink-language-binding to flink-python

On Thu, Oct 1, 2015 at 6:27 PM, Aljoscha Krettek  wrote:
> +1 For pulling out and the restructure. Enough good arguments have been
> brought forward and I agree with all of them.
>
> On Thu, 1 Oct 2015 at 17:47 Ufuk Celebi  wrote:
>
>>
>> > On 01 Oct 2015, at 16:48, Robert Metzger  wrote:
>> >
>> > @Chesnay: Nothing prevents projects from getting stuck there. But at
>> least
>> > we don't have two staging repositories (dist, staging). Also I would not
>> > say that there has been no graduation out of staging. Yarn was also there
>> > once, streaming and gelly are currently leaving it. So I would actually
>> say
>> > it worked.
>> >
>> > One more thing: Can we actually put the web submission client into a
>> > separate module until we have something new and better?
>> >
>> > It does really not fit into "flink-clients". It makes Flink users pull
>> the
>> > jetty dependencies even if they never use them for example when running
>> > Flink from an IDE.
>>
>> I think most people use it as part of the binary distribution and nothing
>> else anyways. So +1.


Hash-based aggregation

2015-10-01 Thread Gábor Gévay
Hello,

I would really like to see FLINK-2237 solved.
I would implement this feature over the weekend, if the
CompactingHashTable can be used to solve it (see my comment there).
Could you please give me some advice on whether is this a viable
approach, or you perhaps see some difficulties that I'm not aware of?

Best,
Gabor


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Henry Saputra
+1

I like the idea moving "staging" projects into appropriate modules.

While we are at it, I would like to propose changing "
flink-hadoop-compatibility" to "flink-hadoop". It is in my bucket list
but would be nice if it is part of re-org.
Supporting Hadoop in the connector implicitly means compatibility with Hadoop.
Also same thing with "flink-storm-compatibility" to "flink-storm".

- Henry

On Thu, Oct 1, 2015 at 3:25 AM, Stephan Ewen  wrote:
> Hi all!
>
> We are making good headway with reworking the last parts of the Window API.
> After that, the streaming API should be good to be pulled out of staging.
>
> Since we are reorganizing the projects as part of that, I would shift a bit
> more to bring things a bit more up to date.
>
> In this restructure, I would like to get rid of the "flink-staging"
> project. Anyone who only uses the maven artifacts sees no difference
> whether a project is in "staging" or not, so it does not help much to have
> that directory structure.
> On the other hand, projects have a tendency to linger in staging forever
> (like avro, spargel, hbase, jdbc, ...)
>
> The new structure could be
>
> flink-core
> flink-java
> flink-scala
> flink-streaming-core
> flink-streaming-scala
>
> flink-runtime
> flink-runtime-web
> flink-optimizer
> flink-clients
>
> flink-shaded
>   -> flink-shaded-hadoop
>   -> flink-shaded-hadoop2
>   -> flink-shaded-include-yarn-tests
>   -> flink-shaded-curator
>
> flink-examples
>   -> (have all examples, Scala and Java, Batch and Streaming)
>
> flink-batch-connectors
>   -> flink-avro
>   -> flink-jdbc
>   -> flink-hadoop-compatibility
>   -> flink-hbase
>   -> flink-hcatalog
>
> flink-streaming-connectors
>   -> flink-connector-twitter
>   -> flink-streaming-examples
>   -> flink-connector-flume
>   -> flink-connector-kafka
>   -> flink-connector-elasticsearch
>   -> flink-connector-rabbitmq
>   -> flink-connector-filesystem
>
> flink-libraries
>   -> flink-gelly
>   -> flink-gelly-scala
>   -> flink-ml
>   -> flink-table
>   -> flink-language-binding
>   -> flink-python
>
>
> flink-scala-shell
>
> flink-test-utils
> flink-tests
> flink-fs-tests
>
> flink-contrib
>   -> flink-storm-compatibility
>   -> flink-storm-compatibility-examples
>   -> flink-streaming-utils
>   -> flink-tweet-inputformat
>   -> flink-operator-stats
>   -> flink-tez
>
> flink-quickstart
>   -> flink-quickstart-java
>   -> flink-quickstart-scala
>   -> flink-tez-quickstart
>
> flink-yarn
> flink-yarn-tests
>
> flink-dist
>
> flink-benchmark
>
>
> Let me know if that makes sense!
>
> Greetings,
> Stephan


Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Matthias J. Sax
I will commit something to flink-storm-compatibility tomorrow that
contains some internal package restructuring. I think, renaming the
three modules in this commit would be a smart move as both changes
result in merge conflicts when rebasing open PRs. Thus we can limit this
pain to a single time. If no objections, I will commit those changes
tomorrow.

-Matthias

On 10/01/2015 09:52 PM, Henry Saputra wrote:
> +1
> 
> I like the idea moving "staging" projects into appropriate modules.
> 
> While we are at it, I would like to propose changing "
> flink-hadoop-compatibility" to "flink-hadoop". It is in my bucket list
> but would be nice if it is part of re-org.
> Supporting Hadoop in the connector implicitly means compatibility with Hadoop.
> Also same thing with "flink-storm-compatibility" to "flink-storm".
> 
> - Henry
> 
> On Thu, Oct 1, 2015 at 3:25 AM, Stephan Ewen  wrote:
>> Hi all!
>>
>> We are making good headway with reworking the last parts of the Window API.
>> After that, the streaming API should be good to be pulled out of staging.
>>
>> Since we are reorganizing the projects as part of that, I would shift a bit
>> more to bring things a bit more up to date.
>>
>> In this restructure, I would like to get rid of the "flink-staging"
>> project. Anyone who only uses the maven artifacts sees no difference
>> whether a project is in "staging" or not, so it does not help much to have
>> that directory structure.
>> On the other hand, projects have a tendency to linger in staging forever
>> (like avro, spargel, hbase, jdbc, ...)
>>
>> The new structure could be
>>
>> flink-core
>> flink-java
>> flink-scala
>> flink-streaming-core
>> flink-streaming-scala
>>
>> flink-runtime
>> flink-runtime-web
>> flink-optimizer
>> flink-clients
>>
>> flink-shaded
>>   -> flink-shaded-hadoop
>>   -> flink-shaded-hadoop2
>>   -> flink-shaded-include-yarn-tests
>>   -> flink-shaded-curator
>>
>> flink-examples
>>   -> (have all examples, Scala and Java, Batch and Streaming)
>>
>> flink-batch-connectors
>>   -> flink-avro
>>   -> flink-jdbc
>>   -> flink-hadoop-compatibility
>>   -> flink-hbase
>>   -> flink-hcatalog
>>
>> flink-streaming-connectors
>>   -> flink-connector-twitter
>>   -> flink-streaming-examples
>>   -> flink-connector-flume
>>   -> flink-connector-kafka
>>   -> flink-connector-elasticsearch
>>   -> flink-connector-rabbitmq
>>   -> flink-connector-filesystem
>>
>> flink-libraries
>>   -> flink-gelly
>>   -> flink-gelly-scala
>>   -> flink-ml
>>   -> flink-table
>>   -> flink-language-binding
>>   -> flink-python
>>
>>
>> flink-scala-shell
>>
>> flink-test-utils
>> flink-tests
>> flink-fs-tests
>>
>> flink-contrib
>>   -> flink-storm-compatibility
>>   -> flink-storm-compatibility-examples
>>   -> flink-streaming-utils
>>   -> flink-tweet-inputformat
>>   -> flink-operator-stats
>>   -> flink-tez
>>
>> flink-quickstart
>>   -> flink-quickstart-java
>>   -> flink-quickstart-scala
>>   -> flink-tez-quickstart
>>
>> flink-yarn
>> flink-yarn-tests
>>
>> flink-dist
>>
>> flink-benchmark
>>
>>
>> Let me know if that makes sense!
>>
>> Greetings,
>> Stephan



signature.asc
Description: OpenPGP digital signature