[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328723#comment-14328723
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75209814
  
Indeed :-)

What does the OS load mean? It would be really awesome to show the CPU 
load, too. I think this is a helpful indicator. 


> Integrate metrics library and report basic metrics to JobManager web interface
> --
>
> Key: FLINK-1501
> URL: https://issues.apache.org/jira/browse/FLINK-1501
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: pre-apache
>
>
> As per mailing list, the library: https://github.com/dropwizard/metrics
> The goal of this task is to get the basic infrastructure in place.
> Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75209814
  
Indeed :-)

What does the OS load mean? It would be really awesome to show the CPU 
load, too. I think this is a helpful indicator. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1585) Fix computation of TaskManager memory for Mini Cluster (tests)

2015-02-20 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1585.
-
Resolution: Fixed

Fixed via a1104491b88ff3293142992900e69253206736ac

> Fix computation of TaskManager memory for Mini Cluster (tests)
> --
>
> Key: FLINK-1585
> URL: https://issues.apache.org/jira/browse/FLINK-1585
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 0.9
>
>
> The mini cluster currently set up the memory for the TaskManager incorrectly. 
> It misses to
>  - respect preconfigured memory settings
>  - evaluate the config for heap fraction and network buffers
>  - take the memory needed by the JobManager into account



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1499) Make TaskManager to disconnect from JobManager in case of a restart

2015-02-20 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1499.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Till Rohrmann

Fixed via 665e60128e02cf1990d5a8b13e4704e180b5054f

> Make TaskManager to disconnect from JobManager in case of a restart
> ---
>
> Key: FLINK-1499
> URL: https://issues.apache.org/jira/browse/FLINK-1499
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> In case of a TaskManager restart, the TaskManager does not unregisters from 
> the JobManager. However, it tries to reconnect once the restart has been 
> finished. In order to maintain a consistent state, the TaskManager should 
> disconnect from the JobManager upon restart or termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1484) JobManager restart does not notify the TaskManager

2015-02-20 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1484.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Till Rohrmann

Fixed via 5ac1911a831c310e9f6634099e9767a2f0bce306

> JobManager restart does not notify the TaskManager
> --
>
> Key: FLINK-1484
> URL: https://issues.apache.org/jira/browse/FLINK-1484
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> In case of a JobManager restart, which can happen due to an uncaught 
> exception, the JobManager is restarted. However, connected TaskManager are 
> not informed about the disconnection and continue sending messages to a 
> JobManager with a reseted state. 
> TaskManager should be informed about a possible restart and cleanup their own 
> state in such a case. Afterwards, they can try to reconnect to a restarted 
> JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1556) JobClient does not wait until a job failed completely if submission exception

2015-02-20 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1556.
-
   Resolution: Fixed
Fix Version/s: 0.9

Re-fixed in 4ff91cd7cd3e05637d70a56cfc6f5a4ba2f2501a

> JobClient does not wait until a job failed completely if submission exception
> -
>
> Key: FLINK-1556
> URL: https://issues.apache.org/jira/browse/FLINK-1556
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> If an exception occurs during job submission the {{JobClient}} received a 
> {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} 
> terminates itself and returns the error to the {{Client}}. This indicates to 
> the user that the job has been completely failed which is not necessarily 
> true. 
> If the user directly after such a failure submits another job, then it might 
> be the case that not all slots of the formerly failed job are returned. This 
> can lead to a {{NoRessourceAvailableException}}.
> We can solve this problem by waiting for the completion of the job failure in 
> the {{JobClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1584) Spurious failure of TaskManagerFailsITCase

2015-02-20 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1584.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Till Rohrmann

Fixed via 52b40debad6293cc0591eabb847eaac322d174aa

> Spurious failure of TaskManagerFailsITCase
> --
>
> Key: FLINK-1584
> URL: https://issues.apache.org/jira/browse/FLINK-1584
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> The {{TaskManagerFailsITCase}} fails spuriously on Travis. The reason might 
> be that different test cases try to access the same {{JobManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1586) Add support for iteration visualization for Streaming programs

2015-02-20 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1586:
--
Component/s: Streaming

> Add support for iteration visualization for Streaming programs
> --
>
> Key: FLINK-1586
> URL: https://issues.apache.org/jira/browse/FLINK-1586
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Reporter: Gyula Fora
>Priority: Minor
>
> The plan visualizer currently does not support streaming programs containing 
> iterations. There is no visualization at all due to an exception thrown in 
> the visualizer script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...

2015-02-20 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75218017
  
This looks great! ^^


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328795#comment-14328795
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75218017
  
This looks great! ^^


> Integrate metrics library and report basic metrics to JobManager web interface
> --
>
> Key: FLINK-1501
> URL: https://issues.apache.org/jira/browse/FLINK-1501
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: pre-apache
>
>
> As per mailing list, the library: https://github.com/dropwizard/metrics
> The goal of this task is to get the basic infrastructure in place.
> Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()

2015-02-20 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328808#comment-14328808
 ] 

Vasia Kalavri commented on FLINK-1587:
--

Hi,

you can easily reproduce if you have an edge set with an invalid edge ID, as 
[~andralungu] says.
This is a case that shouldn't happen, so I suggest we add a check and throw an 
exception. This should probably be taken care of in the other 2 degrees 
methods. 
[~andralungu] or [~cebe] any of you would like to take care of this? :)

-V.

> coGroup throws NoSuchElementException on iterator.next()
> 
>
> Key: FLINK-1587
> URL: https://issues.apache.org/jira/browse/FLINK-1587
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
> Environment: flink-0.8.0-SNAPSHOT
>Reporter: Carsten Brandt
>
> I am receiving the following exception when running a simple job that 
> extracts outdegree from a graph using Gelly. It is currently only failing on 
> the cluster and I am not able to reproduce it locally. Will try that the next 
> days.
> {noformat}
> 02/20/2015 02:27:02:  CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) 
> switched to FAILED
> java.util.NoSuchElementException
>   at java.util.Collections$EmptyIterator.next(Collections.java:3006)
>   at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665)
>   at 
> org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
>   at 
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
>   at java.lang.Thread.run(Thread.java:745)
> 02/20/2015 02:27:02:  Job execution switched to status FAILING
> ...
> {noformat}
> The error occurs in Gellys Graph.java at this line: 
> https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636
> Is there any valid case where a coGroup Iterator may be empty? As far as I 
> see there is a bug somewhere.
> I'd like to write a test case for this to reproduce the issue. Where can I 
> put such a test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1515]Splitted runVertexCentricIteration...

2015-02-20 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75220973
  
Hi,
if no more comments, I'd like to merge this. 
There is a failing check in Travis (not related to this PR):
```
Tests in error: 
  
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40
 » Timeout
  
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40->TestKit.within:707->TestKit.within:707
 » Timeout
```
Is this fixed by #422? Shall I proceed?
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1515) [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328814#comment-14328814
 ] 

ASF GitHub Bot commented on FLINK-1515:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75220973
  
Hi,
if no more comments, I'd like to merge this. 
There is a failing check in Travis (not related to this PR):
```
Tests in error: 
  
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40
 » Timeout
  
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40->TestKit.within:707->TestKit.within:707
 » Timeout
```
Is this fixed by #422? Shall I proceed?
Thanks!


> [Gelly] Enable access to aggregators and broadcast sets in vertex-centric 
> iteration
> ---
>
> Key: FLINK-1515
> URL: https://issues.apache.org/jira/browse/FLINK-1515
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Martin Kiefer
>
> Currently, aggregators and broadcast sets cannot be accessed through Gelly's  
> {{runVertexCentricIteration}} method. The functionality is already present in 
> the {{VertexCentricIteration}} and we just need to expose it.
> This could be done like this: We create a method 
> {{createVertexCentricIteration}}, which will return a 
> {{VertexCentricIteration}} object and we change {{runVertexCentricIteration}} 
> to accept this as a parameter (and return the graph after running this 
> iteration).
> The user can configure the {{VertexCentricIteration}} by directly calling the 
> public methods {{registerAggregator}}, {{setName}}, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1588) Load flink configuration also from classloader

2015-02-20 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1588:
-

 Summary: Load flink configuration also from classloader
 Key: FLINK-1588
 URL: https://issues.apache.org/jira/browse/FLINK-1588
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger


The GlobalConfiguration object should also check if it finds the 
flink-config.yaml in the classpath and load if from there.

This allows users to inject configuration files in local "standalone" or 
embedded environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1589:
-

 Summary: Add option to pass Configuration to LocalExecutor
 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger


Right now its not possible for users to pass custom configuration values to 
Flink when running it from within an IDE.

It would be very convenient to be able to create a local execution environment 
that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1589:
-

Assignee: Robert Metzger

> Add option to pass Configuration to LocalExecutor
> -
>
> Key: FLINK-1589
> URL: https://issues.apache.org/jira/browse/FLINK-1589
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Right now its not possible for users to pass custom configuration values to 
> Flink when running it from within an IDE.
> It would be very convenient to be able to create a local execution 
> environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

2015-02-20 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/427

[FLINK-1589] Add option to pass configuration to LocalExecutor

Please review the changes.

I'll add a testcase and update the documentation later today.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink1589

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #427


commit b75b4c285f4810faa5d02d638b61dc7b8e125c8d
Author: Robert Metzger 
Date:   2015-02-20T11:40:41Z

[FLINK-1589] Add option to pass configuration to LocalExecutor




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328841#comment-14328841
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/427

[FLINK-1589] Add option to pass configuration to LocalExecutor

Please review the changes.

I'll add a testcase and update the documentation later today.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink1589

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #427


commit b75b4c285f4810faa5d02d638b61dc7b8e125c8d
Author: Robert Metzger 
Date:   2015-02-20T11:40:41Z

[FLINK-1589] Add option to pass configuration to LocalExecutor




> Add option to pass Configuration to LocalExecutor
> -
>
> Key: FLINK-1589
> URL: https://issues.apache.org/jira/browse/FLINK-1589
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Right now its not possible for users to pass custom configuration values to 
> Flink when running it from within an IDE.
> It would be very convenient to be able to create a local execution 
> environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1421) Implement a SAMOA Adapter for Flink Streaming

2015-02-20 Thread Paris Carbone (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paris Carbone updated FLINK-1421:
-
Description: 
Yahoo's Samoa is an experimental incremental machine learning library that 
builds on an abstract compositional data streaming model to write streaming 
algorithms. The task is to provide an adapter from SAMOA topologies to 
Flink-streaming job graphs in order to support Flink as a backend engine for 
SAMOA tasks.

A statup guide can be viewed here :
https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub

The main working branch of the adapter :
https://github.com/senorcarbone/samoa/tree/flink-integration

  was:
Yahoo's Samoa is an experimental incremental machine learning library that 
builds on an abstract compositional data streaming model to write streaming 
algorithms. The task is to provide an adapter from SAMOA topologies to 
Flink-streaming job graphs in order to support Flink as a backend engine for 
SAMOA tasks.

A statup guide can be viewed here :
https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub

The main working branch of the adapter :
https://github.com/senorcarbone/samoa/tree/flink


> Implement a SAMOA Adapter for Flink Streaming
> -
>
> Key: FLINK-1421
> URL: https://issues.apache.org/jira/browse/FLINK-1421
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Paris Carbone
>Assignee: Paris Carbone
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Yahoo's Samoa is an experimental incremental machine learning library that 
> builds on an abstract compositional data streaming model to write streaming 
> algorithms. The task is to provide an adapter from SAMOA topologies to 
> Flink-streaming job graphs in order to support Flink as a backend engine for 
> SAMOA tasks.
> A statup guide can be viewed here :
> https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub
> The main working branch of the adapter :
> https://github.com/senorcarbone/samoa/tree/flink-integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1421) Implement a SAMOA Adapter for Flink Streaming

2015-02-20 Thread Paris Carbone (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328882#comment-14328882
 ] 

Paris Carbone commented on FLINK-1421:
--

Thanks for all the great feedback everyone! I believe the adapter is ready for 
a PR! :)
The current Flink-Samoa adapter runs well for all existing Tasks in local and 
remote mode. The only external requirement, at least from the samoa executable 
script is to have the Flink CLI and its dependencies installed and exported 
into a FLINK_HOME variable.

[~StephanEwen] We should address the proper serialisation in the next patch.

[~azaroth] We have rebased our current branch [1] with the incubator-samoa 
repository master and we also created a JIRA for the PR [2]. Do you think it is 
ok to do the PR now or should we wait for some upcoming refactoring that 
affects the adapters?

[1] https://github.com/senorcarbone/samoa/tree/flink-integration
[2] https://issues.apache.org/jira/browse/SAMOA-16

> Implement a SAMOA Adapter for Flink Streaming
> -
>
> Key: FLINK-1421
> URL: https://issues.apache.org/jira/browse/FLINK-1421
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Paris Carbone
>Assignee: Paris Carbone
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Yahoo's Samoa is an experimental incremental machine learning library that 
> builds on an abstract compositional data streaming model to write streaming 
> algorithms. The task is to provide an adapter from SAMOA topologies to 
> Flink-streaming job graphs in order to support Flink as a backend engine for 
> SAMOA tasks.
> A statup guide can be viewed here :
> https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub
> The main working branch of the adapter :
> https://github.com/senorcarbone/samoa/tree/flink-integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1451) Enable parallel execution of streaming file sources

2015-02-20 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora resolved FLINK-1451.
---
Resolution: Fixed
  Assignee: Gyula Fora  (was: Márton Balassi)

https://github.com/apache/flink/commit/c56e3f10b27e1e5be38b8a731f330891b190a268

> Enable parallel execution of streaming file sources
> ---
>
> Key: FLINK-1451
> URL: https://issues.apache.org/jira/browse/FLINK-1451
> Project: Flink
>  Issue Type: Task
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Gyula Fora
>
> Currently the streaming FileSourceFunction does not support parallelism 
> greater than 1 as it does not implement the ParallelSourceFunction interface. 
> Usage with distributed filesystems should be checked. In relation with this 
> issue possible parallel solution for the FileMonitoringFunction should be 
> considered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328963#comment-14328963
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75244751
  
I've added documentation and tests to the change.
Lets see if travis gives us a green light.


> Add option to pass Configuration to LocalExecutor
> -
>
> Key: FLINK-1589
> URL: https://issues.apache.org/jira/browse/FLINK-1589
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Right now its not possible for users to pass custom configuration values to 
> Flink when running it from within an IDE.
> It would be very convenient to be able to create a local execution 
> environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

2015-02-20 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75244751
  
I've added documentation and tests to the change.
Lets see if travis gives us a green light.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-1555) Add utility to log the serializers of composite types

2015-02-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1555:
-

Assignee: Robert Metzger

> Add utility to log the serializers of composite types
> -
>
> Key: FLINK-1555
> URL: https://issues.apache.org/jira/browse/FLINK-1555
> Project: Flink
>  Issue Type: Improvement
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> Users affected by poor performance might want to understand how Flink is 
> serializing their data.
> Therefore, it would be cool to have a tool utility which logs the serializers 
> like this:
> {{SerializerUtils.getSerializers(TypeInformation t);}}
> to get 
> {code}
> PojoSerializer
> TupleSerializer
>   IntSer
>   DateSer
>   GenericTypeSer(java.sql.Date)
> PojoSerializer
>   GenericTypeSer(HashMap)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1590) Log environment information also in YARN mode

2015-02-20 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1590:
-

 Summary: Log environment information also in YARN mode
 Key: FLINK-1590
 URL: https://issues.apache.org/jira/browse/FLINK-1590
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1591) Remove window merge before flatten as an optimization

2015-02-20 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1591:
-

 Summary: Remove window merge before flatten as an optimization
 Key: FLINK-1591
 URL: https://issues.apache.org/jira/browse/FLINK-1591
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora


After a Window Reduce or Map transformation there is always a merge step when 
the transformation was parallel or grouped.

This merge step should be removed when the windowing operator is followed by 
flatten to avoid unnecessary bottlenecks in the program.

This feature should be added as an optimization step to the WindowingOptimizer 
class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1514][Gelly] Add a Gather-Sum-Apply ite...

2015-02-20 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/408#issuecomment-75249793
  
Hi @balidani! Thanks a lot for this PR! Gather-Sum-Apply will be an awesome 
addition to Gelly ^^
Here come my comments:

- There's no need for Gather, Sum and Apply functions to implement 
MapFunction, FlatJoinFunction, etc., since they are wrapped inside those in 
GatherSumApplyIteration class. Actually, I would use the Rich* versions 
instead, so that we can have access to open() and close() methods. You can look 
at how `VertexCentricIteration` wraps the `VertexUpdateFunction` inside a 
`RichCoGroupFunction`.

- With this small change above, we could also allow access to aggregators 
and broadcast sets. This must be straight-forward to add (again look at 
`VertexCentricIteration` for hints). We should also add `getName()`, 
`setName()`, `getParallelism()`, `setParallelism()` methods to 
`GatherSumApplyIteration`.

- Finally, it'd be great if you could add the tests you have as examples, 
i.e. one for Greedy Graph Coloring and one for GSAShortestPaths.

Let me know if you have any doubts!

Thanks again :sunny:


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1514) [Gelly] Add a Gather-Sum-Apply iteration method

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328998#comment-14328998
 ] 

ASF GitHub Bot commented on FLINK-1514:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/408#issuecomment-75249793
  
Hi @balidani! Thanks a lot for this PR! Gather-Sum-Apply will be an awesome 
addition to Gelly ^^
Here come my comments:

- There's no need for Gather, Sum and Apply functions to implement 
MapFunction, FlatJoinFunction, etc., since they are wrapped inside those in 
GatherSumApplyIteration class. Actually, I would use the Rich* versions 
instead, so that we can have access to open() and close() methods. You can look 
at how `VertexCentricIteration` wraps the `VertexUpdateFunction` inside a 
`RichCoGroupFunction`.

- With this small change above, we could also allow access to aggregators 
and broadcast sets. This must be straight-forward to add (again look at 
`VertexCentricIteration` for hints). We should also add `getName()`, 
`setName()`, `getParallelism()`, `setParallelism()` methods to 
`GatherSumApplyIteration`.

- Finally, it'd be great if you could add the tests you have as examples, 
i.e. one for Greedy Graph Coloring and one for GSAShortestPaths.

Let me know if you have any doubts!

Thanks again :sunny:


> [Gelly] Add a Gather-Sum-Apply iteration method
> ---
>
> Key: FLINK-1514
> URL: https://issues.apache.org/jira/browse/FLINK-1514
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>
> This will be a method that implements the GAS computation model, but without 
> the "scatter" step. The phases can be mapped into the following steps inside 
> a delta iteration:
> gather: a map on each < srcVertex, edge, trgVertex > that produces a partial 
> value
> sum: a reduce that combines the partial values
> apply: join with vertex set to update the vertex values using the results of 
> sum and the previous state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1592) Refactor StreamGraph to store vertex IDs as Integers instead of Strings

2015-02-20 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1592:
-

 Summary: Refactor StreamGraph to store vertex IDs as Integers 
instead of Strings
 Key: FLINK-1592
 URL: https://issues.apache.org/jira/browse/FLINK-1592
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Priority: Minor


The vertex IDs are currently stored as Strings reflecting some deprecated 
usage. It should be refactored to use Integers instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1444][api-extending] Add support for sp...

2015-02-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/379


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1466] Add HCatInputFormats to read from...

2015-02-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/411


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1461) Add sortPartition operator

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329033#comment-14329033
 ] 

ASF GitHub Bot commented on FLINK-1461:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/381


> Add sortPartition operator
> --
>
> Key: FLINK-1461
> URL: https://issues.apache.org/jira/browse/FLINK-1461
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Local Runtime, Optimizer, Scala API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> A {{sortPartition()}} operator can be used to
> * sort the input of a {{mapPartition()}} operator
> * enforce a certain sorting of the input of a given operator of a program. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1461) Add sortPartition operator

2015-02-20 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1461.
--
Resolution: Implemented

Implemented with 3d84970364ced41d1497269dc3c9d0b5835f9e1e

> Add sortPartition operator
> --
>
> Key: FLINK-1461
> URL: https://issues.apache.org/jira/browse/FLINK-1461
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Local Runtime, Optimizer, Scala API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> A {{sortPartition()}} operator can be used to
> * sort the input of a {{mapPartition()}} operator
> * enforce a certain sorting of the input of a given operator of a program. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1466) Add InputFormat to read HCatalog tables

2015-02-20 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1466.
--
Resolution: Implemented

Implemented with bed3da4a61a8637c0faa9632b8a05ccef8c5a6dc

> Add InputFormat to read HCatalog tables
> ---
>
> Key: FLINK-1466
> URL: https://issues.apache.org/jira/browse/FLINK-1466
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> HCatalog is a metadata repository and InputFormat to make Hive tables 
> accessible to other frameworks such as Pig.
> Adding support for HCatalog would give access to Hive managed data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1461][api-extending] Add SortPartition ...

2015-02-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/381


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1593) Improve exception handling int Streaming runtime

2015-02-20 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1593:
-

 Summary: Improve exception handling int Streaming runtime
 Key: FLINK-1593
 URL: https://issues.apache.org/jira/browse/FLINK-1593
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora


The current exception handling logic is not very helpful when trying to debug 
an application. In many cases with serialization/user code or other exceptions, 
the error is just logged and not propagated properly. This should mainly be 
fixed in the StreamInvokables.

Some improvements that could be made:

-Serialization/Deserialiaztion and other system errors should be propagated 
instead of just silently logged

-User code exceptions should be better handled, for instance I think it would 
be helpful to log them to INFO so that users can instantly see it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329046#comment-14329046
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75254417
  
@tillrohrmann and @StephanEwen worked on some other reliablity issues. Will 
the changes in this PR be subsumed by the upcoming changes? If not, we should 
merge this. :-)


> JobManager restart does not notify the TaskManager
> --
>
> Key: FLINK-1484
> URL: https://issues.apache.org/jira/browse/FLINK-1484
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> In case of a JobManager restart, which can happen due to an uncaught 
> exception, the JobManager is restarted. However, connected TaskManager are 
> not informed about the disconnection and continue sending messages to a 
> JobManager with a reseted state. 
> TaskManager should be informed about a possible restart and cleanup their own 
> state in such a case. Afterwards, they can try to reconnect to a restarted 
> JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1444) Add data properties for data sources

2015-02-20 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1444.
--
Resolution: Implemented

Implemented with f0a28bf5345084a0a43df16021e60078e322e087

> Add data properties for data sources
> 
>
> Key: FLINK-1444
> URL: https://issues.apache.org/jira/browse/FLINK-1444
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, JobManager, Optimizer
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> This issue proposes to add support for attaching data properties to data 
> sources. These data properties are defined with respect to input splits.
> Possible properties are:
> - partitioning across splits: all elements of the same key (combination) are 
> contained in one split
> - sorting / grouping with splits: elements are sorted or grouped on certain 
> keys within a split
> - key uniqueness: a certain key (combination) is unique for all elements of 
> the data source. This property is not defined wrt. input splits.
> The optimizer can leverage this information to generate more efficient 
> execution plans.
> The InputFormat will be responsible to generate input splits such that the 
> promised data properties are actually in place. Otherwise, the program will 
> produce invalid results. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

2015-02-20 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75254417
  
@tillrohrmann and @StephanEwen worked on some other reliablity issues. Will 
the changes in this PR be subsumed by the upcoming changes? If not, we should 
merge this. :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329056#comment-14329056
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/368


> JobManager restart does not notify the TaskManager
> --
>
> Key: FLINK-1484
> URL: https://issues.apache.org/jira/browse/FLINK-1484
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> In case of a JobManager restart, which can happen due to an uncaught 
> exception, the JobManager is restarted. However, connected TaskManager are 
> not informed about the disconnection and continue sending messages to a 
> JobManager with a reseted state. 
> TaskManager should be informed about a possible restart and cleanup their own 
> state in such a case. Afterwards, they can try to reconnect to a restarted 
> JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/368


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329055#comment-14329055
 ] 

ASF GitHub Bot commented on FLINK-1484:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75256160
  
This PR has been merged as part of PR #423 


> JobManager restart does not notify the TaskManager
> --
>
> Key: FLINK-1484
> URL: https://issues.apache.org/jira/browse/FLINK-1484
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> In case of a JobManager restart, which can happen due to an uncaught 
> exception, the JobManager is restarted. However, connected TaskManager are 
> not informed about the disconnection and continue sending messages to a 
> JobManager with a reseted state. 
> TaskManager should be informed about a possible restart and cleanup their own 
> state in such a case. Afterwards, they can try to reconnect to a restarted 
> JobManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/368#issuecomment-75256160
  
This PR has been merged as part of PR #423 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329061#comment-14329061
 ] 

ASF GitHub Bot commented on FLINK-1505:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75257236
  
Thank you!
I will try it over the weekend and give you feedback.


> Separate buffer reader and channel consumption logic
> 
>
> Key: FLINK-1505
> URL: https://issues.apache.org/jira/browse/FLINK-1505
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is 
> bloated. There is no separation between consumption of the input channels and 
> the buffer readers.
> This was not the case up until release-0.8 and has been introduced by me with 
> intermediate results. I think this was a mistake and we should seperate this 
> again. flink-streaming is currently the heaviest user of these lower level 
> APIs and I have received feedback from [~gyfora] to undo this as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...

2015-02-20 Thread uce
GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/428

[FLINK-1505] Separate reader API from result consumption

@gyfora, can you please rebase on this branch and verify that everything is 
still working as expected for you?

This PR separates the reader API (record and buffer readers) from result 
consumption (input gate). The buffer reader was a huge component with mixed 
responsibilities both as the runtime component to set up input channels for 
intermediate result consumption and as a lower-level user API to consume 
buffers/events.

The separation makes it easier for users of the API (e.g. flink-streaming) 
to extend the handling of low-level buffers and events. Gyula's initial 
feedback confirmed this.

In view of FLINK-1568, this PR makes it also easier to test the result 
consumption logic for failure scenarios.

I will rebase #356 on this changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink flink-1505-input_gate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/428.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #428


commit db1dc5be12427664a418ce6e4fb41de39838fac0
Author: Ufuk Celebi 
Date:   2015-02-10T14:05:44Z

[FLINK-1505] [distributed runtime] Separate reader API from result 
consumption




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1466) Add InputFormat to read HCatalog tables

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329032#comment-14329032
 ] 

ASF GitHub Bot commented on FLINK-1466:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/411


> Add InputFormat to read HCatalog tables
> ---
>
> Key: FLINK-1466
> URL: https://issues.apache.org/jira/browse/FLINK-1466
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> HCatalog is a metadata repository and InputFormat to make Hive tables 
> accessible to other frameworks such as Pig.
> Adding support for HCatalog would give access to Hive managed data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1444) Add data properties for data sources

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329031#comment-14329031
 ] 

ASF GitHub Bot commented on FLINK-1444:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/379


> Add data properties for data sources
> 
>
> Key: FLINK-1444
> URL: https://issues.apache.org/jira/browse/FLINK-1444
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, JobManager, Optimizer
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> This issue proposes to add support for attaching data properties to data 
> sources. These data properties are defined with respect to input splits.
> Possible properties are:
> - partitioning across splits: all elements of the same key (combination) are 
> contained in one split
> - sorting / grouping with splits: elements are sorted or grouped on certain 
> keys within a split
> - key uniqueness: a certain key (combination) is unique for all elements of 
> the data source. This property is not defined wrt. input splits.
> The optimizer can leverage this information to generate more efficient 
> execution plans.
> The InputFormat will be responsible to generate input splits such that the 
> promised data properties are actually in place. Otherwise, the program will 
> produce invalid results. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329028#comment-14329028
 ] 

ASF GitHub Bot commented on FLINK-1505:
---

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/428

[FLINK-1505] Separate reader API from result consumption

@gyfora, can you please rebase on this branch and verify that everything is 
still working as expected for you?

This PR separates the reader API (record and buffer readers) from result 
consumption (input gate). The buffer reader was a huge component with mixed 
responsibilities both as the runtime component to set up input channels for 
intermediate result consumption and as a lower-level user API to consume 
buffers/events.

The separation makes it easier for users of the API (e.g. flink-streaming) 
to extend the handling of low-level buffers and events. Gyula's initial 
feedback confirmed this.

In view of FLINK-1568, this PR makes it also easier to test the result 
consumption logic for failure scenarios.

I will rebase #356 on this changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink flink-1505-input_gate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/428.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #428


commit db1dc5be12427664a418ce6e4fb41de39838fac0
Author: Ufuk Celebi 
Date:   2015-02-10T14:05:44Z

[FLINK-1505] [distributed runtime] Separate reader API from result 
consumption




> Separate buffer reader and channel consumption logic
> 
>
> Key: FLINK-1505
> URL: https://issues.apache.org/jira/browse/FLINK-1505
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is 
> bloated. There is no separation between consumption of the input channels and 
> the buffer readers.
> This was not the case up until release-0.8 and has been introduced by me with 
> intermediate results. I think this was a mistake and we should seperate this 
> again. flink-streaming is currently the heaviest user of these lower level 
> APIs and I have received feedback from [~gyfora] to undo this as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...

2015-02-20 Thread gyfora
Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75257236
  
Thank you!
I will try it over the weekend and give you feedback.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Daniel Bali (JIRA)
Daniel Bali created FLINK-1594:
--

 Summary: DataStreams don't support self-join
 Key: FLINK-1594
 URL: https://issues.apache.org/jira/browse/FLINK-1594
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
 Environment: flink-0.9.0-SNAPSHOT
Reporter: Daniel Bali


Trying to join a DataSets with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Union buffer reader must be 
initialized with at least two individual buffer readers
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at 
org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.(UnionBufferReader.java:69)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
at 
org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:170)
... 20 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Daniel Bali (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Bali updated FLINK-1594:
---
Description: 
Trying to join a DataSet with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Union buffer reader must be 
initialized with at least two individual buffer readers
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at 
org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.(UnionBufferReader.java:69)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
at 
org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:170)
... 20 more
{noformat}

  was:
Trying to join a DataSets with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java

[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329117#comment-14329117
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/427#discussion_r25080967
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/ExecutionEnvironmentITCase.java
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.javaApiOperators;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.configuration.ConfigConstants;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.test.javaApiOperators.util.CollectionDataSets;
+import org.apache.flink.test.util.MultipleProgramsTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.ArrayList;
+import java.util.Collection;
+
+
+@RunWith(Parameterized.class)
+public class ExecutionEnvironmentITCase extends MultipleProgramsTestBase {
+
+
+   public ExecutionEnvironmentITCase(ExecutionMode mode) {
+   super(mode);
+   }
+
+   @Parameterized.Parameters(name = "Execution mode = {0}")
+   public static Collection executionModes(){
+   Collection c = new 
ArrayList(1);
+   c.add(new ExecutionMode[] {ExecutionMode.CLUSTER});
+   return c;
+   }
+
+
+   @Test
+   public void testLocalEnvironmentWithConfig() throws Exception {
+   IllegalArgumentException e = null;
+   try {
+   Configuration conf = new Configuration();
+   
conf.setBoolean(ConfigConstants.FILESYSTEM_DEFAULT_OVERWRITE_KEY, true);
+   
conf.setString(ConfigConstants.TASK_MANAGER_TMP_DIR_KEY, 
"/tmp/thelikelyhoodthatthisdirectoryexisitsisreallylow");
--- End diff --

Are you sure?


> Add option to pass Configuration to LocalExecutor
> -
>
> Key: FLINK-1589
> URL: https://issues.apache.org/jira/browse/FLINK-1589
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Right now its not possible for users to pass custom configuration values to 
> Flink when running it from within an IDE.
> It would be very convenient to be able to create a local execution 
> environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/427#discussion_r25080967
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/ExecutionEnvironmentITCase.java
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.javaApiOperators;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.configuration.ConfigConstants;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.test.javaApiOperators.util.CollectionDataSets;
+import org.apache.flink.test.util.MultipleProgramsTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.ArrayList;
+import java.util.Collection;
+
+
+@RunWith(Parameterized.class)
+public class ExecutionEnvironmentITCase extends MultipleProgramsTestBase {
+
+
+   public ExecutionEnvironmentITCase(ExecutionMode mode) {
+   super(mode);
+   }
+
+   @Parameterized.Parameters(name = "Execution mode = {0}")
+   public static Collection executionModes(){
+   Collection c = new 
ArrayList(1);
+   c.add(new ExecutionMode[] {ExecutionMode.CLUSTER});
+   return c;
+   }
+
+
+   @Test
+   public void testLocalEnvironmentWithConfig() throws Exception {
+   IllegalArgumentException e = null;
+   try {
+   Configuration conf = new Configuration();
+   
conf.setBoolean(ConfigConstants.FILESYSTEM_DEFAULT_OVERWRITE_KEY, true);
+   
conf.setString(ConfigConstants.TASK_MANAGER_TMP_DIR_KEY, 
"/tmp/thelikelyhoodthatthisdirectoryexisitsisreallylow");
--- End diff --

Are you sure?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75271706
  
I think we should rework the test case to check that the configuration is 
properly passed to the system. Right now the exception is thrown in 
```ds.writeAsText(null)``` because we pass ```null```. 

I'd propose something like @StephanEwen did in the PR #410. We set the 
number of slots in the configuration and the job to ```PARALLELISM_AUTO_MAX```. 
With the special input format which produces only a single element per split, 
we can count the number of parallel tasks, given that every task receives only 
one input split.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329156#comment-14329156
 ] 

ASF GitHub Bot commented on FLINK-1589:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/427#issuecomment-75271706
  
I think we should rework the test case to check that the configuration is 
properly passed to the system. Right now the exception is thrown in 
```ds.writeAsText(null)``` because we pass ```null```. 

I'd propose something like @StephanEwen did in the PR #410. We set the 
number of slots in the configuration and the job to ```PARALLELISM_AUTO_MAX```. 
With the special input format which produces only a single element per split, 
we can count the number of parallel tasks, given that every task receives only 
one input split.


> Add option to pass Configuration to LocalExecutor
> -
>
> Key: FLINK-1589
> URL: https://issues.apache.org/jira/browse/FLINK-1589
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Right now its not possible for users to pass custom configuration values to 
> Flink when running it from within an IDE.
> It would be very convenient to be able to create a local execution 
> environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1555] Add serializer hierarchy debug ut...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/415#discussion_r25085047
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/typeutils/CompositeType.java
 ---
@@ -132,7 +133,31 @@ public CompositeType(Class typeClass) {
}
return getNewComparator(config);
}
-   
+
+   // 

+
+   /**
+* Debugging utility to understand the hierarchy of serializers created 
by the Java API.
+*/
+   public static  String getSerializerTree(TypeInformation ti) {
+   return getSerializerTree(ti, 0);
+   }
+
+   private static  String getSerializerTree(TypeInformation ti, int 
indent) {
+   String ret = "";
+   if(ti instanceof CompositeType) {
+   ret += ti.toString()+"\n";
--- End diff --

Should the ```toString``` method not already print the whole tree? Thus, 
the information would be redundant.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1555) Add utility to log the serializers of composite types

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329188#comment-14329188
 ] 

ASF GitHub Bot commented on FLINK-1555:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/415#discussion_r25085047
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/typeutils/CompositeType.java
 ---
@@ -132,7 +133,31 @@ public CompositeType(Class typeClass) {
}
return getNewComparator(config);
}
-   
+
+   // 

+
+   /**
+* Debugging utility to understand the hierarchy of serializers created 
by the Java API.
+*/
+   public static  String getSerializerTree(TypeInformation ti) {
+   return getSerializerTree(ti, 0);
+   }
+
+   private static  String getSerializerTree(TypeInformation ti, int 
indent) {
+   String ret = "";
+   if(ti instanceof CompositeType) {
+   ret += ti.toString()+"\n";
--- End diff --

Should the ```toString``` method not already print the whole tree? Thus, 
the information would be redundant.


> Add utility to log the serializers of composite types
> -
>
> Key: FLINK-1555
> URL: https://issues.apache.org/jira/browse/FLINK-1555
> Project: Flink
>  Issue Type: Improvement
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> Users affected by poor performance might want to understand how Flink is 
> serializing their data.
> Therefore, it would be cool to have a tool utility which logs the serializers 
> like this:
> {{SerializerUtils.getSerializers(TypeInformation t);}}
> to get 
> {code}
> PojoSerializer
> TupleSerializer
>   IntSer
>   DateSer
>   GenericTypeSer(java.sql.Date)
> PojoSerializer
>   GenericTypeSer(HashMap)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1515]Splitted runVertexCentricIteration...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75279454
  
I'm currently working on fixing this problem. You can ignore it for the
moment.

On Fri, Feb 20, 2015 at 11:58 AM, Vasia Kalavri 
wrote:

> Hi,
> if no more comments, I'd like to merge this.
> There is a failing check in Travis (not related to this PR):
>
> Tests in error:
>   
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40
 » Timeout
>   
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40->TestKit.within:707->TestKit.within:707
 » Timeout
>
> Is this fixed by #422 ? Shall I
> proceed?
> Thanks!
>
> —
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1515) [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329204#comment-14329204
 ] 

ASF GitHub Bot commented on FLINK-1515:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/402#issuecomment-75279454
  
I'm currently working on fixing this problem. You can ignore it for the
moment.

On Fri, Feb 20, 2015 at 11:58 AM, Vasia Kalavri 
wrote:

> Hi,
> if no more comments, I'd like to merge this.
> There is a failing check in Travis (not related to this PR):
>
> Tests in error:
>   
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40
 » Timeout
>   
JobManagerFailsITCase.run:40->org$scalatest$BeforeAndAfterAll$$super$run:40->org$scalatest$WordSpecLike$$super$run:40->runTests:40->runTest:40->withFixture:40->TestKit.within:707->TestKit.within:707
 » Timeout
>
> Is this fixed by #422 ? Shall I
> proceed?
> Thanks!
>
> —
> Reply to this email directly or view it on GitHub
> .
>



> [Gelly] Enable access to aggregators and broadcast sets in vertex-centric 
> iteration
> ---
>
> Key: FLINK-1515
> URL: https://issues.apache.org/jira/browse/FLINK-1515
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Martin Kiefer
>
> Currently, aggregators and broadcast sets cannot be accessed through Gelly's  
> {{runVertexCentricIteration}} method. The functionality is already present in 
> the {{VertexCentricIteration}} and we just need to expose it.
> This could be done like this: We create a method 
> {{createVertexCentricIteration}}, which will return a 
> {{VertexCentricIteration}} object and we change {{runVertexCentricIteration}} 
> to accept this as a parameter (and return the graph after running this 
> iteration).
> The user can configure the {{VertexCentricIteration}} by directly calling the 
> public methods {{registerAggregator}}, {{setName}}, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086186
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/GenericTypeInfo.java
 ---
@@ -66,6 +71,9 @@ public boolean isKeyType() {
 
@Override
public TypeSerializer createSerializer() {
+   if(useAvroSerializer) {
+   return new AvroSerializer(this.typeClass);
+   }
return new KryoSerializer(this.typeClass);
--- End diff --

If we enclose the ```return new KryoSerializer``` in an else block, then 
the control flow looks nicer imho.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()

2015-02-20 Thread Andra Lungu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andra Lungu reassigned FLINK-1587:
--

Assignee: Andra Lungu

> coGroup throws NoSuchElementException on iterator.next()
> 
>
> Key: FLINK-1587
> URL: https://issues.apache.org/jira/browse/FLINK-1587
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
> Environment: flink-0.8.0-SNAPSHOT
>Reporter: Carsten Brandt
>Assignee: Andra Lungu
>
> I am receiving the following exception when running a simple job that 
> extracts outdegree from a graph using Gelly. It is currently only failing on 
> the cluster and I am not able to reproduce it locally. Will try that the next 
> days.
> {noformat}
> 02/20/2015 02:27:02:  CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) 
> switched to FAILED
> java.util.NoSuchElementException
>   at java.util.Collections$EmptyIterator.next(Collections.java:3006)
>   at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665)
>   at 
> org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
>   at 
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
>   at java.lang.Thread.run(Thread.java:745)
> 02/20/2015 02:27:02:  Job execution switched to status FAILING
> ...
> {noformat}
> The error occurs in Gellys Graph.java at this line: 
> https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636
> Is there any valid case where a coGroup Iterator may be empty? As far as I 
> see there is a bug somewhere.
> I'd like to write a test case for this to reproduce the issue. Where can I 
> put such a test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329215#comment-14329215
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086186
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/GenericTypeInfo.java
 ---
@@ -66,6 +71,9 @@ public boolean isKeyType() {
 
@Override
public TypeSerializer createSerializer() {
+   if(useAvroSerializer) {
+   return new AvroSerializer(this.typeClass);
+   }
return new KryoSerializer(this.typeClass);
--- End diff --

If we enclose the ```return new KryoSerializer``` in an else block, then 
the control flow looks nicer imho.


> Add option to switch between Avro and Kryo serialization for GenericTypes
> -
>
> Key: FLINK-1567
> URL: https://issues.apache.org/jira/browse/FLINK-1567
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.8, 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086445
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/AvroSerializer.java
 ---
@@ -91,7 +98,22 @@ public T createInstance() {
@Override
public T copy(T from) {
checkKryoInitialized();
-   return this.kryo.copy(from);
+   try {
+   return this.kryo.copy(from);
+   } catch(KryoException ke) {
+   // kryo was unable to copy it, so we do it through 
serialization:
+   ByteArrayOutputStream baout = new 
ByteArrayOutputStream();
+   Output output = new Output(baout);
+
+   kryo.writeObject(output, from);
+
+   output.close();
+
+   ByteArrayInputStream bain = new 
ByteArrayInputStream(baout.toByteArray());
+   Input input = new Input(bain);
+
+   return (T)kryo.readObject(input, from.getClass());
--- End diff --

Which cases do we cover with this method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()

2015-02-20 Thread Andra Lungu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329218#comment-14329218
 ] 

Andra Lungu commented on FLINK-1587:


If there are no time constraints, I will take it :) 

> coGroup throws NoSuchElementException on iterator.next()
> 
>
> Key: FLINK-1587
> URL: https://issues.apache.org/jira/browse/FLINK-1587
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
> Environment: flink-0.8.0-SNAPSHOT
>Reporter: Carsten Brandt
>
> I am receiving the following exception when running a simple job that 
> extracts outdegree from a graph using Gelly. It is currently only failing on 
> the cluster and I am not able to reproduce it locally. Will try that the next 
> days.
> {noformat}
> 02/20/2015 02:27:02:  CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) 
> switched to FAILED
> java.util.NoSuchElementException
>   at java.util.Collections$EmptyIterator.next(Collections.java:3006)
>   at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665)
>   at 
> org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
>   at 
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
>   at java.lang.Thread.run(Thread.java:745)
> 02/20/2015 02:27:02:  Job execution switched to status FAILING
> ...
> {noformat}
> The error occurs in Gellys Graph.java at this line: 
> https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636
> Is there any valid case where a coGroup Iterator may be empty? As far as I 
> see there is a bug somewhere.
> I'd like to write a test case for this to reproduce the issue. Where can I 
> put such a test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329219#comment-14329219
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086445
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/AvroSerializer.java
 ---
@@ -91,7 +98,22 @@ public T createInstance() {
@Override
public T copy(T from) {
checkKryoInitialized();
-   return this.kryo.copy(from);
+   try {
+   return this.kryo.copy(from);
+   } catch(KryoException ke) {
+   // kryo was unable to copy it, so we do it through 
serialization:
+   ByteArrayOutputStream baout = new 
ByteArrayOutputStream();
+   Output output = new Output(baout);
+
+   kryo.writeObject(output, from);
+
+   output.close();
+
+   ByteArrayInputStream bain = new 
ByteArrayInputStream(baout.toByteArray());
+   Input input = new Input(bain);
+
+   return (T)kryo.readObject(input, from.getClass());
--- End diff --

Which cases do we cover with this method?


> Add option to switch between Avro and Kryo serialization for GenericTypes
> -
>
> Key: FLINK-1567
> URL: https://issues.apache.org/jira/browse/FLINK-1567
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.8, 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086595
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/AvroSerializer.java
 ---
@@ -154,6 +176,7 @@ private void checkAvroInitialized() {
private void checkKryoInitialized() {
if (this.kryo == null) {
this.kryo = new Kryo();
+   this.kryo.register(GenericData.Array.class, new 
KryoSerializer.SpecificInstanceCollectionSerializer(ArrayList.class));
--- End diff --

Without that, it does not work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329226#comment-14329226
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/413#discussion_r25086595
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/AvroSerializer.java
 ---
@@ -154,6 +176,7 @@ private void checkAvroInitialized() {
private void checkKryoInitialized() {
if (this.kryo == null) {
this.kryo = new Kryo();
+   this.kryo.register(GenericData.Array.class, new 
KryoSerializer.SpecificInstanceCollectionSerializer(ArrayList.class));
--- End diff --

Without that, it does not work?


> Add option to switch between Avro and Kryo serialization for GenericTypes
> -
>
> Key: FLINK-1567
> URL: https://issues.apache.org/jira/browse/FLINK-1567
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.8, 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329227#comment-14329227
 ] 

ASF GitHub Bot commented on FLINK-1567:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/413#issuecomment-75283551
  
LGTM besides of my small comments.


> Add option to switch between Avro and Kryo serialization for GenericTypes
> -
>
> Key: FLINK-1567
> URL: https://issues.apache.org/jira/browse/FLINK-1567
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.8, 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...

2015-02-20 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/413#issuecomment-75283551
  
LGTM besides of my small comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Gyula Fora (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329268#comment-14329268
 ] 

Gyula Fora commented on FLINK-1594:
---

We will have to take a look at how self joins are handled in the batch runtime.

> DataStreams don't support self-join
> ---
>
> Key: FLINK-1594
> URL: https://issues.apache.org/jira/browse/FLINK-1594
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
> Environment: flink-0.9.0-SNAPSHOT
>Reporter: Daniel Bali
>
> Trying to window-join a DataStream with itself will result in exceptions. I 
> get the following stack trace:
> {noformat}
> java.lang.Exception: Error setting up runtime environment: Union buffer 
> reader must be initialized with at least two individual buffer readers
> at 
> org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:173)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
> at 
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.IllegalArgumentException: Union buffer reader must 
> be initialized with at least two individual buffer readers
> at 
> org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
> at 
> org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.(UnionBufferReader.java:69)
> at 
> org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
> at 
> org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
> at 
> org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
> at 
> org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:170)
> ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1594:
--
Description: 
Trying to window-join a DataStream with itself will result in exceptions. I get 
the following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Union buffer reader must be 
initialized with at least two individual buffer readers
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125)
at 
org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.(UnionBufferReader.java:69)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101)
at 
org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63)
at 
org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:170)
... 20 more
{noformat}

  was:
Trying to join a DataSet with itself will result in exceptions. I get the 
following stack trace:

{noformat}
java.lang.Exception: Error setting up runtime environment: Union buffer 
reader must be initialized with at least two individual buffer readers
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.(RuntimeEnvironment.java:173)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTa

[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread andralungu
GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/429

[FLINK-1522][gelly] Added test for SSSP Example



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink tidySSSP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/429.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #429


commit 93cb052c4c432ccd28b008b2436e00d2382ea0ba
Author: andralungu 
Date:   2015-02-20T19:11:42Z

[FLINK-1522][gelly] Added test for SSSP Example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread andralungu
Github user andralungu closed the pull request at:

https://github.com/apache/flink/pull/414


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329381#comment-14329381
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/429

[FLINK-1522][gelly] Added test for SSSP Example



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink tidySSSP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/429.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #429


commit 93cb052c4c432ccd28b008b2436e00d2382ea0ba
Author: andralungu 
Date:   2015-02-20T19:11:42Z

[FLINK-1522][gelly] Added test for SSSP Example




> Add tests for the library methods and examples
> --
>
> Key: FLINK-1522
> URL: https://issues.apache.org/jira/browse/FLINK-1522
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>  Labels: easyfix, test
>
> The current tests in gelly test one method at a time. We should have some 
> tests for complete applications. As a start, we could add one test case per 
> example and this way also make sure that our graph library methods actually 
> give correct results.
> I'm assigning this to [~andralungu] because she has already implemented the 
> test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329379#comment-14329379
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

Github user andralungu closed the pull request at:

https://github.com/apache/flink/pull/414


> Add tests for the library methods and examples
> --
>
> Key: FLINK-1522
> URL: https://issues.apache.org/jira/browse/FLINK-1522
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>  Labels: easyfix, test
>
> The current tests in gelly test one method at a time. We should have some 
> tests for complete applications. As a start, we could add one test case per 
> example and this way also make sure that our graph library methods actually 
> give correct results.
> I'm assigning this to [~andralungu] because she has already implemented the 
> test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75310637
  
Btw, you can update a PR by pushing into your remote branch and don't need 
to close and open a new PR. If you want to change previous commits (including 
commit message) or rebase the PR, you can do a "force push".
However, when doing a force push all code comments get lost, so you should 
avoid that if somebody reviewed the code and made comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329479#comment-14329479
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75310637
  
Btw, you can update a PR by pushing into your remote branch and don't need 
to close and open a new PR. If you want to change previous commits (including 
commit message) or rebase the PR, you can do a "force push".
However, when doing a force push all code comments get lost, so you should 
avoid that if somebody reviewed the code and made comments.


> Add tests for the library methods and examples
> --
>
> Key: FLINK-1522
> URL: https://issues.apache.org/jira/browse/FLINK-1522
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>  Labels: easyfix, test
>
> The current tests in gelly test one method at a time. We should have some 
> tests for complete applications. As a start, we could add one test case per 
> example and this way also make sure that our graph library methods actually 
> give correct results.
> I'm assigning this to [~andralungu] because she has already implemented the 
> test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...

2015-02-20 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75312569
  
Thank you for the tip! I am still learning the dirty insights of Git :) 
Next time I will just update the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329497#comment-14329497
 ] 

ASF GitHub Bot commented on FLINK-1522:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/429#issuecomment-75312569
  
Thank you for the tip! I am still learning the dirty insights of Git :) 
Next time I will just update the PR. 


> Add tests for the library methods and examples
> --
>
> Key: FLINK-1522
> URL: https://issues.apache.org/jira/browse/FLINK-1522
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>  Labels: easyfix, test
>
> The current tests in gelly test one method at a time. We should have some 
> tests for complete applications. As a start, we could add one test case per 
> example and this way also make sure that our graph library methods actually 
> give correct results.
> I'm assigning this to [~andralungu] because she has already implemented the 
> test for SSSP, but I will help as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329502#comment-14329502
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75313413
  
This is much needed monitoring and really great!
What are the current options for showing the detailed metrics? I see a 
"show 3 TMs" and "show all TMs" button in the screenshot? Can you select which 
three to show? 

How about showing small indicators (load, % non-Flink-managed heap, GC 
interval) with a simple color coding (red for hot, blue for cool). This would 
help to find TMs which are more loaded than others. The detailed view could be 
opened by clicking on the TMs.

But we do not need to get the perfect solution at once. 
How about we open a document and sketch the design of the monitoring and 
create smaller PRs to get there step-by-step.
This PR is definitely a huge step in the right direction!


> Integrate metrics library and report basic metrics to JobManager web interface
> --
>
> Key: FLINK-1501
> URL: https://issues.apache.org/jira/browse/FLINK-1501
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: pre-apache
>
>
> As per mailing list, the library: https://github.com/dropwizard/metrics
> The goal of this task is to get the basic infrastructure in place.
> Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...

2015-02-20 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-75313413
  
This is much needed monitoring and really great!
What are the current options for showing the detailed metrics? I see a 
"show 3 TMs" and "show all TMs" button in the screenshot? Can you select which 
three to show? 

How about showing small indicators (load, % non-Flink-managed heap, GC 
interval) with a simple color coding (red for hot, blue for cool). This would 
help to find TMs which are more loaded than others. The detailed view could be 
opened by clicking on the TMs.

But we do not need to get the perfect solution at once. 
How about we open a document and sketch the design of the monitoring and 
create smaller PRs to get there step-by-step.
This PR is definitely a huge step in the right direction!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...

2015-02-20 Thread gyfora
Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75322938
  
We have rebased and tried, it works fine for us, thank you. I will add 
several minor changes afterwards but thats just for our convenience.

So from my side 
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329591#comment-14329591
 ] 

ASF GitHub Bot commented on FLINK-1505:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/428#issuecomment-75322938
  
We have rebased and tried, it works fine for us, thank you. I will add 
several minor changes afterwards but thats just for our convenience.

So from my side 
+1


> Separate buffer reader and channel consumption logic
> 
>
> Key: FLINK-1505
> URL: https://issues.apache.org/jira/browse/FLINK-1505
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is 
> bloated. There is no separation between consumption of the input channels and 
> the buffer readers.
> This was not the case up until release-0.8 and has been introduced by me with 
> intermediate results. I think this was a mistake and we should seperate this 
> again. flink-streaming is currently the heaviest user of these lower level 
> APIs and I have received feedback from [~gyfora] to undo this as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1592) Refactor StreamGraph to store vertex IDs as Integers instead of Strings

2015-02-20 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1592:
--
Labels: starter  (was: )

> Refactor StreamGraph to store vertex IDs as Integers instead of Strings
> ---
>
> Key: FLINK-1592
> URL: https://issues.apache.org/jira/browse/FLINK-1592
> Project: Flink
>  Issue Type: Task
>  Components: Streaming
>Reporter: Gyula Fora
>Priority: Minor
>  Labels: starter
>
> The vertex IDs are currently stored as Strings reflecting some deprecated 
> usage. It should be refactored to use Integers instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1595) Add a complex integration test for Streaming API

2015-02-20 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1595:
-

 Summary: Add a complex integration test for Streaming API
 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora


The streaming tests currently lack a sophisticated integration test that would 
test many api features at once. 

This should include different merging, partitioning, grouping, aggregation 
types, as well as windowing and connected operators.

The results should be tested for correctness.

A test like this would help identifying bugs that are hard to detect by 
unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1595) Add a complex integration test for Streaming API

2015-02-20 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1595:
--
Labels: Starter  (was: )

> Add a complex integration test for Streaming API
> 
>
> Key: FLINK-1595
> URL: https://issues.apache.org/jira/browse/FLINK-1595
> Project: Flink
>  Issue Type: Task
>  Components: Streaming
>Reporter: Gyula Fora
>  Labels: Starter
>
> The streaming tests currently lack a sophisticated integration test that 
> would test many api features at once. 
> This should include different merging, partitioning, grouping, aggregation 
> types, as well as windowing and connected operators.
> The results should be tested for correctness.
> A test like this would help identifying bugs that are hard to detect by 
> unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1582) SocketStream gets stuck when socket closes

2015-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329670#comment-14329670
 ] 

ASF GitHub Bot commented on FLINK-1582:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/424#issuecomment-75330500
  
I was thinking as high as a minute to have as a limit. But after 
reconsidering it I am for your argument, trying to establish the connection 
every couple of seconds should not be a big overhead, so my whole idea might be 
an overkill. Let us have your version for now and generalize it on request.


> SocketStream gets stuck when socket closes
> --
>
> Key: FLINK-1582
> URL: https://issues.apache.org/jira/browse/FLINK-1582
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.8, 0.9
>Reporter: Márton Balassi
>  Labels: starter
>
> When the server side of the socket closes the socket stream reader does not 
> terminate. When the socket is reinitiated it does not reconnect just gets 
> stuck.
> It would be nice to add options for the user have the reader should behave 
> when the socket is down: terminate immediately (good for testing and 
> examples) or wait a specified time - possibly forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1582][streaming]Allow SocketStream to r...

2015-02-20 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/424#issuecomment-75330500
  
I was thinking as high as a minute to have as a limit. But after 
reconsidering it I am for your argument, trying to establish the connection 
every couple of seconds should not be a big overhead, so my whole idea might be 
an overkill. Let us have your version for now and generalize it on request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---