[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314537#comment-14314537
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73749137
  
Thanks @balidani. 
And travis tests now pass :tada: 


> Graph API for Flink 
> 
>
> Key: FLINK-1201
> URL: https://issues.apache.org/jira/browse/FLINK-1201
> Project: Flink
>  Issue Type: New Feature
>Reporter: Kostas Tzoumas
>Assignee: Vasia Kalavri
>
> This issue tracks the development of a Graph API/DSL for Flink.
> Until the code is pushed to the Flink repository, collaboration is happening 
> here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-02-10 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73749137
  
Thanks @balidani. 
And travis tests now pass :tada: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1509) Streaming Group Reduce Combiner

2015-02-10 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1509:


 Summary: Streaming Group Reduce Combiner
 Key: FLINK-1509
 URL: https://issues.apache.org/jira/browse/FLINK-1509
 Project: Flink
  Issue Type: New Feature
  Components: Local Runtime, Optimizer
Reporter: Fabian Hueske
Priority: Minor


Combiners for GroupReduce operators are currently always (partially) sorting 
their input even if the data is already appropriately sorted.

It should be relatively easy to add chained and unchained implementations of a 
streaming group reduce combiner strategy .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314478#comment-14314478
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73740648
  
now that's a lot more helpful. looks like there is some problem getting the 
hostname. I'll keep googling a bit, but so far it appears that your OS or 
rather it's network configuration is the cause.


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>Assignee: Chesnay Schepler
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1502) Expose metrics to graphite, ganglia and JMX.

2015-02-10 Thread Henry Saputra (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314476#comment-14314476
 ] 

Henry Saputra commented on FLINK-1502:
--

Ah makes sense, I was actually looking for FLINK-456

> Expose metrics to graphite, ganglia and JMX.
> 
>
> Key: FLINK-1502
> URL: https://issues.apache.org/jira/browse/FLINK-1502
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
> Fix For: pre-apache
>
>
> The metrics library allows to expose collected metrics easily to other 
> systems such as graphite, ganglia or Java's JVM (VisualVM).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-02-10 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73740648
  
now that's a lot more helpful. looks like there is some problem getting the 
hostname. I'll keep googling a bit, but so far it appears that your OS or 
rather it's network configuration is the cause.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-02-10 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73736669
  
Now I'm getting:

02/10/2015 17:47:06 Job execution switched to status RUNNING.
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
SCHEDULED
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
DEPLOYING
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
RUNNING
02/10/2015 17:47:06 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to SCHEDULED
02/10/2015 17:47:06 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to DEPLOYING
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
FINISHED
02/10/2015 17:47:06 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to RUNNING
02/10/2015 17:47:16 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to FAILED
java.lang.Exception: The user defined 'open()' method caused an 
exception: External process for task MapPartition (PythonFlatMap -> 
PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  File 
"/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
 line 224, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:491)
at 
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:204)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: External process for task 
MapPartition (PythonFlatMap -> PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  File 
"/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
 line 224, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
at 
org.apache.flink.languagebinding.api.java.common.streaming.Streamer.open(Streamer.java:72)
at 
org.apache.flink.languagebinding.api.java.python.functions.PythonMapPartition.open(PythonMapPartition.java:48)
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33)
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:487)
... 3 more

02/10/2015 17:47:16 Job execution switched to status FAILING.
02/10/2015 17:47:16 GroupReduce (PythonGroupReducePreStep)(1/1) 
switched to CANCELED
02/10/2015 17:47:16 MapPartition (PythonGroupReduce)(1/1) switched 
to CANCELED
02/10/2015 17:47:16 DataSink(PrintSink)(1/1) switched to CANCELED
02/10/2015 17:47:16 Job execution switched to status FAILED.
Error: The program execution failed: java.lang.Exception: The user 
defined 'open()' method caused an exception: External process for task 
MapPartition (PythonFlatMap -> PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  File 
"/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
 line 224, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:491)
at 
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:204)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: External process for task 
MapPartition (PythonFlatMap -> PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  

[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314456#comment-14314456
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73736669
  
Now I'm getting:

02/10/2015 17:47:06 Job execution switched to status RUNNING.
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
SCHEDULED
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
DEPLOYING
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
RUNNING
02/10/2015 17:47:06 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to SCHEDULED
02/10/2015 17:47:06 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to DEPLOYING
02/10/2015 17:47:06 DataSource (ValueSource)(1/1) switched to 
FINISHED
02/10/2015 17:47:06 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to RUNNING
02/10/2015 17:47:16 MapPartition (PythonFlatMap -> 
PythonCombine)(1/1) switched to FAILED
java.lang.Exception: The user defined 'open()' method caused an 
exception: External process for task MapPartition (PythonFlatMap -> 
PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  File 
"/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
 line 224, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:491)
at 
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:204)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: External process for task 
MapPartition (PythonFlatMap -> PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  File 
"/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
 line 224, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
at 
org.apache.flink.languagebinding.api.java.common.streaming.Streamer.open(Streamer.java:72)
at 
org.apache.flink.languagebinding.api.java.python.functions.PythonMapPartition.open(PythonMapPartition.java:48)
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33)
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:487)
... 3 more

02/10/2015 17:47:16 Job execution switched to status FAILING.
02/10/2015 17:47:16 GroupReduce (PythonGroupReducePreStep)(1/1) 
switched to CANCELED
02/10/2015 17:47:16 MapPartition (PythonGroupReduce)(1/1) switched 
to CANCELED
02/10/2015 17:47:16 DataSink(PrintSink)(1/1) switched to CANCELED
02/10/2015 17:47:16 Job execution switched to status FAILED.
Error: The program execution failed: java.lang.Exception: The user 
defined 'open()' method caused an exception: External process for task 
MapPartition (PythonFlatMap -> PythonCombine) stopped responding.
Traceback (most recent call last):
  File 
"/var/folders/p0/153pkhvn2vn79b2yszvb64thgn/T/tmp_d688cd6ddf2347ab1d713a847b73d234/flink/executor.py",
 line 55, in 
s.bind((socket.gethostname(), 0))
  File 
"/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
 line 224, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:491)
at 
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:204)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: External process for task 
MapPartition (PythonFlatMap -> PythonCombine) sto

[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-02-10 Thread balidani
Github user balidani commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73732358
  
@vasia my ICLA has been filed!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314433#comment-14314433
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user balidani commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73732358
  
@vasia my ICLA has been filed!


> Graph API for Flink 
> 
>
> Key: FLINK-1201
> URL: https://issues.apache.org/jira/browse/FLINK-1201
> Project: Flink
>  Issue Type: New Feature
>Reporter: Kostas Tzoumas
>Assignee: Vasia Kalavri
>
> This issue tracks the development of a Graph API/DSL for Flink.
> Until the code is pushed to the Flink repository, collaboration is happening 
> here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314428#comment-14314428
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73731632
  
if you don't mind, try the example again and let it run. both processes 
should timeout after 5 minutes throwing exceptions, hopefully pointing to the 
origin of the lock.


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>Assignee: Chesnay Schepler
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-02-10 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73731632
  
if you don't mind, try the example again and let it run. both processes 
should timeout after 5 minutes throwing exceptions, hopefully pointing to the 
origin of the lock.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314412#comment-14314412
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73729595
  
hmm, works for me. could be the same issue robert reported. is a python 
process active?


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>Assignee: Chesnay Schepler
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-02-10 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73729595
  
hmm, works for me. could be the same issue robert reported. is a python 
process active?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314393#comment-14314393
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73727421
  
ah probably something went wrong when adding debugging mode, gimme a sec...


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>Assignee: Chesnay Schepler
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-02-10 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73727421
  
ah probably something went wrong when adding debugging mode, gimme a sec...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314382#comment-14314382
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73726322
  
With the current version of the pull request, I get the following when 
executing

/pyflink2.sh ../resources/python/flink/example/WordCount.py

>Job execution switched to status RUNNING.
DataSource (ValueSource)(1/1) switched to SCHEDULED
DataSource (ValueSource)(1/1) switched to DEPLOYING
DataSource (ValueSource)(1/1) switched to RUNNING
MapPartition (PythonFlatMap -> PythonCombine)(1/1) switched to SCHEDULED
>MapPartition (PythonFlatMap -> PythonCombine)(1/1) switched to DEPLOYING
>DataSource (ValueSource)(1/1) switched to FINISHED
>MapPartition (PythonFlatMap -> PythonCombine)(1/1) switched to RUNNING

This hangs forever. If I abort, no Exception is thrown. The job manager log 
states:

>org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Received job 920dca962d0ba21e3be8d3997c0940f1 (Flink Java Job at Tue Feb 10 
16:54:56 CET 2015).
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Scheduling job Flink Java Job at Tue Feb 10 16:54:56 CET 2015.
org.apache.flink.runtime.executiongraph.ExecutionGraph- Deploying 
DataSource (ValueSource) (1/1) (attempt #0) to localhost
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Status of job 920dca962d0ba21e3be8d3997c0940f1 (Flink Java Job at Tue Feb 10 
16:54:56 CET 2015) changed to RUNNING.
org.apache.flink.runtime.taskmanager.TaskManager  - There is no 
profiling enabled for the task manager.
org.apache.flink.runtime.executiongraph.ExecutionGraph- Deploying 
MapPartition (PythonFlatMap -> PythonCombine) (1/1) (attempt #0) to localhost
org.apache.flink.runtime.taskmanager.Task - DataSource 
(ValueSource) (1/1) switched to FINISHED


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>Assignee: Chesnay Schepler
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-02-10 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-73726322
  
With the current version of the pull request, I get the following when 
executing

/pyflink2.sh ../resources/python/flink/example/WordCount.py

>Job execution switched to status RUNNING.
DataSource (ValueSource)(1/1) switched to SCHEDULED
DataSource (ValueSource)(1/1) switched to DEPLOYING
DataSource (ValueSource)(1/1) switched to RUNNING
MapPartition (PythonFlatMap -> PythonCombine)(1/1) switched to SCHEDULED
>MapPartition (PythonFlatMap -> PythonCombine)(1/1) switched to DEPLOYING
>DataSource (ValueSource)(1/1) switched to FINISHED
>MapPartition (PythonFlatMap -> PythonCombine)(1/1) switched to RUNNING

This hangs forever. If I abort, no Exception is thrown. The job manager log 
states:

>org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Received job 920dca962d0ba21e3be8d3997c0940f1 (Flink Java Job at Tue Feb 10 
16:54:56 CET 2015).
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Scheduling job Flink Java Job at Tue Feb 10 16:54:56 CET 2015.
org.apache.flink.runtime.executiongraph.ExecutionGraph- Deploying 
DataSource (ValueSource) (1/1) (attempt #0) to localhost
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Status of job 920dca962d0ba21e3be8d3997c0940f1 (Flink Java Job at Tue Feb 10 
16:54:56 CET 2015) changed to RUNNING.
org.apache.flink.runtime.taskmanager.TaskManager  - There is no 
profiling enabled for the task manager.
org.apache.flink.runtime.executiongraph.ExecutionGraph- Deploying 
MapPartition (PythonFlatMap -> PythonCombine) (1/1) (attempt #0) to localhost
org.apache.flink.runtime.taskmanager.Task - DataSource 
(ValueSource) (1/1) switched to FINISHED


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1492.
---
   Resolution: Fixed
Fix Version/s: 0.8.1

Resolved for 0.8.1 in 
https://git1-us-west.apache.org/repos/asf?p=flink.git;a=commit;h=5b420d84

> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9, 0.8.1
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.jav

[jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314348#comment-14314348
 ] 

ASF GitHub Bot commented on FLINK-1489:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24420806
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(List> consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new Callable(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException("Could not schedule consumer " +
+   "vertex 
" + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

Yeah, true. :-)


> Failing JobManager due to blocking calls in 
> Execution.scheduleOrUpdateConsumers
> ---
>
> Key: FLINK-1489
> URL: https://issues.apache.org/jira/browse/FLINK-1489
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> [~Zentol] reported that the JobManager failed to execute his python job. The 
> reason is that the the JobManager executes blocking calls in the actor thread 
> in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
> {{ScheduleOrUpdateConsumers}} message. 
> Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
> JobManager to notify the consumers about available data. The JobManager then 
> sends to each TaskManager the respective update call 
> {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
> effectively execute the update calls sequentially. Due to the ever 
> accumulating delay, some of the initial timeouts on the TaskManager side in 
> {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result 
> the execution of the respective Tasks fails.
> A solution would be to make the call non-blocking.
> A general caveat for actor programming is: We should never block the actor 
> thread, otherwise we seriously jeopardize the scalability of the system. Or 
> even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24420806
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(List> consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new Callable(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException("Could not schedule consumer " +
+   "vertex 
" + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

Yeah, true. :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314343#comment-14314343
 ] 

ASF GitHub Bot commented on FLINK-1489:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24420364
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(List> consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new Callable(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException("Could not schedule consumer " +
+   "vertex 
" + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

The UpdateTask messages are idempotent in the ```BufferReader```. But my 
intention was not to send any UpdateTask messages twice. The 
```ConcurrentLinkedQueue``` should make sure that every element is only 
dequeued once.


> Failing JobManager due to blocking calls in 
> Execution.scheduleOrUpdateConsumers
> ---
>
> Key: FLINK-1489
> URL: https://issues.apache.org/jira/browse/FLINK-1489
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> [~Zentol] reported that the JobManager failed to execute his python job. The 
> reason is that the the JobManager executes blocking calls in the actor thread 
> in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
> {{ScheduleOrUpdateConsumers}} message. 
> Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
> JobManager to notify the consumers about available data. The JobManager then 
> sends to each TaskManager the respective update call 
> {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
> effectively execute the update calls sequentially. Due to the ever 
> accumulating delay, some of the initial timeouts on the TaskManager side in 
> {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result 
> the execution of the respective Tasks fails.
> A solution would be to make the call non-blocking.
> A general caveat for actor programming is: We should never block the actor 
> thread, otherwise we seriously jeopardize the scalability of the system. Or 
> even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24420364
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(List> consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new Callable(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException("Could not schedule consumer " +
+   "vertex 
" + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

The UpdateTask messages are idempotent in the ```BufferReader```. But my 
intention was not to send any UpdateTask messages twice. The 
```ConcurrentLinkedQueue``` should make sure that every element is only 
dequeued once.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1343) Branching Join Program Deadlocks

2015-02-10 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314326#comment-14314326
 ] 

Fabian Hueske commented on FLINK-1343:
--

After looking into the execution plan of deadlocking program and getting 
stacktraces of the deadlocked execution I think I found the reason for this 
deadlock.

The first join is executed as HybridHashJoin. Both sides are fed from the same 
data source. In order to prevent a deadlock the optimizer places a tempBarrier 
on the probe side input. However, the implementation of the join 
({{MatchDriver}}) blocks until both input are available (lines 103/104: 
{{this.taskContext.getInput()}}) before it starts building the hashtable. This 
behavior causes the deadlock because the build side is not fully consumed (as 
assumed by the optimizer). Instead, the HashJoin waits for the probe side to be 
available from the tempBarrier and gets stuck because the build input is not 
consumed.

Forcing the first join to be executed as MergeJoin fixes the deadlock.

> Branching Join Program Deadlocks
> 
>
> Key: FLINK-1343
> URL: https://issues.apache.org/jira/browse/FLINK-1343
> Project: Flink
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 0.8, 0.9
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> The following program which gets its data from a single non-parallel data 
> source, branches two times, and joins the branches with two joins, deadlocks.
> {code:java}
> public class DeadlockProgram {
> public static void main(String[] args) throws Exception {
> final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSet longs = 
> env.generateSequence(0,100l).setParallelism(1);
> DataSet longs2 = env.generateSequence(0, 
> 100l).setParallelism(1);
> DataSet> longT1 = longs.map(new TupleWrapper());
> DataSet> longT2 = longT1.project(0);
> DataSet> longT3 = longs.map(new TupleWrapper()); // 
> deadlocks
> //DataSet> longT3 = longs2.map(new TupleWrapper()); // 
> works
> longT2.join(longT3).where(0).equalTo(0).projectFirst(0)
> .join(longT1).where(0).equalTo(0).projectFirst(0)
> .print();
> env.execute();
> }
> public static class TupleWrapper implements MapFunction Tuple1> {
> @Override
> public Tuple1 map(Long l) throws Exception {
> return new Tuple1(l);
> }
> };
> }
> {code}
> If one of the branches reads its data from a second data source (see inline 
> comment) or if the single data source uses the default parallelism, the 
> program executes correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1461) Add sortPartition operator

2015-02-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-1461:


Assignee: Fabian Hueske

> Add sortPartition operator
> --
>
> Key: FLINK-1461
> URL: https://issues.apache.org/jira/browse/FLINK-1461
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Local Runtime, Optimizer, Scala API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> A {{sortPartition()}} operator can be used to
> * sort the input of a {{mapPartition()}} operator
> * enforce a certain sorting of the input of a given operator of a program. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1502) Expose metrics to graphite, ganglia and JMX.

2015-02-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314262#comment-14314262
 ] 

Robert Metzger commented on FLINK-1502:
---

No. This task depends on FLINK-1501.
But I see FLINK-1501 and FLINK-1502 both as subtasks of the FLINK-456.

> Expose metrics to graphite, ganglia and JMX.
> 
>
> Key: FLINK-1502
> URL: https://issues.apache.org/jira/browse/FLINK-1502
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
> Fix For: pre-apache
>
>
> The metrics library allows to expose collected metrics easily to other 
> systems such as graphite, ganglia or Java's JVM (VisualVM).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314240#comment-14314240
 ] 

ASF GitHub Bot commented on FLINK-1489:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73708514
  
Looks good to me. +1

We chatted about batching update task calls. Did you realize a problem with 
it or can we open an "improvement" issue for it?


> Failing JobManager due to blocking calls in 
> Execution.scheduleOrUpdateConsumers
> ---
>
> Key: FLINK-1489
> URL: https://issues.apache.org/jira/browse/FLINK-1489
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> [~Zentol] reported that the JobManager failed to execute his python job. The 
> reason is that the the JobManager executes blocking calls in the actor thread 
> in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
> {{ScheduleOrUpdateConsumers}} message. 
> Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
> JobManager to notify the consumers about available data. The JobManager then 
> sends to each TaskManager the respective update call 
> {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
> effectively execute the update calls sequentially. Due to the ever 
> accumulating delay, some of the initial timeouts on the TaskManager side in 
> {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result 
> the execution of the respective Tasks fails.
> A solution would be to make the call non-blocking.
> A general caveat for actor programming is: We should never block the actor 
> thread, otherwise we seriously jeopardize the scalability of the system. Or 
> even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73708514
  
Looks good to me. +1

We chatted about batching update task calls. Did you realize a problem with 
it or can we open an "improvement" issue for it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314237#comment-14314237
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/376


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:134

[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/376


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314238#comment-14314238
 ] 

Robert Metzger commented on FLINK-1492:
---

Resolved for master in 
http://git-wip-us.apache.org/repos/asf/flink/commit/b88f909c

> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> 

[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24415175
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(List> consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new Callable(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException("Could not schedule consumer " +
+   "vertex 
" + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

Just to verify: the double check & send relies on the fact that update 
messages at the task manager are idempotent, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314232#comment-14314232
 ] 

ASF GitHub Bot commented on FLINK-1489:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/378#discussion_r24415175
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -416,13 +426,26 @@ boolean 
scheduleOrUpdateConsumers(List> consumers) throws Ex
final ExecutionState consumerState = 
consumerVertex.getExecutionState();
 
if (consumerState == CREATED) {
-   if (state == RUNNING) {
-   if 
(!consumerVertex.scheduleForExecution(consumerVertex.getExecutionGraph().getScheduler(),
 false)) {
-   success = false;
+   
consumerVertex.cachePartitionInfo(PartialPartitionInfo.fromEdge(edge));
+
+   future(new Callable(){
+   @Override
+   public Boolean call() throws Exception {
+   try {
+   
consumerVertex.scheduleForExecution(
+   
consumerVertex.getExecutionGraph().getScheduler(), false);
+   } catch (Exception exception) {
+   fail(new 
IllegalStateException("Could not schedule consumer " +
+   "vertex 
" + consumerVertex, exception));
+   }
+
+   return true;
}
-   }
-   else {
-   success = false;
+   }, AkkaUtils.globalExecutionContext());
+
+   // double check to resolve race conditions
+   if(consumerVertex.getExecutionState() == 
RUNNING){
+   consumerVertex.sendPartitionInfos();
--- End diff --

Just to verify: the double check & send relies on the fact that update 
messages at the task manager are idempotent, right?


> Failing JobManager due to blocking calls in 
> Execution.scheduleOrUpdateConsumers
> ---
>
> Key: FLINK-1489
> URL: https://issues.apache.org/jira/browse/FLINK-1489
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> [~Zentol] reported that the JobManager failed to execute his python job. The 
> reason is that the the JobManager executes blocking calls in the actor thread 
> in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
> {{ScheduleOrUpdateConsumers}} message. 
> Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
> JobManager to notify the consumers about available data. The JobManager then 
> sends to each TaskManager the respective update call 
> {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
> effectively execute the update calls sequentially. Due to the ever 
> accumulating delay, some of the initial timeouts on the TaskManager side in 
> {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result 
> the execution of the respective Tasks fails.
> A solution would be to make the call non-blocking.
> A general caveat for actor programming is: We should never block the actor 
> thread, otherwise we seriously jeopardize the scalability of the system. Or 
> even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314178#comment-14314178
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73699625
  
OK, thanks. Just a reminder: we need to include this in 0.8.1 as well.


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.jav

[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73699625
  
OK, thanks. Just a reminder: we need to include this in 0.8.1 as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314171#comment-14314171
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73699086
  
I'll merge the change.


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.fork

[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73699086
  
I'll merge the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1369) The Pojo Serializers/Comparators fail when using Subclasses or Interfaces

2015-02-10 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek resolved FLINK-1369.
-
   Resolution: Fixed
Fix Version/s: 0.9

Resolved in 
https://github.com/apache/flink/commit/7407076d3990752eb5fa4072cd036efd2f656cbc

> The Pojo Serializers/Comparators fail when using Subclasses or Interfaces
> -
>
> Key: FLINK-1369
> URL: https://issues.apache.org/jira/browse/FLINK-1369
> Project: Flink
>  Issue Type: Improvement
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Minor
> Fix For: 0.9
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-02-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/236


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1508) Remove AkkaUtils.ask to encourage explicit future handling

2015-02-10 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314119#comment-14314119
 ] 

Chiwan Park commented on FLINK-1508:


+1
I agree. I misused it in [#374|https://github.com/apache/flink/pull/374].

> Remove AkkaUtils.ask to encourage explicit future handling
> --
>
> Key: FLINK-1508
> URL: https://issues.apache.org/jira/browse/FLINK-1508
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>
> {{AkkaUtils.ask}} asks another actor and awaits its response. Since this 
> constitutes a blocking call, it might be potentially harmful when used in an 
> actor thread. In order to encourage developers to program asynchronously I 
> propose to remove this helper function. That forces the developer to handle 
> futures explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314116#comment-14314116
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73694315
  
Ha! Thanks :))
On Feb 10, 2015 1:10 PM, "zentol"  wrote:

> in the pom.xml in flink-gelly; change flink-addons to flink-staging
>
> —
> Reply to this email directly or view it on GitHub
> .
>



> Graph API for Flink 
> 
>
> Key: FLINK-1201
> URL: https://issues.apache.org/jira/browse/FLINK-1201
> Project: Flink
>  Issue Type: New Feature
>Reporter: Kostas Tzoumas
>Assignee: Vasia Kalavri
>
> This issue tracks the development of a Graph API/DSL for Flink.
> Until the code is pushed to the Flink repository, collaboration is happening 
> here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-02-10 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73694315
  
Ha! Thanks :))
On Feb 10, 2015 1:10 PM, "zentol"  wrote:

> in the pom.xml in flink-gelly; change flink-addons to flink-staging
>
> —
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314110#comment-14314110
 ] 

ASF GitHub Bot commented on FLINK-1489:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73693561
  
Very nice. I will have a detailed look later.

@zentol Can you also test it with the Python API? I think you initially 
noticed the problem.


> Failing JobManager due to blocking calls in 
> Execution.scheduleOrUpdateConsumers
> ---
>
> Key: FLINK-1489
> URL: https://issues.apache.org/jira/browse/FLINK-1489
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> [~Zentol] reported that the JobManager failed to execute his python job. The 
> reason is that the the JobManager executes blocking calls in the actor thread 
> in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
> {{ScheduleOrUpdateConsumers}} message. 
> Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
> JobManager to notify the consumers about available data. The JobManager then 
> sends to each TaskManager the respective update call 
> {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
> effectively execute the update calls sequentially. Due to the ever 
> accumulating delay, some of the initial timeouts on the TaskManager side in 
> {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result 
> the execution of the respective Tasks fails.
> A solution would be to make the call non-blocking.
> A general caveat for actor programming is: We should never block the actor 
> thread, otherwise we seriously jeopardize the scalability of the system. Or 
> even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1489] Fixes blocking scheduleOrUpdateCo...

2015-02-10 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/378#issuecomment-73693561
  
Very nice. I will have a detailed look later.

@zentol Can you also test it with the Python API? I think you initially 
noticed the problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1508) Remove AkkaUtils.ask to encourage explicit future handling

2015-02-10 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314086#comment-14314086
 ] 

Ufuk Celebi commented on FLINK-1508:


I agree.

I was actually misusing it earlier. Will it be used anywhere else after 
[#378|https://github.com/apache/flink/pull/378] is merged?

> Remove AkkaUtils.ask to encourage explicit future handling
> --
>
> Key: FLINK-1508
> URL: https://issues.apache.org/jira/browse/FLINK-1508
> Project: Flink
>  Issue Type: Bug
>Reporter: Till Rohrmann
>
> {{AkkaUtils.ask}} asks another actor and awaits its response. Since this 
> constitutes a blocking call, it might be potentially harmful when used in an 
> actor thread. In order to encourage developers to program asynchronously I 
> propose to remove this helper function. That forces the developer to handle 
> futures explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/376#discussion_r24407836
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
@@ -196,23 +199,28 @@ public void run() {
 */
@Override
public void shutdown() throws IOException {
-
-   this.shutdownRequested = true;
-   try {
-   this.serverSocket.close();
-   } catch (IOException ioe) {
+   if (shutdownRequested.compareAndSet(false, true)) {
+   try {
+   this.serverSocket.close();
+   }
+   catch (IOException ioe) {
LOG.debug("Error while closing the server 
socket.", ioe);
-   }
-   try {
-   join();
-   } catch (InterruptedException ie) {
-   LOG.debug("Error while waiting for this thread to 
die.", ie);
-   }
+   }
+   try {
+   join();
--- End diff --

Ah of course. Thanks for the clarification.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314080#comment-14314080
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/376#discussion_r24407836
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
@@ -196,23 +199,28 @@ public void run() {
 */
@Override
public void shutdown() throws IOException {
-
-   this.shutdownRequested = true;
-   try {
-   this.serverSocket.close();
-   } catch (IOException ioe) {
+   if (shutdownRequested.compareAndSet(false, true)) {
+   try {
+   this.serverSocket.close();
+   }
+   catch (IOException ioe) {
LOG.debug("Error while closing the server 
socket.", ioe);
-   }
-   try {
-   join();
-   } catch (InterruptedException ie) {
-   LOG.debug("Error while waiting for this thread to 
die.", ie);
-   }
+   }
+   try {
+   join();
--- End diff --

Ah of course. Thanks for the clarification.


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCa

[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314072#comment-14314072
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73689455
  
in the pom.xml in flink-gelly; change flink-addons 
to flink-staging


> Graph API for Flink 
> 
>
> Key: FLINK-1201
> URL: https://issues.apache.org/jira/browse/FLINK-1201
> Project: Flink
>  Issue Type: New Feature
>Reporter: Kostas Tzoumas
>Assignee: Vasia Kalavri
>
> This issue tracks the development of a Graph API/DSL for Flink.
> Until the code is pushed to the Flink repository, collaboration is happening 
> here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-02-10 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73689455
  
in the pom.xml in flink-gelly; change flink-addons 
to flink-staging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-02-10 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73688058
  
I'm getting the following error on travis:
```
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:2.12.1:check (validate) on 
project flink-gelly: Failed during checkstyle execution: Unable to find 
suppressions file at location: /tools/maven/suppressions.xml: Could not find 
resource '/tools/maven/suppressions.xml'. -> [Help 1]
```

Any idea how to solve this? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314064#comment-14314064
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73688058
  
I'm getting the following error on travis:
```
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:2.12.1:check (validate) on 
project flink-gelly: Failed during checkstyle execution: Unable to find 
suppressions file at location: /tools/maven/suppressions.xml: Could not find 
resource '/tools/maven/suppressions.xml'. -> [Help 1]
```

Any idea how to solve this? Thanks!


> Graph API for Flink 
> 
>
> Key: FLINK-1201
> URL: https://issues.apache.org/jira/browse/FLINK-1201
> Project: Flink
>  Issue Type: New Feature
>Reporter: Kostas Tzoumas
>Assignee: Vasia Kalavri
>
> This issue tracks the development of a Graph API/DSL for Flink.
> Until the code is pushed to the Flink repository, collaboration is happening 
> here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1508) Remove AkkaUtils.ask to encourage explicit future handling

2015-02-10 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1508:


 Summary: Remove AkkaUtils.ask to encourage explicit future handling
 Key: FLINK-1508
 URL: https://issues.apache.org/jira/browse/FLINK-1508
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann


{{AkkaUtils.ask}} asks another actor and awaits its response. Since this 
constitutes a blocking call, it might be potentially harmful when used in an 
actor thread. In order to encourage developers to program asynchronously I 
propose to remove this helper function. That forces the developer to handle 
futures explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314010#comment-14314010
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/376#discussion_r24403795
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
@@ -196,23 +199,28 @@ public void run() {
 */
@Override
public void shutdown() throws IOException {
-
-   this.shutdownRequested = true;
-   try {
-   this.serverSocket.close();
-   } catch (IOException ioe) {
+   if (shutdownRequested.compareAndSet(false, true)) {
+   try {
+   this.serverSocket.close();
+   }
+   catch (IOException ioe) {
LOG.debug("Error while closing the server 
socket.", ioe);
-   }
-   try {
-   join();
-   } catch (InterruptedException ie) {
-   LOG.debug("Error while waiting for this thread to 
die.", ie);
-   }
+   }
+   try {
+   join();
--- End diff --

It's from the old code and I am not sure if it really needs to stay, but it 
ensures that the BlobServer thread really finishes when calling the shutdown 
method (BlobServer is a Thread and because the join is called from outside of 
the run method it waits for the BlobServer thread to finish).


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.com

[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/376#discussion_r24403795
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
@@ -196,23 +199,28 @@ public void run() {
 */
@Override
public void shutdown() throws IOException {
-
-   this.shutdownRequested = true;
-   try {
-   this.serverSocket.close();
-   } catch (IOException ioe) {
+   if (shutdownRequested.compareAndSet(false, true)) {
+   try {
+   this.serverSocket.close();
+   }
+   catch (IOException ioe) {
LOG.debug("Error while closing the server 
socket.", ioe);
-   }
-   try {
-   join();
-   } catch (InterruptedException ie) {
-   LOG.debug("Error while waiting for this thread to 
die.", ie);
-   }
+   }
+   try {
+   join();
--- End diff --

It's from the old code and I am not sure if it really needs to stay, but it 
ensures that the BlobServer thread really finishes when calling the shutdown 
method (BlobServer is a Thread and because the join is called from outside of 
the run method it waits for the BlobServer thread to finish).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-02-10 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/236#discussion_r24403182
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
@@ -169,4 +186,113 @@ public ExecutionConfig disableObjectReuse() {
public boolean isObjectReuseEnabled() {
return objectReuse;
}
+
+   // 

+   //  Registry for types and serializers
+   // 

+
+   /**
+* Registers the given Serializer as a default serializer for the given 
type at the
--- End diff --

yes :dancers: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-02-10 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/236#discussion_r24403132
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
@@ -169,4 +186,113 @@ public ExecutionConfig disableObjectReuse() {
public boolean isObjectReuseEnabled() {
return objectReuse;
}
+
+   // 

+   //  Registry for types and serializers
+   // 

+
+   /**
+* Registers the given Serializer as a default serializer for the given 
type at the
--- End diff --

If you want I can also change it. I've another change for the Kryo system 
pending, so I can do the change in the course of that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-02-10 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/236#issuecomment-73679224
  
Ok, I'll merge this, and then you can make your changes on top of this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-02-10 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/236#discussion_r24402955
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
@@ -169,4 +186,113 @@ public ExecutionConfig disableObjectReuse() {
public boolean isObjectReuseEnabled() {
return objectReuse;
}
+
+   // 

+   //  Registry for types and serializers
+   // 

+
+   /**
+* Registers the given Serializer as a default serializer for the given 
type at the
--- End diff --

I think it would be better to call this method `registerTypeWithSerializer`.
defaultSerializers are something different in Kryo.

I'm sure users who know Kryo are confused by this "Registers the given 
Serializer as a default serializer for the given type at the"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-02-10 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/236#discussion_r24402907
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/KryoSerializer.java
 ---
@@ -82,29 +75,15 @@

// 

 
-   public KryoSerializer(Class type){
+   public KryoSerializer(Class type, ExecutionConfig executionConfig){
if(type == null){
throw new NullPointerException("Type class cannot be 
null.");
}
this.type = type;
 
-   // create copies of the statically registered serializers
-   // we use static synchronization to safeguard against 
concurrent use
-   // of the static collections.
-   synchronized (KryoSerializer.class) {
-   this.registeredSerializers = 
staticRegisteredSerializers.isEmpty() ?
-   Collections., Serializer>emptyMap() 
:
-   new HashMap, 
Serializer>(staticRegisteredSerializers);
-   
-   this.registeredSerializersClasses = 
staticRegisteredSerializersClasses.isEmpty() ?
-   Collections., Class>>emptyMap() :
-   new HashMap, Class>>(staticRegisteredSerializersClasses);
-   
-   this.registeredTypes = staticRegisteredTypes.isEmpty() ?
-   Collections.>emptySet() :
-   new HashSet>(staticRegisteredTypes);
-   }
-   
+   this.registeredSerializers = 
executionConfig.getRegisteredKryoSerializers();
--- End diff --

We should add a way to register default Kryo serializers as well .. to have 
complete support.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73678581
  
LGTM except for my single question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313973#comment-14313973
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/376#discussion_r24402830
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
@@ -196,23 +199,28 @@ public void run() {
 */
@Override
public void shutdown() throws IOException {
-
-   this.shutdownRequested = true;
-   try {
-   this.serverSocket.close();
-   } catch (IOException ioe) {
+   if (shutdownRequested.compareAndSet(false, true)) {
+   try {
+   this.serverSocket.close();
+   }
+   catch (IOException ioe) {
LOG.debug("Error while closing the server 
socket.", ioe);
-   }
-   try {
-   join();
-   } catch (InterruptedException ie) {
-   LOG.debug("Error while waiting for this thread to 
die.", ie);
-   }
+   }
+   try {
+   join();
--- End diff --

What does this join do? I'm at a loss here.


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCa

[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313974#comment-14313974
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73678581
  
LGTM except for my single question.


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scal

[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/376#discussion_r24402830
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
@@ -196,23 +199,28 @@ public void run() {
 */
@Override
public void shutdown() throws IOException {
-
-   this.shutdownRequested = true;
-   try {
-   this.serverSocket.close();
-   } catch (IOException ioe) {
+   if (shutdownRequested.compareAndSet(false, true)) {
+   try {
+   this.serverSocket.close();
+   }
+   catch (IOException ioe) {
LOG.debug("Error while closing the server 
socket.", ioe);
-   }
-   try {
-   join();
-   } catch (InterruptedException ie) {
-   LOG.debug("Error while waiting for this thread to 
die.", ie);
-   }
+   }
+   try {
+   join();
--- End diff --

What does this join do? I'm at a loss here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1432) CombineTaskTest.testCancelCombineTaskSorting sometimes fails

2015-02-10 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313897#comment-14313897
 ] 

Till Rohrmann commented on FLINK-1432:
--

Had another occurrence. 

> CombineTaskTest.testCancelCombineTaskSorting sometimes fails
> 
>
> Key: FLINK-1432
> URL: https://issues.apache.org/jira/browse/FLINK-1432
> Project: Flink
>  Issue Type: Bug
>Reporter: Robert Metzger
>
> We have a bunch of tests which fail only in rare cases on travis.
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/47783455/log.txt
> {code}
> Exception in thread "Thread-17" java.lang.AssertionError: Canceling task 
> failed: java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
>   at java.util.ArrayList$Itr.next(ArrayList.java:831)
>   at 
> org.apache.flink.runtime.memorymanager.DefaultMemoryManager.release(DefaultMemoryManager.java:290)
>   at 
> org.apache.flink.runtime.operators.GroupReduceCombineDriver.cancel(GroupReduceCombineDriver.java:221)
>   at 
> org.apache.flink.runtime.operators.testutils.DriverTestBase.cancel(DriverTestBase.java:272)
>   at 
> org.apache.flink.runtime.operators.testutils.TaskCancelThread.run(TaskCancelThread.java:60)
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.runtime.operators.testutils.TaskCancelThread.run(TaskCancelThread.java:68)
> java.lang.NullPointerException
>   at 
> org.apache.flink.runtime.memorymanager.DefaultMemoryManager.release(DefaultMemoryManager.java:291)
>   at 
> org.apache.flink.runtime.operators.GroupReduceCombineDriver.cleanup(GroupReduceCombineDriver.java:213)
>   at 
> org.apache.flink.runtime.operators.testutils.DriverTestBase.testDriverInternal(DriverTestBase.java:245)
>   at 
> org.apache.flink.runtime.operators.testutils.DriverTestBase.testDriver(DriverTestBase.java:175)
>   at 
> org.apache.flink.runtime.operators.CombineTaskTest$1.run(CombineTaskTest.java:143)
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.172 sec <<< 
> FAILURE! - in org.apache.flink.runtime.operators.CombineTaskTest
> testCancelCombineTaskSorting[0](org.apache.flink.runtime.operators.CombineTaskTest)
>   Time elapsed: 1.023 sec  <<< FAILURE!
> java.lang.AssertionError: Exception was thrown despite proper canceling.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.runtime.operators.CombineTaskTest.testCancelCombineTaskSorting(CombineTaskTest.java:162)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1507) Allow to configure produced intermediate result type via ExecutionConfig

2015-02-10 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1507:
--

 Summary: Allow to configure produced intermediate result type via 
ExecutionConfig
 Key: FLINK-1507
 URL: https://issues.apache.org/jira/browse/FLINK-1507
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: master
Reporter: Ufuk Celebi
Priority: Minor


Currently, the default type of intermediate results is pipelined. With 
[#356|https://github.com/apache/flink/pull/356], we will use blocking 
intermediate results for branching flows, which are merged later.

We can keep this mechanism, but we should make it configurable via 
ExecutionConfig to allow more fine-grained control of the flows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1506) Add buffer/event wrapper for runtime results

2015-02-10 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1506:
--

 Summary: Add buffer/event wrapper for runtime results
 Key: FLINK-1506
 URL: https://issues.apache.org/jira/browse/FLINK-1506
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: master
Reporter: Ufuk Celebi
Priority: Minor


There is no separation between buffer or events and their metadata when adding 
them to an intermediate result at runtime. This was not the case up until 
release-0.8 and has been introduced by me with intermediate results.

I think this was a mistake and we should seperate this again using a similar 
wrapper as before (e.g. an Envelope).

Some of the problems of the current solution include that events are always 
serialized when adding them to an intermediate result and buffers are tagged to 
include whether they serialize a buffer or event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313869#comment-14313869
 ] 

ASF GitHub Bot commented on FLINK-1492:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73667317
  
Looks good.


> Exceptions on shutdown concerning BLOB store cleanup
> 
>
> Key: FLINK-1492
> URL: https://issues.apache.org/jira/browse/FLINK-1492
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Ufuk Celebi
> Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: 
> /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at 
> org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
>   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
>   at 
> akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
>   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR 
> org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
>   - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: 
> /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
>   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
>   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
>   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJo

[GitHub] flink pull request: [FLINK-1492] Fix exceptions on blob store shut...

2015-02-10 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/376#issuecomment-73667317
  
Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1505) Separate buffer reader and channel consumption logic

2015-02-10 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1505:
--

 Summary: Separate buffer reader and channel consumption logic
 Key: FLINK-1505
 URL: https://issues.apache.org/jira/browse/FLINK-1505
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: master
Reporter: Ufuk Celebi
Priority: Minor


Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is bloated. 
There is no separation between consumption of the input channels and the buffer 
readers.

This was not the case up until release-0.8 and has been introduced by me with 
intermediate results. I think this was a mistake and we should seperate this 
again. flink-streaming is currently the heaviest user of these lower level APIs 
and I have received feedback from [~gyfora] to undo this as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1496) Events at unitialized input channels are lost

2015-02-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313849#comment-14313849
 ] 

ASF GitHub Bot commented on FLINK-1496:
---

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/380

[FLINK-1496] [distributed runtime] Fix loss of events at unitialized input 
channels

Related 
[discussion](http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Events-not-received-td3699.html).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink flink-1496-lost

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/380.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #380


commit d73585c346e7f10b7157305d640b740e786557b9
Author: Ufuk Celebi 
Date:   2015-02-09T16:17:19Z

[FLINK-1496] [distributed runtime] Fix loss of events at unitialized input 
channels




> Events at unitialized input channels are lost
> -
>
> Key: FLINK-1496
> URL: https://issues.apache.org/jira/browse/FLINK-1496
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Ufuk Celebi
>
> If a program sends an event backwards to the producer task, it might happen 
> that some of it input channels have not been initialized yet 
> (UnknownInputChannel). In that case, the events are lost and will never be 
> received at the producer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1496] [distributed runtime] Fix loss of...

2015-02-10 Thread uce
GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/380

[FLINK-1496] [distributed runtime] Fix loss of events at unitialized input 
channels

Related 
[discussion](http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Events-not-received-td3699.html).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink flink-1496-lost

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/380.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #380


commit d73585c346e7f10b7157305d640b740e786557b9
Author: Ufuk Celebi 
Date:   2015-02-09T16:17:19Z

[FLINK-1496] [distributed runtime] Fix loss of events at unitialized input 
channels




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---