[GitHub] flink issue #4924: [FLINK-7949] AsyncWaitOperator is not restarting when que...

2017-12-28 Thread bartektartanus
Github user bartektartanus commented on the issue:

https://github.com/apache/flink/pull/4924
  
Hi @tillrohrmann 
I've finally managed to write a simple test that fails without my change in 
AsyncWaitOperator class. Test steps:
1. add enough records to fill up the AsyncWaitOperator queue
2. add record which processing takes more time than timeout which causes 
restart
3. data stream is restarted and in `open()` method it tries to add N+1 
recovered stream elements to queue of size N
4. test waits forever and fails due to timeout. 

Works exactly as I've mentioned before. But if the emitter is started 
earlier, eventually test passes after two restarts (`timeoutCounter` field in 
`TimeoutableFunction`). Tried to make this test as short and quick as possible. 
I don't know if this is the right file for this, please fix this if so. I also 
hope this change will be in next flink version :) 
Happy New Year!


---


[GitHub] flink issue #4924: [FLINK-7949] AsyncWaitOperator is not restarting when que...

2017-11-03 Thread bartektartanus
Github user bartektartanus commented on the issue:

https://github.com/apache/flink/pull/4924
  
Yes, it fixes our issue - now our test in 
[nussknacker](https://github.com/TouK/nussknacker) is passing. Process restarts 
seamlessly and works fine even after another restarts, but I haven't managed to 
reproduce this error in simple unit Flink test (yet). Maybe next week :)


---


[GitHub] flink pull request #4924: [FLINK-7949] AsyncWaitOperator is not restarting w...

2017-10-30 Thread bartektartanus
GitHub user bartektartanus opened a pull request:

https://github.com/apache/flink/pull/4924

[FLINK-7949] AsyncWaitOperator is not restarting when queue is full

Change:
Emitter thread is started BEFORE filling up the queue of recovered elements
Issue description:
During process restart, if the queue was full (with N elements) and there 
was pending element waiting to be added to the queue, then the queue couldn't 
fit N+1 elements and thread was blocked forever. As Till Rohrmann suggested 
here:

http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-was-declined-tasks-not-ready-td16066.html
I've changed the order of this code to start emitter thread earlier.   


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bartektartanus/flink master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4924.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4924


commit 97620649ddfcf8f0320b20bdfdb69d9b44dd8f0c
Author: Bartłomiej Tartanus <bartektarta...@gmail.com>
Date:   2017-10-30T14:39:43Z

start emmiter thread BEFORE filling up the queue of recovered elements




---