Romain Revol created FLINK-8717: ----------------------------------- Summary: Flink seems to deadlock due to buffer starvation when iterating Key: FLINK-8717 URL: https://issues.apache.org/jira/browse/FLINK-8717 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 1.4.0 Environment: Windows 10 Pro 64-bit
Core i7-6820HQ @ 2.7 GHz 16GB RAM Flink 1.4 Scala client Scala 2.11.7 Reporter: Romain Revol Attachments: threadDump.txt We are encountering what looks like a deadlock of Flink in one of our jobs with an "iterate" in it. I've reduced the job use case to the example in this gist : [https://gist.github.com/rrevol/06ddfecd5f5ac7cbc67785b5d3a84dd4] Nothe that : * varying the parallelism affects the rapidity of occurence of the deadlock, but it always occur * varying MAX_LOOP_NB does affect the deadlock : the higher it is, the faster we encounter the deadlock. If MAX_LOOP_NB == 1, no deadlock. It consequently leads to think that it happens when the number of iterations reaches some threshold. >From the [^threadDump.txt], it looks like some starvation over buffer >allocation, but I may be mistaking since I don't know we'll Flink internals. -- This message was sent by Atlassian JIRA (v7.6.3#76005)