[
https://issues.apache.org/jira/browse/FLINK-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ufuk Celebi resolved FLINK-1063.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.6.1-incubating
0.7-incubating
Fixed in c0c2abda5eaaabc6291f765718d88dcbdc12d2a4 (master) and
828407d58e1894a1e51f74f2c615fcc941e909d2 (release-0.6.1) after [~rmetzger]
reported the problem as well.
I will move the 2nd part of a non-multiplexed fall back to a seperate issue.
> Race condition in NettyConnectionManager
> ----------------------------------------
>
> Key: FLINK-1063
> URL: https://issues.apache.org/jira/browse/FLINK-1063
> Project: Flink
> Issue Type: Bug
> Components: Distributed Runtime
> Affects Versions: 0.6-incubating, 0.7-incubating
> Reporter: Ufuk Celebi
> Assignee: Ufuk Celebi
> Fix For: 0.7-incubating, 0.6.1-incubating
>
>
> The TCP channel queuing mechanism in {{NettyConnectionManager}} has a race
> condition, which may result in re-ordering of envelopes at the receiver (the
> dreaded {{"Expected data packet X but received Y"}} exception).
> Thanks to [AHeise|https://github.com/AHeise] for reporting the problem.
> The problem has been introduced with commits
> 52512636444902497e47ccbfb1cabaffb3e23343 ...
> 32d168f439bdb5dfab02a3ab2d12e87d0622a67e.
> I will revert the respective commits and implement a fall back, which limits
> TCP channel multiplexing and immediately closes TCP channels.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)