tillrohrmann commented on issue #6974: [FLINK-10727][network] remove 
unnecessary synchronization in SingleInputGate#requestPartitions()
URL: https://github.com/apache/flink/pull/6974#issuecomment-435073195
 
 
   This change breaks Flink's iteration mechanism and potentially even more 
because `SingleInputGate#requestedPartitionsFlag` is read in 
`SingleInputGate#updateInputChannel` which is not called from the `Task` 
thread. The result can be that `updateInputChannel` does not request the sub 
partition of newly registered partitions. Due to that it can happen that a job 
gets stuck because it never consumes the input from a producer.
   
   You can easily reproduce the problem by adding a `Thread.sleep(10L)` before 
setting `requestedPartitionsFlag = true`.
   
   I'm wondering how much improvement these kind of changes actually bring. I'm 
a bit concerned that changes to such a critical component like the network 
stack get merged into master just before feature freeze. If at all, something 
like this should be merged at the beginning of the release cycle to give it 
more exposure. Moreover, Travis never passed and actually failed with an IT 
case running in exactly this problem. And also IntelliJ warns about the fact 
that `requestedPartitionsFlag` is accessed both in synchronized and 
unsynchronized context which should be red flag in most cases. I think we 
should be more careful in the future!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to