[ 
https://issues.apache.org/jira/browse/FLINK-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374496#comment-16374496
 ] 

ASF GitHub Bot commented on FLINK-8694:
---------------------------------------

GitHub user pnowojski opened a pull request:

    https://github.com/apache/flink/pull/5572

    [FLINK-8694][runtime] Fix notifyDataAvailable race condition

    This fixes two bugs in network stack:
    https://issues.apache.org/jira/browse/FLINK-8760
    https://issues.apache.org/jira/browse/FLINK-8694
    
    ## Brief change log
    
    Please check individual commit messages.
    
    ## Verifying this change
    
    This PR adds new tests covering the previously bugged cases.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (yes / **no**)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
      - The serializers: (yes / **no** / don't know)
      - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
      - The S3 file system connector: (yes / **no** / don't know)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / **no**)
      - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pnowojski/flink f8694-proper-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5572.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5572
    
----
commit 388d16118763dddff7d4c3593572169ad3e65c0d
Author: Piotr Nowojski <piotr.nowojski@...>
Date:   2018-02-23T10:37:37Z

    [hotfix][tests] Deduplicate code in SingleInputGateTest

commit e22a44b24ab1e9f02c236440f899a1f4dfdfc873
Author: Piotr Nowojski <piotr.nowojski@...>
Date:   2018-02-23T11:11:14Z

    [hotfix][runtime] Remove duplicated check

commit 5c16e565c4a7f0ffdaec888696d98e3c2c221d99
Author: Piotr Nowojski <piotr.nowojski@...>
Date:   2018-02-23T10:20:21Z

    [FLINK-8760][runtime] Correctly propagate moreAvailable flag through 
SingleInputGate
    
    Previously if we SingleInputGate was re-eqnqueuing an input channel, 
isMoreAvailable
    might incorrectly return false. This might caused some dead locks.

commit a451006fd2e38e478ef745fd9de0ebc5fb2fd5c2
Author: Piotr Nowojski <piotr.nowojski@...>
Date:   2018-02-23T10:27:54Z

    [hotfixu][tests] Do not hide original exception in 
SuccessAfterNetworkBuffersFailureITCase

commit e70cd04424f0f92b9d5127e7c4a351d3823d20bd
Author: Piotr Nowojski <piotr.nowojski@...>
Date:   2018-02-23T10:28:20Z

    [FLINK-8694][runtime] Fix notifyDataAvailable race condition
    
    Before there was a race condition that might resulted in igonoring some 
notifyDataAvailable calls.
    This fixes the problem by moving buffersAvailable handling to Supartitions 
and adds stress test
    for flushAlways (without this fix this test is dead locking).

----


> Make notifyDataAvailable call reliable
> --------------------------------------
>
>                 Key: FLINK-8694
>                 URL: https://issues.apache.org/jira/browse/FLINK-8694
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Piotr Nowojski
>            Assignee: Piotr Nowojski
>            Priority: Major
>
> After FLINK-8591 
> org.apache.flink.runtime.io.network.netty.SequenceNumberingViewReader#notifyDataAvailable
>  (and the same for Credit base flow control) due to race condition can be 
> sometimes ignored. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to