GitHub user hachikuji opened a pull request:

    https://github.com/apache/kafka/pull/4349

    KAFKA-6366 [WIP]: Fix stack overflow in consumer due to fast offset commits 
during coordinator disconnect

    When the coordinator is marked unknown, we explicitly disconnect its 
connection and cancel pending requests. Currently the disconnect happens before 
the coordinator state is set to null, which means that callbacks which inspect 
the coordinator state will see it still as active. This can lead to further 
requests being sent. In pathological cases, the disconnect itself is not able 
to return because new requests are sent to the coordinator before the 
disconnect can complete, which leads to the stack overflow error. To fix the 
problem, I have reordered the disconnect to happen after the coordinator is set 
to null.
    
    I have added a basic test case to verify that callbacks for in-flight or 
unsent requests see the coordinator as unknown which prevents them from 
attempting to resend. We may need additional test cases after we determine 
whether this is in fact was it happening in the reported ticket.
    
    Note that I have also included some minor cleanups which I noticed along 
the way.
    
    ### Committer Checklist (excluded from commit message)
    - [ ] Verify design and implementation 
    - [ ] Verify test coverage and CI build status
    - [ ] Verify documentation (including upgrade notes)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hachikuji/kafka KAFKA-6366

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/4349.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4349
    
----
commit 488de3dca5be6111fd447980c8e79477259dc99a
Author: Jason Gustafson <jason@...>
Date:   2017-12-18T18:53:38Z

    KAFKA-6366 [WIP]: Fix stack overflow in consumer due to fast offset commits 
during coordinator disconnect

----


---

Reply via email to