GitHub user czm1989 opened a pull request:

    https://github.com/apache/storm/pull/1986

    STORM-2394 KafkaSpout: Has no leader of partitions for a short time

    see https://issues.apache.org/jira/browse/STORM-2394
    In our case, there is something wrong with network for a short time. So 
some partitions of Kafka have no leaders.
    The nextTuple of KafkaSpout throw an exception of "No leader found for 
partition 0" at the position of "_coordinator.refresh();". The exception is 
from the function getLeaderFor in DynamicBrokersReader.java. So the spout is 
hanged.
    The partitions of Kafka have recover for a short time. But the spout can 
not deal with this problem. This problem appears several times on our server. 
Such as:
    Feb 25 06:31:19 CST 2017, KafkaSpout threw the exception.
    Feb 25 06:31:21 CST 2017, Kafka partitions recoverd.
    To be stronger, I think that the "_coordinator.refresh();" can try times. 
At the last time, throw the exception. Anyway, it will die, why not try one 
more time?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/czm1989/storm feature/kafkaspout_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1986.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1986
    
----
commit 6acd60625d3ec0fda932c76c6803ae78c56800b7
Author: czm1989 <[email protected]>
Date:   2017-03-02T15:31:04Z

    fix nextTuple of KafkaSpout when there is something wrong with the network

commit 74c28b64d7734e17b650dc05c95af760fec82f19
Author: czm1989 <[email protected]>
Date:   2017-03-03T05:30:59Z

    fix sleep

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to