[
https://issues.apache.org/jira/browse/SAMZA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Riccomini resolved SAMZA-607.
-----------------------------------
Resolution: Duplicate
> BrokerProxy gets stuck on down brokers
> --------------------------------------
>
> Key: SAMZA-607
> URL: https://issues.apache.org/jira/browse/SAMZA-607
> Project: Samza
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Gian Merlino
>
> I took a broker offline for a few hours today and found that a Samza job was
> stuck trying to read from it while it was down, instead of switching to
> another broker in the ISR (this was a replicated topic with some partitions
> under-replicated, but all partitions available). During this time the
> BrokerProxy thread was in a retry loop logging a lot of
> ClosedChannelExceptions.
> The broker had done a clean shutdown, but I think what happened is that the
> BrokerProxy just hadn't made any calls between when that broker stopped being
> leader for its partitions and when that broker went offline. So, it never got
> a NotLeaderForPartitionException and never abdicated.
> Would it make sense for the BrokerProxy to abdicate all of its
> topic-partitions after getting too many network errors, and possibly shut
> itself down if it becomes empty? I think it'd be good to support brokers
> going offline temporarily or even permanently.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)