[ 
https://issues.apache.org/jira/browse/SAMZA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini resolved SAMZA-607.
-----------------------------------
    Resolution: Duplicate

> BrokerProxy gets stuck on down brokers
> --------------------------------------
>
>                 Key: SAMZA-607
>                 URL: https://issues.apache.org/jira/browse/SAMZA-607
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Gian Merlino
>
> I took a broker offline for a few hours today and found that a Samza job was 
> stuck trying to read from it while it was down, instead of switching to 
> another broker in the ISR (this was a replicated topic with some partitions 
> under-replicated, but all partitions available). During this time the 
> BrokerProxy thread was in a retry loop logging a lot of 
> ClosedChannelExceptions.
> The broker had done a clean shutdown, but I think what happened is that the 
> BrokerProxy just hadn't made any calls between when that broker stopped being 
> leader for its partitions and when that broker went offline. So, it never got 
> a NotLeaderForPartitionException and never abdicated.
> Would it make sense for the BrokerProxy to abdicate all of its 
> topic-partitions after getting too many network errors, and possibly shut 
> itself down if it becomes empty? I think it'd be good to support brokers 
> going offline temporarily or even permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to