[ 
https://issues.apache.org/jira/browse/KAFKA-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14515079#comment-14515079
 ] 

Jun Rao commented on KAFKA-2139:
--------------------------------

[~becket_qin], thanks for taking the initiative to redesign the controller. 
Yes, the controller logic is fairly complicated now and will need to simplify 
it in order to add more features to it. One of the issues with the controller 
is that the reads/writes to ZK is done one partition at a time sequentially. 
This increases the unavailability window during a hard failure. It would be 
useful to see if we can address this issue in the new design as well. For 
example, perhaps we can use the multi-operation and the async support in ZK to 
speed things up. Also, timing wise, do you plan to do this in 0.8.3? We are 
changing quite a few parts in trunk now. Perhaps  we can start the design work, 
but do the implementation post 0.8.3?

> Add a separate controller messge queue with higher priority on broker side 
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-2139
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2139
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jiangjie Qin
>            Assignee: Jiangjie Qin
>
> This ticket is supposed to be working together with KAFKA-2029. 
> There are two issues with current controller to broker messages.
> 1. On the controller side the message are sent without synchronization.
> 2. On broker side the controller messages share the same queue as client 
> messages.
> The problem here is that brokers process the controller messages for the same 
> partition at different times and the variation could be big. This causes 
> unnecessary data loss and prolong the preferred leader election / controlled 
> shutdown/ partition reassignment, etc.
> KAFKA-2029 was trying to add a boundary between messages for different 
> partitions. For example, before leader migration for previous partition 
> finishes, the leader migration for next partition won't begin.
> This ticket is trying to let broker process controller messages faster. So 
> the idea is have separate queue to hold controller messages, if there are 
> controller messages, KafkaApi thread will first take care of those messages, 
> otherwise it will proceed messages from clients.
> Those two tickets are not ultimate solution to current controller problems, 
> but just mitigate them with minor code changes. Moving forward, we still need 
> to think about rewriting controller in a cleaner way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to