[
https://issues.apache.org/jira/browse/KAFKA-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478202#comment-13478202
]
Neha Narkhede commented on KAFKA-340:
-------------------------------------
Thanks for the patch, it looks very well thought out. A few questions/comments -
1. KafkaController
1.1 It seems like we might need to move the filter to the getter instead of the
setter for liveBrokerIds. This is because if the shutting down broker list
changes after setting the value for live brokers and reading it, the list of
live brokers might either miss out alive brokers or include dead brokers.
1.2 Since both alive and shutting down broker lists are set, you can use the
set difference notation instead of individually reading and filtering out
unwanted brokers. Not sure how sets are implemented under the covers in Scala,
but my guess is that the set difference method is more efficient.
1.3 Can we rename replicatedPartitionsBrokerLeads to
replicatePartitionsThisBrokerLeads ?
1.4 The leader movement sets the isr correctly for the partition by removing
the current leader as well as any broker that is shutting down from the isr.
Then, the replica is marked as offline which tries to remove the broker from
the isr again. Let's ignore the fact that the OfflineReplica state change reads
the isr from zookeeper since we are going to address that in a separate JIRA
already and I don't think we should optimize too early here. What we can do is
add a check to OfflineReplica state change that doesn't do the zookeeper write
and rpc request if the replica is already not in the isr.
1.5 In shutdownBroker,
controllerContext.shuttingDownBrokerIds.add(id)
controllerContext.liveBrokers = controllerContext.liveBrokers.filter(_.id
!= id)
It seems like the 2nd statement is not required since the shutting down brokers
should be removed from the live brokers list anyways. This is another reason
why moving it to the getter might be useful and safer.
1.6 If the shutdownBroker API throws a runtime exception, what value is
returned to the admin command ?
1.7 In shutdownBroker, the controllerLock is acquired and released atleast four
times. Probably you wanted to release the lock between the move for each
partition in order to not block on an unimportant operation like broker
shutdown. It is usually wise to release a lock for blocking operations to avoid
deadlock/starvation. So, I didn't get why the lock is released for sending the
stop replica request. It seems like the lock can be acquired up until the list
of partitions to be moved is computed ? This acquire/release lock multiple
times approach is always tricky, but right now after reading the code once, I'm
not sure if this particular API would run into any race condition or not. So
far, my intuition is that the only bad thing that can happen is we miss out on
moving some leaders from the broker, which is not a critical operation anyway.
2. StopReplicaRequest
2.1 I think it is a little awkward to overload the stop replica request with no
partitions to mean that it is not supposed to shutdown the replica fetcher
thread. In the future, if we find a use case where we need to do this only for
some partitions, we might need to change the stop replica request format. What
do people think about adding a flag to the stop replica request that tells the
broker if it should just shutdown the replica fetcher thread or delete the
replica as well ?
> Implement clean shutdown in 0.8
> -------------------------------
>
> Key: KAFKA-340
> URL: https://issues.apache.org/jira/browse/KAFKA-340
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Joel Koshy
> Priority: Blocker
> Labels: bugs
> Attachments: KAFKA-340-v1.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> If we are shutting down a broker when the ISR of a partition includes only
> that broker, we could lose some messages that have been previously committed.
> For clean shutdown, we need to guarantee that there is at least 1 other
> broker in ISR after the broker is shut down.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira