[
https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592884#comment-13592884
]
Neha Narkhede commented on KAFKA-513:
-------------------------------------
Thanks for the patch! This patch is great and is very close to being checked
in. Few review suggestions -
1. ControllerChannelManager
1.1 The log message when the controller sends the data includes the
controller's epoch, but the response doesn't. It is useful to know the epoch
wherever the controller id is logged.
1.2 The "," between controller id and epoch probably could be skipped, this
pattern is not followed elsewhere in this patch.
2. Partition
2.1 The error statement when the broker is dropping the leaderAndIsr request
should be -
"since local leader epoch %d is >= the request's leader epoch"
2.2 Also, let's add "Broker %d aborted..." to the following statement similar
to "Broker %d discarded..."
"Aborted the become-follower state change since leader %d for partition [%s,%d]"
3. PartitionStateMachine
In electLeaderForPartition() API, it is useful to include the contents of the
StateChangedFailedException in the state change log. This is because it is
useful to know that after starting the state change, it got aborted because
another controller with a higher controller epoch was detected. So something
like -
"Controller %d epoch %d aborted leader election for partition [%s,%d] since ..."
4. ReplicaManager
The log statements here are of the form of [Replica Manager on Broker %d]:
Handling... This is different from the ones in Partition which are like "Broker
%d started become-follower...". I guess the default logIdent, which is meant
for the main log, is the cause of this discrepancy. For the purpose of the
state change log, we actually don't care if it is Replica Manager or Partition
on that broker, it would just suffice to have the simplest statement to
communicate the state changes like the ones in Partition except the logIdent.
5. ReplicaStateMachine
Same as 4
6. Rename stateChangeLogMerger.scala to have 1st letter in caps.
7. StateChangeLogMerger
This tool looks great now. Thanks for including the review suggestions. Few
minor observations -
7.1 You check if at least one of the two log input options are specified. I
think we should also check that at most one of them is specified. In other
words, what's the expected behavior if both are speci
fied ?
> Add state change log to Kafka brokers
> -------------------------------------
>
> Key: KAFKA-513
> URL: https://issues.apache.org/jira/browse/KAFKA-513
> Project: Kafka
> Issue Type: Sub-task
> Affects Versions: 0.8
> Reporter: Neha Narkhede
> Assignee: Swapnil Ghike
> Priority: Blocker
> Labels: p1, replication, tools
> Fix For: 0.8
>
> Attachments: kafka-513-v1.patch, kafka-513-v2.patch,
> kafka-513-v3.patch, kafka-513-v4.patch
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Once KAFKA-499 is checked in, every controller to broker communication can be
> modelled as a state change for one or more partitions. Every state change
> request will carry the controller epoch. If there is a problem with the state
> of some partitions, it will be good to have a tool that can create a timeline
> of requested and completed state changes. This will require each broker to
> output a state change log that has entries like
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for
> partition [foo, 0] from controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for
> partition [foo, 0] from controller 2, epoch 1
> On controller, this will look like -
> [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change
> request LeaderAndIsr() for partition [foo, 0]
> We need a tool that can collect the state change log from all brokers and
> create a per-partition timeline of state changes -
> [foo, 0]
> [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change
> request LeaderAndIsr()
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from
> controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from
> controller 2, epoch 1
> This JIRA involves adding the state change log to each broker and adding the
> tool to create the timeline
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira