> On Oct. 21, 2014, 12:18 a.m., Jun Rao wrote:
> > core/src/main/scala/kafka/server/ReplicaManager.scala, lines 481-506
> > <https://reviews.apache.org/r/26373/diff/3/?file=725140#file725140line481>
> >
> >     This doesn't quite fix the original problem though. The original 
> > problem is that if the leader is not alive, we won't call 
> > partition.makeFollower(), in which the local replica is created. If a local 
> > replica is not created, the partition will be ignored when checkingpoint HW 
> > and we lose the last checkpointed HW.
> >     
> >     So, we have to call partition.makerFollower() for every follower, 
> > whether its leader is live or not. After this, we can proceed with the rest 
> > of the steps for only those partitions with a live leader. We can log a 
> > warning for those partitions w/o a live leader.
> 
> Jiangjie Qin wrote:
>     Hi Jun, thanks for the review. 
>     I think the high watermark is actually read in 
> partition.getOrCreateReplica(). As you said, that means in order to preserve 
> the high watermark, the local replica is supposed to be created. The original 
> code did not create local replica when remote leader is not up so it lost the 
> high watermark.
>     partition.getOrCreateReplica() is actually called in the following 2 
> places:
>     1. partition.makeFollower()
>     2. ReplicaManager line 515, when we truncate the log for the partitions 
> in partitionsToMakeFollower.
>     Both of them could preserve high watermark. The difference between them 
> is that in (1) all the replica for the partition, including the 
> not-existing-yet remote replica, will be created. However in (2) only local 
> replica will be created. Because I'm not 100% sure if it could cause some 
> other issue if we create a remote replica when it is actually not up yet, so 
> I chose to preserve the high watermark in (2) instead of in (1), which has 
> less change compared with original code.

Did you get a chance to repeat the kind of testing that was done to find this 
bug? I'd be more comfortable accepting this patch if we did that sort of 
testing since this change is tricky.


- Neha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26373/#review57490
-----------------------------------------------------------


On Oct. 18, 2014, 7:26 a.m., Jiangjie Qin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26373/
> -----------------------------------------------------------
> 
> (Updated Oct. 18, 2014, 7:26 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1647
>     https://issues.apache.org/jira/browse/KAFKA-1647
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> Addressed Joel's comments.
> 
> 
> the version 2 code seems to be submitted by mistake... This should be the 
> code for review that addressed Joel's comments.
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 78b7514cc109547c562e635824684fad581af653 
> 
> Diff: https://reviews.apache.org/r/26373/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jiangjie Qin
> 
>

Reply via email to