> On May 20, 2015, 5:15 p.m., Onur Karaman wrote:
> > I only did a brief skim. This optimization tries to switch consumers over 
> > to a new coordinator without a rebalance. From my understanding, the 
> > consumers would detect a coordinator failure, discover the new coordinator 
> > to work with, and try heartbeating that new coordinator withouth a 
> > rebalance.
> > 
> > So it seems to me that putting the logic in handleJoinGroup isn't right, as 
> > the rebalance is what we're trying to avoid. The code should be in 
> > handleHeartbeat. It should lookup zk for the group info, add it to 
> > CoordinatorMetadata, and start up a DelayedHeartbeat for every consumer of 
> > that group.
> > 
> > **More importantly: given that this is just an optimization, and we haven't 
> > even seen the performance hit without this, I think KAFKA-2017 should be 
> > very low priority.**
> > 
> > The following are higher priority:
> > 1. Getting the consumer to properly handle error codes of the join group 
> > and heartbeat responses.
> > 2. Getting the consumer to detect coordinator failures and switch over to 
> > another coordinator (my KAFKA-1334 patch just had the coordinator detect 
> > consumer failures). A nice benefit of completing this first is that if we 
> > decide that the rebalances on coordinator failover are an actual issue, 
> > this would greatly facilitate testing any coordinator failover logic. Right 
> > now, it's unclear how this rb's logic can be tested.

I added a ticket for 2: 
[KAFKA-2208](https://issues.apache.org/jira/browse/KAFKA-2208)


- Onur


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34450/#review84539
-----------------------------------------------------------


On May 20, 2015, 4:13 p.m., Guozhang Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34450/
> -----------------------------------------------------------
> 
> (Updated May 20, 2015, 4:13 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2017
>     https://issues.apache.org/jira/browse/KAFKA-2017
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> 1. Upon receiving join-group, if the group metadata cannot be found in the 
> local cache try to read it from ZK; 2. Upon completing rebalance, update the 
> ZK with new group registry or delete the registry if the group becomes empty
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
> af06ad45cdc46ac3bc27898ebc1a5bd5b1c7b19e 
>   core/src/main/scala/kafka/coordinator/ConsumerGroupMetadata.scala 
> 47bdfa7cc86fd4e841e2b1d6bfd40f1508e643bd 
>   core/src/main/scala/kafka/coordinator/CoordinatorMetadata.scala 
> c39e6de34ee531c6dfa9107b830752bd7f8fbe59 
>   core/src/main/scala/kafka/utils/ZkUtils.scala 
> 2618dd39b925b979ad6e4c0abd5c6eaafb3db5d5 
> 
> Diff: https://reviews.apache.org/r/34450/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>

Reply via email to