Oleksiy,

The join group protocol is general enough to handle multiple types of group
membership, not just consumers. This is used in Kafka Connect to form a
group of workers (which, instead of splitting topic partitions between
members splits connector tasks).

In order to make this work and allow flexibility in how assignment is
handled, the protocol is divided into two layers. The primary join group
protocol only a) keeps track of group membership and b) selects a group
protocol that all members agree they can work with. At this level, there's
no version information, no info about consumer subscriptions, and no
knowledge of partition assignment strategies other than the names and
opaque metadata submitted by clients.

The "embedded" layer is where the version info you're setting is specified.
This is never even parsed by the brokers -- the information is collected
and sent to one of the group members which then decodes it and determines
the assignment info. That result is then returned to the broker which
disseminates the information (and again, the broker never decodes this, it
just forwards the appropriate info to each member).

The version is included specifically in the consumer protocol to allow us
to extend the format in the future. For example, if we needed to add or
change the way subscriptions are expressed, we could increase that version
number and update the message format. In other words, it is the mechanism
we have chosen *only for the consumer embedded protocol* to allow metadata
format changes. (Note that for the consumer embedded protocol there is also
*yet another* layer of data, called "UserData" in that protocol
documentation; this is custom data the partition assignment strategy in the
consumer, which is pluggable, might want include, e.g. if you were doing
resource-based assignment you might need to include info like # of cpus,
which is specific to that assignment strategy).

The broker only looks at the ProtocolName (which is equivalent to
AssignmentStrategy for consumers) when choosing which protocol to use for
consumers. If you want to version those in an incompatible way (i.e. you
can't handle the change just by updating the format of your metadata), you
should include version info in the ProtocolName itself to ensure the group
coordinator broker can differentiate them, e.g. round-robin vs
round-robin-2. But you should also think carefully about whether that
change is necessary -- in many cases if you're not adding any metadata
you'll be fine just keeping the same name since one member is selected to
perform the assignment and every other member just needs to respect
whatever assignment it makes. And of course if you're just trying to switch
to a completely different assignment strategy (e.g. from range ->
round-robin), then the name itself is enough. Just bounce all consumers
adding round-robin as an option, then bounce them all removing range.

We considered other options when designing this protocol, but decided this
was the best tradeoff. The current protocol is already pretty complex and
multi-layered and the alternatives that tried to build in versioning at
this level too were even more complex and confusing.

-Ewen



On Wed, Dec 23, 2015 at 10:45 PM, Oleksiy Krivoshey <oleks...@gmail.com>
wrote:

> Hi Ewen,
>
> I specify version in ProtocolMetadata structure, as per this document:
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-JoinGroupResponse
>
> ---------------
> ProtocolType => "consumer"
>
> ProtocolName => AssignmentStrategy
>   AssignmentStrategy => string
>
> ProtocolMetadata => Version Subscription UserData
>   Version => int16
>   Subscription => [Topic]
>     Topic => string
>   UserData => bytes
> -----------------
>
> Maybe I misunderstood the purpose of this version field?
>
> On Thu, 24 Dec 2015 at 00:27 Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > Oleksiy,
> >
> > Where are you specifying the version? Unless I'm missing something, the
> > JoinGroup protocol doesn't include versions so I'm not sure I understand
> > the examples you are giving. Are the version numbers included in the
> > per-protocol metadata?
> >
> > You can see exactly how the consumer coordinator on the broker selects
> the
> > protocol here:
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/GroupMetadata.scala#L179
> > It is just taking the candidate protocols (ones that are available for
> all
> > consumers), then has each consumer "vote" by selecting whichever
> candidate
> > appears in its list of strategies first, then uses the one with the most
> > votes.
> >
> > Is it possible your example is behaving the way it is because it actually
> > has duplicates for "strategyX", and in the last case it chooses the first
> > strategyX despite the conflicting versions?
> >
> > -Ewen
> >
> > On Wed, Dec 23, 2015 at 9:44 AM, Oleksiy Krivoshey <oleks...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I can't understand how the protocol upgrades (to newer version) should
> > > work. When I send GroupJoinRequest with a list of assignment protocols
> > > (same protocol name, different versions) always the first
> > protocol/version
> > > gets picked up as a member version. Even if all consumers in the group
> > are
> > > configured with two versions still always the first specified version
> > will
> > > be selected by coordinator and not the one with highest version number.
> > >
> > > So for example:
> > > consumer1: [ {name:strategyX, version: 0}, {name: strategyX, version:
> 1}
> > ]
> > > consumer2: [ {name:strategyX, version: 0}, {name: strategyX, version:
> 1}
> > ]
> > >
> > > Both will be assigned a version 0 in a response to leader. If I make it
> > > this way:
> > >
> > > consumer1: [ {name:strategyX, version: 1}, {name: strategyX, version:
> 0}
> > ]
> > > consumer2: [ {name:strategyX, version: 1}, {name: strategyX, version:
> 0}
> > ]
> > >
> > > Both will be assigned version 1.
> > >
> > > In this case:
> > >
> > > consumer1: [ {name:strategyX, version: 10}, {name: strategyX, version:
> > 1} ]
> > > consumer2: [ {name:strategyX, version: 20}, {name: strategyX, version:
> > 1} ]
> > >
> > > Kafka will endlessly try to rebalance the group without success because
> > > consumer1 will have version:10 and consumer2 - version:20 in a
> > > GroupJoinResponse.
> > >
> > > Can anyone please explain the process of the protocol version upgrade?
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
>



-- 
Thanks,
Ewen

Reply via email to