Those docs at https://kafka.apache.org/090/documentation.html#upgrade ate still mentioning 0.8.3 instead of 0.9.0. Is there JIRA already to fix this?
On Thu, Nov 5, 2015 at 9:28 PM, Grant Henke <ghe...@cloudera.com> wrote: > Hi Matthew, > > I have not read into the details of your issues but have done similar > "rolling" upgrade testing myself. The reason replication breaks is due to > some wire protocol changes. > > Just checking some preliminary things before digging in > > - Have you followed the upgrade steps outlined here? > - https://kafka.apache.org/090/documentation.html#upgrade > - Does setting inter.broker.protocol.version=0.8.2.X resolve the issue? > - Note: you need to unset and restart again after all brokers are > upgraded. > > In the future KIP-35 may help alleviate the manual step of setting the > inter.broker.protocol.version. You can read more about KIP-35 and > participate in the discussion/design here: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version > > Thanks, > Grant > > > On Thu, Nov 5, 2015 at 2:18 PM, Matthew Bruce <mbr...@blackberry.com> > wrote: > > > Hello Kafka Devs, > > > > I've been testing the upgrade procedure between Kafka 0.8.2.1 and Kafka > > 0.9.0.0 and have been having Replication issues between the two version, > > and I was wondering if anyone was aware of this issue (I just searched > and > > this seems to be related to KAFKA-2750 raised yesterday ). > > > > I start with 3 brokers running 0.8.2.1 all that contain data (1 topic > with > > 10 partitions), then I shut down one of the brokers, upgrade it to 0.9.0 > > (making sure to set 'inter.broker.protocol.version=0.8.2.X' in > > broker.properties). Once the Broker is started I see errors like the > > following: > > > > [2015-11-05 19:13:10,309] WARN [ReplicaFetcherThread-0-182050600], Error > > in fetch kafka.server.ReplicaFetcherThread$FetchRequest@6cc18858<mailto: > > kafka.server.ReplicaFetcherThread$FetchRequest@6cc18858>. Possible > cause: > > org.apache.kafka.common.protocol.types.SchemaException: Error reading > field > > 'responses': Error reading field 'topic': > java.nio.BufferUnderflowException > > (kafka.server.ReplicaFetcherThread) > > And > > [2015-11-03 16:55:15,178] WARN [ReplicaFetcherThread-1-182050600], Error > > in fetch kafka.server.ReplicaFetcherThread$FetchRequest@224388b2<mailto: > > kafka.server.ReplicaFetcherThread$FetchRequest@224388b2>. Possible > cause: > > org.apache.kafka.common.protocol.types.SchemaException: Error reading > field > > 'responses': Error reading field 'partition_responses': Error reading > field > > 'record_set': java.lang.IllegalArgumentException > > (kafka.server.ReplicaFetcherThread) > > > > > > I've spent some time in the Kafka code, and packet captures/wireshark > > trying to figure this out, and I believe there is an issue in > > org.apache.kafka.clients.networkClient.java in the > handleCompletedReceives > > function: > > When extracting the response body, this function is using > > ProtoUtils.currentResponseSchema instead of ProtoUtils.ResponseSchema > and > > specifying the API version required by inter.broker.protocol.version. > > Struct body = (Struct) > > ProtoUtils.currentResponseSchema(apiKey).read(receive.payload()); > > > > This results in errors when the newer version of a Schema > > (FETCH_RESPONSE_V1 instead of FETCH_RESPONSE_V0) is applied against the > > fetch response returned by the 0.8.2.1 broker > > > > > > Thanks, > > Matthew Bruce > > mbr...@blackberry.com<mailto:mbr...@blackberry.com> > > > > > > > -- > Grant Henke > Software Engineer | Cloudera > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke >