I don’t mind opening a JIRA and testing a patch, I'm not sure how quickly I'd be able to contribute one on my own though.
Thanks, Matt -----Original Message----- From: Grant Henke [mailto:ghe...@cloudera.com] Sent: Thursday, November 5, 2015 3:37 PM To: dev@kafka.apache.org Subject: Re: Replication Broken between Kafka 0.8.2.1 and 0.9 (trunk) Hi Matt, Great. Just wanted to be sure of the background. Are you interested in opening a jira and contributing a patch? If not would you be interested in testing a patch if I provided one? Thank you, Grant On Thu, Nov 5, 2015 at 2:31 PM, Matthew Bruce <mbr...@blackberry.com> wrote: > Hi Grant, > > Yes, I have read and followed the Upgrade documentation and I have set > inter.broker.protocol.version=0.8.2.X and it does not resolve the > issue - based on the part of the code it's occurring in, it uses the 'latest' > version for any API Key, and not the one specified by > 'inter.broker.protocol.version'. > > Matt > > -----Original Message----- > From: Grant Henke [mailto:ghe...@cloudera.com] > Sent: Thursday, November 5, 2015 3:28 PM > To: dev@kafka.apache.org > Subject: Re: Replication Broken between Kafka 0.8.2.1 and 0.9 (trunk) > > Hi Matthew, > > I have not read into the details of your issues but have done similar > "rolling" upgrade testing myself. The reason replication breaks is due > to some wire protocol changes. > > Just checking some preliminary things before digging in > > - Have you followed the upgrade steps outlined here? > - https://kafka.apache.org/090/documentation.html#upgrade > - Does setting inter.broker.protocol.version=0.8.2.X resolve the issue? > - Note: you need to unset and restart again after all brokers are > upgraded. > > In the future KIP-35 may help alleviate the manual step of setting the > inter.broker.protocol.version. You can read more about KIP-35 and > participate in the discussion/design here: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+ > protocol+version > > Thanks, > Grant > > > On Thu, Nov 5, 2015 at 2:18 PM, Matthew Bruce <mbr...@blackberry.com> > wrote: > > > Hello Kafka Devs, > > > > I've been testing the upgrade procedure between Kafka 0.8.2.1 and > > Kafka > > 0.9.0.0 and have been having Replication issues between the two > > version, and I was wondering if anyone was aware of this issue (I > > just searched and this seems to be related to KAFKA-2750 raised yesterday ). > > > > I start with 3 brokers running 0.8.2.1 all that contain data (1 > > topic with > > 10 partitions), then I shut down one of the brokers, upgrade it to > > 0.9.0 (making sure to set 'inter.broker.protocol.version=0.8.2.X' in > > broker.properties). Once the Broker is started I see errors like > > the > > following: > > > > [2015-11-05 19:13:10,309] WARN [ReplicaFetcherThread-0-182050600], > > Error in fetch > > kafka.server.ReplicaFetcherThread$FetchRequest@6cc18858 > <mailto: > > kafka.server.ReplicaFetcherThread$FetchRequest@6cc18858>. Possible > cause: > > org.apache.kafka.common.protocol.types.SchemaException: Error > > reading field > > 'responses': Error reading field 'topic': > > java.nio.BufferUnderflowException > > (kafka.server.ReplicaFetcherThread) > > And > > [2015-11-03 16:55:15,178] WARN [ReplicaFetcherThread-1-182050600], > > Error in fetch > > kafka.server.ReplicaFetcherThread$FetchRequest@224388b2 > <mailto: > > kafka.server.ReplicaFetcherThread$FetchRequest@224388b2>. Possible > cause: > > org.apache.kafka.common.protocol.types.SchemaException: Error > > reading field > > 'responses': Error reading field 'partition_responses': Error > > reading field > > 'record_set': java.lang.IllegalArgumentException > > (kafka.server.ReplicaFetcherThread) > > > > > > I've spent some time in the Kafka code, and packet > > captures/wireshark trying to figure this out, and I believe there is > > an issue in org.apache.kafka.clients.networkClient.java in the > > handleCompletedReceives > > function: > > When extracting the response body, this function is using > > ProtoUtils.currentResponseSchema instead of > > ProtoUtils.ResponseSchema and specifying the API version required by > > inter.broker.protocol.version. > > Struct body = (Struct) > > ProtoUtils.currentResponseSchema(apiKey).read(receive.payload()); > > > > This results in errors when the newer version of a Schema > > (FETCH_RESPONSE_V1 instead of FETCH_RESPONSE_V0) is applied against > > the fetch response returned by the 0.8.2.1 broker > > > > > > Thanks, > > Matthew Bruce > > mbr...@blackberry.com<mailto:mbr...@blackberry.com> > > > > > > > -- > Grant Henke > Software Engineer | Cloudera > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke > -- Grant Henke Software Engineer | Cloudera gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke