I can confirm, it has bitten me few days ago, had it been documented I wouldn't need to spend an hour looking into the scala code which serializes the gzipped message only to find out theres another message inside when writing a php client.
I also think it does actually complicate the matter because the code which is supposed to be handling payload has to treat compressed payload as message again but only for compressed ones, and same when producing, let alone the potential for the compression flag slipping to 1 and wrapping messages infinitely.. On Jul 18, 2012 7:55 PM, "Lorenzo Alberton (JIRA)" <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/KAFKA-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417367#comment-13417367] > > Lorenzo Alberton commented on KAFKA-406: > ---------------------------------------- > > I see. It needs to be documented somewhere though, as people writing > client libs might have trouble reverse-engineering the protocol. The fact > that only gzipped messages are double-wrapped (requiring different handling > depending on a flag) can be confusing. > > This might be a stupid question, but if a Message can contain a MessageSet > with more than one Message, how is the consumer supposed to iterate through > them? Children of a MessageSet might be Message objects OR MessageSet > objects? > > > Gzipped payload is a fully wrapped Message (with headers), not just > payload > > > --------------------------------------------------------------------------- > > > > Key: KAFKA-406 > > URL: https://issues.apache.org/jira/browse/KAFKA-406 > > Project: Kafka > > Issue Type: Bug > > Components: core > > Affects Versions: 0.7.1 > > Environment: N/A > > Reporter: Lorenzo Alberton > > > > When creating a gzipped MessageSet, the collection of Messages is passed > to CompressionUtils.compress(), where each message is serialised [1] into a > buffer (not just the payload, the full Message with headers, uncompressed), > then gripped, and finally wrapped into another Message [2]. > > In other words, the consumer has to unwrap the Message flagged as > gzipped, unzip the payload, and unwrap the unzipped payload again as a > non-compressed Message. > > Is this double-wrapping the intended behaviour? > > [1] messages.foreach(m => m.serializeTo(messageByteBuffer)) > > [2] new Message(outputStream.toByteArray, compressionCodec) > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >