Thanks Niek good point; I can certainly do that and limit the payload arbitrarily or based on size (I just need to include my "contextual" information again in the next message) or latency requirements (which I was planning to do for the message set too)
I guess I sort of thought that messagesets might be a built in concept that might do what I wanted already Since many of my payloads may be smaller than the message header anyway, this will probably save me a bunch of space too On Jul 7, 2012, at 3:30 PM, Niek Sanders <niek.sand...@gmail.com> wrote: > Graham, > > If you have a collection of data that should always be sent and > consumed together and in-order, why not send it using a single Kafka > message? Or is the payload really huge? > > - Niek > > > > On Sat, Jul 7, 2012 at 10:18 AM, graham sanderson <gra...@vast.com> wrote: >> 1) I would like to guarantee that a group of messages are always delivered >> in their entirety together (because there is contextual information in >> messages which precede other messages). I'm a little confused by the use of >> the term "nested message sets" since I don't really see much in the code >> (though II don't really know Scala) - perhaps this refers to the fact that >> you can have a set of messages within a message set file on disk. Anyway, I >> was curious (and I'm using the Java api now, but may move to the Scala >> later) what I need to do to guarantee N messages are sent and delivered as a >> single message set; is a single ProducerData with a List of messages always >> sent as a single message set? does compression need to be turned on? how >> does this affect network limits etc. (i.e. does the entire message set have >> to fit). I'm also assuming that once I have my message set containing all my >> messages it will be discarded in its entirety. >> >> 2) Related to 1) from the consumer side, can I tell the boundaries of a >> message set (perhaps not required for me), but nevertheless I do want to >> make sure I receive the entire set in one go (again do I have to set network >> limits accordingly). The docs say that the entire message set is always >> delivered to the client when compressed, but I'm not sure if it can be >> subdivided if not compressed. Note I'm happy to stick with compression if >> required. >> >> 3) So I'm using the ZookeeperConsumerConnector, since I don't want to manage >> finding the brokers myself, however I was wondering if there are any plans >> to decouple the consumer offset tracking from the former. One of my use >> cases is that I'll have a lot of ad-hoc one off consumers that simply read a >> subset of data until they die - from looking at ConsoleConsumer, there is >> currently a hack to simply delete the zookeeper info after the fact to get >> around this. >> >> Thanks, >> >> Graham.