There is no scanning, we compute the message location from the offset and begin fetching there.
Sent from my iPhone On Jun 13, 2012, at 6:40 AM, S Ahmed <sahmed1...@gmail.com> wrote: > I was thinking of replicating messages to a central location, and having a > very long expire date on the messages (like say 1 year). > > My requirement would be able to not just stream messages, but access > messages by key, similiar to a "SELECT * FROM TABLE WHERE id=123" > > From I understand, currently their is no index file that maps messages to > their exact location in a file correct? i.e. kafka streams the messages, > so it goes to a .kafka file, starts from the beginning and streams the data > to a consumer. If your offset happends to be in the middle of the file, it > will scan the file, start at the beginning of the message, figure out the > length of the message, and then jump to the position of the next message > until it finds the correct message offset, is this correct? > > i.e. I would have to create some sort of index that maps the offset to the > 'messageId' (where the messageId is stored in the body of the message > itself).