I've always hoped that since Kafka is agnostic about message payload format (right?), that written format might be too... but maybe that is a bit over simplified.
Russell Jurney http://datasyndrome.com On May 23, 2012, at 11:19 AM, S Ahmed <sahmed1...@gmail.com> wrote: >> Kafka handles >> scaling the consumption while making sure each consumer gets a subset of >> data. > Is there a writeup on the algorithm used to do that? Sounds interesting :) > > Agreed, this sounds like more of a contrib. > > On Wed, May 23, 2012 at 1:49 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > >> Basically it would just be a consumer that wrote to S3. Kafka handles >> scaling the consumption while making sure each consumer gets a subset of >> data. Probably we could make some command line tool. You would need some >> way to let the user control the format of the S3 data in a pluggable >> fashion. It could be a contrib package, or even just a separate github >> mini-project since it just works off the public api and would really just >> be used by people who want to get stuff into S3. >> >> -Jay >> >> On Wed, May 23, 2012 at 8:21 AM, S Ahmed <sahmed1...@gmail.com> wrote: >> >>> What would be needed to do this? >>> >>> Just thinking off the top of my head: >>> >>> 1. create a zookeeper store to keep track of the last message offset >>> persisted to s3, and which messages each consumer is processing. >>> >>> 2. pull messages off and group in whatever grouping you want (per >> message, >>> 10 messages, etc.), and spin off a executorservice to push to s3, update >>> the zookeeper offset. >>> >>> I'm new to kafka, but I would have to investigate on how multiple >> consumers >>> can pull messages and push to s3, while not having the consumers pull the >>> same messages. >>> Setting up a zookeeper store to track progress specifically for what has >>> been pushed to s3. >>> >>> >>> On Wed, May 23, 2012 at 1:35 AM, Russell Jurney < >> russell.jur...@gmail.com >>>> wrote: >>> >>>> Yeah, no kidding. I keep waiting on one :) >>>> >>>> Russell Jurney http://datasyndrome.com >>>> >>>> On May 22, 2012, at 10:31 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >>>> >>>>> No. Patches accepted. >>>>> >>>>> -Jay >>>>> >>>>> On Tue, May 22, 2012 at 10:23 PM, Russell Jurney >>>>> <russell.jur...@gmail.com>wrote: >>>>> >>>>>> Is there a simple way to dump Kafka events to S3 yet? >>>>>> >>>>>> Russell Jurney http://datasyndrome.com >>>>>> >>>> >>> >>