>Kafka handles >scaling the consumption while making sure each consumer gets a subset of >data. Is there a writeup on the algorithm used to do that? Sounds interesting :)
Agreed, this sounds like more of a contrib. On Wed, May 23, 2012 at 1:49 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > Basically it would just be a consumer that wrote to S3. Kafka handles > scaling the consumption while making sure each consumer gets a subset of > data. Probably we could make some command line tool. You would need some > way to let the user control the format of the S3 data in a pluggable > fashion. It could be a contrib package, or even just a separate github > mini-project since it just works off the public api and would really just > be used by people who want to get stuff into S3. > > -Jay > > On Wed, May 23, 2012 at 8:21 AM, S Ahmed <sahmed1...@gmail.com> wrote: > > > What would be needed to do this? > > > > Just thinking off the top of my head: > > > > 1. create a zookeeper store to keep track of the last message offset > > persisted to s3, and which messages each consumer is processing. > > > > 2. pull messages off and group in whatever grouping you want (per > message, > > 10 messages, etc.), and spin off a executorservice to push to s3, update > > the zookeeper offset. > > > > I'm new to kafka, but I would have to investigate on how multiple > consumers > > can pull messages and push to s3, while not having the consumers pull the > > same messages. > > Setting up a zookeeper store to track progress specifically for what has > > been pushed to s3. > > > > > > On Wed, May 23, 2012 at 1:35 AM, Russell Jurney < > russell.jur...@gmail.com > > >wrote: > > > > > Yeah, no kidding. I keep waiting on one :) > > > > > > Russell Jurney http://datasyndrome.com > > > > > > On May 22, 2012, at 10:31 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > > > > > No. Patches accepted. > > > > > > > > -Jay > > > > > > > > On Tue, May 22, 2012 at 10:23 PM, Russell Jurney > > > > <russell.jur...@gmail.com>wrote: > > > > > > > >> Is there a simple way to dump Kafka events to S3 yet? > > > >> > > > >> Russell Jurney http://datasyndrome.com > > > >> > > > > > >