That is true, but you want lots of messages in a particular S3 bucket, so you need some kind of separator or delimeter.
-Jay On Wed, May 23, 2012 at 11:34 AM, Russell Jurney <russell.jur...@gmail.com>wrote: > I've always hoped that since Kafka is agnostic about message payload > format (right?), that written format might be too... but maybe that is > a bit over simplified. > > Russell Jurney http://datasyndrome.com > > On May 23, 2012, at 11:19 AM, S Ahmed <sahmed1...@gmail.com> wrote: > > >> Kafka handles > >> scaling the consumption while making sure each consumer gets a subset of > >> data. > > Is there a writeup on the algorithm used to do that? Sounds interesting > :) > > > > Agreed, this sounds like more of a contrib. > > > > On Wed, May 23, 2012 at 1:49 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > >> Basically it would just be a consumer that wrote to S3. Kafka handles > >> scaling the consumption while making sure each consumer gets a subset of > >> data. Probably we could make some command line tool. You would need some > >> way to let the user control the format of the S3 data in a pluggable > >> fashion. It could be a contrib package, or even just a separate github > >> mini-project since it just works off the public api and would really > just > >> be used by people who want to get stuff into S3. > >> > >> -Jay > >> > >> On Wed, May 23, 2012 at 8:21 AM, S Ahmed <sahmed1...@gmail.com> wrote: > >> > >>> What would be needed to do this? > >>> > >>> Just thinking off the top of my head: > >>> > >>> 1. create a zookeeper store to keep track of the last message offset > >>> persisted to s3, and which messages each consumer is processing. > >>> > >>> 2. pull messages off and group in whatever grouping you want (per > >> message, > >>> 10 messages, etc.), and spin off a executorservice to push to s3, > update > >>> the zookeeper offset. > >>> > >>> I'm new to kafka, but I would have to investigate on how multiple > >> consumers > >>> can pull messages and push to s3, while not having the consumers pull > the > >>> same messages. > >>> Setting up a zookeeper store to track progress specifically for what > has > >>> been pushed to s3. > >>> > >>> > >>> On Wed, May 23, 2012 at 1:35 AM, Russell Jurney < > >> russell.jur...@gmail.com > >>>> wrote: > >>> > >>>> Yeah, no kidding. I keep waiting on one :) > >>>> > >>>> Russell Jurney http://datasyndrome.com > >>>> > >>>> On May 22, 2012, at 10:31 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > >>>> > >>>>> No. Patches accepted. > >>>>> > >>>>> -Jay > >>>>> > >>>>> On Tue, May 22, 2012 at 10:23 PM, Russell Jurney > >>>>> <russell.jur...@gmail.com>wrote: > >>>>> > >>>>>> Is there a simple way to dump Kafka events to S3 yet? > >>>>>> > >>>>>> Russell Jurney http://datasyndrome.com > >>>>>> > >>>> > >>> > >> >