That is true, but you want lots of messages in a particular S3 bucket, so
you need some kind of separator or delimeter.

-Jay

On Wed, May 23, 2012 at 11:34 AM, Russell Jurney
<russell.jur...@gmail.com>wrote:

> I've always hoped that since Kafka is agnostic about message payload
> format (right?), that written format might be too... but maybe that is
> a bit over simplified.
>
> Russell Jurney http://datasyndrome.com
>
> On May 23, 2012, at 11:19 AM, S Ahmed <sahmed1...@gmail.com> wrote:
>
> >> Kafka handles
> >> scaling the consumption while making sure each consumer gets a subset of
> >> data.
> > Is there a writeup on the algorithm used to do that? Sounds interesting
> :)
> >
> > Agreed, this sounds like more of a contrib.
> >
> > On Wed, May 23, 2012 at 1:49 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >
> >> Basically it would just be a consumer that wrote to S3. Kafka handles
> >> scaling the consumption while making sure each consumer gets a subset of
> >> data. Probably we could make some command line tool. You would need some
> >> way to let the user control the format of the S3 data in a pluggable
> >> fashion. It could be a contrib package, or even just a separate github
> >> mini-project since it just works off the public api and would really
> just
> >> be used by people who want to get stuff into S3.
> >>
> >> -Jay
> >>
> >> On Wed, May 23, 2012 at 8:21 AM, S Ahmed <sahmed1...@gmail.com> wrote:
> >>
> >>> What would be needed to do this?
> >>>
> >>> Just thinking off the top of my head:
> >>>
> >>> 1. create a zookeeper store to keep track of the last message offset
> >>> persisted to s3, and which messages each consumer is processing.
> >>>
> >>> 2. pull messages off and group in whatever grouping you want (per
> >> message,
> >>> 10 messages, etc.), and spin off a executorservice to push to s3,
> update
> >>> the zookeeper offset.
> >>>
> >>> I'm new to kafka, but I would have to investigate on how multiple
> >> consumers
> >>> can pull messages and push to s3, while not having the consumers pull
> the
> >>> same messages.
> >>> Setting up a zookeeper store to track progress specifically for what
> has
> >>> been pushed to s3.
> >>>
> >>>
> >>> On Wed, May 23, 2012 at 1:35 AM, Russell Jurney <
> >> russell.jur...@gmail.com
> >>>> wrote:
> >>>
> >>>> Yeah, no kidding. I keep waiting on one :)
> >>>>
> >>>> Russell Jurney http://datasyndrome.com
> >>>>
> >>>> On May 22, 2012, at 10:31 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >>>>
> >>>>> No. Patches accepted.
> >>>>>
> >>>>> -Jay
> >>>>>
> >>>>> On Tue, May 22, 2012 at 10:23 PM, Russell Jurney
> >>>>> <russell.jur...@gmail.com>wrote:
> >>>>>
> >>>>>> Is there a simple way to dump Kafka events to S3 yet?
> >>>>>>
> >>>>>> Russell Jurney http://datasyndrome.com
> >>>>>>
> >>>>
> >>>
> >>
>

Reply via email to