bump On Wed, Mar 21, 2012 at 10:01 PM, Vaibhav Puranik <vpura...@gmail.com>wrote:
> Let me ask my boss what I can share. Let's talk off the mailing list. > > Regards, > Vaibhav > > On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <russell.jur...@gmail.com > >wrote: > > > You have code that puts records in bigger blocks on s3? Plz to share? :) > > > > Russell Jurney http://datasyndrome.com > > > > On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <vpura...@gmail.com> wrote: > > > > > We also have s3 files organized by date in the following fashion. > > > > > > yyyy/MM/dd/hh > > > > > > Our messages are in JSON. > > > > > > Regards, > > > Vaibhav > > > > > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney < > > russell.jur...@gmail.com>wrote: > > > > > >> I want the S3 files to be organized by type and date. Folders for > types, > > >> subfolders for date down to the hour: year/month/day/hour. All > payloads > > of > > >> a given type get written together. > > >> > > >> It would be ideal if there was no integration with the end format, but > > in > > >> practice I'm not sure if all the serialization protocols mentioned can > > be > > >> written in this way. > > >> > > >> Russell Jurney http://datasyndrome.com > > >> > > >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <t...@lossen.de> wrote: > > >> > > >>> another good option would be messagepack -- flexible & schemaless > like > > >> json, but binary. > > >>> > > >>> Sent from my iPhone > > >>> > > >>> On 21 Mar 2012, at 20:46, Russell Jurney <russell.jur...@gmail.com> > > >> wrote: > > >>> > > >>>> I'm going to use thrift, avro or protobuf for serialization. > > >>>> > > >>>> Russell Jurney http://datasyndrome.com > > >>>> > > >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <vpura...@gmail.com> > > >> wrote: > > >>>> > > >>>>> I would use the payload. I want the message to be exactly as it is. > > We > > >> want > > >>>>> to name the files as per topic. > > >>>>> (That's how we differentiate right now). > > >>>>> > > >>>>> Regards, > > >>>>> Vaibhav > > >>>>> > > >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders < > > niek.sand...@gmail.com > > >>> wrote: > > >>>>> > > >>>>>> So what would you like the S3 files to actually look like? > > >>>>>> > > >>>>>> One Kafka message body per line? Should the message topic be > tossed > > >>>>>> in there too? > > >>>>>> > > >>>>>> A tricky aspect is that the Kafka message body is an opaque byte > > >>>>>> array. For my own case I'm using JSON for the payload so it makes > > my > > >>>>>> requirements simpler. > > >>>>>> > > >>>>>> - Niek > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney > > >>>>>> <russell.jur...@gmail.com> wrote: > > >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit > > them > > >> in > > >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like > > >> most > > >>>>>> everyone else. > > >>>>>>> > > >>>>>>> Russell Jurney http://datasyndrome.com > > >>>>>>> > > >>>>>> > > >> > > > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com