Let me ask my boss what I can share. Let's talk off the mailing list. Regards, Vaibhav
On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <russell.jur...@gmail.com>wrote: > You have code that puts records in bigger blocks on s3? Plz to share? :) > > Russell Jurney http://datasyndrome.com > > On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <vpura...@gmail.com> wrote: > > > We also have s3 files organized by date in the following fashion. > > > > yyyy/MM/dd/hh > > > > Our messages are in JSON. > > > > Regards, > > Vaibhav > > > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney < > russell.jur...@gmail.com>wrote: > > > >> I want the S3 files to be organized by type and date. Folders for types, > >> subfolders for date down to the hour: year/month/day/hour. All payloads > of > >> a given type get written together. > >> > >> It would be ideal if there was no integration with the end format, but > in > >> practice I'm not sure if all the serialization protocols mentioned can > be > >> written in this way. > >> > >> Russell Jurney http://datasyndrome.com > >> > >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <t...@lossen.de> wrote: > >> > >>> another good option would be messagepack -- flexible & schemaless like > >> json, but binary. > >>> > >>> Sent from my iPhone > >>> > >>> On 21 Mar 2012, at 20:46, Russell Jurney <russell.jur...@gmail.com> > >> wrote: > >>> > >>>> I'm going to use thrift, avro or protobuf for serialization. > >>>> > >>>> Russell Jurney http://datasyndrome.com > >>>> > >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <vpura...@gmail.com> > >> wrote: > >>>> > >>>>> I would use the payload. I want the message to be exactly as it is. > We > >> want > >>>>> to name the files as per topic. > >>>>> (That's how we differentiate right now). > >>>>> > >>>>> Regards, > >>>>> Vaibhav > >>>>> > >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders < > niek.sand...@gmail.com > >>> wrote: > >>>>> > >>>>>> So what would you like the S3 files to actually look like? > >>>>>> > >>>>>> One Kafka message body per line? Should the message topic be tossed > >>>>>> in there too? > >>>>>> > >>>>>> A tricky aspect is that the Kafka message body is an opaque byte > >>>>>> array. For my own case I'm using JSON for the payload so it makes > my > >>>>>> requirements simpler. > >>>>>> > >>>>>> - Niek > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney > >>>>>> <russell.jur...@gmail.com> wrote: > >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit > them > >> in > >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like > >> most > >>>>>> everyone else. > >>>>>>> > >>>>>>> Russell Jurney http://datasyndrome.com > >>>>>>> > >>>>>> > >> >