To be fair, you can test types as you parse JSON. But only a few. The Avro schemas even include comments... huge win.
Russell Jurney http://datasyndrome.com On Aug 12, 2012, at 7:42 PM, Bill Graham <[email protected]> wrote: The benefit of having a schema associated with your data should not be understated. I think when debating whether to use JSON or some other data serialization format that has a schema (like Avro), you should choose the later. The schema support alone will pay dividends over the long run. On Sun, Aug 12, 2012 at 3:34 PM, Russell Jurney <[email protected]>wrote: > You'll need to compress JSON. Avro can compress itself. Avro > represents more types, you'll need to serialize your types beyond what > json supports with annotation or by convention. JSON is simpler. > > Short answer: use JSON if it's types are expressive enough for your > data, and if you don't mind compressing it yourself. Avro has more > types, has the schema onboard and self compresses. > > Russell Jurney > > On Aug 12, 2012, at 3:27 PM, Tatu Saloranta <[email protected]> wrote: > > > I would ask questions from specific subset of users: those with actual > > experience in using both, to compare approaches. If you ask someone > > who is only used one, all you get to know is that both can be made to > > work well enough. Which of course may be enough for your needs. :-) > > > > -+ Tatu +- > > > > On Sun, Aug 12, 2012 at 10:32 AM, Harsh J <[email protected]> wrote: > >> Moving this to the user@avro lists. Please use the right lists for the > >> best answers and the right people. > >> > >> I'd pick Avro out of the two - it is very well designed for typed data > >> and has a very good implementation of the serializer/deserializer, > >> aside of the schema advantages. FWIW, Avro has a tojson CLI tool to > >> dump Avro binary format out as JSON structures, which would be of help > >> if you seek readability and/or integration with apps/systems that > >> already depend on JSON. > >> > >> On Sun, Aug 12, 2012 at 10:41 PM, Mohit Anchlia <[email protected]> > wrote: > >>> We get data in Json format. I was initially thinking of simply storing > Json > >>> in hdfs for processing. I see there is Avro that does the similar > thing but > >>> most likely stores it in more optimized format. I wanted to get users > >>> opinion on which one is better. > >> > >> > >> > >> -- > >> Harsh J > -- *Note that I'm no longer using my Yahoo! email address. Please email me at [email protected] going forward.*
