Ben, You can create a "materialized path" for each field in the document:
{ ["user", "firstName"]: "ben", ["user", "skills", <TimeUUID>]: "java", ["user", "skills", <TimeUUID>]: "javascript", ["user", "skills", <TimeUUID>]: "html", ["user", "education", "school"]: "cmu", ["user", "education", "major"]: "computer science" } This way each field could be independently updated, and you can take sub-document slices with queries such as "give me everything under user/skills." Rick On Thursday, March 29, 2012 at 7:27 AM, Ben McCann wrote: > Could you explain further how I would use CASSANDRA-3647? There's still > very little documentation on composite columns and it was not clear to me > whether they could be used to store document oriented data. Say for > example that I had a document like: > > user: { > firstName: 'ben', > skills: ['java', 'javascript', 'html'], > education { > school: 'cmu', > major: 'computer science' > } > } > > How would I flatten this to be stored and then reconstruct the document? > > > On Thu, Mar 29, 2012 at 5:44 AM, Jake Luciani <jak...@gmail.com > (mailto:jak...@gmail.com)> wrote: > > > Is there a reason you would prefer a JSONType over CASSANDRA-3647? It > > would seem the only thing a JSON type offers you is validation. 3647 takes > > it much further by deconstructing a JSON document using composite columns > > to flatten the document out, with the ability to access and update portions > > of the document (as well as reconstruct it). > > > > On Wed, Mar 28, 2012 at 11:58 AM, Ben McCann <b...@benmccann.com > > (mailto:b...@benmccann.com)> wrote: > > > > > Hi, > > > > > > I was wondering if it would be interesting to add some type of > > > document-oriented data type. > > > > > > I've found it somewhat awkward to store document-oriented data in > > Cassandra > > > today. I can make a JSON/Protobuf/Thrift, serialize it, and store it, > > > > > > but > > > Cassandra cannot differentiate it from any other string or byte array. > > > However, if my column validation_class could be a JsonType that would > > > allow tools to potentially do more interesting introspection on the > > > > > > column > > > value. E.g. bug 3647 > > > <https://issues.apache.org/jira/browse/CASSANDRA-3647>calls for > > > supporting arbitrarily nested "documents" in CQL. Running a > > > query against the JSON column in Pig is possible as well, but again in > > > > > > this > > > use case it would be helpful to be able to encode in column metadata that > > > the column is stored as JSON. For debugging, running nightly reports, > > > > > > etc. > > > it would be quite useful compared to the opaque string and byte array > > > > > > types > > > we have today. JSON is appealing because it would be easy to implement. > > > Something like Thrift or Protocol Buffers would actually be interesting > > > since they would be more space efficient. However, they would also be a > > > bit more difficult to implement because of the extra typing information > > > they provide. I'm hoping with Cassandra 1.0's addition of compression > > > > > > that > > > storing JSON is not too inefficient. > > > > > > Would there be interest in adding a JsonType? I could look at putting a > > > patch together. > > > > > > Thanks, > > > Ben > > > > > > > > > > > > -- > > http://twitter.com/tjake >