Filipe this looks awesome - sure lots of comments will come on this one :-)
A+ Dave On 16 March 2011 09:54, Filipe Manana (JIRA) <[email protected]> wrote: > Storing documents bodies as raw JSON binaries instead of serialized JSON terms > ------------------------------------------------------------------------------ > > Key: COUCHDB-1092 > URL: https://issues.apache.org/jira/browse/COUCHDB-1092 > Project: CouchDB > Issue Type: Improvement > Components: Database Core > Reporter: Filipe Manana > Assignee: Filipe Manana > > > Currently we store documents as Erlang serialized (via the term_to_binary/1 > BIF) EJSON. > The proposed patch changes the database file format so that instead of > storing serialized > EJSON document bodies, it stores raw JSON binaries. > > The github branch is at: > https://github.com/fdmanana/couchdb/tree/raw_json_docs > > Advantages: > > * what we write to disk is much smaller - a raw JSON binary can easily get up > to 50% smaller > (at least according to the tests I did) > > * when serving documents to a client we no longer need to JSON encode the > document body > read from the disk - this applies to individual document requests, view > queries with > ?include_docs=true, pull and push replications, and possibly other use cases. > We just grab its body and prepend the _id, _rev and all the necessary > metadata fields > (this is via simple Erlang binary operations) > > * we avoid the EJSON term copying between request handlers and the db updater > processes, > between the work queues and the view updater process, between replicator > processes, etc > > * before sending a document to the JavaScript view server, we no longer need > to convert it > from EJSON to JSON > > The changes done to the document write workflow are minimalist - after JSON > decoding the > document's JSON into EJSON and removing the metadata top level fields (_id, > _rev, etc), it > JSON encodes the resulting EJSON body into a binary - this consumes CPU of > course but it > brings 2 advantages: > > 1) we avoid the EJSON copy between the request process and the database > updater process - > for any realistic document size (4kb or more) this can be very expensive, > specially > when there are many nested structures (lists inside objects inside lists, > etc) > > 2) before writing anything to the file, we do a term_to_binary([Len, Md5, > TheThingToWrite]) > and then write the result to the file. A term_to_binary call with a binary > as the input > is very fast compared to a term_to_binary call with EJSON as input (or some > other nested > structure) > > I think both compensate the JSON encoding after the separation of meta data > fields and non-meta data fields. > > The following relaximation graph, for documents with sizes of 4Kb, shows a > significant > performance increase both for writes and reads - especially reads. > > http://graphs.mikeal.couchone.com/#/graph/698bf36b6c64dbd19aa2bef63400b94f > > > I've also made a few tests to see how much the improvement is when querying a > view, for the > first time, without ?stale=ok. The size difference of the databases (after > compaction) is > also very significant - this change can reduce the size at least 50% in > common cases. > > The test databases were created in an instance built from that experimental > branch. > Then they were replicated into a CouchDB instance built from the current > trunk. > At the end both databases were compacted (to fairly compare their final > sizes). > > The databases contain the following view: > { > "_id": "_design/test", > "language": "javascript", > "views": { > "simple": { > "map": "function(doc) { emit(doc.float1, doc.strings[1]); }" > } > } > } > > > ## Database with 500 000 docs of 2.5Kb each > > Document template is at: > https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_2_5k.json > > Sizes (branch vs trunk): > > $ du -m couchdb/tmp/lib/disk_json_test.couch > 1996 couchdb/tmp/lib/disk_json_test.couch > > $ du -m couchdb-trunk/tmp/lib/disk_ejson_test.couch > 2693 couchdb-trunk/tmp/lib/disk_ejson_test.couch > > > Time, from a user's perpective, to build the view index from scratch: > > $ time curl > http://localhost:5984/disk_json_test/_design/test/_view/simple?limit=1 > {"total_rows":500000,"offset":0,"rows":[ > {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"} > ]} > > real 6m6.740s > user 0m0.016s > sys 0m0.008s > > $ time curl > http://localhost:5985/disk_ejson_test/_design/test/_view/simple?limit=1 > {"total_rows":500000,"offset":0,"rows":[ > {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"} > ]} > > real 15m41.439s > user 0m0.012s > sys 0m0.012s > > > > ## Database with 100 000 docs of 11Kb each > > Document template is at: > https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_11k.json > > Sizes (branch vs trunk): > > $ du -m couchdb/tmp/lib/disk_json_test_11kb.couch > 1185 couchdb/tmp/lib/disk_json_test_11kb.couch > > $ du -m couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch > 2202 couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch > > > Time, from a user's perpective, to build the view index from scratch: > > $ time curl > http://localhost:5984/disk_json_test_11kb/_design/test/_view/simple?limit=1 > {"total_rows":100000,"offset":0,"rows":[ > {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"} > ]} > > real 4m19.306s > user 0m0.008s > sys 0m0.004s > > $ time curl > http://localhost:5985/disk_ejson_test_11kb/_design/test/_view/simple?limit=1 > {"total_rows":100000,"offset":0,"rows":[ > {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"} > ]} > > real 18m46.051s > user 0m0.008s > sys 0m0.016s > > > > All in all, I haven't seen yet any disadvantage with this approach. Also, the > code changes > don't bring additional complexity. I say the performance and disk space gains > it gives are > very positive. > > This branch still needs to be polished in a few places. But I think it isn't > far from getting mature. > > Other experiments that can be done are to store view values as raw JSON > binaries as well (instead of EJSON) > and optional compression of the stored JSON binaries (since it's pure text, > the compression ratio is very high). > However, I would prefer to do these other 2 suggestions in separate > branches/patches - I haven't actually tested > any of them yet, so maybe they not bring significant gains. > > Thoughts? :) > > > -- > This message is automatically generated by JIRA. > For more information on JIRA, see: http://www.atlassian.com/software/jira >
