Thanks Dave On Wed, Mar 16, 2011 at 10:46 AM, Dave Cottlehuber <[email protected]> wrote: > Filipe > > this looks awesome - sure lots of comments will come on this one :-) > > A+ > Dave > > On 16 March 2011 09:54, Filipe Manana (JIRA) <[email protected]> wrote: >> Storing documents bodies as raw JSON binaries instead of serialized JSON >> terms >> ------------------------------------------------------------------------------ >> >> Key: COUCHDB-1092 >> URL: https://issues.apache.org/jira/browse/COUCHDB-1092 >> Project: CouchDB >> Issue Type: Improvement >> Components: Database Core >> Reporter: Filipe Manana >> Assignee: Filipe Manana >> >> >> Currently we store documents as Erlang serialized (via the term_to_binary/1 >> BIF) EJSON. >> The proposed patch changes the database file format so that instead of >> storing serialized >> EJSON document bodies, it stores raw JSON binaries. >> >> The github branch is at: >> https://github.com/fdmanana/couchdb/tree/raw_json_docs >> >> Advantages: >> >> * what we write to disk is much smaller - a raw JSON binary can easily get >> up to 50% smaller >> (at least according to the tests I did) >> >> * when serving documents to a client we no longer need to JSON encode the >> document body >> read from the disk - this applies to individual document requests, view >> queries with >> ?include_docs=true, pull and push replications, and possibly other use >> cases. >> We just grab its body and prepend the _id, _rev and all the necessary >> metadata fields >> (this is via simple Erlang binary operations) >> >> * we avoid the EJSON term copying between request handlers and the db >> updater processes, >> between the work queues and the view updater process, between replicator >> processes, etc >> >> * before sending a document to the JavaScript view server, we no longer need >> to convert it >> from EJSON to JSON >> >> The changes done to the document write workflow are minimalist - after JSON >> decoding the >> document's JSON into EJSON and removing the metadata top level fields (_id, >> _rev, etc), it >> JSON encodes the resulting EJSON body into a binary - this consumes CPU of >> course but it >> brings 2 advantages: >> >> 1) we avoid the EJSON copy between the request process and the database >> updater process - >> for any realistic document size (4kb or more) this can be very expensive, >> specially >> when there are many nested structures (lists inside objects inside lists, >> etc) >> >> 2) before writing anything to the file, we do a term_to_binary([Len, Md5, >> TheThingToWrite]) >> and then write the result to the file. A term_to_binary call with a binary >> as the input >> is very fast compared to a term_to_binary call with EJSON as input (or >> some other nested >> structure) >> >> I think both compensate the JSON encoding after the separation of meta data >> fields and non-meta data fields. >> >> The following relaximation graph, for documents with sizes of 4Kb, shows a >> significant >> performance increase both for writes and reads - especially reads. >> >> http://graphs.mikeal.couchone.com/#/graph/698bf36b6c64dbd19aa2bef63400b94f >> >> >> I've also made a few tests to see how much the improvement is when querying >> a view, for the >> first time, without ?stale=ok. The size difference of the databases (after >> compaction) is >> also very significant - this change can reduce the size at least 50% in >> common cases. >> >> The test databases were created in an instance built from that experimental >> branch. >> Then they were replicated into a CouchDB instance built from the current >> trunk. >> At the end both databases were compacted (to fairly compare their final >> sizes). >> >> The databases contain the following view: >> { >> "_id": "_design/test", >> "language": "javascript", >> "views": { >> "simple": { >> "map": "function(doc) { emit(doc.float1, doc.strings[1]); }" >> } >> } >> } >> >> >> ## Database with 500 000 docs of 2.5Kb each >> >> Document template is at: >> https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_2_5k.json >> >> Sizes (branch vs trunk): >> >> $ du -m couchdb/tmp/lib/disk_json_test.couch >> 1996 couchdb/tmp/lib/disk_json_test.couch >> >> $ du -m couchdb-trunk/tmp/lib/disk_ejson_test.couch >> 2693 couchdb-trunk/tmp/lib/disk_ejson_test.couch >> >> >> Time, from a user's perpective, to build the view index from scratch: >> >> $ time curl >> http://localhost:5984/disk_json_test/_design/test/_view/simple?limit=1 >> {"total_rows":500000,"offset":0,"rows":[ >> {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"} >> ]} >> >> real 6m6.740s >> user 0m0.016s >> sys 0m0.008s >> >> $ time curl >> http://localhost:5985/disk_ejson_test/_design/test/_view/simple?limit=1 >> {"total_rows":500000,"offset":0,"rows":[ >> {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"} >> ]} >> >> real 15m41.439s >> user 0m0.012s >> sys 0m0.012s >> >> >> >> ## Database with 100 000 docs of 11Kb each >> >> Document template is at: >> https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_11k.json >> >> Sizes (branch vs trunk): >> >> $ du -m couchdb/tmp/lib/disk_json_test_11kb.couch >> 1185 couchdb/tmp/lib/disk_json_test_11kb.couch >> >> $ du -m couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch >> 2202 couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch >> >> >> Time, from a user's perpective, to build the view index from scratch: >> >> $ time curl >> http://localhost:5984/disk_json_test_11kb/_design/test/_view/simple?limit=1 >> {"total_rows":100000,"offset":0,"rows":[ >> {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"} >> ]} >> >> real 4m19.306s >> user 0m0.008s >> sys 0m0.004s >> >> $ time curl >> http://localhost:5985/disk_ejson_test_11kb/_design/test/_view/simple?limit=1 >> {"total_rows":100000,"offset":0,"rows":[ >> {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"} >> ]} >> >> real 18m46.051s >> user 0m0.008s >> sys 0m0.016s >> >> >> >> All in all, I haven't seen yet any disadvantage with this approach. Also, >> the code changes >> don't bring additional complexity. I say the performance and disk space >> gains it gives are >> very positive. >> >> This branch still needs to be polished in a few places. But I think it isn't >> far from getting mature. >> >> Other experiments that can be done are to store view values as raw JSON >> binaries as well (instead of EJSON) >> and optional compression of the stored JSON binaries (since it's pure text, >> the compression ratio is very high). >> However, I would prefer to do these other 2 suggestions in separate >> branches/patches - I haven't actually tested >> any of them yet, so maybe they not bring significant gains. >> >> Thoughts? :) >> >> >> -- >> This message is automatically generated by JIRA. >> For more information on JIRA, see: http://www.atlassian.com/software/jira >> >
-- Filipe David Manana, [email protected], [email protected] "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men."
