[
https://issues.apache.org/jira/browse/COUCHDB-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015705#comment-13015705
]
Paul Joseph Davis commented on COUCHDB-1118:
--------------------------------------------
Just a quick heads up on some work I did this weekend.
I was working on a thing that took a different approach to JSON
encoding/decoding than the token based approach in ejson. Currently I return
{ok, EJson}, {error, Error} or {bignum, Terms} and then if if its bignum I have
a function that goes through and parses all the bingums present before
returning the EJson.
Encoding *will* have a similar method based on iolists but currently only
supports JSON available through the NIF api.
I've finally managed to suss out the remaining memory bugs I was having last
night and have slapped together a repo for testing the new project against
ejson and mochijson.
The steps to running it are:
$ git clone git://github.com/davisp/erljson_bench.git
$ cd erljson_bench
$ make
$ ./bench
That'll spit out something like such:
encode: jiffy: 2444465
encode: ejson_test: 12169427
encode: mochijson2: 25071078
decode: jiffy: 1118045
decode: ejson_test: 9873485
decode: mochijson2: 25117838
The current test runs N workers (default to 10) for M iterations per worker
(default 1,000). Each iteration runs timer:tc(Module, encode|decode, [Doc])
(where Doc is roughly 5K by default). The number in the third column is the sum
of the time that is reported by timer:tc.
Here's some quick results on multiple machines from Dale Harvey, Koco and I
from earlier tonight. I think we're all three on relatively newish Mac laptops
of some sort.
My next bit will be to add the rest of the special encoding as well as adding
some tests to look at how the Erlang VM reacts to having these types of NIF
calls.
Anyway, just a heads up on some hopeful gains to be had here.
> Adding a NIF based JSON decoding/encoding module
> ------------------------------------------------
>
> Key: COUCHDB-1118
> URL: https://issues.apache.org/jira/browse/COUCHDB-1118
> Project: CouchDB
> Issue Type: Improvement
> Components: Database Core
> Reporter: Filipe Manana
> Fix For: 1.2
>
>
> Currently, all the Erlang based JSON encoders and decoders are very slow, and
> decoding and encoding JSON is something that we do basically everywhere.
> Via IRC, it recently discussed about adding a JSON NIF encoder/decoder.
> Damien also started a thread at the development mailing list about adding
> NIFs to trunk.
> The patch/branch at [1] adds such a JSON encoder/decoder. It is based on Paul
> Davis' eep0018 project [2]. Damien made some modifications [3] to it mostly
> to add support for big numbers (Paul's eep0018 limits the precision to 32/64
> bits) and a few optimizations. I made a few corrections and minor
> enhancements on top of Damien's fork as well [4]. Finally BenoƮt identified
> some missing capabilities compared to mochijson2 (on encoding, allow atoms as
> strings and strings as object properties).
> Also, the version added in the patch at [1] uses mochijson2 when the C NIF is
> not loaded. Autotools configuration was adapted to compile the NIF only when
> we're using an OTP release >= R13B04 (R13B03 NIF API is too limited and
> suffered many changes compared to R13B04 and R14) - therefore it should work
> on any OTP release > R13B at least.
> I successfully tested this on R13B03, R13B04 and R14B02 in an Ubuntu
> environment.
> I'm not sure if it builds at all on Windows - would appreciate if someone
> could verify it.
> Also, I'm far from being good with the autotools, so I probably missed
> something important or I'm doing something in a not very standard way.
> This NIF encoder/decoder is about one order of magnitude faster compared to
> mochijson2 and other Erlang-only solutions such as jsx. A read and writes
> test with relaximation shows this has a very positive impact, specially on
> reads (the EJSON encoding is more expensive than JSON decoding) -
> http://graphs.mikeal.couchone.com/#/graph/698bf36b6c64dbd19aa2bef634052381
> @Paul, since this is based on your eep0018 effort, do you think any other
> missing files should be added (README, etap tests, etc)? Also, should we put
> somewhere a note this is based on your project?
> [1] - https://github.com/fdmanana/couchdb/compare/json_nif
> [2] - https://github.com/davisp/eep0018
> [3] - https://github.com/Damienkatz/eep0018/commits/master
> [4] - https://github.com/fdmanana/eep0018/commits/final_damien
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira