> On Feb 21, 2020, at 4:20 AM, Wout Mertens <wout.mert...@gmail.com> wrote: > > I was wondering if the JSON extension could not do the same thing: for each > table, keep a hidden stash of object layouts, and store the values as > sqlite primitives. (you'd be able to disable this, in case the layouts > rarely repeat)
We do something a bit like this in Couchbase Lite[1]. We don't directly store JSON, we store a binary data format we designed called Fleece, which has the same semantics but is faster to traverse and more compact. https://github.com/couchbaselabs/fleece In Fleece a JSON object is represented as an array of fixed-width {key, value} pairs. - The keys are typically 16-bit integers that map into a global (per-database) table of common key strings. - The values are either inline if they fit (small ints, boolean, null) or else offsets to where the real data is. The array is sorted by key, so lookup is a binary search, usually on 16-bit ints. Numbers are stored in a compact variable-width encoding. Strings are unescaped UTF-8, and multiple occurrences of the same string are only stored once. (I've recently discovered that this is similar to FlexBuffers. Fleece has been around since 2015, so I don't know if the designers of FlexBuffers looked at it…) I've thought about the 'layouts' concept but haven't tried implementing it yet. (Most JavaScript runtimes use something like it, btw.) It would only save about 4*n bytes per record, where n is the average number of keys in a layout, and a small number of clock cycles per key lookup. > On Feb 21, 2020, at 6:03 AM, Richard Hipp <d...@sqlite.org> wrote: > > I experimented with a number of similar ideas for storing JSON when I > was first designing the JSON components for SQLite. I was never able > to find anything that was as fast or as compact as just storing the > original JSON text. I had similar experiences with BSON, Bencode, etc. In my use case I found that the major expense wasn't parsing, rather allocating an object tree. So I designed the Fleece format to be fully useable in-place without requiring parsing. A secondary factor is that traversing objects and arrays is expensive because the items are variable width, so you have to parse through all the (potentially nested) items before the one you're looking for. That's why Fleece objects and arrays are fixed-width, using offsets to point to variable-size values. —Jens [1] https://www.couchbase.com/products/mobile _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users