Re: [sqlite] more efficient JSON encoding: idle musing

Jens Alfke Tue, 25 Feb 2020 14:54:46 -0800


> On Feb 25, 2020, at 6:12 AM, J Decker <d3c...@gmail.com> wrote:
> 
> other than that; if space is really a concern, maybe a zip layer?


In my experience, the concern is more about speed than size. Given the raw 
string/blob data from a SQLite column, and a specific property name/path, how 
fast can you find its value, convert it to a format SQLite's query engine 
understand, and return it?

The JSON1 extension's parser seems pretty darn fast, but given the nature of 
JSON, it has to do a bunch of scanning that's O(n) with the length of the data. 
It's generally impossible to find a value in the middle of a JSON string 
without examining every single byte that came before it.

The way to go faster is to use formats that let you jump through the structure 
faster. For example, tokenizing dictionary keys and encoding an object's keys 
as a fixed-width sorted list lets you look up a key in O(log n) time, and the 
constant factor is very small because you're comparing integers not strings.

I don't know if extracting 'classes' will help much in a query. The data will 
be smaller, and it makes it extremely fast to look up values if you know the 
class ahead of time, but in a query you don't know the class. Compared to what 
I described above, there's an extra step where you have to look up the class 
description from the object.

I also worry about use cases where the number of 'classes' becomes unwieldy, 
because the schema might be using a huge set of keys. For example, something 
where a customer order contains an object that maps SKUs to quantities, like 
{"A73563M1": 3, "A73522M0": 7, …}. And there are tens of thousands of SKUs.

—Jens
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] more efficient JSON encoding: idle musing

Reply via email to