Re: [sqlite] Memoization in sqlite json1 functions

Deon Brewis Fri, 24 Mar 2017 16:48:34 -0700

> It could be; my knowledge of optimization gets tenuous when it comes to 
> down-to-the-metal areas like CPU caching. But for large data, you run the 
> risk of blowing out the cache traversing it. And if the data is 
> memory-mapped, it becomes hugely faster to skip right to the relevant page 
> instead of faulting in every page ahead of it.

Yeah ok, if you take I/O hits then things like memory pre-fetching makes zero 
difference. We're more in the business of "You take a page fault" == "You buy 
more memory". Different level of performance requirements. (And glad that 
SQLITE works well for both of us).

> In Fleece I put a lot of effort into making the C++ API nice to use, so that 
> I don’t have to have any other data structure. That's worked well so far.

Strong typing?

- Deon

-----Original Message-----
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Jens Alfke
Sent: Thursday, March 23, 2017 6:09 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Memoization in sqlite json1 functions

> On Mar 23, 2017, at 3:17 PM, Deon Brewis <de...@outlook.com> wrote:
> 
> If you however can use a forward-only push or pull parser like a SAX or StAX 
> parse, it's a different story. I'm using a StAX-like pull parser for a binary 
> json-ish internal format we have, and reading & parsing through it is on par 
> with the performance of reading equivalent SQLITE columns directly

I agree that’s a lot faster, but you’re still looking at O(n) lookup time in an 
array or dictionary. And the proportion constant gets worse the bigger the 
document is, since jumping to the next item involves parsing through all of the 
nested items in that collection.

> That obviously implies if you do random-access into a structure you have to 
> keep reparsing it (which is where Memoization would be nice). However, CPU 
> caches are better at reading continues data streams in forward-only fashion 
> than they are with pointers, so forward-only pull parsers, even when you have 
> to repeat the entire parse, are often faster than the math behind it suggests.

It could be; my knowledge of optimization gets tenuous when it comes to 
down-to-the-metal areas like CPU caching. But for large data, you run the risk 
of blowing out the cache traversing it. And if the data is memory-mapped, it 
becomes hugely faster to skip right to the relevant page instead of faulting in 
every page ahead of it.

> Besides, in 99% of cases my users take the outcome from a json parse and just 
> store the results into a C++ data structure anyway. In that case the 
> intermediary object tree is just a throwaway and you may as well have built 
> the C++ structure up using a pull or push parser.

In Fleece I put a lot of effort into making the C++ API nice to use, so that I 
don’t have to have any other data structure. That's worked well so far.

—Jens

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Memoization in sqlite json1 functions

Reply via email to