On 2/25/19, Richard Hipp <d...@sqlite.org> wrote: > performance of just over 3GB/sec, which is slightly > faster than reported simdjson performance of 2.9GB/sec.
Further analysis shows that SQLite was caching its parse tree, which was distorting the measurement. The following script adds a different string of spaces to the end of each instance of gsoc-2019.json that is parsed, thereby invalidating the cache. .timer on CREATE TEMP TABLE [$Parameters](key TEXT PRIMARY KEY,value) WITHOUT ROWID; INSERT INTO [$Parameters](key,value) VALUES('$json',readfile('/home/drh/tmp/gsoc-2018.json')); SELECT length($json); WITH RECURSIVE c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE x<1000) SELECT count(json_valid($json||printf('%*c',x,' '))) FROM c; In this case, SQLite parses JSON at 1.1GB/sec. That is slower than simdjson, but it is still pretty fast. And there are other reasons to prefer the current SQLite implementation: (1) The SQLite code is public domain. Simdjson is not. We do not want a license on SQLite that says something like "Public Domain unless you use JSON functions, in which case the license is Apache." (2) SQLite is written in portable C code. It runs everywhere. Simdjson is written in C++ and makes use of SIMD extensions that are not universally available. (3) Simdjson is optimized for large JSON blobs. SQLite is optimized for the common database case of small JSON blobs. -- D. Richard Hipp d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users