Re: [sqlite] JSON_EACH + recursive query = unexpected performance degradation

2020-02-12 Thread Warren Young
On Feb 12, 2020, at 10:53 AM, Jens Alfke  wrote:
> 
> You should be able to speed this up by creating temporary tables from the 
> JSON first, and then changing the CTE to use those tables.

Do you not get the same effect by using the new generated columns feature, only 
without the manual work of maintaining the temporary table?

https://www.sqlite.org/gencol.html

sqlite> create table a (
  json text,
  b text generated always as (json_extract(json, '$.field')) stored
);
sqlite> insert into a values('{"field": "hello"}'); 
sqlite> select b from a;
hello


It’s probably critical to the success of this that you use the STORED attribute 
rather than VIRTUAL, which means you can’t ALTER TABLE your way to success, but 
you’d be looking at table copies with the temporary table idea anyway.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EACH + recursive query = unexpected performance degradation

2020-02-12 Thread Keith Medcalf

The easiest way is to phrase the query such that the table is internally 
materialized.  In query1.sql the easiest way to do that is to change the:

WITH
build a in memory table with parent-child relations 
from 3 json arrays _which have the same size_
"objects_in_memory" AS (
   SELECT "id"."value" AS "id",

to read like this:

WITH
build a in memory table with parent-child relations 
from 3 json arrays _which have the same size_
"objects_in_memory" AS (
   SELECT DISTINCT "id"."value" AS "id",

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of  ?
>Sent: Monday, 10 February, 2020 23:28
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] JSON_EACH + recursive query = unexpected performance
>degradation
>
>I have a dataset of about 300 rows which have parent-child relations.
>Due to factors unrelated to the issue I build the rows by zipping JSON
>arrays with values from each column.
>Then I run a simplest recursive query on it to get the whole tree(ends
>up being ~4 levels).
>
>Problem: the query takes 3000 ms (3 seconds) on my machine to complete.
>If I create a real table with the SAME structure, insert the SAME data
>into it, and run the SAME query, get the SAME result back, it takes
>10-15 ms (200-300 TIMES faster).
>
>
>I attached both queries(don't require schema to run)




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EACH + recursive query = unexpected performance degradation

2020-02-12 Thread Jens Alfke


> On Feb 10, 2020, at 10:27 PM, Хусаинов Динар  
> wrote:
> 
> Problem: the query takes 3000 ms (3 seconds) on my machine to complete. If I 
> create a real table with the SAME structure, insert the SAME data into it, 
> and run the SAME query, get the SAME result back, it takes 10-15 ms (200-300 
> TIMES faster).

It sounds like SQLite is much faster at traversing real tables than it is at 
traversing JSON arrays. Which I would expect. This is exacerbated by the fact 
that the recursive CTE must be doing a lot of traversals of the JSON.

You should be able to speed this up by creating temporary tables from the JSON 
first, and then changing the CTE to use those tables.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EACH + recursive query = unexpected performance degradation

2020-02-12 Thread Richard Hipp
On 2/12/20, David Raymond  wrote:
> Not necessarily related to the question itself, but how did the attachments
> actually come through with this mail? Every single other person to try and
> attach something to this list has had it stripped off.

Sometimes Mailman asks me for approval for messages containing
attachments.  I think this depends on the size and mimetype of the
attachments.  In this case, I was asked and I approved.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EACH + recursive query = unexpected performance degradation

2020-02-12 Thread David Raymond
Not necessarily related to the question itself, but how did the attachments 
actually come through with this mail? Every single other person to try and 
attach something to this list has had it stripped off.


-Original Message-
From: sqlite-users  On Behalf Of 
 ?
Sent: Tuesday, February 11, 2020 1:28 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] JSON_EACH + recursive query = unexpected performance 
degradation

I have a dataset of about 300 rows which have parent-child relations. 
Due to factors unrelated to the issue I build the rows by zipping JSON 
arrays with values from each column.
Then I run a simplest recursive query on it to get the whole tree(ends 
up being ~4 levels).

Problem: the query takes 3000 ms (3 seconds) on my machine to complete. 
If I create a real table with the SAME structure, insert the SAME data 
into it, and run the SAME query, get the SAME result back, it takes 
10-15 ms (200-300 TIMES faster).


I attached both queries(don't require schema to run)

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JSON_EACH + recursive query = unexpected performance degradation

2020-02-12 Thread Хусаинов Динар
I have a dataset of about 300 rows which have parent-child relations. 
Due to factors unrelated to the issue I build the rows by zipping JSON 
arrays with values from each column.
Then I run a simplest recursive query on it to get the whole tree(ends 
up being ~4 levels).


Problem: the query takes 3000 ms (3 seconds) on my machine to complete. 
If I create a real table with the SAME structure, insert the SAME data 
into it, and run the SAME query, get the SAME result back, it takes 
10-15 ms (200-300 TIMES faster).



I attached both queries(don't require schema to run)

/*query1*/

WITH 
build a in memory table with parent-child relations 
from 3 json arrays _which have the same size_
"objects_in_memory" AS (
   SELECT "id"."value" AS "id",
  "parent"."value" AS "parent",
  "code"."value"   AS "code"
   FROM