Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Teg
Hello Domingo,


DAD>db.exec_dml("PRAGMA synchronous = 0;");
DAD>db.exec_dml("PRAGMA journal_mode = WAL");
DAD>//db.exec_dml("PRAGMA journal_mode = MEMORY;");
DAD>//db.exec_dml("PRAGMA journal_mode = OFF;");
DAD>//db.exec_dml("PRAGMA locking_mode = EXCLUSIVE;");
DAD>db.exec_dml("PRAGMA temp_store = MEMORY;");
DAD>//db.exec_dml("PRAGMA threads = 4;");


Here  are  some  of the pragma's you reported this morning.  Any place
you have "MEMORY" can consume most of your memory when doing a bunch of
transactions.

I'm  going  to  guess  you're  still  using some of these "use memory"
pragma's.


Sunday, October 2, 2016, 4:35:23 PM, you wrote:

DAD> Hello !

DAD> I'm still fighting to use sqlite with a 20GB database and now I'm trying
DAD> to update a table with 980M rows but sqlite is eating all my memory 
DAD> (2GB) and making blocking my computer (too much swap).

DAD> I'm in doubt if sqlite can really manage databases above 1GB.

DAD> Have someone any experience with big databases ?

DAD> Cheers !

DAD> The query:

DAD> --explain query plan
DAD> update bolsas_familia set favorecido_id=(select id from favorecidos 
DAD> where nis=favorecido_id);

DAD> Query plan:

DAD> selectid|order|from|detail
DAD> 0|0|0|SCAN TABLE bolsas_familia
DAD> 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
DAD> 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx
DAD> (nis=?)

DAD> The schema is:

DAD> CREATE TABLE bolsas_familia(
DAD>  id integer primary key,
DAD>  favorecido_id integer not null,
DAD>  valor_parcela numeric,
DAD>  mes_competencia integer
DAD> ); //980M rows

DAD> CREATE TABLE "favorecidos"(
DAD>  id integer primary key,
DAD>  nis integer,
DAD>  nome varchar collate nocase_slna,
DAD>  municipio_id integer,
DAD>  programa_id integer not null,
DAD>  fonte_finalidade_id integer not null,
DAD>  first_ano_mes integer
DAD> ); //14M rows

DAD> CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);


DAD> Cheers !

DAD> ___
DAD> sqlite-users mailing list
DAD> sqlite-users@mailinglists.sqlite.org
DAD> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
 Tegmailto:t...@djii.com

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug: optional transfer optimization does not update sqlite_sequence

2016-10-02 Thread Richard Hipp
Thanks for the bug report.  The ticket is
https://www.sqlite.org/src/tktview/7b3328086a5c1

Your analysis of the problem seems spot on.  I'm guessing you spent a
long time debugging his  Your efforts are appreciated.

On 10/2/16, Clemens Ladisch  wrote:
> Hi,
>
> if the transfer optimization needs to check if the destination table is
> empty, and if the optimization is actually used, and if the destination
> table uses AUTOINCREMENT, then the generated code does not update the
> sqlite_sequence table.
>
> After these statements, the row in dst should have rowid 2, but the
> rowid 1 got reused:
>
> create table src(a integer primary key autoincrement, b unique);
> create table dst(a integer primary key autoincrement, b unique);
> insert into src default values;
> insert into dst select * from src;
> -- select * from sqlite_sequence where name = 'dst'; is empty
> delete from dst;
> insert into dst default values;
> select a from dst;
> 1
>
> (reported by Matthew Moisen on )
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Rob Willett
We tend to use append only. We delete some small transient data as we 
use them, this transient data maintains a bit of state between appends. 
We're talking 300-500 lines of a table.


It's not read only, we add a lot of data every 3-5 minutes, but its all 
driven from a single Perl process that captures XML data, processes it 
as only Perl can, and then updates the database. Nothing updates the 
database between those time intervals. No foreign keys at all.


The schema is quite long but rather simplistic. There are lots of tables 
all referenced by a big central table, since a lot of our data is common 
and referenced thousands of times by different rows, we needed to reduce 
the amount of data or else we would have tripled the size of the main 
database to over 100GB.


Its 1,200 lines long so not really suitable for posting here, but the 
main table is


-- 
--  Table structure for Disruptions
-- 
DROP TABLE IF EXISTS "Disruptions";
CREATE TABLE "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", 
'0');


All the integer fields are really indexes into other tables. We made a 
deliberate decision to move some of the more complex logic out of SQL 
into Perl as it sat better there. This was no reflection on SQLite, we 
would have made the same decision with PostGres or MySQL or Oracle. 
Right tool for the right job etc etc.


Rob

On 2 Oct 2016, at 22:47, Domingo Alvarez Duarte wrote:


Hello Rob !

How do you use those big databases ?

Append only ? Read only ? Constant read write ? Foreign keys ?

Could you show the schema to talk about it ?

Cheers !

On 02/10/16 17:44, Rob Willett wrote:
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB 
as thats all our VM provider allows us.


After saying that we have never seen any performance issues that 
weren't due to our schemas or our bad design. Our working assumption 
is that if there's a problem, its our fault before we blame SQLite. 
So far that working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data 
and had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main 
memory as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Simon Slavin

On 2 Oct 2016, at 10:47pm, Domingo Alvarez Duarte  wrote:

> How do you use those big databases ?

Just a note that it's not the big databases which are a problem.  Like another 
poster I too use big databases in production -- 43GB in my case.  And that 
particular database resembles the one from the OP: it's mostly one very thin 
table with a huge number of rows.

The problem here is that OP has an UPDATE command which updates a lot of rows, 
and they have only 2GB of cache space.  Since the UPDATE command is all one 
transaction all the lines which are changed have to be held in the journal 
before any changes can be made to the actual database file.  Which means that 
SQLite is effectively copying the entire table.

To see if it helps I have recommended that the OP changes their journal mode.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Domingo Alvarez Duarte

Hello Rob !

How do you use those big databases ?

Append only ? Read only ? Constant read write ? Foreign keys ?

Could you show the schema to talk about it ?

Cheers !


On 02/10/16 17:44, Rob Willett wrote:
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB 
as thats all our VM provider allows us.


After saying that we have never seen any performance issues that 
weren't due to our schemas or our bad design. Our working assumption 
is that if there's a problem, its our fault before we blame SQLite. So 
far that working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data 
and had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main memory 
as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Domingo Alvarez Duarte

Hello Richard !

On 02/10/16 18:02, Richard Hipp wrote:

For your specific UPDATE, it seems like SQLite ought to be able to do
it all in one pass, and thus avoid using all that memory to store the
rowids.  This looks like an opportunity to improve the query planner.


That would be nice, also I noticed that sqlite seem to do a delete then 
an insert for updates even when it would be possible to update in place, 
with tables with several fields and indices seems to be a waste of work.


Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Richard Hipp
On 10/2/16, Domingo Alvarez Duarte  wrote:
> Hello !
>
> I'm still fighting to use sqlite with a 20GB database and now I'm trying
> to update a table with 980M rows but sqlite is eating all my memory
> (2GB) and making blocking my computer (too much swap).
>
> I'm in doubt if sqlite can really manage databases above 1GB.

For some (many) UPDATE statements, SQLite has to make two passes.  On
the first pass, it determines the ROWID of every row in the table that
needs to be changed, and then on the second pass it updates those
rows.

You are trying to updates 900 million rows all in one go, and SQLite
requires 8 bytes for each row, so that is 7.2 GB of RAM right there,
just to hold all of the rowids.

For your specific UPDATE, it seems like SQLite ought to be able to do
it all in one pass, and thus avoid using all that memory to store the
rowids.  This looks like an opportunity to improve the query planner.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Keith Medcalf

You do realize that your UPDATE is spooling an update to 980 Million rows
  (980 Million integer updates means the update spool needs to contain 
   980 Million 64-bit integer primary keys plus 980 Million updates
   which is probably at LEAST 4GB of temporary data)
  -- even if the correlated subquery cannot be satisfied (result in NULL)
  -- even if the update does not need to be performed?

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Domingo Alvarez Duarte
> Sent: Sunday, 2 October, 2016 14:35
> To: SQLite mailing list
> Subject: [sqlite] Using too much memory to execute an update query
> 
> Hello !
> 
> I'm still fighting to use sqlite with a 20GB database and now I'm trying
> to update a table with 980M rows but sqlite is eating all my memory
> (2GB) and making blocking my computer (too much swap).
> 
> I'm in doubt if sqlite can really manage databases above 1GB.
> 
> Have someone any experience with big databases ?
> 
> Cheers !
> 
> The query:
> 
> --explain query plan
> update bolsas_familia set favorecido_id=(select id from favorecidos
> where nis=favorecido_id);
> 
> Query plan:
> 
> selectid|order|from|detail
> 0|0|0|SCAN TABLE bolsas_familia
> 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
> 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx
> (nis=?)
> 
> The schema is:
> 
> CREATE TABLE bolsas_familia(
>  id integer primary key,
>  favorecido_id integer not null,
>  valor_parcela numeric,
>  mes_competencia integer
> ); //980M rows
> 
> CREATE TABLE "favorecidos"(
>  id integer primary key,
>  nis integer,
>  nome varchar collate nocase_slna,
>  municipio_id integer,
>  programa_id integer not null,
>  fonte_finalidade_id integer not null,
>  first_ano_mes integer
> ); //14M rows
> 
> CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);
> 
> 
> Cheers !
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Clemens Ladisch
Domingo Alvarez Duarte wrote:
> I'm still fighting to use sqlite with a 20GB database and now I'm
> trying to update a table with 980M rows but sqlite is eating all my
> memory (2GB) and making blocking my computer (too much swap).

IIRC SQLite keeps a list of (the IDs of) all rows to be updated.

Try making the updates in smaller chunks (WHERE ... BETWEEN ...).


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial indexes on JSON properties?

2016-10-02 Thread Clemens Ladisch
Jens Alfke wrote:
> if the key is null and the table has a rowid, the node would be small, on the 
> order of 10 bytes or so … ?

Typically less than 10 bytes.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Rob Willett
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB as 
thats all our VM provider allows us.


After saying that we have never seen any performance issues that weren't 
due to our schemas or our bad design. Our working assumption is that if 
there's a problem, its our fault before we blame SQLite. So far that 
working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data and 
had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main memory 
as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug: optional transfer optimization does not update sqlite_sequence

2016-10-02 Thread Clemens Ladisch
Hi,

if the transfer optimization needs to check if the destination table is
empty, and if the optimization is actually used, and if the destination
table uses AUTOINCREMENT, then the generated code does not update the
sqlite_sequence table.

After these statements, the row in dst should have rowid 2, but the
rowid 1 got reused:

create table src(a integer primary key autoincrement, b unique);
create table dst(a integer primary key autoincrement, b unique);
insert into src default values;
insert into dst select * from src;
-- select * from sqlite_sequence where name = 'dst'; is empty
delete from dst;
insert into dst default values;
select a from dst;
1

(reported by Matthew Moisen on )


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Simon Slavin

On 2 Oct 2016, at 9:35pm, Domingo Alvarez Duarte  wrote:

> I'm still fighting to use sqlite with a 20GB database and now I'm trying to 
> update a table with 980M rows but sqlite is eating all my memory (2GB) and 
> making blocking my computer (too much swap).

I assume you are not using any unusual PRAGMAs and are not using 
multiprocessing while your update executes.  Please execute this command on the 
database (you can use the shell tool).

PRAGMA journal_mode

If the result is WAL then execute

PRAGMA journal_mode = TRUNCATE

otherwise execute

PRAGMA journal_mode = WAL

Then try your operation again.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Using too much memory to execute an update query

2016-10-02 Thread Domingo Alvarez Duarte

Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm trying 
to update a table with 980M rows but sqlite is eating all my memory 
(2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx 
(nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);


Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial indexes on JSON properties?

2016-10-02 Thread Jens Alfke

> On Oct 2, 2016, at 6:20 AM, Clemens Ladisch  wrote:
> 
> Changing the function in any way (including not registering the
> function) would essentially corrupt the index.

Well, the same can be said of using a custom collation function, which has been 
supported since 3.0; or of using a function in the expression being indexed, 
which is supported since 3.9.

- “Changing the function” is covered by the SQLITE_DETERMINISTIC flag, which is 
a promise that the function is pure, i.e. always produces the same output given 
the same input. If that guarantee were to fail, you’d need to drop and 
re-create the index.
- “Not registering the function” is handled by SQLite — any query/command that 
would involve calling an unregistered function will instead return an error. (I 
already run into this in the current version of my library, which uses a custom 
collator, when I try to query databases with the `sqlite3` tool.)

So I don’t see this as a reason not to support (deterministic) functions in the 
WHERE clause.

> If only 10% of your rows have a professor, the first 90% of the index
> entries will be NULL.  This is probably not a performance problem when
> searching in a B-tree (you can test this with a table with 'real'
> columns).  Whether the storage overhead matters is something you have
> to decide yourself.

It’s 1%, not 10%, in my example; but in reality I can’t predict or control what 
types of data sets might be used by customers.

I don’t know the details of SQLite’s b-tree, but I’d guess that the root node 
would have one child pointer for ‘null’ values, with the rest being non-null, 
so there’s little overhead in lookup time. As for storage overhead, hopefully 
if the key is null and the table has a rowid, the node would be small, on the 
order of 10 bytes or so … ?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial indexes on JSON properties?

2016-10-02 Thread Clemens Ladisch
Jens Alfke wrote:
> the WHERE clause in a CREATE INDEX statement explicitly disallows
> function calls, so I can’t constrain the index to only the rows that
> contain the JSON property. Is this limitation something that might be
> lifted soon (now that functions can be identified as ‘pure’), or is it
> somehow a direct consequence of the way partial indexes work?

Changing the function in any way (including not registering the
function) would essentially corrupt the index.

>   CREATE INDEX profs ON dataset (json_extract(doc, ‘$.professor’)) WHERE 
> json_extract(doc, ‘$.professor’) IS NOT NULL
>
> (Now, maybe I’m prematurely optimizing, and in reality all those null
> values don’t have much overhead

If only 10% of your rows have a professor, the first 90% of the index
entries will be NULL.  This is probably not a performance problem when
searching in a B-tree (you can test this with a table with 'real'
columns).  Whether the storage overhead matters is something you have
to decide yourself.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why so much I/O ? Can sqlite be improved ?

2016-10-02 Thread Florian Weimer
* Domingo Alvarez Duarte:

> After 12 hours inserting of:
>
> 934,135,285 records on bolsas_familia
>
> 22,711,259 records in favorecidos
>
> 5,570 records in municipios
>
> ...

Insertion will be faster if you create the index after populating the
tables.

> time sqlite3 bolsa_familia3.db "vacuum;"
>
> real147m6.252s
> user10m53.790s
> sys3m43.663s

You really need to increase the cache size if at all possible.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users