Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Teg
Hello Domingo,


DAD>db.exec_dml("PRAGMA synchronous = 0;");
DAD>db.exec_dml("PRAGMA journal_mode = WAL");
DAD>//db.exec_dml("PRAGMA journal_mode = MEMORY;");
DAD>//db.exec_dml("PRAGMA journal_mode = OFF;");
DAD>//db.exec_dml("PRAGMA locking_mode = EXCLUSIVE;");
DAD>db.exec_dml("PRAGMA temp_store = MEMORY;");
DAD>//db.exec_dml("PRAGMA threads = 4;");


Here  are  some  of the pragma's you reported this morning.  Any place
you have "MEMORY" can consume most of your memory when doing a bunch of
transactions.

I'm  going  to  guess  you're  still  using some of these "use memory"
pragma's.


Sunday, October 2, 2016, 4:35:23 PM, you wrote:

DAD> Hello !

DAD> I'm still fighting to use sqlite with a 20GB database and now I'm trying
DAD> to update a table with 980M rows but sqlite is eating all my memory 
DAD> (2GB) and making blocking my computer (too much swap).

DAD> I'm in doubt if sqlite can really manage databases above 1GB.

DAD> Have someone any experience with big databases ?

DAD> Cheers !

DAD> The query:

DAD> --explain query plan
DAD> update bolsas_familia set favorecido_id=(select id from favorecidos 
DAD> where nis=favorecido_id);

DAD> Query plan:

DAD> selectid|order|from|detail
DAD> 0|0|0|SCAN TABLE bolsas_familia
DAD> 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
DAD> 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx
DAD> (nis=?)

DAD> The schema is:

DAD> CREATE TABLE bolsas_familia(
DAD>  id integer primary key,
DAD>  favorecido_id integer not null,
DAD>  valor_parcela numeric,
DAD>  mes_competencia integer
DAD> ); //980M rows

DAD> CREATE TABLE "favorecidos"(
DAD>  id integer primary key,
DAD>  nis integer,
DAD>  nome varchar collate nocase_slna,
DAD>  municipio_id integer,
DAD>  programa_id integer not null,
DAD>  fonte_finalidade_id integer not null,
DAD>  first_ano_mes integer
DAD> ); //14M rows

DAD> CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);


DAD> Cheers !

DAD> ___
DAD> sqlite-users mailing list
DAD> sqlite-users@mailinglists.sqlite.org
DAD> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
 Tegmailto:t...@djii.com

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Rob Willett
We tend to use append only. We delete some small transient data as we 
use them, this transient data maintains a bit of state between appends. 
We're talking 300-500 lines of a table.


It's not read only, we add a lot of data every 3-5 minutes, but its all 
driven from a single Perl process that captures XML data, processes it 
as only Perl can, and then updates the database. Nothing updates the 
database between those time intervals. No foreign keys at all.


The schema is quite long but rather simplistic. There are lots of tables 
all referenced by a big central table, since a lot of our data is common 
and referenced thousands of times by different rows, we needed to reduce 
the amount of data or else we would have tripled the size of the main 
database to over 100GB.


Its 1,200 lines long so not really suitable for posting here, but the 
main table is


-- 
--  Table structure for Disruptions
-- 
DROP TABLE IF EXISTS "Disruptions";
CREATE TABLE "Disruptions" (
 "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
 "version" integer NOT NULL,
 "Disruption_id" INTEGER NOT NULL,
 "status" integer NOT NULL,
 "severity" integer NOT NULL,
 "levelOfInterest" integer NOT NULL,
 "category" integer NOT NULL,
 "subCategory" integer NOT NULL,
 "startTime" TEXT NOT NULL,
 "endTime" text NOT NULL,
 "location" integer NOT NULL,
 "corridor" integer NOT NULL,
 "comments" integer NOT NULL,
 "currentUpdate" integer NOT NULL,
 "remarkTime" TEXT NOT NULL,
 "lastModTime" TEXT NOT NULL,
 "CauseAreaPointX" real NOT NULL,
 "CauseAreaPointY" real NOT NULL,
 "Direction" TEXT
);
INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", 
'0');


All the integer fields are really indexes into other tables. We made a 
deliberate decision to move some of the more complex logic out of SQL 
into Perl as it sat better there. This was no reflection on SQLite, we 
would have made the same decision with PostGres or MySQL or Oracle. 
Right tool for the right job etc etc.


Rob

On 2 Oct 2016, at 22:47, Domingo Alvarez Duarte wrote:


Hello Rob !

How do you use those big databases ?

Append only ? Read only ? Constant read write ? Foreign keys ?

Could you show the schema to talk about it ?

Cheers !

On 02/10/16 17:44, Rob Willett wrote:
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB 
as thats all our VM provider allows us.


After saying that we have never seen any performance issues that 
weren't due to our schemas or our bad design. Our working assumption 
is that if there's a problem, its our fault before we blame SQLite. 
So far that working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data 
and had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main 
memory as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Simon Slavin

On 2 Oct 2016, at 10:47pm, Domingo Alvarez Duarte  wrote:

> How do you use those big databases ?

Just a note that it's not the big databases which are a problem.  Like another 
poster I too use big databases in production -- 43GB in my case.  And that 
particular database resembles the one from the OP: it's mostly one very thin 
table with a huge number of rows.

The problem here is that OP has an UPDATE command which updates a lot of rows, 
and they have only 2GB of cache space.  Since the UPDATE command is all one 
transaction all the lines which are changed have to be held in the journal 
before any changes can be made to the actual database file.  Which means that 
SQLite is effectively copying the entire table.

To see if it helps I have recommended that the OP changes their journal mode.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Domingo Alvarez Duarte

Hello Rob !

How do you use those big databases ?

Append only ? Read only ? Constant read write ? Foreign keys ?

Could you show the schema to talk about it ?

Cheers !


On 02/10/16 17:44, Rob Willett wrote:
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB 
as thats all our VM provider allows us.


After saying that we have never seen any performance issues that 
weren't due to our schemas or our bad design. Our working assumption 
is that if there's a problem, its our fault before we blame SQLite. So 
far that working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data 
and had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main memory 
as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Domingo Alvarez Duarte

Hello Richard !

On 02/10/16 18:02, Richard Hipp wrote:

For your specific UPDATE, it seems like SQLite ought to be able to do
it all in one pass, and thus avoid using all that memory to store the
rowids.  This looks like an opportunity to improve the query planner.


That would be nice, also I noticed that sqlite seem to do a delete then 
an insert for updates even when it would be possible to update in place, 
with tables with several fields and indices seems to be a waste of work.


Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Richard Hipp
On 10/2/16, Domingo Alvarez Duarte  wrote:
> Hello !
>
> I'm still fighting to use sqlite with a 20GB database and now I'm trying
> to update a table with 980M rows but sqlite is eating all my memory
> (2GB) and making blocking my computer (too much swap).
>
> I'm in doubt if sqlite can really manage databases above 1GB.

For some (many) UPDATE statements, SQLite has to make two passes.  On
the first pass, it determines the ROWID of every row in the table that
needs to be changed, and then on the second pass it updates those
rows.

You are trying to updates 900 million rows all in one go, and SQLite
requires 8 bytes for each row, so that is 7.2 GB of RAM right there,
just to hold all of the rowids.

For your specific UPDATE, it seems like SQLite ought to be able to do
it all in one pass, and thus avoid using all that memory to store the
rowids.  This looks like an opportunity to improve the query planner.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Keith Medcalf

You do realize that your UPDATE is spooling an update to 980 Million rows
  (980 Million integer updates means the update spool needs to contain 
   980 Million 64-bit integer primary keys plus 980 Million updates
   which is probably at LEAST 4GB of temporary data)
  -- even if the correlated subquery cannot be satisfied (result in NULL)
  -- even if the update does not need to be performed?

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Domingo Alvarez Duarte
> Sent: Sunday, 2 October, 2016 14:35
> To: SQLite mailing list
> Subject: [sqlite] Using too much memory to execute an update query
> 
> Hello !
> 
> I'm still fighting to use sqlite with a 20GB database and now I'm trying
> to update a table with 980M rows but sqlite is eating all my memory
> (2GB) and making blocking my computer (too much swap).
> 
> I'm in doubt if sqlite can really manage databases above 1GB.
> 
> Have someone any experience with big databases ?
> 
> Cheers !
> 
> The query:
> 
> --explain query plan
> update bolsas_familia set favorecido_id=(select id from favorecidos
> where nis=favorecido_id);
> 
> Query plan:
> 
> selectid|order|from|detail
> 0|0|0|SCAN TABLE bolsas_familia
> 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
> 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx
> (nis=?)
> 
> The schema is:
> 
> CREATE TABLE bolsas_familia(
>  id integer primary key,
>  favorecido_id integer not null,
>  valor_parcela numeric,
>  mes_competencia integer
> ); //980M rows
> 
> CREATE TABLE "favorecidos"(
>  id integer primary key,
>  nis integer,
>  nome varchar collate nocase_slna,
>  municipio_id integer,
>  programa_id integer not null,
>  fonte_finalidade_id integer not null,
>  first_ano_mes integer
> ); //14M rows
> 
> CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);
> 
> 
> Cheers !
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Clemens Ladisch
Domingo Alvarez Duarte wrote:
> I'm still fighting to use sqlite with a 20GB database and now I'm
> trying to update a table with 980M rows but sqlite is eating all my
> memory (2GB) and making blocking my computer (too much swap).

IIRC SQLite keeps a list of (the IDs of) all rows to be updated.

Try making the updates in smaller chunks (WHERE ... BETWEEN ...).


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Rob Willett
We have production databases with 30-50GB and have no issues with 
managing them. Mind you we have more memory, though we only have 8GB as 
thats all our VM provider allows us.


After saying that we have never seen any performance issues that weren't 
due to our schemas or our bad design. Our working assumption is that if 
there's a problem, its our fault before we blame SQLite. So far that 
working assumption had held true :)


We did some tests (just for the hell of it) to 200GB with dummy data and 
had no issues I can recall.


I know that other people on this list have far, far bigger production 
databases than us.


I'm not a SQLite expert but I look on machines with 2GB of main memory 
as rather small for this sort of thing.


Rob

On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote:


Hello !

I'm still fighting to use sqlite with a 20GB database and now I'm 
trying to update a table with 980M rows but sqlite is eating all my 
memory (2GB) and making blocking my computer (too much swap).


I'm in doubt if sqlite can really manage databases above 1GB.

Have someone any experience with big databases ?

Cheers !

The query:

--explain query plan
update bolsas_familia set favorecido_id=(select id from favorecidos 
where nis=favorecido_id);


Query plan:

selectid|order|from|detail
0|0|0|SCAN TABLE bolsas_familia
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX 
favorecidos_nis_idx (nis=?)


The schema is:

CREATE TABLE bolsas_familia(
id integer primary key,
favorecido_id integer not null,
valor_parcela numeric,
mes_competencia integer
); //980M rows

CREATE TABLE "favorecidos"(
id integer primary key,
nis integer,
nome varchar collate nocase_slna,
municipio_id integer,
programa_id integer not null,
fonte_finalidade_id integer not null,
first_ano_mes integer
); //14M rows

CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis);

Cheers !

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using too much memory to execute an update query

2016-10-02 Thread Simon Slavin

On 2 Oct 2016, at 9:35pm, Domingo Alvarez Duarte  wrote:

> I'm still fighting to use sqlite with a 20GB database and now I'm trying to 
> update a table with 980M rows but sqlite is eating all my memory (2GB) and 
> making blocking my computer (too much swap).

I assume you are not using any unusual PRAGMAs and are not using 
multiprocessing while your update executes.  Please execute this command on the 
database (you can use the shell tool).

PRAGMA journal_mode

If the result is WAL then execute

PRAGMA journal_mode = TRUNCATE

otherwise execute

PRAGMA journal_mode = WAL

Then try your operation again.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users