Re: [sqlite] Using too much memory to execute an update query
Hello Domingo, DAD>db.exec_dml("PRAGMA synchronous = 0;"); DAD>db.exec_dml("PRAGMA journal_mode = WAL"); DAD>//db.exec_dml("PRAGMA journal_mode = MEMORY;"); DAD>//db.exec_dml("PRAGMA journal_mode = OFF;"); DAD>//db.exec_dml("PRAGMA locking_mode = EXCLUSIVE;"); DAD>db.exec_dml("PRAGMA temp_store = MEMORY;"); DAD>//db.exec_dml("PRAGMA threads = 4;"); Here are some of the pragma's you reported this morning. Any place you have "MEMORY" can consume most of your memory when doing a bunch of transactions. I'm going to guess you're still using some of these "use memory" pragma's. Sunday, October 2, 2016, 4:35:23 PM, you wrote: DAD> Hello ! DAD> I'm still fighting to use sqlite with a 20GB database and now I'm trying DAD> to update a table with 980M rows but sqlite is eating all my memory DAD> (2GB) and making blocking my computer (too much swap). DAD> I'm in doubt if sqlite can really manage databases above 1GB. DAD> Have someone any experience with big databases ? DAD> Cheers ! DAD> The query: DAD> --explain query plan DAD> update bolsas_familia set favorecido_id=(select id from favorecidos DAD> where nis=favorecido_id); DAD> Query plan: DAD> selectid|order|from|detail DAD> 0|0|0|SCAN TABLE bolsas_familia DAD> 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0 DAD> 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx DAD> (nis=?) DAD> The schema is: DAD> CREATE TABLE bolsas_familia( DAD> id integer primary key, DAD> favorecido_id integer not null, DAD> valor_parcela numeric, DAD> mes_competencia integer DAD> ); //980M rows DAD> CREATE TABLE "favorecidos"( DAD> id integer primary key, DAD> nis integer, DAD> nome varchar collate nocase_slna, DAD> municipio_id integer, DAD> programa_id integer not null, DAD> fonte_finalidade_id integer not null, DAD> first_ano_mes integer DAD> ); //14M rows DAD> CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis); DAD> Cheers ! DAD> ___ DAD> sqlite-users mailing list DAD> sqlite-users@mailinglists.sqlite.org DAD> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users -- Tegmailto:t...@djii.com ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
We tend to use append only. We delete some small transient data as we use them, this transient data maintains a bit of state between appends. We're talking 300-500 lines of a table. It's not read only, we add a lot of data every 3-5 minutes, but its all driven from a single Perl process that captures XML data, processes it as only Perl can, and then updates the database. Nothing updates the database between those time intervals. No foreign keys at all. The schema is quite long but rather simplistic. There are lots of tables all referenced by a big central table, since a lot of our data is common and referenced thousands of times by different rows, we needed to reduce the amount of data or else we would have tripled the size of the main database to over 100GB. Its 1,200 lines long so not really suitable for posting here, but the main table is -- -- Table structure for Disruptions -- DROP TABLE IF EXISTS "Disruptions"; CREATE TABLE "Disruptions" ( "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, "version" integer NOT NULL, "Disruption_id" INTEGER NOT NULL, "status" integer NOT NULL, "severity" integer NOT NULL, "levelOfInterest" integer NOT NULL, "category" integer NOT NULL, "subCategory" integer NOT NULL, "startTime" TEXT NOT NULL, "endTime" text NOT NULL, "location" integer NOT NULL, "corridor" integer NOT NULL, "comments" integer NOT NULL, "currentUpdate" integer NOT NULL, "remarkTime" TEXT NOT NULL, "lastModTime" TEXT NOT NULL, "CauseAreaPointX" real NOT NULL, "CauseAreaPointY" real NOT NULL, "Direction" TEXT ); INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", '0'); All the integer fields are really indexes into other tables. We made a deliberate decision to move some of the more complex logic out of SQL into Perl as it sat better there. This was no reflection on SQLite, we would have made the same decision with PostGres or MySQL or Oracle. Right tool for the right job etc etc. Rob On 2 Oct 2016, at 22:47, Domingo Alvarez Duarte wrote: Hello Rob ! How do you use those big databases ? Append only ? Read only ? Constant read write ? Foreign keys ? Could you show the schema to talk about it ? Cheers ! On 02/10/16 17:44, Rob Willett wrote: We have production databases with 30-50GB and have no issues with managing them. Mind you we have more memory, though we only have 8GB as thats all our VM provider allows us. After saying that we have never seen any performance issues that weren't due to our schemas or our bad design. Our working assumption is that if there's a problem, its our fault before we blame SQLite. So far that working assumption had held true :) We did some tests (just for the hell of it) to 200GB with dummy data and had no issues I can recall. I know that other people on this list have far, far bigger production databases than us. I'm not a SQLite expert but I look on machines with 2GB of main memory as rather small for this sort of thing. Rob On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote: Hello ! I'm still fighting to use sqlite with a 20GB database and now I'm trying to update a table with 980M rows but sqlite is eating all my memory (2GB) and making blocking my computer (too much swap). I'm in doubt if sqlite can really manage databases above 1GB. Have someone any experience with big databases ? Cheers ! The query: --explain query plan update bolsas_familia set favorecido_id=(select id from favorecidos where nis=favorecido_id); Query plan: selectid|order|from|detail 0|0|0|SCAN TABLE bolsas_familia 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx (nis=?) The schema is: CREATE TABLE bolsas_familia( id integer primary key, favorecido_id integer not null, valor_parcela numeric, mes_competencia integer ); //980M rows CREATE TABLE "favorecidos"( id integer primary key, nis integer, nome varchar collate nocase_slna, municipio_id integer, programa_id integer not null, fonte_finalidade_id integer not null, first_ano_mes integer ); //14M rows CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis); Cheers ! ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
On 2 Oct 2016, at 10:47pm, Domingo Alvarez Duartewrote: > How do you use those big databases ? Just a note that it's not the big databases which are a problem. Like another poster I too use big databases in production -- 43GB in my case. And that particular database resembles the one from the OP: it's mostly one very thin table with a huge number of rows. The problem here is that OP has an UPDATE command which updates a lot of rows, and they have only 2GB of cache space. Since the UPDATE command is all one transaction all the lines which are changed have to be held in the journal before any changes can be made to the actual database file. Which means that SQLite is effectively copying the entire table. To see if it helps I have recommended that the OP changes their journal mode. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
Hello Rob ! How do you use those big databases ? Append only ? Read only ? Constant read write ? Foreign keys ? Could you show the schema to talk about it ? Cheers ! On 02/10/16 17:44, Rob Willett wrote: We have production databases with 30-50GB and have no issues with managing them. Mind you we have more memory, though we only have 8GB as thats all our VM provider allows us. After saying that we have never seen any performance issues that weren't due to our schemas or our bad design. Our working assumption is that if there's a problem, its our fault before we blame SQLite. So far that working assumption had held true :) We did some tests (just for the hell of it) to 200GB with dummy data and had no issues I can recall. I know that other people on this list have far, far bigger production databases than us. I'm not a SQLite expert but I look on machines with 2GB of main memory as rather small for this sort of thing. Rob On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote: Hello ! I'm still fighting to use sqlite with a 20GB database and now I'm trying to update a table with 980M rows but sqlite is eating all my memory (2GB) and making blocking my computer (too much swap). I'm in doubt if sqlite can really manage databases above 1GB. Have someone any experience with big databases ? Cheers ! The query: --explain query plan update bolsas_familia set favorecido_id=(select id from favorecidos where nis=favorecido_id); Query plan: selectid|order|from|detail 0|0|0|SCAN TABLE bolsas_familia 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx (nis=?) The schema is: CREATE TABLE bolsas_familia( id integer primary key, favorecido_id integer not null, valor_parcela numeric, mes_competencia integer ); //980M rows CREATE TABLE "favorecidos"( id integer primary key, nis integer, nome varchar collate nocase_slna, municipio_id integer, programa_id integer not null, fonte_finalidade_id integer not null, first_ano_mes integer ); //14M rows CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis); Cheers ! ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
Hello Richard ! On 02/10/16 18:02, Richard Hipp wrote: For your specific UPDATE, it seems like SQLite ought to be able to do it all in one pass, and thus avoid using all that memory to store the rowids. This looks like an opportunity to improve the query planner. That would be nice, also I noticed that sqlite seem to do a delete then an insert for updates even when it would be possible to update in place, with tables with several fields and indices seems to be a waste of work. Cheers ! ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
On 10/2/16, Domingo Alvarez Duartewrote: > Hello ! > > I'm still fighting to use sqlite with a 20GB database and now I'm trying > to update a table with 980M rows but sqlite is eating all my memory > (2GB) and making blocking my computer (too much swap). > > I'm in doubt if sqlite can really manage databases above 1GB. For some (many) UPDATE statements, SQLite has to make two passes. On the first pass, it determines the ROWID of every row in the table that needs to be changed, and then on the second pass it updates those rows. You are trying to updates 900 million rows all in one go, and SQLite requires 8 bytes for each row, so that is 7.2 GB of RAM right there, just to hold all of the rowids. For your specific UPDATE, it seems like SQLite ought to be able to do it all in one pass, and thus avoid using all that memory to store the rowids. This looks like an opportunity to improve the query planner. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
You do realize that your UPDATE is spooling an update to 980 Million rows (980 Million integer updates means the update spool needs to contain 980 Million 64-bit integer primary keys plus 980 Million updates which is probably at LEAST 4GB of temporary data) -- even if the correlated subquery cannot be satisfied (result in NULL) -- even if the update does not need to be performed? > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Domingo Alvarez Duarte > Sent: Sunday, 2 October, 2016 14:35 > To: SQLite mailing list > Subject: [sqlite] Using too much memory to execute an update query > > Hello ! > > I'm still fighting to use sqlite with a 20GB database and now I'm trying > to update a table with 980M rows but sqlite is eating all my memory > (2GB) and making blocking my computer (too much swap). > > I'm in doubt if sqlite can really manage databases above 1GB. > > Have someone any experience with big databases ? > > Cheers ! > > The query: > > --explain query plan > update bolsas_familia set favorecido_id=(select id from favorecidos > where nis=favorecido_id); > > Query plan: > > selectid|order|from|detail > 0|0|0|SCAN TABLE bolsas_familia > 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0 > 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx > (nis=?) > > The schema is: > > CREATE TABLE bolsas_familia( > id integer primary key, > favorecido_id integer not null, > valor_parcela numeric, > mes_competencia integer > ); //980M rows > > CREATE TABLE "favorecidos"( > id integer primary key, > nis integer, > nome varchar collate nocase_slna, > municipio_id integer, > programa_id integer not null, > fonte_finalidade_id integer not null, > first_ano_mes integer > ); //14M rows > > CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis); > > > Cheers ! > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
Domingo Alvarez Duarte wrote: > I'm still fighting to use sqlite with a 20GB database and now I'm > trying to update a table with 980M rows but sqlite is eating all my > memory (2GB) and making blocking my computer (too much swap). IIRC SQLite keeps a list of (the IDs of) all rows to be updated. Try making the updates in smaller chunks (WHERE ... BETWEEN ...). Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
We have production databases with 30-50GB and have no issues with managing them. Mind you we have more memory, though we only have 8GB as thats all our VM provider allows us. After saying that we have never seen any performance issues that weren't due to our schemas or our bad design. Our working assumption is that if there's a problem, its our fault before we blame SQLite. So far that working assumption had held true :) We did some tests (just for the hell of it) to 200GB with dummy data and had no issues I can recall. I know that other people on this list have far, far bigger production databases than us. I'm not a SQLite expert but I look on machines with 2GB of main memory as rather small for this sort of thing. Rob On 2 Oct 2016, at 21:35, Domingo Alvarez Duarte wrote: Hello ! I'm still fighting to use sqlite with a 20GB database and now I'm trying to update a table with 980M rows but sqlite is eating all my memory (2GB) and making blocking my computer (too much swap). I'm in doubt if sqlite can really manage databases above 1GB. Have someone any experience with big databases ? Cheers ! The query: --explain query plan update bolsas_familia set favorecido_id=(select id from favorecidos where nis=favorecido_id); Query plan: selectid|order|from|detail 0|0|0|SCAN TABLE bolsas_familia 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0 0|0|0|SEARCH TABLE favorecidos USING COVERING INDEX favorecidos_nis_idx (nis=?) The schema is: CREATE TABLE bolsas_familia( id integer primary key, favorecido_id integer not null, valor_parcela numeric, mes_competencia integer ); //980M rows CREATE TABLE "favorecidos"( id integer primary key, nis integer, nome varchar collate nocase_slna, municipio_id integer, programa_id integer not null, fonte_finalidade_id integer not null, first_ano_mes integer ); //14M rows CREATE UNIQUE INDEX favorecidos_nis_idx ON favorecidos(nis); Cheers ! ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Using too much memory to execute an update query
On 2 Oct 2016, at 9:35pm, Domingo Alvarez Duartewrote: > I'm still fighting to use sqlite with a 20GB database and now I'm trying to > update a table with 980M rows but sqlite is eating all my memory (2GB) and > making blocking my computer (too much swap). I assume you are not using any unusual PRAGMAs and are not using multiprocessing while your update executes. Please execute this command on the database (you can use the shell tool). PRAGMA journal_mode If the result is WAL then execute PRAGMA journal_mode = TRUNCATE otherwise execute PRAGMA journal_mode = WAL Then try your operation again. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users