Re: [sqlite] performance difference based on the sqlite configuration?
On Tue, Jan 28, 2014 at 6:57 PM, veeresh kumar wrote: > I do have a similar issue. I am finding it hard which is > the best configuration for most of the large scale application. Our > database > size grows from 0 - 45 GB . As the database size grows, performance seems > to be > degrading. > > Major operations include insertion/read/delete > > Current settings: > > PRAGMA journal_mode = TRUNCATE > PRAGMA page_size;", lPageSize) > UINT64 ui64MaxPageCount = 53687091200 / lPageSize; //50 GB > > PRAGMA max_page_count = %I64u;", ui64MaxPageCount)); > > I am planning to change it to below settings to see if that > improves the performance. > > PRAGMA journal_mode = WAL > pragma page_size=4096 > pragma cache_size=16384 > PRAGMA wal_autocheckpoint=10 > > Setting wal_autocheckpoint to 10 mean that the data gets committed > to the disk after it reaches 100 MB , Am I right ? No. the autocheckpoint threshold is the number of *pages* in the WAL file. If you have a 4K page, then the WAL file will grow to 400MB before the automatic checkpoint occurs. > If the machine crashes or > power goes off during this time, 100 MB data is lost? No. The content of the WAL file will be rolled forward automatically the next time any application opens the database (after the machine reboots). If the last one or two transactions failed to be completely written into the WAL file, then those transactions will be lost. If you are in the middle of a transaction which has not yet committed, that transaction will certainly be lost. If you set PRAGMA synchronous=FULL then fsync() will be called on the WAL file after every transaction to make sure it is completely flushed to disk. This will slow down writes, but will ensure that all content is recoverable if the machine crashes. The default value for synchronous in WAL mode is NORMAL, which means that fsync() is only called when a checkpoint occurs. This is much faster, but you risk having the last one or two transactions rollback if the machine crashes. > Any API calls which would > commit the data to disk forcibly? > > Let me know if the above settings would improve the > application performance or something needs to be configured. > > Thanks > > -Veeresh > > > -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto: > sqlite-users-boun...@sqlite.org] On > Behalf Of Richard Hipp > Sent: Sunday, January 19, 2014 9:37 AM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Dramatic performance difference between different > "PRAGMA synchronous" settings for bulk inserts > > In WAL mode with synchronous=NORMAL, SQLite only syncs > (FlushFileBuffers() on windows) when it does a checkpoint operation. > Checkpoints should be happening automatically > whenever the WAL file exceeds about 1MB in size. > > For an 8GB database, probably there are about 8000 sync > operations, therefore. If each takes > about 2 seconds, that would pretty much account for the extra 4.5 hours. > > If you are creating a new database from scratch, it is > safe to set synchronous=OFF. If you lose > power in the middle, your database file will probably be corrupt, but > since you > were creating it from scratch you can easily recover just be starting the > database creation process over again from the beginning. > > If you want to try running with synchronous=NORMAL, you > might try setting PRAGMA wal_autocheckpoint=10; (from the default of > 1000) > which will make for dramatically larger WAL files, but also dramatically > fewer > syncs. > Then the syncs will use just 5 or 6 minutes instead of > 4.5 hours. Hopefully. > > > > On Sun, Jan 19, 2014 at 9:00 AM, Mario M. Westphal wrote: > > > I have a performance effect which I don't quite > understand. > > Maybe I'm using the wrong settings or something. > Sorry for the long > > post, but I wanted to include all the info that may > be important. > > > > My software is written in C++, runs on Windows 7/8, > the SQLite > > database file is either on a local SATA RAID disk or > a SSD. > > Typical database sizes are between 2 GB and 8 GB. > > The largest tables hold several million entries. > Also FTS4 is used, > > which also creates large tables. > > Fast internal RAID disks, SDD. Four Xeon cores. 8 GB > RAM. > > > > I'm using SQLite 3.8.0.2 > > WAL mode, shared cache enabled. > > locking_mode=NORMAL > > checkpoint_fullfsync=0 > > pragma page_size=4096 > > pragma cache_size=16384 > > > > General (retrieval) performance is excellent! > > > > > > During an ingest phase, my application pumps in > hundreds of thousands > > of records into multiple tables. > > There are massive amounts of writes during that > phase, different > > record sizes, tables with one to four indices etc. > > > > My application is multi-threaded and inserts data > into the database > > concurrently from multiple threads. > > The threads process data in batches, and use SQLite > transactions to > > process all records of a batch into the databas
[sqlite] Performance based on Sqlite Configuration
I am finding it hard which is the best configuration for most of the large scale application. Our database size grows from 0 – 45 GB . As the database size grows, performance seems to be degrading. Performance of the same application is better when it compared to Sql Server. I am in the middle of identifying the bottle neck and first thing I am looking for is the configuration. Current settings: PRAGMA journal_mode = TRUNCATE PRAGMA page_size;", lPageSize) UINT64 ui64MaxPageCount = 53687091200 / lPageSize; PRAGMA max_page_count = %I64u;", ui64MaxPageCount)); I am planning to change to below settings to see if that improves the performance. PRAGMA journal_mode = WAL PRAGMA page_size=4096 PRAGMA cache_size=16384 PRAGMA synchronous=NORMAL PRAGMA wal_autocheckpoint=10 Setting wal_autocheckpoint to 10 mean that the data gets committed to the disk after it reaches 100 MB , Am I right ? If the machine crashes or power goes off during this time, 100 MB data is lost? Any API calls which would commit the data to disk forcibly? Let me know if the above settings would improve the application performance or something needs to be configured. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] performance difference based on the sqlite configuration?
I do have a similar issue. I am finding it hard which is the best configuration for most of the large scale application. Our database size grows from 0 – 45 GB . As the database size grows, performance seems to be degrading. Major operations include insertion/read/delete Current settings: PRAGMA journal_mode = TRUNCATE PRAGMA page_size;", lPageSize) UINT64 ui64MaxPageCount = 53687091200 / lPageSize; //50 GB PRAGMA max_page_count = %I64u;", ui64MaxPageCount)); I am planning to change it to below settings to see if that improves the performance. PRAGMA journal_mode = WAL pragma page_size=4096 pragma cache_size=16384 PRAGMA wal_autocheckpoint=10 Setting wal_autocheckpoint to 10 mean that the data gets committed to the disk after it reaches 100 MB , Am I right ? If the machine crashes or power goes off during this time, 100 MB data is lost? Any API calls which would commit the data to disk forcibly? Let me know if the above settings would improve the application performance or something needs to be configured. Thanks -Veeresh -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Richard Hipp Sent: Sunday, January 19, 2014 9:37 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts In WAL mode with synchronous=NORMAL, SQLite only syncs (FlushFileBuffers() on windows) when it does a checkpoint operation. Checkpoints should be happening automatically whenever the WAL file exceeds about 1MB in size. For an 8GB database, probably there are about 8000 sync operations, therefore. If each takes about 2 seconds, that would pretty much account for the extra 4.5 hours. If you are creating a new database from scratch, it is safe to set synchronous=OFF. If you lose power in the middle, your database file will probably be corrupt, but since you were creating it from scratch you can easily recover just be starting the database creation process over again from the beginning. If you want to try running with synchronous=NORMAL, you might try setting PRAGMA wal_autocheckpoint=10; (from the default of 1000) which will make for dramatically larger WAL files, but also dramatically fewer syncs. Then the syncs will use just 5 or 6 minutes instead of 4.5 hours. Hopefully. On Sun, Jan 19, 2014 at 9:00 AM, Mario M. Westphal wrote: > I have a performance effect which I don't quite understand. > Maybe I'm using the wrong settings or something. Sorry for the long > post, but I wanted to include all the info that may be important. > > My software is written in C++, runs on Windows 7/8, the SQLite > database file is either on a local SATA RAID disk or a SSD. > Typical database sizes are between 2 GB and 8 GB. > The largest tables hold several million entries. Also FTS4 is used, > which also creates large tables. > Fast internal RAID disks, SDD. Four Xeon cores. 8 GB RAM. > > I'm using SQLite 3.8.0.2 > WAL mode, shared cache enabled. > locking_mode=NORMAL > checkpoint_fullfsync=0 > pragma page_size=4096 > pragma cache_size=16384 > > General (retrieval) performance is excellent! > > > During an ingest phase, my application pumps in hundreds of thousands > of records into multiple tables. > There are massive amounts of writes during that phase, different > record sizes, tables with one to four indices etc. > > My application is multi-threaded and inserts data into the database > concurrently from multiple threads. > The threads process data in batches, and use SQLite transactions to > process all records of a batch into the database. Transactions gain a > lot of speed, which outweighs the side effects of potential blocking. > The threads monitor the execution times of the database operations and > adjust the batch size to balance speed and transaction lock duration. > Slower > operations cause smaller batches, which results in shorter database > locks and better concurrency. The system adapts fairly well to system > performance and data structure. > > The performance was not that bad, but far from good. > > For a given set of input data (100,000 "elements"), the execution > estimate was about 5 hours. > Database on a high-speed SSD. > The largest table holds about 5 million entries afterwards. > > ***With one single change*** I improved the execution time from 5 > hours down to about 30 minutes! > > I changed > > PRAGMA synchronous=NORMAL > > to > > PRAGMA synchronous=OFF > > Also all other database write operations just 'fly' now. > I'm even more impressed with SQLite than before, but I wonder why is > the change so _dramatic_ ? > > From the docs my impression was that using WAL mode is ideal for bulk > inserts. That wrapping large bulks of data into smaller batches, > wrapped in BEGIN/COMMT is best for performance etc. That using > synchronous=NORMAL limits the file system flush/wa
Re: [sqlite] (no subject)
On Tue, 1/28/14, Igor Tandetnik wrote: Subject: Re: [sqlite] (no subject) To: sqlite-users@sqlite.org Date: Tuesday, January 28, 2014, 2:41 PM On 1/28/2014 2:26 PM, David Bicking wrote: > I have two tables: > > ARB > KEY (PRIMARY KEY) > ASSIGN (NOT NECESSARILY UNIQUE) > > DMC > KEY (NOT UNIQUE) > ASSIGN (NOT UNIQUE) > VALUE > > I need to report all the records from ARB, and sum up the values if the keys match OR if the keys don't match, then sum up the values where the ASSIGN matches, but only if the ASSIGN is unique in ARB. > > SELECT ARB.KEY > , ARB.ASSIGN > , COALESCE((SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.KEY = ARB.KEY) > , (SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.ASSIGN = ARB.ASSIGN AND ...), 0) > FROM ARB; > > I can't think of >what to put after that AND >Something like this perhaps: >and 1 = (select count(*) from >ARB t2 where t2.ASSIGN=ARB.ASSIGN) > Igor Tandetnik Thanks. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] (no subject)
On 1/28/2014 2:26 PM, David Bicking wrote: I have two tables: ARB KEY (PRIMARY KEY) ASSIGN (NOT NECESSARILY UNIQUE) DMC KEY (NOT UNIQUE) ASSIGN (NOT UNIQUE) VALUE I need to report all the records from ARB, and sum up the values if the keys match OR if the keys don't match, then sum up the values where the ASSIGN matches, but only if the ASSIGN is unique in ARB. SELECT ARB.KEY , ARB.ASSIGN , COALESCE((SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.KEY = ARB.KEY) , (SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.ASSIGN = ARB.ASSIGN AND ...), 0) FROM ARB; I can't think of what to put after that AND Something like this perhaps: and 1 = (select count(*) from ARB t2 where t2.ASSIGN=ARB.ASSIGN) -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] (no subject)
I have two tables: ARB KEY (PRIMARY KEY) ASSIGN (NOT NECESSARILY UNIQUE) DMC KEY (NOT UNIQUE) ASSIGN (NOT UNIQUE) VALUE I need to report all the records from ARB, and sum up the values if the keys match OR if the keys don't match, then sum up the values where the ASSIGN matches, but only if the ASSIGN is unique in ARB. SELECT ARB.KEY , ARB.ASSIGN , COALESCE((SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.KEY = ARB.KEY) , (SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.ASSIGN = ARB.ASSIGN AND ...), 0) FROM ARB; I can't think of what to put after that AND I don't think it matters, but I simplified things up there. DMC is actually a view of a table with KEY/ASSIGN and another with KEY/VALUE. Any help is appreciated. Thanks, David ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How should I use parenthesis?
On 1/27/2014, 9:57 AM, Jean-Christophe Deschamps wrote: I'm trying to find the correct syntaxt for this, but I hit a syntax error each time: either SQLite shokes on outer parenthesis or on union all. Try something like this, which is a minimal change from yours: (select * from (select * from A where x in (subselectA)) dx left outer join (select * from B where y in (subselectB)) dy using (...) ) union all (select * from (select * from B where y in (subselectC)) dx left outer join (select * from A where x in (subselectD)) dy using (...) ) ... but replace the "using (...)" with a join condition saying which fields you want to be used for matching in the join, and also replace the "select *" with a specific list of fields you want to match up for the union. -- Darren Duncan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How should I use parenthesis?
Perhaps you want: select * from ( select * from A where x in (subselectA) left outer join ( select * from B where y in (subselectB) ) as a on -- something ) as A1 UNION ALL select * from ( select * from B where y in (subselectC) left outer join ( select * from A where x in (subselectD) ) as b on -- something ) as B1 JOIN takes tables (or table-like objects) as operands. SELECT does not yield a table unless it's wrapped in parenthesis. (I generally put UNION in uppercase to make it stand out, lest on a quick scan it seem like two separate statements.) Yes that's it. In fact the actual statement is much more complex than the sketch I typed, where I forgot to type the where condition and the outer select. And I got lost in nesting the parenthesis when "simplifying" (i.e. emasculating) my own example. Thanks for the answers and sorry for the noise. I now have both eyes wide open... ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How should I use parenthesis?
Select A.*,B.* from A left outer join B on x in (subselectA) and y in (subselectB) Union all Select A.*,B.* from B left outer join A on x in (subselectD) and y in (subselectC) It seems strange that there is no condition limiting which rows from A and B match, which makes the LEFT OUTER JOIN rather pointless... -Ursprüngliche Nachricht- Von: Jean-Christophe Deschamps [mailto:j...@antichoc.net] Gesendet: Montag, 27. Jänner 2014 18:57 An: sqlite-users@sqlite.org Betreff: [sqlite] How should I use parenthesis? Dear list, I'm trying to find the correct syntaxt for this, but I hit a syntax error each time: either SQLite shokes on outer parenthesis or on union all. ( select * from A where x in (subselectA) left outer join select * from B where y in (subselectB) ) union all ( -- <-- error select * from B where y in (subselectC) left outer join select * from A where x in (subselectD) ) Union [all], except, intersect don't seem to accept parenthesis around the left or right parts. Of course a workaround could be to create two views then union all them, but there must be a way to express this construct in a single statement. I don't want the statement to be interpreted this way: select * from A where x in (subselectA) left outer join ( select * from B where y in (subselectB) union all select * from B where y in (subselectC) ) left outer join select * from A where x in (subselectD) -- jcd ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Gunter Hick Software Engineer Scientific Games International GmbH Klitschgasse 2 – 4, A - 1130 Vienna, Austria FN 157284 a, HG Wien Tel: +43 1 80100 0 E-Mail: h...@scigames.at This e-mail is confidential and may well also be legally privileged. If you have received it in error, you are on notice as to its status and accordingly please notify us immediately by reply e-mail and then delete this message from your system. Please do not copy it or use it for any purposes, or disclose its contents to any person as to do so could be a breach of confidence. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users