Re: [sqlite] performance difference based on the sqlite configuration?

2014-01-28 Thread Richard Hipp
On Tue, Jan 28, 2014 at 6:57 PM, veeresh kumar  wrote:

> I do have a similar issue. I am finding it hard which is
> the best configuration for most of the large scale application. Our
> database
> size grows from 0 - 45 GB . As the database size grows, performance seems
> to be
> degrading.
>
> Major operations include insertion/read/delete
>
> Current settings:
>
> PRAGMA journal_mode = TRUNCATE
> PRAGMA page_size;", lPageSize)
> UINT64 ui64MaxPageCount = 53687091200 / lPageSize; //50 GB
>
> PRAGMA max_page_count = %I64u;", ui64MaxPageCount));
>
> I am planning to change it to below settings to see if that
> improves the performance.
>
> PRAGMA journal_mode = WAL
> pragma page_size=4096
> pragma cache_size=16384
> PRAGMA wal_autocheckpoint=10
>
> Setting wal_autocheckpoint  to 10 mean that the data gets committed
> to the disk after it reaches 100 MB , Am I right ?


No.  the autocheckpoint threshold is the number of *pages* in the WAL
file.  If you have a 4K page, then the WAL file will grow to 400MB before
the automatic checkpoint occurs.



> If the machine crashes or
> power goes off during this time, 100 MB data is lost?


No.  The content of the WAL file will be rolled forward automatically the
next time any application opens the database (after the machine reboots).

If the last one or two transactions failed to be completely written into
the WAL file, then those transactions will be lost.  If you are in the
middle of a transaction which has not yet committed, that transaction will
certainly be lost.

If you set PRAGMA synchronous=FULL then fsync() will be called on the WAL
file after every transaction to make sure it is completely flushed to
disk.  This will slow down writes, but will ensure that all content is
recoverable if the machine crashes.  The default value for synchronous in
WAL mode is NORMAL, which means that fsync() is only called when a
checkpoint occurs.  This is much faster, but you risk having the last one
or two transactions rollback if the machine crashes.



> Any API calls which would
> commit the data to disk forcibly?
>
> Let me know if the above settings would improve the
> application performance or something needs to be configured.
>
> Thanks
>
> -Veeresh
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:
> sqlite-users-boun...@sqlite.org] On
> Behalf Of Richard Hipp
> Sent: Sunday, January 19, 2014 9:37 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Dramatic performance difference between different
> "PRAGMA synchronous" settings for bulk inserts
>
> In WAL mode with synchronous=NORMAL, SQLite only syncs
> (FlushFileBuffers() on windows) when it does a checkpoint operation.
> Checkpoints should be happening automatically
> whenever the WAL file exceeds about 1MB in size.
>
> For an 8GB database, probably there are about 8000 sync
> operations, therefore.  If each takes
> about 2 seconds, that would pretty much account for the extra 4.5 hours.
>
> If you are creating a new database from scratch, it is
> safe to set synchronous=OFF.  If you lose
> power in the middle, your database file will probably be corrupt, but
> since you
> were creating it from scratch you can easily recover just be starting the
> database creation process over again from the beginning.
>
> If you want to try running with synchronous=NORMAL, you
> might try setting PRAGMA wal_autocheckpoint=10; (from the default of
> 1000)
> which will make for dramatically larger WAL files, but also dramatically
> fewer
> syncs.
> Then the syncs will use just 5 or 6 minutes instead of
> 4.5 hours. Hopefully.
>
>
>
> On Sun, Jan 19, 2014 at 9:00 AM, Mario M. Westphal  wrote:
>
> > I have a performance effect which I don't quite
> understand.
> > Maybe I'm using the wrong settings or something.
> Sorry for the long
> > post, but I wanted to include all the info that may
> be important.
> >
> > My software is written in C++, runs on Windows 7/8,
> the SQLite
> > database file is either on a local SATA RAID disk or
> a SSD.
> > Typical database sizes are between 2 GB and 8 GB.
> > The largest tables hold several million entries.
> Also FTS4 is used,
> > which also creates large tables.
> > Fast internal RAID disks, SDD. Four Xeon cores. 8 GB
> RAM.
> >
> > I'm using SQLite 3.8.0.2
> > WAL mode, shared cache enabled.
> > locking_mode=NORMAL
> > checkpoint_fullfsync=0
> > pragma page_size=4096
> > pragma cache_size=16384
> >
> > General (retrieval) performance is excellent!
> >
> >
> > During an ingest phase, my application pumps in
> hundreds of thousands
> > of records into multiple tables.
> > There are massive amounts of writes during that
> phase, different
> > record sizes, tables with one to four indices etc.
> >
> > My application is multi-threaded and inserts data
> into the database
> > concurrently from multiple threads.
> > The threads process data in batches, and use SQLite
> transactions to
> > process all records of a batch into the databas

[sqlite] Performance based on Sqlite Configuration

2014-01-28 Thread veeresh kumar
I am finding it hard which is the best configuration for most
of the large scale application. Our database size grows from 0 – 45 GB . As the
database size grows, performance seems to be degrading. Performance of the same 
application is better when it compared to Sql Server.

I am in the middle of identifying the bottle neck and first thing I am looking 
for is the configuration. 

 
Current settings:
 
PRAGMA journal_mode = TRUNCATE
PRAGMA page_size;", lPageSize)
UINT64 ui64MaxPageCount = 53687091200 / lPageSize;
PRAGMA max_page_count = %I64u;", ui64MaxPageCount));
 
I am planning to change to below settings to see if that
improves the performance.
 
PRAGMA journal_mode = WAL
PRAGMA page_size=4096
PRAGMA cache_size=16384
PRAGMA synchronous=NORMAL
PRAGMA wal_autocheckpoint=10 
 
Setting wal_autocheckpoint  to 10 mean that the data gets committed to
the disk after it reaches 100 MB , Am I right ? If the machine crashes or power
goes off during this time, 100 MB data is lost? Any API calls which would commit
the data to disk forcibly?
 
Let me know if the above settings would improve the
application performance or something needs to be configured.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] performance difference based on the sqlite configuration?

2014-01-28 Thread veeresh kumar
I do have a similar issue. I am finding it hard which is
the best configuration for most of the large scale application. Our database
size grows from 0 – 45 GB . As the database size grows, performance seems to be
degrading.
 
Major operations include insertion/read/delete
 
Current settings:
 
PRAGMA journal_mode = TRUNCATE
PRAGMA page_size;", lPageSize)
UINT64 ui64MaxPageCount = 53687091200 / lPageSize; //50 GB

PRAGMA max_page_count = %I64u;", ui64MaxPageCount));
 
I am planning to change it to below settings to see if that
improves the performance.
 
PRAGMA journal_mode = WAL
pragma page_size=4096
pragma cache_size=16384
PRAGMA wal_autocheckpoint=10   
 
Setting wal_autocheckpoint  to 10 mean that the data gets committed
to the disk after it reaches 100 MB , Am I right ? If the machine crashes or
power goes off during this time, 100 MB data is lost? Any API calls which would
commit the data to disk forcibly?
 
Let me know if the above settings would improve the
application performance or something needs to be configured.

Thanks 

-Veeresh 

 
-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On
Behalf Of Richard Hipp
Sent: Sunday, January 19, 2014 9:37 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Dramatic performance difference between different
"PRAGMA synchronous" settings for bulk inserts
 
In WAL mode with synchronous=NORMAL, SQLite only syncs
(FlushFileBuffers() on windows) when it does a checkpoint operation.  
Checkpoints should be happening automatically
whenever the WAL file exceeds about 1MB in size.
 
For an 8GB database, probably there are about 8000 sync
operations, therefore.  If each takes
about 2 seconds, that would pretty much account for the extra 4.5 hours.
 
If you are creating a new database from scratch, it is
safe to set synchronous=OFF.  If you lose
power in the middle, your database file will probably be corrupt, but since you
were creating it from scratch you can easily recover just be starting the
database creation process over again from the beginning.
 
If you want to try running with synchronous=NORMAL, you
might try setting PRAGMA wal_autocheckpoint=10; (from the default of 1000)
which will make for dramatically larger WAL files, but also dramatically fewer
syncs.
Then the syncs will use just 5 or 6 minutes instead of
4.5 hours. Hopefully.
 
 
 
On Sun, Jan 19, 2014 at 9:00 AM, Mario M. Westphal  wrote:
 
> I have a performance effect which I don't quite
understand.
> Maybe I'm using the wrong settings or something.
Sorry for the long 
> post, but I wanted to include all the info that may
be important.
> 
> My software is written in C++, runs on Windows 7/8,
the SQLite 
> database file is either on a local SATA RAID disk or
a SSD.
> Typical database sizes are between 2 GB and 8 GB.
> The largest tables hold several million entries.
Also FTS4 is used, 
> which also creates large tables.
> Fast internal RAID disks, SDD. Four Xeon cores. 8 GB
RAM.
> 
> I'm using SQLite 3.8.0.2
> WAL mode, shared cache enabled.
> locking_mode=NORMAL
> checkpoint_fullfsync=0
> pragma page_size=4096
> pragma cache_size=16384
> 
> General (retrieval) performance is excellent!
> 
> 
> During an ingest phase, my application pumps in
hundreds of thousands 
> of records into multiple tables.
> There are massive amounts of writes during that
phase, different 
> record sizes, tables with one to four indices etc.
> 
> My application is multi-threaded and inserts data
into the database 
> concurrently from multiple threads.
> The threads process data in batches, and use SQLite
transactions to 
> process all records of a batch into the database.
Transactions gain a 
> lot of speed, which outweighs the side effects of
potential blocking.
> The threads monitor the execution times of the
database operations and 
> adjust the batch size to balance speed and
transaction lock duration.
> Slower
> operations cause smaller batches, which results in
shorter database 
> locks and better concurrency. The system adapts
fairly well to system 
> performance and data structure.
> 
> The performance was not that bad, but far from good.
> 
> For a given set of input data (100,000
"elements"), the execution 
> estimate was about 5 hours.
> Database on a high-speed SSD.
> The largest table holds about 5 million entries afterwards.
> 
> ***With one single change*** I improved the
execution time from 5 
> hours down to about 30 minutes!
> 
> I changed
> 
> PRAGMA synchronous=NORMAL
> 
> to
> 
> PRAGMA synchronous=OFF
> 
> Also all other database write operations just 'fly'
now.
> I'm even more impressed with SQLite than before, but
I wonder why is 
> the change so _dramatic_ ?
> 
> From the docs my impression was that using WAL mode
is ideal for bulk 
> inserts. That wrapping large bulks of data into
smaller batches, 
> wrapped in BEGIN/COMMT is best for performance etc.
That using 
> synchronous=NORMAL limits the file system flush/wa

Re: [sqlite] (no subject)

2014-01-28 Thread David Bicking


On Tue, 1/28/14, Igor Tandetnik  wrote:

 Subject: Re: [sqlite] (no subject)
 To: sqlite-users@sqlite.org
 Date: Tuesday, January 28, 2014, 2:41 PM
 
 On 1/28/2014 2:26 PM,
 David Bicking wrote:
 > I have two tables:
 >
 > ARB
 >   KEY (PRIMARY KEY)
 >   ASSIGN (NOT NECESSARILY
 UNIQUE)
 >
 > DMC
 >   KEY (NOT UNIQUE)
 >   ASSIGN (NOT UNIQUE)
 >   VALUE
 >
 > I need to report all the records from ARB,
 and sum up the values if the keys match OR if the keys
 don't match, then sum up the values where the ASSIGN
 matches, but only if the ASSIGN is unique in ARB.
 >
 > SELECT ARB.KEY
 > , ARB.ASSIGN
 > ,
 COALESCE((SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.KEY =
 ARB.KEY)
 >                   
     , (SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.ASSIGN =
 ARB.ASSIGN AND ...), 0)
 > FROM ARB;
 >
 > I can't think of
 >what to put after that AND
 
 >Something like this perhaps:
 
 >and 1 = (select count(*) from
 >ARB t2 where t2.ASSIGN=ARB.ASSIGN)
 
> Igor Tandetnik

Thanks.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] (no subject)

2014-01-28 Thread Igor Tandetnik

On 1/28/2014 2:26 PM, David Bicking wrote:

I have two tables:

ARB
  KEY (PRIMARY KEY)
  ASSIGN (NOT NECESSARILY UNIQUE)

DMC
  KEY (NOT UNIQUE)
  ASSIGN (NOT UNIQUE)
  VALUE

I need to report all the records from ARB, and sum up the values if the keys 
match OR if the keys don't match, then sum up the values where the ASSIGN 
matches, but only if the ASSIGN is unique in ARB.

SELECT ARB.KEY
, ARB.ASSIGN
, COALESCE((SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.KEY = ARB.KEY)
   , (SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.ASSIGN = 
ARB.ASSIGN AND ...), 0)
FROM ARB;

I can't think of what to put after that AND


Something like this perhaps:

and 1 = (select count(*) from ARB t2 where t2.ASSIGN=ARB.ASSIGN)

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] (no subject)

2014-01-28 Thread David Bicking
I have two tables:

ARB 
 KEY (PRIMARY KEY)
 ASSIGN (NOT NECESSARILY UNIQUE)

DMC
 KEY (NOT UNIQUE)
 ASSIGN (NOT UNIQUE)
 VALUE

I need to report all the records from ARB, and sum up the values if the keys 
match OR if the keys don't match, then sum up the values where the ASSIGN 
matches, but only if the ASSIGN is unique in ARB.

SELECT ARB.KEY
, ARB.ASSIGN
, COALESCE((SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.KEY = ARB.KEY)
  , (SELECT SUM(DMC.VALUE) FROM DMC WHERE DMC.ASSIGN = 
ARB.ASSIGN AND ...), 0)
FROM ARB;

I can't think of what to put after that AND

I don't think it matters, but I simplified things up there. DMC is actually a 
view of a table with KEY/ASSIGN and another with KEY/VALUE.

Any help is appreciated.

Thanks,
David
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How should I use parenthesis?

2014-01-28 Thread Darren Duncan

On 1/27/2014, 9:57 AM, Jean-Christophe Deschamps wrote:

I'm trying to find the correct syntaxt for this, but I hit a syntax error each
time: either SQLite shokes on outer parenthesis or on union all.


Try something like this, which is a minimal change from yours:

(select * from
(select * from A where x in (subselectA)) dx
left outer join
(select * from B where y in (subselectB)) dy using (...)
)
union all
(select * from
(select * from B where y in (subselectC)) dx
left outer join
(select * from A where x in (subselectD)) dy using (...)
)

... but replace the "using (...)" with a join condition saying which fields you 
want to be used for matching in the join, and also replace the "select *" with a 
specific list of fields you want to match up for the union.


-- Darren Duncan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How should I use parenthesis?

2014-01-28 Thread Jean-Christophe Deschamps




Perhaps you want:

select * from (
   select * from A where x in (subselectA)
   left outer join (
select * from B where y in (subselectB)
   ) as a
   on -- something
) as A1

  UNION ALL

select * from (
   select * from B where y in (subselectC)
   left outer join (
select * from A where x in (subselectD)
   ) as b
   on -- something
) as B1

JOIN takes tables (or table-like objects) as operands.  SELECT does not
yield a table unless it's wrapped in parenthesis.

(I generally put UNION in uppercase to make it stand out, lest on a
quick scan it seem like two separate statements.)


Yes that's it. In fact the actual statement is much more complex than 
the sketch I typed, where I forgot to type the where condition and the 
outer select. And I got lost in nesting the parenthesis when 
"simplifying" (i.e. emasculating) my own example.


Thanks for the answers and sorry for the noise. I now have both eyes 
wide open...


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How should I use parenthesis?

2014-01-28 Thread Hick Gunter
Select A.*,B.* from A left outer join B on x in (subselectA) and y in 
(subselectB)
Union all
Select A.*,B.* from B left outer join A on x in (subselectD) and y in 
(subselectC)

It seems strange that there is no condition limiting which rows from A and B 
match, which makes the LEFT OUTER JOIN rather pointless...

-Ursprüngliche Nachricht-
Von: Jean-Christophe Deschamps [mailto:j...@antichoc.net]
Gesendet: Montag, 27. Jänner 2014 18:57
An: sqlite-users@sqlite.org
Betreff: [sqlite] How should I use parenthesis?

Dear list,

I'm trying to find the correct syntaxt for this, but I hit a syntax error each 
time: either SQLite shokes on outer parenthesis or on union all.

(
select * from A where x in (subselectA)
left outer join
select * from B where y in (subselectB)
)

union all

(  -- <-- error
select * from B where y in (subselectC)
left outer join
select * from A where x in (subselectD)
)

Union [all], except, intersect don't seem to accept parenthesis around the left 
or right parts.

Of course a workaround could be to create two views then union all them, but 
there must be a way to express this construct in a single statement.

I don't want the statement to be interpreted this way:

select * from A where x in (subselectA)
left outer join
(
select * from B where y in (subselectB)
union all
select * from B where y in (subselectC)
)
left outer join
select * from A where x in (subselectD)

--
jcd

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


--
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users