date:20140119

Re: [sqlite] Mutally dependent JOIN clauses

2014-01-19 Thread Hick Gunter



>-Ursprüngliche Nachricht-
>Von: Rob Golsteijn [mailto:rob.golste...@mapscape.eu]
>Gesendet: Freitag, 17. Jänner 2014 11:38
>An: sqlite-users@sqlite.org
>Betreff: Re: [sqlite] Mutally dependent JOIN clauses
>
>...
>My statement:
>
>SELECT * FROM C
>LEFT JOIN A ON A.a*A.a + B.b*B.b = C.c*c.c
>LEFT JOIN B ON A.a*A.a + B.b*B.b = C.c*c.c;
>
>Should be interpreted as
>
>SELECT * FROM (C LEFT JOIN
>   A ON A.a*A.a + B.b*B.b = C.c*c.c)
> LEFT JOIN B ON A.a*A.a + B.b*B.b = C.c*c.c;
>
>The "C LEFT JOIN A" part is to be evaluated first; produces output for all 
>value in C (only value 5). Each of these rows of C is completed either with 
>all suitable value of A, or >NULL if such value does not exist.
>Sqlite should find 2 candidates from A (values 3 and 4).  The a values 1, 2, 
>and 5 are not suitable, since they can never satisfy the ON clause. (Note that 
>for finding these suitable >values in A  SqLite also needs to look into table 
>B, but that is, I think, an implementtion detail)
>

I don't think it is an implementation detail.

Looking at the left left join you get (C left join A on A.a*A.a + B.b*B.b = 
C.c*c.c) with an expression that evaluates to "don't know (yet)". As a DB 
Engine you have two options

a) return an error code (i.e. "there is no table B to read from here")
b) postpone the evaluation until there is a table B (which gives the Cartesian 
Product as a result set of the join)

As the ON clause is supposed to be evaluated BEFORE the join, option a would be 
the safe bet



--
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Mario M. Westphal

> Unrelated to your question, but, take a look at "external content" 
> FTS4 table they dramatically cut down the amount of duplicated data 
> [1])

Thanks for the tip. I'll definitely check that.
Currently I build the contents for FTS dynamically from several other
tables, combining, splitting, merging data via SQL as needed when INSERTing
into the FTS tables. Maybe I can safe some of these efforts and reduce the
amount of data in the FTS tables.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Mario M. Westphal


>
If you want to try running with synchronous=NORMAL, you might try setting
PRAGMA wal_autocheckpoint=10; (from the default of 1000) which will
make for dramatically larger WAL files, but also dramatically fewer syncs.
Then the syncs will use just 5 or 6 minutes instead of 4.5 hours. Hopefully.
<

Thanks for the tip! I will add that and combine it with synchronous=NORMAL.

The wal_autocheckpoint documentation is not that clear (IMHO) about how this
setting can impact performance.
Maybe adding a sentence to the documentation, explaining the relation
between wal_autocheckpoint, synch frequency and performance, would help
other (new) users.

If I set wal_autocheckpoint=1, I will get 1/10 of the synchs and WAL
file of about 10 MB, correct?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Command line shell not flushing stderr when interactive

2014-01-19 Thread Simon Slavin

On 19 Jan 2014, at 7:32pm, Luuk  wrote:

> It is acceptable—and normal—for standard output and standard error to be 
> directed to the same destination, such as the text terminal. Messages appear 
> in the same order as the program writes them, unless buffering is involved. 
> (For example, a common situation is when the standard error stream is 
> unbuffered but the standard output stream is line-buffered; in this case, 
> text written to standard error later may appear on the terminal earlier, if 
> the standard output stream's buffer is not yet full.)
> 
> source:
> http://en.wikipedia.org/wiki/Standard_streams#Standard_error_.28stderr.29

Buffering matters only if an app is going to use stderr as a warning stream 
instead of its original purpose of "I'm about to crash and here's why.".

When stderr was thought up, a program wrote some text to it just before it 
quit.  The question of buffering wasn't important because any buffer would be 
flushed an instant later when the program that wrote it quit.  So it didn't 
matter whether stderr was buffered or not.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Command line shell not flushing stderr when interactive

2014-01-19 Thread Luuk


On 19-01-2014 19:59, Christopher Wellons wrote:



When the shell is set to interactive (i.e. "-interactive"), the output
(stdout) is flushed with every prompt (shell.c:422) but stderr is not.



Stderr is suppose to be unbuffered so that flushing is not required.  Or is
that different for windows?


According to the stderr Linux man page stderr is unbuffered, which would
be why I'm not having a problem in Linux:


The stream stderr is unbuffered.  The  stream  stdout  is  line-buffered
when  it  points  to  a  terminal.


I'm unable to find any documentation about this for Windows, but since
I'm seeing stderr buffering it must not be unbuffered in Windows.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



Is this not biting you?


It is acceptable—and normal—for standard output and standard error to be 
directed to the same destination, such as the text terminal. Messages 
appear in the same order as the program writes them, unless buffering is 
involved. (For example, a common situation is when the standard error 
stream is unbuffered but the standard output stream is line-buffered; in 
this case, text written to standard error later may appear on the 
terminal earlier, if the standard output stream's buffer is not yet full.)


source:
http://en.wikipedia.org/wiki/Standard_streams#Standard_error_.28stderr.29

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Solving the Sudoku with SQlite 3.8.3 trunk

2014-01-19 Thread Keith Medcalf

soduko1.sql and soduko2.sql are the two originals.  soduko3.sql removes the 
digits view to an actual table (from soduko2.sql) and soduko3.sql puts digits 
back in as a CTE but is a select from the wholenumber module rather than 
generating the digits recursively.

So, the fastest one uses digits pregenerated as a table with both text and 
integer columns used in the appropriate places dictated by the format 
requirement in order to avoid conversions.  All run one after the other on a 
3Ghz core, single threaded, solving the same problem.

The main determinants are whether or not the digits is a CTE thus regenerated 
each time needed, and whether the digits table contains both text and integer 
values so that conversions can be avoided.  The former (CTE regeneration) may 
be able to be fixed in the optimizer.  The latter (conversions) is pretty much 
expected and why we have datatypes.

>timethis sqlite < soduko1.sql

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:43:39 2014

812753649943682175675491283154237896369845721287169534521974368438526917796318452

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:43:39 2014
TimeThis :  End Time :  Sun Jan 19 11:45:27 2014
TimeThis :  Elapsed Time :  00:01:47.919


>timethis sqlite < soduko2.sql

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:46:39 2014

812753649943682175675491283154237896369845721287169534521974368438526917796318452

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:46:39 2014
TimeThis :  End Time :  Sun Jan 19 11:50:42 2014
TimeThis :  Elapsed Time :  00:04:02.752


>timethis sqlite < soduko3.sql

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:50:50 2014

812753649943682175675491283154237896369845721287169534521974368438526917796318452

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:50:50 2014
TimeThis :  End Time :  Sun Jan 19 11:52:10 2014
TimeThis :  Elapsed Time :  00:01:19.912


>timethis sqlite < soduko4.sql

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:52:17 2014

812753649943682175675491283154237896369845721287169534521974368438526917796318452

TimeThis :  Command Line :  sqlite
TimeThis :Start Time :  Sun Jan 19 11:52:17 2014
TimeThis :  End Time :  Sun Jan 19 11:54:14 2014
TimeThis :  Elapsed Time :  00:01:56.807

soduko1 uses:


drop table if exists gen9;
create table gen9(z);
insert into gen9 values ('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9');
drop table if exists initial;
create table initial (s, ind);

insert into initial
select  sud, instr( sud, ' ')
  from  (SELECT
--- '53  76  195986 8   6   34  8 3  17   2   6 6   
 28419  58  79'
'8  36  7  9 2   5   7   457 1   3   1
68  85   1  94  '
 as sud) as q;

soduko2:

WITH RECURSIVE input(sud) AS (
   VALUES(
'8..36..7..9.2...5...7...457.1...3...168..85...1..94..'
--- 
'53..76..195986.8...6...34..8.3..17...2...6.628419..58..79'
   )
),

/* A table filled with digits 1..9, inclusive. */
digits(z, lp) AS (
   VALUES('1', 1)
   UNION ALL SELECT
   CAST(lp+1 AS TEXT), lp+1 FROM digits WHERE lp<9
),

soduko3:

create table digits(lp integer primary key, z text);
create virtual table w using wholenumber;
insert into digits (lp, z) select value, cast(value as text) from w where value 
between 1 and 9;

WITH RECURSIVE input(sud) AS (
   VALUES(
'8..36..7..9.2...5...7...457.1...3...168..85...1..94..'
--- 
'53..76..195986.8...6...34..8.3..17...2...6.628419..58..79'
   )
),

and soduko4:

create virtual table w using wholenumber;
WITH RECURSIVE input(sud) AS (
   VALUES(
'8..36..7..9.2...5...7...457.1...3...168..85...1..94..'
--- 
'53..76..195986.8...6...34..8.3..17...2...6.628419..58..79'
   )
),

/* A table filled with digits 1..9, inclusive. */
digits(z, lp) AS (
select cast(value as text), value from w where value between 1 and 9
--   VALUES('1', 1)
--   UNION ALL SELECT
--   CAST(lp+1 AS TEXT), lp+1 FROM digits WHERE lp<9
),



>-Original Message-
>From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
>boun...@sqlite.org] On Behalf Of big stone
>Sent: Sunday, 19 January, 2014 04:54
>To: sqlite-users@sqlite.org
>Subject: Re: [sqlite] Solving the Sudoku with SQlite 3.8.3 trunk
>
>Hi Keith,
>
>Indeed just removing the CTE creation of the DIGITS makes Dan's version
>up
>to speed.
>
>Would the "wholenumber" external SQLite module help :
>- to make SQLite code cleaner ? (like "generate_series" of Postgresql, or
>"dual" of Oracle)
>- still provide the same speed-up ?
>
>
>Portfolio of typical Sudokus
>-- easy   (0 sec)
>'53..76..195986.8...6...34..8.3..17...2...6.628419..5
>8..79'

Re: [sqlite] Command line shell not flushing stderr when interactive

2014-01-19 Thread Christopher Wellons


>> When the shell is set to interactive (i.e. "-interactive"), the output
>> (stdout) is flushed with every prompt (shell.c:422) but stderr is not.

> Stderr is suppose to be unbuffered so that flushing is not required.  Or is
> that different for windows?

According to the stderr Linux man page stderr is unbuffered, which would
be why I'm not having a problem in Linux:

> The stream stderr is unbuffered.  The  stream  stdout  is  line-buffered
> when  it  points  to  a  terminal.

I'm unable to find any documentation about this for Windows, but since
I'm seeing stderr buffering it must not be unbuffered in Windows.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Petite Abeille


On Jan 19, 2014, at 3:00 PM, Mario M. Westphal  wrote:

> Also FTS4 is used, which also creates large tables.

(Unrelated to your question, but, take a look at "external content" FTS4 table… 
they dramatically cut down the amount of duplicated data [1])

> During an ingest phase, my application pumps in hundreds of thousands of
> records into multiple tables.

For initial, bulk loading, I tend to use the following pragma combo:

pragma journal_mode = off
pragma locking_mode = exclusive
pragma synchronous = off


[1] http://www.sqlite.org/fts3.html#section_6_2_2
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Richard Hipp

In WAL mode with synchronous=NORMAL, SQLite only syncs (FlushFileBuffers()
on windows) when it does a checkpoint operation.  Checkpoints should be
happening automatically whenever the WAL file exceeds about 1MB in size.

For an 8GB database, probably there are about 8000 sync operations,
therefore.  If each takes about 2 seconds, that would pretty much account
for the extra 4.5 hours.

If you are creating a new database from scratch, it is safe to set
synchronous=OFF.  If you lose power in the middle, your database file will
probably be corrupt, but since you were creating it from scratch you can
easily recover just be starting the database creation process over again
from the beginning.

If you want to try running with synchronous=NORMAL, you might try setting
PRAGMA wal_autocheckpoint=10; (from the default of 1000) which will
make for dramatically larger WAL files, but also dramatically fewer syncs.
Then the syncs will use just 5 or 6 minutes instead of 4.5 hours. Hopefully.



On Sun, Jan 19, 2014 at 9:00 AM, Mario M. Westphal  wrote:

> I have a performance effect which I don't quite understand.
> Maybe I'm using the wrong settings or something. Sorry for the long post,
> but I wanted to include all the info that may be important.
>
> My software is written in C++, runs on Windows 7/8, the SQLite database
> file
> is either on a local SATA RAID disk or a SSD.
> Typical database sizes are between 2 GB and 8 GB.
> The largest tables hold several million entries. Also FTS4 is used, which
> also creates large tables.
> Fast internal RAID disks, SDD. Four Xeon cores. 8 GB RAM.
>
> I'm using SQLite 3.8.0.2
> WAL mode, shared cache enabled.
> locking_mode=NORMAL
> checkpoint_fullfsync=0
> pragma page_size=4096
> pragma cache_size=16384
>
> General (retrieval) performance is excellent!
>
>
> During an ingest phase, my application pumps in hundreds of thousands of
> records into multiple tables.
> There are massive amounts of writes during that phase, different record
> sizes, tables with one to four indices etc.
>
> My application is multi-threaded and inserts data into the database
> concurrently from multiple threads.
> The threads process data in batches, and use SQLite transactions to process
> all records of a batch into the database. Transactions gain a lot of speed,
> which outweighs the side effects of potential blocking.
> The threads monitor the execution times of the database operations and
> adjust the batch size to balance speed and transaction lock duration.
> Slower
> operations cause smaller batches, which results in shorter database locks
> and better concurrency. The system adapts fairly well to system performance
> and data structure.
>
> The performance was not that bad, but far from good.
>
> For a given set of input data (100,000 "elements"), the execution estimate
> was about 5 hours.
> Database on a high-speed SSD.
> The largest table holds about 5 million entries afterwards.
>
> ***With one single change*** I improved the execution time from 5 hours
> down
> to about 30 minutes!
>
> I changed
>
> PRAGMA synchronous=NORMAL
>
> to
>
> PRAGMA synchronous=OFF
>
> Also all other database write operations just 'fly' now.
> I'm even more impressed with SQLite than before, but I wonder why is the
> change so _dramatic_ ?
>
> From the docs my impression was that using WAL mode is ideal for bulk
> inserts. That wrapping large bulks of data into smaller batches, wrapped in
> BEGIN/COMMT is best for performance etc. That using synchronous=NORMAL
> limits the file system flush/wait operations certain really important
> operations.
>
> I logged the execution times of various operations in this phase to a text
> file. Everything was fast, the processing, the INSERTs etc.
> But COMMIT operations sometimes took 20s, then 0.2s, then again 10s. That's
> the time SQLite spends in the execute call with "COMMIT".
>
> Of course the amount of data written in each transaction block varied, but
> in general, 85% of the total execution time of my code was spent in the
> COMMIT call.
> My application was the only application with measurable disk I/O at the
> time. No virus checker etc.
>
> Changing to synchronous=OFF made the commits 10 times faster.
>
> Is this the expected behavior or am I missing something obvious?
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Command line shell not flushing stderr when interactive

2014-01-19 Thread Richard Hipp

On Sun, Jan 19, 2014 at 11:10 AM, Christopher Wellons <
well...@nullprogram.com> wrote:

>
> When the shell is set to interactive (i.e. "-interactive"), the output
> (stdout) is flushed with every prompt (shell.c:422) but stderr is not.
>

Stderr is suppose to be unbuffered so that flushing is not required.  Or is
that different for windows?



> In some situations this leads to no error messages being displayed until
> the stderr buffer fills.
>
> This happens when running the official sqlite3 binary as subprocess of
> Emacs under Windows 7. The error messages do not appear in a timely
> fashion. I was unable to trigger the misbehavior in a plain shell so my
> only demo is a bit of Emacs Lisp. When this Elisp code below is run, a
> buffer will pop up that *should* contain the output of .help. Under
> Windows it does not. The same occurs even when it's launched via a shell
> subprocess using "2>&1", so it's not simply an issue with Emacs not
> reading from the subprocess's stderr output fast enough.
>
> (let* ((buffer (generate-new-buffer "sqlite"))
>(proc (start-process "sqlite" buffer "sqlite3" "-interactive")))
>   (process-send-string proc ".help\n")
>   (pop-to-buffer buffer))
>
> I suspect it has to do with being compiled without readline, which is
> why it behaves better elsewhere. I couldn't figure out how to link with
> libreadline on Windows, though, so I couldn't test this.
>
> With the following change to the amalgamation release I got the behavior
> I was looking for: timely error messages from the SQLite command line
> shell. I understand this is probably not the Right Way to do this, but
> it's just a demonstation of a possible fix.
>
> --- a/shell.c
> +++ b/shell.c
> @@ -418,6 +418,7 @@ static char *one_input_line(FILE *in, char *zPrior,
> int isCont
>  zResult = readline(zPrompt);
>  if( zResult && *zResult ) add_history(zResult);
>  #else
> +fflush(stderr);
>  printf("%s", zPrompt);
>  fflush(stdout);
>  zResult = local_getline(zPrior, stdin);
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sqlie from c#: encoding bug?

2014-01-19 Thread cinema cinema

I'm reading a text from c# saving it through a insert parameter in a
varchar column of a table. Both the file and the sqlite db encoding is
utf-8 but I see different characters (it seems due to a bad encoding) while
reading data from the sqlite3 command line or from other clients...
Any suggestion?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Command line shell not flushing stderr when interactive

2014-01-19 Thread Christopher Wellons


When the shell is set to interactive (i.e. "-interactive"), the output
(stdout) is flushed with every prompt (shell.c:422) but stderr is not.
In some situations this leads to no error messages being displayed until
the stderr buffer fills.

This happens when running the official sqlite3 binary as subprocess of
Emacs under Windows 7. The error messages do not appear in a timely
fashion. I was unable to trigger the misbehavior in a plain shell so my
only demo is a bit of Emacs Lisp. When this Elisp code below is run, a
buffer will pop up that *should* contain the output of .help. Under
Windows it does not. The same occurs even when it's launched via a shell
subprocess using "2>&1", so it's not simply an issue with Emacs not
reading from the subprocess's stderr output fast enough.

(let* ((buffer (generate-new-buffer "sqlite"))
   (proc (start-process "sqlite" buffer "sqlite3" "-interactive")))
  (process-send-string proc ".help\n")
  (pop-to-buffer buffer))

I suspect it has to do with being compiled without readline, which is
why it behaves better elsewhere. I couldn't figure out how to link with
libreadline on Windows, though, so I couldn't test this.

With the following change to the amalgamation release I got the behavior
I was looking for: timely error messages from the SQLite command line
shell. I understand this is probably not the Right Way to do this, but
it's just a demonstation of a possible fix.

--- a/shell.c
+++ b/shell.c
@@ -418,6 +418,7 @@ static char *one_input_line(FILE *in, char *zPrior, int 
isCont
 zResult = readline(zPrompt);
 if( zResult && *zResult ) add_history(zResult);
 #else
+fflush(stderr);
 printf("%s", zPrompt);
 fflush(stdout);
 zResult = local_getline(zPrior, stdin);
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Simon Slavin

On 19 Jan 2014, at 2:00pm, Mario M. Westphal  wrote:

> I logged the execution times of various operations in this phase to a text
> file. Everything was fast, the processing, the INSERTs etc.
> But COMMIT operations sometimes took 20s, then 0.2s, then again 10s. That's
> the time SQLite spends in the execute call with "COMMIT".

First, I want to check that you've read

I can't answer your question but only a big UPDATE or DELETE would legitimately 
take 20s.  If you're seeing INSERT times of 10s or 20s then you're seeing the 
result of two threads clashing over database access.  One thread has to back 
off and wait for the other to finish, and the retry times eventually reach 10s 
and 20s before they get so long SQLite gives up and returns an error.  So 
you're not seeing a process take 20s to do useful stuff, you're seeing one 
thread keep the database busy -- so busy that it's always busy when the other 
thread tries to write to it.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

2014-01-19 Thread Mario M. Westphal

I have a performance effect which I don't quite understand.
Maybe I'm using the wrong settings or something. Sorry for the long post,
but I wanted to include all the info that may be important.
 
My software is written in C++, runs on Windows 7/8, the SQLite database file
is either on a local SATA RAID disk or a SSD.
Typical database sizes are between 2 GB and 8 GB.
The largest tables hold several million entries. Also FTS4 is used, which
also creates large tables.
Fast internal RAID disks, SDD. Four Xeon cores. 8 GB RAM.
 
I'm using SQLite 3.8.0.2
WAL mode, shared cache enabled.
locking_mode=NORMAL
checkpoint_fullfsync=0
pragma page_size=4096
pragma cache_size=16384
 
General (retrieval) performance is excellent!
 
 
During an ingest phase, my application pumps in hundreds of thousands of
records into multiple tables.
There are massive amounts of writes during that phase, different record
sizes, tables with one to four indices etc.
 
My application is multi-threaded and inserts data into the database
concurrently from multiple threads.
The threads process data in batches, and use SQLite transactions to process
all records of a batch into the database. Transactions gain a lot of speed,
which outweighs the side effects of potential blocking.
The threads monitor the execution times of the database operations and
adjust the batch size to balance speed and transaction lock duration. Slower
operations cause smaller batches, which results in shorter database locks
and better concurrency. The system adapts fairly well to system performance
and data structure.
 
The performance was not that bad, but far from good.
 
For a given set of input data (100,000 "elements"), the execution estimate
was about 5 hours.
Database on a high-speed SSD.
The largest table holds about 5 million entries afterwards.
 
***With one single change*** I improved the execution time from 5 hours down
to about 30 minutes!
 
I changed 
 
PRAGMA synchronous=NORMAL
 
to 
 
PRAGMA synchronous=OFF
 
Also all other database write operations just 'fly' now.
I'm even more impressed with SQLite than before, but I wonder why is the
change so _dramatic_ ?
 
>From the docs my impression was that using WAL mode is ideal for bulk
inserts. That wrapping large bulks of data into smaller batches, wrapped in
BEGIN/COMMT is best for performance etc. That using synchronous=NORMAL
limits the file system flush/wait operations certain really important
operations.

I logged the execution times of various operations in this phase to a text
file. Everything was fast, the processing, the INSERTs etc.
But COMMIT operations sometimes took 20s, then 0.2s, then again 10s. That's
the time SQLite spends in the execute call with "COMMIT".

Of course the amount of data written in each transaction block varied, but
in general, 85% of the total execution time of my code was spent in the
COMMIT call.
My application was the only application with measurable disk I/O at the
time. No virus checker etc.
 
Changing to synchronous=OFF made the commits 10 times faster.
 
Is this the expected behavior or am I missing something obvious?
 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Solving the Sudoku with SQlite 3.8.3 trunk

2014-01-19 Thread big stone

Hi Keith,

Indeed just removing the CTE creation of the DIGITS makes Dan's version up
to speed.

Would the "wholenumber" external SQLite module help :
- to make SQLite code cleaner ? (like "generate_series" of Postgresql, or
"dual" of Oracle)
- still provide the same speed-up ?


Portfolio of typical Sudokus
-- easy   (0 sec)
'53..76..195986.8...6...34..8.3..17...2...6.628419..58..79'
-- medium (2 sec)
'17.9..3..2...8..96..553..9...1..8...264...3..1..4..7..7...3..'
-- hard   (200 s)
'8..36..7..9.2...5...7...457.1...3...168..85...1..94..'

WITH RECURSIVE input(sud) AS (
   VALUES(
'53..76..195986.8...6...34..8.3..17...2...6.628419..58..79'
   )
),

/* A table filled with digits 1..9, inclusive. */
digits(z, lp) AS (
VALUES('1', 1),('2', 2) ,('3', 3),('4', 4),('5', 5),('6', 6),('7',
7),('8', 8),('9', 9)
),

/* The tricky bit. */
x(s, ind) AS (
   SELECT sud, instr(sud, '.') FROM input
   UNION ALL
   SELECT
   substr(s, 1, ind-1) || z || substr(s, ind+1),
   instr( substr(s, 1, ind-1) || z || substr(s, ind+1), '.' )
   FROM x, digits AS z
   WHERE ind>0
   AND NOT EXISTS (
 SELECT 1 FROM digits AS lp
 WHERE z.z = substr(s, ((ind-1)/9)*9 + lp, 1)
 OR z.z = substr(s, ((ind-1)%9) + (lp-1)*9 + 1, 1)
 OR z.z = substr(s, (((ind-1)/3) % 3) * 3
   + ((ind-1)/27) * 27 + lp
   + ((lp-1) / 3) * 6
   , 1)
   )
)

SELECT s FROM x WHERE ind=0;
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Mutally dependent JOIN clauses

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

Re: [sqlite] Command line shell not flushing stderr when interactive

Re: [sqlite] Command line shell not flushing stderr when interactive

Re: [sqlite] Solving the Sudoku with SQlite 3.8.3 trunk

Re: [sqlite] Command line shell not flushing stderr when interactive

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

Re: [sqlite] Command line shell not flushing stderr when interactive

[sqlite] Sqlie from c#: encoding bug?

[sqlite] Command line shell not flushing stderr when interactive

Re: [sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

[sqlite] Dramatic performance difference between different "PRAGMA synchronous" settings for bulk inserts

Re: [sqlite] Solving the Sudoku with SQlite 3.8.3 trunk

15 matches

Site Navigation

Mail list logo

Footer information