Re: [sqlite] Tracing latencies

2012-12-07 Thread Keith Chew
On Fri, Dec 7, 2012 at 3:28 PM, Keith Chew  wrote:
>
> We can see the times taken to do the syscalls are fast, so what is
> causing the delay between the seek and write? I had a look at the
> code, and found the method seekAndWrite(), which looks pretty straight
> forward. The only time it could be slow is when osWrite() return -1.
> Any ideas?
>
> Please advise if I should still try to get vfs=unix-none to work.

Looks like I have managed to pin point the problem area. Here are the results:
---
Config 1: No latency
---
PRAGMA journal_mode = OFF;
PRAGMA locking_mode = EXCLUSIVE;
---
Config 2: Latency observed
---
PRAGMA journal_mode = WAL;
PRAGMA locking_mode = EXCLUSIVE;
---
Config 3: Latency observed
---
PRAGMA journal_mode = OFF;
PRAGMA locking_mode = NORMAL;
---

>From the above, we can see that:
- WAL itself is not the issue, because config 3 shows the latency without WAL
- Speed of writing to disk is not the issue, because of config 1 & 3

What causes the latency is the common thing between WAL and
locking_mode=normal, and just briefly scanning the code, it looks like
they both lock files before writing to disk.

I am happy to try out any suggestions to pin point this even further.
Any guidance on where I should next look at would be greatly
appreciated! I am happy to change or comment out sections of the code
just to prove things...

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-06 Thread Keith Chew
Hi Richard

On Fri, Dec 7, 2012 at 2:08 AM, Richard Hipp  wrote:
> If that does cure your latencies, it also adds another problem, in that it
> does nothing to prevent two or more processes, or even two threads within
> the same process, from trying to use the database at the same time and
> tripping over one another.  So this solution will only work if you can
> absolutely guarantee that nobody will ever fire up an "sqlite3" shell on
> your database file at the same time you application is using it.

First, thank you so much for having the patience to help me out,
really appreciate it.

I tried to see if unix-none works from the shell:
-
sqlite3 file:/tmp/test.db?vfs=unix-none
SQLite version 3.7.10 2012-01-16 13:28:40
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .tables
Error: unable to open database file
-

I can't seem to get this to work, but I know URI name format is
working because sqlite3 file:/tmp/test.db?vfs=unix works fine.

I had another idea. Would setting "PRAGMA locking_mode=EXCLUSIVE;"
have the same effect? Ie it only locks once when opening the DB, and
never releases it?

I tried this, but the latency is still there. I have gone another step
further and straced it:
---
19547 22:04:13.331096 _llseek(79, 1050632, [1050632], SEEK_SET) = 0 <0.16>

19547 22:04:13.545595 write(79,
"\0\0\0\4\0\0\r\222\313\370\227\2001\342\366\304p\5\254\250W}<\372",
24) = 24 <0.000243>
19547 22:04:13.546570 _llseek(79, 1050656,  
19547 22:04:13.547100 <... _llseek resumed> [1050656], SEEK_SET) = 0 <0.000502>
19547 22:04:13.548011 write(79,
"\r\0\0\0\1\0173\0\0173\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096 
19547 22:04:13.548368 <... write resumed> ) = 4096 <0.000293>
---

We can see the times taken to do the syscalls are fast, so what is
causing the delay between the seek and write? I had a look at the
code, and found the method seekAndWrite(), which looks pretty straight
forward. The only time it could be slow is when osWrite() return -1.
Any ideas?

Please advise if I should still try to get vfs=unix-none to work.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-06 Thread Richard Hipp
On Thu, Dec 6, 2012 at 12:14 AM, Keith Chew  wrote:

> On Thu, Dec 6, 2012 at 3:52 PM, Keith Chew  wrote:
> > I am guess it may have something to do with the locking of the files.
> > sqlite could be getting blocked by the filesystem when trying to
> > obtain the lock before writing to the file...
>
> This is a stab in the dark... I had a look at the sqlite code, and
> noticed that the file lock is acquired using fcntl(F_SETLK). So, when
> kjournald wants to flush the data to disk, and sqlite wants to write
> to the file at the same time, I guess there will be contention, and
> sqlite can only get the lock when the file is free. This explains the
> latencies that I am seeing being consistent with the ext3 journal's
> parameters. Eg increase commit time removes the contention (thus 0ms
> between commits), removing write-cache, disabling barriers all affect
> the latency accordingly.
>
> So, if this is the case, how do I get around it?
>

Specify "unix-none" at the VFS in the 4th parameter to sqlite3_open_v2().
(Or you can use URI pathnames and add the vfs=unix-none query parameter.)
This will load a version of the backend that never does any file locking.
Does that make your problem go away?

If that does cure your latencies, it also adds another problem, in that it
does nothing to prevent two or more processes, or even two threads within
the same process, from trying to use the database at the same time and
tripping over one another.  So this solution will only work if you can
absolutely guarantee that nobody will ever fire up an "sqlite3" shell on
your database file at the same time you application is using it.



>
> Regards
> Keith
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Keith Chew
On Thu, Dec 6, 2012 at 3:52 PM, Keith Chew  wrote:
> I am guess it may have something to do with the locking of the files.
> sqlite could be getting blocked by the filesystem when trying to
> obtain the lock before writing to the file...

This is a stab in the dark... I had a look at the sqlite code, and
noticed that the file lock is acquired using fcntl(F_SETLK). So, when
kjournald wants to flush the data to disk, and sqlite wants to write
to the file at the same time, I guess there will be contention, and
sqlite can only get the lock when the file is free. This explains the
latencies that I am seeing being consistent with the ext3 journal's
parameters. Eg increase commit time removes the contention (thus 0ms
between commits), removing write-cache, disabling barriers all affect
the latency accordingly.

So, if this is the case, how do I get around it?

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Keith Chew
On Thu, Dec 6, 2012 at 1:01 PM, Keith Chew  wrote:
> I used sar and iostat to profile what is happening at the disk level,
> and below are the results. You can see that the TPS on the disk is
> very small, only 6-7 tps, and both mysql and sqlite profiles show very
> close patterns, except sqlite is writing a bit more. I would expect at
> this tps level, writes should be very fast. Is there anything else I
> can try to narrow this down further?
>

Hi

I have found that by changing the commit time of ext3 (ie from
commit=1 to commit=5), the latency drops to 0s, and only goes up to
160ms during the commit from the kernel.

So, I know that the latency is actually caused by the kernel journal
commits, but strangely it is not affecting mysql. Is there anything I
can do in the code to investigate this further?

I am guess it may have something to do with the locking of the files.
sqlite could be getting blocked by the filesystem when trying to
obtain the lock before writing to the file...
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Keith Chew
On Thu, Dec 6, 2012 at 9:16 AM, Keith Chew  wrote:
> Thank you so much for all the input. I tried 2 things, using only a
> single connection (as suggested by Clemen) and to change the page size
> to 32K. Single connection did not reduce the average latency, but did
> help in reducing the spikes. So, I believe Clemen has helped identify
> one issue (write concurrencies). Using 32K page size also did help
> reduce the spikes (not sure logic behind this, have yet to do a
> strace), but average latency is still there.

I used sar and iostat to profile what is happening at the disk level,
and below are the results. You can see that the TPS on the disk is
very small, only 6-7 tps, and both mysql and sqlite profiles show very
close patterns, except sqlite is writing a bit more. I would expect at
this tps level, writes should be very fast. Is there anything else I
can try to narrow this down further?

Sqlite

12/06/2012 12:43:10 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 6.930.006.93 0.00   110.89
16.00 0.35   64.57  25.14  17.43

12/06/2012 12:43:11 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 9.000.006.00 0.00   120.00
20.00 0.45   59.33  29.33  17.60

12/06/2012 12:43:12 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 6.000.007.00 0.00   104.00
14.86 0.37   67.43  24.57  17.20

12/06/2012 12:43:13 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 5.940.005.94 0.0095.05
16.00 0.36   59.33  30.00  17.82

12/06/2012 12:43:14 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 6.000.004.00 0.0080.00
20.00 0.46   34.00  39.00  15.60

12/06/2012 12:43:15 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 6.000.009.00 0.00   120.00
13.33 0.49   89.78  27.11  24.40

Mysql
12/06/2012 12:50:22 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 3.060.006.12 0.0073.47
12.00 0.38   62.67  27.33  16.73

12/06/2012 12:50:23 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 2.910.004.85 0.0062.14
12.80 0.31   63.20  30.40  14.76

12/06/2012 12:50:24 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 3.030.006.06 0.0072.73
12.00 0.34   57.33  24.67  14.95

12/06/2012 12:50:25 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 3.000.005.00 0.0064.00
12.80 0.30   57.60  29.60  14.80

12/06/2012 12:50:26 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 2.970.005.94 0.0071.29
12.00 0.36   60.00  26.00  15.45

12/06/2012 12:50:27 PM
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda   0.00 3.000.005.00 0.0064.00
12.80 0.29   57.60  28.80  14.40


Sqlite
12:43:44 PM   DEV   tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz await svctm %util
12:43:45 PMdev8-0  6.93  0.00102.97 14.86
0.39 56.00 21.71 15.05

12:43:45 PM   DEV   tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz await svctm %util
12:43:46 PMdev8-0  5.94  0.00 95.05 16.00
0.31 52.67 24.67 14.65

12:43:46 PM   DEV   tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz await svctm %util
12:43:47 PMdev8-0  6.93  0.00102.97 14.86
0.41 58.86 22.86 15.84

12:43:47 PM   DEV   tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz await svctm %util
12:43:48 PMdev8-0  6.00  0.00 96.00 16.00
0.33 54.67 28.00 16.80

12:43:48 PM   DEV   tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz await svctm %util
12:43:49 PMdev8-0  7.00  0.00120.00 17.14
0.46 66.29 26.86 18.80

12:43:49 PM   DEV   tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz await svctm %util
12:43:50 PMdev8-0  5.94  0.00 95.05 16.00
0.36 60.00 31.33 18.61

12:43:50 PM   DEV   tps  rd_sec/s  wr_sec/s  

Re: [sqlite] Tracing latencies

2012-12-05 Thread Keith Chew
Hi All

> That hypothesis could be easily tested by examining the corresponding strace
> output for mysql... I don't know if sqlite3's unaligned log writes are
> actually slow (didn't see this info in any of Keith's messages), but if they
> are that would suggest the hypothesis is false; it would really put nails in
> the coffin if mysql uses aligned log writes.
>
> (BTW Michael, whatever mechanism you use to reply breaks threading in both
> Thunderbird and the list archives at sqlite.org; makes it harder to follow
> conversations)

Thank you so much for all the input. I tried 2 things, using only a
single connection (as suggested by Clemen) and to change the page size
to 32K. Single connection did not reduce the average latency, but did
help in reducing the spikes. So, I believe Clemen has helped identify
one issue (write concurrencies). Using 32K page size also did help
reduce the spikes (not sure logic behind this, have yet to do a
strace), but average latency is still there.

As expected, combining the 2 solutions would not reduce the average
latency, but spikes are not as frequent anymore. Eg I still see 160ms
average, and sometimes it spikes to 300ms, but I have not seen a
4000ms transaction.

I don't think there is anything more I do about this at this stage, I
am getting a 7200rpm drive later today to test and see if that makes a
difference. I am expecting the average latency to drop by about 25%.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Ryan Johnson

On 05/12/2012 1:25 PM, Dan Kennedy wrote:

On 12/06/2012 12:47 AM, Black, Michael (IS) wrote:

I'm a bit confused with "always aligned". None of the lseeks in this
log are 1024 aligned.

And I just ran a test with the 3.7.15.1 latest amalgamation and most
of these seeks are not aligned.  Once in a while it gets lucky.
Alignment sure isn't deliberate in this.  It appears the first page
is 1080 which is already out of alignment.  File#4 here is the wal
file.


I wasn't clear. Writes to the WAL file are not aligned. However,
they are usually sequential, which means on an OS with a smart
enough cache you don't really gain anything at all trying to align
them.
That hypothesis could be easily tested by examining the corresponding 
strace output for mysql... I don't know if sqlite3's unaligned log 
writes are actually slow (didn't see this info in any of Keith's 
messages), but if they are that would suggest the hypothesis is false; 
it would really put nails in the coffin if mysql uses aligned log writes.


(BTW Michael, whatever mechanism you use to reply breaks threading in 
both Thunderbird and the list archives at sqlite.org; makes it harder to 
follow conversations)


Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Dan Kennedy

On 12/06/2012 12:47 AM, Black, Michael (IS) wrote:

I'm a bit confused with "always aligned".  None of the lseeks in this
log are 1024 aligned.

And I just ran a test with the 3.7.15.1 latest amalgamation and most
of these seeks are not aligned.  Once in a while it gets lucky.
Alignment sure isn't deliberate in this.  It appears the first page
is 1080 which is already out of alignment.  File#4 here is the wal
file.


I wasn't clear. Writes to the WAL file are not aligned. However,
they are usually sequential, which means on an OS with a smart
enough cache you don't really gain anything at all trying to align
them.

To the database file, where it does matter, writes are always
aligned.






lseek(4, 0, SEEK_SET)   = 0 lseek(4, 0, SEEK_SET)
= 0 lseek(4, 32, SEEK_SET)  = 32 lseek(4, 56,
SEEK_SET)  = 56 lseek(4, 1080, SEEK_SET)
= 1080 lseek(4, 1104, SEEK_SET)= 1104 lseek(4, 2128,
SEEK_SET)= 2128 lseek(4, 2152, SEEK_SET)
= 2152 lseek(4, 3176, SEEK_SET)= 3176 lseek(4, 3200,
SEEK_SET)= 3200




Michael D. Black Senior Scientist Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit Northrop Grumman Information
Systems

 From:
sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on
behalf of Dan Kennedy [danielk1...@gmail.com] Sent: Wednesday,
December 05, 2012 10:27 AM To: sqlite-users@sqlite.org Subject: EXT
:Re: [sqlite] Tracing latencies

On 12/05/2012 09:03 PM, Black, Michael (IS) wrote:

Hmmm...looking at that strace sequencing is there some reason those
24-byte and 1024-byte writes can't be combined?  The 1024-byte
write is occurring at the end boundary of the 24-byte.

That would cut the i/o ops in half and might be a noticeable
improvement. A memory copy would be a lot faster than 2 i/o
requests.

And since disk page size is always a power of 2 would the 1024
buffer be better off as 1000 to align the page i/o better?



Those writes are appending to the WAL file. Writes to the database
file are always aligned page-sized (in this case 1024 byte) blocks.

For the WAL file, we tried it both ways (combining the 24 and 1024
bytes writes into one and leaving them separate) and found that, on
linux, it's faster to call write() twice. In other words, the extra
write() call is cheaper than doing a 1048 byte memcpy(). And you
can't just use 1048 byte buffers everywhere internally, as memory
allocators tend to waste lots of space if you allocate many blocks
that are all just a bit larger than a power-of-two.

Having said that, there has been at least one closed-source VFS
backend that buffers the sequential writes SQLite makes on the
journal and WAL files so that it can make mostly 8KB aligned writes
to the underlying file-system. So on some systems there is a benefit
to writing aligned page blocks even if you are writing sequentially.






Michael D. Black Senior Scientist Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit Northrop Grumman
Information Systems

 From:
sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org]
on behalf of Keith Chew [keith.c...@gmail.com] Sent: Wednesday,
December 05, 2012 2:11 AM To: General Discussion of SQLite
Database Subject: EXT :Re: [sqlite] Tracing latencies

Hi Dan

On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy<danielk1...@gmail.com>
wrote:

If it's not fsync() then IO delays are normally caused by
read(). You could try [strace -T -eread ...] to check.

Are SELECT statements fast on the same database? How large is
the database compared to the machines memory?


Ah, do you think the read on the OS level is starving the writes?
There are not many DB selects from the app, but there are other
read IO activities happening in the background. Still it doesn't
make sense, because WAL mode ensures from sqlite's point of view,
the reader will not be blocked by the writer. So, sqlite is
subjected to similar levels of read IO activity as Mysql (because
there are very few DB selects).

The tables are small, around 20MB in total, compared to 2GB of
memory available.

I have done more investigation. Using strace, I waited to capture
an insert/update with a long latency, and caught one that is 2s
long, below is the strace to the WAL file. We can see that it is
very bursty, all the seeks and writes span close to over 1s.

I recall doing a strace on Mysql a long time ago, and each insert
only does a single seek and write. Maybe it is sqlite's file format
that requires it to seek/write multiple places causing the
slowness? I am only guessing here, as I do not know how to
interpret the strace logs below.

[pid  4015] 21:01:53.634099 _llseek(98, 499928,
[pid  4015] 21:01:53.634245 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24
 [pid  4015] 21:01:53.634546 _llseek(98,
499952, [pid  4015] 21:01:

Re: [sqlite] Tracing latencies

2012-12-05 Thread Black, Michael (IS)
I'm a bit confused with "always aligned".  None of the lseeks in this log are 
1024 aligned.

And I just ran a test with the 3.7.15.1 latest amalgamation and most of these 
seeks are not aligned.  Once in a while it gets lucky.  Alignment sure isn't 
deliberate in this.  It appears the first page is 1080 which is already out of 
alignment.  File#4 here is the wal file.

lseek(4, 0, SEEK_SET)   = 0
lseek(4, 0, SEEK_SET)   = 0
lseek(4, 32, SEEK_SET)  = 32
lseek(4, 56, SEEK_SET)  = 56
lseek(4, 1080, SEEK_SET)= 1080
lseek(4, 1104, SEEK_SET)= 1104
lseek(4, 2128, SEEK_SET)= 2128
lseek(4, 2152, SEEK_SET)= 2152
lseek(4, 3176, SEEK_SET)= 3176
lseek(4, 3200, SEEK_SET)= 3200




Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Dan Kennedy [danielk1...@gmail.com]
Sent: Wednesday, December 05, 2012 10:27 AM
To: sqlite-users@sqlite.org
Subject: EXT :Re: [sqlite] Tracing latencies

On 12/05/2012 09:03 PM, Black, Michael (IS) wrote:
> Hmmm...looking at that strace sequencing is there some reason those 24-byte 
> and 1024-byte writes can't be combined?  The 1024-byte write is occurring at 
> the end boundary of the 24-byte.
>
> That would cut the i/o ops in half and might be a noticeable improvement. A 
> memory copy would be a lot faster than 2 i/o requests.
>
> And since disk page size is always a power of 2 would the 1024 buffer be 
> better off as 1000 to align the page i/o better?


Those writes are appending to the WAL file. Writes to the
database file are always aligned page-sized (in this case
1024 byte) blocks.

For the WAL file, we tried it both ways (combining the 24
and 1024 bytes writes into one and leaving them separate) and
found that, on linux, it's faster to call write() twice. In
other words, the extra write() call is cheaper than doing
a 1048 byte memcpy(). And you can't just use 1048 byte buffers
everywhere internally, as memory allocators tend to waste lots
of space if you allocate many blocks that are all just a bit
larger than a power-of-two.

Having said that, there has been at least one closed-source VFS backend
that buffers the sequential writes SQLite makes on the journal
and WAL files so that it can make mostly 8KB aligned writes to
the underlying file-system. So on some systems there is a benefit
to writing aligned page blocks even if you are writing sequentially.




>
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Advanced GEOINT Solutions Operating Unit
> Northrop Grumman Information Systems
>
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Keith Chew [keith.c...@gmail.com]
> Sent: Wednesday, December 05, 2012 2:11 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite] Tracing latencies
>
> Hi Dan
>
> On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy<danielk1...@gmail.com>  wrote:
>> If it's not fsync() then IO delays are normally caused by read().
>> You could try [strace -T -eread ...] to check.
>>
>> Are SELECT statements fast on the same database? How large is the
>> database compared to the machines memory?
>
> Ah, do you think the read on the OS level is starving the writes?
> There are not many DB selects from the app, but there are other read
> IO activities happening in the background. Still it doesn't make
> sense, because WAL mode ensures from sqlite's point of view, the
> reader will not be blocked by the writer. So, sqlite is subjected to
> similar levels of read IO activity as Mysql (because there are very
> few DB selects).
>
> The tables are small, around 20MB in total, compared to 2GB of memory 
> available.
>
> I have done more investigation. Using strace, I waited to capture an
> insert/update with a long latency, and caught one that is 2s long,
> below is the strace to the WAL file. We can see that it is very
> bursty, all the seeks and writes span close to over 1s.
>
> I recall doing a strace on Mysql a long time ago, and each insert only
> does a single seek and write. Maybe it is sqlite's file format that
> requires it to seek/write multiple places causing the slowness? I am
> only guessing here, as I do not know how to interpret the strace logs
> below.
>
> [pid  4015] 21:01:53.634099 _llseek(98, 499928,
> [pid  4015] 21:01:53.634245 write(98,
> "\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24
> 
> [pid  4015] 21:01:53.634546 _llseek(98,

Re: [sqlite] Tracing latencies

2012-12-05 Thread Dan Kennedy

On 12/05/2012 09:03 PM, Black, Michael (IS) wrote:

Hmmm...looking at that strace sequencing is there some reason those 24-byte and 
1024-byte writes can't be combined?  The 1024-byte write is occurring at the 
end boundary of the 24-byte.

That would cut the i/o ops in half and might be a noticeable improvement. A 
memory copy would be a lot faster than 2 i/o requests.

And since disk page size is always a power of 2 would the 1024 buffer be better 
off as 1000 to align the page i/o better?



Those writes are appending to the WAL file. Writes to the
database file are always aligned page-sized (in this case
1024 byte) blocks.

For the WAL file, we tried it both ways (combining the 24
and 1024 bytes writes into one and leaving them separate) and
found that, on linux, it's faster to call write() twice. In
other words, the extra write() call is cheaper than doing
a 1048 byte memcpy(). And you can't just use 1048 byte buffers
everywhere internally, as memory allocators tend to waste lots
of space if you allocate many blocks that are all just a bit
larger than a power-of-two.

Having said that, there has been at least one closed-source VFS backend
that buffers the sequential writes SQLite makes on the journal
and WAL files so that it can make mostly 8KB aligned writes to
the underlying file-system. So on some systems there is a benefit
to writing aligned page blocks even if you are writing sequentially.






Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Keith Chew [keith.c...@gmail.com]
Sent: Wednesday, December 05, 2012 2:11 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Tracing latencies

Hi Dan

On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy<danielk1...@gmail.com>  wrote:

If it's not fsync() then IO delays are normally caused by read().
You could try [strace -T -eread ...] to check.

Are SELECT statements fast on the same database? How large is the
database compared to the machines memory?


Ah, do you think the read on the OS level is starving the writes?
There are not many DB selects from the app, but there are other read
IO activities happening in the background. Still it doesn't make
sense, because WAL mode ensures from sqlite's point of view, the
reader will not be blocked by the writer. So, sqlite is subjected to
similar levels of read IO activity as Mysql (because there are very
few DB selects).

The tables are small, around 20MB in total, compared to 2GB of memory available.

I have done more investigation. Using strace, I waited to capture an
insert/update with a long latency, and caught one that is 2s long,
below is the strace to the WAL file. We can see that it is very
bursty, all the seeks and writes span close to over 1s.

I recall doing a strace on Mysql a long time ago, and each insert only
does a single seek and write. Maybe it is sqlite's file format that
requires it to seek/write multiple places causing the slowness? I am
only guessing here, as I do not know how to interpret the strace logs
below.

[pid  4015] 21:01:53.634099 _llseek(98, 499928,
[pid  4015] 21:01:53.634245 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24

[pid  4015] 21:01:53.634546 _llseek(98, 499952,
[pid  4015] 21:01:53.634712 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024
[pid  4015] 21:01:53.740378 _llseek(98, 500976, [500976], SEEK_SET) = 0
[pid  4015] 21:01:53.740449 write(98,
"\0\0\0\3\0\0\0\0^=\226\213\23\10<\247\327\201>\32\227\323\f8", 24) =
24
[pid  4015] 21:01:53.740521 _llseek(98, 501000, [501000], SEEK_SET) = 0
[pid  4015] 21:01:53.740566 write(98,
"\r\3\221\0\t\3G\1\3\243\3\261\3\352\3\325\3\304\3G\3\177\3m\3\\\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740632 _llseek(98, 502024, [502024], SEEK_SET) = 0
[pid  4015] 21:01:53.740677 write(98,
"\0\0\37X\0\0\0\0^=\226\213\23\10<\247dL\17\316\32\30\301\237", 24) =
24
[pid  4015] 21:01:53.740736 _llseek(98, 502048, [502048], SEEK_SET) = 0
[pid  4015] 21:01:53.740781 write(98,
"\r\0\0\0\5\1*\0\3o\2\336\2L\1\273\1*\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740844 _llseek(98, 503072, [503072], SEEK_SET) = 0
[pid  4015] 21:01:53.740889 write(98,
"\0\0HR\0\0HU^=\226\213\23\10<\247\276\32g\304j\372Q.", 24) = 24
[pid  4015] 21:01:53.740945 _llseek(98, 503096, [503096], SEEK_SET) = 0
[pid  4015] 21:01:53.740989 write(98,
"\n\0\0\0\25\1\264\0\2@\2\\\2x\2\224\2\260\2\314\2\350\3\4\3
\3<\3X\3t"..., 1024) = 1024
[pid  4015] 21:01:54.636566 _llseek(98, 504120, [504120], SEEK_SET) = 0
[pid  4015] 21:01:54.636636 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247=]`\3700\351\226n", 24) = 24
[pid  4015] 21:01:54.636812 _llsee

Re: [sqlite] Tracing latencies

2012-12-05 Thread Black, Michael (IS)
Run this program (change the diff threshold if you want) on the strace log file.
This will only show the calls that take "too long" and the time involved.

#include 
#include 

main(int argc, char *argv[])
{
  char buf[65535];
  double t1=0,t2;
  FILE *fp=fopen(argv[1],"r");
  while(fgets(buf,sizeof(buf),fp)) {
if (t1==0) {
  t1 = atof(buf);
}
else {
  t2 = atof(buf);
  double diff = t2 - t1;
  if (diff >  .0001) {
printf("%.6f %s",t2-t1,buf);
  }
  t1 = t2;
}
  }
}


Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Keith Chew [keith.c...@gmail.com]
Sent: Wednesday, December 05, 2012 2:11 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Tracing latencies

Hi Dan

On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy <danielk1...@gmail.com> wrote:
> If it's not fsync() then IO delays are normally caused by read().
> You could try [strace -T -eread ...] to check.
>
> Are SELECT statements fast on the same database? How large is the
> database compared to the machines memory?

Ah, do you think the read on the OS level is starving the writes?
There are not many DB selects from the app, but there are other read
IO activities happening in the background. Still it doesn't make
sense, because WAL mode ensures from sqlite's point of view, the
reader will not be blocked by the writer. So, sqlite is subjected to
similar levels of read IO activity as Mysql (because there are very
few DB selects).

The tables are small, around 20MB in total, compared to 2GB of memory available.

I have done more investigation. Using strace, I waited to capture an
insert/update with a long latency, and caught one that is 2s long,
below is the strace to the WAL file. We can see that it is very
bursty, all the seeks and writes span close to over 1s.

I recall doing a strace on Mysql a long time ago, and each insert only
does a single seek and write. Maybe it is sqlite's file format that
requires it to seek/write multiple places causing the slowness? I am
only guessing here, as I do not know how to interpret the strace logs
below.

[pid  4015] 21:01:53.634099 _llseek(98, 499928,  
[pid  4015] 21:01:53.634245 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24

[pid  4015] 21:01:53.634546 _llseek(98, 499952,  
[pid  4015] 21:01:53.634712 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024 
[pid  4015] 21:01:53.740378 _llseek(98, 500976, [500976], SEEK_SET) = 0
[pid  4015] 21:01:53.740449 write(98,
"\0\0\0\3\0\0\0\0^=\226\213\23\10<\247\327\201>\32\227\323\f8", 24) =
24
[pid  4015] 21:01:53.740521 _llseek(98, 501000, [501000], SEEK_SET) = 0
[pid  4015] 21:01:53.740566 write(98,
"\r\3\221\0\t\3G\1\3\243\3\261\3\352\3\325\3\304\3G\3\177\3m\3\\\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740632 _llseek(98, 502024, [502024], SEEK_SET) = 0
[pid  4015] 21:01:53.740677 write(98,
"\0\0\37X\0\0\0\0^=\226\213\23\10<\247dL\17\316\32\30\301\237", 24) =
24
[pid  4015] 21:01:53.740736 _llseek(98, 502048, [502048], SEEK_SET) = 0
[pid  4015] 21:01:53.740781 write(98,
"\r\0\0\0\5\1*\0\3o\2\336\2L\1\273\1*\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740844 _llseek(98, 503072, [503072], SEEK_SET) = 0
[pid  4015] 21:01:53.740889 write(98,
"\0\0HR\0\0HU^=\226\213\23\10<\247\276\32g\304j\372Q.", 24) = 24
[pid  4015] 21:01:53.740945 _llseek(98, 503096, [503096], SEEK_SET) = 0
[pid  4015] 21:01:53.740989 write(98,
"\n\0\0\0\25\1\264\0\2@\2\\\2x\2\224\2\260\2\314\2\350\3\4\3
\3<\3X\3t"..., 1024) = 1024
[pid  4015] 21:01:54.636566 _llseek(98, 504120, [504120], SEEK_SET) = 0
[pid  4015] 21:01:54.636636 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247=]`\3700\351\226n", 24) = 24
[pid  4015] 21:01:54.636812 _llseek(98, 504144, [504144], SEEK_SET) = 0
[pid  4015] 21:01:54.636860 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024 
[pid  6744] 21:01:54.676590 close(98)   = 0


Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Black, Michael (IS)
Hmmm...looking at that strace sequencing is there some reason those 24-byte and 
1024-byte writes can't be combined?  The 1024-byte write is occurring at the 
end boundary of the 24-byte.

That would cut the i/o ops in half and might be a noticeable improvement. A 
memory copy would be a lot faster than 2 i/o requests.

And since disk page size is always a power of 2 would the 1024 buffer be better 
off as 1000 to align the page i/o better?

Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Keith Chew [keith.c...@gmail.com]
Sent: Wednesday, December 05, 2012 2:11 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Tracing latencies

Hi Dan

On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy <danielk1...@gmail.com> wrote:
> If it's not fsync() then IO delays are normally caused by read().
> You could try [strace -T -eread ...] to check.
>
> Are SELECT statements fast on the same database? How large is the
> database compared to the machines memory?

Ah, do you think the read on the OS level is starving the writes?
There are not many DB selects from the app, but there are other read
IO activities happening in the background. Still it doesn't make
sense, because WAL mode ensures from sqlite's point of view, the
reader will not be blocked by the writer. So, sqlite is subjected to
similar levels of read IO activity as Mysql (because there are very
few DB selects).

The tables are small, around 20MB in total, compared to 2GB of memory available.

I have done more investigation. Using strace, I waited to capture an
insert/update with a long latency, and caught one that is 2s long,
below is the strace to the WAL file. We can see that it is very
bursty, all the seeks and writes span close to over 1s.

I recall doing a strace on Mysql a long time ago, and each insert only
does a single seek and write. Maybe it is sqlite's file format that
requires it to seek/write multiple places causing the slowness? I am
only guessing here, as I do not know how to interpret the strace logs
below.

[pid  4015] 21:01:53.634099 _llseek(98, 499928,  
[pid  4015] 21:01:53.634245 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24

[pid  4015] 21:01:53.634546 _llseek(98, 499952,  
[pid  4015] 21:01:53.634712 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024 
[pid  4015] 21:01:53.740378 _llseek(98, 500976, [500976], SEEK_SET) = 0
[pid  4015] 21:01:53.740449 write(98,
"\0\0\0\3\0\0\0\0^=\226\213\23\10<\247\327\201>\32\227\323\f8", 24) =
24
[pid  4015] 21:01:53.740521 _llseek(98, 501000, [501000], SEEK_SET) = 0
[pid  4015] 21:01:53.740566 write(98,
"\r\3\221\0\t\3G\1\3\243\3\261\3\352\3\325\3\304\3G\3\177\3m\3\\\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740632 _llseek(98, 502024, [502024], SEEK_SET) = 0
[pid  4015] 21:01:53.740677 write(98,
"\0\0\37X\0\0\0\0^=\226\213\23\10<\247dL\17\316\32\30\301\237", 24) =
24
[pid  4015] 21:01:53.740736 _llseek(98, 502048, [502048], SEEK_SET) = 0
[pid  4015] 21:01:53.740781 write(98,
"\r\0\0\0\5\1*\0\3o\2\336\2L\1\273\1*\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740844 _llseek(98, 503072, [503072], SEEK_SET) = 0
[pid  4015] 21:01:53.740889 write(98,
"\0\0HR\0\0HU^=\226\213\23\10<\247\276\32g\304j\372Q.", 24) = 24
[pid  4015] 21:01:53.740945 _llseek(98, 503096, [503096], SEEK_SET) = 0
[pid  4015] 21:01:53.740989 write(98,
"\n\0\0\0\25\1\264\0\2@\2\\\2x\2\224\2\260\2\314\2\350\3\4\3
\3<\3X\3t"..., 1024) = 1024
[pid  4015] 21:01:54.636566 _llseek(98, 504120, [504120], SEEK_SET) = 0
[pid  4015] 21:01:54.636636 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247=]`\3700\351\226n", 24) = 24
[pid  4015] 21:01:54.636812 _llseek(98, 504144, [504144], SEEK_SET) = 0
[pid  4015] 21:01:54.636860 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024 
[pid  6744] 21:01:54.676590 close(98)   = 0


Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Clemens Ladisch
Keith Chew wrote:
> In the application, it opens 5 connections and put them in a pool.

Please note that SQLite has no write concurrency (actual writing to the
DB file requires an exclusive lock), and that there is also no
synchronization between multiple connections -- when one connection does
not get the exclusive lock, it just waits a little time before trying
again.  It's possible that it's these waits that you're seeing.

> I had a look, and at the most 3 connections are used at a time.

Try it with one connection at a time.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Dan Kennedy

On 12/05/2012 03:11 PM, Keith Chew wrote:

Hi Dan

On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy  wrote:

If it's not fsync() then IO delays are normally caused by read().
You could try [strace -T -eread ...] to check.

Are SELECT statements fast on the same database? How large is the
database compared to the machines memory?


Ah, do you think the read on the OS level is starving the writes?
There are not many DB selects from the app, but there are other read
IO activities happening in the background. Still it doesn't make
sense, because WAL mode ensures from sqlite's point of view, the
reader will not be blocked by the writer. So, sqlite is subjected to
similar levels of read IO activity as Mysql (because there are very
few DB selects).

The tables are small, around 20MB in total, compared to 2GB of memory available.

I have done more investigation. Using strace, I waited to capture an
insert/update with a long latency, and caught one that is 2s long,
below is the strace to the WAL file. We can see that it is very
bursty, all the seeks and writes span close to over 1s.

I recall doing a strace on Mysql a long time ago, and each insert only
does a single seek and write. Maybe it is sqlite's file format that
requires it to seek/write multiple places causing the slowness? I am
only guessing here, as I do not know how to interpret the strace logs
below.


A bunch of sequential writes to the wal file, totaling somewhere
around 4KB. This should be fast. But in the middle you have a 1
second delay:

> [pid  4015] 21:01:53.740989 write(98,
> "\n\0\0\0\25\1\264\0\2@\2\\\2x\2\224\2\260\2\314\2\350\3\4\3
> \3<\3X\3t"..., 1024) = 1024
> [pid  4015] 21:01:54.636566 _llseek(98, 504120, [504120], SEEK_SET) = 0

If you use -T instead of -t with strace, it reports the amount
of wall-clock time spent in each system call.

Dan.




[pid  4015] 21:01:53.634099 _llseek(98, 499928,
[pid  4015] 21:01:53.634245 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24

[pid  4015] 21:01:53.634546 _llseek(98, 499952,
[pid  4015] 21:01:53.634712 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024
[pid  4015] 21:01:53.740378 _llseek(98, 500976, [500976], SEEK_SET) = 0
[pid  4015] 21:01:53.740449 write(98,
"\0\0\0\3\0\0\0\0^=\226\213\23\10<\247\327\201>\32\227\323\f8", 24) =
24
[pid  4015] 21:01:53.740521 _llseek(98, 501000, [501000], SEEK_SET) = 0
[pid  4015] 21:01:53.740566 write(98,
"\r\3\221\0\t\3G\1\3\243\3\261\3\352\3\325\3\304\3G\3\177\3m\3\\\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740632 _llseek(98, 502024, [502024], SEEK_SET) = 0
[pid  4015] 21:01:53.740677 write(98,
"\0\0\37X\0\0\0\0^=\226\213\23\10<\247dL\17\316\32\30\301\237", 24) =
24
[pid  4015] 21:01:53.740736 _llseek(98, 502048, [502048], SEEK_SET) = 0
[pid  4015] 21:01:53.740781 write(98,
"\r\0\0\0\5\1*\0\3o\2\336\2L\1\273\1*\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740844 _llseek(98, 503072, [503072], SEEK_SET) = 0
[pid  4015] 21:01:53.740889 write(98,
"\0\0HR\0\0HU^=\226\213\23\10<\247\276\32g\304j\372Q.", 24) = 24
[pid  4015] 21:01:53.740945 _llseek(98, 503096, [503096], SEEK_SET) = 0
[pid  4015] 21:01:53.740989 write(98,
"\n\0\0\0\25\1\264\0\2@\2\\\2x\2\224\2\260\2\314\2\350\3\4\3
\3<\3X\3t"..., 1024) = 1024
[pid  4015] 21:01:54.636566 _llseek(98, 504120, [504120], SEEK_SET) = 0
[pid  4015] 21:01:54.636636 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247=]`\3700\351\226n", 24) = 24
[pid  4015] 21:01:54.636812 _llseek(98, 504144, [504144], SEEK_SET) = 0
[pid  4015] 21:01:54.636860 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024
[pid  6744] 21:01:54.676590 close(98)   = 0


Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Simon Slavin

On 5 Dec 2012, at 8:11am, Keith Chew  wrote:

> Ah, do you think the read on the OS level is starving the writes?
> There are not many DB selects from the app, but there are other read
> IO activities happening in the background. Still it doesn't make
> sense, because WAL mode ensures from sqlite's point of view, the
> reader will not be blocked by the writer. So, sqlite is subjected to
> similar levels of read IO activity as Mysql (because there are very
> few DB selects).

You may be right.  If your cache is big enough, some storage setups prioritise 
all reads ahead of writes.  They'll accumulate write commands in cache but 
won't actually do them until nothing is asking to read data any more.  The 
theory is that all programs, sooner or later, will need some writes to happen 
before they know what to read next.

In these days when programs do several things simultaneously the theory is 
wrong, and many programs do things like constantly updating one window while 
allowing another to be used to input new information.  If they ask for data to 
be read fast enough, the writes never get done.  This is called 'write 
starvation'.

I can verify that there's nothing in SQLite that makes this assumption or works 
this way.  What you're seeing must be provided by some subset of your firmware, 
networking protocol, framework or database library.  The cures you've already 
discovered: don't use that combination of PRAGMAs on that setup.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-05 Thread Keith Chew
Hi Dan

On Wed, Dec 5, 2012 at 6:38 PM, Dan Kennedy  wrote:
> If it's not fsync() then IO delays are normally caused by read().
> You could try [strace -T -eread ...] to check.
>
> Are SELECT statements fast on the same database? How large is the
> database compared to the machines memory?

Ah, do you think the read on the OS level is starving the writes?
There are not many DB selects from the app, but there are other read
IO activities happening in the background. Still it doesn't make
sense, because WAL mode ensures from sqlite's point of view, the
reader will not be blocked by the writer. So, sqlite is subjected to
similar levels of read IO activity as Mysql (because there are very
few DB selects).

The tables are small, around 20MB in total, compared to 2GB of memory available.

I have done more investigation. Using strace, I waited to capture an
insert/update with a long latency, and caught one that is 2s long,
below is the strace to the WAL file. We can see that it is very
bursty, all the seeks and writes span close to over 1s.

I recall doing a strace on Mysql a long time ago, and each insert only
does a single seek and write. Maybe it is sqlite's file format that
requires it to seek/write multiple places causing the slowness? I am
only guessing here, as I do not know how to interpret the strace logs
below.

[pid  4015] 21:01:53.634099 _llseek(98, 499928,  
[pid  4015] 21:01:53.634245 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247+\214\332\260\314Wf ", 24

[pid  4015] 21:01:53.634546 _llseek(98, 499952,  
[pid  4015] 21:01:53.634712 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024 
[pid  4015] 21:01:53.740378 _llseek(98, 500976, [500976], SEEK_SET) = 0
[pid  4015] 21:01:53.740449 write(98,
"\0\0\0\3\0\0\0\0^=\226\213\23\10<\247\327\201>\32\227\323\f8", 24) =
24
[pid  4015] 21:01:53.740521 _llseek(98, 501000, [501000], SEEK_SET) = 0
[pid  4015] 21:01:53.740566 write(98,
"\r\3\221\0\t\3G\1\3\243\3\261\3\352\3\325\3\304\3G\3\177\3m\3\\\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740632 _llseek(98, 502024, [502024], SEEK_SET) = 0
[pid  4015] 21:01:53.740677 write(98,
"\0\0\37X\0\0\0\0^=\226\213\23\10<\247dL\17\316\32\30\301\237", 24) =
24
[pid  4015] 21:01:53.740736 _llseek(98, 502048, [502048], SEEK_SET) = 0
[pid  4015] 21:01:53.740781 write(98,
"\r\0\0\0\5\1*\0\3o\2\336\2L\1\273\1*\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024) = 1024
[pid  4015] 21:01:53.740844 _llseek(98, 503072, [503072], SEEK_SET) = 0
[pid  4015] 21:01:53.740889 write(98,
"\0\0HR\0\0HU^=\226\213\23\10<\247\276\32g\304j\372Q.", 24) = 24
[pid  4015] 21:01:53.740945 _llseek(98, 503096, [503096], SEEK_SET) = 0
[pid  4015] 21:01:53.740989 write(98,
"\n\0\0\0\25\1\264\0\2@\2\\\2x\2\224\2\260\2\314\2\350\3\4\3
\3<\3X\3t"..., 1024) = 1024
[pid  4015] 21:01:54.636566 _llseek(98, 504120, [504120], SEEK_SET) = 0
[pid  4015] 21:01:54.636636 write(98,
"\0\0\0\4\0\0HU^=\226\213\23\10<\247=]`\3700\351\226n", 24) = 24
[pid  4015] 21:01:54.636812 _llseek(98, 504144, [504144], SEEK_SET) = 0
[pid  4015] 21:01:54.636860 write(98,
"\r\0\0\0\1\0033\0\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1024 
[pid  6744] 21:01:54.676590 close(98)   = 0


Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Dan Kennedy

On 12/05/2012 09:22 AM, Keith Chew wrote:

On Wed, Dec 5, 2012 at 1:14 PM, Keith Chew  wrote:

Since there is no fsync (I know this using grep on the strace ouput),
it is hard to tell what to look for in the strace. The output is very
noisy, so it makes it hard to go through them.

Does anyone know any handy flags for strace to only see all the
activities associated with sqlite file access?


I have confirmed using strace there is no fsyncs done by sqlite (using
synchronous=0 or synchronous=1), just seeks and writes. fsync is seen
only when synchronous=2.

After conducting more tests (using synchronous=0), I have some results
which is baffling me:
- Mysql average insert time, with write-cache OFF: 1-2ms
- Mysql average insert time, with write-cache ON: 0-1ms
- sqlite average insert time, with write-cache OFF: 160ms (have seen
up to 4000ms)
- sqlite average insert time, with write-cache ON: 50ms (have seen up to 150ms)

Quite a significant difference. I am stumped on how to make sqlite run faster.


If it's not fsync() then IO delays are normally caused by read().
You could try [strace -T -eread ...] to check.

Are SELECT statements fast on the same database? How large is the
database compared to the machines memory?

Dan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
On Wed, Dec 5, 2012 at 1:14 PM, Keith Chew  wrote:
> Since there is no fsync (I know this using grep on the strace ouput),
> it is hard to tell what to look for in the strace. The output is very
> noisy, so it makes it hard to go through them.
>
> Does anyone know any handy flags for strace to only see all the
> activities associated with sqlite file access?

I have confirmed using strace there is no fsyncs done by sqlite (using
synchronous=0 or synchronous=1), just seeks and writes. fsync is seen
only when synchronous=2.

After conducting more tests (using synchronous=0), I have some results
which is baffling me:
- Mysql average insert time, with write-cache OFF: 1-2ms
- Mysql average insert time, with write-cache ON: 0-1ms
- sqlite average insert time, with write-cache OFF: 160ms (have seen
up to 4000ms)
- sqlite average insert time, with write-cache ON: 50ms (have seen up to 150ms)

Quite a significant difference. I am stumped on how to make sqlite run faster.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
Hi Michael

On Wed, Dec 5, 2012 at 12:02 PM, Black, Michael (IS)
 wrote:
> Can you re-run your strace as "strace -tt" and look at the timings to help 
> pinpoint it?

Since there is no fsync (I know this using grep on the strace ouput),
it is hard to tell what to look for in the strace. The output is very
noisy, so it makes it hard to go through them.

Does anyone know any handy flags for strace to only see all the
activities associated with sqlite file access?

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Black, Michael (IS)
Can you re-run your strace as "strace -tt" and look at the timings to help 
pinpoint it?

Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Keith Chew [keith.c...@gmail.com]
Sent: Tuesday, December 04, 2012 5:00 PM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Tracing latencies

On Wed, Dec 5, 2012 at 11:10 AM, Keith Chew <keith.c...@gmail.com> wrote:
> The strange thing is that I am setting sqlite it to use WAL,
> autocheckpoint off and synchronous off. Even in this setup, I still
> see > 350ms transactions times for less than 3 TPS. A bit hard to
> believe, so I am now doing a strace to find out exactly what is
> hitting the disk.

II can confirm using strace that there are no fsyncs happening from
the application (which we expect because synchronous=0). So, it must
be something else that is causing these blocks. Somekind of file
locking issue (particular to my environment)? What else should I be
looking at?

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
On Wed, Dec 5, 2012 at 11:10 AM, Keith Chew  wrote:
> The strange thing is that I am setting sqlite it to use WAL,
> autocheckpoint off and synchronous off. Even in this setup, I still
> see > 350ms transactions times for less than 3 TPS. A bit hard to
> believe, so I am now doing a strace to find out exactly what is
> hitting the disk.

II can confirm using strace that there are no fsyncs happening from
the application (which we expect because synchronous=0). So, it must
be something else that is causing these blocks. Somekind of file
locking issue (particular to my environment)? What else should I be
looking at?

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
Hi Michael

On Wed, Dec 5, 2012 at 10:57 AM, Black, Michael (IS)
 wrote:
> Could it be waitiing on the prior transaction though?
>
> Since disk I/O lies it might be syncing the last transaction causing the new 
> one to wait longer.

Yes, that is possible. In the application, it opens 5 connections and
put them in a pool. I had a look, and at the most 3 connections are
used at a time. So, I am looking at 3 TPS max.

The strange thing is that I am setting sqlite it to use WAL,
autocheckpoint off and synchronous off. Even in this setup, I still
see > 350ms transactions times for less than 3 TPS. A bit hard to
believe, so I am now doing a strace to find out exactly what is
hitting the disk.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
On Wed, Dec 5, 2012 at 10:56 AM, Richard Hipp  wrote:
>> It is not an indexing issue. For one of the UPDATE SQLs, it is
>> updating a table with only 1 record in it. And this takes > 350ms...
>>
>
> Can you provide specifics:  The schema and the UPDATE statement?

Schema and SQL are below:

CREATE TABLE `k_user` (
  `user_id` integer primary key autoincrement,
  `user_name` varchar(50) default NULL,
  `created_at` datetime default NULL,
  `user_group_id` bigint(20) default NULL,
  `x` double default NULL,
  `y` double default NULL,
  `heading` double default NULL,
  `altitude` double default NULL,
  `speed` double default NULL,
  `state_id` bigint(20) default NULL,
  `status` varchar(15) default NULL,
  `last_updated_at` datetime default NULL
);

update k_user set x = ? , y = ? , heading = ? , last_updated_at = ?
where user_id=?

It cannot get simpler than this. For the inserts, it is just as
simple. A table with 15 columns, and a straight insert.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Black, Michael (IS)
Could it be waitiing on the prior transaction though?

Since disk I/O lies it might be syncing the last transaction causing the new 
one to wait longer.

Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Keith Chew [keith.c...@gmail.com]
Sent: Tuesday, December 04, 2012 3:45 PM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Tracing latencies

On Wed, Dec 5, 2012 at 10:28 AM, Keith Chew <keith.c...@gmail.com> wrote:
>
> I wonder what could be causing sqlite to hang so long? Will try to
> remove all indexes to see if that narrows things down.

It is not an indexing issue. For one of the UPDATE SQLs, it is
updating a table with only 1 record in it. And this takes > 350ms...
All the other tables have only 1 or 2 indexes, so should not be
impacting the inserts/updates.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Richard Hipp
On Tue, Dec 4, 2012 at 4:45 PM, Keith Chew  wrote:

> On Wed, Dec 5, 2012 at 10:28 AM, Keith Chew  wrote:
> >
> > I wonder what could be causing sqlite to hang so long? Will try to
> > remove all indexes to see if that narrows things down.
>
> It is not an indexing issue. For one of the UPDATE SQLs, it is
> updating a table with only 1 record in it. And this takes > 350ms...
>

Can you provide specifics:  The schema and the UPDATE statement?


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
On Wed, Dec 5, 2012 at 10:28 AM, Keith Chew  wrote:
>
> I wonder what could be causing sqlite to hang so long? Will try to
> remove all indexes to see if that narrows things down.

It is not an indexing issue. For one of the UPDATE SQLs, it is
updating a table with only 1 record in it. And this takes > 350ms...
All the other tables have only 1 or 2 indexes, so should not be
impacting the inserts/updates.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
On Wed, Dec 5, 2012 at 9:33 AM, Keith Chew  wrote:
> Any other ideas for me to trace this at a lower level would be greatly
> appreciated. I think I will need to get deeper into linux's block
> layer stuff, so perhaps this is not the correct place to ask the
> question, but I thought I'd start here.

I have done a bit more comparison on the IO pattern when running MySQL
and sqlite, using vmstat (results below). I can observe that sqlite
causes a bit more bytes to be written to the disk, but not by much.
Wait times are also reasonable.

I wonder what could be causing sqlite to hang so long? Will try to
remove all indexes to see if that narrows things down.

Sqlite:
 2  0  0 184148  51564 44622400 028  290  602  0  0 86 14  0
 1  0  0 184148  51572 44623200 028  293 1041 26  7 51 16  0
 0  0  0 184148  51580 44623200 052  300  926  0  0 84 16  0
 1  0  0 184024  51588 44623200 028  293 1071 24  9 53 14  0
 1  0  0 184024  51596 44622800 036  287  802  0  1 83 16  0
 2  0  0 183588  51604 44623600 028  291 1004 10  4 45 41  0
 0  0  0 184024  51612 44623600 056  293 1003 16  3 65 17  0
 0  0  0 184024  51620 44624000 028  293  854  0  0 87 13  0
 0  0  0 184024  51628 44624400 040  292 1060 28  6 49 18  0
 0  0  0 184024  51636 44624000 040  293  598  0  1 82 17  0
 1  0  0 184024  51644 44624400 036  290  899 24  7 54 15  0
 0  0  0 184024  51652 44624800 036  292  591  0  1 90  9  0
 1  0  0 183580  51660 44624800 040  290  758 16  3 67 14  0
 0  0  0 183900  51668 44624800 036  291  730 11  3 71 16  0
 1  0  0 183900  51676 44624800 028  292  601  0  1 59 40  0
 0  0  0 183900  51684 44625200 036  292  894 25  8 67  0  0
 1  0  0 183900  51692 44625200 064  295  590  0  0 91  9  0
 0  0  0 183900  51700 44625600 028  289  860 25  7 55 13  0


Mysql:
 2  0  0 192408  48120 44830800 048  260 1076 30  5 62  3  0
 0  0  0 195300  48128 45031600 032  263 1027 15  4 71 10  0
 2  1  0 194708  48128 45027600 056  257  768 15  4 81  0  0
 2  0  0 199004  48136 44640000 052  263  722 10  2 85  3  0
 1  0  0 197268  48144 44842800 040  258  636  7  0 81 12  0
 0  0  0 196896  48152 44845600 036  260  916 22  9 59 10  0
 1  0  0 198880  48152 44640800 0 0  253  496  0
0 100  0  0
 1  0  0 199012  48160 44626400 024  258  765 28  9 63  0  0
 1  0  0 199012  48168 44626400 032  260  382  1  0 85 14  0
 2  0  0 198868  48168 44626400 0 0  254  701 23  8 69  0  0
 0  0  0 199012  48176 44629200 032  259  434  6  0 82 12  0
 3  0  0 198832  48184 44628000 032  259  445  0  2 86 12  0
 1  0  0 199012  48184 44626400 0 0  253  656 22  8 70  0  0
 1  0  0 199012  48192 44626400 032  259  412  6  1 79 14  0
 1  0  0 19  48200 44625600 056  261  702 24  6 53 17  0
 0  0  0 19  48200 44626400 056  266  376  0  1 96  3  0
 1  0  0 198864  48208 44626400 024  257  757 31  5 53 11  0
 0  0  0 19  48216 44626400 032  259  382  1  1 85 13  0
 2  0  0 19  48216 44626400 0 0  253  368  0
0 100  0  0
 0  0  0 19  48224 44626800 036  258  430  6  1 73 20  0
 1  0  0 197568  48232 44468800 024  267  687  8  3 80  9  0

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Keith Chew
Hi Richard

On Wed, Dec 5, 2012 at 9:05 AM, Richard Hipp  wrote:
> The $dir will contain initial database contents and scripts of SQL
> statements that were run.  You can rerun those SQL statement using the
> command-line shell to find slow ones, then do things like EXPLAIN QUERY
> PLAN to figure out why they are slow and perhaps fix them by adding indices.

Thank you very much for your comprehensive reply. I actually know the
queries, because the application logs both the SQL and times. They are
simple SQL statements, eg "insert into table_a (...) values (...)" or
"update table_a set aaa='...' where bbb='...'".

I just tried enabling the write-cache of the drive, and the warn logs
disappeared. So, I know that somehow the OS/filesystem/disk setup is
blocking the inserts/updates quite severely for sqlite (but not for
MySQL). Is there anything else I can do to improve the situation?
Ideally I would like to understand what is happening under the hood of
sqlite, to improve my understanding of the difference between the 2
DBs.

Any other ideas for me to trace this at a lower level would be greatly
appreciated. I think I will need to get deeper into linux's block
layer stuff, so perhaps this is not the correct place to ask the
question, but I thought I'd start here.

Regards
Keith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tracing latencies

2012-12-04 Thread Richard Hipp
On Tue, Dec 4, 2012 at 2:30 PM, Keith Chew  wrote:

> Hi
>
> I have an existing application, which I am experimenting with
> replacing MySQL with sqlite. I have 2 boxes, one running MySQL, the
> other sqlite.
>
> Execution times for every insert/update/delete SQLs is measured, and a
> warn log is printed if it takes > 250ms to perform the SQL.
>
> OS configuration:
> - kernel 2.6.39
> - ext3, data=ordered, commit=1, barrier=1
>
> On the MySQL box, I do not see any warn logs, so everything is going
> well. On the sqlite box, I see warn logs come up quite often, ranging
> from 350ms up to 4000ms to run simple insert/update/delete SQL.
>
> I have tried many sqlite configurations including:
> - WAL, synchronous=0, wal_autocheckpoint=0
>
> which should make sqlite run the fastest!
>
> My question is how do I go about tracing the cause of these latencies?
>

Here is one approach:

(1) Get the latest SQLite sources from the Fossil repository
(2) Do "./configure; make sqlite3.c" to build the amalgamation
(3) Do "cat src/test_sqllog.c >>sqlite3.c" to append the "test_sqllog.c"
file to the end of the amalgamation from step (2).
(4) Edit the "sqlite3.c" file to insert a line near the top:

   #define SQLITE_ENABLE_SQLLOG

Or, you can compile with -DSQLITE_ENABLE_SQLLOG, whichever is easiest for
you.
(5) Set the environment variable SQLITE_SQLLOG_DIR=$dir where $dir is an
empty directory someplace where you want logging information to be written.
(6) Run your application

The $dir will contain initial database contents and scripts of SQL
statements that were run.  You can rerun those SQL statement using the
command-line shell to find slow ones, then do things like EXPLAIN QUERY
PLAN to figure out why they are slow and perhaps fix them by adding indices.

Further information in the header comment of test_sqllog.c.


http://www.sqlite.org/src/artifact/7813b47021a6d4e39bb7b1b328a8893dc59885cb

The SQLLOG facility is new.  So you'll have to use the latest sources out
of Fossil for it to work.



> I am starting to look at tools like blktrace, is that the right path?
> Some guidance on how to use some low level tools would be appreciated,
>
> Regards
> Keith
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users