[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Simon Slavin

On 17 Feb 2016, at 4:27pm, Dave Baggett  wrote:

> One clarification: is it the case that transaction bundling ONLY affects 
> write/delete operations -- i.e., those operations that alter the database?

A transaction must be created for read operations too.  Else a database might 
be changed in the middle of one giving you an inconsistent set of rows.

> Another clarification: is it the case that writes within a single transaction 
> will remain in the in-memory page cache until COMMIT is issued?

No, but you can get close to this by holding your journal in memory.

> I see various pragmas like cache_spill that seem to control the page cache 
> but I'm confused as to what they do.

Ignore them all.  They don't understand SQLite.

> Finally, is there a way to tell SQLite I would like to buffer, say, 16MB of 
> write/delete transactions in memory before the transactions are written to 
> disk?

No.  But you can use



and turn "PRAGMA synchronous = OFF".

Simon.


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Dave Baggett
Clarification to my own post:

When I talk about buffering, say, 16MB of write transactions in memory, I want 
the effects transactions to be visible to readers immediately (once they are 
"commited" to RAM). This seems akin to WAL mode where SQLite scans the .wal 
file prior to scanning the .dat file on SELECTs.

(BTW, I'm using WAL mode and have found it performs better than standard 
journal mode on pretty much every target device.)

Dave

Sent with inky

"Dave Baggett"  wrote:



OK, that helps -- thank you.



One clarification: is it the case that transaction bundling ONLY affects 
write/delete operations -- i.e., those operations that alter the database?



Another clarification: is it the case that writes within a single transaction 
will remain in the in-memory page cache until COMMIT is issued? I see various 
pragmas like cache_spill that seem to control the page cache but I'm confused 
as to what they do.



Finally, is there a way to tell SQLite I would like to buffer, say, 16MB of 
write/delete transactions in memory before the transactions are written to 
disk? Here I am talking about the meta level above the transaction level -- I 
have atomic transactions and I want to defer physically writing them until I 
have enough of them (say, 16MB worth of altered pages).



Dave



[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Dave Baggett
OK, that helps -- thank you.

One clarification: is it the case that transaction bundling ONLY affects 
write/delete operations -- i.e., those operations that alter the database?

Another clarification: is it the case that writes within a single transaction 
will remain in the in-memory page cache until COMMIT is issued? I see various 
pragmas like cache_spill that seem to control the page cache but I'm confused 
as to what they do.

Finally, is there a way to tell SQLite I would like to buffer, say, 16MB of 
write/delete transactions in memory before the transactions are written to 
disk? Here I am talking about the meta level above the transaction level -- I 
have atomic transactions and I want to defer physically writing them until I 
have enough of them (say, 16MB worth of altered pages).

Dave

Sent with inky

"Simon Slavin"  wrote:



On 17 Feb 2016, at 3:34pm, Simon Slavin  wrote:



> A lot of operations on the database file are done at the beginning and end of 
> every transaction.  If your journal is in memory, then you can dramatically 
> disk usage by using large transactions.  So I think you are right and you 
> should check out that strategy.



I'm sorry, that's poorly phrased and has a word missing.  Here's a better 
version:



A high proportion of the disk activity involved in making changes to the 
database are done to support the transaction structure, rather than the 
individual operation (INSERT/UPDATE/DELETE) you asked for.  Grouping lots of 
operations together into one transaction will reduce the overhead needed for 
locking and ACID.  In addition, in some journal modes operations relating to 
the transaction as a whole are done with the database file whereas much of the 
work relating to the operations is done with the journal file.  This should 
increase the advantage in your situation gained by using large transactions.



Simon.

___

sqlite-users mailing list

sqlite-users at mailinglists.sqlite.org

http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Dave Baggett
That's a great suggestion. One issue, though is that I'd have to run two FTS 
searches to search -- one on the disk-based database, and one on the 
memory-based one. I also already have the database split into 8 .dat files for 
scaling purposes. :)

But this may be workable -- thanks. (BTW, I am using SQLite via apsw -- thanks 
for that too!)

Dave

Sent with inky

"Roger Binns"  wrote:

-BEGIN PGP SIGNED MESSAGE-

Hash: SHA1



On 17/02/16 06:37, Dave Baggett wrote:

> I'd welcome any suggestions



How about two databases?  Create an in memory database for the cache.

Then whenever it hits a certain size (eg 64MB) or time passed (eg 5

minutes), copy/move data from the memory database to the persistent

(disk) one.  This ensures the writes to the disk database are in big

chunks.



Roger

-BEGIN PGP SIGNATURE-

Version: GnuPG v2



iEYEARECAAYFAlbEm9AACgkQmOOfHg372QR9rwCgu+MPM+kJEGYlBTzkKRYhHsOu

U98AoOZ4kBue7MV6Q8P9+vkljyJGobVu

=61/4

-END PGP SIGNATURE-

___

sqlite-users mailing list

sqlite-users at mailinglists.sqlite.org

http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Simon Slavin

On 17 Feb 2016, at 3:34pm, Simon Slavin  wrote:

> A lot of operations on the database file are done at the beginning and end of 
> every transaction.  If your journal is in memory, then you can dramatically 
> disk usage by using large transactions.  So I think you are right and you 
> should check out that strategy.

I'm sorry, that's poorly phrased and has a word missing.  Here's a better 
version:

A high proportion of the disk activity involved in making changes to the 
database are done to support the transaction structure, rather than the 
individual operation (INSERT/UPDATE/DELETE) you asked for.  Grouping lots of 
operations together into one transaction will reduce the overhead needed for 
locking and ACID.  In addition, in some journal modes operations relating to 
the transaction as a whole are done with the database file whereas much of the 
work relating to the operations is done with the journal file.  This should 
increase the advantage in your situation gained by using large transactions.

Simon.


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Simon Slavin

On 17 Feb 2016, at 2:48pm, Dave Baggett  wrote:

> Regarding transactions, I'm bundling write operations into transactions, but 
> not optimally. If, say, I do a huge write as a single transaction, will that 
> cause SQLite to keep everything in memory until I COMMIT? Perhaps that's the 
> right strategy for me to pursue.

A lot of operations on the database file are done at the beginning and end of 
every transaction.  If your journal is in memory, then you can dramatically 
disk usage by using large transactions.  So I think you are right and you 
should check out that strategy.

I've forgotten whether you already said which journal mode you are using or 
not, but there is a big difference in journal usage between WAL mode and the 
original mode.  Benchmarking between these two modes would be another good test 
for you.  Unfortunately I have no idea which option would be better for you but 
there are people here with more experience of these differences than I have and 
I hope some might help.

> If there is a document (or even section of the SQLite source) that I could 
> read to fully understand where the transition from memory (page cache, etc.) 
> to disk occurs, that would probably get me above n00b level of understanding, 
> which would help.

I don't think it's gathered together at one point.  For traditional mode you 
might find section 4 of



useful.  Sorry but it mixes information at the level you want with information 
far more details than you need.  The WAL information is gathered and



explains a lot about how WAL mode uses the journal file.  

Simon.


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Dave Baggett
I should have clarified: the problematic locking is happening in the OS layer. 
I've completely disable SQLite (thread) locking by building with 
SQLITE_THREADSAFE=0.
(And, yes, I'm only using SQLite from a single thread!)

Regarding transactions, I'm bundling write operations into transactions, but 
not optimally. If, say, I do a huge write as a single transaction, will that 
cause SQLite to keep everything in memory until I COMMIT? Perhaps that's the 
right strategy for me to pursue.

If there is a document (or even section of the SQLite source) that I could read 
to fully understand where the transition from memory (page cache, etc.) to disk 
occurs, that would probably get me above n00b level of understanding, which 
would help.

Dave

Sent with inky

"Simon Slavin"  wrote:



On 17 Feb 2016, at 2:37pm, Dave Baggett  wrote:



> I'm seeking specific advice on how to tune SQLite for this application when 
> deployed on a target with extremely poor write performance. On this target 
> writing in many small chunks is much more expensive than writing in a single 
> big sequential chunk. In particular, the write syscall is very expensive, 
> because frequent writes cause the host to spend huge amounts of time in lock 
> contention, because the locking strategy is very coarse.



The vast majority of time spent writing is not in the change you asked to make 
(INSERT, UPDATE, etc.) but in the measures made to handle the transaction: 
locking, ensuring ACID, etc..



Are you doing multiple write commands in one chunk of time ?  If so, are you 
enclosing them in a single transaction ?



Simon.

___

sqlite-users mailing list

sqlite-users at mailinglists.sqlite.org

http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Simon Slavin

On 17 Feb 2016, at 2:37pm, Dave Baggett  wrote:

> I'm seeking specific advice on how to tune SQLite for this application when 
> deployed on a target with extremely poor write performance. On this target 
> writing in many small chunks is much more expensive than writing in a single 
> big sequential chunk. In particular, the write syscall is very expensive, 
> because frequent writes cause the host to spend huge amounts of time in lock 
> contention, because the locking strategy is very coarse.

The vast majority of time spent writing is not in the change you asked to make 
(INSERT, UPDATE, etc.) but in the measures made to handle the transaction: 
locking, ensuring ACID, etc..

Are you doing multiple write commands in one chunk of time ?  If so, are you 
enclosing them in a single transaction ?

Simon.


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Dave Baggett
My application uses SQLite to cache data. Specifically, the cache isn't the 
storage of record for anything; it's just a local copy of data that's stored 
remotely on servers of record. The cache primarily facilitates rapid searching 
of the data locally, via FTS.

I'm seeking specific advice on how to tune SQLite for this application when 
deployed on a target with extremely poor write performance. On this target 
writing in many small chunks is much more expensive than writing in a single 
big sequential chunk. In particular, the write syscall is very expensive, 
because frequent writes cause the host to spend huge amounts of time in lock 
contention, because the locking strategy is very coarse.

Given that the data I am storing in SQLite is expendable -- I can always fetch 
it again from the server -- I don't really care if I lose, say, the last 30 
seconds of data written if the machine loses power, the app is killed by the 
host operating system, etc. However, I want to avoid the database going 
corrupt, since that requires fetching everything again.

It seems like an optimal strategy for this would be to keep a WAL-like journal 
in memory. I have plenty of RAM so 64MB for an in-memory WAL "file" would work. 
However, I don't see any way to tell SQLite to use WAL mode but keep the WAL 
file in memory. I also believe from reading the docs that if the memory-based 
WAL file is lost (e.g., loss of power) then the database will be corrupt.

I've tried journal_mode=MEMORY but that didn't seem to help any. It seems that 
no matter what pragmas I use, I can't convince SQLite to keep transactions in 
memory very long.

What I need, I guess, is some tuning parameter -- or I can write my own VFS -- 
that buffers entire transactions, then periodically flushes large groups of 
transactions at once, minimizing the number of write calls.

I'd welcome any suggestions from SQLite experts on this.

Dave

Sent with inky


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 17/02/16 08:20, Dave Baggett wrote:
> One issue, though is that I'd have to run two FTS searches to
> search -- one on the disk-based database, and one on the
> memory-based one

You see issues, I see features :-)

The memory based cache would contain the most recently cached "fresh"
information, and probably what is of most interest to the user.  So
the FTS searches for that will be very fast which is great, while
older information is searched at "normal" speeds.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbEoW4ACgkQmOOfHg372QST2wCdGgvbncjSo4B2FDxAZYQye3E0
TlIAoIKR+X4rhdwSUeOD5TVTchA83GT7
=vMsO
-END PGP SIGNATURE-


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Scott Hess
Just FYI, FTS writes each transaction's index data in a segment, then does
segment merges over time.  So there's some advantage to bulk updates versus
one-at-a-time updates in terms of index fragmentation and write overhead.
Having an in-memory FTS table which you spill to the on-disk table(s) as a
big transaction can work pretty well.

Note that big transactions can have their own issues, for instance if you
end up having to spill the page cache.

[Obviously, you'll want to test this for your case.]

-scott


On Wed, Feb 17, 2016 at 8:20 AM, Dave Baggett  wrote:

> That's a great suggestion. One issue, though is that I'd have to run two
> FTS searches to search -- one on the disk-based database, and one on the
> memory-based one. I also already have the database split into 8 .dat files
> for scaling purposes. :)
>
> But this may be workable -- thanks. (BTW, I am using SQLite via apsw --
> thanks for that too!)
>
> Dave
>
> Sent with inky
>
> "Roger Binns"  wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
>
> Hash: SHA1
>
>
>
> On 17/02/16 06:37, Dave Baggett wrote:
>
> > I'd welcome any suggestions
>
>
>
> How about two databases?  Create an in memory database for the cache.
>
> Then whenever it hits a certain size (eg 64MB) or time passed (eg 5
>
> minutes), copy/move data from the memory database to the persistent
>
> (disk) one.  This ensures the writes to the disk database are in big
>
> chunks.
>
>
>
> Roger
>
> -BEGIN PGP SIGNATURE-
>
> Version: GnuPG v2
>
>
>
> iEYEARECAAYFAlbEm9AACgkQmOOfHg372QR9rwCgu+MPM+kJEGYlBTzkKRYhHsOu
>
> U98AoOZ4kBue7MV6Q8P9+vkljyJGobVu
>
> =61/4
>
> -END PGP SIGNATURE-
>
> ___
>
> sqlite-users mailing list
>
> sqlite-users at mailinglists.sqlite.org
>
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 17/02/16 06:37, Dave Baggett wrote:
> I'd welcome any suggestions

How about two databases?  Create an in memory database for the cache.
 Then whenever it hits a certain size (eg 64MB) or time passed (eg 5
minutes), copy/move data from the memory database to the persistent
(disk) one.  This ensures the writes to the disk database are in big
chunks.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbEm9AACgkQmOOfHg372QR9rwCgu+MPM+kJEGYlBTzkKRYhHsOu
U98AoOZ4kBue7MV6Q8P9+vkljyJGobVu
=61/4
-END PGP SIGNATURE-