What was the size of the original database?

To VACUUM a database, the process is:

1) Read the logical components of the database, write them to a new file.  This 
will generate at least 1x reads (original size), and 1x writes (final size).  
In most cases the reads will be larger and the writes will be larger, because 
some blocks are touched more than once.  This is especially true if there are 
large indexes.

2) The new file is then copied back to the original file, one block at a time.  
This requires 1x (final) read and 1x write (final).

3) Except, to make the write-back ACID safe, each block of the original 
database needs to be copied during the write-back process, which requires 
another 1x read (final) and 1x write (final) in journal mode.  WAL numbers are 
similar, if not higher.

In other words, in the ideal case you’re going to have a bare minimum of 3x 
final writes and 2x final + 1x original reads, but even on a freshly VACUUMed 
database, you’ll never see ideal numbers— especially if there are indexes… 
indexes are rebuilt by insertion, so if the source table data is not really in 
order, that can require a lot of data shuffling (i.e. extra read/writes).


In a similar note, SQLite typically requires about ~2x final size of free 
storage space to complete a VACUUM.


There are a number of ways to improve this.  Most of the I/O is in the 
write-back process, which is required for ACID proof VACUUM transactions.  In 
2010 I proposed a “VACUUM TO” command that would VACUUM one database file to a 
new database file, essentially making a copy.  This would only require 1x 
original reads, and ~1x+ final writes, and only 1x final free space.  The 
disadvantage is that you end up with a new file that would require closing all 
connections (including those in other applications) and re-opening them.  
SQLite also does not trust OS filesystem commands (such as renaming a new file 
over and old one) to operate in any type of transaction/rollback safe way, so 
it avoids them.  There seems to be a number of times when that’s an acceptable 
alternative, however.

See:

http://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg87972.html
http://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg50941.html


  -j, author of “Using SQLite”, O’Reilly Media



On Oct 1, 2016, at 3:27 PM, Domingo Alvarez Duarte <mingo...@gmail.com> wrote:

> Hello !
> 
> I'm using sqlite (trunk) for a database (see bellow) and for a final database 
> file of 22GB a "vacuum" was executed and doing so it  made a lot of I/O ( 
> 134GB reads and 117GB writes in 2h:30min).
> 
> Can something be improved on sqlite to achieve a better performance ?
> 
> The data is public available just in case it can be useful to perform tests.
> 
> Cheers !
> 
> 

--  
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it, but showing 
it to the wrong people has the tendency to make them feel uncomfortable." -- 
Angela Johnson



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to