Re: [sqlite] VACUUM & journal size
> First, I'm sure it is nearly impossible to do this as a > guaranteed, atomic operation on most OSes and filesystems. ... > > Second, if this is meant to look like a cleanup operation on the > original file, the original file (including any filesystem meta-data) > should be kept in-tact. ... > > Last, file manipulations of this sort aren't supported by the current > VFS interface, and (IMHO) with good reason. ... > ... > However, it might be nice to have a "VACUUM TO " version of the > command. > Thanks, Jay (and Pavel too), you gave a very detail description of the situation. Although unlikely your interesting suggestion (VACUUM TO) will be implemented, I think one always make his own equivalent if it is necessary (creating a subset of operations from sqlite3RunVacuum) Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Tue, Mar 16, 2010 at 06:18:13PM +0300, Max Vlasov scratched on the wall: > When I read the comments it was obvious that the algorithm uses very simple > approach: > Attach blank database, copy all data, detach, rename. Sure I might be > wrong in details, but generally it looks like this. With one exception. The temporary database is not renamed back to the original database name. Rather, the data is copied back, page by page, from the temp database file back into the original database file. The original database file is then truncated to the correct size. > With this actions journal file(s) for the new database should not contain > much data. That is correct. The comments in the code support this. > So my quess is one only need at most the size of the actual data > from the source base plus very tiny journal file. But in fact (as everyone > sees) it uses much more. Almost. In addition to the original file, you need enough free space for the newly vacuumed database, as well as the journal files for both databases. As mentioned, the journal file for the temporary database is quite small. However, the "copy back" operation creates a journal file on the original database. This file is going to be similar in size to the original database, since every page in that file will be touched as part of the vacuum process. The journal file is required, as any failure or error (including a process or power failure) will require the journal file to recreate the original pre-vacuum database file. > So the question is what is so special about sqlite3RunVacuum that it needs > more space than a simple emulation of its actions? The copy-back operation. Yes, it would be faster and easier to just copy the temp database back over to the original database name, but there are issues with that. First, I'm sure it is nearly impossible to do this as a guaranteed, atomic operation on most OSes and filesystems. That's bad for data security. Second, if this is meant to look like a cleanup operation on the original file, the original file (including any filesystem meta-data) should be kept in-tact. Replacing the file by copying something on top of it won't do this. Last, file manipulations of this sort aren't supported by the current VFS interface, and (IMHO) with good reason. The current vacuum process works regardless of the operating environment, including in-memory databases and any custom VFS modules. Reaching outside of that context to things like filesystem and directory manipulations complicates this. This makes modification of the current vacuum process unlikely. However, it might be nice to have a "VACUUM TO " version of the command. This would allow the user/application to vacuum a database to another non-temporary file and skip the copy-back operation. This would save considerable space and some amount of time. You could then do as you want with the new file. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
> So the question is what is so special about sqlite3RunVacuum that it needs > more space than a simple emulation of its actions? I believe renaming of the file cannot be atomic. So in case of OS crash you can be in situation without database at all - no old and no new. Also deleting of old file can be problematic when other processes have open handles to it. Even if Unix system will allow that all those processes won't know that they should re-open there handles to work with new file. Pavel On Tue, Mar 16, 2010 at 11:18 AM, Max Vlasov wrote: >> This means that to VACUUM a SQLite database of size X, you need at >> least 2X of _additional_ free disk space available. That seems rather >> wasteful, just looking at it as a SQLite user. Although >> programmatically there may be reasons for it that I'm not aware of. >> >> > > Hmm, did some research, I think that VACUUM requirements for free disk > space is too big. > > When I read the comments it was obvious that the algorithm uses very simple > approach: > Attach blank database, copy all data, detach, rename. Sure I might be > wrong in details, but generally it looks like this. > > With this actions journal file(s) for the new database should not contain > much data. So my quess is one only need at most the size of the actual data > from the source base plus very tiny journal file. But in fact (as everyone > sees) it uses much more. > > I just tried to perform VACUUM on a test base and emulate the actions with > manual attach/copy with the same base. > > We have: > - truecrypt volume, size: 10M in order to see all the "disk full" errors. > - new connection, testdb.db3 > - CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Value] > Text) > - 100,000 times > INSERT INTO TestTable (Value) VALUES > ("12345678901234567890123456789012345678901234567890") > Final size: 6M > - After we have completely filled the db, let's remove half of the records. > DELETE FROM TestTable WHERE Id < 5 > > Now we have our 6M db (ready to be decreased to ~3) and about 4M of free > space. > > So two scenarios: > > 1. Simple vacuum > > Trying to perform VACUUM: > Result: "database or disk is full" error. > > 2. Emulation of Vacuum actions. > > testdb_new.db3 is a new connection (tiny file 3k in size) > > ATTACH 'Q:\testdb_new.db3' AS newdb > > BEGIN TRANSACTION > ; the following operations create two db-journal files so rollback is > possible. > > CREATE TABLE newdb.TestTable ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Value] > Text) > ; actually sqlite3RunVacuum does some string formatting with results from > sqlite_master and performing corresponding queries. so this create table is > probably the same query. > > INSERT INTO newdb.TestTable SELECT * FROM main.TestTable > ; the operation is ok, Windows reports 3M free, but this is probably due > to cached writes. > > END TRANSACTION > ; Ok, testdb_new is 3m and 1M is free > > = > > So the question is what is so special about sqlite3RunVacuum that it needs > more space than a simple emulation of its actions? > > Thanks > > Max > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
> This means that to VACUUM a SQLite database of size X, you need at > least 2X of _additional_ free disk space available. That seems rather > wasteful, just looking at it as a SQLite user. Although > programmatically there may be reasons for it that I'm not aware of. > > Hmm, did some research, I think that VACUUM requirements for free disk space is too big. When I read the comments it was obvious that the algorithm uses very simple approach: Attach blank database, copy all data, detach, rename. Sure I might be wrong in details, but generally it looks like this. With this actions journal file(s) for the new database should not contain much data. So my quess is one only need at most the size of the actual data from the source base plus very tiny journal file. But in fact (as everyone sees) it uses much more. I just tried to perform VACUUM on a test base and emulate the actions with manual attach/copy with the same base. We have: - truecrypt volume, size: 10M in order to see all the "disk full" errors. - new connection, testdb.db3 - CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Value] Text) - 100,000 times INSERT INTO TestTable (Value) VALUES ("12345678901234567890123456789012345678901234567890") Final size: 6M - After we have completely filled the db, let's remove half of the records. DELETE FROM TestTable WHERE Id < 5 Now we have our 6M db (ready to be decreased to ~3) and about 4M of free space. So two scenarios: 1. Simple vacuum Trying to perform VACUUM: Result: "database or disk is full" error. 2. Emulation of Vacuum actions. testdb_new.db3 is a new connection (tiny file 3k in size) ATTACH 'Q:\testdb_new.db3' AS newdb BEGIN TRANSACTION ; the following operations create two db-journal files so rollback is possible. CREATE TABLE newdb.TestTable ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Value] Text) ; actually sqlite3RunVacuum does some string formatting with results from sqlite_master and performing corresponding queries. so this create table is probably the same query. INSERT INTO newdb.TestTable SELECT * FROM main.TestTable ; the operation is ok, Windows reports 3M free, but this is probably due to cached writes. END TRANSACTION ; Ok, testdb_new is 3m and 1M is free = So the question is what is so special about sqlite3RunVacuum that it needs more space than a simple emulation of its actions? Thanks Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Mon, Mar 15, 2010 at 11:18 AM, Scott Hess wrote: > AFAICT, the operation to copy the pages back _is_ journaled, and the > journal will get any pages which are overwritten in the front of the > main database. If the initial database has half of the pages used, it > seems like the journal could range from almost nothing (if the used > pages were mostly at the end) to almost half the database size (if the > used pages were mostly at the front), so to be safe one would need 2x > the final database size available. As an aside, it occurs to me that one could implement an alternate VACUUM which wrote the new database back to the main database file, journalling the original versions of those pages. The main database would be locked for longer, but that might be a fair tradeoff for some applications. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Mon, Mar 15, 2010 at 11:18:32AM -0800, Scott Hess scratched on the wall: > On Sun, Mar 14, 2010 at 5:18 PM, Jay A. Kreibich wrote: > > ?While I have not tested this, I was under the impression that the > > ?journal file is very very small, as no modifications are made to the > > ?database, other than the final low-level copy (which is not a > > ?journaled operation). > AFAICT, the operation to copy the pages back _is_ journaled, and the > journal will get any pages which are overwritten in the front of the > main database. Taking a closer look at the comments in the code, I believe you are correct. That makes sense anyways, as there is still a need to be able to back-out. There are some comments in the code about how turning journaling off is slower than leaving it on, but I see now that those comments are for the temp database, not the original file. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Sun, Mar 14, 2010 at 5:18 PM, Jay A. Kreibich wrote: > On Sun, Mar 14, 2010 at 07:19:59PM -0400, Matthew L. Creech scratched on the > wall: >> I have a SQLite database with one large table, and I'd like to shrink >> the size of that table to free up space in the filesystem. My problem >> is that the database is (for example) 100 MB, and I have 80 MB of free >> filesystem space. I figured that I could DELETE, say, half of the >> records from the table, then VACUUM, and the VACUUM would >> [temporarily] need ~50 MB of free space for the journal (since that's >> how much real data there is). >> >> Instead, I'm finding that it needs a full 100 MB for the journal, even >> though once the VACUUM succeeds the resulting DB is only 50 MB. As a >> result, I'm stuck unable to shrink the database, since VACUUM fails >> with a disk I/O error (out of space), seemingly no matter many entries >> I remove ahead of time. I know the space is being freed, since >> "PRAGMA freelist_count" shows the expected numbers. So presumably >> this is just an artifact of the way VACUUM is implemented internally. > > Are you sure it is the journal file that is growing too large? > > VACUUM works by making a logical copy of the database from the > original database into a temp database. This restructures the > database and recovers space. The temp database is then copied back > to the original database using low-level page copy. This low-level > copy then truncates the original database file, recovering > filesystem space. > > This also means the total space required to VACUUM a database is: > [old database size] + [new database size] + [journal file] > > While I have not tested this, I was under the impression that the > journal file is very very small, as no modifications are made to the > database, other than the final low-level copy (which is not a > journaled operation). AFAICT, the operation to copy the pages back _is_ journaled, and the journal will get any pages which are overwritten in the front of the main database. If the initial database has half of the pages used, it seems like the journal could range from almost nothing (if the used pages were mostly at the end) to almost half the database size (if the used pages were mostly at the front), so to be safe one would need 2x the final database size available. I could be mis-reading the code. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Mar 14, 2010, at 7:19 PM, Matthew L. Creech wrote: > Hi, > > I have a SQLite database with one large table, and I'd like to shrink > the size of that table to free up space in the filesystem. > > I'm finding that it needs a full 100 MB for the journal, even > though once the VACUUM succeeds the resulting DB is only 50 MB. > > Any tips are appreciated. Thanks! PRAGMA journal_mode=OFF; Be warned, though, that if you session crashes or you lose power, your database will go corrupt if there is no journal. D. Richard Hipp d...@hwaci.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Mon, Mar 15, 2010 at 2:11 AM, Matthew L. Creech wrote: > > I'll give this a try tomorrow on a real device with journaling off, > and see how much space it uses in /tmp with journaling turned off. > I ran some tests on a real device with a real database, and got the following results: Test 1: 250,000 entries, normal VACUUM Before the test: - 27.9 MB database - No journal - 37.5 MB free disk space By the time the journal file starts growing: - 27.9 MB database - Small journal - ~15 MB free disk space Toward the end of the operation (before failure): - 27.9 MB database - 14.6 MB journal - < 2 MB free disk space The VACUUM operation fails with a disk I/O error, presumably due to running out of space. Test 2: 250,000 entries, journal_mode = OFF, VACUUM Before the test: - 27.9 MB database - 37.5 MB free disk space Toward the end of the operation: - 28.1 MB database - 10.6 MB free disk space The VACUUM operation succeeds. Test 3: 250,000 entries trimmed to 100,000 entries, normal VACUUM Before the test: - 27.9 MB database - No journal - 37.5 MB free disk space By the time the journal file starts growing: - 27.9 MB database - Small journal - ~33 MB free disk space Toward the end of the operation: - 27.9 MB database - ~28 MB journal - ~5 MB free disk space Afterward: - 11.2 MB database - 54.3 MB free disk space The VACUUM operation succeeds. Test 4: 250,000 entries trimmed to 100,000 entries, journal_mode = OFF, VACUUM Before the test: - 27.9 MB database - 37.5 MB free disk space Toward the end of the operation: - 28.1 MB database - 33.3 MB free disk space The VACUUM operation succeeds. I never did see any temporary files, but space was obviously being taken up for a temp database, so I assume that SQLite opens a file then unlink()s it or something like that. It looks like in the normal (journaled) case, the journal file consistently grows about as large as the original database, but the extra disk space used up by the temporary table is dependent on the new database size. So Jay's estimate of disk usage: [old database size] + [new database size] + [journal file] is correct, but for the normal VACUUM case, [journal file] is basically equivalent to [old database size]. So it's really just: (2 * [old database size]) + [new database size] This means that to VACUUM a SQLite database of size X, you need at least 2X of _additional_ free disk space available. That seems rather wasteful, just looking at it as a SQLite user. Although programmatically there may be reasons for it that I'm not aware of. At any rate, I guess my immediate problem will have to be solved with a one-off patch that disables journaling and does a VACUUM, and for long-term usage I'll need to shrink the database capacity even more to account for the 3x disk space usage if I need to VACUUM again in the future. Thanks -- Matthew L. Creech ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Sun, Mar 14, 2010 at 9:18 PM, Jay A. Kreibich wrote: > > Are you sure it is the journal file that is growing too large? > ... > > Now, if I'm following you correctly, the numbers you gave seem to > indicate that this should work... If the old database is 100MB and > the new database is 50MB and I'm saying the journal file is small, > then 80MB free before you start should be enough. > I'm sure that it's the journal file. This is happening on an embedded device in a flash-based filesystem which is rather slow, so I was logged in via SSH and could see the journal file growing unusually large as free space shrunk to zero. To double check, I just tried a similar test on my desktop: $ ls -l deadband.db -rw-r--r-- 1 mlcreech mlcreech 85209088 Mar 15 01:35 deadband.db $ ./sqlite3 deadband.db SQLite version 3.6.22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> SELECT count() FROM val_table; 663552 sqlite> DELETE FROM val_table WHERE idx < 60; sqlite> SELECT count() FROM val_table; 51840 sqlite> VACUUM; In another shell, I've got a script monitoring the size of deadband.db-journal every 100ms. When I do the VACUUM, it shows: Journal size: 4096 bytes Journal size: 4096 bytes Journal size: 7389184 bytes Journal size: 14774272 bytes Journal size: 22159360 bytes Journal size: 2958 bytes Journal size: 36929536 bytes Journal size: 44314624 bytes Journal size: 51699712 bytes Journal size: 59080704 bytes Journal size: 66658304 bytes Journal size: 81235968 bytes Journal size: 85393408 bytes That last size is actually larger than the original database (additional transaction metadata and what not, I guess). After it's done, though, the file size is appropriately smaller: $ ls -l deadband.db -rw-r--r-- 1 mlcreech mlcreech 6709248 Mar 15 01:43 deadband.db I notice that when I've mostly emptied the database (as in this example), the VACUUM completes quickly, roughly in proportion to how many real entries are left. But the journal size still follows a near-linear growth throughout the operation, regardless. So presumably it's not really doing anything with the old data, but still copying it over to the journal as it goes through the old DB page-by-page or something. You mentioned that it's creating a temp database, and looking at sqlite3RunVacuum() I see 'vacuum_db' which seems to be just that. However, I'm not sure where that data is actually going. Setting temp_store_directory to my current directory didn't actually generate any files while the VACUUM was happening, that I could see. > > PRAGMA journal_mode = OFF; > > I would do this on a test system. If the problem really is the > journal file, this should allow things to work. If you still get a > space error, we're dealing with something else. > This could work as a last resort, although I'd hoped to find a better way - these are embedded devices, so power failure or reboot during the middle of this operation is a possibility. For this one-time case, though, I may be able to swing it if there's no other choice. By the way, there's one single read/write filesystem in flash on these systems, so in my case "/tmp" is no different than the directory that the database & journal are stored in. There's also not nearly enough free memory to hold the small database, so I can't use temp_store = MEMORY, unfortunately. I'll give this a try tomorrow on a real device with journaling off, and see how much space it uses in /tmp with journaling turned off. Thanks for the response! -- Matthew L. Creech ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] VACUUM & journal size
On Sun, Mar 14, 2010 at 07:19:59PM -0400, Matthew L. Creech scratched on the wall: > Hi, > > I have a SQLite database with one large table, and I'd like to shrink > the size of that table to free up space in the filesystem. My problem > is that the database is (for example) 100 MB, and I have 80 MB of free > filesystem space. I figured that I could DELETE, say, half of the > records from the table, then VACUUM, and the VACUUM would > [temporarily] need ~50 MB of free space for the journal (since that's > how much real data there is). > > Instead, I'm finding that it needs a full 100 MB for the journal, even > though once the VACUUM succeeds the resulting DB is only 50 MB. As a > result, I'm stuck unable to shrink the database, since VACUUM fails > with a disk I/O error (out of space), seemingly no matter many entries > I remove ahead of time. I know the space is being freed, since > "PRAGMA freelist_count" shows the expected numbers. So presumably > this is just an artifact of the way VACUUM is implemented internally. Are you sure it is the journal file that is growing too large? VACUUM works by making a logical copy of the database from the original database into a temp database. This restructures the database and recovers space. The temp database is then copied back to the original database using low-level page copy. This low-level copy then truncates the original database file, recovering filesystem space. This also means the total space required to VACUUM a database is: [old database size] + [new database size] + [journal file] While I have not tested this, I was under the impression that the journal file is very very small, as no modifications are made to the database, other than the final low-level copy (which is not a journaled operation). Now, if I'm following you correctly, the numbers you gave seem to indicate that this should work... If the old database is 100MB and the new database is 50MB and I'm saying the journal file is small, then 80MB free before you start should be enough. Except... all that disk space isn't taken from the same spot. The temp database is opened as '' (e.g. a zero-length string), which creates it in the temporary space. Where, exactly, this is depends on the host OS. Unix likes /var/tmp, /usr/tmp/ or /tmp, for example. If /tmp is a different filesystem (as it often is), and if it isn't big enough to hold the new database, you're going to get a space error. So... Just to verify if this really is or isn't the journal file, you can try to turn the journal file off: PRAGMA journal_mode = OFF; I would do this on a test system. If the problem really is the journal file, this should allow things to work. If you still get a space error, we're dealing with something else. If that doesn't work, move the temporary files to some place that has enough space to hold the new database file. It should be OK to set this to the same directory as the original database PRAGMA temp_store_directory = '/path/to/database/directory'; If that filesystem has enough space to hold both the old database and the new database, you should be able to vacuum it correctly. Let us know how it works out. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] VACUUM & journal size
Hi, I have a SQLite database with one large table, and I'd like to shrink the size of that table to free up space in the filesystem. My problem is that the database is (for example) 100 MB, and I have 80 MB of free filesystem space. I figured that I could DELETE, say, half of the records from the table, then VACUUM, and the VACUUM would [temporarily] need ~50 MB of free space for the journal (since that's how much real data there is). Instead, I'm finding that it needs a full 100 MB for the journal, even though once the VACUUM succeeds the resulting DB is only 50 MB. As a result, I'm stuck unable to shrink the database, since VACUUM fails with a disk I/O error (out of space), seemingly no matter many entries I remove ahead of time. I know the space is being freed, since "PRAGMA freelist_count" shows the expected numbers. So presumably this is just an artifact of the way VACUUM is implemented internally. Is there anything that I can do to shrink the database in-place? It looks like auto-vacuum is out of the question, since it has to have been enabled before the table was initially created. FYI, this situation exists on a few dozen devices in various locations, so I'd need a programmatic solution - "move the DB somewhere else, VACUUM, then move it back" won't work, unfortunately. :-) Any tips are appreciated. Thanks! -- Matthew L. Creech ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users