Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
Sorry for the slow response -- yes, this is great logic. We're just disabling vacuum. Thanks! -david On Mon, Dec 8, 2014 at 6:18 PM, Simon Slavinwrote: > > On 9 Dec 2014, at 1:36am, David Barrett wrote: > > > *Re: Why VACUUM.* We vacuum weekly. This particular database is a > > "rolling journal" -- we are constantly adding new rows to the end of the > > table, and every week we truncate off the head of the journal to only > keep > > 3M rows at the "tail". Given that we're truncating the "head", without > > vacuuming we'd be inserting the new rows at the "front" of the database > > with the old rows at the "end" -- and then each truncation would leave > the > > database more and more fragmented. Granted, this is on SSDs so the > > fragmentation doesn't matter a *ton*, but it just adds up and gets worse > > over time. Anyway, agreed it's not the most important thing to do, but > all > > things being equal I'd like to do it if I can to keep things clean and > > snappy. > > Okay. I have some great news for you. You can completely ignore > VACUUMing without any time or space drawbacks. You're wasting your time > and using up the life of your SSD for no advantage. > > Fragmentation ceases to become a problem when you move from rotating disks > to SSD. SSD is a truly random access medium. It's no faster to access > block b then block b+1 than it is block b then block b+1000. Two > contiguous blocks used to be faster in rotating disks only because there is > a physical read/write head and it will already be in the right place. SSDs > have no read/write head. It's all solid state and accessing one block is > no faster than another. > > Delete old rows and you'll release space. Insert new rows and they'll > take up the space released. Don't worry about the internal 'neatness' of > the file. Over a long series of operations you might see an extra block > used from time to time. But it will be either zero or one extra block per > table/index. No more than that. A messy internal file structure might > niggle the OCD side of your nature but that's the only disadvantage. > > Also, SSD drives wear out fast. We don't have good figures yet for > mass-produced drives (manufacturers introduce new models faster than the > old ones wear out, so it's hard to gather stats) but typical figures show a > drive failing in from 2,000 to 3,000 write cycles of each single block. > Your drive does something called 'wear levelling' and it has a certain > number of blocks spare and will automatically swap them in when the first > blocks fail, but after that your drive is smoke. And VACUUM /thrashes/ a > drive, doing huge amounts of reading and writing as it rebuilds tables and > indexes. You don't want to do something like that on an SSD without a good > reason. > > So maybe once every few years, or perhaps if you have another more > complicated maintenance routine which already takes up lots of time, do a > VACUUM then. But it doesn't really matter if you never VACUUM. (Which is > entirely unlike my home, dammit.) > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
> Also, SSD drives wear out fast. We don't have good figures yet for > mass-produced drives (manufacturers introduce new models faster than the old > ones wear out, so it's hard to gather stats) but typical figures show a drive > failing in from 2,000 to 3,000 write cycles of each single block. Your drive > does something called 'wear levelling' and it has a certain number of blocks > spare and will automatically swap them in when the first blocks fail, but > after that your drive is smoke. And VACUUM /thrashes/ a drive, doing huge > amounts of reading and writing as it rebuilds tables and indexes. You don't > want to do something like that on an SSD without a good reason. The SSD endurance experiment suggests that you might not need to worry too much about it: http://techreport.com/review/27436/the-ssd-endurance-experiment-two-freaking-petabytes Hadley -- http://had.co.nz/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/08/2014 05:36 PM, David Barrett wrote: > *Re: "a simple way is to sleep in the progress callback"* -- Can > you tell me more about this? Are you referring to the callback > provided to sqlite3_exec(), or something else? https://sqlite.org/c3ref/progress_handler.html Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1 iEYEARECAAYFAlSHIZ4ACgkQmOOfHg372QResgCg1AXMQWpW0LnhKVc9k02TXRfN P0wAoLdmiexWvkkiZOojFb7BSwZXF07X =97eR -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
On 2014/12/09 03:36, David Barrett wrote: Hi all, great questions: *Re: Why VACUUM.* We vacuum weekly. This particular database is a "rolling journal" -- we are constantly adding new rows to the end of the table, and every week we truncate off the head of the journal to only keep 3M rows at the "tail". Given that we're truncating the "head", without vacuuming we'd be inserting the new rows at the "front" of the database with the old rows at the "end" -- and then each truncation would leave the database more and more fragmented. Granted, this is on SSDs so the fragmentation doesn't matter a *ton*, but it just adds up and gets worse over time. Anyway, agreed it's not the most important thing to do, but all things being equal I'd like to do it if I can to keep things clean and snappy. Hi David - this is a completely unneeded step. SQLite will maintain it's internal structure, it doesn't do 'rolling' data usage and it will do so using the minimum IO. SQLite knows about people deleting and inserting, it will re-use and not waste, so no need to try and accomplish that. Only use Vacuum maybe after some months or such when you are doing all your proper DB maintenance - otherwise the performance gain will be negligible (and if you do experience some or other significant performance gain after a vacuum in your use-case, please let us know because something might be broken then). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
On 9 Dec 2014, at 1:36am, David Barrettwrote: > *Re: Why VACUUM.* We vacuum weekly. This particular database is a > "rolling journal" -- we are constantly adding new rows to the end of the > table, and every week we truncate off the head of the journal to only keep > 3M rows at the "tail". Given that we're truncating the "head", without > vacuuming we'd be inserting the new rows at the "front" of the database > with the old rows at the "end" -- and then each truncation would leave the > database more and more fragmented. Granted, this is on SSDs so the > fragmentation doesn't matter a *ton*, but it just adds up and gets worse > over time. Anyway, agreed it's not the most important thing to do, but all > things being equal I'd like to do it if I can to keep things clean and > snappy. Okay. I have some great news for you. You can completely ignore VACUUMing without any time or space drawbacks. You're wasting your time and using up the life of your SSD for no advantage. Fragmentation ceases to become a problem when you move from rotating disks to SSD. SSD is a truly random access medium. It's no faster to access block b then block b+1 than it is block b then block b+1000. Two contiguous blocks used to be faster in rotating disks only because there is a physical read/write head and it will already be in the right place. SSDs have no read/write head. It's all solid state and accessing one block is no faster than another. Delete old rows and you'll release space. Insert new rows and they'll take up the space released. Don't worry about the internal 'neatness' of the file. Over a long series of operations you might see an extra block used from time to time. But it will be either zero or one extra block per table/index. No more than that. A messy internal file structure might niggle the OCD side of your nature but that's the only disadvantage. Also, SSD drives wear out fast. We don't have good figures yet for mass-produced drives (manufacturers introduce new models faster than the old ones wear out, so it's hard to gather stats) but typical figures show a drive failing in from 2,000 to 3,000 write cycles of each single block. Your drive does something called 'wear levelling' and it has a certain number of blocks spare and will automatically swap them in when the first blocks fail, but after that your drive is smoke. And VACUUM /thrashes/ a drive, doing huge amounts of reading and writing as it rebuilds tables and indexes. You don't want to do something like that on an SSD without a good reason. So maybe once every few years, or perhaps if you have another more complicated maintenance routine which already takes up lots of time, do a VACUUM then. But it doesn't really matter if you never VACUUM. (Which is entirely unlike my home, dammit.) Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
Hi all, great questions: *Re: Why VACUUM.* We vacuum weekly. This particular database is a "rolling journal" -- we are constantly adding new rows to the end of the table, and every week we truncate off the head of the journal to only keep 3M rows at the "tail". Given that we're truncating the "head", without vacuuming we'd be inserting the new rows at the "front" of the database with the old rows at the "end" -- and then each truncation would leave the database more and more fragmented. Granted, this is on SSDs so the fragmentation doesn't matter a *ton*, but it just adds up and gets worse over time. Anyway, agreed it's not the most important thing to do, but all things being equal I'd like to do it if I can to keep things clean and snappy. *Re: "a simple way is to sleep in the progress callback"* -- Can you tell me more about this? Are you referring to the callback provided to sqlite3_exec(), or something else? Thanks! -david On Mon, Dec 8, 2014 at 3:16 PM, Roger Binnswrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 12/08/2014 01:35 PM, Max Vlasov wrote: > > I wonder whether I/O "sleeping" possible in the first place. > > In this particular case the OP wants to vacuum while the machine is > doing other I/O activity unrelated to the vacuum. Having more > sleeping during the vacuum will allow the other I/O a greater share. > > Roger > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1 > > iEYEARECAAYFAlSGMTgACgkQmOOfHg372QRxMACgz3qZHBGcUrOyf4DkFR5Km1a4 > jm4AoL49txXLfzPQefbjlnGg9UZ4GtcP > =9gAV > -END PGP SIGNATURE- > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
Wanting is not needing. If a highly I/O bound process interferes with the I/O performed by other (not I/O bound) processes, then the OS is broken and the proper solution is to get a better O/S. These sorts of problems were solved back in the 60's (okay, maybe 70's). Therefore, unless a Redmond Operating System is in use (for which the only repair is choosing a better OS), then the problem is most likely bad or ill-suited hardware choices (insufficient cache, no data phase disconnect, excessive queueing, etc) or OS configuration deliberately set to ill-suited values (such as to use a "user" scheduler rather than a "server" scheduler, or a lack of pre-emption). --- Theory is when you know everything but nothing works. Practice is when everything works but no one knows why. Sometimes theory and practice are combined: nothing works and no one knows why. >-Original Message- >From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- >boun...@sqlite.org] On Behalf Of Roger Binns >Sent: Monday, 8 December, 2014 16:16 >To: General Discussion of SQLite Database >Subject: Re: [sqlite] Artificially slow VACUUM by injecting a sleep() >somewhere? > >-BEGIN PGP SIGNED MESSAGE- >Hash: SHA1 > >On 12/08/2014 01:35 PM, Max Vlasov wrote: >> I wonder whether I/O "sleeping" possible in the first place. > >In this particular case the OP wants to vacuum while the machine is >doing other I/O activity unrelated to the vacuum. Having more >sleeping during the vacuum will allow the other I/O a greater share. > >Roger > >-BEGIN PGP SIGNATURE- >Version: GnuPG v1 > >iEYEARECAAYFAlSGMTgACgkQmOOfHg372QRxMACgz3qZHBGcUrOyf4DkFR5Km1a4 >jm4AoL49txXLfzPQefbjlnGg9UZ4GtcP >=9gAV >-END PGP SIGNATURE- >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/08/2014 01:35 PM, Max Vlasov wrote: > I wonder whether I/O "sleeping" possible in the first place. In this particular case the OP wants to vacuum while the machine is doing other I/O activity unrelated to the vacuum. Having more sleeping during the vacuum will allow the other I/O a greater share. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1 iEYEARECAAYFAlSGMTgACgkQmOOfHg372QRxMACgz3qZHBGcUrOyf4DkFR5Km1a4 jm4AoL49txXLfzPQefbjlnGg9UZ4GtcP =9gAV -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
On Mon, Dec 8, 2014 at 5:56 PM, Roger Binnswrote: > On 12/07/2014 04:43 PM, David Barrett wrote: >> so I'm curious if you can think of a way using the API (or any >> other way) to essentially "nice" the process by inserting a short >> "sleep" into whatever loop runs inside the VACUUM command. > > Using OS provided functionality will be the most reliable. Other than > that, a simple way is to sleep in the progress callback, although that > will make I/O lumpy. I wonder whether I/O "sleeping" possible in the first place. Correct me, but what we usually call "sleeping" is about CPU that already sleeps during most I/O operations waiting for rotating media finishing its slow tasks. As a consequence, the more fragmented the data on disk, the less relative cpu time will be spent trying to read and write data. In this case the strategy might be to measure cpu consumption value for consecutive blocks of data and if it's 100% (or other heuristically adequate value) then never sleep (this probably means either cached data or non-fragmented data on a fast disk). But when the cpu consumption drops significantly (much time spent waiting for I/O), the sleeping indeed might be needed. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/07/2014 04:43 PM, David Barrett wrote: > so I'm curious if you can think of a way using the API (or any > other way) to essentially "nice" the process by inserting a short > "sleep" into whatever loop runs inside the VACUUM command. Using OS provided functionality will be the most reliable. Other than that, a simple way is to sleep in the progress callback, although that will make I/O lumpy. If you want finer grained control then you can copy the pointers for the default VFS into your own VFS, and override the read/write methods to rate limit themselves. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1 iEYEARECAAYFAlSFvBsACgkQmOOfHg372QRv9wCfYrybsVowHx6QTpbw/WjMoSZh AJIAoNc4HyP1pUU/AvTGkdjJeQm93I7Y =IKzd -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
On 8 Dec 2014, at 12:43am, David Barrettwrote: > Other alternatives we're considering are to fork and nice the process, or > to call ioprio_set() directly, but I'm curious if there's a simpler way to > do it. Thanks! VACUUM does the same job (in a very different way) as copying all the TABLEs, then creating the VIEWS, INDEXes and TRIGGERs on the new tables. These can all be done using SQL statements. Had you considered creating a VACUUMed copy yourself ? You could do one table/view/index/trigger at a time. And you could engineer a pause of a few seconds after every ten thousand rows are put in a table. But I'm wondering why you need to VACUUM often enough that anything it does is a problem. It can save filespace after deletion (before new data is put in to take up the released filespace), and it can increase speed, but the speed increase is small. It's not needed in normal use. It should be kept for a maintenance routine, perhaps once a month at most. If your users are putting in more data than they are deleting, VACUUM has no noticable effect and I know of SQLite databases which have been amended daily for years without ever once having been VACUUMed. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?
Hi! I have a large database, and running the VACUUM pragma hammers disk IO so badly that it dramatically affects performance on the box. I'm not in a hurry for the results, so I'm curious if you can think of a way using the API (or any other way) to essentially "nice" the process by inserting a short "sleep" into whatever loop runs inside the VACUUM command. (I had initially thought of putting a sleep() into the callback to sqlite3_exec() callback, but that's only called once for each result row -- I need something that's called consistently through the running of the query.) Other alternatives we're considering are to fork and nice the process, or to call ioprio_set() directly, but I'm curious if there's a simpler way to do it. Thanks! -david ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users