Re: [WebSQLDatabase] Adding a vacuum() call
shane, i was hoping you could clarify a few things about AUTO_VACUUM: However, the B-Tree balancing algorithm used by SQLite will attempt to merge pages with neighbors when there space utilization drops below certain thresholds. Minimum average fill for intkey leaves is 50%. For other pages I think it's 66%. according to http://www.sqlite.org/pragma.html#pragma_auto_vacuum: Auto-vacuum does not defragment the database nor repack individual database pages the way that the VACUUM command does. so the way i understand these statements is: 1. sqlite always attempts to merge mostly-empty pages. the (auto-)vacuum settings have no effect on that. 2. AUTO_VACUUM only moves pages around and deletes the empty ones. it does not try to repack individual pages. however, because of #1, page repacking happens anyway (to some extent), when AUTO_VACUUM is on. am i right? and one more question: Auto-vacuuming is only possible if the database stores some additional information that allows each database page to be traced backwards to its referer. Therefore, auto-vacuuming must be turned on before any tables are created. It is not possible to enable or disable auto-vacuum after a table has been created. what happens if we create a database without AUTO_VACUUM on, insert some data, save it to a file, then turn on AUTO_VACUUM and try to open that database again? will sqlite add the missing information? will AUTO_VACUUM be silently turned off for that database? will we get an error when we try to open/read from/write to that database? anything else that we need to be aware of in this case? thanks, dumi
Re: [WebSQLDatabase] Adding a vacuum() call
On Wed, 17 Mar 2010 01:50:00 +0100, Dumitru Daniliuc d...@chromium.org wrote: shane, i was hoping you could clarify a few things about AUTO_VACUUM: Perhaps this is a bit out of scope of this mailing list ?
Re: [WebSQLDatabase] Adding a vacuum() call
Some things to consider: (1) VACUUM is never necessary, though can sometimes reduce the size of the database file and provide a performance improvement if the database is large and fragmented. (2) Use PRAGMA freelist_count to estimate how much VACUUM will shrink a database. (3) VACUUM requires O(N) time where N is the size of the database file, and can require up to 2N temporary space while processing. Our recommendation is that the use of VACUUM be discouraged. The performance improvement only comes up with very large databases that are highly fragmented. Where space reclamation is an issue, AUTO_VACUUM would be more appropriate. -Shane On Thu, Mar 11, 2010 at 7:33 PM, Dimitri Glazkov dglaz...@chromium.orgwrote: I like the completion callback idea. Also like the notion of some sort of protection from over-eager vacuum-calling syndrome. :DG On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman micha...@google.com wrote: Instead of calling back on success only, maybe call back on completion regardless of success or failure. This way the caller would know when the potentially lengthy operation was done, regardless of the outcome. 2010/3/11 Dumitru Daniliuc d...@chromium.org joao, it looks like we mostly agree on this feature, so i was wondering if we could formally agree on a spec. here's what i propose: 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases implementations use SQLite, and in SQLite the command is called VACUUM.. so it seems to me that we might as well call the new function vacuum(). what do you think? 2. spec: no need for an error callback. interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback successCallback); }; 3. what the call should do: the purpose of this call is to allow apps to vacuum/compact/defragment/clean up their databases whenever they see fit.. a call to vacuum() could take a considerable amount of time (especially on larger or highly fragmented databases); therefore, it is not recommended for web apps to call this method during periods of high activity. how to process a vacuum() call: if the UA does not support this call (mobile browsers?), jump to step 3. queue up a task to vacuum/compact/defragment/clean up the database. if the task succeeds, and a success callback is provided, queue up a task to invoke the success callback; in all other cases (the task failed, or no success callback was provided), do nothing: proceed to the next task in the queue. does this seem acceptable? we (google engineers interested in this) feel that UAs should either not implement the vacuum() call, or they should respect it (rather than taking it as a hint). it is ok for UAs to keep track of things like system idleness or databases closing to do more vacuuming that the apps asked for, if they want to. however, we feel that a vacuum() request by an app should not be postponed, simply because sometimes apps know better than UAs when the best time to vacuum is (it might be nice to give apps more information on how fragmented their databases are, but that's a separate discussion). thanks, dumi 2010/3/9 Jeremy Orlow jor...@google.com On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote: On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote: I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. True, but this is a kind of operation that could very well just run on the background, with a single optional callback when it's done (the webpage can't do anything if an error is detected anyway). ok, so let's drop the errorCallback: vacuum([optional] successCallback); The user agent would need to queue any subsequent transactions if a vacuum is running. I would consider it as an hint, and after all webpages that own references to the underlying data files are closed, would do a vacuum. So, if you have many tabs on gmail, and that a single gmail instance tries to do multiple vacuums, it would equiv to one single vacuum operation. what do we do if some databases are opened for the entire life of the browser? for example, i open my browser which has myfavoriteapp.comset as its homepage. myfavoriteapp.com immediately opens a DB, and i only close that app when i close the browser.
Re: [WebSQLDatabase] Adding a vacuum() call
On Fri, 12 Mar 2010 22:31:15 +0100, Dirk Pranke dpra...@chromium.org wrote: I admit to not being super familiar with the spec as it currently stands, but I find the idea that we would add something like this fairly unappealing. I'm not familiar with any other database API that asks the application programmer to some sort of GC as part of the application. I almost feel like if you're going to add this, you should drop any pretense of calling this a generic SQL interface, and just call it the WebSQLLite spec. It isn't a generic SQL spec, it is already a WebSQLite spec. But I'm still not sure what the rationale is for this functionality. In particular, because the sense I got from the TPAC meeting was that nobody except Apple is especially keen on this spec as a long-term solution. So working on such low-level functionality doesn't seem very worthwhile - although if people are seriously going to implement it then knowing that will at least allow ongoing interoperability for those who choose to implement it. cheers Chaals -- Dirk 2010/3/9 Jeremy Orlow jor...@google.com: On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote: On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote: I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. True, but this is a kind of operation that could very well just run on the background, with a single optional callback when it's done (the webpage can't do anything if an error is detected anyway). ok, so let's drop the errorCallback: vacuum([optional] successCallback); The user agent would need to queue any subsequent transactions if a vacuum is running. I would consider it as an hint, and after all webpages that own references to the underlying data files are closed, would do a vacuum. So, if you have many tabs on gmail, and that a single gmail instance tries to do multiple vacuums, it would equiv to one single vacuum operation. what do we do if some databases are opened for the entire life of the browser? for example, i open my browser which has myfavoriteapp.com set as its homepage. myfavoriteapp.com immediately opens a DB, and i only close that app when i close the browser. when would the browser vacuum myfavoriteapp's DBs in this case? i think it's ok for the UA to vacuum some DBs automatically when it thinks it's a good time to do so; however, if a platform supports the vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call coming from the app should be immediately scheduled (yes, the subsequent transactions would have to wait for the vacuuming to finish running). in some cases, the apps know better than the UA when to vacuum their DBs. by the way, we should probably agree on a name for this call. which one do you prefer? vacuum, defragment, defrag, something else? i don't have a strong opinion. I think vacuum is fine since the spec is already tied to the SQLite SQL dialect. collectGarbage() is another possibility Go with whatever you think is most clear and accurate though. J -- Charles McCathieNevile Opera Software, Standards Group je parle français -- hablo español -- jeg lærer norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Re: [WebSQLDatabase] Adding a vacuum() call
joao, it looks like we mostly agree on this feature, so i was wondering if we could formally agree on a spec. here's what i propose: 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases implementations use SQLite, and in SQLite the command is called VACUUM. so it seems to me that we might as well call the new function vacuum(). what do you think? 2. spec: no need for an error callback. interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback successCallback); }; 3. what the call should do: the purpose of this call is to allow apps to vacuum/compact/defragment/clean up their databases whenever they see fit. a call to vacuum() could take a considerable amount of time (especially on larger or highly fragmented databases); therefore, it is not recommended for web apps to call this method during periods of high activity. how to process a vacuum() call: 1. if the UA does not support this call (mobile browsers?), jump to step 3. 2. queue up a task to vacuum/compact/defragment/clean up the database. 3. if the task succeeds, and a success callback is provided, queue up a task to invoke the success callback; in all other cases (the task failed, or no success callback was provided), do nothing: proceed to the next task in the queue. does this seem acceptable? we (google engineers interested in this) feel that UAs should either not implement the vacuum() call, or they should respect it (rather than taking it as a hint). it is ok for UAs to keep track of things like system idleness or databases closing to do more vacuuming that the apps asked for, if they want to. however, we feel that a vacuum() request by an app should not be postponed, simply because sometimes apps know better than UAs when the best time to vacuum is (it might be nice to give apps more information on how fragmented their databases are, but that's a separate discussion). thanks, dumi 2010/3/9 Jeremy Orlow jor...@google.com On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote: On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote: I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. True, but this is a kind of operation that could very well just run on the background, with a single optional callback when it's done (the webpage can't do anything if an error is detected anyway). ok, so let's drop the errorCallback: vacuum([optional] successCallback); The user agent would need to queue any subsequent transactions if a vacuum is running. I would consider it as an hint, and after all webpages that own references to the underlying data files are closed, would do a vacuum. So, if you have many tabs on gmail, and that a single gmail instance tries to do multiple vacuums, it would equiv to one single vacuum operation. what do we do if some databases are opened for the entire life of the browser? for example, i open my browser which has myfavoriteapp.com set as its homepage. myfavoriteapp.com immediately opens a DB, and i only close that app when i close the browser. when would the browser vacuum myfavoriteapp's DBs in this case? i think it's ok for the UA to vacuum some DBs automatically when it thinks it's a good time to do so; however, if a platform supports the vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call coming from the app should be immediately scheduled (yes, the subsequent transactions would have to wait for the vacuuming to finish running). in some cases, the apps know better than the UA when to vacuum their DBs. by the way, we should probably agree on a name for this call. which one do you prefer? vacuum, defragment, defrag, something else? i don't have a strong opinion. I think vacuum is fine since the spec is already tied to the SQLite SQL dialect. collectGarbage() is another possibility Go with whatever you think is most clear and accurate though. J
Re: [WebSQLDatabase] Adding a vacuum() call
On Fri, 12 Mar 2010 01:08:41 +0100, Dumitru Daniliuc d...@chromium.org wrote: joao, it looks like we mostly agree on this feature, so i was wondering if we could formally agree on a spec. here's what i propose: 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases implementations use SQLite, and in SQLite the command is called VACUUM. so it seems to me that we might as well call the new function vacuum(). what do you think? I'm fine with it but... 2. spec: no need for an error callback. interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback successCallback); }; 3. what the call should do: the purpose of this call is to allow apps to vacuum/compact/defragment/clean up their databases whenever they see fit. a call to vacuum() could take a considerable amount of time (especially on larger or highly fragmented databases); therefore, it is not recommended for web apps to call this method during periods of high activity. how to process a vacuum() call: 1. if the UA does not support this call (mobile browsers?), jump to step 3. 2. queue up a task to vacuum/compact/defragment/clean up the database. 3. if the task succeeds, and a success callback is provided, queue up a task to invoke the success callback; in all other cases (the task failed, or no success callback was provided), do nothing: proceed to the next task in the queue. does this seem acceptable? we (google engineers interested in this) feel that UAs should either not implement the vacuum() call, or they should respect it (rather than taking it as a hint). it is ok for UAs to keep track of things like system idleness or databases closing to do more vacuuming that the apps asked for, if they want to. however, we feel that a vacuum() request by an app should not be postponed, simply because sometimes apps know better than UAs when the best time to vacuum is (it might be nice to give apps more information on how fragmented their databases are, but that's a separate discussion). I unfortunately don't agree with this last sentence :) Web pages do not have a way to know the level of fragmentation, and the last thing user wants is a web page calling vacuum() before each transaction, *just to be safe*, because of potential fragmentation. But the API is good I think. thanks, dumi
Re: [WebSQLDatabase] Adding a vacuum() call
Instead of calling back on success only, maybe call back on completion regardless of success or failure. This way the caller would know when the potentially lengthy operation was done, regardless of the outcome. 2010/3/11 Dumitru Daniliuc d...@chromium.org joao, it looks like we mostly agree on this feature, so i was wondering if we could formally agree on a spec. here's what i propose: 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases implementations use SQLite, and in SQLite the command is called VACUUM. so it seems to me that we might as well call the new function vacuum(). what do you think? 2. spec: no need for an error callback. interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback successCallback); }; 3. what the call should do: the purpose of this call is to allow apps to vacuum/compact/defragment/clean up their databases whenever they see fit. a call to vacuum() could take a considerable amount of time (especially on larger or highly fragmented databases); therefore, it is not recommended for web apps to call this method during periods of high activity. how to process a vacuum() call: 1. if the UA does not support this call (mobile browsers?), jump to step 3. 2. queue up a task to vacuum/compact/defragment/clean up the database. 3. if the task succeeds, and a success callback is provided, queue up a task to invoke the success callback; in all other cases (the task failed, or no success callback was provided), do nothing: proceed to the next task in the queue. does this seem acceptable? we (google engineers interested in this) feel that UAs should either not implement the vacuum() call, or they should respect it (rather than taking it as a hint). it is ok for UAs to keep track of things like system idleness or databases closing to do more vacuuming that the apps asked for, if they want to. however, we feel that a vacuum() request by an app should not be postponed, simply because sometimes apps know better than UAs when the best time to vacuum is (it might be nice to give apps more information on how fragmented their databases are, but that's a separate discussion). thanks, dumi 2010/3/9 Jeremy Orlow jor...@google.com On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote: On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote: I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. True, but this is a kind of operation that could very well just run on the background, with a single optional callback when it's done (the webpage can't do anything if an error is detected anyway). ok, so let's drop the errorCallback: vacuum([optional] successCallback); The user agent would need to queue any subsequent transactions if a vacuum is running. I would consider it as an hint, and after all webpages that own references to the underlying data files are closed, would do a vacuum. So, if you have many tabs on gmail, and that a single gmail instance tries to do multiple vacuums, it would equiv to one single vacuum operation. what do we do if some databases are opened for the entire life of the browser? for example, i open my browser which has myfavoriteapp.com set as its homepage. myfavoriteapp.com immediately opens a DB, and i only close that app when i close the browser. when would the browser vacuum myfavoriteapp's DBs in this case? i think it's ok for the UA to vacuum some DBs automatically when it thinks it's a good time to do so; however, if a platform supports the vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call coming from the app should be immediately scheduled (yes, the subsequent transactions would have to wait for the vacuuming to finish running). in some cases, the apps know better than the UA when to vacuum their DBs. by the way, we should probably agree on a name for this call. which one do you prefer? vacuum, defragment, defrag, something else? i don't have a strong opinion. I think vacuum is fine since the spec is already tied to the SQLite SQL dialect. collectGarbage() is another possibility Go with whatever you think is most clear and accurate though. J
Re: [WebSQLDatabase] Adding a vacuum() call
sounds good to me: interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback completionCallback); }; ... upon completion, queue up a task to invoke the completionCallback, if one was provided. dumi On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman micha...@google.comwrote: Instead of calling back on success only, maybe call back on completion regardless of success or failure. This way the caller would know when the potentially lengthy operation was done, regardless of the outcome. 2010/3/11 Dumitru Daniliuc d...@chromium.org joao, it looks like we mostly agree on this feature, so i was wondering if we could formally agree on a spec. here's what i propose: 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases implementations use SQLite, and in SQLite the command is called VACUUM. so it seems to me that we might as well call the new function vacuum(). what do you think? 2. spec: no need for an error callback. interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback successCallback); }; 3. what the call should do: the purpose of this call is to allow apps to vacuum/compact/defragment/clean up their databases whenever they see fit. a call to vacuum() could take a considerable amount of time (especially on larger or highly fragmented databases); therefore, it is not recommended for web apps to call this method during periods of high activity. how to process a vacuum() call: 1. if the UA does not support this call (mobile browsers?), jump to step 3. 2. queue up a task to vacuum/compact/defragment/clean up the database. 3. if the task succeeds, and a success callback is provided, queue up a task to invoke the success callback; in all other cases (the task failed, or no success callback was provided), do nothing: proceed to the next task in the queue. does this seem acceptable? we (google engineers interested in this) feel that UAs should either not implement the vacuum() call, or they should respect it (rather than taking it as a hint). it is ok for UAs to keep track of things like system idleness or databases closing to do more vacuuming that the apps asked for, if they want to. however, we feel that a vacuum() request by an app should not be postponed, simply because sometimes apps know better than UAs when the best time to vacuum is (it might be nice to give apps more information on how fragmented their databases are, but that's a separate discussion). thanks, dumi 2010/3/9 Jeremy Orlow jor...@google.com On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote: On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote: I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. True, but this is a kind of operation that could very well just run on the background, with a single optional callback when it's done (the webpage can't do anything if an error is detected anyway). ok, so let's drop the errorCallback: vacuum([optional] successCallback); The user agent would need to queue any subsequent transactions if a vacuum is running. I would consider it as an hint, and after all webpages that own references to the underlying data files are closed, would do a vacuum. So, if you have many tabs on gmail, and that a single gmail instance tries to do multiple vacuums, it would equiv to one single vacuum operation. what do we do if some databases are opened for the entire life of the browser? for example, i open my browser which has myfavoriteapp.com set as its homepage. myfavoriteapp.com immediately opens a DB, and i only close that app when i close the browser. when would the browser vacuum myfavoriteapp's DBs in this case? i think it's ok for the UA to vacuum some DBs automatically when it thinks it's a good time to do so; however, if a platform supports the vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call coming from the app should be immediately scheduled (yes, the subsequent transactions would have to wait for the vacuuming to finish running). in some cases, the apps know better than the UA when to vacuum their DBs. by the way, we should probably agree on a name for this call. which one do you prefer? vacuum, defragment, defrag, something else? i don't have a strong opinion. I
Re: [WebSQLDatabase] Adding a vacuum() call
joao, if i understand correctly, you basically want to have an automated system implemented in the browser that decides when to vacuum databases (or at least make sure it doesn't happen too often). and the vacuum() calls would be just one of the parameters that the system takes into account. i think having such a system is fine, but doesn't need to be spec'ed. at the same time, i think it's worth having a spec'ed vacuum() call that gives the app some guarantees. in particular, i would like the completionCallback to be a signal that the database is in a good shape and ready for more work. so how about this: 1. if the UA doesn't support the vacuum() call, go to step 3. 2. queue up a task to vacuum the database. if the UA decides that vacuuming is not needed at this time, the vacuuming task scheduled in this step could be a no-op. 3. when the vacuuming task completes, queue up a task to invoke the completion callback, if one was specified. i think this spec should allow you to treat the vacuum() call as a hint, while also guaranteeing that: 1. the UA believes the database is in a good shape when the completion callback is called. 2. the completion callback is called as soon as UA believes the DB is in a good shape. what do you think? thanks, dumi On Thu, Mar 11, 2010 at 4:13 PM, João Eiras jo...@opera.com wrote: On Fri, 12 Mar 2010 01:08:41 +0100, Dumitru Daniliuc d...@chromium.org wrote: joao, it looks like we mostly agree on this feature, so i was wondering if we could formally agree on a spec. here's what i propose: 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases implementations use SQLite, and in SQLite the command is called VACUUM. so it seems to me that we might as well call the new function vacuum(). what do you think? I'm fine with it but... 2. spec: no need for an error callback. interface Database { // the methods and properties currently in the spec void vacuum(in optional SQLVoidCallback successCallback); }; 3. what the call should do: the purpose of this call is to allow apps to vacuum/compact/defragment/clean up their databases whenever they see fit. a call to vacuum() could take a considerable amount of time (especially on larger or highly fragmented databases); therefore, it is not recommended for web apps to call this method during periods of high activity. how to process a vacuum() call: 1. if the UA does not support this call (mobile browsers?), jump to step 3. 2. queue up a task to vacuum/compact/defragment/clean up the database. 3. if the task succeeds, and a success callback is provided, queue up a task to invoke the success callback; in all other cases (the task failed, or no success callback was provided), do nothing: proceed to the next task in the queue. does this seem acceptable? we (google engineers interested in this) feel that UAs should either not implement the vacuum() call, or they should respect it (rather than taking it as a hint). it is ok for UAs to keep track of things like system idleness or databases closing to do more vacuuming that the apps asked for, if they want to. however, we feel that a vacuum() request by an app should not be postponed, simply because sometimes apps know better than UAs when the best time to vacuum is (it might be nice to give apps more information on how fragmented their databases are, but that's a separate discussion). I unfortunately don't agree with this last sentence :) Web pages do not have a way to know the level of fragmentation, and the last thing user wants is a web page calling vacuum() before each transaction, *just to be safe*, because of potential fragmentation. But the API is good I think. thanks, dumi
Re: [WebSQLDatabase] Adding a vacuum() call
On Fri, 12 Mar 2010 04:07:21 +0100, Dumitru Daniliuc d...@chromium.org wrote: joao, if i understand correctly, you basically want to have an automated system implemented in the browser that decides when to vacuum databases (or at least make sure it doesn't happen too often). and the vacuum() calls would be just one of the parameters that the system takes into account. i think having such a system is fine, but doesn't need to be spec'ed. at the same time, i think it's worth having a spec'ed vacuum() call that gives the app some guarantees. in particular, i would like the completionCallback to be a signal that the database is in a good shape and ready for more work. so how about this: 1. if the UA doesn't support the vacuum() call, go to step 3. 2. queue up a task to vacuum the database. if the UA decides that vacuuming is not needed at this time, the vacuuming task scheduled in this step could be a no-op. 3. when the vacuuming task completes, queue up a task to invoke the completion callback, if one was specified. i think this spec should allow you to treat the vacuum() call as a hint, while also guaranteeing that: 1. the UA believes the database is in a good shape when the completion callback is called. 2. the completion callback is called as soon as UA believes the DB is in a good shape. what do you think? Looks better, and more flexible. I wonder if sqlite has a way to query the number of pages in the freelist ? Probably something like 10% of pages in the freelist would be a good threshold to allow a vacuum. thanks, dumi
Re: [WebSQLDatabase] Adding a vacuum() call
Looks better, and more flexible. I wonder if sqlite has a way to query the number of pages in the freelist ? Probably something like 10% of pages in the freelist would be a good threshold to allow a vacuum. Oh and btw, read only transactions should disallow vacuum, I think/hope :)
Re: [WebSQLDatabase] Adding a vacuum() call
On Thu, Mar 11, 2010 at 7:17 PM, João Eiras jo...@opera.com wrote: On Fri, 12 Mar 2010 04:07:21 +0100, Dumitru Daniliuc d...@chromium.org wrote: joao, if i understand correctly, you basically want to have an automated system implemented in the browser that decides when to vacuum databases (or at least make sure it doesn't happen too often). and the vacuum() calls would be just one of the parameters that the system takes into account. i think having such a system is fine, but doesn't need to be spec'ed. at the same time, i think it's worth having a spec'ed vacuum() call that gives the app some guarantees. in particular, i would like the completionCallback to be a signal that the database is in a good shape and ready for more work. so how about this: 1. if the UA doesn't support the vacuum() call, go to step 3. 2. queue up a task to vacuum the database. if the UA decides that vacuuming is not needed at this time, the vacuuming task scheduled in this step could be a no-op. 3. when the vacuuming task completes, queue up a task to invoke the completion callback, if one was specified. i think this spec should allow you to treat the vacuum() call as a hint, while also guaranteeing that: 1. the UA believes the database is in a good shape when the completion callback is called. 2. the completion callback is called as soon as UA believes the DB is in a good shape. what do you think? Looks better, and more flexible. thanks. is there anything else you'd like to change? i'm pushing for this so hard because i'd like to implement this feature, but i don't want to start before all interested parties agree on a spec. I wonder if sqlite has a way to query the number of pages in the freelist ? Probably something like 10% of pages in the freelist would be a good threshold to allow a vacuum. you can get the number of free pages with PRAGMA freelist_count ( http://www.sqlite.org/pragma.html). unfortunately, looking only at this number would result in a lot of false negatives. for example, in theory, you could have a DB with 1000 2KB pages, and each page could have only one useful byte written on it. so PRAGMA freelist_count would return 0, even though doing a vacuuming would shrink your DB from ~2MB to ~1KB. but i agree that it could be one of the parameters to take into account.
Re: [WebSQLDatabase] Adding a vacuum() call
On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote: On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote: I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. True, but this is a kind of operation that could very well just run on the background, with a single optional callback when it's done (the webpage can't do anything if an error is detected anyway). ok, so let's drop the errorCallback: vacuum([optional] successCallback); The user agent would need to queue any subsequent transactions if a vacuum is running. I would consider it as an hint, and after all webpages that own references to the underlying data files are closed, would do a vacuum. So, if you have many tabs on gmail, and that a single gmail instance tries to do multiple vacuums, it would equiv to one single vacuum operation. what do we do if some databases are opened for the entire life of the browser? for example, i open my browser which has myfavoriteapp.com set as its homepage. myfavoriteapp.com immediately opens a DB, and i only close that app when i close the browser. when would the browser vacuum myfavoriteapp's DBs in this case? i think it's ok for the UA to vacuum some DBs automatically when it thinks it's a good time to do so; however, if a platform supports the vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call coming from the app should be immediately scheduled (yes, the subsequent transactions would have to wait for the vacuuming to finish running). in some cases, the apps know better than the UA when to vacuum their DBs. by the way, we should probably agree on a name for this call. which one do you prefer? vacuum, defragment, defrag, something else? i don't have a strong opinion. I think vacuum is fine since the spec is already tied to the SQLite SQL dialect. collectGarbage() is another possibility Go with whatever you think is most clear and accurate though. J
Re: [WebSQLDatabase] Adding a vacuum() call
On Fri, Mar 5, 2010 at 2:32 AM, João Eiras jo...@opera.com wrote: On Fri, 05 Mar 2010 03:22:00 +0100, Dumitru Daniliuc d...@chromium.org wrote: Hi, We (Chromium) would like to add a vacuum() call on the Database object. [...] I would argue about having something a bit more generic for naming like defragment(). I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? Given that an operation like vacuum would be disk IO intensive, would it be good to give the webpage control over a feature that can have significant performance concerns ? I'm sure a UA could always ignore it and/or vacuum whenever it wants...but the drawback is that if you start vacuuming and then the app tries to open the database, you might get some serious jank in the web app. While computers benefit from good file IO performance, that's not quite true in many mobile devices. So, the API would be more like an hint ? How can the webpage know the level of fragmentation of the data file ? Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so this kind of feature should be left for sqlite to handle for itself. From that page: Note, however, that auto-vacuum only truncates the freelist pages from the file. Auto-vacuum does not defragment the database nor repack individual database pages the way that the VACUUM lang_vacuum.html command does. In fact, because it moves pages around within the file, auto-vacuum can actually make fragmentation worse.
Re: [WebSQLDatabase] Adding a vacuum() call
hi joao, thanks for your comments! I would argue about having something a bit more generic for naming like defragment(). that's totally fine with me. I don't see how the callbacks are useful though. Vacuum works transparently, its effects are not visible, and what should the page do in case of error ? i was thinking of something like: db.defragment(errorCallback, successCallback); showAPrettyImageAndAskTheUserToWait(); function errorCallback(error) {} function successCallback() { getRidOfThePrettyImageAndRestartTheApp(); } just like you, i'm not sure if the error callback is useful at all, but i thought i'd add it to make the defragment() call look more like a transaction. maybe we don't need it. Given that an operation like vacuum would be disk IO intensive, would it be good to give the webpage control over a feature that can have significant performance concerns ? first, i think it's better to give big, always-on apps access to a potentially expensive clean-up call, than not give them any way at all to defragment their databases. second, the app which called defragment() is the only one that should notice the performance overhead (in practice, other web apps might notice it too because of slower IO, but this is no different from having an external, non-web app doing some intensive IO work and slowing down all web apps). and third, most apps will probably never need to use this call; it is mostly intended for apps with big databases that are always open on a user's PC. those apps would usually know when they can call defragment() without slowing down the user (for example, they could do it when the app is idle). While computers benefit from good file IO performance, that's not quite true in many mobile devices. So, the API would be more like an hint ? How can the webpage know the level of fragmentation of the data file ? on mobile devices i'm imagining the quotas and DBs are pretty small, in which case defragmenting databases probably wouldn't save much space anyway. so maybe those platforms should decide that they never need to do any kind of maintenance on their DBs and make defragment() a no-op? i don't think a webapp can figure out the level of fragmentation (maybe we need an API to expose some quota/DB size information to the webapp?). however, it could use other criteria to decide when to call defragment(): once every couple of weeks, when the system is idle, etc. Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so this kind of feature should be left for sqlite to handle for itself. in addition to what jeremy said, having auto-vacuum on is going to add some performance overhead to every transaction. this is not acceptable to some apps. thanks, dumi