Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-16 Thread Dumitru Daniliuc
shane, i was hoping you could clarify a few things about AUTO_VACUUM:


 However, the B-Tree balancing algorithm used by SQLite will attempt to
 merge pages with neighbors when there space utilization drops below certain
 thresholds. Minimum average fill for intkey leaves is 50%. For other pages I
 think it's 66%.


according to http://www.sqlite.org/pragma.html#pragma_auto_vacuum:
Auto-vacuum does not defragment the database nor repack individual database
pages the way that the VACUUM command does.

so the way i understand these statements is:
1. sqlite always attempts to merge mostly-empty pages. the (auto-)vacuum
settings have no effect on that.
2. AUTO_VACUUM only moves pages around and deletes the empty ones. it does
not try to repack individual pages. however, because of #1, page repacking
happens anyway (to some extent), when AUTO_VACUUM is on.

am i right?

and one more question:

Auto-vacuuming is only possible if the database stores some additional
information that allows each database page to be traced backwards to its
referer. Therefore, auto-vacuuming must be turned on before any tables are
created. It is not possible to enable or disable auto-vacuum after a table
has been created.

what happens if we create a database without AUTO_VACUUM on, insert some
data, save it to a file, then turn on AUTO_VACUUM and try to open that
database again? will sqlite add the missing information? will AUTO_VACUUM be
silently turned off for that database? will we get an error when we try to
open/read from/write to that database? anything else that we need to be
aware of in this case?

thanks,
dumi


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-16 Thread João Eiras
On Wed, 17 Mar 2010 01:50:00 +0100, Dumitru Daniliuc d...@chromium.org  
wrote:



shane, i was hoping you could clarify a few things about AUTO_VACUUM:



Perhaps this is a bit out of scope of this mailing list ?




Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Shane Harrelson
Some things to consider:

(1) VACUUM is never necessary, though can sometimes reduce the size of the
database file and provide a performance improvement if the database is large
and fragmented.

(2) Use PRAGMA freelist_count to estimate how much VACUUM will shrink a
database.

(3) VACUUM requires O(N) time where N is the size of the database file, and
can require up to 2N temporary space while processing.

Our recommendation is that the use of VACUUM be discouraged.   The
performance improvement only comes up with very large databases that are
highly fragmented.  Where space reclamation is an issue, AUTO_VACUUM would
be more appropriate.

-Shane


On Thu, Mar 11, 2010 at 7:33 PM, Dimitri Glazkov dglaz...@chromium.orgwrote:

 I like the completion callback idea.

 Also like the notion of some sort of protection from over-eager
 vacuum-calling syndrome.

 :DG

 On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman micha...@google.com
 wrote:
  Instead of calling back on success only, maybe call back on completion
  regardless of success or failure. This way the caller would know when the
  potentially lengthy operation was done, regardless of the outcome.
 
  2010/3/11 Dumitru Daniliuc d...@chromium.org
 
  joao,
 
  it looks like we mostly agree on this feature, so i was wondering if we
  could formally agree on a spec. here's what i propose:
 
  1. name: vacuum. to the best of my knowledge, all current
 WebSQLDatabases
  implementations use SQLite, and in SQLite the command is called VACUUM..
 so
  it seems to me that we might as well call the new function vacuum().
 what do
  you think?
 
  2. spec: no need for an error callback.
 
  interface Database {
// the methods and properties currently in the spec
void vacuum(in optional SQLVoidCallback successCallback);
  };
 
  3. what the call should do: the purpose of this call is to allow apps to
  vacuum/compact/defragment/clean up their databases whenever they see
 fit.. a
  call to vacuum() could take a considerable amount of time (especially on
  larger or highly fragmented databases); therefore, it is not recommended
 for
  web apps to call this method during periods of high activity.
 
  how to process a vacuum() call:
 
  if the UA does not support this call (mobile browsers?), jump to step 3.
  queue up a task to vacuum/compact/defragment/clean up the database.
  if the task succeeds, and a success callback is provided, queue up a
 task
  to invoke the success callback; in all other cases (the task failed, or
 no
  success callback was provided), do nothing: proceed to the next task in
 the
  queue.
 
  does this seem acceptable? we (google engineers interested in this) feel
  that UAs should either not implement the vacuum() call, or they should
  respect it (rather than taking it as a hint). it is ok for UAs to keep
 track
  of things like system idleness or databases closing to do more vacuuming
  that the apps asked for, if they want to. however, we feel that a
 vacuum()
  request by an app should not be postponed, simply because sometimes apps
  know better than UAs when the best time to vacuum is (it might be nice
 to
  give apps more information on how fragmented their databases are, but
 that's
  a separate discussion).
 
  thanks,
  dumi
 
 
  2010/3/9 Jeremy Orlow jor...@google.com
 
  On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com
 wrote:
 
 
  On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote:
 
  I don't see how the callbacks are useful though. Vacuum works
  transparently, its effects are not visible, and what should the
 page
  do in
  case of error ?
 
 
  i was thinking of something like:
 
  db.defragment(errorCallback, successCallback);
  showAPrettyImageAndAskTheUserToWait();
 
  function errorCallback(error) {}
  function successCallback() {
   getRidOfThePrettyImageAndRestartTheApp();
  }
 
  just like you, i'm not sure if the error callback is useful at all,
  but i
  thought i'd add it to make the defragment() call look more like a
  transaction. maybe we don't need it.
 
 
  True, but this is a kind of operation that could very well just run
 on
  the background, with a single optional callback when it's done (the
 webpage
  can't do anything if an error is detected anyway).
 
  ok, so let's drop the errorCallback: vacuum([optional]
 successCallback);
 
 
  The user agent would need to queue any subsequent transactions if a
  vacuum is running. I would consider it as an hint, and after all
 webpages
  that own references to the underlying data files are closed, would do
 a
  vacuum. So, if you have many tabs on gmail, and that a single gmail
 instance
  tries to do multiple vacuums, it would equiv to one single vacuum
 operation.
 
  what do we do if some databases are opened for the entire life of the
  browser? for example, i open my browser which has myfavoriteapp.comset as
  its homepage. myfavoriteapp.com immediately opens a DB, and i only
 close
  that app when i close the browser. 

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Charles McCathieNevile
On Fri, 12 Mar 2010 22:31:15 +0100, Dirk Pranke dpra...@chromium.org  
wrote:



I admit to not being super familiar with the spec as it currently
stands, but I find the idea that we would add something like this
fairly unappealing. I'm not familiar with any other database API
that asks the application programmer to some sort of GC as part of
the application. I almost feel like if you're going to add this,
you should drop any pretense of calling this a generic SQL
interface, and just call it the WebSQLLite spec.


It isn't a generic SQL spec, it is already a WebSQLite spec.

But I'm still not sure what the rationale is for this functionality. In  
particular, because the sense I got from the TPAC meeting was that nobody  
except Apple is especially keen on this spec as a long-term solution. So  
working on such low-level functionality doesn't seem very worthwhile -  
although if people are seriously going to implement it then knowing that  
will at least allow ongoing interoperability for those who choose to  
implement it.


cheers

Chaals


-- Dirk

2010/3/9 Jeremy Orlow jor...@google.com:
On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com  
wrote:



On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote:



I don't see how the callbacks are useful though. Vacuum works
transparently, its effects are not visible, and what should the  
page do

in
case of error ?



i was thinking of something like:

db.defragment(errorCallback, successCallback);
showAPrettyImageAndAskTheUserToWait();

function errorCallback(error) {}
function successCallback() {
 getRidOfThePrettyImageAndRestartTheApp();
}

just like you, i'm not sure if the error callback is useful at all,  
but

i
thought i'd add it to make the defragment() call look more like a
transaction. maybe we don't need it.



True, but this is a kind of operation that could very well just run on
the background, with a single optional callback when it's done (the  
webpage

can't do anything if an error is detected anyway).


ok, so let's drop the errorCallback: vacuum([optional]  
successCallback);




The user agent would need to queue any subsequent transactions if a
vacuum is running. I would consider it as an hint, and after all  
webpages
that own references to the underlying data files are closed, would do  
a
vacuum. So, if you have many tabs on gmail, and that a single gmail  
instance
tries to do multiple vacuums, it would equiv to one single vacuum  
operation.


what do we do if some databases are opened for the entire life of the
browser? for example, i open my browser which has myfavoriteapp.com  
set as
its homepage. myfavoriteapp.com immediately opens a DB, and i only  
close

that app when i close the browser. when would the browser vacuum
myfavoriteapp's DBs in this case?

i think it's ok for the UA to vacuum some DBs automatically when it  
thinks
it's a good time to do so; however, if a platform supports the  
vacuum/defrag
call (i.e. if it doesn't treat it is a no-op), then i think a vacuum  
call
coming from the app should be immediately scheduled (yes, the  
subsequent
transactions would have to wait for the vacuuming to finish running).  
in

some cases, the apps know better than the UA when to vacuum their DBs.

by the way, we should probably agree on a name for this call. which  
one do

you prefer? vacuum, defragment, defrag, something else? i don't have a
strong opinion.


I think vacuum is fine since the spec is already tied to the SQLite SQL
dialect.
collectGarbage() is another possibility
Go with whatever you think is most clear and accurate though.
J





--
Charles McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera: http://www.opera.com



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
joao,

it looks like we mostly agree on this feature, so i was wondering if we
could formally agree on a spec. here's what i propose:

1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
implementations use SQLite, and in SQLite the command is called VACUUM. so
it seems to me that we might as well call the new function vacuum(). what do
you think?

2. spec: no need for an error callback.

interface Database {
  // the methods and properties currently in the spec
  void vacuum(in optional SQLVoidCallback successCallback);
};

3. what the call should do: the purpose of this call is to allow apps to
vacuum/compact/defragment/clean up their databases whenever they see fit. a
call to vacuum() could take a considerable amount of time (especially on
larger or highly fragmented databases); therefore, it is not recommended for
web apps to call this method during periods of high activity.

how to process a vacuum() call:

   1. if the UA does not support this call (mobile browsers?), jump to step
   3.
   2. queue up a task to vacuum/compact/defragment/clean up the database.
   3. if the task succeeds, and a success callback is provided, queue up a
   task to invoke the success callback; in all other cases (the task failed, or
   no success callback was provided), do nothing: proceed to the next task in
   the queue.

does this seem acceptable? we (google engineers interested in this) feel
that UAs should either not implement the vacuum() call, or they should
respect it (rather than taking it as a hint). it is ok for UAs to keep track
of things like system idleness or databases closing to do more vacuuming
that the apps asked for, if they want to. however, we feel that a vacuum()
request by an app should not be postponed, simply because sometimes apps
know better than UAs when the best time to vacuum is (it might be nice to
give apps more information on how fragmented their databases are, but that's
a separate discussion).

thanks,
dumi


2010/3/9 Jeremy Orlow jor...@google.com

 On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote:



 On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote:


  I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page do
 in
 case of error ?


 i was thinking of something like:

 db.defragment(errorCallback, successCallback);
 showAPrettyImageAndAskTheUserToWait();

 function errorCallback(error) {}
 function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
 }

 just like you, i'm not sure if the error callback is useful at all, but
 i
 thought i'd add it to make the defragment() call look more like a
 transaction. maybe we don't need it.


 True, but this is a kind of operation that could very well just run on
 the background, with a single optional callback when it's done (the webpage
 can't do anything if an error is detected anyway).


 ok, so let's drop the errorCallback: vacuum([optional] successCallback);


 The user agent would need to queue any subsequent transactions if a
 vacuum is running. I would consider it as an hint, and after all webpages
 that own references to the underlying data files are closed, would do a
 vacuum. So, if you have many tabs on gmail, and that a single gmail instance
 tries to do multiple vacuums, it would equiv to one single vacuum operation.


 what do we do if some databases are opened for the entire life of the
 browser? for example, i open my browser which has myfavoriteapp.com set
 as its homepage. myfavoriteapp.com immediately opens a DB, and i only
 close that app when i close the browser. when would the browser vacuum
 myfavoriteapp's DBs in this case?

 i think it's ok for the UA to vacuum some DBs automatically when it thinks
 it's a good time to do so; however, if a platform supports the vacuum/defrag
 call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call
 coming from the app should be immediately scheduled (yes, the subsequent
 transactions would have to wait for the vacuuming to finish running). in
 some cases, the apps know better than the UA when to vacuum their DBs.

 by the way, we should probably agree on a name for this call. which one do
 you prefer? vacuum, defragment, defrag, something else? i don't have a
 strong opinion.


 I think vacuum is fine since the spec is already tied to the SQLite SQL
 dialect.

 collectGarbage() is another possibility

 Go with whatever you think is most clear and accurate though.

 J



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread João Eiras
On Fri, 12 Mar 2010 01:08:41 +0100, Dumitru Daniliuc d...@chromium.org  
wrote:



joao,

it looks like we mostly agree on this feature, so i was wondering if we
could formally agree on a spec. here's what i propose:

1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
implementations use SQLite, and in SQLite the command is called VACUUM.  
so
it seems to me that we might as well call the new function vacuum().  
what do

you think?


I'm fine with it but...



2. spec: no need for an error callback.

interface Database {
  // the methods and properties currently in the spec
  void vacuum(in optional SQLVoidCallback successCallback);
};

3. what the call should do: the purpose of this call is to allow apps to
vacuum/compact/defragment/clean up their databases whenever they see  
fit. a

call to vacuum() could take a considerable amount of time (especially on
larger or highly fragmented databases); therefore, it is not recommended  
for

web apps to call this method during periods of high activity.

how to process a vacuum() call:

   1. if the UA does not support this call (mobile browsers?), jump to  
step

   3.
   2. queue up a task to vacuum/compact/defragment/clean up the database.
   3. if the task succeeds, and a success callback is provided, queue up  
a
   task to invoke the success callback; in all other cases (the task  
failed, or
   no success callback was provided), do nothing: proceed to the next  
task in

   the queue.

does this seem acceptable? we (google engineers interested in this) feel
that UAs should either not implement the vacuum() call, or they should
respect it (rather than taking it as a hint). it is ok for UAs to keep  
track

of things like system idleness or databases closing to do more vacuuming
that the apps asked for, if they want to. however, we feel that a  
vacuum()

request by an app should not be postponed, simply because sometimes apps
know better than UAs when the best time to vacuum is (it might be nice to
give apps more information on how fragmented their databases are, but  
that's

a separate discussion).



I unfortunately don't agree with this last sentence :) Web pages do not  
have a way to know the level of fragmentation, and the last thing user  
wants is a web page calling vacuum() before each transaction, *just to be  
safe*, because of potential fragmentation.


But the API is good I think.


thanks,
dumi






Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Michael Nordman
Instead of calling back on success only, maybe call back on completion
regardless of success or failure. This way the caller would know when the
potentially lengthy operation was done, regardless of the outcome.

2010/3/11 Dumitru Daniliuc d...@chromium.org

 joao,

 it looks like we mostly agree on this feature, so i was wondering if we
 could formally agree on a spec. here's what i propose:

 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
 implementations use SQLite, and in SQLite the command is called VACUUM. so
 it seems to me that we might as well call the new function vacuum(). what do
 you think?

 2. spec: no need for an error callback.


 interface Database {
   // the methods and properties currently in the spec
   void vacuum(in optional SQLVoidCallback successCallback);
 };

 3. what the call should do: the purpose of this call is to allow apps to
 vacuum/compact/defragment/clean up their databases whenever they see fit. a
 call to vacuum() could take a considerable amount of time (especially on
 larger or highly fragmented databases); therefore, it is not recommended for
 web apps to call this method during periods of high activity.

 how to process a vacuum() call:

1. if the UA does not support this call (mobile browsers?), jump to
step 3.
2. queue up a task to vacuum/compact/defragment/clean up the database.
3. if the task succeeds, and a success callback is provided, queue up a
task to invoke the success callback; in all other cases (the task failed, 
 or
no success callback was provided), do nothing: proceed to the next task in
the queue.

 does this seem acceptable? we (google engineers interested in this) feel
 that UAs should either not implement the vacuum() call, or they should
 respect it (rather than taking it as a hint). it is ok for UAs to keep track
 of things like system idleness or databases closing to do more vacuuming
 that the apps asked for, if they want to. however, we feel that a vacuum()
 request by an app should not be postponed, simply because sometimes apps
 know better than UAs when the best time to vacuum is (it might be nice to
 give apps more information on how fragmented their databases are, but that's
 a separate discussion).

 thanks,
 dumi


 2010/3/9 Jeremy Orlow jor...@google.com

 On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote:



 On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote:


  I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page
 do in
 case of error ?


 i was thinking of something like:

 db.defragment(errorCallback, successCallback);
 showAPrettyImageAndAskTheUserToWait();

 function errorCallback(error) {}
 function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
 }

 just like you, i'm not sure if the error callback is useful at all, but
 i
 thought i'd add it to make the defragment() call look more like a
 transaction. maybe we don't need it.


 True, but this is a kind of operation that could very well just run on
 the background, with a single optional callback when it's done (the webpage
 can't do anything if an error is detected anyway).


 ok, so let's drop the errorCallback: vacuum([optional] successCallback);


 The user agent would need to queue any subsequent transactions if a
 vacuum is running. I would consider it as an hint, and after all webpages
 that own references to the underlying data files are closed, would do a
 vacuum. So, if you have many tabs on gmail, and that a single gmail 
 instance
 tries to do multiple vacuums, it would equiv to one single vacuum 
 operation.


 what do we do if some databases are opened for the entire life of the
 browser? for example, i open my browser which has myfavoriteapp.com set
 as its homepage. myfavoriteapp.com immediately opens a DB, and i only
 close that app when i close the browser. when would the browser vacuum
 myfavoriteapp's DBs in this case?

 i think it's ok for the UA to vacuum some DBs automatically when it
 thinks it's a good time to do so; however, if a platform supports the
 vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a
 vacuum call coming from the app should be immediately scheduled (yes, the
 subsequent transactions would have to wait for the vacuuming to finish
 running). in some cases, the apps know better than the UA when to vacuum
 their DBs.

 by the way, we should probably agree on a name for this call. which one
 do you prefer? vacuum, defragment, defrag, something else? i don't have a
 strong opinion.


 I think vacuum is fine since the spec is already tied to the SQLite SQL
 dialect.

 collectGarbage() is another possibility

 Go with whatever you think is most clear and accurate though.

 J





Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
sounds good to me:

interface Database {
 // the methods and properties currently in the spec
 void vacuum(in optional SQLVoidCallback completionCallback);
};

... upon completion, queue up a task to invoke the completionCallback, if
one was provided.

dumi


On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman micha...@google.comwrote:

 Instead of calling back on success only, maybe call back on completion
 regardless of success or failure. This way the caller would know when the
 potentially lengthy operation was done, regardless of the outcome.

 2010/3/11 Dumitru Daniliuc d...@chromium.org

 joao,

 it looks like we mostly agree on this feature, so i was wondering if we
 could formally agree on a spec. here's what i propose:

 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
 implementations use SQLite, and in SQLite the command is called VACUUM. so
 it seems to me that we might as well call the new function vacuum(). what do
 you think?

 2. spec: no need for an error callback.


 interface Database {
   // the methods and properties currently in the spec
   void vacuum(in optional SQLVoidCallback successCallback);
 };

 3. what the call should do: the purpose of this call is to allow apps to
 vacuum/compact/defragment/clean up their databases whenever they see fit. a
 call to vacuum() could take a considerable amount of time (especially on
 larger or highly fragmented databases); therefore, it is not recommended for
 web apps to call this method during periods of high activity.

 how to process a vacuum() call:

1. if the UA does not support this call (mobile browsers?), jump to
step 3.
2. queue up a task to vacuum/compact/defragment/clean up the database.
3. if the task succeeds, and a success callback is provided, queue up
a task to invoke the success callback; in all other cases (the task 
 failed,
or no success callback was provided), do nothing: proceed to the next task
in the queue.

 does this seem acceptable? we (google engineers interested in this) feel
 that UAs should either not implement the vacuum() call, or they should
 respect it (rather than taking it as a hint). it is ok for UAs to keep track
 of things like system idleness or databases closing to do more vacuuming
 that the apps asked for, if they want to. however, we feel that a vacuum()
 request by an app should not be postponed, simply because sometimes apps
 know better than UAs when the best time to vacuum is (it might be nice to
 give apps more information on how fragmented their databases are, but that's
 a separate discussion).

 thanks,
 dumi


 2010/3/9 Jeremy Orlow jor...@google.com

 On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote:



 On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote:


  I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page
 do in
 case of error ?


 i was thinking of something like:

 db.defragment(errorCallback, successCallback);
 showAPrettyImageAndAskTheUserToWait();

 function errorCallback(error) {}
 function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
 }

 just like you, i'm not sure if the error callback is useful at all,
 but i
 thought i'd add it to make the defragment() call look more like a
 transaction. maybe we don't need it.


 True, but this is a kind of operation that could very well just run on
 the background, with a single optional callback when it's done (the 
 webpage
 can't do anything if an error is detected anyway).


 ok, so let's drop the errorCallback: vacuum([optional] successCallback);


 The user agent would need to queue any subsequent transactions if a
 vacuum is running. I would consider it as an hint, and after all webpages
 that own references to the underlying data files are closed, would do a
 vacuum. So, if you have many tabs on gmail, and that a single gmail 
 instance
 tries to do multiple vacuums, it would equiv to one single vacuum 
 operation.


 what do we do if some databases are opened for the entire life of the
 browser? for example, i open my browser which has myfavoriteapp.com set
 as its homepage. myfavoriteapp.com immediately opens a DB, and i only
 close that app when i close the browser. when would the browser vacuum
 myfavoriteapp's DBs in this case?

 i think it's ok for the UA to vacuum some DBs automatically when it
 thinks it's a good time to do so; however, if a platform supports the
 vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a
 vacuum call coming from the app should be immediately scheduled (yes, the
 subsequent transactions would have to wait for the vacuuming to finish
 running). in some cases, the apps know better than the UA when to vacuum
 their DBs.

 by the way, we should probably agree on a name for this call. which one
 do you prefer? vacuum, defragment, defrag, something else? i don't have a
 strong opinion.


 I 

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
joao,

if i understand correctly, you basically want to have an automated system
implemented in the browser that decides when to vacuum databases (or at
least make sure it doesn't happen too often). and the vacuum() calls would
be just one of the parameters that the system takes into account. i think
having such a system is fine, but doesn't need to be spec'ed. at the same
time, i think it's worth having a spec'ed vacuum() call that gives the app
some guarantees. in particular, i would like the completionCallback to be a
signal that the database is in a good shape and ready for more work. so how
about this:


   1. if the UA doesn't support the vacuum() call, go to step 3.
   2. queue up a task to vacuum the database. if the UA decides that
   vacuuming is not needed at this time, the vacuuming task scheduled in this
   step could be a no-op.
   3. when the vacuuming task completes, queue up a task to invoke the
   completion callback, if one was specified.

i think this spec should allow you to treat the vacuum() call as a hint,
while also guaranteeing that:
1. the UA believes the database is in a good shape when the completion
callback is called.
2. the completion callback is called as soon as UA believes the DB is in a
good shape.

what do you think?

thanks,
dumi


On Thu, Mar 11, 2010 at 4:13 PM, João Eiras jo...@opera.com wrote:

 On Fri, 12 Mar 2010 01:08:41 +0100, Dumitru Daniliuc d...@chromium.org
 wrote:

  joao,

 it looks like we mostly agree on this feature, so i was wondering if we
 could formally agree on a spec. here's what i propose:

 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
 implementations use SQLite, and in SQLite the command is called VACUUM. so
 it seems to me that we might as well call the new function vacuum(). what
 do
 you think?


 I'm fine with it but...


 2. spec: no need for an error callback.

 interface Database {
  // the methods and properties currently in the spec
  void vacuum(in optional SQLVoidCallback successCallback);
 };

 3. what the call should do: the purpose of this call is to allow apps to
 vacuum/compact/defragment/clean up their databases whenever they see fit.
 a
 call to vacuum() could take a considerable amount of time (especially on
 larger or highly fragmented databases); therefore, it is not recommended
 for
 web apps to call this method during periods of high activity.

 how to process a vacuum() call:

   1. if the UA does not support this call (mobile browsers?), jump to step
   3.
   2. queue up a task to vacuum/compact/defragment/clean up the database.
   3. if the task succeeds, and a success callback is provided, queue up a

   task to invoke the success callback; in all other cases (the task
 failed, or
   no success callback was provided), do nothing: proceed to the next task
 in
   the queue.

 does this seem acceptable? we (google engineers interested in this) feel
 that UAs should either not implement the vacuum() call, or they should
 respect it (rather than taking it as a hint). it is ok for UAs to keep
 track
 of things like system idleness or databases closing to do more vacuuming
 that the apps asked for, if they want to. however, we feel that a vacuum()
 request by an app should not be postponed, simply because sometimes apps
 know better than UAs when the best time to vacuum is (it might be nice to
 give apps more information on how fragmented their databases are, but
 that's
 a separate discussion).


 I unfortunately don't agree with this last sentence :) Web pages do not
 have a way to know the level of fragmentation, and the last thing user wants
 is a web page calling vacuum() before each transaction, *just to be safe*,
 because of potential fragmentation.

 But the API is good I think.

  thanks,
 dumi





Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread João Eiras
On Fri, 12 Mar 2010 04:07:21 +0100, Dumitru Daniliuc d...@chromium.org  
wrote:



joao,

if i understand correctly, you basically want to have an automated system
implemented in the browser that decides when to vacuum databases (or at
least make sure it doesn't happen too often). and the vacuum() calls  
would

be just one of the parameters that the system takes into account. i think
having such a system is fine, but doesn't need to be spec'ed. at the same
time, i think it's worth having a spec'ed vacuum() call that gives the  
app
some guarantees. in particular, i would like the completionCallback to  
be a
signal that the database is in a good shape and ready for more work. so  
how

about this:


   1. if the UA doesn't support the vacuum() call, go to step 3.
   2. queue up a task to vacuum the database. if the UA decides that
   vacuuming is not needed at this time, the vacuuming task scheduled in  
this

   step could be a no-op.
   3. when the vacuuming task completes, queue up a task to invoke the
   completion callback, if one was specified.

i think this spec should allow you to treat the vacuum() call as a hint,
while also guaranteeing that:
1. the UA believes the database is in a good shape when the completion
callback is called.
2. the completion callback is called as soon as UA believes the DB is in  
a

good shape.

what do you think?


Looks better, and more flexible.

I wonder if sqlite has a way to query the number of pages in the freelist ?
Probably something like 10% of pages in the freelist would be a good  
threshold to allow a vacuum.




thanks,
dumi




Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread João Eiras



Looks better, and more flexible.

I wonder if sqlite has a way to query the number of pages in the  
freelist ?
Probably something like 10% of pages in the freelist would be a good  
threshold to allow a vacuum.




Oh and btw, read only transactions should disallow vacuum, I think/hope :)



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
On Thu, Mar 11, 2010 at 7:17 PM, João Eiras jo...@opera.com wrote:

 On Fri, 12 Mar 2010 04:07:21 +0100, Dumitru Daniliuc d...@chromium.org
 wrote:

  joao,

 if i understand correctly, you basically want to have an automated system
 implemented in the browser that decides when to vacuum databases (or at
 least make sure it doesn't happen too often). and the vacuum() calls would
 be just one of the parameters that the system takes into account. i think
 having such a system is fine, but doesn't need to be spec'ed. at the same
 time, i think it's worth having a spec'ed vacuum() call that gives the app
 some guarantees. in particular, i would like the completionCallback to be
 a
 signal that the database is in a good shape and ready for more work. so
 how
 about this:


   1. if the UA doesn't support the vacuum() call, go to step 3.
   2. queue up a task to vacuum the database. if the UA decides that

   vacuuming is not needed at this time, the vacuuming task scheduled in
 this
   step could be a no-op.
   3. when the vacuuming task completes, queue up a task to invoke the

   completion callback, if one was specified.

 i think this spec should allow you to treat the vacuum() call as a hint,
 while also guaranteeing that:
 1. the UA believes the database is in a good shape when the completion
 callback is called.
 2. the completion callback is called as soon as UA believes the DB is in a
 good shape.

 what do you think?


 Looks better, and more flexible.


thanks. is there anything else you'd like to change? i'm pushing for this so
hard because i'd like to implement this feature, but i don't want to start
before all interested parties agree on a spec.


 I wonder if sqlite has a way to query the number of pages in the freelist ?
 Probably something like 10% of pages in the freelist would be a good
 threshold to allow a vacuum.


you can get the number of free pages with PRAGMA freelist_count (
http://www.sqlite.org/pragma.html). unfortunately, looking only at this
number would result in a lot of false negatives. for example, in theory, you
could have a DB with 1000 2KB pages, and each page could have only one
useful byte written on it. so PRAGMA freelist_count would return 0, even
though doing a vacuuming would shrink your DB from ~2MB to ~1KB. but i agree
that it could be one of the parameters to take into account.


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-09 Thread Jeremy Orlow
On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc d...@google.com wrote:



 On Mon, Mar 8, 2010 at 3:39 AM, João Eiras jo...@opera.com wrote:


  I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page do
 in
 case of error ?


 i was thinking of something like:

 db.defragment(errorCallback, successCallback);
 showAPrettyImageAndAskTheUserToWait();

 function errorCallback(error) {}
 function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
 }

 just like you, i'm not sure if the error callback is useful at all, but i
 thought i'd add it to make the defragment() call look more like a
 transaction. maybe we don't need it.


 True, but this is a kind of operation that could very well just run on the
 background, with a single optional callback when it's done (the webpage
 can't do anything if an error is detected anyway).


 ok, so let's drop the errorCallback: vacuum([optional] successCallback);


 The user agent would need to queue any subsequent transactions if a vacuum
 is running. I would consider it as an hint, and after all webpages that own
 references to the underlying data files are closed, would do a vacuum. So,
 if you have many tabs on gmail, and that a single gmail instance tries to do
 multiple vacuums, it would equiv to one single vacuum operation.


 what do we do if some databases are opened for the entire life of the
 browser? for example, i open my browser which has myfavoriteapp.com set as
 its homepage. myfavoriteapp.com immediately opens a DB, and i only close
 that app when i close the browser. when would the browser vacuum
 myfavoriteapp's DBs in this case?

 i think it's ok for the UA to vacuum some DBs automatically when it thinks
 it's a good time to do so; however, if a platform supports the vacuum/defrag
 call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call
 coming from the app should be immediately scheduled (yes, the subsequent
 transactions would have to wait for the vacuuming to finish running). in
 some cases, the apps know better than the UA when to vacuum their DBs.

 by the way, we should probably agree on a name for this call. which one do
 you prefer? vacuum, defragment, defrag, something else? i don't have a
 strong opinion.


I think vacuum is fine since the spec is already tied to the SQLite SQL
dialect.

collectGarbage() is another possibility

Go with whatever you think is most clear and accurate though.

J


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-05 Thread Jeremy Orlow
On Fri, Mar 5, 2010 at 2:32 AM, João Eiras jo...@opera.com wrote:

 On Fri, 05 Mar 2010 03:22:00 +0100, Dumitru Daniliuc d...@chromium.org
 wrote:

  Hi,

 We (Chromium) would like to add a vacuum() call on the Database object.


 [...]

 I would argue about having something a bit more generic for naming like
 defragment().

 I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page do in
 case of error ?

 Given that an operation like vacuum would be disk IO intensive, would it be
 good to give the webpage control over a feature that can have significant
 performance concerns ?


I'm sure a UA could always ignore it and/or vacuum whenever it wants...but
the drawback is that if you start vacuuming and then the app tries to open
the database, you might get some serious jank in the web app.


 While computers benefit from good file IO performance, that's not quite
 true in many mobile devices. So, the API would be more like an hint ? How
 can the webpage know the level of fragmentation of the data file ?

 Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so
 this kind of feature should be left for sqlite to handle for itself.


From that page:

Note, however, that auto-vacuum only truncates the freelist pages from the
file. Auto-vacuum does not defragment the database nor repack individual
database pages the way that the VACUUM lang_vacuum.html command does. In
fact, because it moves pages around within the file, auto-vacuum can
actually make fragmentation worse.



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-05 Thread Dumitru Daniliuc
hi joao,

thanks for your comments!


 I would argue about having something a bit more generic for naming like
 defragment().


that's totally fine with me.


 I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page do in
 case of error ?


i was thinking of something like:

db.defragment(errorCallback, successCallback);
showAPrettyImageAndAskTheUserToWait();

function errorCallback(error) {}
function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
}

just like you, i'm not sure if the error callback is useful at all, but i
thought i'd add it to make the defragment() call look more like a
transaction. maybe we don't need it.


 Given that an operation like vacuum would be disk IO intensive, would it be
 good to give the webpage control over a feature that can have significant
 performance concerns ?


first, i think it's better to give big, always-on apps access to a
potentially expensive clean-up call, than not give them any way at all to
defragment their databases. second, the app which called defragment() is the
only one that should notice the performance overhead (in practice, other web
apps might notice it too because of slower IO, but this is no different from
having an external, non-web app doing some intensive IO work and slowing
down all web apps). and third, most apps will probably never need to use
this call; it is mostly intended for apps with big databases that are always
open on a user's PC. those apps would usually know when they can call
defragment() without slowing down the user (for example, they could do it
when the app is idle).


 While computers benefit from good file IO performance, that's not quite
 true in many mobile devices. So, the API would be more like an hint ? How
 can the webpage know the level of fragmentation of the data file ?


on mobile devices i'm imagining the quotas and DBs are pretty small, in
which case defragmenting databases probably wouldn't save much space anyway.
so maybe those platforms should decide that they never need to do any kind
of maintenance on their DBs and make defragment() a no-op?

i don't think a webapp can figure out the level of fragmentation (maybe we
need an API to expose some quota/DB size information to the webapp?).
however, it could use other criteria to decide when to call defragment():
once every couple of weeks, when the system is idle, etc.


 Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so
 this kind of feature should be left for sqlite to handle for itself.


in addition to what jeremy said, having auto-vacuum on is going to add some
performance overhead to every transaction. this is not acceptable to some
apps.

thanks,
dumi