Re: [IndexDB] Proposal for async API changes

2010-06-21 Thread Andrei Popescu
On Tue, Jun 15, 2010 at 5:44 PM, Nikunj Mehta nik...@o-micron.com wrote:
 (specifically answering out of context)

 On May 17, 2010, at 6:15 PM, Jonas Sicking wrote:

 9. IDBKeyRanges are created using functions on IndexedDatabaseRequest.
 We couldn't figure out how the old API allowed you to create a range
 object without first having a range object.

 Hey Jonas,

 What was the problem in simply creating it like it is shown in examples? The 
 API is intentionally designed that way to be able to use constants such as 
 LEFT_BOUND and operations like only directly from the interface.

 For example,
 IDBKeyRange.LEFT_BOUND; // this should evaluate to 4
 IDBKeyRange.only(a).left; // this should evaluate to a


But in http://dvcs.w3.org/hg/IndexedDB/rev/fc747a407817 you added
[NoInterfaceObject] to the IDBKeyRange interface. Does the above
syntax still work? My understanding is that it doesn't anymore..

Thanks,
Andrei



Re: [IndexDB] Proposal for async API changes

2010-06-21 Thread Nikunj Mehta

On Jun 22, 2010, at 12:44 AM, Andrei Popescu wrote:

 On Tue, Jun 15, 2010 at 5:44 PM, Nikunj Mehta nik...@o-micron.com wrote:
 (specifically answering out of context)
 
 On May 17, 2010, at 6:15 PM, Jonas Sicking wrote:
 
 9. IDBKeyRanges are created using functions on IndexedDatabaseRequest.
 We couldn't figure out how the old API allowed you to create a range
 object without first having a range object.
 
 Hey Jonas,
 
 What was the problem in simply creating it like it is shown in examples? The 
 API is intentionally designed that way to be able to use constants such as 
 LEFT_BOUND and operations like only directly from the interface.
 
 For example,
 IDBKeyRange.LEFT_BOUND; // this should evaluate to 4
 IDBKeyRange.only(a).left; // this should evaluate to a
 
 
 But in http://dvcs.w3.org/hg/IndexedDB/rev/fc747a407817 you added
 [NoInterfaceObject] to the IDBKeyRange interface. Does the above
 syntax still work? My understanding is that it doesn't anymore..

You are right. I will reverse that modifier.

Nikunj




Re: [IndexDB] Proposal for async API changes

2010-06-15 Thread Nikunj Mehta
(specifically answering out of context)

On May 17, 2010, at 6:15 PM, Jonas Sicking wrote:

 9. IDBKeyRanges are created using functions on IndexedDatabaseRequest.
 We couldn't figure out how the old API allowed you to create a range
 object without first having a range object.

Hey Jonas,

What was the problem in simply creating it like it is shown in examples? The 
API is intentionally designed that way to be able to use constants such as 
LEFT_BOUND and operations like only directly from the interface.

For example, 
IDBKeyRange.LEFT_BOUND; // this should evaluate to 4
IDBKeyRange.only(a).left; // this should evaluate to a

Let me know if you need help with this IDL. Also, it might be a good idea to 
get the WebIDL experts involved in clarifying such questions rather than 
changing the API.

Nikunj


Re: [IndexDB] Proposal for async API changes

2010-06-15 Thread Jonas Sicking
On Tue, Jun 15, 2010 at 9:44 AM, Nikunj Mehta nik...@o-micron.com wrote:
 (specifically answering out of context)

 On May 17, 2010, at 6:15 PM, Jonas Sicking wrote:

 9. IDBKeyRanges are created using functions on IndexedDatabaseRequest.
 We couldn't figure out how the old API allowed you to create a range
 object without first having a range object.

 Hey Jonas,

 What was the problem in simply creating it like it is shown in examples? The 
 API is intentionally designed that way to be able to use constants such as 
 LEFT_BOUND and operations like only directly from the interface.

 For example,
 IDBKeyRange.LEFT_BOUND; // this should evaluate to 4
 IDBKeyRange.only(a).left; // this should evaluate to a

 Let me know if you need help with this IDL. Also, it might be a good idea to 
 get the WebIDL experts involved in clarifying such questions rather than 
 changing the API.

If that is the intended syntax then that looks ok with me. What
confused me was the IDL. We should definitely have keyrange stuff as a
separate thread though, I'll start one today.

/ Jonas



Re: [IndexDB] Proposal for async API changes

2010-06-10 Thread Mikeal Rogers
I've been looking through the current spec and all the proposed changes.

Great work. I'm going to be building a CouchDB compatible API on top
of IndexedDB that can support peer-to-peer replication without other
CouchDB instances.

One of the things that will entail is a by-sequence index for all the
changes in a give database (in my case a database will be scoped to
more than one ObjectStore). In order to accomplish this I'll need to
keep the last known sequence around so that each new write can create
a new entry in the by-sequence index. The problem is that if another
tab/window writes to the database it'll increment that sequence and I
won't be notified so I would have to start every transaction with a
check on the sequence index for the last sequence which seems like a
lot of extra cursor calls.

What I really need is an event listener on an ObjectStore that fires
after a transaction is committed to the store but before the next
transaction is run that gives me information about the commits to the
ObjectStore.

Thoughts?

-Mikeal

On Wed, Jun 9, 2010 at 11:40 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Wed, Jun 9, 2010 at 7:25 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
  On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
  wrote:
   I'm not sure I like the idea of offering sync cursors either since
   the
   UA
   will either need to load everything into memory before starting or
   risk
   blocking on disk IO for large data sets.  Thus I'm not sure I support
   the
   idea of synchronous cursors.  But, at the same time, I'm concerned
   about
   the
   overhead of firing one event per value with async cursors.  Which is
   why I
   was suggesting an interface where the common case (the data is in
   memory) is
   done synchronously but the uncommon case (we'd block if we had to
   respond
   synchronously) has to be handled since we guarantee that the first
   time
   will
   be forced to be asynchronous.
   Like I said, I'm not super happy with what I proposed, but I think
   some
   hybrid async/sync interface is really what we need.  Have you guys
   spent
   any
   time thinking about something like this?  How dead-set are you on
   synchronous cursors?
 
  The idea is that synchronous cursors load all the required data into
  memory, yes. I think it would help authors a lot to be able to load
  small chunks of data into memory and read and write to it
  synchronously. Dealing with asynchronous operations constantly is
  certainly possible, but a bit of a pain for authors.
 
  I don't think we should obsess too much about not keeping things in
  memory, we already have things like canvas and the DOM which adds up
  to non-trivial amounts of memory.
 
  Just because data is loaded from a database doesn't mean it's huge.
 
  I do note that you're not as concerned about getAll(), which actually
  have worse memory characteristics than synchronous cursors since you
  need to create the full JS object graph in memory.
 
  I've been thinking about this off and on since the original proposal was
  made, and I just don't feel right about getAll() or synchronous cursors.
   You make some good points about there already being many ways to
  overwhelm
  ram with webAPIs, but is there any place we make it so easy?  You're
  right
  that just because it's a database doesn't mean it needs to be huge, but
  often times they can get quite big.  And if a developer doesn't spend
  time
  making sure they test their app with the upper ends of what users may
  possibly see, it just seems like this is a recipe for problems.
  Here's a concrete example: structured clone allows you to store image
  data.
   Lets say I'm building an image hosting site and that I cache all the
  images
  along with their thumbnails locally in an IndexedDB entity store.  Lets
  say
  each thumbnail is a trivial amount, but each image is 1MB.  I have an
  album
  with 1000 images.  I do |var photos =
  albumIndex.getAllObjects(albumName);|
  and then iterate over that to get the thumbnails.  But I've just loaded
  over
  1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
  suppose it's possible JavaScript engines could build mechanisms to fetch
  this stuff lazily (like you could even with a synchronous cursor) but
  that
  will take time/effort and introduce lag in the page (while fetching
  additional info from disk).
 
  I'm not completely against the idea of getAll/sync cursors, but I do
  think
  they should be de-coupled from this proposed API.  I would also suggest
  that
  we re-consider them only after at least one implementation has normal
  cursors working and there's been some experimentation with it.  Until
  then,
  we're basing most of our arguments on intuition and assumptions.

 I'm not married to the concept of sync cursors. However I 

Re: [IndexDB] Proposal for async API changes

2010-06-10 Thread Andrei Popescu
Hi Jonas,

On Wed, Jun 9, 2010 at 11:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 I'm well aware of this. My argument is that I think we'll see people
 write code like this:

 results = [];
 db.objectStore(foo).openCursor(range).onsuccess = function(e) {
  var cursor = e.result;
  if (!cursor) {
    weAreDone(results);
  }
  results.push(cursor.value);
  cursor.continue();
 }

 While the indexedDB implementation doesn't hold much data in memory at
 a time, the webpage will hold just as much as if we had had a getAll
 function. Thus we havn't actually improved anything, only forced the
 author to write more code.


True, but the difference here is that the author's code is the one
that may cause an OOM situation, not the indexedDB implementation. I
am afraid that, by allowing getAll(), we are designing an API may or
may not work depending on how large the underlying data set is and
what platform the code is running on (e.g. a mobile with a few MB of
RAM available or a desktop with a few GB free). To me, that is not
ideal.


 Put it another way: The raised concern is that people won't think
 about the fact that getAll can load a lot of data into memory. And the
 proposed solution is to remove the getAll function and tell people to
 use openCursor. However if they weren't thinking about that a lot of
 data will be in memory at one time, then why wouldn't they write code
 like the above? Which results as just as much data being in memory?


If they write code like the above and they run out of memory, I think
there's a chance they can trace the problem back to their own code and
attempt to fix it. On the other hand, if they trace the problem to the
indexedDB implementation, then their only choice is to avoid using
getAll().  Like you said, perhaps it's best to leave this method out
for now and see what kind of feedback we get from API users. If there
is demand, we can add it at that point?

Thanks,
Andrei



Re: [IndexDB] Proposal for async API changes

2010-06-10 Thread Jonas Sicking
On Thu, Jun 10, 2010 at 4:46 AM, Andrei Popescu andr...@google.com wrote:
 Hi Jonas,

 On Wed, Jun 9, 2010 at 11:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 I'm well aware of this. My argument is that I think we'll see people
 write code like this:

 results = [];
 db.objectStore(foo).openCursor(range).onsuccess = function(e) {
  var cursor = e.result;
  if (!cursor) {
    weAreDone(results);
  }
  results.push(cursor.value);
  cursor.continue();
 }

 While the indexedDB implementation doesn't hold much data in memory at
 a time, the webpage will hold just as much as if we had had a getAll
 function. Thus we havn't actually improved anything, only forced the
 author to write more code.


 True, but the difference here is that the author's code is the one
 that may cause an OOM situation, not the indexedDB implementation.

I don't see that the two are different. The user likely sees the same
behavior and the action on the part of the website author is the same,
i.e. to load the data in chunks rather than all at once.

Why does it make a different on which side of the API the out-of-memory happens?


 Put it another way: The raised concern is that people won't think
 about the fact that getAll can load a lot of data into memory. And the
 proposed solution is to remove the getAll function and tell people to
 use openCursor. However if they weren't thinking about that a lot of
 data will be in memory at one time, then why wouldn't they write code
 like the above? Which results as just as much data being in memory?


 If they write code like the above and they run out of memory, I think
 there's a chance they can trace the problem back to their own code and
 attempt to fix it. On the other hand, if they trace the problem to the
 indexedDB implementation, then their only choice is to avoid using
 getAll().

Yes, their only choice is to rewrite the code to read data in chunks.
However you could do that both using getAll (using limits and making
several calls to getAll) and using cursors. So again, I don't really
see a difference.

/ Jonas



Re: [IndexDB] Proposal for async API changes

2010-06-10 Thread Andrei Popescu
On Thu, Jun 10, 2010 at 5:52 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, Jun 10, 2010 at 4:46 AM, Andrei Popescu andr...@google.com wrote:
 Hi Jonas,

 On Wed, Jun 9, 2010 at 11:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 I'm well aware of this. My argument is that I think we'll see people
 write code like this:

 results = [];
 db.objectStore(foo).openCursor(range).onsuccess = function(e) {
  var cursor = e.result;
  if (!cursor) {
    weAreDone(results);
  }
  results.push(cursor.value);
  cursor.continue();
 }

 While the indexedDB implementation doesn't hold much data in memory at
 a time, the webpage will hold just as much as if we had had a getAll
 function. Thus we havn't actually improved anything, only forced the
 author to write more code.


 True, but the difference here is that the author's code is the one
 that may cause an OOM situation, not the indexedDB implementation.

 I don't see that the two are different. The user likely sees the same
 behavior and the action on the part of the website author is the same,
 i.e. to load the data in chunks rather than all at once.

 Why does it make a different on which side of the API the out-of-memory 
 happens?


Yep, you are right in saying that the two situations are identical
from the point of view of the user or from the point of view of the
action that the website author takes.

I just thought that in one case, the website author wrote code to
explicitly load the entire store into the memory, so when an OOM
happens, the culprit may be easy to spot. In the other case, the
website author may not have realized how getAll() is implemented and
may not know immediately what is going on. On the other hand, getAll()
asynchronously returns an Array containing all the requested values so
it should be just as obvious that it may cause an OOM. So ok, this
isn't such a big concern after all..


 Put it another way: The raised concern is that people won't think
 about the fact that getAll can load a lot of data into memory. And the
 proposed solution is to remove the getAll function and tell people to
 use openCursor. However if they weren't thinking about that a lot of
 data will be in memory at one time, then why wouldn't they write code
 like the above? Which results as just as much data being in memory?


 If they write code like the above and they run out of memory, I think
 there's a chance they can trace the problem back to their own code and
 attempt to fix it. On the other hand, if they trace the problem to the
 indexedDB implementation, then their only choice is to avoid using
 getAll().

 Yes, their only choice is to rewrite the code to read data in chunks.
 However you could do that both using getAll (using limits and making
 several calls to getAll) and using cursors. So again, I don't really
 see a difference.


Well, I don't feel very strongly about it but I personally would lean
towards keeping the API simple and, where possible, avoid having
multiple ways of doing the same thing until we're sure there's demand
for them...

Thanks,
Andrei



Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Jeremy Orlow
On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  I'm not sure I like the idea of offering sync cursors either since the UA
  will either need to load everything into memory before starting or risk
  blocking on disk IO for large data sets.  Thus I'm not sure I support the
  idea of synchronous cursors.  But, at the same time, I'm concerned about
 the
  overhead of firing one event per value with async cursors.  Which is
 why I
  was suggesting an interface where the common case (the data is in memory)
 is
  done synchronously but the uncommon case (we'd block if we had to respond
  synchronously) has to be handled since we guarantee that the first time
 will
  be forced to be asynchronous.
  Like I said, I'm not super happy with what I proposed, but I think some
  hybrid async/sync interface is really what we need.  Have you guys spent
 any
  time thinking about something like this?  How dead-set are you on
  synchronous cursors?

 The idea is that synchronous cursors load all the required data into
 memory, yes. I think it would help authors a lot to be able to load
 small chunks of data into memory and read and write to it
 synchronously. Dealing with asynchronous operations constantly is
 certainly possible, but a bit of a pain for authors.

 I don't think we should obsess too much about not keeping things in
 memory, we already have things like canvas and the DOM which adds up
 to non-trivial amounts of memory.

 Just because data is loaded from a database doesn't mean it's huge.

 I do note that you're not as concerned about getAll(), which actually
 have worse memory characteristics than synchronous cursors since you
 need to create the full JS object graph in memory.


I've been thinking about this off and on since the original proposal was
made, and I just don't feel right about getAll() or synchronous cursors.
 You make some good points about there already being many ways to overwhelm
ram with webAPIs, but is there any place we make it so easy?  You're right
that just because it's a database doesn't mean it needs to be huge, but
often times they can get quite big.  And if a developer doesn't spend time
making sure they test their app with the upper ends of what users may
possibly see, it just seems like this is a recipe for problems.

Here's a concrete example: structured clone allows you to store image data.
 Lets say I'm building an image hosting site and that I cache all the images
along with their thumbnails locally in an IndexedDB entity store.  Lets say
each thumbnail is a trivial amount, but each image is 1MB.  I have an album
with 1000 images.  I do |var photos = albumIndex.getAllObjects(albumName);|
and then iterate over that to get the thumbnails.  But I've just loaded over
1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
suppose it's possible JavaScript engines could build mechanisms to fetch
this stuff lazily (like you could even with a synchronous cursor) but that
will take time/effort and introduce lag in the page (while fetching
additional info from disk).


I'm not completely against the idea of getAll/sync cursors, but I do think
they should be de-coupled from this proposed API.  I would also suggest that
we re-consider them only after at least one implementation has normal
cursors working and there's been some experimentation with it.  Until then,
we're basing most of our arguments on intuition and assumptions.

J


Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Jonas Sicking
On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  I'm not sure I like the idea of offering sync cursors either since the
  UA
  will either need to load everything into memory before starting or risk
  blocking on disk IO for large data sets.  Thus I'm not sure I support
  the
  idea of synchronous cursors.  But, at the same time, I'm concerned about
  the
  overhead of firing one event per value with async cursors.  Which is
  why I
  was suggesting an interface where the common case (the data is in
  memory) is
  done synchronously but the uncommon case (we'd block if we had to
  respond
  synchronously) has to be handled since we guarantee that the first time
  will
  be forced to be asynchronous.
  Like I said, I'm not super happy with what I proposed, but I think some
  hybrid async/sync interface is really what we need.  Have you guys spent
  any
  time thinking about something like this?  How dead-set are you on
  synchronous cursors?

 The idea is that synchronous cursors load all the required data into
 memory, yes. I think it would help authors a lot to be able to load
 small chunks of data into memory and read and write to it
 synchronously. Dealing with asynchronous operations constantly is
 certainly possible, but a bit of a pain for authors.

 I don't think we should obsess too much about not keeping things in
 memory, we already have things like canvas and the DOM which adds up
 to non-trivial amounts of memory.

 Just because data is loaded from a database doesn't mean it's huge.

 I do note that you're not as concerned about getAll(), which actually
 have worse memory characteristics than synchronous cursors since you
 need to create the full JS object graph in memory.

 I've been thinking about this off and on since the original proposal was
 made, and I just don't feel right about getAll() or synchronous cursors.
  You make some good points about there already being many ways to overwhelm
 ram with webAPIs, but is there any place we make it so easy?  You're right
 that just because it's a database doesn't mean it needs to be huge, but
 often times they can get quite big.  And if a developer doesn't spend time
 making sure they test their app with the upper ends of what users may
 possibly see, it just seems like this is a recipe for problems.
 Here's a concrete example: structured clone allows you to store image data.
  Lets say I'm building an image hosting site and that I cache all the images
 along with their thumbnails locally in an IndexedDB entity store.  Lets say
 each thumbnail is a trivial amount, but each image is 1MB.  I have an album
 with 1000 images.  I do |var photos = albumIndex.getAllObjects(albumName);|
 and then iterate over that to get the thumbnails.  But I've just loaded over
 1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
 suppose it's possible JavaScript engines could build mechanisms to fetch
 this stuff lazily (like you could even with a synchronous cursor) but that
 will take time/effort and introduce lag in the page (while fetching
 additional info from disk).

 I'm not completely against the idea of getAll/sync cursors, but I do think
 they should be de-coupled from this proposed API.  I would also suggest that
 we re-consider them only after at least one implementation has normal
 cursors working and there's been some experimentation with it.  Until then,
 we're basing most of our arguments on intuition and assumptions.

I'm not married to the concept of sync cursors. However I pretty
strongly feel that getAll is something we need. If we just allow
cursors for getting multiple results I think we'll see an extremely
common pattern of people using a cursor to loop through a result set
and put values into an array.

Yes, it can be misused, but I don't see a reason why people wouldn't
misuse a cursor just as much. If they don't think about the fact that
a range contains lots of data when using getAll, why would they think
about it when using cursors?

/ Jonas



RE: [IndexDB] Proposal for async API changes

2010-06-09 Thread Laxmi Narsimha Rao Oruganti
Inline...

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Wednesday, June 09, 2010 11:55 PM
To: Jeremy Orlow
Cc: Shawn Wilsher; Webapps WG
Subject: Re: [IndexDB] Proposal for async API changes

On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  I'm not sure I like the idea of offering sync cursors either since the
  UA
  will either need to load everything into memory before starting or risk
  blocking on disk IO for large data sets.  Thus I'm not sure I support
  the
  idea of synchronous cursors.  But, at the same time, I'm concerned about
  the
  overhead of firing one event per value with async cursors.  Which is
  why I
  was suggesting an interface where the common case (the data is in
  memory) is
  done synchronously but the uncommon case (we'd block if we had to
  respond
  synchronously) has to be handled since we guarantee that the first time
  will
  be forced to be asynchronous.
  Like I said, I'm not super happy with what I proposed, but I think some
  hybrid async/sync interface is really what we need.  Have you guys spent
  any
  time thinking about something like this?  How dead-set are you on
  synchronous cursors?

 The idea is that synchronous cursors load all the required data into
 memory, yes. I think it would help authors a lot to be able to load
 small chunks of data into memory and read and write to it
 synchronously. Dealing with asynchronous operations constantly is
 certainly possible, but a bit of a pain for authors.

 I don't think we should obsess too much about not keeping things in
 memory, we already have things like canvas and the DOM which adds up
 to non-trivial amounts of memory.

 Just because data is loaded from a database doesn't mean it's huge.

 I do note that you're not as concerned about getAll(), which actually
 have worse memory characteristics than synchronous cursors since you
 need to create the full JS object graph in memory.

 I've been thinking about this off and on since the original proposal was
 made, and I just don't feel right about getAll() or synchronous cursors.
  You make some good points about there already being many ways to overwhelm
 ram with webAPIs, but is there any place we make it so easy?  You're right
 that just because it's a database doesn't mean it needs to be huge, but
 often times they can get quite big.  And if a developer doesn't spend time
 making sure they test their app with the upper ends of what users may
 possibly see, it just seems like this is a recipe for problems.
 Here's a concrete example: structured clone allows you to store image data.
  Lets say I'm building an image hosting site and that I cache all the images
 along with their thumbnails locally in an IndexedDB entity store.  Lets say
 each thumbnail is a trivial amount, but each image is 1MB.  I have an album
 with 1000 images.  I do |var photos = albumIndex.getAllObjects(albumName);|
 and then iterate over that to get the thumbnails.  But I've just loaded over
 1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
 suppose it's possible JavaScript engines could build mechanisms to fetch
 this stuff lazily (like you could even with a synchronous cursor) but that
 will take time/effort and introduce lag in the page (while fetching
 additional info from disk).

 I'm not completely against the idea of getAll/sync cursors, but I do think
 they should be de-coupled from this proposed API.  I would also suggest that
 we re-consider them only after at least one implementation has normal
 cursors working and there's been some experimentation with it.  Until then,
 we're basing most of our arguments on intuition and assumptions.

I'm not married to the concept of sync cursors. However I pretty
strongly feel that getAll is something we need. If we just allow
cursors for getting multiple results I think we'll see an extremely
common pattern of people using a cursor to loop through a result set
and put values into an array.

Yes, it can be misused, but I don't see a reason why people wouldn't
misuse a cursor just as much. If they don't think about the fact that
a range contains lots of data when using getAll, why would they think
about it when using cursors?

[Laxmi] Cursor is a streaming operator that means only the current row or page 
is available in memory and the rest sits on the disk.  As the program moves the 
cursor thru the result, old pages are thrown away and new pages are loaded from 
the result set.  Whereas with getAll everything has to come to memory before 
returning to the caller.  If there is not enough memory to keep the result all 
at a time, we would end up in out-of-memory.  In short, getAll suites well for 
small result/range, but not for big databases.  That is, with getAll

Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Jeremy Orlow
On Wed, Jun 9, 2010 at 7:25 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
  On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
  wrote:
   I'm not sure I like the idea of offering sync cursors either since the
   UA
   will either need to load everything into memory before starting or
 risk
   blocking on disk IO for large data sets.  Thus I'm not sure I support
   the
   idea of synchronous cursors.  But, at the same time, I'm concerned
 about
   the
   overhead of firing one event per value with async cursors.  Which is
   why I
   was suggesting an interface where the common case (the data is in
   memory) is
   done synchronously but the uncommon case (we'd block if we had to
   respond
   synchronously) has to be handled since we guarantee that the first
 time
   will
   be forced to be asynchronous.
   Like I said, I'm not super happy with what I proposed, but I think
 some
   hybrid async/sync interface is really what we need.  Have you guys
 spent
   any
   time thinking about something like this?  How dead-set are you on
   synchronous cursors?
 
  The idea is that synchronous cursors load all the required data into
  memory, yes. I think it would help authors a lot to be able to load
  small chunks of data into memory and read and write to it
  synchronously. Dealing with asynchronous operations constantly is
  certainly possible, but a bit of a pain for authors.
 
  I don't think we should obsess too much about not keeping things in
  memory, we already have things like canvas and the DOM which adds up
  to non-trivial amounts of memory.
 
  Just because data is loaded from a database doesn't mean it's huge.
 
  I do note that you're not as concerned about getAll(), which actually
  have worse memory characteristics than synchronous cursors since you
  need to create the full JS object graph in memory.
 
  I've been thinking about this off and on since the original proposal was
  made, and I just don't feel right about getAll() or synchronous cursors.
   You make some good points about there already being many ways to
 overwhelm
  ram with webAPIs, but is there any place we make it so easy?  You're
 right
  that just because it's a database doesn't mean it needs to be huge, but
  often times they can get quite big.  And if a developer doesn't spend
 time
  making sure they test their app with the upper ends of what users may
  possibly see, it just seems like this is a recipe for problems.
  Here's a concrete example: structured clone allows you to store image
 data.
   Lets say I'm building an image hosting site and that I cache all the
 images
  along with their thumbnails locally in an IndexedDB entity store.  Lets
 say
  each thumbnail is a trivial amount, but each image is 1MB.  I have an
 album
  with 1000 images.  I do |var photos =
 albumIndex.getAllObjects(albumName);|
  and then iterate over that to get the thumbnails.  But I've just loaded
 over
  1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
  suppose it's possible JavaScript engines could build mechanisms to fetch
  this stuff lazily (like you could even with a synchronous cursor) but
 that
  will take time/effort and introduce lag in the page (while fetching
  additional info from disk).
 
  I'm not completely against the idea of getAll/sync cursors, but I do
 think
  they should be de-coupled from this proposed API.  I would also suggest
 that
  we re-consider them only after at least one implementation has normal
  cursors working and there's been some experimentation with it.  Until
 then,
  we're basing most of our arguments on intuition and assumptions.

 I'm not married to the concept of sync cursors. However I pretty
 strongly feel that getAll is something we need. If we just allow
 cursors for getting multiple results I think we'll see an extremely
 common pattern of people using a cursor to loop through a result set
 and put values into an array.

 Yes, it can be misused, but I don't see a reason why people wouldn't
 misuse a cursor just as much. If they don't think about the fact that
 a range contains lots of data when using getAll, why would they think
 about it when using cursors?


Once again, I feel like there is a lot of speculation (more than normal)
happening here.  I'd prefer we take the Async API without the sync cursors
or getAll and give the rest of the API some time to bake before considering
it again.  Ideally by then we'd have at least one or two early adopters that
can give their perspective on the issue.

J


Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Jonas Sicking
On Wed, Jun 9, 2010 at 11:39 AM, Laxmi Narsimha Rao Oruganti
laxmi.oruga...@microsoft.com wrote:
 Inline...

 -Original Message-
 From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
 Behalf Of Jonas Sicking
 Sent: Wednesday, June 09, 2010 11:55 PM
 To: Jeremy Orlow
 Cc: Shawn Wilsher; Webapps WG
 Subject: Re: [IndexDB] Proposal for async API changes

 On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  I'm not sure I like the idea of offering sync cursors either since the
  UA
  will either need to load everything into memory before starting or risk
  blocking on disk IO for large data sets.  Thus I'm not sure I support
  the
  idea of synchronous cursors.  But, at the same time, I'm concerned about
  the
  overhead of firing one event per value with async cursors.  Which is
  why I
  was suggesting an interface where the common case (the data is in
  memory) is
  done synchronously but the uncommon case (we'd block if we had to
  respond
  synchronously) has to be handled since we guarantee that the first time
  will
  be forced to be asynchronous.
  Like I said, I'm not super happy with what I proposed, but I think some
  hybrid async/sync interface is really what we need.  Have you guys spent
  any
  time thinking about something like this?  How dead-set are you on
  synchronous cursors?

 The idea is that synchronous cursors load all the required data into
 memory, yes. I think it would help authors a lot to be able to load
 small chunks of data into memory and read and write to it
 synchronously. Dealing with asynchronous operations constantly is
 certainly possible, but a bit of a pain for authors.

 I don't think we should obsess too much about not keeping things in
 memory, we already have things like canvas and the DOM which adds up
 to non-trivial amounts of memory.

 Just because data is loaded from a database doesn't mean it's huge.

 I do note that you're not as concerned about getAll(), which actually
 have worse memory characteristics than synchronous cursors since you
 need to create the full JS object graph in memory.

 I've been thinking about this off and on since the original proposal was
 made, and I just don't feel right about getAll() or synchronous cursors.
  You make some good points about there already being many ways to overwhelm
 ram with webAPIs, but is there any place we make it so easy?  You're right
 that just because it's a database doesn't mean it needs to be huge, but
 often times they can get quite big.  And if a developer doesn't spend time
 making sure they test their app with the upper ends of what users may
 possibly see, it just seems like this is a recipe for problems.
 Here's a concrete example: structured clone allows you to store image data.
  Lets say I'm building an image hosting site and that I cache all the images
 along with their thumbnails locally in an IndexedDB entity store.  Lets say
 each thumbnail is a trivial amount, but each image is 1MB.  I have an album
 with 1000 images.  I do |var photos = albumIndex.getAllObjects(albumName);|
 and then iterate over that to get the thumbnails.  But I've just loaded over
 1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
 suppose it's possible JavaScript engines could build mechanisms to fetch
 this stuff lazily (like you could even with a synchronous cursor) but that
 will take time/effort and introduce lag in the page (while fetching
 additional info from disk).

 I'm not completely against the idea of getAll/sync cursors, but I do think
 they should be de-coupled from this proposed API.  I would also suggest that
 we re-consider them only after at least one implementation has normal
 cursors working and there's been some experimentation with it.  Until then,
 we're basing most of our arguments on intuition and assumptions.

 I'm not married to the concept of sync cursors. However I pretty
 strongly feel that getAll is something we need. If we just allow
 cursors for getting multiple results I think we'll see an extremely
 common pattern of people using a cursor to loop through a result set
 and put values into an array.

 Yes, it can be misused, but I don't see a reason why people wouldn't
 misuse a cursor just as much. If they don't think about the fact that
 a range contains lots of data when using getAll, why would they think
 about it when using cursors?

 [Laxmi] Cursor is a streaming operator that means only the current row or 
 page is available in memory and the rest sits on the disk.  As the program 
 moves the cursor thru the result, old pages are thrown away and new pages are 
 loaded from the result set.  Whereas with getAll everything has to come to 
 memory before returning to the caller.  If there is not enough memory to keep 
 the result all at a time, we would end up

Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Tab Atkins Jr.
On Wed, Jun 9, 2010 at 3:27 PM, Jonas Sicking jo...@sicking.cc wrote:
 I'm well aware of this. My argument is that I think we'll see people
 write code like this:

 results = [];
 db.objectStore(foo).openCursor(range).onsuccess = function(e) {
  var cursor = e.result;
  if (!cursor) {
    weAreDone(results);
  }
  results.push(cursor.value);
  cursor.continue();
 }

 While the indexedDB implementation doesn't hold much data in memory at
 a time, the webpage will hold just as much as if we had had a getAll
 function. Thus we havn't actually improved anything, only forced the
 author to write more code.


 Put it another way: The raised concern is that people won't think
 about the fact that getAll can load a lot of data into memory. And the
 proposed solution is to remove the getAll function and tell people to
 use openCursor. However if they weren't thinking about that a lot of
 data will be in memory at one time, then why wouldn't they write code
 like the above? Which results as just as much data being in memory?

At the very least, explicitly loading things into an honest-to-god
array can make it more obvious that you're eating memory in the form
of a big array, as opposed to just a magically transform my blob of
data into something more convenient.

(That said, I dislike cursors and explicitly avoid them in my own
code.  In the PHP db abstraction layer I wrote for myself, every query
slurps the results into an array and just returns that - I don't give
myself any access to the cursor at all.  I probably like this better
simply because I can easily foreach through an array, while I can't do
the same with a cursor unless I write some moderately more complex
code.  I hate using while loops when foreach is beckoning to me.)

~TJ



Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Jonas Sicking
On Wed, Jun 9, 2010 at 11:40 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Wed, Jun 9, 2010 at 7:25 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
  On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
  wrote:
   I'm not sure I like the idea of offering sync cursors either since
   the
   UA
   will either need to load everything into memory before starting or
   risk
   blocking on disk IO for large data sets.  Thus I'm not sure I support
   the
   idea of synchronous cursors.  But, at the same time, I'm concerned
   about
   the
   overhead of firing one event per value with async cursors.  Which is
   why I
   was suggesting an interface where the common case (the data is in
   memory) is
   done synchronously but the uncommon case (we'd block if we had to
   respond
   synchronously) has to be handled since we guarantee that the first
   time
   will
   be forced to be asynchronous.
   Like I said, I'm not super happy with what I proposed, but I think
   some
   hybrid async/sync interface is really what we need.  Have you guys
   spent
   any
   time thinking about something like this?  How dead-set are you on
   synchronous cursors?
 
  The idea is that synchronous cursors load all the required data into
  memory, yes. I think it would help authors a lot to be able to load
  small chunks of data into memory and read and write to it
  synchronously. Dealing with asynchronous operations constantly is
  certainly possible, but a bit of a pain for authors.
 
  I don't think we should obsess too much about not keeping things in
  memory, we already have things like canvas and the DOM which adds up
  to non-trivial amounts of memory.
 
  Just because data is loaded from a database doesn't mean it's huge.
 
  I do note that you're not as concerned about getAll(), which actually
  have worse memory characteristics than synchronous cursors since you
  need to create the full JS object graph in memory.
 
  I've been thinking about this off and on since the original proposal was
  made, and I just don't feel right about getAll() or synchronous cursors.
   You make some good points about there already being many ways to
  overwhelm
  ram with webAPIs, but is there any place we make it so easy?  You're
  right
  that just because it's a database doesn't mean it needs to be huge, but
  often times they can get quite big.  And if a developer doesn't spend
  time
  making sure they test their app with the upper ends of what users may
  possibly see, it just seems like this is a recipe for problems.
  Here's a concrete example: structured clone allows you to store image
  data.
   Lets say I'm building an image hosting site and that I cache all the
  images
  along with their thumbnails locally in an IndexedDB entity store.  Lets
  say
  each thumbnail is a trivial amount, but each image is 1MB.  I have an
  album
  with 1000 images.  I do |var photos =
  albumIndex.getAllObjects(albumName);|
  and then iterate over that to get the thumbnails.  But I've just loaded
  over
  1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
  suppose it's possible JavaScript engines could build mechanisms to fetch
  this stuff lazily (like you could even with a synchronous cursor) but
  that
  will take time/effort and introduce lag in the page (while fetching
  additional info from disk).
 
  I'm not completely against the idea of getAll/sync cursors, but I do
  think
  they should be de-coupled from this proposed API.  I would also suggest
  that
  we re-consider them only after at least one implementation has normal
  cursors working and there's been some experimentation with it.  Until
  then,
  we're basing most of our arguments on intuition and assumptions.

 I'm not married to the concept of sync cursors. However I pretty
 strongly feel that getAll is something we need. If we just allow
 cursors for getting multiple results I think we'll see an extremely
 common pattern of people using a cursor to loop through a result set
 and put values into an array.

 Yes, it can be misused, but I don't see a reason why people wouldn't
 misuse a cursor just as much. If they don't think about the fact that
 a range contains lots of data when using getAll, why would they think
 about it when using cursors?

 Once again, I feel like there is a lot of speculation (more than normal)
 happening here.  I'd prefer we take the Async API without the sync cursors
 or getAll and give the rest of the API some time to bake before considering
 it again.  Ideally by then we'd have at least one or two early adopters that
 can give their perspective on the issue.

If it helps move things forward we can keep getAll out of the spec for
now. I still think that mozilla will keep the implementation though as
to allow people to experiment with it. This will also allow us to
guess less 

Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Kris Zyp
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 

On 6/9/2010 4:27 PM, Jonas Sicking wrote:
 On Wed, Jun 9, 2010 at 11:39 AM, Laxmi Narsimha Rao Oruganti
 laxmi.oruga...@microsoft.com wrote:
 Inline...

 -Original Message-
 From: public-webapps-requ...@w3.org
[mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking
 Sent: Wednesday, June 09, 2010 11:55 PM
 To: Jeremy Orlow
 Cc: Shawn Wilsher; Webapps WG
 Subject: Re: [IndexDB] Proposal for async API changes

 On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org
 wrote:
 I'm not sure I like the idea of offering sync cursors either since the
 UA
 will either need to load everything into memory before starting or risk
 blocking on disk IO for large data sets.  Thus I'm not sure I support
 the
 idea of synchronous cursors.  But, at the same time, I'm concerned
about
 the
 overhead of firing one event per value with async cursors.  Which is
 why I
 was suggesting an interface where the common case (the data is in
 memory) is
 done synchronously but the uncommon case (we'd block if we had to
 respond
 synchronously) has to be handled since we guarantee that the first time
 will
 be forced to be asynchronous.
 Like I said, I'm not super happy with what I proposed, but I think some
 hybrid async/sync interface is really what we need.  Have you guys
spent
 any
 time thinking about something like this?  How dead-set are you on
 synchronous cursors?

 The idea is that synchronous cursors load all the required data into
 memory, yes. I think it would help authors a lot to be able to load
 small chunks of data into memory and read and write to it
 synchronously. Dealing with asynchronous operations constantly is
 certainly possible, but a bit of a pain for authors.

 I don't think we should obsess too much about not keeping things in
 memory, we already have things like canvas and the DOM which adds up
 to non-trivial amounts of memory.

 Just because data is loaded from a database doesn't mean it's huge.

 I do note that you're not as concerned about getAll(), which actually
 have worse memory characteristics than synchronous cursors since you
 need to create the full JS object graph in memory.

 I've been thinking about this off and on since the original proposal was
 made, and I just don't feel right about getAll() or synchronous cursors.
  You make some good points about there already being many ways to
overwhelm
 ram with webAPIs, but is there any place we make it so easy?  You're
right
 that just because it's a database doesn't mean it needs to be huge, but
 often times they can get quite big.  And if a developer doesn't spend
time
 making sure they test their app with the upper ends of what users may
 possibly see, it just seems like this is a recipe for problems.
 Here's a concrete example: structured clone allows you to store image
data.
  Lets say I'm building an image hosting site and that I cache all the
images
 along with their thumbnails locally in an IndexedDB entity store. 
Lets say
 each thumbnail is a trivial amount, but each image is 1MB.  I have an
album
 with 1000 images.  I do |var photos =
albumIndex.getAllObjects(albumName);|
 and then iterate over that to get the thumbnails.  But I've just
loaded over
 1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
 suppose it's possible JavaScript engines could build mechanisms to fetch
 this stuff lazily (like you could even with a synchronous cursor) but
that
 will take time/effort and introduce lag in the page (while fetching
 additional info from disk).

 I'm not completely against the idea of getAll/sync cursors, but I do
think
 they should be de-coupled from this proposed API.  I would also
suggest that
 we re-consider them only after at least one implementation has normal
 cursors working and there's been some experimentation with it.  Until
then,
 we're basing most of our arguments on intuition and assumptions.

 I'm not married to the concept of sync cursors. However I pretty
 strongly feel that getAll is something we need. If we just allow
 cursors for getting multiple results I think we'll see an extremely
 common pattern of people using a cursor to loop through a result set
 and put values into an array.

 Yes, it can be misused, but I don't see a reason why people wouldn't
 misuse a cursor just as much. If they don't think about the fact that
 a range contains lots of data when using getAll, why would they think
 about it when using cursors?

 [Laxmi] Cursor is a streaming operator that means only the current row
or page is available in memory and the rest sits on the disk.  As the
program moves the cursor thru the result, old pages are thrown away and
new pages are loaded from the result set.  Whereas with getAll
everything has to come to memory before returning to the caller

Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Shawn Wilsher

On 6/9/2010 3:48 PM, Kris Zyp wrote:

Another option would be to have cursors essentially implement a JS
array-like API:

db.objectStore(foo).openCursor(range).forEach(function(object){
   // do something with each object
}).onsuccess = function(){
// all done
});

(Or perhaps the cursor with a forEach would be nested inside a
callback, not sure).

The standard some function is also useful if you know you probably
won't need to iterate through everything

db.objectStore(foo).openCursor(range).some(function(object){
   return object.name == John;
}).onsuccess = function(johnIsInDatabase){
if(johnIsInDatabase){
  ...
}
});

This allows us to have an async interface (the callbacks can be called
at any time) and still follows normal JS array patterns, for
programmer convenience (so programmers wouldn't need to iterate over a
cursor and push the results into another array). I don't think anyone
would miss getAll() with this design, since cursors would already be
array-like.
To me, this feels like we are basically doing what we expect a library 
to do: make the syntactic sugar work.  I don't see why a library 
couldn't provide a some or forEach method with the currently proposed API.


Cheers,

Shawn



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Jonas Sicking
On Wed, Jun 9, 2010 at 3:36 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 On Wed, Jun 9, 2010 at 3:27 PM, Jonas Sicking jo...@sicking.cc wrote:
 I'm well aware of this. My argument is that I think we'll see people
 write code like this:

 results = [];
 db.objectStore(foo).openCursor(range).onsuccess = function(e) {
  var cursor = e.result;
  if (!cursor) {
    weAreDone(results);
  }
  results.push(cursor.value);
  cursor.continue();
 }

 While the indexedDB implementation doesn't hold much data in memory at
 a time, the webpage will hold just as much as if we had had a getAll
 function. Thus we havn't actually improved anything, only forced the
 author to write more code.


 Put it another way: The raised concern is that people won't think
 about the fact that getAll can load a lot of data into memory. And the
 proposed solution is to remove the getAll function and tell people to
 use openCursor. However if they weren't thinking about that a lot of
 data will be in memory at one time, then why wouldn't they write code
 like the above? Which results as just as much data being in memory?

 At the very least, explicitly loading things into an honest-to-god
 array can make it more obvious that you're eating memory in the form
 of a big array, as opposed to just a magically transform my blob of
 data into something more convenient.

I don't fully understand this. getAll also returns an honest-to-god array.

 (That said, I dislike cursors and explicitly avoid them in my own
 code.  In the PHP db abstraction layer I wrote for myself, every query
 slurps the results into an array and just returns that - I don't give
 myself any access to the cursor at all.  I probably like this better
 simply because I can easily foreach through an array, while I can't do
 the same with a cursor unless I write some moderately more complex
 code.  I hate using while loops when foreach is beckoning to me.)

This is what I'd expect many/most people to do.

/ Jonas



Re: [IndexDB] Proposal for async API changes

2010-06-09 Thread Shawn Wilsher

On 6/9/2010 3:36 PM, Tab Atkins Jr. wrote:

At the very least, explicitly loading things into an honest-to-god
array can make it more obvious that you're eating memory in the form
of a big array, as opposed to just a magically transform my blob of
data into something more convenient.
I'm sorry, but if a developer can't figure out that if they are given a 
big array (that is a proper Array in JavaScript) that it is the cause of 
large amounts of memory usage, I don't see how them populating it 
themselves is going to raise any additional flags.


Cheers,

Shawn



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [IndexDB] Proposal for async API changes

2010-05-20 Thread Andrei Popescu
Hi Jonas,


 A draft of the proposed API is here:

 http://docs.google.com/View?id=dfs2skx2_4g3s5f857


As someone new to this API, I thought the naming used in the current
draft is somewhat confusing. Consider the following interfaces:

IndexedDatabase
IndexedDatabaseRequest,
IDBDatabaseRequest,
IDBDatabase,
IDBRequest

Just by looking at this, it is pretty hard to understand what the
relationship between these interfaces really is and what role do they
play in the API. For instance, I thought that the IDBDatabaseRequest
is some type of Request when, in fact, it isn't a Request at all. It
also isn't immediately obvious what the difference between
IndexedDatabase and IDBDatabase really is, etc.

I really don't want to start a color of the bikeshed argument and I
fully understand how you reached the current naming convention.
However, I thought I'd suggest a three small changes that could help
other people understand this API easier:

- I know we need to keep the IDB prefix in order to avoid collisions
with other APIs. I would therefore think we should keep the IDB prefix
and make sure all the interfaces start with it (right now they don't).
- The Request suffix is now used to denote the asynchronous versions
of the API interfaces. These interfaces aren't actually Requests of
any kind, so I would like to suggest changing this suffix. In fact, if
the primary usage of this API is via its async version, we could even
drop this suffix altogether and just add Sync to the synchronous
versions?
- Some of the interfaces could have names that would more closely
reflect their roles in the API. For instance, IDBDatabase could be
renamed to IDBConnection, since in the spec it is described as a
connection to the database. Likewise, IndexedDatabase could become
IDBFactory since it is used to create database connections or key
ranges.

In any case, I want to make it clear that the current naming works
once one takes the time to understand it. On the other hand, if we
make it easier for people to understand the API, we could hopefully
get feedback from more developers.

Thanks,
Andrei



Re: [IndexDB] Proposal for async API changes

2010-05-19 Thread Jeremy Orlow
On Tue, May 18, 2010 at 2:15 AM, Jonas Sicking jo...@sicking.cc wrote:

 A draft of the proposed API is here:

 http://docs.google.com/View?id=dfs2skx2_4g3s5f857


I just noticed another nit.  Your proposal says interface IDBIndex { }; //
Unchanged but the spec's IDBIndex interface includes readonly attribute
DOMString storeName #widl-IDBIndex-storeName; which is the owning object
store's name.  This param is probably no longer necessary now that indexes
hang off of objectStores (and thus it's pretty clear which one an index
is associated with).

J


Re: [IndexDB] Proposal for async API changes

2010-05-18 Thread Jonas Sicking
On Tue, May 18, 2010 at 7:20 AM, Jeremy Orlow jor...@chromium.org wrote:
 Overall, I'm pretty happy with these changes.  I support making these
 changes to the spec.  Additional comments inline...
 On Tue, May 18, 2010 at 2:15 AM, Jonas Sicking jo...@sicking.cc wrote:

 Hi All,

 I, together with Ben Turner and Shawn Wilsher have been looking at the
 asynchronous API defined in the IndexDB specification and have a set
 of changes to propose. The main goal of these changes is to simplify
 the API that we expose to authors, making it easier for them to work
 with. Another goal has been to reduce the risk that authors misuse the
 API and use long running transactions. Finally, it has been a goal to
 reduce the risk of situations that can race.

 It has explicitly not been a goal to simplify the implementation. In
 some cases it is definitely harder to implement the proposed API.
 However, we believe that the extra complexity in implementation is
 outweighed by simplicity for users of the API.

 The main changes are:

 1. Once a database has been opened (a database connection has been
 established) read access to meta-data, such as objectStore and index
 names, is synchronous. Changes to such meta data, such as creating
 objectStores and indexes, is still asynchronous.

 I believe this is already how it's specced.  The IDBDatabase interface
 already gives you synchronous access to all of this.

The big difference is that the current spec makes openObjectStore()
and openIndex() asynchronous. Our proposal makes openObjectStore() and
openIndex() (renamed objectStore() and index()) synchronous. So
opening an objectstore, or even starting a transaction, only
synchronously accesses metadata. But any requests you make on the
transaction will be held until the transaction has managed to grab the
requested tables.

So when you, in our proposal, call:

db.objectStore(foo, READ_WRITE).put(...);

the objectStore function synchronously creates a transaction object
representing a transaction which only contains the foo objectStore.
The implementation then fires off a asynchronous request to lock the
foo objectStore with a write lock. It then returns the synchronously
created transaction object.

When the put() function is called, the implementation notices that the
lock is not yet acquired. So it simply records what information should
be written to the object store.

Later, when the write lock is successfully acquired, the
implementation executes the recorded operations and once they finish
we call their callbacks.

 9. IDBKeyRanges are created using functions on IndexedDatabaseRequest.
 We couldn't figure out how the old API allowed you to create a range
 object without first having a range object.

 In the spec, I see the following in examples:
 var range = new IDBKeyRange.bound(2, 4);
 and
 var range = IDBKeyRange.leftBound(key);
 I'm not particularly happy with hanging functions off of
 IndexedDatabaseRequest for this.  Can it work something like what I listed
 above?  If not, maybe we can find a better place to put them?  Or just
 create multiple openCursor functions for each case?

Mostly we were just confused as to what syntax was actually proposed.
You are listing two syntaxes (with and without 'new'), neither of
which match the WebIDL in the spec. I personally think that most
proposed syntaxes are ok and don't care much which one we choose, as
long as it's clearly defined.

 10. You are allowed to have multiple transactions per database
 connection. However if they use overlapping tables, only the first one
 will receive events until it is finished (with the usual exceptions of
 allowing multiple readers of the same table).

 Can you please clarify what you mean here?  This seems like simply an
 implementation detail to me, so maybe I'm missing something?

The spec currently does explicitly forbid having multiple transactions
per database connection. The syntax doesn't even support it since
there is a .currentTransaction property on IDBDatabase. I.e. the
following code seems forbidden (though I'm not sure if forbidden means
that an exception will be thrown somewhere, or if it means that the
code will just silently fail to work).

request = indexedDB.openDatabase(myDB, ...);
request.onsuccess = function() {
  db = request.result;
  r1 = db.openTransaction([foo]);
  r1.onsuccess = function() { ... };
  r2 = db.openTransaction([bar]);
  r2.onsuccess = function() { ... };
};

The spec says that the above is forbidden. In our new proposal the
following would be allowed:

request = indexedDB.openDatabase(myDB, ...);
request.onsuccess = function() {
  db = request.result;
  t1 = db.transaction([foo], READ_WRITE);
  t2 = db.transaction([bar], READ_WRITE);
};

And would allow the two transactions to run concurrently.

 A draft of the proposed API is here:

 http://docs.google.com/View?id=dfs2skx2_4g3s5f857

 Comments:
 1)  IDBRequest.abort() in IDBRequest needs to be able to raise.

I think in general we haven't indicated