RE: [IndexedDB] Closing on bug 9903 (collations)

2011-06-17 Thread Pablo Castro

From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
Behalf Of Keean Schupke
Sent: Tuesday, May 31, 2011 11:51 PM

 On 1 June 2011 01:37, Pablo Castro pablo.cas...@microsoft.com wrote:

 -Original Message-
 From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh 
 Gregor
 Sent: Tuesday, May 31, 2011 3:49 PM

  On Tue, May 31, 2011 at 6:39 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
   No, that was poor wording on my part, I keep using locale in the 
   wrong context. I meant to have the API take a proper collation 
   identifier. The identifier can be as specific as the caller wants it to 
   be. The implementation could choose to not honor some specific detail 
   if it can't handle it (to the extent that doing so is allowed by the 
   specification of collation names), or fail because it considers that 
   not handling a particular aspect of the collation identifier would 
   severely deviate from the caller's expectations.
 
  I'm not sure I understand you.  My personal opinion is that there
  should be no undefined behavior here.  If authors are allowed to pass
  collation identifiers, the spec needs to say exactly how they're to be
  interpreted, so the same identifier passed to two different browsers
  will result in the same collation, i.e., the same strings need to sort
  the same cross-browser.  Having only binary collation is better than
  having non-binary collations but not defining them, IMO.
 I thought BCP47 allowed implementations to drop subtags if needed. I just 
 re-read the spec and it seems that it only allows to do that in constrained 
 cases where you can't fit the whole name in your buffer (which wouldn't 
 apply to the context discussed here). My first instinct is that this is 
 quite a bit to guarantee (full consistency in collation), but it seems that 
 that's what the spec is shooting for.

   Given the amount of debate on this, could we at least agree that we can 
   do binary for v1? We can then have an open item for v2 on taking 
   collation names and sort according to UCA or taking callbacks and such.
 
  I'm okay with supporting only binary to start with.
 Great. I'll still wait a bit to see what other folks think, and then update 
 the bug one way or the other.

 Thanks
 -pablo

 The discussion sounds like it is headed in the right direction. Are there 
 any issues with non-unicode encodings that need to be dealt with (HTTP 
 headers default to ISO-8859 I think). Would people be expected to convert on 
 read into UTF-16 strings or use typed-arrays?

I asked around here and folks actually pointed out that the JavaScript spec 
seems to be describing exactly what we needed. Looking at here [1], section 
11.8.5, the relevant fragment starting at step 4 goes:

Else, both px and py are Strings
a. If py is a prefix of px, return false. (A String value p is a prefix of 
String value q if q can be the result of concatenating p and some other String 
r. Note that any String is a prefix of itself, because r may be the empty 
String.)
b. If px is a prefix of py, return true.
c. Let k be the smallest nonnegative integer such that the character at 
position k within px is different from the character at position k within py. 
(There must be such a k, for neither String is a prefix of the other.)
d. Let m be the integer that is the code unit value for the character at 
position k within px.
e. Let n be the integer that is the code unit value for the character at 
position k within py.
f. If m  n, return true. Otherwise, return false.

It also has a note below indicating:

NOTE 2 The comparison of Strings uses a simple lexicographic ordering on 
sequences of code unit values. There is no attempt to use the more complex, 
semantically oriented definitions of character or string equality and collating 
order defined in the Unicode specification. Therefore String values that are 
canonically equal according to the Unicode standard could test as unequal. In 
effect this algorithm assumes that both Strings are already in normalised form. 
Also, note that for strings containing supplementary characters, lexicographic 
ordering on sequences of UTF-16 code unit values differs from that on sequences 
of code point values.

Which is very much in line with what we've been discussing, and has the extra 
feature of being compatible with JavaScript order. 

So it looks like we could reference (or inline) this in the spec and have a 
fully specified order for keys with string content.

Thoughts? 

Thanks
-pablo

[1] http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf





RE: [IndexedDB] Evictable stores

2011-06-07 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Tuesday, May 31, 2011 5:34 PM

 On Tue, May 31, 2011 at 3:46 PM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
  We discussed evictable stores some time ago and captured it in bug 11350 
  [1], but I haven't seen further discussion on it and it hasn't gone into 
  the spec. I'm curious on where folks are with this? Should we move it to 
  v2? Should we just allow UAs to have their own policy around eviction 
  (back at TPAC it seemed folks had reasonable but different strategies for 
  handling when to allow websites to use storage already).

 I think this is a very interesting feature, but one that I'd prefer to
 move to a version 2 as it isn't a required feature and is one that
 seems easy to retrofit.

 / Jonas

The feature is already captured in the wiki page that tracks future features 
[1]. So I guess we can just resolve the bug as later. 

Jeremy, the bug is currently assigned to you, were you doing work on it or 
should I just resolve it?

Thanks
-pablo

[1] http://www.w3.org/2008/webapps/wiki/IndexedDatabaseFeatures




RE: [IndexedDB] Evictable stores

2011-06-07 Thread Pablo Castro

From: dgro...@google.com [mailto:dgro...@google.com] On Behalf Of David Grogan
Sent: Tuesday, June 07, 2011 1:01 PM

 We (chrome) are still having internal discussions about evictable vs 
 non-evictable storage; we're on board with worrying about this in v2.
 On Tue, May 31, 2011 at 5:33 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, May 31, 2011 at 3:46 PM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
   We discussed evictable stores some time ago and captured it in bug 
   11350 [1], but I haven't seen further discussion on it and it hasn't 
   gone into the spec. I'm curious on where folks are with this? Should we 
   move it to v2? Should we just allow UAs to have their own policy around 
   eviction (back at TPAC it seemed folks had reasonable but different 
   strategies for handling when to allow websites to use storage already).
  I think this is a very interesting feature, but one that I'd prefer to
  move to a version 2 as it isn't a required feature and is one that
  seems easy to retrofit.
 
   / Jonas

Got it. I postponed the bug.




RE: [IndexedDB] Closing on bug 9903 (collations)

2011-05-31 Thread Pablo Castro

-Original Message-
From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh 
Gregor
Sent: Friday, May 06, 2011 10:05 AM


 On Fri, May 6, 2011 at 5:18 AM, Jonas Sicking jo...@sicking.cc wrote:
  Based on that, my conclusion is that we should go with what Pablo is
  proposing. And I think we should do it for v1.

 If I understand correctly, Pablo's proposal is that the author be
 allowed to specify a locale, and the browser can collate in some
 undefined way based on that locale.  That sounds like a really bad
 idea for interop.  If non-binary collation is supported in a first
 version, it should be either

No, that was poor wording on my part, I keep using locale in the wrong 
context. I meant to have the API take a proper collation identifier. The 
identifier can be as specific as the caller wants it to be. The implementation 
could choose to not honor some specific detail if it can't handle it (to the 
extent that doing so is allowed by the specification of collation names), or 
fail because it considers that not handling a particular aspect of the 
collation identifier would severely deviate from the caller's expectations.

 1) Two choices, binary or UCA 6.0.0.  (AFAIK, UCA gives fairly good
 results for most languages even without tailoring, so it might be just
 fine for v1.  It's vastly better than binary, for sure.)

Given the amount of debate on this, could we at least agree that we can do 
binary for v1? We can then have an open item for v2 on taking collation names 
and sort according to UCA or taking callbacks and such.

 2) In addition to binary and UCA 6.0.0, allow UCA 6.0.0 tailored by
 any of the locales defined by CLDR 1.9.1.

 There also needs to be some thought put into how to handle version
 updates, since browsers cannot update their UCA or CLDR implementation
 without rebuilding all existing indexes that used it (unless they keep
 the old implementation forever).  It might be that browsers should
 just stick to a fixed version for the time being (like 6.0.0 and
 1.9.1), and we might decide that no further APIs are needed now to
 accommodate possible future switches, but at least some thought needs
 to be given to it.

I wonder if the API (independently of when we get to this) should include the 
version either as part of the collation identifier or as a separate argument. 
This would allow UAs to support a version or two for a while, and then phase 
them out as they fall out of use in favor of newer ones.

 On consideration, I don't think user-specified sortkey functions are
 necessary at this stage.  If collations are to be identified by
 strings for now, we could always overload the value to accept a
 function at some later date if we wanted to support that.  So I
 wouldn't worry about that further.

I agree.

-pablo



[IndexedDB] Evictable stores

2011-05-31 Thread Pablo Castro
We discussed evictable stores some time ago and captured it in bug 11350 [1], 
but I haven't seen further discussion on it and it hasn't gone into the spec. 
I'm curious on where folks are with this? Should we move it to v2? Should we 
just allow UAs to have their own policy around eviction (back at TPAC it seemed 
folks had reasonable but different strategies for handling when to allow 
websites to use storage already).

Thanks,
-pablo

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=11350




RE: [IndexedDB] Closing on bug 9903 (collations)

2011-05-31 Thread Pablo Castro

-Original Message-
From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh 
Gregor
Sent: Tuesday, May 31, 2011 3:49 PM

 On Tue, May 31, 2011 at 6:39 PM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
  No, that was poor wording on my part, I keep using locale in the wrong 
  context. I meant to have the API take a proper collation identifier. The 
  identifier can be as specific as the caller wants it to be. The 
  implementation could choose to not honor some specific detail if it can't 
  handle it (to the extent that doing so is allowed by the specification of 
  collation names), or fail because it considers that not handling a 
  particular aspect of the collation identifier would severely deviate from 
  the caller's expectations.

 I'm not sure I understand you.  My personal opinion is that there
 should be no undefined behavior here.  If authors are allowed to pass
 collation identifiers, the spec needs to say exactly how they're to be
 interpreted, so the same identifier passed to two different browsers
 will result in the same collation, i.e., the same strings need to sort
 the same cross-browser.  Having only binary collation is better than
 having non-binary collations but not defining them, IMO.

I thought BCP47 allowed implementations to drop subtags if needed. I just 
re-read the spec and it seems that it only allows to do that in constrained 
cases where you can't fit the whole name in your buffer (which wouldn't apply 
to the context discussed here). My first instinct is that this is quite a bit 
to guarantee (full consistency in collation), but it seems that that's what the 
spec is shooting for. 

  Given the amount of debate on this, could we at least agree that we can do 
  binary for v1? We can then have an open item for v2 on taking collation 
  names and sort according to UCA or taking callbacks and such.

 I'm okay with supporting only binary to start with.

Great. I'll still wait a bit to see what other folks think, and then update the 
bug one way or the other.

Thanks
-pablo



[IndexedDB] Closing on bug 9903 (collations)

2011-04-29 Thread Pablo Castro
We've had quite a bit of debate on this but I don't think we've reached 
closure. At this point I would be fine with either one of a) postpone to v2 and 
agree that for now we'll just do binary collation everywhere or b) the last 
form of the proposal sent around: extra collation argument (following BCP47 
plus whatever the UA wants to allow) in createObjectStore/createIndex, plus a 
collation property to interrogate it; no way to change the collation of a 
store/index once created.

Given that this turned out to be a more elaborate topic than I had originally 
expected and that it doesn't seem to have a lot of traction right now, my 
preference would be to postpone to v2. Thoughts? Once we make a call I'll make 
sure the spec reflects it.

Thanks
-pablo




[IndexedDB] Exceptions in IDB and the DOMException

2011-04-20 Thread Pablo Castro
This came up today that I didn't remember having a conversation about it with 
folks.

We currently have IDBDatabaseException with a some error codes as constants and 
code/message properties. Looking at DOMException as defined in DOM Core [1], it 
turns out that a) the pattern of the class is identica, but instead of 
code/message it has code/name and b) there are some errors present in both or 
that are very close (e.g. NOT_FOUND_ERR, DATA_CLONE_ERR, QUOTA_EXCEEDED_ERR). 

Would it be worth it trying to use the constants of DOMException when there's 
one already there that matches the need? If that was the case, would be it the 
constants that we would be reusing or would be have to throw a DOMException 
instead of an IDBDatabaseException?

Separately, in reference to a) above, should we change 
IDBDatabaseException.message to IDBDatabaseException.name for consistency?

Thanks
-pablo

[1] http://www.w3.org/TR/2010/WD-domcore-20101007/#exception-domexception




RE: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-05 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Keean Schupke
Sent: Monday, April 04, 2011 10:17 PM

 Something like RelationalDB gives you the power of a relational-db with no 
 dependence on a specific implementation of SQL, so it would be compatible 
 enough for the web.  It fixes all the problems with the standardisation of 
 WebSQL that have been talked about so far.  I think it would find no 
 technical issues that block its standardisation.  As a high level DB API it 
 does not need all the low-level features of IndexedDB, so its API can be 
 much simpler and cleaner. RelationalDB can at least be provided as a library 
 on top of IndexedDB, and it can use WebSQL where it is supported. My concern 
 with the library approach is performance when implemented on top of 
 IndexedDB.

The goal of IndexedDB has always been to enable things like RelationalDB and 
CouchDB to be built on top, while maintaining a reasonable level of 
functionality for those that wanted to use it directly. I really like the idea 
of thinking of RelationalDB as something that's built as a library on top of 
IndexedDB. Are there specific tweaks we can make to IndexedDB so it can be a 
good lower-layer for RelationalDB, such that RelationalDB could be built as a 
pure JavaScript library?

Thanks
-pablo




RE: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, March 31, 2011 11:36 AM

 I can find a lot of stuff on collation, but not a lot about why it could not 
 be done in a library. Could you summerise the reasons why this needs to be 
 core functionality for me?

 Sorry, but that stuff is paged out of my brain.  Pablo, can you?
 
 A library could chose to use an object store as meta-data to store the 
 collation orders that it is using for various indexes for example.

- Currently there are no APIs in JavaScript to compare strings using specific 
collations. There are folks that are looking into this, but it will need time.
- I'm far from an expert in the topic, but from talking to folks that 
understand this well it seems that to actually implement this entirely in 
JavaScript it would mean you have to download collation tables and apply them 
as needed in callbacks. Not only this means a hit in download size/time for the 
app but also that callbacks have to either download stuff or inline collation 
rules/tables in the callback itself. 
- In pure practical terms, I suspect the 80% scenario can be covered by 
implementing this natively, having it be fast and simple to use for common 
cases. Not pushing back on the callback stuff, just saying that I find it 
valuable to have users simply say en-US and get what they wanted.
- Also from the practical perspective, simple cases that don't require the 
flexibility and can avoid having to take care of making the callbacks perfectly 
consistent even as you roll out updates that may hit only some of the pages, 
use components written by someone else, etc.
- By default we would still do binary collation (there was a question in the 
thread, I forget exactly where).

Thanks
-pablo




RE: [IndexedDB] Spec changes for international language support

2011-03-22 Thread Pablo Castro

From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
Behalf Of Keean Schupke
Sent: Friday, March 18, 2011 8:17 PM

 On 18 March 2011 19:29, Pablo Castro pablo.cas...@microsoft.com wrote:

 From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
 Behalf Of Keean Schupke
 Sent: Friday, March 18, 2011 1:53 AM

  See my proposal in another thread. The basic idea is to copy BDB. Have a 
  primary index that is based on an integer, something primitive and fast. 
  Allow secondary indexes which use a callback to generate a binary index 
  key. IDB shifts the complexity out into a library. Common use cases can 
  be provided (a hash of all fields in the object, internationalised 
  bidirectional lexicographic etc...), but the user is free to write their 
  own for less usual cases (for example indexing by the last word in a name 
  string to order by surname).
I agree with Jeremy's comments on the other thread for this. Having the 
callback mechanism definitely sounds interesting but there are a ton of common 
cases that we can solve by just taking a language identifier, I'm not sure we 
want to make people work hard to get something that's already supported in most 
systems. The idea of having a callback to compute the index value feels 
incremental to this, so we could take on it later on without disrupting the 
explicit international collation stuff.

 The idea would be to provide pre-defined implementations of the callback for 
 common use cases, then it is just as simple to register a callback as set 
 any other option. All this means to the API is you pass a function instead 
 of a string. It also is better for modularity as all the code relating to 
 the sort order is kept in the callback functions.

 The difference comes down to something like:

 index.set_order_lexicographic('us');

 vs

 index.set_order_method(order_lexicographic('us'));

 So more than just setting a property like the first case, where presumably 
 all the ordering code is mixed in with the indexing code, the second case 
 encapsulates all the ordering code in the function returned from the 
 execution of order_lexicographic('us'). This function would represent a 
 mapping from the object being indexed to a binary blob that is the actual 
 stored index data.

 So doing it this was does not necessarily make things harder, and it 
 improves encapsulation, the type-safety, and the flexibility of the API.

Yep, we talked about supporting callbacks already in the other threads and in 
this one. As I mentioned before, I think this is an incremental to the basic 
feature of taking a collation name. I do realize you can just pass a 
pre-implemented function, but that opens the door to a bunch of things we'd 
need to handle, including storing possibly storing code in the database (such 
that proper updates don't depend on each page re-registering all the index 
callbacks), handling scripts with the appropriate context to run during index 
updates, etc.  I would much rather have basic functionality in place and then 
expand as needed once we have users using the API.

Thanks
-pablo




RE: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Pablo Castro

From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
Behalf Of Keean Schupke
Sent: Friday, March 18, 2011 1:53 AM

 See my proposal in another thread. The basic idea is to copy BDB. Have a 
 primary index that is based on an integer, something primitive and fast. 
 Allow secondary indexes which use a callback to generate a binary index key. 
 IDB shifts the complexity out into a library. Common use cases can be 
 provided (a hash of all fields in the object, internationalised 
 bidirectional lexicographic etc...), but the user is free to write their own 
 for less usual cases (for example indexing by the last word in a name string 
 to order by surname).

I agree with Jeremy's comments on the other thread for this. Having the 
callback mechanism definitely sounds interesting but there are a ton of common 
cases that we can solve by just taking a language identifier, I'm not sure we 
want to make people work hard to get something that's already supported in most 
systems. The idea of having a callback to compute the index value feels 
incremental to this, so we could take on it later on without disrupting the 
explicit international collation stuff.

 On 18 March 2011 02:19, Jonas Sicking jo...@sicking.cc wrote:
 2011/3/17 Pablo Castro pablo.cas...@microsoft.com:
 
  From: Jonas Sicking [mailto:jo...@sicking.cc]
  Sent: Tuesday, March 08, 2011 1:11 PM
 
  All in all, is there anything preventing adding the API Pablo suggests
  in this thread to the IndexedDB spec drafts?
 
  I wanted to propose a couple of specific tweaks to the initial proposal 
  and then unless I hear pushback start editing this into the spec.
 
  From reading the details on this thread I'm starting to realize that 
  per-database collations won't do it. What did it for me was the example 
  that has a fuzzier matching mode (case/accent insensitive). This is 
  exactly the kind of index I would want to sort people's names in my 
  address book, but most likely not the index I'll want to use for my 
  primary key.
 
  Refactoring the API to accommodate for this would mean to move the 
  setCollation() method and the collation property to the object store and 
  index objects. If we were willing to live without the ability to change 
  them we could take collation as one of the optional parameters to 
  createObjectStore()/createIndex() and reduce a bit of surface area...
 Unfortunately I think you bring up good use cases for
 per-objectStore/index collations. It's definitely tempting to just add
 it as a optional parameter to createObjectStore/createIndex. The
 downside is obviously pushing more complexity onto web developers.
 Complexity which will be duplicated across sites.

 However there is another problem to consider here. Can switching
 collation on a objectStore or a unique index can affect its validity?
 I.e. if you switch from a case sensitive to a case insensitive
 collation, does that mean that if you have two entries with the
 primary keys Sweden and sweden they collide and thus the change of
 collation must result in an error (or aborted transaction)?

 I do seem to recall that there are ways to do at least case
 sensitivity such that you generally don't take case into account when
 sorting, unless two entries are exactly the same, in which case you do
 look at casing to differentiate them. However I don't really know a
 whole lot about this and so defer to people that know
 internationalization better.

This is a good point. It makes me lean toward not allowing changing the 
collation of an index or store. That means we could just have an optional 
parameter (in the generic parameter object thingy we have now) on 
createObjectStore and createIndex that indicates the collation name. It seems 
minimally disruptive, it doesn't tax people that don't care about it, and since 
there is no setCollation we don't have the problem of not being able to 
re-index the data.

  Another piece of feedback I heard consistently as I discussed this with 
  various folks at Microsoft is the need to be able to pick up what the UA 
  would consider the collation that's most appropriate for the user 
  environment (derived from settings, page language or whatever). We could 
  support this by introducing a special value that  you can pass to 
  setCollation that indicates pick whatever is the right for the 
  environment's language right now. Given that there is no other way for 
  people to discover the user preference on this, I think this is pretty 
  important.
 I would be fine with this as long as it's a explicit opt-in. There is
 definitely a risk that people will do this and then only do testing in
 one language, but it seems to me like a useful use case to support,
 and I don't see a way of supporting this while completely avoiding the
 risk of internationalization bugs.

I agree, it should be opt-in. I still assume we'll default to binary collation 
(same if you specify the collation value as null). I

RE: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Friday, March 18, 2011 1:57 PM

  However there is another problem to consider here. Can switching
  collation on a objectStore or a unique index can affect its validity?
  I.e. if you switch from a case sensitive to a case insensitive
  collation, does that mean that if you have two entries with the
  primary keys Sweden and sweden they collide and thus the change of
  collation must result in an error (or aborted transaction)?
 
  I do seem to recall that there are ways to do at least case
  sensitivity such that you generally don't take case into account when
  sorting, unless two entries are exactly the same, in which case you do
  look at casing to differentiate them. However I don't really know a
  whole lot about this and so defer to people that know
  internationalization better.
 
  This is a good point. It makes me lean toward not allowing changing the 
  collation of an index or store. That means we could just have an optional 
  parameter (in the generic parameter object thingy we have now) on 
  createObjectStore and createIndex that indicates the collation name. It 
  seems minimally disruptive, it doesn't tax people that don't care about 
  it, and since there is no setCollation we don't have the problem of not 
  being able to re-index the data.

 So there is no way to specify things such that the collation doesn't
 affect unique-ness? If so, I tend to agree.

The problem is that different collations will consider different things unique. 
This is bound to be variable across languages and such, so I'm not sure we want 
to be in the business of fine-tuning this. It seems that being a bit more 
restrictive could result in a more robust result overall. If someone really 
needs to change the collation they can copy the table manually...not great, but 
if we think it's a corner case it's probably fine.

   Another piece of feedback I heard consistently as I discussed this 
   with various folks at Microsoft is the need to be able to pick up what 
   the UA would consider the collation that's most appropriate for the 
   user environment (derived from settings, page language or whatever). 
   We could support this by introducing a special value that  you can 
   pass to setCollation that indicates pick whatever is the right for 
   the environment's language right now. Given that there is no other 
   way for people to discover the user preference on this, I think this 
   is pretty important.
  I would be fine with this as long as it's a explicit opt-in. There is
  definitely a risk that people will do this and then only do testing in
  one language, but it seems to me like a useful use case to support,
  and I don't see a way of supporting this while completely avoiding the
  risk of internationalization bugs.
 
  I agree, it should be opt-in. I still assume we'll default to binary 
  collation (same if you specify the collation value as null). I was reading 
  the BCP 47 [1] and in section 4.1 Choice of Language Tag the item #7 
  seems to describe what we're looking for. The value i-default seems to 
  match our needs close enough, so callers could use that value. 
  Discoverability is not great, but we avoid having to specify something 
  new, and arguably they'll need to read somewhere that this argument is a 
  BCP47-compatible value, and we could put a comment about i-default right 
  there.

 Sounds good to me. Though you seem to have forgotten to include the
 [1] reference.

Oops, here it goes:
 [1] http://tools.ietf.org/html/bcp47





RE: Indexed Database API

2011-03-17 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Tuesday, March 15, 2011 3:08 PM

 Filed: http://www.w3.org/Bugs/Public/show_bug.cgi?id=12310

I'm not sure if this is a lot more valuable than just creating an index over 
whatever index key you want plus the primary key, and then seeking to the 
compound key of the last row in the previous page to resume scanning the next 
page of records. No strong pushback, just not sure this is worth the extra 
method.

-pablo




RE: [IndexedDB] Compound and multiple keys

2011-03-08 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Keean Schupke
Sent: Tuesday, March 08, 2011 3:03 PM

 No objections here.

 Keean.

 On 8 March 2011 21:14, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org wrote:
 
  On Thu, Jan 20, 2011 at 6:29 PM, Tab Atkins Jr. jackalm...@gmail.com
  wrote:
 
  On Thu, Jan 20, 2011 at 10:12 AM, Keean Schupke ke...@fry-it.com wrote:
   Compound primary keys are commonly used afaik.
 
  Indeed.  It's one of the common themes in the debate between natural
  and synthetic keys.
 
  Fair enough.
  Should we allow explicit compound keys?  I.e myOS.put({...}, ['first
  name', 'last name'])?  I feel pretty strongly that if we do, we should
  require this be specified up-front when creating the objectStore.  I.e. 
  add
  some additional parameter to the optional options object.  Otherwise, 
  we'll
  force implementations to handle variable compound keys for just this one
  case, which seems kind of silly.
  The other option is to just disallow them.
 
  After thinking about it a bunch and talking to others, I'm actually leaning
  towards both option A and B.  Although this will be a little harder for
  implementors, it seems like there are solid reasons why some users would
  want to use A and solid reasons why others would want to use B.
  Any objections to us going that route?
 Not from me. If I don't hear objections I'll write up a spec draft and
 attach it here before committing to the spec.

Option A is pretty well understood, I like that one.

For option B, at some point we had a debate on whether when indexing an array 
value we should consider it a single key value or we should unfold it into 
multiple index records. The first option makes it very similar to A in that an 
array is just a composite value (it is quite a bit more painful to 
implement...), the second option is interesting in that allows for new 
scenarios such as objects with an array for tags, where you want to look up by 
tag (even after doing options A and B as currently defined, in order support 
multiple tags you'd need a second store that keeps the tags + key for the 
objects you want to tag). Is there any interest in that scenario?

Thanks
-pablo
 




RE: [IndexedDB] Spec changes for international language support

2011-02-23 Thread Pablo Castro

From: jungs...@google.com [mailto:jungs...@google.com] On Behalf Of Jungshik 
Shin (???, ???)
Sent: Tuesday, February 22, 2011 2:08 PM


 On Fri, Feb 18, 2011 at 2:34 AM, Bjoern Hoehrmann derhoe...@gmx.net wrote:
 * Pablo Castro wrote:
 We discussed international language support last time at the TPAC and I
 said I'd propose spec text for it. Please find the patch below, the
 changes mirror exactly the proposal described in the bug we have for
 tracking this: http://www.w3.org/Bugs/Public/show_bug.cgi?id=9903
 You should anticipate objections to that; collation is not a property of
 language, for instance, for de-de you typically have dictionary sorting
 and phone book sorting (and of course you have de-de, de-ch, and so
 on, so de alone would be rather meaningless). So far the W3C and the
 IETF have used resource identifiers to specify collations (see XPath 2.0
 and RFC 4790) where the IETF allows shorthands like i;ascii-casemap.

 I agree that simply specifying that 'language' be used without saying what 
 it means is not sufficient. However, your examples (German phonebook vs 
 dictionary) can be  covered with language identifier framework laid out in 
 BCP47 (with 'u' extension). 

Fair enough. I'll adjust this part of the write up to discuss this in terms of 
collation identifier or language identifier.

 I do understand that Microsoft uses an extension of language tags for
 the `CultureInfo` in the .NET Framework, where, say, `de-DE_phoneb` is
 used to refer to german phone book sorting, but BCP 47 does not allow
 for that, 

 There's a way to specify alternate sorting orders (e.g. German phonebook, 
 Chinese pinyin, stroke count, radical-stroke count order, etc) under the BCP 
 47 framework  because it has a mechanism for defining an extension and 
 registering it. The Unicode consortium uses that mechanism to define 'u' 
 extension and a set of subtags that can  be used with 'u'. 
 For instance, German phonebook sorting can be identified with 
 'de-DE-u-co-phonebk'. See 

 https://tools.ietf.org/html/bcp47
 https://tools.ietf.org/html/rfc6067
 http://unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers

 Also, see Bug 9903 comment 6 by Mark Davis for more examples. Well, I'm just 
 copying his comment directly here:


 To add to what Jungshik said, BCP47 defines standard extensions. The 
 extension
 defined by the Unicode consortium
 (http://cldr.unicode.org/index/bcp47-extension) provides for fine-grained

 specifications of collation behavior.
 Examples for German:
 de-u-co-phonebk // phonebook order
 de-u-kn-true // numeric sorting, eg Tom2 comes before Tom12
 de-u-ks-level1 // ignore accents, case differences
 de-u-ks-level2 // ignore case differences
 de-u-ks-level1-kc-true // ignore accents, but not case
 These can be combined, such as:
 de-u-co-phonebk-kn-true-ks-level1-kc-true
 
 neither could you devise a language tag to define something
 like i;ascii-casemap (which simply defines A-Z = a-z).


I'm not sure how specific we want to get into this. In particular, would be it 
better if we specified it all the way (including which extensions UAs need to 
support) or if we used BCP47 as the starting point and allowed UAs to support 
additional extensions as needed?

 I would expect that if browsers offer collations, there would be an in-
 terface for that so you can use them in other places, as such it might
 be wiser to accept something other than a language identifier string. 

 There's an on-going effort to expose a 'rich' set of I18N API to client-side 
 development using Javascript ( 
 http://wiki.ecmascript.org/doku.php?id=strawman:i18n_api : The API used be 
 much more extensive than now, but has been scaled down significantly to get 
 more browsers on board in its 1st iteration). There we're likely to use BCP 
 47 with 'u' extension (see above). So, I think it'd be better if IndexedDB 
 matches what ECMAScript plans to do. 

This is interesting, do you know how far along is this?


 I also note that collation often involves equivalence testing, but it
 is not clear from your proposal whether that is the case here. It might
 also be a good idea to clearly spell out interoperability expectations;
 if two implementations support some collation, will they behave the same
 for any and all inputs as far as collation is concerned, or should one
 be prepared for slight differences among implementations?

I think it's more practical to assume that users should be prepared for slight 
differences among implementations.

Thanks
-pablo



RE: [IndexedDB] More questions about IDBRequests always firing (WAS: Reason for aborting transactions)

2011-02-17 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Thursday, February 17, 2011 11:51 AM

 On Thu, Feb 17, 2011 at 11:12 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, Feb 17, 2011 at 11:02 AM, ben turner bent.mozi...@gmail.com wrote:
  Also, what should we do when you enqueue a setVersion transaction and 
  then
  close the database handle?  Maybe an ABORT_ERR there too?
 
  Yeah, that'd make sense to me. Just like if you enque any other
  transaction and then close the db handle.
 
  We don't abort transactions that are already in progress when you call
  db.close()... We just set a flag and prevent further transactions from
  being created.
 Doh! Of course.

 If the setVersion transaction has started then we should definitely
 allow it finish, just like all other transactions. I don't have a
 strong opinion on if we should let the setVersion transaction start if
 it hasn't yet. Seems most consistent to let it, but if there's a
 strong reason not to I could be convinced.

 What if you have two database connections open and both do a setVersion 
 transaction and one calls .close (to yield to the other)?  Neither can start 
 until one or the other actually is closed.  If a database is closed (not 
 just close pending) then I think we need to abort any blocked setVersion 
 calls.  If one is already running, it should certainly be allowed to finish 
 before we close the database.

This sounds reasonable to me (special case and abort the transaction only for 
blocked setVersion transactions). We should capture it explicitly on the spec, 
it's the kind of little detail that's easy to forget. 

-pc




[IndexedDB] Spec changes for international language support

2011-02-17 Thread Pablo Castro
We discussed international language support last time at the TPAC and I said 
I'd propose spec text for it. Please find the patch below, the changes mirror 
exactly the proposal described in the bug we have for tracking this:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9903

btw - the bug is assigned to Nikunj right now but I think that's just because 
of an editing glitch. Nikunj please let me know if you were working on it, 
otherwise I'll just submit the changes once I hear some feedback from this 
group.

Thanks
-pablo


Left file: \IndexedDB 
Specs\20110217\Speclet_023_IDB_API_Asynchronous_APIs.original.html
Right file: \IndexedDB Specs\20110217\Speclet_023_IDB_API_Asynchronous_APIs.html
copy 6
add 7
dtreadonly attribute DOMString language/dt
dd
On getting, this attribute MUST return the a title=database 
languagelanguage/a
that is configured in this database for string collation. If no 
collation has been
configured for a database this value is codenull/code and 
the database will
use binary collation.
/dd
copy 6
copy 6
add 24
dtIDBRequest setLanguage()/dt
dd
p
This method changes the a title=database 
languagelanguage/a used by the database
for string collation. Note that this method must only
be called from a acodeVERSION_CHANGE/code/a 
atransaction/a callback.
/p
p class=note
Changing the language in a database that already contains data 
typically involves reading and 
re-writing the entire database and thus can be a time consuming 
operation.
/p
dl class=parameters
dtoptional DOMString language/dt
ddThe language to be used in the database specified as a 
language identifier as
described in [[!BCP47]]./dd
/dl
dl class=exception title=IDBDatabaseException
dtNOT_ALLOWED_ERR/dt
ddThis method was not called from a 
acodeVERSION_CHANGE/code/a atransaction/a callback./dd
dtDATA_ERR/dt
ddThe language parameter contained a string that was not 
a valid language identifier or was a language
identifier not supported by the system./dd
/dl
/dd
copy 6



Left file: \IndexedDB 
Specs\20110217\Speclet_022_IDB_API_Synchronous_APIs.original.html
Right file: \IndexedDB Specs\20110217\Speclet_022_IDB_API_Synchronous_APIs.html
copy 6
add 7
dtreadonly attribute DOMString language/dt
dd
On getting, this attribute MUST return the a title=database 
languagelanguage/a
that is configured in this database for string collation. If no 
collation has been
configured for a database this value is codenull/code and 
the database will
use binary collation.
/dd
copy 6
copy 6
add 24
dtvoid setLanguage()/dt
dd
p
This method changes the a title=database 
languagelanguage/a used by the database
for string collation. Note that this method must only
be called from a acodeVERSION_CHANGE/code/a 
atransaction/a callback.
/p
p class=note
Changing the language in a database that already contains data 
typically involves reading and 
re-writing the entire database and thus can be a time consuming 
operation.
/p
dl class=parameters
dtoptional DOMString language/dt
ddThe language to be used in the database specified as a 
language identifier as
described in [[!BCP47]]./dd
/dl
dl class=exception title=IDBDatabaseException
dtNOT_ALLOWED_ERR/dt
ddThis method was not called from a 
acodeVERSION_CHANGE/code/a atransaction/a callback./dd
dtDATA_ERR/dt
ddThe language parameter contained a string that was not 
a valid language identifier or was a language
identifier not supported by the system./dd
/dl
/dd
copy 6



Left file: \IndexedDB 
Specs\20110217\Speclet_020_IDB_API_Constructs.original.html
Right file: \IndexedDB Specs\20110217\Speclet_020_IDB_API_Constructs.html
copy 6
add 4
Every adatabase/a also has a dfn title=database 
languagelanguage/dfn that indicates the 
language that should be used for collating strings when comparing 
keys.
  /p
  p
copy 6
copy 6
delete 1
add 2
value with no need to separate them by type. When comparing a 
codeDOMString/code with another codeDOMString/code, the adatabase
  

RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-14 Thread Pablo Castro

 From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
 Sent: Sunday, February 06, 2011 12:43 PM

 On Tue, Dec 14, 2010 at 4:26 PM, Pablo Castro pablo.cas...@microsoft.com 
 wrote:

 From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
 Sent: Tuesday, December 14, 2010 4:23 PM

  On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro 
  pablo.cas...@microsoft.com wrote:
 
  From: public-webapps-requ...@w3.org 
  [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking
  Sent: Friday, December 10, 2010 1:42 PM
 
   On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow jor...@chromium.org 
   wrote:
Any more thoughts on this?
  
   I don't feel strongly one way or another. Implementation wise I don't
   really understand why implementations couldn't use keys of unlimited
   size. I wouldn't imagine implementations would want to use fixed-size
   allocations for every key anyway, right (which would be a strong
   reason to keep maximum size down).
  I don't have a very strong opinion either. I don't quite agree with the 
  guideline of having something working slowly is better than not working 
  at all...as having something not work at all sometimes may help 
  developers hit a wall and think differently about their approach for a 
  given problem. That said, if folks think this is an instance where we're 
  better off not having a limit I'm fine with it.
 
  My only concern is that the developer might not hit this wall, but then 
  some user (doing things the developer didn't fully anticipate) could hit 
  that wall.  I can definitely see both sides of the argument though.  And 
  elsewhere we've headed more in the direction of forcing the developer to 
  think about performance, but this case seems a bit more non-deterministic 
  than any of those.
 
 Yeah, that's a good point for this case, avoiding data-dependent errors is 
 probably worth the perf hit.
 
 My current thinking is that we should have some relatively large 
 limitmaybe on the order of 64k?  It seems like it'd be very difficult to 
 hit such a limit with any sort of legitimate use case, and the chances of 
 some subtle data-dependent error would be much less.  But a 1GB key is just 
 not going to work well in any implementation (if it doesn't simply oom the 
 process!).  So despite what I said earlier, I guess I think we should have 
 some limit...but keep it an order of magnitude or two larger than what we 
 expect any legitimate usage to hit just to keep the system as flexible as 
 possible.

 Does that sound reasonable to people?

I thought we were trying to avoid data-dependent errors and thus shooting for 
having no limit (which may translate into having very large limits in actual 
implementations but not the kind of thing you'd typically hit).  

Specifying an exact size may be a bit weird...I guess an alternative could be 
to spec what is the minimum size UAs need to support. A related problem is what 
units is this specified in, if it's bytes then that means developers need to 
make assumptions about how strings are stored or something.

-pablo
 



RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-14 Thread Pablo Castro
(sorry for my random out-of-timing previous email on this thread. please see 
below for an actually up to date reply)

-Original Message-
From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Monday, February 07, 2011 3:31 PM

On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow jor...@chromium.org wrote:
 On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow jor...@chromium.org
  wrote:
   On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher sdwi...@mozilla.com
   wrote:
  
   On 2/6/2011 12:42 PM, Jeremy Orlow wrote:
  
   My current thinking is that we should have some relatively large
   limitmaybe on the order of 64k?  It seems like it'd be very
   difficult
   to
   hit such a limit with any sort of legitimate use case, and the
   chances
   of
   some subtle data-dependent error would be much less.  But a 1GB key
   is
   just
   not going to work well in any implementation (if it doesn't simply
   oom
   the
   process!).  So despite what I said earlier, I guess I think we
   should
   have
   some limit...but keep it an order of magnitude or two larger than
   what
   we
   expect any legitimate usage to hit just to keep the system as
   flexible
   as
   possible.
  
   Does that sound reasonable to people?
  
   Are we thinking about making this a MUST requirement, or a SHOULD?
    I'm
   hesitant to spec an exact size as a MUST given how technology has a
   way
   of
   changing in unexpected ways that makes old constraints obsolete.
    But
   then,
   I may just be overly concerned about this too.
  
   If we put a limit, it'd be a MUST for sure.  Otherwise people would
   develop
   against one of the implementations that don't place a limit and then
   their
   app would break on the others.
   The reason that I suggested 64K is that it seems outrageously big for
   the
   data types that we're looking at.  But it's too small to do much with
   base64
   encoding binary blobs into it or anything else like that that I could
   see
   becoming rather large.  So it seems like a limit that'd avoid major
   abuses
   (where someone is probably approaching the problem wrong) but would
   not
   come
   close to limiting any practical use I can imagine.
   With our architecture in Chrome, we will probably need to have some
   limit.
    We haven't decided what that is yet, but since I remember others
   saying
   similar things when we talked about this at TPAC, it seems like it
   might
   be
   best to standardize it--even though it does feel a bit dirty.
 
  One problem with putting a limit is that it basically forces
  implementations to use a specific encoding, or pay a hefty price. For
  example if we choose a 64K limit, is that of UTF8 data or of UTF16
  data? If it is of UTF8 data, and the implementation uses something
  else to store the date, you risk having to convert the data just to
  measure the size. Possibly this would be different if we measured size
  using UTF16 as javascript more or less enforces that the source string
  is UTF16 which means that you can measure utf16 size on the cheap,
  even if the stored data uses a different format.
 
  That's a very good point.  What's your suggestion then?  Spec unlimited
  storage and have non-normative text saying that
  most implementations will
  likely have some limit?  Maybe we can at least spec a minimum limit in
  terms
  of a particular character encoding?  (Implementations could translate
  this
  into the worst case size for their own native encoding and then ensure
  their
  limit is higher.)

 I'm fine with relying on UTF16 encoding size and specifying a 64K
 limit. Like Shawn points out, this API is fairly geared towards
 JavaScript anyway (and I personally don't think that's a bad thing).
 One thing that I just thought of is that even if implementations use
 other encodings, you can in the vast majority of cases do a worst-case
 estimate and easily see that the key that is used is below 64K.

 That said, does having a 64K limit really help anyone? In SQLite we
 can easily store vastly more than that, enough that we don't have to
 specify a limit. And my understanding is that in the Microsoft
 implementation, the limits for what they can store without resorting
 to various tricks, is much lower. So since that implementation will
 have to implement special handling of long keys anyway, is there a
 difference between saying a 64K limit vs. saying unlimited?

 As I explained earlier: The reason that I suggested 64K is that it seems
 outrageously big for the data types that we're looking at.  But it's too
 small to do much with base64 encoding binary blobs into it or anything else
 like that that I could see becoming rather large.  So it seems like a limit
 that'd avoid major abuses (where someone is probably approaching 

RE: [IndexedDB] Reason for aborting transactions

2011-02-09 Thread Pablo Castro
(back!)

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Wednesday, February 09, 2011 6:47 PM

 On Wed, Feb 9, 2011 at 5:54 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Feb 9, 2011 at 5:43 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Wed, Feb 9, 2011 at 5:37 PM, ben turner bent.mozi...@gmail.com wrote:
 
   Normal exceptions have error messages that are not consistient across
   implementations and are not localized.  What's the difference?
 
  These messages aren't part of any exception though, it's just some
  property on a transaction object. (None of our DOM exceptions, IDB or
  otherwise, have message properties btw, they're only converted to some
  message if they make it to the error console).
 
   For stuff like internal errors, they seem especially important.
 
  You're thinking of having multiple messages for the INTERAL_ERROR_ABORT
  code?
 
  I think that'd be ideal, yes.  Since internal errors will be UA specific,
  string matching wouldn't be so bad there.
  If no one likes this idea, I'm happy hiding away the message in some
  webkitAbortMessage attribute so it's super clear it's just us who 
  implements
  this.  (Speaking of which, maybe you guys should do that with getAll.)
 We'll definitely put getAll under a vendor prefix once we drop the
 front door prefix on .indexeddb.

 I'm with Ben here. I'd prefer to hide the message away under a vendor
 prefix (either now or once you drop the front door one) for now to
 gather feedback on how it'll be used.


I'm not sure about this...as I was catching up on the thread I understood this 
more as a debugging helper feature. In the end if we didn't have this you could 
just have a database-wide error handler and stash errors as they come in a 
global array or something, and that's okay for diagnostics. If we want to make 
it easier to just look at the transaction and see what happened, we may as well 
let UAs include a descriptive string so you can really find out on the spot. I 
don't have a strong opinion about excluding (or vendor-prefixing) the property, 
but it seems it would come in handy.

-pablo




RE: [IndexedDB] Do we need a timeout for VERSION_CHANGE?

2010-12-16 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, December 16, 2010 2:35 AM

In another thread (in the last couple days) we actually decided to remove 
timeouts from normal transactions since they can be implemented as a 
setTimeout+abort.

But I agree that we need a way to abort setVersion transactions before 
getting the callback (so that we implement timeouts for them as well).  
Unfortunately, I don't immediately have any good ideas on how to do that 
though.

Sorry, forgot to qualify it, context == sync api. I assume that the sync 
versions of the API will truly block, so setTimeout won't do as code won't just 
reenter into the timeout callback while blocked on a sync IndexedDB call, are 
we all on the same page on that? If that's the case, then I don't think we can 
remove the timeout parameter from the sync versions of transaction() and 
setVersion(). Does that sound reasonable? I'll add them for now, we can adjust 
if somebody come up with a better approach.

As for setVersion in async...that's actually a problem as well now that I think 
about it because you don't have access to the (version) transaction object 
until it actually was able to start. One option besides having a timeout 
parameter in the method would be to have an abort() method in 
IDBVersionChangeRequest. 

Thanks,
-pablo




[IndexedDB] KeyRange factory methods

2010-12-16 Thread Pablo Castro
I was going to file a bug on this but wanted to make sure I'm not missing 
something first.

All the factory methods for ranges (e.g. bound, lowerBound, etc.) are in the 
IDBKeyRangeConstructors interface now, but I don't see the interface referenced 
anywhere. Who implements this interface, the Window object, IDBFactory[Sync], 
something else?

Thanks
-pablo




RE: [Bug 11553] New: Ensure indexedDBSync is on the right worker interface

2010-12-15 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Wednesday, December 15, 2010 3:21 AM
 
 I believe the instance of WorkerUtils is much like window in a page.  I.e. 
 you put stuff on there that you want in the global scope.  Thus I'm pretty 
 sure that WorkerUtils is the right place for both.

Yeah, I read the workers spec too quickly yesterday. You're right, WorkerUtils 
is what we need, I'll make it implement both IDBEnvironment and 
IDBEnvironmentSync.

Thanks,
-pablo




[IndexedDB] Do we need a timeout for VERSION_CHANGE?

2010-12-15 Thread Pablo Castro
Regular transactions take a timeout parameter when started, which ensures that 
we eventually make progress one way or the other if there's an un-cooperating 
script that won't let go of an object store or something like that.

I'm not sure if we discussed this before, it seems that we need to add a 
similar thing for setVersion(), and it's basically a way of starting a 
transaction.

I was thinking we could have an optional timeout argument in setVersion with a 
UA-specific default. In the async case we would fire the onerror event and in 
the sync case just throw, both with TIMEOUT_ERR.

Thanks
-pablo




RE: [Bug 11375] New: [IndexedDB] Error codes need to be assigned new numbers

2010-12-14 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Friday, December 10, 2010 5:03 AM

 I noticed that QUOTA_ERR is commented out.  I can't remember when or why and 
 the blame history is a bit mangled.  Does anyone else?  In Chromium we 
 currently use UNKNOWN_ERR for whenever we have issues writing stuff to disk. 
  We could probably tease quota related issues out into their own error.  
 And/or we should probably create or find a good existing error for such uses.

It sounds like a good idea to keep QUOTA_ERR separated from other general 
errors that come up when writing stuff to disk.

 Speaking of which, we use UNKNOWN_ERR for a bunch of other 
 internal consistency issues.  Is this OK by everyone, should we use another, 
 or should we create a new one?  (Ideally these issues will be few and far 
 between as we make things more robust.)

That sounds reasonable to me. 

 We also use UNKNOWN_ERR for when things are not yet implemented.  Any 
 concerns?

I don't think it's a big deal, but are we going to have a bunch of 
unimplemented stuff across browsers? If this becomes common, I wonder if we 
should have a separate error so calling code can choose to compensate or 
something.

 What error code should we use for IDBCursor.update/delete when the cursor is 
 not currently on an item (or that item has been deleted)?

NOT_ALLOWED_ERR?

 TRANSIENT_ERR doesn't seem to be used anywhere in the spec.  Should it be 
 removed?

Sure.

 As for the numbering: does anyone object to me just starting from 1 and 
 going sequentially?  I.e. does anyone have a problem with them all getting 
 new numbers, or should I keep the numbers the same when possible.  (i.e. 
 only UNKNOWN_ERR, RECOVERABLE_ERR, TRANSIENT_ERR, TIMEOUT_ERR, DEADLOCK_ERR 
 would change number, but the ordering of those on the page would change.)

I'm fine with that.

-pc




RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2010-12-14 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Friday, December 10, 2010 1:42 PM

 On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow jor...@chromium.org wrote:
  Any more thoughts on this?

 I don't feel strongly one way or another. Implementation wise I don't
 really understand why implementations couldn't use keys of unlimited
 size. I wouldn't imagine implementations would want to use fixed-size
 allocations for every key anyway, right (which would be a strong
 reason to keep maximum size down).

I don't have a very strong opinion either. I don't quite agree with the 
guideline of having something working slowly is better than not working at 
all...as having something not work at all sometimes may help developers hit a 
wall and think differently about their approach for a given problem. That said, 
if folks think this is an instance where we're better off not having a limit 
I'm fine with it. 

 Pablo, do you know why the back ends you were looking at had such
 relatively low limits?

Mostly an implementation thing. Keys (and all other non-blob columns) typically 
need to fit in a page.  Predictable perf is also nice (no linked lists, high 
density/locality, etc.), but not as fundamental as page size.

-pablo




RE: [Bug 11398] New: [IndexedDB] Methods that take multiple optional parameters should instead take an options object

2010-12-10 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Friday, December 10, 2010 7:27 AM
 
 In addition to createObjectStore, I also intend to convert the following 
 over:


 IDBObjectStore.createIndex
 IDBObjectStore.openCursor
 IDBIndex.openCursor
 IDBIndex.openKeyCursor
 IDBKeyRange.bound

Sounds great.

 We did all of these two weeks ago in Chromium and have gotten some feedback. 
  The main downside is that typos are silently ignored by JavaScript.  We 
 considered throwing if someone passed in an option we didn't recognize, but 
 this would make it impossible to add more options later (which is one of the 
 main reasons for doing this change).  I think what we might do is just log 
 something in the console with this happens.  (Should the spec actually make 
 a recommendation to this effect?)  Besides that, I think overall we're happy 
 with the change.

I'm not sure what the problem is with throwing. Can't each implementation throw 
if it receives a parameter that has no meaning for it? Given that we can't know 
if future options will have substantial impact on the behavior of the function 
when they are present, it looks safer to go that route.

Is there prior art in some other webapps API that takes JavaScript objects as 
parameters? What do they do?

Thanks
-pablo
 



RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2010-11-19 Thread Pablo Castro

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of bugzi...@jessica.w3.org
Sent: Friday, November 19, 2010 4:16 AM

 Just looking at this list, I guess I'm leaning towards _not_ limiting the
 maximum key size and instead pushing it onto implementations to do the hard
 work here.  If so, we should probably have some normative text about how 
 bigger
 keys will probably not be handled very efficiently.

I was trying to make up my mind on this, and I'm not sure this is a good idea. 
What would be the options for an implementation? Hashing keys into smaller 
values is pretty painful because of sorting requirements (we'd have to index 
the data twice, once for the key prefix that fits within limits, and a second 
one for a hash plus some sort of discriminator for collisions). Just storing a 
prefix as part of the key under the covers obviously won't fly...am I missing 
some other option?

Clearly consistency in these things is important to people don't get caught off 
guard. I wonder if we just pick a reasonable limit, say 1 K characters (yeah, 
trying to do something weird to avoid details of how stuff is actually stored), 
and run with it. I looked around at a few databases (from a single vendor :)), 
and they seem to all be well over this but not by orders of magnitude (2KB to 
8KB seems to be the range of upper limits for this in practice).

Thanks
-pablo




RE: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-10 Thread Pablo Castro

From: Tab Atkins Jr. [mailto:jackalm...@gmail.com] 
Sent: Wednesday, November 10, 2010 1:50 PM

 On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
 
  From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] 
  On Behalf Of bugzi...@jessica.w3.org
  Sent: Monday, November 08, 2010 5:07 PM
 
  So what happens if trying save in an object store which has the following
  keypath, the following value. (The generated key is 4):
 
  foo.bar
  { foo: {} }
 
  Here the resulting object is clearly { foo: { bar: 4 } }
 
  But what about
 
  foo.bar
  { foo: { bar: 10 } }
 
  Does this use the value 10 rather than generate a new key, does it throw 
  an
  exception or does it store the value { foo: { bar: 4 } }?
 
  I suspect that all options are somewhat arbitrary here. I'll just propose 
  that we error out to ensure that nobody has the wrong expectations about 
  the implementation preserving the initial value. I would be open to other 
  options except silently overwriting the initial value with a generated 
  one, as that's likely to confuse folks.

 It's relatively common for me to need to supply a manual value for an
 id field that's automatically generated when working with databases,
 and I don't see any particular reason that my situation would change
 if using IndexedDB.  So I think that a manually-supplied key should be
 kept.

That would be okay with me. One bit of fine-print on this one is that if you're 
calling store.add() with an explicit key then you may get a unique constraint 
error (which would never happen with a generator if you never provided your own 
keys). Also, did we settle for having put() never adding a new record if one 
didn't exist? If put() can create a record, then things still work but become a 
bit more elaborate in that put() would create a new record either if the key is 
not present in the object or if it's present but the value doesn't exist in the 
database, while it would update a record if the value was present and it 
existed as a key in the store.

-pablo



RE: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-10 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Wednesday, November 10, 2010 2:08 PM

 On Wed, Nov 10, 2010 at 1:50 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
  On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
 
  From: public-webapps-requ...@w3.org 
  [mailto:public-webapps-requ...@w3.org] On Behalf Of 
  bugzi...@jessica.w3.org
  Sent: Monday, November 08, 2010 5:07 PM
 

 I'm fine with either solution here. My database experience is too weak
 to have strong opinions on this matter.

 What do databases usually do with columns that use autoincrement but a
 value is still supplied? My recollection is that that is generally
 allowed?

It does happen in practice that sometimes you need to use explicit keys. The 
typical case is when you're initializing a database with base data and you want 
to have known keys. 

As for what databases do, I'll use SQL Server as an example (for no particular 
reason :) ). In SQL Server by default if you try to insert a row with a value 
in an identity column you get an error and the operation is aborted; however, 
developers can issue a command (SET IDENTITY_INSERT table ON) to turn it off 
temporarily and insert rows with an explicitly provided primary key. Usually 
when you do this you have to be careful to use keys that are either way out of 
the range of keys the generator will use (or you may not be able to insert keys 
anymore) or you have to reset the next key (using an obscure DBCC CHECKIDENT 
(table, RESEED, next-key) command). 

I don't know much about Oracle, but I believe the typical pattern is still to 
use a sequence object and set the default value for the key column to  
sequence.nextval, thus allowing callers to override the next value in the 
sequence by just providing one, and if necessary they may need to go and fix up 
the sequence. 

From writing the above paragraph I'm realizing one more detail we need to be 
explicit about: the fact that you do an add() with an explicit key does not 
mean the implementation will fix up the next key it'll assign. You'll still 
get the value that comes after the one generated last, and if you inserted 
that value in the store explicitly you just made the store unable to add new 
objects with generated keys until you delete it.

If that's too much fine-print then we should just disallow it. I like the 
ability to set explicit key values, but it does come with some extra care that 
both implementers and users will have to have.

-pablo
 



RE: IndexedDB TPAC agenda

2010-11-02 Thread Pablo Castro
To hit the ground running on this, here is a consolidated list of issues coming 
both from the thread below and various pending bugs/discussions we've had. I 
picked an arbitrary order and grouping, feel free to tweak in any way.

- keys (arrays as keys, compound keys, general keypath restrictions)
- index keys (arrays as keys, empty values, general keypath restrictions)
- internationalization (collation specification, collation algorithm)
- quotas (how do apps request more storage, is there a temp/permanent 
distinction?)
- error handling (propagation, relationship to window.error, db scoped event 
handlers, errors vs return empty values)
- blobs (be explicit about behavior of blobs in indexeddb objects)
- transactions error modes (abort-on-unwind in error conditions; what happens 
when user leaves the page with pending transactions?)
- transactions isolation/concurrent aspects
- transactions scopes (dynamic support)
- synchronous api

Thanks
-pablo

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Pablo Castro
Sent: Monday, November 01, 2010 10:39 PM
To: Jeremy Orlow; Jonas Sicking
Cc: public-webapps@w3.org
Subject: RE: IndexedDB TPAC agenda

A few other items to add to the list to discuss tomorrow:

- Blobs support: have we discussed explicitly how things work when an object 
has a blob (file, array, etc.) as one of its properties?
- Close on collation and international support
- How do applications request that they need more storage? And related to this, 
at some point we discussed temporary vs permanent stores. Close on the whole 
story of how space is managed.
- Database-wide exception handlers

Looking forward to the discussion tomorrow.

-pablo


From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Monday, November 01, 2010 1:34 PM
To: Jonas Sicking
Cc: public-webapps@w3.org
Subject: Re: IndexedDB TPAC agenda

On Mon, Nov 1, 2010 at 12:23 PM, Jonas Sicking jo...@sicking.cc wrote:
On Mon, Nov 1, 2010 at 5:13 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Mon, Nov 1, 2010 at 11:53 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 1, 2010 at 4:40 AM, Jeremy Orlow jor...@chromium.org wrote:
  What items should we try to cover during the f2f?
  On Mon, Nov 1, 2010 at 11:08 AM, Jonas Sicking jo...@sicking.cc wrote:
 
   P.S. I'm happy to discuss all of this f2f tomorrow rather than over
   email
   now.
 
  Speaking of which, would be great to have an agenda. Some of the
  bigger items are:
 
  * Dynamic transactions
  * Arrays-as-keys
  * Arrays and indexes (what to do if the keyPath for an index evaluates
  to an array)
  * Synchronous API
 
  * Compound keys.
  * What should be allowed in a keyPath.

 Aren't compound keys same as arrays-as-keys?

 Sorry, I meant to say compound indexes.
 We've talked about using indexes in many different ways--including compound
 indexes and allowing keys to include indexes.  I assumed you meant the
 latter?
I'm lost as to what you're saying here. Could you elaborate? Are you
saying index when you mean array anywhere?

oops.  Yes, I meant to say: We've talked about using arrays in many different 
ways--including compound indexes and allowing keys to include arrays.  I 
assumed you meant the latter?
 
 * What should happen if an index's keyPath points to a property which
 doesn't exist or which isn't a valid key-value? (same general topic as
 arrays and indexes above)

 We've talked about this several times.  It'd be great to settle on something
 once and for all.
Agreed.

 * What happens if the user leaves a page in the middle of a
 transaction? (this might be nice to tackle since there'll be lots of
 relevant people in the room)

 I'm pretty sure this is simple: if there's an onsuccess/onerror handler that
 has not yet fired (or we're in the middle of firing), then you abort the
 transaction.  If not, the behavior is undefined (because there's no way the
 app could have observed the difference anyway).  The aborting behavior is
 necessary since the user could have planned to execute additional commands
 atomically in the handler.
There is also the option to let the transaction finish. They should be
short-lived so it shouldn't be too bad.

I.e. keep the page alive for a bit longer in the background or something that 
blocks page unload?  Is there precedent for this elsewhere?  This sounds pretty 
complicated to get right both in terms of implementation and speccing.  Let's 
chat about it though.
 
 * Error handling

 What do you mean by this?
How to handle exceptions in various places. Where (error) events
propagate. How does it relate to window.onerror. What happens if you
do/don't call preventDefault on the error event?


Sounds good.





RE: IndexedDB TPAC agenda

2010-11-01 Thread Pablo Castro
A few other items to add to the list to discuss tomorrow:

- Blobs support: have we discussed explicitly how things work when an object 
has a blob (file, array, etc.) as one of its properties?
- Close on collation and international support
- How do applications request that they need more storage? And related to this, 
at some point we discussed temporary vs permanent stores. Close on the whole 
story of how space is managed.
- Database-wide exception handlers

Looking forward to the discussion tomorrow.

-pablo


From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Monday, November 01, 2010 1:34 PM
To: Jonas Sicking
Cc: public-webapps@w3.org
Subject: Re: IndexedDB TPAC agenda

On Mon, Nov 1, 2010 at 12:23 PM, Jonas Sicking jo...@sicking.cc wrote:
On Mon, Nov 1, 2010 at 5:13 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Mon, Nov 1, 2010 at 11:53 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 1, 2010 at 4:40 AM, Jeremy Orlow jor...@chromium.org wrote:
  What items should we try to cover during the f2f?
  On Mon, Nov 1, 2010 at 11:08 AM, Jonas Sicking jo...@sicking.cc wrote:
 
   P.S. I'm happy to discuss all of this f2f tomorrow rather than over
   email
   now.
 
  Speaking of which, would be great to have an agenda. Some of the
  bigger items are:
 
  * Dynamic transactions
  * Arrays-as-keys
  * Arrays and indexes (what to do if the keyPath for an index evaluates
  to an array)
  * Synchronous API
 
  * Compound keys.
  * What should be allowed in a keyPath.

 Aren't compound keys same as arrays-as-keys?

 Sorry, I meant to say compound indexes.
 We've talked about using indexes in many different ways--including compound
 indexes and allowing keys to include indexes.  I assumed you meant the
 latter?
I'm lost as to what you're saying here. Could you elaborate? Are you
saying index when you mean array anywhere?

oops.  Yes, I meant to say: We've talked about using arrays in many different 
ways--including compound indexes and allowing keys to include arrays.  I 
assumed you meant the latter?
 
 * What should happen if an index's keyPath points to a property which
 doesn't exist or which isn't a valid key-value? (same general topic as
 arrays and indexes above)

 We've talked about this several times.  It'd be great to settle on something
 once and for all.
Agreed.

 * What happens if the user leaves a page in the middle of a
 transaction? (this might be nice to tackle since there'll be lots of
 relevant people in the room)

 I'm pretty sure this is simple: if there's an onsuccess/onerror handler that
 has not yet fired (or we're in the middle of firing), then you abort the
 transaction.  If not, the behavior is undefined (because there's no way the
 app could have observed the difference anyway).  The aborting behavior is
 necessary since the user could have planned to execute additional commands
 atomically in the handler.
There is also the option to let the transaction finish. They should be
short-lived so it shouldn't be too bad.

I.e. keep the page alive for a bit longer in the background or something that 
blocks page unload?  Is there precedent for this elsewhere?  This sounds pretty 
complicated to get right both in terms of implementation and speccing.  Let's 
chat about it though.
 
 * Error handling

 What do you mean by this?
How to handle exceptions in various places. Where (error) events
propagate. How does it relate to window.onerror. What happens if you
do/don't call preventDefault on the error event?


Sounds good.



Re: [IndexedDB] Explicitly stablishing the timing of clone creation

2010-10-04 Thread Pablo Castro

On Mon, Aug 16, 2010 at 12:11 AM, Jonas Sicking jo...@sicking.cc wrote:


  On Fri, Aug 13, 2010 at 1:43 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
   The spec for the asynchronous put and add methods in object store as
  well as update in cursors don't explicitly state when clones are created,
  and can even be read as if clones should be created after the function call
  returned, when the queued up task is executed. This leads to problems where
  the clone may be modified after the call to put/add/update happens. 
  Wouldn't
  it be more reasonable to require implementations to always create a clone 
  of
  the object before returning (i.e. synchronously) and perform the rest of 
  the
  operation asynchronously?
 
  Yes.
 
   If we agree on this I'll file a bug and later follow up with some text
  for the spec.
 
  Please do.
 

 Agreed.

Closing the loop on this one. Proposed text is below, any feedback is welcome. 
I also updated the bug with it.
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10381

Thanks
-pablo


Proposed text changes for this:

In section 3.2.5 Object Store, the description for the add method says:
This method returns immediately and stores the given value in this object store
by following the steps for storing a record into an object store with the
no-overwrite flag set. If the record can be successfully stored in the object
store, then a success event is fired on this method's returned object using the
IDBTransactionEvent interface with its result set to the key for the stored
record and transaction set to the transaction in which this object store is
opened. If a record exists in this object store for the key key parameter, then
an error event is fired on this method's returned object with its code set to
CONSTRAINT_ERR

We should change it to:
This method stores the given value in this object store by first synchronously
creating a copy of the value following steps 1 through 4 of the algorithm
described in 4.2 Object Store Storage steps, then returning immediately and
asynchronously performing the remaining steps for the algorithm that actually
store the object in the object store, with the no-overwrite flag set. If the
record can be successfully stored in the object store, then a success event is
fired on this method's returned object using the IDBTransactionEvent interface
with its result set to the key for the stored record and transaction set to the
transaction in which this object store is opened. If a record exists in this
object store for the key key parameter, then an error event is fired on this
method's returned object with its code set to CONSTRAINT_ERR.



In section 3.2.5 Object Store, the description for the put method says:
This method returns immediately and stores the given value in this object store
by following the steps for storing a record into an object store. If the record
can be successfully stored in the object store, then a success event is fired
on this method's returned object using the IDBTransactionEvent interface with
its result set to the key for the stored record and transaction set to the
transaction in which this object store is opened.

We should change it to:
This method stores the given value in this object store by first synchronously
creating a copy of the value following steps 1 through 4 of the algorithm
described in 4.2 Object Store Storage steps, then returning immediately and
asynchronously performing the remaining steps for the algorithm that actually
store the object in the object store. If the record can be successfully stored
in the object store, then a success event is fired on this method's returned
object using the IDBTransactionEvent interface with its result set to the key
for the stored record and transaction set to the transaction in which this
object store is opened.



In section 3.2.7 Cursor the description of the update method says:
This method returns immediately and sets the value for the record at the
cursor's position.

We should change it to:
This method sets the value for the record at the cursor's position by first
synchronously creating a copy of the value using the structured clone
algorithm, then returning immediately and asynchronously updating the record in
the underlying store.




RE: Seeking agenda items for WebApps' Nov 1-2 f2f meeting

2010-10-04 Thread Pablo Castro
Are these slots more or less frozen at this point? Just wanted to confirm to 
make travel arrangements.

Thanks
-pablo
 

-Original Message-
From: Arthur Barstow [mailto:art.bars...@nokia.com] 
Sent: Wednesday, September 29, 2010 5:41 AM
To: ext Eric Uhrhane; Jonas Sicking; Jeremy Orlow; Pablo Castro; 
public-webapps; Arun Ranganathan
Subject: Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting


  I added the following slots for November 2:

[[
http://www.w3.org/2008/webapps/wiki/TPAC2010#Tuesday.2C_November_2

13:30-15:00: Indexed DB
15:30-16:30: Indexed DB
16:30-18:00: File * APIs
]]

Of course we can fine-tune the times as needed.

Arun - we reserved a speaker phone for remote participants for both days.

-Art Barstow

On 9/28/10 5:45 PM, ext Eric Uhrhane wrote:
 Works fine for me.  I'll be there all of Monday and Tuesday.  Due to
 jetlag morning vs. afternoon's probably irrelevant to me, as I won't
 have any idea what time it is ;'.

 On Tue, Sep 28, 2010 at 2:30 PM, Jonas Sickingjo...@sicking.cc  wrote:
 The later the better for me. If we can make it after noon I'll be
 there for sure.

 / Jonas

 On Tue, Sep 28, 2010 at 1:37 PM, Jeremy Orlowjor...@google.com  wrote:
 I'm OK with any time slot.

 On Tue, Sep 28, 2010 at 8:57 PM, Arthur Barstowart.bars...@nokia.com
 wrote:
   Hi All,

 Currently, no one has requested a specific day + time slot for any of the
 proposed topics:

   http://www.w3.org/2008/webapps/wiki/TPAC2010

 When our IndexedDB participants agree on a time slot on Tuesday the 2nd,
 I'll add it to the agenda. Pablo, Jonas, Jeremy - please propose a time.

 Day + time slot proposals for the agenda topics already proposed are also
 welcome (as are proposals for additional topics).

 -Art Barstow

 On 9/28/10 3:28 PM, ext Pablo Castro wrote:
 It looks like there will be good critical mass for IndexedDB discussions,
 so I'll try to make it as well. Tuesday would be best for me as well for 
 an
 IndexedDB meeting so I can travel on Sunday/Monday.

 -pablo

 -Original Message-
 From: Jonas Sicking [mailto:jo...@sicking.cc]
 Sent: Tuesday, September 28, 2010 10:53 AM
 To: Jeremy Orlow
 Cc: Pablo Castro; art.bars...@nokia.com; public-webapps
 Subject: Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting

 I'm not 100% sure that I'll make TPAC this year, but if I do, I likely
 won't make monday. So a tuesday schedule would fit me better too.

 / Jonas

 On Tue, Sep 28, 2010 at 8:36 AM, Jeremy Orlowjor...@google.comwrote:
 Is it possible to schedule IndexedDB for Tuesday?  I'm pretty sure that
 I
 can be there then, but Monday is more up in the air at this moment.
 Thanks!
 Jeremy
 On Thu, Sep 2, 2010 at 3:28 AM, Jonas Sickingjo...@sicking.ccwrote:
 I'm hoping to be there yes. Especially if we'll get a critical mass of
 IndexedDB contributors.

 / Jonas

 On Wed, Sep 1, 2010 at 7:18 PM, Pablo
 Castropablo.cas...@microsoft.com
 wrote:
 -Original Message-
 From: public-webapps-requ...@w3.org
 [mailto:public-webapps-requ...@w3.org] On Behalf Of Arthur Barstow
 Sent: Tuesday, August 31, 2010 4:32 AM

 The WebApps WG will meet face-to-face November 1-2 as part of the
 W3C's
 2010 TPAC meeting week [TPAC].

 I created a stub agenda item page and seek input to flesh out
 agenda:

 http://www.w3.org/2008/webapps/wiki/TPAC2010

 [TPAC] includes a link to the Registration page, a detailed schedule
 of
 the group meetings, and other useful information.

 The registration fee is 40€ per day and will increase to 120€ per
 day
 after October 22.

 -Art Barstow

 [TPAC] http://www.w3.org/2010/11/TPAC/
 For folks working on IndexedDB, are you guys planning on attending the
 TPAC? Given the timing of the event it may be a great opportunity to
 get
 together and iron out a whole bunch of issues at once. It would be
 good to
 know ahead of time so we can all make plans if we have critical mass.

 Thanks
 -pablo







RE: [IndexedDB] setVersion with multiple IDBDatabase objects

2010-09-29 Thread Pablo Castro

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of ben turner
Sent: Tuesday, September 28, 2010 8:19 AM

 Yes, let's have it tied to the instance on which setVersion() was called.

 As Shawn pointed out that is consistent with the behavior that
 database instances from different windows will observe. As Jeremy
 pointed out that is consistent with the way object stores and indexes
 are tied to a transaction instance. Also, the |event.source| will be
 db1 in the given example, so it seems natural to allow changes only to
 the database we pass in the event and no other.

 -Ben

+1, let's tie it to the instance and make it consistent with stores/indexes.

-pablo




RE: [IndexedDB] IDBCursor.update for cursors returned from IDBIndex.openCursor

2010-09-29 Thread Pablo Castro
I agree with Jonas on this. I think accessing the index values is an important 
feature (in addition to joins you can imagine add an extra property or two to 
the index key* to create a covering index and avoid fetching the object in a 
perf-critical path).

That said, to me it's just about allowing retrieval. For update/delete it would 
be perfectly reasonable to have to go to the store in my opinion.

-pablo

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Friday, September 17, 2010 3:15 PM

On Fri, Sep 17, 2010 at 2:46 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Fri, Sep 17, 2010 at 1:06 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Sep 16, 2010 at 2:23 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Thu, Sep 16, 2010 at 8:53 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Thu, Sep 16, 2010 at 2:15 AM, Jeremy Orlow jor...@chromium.org
  wrote:
   Wait a sec.  What are the use cases for non-object cursors anyway?
    They
   made perfect sense back when we allowed explicit index management,
   but
   now
   they kind of seem like a premature optimization or possibly even dead
   weight.  Maybe we should just remove them altogether?
 
  They are still useful for joins. Consider an objectStore employees:
 
  { id: 1, name: Sven, employed: 1-1-2010 }
  { id: 2, name: Bert, employed: 5-1-2009 }
  { id: 3, name: Adam, employed: 6-6-2008 }
  And objectStore sales
 
  { seller: 1, candyName: lollipop, quantity: 5, date: 9-15-2010 }
  { seller: 1, candyName: swedish fish, quantity: 12, date: 9-15-2010
  }
  { seller: 2, candyName: jelly belly, quantity: 3, date: 9-14-2010 }
  { seller: 3, candyName: heath bar, quantity: 3, date: 9-13-2010 }
  If you want to display the amount of sales per person, sorted by names
  of sales person, you could do this by first creating and index for
  employees with keyPath name. You'd then use IDBIndex.openCursor to
  iterate that index, and for each entry find all entries in the sales
  objectStore where seller matches the cursors .value.
 
  So in this case you don't actually need any data from the employees
  objectStore, all the data is available in the index. Thus it is
  sufficient, and faster, to use openCursor than openObjectCursor.
 
  In general, it's a common optimization to stick enough data in an
  index that you don't have to actually look up in the objectStore
  itself. This is slightly less commonly doable since we have relatively
  simple indexes so far. But still doable as the example above shows.
  Once we add support for arrays as keys this will be much more common
  as you can then stick arbitrary data into the index by simply adding
  additional entries to all key arrays. And even more so once we
  (probably in a future version) add support for computed indexes.
 
 
  On Thu, Sep 16, 2010 at 8:57 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Thu, Sep 16, 2010 at 4:08 AM, Jeremy Orlow jor...@chromium.org
  wrote:
   Actually, for that matter, are remove and update needed at all?  I
   think
   they may just be more cruft left over from the explicit index days.
    As
   far
   as I can tell, any .delete or .remove should be doable via an
   objectCursor +
   .puts/.removes on the objectStore.
 
  They are not strictly needed, but they are a decent convinence
  feature, and with a proper implementation they can even be a
  performance optimization. With a cursor iterating a b-tree you can let
  the cursor keep a pointer to the b-tree entry. They way .delete and
  .update doesn't have to do a b-tree lookup at all.
 
  We're currently not able to do this since our backend (sqlite) doesn't
  have good enough cursor support, but I suspect that this will change
  at some point in the future. In the mean time it seems like a good
  thing to allow people to use API that will be faster in the future.
 
  All your arguments revolve around what the spec
  and implementations might do
  in the future.

 I disagree. The IDBIndex.openCursor example I included uses only
 existing API, and is a performance improvement in at least our current
 implementation. Would be interested to hear if it's not a performance
 improvement in others.

 It's not in ours because we join to the ObjectStore's data table either way.
  But that's not at all why I'm bringing this up.

Why?

  Typically we add API surface area only for use cases that
  are currently impossible to satisfy or proven performance bottlenecks. I
  agree that it's likely implementations will want to do optimizations
  like
  this in the future, but until they do, it'll be hard to really
  understand
  the implications and complications that might arrise.

 That's not entirely true. All the databases I have worked with have
 had significant performance degradations when having to look up the
 main table contents rather than simply looking at the contents in the
 index. I doubt that we'll be able to create a backend where 

RE: [IndexedDB] Languages for collation

2010-09-29 Thread Pablo Castro

From: Jungshik Shin (신정식, 申政湜) [mailto:jungs...@google.com] 
Sent: Tuesday, August 24, 2010 10:34 PM

 As for the locale identifiers, my understanding is that Windows APIs (newer 
 'name-based' locale APIs) more or less follows BCP 47. 


Picking this back up from this August thread. I went around and asked Windows 
folks about this. Locale identifiers based on BCP 47 sound good.

On the other hand, we probably wouldn't do UCA. I heard various worries from 
folks that work in this space, including the fact that it seems it's still 
changing so it would be a moving target (which btw means that collisions could 
still happen) and that we don't support it in a number of places today. Given 
that feedback, I would rather leave this open and let implementations choose 
the algorithm for collation (still need to do language-sensitive collation, of 
course). Would that work?

Thanks
-pablo
 


RE: Seeking agenda items for WebApps' Nov 1-2 f2f meeting

2010-09-28 Thread Pablo Castro
It looks like there will be good critical mass for IndexedDB discussions, so 
I'll try to make it as well. Tuesday would be best for me as well for an 
IndexedDB meeting so I can travel on Sunday/Monday.

-pablo

-Original Message-
From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Tuesday, September 28, 2010 10:53 AM
To: Jeremy Orlow
Cc: Pablo Castro; art.bars...@nokia.com; public-webapps
Subject: Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting

I'm not 100% sure that I'll make TPAC this year, but if I do, I likely
won't make monday. So a tuesday schedule would fit me better too.

/ Jonas

On Tue, Sep 28, 2010 at 8:36 AM, Jeremy Orlow jor...@google.com wrote:
 Is it possible to schedule IndexedDB for Tuesday?  I'm pretty sure that I
 can be there then, but Monday is more up in the air at this moment.
 Thanks!
 Jeremy
 On Thu, Sep 2, 2010 at 3:28 AM, Jonas Sicking jo...@sicking.cc wrote:

 I'm hoping to be there yes. Especially if we'll get a critical mass of
 IndexedDB contributors.

 / Jonas

 On Wed, Sep 1, 2010 at 7:18 PM, Pablo Castro pablo.cas...@microsoft.com
 wrote:
 
  -Original Message-
  From: public-webapps-requ...@w3.org
  [mailto:public-webapps-requ...@w3.org] On Behalf Of Arthur Barstow
  Sent: Tuesday, August 31, 2010 4:32 AM
 
  The WebApps WG will meet face-to-face November 1-2 as part of the
  W3C's
  2010 TPAC meeting week [TPAC].
 
  I created a stub agenda item page and seek input to flesh out agenda:
 
  http://www.w3.org/2008/webapps/wiki/TPAC2010
 
  [TPAC] includes a link to the Registration page, a detailed schedule
  of
  the group meetings, and other useful information.
 
  The registration fee is 40€ per day and will increase to 120€ per day
  after October 22.
 
  -Art Barstow
 
  [TPAC] http://www.w3.org/2010/11/TPAC/
 
  For folks working on IndexedDB, are you guys planning on attending the
  TPAC? Given the timing of the event it may be a great opportunity to get
  together and iron out a whole bunch of issues at once. It would be good to
  know ahead of time so we can all make plans if we have critical mass.
 
  Thanks
  -pablo
 
 






RE: Seeking agenda items for WebApps' Nov 1-2 f2f meeting

2010-09-01 Thread Pablo Castro

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Arthur Barstow
Sent: Tuesday, August 31, 2010 4:32 AM

 The WebApps WG will meet face-to-face November 1-2 as part of the W3C's 
 2010 TPAC meeting week [TPAC].

 I created a stub agenda item page and seek input to flesh out agenda:

 http://www.w3.org/2008/webapps/wiki/TPAC2010

 [TPAC] includes a link to the Registration page, a detailed schedule of 
 the group meetings, and other useful information.

 The registration fee is 40€ per day and will increase to 120€ per day 
 after October 22.

 -Art Barstow

 [TPAC] http://www.w3.org/2010/11/TPAC/

For folks working on IndexedDB, are you guys planning on attending the TPAC? 
Given the timing of the event it may be a great opportunity to get together and 
iron out a whole bunch of issues at once. It would be good to know ahead of 
time so we can all make plans if we have critical mass.

Thanks
-pablo



RE: [IndexedDB] Let's remove IDBDatabase.objectStore()

2010-08-24 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Tuesday, August 24, 2010 12:40 AM

 On Tue, Aug 24, 2010 at 12:43 AM, ben turner bent.mozi...@gmail.com wrote:
 Hi folks,

 We originally included IDBDatabase.objectStore() as a convenience
 function because we figured that everyone would hate typing
 |myDatabase.transaction('myObjectStore').objectStore('myObjectStore')|.
 Unfortunately I think we should remove it - too many developers have
 used the function without realizing that the returned object was tied
 to a particular transaction. Any objections?

 It does seem like it could be confusing and it doesn't seem to save all that 
 many characters.  So I'm fine with it.

+1




[IndexedDB] Avoiding reader/writer starvation

2010-08-13 Thread Pablo Castro
In the context of transactions, readers using READ_ONLY and writers using 
READ_WRITE may block each other when starting transactions, at least for cases 
where the underlying implementation uses locking for isolation. Since we allow 
multiple readers and they can start while other readers were already running, 
it's possible that readers end up starving writers in a concurrent setting. It 
seems it would be a good idea to add some minimum guarantees to the spec that 
ensures some amount of fairness to concurrent activities against a given 
database. 

We could either include a loose recommendation or try to mandate a strict 
behavior. It seems the loose recommendation is more practical, the questions 
are a) is there a risk of incompatible behavior because of under-specification, 
and b) will we risk that some implementations will just ignore this aspect if 
it's specified too informally.

The loose recommendation could just be a sentence in the transactions section:

UAs need to ensure a reasonable level of fairness across readers and writers 
to prevent starvation.

If we wanted to be more specific, we could go with something like this (we'd 
probably spell it out as rules if we decide to put this strict version in the 
spec):

All readers can run concurrently, but once a writer tries to start a 
transaction we stop allowing new readers to start and queue up the writer and 
any subsequent reader/writer. Once the existing readers are drained the writer 
runs, and after that whatever is queued up next runs, which can be another 
writer or all the remaining readers (depending upon what came first, another 
writer or another reader; readers are released all simultaneously since they 
run concurrently).

Given that not all implementations will have to deal with this and that 
different implementations may want to have different strategies, it seems that 
just having the recommendation around starvation is the best option.

Thanks
-pablo




RE: [IndexedDB] Languages for collation

2010-08-12 Thread Pablo Castro

From: Mikeal Rogers [mailto:mikeal.rog...@gmail.com] 
Sent: Wednesday, August 11, 2010 11:35 PM

 Why not just use the unicode collation algorithm?

 Then you won't have to hint the locale.

Unless I'm missing something, the UCA defines the general algorithm for 
collating strings but you still need to know the language in order to sort 
strings properly in that language. For example, in Spanish the letters c and 
h  together (e.g. in chau (bye)) sort as a single letter, causing the 
expected sort order to be different from English where they are always two 
independent letters (e.g. so chau comes before cuando (when) when sorted in 
English, but after when sorted in Spanish).


 http://en.wikipedia.org/wiki/Unicode_collation_algorithm

 CouchDB uses some definitions around sorting complex types like arrays and 
 objects but when it comes down to sorting strings it just defaults to to the 
 unicode collation algorithm and all the locale's are happy.

 -Mikeal

 On Wed, Aug 11, 2010 at 11:28 PM, Pablo Castro pablo.cas...@microsoft.com 
 wrote:
 We had some discussions about collation algorithms and such in the past, but 
 I don't think we have settled on the language aspect of it. In order to have 
 stores and indexes sort character-based keys in a way that is consistent 
 with users' expectations we'll have to take indication in the API of what 
 language we should use to collate strings.

 Trying to take a minimalist approach, we could add an optional parameter on 
 the database open call that indicates the language to use (e.g. en or 
 en-UK, etc.). If the language is not specified and the database does not 
 exist, then we can use the current browser/OS language to create the 
 database. If not specified and database already exists, then use the one 
 it's already there (this accommodates the fact that a user may be able to 
 change their default language in the browser/OS after the database has been 
 created using the default). If the language is specified and the database 
 already exists and the specified language is not the one the database has 
 then we'll throw an exception (same behavior as with description, although 
 we have that one in flight right now as well).

 We should probably also add a read-only attribute to the database object 
 that exposes the language.

 If this works for folks I can write a proposal for the specific changes to 
 the spec.

 Thanks
 -pablo





RE: [IndexedDB] question about description argument of IDBFactory::open()

2010-08-12 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Thursday, August 12, 2010 3:59 AM

 On Thu, Aug 12, 2010 at 11:55 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, Aug 12, 2010 at 3:41 AM, Jeremy Orlow jor...@chromium.org wrote:
  http://www.w3.org/Bugs/Public/show_bug.cgi?id=10349
  One quesiton though: if they pass in null or undefined, do we want to
  interpret this as the argument not being passed in or simply let them
  convert to undefined and null (which is the default behavior in WebIDL,
  I believe).  I feel somewhat strongly we should do the former.  Especially
  since the latter would make it impossible to add additional parameters to
  .open() in the future.
 I don't understand why it would make it impossible to add optional
 parameters in the future. Wouldn't it be a matter of people writing

 indexeddb.open(mydatabase, , SOME_OTHER_PARAM);

 vs.

 indexeddb.open(mydatabase, null, SOME_OTHER_PARAM);

 So  is assumed to mean don't update?  My assumption was that  meant 
 empty description.

 It seems silly to make someone replace the description with a space (or 
 something like that) if they truly want to zero it out.  And it seems silly 
 to ever make your description be  null.  So it seemed natural to make 
 null and/or undefined be such a signal.

Given that open() is one of those functions that are likely to grow in 
parameters over time, I wonder if we should consider taking an object as the 
second argument with names/values(e.g. open(mydatabase, { description: foo 
}); ). That would allow us to keep the minimum specification small and easily 
add more parameters later without resulting un hard to read code that has a 
bunch of undefined in arguments. The only thing I'm not sure is if there is 
precedent of doing this in one of the standard APIs.

Thanks
-pablo




[IndexedDB] READ_ONLY vs SNAPSHOT_READ transactions

2010-08-12 Thread Pablo Castro
We currently have two read-only transaction modes, READ_ONLY and SNAPSHOT_READ. 
As we map this out to implementation we ran into various questions that made me 
wonder whether we have the right set of modes. 

It seems that READ_ONLY and SNAPSHOT_READ are identical in every aspect 
(point-in-time consistency for readers, allow multiple concurrent readers, 
etc.), except that they have different concurrency characteristics, with 
READ_ONLY blocking writers and SNAPSHOT_READ allowing concurrent writers come 
and go while readers are active. Does that match everybody's interpretation?

Assuming that interpretation, then I'm not sure if we need both. Should we 
consider having only READ_ONLY, where transactions are guaranteed a stable view 
of the world regardless of the implementation strategy, and then let 
implementations either block writers or version the data? I understand that 
this introduces variability in the reader-writer interaction. On the other 
hand, I also suspect that the cost of SNAPSHOT_READ will also vary a lot across 
implementations (e.g. mvcc-based stores versus non-mvcc stores that will have 
to make copies of all stores included in a transaction to support this mode). 

Thanks
-pablo




RE: [IndexedDB] Languages for collation

2010-08-12 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, August 12, 2010 2:18 AM

 I think we should first break down the use cases and look at how many of 
 them just need _a_ sort order, how many of them a per-database sort order is 
 ok, and how many of them would need something finer grained (like a per-key 
 ordering).

That's reasonable. What I was thinking is that any case where you'll use the 
order of items in a store/index to display things to the user (e.g. a list of 
contacts) you'd want the items to be in proper order  for the user's language. 
That will not only match users' expectations but also match other applications 
(or even other parts of the UA) that display data based on the current OS 
language or the users' choice of language. 

That covers a very broad spectrum of scenarios that need language-specific sort 
order. 

I find it unlikely that a single web app will need more than one language per 
database (or even per origin/OS account), given that most applications operate 
in a single language at any one point in time. 

 Are there work-arounds for getting an UCA ordered data structure to hold 
 data other language's order?  For example, I could imagine it'd be possible 
 to do some sort of encode step on the data before insertion (and decode on 
 removal) that would make UCA work.  I have no idea, but if such algorithms 
 existed and were well understood, then it'd definitely make me lean towards 
 punting language specification to v2.

I'm not sure I understand this paragraph. UCA ordered may not mean much more 
than just ordering using a binary collation if the language is not specified. 
While this is typically not an issue in English, in other languages this 
introduces a varying level of deviation from users' expectations. Given that 
different languages have conflicting rules for collation, I'm not sure how this 
can be generalized independently of the language. Even in the UCA specification 
[1] the aspect of input language is mentioned as the most important feature of 
collation.

[1] http://www.unicode.org/reports/tr10/




RE: [IndexedDB] Languages for collation

2010-08-12 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, August 12, 2010 3:36 AM

 On Thu, Aug 12, 2010 at 11:19 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Aug 11, 2010 at 11:28 PM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
  We had some discussions about collation algorithms and such in the past, 
  but I don't think we have settled on the language aspect of it. In order 
  to have stores and indexes sort character-based keys in a way that is 
  consistent with users' expectations we'll have to take indication in the 
  API of what language we should use to collate strings.
 
  Trying to take a minimalist approach, we could add an optional parameter 
  on the database open call that indicates the language to use (e.g. en or 
  en-UK, etc.). If the language is not specified and the database does not 
  exist, then we can use the current browser/OS language to create the 
  database. If not specified and database already exists, then use the one 
  it's already there (this accommodates the fact that a user may be able to 
  change their default language in the browser/OS after the database has 
  been created using the default). If the language is specified and the 
  database already exists and the specified language is not the one the 
  database has then we'll throw an exception (same behavior as with 
  description, although we have that one in flight right now as well).
 
  We should probably also add a read-only attribute to the database object 
  that exposes the language.
 
  If this works for folks I can write a proposal for the specific changes to 
  the spec.
 If we make it part of the database open call, then that makes it
 impossible to change the sorting order of an existing database, no?
 This seems like it could be a problem. I.e. it quite possible that an
 application will want to allow the user to change the sorting
 language, for example when changing the language of the UI.

 One solution would be to allow language to be set as part of the
 setVersion call.

 Whether it's per-database or more fine grained I think it absolutely must be 
 part of setVersion.  Changing the language will be a very heavyweight 
 operation that'll require a similar level of isolation to schema changes 
 of the database.  (Not sure how I missed this point of Pablo's original 
 email.)

Yes, changing the collection would effectively mean re-creating all the stores 
and indexes. At a very minimum it needs to be a setVersion thing. I also don't 
think it would be too crazy to not support changing collations period. In the 
unusual case where a user must absolutely do this, it can be done by creating a 
separate database and copying the data over using the APIs.





RE: CfC: to publish new WD of Indexed Database API; deadline August 17

2010-08-11 Thread Pablo Castro
We support this as well.

-pablo


-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Tuesday, August 10, 2010 8:06 AM
To: Jeremy Orlow
Cc: art.bars...@nokia.com; public-webapps
Subject: Re: CfC: to publish new WD of Indexed Database API; deadline August 17

I support this.

On Tue, Aug 10, 2010 at 4:38 AM, Jeremy Orlow jor...@google.com wrote:
 On Tue, Aug 10, 2010 at 12:04 PM, Arthur Barstow art.bars...@nokia.com
 wrote:

 All - the Editors of the Indexed Database API would like to publish a new
 Working Draft:

  http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html

 If you have any comments or concerns about this proposal, please send them
 to public-webapps by August 10 at the latest.

 I assume you mean the 17th?

 As with all of our CfCs, positive response is preferred and encouraged and
 silence will be assumed to be assent.

 We support.





RE: [IndexedDB] Need a method to remove a database

2010-08-09 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Friday, August 06, 2010 2:34 AM

 On Fri, Aug 6, 2010 at 12:37 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, Aug 5, 2010 at 4:02 PM, Pablo Castro pablo.cas...@microsoft.com 
 wrote:
 
  -Original Message-
  From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] 
  On Behalf Of Jonas Sicking
  Sent: Thursday, August 05, 2010 2:12 PM
 
   I suggest we make removeDatabase (or whatever we call it) schedule a
   database to be deleted, but doesn't actually delete it until all
   existing connections to it are closed (though either explicit calls to
   IDBDatabase.close(), or through the tab being closed).
  
   Any calls to IDBFactory.open with the same name will hold the callback
   until the removeDatabase() operation is finished. I.e. after all
   existing connections are closed and the database is removed.
  
   This is similar to how setVersion works.
  
   If we're not going to keep it simple, then we should match the 
   setVersion
   semantics as much as is possible.  I.e. add the blocked event and 
   stuff like
   that.
 
  The blocked event fires on the IDBDatabase object. Do we want to
  require that the database is opened before it can be removed? I don't
  really feel strongly either way.
 
  The other question is if we should fire a versionchange event on
  other open IDBDatabases, like setVersion does. Or should we fire a
  holy hell, your database is about to get nuked! event? The former
  would keep things simpler since there is just one event to listen to.
  The latter might be more correct.
 
  / Jonas
 
  I like the idea of just scheduling the database to be deleted once the 
  last connection to it closes, and also preventing any new connection from 
  being established  once the database has been scheduled for deletion. 
  This adds as little surface area as possible to the API.
 
  If we find that that's not a good idea for some reason, I wonder if we 
  should unify the versionchange event and this into a single stuff 
  seriously changed event where subscribers need to close their handles and 
  let go of any assumptions they had about the database. Once they can 
  re-open, they need to re-establish all their context (this is already true 
  for a version change, we may as well extend it to database deletes and any 
  other future big changes to the database schema, options, etc.)
 Here's my proposal, please poke holes in it:

 interface IDBFactory {
 ...
 IDBRequest deleteDatabase(in DOMString name);
 ...
 };

 When deleteDatabase is called, the given database is scheduled for
 deletion. If any IDBDatabase objects are opened to the database fire a
 versionchange event on those IDBDatabase objects, with a .version
 set to null. If any calls to IDBFactory.open occur, stall those until
 after this algorithm is finished. Note that this generally won't mean
 that those open calls will fail. They'll generally will receive a
 newly created database instead.

 Once all existing IDBDatabase are closed (implicitly or explicitly),
 the database is removed. At this point any IDBFactory.open calls are
 fulfilled and a success event is fired on the returned IDBRequest.

 So no blocked event is fired as I'm not sure where to fire it. I'm
 also not sure that this is a big problem. I'm not even sure that
 returning a IDBRequest is worth it. The only value I can see is
 wanting to display to a user when a database is for sure deleted as to
 allow the user to for example safely shut down the computer without
 worrying that sensitive data is still in the database.

 All of this sounds good to me.  I'd probably still return an IDBRequest 
 for consistency and so that the app can get a conformation when it's really 
 gone.  On success would fire with a null result field, I'd think.

This looks good to me too. I agree with still having deleteDatabase return an 
IDBRequest so the caller can tell when the operation is done.

-pablo




RE: [IndexedDB] Need a method to remove a database

2010-08-05 Thread Pablo Castro

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Thursday, August 05, 2010 2:12 PM

  I suggest we make removeDatabase (or whatever we call it) schedule a
  database to be deleted, but doesn't actually delete it until all
  existing connections to it are closed (though either explicit calls to
  IDBDatabase.close(), or through the tab being closed).
 
  Any calls to IDBFactory.open with the same name will hold the callback
  until the removeDatabase() operation is finished. I.e. after all
  existing connections are closed and the database is removed.
 
  This is similar to how setVersion works.
 
  If we're not going to keep it simple, then we should match the setVersion
  semantics as much as is possible.  I.e. add the blocked event and stuff 
  like
  that.

 The blocked event fires on the IDBDatabase object. Do we want to
 require that the database is opened before it can be removed? I don't
 really feel strongly either way.

 The other question is if we should fire a versionchange event on
 other open IDBDatabases, like setVersion does. Or should we fire a
 holy hell, your database is about to get nuked! event? The former
 would keep things simpler since there is just one event to listen to.
 The latter might be more correct.

 / Jonas

I like the idea of just scheduling the database to be deleted once the last 
connection to it closes, and also preventing any new connection from being 
established once the database has been scheduled for deletion. This adds as 
little surface area as possible to the API.

If we find that that's not a good idea for some reason, I wonder if we should 
unify the versionchange event and this into a single stuff seriously 
changed event where subscribers need to close their handles and let go of any 
assumptions they had about the database. Once they can re-open, they need to 
re-establish all their context (this is already true for a version change, we 
may as well extend it to database deletes and any other future big changes to 
the database schema, options, etc.)

-pablo




RE: [IndexedDB] Need a method to remove a database

2010-08-04 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Wednesday, August 04, 2010 2:56 AM

 On Tue, Aug 3, 2010 at 11:26 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Aug 3, 2010 at 3:20 PM, Shawn Wilsher sdwi...@mozilla.com wrote:
  Hey all,
 
  Some of the feedback I've been seeing on the web is that there is no way to
  remove a database.  Examples seem to be web page wants to allow the user 
  to
  remove the data they stored.  A site can almost accomplish this now by
  removing all object stores, but we still end up storing some meta data
  (version number).  Does this seem like a legit request to everyone?
 Sounds legit to me. Feel somewhat embarrassed that I've missed this so far :)

 Agreed.

 What should the semantics be for open database connections?  We could do 
 something like setVersion, but I'd just as soon nuke any existing connection 
 (i.e. make all future operations fail).  This seems  reasonable since the 
 reasons we didn't do this for setVersion (data loss) don't really seem to 
 apply here.

 J

+1

Nuking is fine...another option would be to queue up the delete until all 
database sessions are gone, but probably will complicate things and not add 
much. The only thing I wonder is if we'll create a bunch of pain for 
implementations where nuking is tricky (thinking of multi-process scenarios 
where maybe files are locked or something).

-pablo
 



RE: [IndexedDB] Need a method to clear an object store

2010-08-04 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Tuesday, August 03, 2010 12:21 PM

 On Tue, Aug 3, 2010 at 12:09 PM, ben turner bent.mozi...@gmail.com wrote:
  Hi folks,
 
  Currently there are only two ways to clear an object store of all
  data: (i) remove the object store and recreate it, or (ii) open a
  cursor and call remove for all entries. I propose a third, simpler
  approach:
 
  interface IDBObjectStore
  {
   ...
   void clear();
   ...
  };
 
  Any thoughts?

 Some background. At least in our implementation, removing each
 individual item is significantly slower than removing and recreating
 the objectStore. It's also significantly slower than a 'clear'
 function is. And while tearing down and recreating the objectStore
 works, it's fairly complex if there are multiple indexes on the store.
 Adding a clear() function, while redundant, should make things easier
 for developers while adding very little work in the implementation.

 I think there is a bug in the above proposal though. clear() should
 return a IDBRequest. However the .result of the request should likely
 be null.

 / Jonas

+1 on having clear(). We ran into the need also while playing with samples and 
such.

-pablo




RE: [IndexedDB] Current editor's draft

2010-07-22 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Thursday, July 22, 2010 11:27 AM

 On Thu, Jul 22, 2010 at 3:43 AM, Nikunj Mehta nik...@o-micron.com wrote:
 
  On Jul 16, 2010, at 5:41 AM, Pablo Castro wrote:
 
 
  From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy 
  Orlow
  Sent: Thursday, July 15, 2010 8:41 AM
 
  On Thu, Jul 15, 2010 at 4:30 PM, Andrei Popescu andr...@google.com 
  wrote:
  On Thu, Jul 15, 2010 at 3:24 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Thu, Jul 15, 2010 at 3:09 PM, Andrei Popescu andr...@google.com 
  wrote:
 
  On Thu, Jul 15, 2010 at 9:50 AM, Jeremy Orlow jor...@chromium.org 
  wrote:
  Nikunj, could you clarify how locking works for the dynamic
  transactions proposal that is in the spec draft right now?
 
  I'd definitely like to hear what Nikunj originally intended here.
 
 
  Hmm, after re-reading the current spec, my understanding is that:
 
  - Scope consists in a set of object stores that the transaction 
  operates
  on.
  - A connection may have zero or one active transactions.
  - There may not be any overlap among the scopes of all active
  transactions (static or dynamic) in a given database. So you cannot
  have two READ_ONLY static transactions operating simultaneously over
  the same object store.
  - The granularity of locking for dynamic transactions is not specified
  (all the spec says about this is do not acquire locks on any database
  objects now. Locks are obtained as the application attempts to access
  those objects).
  - Using dynamic transactions can lead to dealocks.
 
  Given the changes in 9975, here's what I think the spec should say for
  now:
 
  - There can be multiple active static transactions, as long as their
  scopes do not overlap, or the overlapping objects are locked in modes
  that are not mutually exclusive.
  - [If we decide to keep dynamic transactions] There can be multiple
  active dynamic transactions. TODO: Decide what to do if they start
  overlapping:
    -- proceed anyway and then fail at commit time in case of
  conflicts. However, I think this would require implementing MVCC, so
  implementations that use SQLite would be in trouble?
 
  Such implementations could just lock more conservatively (i.e. not 
  allow
  other transactions during a dynamic transaction).
 
  Umm, I am not sure how useful dynamic transactions would be in that
  case...Ben Turner made the same comment earlier in the thread and I
  agree with him.
 
  Yes, dynamic transactions would not be useful on those implementations, 
  but the point is that you could still implement the spec without a MVCC 
  backend--though it  would limit the concurrency that's possible.  
  Thus implementations that use SQLite would NOT necessarily be in 
  trouble.
 
  Interesting, I'm glad this conversation came up so we can sync up on 
  assumptions...mine where:
  - There can be multiple transactions of any kind active against a given 
  database session (see note below)
  - Multiple static transactions may overlap as long as they have 
  compatible modes, which in practice means they are all READ_ONLY
  - Dynamic transactions have arbitrary granularity for scope 
  (implementation specific, down to row-level locking/scope)
 
  Dynamic transactions should be able to lock as little as necessary and as 
  late as required.

 So dynamic transactions, as defined in your proposal, didn't lock on a
 whole-objectStore level? If so, how does the author specify which rows
 are locked? And why is then openObjectStore a asynchronous operation
 that could possibly fail, since at the time when openObjectStore is
 called, the implementation doesn't know which rows are going to be
 accessed and so can't determine if a deadlock is occurring? And is it
 only possible to lock existing rows, or can you prevent new records
 from being created? And is it possible to only use read-locking for
 some rows, but write-locking for others, in the same objectStore?

That's my interpretation, dynamic transactions don't lock whole object stores. 
To me dynamic transactions are the same as what typical SQL databases do today. 

The author doesn't explicitly specify which rows to lock. All rows that you 
see become locked (e.g. through get(), put(), scanning with a cursor, etc.). 
If you start the transaction as read-only then they'll all have shared locks. 
If you start the transaction as read-write then we can choose whether the 
implementation should always attempt to take exclusive locks or if it should 
take shared locks on read, and attempt to upgrade to an exclusive lock on first 
write (this affects failure modes a bit).

Regarding deadlocks, that's right, the implementation cannot determine if a 
deadlock will occur ahead of time. Sophisticated implementations could track 
locks/owners and do deadlock detection, although a simple timeout-based 
mechanism is probably enough for IndexedDB.

As for locking only existing rows, that depends on how much isolation we

RE: [IndexedDB] Current editor's draft

2010-07-22 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Thursday, July 22, 2010 5:18 PM

  The author doesn't explicitly specify which rows to lock. All rows that 
  you see become locked (e.g. through get(), put(), scanning with a 
  cursor, etc.). If you start the transaction as read-only then they'll all 
  have shared locks. If you start the transaction as read-write then we can 
  choose whether the implementation should always attempt to take exclusive 
  locks or if it should take shared locks on read, and attempt to upgrade to 
  an exclusive lock on first write (this affects failure modes a bit).

 What counts as see? If you iterate using an index-cursor all the
 rows that have some value between A and B, but another, not yet
 committed, transaction changes a row such that its value now is
 between A and B, what happens?

We need to design something a bit more formal that covers the whole spectrum. 
As a short answer, assuming we want to have serializable as our isolation 
level, then we'd have a range lock that goes from the start of a cursor to the 
point you've reached, so if you were to start another cursor you'd be 
guaranteed the exact same view of the world. In that case it wouldn't be 
possible for other transaction to insert a row between two rows you scanned 
through with a cursor.

-pablo




RE: [IndexedDB] Current editor's draft

2010-07-22 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Thursday, July 22, 2010 5:25 PM

  Regarding deadlocks, that's right, the implementation cannot determine if
  a deadlock will occur ahead of time. Sophisticated implementations could
  track locks/owners and do deadlock detection, although a simple
  timeout-based mechanism is probably enough for IndexedDB.
 
  Simple implementations will not deadlock because they're only doing object
  store level locking in a constant locking order.

Well, it's not really simple vs sophisticated, but whether they do dynamically 
scoped transactions or not, isn't it? If you do dynamic transactions, then 
regardless of the granularity of your locks, code will grow the lock space in a 
way that you cannot predict so you can't use a well-known locking order, so 
deadlocks are not avoidable. 

   Sophisticated implementations will be doing key level (IndexedDB's analog
  to row level) locking with deadlock detection or using methods to 
  completely
  avoid it.  I'm not sure I'm comfortable with having one or two in-between
  implementations relying on timeouts to resolve deadlocks.

Deadlock detection is quite a bit to ask from the storage engine. From the 
developer's perspective, the difference between deadlock detection and timeouts 
for deadlocks is the fact that the timeout approach will take a bit longer, and 
the error won't be as definitive. I don't think this particular difference is 
enough to require deadlock detection.

  Of course, if we're breaking deadlocks that means that web developers need
  to handle this error case on every async request they make.  As such, I'd
  rather that we require implementations to make deadlocks impossible.  This
  means that they either need to be conservative about locking or to do MVCC
  (or something similar) so that transactions can continue on even beyond the
  point where we know they can't be serialized.  This would 
  be consistent with
  our usual policy of trying to put as much of the burden as is practical on
  the browser developers rather than web developers.

Same as above...MVCC is quite a bit to mandate from all implementations. For 
example, I'm not sure but from my basic understanding of SQLite I think it 
always does straight up locking and doesn't have support for versioning.

 
  As for locking only existing rows, that depends on how much isolation we
  want to provide. If we want serializable, then we'd have to put in 
  things
  such as range locks and locks on non-existing keys so reads are consistent
  w.r.t. newly created rows.
 
  For the record, I am completely against anything other than serializable
  being the default.  Everything a web developer deals with follows run to
  completion.  If you want to have optional modes that relax things in terms
  of serializability, maybe we should start a new thread?

 Agreed.

 I was against dynamic transactions even when they used
 whole-objectStore locking. So I'm even more so now that people are
 proposing row-level locking. But I'd like to understand what people
 are proposing, and make sure that what is being proposed is a coherent
 solution, so that we can correctly evaluate it's risks versus
 benefits.

The way I see the risk/benefit tradeoff of dynamic transactions: they bring 
better concurrency and more flexibility at the cost of new failure modes. I 
think that weighing them in those terms is more important than the specifics 
such as whether it's okay to have timeouts versus explicit deadlock errors. 

-pablo





RE: [IndexedDB] Current editor's draft

2010-07-22 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Thursday, July 22, 2010 5:30 PM

 On Thu, Jul 22, 2010 at 5:26 PM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
 
  From: Jonas Sicking [mailto:jo...@sicking.cc]
  Sent: Thursday, July 22, 2010 5:18 PM
 
   The author doesn't explicitly specify which rows to lock. All rows 
   that you see become locked (e.g. through get(), put(), scanning with 
   a cursor, etc.). If you start the transaction as read-only then 
   they'll all have shared locks. If you start the transaction as 
   read-write then we can choose whether the implementation should always 
   attempt to take exclusive locks or if it should take shared locks on 
   read, and attempt to upgrade to an exclusive lock on first write (this 
   affects failure modes a bit).

 
  What counts as see? If you iterate using an index-cursor all the
  rows that have some value between A and B, but another, not yet
  committed, transaction changes a row such that its value now is
  between A and B, what happens?
 
  We need to design something a bit more formal that covers the whole 
  spectrum. As a short answer, assuming we want to have serializable as 
  our isolation level, then we'd have a range lock that goes from the start 
  of a cursor to the point you've reached, so if you were to start another 
  cursor you'd be guaranteed the exact same view of the world. In that case 
  it wouldn't be possible for other transaction to insert a row between two 
  rows you scanned through with a cursor.

 How would you prevent that? Would a call to .modify() or .put() block
 until the other transaction finishes? With appropriate timeouts on
 deadlocks of course.

That's right, calls would block if they need to acquire a lock for a key or a 
range and there is an incompatible lock present that overlaps somehow with that.

-pablo




RE: [IndexedDB] Cursors and modifications

2010-07-15 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, July 15, 2010 2:04 AM

On Thu, Jul 15, 2010 at 2:44 AM, Jonas Sicking jo...@sicking.cc wrote:
On Wed, Jul 14, 2010 at 6:20 PM, Pablo Castro pablo.cas...@microsoft.com 
wrote:

  If it's accurate, as a side note, for the async API it seems that this 
  makes it more interesting to enforce callback order, so we can more easily 
  explain what we mean by before.
 Indeed.

 What do you mean by enforce callback order?  Are you saying that callbacks 
 should be done in the order the requests are made (rather than prioritizing 
 cursor callbacks)?  (That's how I read it, but Jonas' Indeed makes me 
 suspect I missed something. :-)

That's right. If changes are visible as they are made within a transaction, 
then reordering the callbacks would have a visible effect. In particular if we 
prioritize the cursor callbacks then you'll tend to see a callback for a cursor 
move before you see a callback for say an add/modify, and it's not clear at 
that point whether the add/modify happened already and is visible (but the 
callback didn't land yet) or if the change hasn't happened yet. If callbacks 
are in order, you see changes within your transaction strictly in the order 
that each request is made, avoiding surprises in cursor callbacks. 

-pablo




RE: [IndexedDB] Current editor's draft

2010-07-15 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, July 15, 2010 8:41 AM

On Thu, Jul 15, 2010 at 4:30 PM, Andrei Popescu andr...@google.com wrote:
On Thu, Jul 15, 2010 at 3:24 PM, Jeremy Orlow jor...@chromium.org wrote:
 On Thu, Jul 15, 2010 at 3:09 PM, Andrei Popescu andr...@google.com wrote:

 On Thu, Jul 15, 2010 at 9:50 AM, Jeremy Orlow jor...@chromium.org wrote:
   Nikunj, could you clarify how locking works for the dynamic
   transactions proposal that is in the spec draft right now?
  
   I'd definitely like to hear what Nikunj originally intended here.
  
 
  Hmm, after re-reading the current spec, my understanding is that:
 
  - Scope consists in a set of object stores that the transaction operates
  on.
  - A connection may have zero or one active transactions.
  - There may not be any overlap among the scopes of all active
  transactions (static or dynamic) in a given database. So you cannot
  have two READ_ONLY static transactions operating simultaneously over
  the same object store.
  - The granularity of locking for dynamic transactions is not specified
  (all the spec says about this is do not acquire locks on any database
  objects now. Locks are obtained as the application attempts to access
  those objects).
  - Using dynamic transactions can lead to dealocks.
 
  Given the changes in 9975, here's what I think the spec should say for
  now:
 
  - There can be multiple active static transactions, as long as their
  scopes do not overlap, or the overlapping objects are locked in modes
  that are not mutually exclusive.
  - [If we decide to keep dynamic transactions] There can be multiple
  active dynamic transactions. TODO: Decide what to do if they start
  overlapping:
    -- proceed anyway and then fail at commit time in case of
  conflicts. However, I think this would require implementing MVCC, so
  implementations that use SQLite would be in trouble?
 
  Such implementations could just lock more conservatively (i.e. not allow
  other transactions during a dynamic transaction).
 
 Umm, I am not sure how useful dynamic transactions would be in that
 case...Ben Turner made the same comment earlier in the thread and I
 agree with him.

 Yes, dynamic transactions would not be useful on those implementations, but 
 the point is that you could still implement the spec without a MVCC 
 backend--though it would limit the concurrency that's possible.  Thus 
 implementations that use SQLite would NOT necessarily be in trouble.

Interesting, I'm glad this conversation came up so we can sync up on 
assumptions...mine where:
- There can be multiple transactions of any kind active against a given 
database session (see note below)
- Multiple static transactions may overlap as long as they have compatible 
modes, which in practice means they are all READ_ONLY
- Dynamic transactions have arbitrary granularity for scope (implementation 
specific, down to row-level locking/scope)
- Overlapping between statically and dynamically scoped transactions follows 
the same rules as static-static overlaps; they can only overlap on compatible 
scopes. The only difference is that dynamic transactions may need to block 
mid-flight until it can grab the resources it needs to proceed.

Note: for some databases having multiple transactions active on a single 
connection may be an unsupported thing. This could probably be handled in the 
IndexedDB layer though by using multiple connections under the covers.

-pablo




RE: [IndexedDB] Cursors and modifications

2010-07-15 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Thursday, July 15, 2010 11:59 AM

On Thu, Jul 15, 2010 at 11:02 AM, Pablo Castro
pablo.cas...@microsoft.com wrote:
 
  From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy 
  Orlow
  Sent: Thursday, July 15, 2010 2:04 AM
 
  On Thu, Jul 15, 2010 at 2:44 AM, Jonas Sicking jo...@sicking.cc wrote:
  On Wed, Jul 14, 2010 at 6:20 PM, Pablo Castro pablo.cas...@microsoft.com 
  wrote:
 
   If it's accurate, as a side note, for the async API it seems that this 
   makes it more interesting to enforce callback order, so we can more 
   easily explain what we mean by before.
  Indeed.
 
  What do you mean by enforce callback order?  Are you saying that 
  callbacks should be done in the order the requests are made (rather than 
  prioritizing cursor callbacks)?  (That's how I read it, but Jonas' 
  Indeed makes me suspect I missed something. :-)
 
  That's right. If changes are visible as they are made within a 
  transaction, then reordering the callbacks would have a visible effect. In 
  particular if we prioritize the cursor callbacks then you'll tend to see a 
  callback for a cursor move before you see a callback for say an 
  add/modify, and it's not clear at that point whether the add/modify 
  happened already and is visible (but the callback didn't land yet) or if 
  the change hasn't happened yet. If callbacks are in order, you see changes 
  within your transaction strictly in the order that each request is made, 
  avoiding surprises in cursor callbacks.

 Oh, I took what you said just as that we need to have a defined
 callback order. Not anything in particular what that definition should
 be.

 Regarding when a modification happens, I think the design should be
 that changes logically happen as soon as the 'success' call is fired.
 Any success calls after that will see the modified values.

Yep, I agree with this, a change happened for sure when you see the success 
callback. Before that you may or may not observe the change if you do a get or 
open a cursor to look at the record.
 
 I still think given the quite substantial speedups gained from
 prioritizing cursor callbacks, that it's the right thing to do. It
 arguably also has some benefits from a practical point of view when it
 comes to the very topic we're discussing. If we prioritize cursor
 callbacks, that makes it much easier to iterate a set of entries and
 update them, without having to worry about those updates messing up
 your iterator.

I hear you on the perf implications, but I'm worried that non-sequential order 
for callbacks will be completely non-intuitive for users. In particular, if 
you're changing things as you scan a cursor, if then you cursor through the 
changes you're not sure if you'll see the changes or not (because the callback 
is the only definitive point where the change is visible. That seems quite 
problematic...

-pablo
 



RE: [IndexedDB] Current editor's draft

2010-07-14 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Wednesday, July 14, 2010 12:07 AM

  Dynamic transactions:
  I see that most folks would like to see these going away. While I like the 
  predictability and simplifications that we're able to make by using static 
  scopes for transactions, I worry that we'll close the door for two 
  scenarios: background tasks and query processors. Background tasks such as 
  synchronization and post-processing of content would seem to be almost 
  impossible with the static scope approach, mostly due to the granularity 
  of the scope specification (whole stores). Are we okay with saying that 
  you can't for example sync something in the background (e.g. in a worker) 
  while your app is still working? Am I missing something that would enable 
  this class of scenarios? Query processors are also tricky because you 
  usually take the query specification in some form after the transaction 
  started (especially if you want to execute multiple queries with later 
  queries depending on the outcome of the previous ones). The background 
  tasks issue in particular looks pretty painful to me if we don't have a 
  way to achieve it without freezing the application while it happens.

 I don't understand enough of the details here to be able to make a
 decision. The use cases you are bringing up I definitely agree are
 important, but I would love to look at even a rough draft of what code
 you are expecting people will need to write.

I'll try and hack up and example. In general any scenario that has a worker and 
the UI thread working on the same database will be quite a challenge, because 
the worker will have to a) split the work in small pieces, even if it was 
naturally a bigger chunk and b) consider interleaving implications with the UI 
thread, otherwise even when split in chunks you're not guaranteed that one of 
the two will starve the other one (the worker running on a tight loop will 
effectively always have an active transaction, it'll be just changing the 
actual transaction from time to time). This can certainly happen with dynamic 
transactions as well, the only difference is that since the locking granularity 
is different, it may be that what you're working on in the worker and in the UI 
threads is independent enough that they don't interfere too much, allowing for 
some more concurrency.

 What I suggest is that we keep dynamic transactions in the spec for
 now, but separate the API from static transactions, start a separate
 thread and try to hammer out the details and see what we arrive at. I
 do want to clarify that I don't think dynamic transactions are
 particularly hard to implement, I just suspect they are hard to use
 correctly.

Sounds reasonable.

  Implicit commit:
  Does this really work? I need to play with sample app code more, it may 
  just be that I'm old-fashioned. For example, if I'm downloading a bunch of 
  data form somewhere and pushing rows into the store within a transaction, 
  wouldn't it be reasonable to do the whole thing in a transaction? In that 
  case I'm likely to have to unwind while I wait for the next callback from 
  XmlHttpRequest with the next chunk of data.

 You definitely want to do it in a transaction. In our proposal there
 is no way to even call .get or .put if you aren't inside a
 transaction. For the case you are describing, you'd download the data
 using XMLHttpRequest first. Once the data has been downloaded you
 start a transaction, parse the data, and make the desired
 modifications. Once that is done the transaction is automatically
 committed.

 The idea here is to avoid keeping transactions open for long periods
 of time, while at the same time making the API easier to work with.
 I'm very concerned that any API that requires people to do:

 startOperation();
... do lots of stuff here ...
 endOperation();

 people will forget to do the endOperation call. This is especially
 true if the startOperation/endOperation calls are spread out over
 multiple different asynchronously called functions, which seems to be
 the use case you're concerned about above. One very easy way to
 forget to call endOperation is if something inbetween the two
 function calls throw an exception.

Fair enough, maybe I need to think of this scenario differently, and if someone 
needs to download a bunch of data and then put it in the database atomically 
the right way is to download to work tables first over a long time and 
independent transactions, and then use a transaction only to move the data 
around into its final spot.

 This will likely be extra bad for transactions where no write
 operations are done. In this case failure to call a 'commit()'
 function won't result in any broken behavior. The transaction will
 just sit open for a long time and eventually rolled back, though
 since no changes were done, the rollback is transparent, and the only
 noticeable effect is that the application halts for a while while the
 

RE: [IndexedDB] Current editor's draft

2010-07-14 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Wednesday, July 14, 2010 12:10 AM

On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro pablo.cas...@microsoft.com 
wrote:

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Andrei Popescu
Sent: Monday, July 12, 2010 5:23 AM

  Dynamic transactions:
  I see that most folks would like to see these going away. While I like 
  the predictability and simplifications that we're able to make by using 
  static scopes for transactions, I worry that we'll close the door for two 
  scenarios: background tasks and query processors. Background tasks such 
  as synchronization and post-processing of content would seem to be almost 
  impossible with the static scope approach, mostly due to the granularity 
  of the scope specification (whole stores). Are we okay with saying that 
  you can't for example sync something in the background (e.g. in a worker) 
  while your app is still working? Am I missing something that would enable 
  this class of scenarios? Query processors are also tricky because you 
  usually take the query specification in some form after the transaction 
  started (especially if you want to execute multiple queries with later 
  queries depending on the outcome of the previous ones). The background 
  tasks issue in particular looks pretty painful to me if we don't have a 
  way to achieve it without freezing the application while it happens.

 Well, the application should never freeze in terms of the UI locking up, but 
 in what you described I could see it taking a while for data to show up on 
 the screen.  This is something that can be fixed by doing smaller updates on 
 the background thread, sending a message to the background thread that it 
 should abort for now, doing all database access on the background thread, 
 etc.

This is an issue regardless, isn't it? Let's say you have a worker churning on 
the database somehow. The worker has no UI or user to wait for, so it'll run in 
a tight loop at full speed. If it splits the work in small transactions, in 
cases where it doesn't have to wait for something external there will still be 
a small gap between transactions. That could easily starve the UI thread that 
needs to find an opportunity to get in and do a quick thing against the 
database. As you say the difference between freezing and locking up at this 
point is not that critical, as the end user in the end is just waiting.

 One point that I never saw made in the thread that I think is really 
 important is that dynamic transactions can make concurrency worse in some 
 cases.  For example, with dynamic transactions you can get into live-lock 
 situations.  Also, using Pablo's example, you could easily get into a 
 situation where the long running transaction on the worker keeps hitting 
 serialization issues and thus it's never able to make progress.

While it could certainly happen, I don't remember seeing something like a 
live-lock in a long, long time. Deadlocks are common, but a simple timeout will 
kill one of the transactions and let the other make progress. A bit violent, 
but always effective. 

 I do see that there are use cases where having dynamic transactions would be 
 much nicer, but the amount of non-determinism they add (including to 
 performance) has me pretty worried.  I pretty firmly believe we should look 
 into adding them in v2 and remove them for now.  If we do leave them in, it 
 should definitely be in its own method to make it quite clear that the 
 semantics are more complex.
 
Let's explore a bit more and see where we land. I'm not pushing for dynamic 
transactions themselves, but more for the scenarios they enable (background 
processing and such). If we find other ways of doing that, then all the better. 
Having different entry points is reasonable.

  Nested transactions:
  Not sure why we're considering this an advanced scenario. To be clear 
  about what the feature means to me: make it legal to start a transaction 
  when one is already in progress, and the nested one is effectively a 
  no-op, just refcounts the transaction, so you need equal amounts of 
  commit()'s, implicit or explicit, and an abort() cancels all nested 
  transactions. The purpose of this is to allow composition, where a piece 
  of code that needs a transaction can start one locally, independently of 
  whether the caller had already one going.

 I believe it's actually a bit more tricky than what you said.  For example, 
 if we only support static transactions, will we require that any nested 
 transaction only request a subset of the locks the outer one took?  What if 
 we try to start a dynamic transaction inside of a static one?  Etc.  But I 
 agree it's not _that_ tricky and I'm also not convinced it's an advanced 
 feature.

 I'd suggest we take it out for now and look at re-adding it when the basics 
 of the async API are more solidified.  I hope we can

RE: [IndexedDB] IDBRequest.abort on writing requests

2010-07-14 Thread Pablo Castro
From my perspective cancelling is not something that happens that often, and 
when it happens it's probably ok to cancel the whole transaction. If we can 
spec abort() in the transaction object such that it try to cancel all pending 
operations and then rollback any work that has been done so far, then we 
probably don't need abort on individual operations (with the added value that 
it's uniform across read and write operations).

-pablo

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Wednesday, July 14, 2010 1:57 AM

On Wed, Jul 14, 2010 at 9:14 AM, Jonas Sicking jo...@sicking.cc wrote:
On Wed, Jul 14, 2010 at 1:02 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org
  wrote:
   On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc
   wrote:
  
   On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org
   wrote:
On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc
wrote:
   
Hi All,
   
Sorry if this is something that I've brought up before. I know I
meant
to bring this up in the past, but I couldn't find any actual
emails.
   
One thing that we discussed while implementing IndexedDB was what
to
do for IDBRequest.abort() or writing requests. For example on
the
request object returned from IDBObjectStore.remove() or
IDBCursor.update().
   
Ideal would of course be if it would cancel the write operation,
however this isn't always possible. If the call to .abort() comes
after the write operation has already executed in the database,
but
before the 'success' event has had a chance to fire. What's worse
is
that other write operations might already have been performed on
top
of the aborted request. Consider for example the following code:
   
req1 = myObjectStore.remove(12);
req2 = myObjectStore.add({ id: 12, name: Benny Andersson });
 do other stuff 
req1.abort();
   
In this case, even if the database supported aborting a specific
operation, it's very hard to say what the correct thing to do
with
operations performed after it. As far as I know, databases
generally
don't support rolling back a given operation, only rolling back
to a
specific point, i.e. rolling back a given operation and all
operations
performed after it.
   
We could say that abort() signals some sort of error if the
operation
has already been performed in the database, however that makes
abort()
very racy.
   
Instead we concluded that the best thing to do was to specify
that
IDBRequest.abort() should throw if called on a modifying request.
If
this sounds good I'll make this change to the spec.
   
I'd be fine with that.
Or we could remove abort all together.  I can't really think of
what
types
of operations you'd really want to abort until (at least) we have
some
sort
of join language or other mechanism to do really expensive
read-only
calls.
  
   I think there are expensive-ish read-only calls. Indexes are
   effectively a join mechanism since you'll hit one b-tree to do the
   index lookup, and then a second b-tree to look up the full object in
   the objectStore.
  
   But each individual call (the scope of canceling an IDBRequest) is
   pretty
   short.
  
  
   I don't really feel strongly either way. I think abort() isn't too
   hard to implement, but also doesn't provide a ton of value. At least
   not, like you say, until we add expensive calls like getAll or
   multi-step joins.
  
   I agree that when we look at adding such calls we may want to add an
   abort
   on just IDBRequest, but until then I don't think it's a very useful
   feature.
    And being easy to add is not a good reason to lock ourselves into
   a particular design in the future.  I think we should remove it until
   there's a good reason for it to exist.
  
  
Or we could take abort off IDBRequest and instead put a rollback
on
transactions (and not do the modify limitation).
  
   I definitely think we should have IDBTransaction.abort() no matter
   what. And that should allow rolling back write operations.
  
   Agreed.  In which case it seems as though being able to abort
   individual
   operations isn't that important...especially given what we just
   talked
   about
   above.
   So can we just get rid of abort() on IDBRequest?
 
  I don't feel strongly either way. We'll probably keep them in the
  mozilla implementation since we have experimental
  objectStore.getAll(key) and index.getAllObjects(key) implementations,
  which both probably count 

RE: [IndexedDB] Current editor's draft

2010-07-14 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Wednesday, July 14, 2010 5:43 PM

On Wed, Jul 14, 2010 at 5:03 PM, Pablo Castro
pablo.cas...@microsoft.com wrote:

 From: Jonas Sicking [mailto:jo...@sicking.cc]
 Sent: Wednesday, July 14, 2010 12:07 AM


 I think what I'm struggling with is how dynamic transactions will help
 since they are still doing whole-objectStore locking. I'm also curious
 how you envision people dealing with deadlock hazards. Nikunjs
 examples in the beginning of this thread simply throw up their hands
 and report an error if there was a deadlock. That is obviously not
 good enough for an actual application.

 So in short, looking forward to an example :)

I'll try to come up with one, although I doubt the code itself will be very 
interesting in this particular case. Not sure what you mean by they are still 
doing whole-objectStore locking. The point of dynamic transactions is that 
they *don't* lock the whole store, but instead have the freedom to choose the 
granularity (e.g. you could do row-level locking). 

As for deadlocks, whenever you're doing an operation you need to be ready to 
handle errors (out of disk, timeout, etc.). I'm not sure why deadlocks are 
different. If the underlying implementation has deadlock detection then you may 
get a specific error, otherwise you'll just get a timeout. 

  This will likely be extra bad for transactions where no write
  operations are done. In this case failure to call a 'commit()'
  function won't result in any broken behavior. The transaction will
  just sit open for a long time and eventually rolled back, though
  since no changes were done, the rollback is transparent, and the only
  noticeable effect is that the application halts for a while while the
  transaction is waiting to time out.
 
  I should add that the WebSQLDatabase uses automatically committing
  transactions very similar to what we're proposing, and it seems to
  have worked fine there.
 
  I find this a bit scary, although it could be that I'm permanently tainted 
  with traditional database stuff. Typical databases follow a presumed abort 
  protocol, where if your code is interrupted by an exception, a process 
  crash or whatever, you can always assume transactions will be rolled back 
  if you didn't reach an explicit call to commit. The implicit commit here 
  takes that away, and I'm not sure how safe that is.
 
  For example, if I don't have proper exception handling in place, an 
  illegal call to some other non-indexeddb related API may throw an 
  exception causing the whole thing to unwind, at which point nothing will 
  be pending to do in the database and thus the currently active transaction 
  will be committed.
 
  Using the same line of thought we used for READ_ONLY, forgetting to call 
  commit() is easy to detect the first time you try out your code. Your 
  changes will simply not stick. It's not as clear as the READ_ONLY example 
  because there is no opportunity to throw an explicit exception with an 
  explanation, but the data not being around will certainly prompt 
  developers to look for the issue :)

 Ah, I see where we are differing in thinking. My main concern has been
 that of rollbacks, and associated dataloss, in the non-error case. For
 example people forget to call commit() in some branch of their code,
 thus causing dataloss when the transaction is rolled back.

 Your concern seems to be that of lack of rollback in the error case,
 for example when an exception is thrown and not caught somewhere in
 the code. In this case you'd want to have the transaction rolled back.

 One way to handle this is to try to detect unhandled errors and
 implicitly roll back the transaction. Two situations where we could do
 this is:
 1. When an 'error' event is fired, but where .preventDefault() has is
 not called by any handler. The result is that if an error is ever
 fired, but no one explicitly handles it, we roll back the transaction.
 See also below.
 2. When a success handler is called, but the handler throws an exception.

 The second is a bit of a problem from a spec point of view. I'm not
 sure it is allowed by the DOM Events spec, or by all existing DOM
 Events implementations. I do still think we can pull it off though.
 This is something I've been thinking about raising for a while, but I
 wanted to nail down the raised issues first.

 Would you feel more comfortable with implicit commits if we did the above?

It does make it better, although this seems to introduce quite moving parts to 
the process. I still think an explicit commit() would be better, but I'm open 
to explore more options.

  And as you say, you still usually need error callbacks. In fact, we
  have found while writing examples using our implementation, that you
  almost always want to add a generic error handler. It's very easy to
  make a mistake, and if you don't add error handlers then these just go
  by silently, offering no help as to why your program

RE: [IndexedDB] Cursors and modifications

2010-07-14 Thread Pablo Castro
Making sure I get the essence of this thread: we're saying that cursors see 
live changes as they happen on objects that are after the object you're 
currently standing on; and of course, any other activity within a transaction 
sees all the changes that happened before that activity took place. Is that 
accurate? 

If it's accurate, as a side note, for the async API it seems that this makes it 
more interesting to enforce callback order, so we can more easily explain what 
we mean by before.

Thanks
-pablo


From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Wednesday, July 14, 2010 9:27 AM

On Wed, Jul 14, 2010 at 5:17 PM, Jonas Sicking jo...@sicking.cc wrote:
On Wed, Jul 14, 2010 at 5:12 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Thu, Jul 8, 2010 at 8:42 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Jul 5, 2010 at 9:45 AM, Andrei Popescu andr...@google.com wrote:
  On Sat, Jul 3, 2010 at 2:09 AM, Jonas Sicking jo...@sicking.cc wrote:
  On Fri, Jul 2, 2010 at 5:44 PM, Andrei Popescu andr...@google.com
  wrote:
  On Sat, Jul 3, 2010 at 1:14 AM, Jonas Sicking jo...@sicking.cc
  wrote:
  On Fri, Jul 2, 2010 at 4:40 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
 
  From: public-webapps-requ...@w3.org
  [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking
  Sent: Friday, July 02, 2010 4:00 PM
 
  We ran into an complicated issue while implementing IndexedDB. In
  short, what should happen if an object store is modified while a 
  cursor is
  iterating it?  Note that the modification can be done within the 
  same
  transaction, so the read/write locks preventing several transactions 
  from
  accessing the same table isn't helping here.
 
  Detailed problem description (this assumes the API proposed by
  mozilla):
 
  Consider a objectStore words containing the following objects:
  { name: alpha }
  { name: bravo }
  { name: charlie }
  { name: delta }
 
  and the following program (db is a previously opened IDBDatabase):
 
  var trans = db.transaction([words], READ_WRITE); var cursor; var
  result = []; trans.objectStore(words).openCursor().onsuccess = 
  function(e)
  {
    cursor = e.result;
    result.push(cursor.value);
    cursor.continue();
  }
  trans.objectStore(words).get(delta).onsuccess = function(e) {
    trans.objectStore(words).put({ name: delta, myModifiedValue:
  17 }); }
 
  When the cursor reads the delta entry, will it see the
  'myModifiedValue' property? Since we so far has defined that the 
  callback
  order is defined to be  the request order, that means that put 
  request
  will be finished before the delta entry is iterated by the cursor.
 
  The problem is even more serious with cursors that iterate
  indexes.
  Here a modification can even affect the position of the currently
  iterated object in the index, and the modification can (if i'm 
  reading the
  spec correctly)  come from the cursor itself.
 
  Consider the following objectStore people with keyPath name
  containing the following objects:
 
  { name: Adam, count: 30 }
  { name: Bertil, count: 31 }
  { name: Cesar, count: 32 }
  { name: David, count: 33 }
  { name: Erik, count: 35 }
 
  and an index countIndex with keyPath count. What would the
  following code do?
 
  results = [];
  db.objectStore(people,
  READ_WRITE).index(countIndex).openObjectCursor().onsuccess =
  function (e) {
    cursor = e.result;
    if (!cursor) {
      alert(results);
      return;
    }
    if (cursor.value.name == Bertil) {
      cursor.update({name: Bertil, count: 34 });
    }
    results.push(cursor.value.name);
    cursor.continue();
  };
 
  What does this alert? Would it alert Adam,Bertil,Erik as the
  cursor would stay on the Bertil object as it is moved in the 
  index? Or
  would it alert Adam,Bertil,Cesar,David,Bertil,Erik as we would 
  iterate
  Bertil again at its new position in the index?
 
  My first reaction is that both from the expected behavior of
  perspective (transaction is the scope of isolation) and from the
  implementation perspective it would be better to see live changes if 
  they
  happened in the same transaction as the cursor (over a store or 
  index). So
  in your example you would iterate one of the rows twice. Maintaining 
  order
  and membership stable would mean creating another scope of isolation 
  within
  the transaction, which to me would be unusual and it would be probably 
  quite
  painful to implement without spilling a copy of the records to disk (at
  least a copy of the keys/order if you don't care about protecting from
  changes that don't affect membership/order; some databases call these 
  keyset
  cursors).
 
 
  We could say that cursors always iterate snapshots, however this
  introduces MVCC. Though it seems to me that SNAPSHOT_READ already 
  does that.
 
  Actually, even with MVCC you'd see your own changes, because they
  happen in the same transaction so the buffer pool will use the same 
  version
  of the page

RE: [IndexedDB] Current editor's draft

2010-07-13 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Andrei Popescu
Sent: Monday, July 12, 2010 5:23 AM

Sorry I disappeared for a while. Catching up with this discussion was an 
interesting exercise...there is no particular message in this thread I can 
respond to, so I thought I'd just reply to the last one. Overall I think the 
new proposal is shaping up well and is being effective in simplifying 
scenarios. I do have a few suggestions and questions for things I'm not sure I 
see all the way.

READ_ONLY vs READ_WRITE as defaults for transactions:
To be perfectly honest, I think this discussion went really deep over an issue 
that won't be a huge deal for most people. My perspective, trying to avoid 
performance or usage frequency speculation, is around what's easier to detect. 
Concurrency issues are hard to see. On the other hand, whenever we can throw an 
exception and give explicit guidance that unblocks people right away. For this 
case I suspect it's best to default to READ_ONLY, because if someone doesn't 
read or think about it and just uses the stuff and tries to change something 
they'll get a clear error message saying if you want to change stuff, use 
READ_WRITE please. The error is not data- or context-dependent, so it'll fail 
on first try at most once per developer and once they fix it they'll know for 
all future cases.

Dynamic transactions:
I see that most folks would like to see these going away. While I like the 
predictability and simplifications that we're able to make by using static 
scopes for transactions, I worry that we'll close the door for two scenarios: 
background tasks and query processors. Background tasks such as synchronization 
and post-processing of content would seem to be almost impossible with the 
static scope approach, mostly due to the granularity of the scope specification 
(whole stores). Are we okay with saying that you can't for example sync 
something in the background (e.g. in a worker) while your app is still working? 
Am I missing something that would enable this class of scenarios? Query 
processors are also tricky because you usually take the query specification in 
some form after the transaction started (especially if you want to execute 
multiple queries with later queries depending on the outcome of the previous 
ones). The background tasks issue in particular looks pretty painful to me if 
we don't have a way to achieve it without freezing the application while it 
happens. 

Implicit commit:
Does this really work? I need to play with sample app code more, it may just be 
that I'm old-fashioned. For example, if I'm downloading a bunch of data form 
somewhere and pushing rows into the store within a transaction, wouldn't it be 
reasonable to do the whole thing in a transaction? In that case I'm likely to 
have to unwind while I wait for the next callback from XmlHttpRequest with the 
next chunk of data. I understand that avoiding it results in nicer patterns 
(e.g. db.objectStores(foo).get(123).onsuccess = ...), but in practice I'm not 
sure if that will hold given that you still need error callbacks and such.

Nested transactions:
Not sure why we're considering this an advanced scenario. To be clear about 
what the feature means to me: make it legal to start a transaction when one is 
already in progress, and the nested one is effectively a no-op, just refcounts 
the transaction, so you need equal amounts of commit()'s, implicit or explicit, 
and an abort() cancels all nested transactions. The purpose of this is to allow 
composition, where a piece of code that needs a transaction can start one 
locally, independently of whether the caller had already one going.

Schema versioning:
It's unfortunate that we need to have explicit elements in the page for the 
versioning protocol to work, but the fact that we can have a reliable mechanism 
for pages to coordinate a version bump is really nice. For folks that don't 
know about this the first time they build it, an explicit error message on the 
schema change timeout can explain where to start. I do think that there may be 
a need for non-breaking changes to the schema to happen without a version 
dance. For example, query processors regularly create temporary tables during 
sorts and such. Those shouldn't require any coordination (maybe we allow 
non-versioned additions, or we just introduce temporary, unnamed tables that 
evaporate on commit() or database close()...).

Thanks
-pablo




RE: [IndexedDB] Cursors and modifications

2010-07-02 Thread Pablo Castro

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Friday, July 02, 2010 4:00 PM

 We ran into an complicated issue while implementing IndexedDB. In short, 
 what should happen if an object store is modified while a cursor is 
 iterating it?  Note that the modification can be done within the same 
 transaction, so the read/write locks preventing several transactions from 
 accessing the same table isn't helping here.

 Detailed problem description (this assumes the API proposed by mozilla):

 Consider a objectStore words containing the following objects:
 { name: alpha }
 { name: bravo }
 { name: charlie }
 { name: delta }

 and the following program (db is a previously opened IDBDatabase):

 var trans = db.transaction([words], READ_WRITE); var cursor; var result = 
 []; trans.objectStore(words).openCursor().onsuccess = function(e) {
   cursor = e.result;
   result.push(cursor.value);
   cursor.continue();
 }
 trans.objectStore(words).get(delta).onsuccess = function(e) {
   trans.objectStore(words).put({ name: delta, myModifiedValue: 17 }); }

 When the cursor reads the delta entry, will it see the 'myModifiedValue' 
 property? Since we so far has defined that the callback order is defined to 
 be  the request order, that means that put request will be finished before 
 the delta entry is iterated by the cursor.

 The problem is even more serious with cursors that iterate indexes.
 Here a modification can even affect the position of the currently iterated 
 object in the index, and the modification can (if i'm reading the spec 
 correctly)  come from the cursor itself.

 Consider the following objectStore people with keyPath name
 containing the following objects:

 { name: Adam, count: 30 }
 { name: Bertil, count: 31 }
 { name: Cesar, count: 32 }
 { name: David, count: 33 }
 { name: Erik, count: 35 }

 and an index countIndex with keyPath count. What would the following 
 code do?

 results = [];
 db.objectStore(people,
 READ_WRITE).index(countIndex).openObjectCursor().onsuccess = function (e) {
   cursor = e.result;
   if (!cursor) {
 alert(results);
 return;
   }
   if (cursor.value.name == Bertil) {
 cursor.update({name: Bertil, count: 34 });
   }
   results.push(cursor.value.name);
   cursor.continue();
 };

 What does this alert? Would it alert Adam,Bertil,Erik as the cursor would 
 stay on the Bertil object as it is moved in the index? Or would it alert 
 Adam,Bertil,Cesar,David,Bertil,Erik as we would iterate Bertil again at 
 its new position in the index?

My first reaction is that both from the expected behavior of perspective 
(transaction is the scope of isolation) and from the implementation perspective 
it would be better to see live changes if they happened in the same transaction 
as the cursor (over a store or index). So in your example you would iterate one 
of the rows twice. Maintaining order and membership stable would mean creating 
another scope of isolation within the transaction, which to me would be unusual 
and it would be probably quite painful to implement without spilling a copy of 
the records to disk (at least a copy of the keys/order if you don't care about 
protecting from changes that don't affect membership/order; some databases call 
these keyset cursors).


 We could say that cursors always iterate snapshots, however this introduces 
 MVCC. Though it seems to me that SNAPSHOT_READ already does that.

Actually, even with MVCC you'd see your own changes, because they happen in the 
same transaction so the buffer pool will use the same version of the page. 
While it may be possible to reuse the MVCC infrastructure, it would still 
require the introduction of a second scope for stability. 


 We could also say that cursors iterate live data though that can be pretty 
 confusing and forces the implementation to deal with entries being added and 
  removed during iteration, and it'd be tricky to define all edge cases.

Would this be any different from the implementation perspective than dealing 
with changes that happen through other transactions once they are committed? 
Typically at least in non-MVCC systems committed changes that are further 
ahead in a cursor scan end up showing up even when the cursor was opened 
before the other transaction committed.


 It's certainly debatable how much of a problem any of these edgecases are 
 for users. Note that all of this is only an issue if you modify and read 
 from the  same records *in the same transaction*. I can't think of a case 
 where it isn't trivial to avoid these problems by separating things into 
 separate transactions.  However it'd be nice to avoid creating foot-guns 
 for people to play with (think of the children!).

 However we still need to define *something*. I would suggest that we define 
 that cursors iterate snapshots. It seems the cleanest for users and easiest 
  to define. And once implementations add MVCC support it should be 

RE: [IndexedDB] Multi-value keys

2010-06-18 Thread Pablo Castro
+1 on composite keys in general. The alternative to the proposal below would be 
to have the actual key path specification include multiple members (e.g. 
db.createObjectStore(foo, [a, b])). I like the proposal below as well, I 
just wonder if having the key path specification (that's external to the 
object) indicate which members are keys would be less invasive for scenarios 
where you already have javascript objects you're getting from a web service or 
something and want to store them as is. 

-pablo

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jonas Sicking
Sent: Friday, June 18, 2010 4:08 PM

Hi All,

One thing that (if I'm reading the spec correctly) is currently impossible is 
to create multi-valued keys. Consider for example an object store containing 
objects like:

{ firstName: Sven, lastName: Svensson, age: 57 } { firstName: Benny, 
lastName: Andersson, age: 63 } { firstName: Benny, lastName: Bedrup, age: 
9 }

It is easy to create an index which lets you quickly find everyone with a given 
firstName or a given lastName. However it doesn't seem possible to create an 
index that finds everyone with a given firstName
*and* lastName, or sort the list of people based on firstName and then lastName.

The best thing you could do is to concatenate the firstname and lastname and 
insert a ascii-null character in between and then use that as a key in the 
index. However this doesn't work if firstName or lastName can contain null 
characters. Also, if you want to be able to sort by firstName and then age 
there is no good way to put all the information into a single string while 
having sorting work.

Generally the way this is done in SQL is that you can create an index on 
multiple columns. That way each row has multiple values as the key, and sorting 
is first done on the first value, then the second, then the third etc.

However since we don't really have columns we can't use that exact solution. 
Instead, the way we could allow multiple values is to add an additional type as 
keys: Arrays.

That way you can use [Sven,  57], [Benny, 63] and [Benny, 9] as keys for 
the respective objects above. This would allow sorting and searching on 
firstName and age.

The way that array keys would be compared is that we'd first compare the first 
item in both arrays. If they are different the arrays are ordered the same way 
as the two first-values are order. If they are the same you look at the second 
value and so on. If you reach the end of one array before finding a difference 
then that array is sorted before the other.

We'd also have to define the order if an array is compared to a non-array 
value. It doesn't really matter what we say here, but I propose that we put all 
array after all non-arrays.

Note that I don't think we need to allow arrays to contain arrays.
That just seems to add complication without adding additional functionality.

Let me know what you think.

/ Jonas





RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2

2010-06-15 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Friday, June 11, 2010 3:20 PM

   So there is a real likelyhood of a browser implementation that 
   will predate it's associated JS engine's upgrade to ES5? 
   Feeling a concern isn't really much of technical argument on 
   it's own, and designing for outdated technology is a poor approach.
  I don't think there is, just wanted to avoid imposing it. If you 
  think it's really important then let's change it back to delete 
  assuming other folks are good with it.

  I had the same concerns Pablo did, but I don't feel strongly 
  either way.

Besides the maneuvering we'll have to do on the C++ side of things to avoid 
clashes with language keywords, the question is whether we expect plugins and 
such to add support for IndexedDB in existing browsers that don't do ES5. For 
example:
http://code.google.com/p/firebreath/wiki/FireBreathUsers


 Before we close on this, let me validate one more thing independently 
 of the JS version. Are we going to have trouble when trying to expose 
 these interfaces in C++? Not sure about other compilers and IDL 
 processing tools, but I'm playing around with Visual Studio 2010 and 
 while the COM IDL compiler will take delete as an interface member, 
 my C++ compiler really doesn't like it. As far as I know there is no 
 standard syntax to indicate that a symbol wasn't meant to be a 
 keyword in C++, so having delete (or other C++ keywords for that 
 matter) would be problematic. Am I missing something?

 Good point.  Does anyone have a strong opinion on how much we should 
 care about reserved word conflicts in language other than JavaScript?  
 it seems like a slippery slope.
 As an example, IDBDatabase.description is actually used by the 
 ObjectiveC base object class and so this caused some problems 
 initially.  We worked around it by having the ObjectiveC bindings 
 generator add a suffix whenever an attribute named description is 
 hit.  (Something similar was done for hash and id in other APIs.) 
 To be honest, I hadn't even considered bringing this up and asking for 
 it to be changed, but if we're going to avoid delete because it's a 
 reserved word in JavaScript (pre v5) and/or because it's a reserved 
 word in C++, perhaps we should consider changing description as well?

 We've had to do this a few times in the past already. One example was 
 Window.postMessage where we couldn't use the name PostMessage in C++ 
 because it was a predefined macro on some platform (windows iirc, not to 
 point fingers ;) ).

:)

 We developed a similar trick where we can indicate in the IDL that different 
 names are used for scripted languages and for compiled languages.

 So all in all I believe this problem can be overcome. I prefer to focus on 
 making the JS API be the best it can be, and let other languages take a back 
 seat. As long as it's solvable without too much of an issue (such as large 
 performance penalties) in other languages.

I agree we can sort this out and certainly limitations on the implementation 
language shouldn't surface here. The issue is more whether folks care about a 
C++ binding (or some other language with a similar issue) where we'll have to 
have a different name for this method.

Even though I've been bringing this up I'm ok with keeping delete(), I just 
want to make sure we understand all the implications that come with that.

-pablo
 



RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2

2010-06-11 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Friday, June 11, 2010 3:20 AM
Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline 
February 2

On Fri, Jun 11, 2010 at 1:54 AM, Pablo Castro pablo.cas...@microsoft.com 
wrote:


From: Kris Zyp [mailto:k...@sitepen.com]
Sent: Thursday, June 10, 2010 4:38 PM
Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline 
February 2

  So there is a real likelyhood of a browser implementation that will
  predate it's associated JS engine's upgrade to ES5? Feeling a
  concern isn't really much of technical argument on it's own, and
  designing for outdated technology is a poor approach.
 I don't think there is, just wanted to avoid imposing it. If you think it's 
 really important then let's change it back to delete assuming other folks 
 are good with it.

 I had the same concerns Pablo did, but I don't feel strongly either way.

Before we close on this, let me validate one more thing independently of the JS 
version. Are we going to have trouble when trying to expose these interfaces in 
C++? Not sure about other compilers and IDL processing tools, but I'm playing 
around with Visual Studio 2010 and while the COM IDL compiler will take 
delete as an interface member, my C++ compiler really doesn't like it. As far 
as I know there is no standard syntax to indicate that a symbol wasn't meant to 
be a keyword in C++, so having delete (or other C++ keywords for that matter) 
would be problematic. Am I missing something?

-pablo




RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2

2010-06-10 Thread Pablo Castro

 From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] 
 On Behalf Of Kris Zyp
 Sent: Thursday, June 10, 2010 9:49 AM
 Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline 
 February 2

 I see that in the trunk version of the spec [1] that delete() was
 changed to remove(). I thought we had established that there is no
 reason to make this change. Is anyone seriously expecting to have an
 implementation prior to or without ES5's contextually unreserved
 keywords? I would greatly prefer delete(), as it is much more
 consistent with standard DB and REST terminology.

My concern is that it seems like taking an unnecessary risk. I understand the 
familiarity aspect (and I like delete() better as well), but to me that's not a 
strong enough reason to use it and potentially cause trouble in some browser.

-pablo




RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2

2010-06-10 Thread Pablo Castro


From: Kris Zyp [mailto:k...@sitepen.com] 
Sent: Thursday, June 10, 2010 4:38 PM
Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline 
February 2

 On 6/10/2010 4:15 PM, Pablo Castro wrote:
 
  From: public-webapps-requ...@w3.org
  [mailto:public-webapps-requ...@w3.org] On Behalf Of Kris Zyp
  Sent: Thursday, June 10, 2010 9:49 AM Subject: Re: Seeking
  pre-LCWD comments for Indexed Database API; deadline February
  2
 
  I see that in the trunk version of the spec [1] that delete()
  was changed to remove(). I thought we had established that
  there is no reason to make this change. Is anyone seriously
  expecting to have an implementation prior to or without ES5's
  contextually unreserved keywords? I would greatly prefer
  delete(), as it is much more consistent with standard DB and
  REST terminology.
 
  My concern is that it seems like taking an unnecessary risk. I
  understand the familiarity aspect (and I like delete() better as
  well), but to me that's not a strong enough reason to use it and
  potentially cause trouble in some browser.
 
 So there is a real likelyhood of a browser implementation that will
 predate it's associated JS engine's upgrade to ES5? Feeling a
 concern isn't really much of technical argument on it's own, and
 designing for outdated technology is a poor approach.

I don't think there is, just wanted to avoid imposing it. If you think it's 
really important then let's change it back to delete assuming other folks are 
good with it.

-pablo




RE: Can IndexedDB depend on JavaScript? (WAS: [Bug 9793] New: Allow dates and floating point numbers in keys)

2010-06-03 Thread Pablo Castro

From: Jeremy Orlow
Sent: Tuesday, May 25, 2010 6:54 AM

 On Mon, May 24, 2010 at 9:21 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Sat, May 22, 2010 at 3:58 AM, Jeremy Orlow jor...@chromium.org wrote:
  On Fri, May 21, 2010 at 11:42 PM, bugzi...@jessica.w3.org wrote:
 
  http://www.w3.org/Bugs/Public/show_bug.cgi?id=9793
 
            Summary: Allow dates and floating point numbers in keys
            Product: WebAppsWG
            Version: unspecified
           Platform: All
         OS/Version: All
             Status: NEW
           Severity: normal
           Priority: P2
          Component: Indexed Database API
         AssignedTo: nikunj.me...@oracle.com
         ReportedBy: pablo.cas...@microsoft.com
          QAContact: member-webapi-...@w3.org
                 CC: m...@w3.org, public-webapps@w3.org
 
 
  Currently the spec requires the values referenced by the key path to be
  integers or strings. I strongly believe that we should also allow dates
  and
  floating point numbers (am I missing any other important types?). While
  dates
  and floating point numbers alone are not good for a primary key, they are
  important for non-unique indexes and as part of a composite key, allowing
  for
  things such as scanning in temporal order.
 
  This is the change I'd like to propose:
 
  Section 3.1.1 Keys of the currently published draft reads:
 
  -
  In order to efficiently retrieve records stored in an indexed database, a
  user
  agent needs to organize each record by its key. Conforming user agents
  must
  support the use of values of IDL data types [WEBIDL] DOMString and long as
  well
  as the value null as keys.
 
  For purposes of comparison, a DOMString key is always evaluated higher
  than any
  long key. Moreover, null always evaluates lower than any DOMString or long
  key.
  -
 
  New proposed text:
 
  -
  In order to efficiently retrieve records stored in an indexed database, a
  user
  agent needs to organize each record by its key. Conforming user agents
  must
  support the use of values of IDL data types [WEBIDL] DOMString, long,
  float,
  and the Date JavaScript object
 
  We really need to decide, once and for all, whether or not IndexedDB is
  going to be tied to JavaScript or not.  The two major reasons to do so are
  the lack of date in WebIDL and keyPath.
  KeyPath may be tricky to spec in a way that would work for any language
  without cutting out a lot of flexibility.  In order to keep what we're
  speccing sane, it will probably need to be a pretty small subset of what's
  possible in JavaScript and thus even browsers will likely need to roll 
  their
  own parser and such to support it.  (If we do decide to depend on
  JavaScript, it should enable some really neat things with the keyPath as
  well.)
  The HTML spec defines its own date type, but does not specify sort order at
  all.  I started a thread on this a bit ago (subject: [IndexedDB/WebIDL]
  Dates + Sorting (WAS: Detailed comments for the current draft)) but it 
  only
  got one response [3].
 Note that a Date type for WebIDL doesn't really affect things a whole
 lot for the interfaces in IndexedDB though. The relevant functions all
 take 'any' as type though, so we'll still have to describe in prose
 what types are permitted. I don't think this makes IndexedDB depend on
 javascript though.

Closing the loop on this one. Now that we agreed to add some language to WebIDL 
for the Date type [1], should we go ahead and make this change to the spec? I 
can ask Eliot to do it so we can close this one if folks feel it makes sense.

Thanks
-pablo

[1] http://www.mail-archive.com/public-webapps@w3.org/msg08939.html




RE: [IndexedDB] Proposal for async API changes

2010-05-20 Thread Pablo Castro
(still catching up on the rest of the long thread of API changes, will get back 
to that a bit later)

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Jeremy Orlow
Sent: Thursday, May 20, 2010 3:34 PM

  On Thu, May 20, 2010 at 11:25 PM, Shawn Wilsher sdwi...@mozilla.com 
  wrote:
  On 5/20/2010 7:34 AM, Shawn Wilsher wrote:
  So far it's really just that joins are painful in IndexedDB. I'm working
  on a blog post on this very topic though, and I'll be sure to point
  everyone in this thread to it (I figure this is useful stuff to get out
  to a wider audience).
  And honestly, I thought that we had discussed joins on this list, but I 
  only see a thread from Pablo mentioning it, but no real discussions. 
  Should we start that?

 Joins were actually in the original spec but taken out during the effort to 
 simply the API greatly.  IIRC, the main reason why Nikunj took them out is 
 that we believed you could fairly efficiently join yourself if you had 2 
 sorted lists and because we didn't see a simple way to do them without 
 introducing a lot of API surface area or creating (or borrowing) some sort 
 of syntax for the joins.  (Now that I think about it, though, maybe doing 
 this is not that big of a leap from what we're going to need to do to spec 
 keyPaths.  I'm starting to wonder if we need to rethink that as well)

 Anyway, the decision was made so long ago that maybe it's worth re-opening 
 the discussion.  I'll hunt through my mail archives tomorrow and start a new 
 thread with references to any original bits of info I can find.

My main concern with joins, besides API surface, was that in order to implement 
joins you need to choose an actual strategy. Depending on whether you have 
indexes or not and other circumstances you could choose to do range 
scans/lookups, a merge join, etc. So at least for fancier libraries this would 
only be of partial help, as they would probably want to do their own joins 
sometimes. 

I'm happy to explore again though. It's certainly the case that for simpler 
cases it might help users pull off tasks without depending on a library. I do 
wonder if we should try and land the async API first.

-pablo

J



RE: [IndexedDB] Lots of small nits and clarifying questions

2010-03-30 Thread Pablo Castro
Sorry for having disappeared for a while, odata was keeping me busy. I agree 
with all the clarifications listed in this thread that are required, so I won't 
redundantly mark each with same here, but I have a few comments on one or two 
of them below. 

On Mon, Mar 15, 2010 at 8:14 AM, Jeremy Orlow wrote:

On Sat, Mar 13, 2010 at 9:02 AM, Nikunj Mehta nik...@o-micron.com wrote:
Thanks for your patience. Most questions below don't seem to need new spec text.

On Feb 18, 2010, at 9:08 AM, Jeremy Orlow wrote:


 6) The specific ordering of elements should probably be specced including a 
 mix of types.

 Can you propose spec text for this? What do you think about the text 
 in http://www.w3.org/TR/IndexedDB/#key-construct?

 If we're only adding long long for v1, then I think language similar to 
 what's there now is probably OK.  But now that I think about it, I'm a bit 
 concerned that we might be backing ourselves into a corner for the future.  
 I also noticed that the sort order of JavaScript seems to order it numbers, 
 strings, and then nulls (not strings, numbers, nulls).

 I wonder if there is some other spec on sort order we can cite rather than 
 rolling our own.

I really think that just doing long/strings won't do, even for v1. For 
non-primary-key indexes we'll need at least Date and number (not just integers) 
in addition to long/string. Without that there is no ordering by date sent 
for emails or list price for products or lots of other scenarios where you're 
caching data coming from a server.


 2) What happens when data mutates while you're iterating via a cursor?

 This is covered by http://www.w3.org/TR/IndexedDB/#dfn-mode

 That applies to two separate transactions.  As far as I can tell, it should 
 be possible to have a cursor open and then delete an element that the cursor 
 is currently traversing all within the same transaction.  Am I missing 
 something?
 
I was assuming that within the same transaction you could change rows and those 
changes would be observable from open cursors. If it happens to be the current 
row then you won't be able to fetch it anymore but you can still move to the 
next one and continue scanning (and seeing any new changes that happened since 
you last moved).


 1) Structured clone is going to change over time.  And, realistically, UAs 
 won't support every type right away anyway.  What do we do when a value is 
 inserted that we do not support?

 We will evolve the text as and when the same evolves in WebStorage.

 I don't know of any implementations which have moved away from only allowing 
 strings within WebStorage.  I suspect that not fully supporting the 
 structured clone algorithm as specced is one of the reasons for this.

 As far as I can tell, you're essentially saying that fully supporting the 
 the structured clone algorithm a pre-req for IndexedDB?  I guess I can't 
 argue too much with that, but I'm not sure how realistic it is.  I know we 
 only half support it at the moment in Chromium.

I have the same worry about structured clones...it's right in principle but I 
can't see implementations converging and that will just hurt interoperability. 
Unfortunately there doesn't seem to be a well-known middle-ground. JSON is way 
too restrictive (e.g. no Date). Should we consider defining a subset of 
structured clones that work (maybe something like Javascript primitives plus 
Date plus whatever extra we feel we should include such as perhaps File 
objects)?


Thanks
-pablo
 



RE: [IndexedDB] Promises (WAS: Seeking pre-LCWD comments for Indexed Database API; deadline February 2)

2010-03-30 Thread Pablo Castro
On Fri, Mar 12, 2010 at 7:26 AM, Jeremy Orlow wrote:

On Fri, Mar 12, 2010 at 3:23 PM, Jeremy Orlow jor...@chromium.org wrote:
On Fri, Mar 12, 2010 at 3:04 PM, Kris Zyp k...@sitepen.com wrote:


 I believe computer science has clearly
 observed the fragility of passing callbacks to the initial function
 since it conflates the concerns of the operation with the asynchronous
 notifications and consequently greatly complicates composability.

 I don't understand this sentence.  I'm pretty sure that you can wrap any 
 callback based API in JavaScript with a promised, differed, etc based API.  
 As  Nikunj mentioned earlier, we're more concerned about creating a small 
 API surface area and sticking with well understood API designs rather than 
  eliminating the need for libraries that wrap IndexedDB.
 
Trying to digest this thread, I think we've sort of gone full-circle with the 
whole promises thing. When looking at the code with the chained then pattern 
I just love the result, but it seems that we can't get all the way there (and 
nesting instead of chaining stuff kind of lacks the magic). My take is that 
either we get the really nice pattern by going all the way or we create a more 
traditional callback/events-based API and then we build promises on top. Things 
seem to indicate that frameworks are still cooking on promises, so it may be 
safe to stay with callbacks/events and just build libraries on top (I would 
have loved to have this be the thing that saved us from needing a library 
always...but it seems we'll fall just a bit short).

As for callbacks versus events, while now I'm starting to get used to the 
events hooked up to the result object after the call, the callbacks may be a 
more natural mechanism for this particular usage. I'm not sure why this is 
fundamentally broken...would love to see examples or reference. If that's the 
case, then events are the obvious choice.

Thanks
-pablo




RE: [IndexedDB] Detailed comments for the current draft

2010-02-02 Thread Pablo Castro

On Mon, Feb 1, 2010 at 1:30 AM, Jeremy Orlow jor...@google.com wrote:

   1. Keys and sorting

   a.       3.1.1:  it would seem that having also date/time values as keys 
   would be important and it's a common sorting criteria (e.g. as part of a 
   composite primary key or in general as an index key).

  The Web IDL spec does not support a Date/Time data type. Could your use 
  case be supported by storing the underlying time with millisecond precision 
  using an IDL long long type? I am willing to change the spec so that it 
  allows long long instead of long IDL type, which will provide adequate 
  support for Date and time sorting.

 Can the spec not be augmented?  It seems like other specs like WebGL have 
 created their own types.  If not, I suppose your suggested change would 
 suffice as well.  This does seem like an important use case.
 
I agree, either we could augment the spec or we could describe it in terms of 
Javascript object values. That is, we can say something specific about the 
treatment of Javascript's Date object. Would that be possible? E.g. we could 
require implementations to provide full order for dates if they find an 
instance of that type in a path.

   b.      3.1.1: similarly, sorting on number in general (not just 
   integers/longs) would be important (e.g. price lists, scores, etc.)

  I am once again hampered by Web IDL spec. Is it possible to leave this for 
  future versions of the spec?

Actually Web IDL does define the double type and its Javascript binding. Can 
we add double to the list of types an index can be applied to?

   c.       3.1.1: cross type sorting and sorting of long values are clear. 
   Sorting of strings however needs more elaboration. In particular, which 
   collation do we use? Does the user or developer get to choose a 
   collation? If we pick up a collation from the environment (e.g. the OS), 
   if the collation changes we'd have to re-index all the databases.

  I propose to use Unicode collation algorithm, which was also suggested by 
  Jonas during a conversation.

I don't think this is specific enough, in that it still doesn't say which 
collation tables to use and how to specify them. A single collation strategy 
won't do for all languages (it'll range from slightly wrong to nonsense 
depending on the target language). This is a trickier area than I had 
initialize thought. We'll bake on this a bit and get back to this group with 
ideas. 

   d.      3.1.3: spec reads .key path must be the name of an enumerated 
   property.; how about composite keys (would make the related APIs take a 
   DOMString or DOMStringList)

  I prefer to leave composite keys to a future version.

I don't think we can get away with this. For indexes this is quite common (if 
anything else to have stable ordering when the prefix of the index has 
repeats). Once we have it for indexes the delta for having it for primary keys 
as well is pretty small (although I wouldn't oppose leaving out composite 
primary keys if that would help scope the feature).


   b.      Query processing libraries will need temporary stores, which need 
   temporary names. Should we introduce an API for the creation of temporary 
   stores with transaction lifetime and no name?

  Firstly, I think we can leave this safely to a future version. Secondly, my 
  suggestion would be to provide a parameter to the create call to indicate 
  that an object store being created is a transient one, i.e., not backed by 
  durable storage. They could be available across different transactions. If 
  your intention is to not make these object stores unavailable across 
  connections, then we can also offer a connection-specific transient object 
  store.

  In general, it requires us to introduce the notion of create params, which 
  would simplify the evolution of the API. This is also similar to how 
  Berkeley DB handles various options, not just those related to creation of 
  a Berkeley database.

Let's see how we progress on this one, and maybe revisit it a bit later. I'm 
worried about code that wants to do things such as a block-sort that needs to 
spill to disk, as it would have to either use some pattern or ask the user for 
temp table names.

   c.      It would be nice to have an estimate row count on each store. 
   This comes at an implementation and runtime cost. Strong opinions? 
   Lacking everything else, this would be the only statistic to base 
   decisions on for a query processor.

  I believe we need to have a general way of estimating the number of records 
  in a cursor once a key range has been specified. Kris Zyp also brings this 
  up in a separate email. I am willing to add an estimateCount attribute to 
  IDBCursor for this.

EstimateCount sounds good.

   d.      The draft does not touch on how applications would do optimistic 
   concurrency. A common way of doing this is to use a timestamp value 
   that's automatically updated by the system every time someone 

RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2

2010-02-01 Thread Pablo Castro
A few comments inline marked with [PC].

From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Nikunj Mehta
Sent: Sunday, January 31, 2010 11:37 PM
To: Kris Zyp
Cc: Arthur Barstow; public-webapps
Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline 
February 2


On Jan 27, 2010, at 1:46 PM, Kris Zyp wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

A few comments I've been meaning to suggest:

* count on KeyRange - Previously I had asked if there would be a way
to get a count of the number of objects within a given key range. The
addition of the KeyRange interface seems to be a step towards that,
but the cursor generated with a KeyRange still only provides a count
property that returns the total number of objects that share the
current key. There is still no way to determine how many objects are
within a range. Was the intent to make count return the number of
objects in a KeyRange and the wording is just not up to date?
Otherwise could we add such a count property (countForRange maybe, or
have a count and countForKey, I think Pablo suggested something like
that).

I agree with the concept. I have doubts about implementation success. However, 
I will include this in the editor's draft.

[PC] I agree with Nikunj, I suspect that a implementations will have to just 
compute the count, as it's unlikely that updating intermediate nodes in the 
tree for each update would  be desired (to try to maintain extra information 
for fast range size computation). At that point it's almost the same as user 
code iterating over the range (modulo the Javascript interface overhead). I'm 
also not sure how often you'd use this, as it would only work on simple 
conditions (no composite expressions, no functions in expressions)  that happen 
to have an index.

* Use promises for async interfaces - In server side JavaScript, most
projects are moving towards using promises for asynchronous interfaces
instead of trying to define the specific callback parameters for each
interface. I believe the advantages of using promises over callbacks
are pretty well understood in terms of decoupling async semantics from
interface definitions, and improving encapsulation of concerns. For
the indexed database API this would mean that sync and async
interfaces could essentially look the same except sync would return
completed values and async would return promises. I realize that
defining a promise interface would have implications beyond the
indexed database API, as the goal of promises is to provide a
consistent interface for asynchronous interaction across components,
but perhaps this would be a good time for the W3C to define such an
API. It seems like the indexed database API would be a perfect
interface to leverage promises. If you are interested in proposal,
there is one from CommonJS here [1] (the get() and call() wouldn't
apply here). With this interface, a promise.then(callback,
errorHandler) function is the only function a promise would need to
provide.

Thanks for the pointer. I will look in to this as even Pablo had related 
requirements.


[1] http://wiki.commonjs.org/wiki/Promises

and a comment on this:
On 1/26/2010 1:47 PM, Pablo Castro wrote:
 11. API Names

  a.   transaction is really non-intuitive (particularly given
 the existence of currentTransaction in the same class).
 beginTransaction would capture semantics more accurately. b.
 ObjectStoreSync.delete: delete is a Javascript keyword, can we use
 remove instead?
I'd prefer to keep both of these as is. Since commit and abort are
part of the transaction interface, using transaction() to denote the
transaction creator seems brief and appropriate. As far as
ObjectStoreSync.delete, most JS engines have or should be contextually
reserving delete. I certainly prefer delete in preserving the
familiarity of REST terminology.
[PC] I understand the term familiarity aspect, but this seems to be something 
that would just cause trouble. From a quick check with the browsers I had at 
hand, both IE8 and Safari 4 reject scripts where you try to add a method called 
delete to an object's prototype. Natively-implemented objects may be able to 
work-around this but I see no reason to push it. remove()  is probably equally 
intuitive. Note that the method continue on async cursors are likely to have 
the same issue as continue is also a Javascript keyword.


Thanks,

- --
Kris Zyp
SitePen
(503) 806-1841
http://sitepen.com

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktgtCkACgkQ9VpNnHc4zAwlkgCgti99/iJMi1QqDJYsMgxj9hC3
X0cAnj0J0xzqIQa8abaBQ8qxCMe/7/sU
=W6Jx
-END PGP SIGNATURE-

-pablo



[IndexedDB] Detailed comments for the current draft

2010-01-26 Thread Pablo Castro
These are notes that we collected both from reviewing the spec (editor's draft 
up to Jan 24th) and from a prototype implementation that we are working on. I 
didn't realize we had this many notes, otherwise I would have been sending 
intermediate notes early. Will do so next round.


1. Keys and sorting

a.   3.1.1:  it would seem that having also date/time values as keys would 
be important and it's a common sorting criteria (e.g. as part of a composite 
primary key or in general as an index key).
b.  3.1.1: similarly, sorting on number in general (not just 
integers/longs) would be important (e.g. price lists, scores, etc.)
c.   3.1.1: cross type sorting and sorting of long values are clear. 
Sorting of strings however needs more elaboration. In particular, which 
collation do we use? Does the user or developer get to choose a collation? If 
we pick up a collation from the environment (e.g. the OS), if the collation 
changes we'd have to re-index all the databases.
d.  3.1.3: spec reads …key path must be the name of an enumerated 
property…; how about composite keys (would make the related APIs take a 
DOMString or DOMStringList) 


2. Values

a.   3.1.2: isn't the requirement for structured clones too much? It 
would mean implementations would have to be able to store and retrieve File 
objects and such. Would it be more appropriate to say it's just graphs of 
Javascript primitive objects/values (object, string, number, date, arrays, 
null)? 


3. Object store

a.   3.1.3: do we really need in-line + out-of-line keys? Besides the 
concept-count increase, we wonder whether out-of-line keys would cause trouble 
to generic libraries, as the values for the keys wouldn't be part of the values 
iterated when doing a foreach over the table.
b.  Query processing libraries will need temporary stores, which need 
temporary names. Should we introduce an API for the creation of temporary 
stores with transaction lifetime and no name?
c.  It would be nice to have an estimate row count on each store. This 
comes at an implementation and runtime cost. Strong opinions? Lacking 
everything else, this would be the only statistic to base decisions on for a 
query processor. 
d.  The draft does not touch on how applications would do optimistic 
concurrency. A common way of doing this is to use a timestamp value that's 
automatically updated by the system every time someone touches the row. While 
we don't feel it's a must have, it certainly supports common scenarios.


4. Indexes

a.   3.1.4 mentions auto-populated indexes, but then there is no mention 
of other types. We suggest that we remove this and in the algorithms section 
describe side-effecting operations as always updating the indexes as well.
b.  If during insert/update the value of the key is not present (i.e. 
undefined as opposite to null or a value), is that a failure, does the row not 
get indexed, or is it indexed as null? Failure would probably cause a lot of 
trouble to users; the other two have correctness problems. An option is to 
index them as undefined, but now we have undefined and null as indexable keys. 
We lean toward this last option. 
5.   Databases
a.   Not being able to enumerate database gets in the way of creating good 
tools and frameworks such as database explorers. What was the motivation for 
this? Is it security related?
b.  Clarification on transactions: all database operations that affect the 
schema (create/remove store/index, setVersion, etc.) as well as data 
modification operations are assumed to be auto-commit by default, correct? 
Furthermore, all those operations (both schema and data) can happen within a 
transaction, including mixing schema and data changes. Does that line up with 
others' expectations? If so we should find a spot to articulate this explicitly.
c.   No way to delete a database? It would be reasonable for applications 
to want to do that and let go of the user data (e.g. a forget me feature in a 
web site)
6.   Transactions
a.   While we understand the goal of simplifying developers' life with an 
error-free transactional model, we're not sure if we're making more harm by 
introducing more concepts into this space. Wouldn't it be better to use regular 
transactions with a well-known failure mode (e.g. either deadlocks or 
optimistic concurrency failure on commit)?
b.If in auto-commit mode, if two cursors are opened at the same time (e.g. 
to scan them in an interleaved way), are they in independent transactions 
simultaneously active in the same connection?


7. Algorithms

a.   3.2.2: steps 4 and 5 are inverted in order.
b.  3.2.2: when there is a key generator and the store uses in-line keys, 
should the generated key value be propagated to the original object (in 
addition to the clone), such that both are in sync after the put operation?
c.   3.2.3: step 2, probably editorial mistake? Wouldn't all indexes have a 
key 

RE: IndexedDB and MVCC

2010-01-18 Thread Pablo Castro
Hi Chris,

 -Original Message-
 From: public-webapps-requ...@w3.org [mailto:public-webapps-
 requ...@w3.org] On Behalf Of Chris Anderson
 Sent: Friday, January 15, 2010 11:14 AM
 To: public-webapps WG
 Subject: IndexedDB and MVCC
 
 Hi,
 
 I've been reading the new IndexedDB spec as published here:
 http://www.w3.org/TR/IndexedDB/
 
 My first impression is that this simpler than WebSimpleDB, but not too
 simple. I'm happy to see detached readers being mentioned.
 
 There's one other piece of the concurrency story that could be useful.
 
 In section 3.2.2 Object Store Storage steps
 
 step 7: If the no-overwrite flag was passed to these steps and is set,
 and a record already exists with its key being key, then terminate
 these steps and set error code CONSTRAINT_ERR.
 
 I think it wouldn't add much complexity to use a compare-and-swap
 pattern, instead of a no-write-if-exists pattern. This would allow for
 better concurrency via optimistic updates, and look a lot like HTTP
 etags.

Wouldn't these be different scenarios? The purpose of the flag is to help in 
scenarios where you don't want to automatically create an item, only update an 
existing one. What you're describing seems to be oriented towards the case 
where you're updating an existing item, have an optimistic concurrency token, 
and want to use it to check for conflicts before the update goes through. 

You definitely make a good point about the fact that the current document 
doesn't touch on how applications would handle optimistic concurrency. One way 
would be to build-in support for it (as you suggest, an optional path for the 
concurrency token, and perhaps also a timestamp sort of thing that gets 
automatically updated). Alternatively application code could do the 
check-and-update-or-fail deal within a transaction. 

 
 It could be accomplished by allowing an object store to take a
 key-path for the update-token. Then subsequent updates could require
 that the key-path match. (Some additional complexity: we'd need the
 ability to check for a matching update-token, then change it, in a
 transaction).
 
 CouchDB uses an MVCC token that must match to allow updates. This
 allows us to avoid locking. But even more important is the parallels
 we have with HTTP Etags (if-match for idempotence, if-none-match for
 caching).
 
 The CouchDB style of MVCC can be accomplished by updates in a
 compare-and-swap transaction, so technically I can do what I want in
 the spec as it stands. But I still think the parallels to HTTP etags
 can be instructive.

Out of curiosity: if you were to layer CouchDB on top of IndexedDB, would  you 
always just use the dynamic locking mode, or do you actually have use for the 
other options offered?

I ask because I'm seriously concerned that the extra modes will add to the 
overall concept count in an attempt to simplify the use of transactions, and 
don't really simplify the end to end.

 
 Chris
 
 
 --
 Chris Anderson
 http://jchrisa.net
 http://couch.io
 

Thanks
-pablo




RE: [WebSimpleDB] Allowing schema operations anywhere

2009-12-22 Thread Pablo Castro
My apologies for my late reply, I've been out for a while.

 -Original Message-
 From: Nikunj R. Mehta [mailto:nikunj.me...@oracle.com]
 Sent: Friday, December 11, 2009 10:47 AM
 To: public-webapps@w3.org WG
 Cc: Pablo Castro
 Subject: Re: [WebSimpleDB] Allowing schema operations anywhere
 
 I have gone ahead and updated the spec to allow option B (only).
 Please take a look.

Option B makes sense, as without it there is a class of algorithms that cannot 
be implemented or it would be quite difficult to do so (e.g. a sort type of 
construct a query language might want to support wouldn't be possible without a 
backing index). 

This certainly means versioning becomes the responsibility of the app/library 
and not the user agent. This makes sense to me, given that not all schema 
changes are really version changes (e.g. creation of a spill-to-disk table 
shouldn't bump up the database version).

Thanks
-pablo

 
 Nikunj
 On Dec 8, 2009, at 10:14 AM, Nikunj R. Mehta wrote:
 
  Hi Pablo,
 
  Sorry for the long delay in responding to your comments. Hopefully, we
  can continue the discussion now.
 
  Schema changes interact with the locking model of the database. As I
  see it, here are several ways in which the API could be designed and
  the consequences of doing so:
 
  A. Allow schema changes inside a metadata transaction which can only
  be performed at connection time B. Allow schema changes inside a data
  transaction, which can be performed any time a connection is open C.
  Allow schema changes inside a metadata transaction, which can be
  performed any time a connection is open
 
  Option A's disadvantages are that metadata manipulation cannot be
  combined with data changes. Moreover, version numbers are no longer
  issued by the application but rather by a user agent.
 
  Option A's advantages are that resource acquisition is simplified and
  deadlocks can be avoided considering that a connection acquires and
  releases the metadata resource in a consistent sequence. Another
  upside is that version number maintenance is automated.
 
  Option B's main disadvantage is that there is no real notion of
  version that can be managed by the user agent. Another is that
  deadlocks could occur because there is no a priori declaration of
  intent about metadata modification. This could be remedied by
  including the database itself in the list of objects that are intended
  to be modified in the transaction.
 
  Option B's advantages are closer interleaving of and atomic metadata
  changes with data changes, and application controlled version numbers
  used for the database.
 
  Option C's disadvantage is that data and metadata changes cannot be
  interleaved atomically.
 
  Option C's advantages are that deadlocks can be avoided and version
  number management can be performed  by an application.
 
  Overall, I think version management and metadata changes are exclusive
  in some sense. IOW, if we want Option B and Option C, then we have to
  remove the connection time version check.
 
  Hope that helps. Please feel free to add if I missed anything.
 
  Nikunj
 
  On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote:
 
  We are finding a number of reasons for wanting to create tables on
  the fly, and without bumping up the database version. A few examples:
  - Packaged components that create side tables to maintain its own
  state
  - Query processors often need to spill to disk during query
  execution. For example, sorting large sets requires storing temporary
  sets of rows on disk to be merged later.
 
  So we're thinking it would be better to have these methods directly
  in the DatabaseSync/DatabaseAsync objects (with proper corresponding
  patterns), instead of their current location in the Upgrade
  interface.
 
  For the common case where several schema changes need to be done
  atomically, developers can simply wrap the calls in a transaction,
  and they would do for regular data manipulation.
 
  We would need an extra method to bump up the version explicitly, as
  that would no longer be in the upgrade callback.
 
  Does this seem reasonable?
 
  Regards,
  -pablo
 
 
 
  Nikunj
  http://o-micron.blogspot.com
 
 
 
 
 
 Nikunj
 http://o-micron.blogspot.com
 
 
 




[WebSimpleDB] Introduce a pause/resume pattern for coordinated access to multiple stores

2009-12-22 Thread Pablo Castro
Whenever we take a callback that's to be called for each item in a set (e.g. 
with a .forEach(callback) pattern), we need a way to indicate the system 
whether it's ok to move to the next row and invoke the next callback or not. 
Otherwise, in scenarios where the callback itself performs an operation that 
doesn't finish immediately (such as another database async call) the system 
will keep queuing up top-level callbacks, which in turn may queue up more 
callbacks as part of its implementation, and execution will be in some order 
that's very hard to predict at best.

This comes up in several contexts. Applications will often need to scan more 
than one object store in coordination. Query processors will also need this 
when implementing physical operators for joins and such. A different context 
would be a system that needs to submit an HTTP request per row, where you may 
want to use an XmlHttpRequest and unwind after calling open. While the HTTP 
request is in flight you don't want to move to the next

In most cases one of the key aspects is that we need separate components to 
work cooperatively as they pull rows from one or multiple scans, and there 
needs to be a way of controlling the advance of cursors through the rows.

We would like to introduce pause and resume functions for scans to support 
this. Since there is no obvious place to put this right now, we could introduce 
an iterator object that can be used to control things related to the current 
state of the iteration as of when the callback happens, or maybe this is the 
cursor itself.

The resulting code would look like this (the example uses the 
single-async-level pattern we're playing around, but these two are actually 
independent things):

async_db.forEachObjectInStore(people, function(person, iteration) {
  iteration.pause(); // we won't be done with 'person' until later...
  var request = async_db.getFromStore(people, person.managerId);
  request.onsuccess = function() {
var manager = request.result;
// Do something with both 'person' and 'manager', and now we're ready to 
process the next person.
iteration.resume();
  };
});

The nice thing about adding these as methods on the side is that it's 
completely out of sight in simple scenarios where you may be just scanning to 
build some HTML for example. Only if you're doing multiple coordinated, async 
tasks you need to know about these functions.

Regards,
-pablo




[WebSimpleDB] Allowing schema operations anywhere

2009-11-22 Thread Pablo Castro
We are finding a number of reasons for wanting to create tables on the fly, and 
without bumping up the database version. A few examples:
- Packaged components that create side tables to maintain its own state
- Query processors often need to spill to disk during query execution. For 
example, sorting large sets requires storing temporary sets of rows on disk to 
be merged later.

So we're thinking it would be better to have these methods directly in the 
DatabaseSync/DatabaseAsync objects (with proper corresponding patterns), 
instead of their current location in the Upgrade interface.

For the common case where several schema changes need to be done atomically, 
developers can simply wrap the calls in a transaction, and they would do for 
regular data manipulation.

We would need an extra method to bump up the version explicitly, as that would 
no longer be in the upgrade callback.

Does this seem reasonable?

Regards,
-pablo




[WebSimpleDB] Flatting APIs to simplify primary cases

2009-11-19 Thread Pablo Castro
We're busy creating experimental implementations of WebSimpleDB to both 
understand what it takes to implement and also to see what the developer 
experience looks like. 

As we started to write application code against the API (particularly the 
async one) the first thing that popped is the fact that you need two levels of 
nested callbacks for everything. While the current factoring of the API makes 
sense on the design board, it's kind of noisy in app code. For example:

// assume you already have a database opened in dbReq 
var html = ul; 
var storeReq = new ObjectStoreRequest(dbReq.database);
storeReq.success = function() {
var cursorReq = new CursorRequest(storeReq.store);
cursorReq.callback = function(key, cursor, value) {
html += li + value.Name + /li;
}
cursorReq.onsuccess = function(r) {
document.getElementById(output).innerHTML = html + /ul;
}
cursorReq.open();
}
storeReq.open();

One option that we would like to explore is to flatten the API, so most 
common methods are straight in the database class. This trades off some of the 
factoring in favor of usability for common cases using the async API.

The change would span a couple of aspects:

1. Move operations from object store interface and the index interface into the 
Database interface.

Accessing indexes and stores through specialized objects is problematic for the 
following reasons:
- It's always the case that we need to consider when objects are invalidated 
because something changes from underneath them, for example a schema change. So 
for example, if there is an explicit store object, then when the store is 
dropped we need to consider what is valid/invalid and what its failure points 
and modes are. By not having a standalone store object, we significantly reduce 
the gotchas to consider.
- From a usability perspective, it's simpler to work with a store in a single 
step, rather than having to open it first and then work with it (see patterns 
below with a single request and one DBRequest object).
- With no two-step access pattern, the API has one less level of 
asynchronicity, as effectively the table lookup + operation are atomic within 
the store. This also consolidates all operations with an async variant in a 
single interface (the Database), which is a great simplification for 
discoverability.

var html = ul;
var request = asyncDb.forEachStoreObject(contacts, function(row) {
   html += li + row.Name + /li;
});
request.onsuccess = function(r) {
  document.getElementById(output).innerHTML = html + /ul; }

In moving the operations, it's probably best to rename them to something more 
descriptive, so we can have for example 'getFromStore(storeName, key)' and 
'getFromIndex(storeName, indexName, key)'. This also helps in that 'delete' 
won't collide with the Javascript keyword.

Note that the store and index interfaces are still around to provide metadata, 
but at this point they behave as simple read-only snapshots.

2. Generalize the use of DBRequest, add a 'result' member to it and have all 
asynchronous operations be initiated from a DatabaseAsync interface.

As a result of the previous changes, all operations that have an async 
counterpart should now exist on the DatabaseAsync interface. Rather than having 
multiple types of requests depending on the target object, it is possible to 
have operations on a DatabaseAsync interface that provide a uniform invocation 
and handling programming pattern.

This gives a nice pattern for understanding how a sync API maps to an async API.

So for example:

var record = db.getFromStore(store, key); // use record...

Becomes:

var request = asyncDb.getFromStore(store, key); request.onsuccess = 
function(req) {
  var record = req.result;
  // use record...
};

We could include more data in DBRequest or DBRequest.result as needed if in 
some cases a method produces more than just a simple result. Further 
specializatons of DBRequest (subtypes) are still possible in the future if we 
need to introduce special cases for specific operations.

Similarly, we would have something like asyncDb.forEachStoreObject() that 
queues a task to call a callback for each element in a store/index, potentially 
within a range if specified. The pattern scales well to all the other APIs 
present in db/store/index today.

If this seems like a good idea to folks, we'd be happy to write up a more 
complete version that articulates the tweaks across all the WebSimpleDB APIs to 
make this happen.

Regards,
-pablo




Web Data APIs

2009-10-31 Thread Pablo Castro
We've been looking at the web database space here at Microsoft, trying to 
understand scenarios and requirements. After assessing what was out there we 
are forming an opinion around this. I wanted to write to this group to share 
how we think about the space, what principles we try to apply, and to discuss 
specifics.

The short story is that we believe Nikunj's WebSimpleDB proposal, which 
basically describes a minimum-bar web database API and enables a whole set of 
diverse options to be built on top, is the right thing to do.

During the last couple of weeks we have been talking with various folks from 
Mozilla and Oracle and iterating over details of the WebSimpleDB draft. In the 
process it has become clear that we all share the same high-level expectations 
on the scope and capabilities of this API, and Nikunj has been hard at work 
making changes to the draft to keep up with them. I'll touch on a few details 
below, but bear in mind that several of them are already in the process of 
being addressed.

We would love to hear feedback, requirements, specific application scenarios, 
etc. We want to make progress quickly and get experimental implementations 
going to ensure that as we explore we stay grounded, with things that are 
implementable.


Guiding principles and why we think the ISAM style proposed in WebSimpleDB is a 
good idea
As we try to understand the problem space we formulated a couple of guiding 
principles:
- Get into the standard the key building blocks that are either impossible to 
build on top, or so common that would be very redundant to do so
- Focus on an API that is simple enough that can be reliably specifiable and 
that can be implemented to follow the spec in a relatively simple manner
We believe that WebSimpleDB sets the stage in this direction. An ISAM layer can 
be used directly or can be a building block for more elaborate layers that can 
be built entirely in Javascript on top. Also, ISAM is simple enough that can be 
specified in a way that should enable highly interoperable implementations.


Trimming down

There are a number of elements of WebSimpleDB that we can probably live 
without, at least for a first version, such as Queues and Sequences. This may 
help simplify the database API even further.

Also, there are a few simplifying assumptions we can make from the get-go. For 
example, that paths as informally mentioned in the spec only reference 
Javascript identifiers (perhaps with dot-notation) and when used for 
index/primary keys they point to Javascript primitive values and not to 
objects/arrays.


Terminology

The word Entity has a lot of different meanings depending on who you talk to. 
It would be interesting to find a simpler term, perhaps something that matches 
the Javascript terminology better.


Areas where we need to dig deeper and have broader discussions to understand 
better

Isolation model and its implications in locking: Various isolation models lead 
to different failure modes; for example, regular locks mean that application 
code needs to be ready to deal with deadlocks, or in the case of 
multi-versioning you can see optimistic concurrency violation exceptions during 
commit. There is a tricky balance between not dictating too much from the 
implementation and ensuring that observable behavior across implementations 
really enables interoperability.

What's the sweet spot for the API?: is the primary use for this API to be 
directly consumed by application code? Or is it a building block to create 
various different libraries that present a diversity of styles for query 
formulation and execution? We lean to the side of making it an API that's great 
for libraries to build nice layers on top, but it's still useable directly in 
application code (along the lines of what happens with XmlHttpRequest, where 
most developers will actually use a wrapper that fits the particular 
scenario/library better).

Regards,
-pablo