RE: [IndexedDB] Closing on bug 9903 (collations)
From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On Behalf Of Keean Schupke Sent: Tuesday, May 31, 2011 11:51 PM >> On 1 June 2011 01:37, Pablo Castro wrote: >> >> -Original Message- >> From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh >> Gregor >> Sent: Tuesday, May 31, 2011 3:49 PM >> >> >> On Tue, May 31, 2011 at 6:39 PM, Pablo Castro >> >> wrote: >> >> > No, that was poor wording on my part, I keep using "locale" in the >> >> > wrong context. I meant to have the API take a proper collation >> >> > identifier. The identifier can be as specific as the caller wants it to >> >> > be. The implementation could choose to not honor some specific detail >> >> > if it can't handle it (to the extent that doing so is allowed by the >> >> > specification of collation names), or fail because it considers that >> >> > not handling a particular aspect of the collation identifier would >> >> > severely deviate from the caller's expectations. >> >> >> >> I'm not sure I understand you. My personal opinion is that there >> >> should be no undefined behavior here. If authors are allowed to pass >> >> collation identifiers, the spec needs to say exactly how they're to be >> >> interpreted, so the same identifier passed to two different browsers >> >> will result in the same collation, i.e., the same strings need to sort >> >> the same cross-browser. Having only binary collation is better than >> >> having non-binary collations but not defining them, IMO. >> I thought BCP47 allowed implementations to drop subtags if needed. I just >> re-read the spec and it seems that it only allows to do that in constrained >> cases where you can't fit the whole name in your buffer (which wouldn't >> apply to the context discussed here). My first instinct is that this is >> quite a bit to guarantee (full consistency in collation), but it seems that >> that's what the spec is shooting for. >> >> >> > Given the amount of debate on this, could we at least agree that we can >> >> > do binary for v1? We can then have an open item for v2 on taking >> >> > collation names and sort according to UCA or taking callbacks and such. >> >> >> >> I'm okay with supporting only binary to start with. >> Great. I'll still wait a bit to see what other folks think, and then update >> the bug one way or the other. >> >> Thanks >> -pablo >> >> The discussion sounds like it is headed in the right direction. Are there >> any issues with non-unicode encodings that need to be dealt with (HTTP >> headers default to ISO-8859 I think). Would people be expected to convert on >> read into UTF-16 strings or use typed-arrays? I asked around here and folks actually pointed out that the JavaScript spec seems to be describing exactly what we needed. Looking at here [1], section 11.8.5, the relevant fragment starting at step 4 goes: Else, both px and py are Strings a. If py is a prefix of px, return false. (A String value p is a prefix of String value q if q can be the result of concatenating p and some other String r. Note that any String is a prefix of itself, because r may be the empty String.) b. If px is a prefix of py, return true. c. Let k be the smallest nonnegative integer such that the character at position k within px is different from the character at position k within py. (There must be such a k, for neither String is a prefix of the other.) d. Let m be the integer that is the code unit value for the character at position k within px. e. Let n be the integer that is the code unit value for the character at position k within py. f. If m < n, return true. Otherwise, return false. It also has a note below indicating: NOTE 2 The comparison of Strings uses a simple lexicographic ordering on sequences of code unit values. There is no attempt to use the more complex, semantically oriented definitions of character or string equality and collating order defined in the Unicode specification. Therefore String values that are canonically equal according to the Unicode standard could test as unequal. In effect this algorithm assumes that both Strings are already in normalised form. Also, note that for strings containing supplementary characters, lexicographic ordering on sequences of UTF-16 code unit values differs from that on sequences of code point values. Which is very much in line with what we've been discussing, and has the extra feature of being compatible with JavaScript order. So it looks like we could reference (or inline) this in the spec and have a fully specified order for keys with string content. Thoughts? Thanks -pablo [1] http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf
RE: [IndexedDB] Evictable stores
From: dgro...@google.com [mailto:dgro...@google.com] On Behalf Of David Grogan Sent: Tuesday, June 07, 2011 1:01 PM >> We (chrome) are still having internal discussions about evictable vs >> non-evictable storage; we're on board with worrying about this in v2. >> On Tue, May 31, 2011 at 5:33 PM, Jonas Sicking wrote: >> On Tue, May 31, 2011 at 3:46 PM, Pablo Castro >> wrote: >> >> > We discussed evictable stores some time ago and captured it in bug >> >> > 11350 [1], but I haven't seen further discussion on it and it hasn't >> >> > gone into the spec. I'm curious on where folks are with this? Should we >> >> > move it to v2? Should we just allow UAs to have their own policy around >> >> > eviction (back at TPAC it seemed folks had reasonable but different >> >> > strategies for handling when to allow websites to use storage already). >> >> I think this is a very interesting feature, but one that I'd prefer to >> >> move to a version 2 as it isn't a required feature and is one that >> >> seems easy to "retrofit". >> >> >> >> / Jonas Got it. I postponed the bug.
RE: [IndexedDB] Evictable stores
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Tuesday, May 31, 2011 5:34 PM >> On Tue, May 31, 2011 at 3:46 PM, Pablo Castro >> wrote: >> > We discussed evictable stores some time ago and captured it in bug 11350 >> > [1], but I haven't seen further discussion on it and it hasn't gone into >> > the spec. I'm curious on where folks are with this? Should we move it to >> > v2? Should we just allow UAs to have their own policy around eviction >> > (back at TPAC it seemed folks had reasonable but different strategies for >> > handling when to allow websites to use storage already). >> >> I think this is a very interesting feature, but one that I'd prefer to >> move to a version 2 as it isn't a required feature and is one that >> seems easy to "retrofit". >> >> / Jonas The feature is already captured in the wiki page that tracks future features [1]. So I guess we can just resolve the bug as "later". Jeremy, the bug is currently assigned to you, were you doing work on it or should I just resolve it? Thanks -pablo [1] http://www.w3.org/2008/webapps/wiki/IndexedDatabaseFeatures
RE: [IndexedDB] Closing on bug 9903 (collations)
-Original Message- From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh Gregor Sent: Tuesday, May 31, 2011 3:49 PM >> On Tue, May 31, 2011 at 6:39 PM, Pablo Castro >> wrote: >> > No, that was poor wording on my part, I keep using "locale" in the wrong >> > context. I meant to have the API take a proper collation identifier. The >> > identifier can be as specific as the caller wants it to be. The >> > implementation could choose to not honor some specific detail if it can't >> > handle it (to the extent that doing so is allowed by the specification of >> > collation names), or fail because it considers that not handling a >> > particular aspect of the collation identifier would severely deviate from >> > the caller's expectations. >> >> I'm not sure I understand you. My personal opinion is that there >> should be no undefined behavior here. If authors are allowed to pass >> collation identifiers, the spec needs to say exactly how they're to be >> interpreted, so the same identifier passed to two different browsers >> will result in the same collation, i.e., the same strings need to sort >> the same cross-browser. Having only binary collation is better than >> having non-binary collations but not defining them, IMO. I thought BCP47 allowed implementations to drop subtags if needed. I just re-read the spec and it seems that it only allows to do that in constrained cases where you can't fit the whole name in your buffer (which wouldn't apply to the context discussed here). My first instinct is that this is quite a bit to guarantee (full consistency in collation), but it seems that that's what the spec is shooting for. >> > Given the amount of debate on this, could we at least agree that we can do >> > binary for v1? We can then have an open item for v2 on taking collation >> > names and sort according to UCA or taking callbacks and such. >> >> I'm okay with supporting only binary to start with. Great. I'll still wait a bit to see what other folks think, and then update the bug one way or the other. Thanks -pablo
[IndexedDB] Evictable stores
We discussed evictable stores some time ago and captured it in bug 11350 [1], but I haven't seen further discussion on it and it hasn't gone into the spec. I'm curious on where folks are with this? Should we move it to v2? Should we just allow UAs to have their own policy around eviction (back at TPAC it seemed folks had reasonable but different strategies for handling when to allow websites to use storage already). Thanks, -pablo [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=11350
RE: [IndexedDB] Closing on bug 9903 (collations)
-Original Message- From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh Gregor Sent: Friday, May 06, 2011 10:05 AM >> On Fri, May 6, 2011 at 5:18 AM, Jonas Sicking wrote: >> > Based on that, my conclusion is that we should go with what Pablo is >> > proposing. And I think we should do it for v1. >> >> If I understand correctly, Pablo's proposal is that the author be >> allowed to specify a locale, and the browser can collate in some >> undefined way based on that locale. That sounds like a really bad >> idea for interop. If non-binary collation is supported in a first >> version, it should be either No, that was poor wording on my part, I keep using "locale" in the wrong context. I meant to have the API take a proper collation identifier. The identifier can be as specific as the caller wants it to be. The implementation could choose to not honor some specific detail if it can't handle it (to the extent that doing so is allowed by the specification of collation names), or fail because it considers that not handling a particular aspect of the collation identifier would severely deviate from the caller's expectations. >> 1) Two choices, binary or UCA 6.0.0. (AFAIK, UCA gives fairly good >> results for most languages even without tailoring, so it might be just >> fine for v1. It's vastly better than binary, for sure.) Given the amount of debate on this, could we at least agree that we can do binary for v1? We can then have an open item for v2 on taking collation names and sort according to UCA or taking callbacks and such. >> 2) In addition to binary and UCA 6.0.0, allow UCA 6.0.0 tailored by >> any of the locales defined by CLDR 1.9.1. >> >> There also needs to be some thought put into how to handle version >> updates, since browsers cannot update their UCA or CLDR implementation >> without rebuilding all existing indexes that used it (unless they keep >> the old implementation forever). It might be that browsers should >> just stick to a fixed version for the time being (like 6.0.0 and >> 1.9.1), and we might decide that no further APIs are needed now to >> accommodate possible future switches, but at least some thought needs >> to be given to it. I wonder if the API (independently of when we get to this) should include the version either as part of the collation identifier or as a separate argument. This would allow UAs to support a version or two for a while, and then phase them out as they fall out of use in favor of newer ones. >> On consideration, I don't think user-specified sortkey functions are >> necessary at this stage. If collations are to be identified by >> strings for now, we could always overload the value to accept a >> function at some later date if we wanted to support that. So I >> wouldn't worry about that further. I agree. -pablo
[IndexedDB] Closing on bug 9903 (collations)
We've had quite a bit of debate on this but I don't think we've reached closure. At this point I would be fine with either one of a) postpone to v2 and agree that for now we'll just do binary collation everywhere or b) the last form of the proposal sent around: extra "collation" argument (following BCP47 plus whatever the UA wants to allow) in createObjectStore/createIndex, plus a collation property to interrogate it; no way to change the collation of a store/index once created. Given that this turned out to be a more elaborate topic than I had originally expected and that it doesn't seem to have a lot of traction right now, my preference would be to postpone to v2. Thoughts? Once we make a call I'll make sure the spec reflects it. Thanks -pablo
[IndexedDB] Exceptions in IDB and the DOMException
This came up today that I didn't remember having a conversation about it with folks. We currently have IDBDatabaseException with a some error codes as constants and code/message properties. Looking at DOMException as defined in DOM Core [1], it turns out that a) the pattern of the class is identica, but instead of code/message it has code/name and b) there are some errors present in both or that are very close (e.g. NOT_FOUND_ERR, DATA_CLONE_ERR, QUOTA_EXCEEDED_ERR). Would it be worth it trying to use the constants of DOMException when there's one already there that matches the need? If that was the case, would be it the constants that we would be reusing or would be have to throw a DOMException instead of an IDBDatabaseException? Separately, in reference to a) above, should we change IDBDatabaseException.message to IDBDatabaseException.name for consistency? Thanks -pablo [1] http://www.w3.org/TR/2010/WD-domcore-20101007/#exception-domexception
RE: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Keean Schupke Sent: Monday, April 04, 2011 10:17 PM >> Something like RelationalDB gives you the power of a relational-db with no >> dependence on a specific implementation of SQL, so it would be compatible >> enough for the web. It fixes all the problems with the standardisation of >> WebSQL that have been talked about so far. I think it would find no >> technical issues that block its standardisation. As a high level DB API it >> does not need all the low-level features of IndexedDB, so its API can be >> much simpler and cleaner. RelationalDB can at least be provided as a library >> on top of IndexedDB, and it can use WebSQL where it is supported. My concern >> with the library approach is performance when implemented on top of >> IndexedDB. The goal of IndexedDB has always been to enable things like RelationalDB and CouchDB to be built on top, while maintaining a reasonable level of functionality for those that wanted to use it directly. I really like the idea of thinking of RelationalDB as something that's built as a library on top of IndexedDB. Are there specific tweaks we can make to IndexedDB so it can be a good lower-layer for RelationalDB, such that RelationalDB could be built as a pure JavaScript library? Thanks -pablo
RE: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, March 31, 2011 11:36 AM >> I can find a lot of stuff on collation, but not a lot about why it could not >> be done in a library. Could you summerise the reasons why this needs to be >> core functionality for me? >> >> Sorry, but that stuff is paged out of my brain. Pablo, can you? >> >> A library could chose to use an object store as meta-data to store the >> collation orders that it is using for various indexes for example. - Currently there are no APIs in JavaScript to compare strings using specific collations. There are folks that are looking into this, but it will need time. - I'm far from an expert in the topic, but from talking to folks that understand this well it seems that to actually implement this entirely in JavaScript it would mean you have to download collation tables and apply them as needed in callbacks. Not only this means a hit in download size/time for the app but also that callbacks have to either download stuff or inline collation rules/tables in the callback itself. - In pure practical terms, I suspect the 80% scenario can be covered by implementing this natively, having it be fast and simple to use for common cases. Not pushing back on the callback stuff, just saying that I find it valuable to have users simply say "en-US" and get what they wanted. - Also from the practical perspective, simple cases that don't require the flexibility and can avoid having to take care of making the callbacks perfectly consistent even as you roll out updates that may hit only some of the pages, use components written by someone else, etc. - By default we would still do binary collation (there was a question in the thread, I forget exactly where). Thanks -pablo
RE: [IndexedDB] Spec changes for international language support
From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On Behalf Of Keean Schupke Sent: Tuesday, March 22, 2011 5:34 PM >> IMHO not the job of Idb to store the callbacks, so I don't see this >> complexity as a reason not to implement the API using callbacks. I think >> having one consistent API is more important. >> Specifying the collation 'name' has all the same problems as callbacks >> (needs to be re-done on every page, possibility of using different >> collations on different pages). >> Really a 'function' is just a symbol for a collation. A function name, is a >> better symbol for a collation than a string. Function's have a uniqueness >> property strings do not. So specifying a function as the >> collations >> instead of a string really is the same thing. Consider below: I don't think it's the same. If we don't store the callbacks in the database it means every page has to have full knowledge of the database schema (at least all the indexes) all the time, instead of just pulling that in on demand when needed. It also means we can never allow browser developer tools or generic dev-tool-webpages to modify the database because indexes would become invalid (not sure allowing tools to mess with the database in general is a good idea, but I thought it illustrated the point well). I wonder if the overall issue we're discussing has to do with "how embedded" the database is. In BDB scenarios where the database is completely invisible outside of an application many of these decisions make more sense. I don't think of web applications that way. I think of them more as a number of building blocks (pages, pieces within pages, tool pages added on the side) that are authored and sometimes even versioned independently, and the interface between those building blocks and the store is public and visible to tools and generic data browsers. All that changes the assumptions in the overall picture. -pablo
RE: [IndexedDB] Spec changes for international language support
From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On Behalf Of Keean Schupke Sent: Friday, March 18, 2011 8:17 PM >> On 18 March 2011 19:29, Pablo Castro wrote: >> >> From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On >> Behalf Of Keean Schupke >> Sent: Friday, March 18, 2011 1:53 AM >> >> >> See my proposal in another thread. The basic idea is to copy BDB. Have a >> >> primary index that is based on an integer, something primitive and fast. >> >> Allow secondary indexes which use a callback to generate a binary index >> >> key. IDB shifts the complexity out into a library. Common use cases can >> >> be provided (a hash of all fields in the object, internationalised >> >> bidirectional lexicographic etc...), but the user is free to write their >> >> own for less usual cases (for example indexing by the last word in a name >> >> string to order by surname). I agree with Jeremy's comments on the other thread for this. Having the callback mechanism definitely sounds interesting but there are a ton of common cases that we can solve by just taking a language identifier, I'm not sure we want to make people work hard to get something that's already supported in most systems. The idea of having a callback to compute the index value feels incremental to this, so we could take on it later on without disrupting the explicit international collation stuff. >> >> The idea would be to provide pre-defined implementations of the callback for >> common use cases, then it is just as simple to register a callback as set >> any other option. All this means to the API is you pass a function instead >> of a string. It also is better for modularity as all the code relating to >> the sort order is kept in the callback functions. >> >> The difference comes down to something like: >> >> index.set_order_lexicographic('us'); >> >> vs >> >> index.set_order_method(order_lexicographic('us')); >> >> So more than just setting a property like the first case, where presumably >> all the ordering code is mixed in with the indexing code, the second case >> encapsulates all the ordering code in the function returned from the >> execution of order_lexicographic('us'). This function would represent a >> mapping from the object being indexed to a binary blob that is the actual >> stored index data. >> >> So doing it this was does not necessarily make things harder, and it >> improves encapsulation, the type-safety, and the flexibility of the API. Yep, we talked about supporting callbacks already in the other threads and in this one. As I mentioned before, I think this is an incremental to the basic feature of taking a collation name. I do realize you can just pass a pre-implemented function, but that opens the door to a bunch of things we'd need to handle, including storing possibly storing code in the database (such that proper updates don't depend on each page re-registering all the index callbacks), handling scripts with the appropriate context to run during index updates, etc. I would much rather have basic functionality in place and then expand as needed once we have users using the API. Thanks -pablo
RE: [IndexedDB] Any particular reason built-in properties are not indexable?
-Original Message- From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Monday, March 21, 2011 2:54 PM >> On Mon, Mar 21, 2011 at 11:51 AM, Pablo Castro >> wrote: >> > The spec today requires that properties key paths point at need to be >> > enumerated (see 3.1.2 "Object Store"). Any particular reason for that? It >> > would be reasonable to allow an index on say the "length" property of a >> > string. Perhaps we're opening the door for too much, so I wanted to double >> > check so we make an explicit call one way or the other. Thoughts? >> >> The structured clone algorithm only copies enumerable properties, >> given how we currently do indexes it would be sort of strange if you >> could add an index on a property that isn't stored. >> >> This is generally not a problem though. Before ES5 there wasn't even a >> way to create non-enumerated properties. They only appeared on host >> objects which you can't structured-clone anyway. >> >> One exception to this is Array.length which you mention. While that >> property isn't copied by the structured-clone algorithm, it's >> recreated by it since a new array object is created which contains a >> length property computed according to the same rules. >> >> We could special-case the array.length property in the keyPath >> evaluation algorithm. We might want to do the same for other >> host-object properties such as Blob.size and Blob.type since they >> aren't actually structured-cloned since they live on the prototype >> chain rather than the objects themselves. I'm fine not supporting it, I just wanted to bring it up because it came up here and wanted to make sure we made an explicit call. I'd rather not one-off Array.length, so it seems it would be best to just not do it across the board. Thanks -pablo
[IndexedDB] Any particular reason built-in properties are not indexable?
The spec today requires that properties key paths point at need to be enumerated (see 3.1.2 "Object Store"). Any particular reason for that? It would be reasonable to allow an index on say the "length" property of a string. Perhaps we're opening the door for too much, so I wanted to double check so we make an explicit call one way or the other. Thoughts? Thanks -pablo
RE: [IndexedDB] Spec changes for international language support
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Friday, March 18, 2011 1:57 PM >> >>> However there is another problem to consider here. Can switching >> >>> collation on a objectStore or a unique index can affect its validity? >> >>> I.e. if you switch from a case sensitive to a case insensitive >> >>> collation, does that mean that if you have two entries with the >> >>> primary keys "Sweden" and "sweden" they collide and thus the change of >> >>> collation must result in an error (or aborted transaction)? >> >>> >> >>> I do seem to recall that there are ways to do at least case >> >>> sensitivity such that you generally don't take case into account when >> >>> sorting, unless two entries are exactly the same, in which case you do >> >>> look at casing to differentiate them. However I don't really know a >> >>> whole lot about this and so defer to people that know >> >>> internationalization better. >> > >> > This is a good point. It makes me lean toward not allowing changing the >> > collation of an index or store. That means we could just have an optional >> > parameter (in the generic parameter object thingy we have now) on >> > createObjectStore and createIndex that indicates the collation name. It >> > seems minimally disruptive, it doesn't tax people that don't care about >> > it, and since there is no setCollation we don't have the problem of not >> > being able to re-index the data. >> >> So there is no way to specify things such that the collation doesn't >> affect unique-ness? If so, I tend to agree. The problem is that different collations will consider different things unique. This is bound to be variable across languages and such, so I'm not sure we want to be in the business of fine-tuning this. It seems that being a bit more restrictive could result in a more robust result overall. If someone really needs to change the collation they can copy the table manually...not great, but if we think it's a corner case it's probably fine. >> >>> > Another piece of feedback I heard consistently as I discussed this >> >>> > with various folks at Microsoft is the need to be able to pick up what >> >>> > the UA would consider the collation that's most appropriate for the >> >>> > user environment (derived from settings, page language or whatever). >> >>> > We could support this by introducing a special value that you can >> >>> > pass to setCollation that indicates "pick whatever is the right for >> >>> > the environment's language right now". Given that there is no other >> >>> > way for people to discover the user preference on this, I think this >> >>> > is pretty important. >> >>> I would be fine with this as long as it's a explicit opt-in. There is >> >>> definitely a risk that people will do this and then only do testing in >> >>> one language, but it seems to me like a useful use case to support, >> >>> and I don't see a way of supporting this while completely avoiding the >> >>> risk of internationalization bugs. >> > >> > I agree, it should be opt-in. I still assume we'll default to binary >> > collation (same if you specify the collation value as null). I was reading >> > the BCP 47 [1] and in section 4.1 "Choice of Language Tag" the item #7 >> > seems to describe what we're looking for. The value "i-default" seems to >> > match our needs close enough, so callers could use that value. >> > Discoverability is not great, but we avoid having to specify something >> > new, and arguably they'll need to read somewhere that this argument is a >> > BCP47-compatible value, and we could put a comment about "i-default" right >> > there. >> >> Sounds good to me. Though you seem to have forgotten to include the >> [1] reference. Oops, here it goes: [1] http://tools.ietf.org/html/bcp47
RE: [IndexedDB] Spec changes for international language support
From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On Behalf Of Keean Schupke Sent: Friday, March 18, 2011 1:53 AM >> See my proposal in another thread. The basic idea is to copy BDB. Have a >> primary index that is based on an integer, something primitive and fast. >> Allow secondary indexes which use a callback to generate a binary index key. >> IDB shifts the complexity out into a library. Common use cases can be >> provided (a hash of all fields in the object, internationalised >> bidirectional lexicographic etc...), but the user is free to write their own >> for less usual cases (for example indexing by the last word in a name string >> to order by surname). I agree with Jeremy's comments on the other thread for this. Having the callback mechanism definitely sounds interesting but there are a ton of common cases that we can solve by just taking a language identifier, I'm not sure we want to make people work hard to get something that's already supported in most systems. The idea of having a callback to compute the index value feels incremental to this, so we could take on it later on without disrupting the explicit international collation stuff. >> On 18 March 2011 02:19, Jonas Sicking wrote: >> 2011/3/17 Pablo Castro : >> > >> > From: Jonas Sicking [mailto:jo...@sicking.cc] >> > Sent: Tuesday, March 08, 2011 1:11 PM >> > >> >>> All in all, is there anything preventing adding the API Pablo suggests >> >>> in this thread to the IndexedDB spec drafts? >> > >> > I wanted to propose a couple of specific tweaks to the initial proposal >> > and then unless I hear pushback start editing this into the spec. >> > >> > From reading the details on this thread I'm starting to realize that >> > per-database collations won't do it. What did it for me was the example >> > that has a fuzzier matching mode (case/accent insensitive). This is >> > exactly the kind of index I would want to sort people's names in my >> > address book, but most likely not the index I'll want to use for my >> > primary key. >> > >> > Refactoring the API to accommodate for this would mean to move the >> > setCollation() method and the collation property to the object store and >> > index objects. If we were willing to live without the ability to change >> > them we could take collation as one of the optional parameters to >> > createObjectStore()/createIndex() and reduce a bit of surface area... >> Unfortunately I think you bring up good use cases for >> per-objectStore/index collations. It's definitely tempting to just add >> it as a optional parameter to createObjectStore/createIndex. The >> downside is obviously pushing more complexity onto web developers. >> Complexity which will be duplicated across sites. >> >> However there is another problem to consider here. Can switching >> collation on a objectStore or a unique index can affect its validity? >> I.e. if you switch from a case sensitive to a case insensitive >> collation, does that mean that if you have two entries with the >> primary keys "Sweden" and "sweden" they collide and thus the change of >> collation must result in an error (or aborted transaction)? >> >> I do seem to recall that there are ways to do at least case >> sensitivity such that you generally don't take case into account when >> sorting, unless two entries are exactly the same, in which case you do >> look at casing to differentiate them. However I don't really know a >> whole lot about this and so defer to people that know >> internationalization better. This is a good point. It makes me lean toward not allowing changing the collation of an index or store. That means we could just have an optional parameter (in the generic parameter object thingy we have now) on createObjectStore and createIndex that indicates the collation name. It seems minimally disruptive, it doesn't tax people that don't care about it, and since there is no setCollation we don't have the problem of not being able to re-index the data. >> > Another piece of feedback I heard consistently as I discussed this with >> > various folks at Microsoft is the need to be able to pick up what the UA >> > would consider the collation that's most appropriate for the user >> > environment (derived from settings, page language or whatever). We could >> > support this by introducing a special value that you can pass to >> > setCollation that indicates "pick whatever is
RE: [IndexedDB] Spec changes for international language support
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Tuesday, March 08, 2011 1:11 PM >> All in all, is there anything preventing adding the API Pablo suggests >> in this thread to the IndexedDB spec drafts? I wanted to propose a couple of specific tweaks to the initial proposal and then unless I hear pushback start editing this into the spec. From reading the details on this thread I'm starting to realize that per-database collations won't do it. What did it for me was the example that has a fuzzier matching mode (case/accent insensitive). This is exactly the kind of index I would want to sort people's names in my address book, but most likely not the index I'll want to use for my primary key. Refactoring the API to accommodate for this would mean to move the setCollation() method and the collation property to the object store and index objects. If we were willing to live without the ability to change them we could take collation as one of the optional parameters to createObjectStore()/createIndex() and reduce a bit of surface area...I don't have a strong preference there. In any case both would use BCP47 names as discussed in this thread (as Jonas pointed out, implementations can also do their thing as long as they don't interfere with BCP47). Another piece of feedback I heard consistently as I discussed this with various folks at Microsoft is the need to be able to pick up what the UA would consider the collation that's most appropriate for the user environment (derived from settings, page language or whatever). We could support this by introducing a special value that you can pass to setCollation that indicates "pick whatever is the right for the environment's language right now". Given that there is no other way for people to discover the user preference on this, I think this is pretty important. Thanks -pablo
RE: Indexed Database API
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Tuesday, March 15, 2011 3:08 PM >> Filed: http://www.w3.org/Bugs/Public/show_bug.cgi?id=12310 I'm not sure if this is a lot more valuable than just creating an index over whatever index key you want plus the primary key, and then seeking to the compound key of the last row in the previous page to resume scanning the next page of records. No strong pushback, just not sure this is worth the extra method. -pablo
RE: [IndexedDB] Compound and multiple keys
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Keean Schupke Sent: Tuesday, March 08, 2011 3:03 PM >> No objections here. >> >> Keean. >> >> On 8 March 2011 21:14, Jonas Sicking wrote: >> On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow wrote: >> > On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow wrote: >> >> >> >> On Thu, Jan 20, 2011 at 6:29 PM, Tab Atkins Jr. >> >> wrote: >> >>> >> >>> On Thu, Jan 20, 2011 at 10:12 AM, Keean Schupke wrote: >> >>> > Compound primary keys are commonly used afaik. >> >>> >> >>> Indeed. It's one of the common themes in the debate between natural >> >>> and synthetic keys. >> >> >> >> Fair enough. >> >> Should we allow explicit compound keys? I.e myOS.put({...}, ['first >> >> name', 'last name'])? I feel pretty strongly that if we do, we should >> >> require this be specified up-front when creating the objectStore. I.e. >> >> add >> >> some additional parameter to the optional options object. Otherwise, >> >> we'll >> >> force implementations to handle variable compound keys for just this one >> >> case, which seems kind of silly. >> >> The other option is to just disallow them. >> > >> > After thinking about it a bunch and talking to others, I'm actually leaning >> > towards both option A and B. Although this will be a little harder for >> > implementors, it seems like there are solid reasons why some users would >> > want to use A and solid reasons why others would want to use B. >> > Any objections to us going that route? >> Not from me. If I don't hear objections I'll write up a spec draft and >> attach it here before committing to the spec. Option A is pretty well understood, I like that one. For option B, at some point we had a debate on whether when indexing an array value we should consider it a single key value or we should unfold it into multiple index records. The first option makes it very similar to A in that an array is just a composite value (it is quite a bit more painful to implement...), the second option is interesting in that allows for new scenarios such as objects with an array for tags, where you want to look up by tag (even after doing options A and B as currently defined, in order support multiple tags you'd need a second store that keeps the tags + key for the objects you want to tag). Is there any interest in that scenario? Thanks -pablo
RE: [IndexedDB] Spec changes for international language support
From: jungs...@google.com [mailto:jungs...@google.com] On Behalf Of Jungshik Shin (???, ???) Sent: Tuesday, February 22, 2011 2:08 PM >> On Fri, Feb 18, 2011 at 2:34 AM, Bjoern Hoehrmann wrote: >> * Pablo Castro wrote: >> >We discussed international language support last time at the TPAC and I >> >said I'd propose spec text for it. Please find the patch below, the >> >changes mirror exactly the proposal described in the bug we have for >> >tracking this: http://www.w3.org/Bugs/Public/show_bug.cgi?id=9903 >> You should anticipate objections to that; collation is not a property of >> language, for instance, for de-de you typically have dictionary sorting >> and phone book sorting (and of course you have "de-de", "de-ch", and so >> on, so "de" alone would be rather meaningless). So far the W3C and the >> IETF have used resource identifiers to specify collations (see XPath 2.0 >> and RFC 4790) where the IETF allows shorthands like "i;ascii-casemap". >> >> I agree that simply specifying that 'language' be used without saying what >> it means is not sufficient. However, your examples (German phonebook vs >> dictionary) can be >> covered with language identifier framework laid out in >> BCP47 (with 'u' extension). Fair enough. I'll adjust this part of the write up to discuss this in terms of "collation identifier" or "language identifier". >> I do understand that Microsoft uses an extension of language tags for >> the `CultureInfo` in the .NET Framework, where, say, `de-DE_phoneb` is >> used to refer to german phone book sorting, but BCP 47 does not allow >> for that, >> >> There's a way to specify alternate sorting orders (e.g. German phonebook, >> Chinese pinyin, stroke count, radical-stroke count order, etc) under the BCP >> 47 framework >> because it has a mechanism for defining an extension and >> registering it. The Unicode consortium uses that mechanism to define 'u' >> extension and a set of subtags that can >> be used with 'u'. >> For instance, German phonebook sorting can be identified with >> 'de-DE-u-co-phonebk'. See >> >> https://tools.ietf.org/html/bcp47 >> https://tools.ietf.org/html/rfc6067 >> http://unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers >> >> Also, see Bug 9903 comment 6 by Mark Davis for more examples. Well, I'm just >> copying his comment directly here: >> >> >> To add to what Jungshik said, BCP47 defines standard extensions. The >> extension >> defined by the Unicode consortium >> (http://cldr.unicode.org/index/bcp47-extension) provides for fine-grained >> >> specifications of collation behavior. >> Examples for German: >> de-u-co-phonebk // phonebook order >> de-u-kn-true // numeric sorting, eg Tom2 comes before Tom12 >> de-u-ks-level1 // ignore accents, case differences >> de-u-ks-level2 // ignore case differences >> de-u-ks-level1-kc-true // ignore accents, but not case >> These can be combined, such as: >> de-u-co-phonebk-kn-true-ks-level1-kc-true >> >> neither could you devise a language tag to define something >> like "i;ascii-casemap" (which simply defines A-Z = a-z). >> I'm not sure how specific we want to get into this. In particular, would be it better if we specified it all the way (including which extensions UAs need to support) or if we used BCP47 as the starting point and allowed UAs to support additional extensions as needed? >> I would expect that if browsers offer collations, there would be an in- >> terface for that so you can use them in other places, as such it might >> be wiser to accept something other than a language identifier string. >> >> There's an on-going effort to expose a 'rich' set of I18N API to client-side >> development using Javascript ( >> http://wiki.ecmascript.org/doku.php?id=strawman:i18n_api : The API used be >> much more extensive than now, but has been scaled down significantly to get >> more browsers on board in its 1st iteration). There we're likely to use BCP >> 47 with 'u' extension (see above). So, I think it'd be better if IndexedDB >> matches what ECMAScript plans to do. This is interesting, do you know how far along is this? >> I also note that collation often involves equivalence testing, but it >> is not clear from your proposal whether that is the case here. It might >> also be a good idea to clearly spell out interoperability expectations; >> if two implementations support some collation, will they behave the same >> for any and all inputs as far as collation is concerned, or should one >> be prepared for slight differences among implementations? I think it's more practical to assume that users should be prepared for slight differences among implementations. Thanks -pablo
[IndexedDB] Spec changes for international language support
We discussed international language support last time at the TPAC and I said I'd propose spec text for it. Please find the patch below, the changes mirror exactly the proposal described in the bug we have for tracking this: http://www.w3.org/Bugs/Public/show_bug.cgi?id=9903 btw - the bug is assigned to Nikunj right now but I think that's just because of an editing glitch. Nikunj please let me know if you were working on it, otherwise I'll just submit the changes once I hear some feedback from this group. Thanks -pablo Left file: \IndexedDB Specs\20110217\Speclet_023_IDB_API_Asynchronous_APIs.original.html Right file: \IndexedDB Specs\20110217\Speclet_023_IDB_API_Asynchronous_APIs.html copy 6 add 7 readonly attribute DOMString language On getting, this attribute MUST return the language that is configured in this database for string collation. If no collation has been configured for a database this value is null and the database will use binary collation. copy 6 copy 6 add 24 IDBRequest setLanguage() This method changes the language used by the database for string collation. Note that this method must only be called from a VERSION_CHANGE transaction callback. Changing the language in a database that already contains data typically involves reading and re-writing the entire database and thus can be a time consuming operation. optional DOMString language The language to be used in the database specified as a language identifier as described in [[!BCP47]]. NOT_ALLOWED_ERR This method was not called from a VERSION_CHANGE transaction callback. DATA_ERR The language parameter contained a string that was not a valid language identifier or was a language identifier not supported by the system. copy 6 Left file: \IndexedDB Specs\20110217\Speclet_022_IDB_API_Synchronous_APIs.original.html Right file: \IndexedDB Specs\20110217\Speclet_022_IDB_API_Synchronous_APIs.html copy 6 add 7 readonly attribute DOMString language On getting, this attribute MUST return the language that is configured in this database for string collation. If no collation has been configured for a database this value is null and the database will use binary collation. copy 6 copy 6 add 24 void setLanguage() This method changes the language used by the database for string collation. Note that this method must only be called from a VERSION_CHANGE transaction callback. Changing the language in a database that already contains data typically involves reading and re-writing the entire database and thus can be a time consuming operation. optional DOMString language The language to be used in the database specified as a language identifier as described in [[!BCP47]]. NOT_ALLOWED_ERR This method was not called from a VERSION_CHANGE transaction callback. DATA_ERR The language parameter contained a string that was not a valid language identifier or was a language identifier not supported by the system. copy 6 Left file: \IndexedDB Specs\20110217\Speclet_020_IDB_API_Constructs.original.html Right file: \IndexedDB Specs\20110217\Speclet_020_IDB_API_Constructs.html copy 6 add 4 Every database also has a language that indicates the language that should be used for collating strings when comparing keys. copy 6 copy 6 delete 1 add 2 value with no need to separate them by type. When comparing a DOMString with another DOMString, the database language should be used to determine the specific collation rules to be used. copy 6
RE: [IndexedDB] More questions about IDBRequests always firing (WAS: Reason for aborting transactions)
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Thursday, February 17, 2011 11:51 AM >> On Thu, Feb 17, 2011 at 11:12 AM, Jonas Sicking wrote: >> On Thu, Feb 17, 2011 at 11:02 AM, ben turner wrote: >> >>> Also, what should we do when you enqueue a setVersion transaction and >> >>> then >> >>> close the database handle? Maybe an ABORT_ERR there too? >> >> >> >> Yeah, that'd make sense to me. Just like if you enque any other >> >> transaction and then close the db handle. >> > >> > We don't abort transactions that are already in progress when you call >> > db.close()... We just set a flag and prevent further transactions from >> > being created. >> Doh! Of course. >> >> If the setVersion transaction has started then we should definitely >> allow it finish, just like all other transactions. I don't have a >> strong opinion on if we should let the setVersion transaction start if >> it hasn't yet. Seems most consistent to let it, but if there's a >> strong reason not to I could be convinced. >> >> What if you have two database connections open and both do a setVersion >> transaction and one calls .close (to yield to the other)? Neither can start >> until one or the other actually is closed. If a database is closed (not >> just close pending) then I think we need to abort any blocked setVersion >> calls. If one is already running, it should certainly be allowed to finish >> before we close the database. This sounds reasonable to me (special case and abort the transaction only for blocked setVersion transactions). We should capture it explicitly on the spec, it's the kind of little detail that's easy to forget. -pc
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
(sorry for my random out-of-timing previous email on this thread. please see below for an actually up to date reply) -Original Message- From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Monday, February 07, 2011 3:31 PM On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow wrote: > On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking wrote: >> >> On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow wrote: >> > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking wrote: >> >> >> >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow >> >> wrote: >> >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher >> >> > wrote: >> >> >> >> >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >> >> >>> >> >> >>> My current thinking is that we should have some relatively large >> >> >>> limitmaybe on the order of 64k? It seems like it'd be very >> >> >>> difficult >> >> >>> to >> >> >>> hit such a limit with any sort of legitimate use case, and the >> >> >>> chances >> >> >>> of >> >> >>> some subtle data-dependent error would be much less. But a 1GB key >> >> >>> is >> >> >>> just >> >> >>> not going to work well in any implementation (if it doesn't simply >> >> >>> oom >> >> >>> the >> >> >>> process!). So despite what I said earlier, I guess I think we >> >> >>> should >> >> >>> have >> >> >>> some limit...but keep it an order of magnitude or two larger than >> >> >>> what >> >> >>> we >> >> >>> expect any legitimate usage to hit just to keep the system as >> >> >>> flexible >> >> >>> as >> >> >>> possible. >> >> >>> >> >> >>> Does that sound reasonable to people? >> >> >> >> >> >> Are we thinking about making this a MUST requirement, or a SHOULD? >> >> >> I'm >> >> >> hesitant to spec an exact size as a MUST given how technology has a >> >> >> way >> >> >> of >> >> >> changing in unexpected ways that makes old constraints obsolete. >> >> >> But >> >> >> then, >> >> >> I may just be overly concerned about this too. >> >> > >> >> > If we put a limit, it'd be a MUST for sure. Otherwise people would >> >> > develop >> >> > against one of the implementations that don't place a limit and then >> >> > their >> >> > app would break on the others. >> >> > The reason that I suggested 64K is that it seems outrageously big for >> >> > the >> >> > data types that we're looking at. But it's too small to do much with >> >> > base64 >> >> > encoding binary blobs into it or anything else like that that I could >> >> > see >> >> > becoming rather large. So it seems like a limit that'd avoid major >> >> > abuses >> >> > (where someone is probably approaching the problem wrong) but would >> >> > not >> >> > come >> >> > close to limiting any practical use I can imagine. >> >> > With our architecture in Chrome, we will probably need to have some >> >> > limit. >> >> > We haven't decided what that is yet, but since I remember others >> >> > saying >> >> > similar things when we talked about this at TPAC, it seems like it >> >> > might >> >> > be >> >> > best to standardize it--even though it does feel a bit dirty. >> >> >> >> One problem with putting a limit is that it basically forces >> >> implementations to use a specific encoding, or pay a hefty price. For >> >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 >> >> data? If it is of UTF8 data, and the implementation uses something >> >> else to store the date, you risk having to convert the data just to >> >> measure the size. Possibly this would be different if we measured size >> >> using UTF16 as javascript more or less enforces that the source string >> >> is UTF16 which means that you can measure utf16 size on the cheap, >> >> even if the stored data uses a different format. >> > >> > That's a very good point. What's your suggestion then? Spec unlimited >> > storage and have non-normative text saying that >> > most implementations will >> > likely have some limit? Maybe we can at least spec a minimum limit in >> > terms >> > of a particular character encoding? (Implementations could translate >> > this >> > into the worst case size for their own native encoding and then ensure >> > their >> > limit is higher.) >> >> I'm fine with relying on UTF16 encoding size and specifying a 64K >> limit. Like Shawn points out, this API is fairly geared towards >> JavaScript anyway (and I personally don't think that's a bad thing). >> One thing that I just thought of is that even if implementations use >> other encodings, you can in the vast majority of cases do a worst-case >> estimate and easily see that the key that is used is below 64K. >> >> That said, does having a 64K limit really help anyone? In SQLite we >> can easily store vastly more than that, enough that we don't have to >> specify a limit. And my understanding is that in the Microsoft >> implementation, the limits for what they can store without resorting >> to various tricks, is much lower. So since that implementation will >> have to implement special handling of long keys anyway, is there a >> difference between say
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
>> From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow >> Sent: Sunday, February 06, 2011 12:43 PM >> >> On Tue, Dec 14, 2010 at 4:26 PM, Pablo Castro >> wrote: >> >> From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow >> Sent: Tuesday, December 14, 2010 4:23 PM >> >> >> On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro >> >> wrote: >> >> >> >> From: public-webapps-requ...@w3.org >> >> [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking >> >> Sent: Friday, December 10, 2010 1:42 PM >> >> >> >> >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow >> >> >> wrote: >> >> >> > Any more thoughts on this? >> >> >> >> >> >> I don't feel strongly one way or another. Implementation wise I don't >> >> >> really understand why implementations couldn't use keys of unlimited >> >> >> size. I wouldn't imagine implementations would want to use fixed-size >> >> >> allocations for every key anyway, right (which would be a strong >> >> >> reason to keep maximum size down). >> >> I don't have a very strong opinion either. I don't quite agree with the >> >> guideline of "having something working slowly is better than not working >> >> at all"...as having something not work at all sometimes may help >> >> developers hit a wall and think differently about their approach for a >> >> given problem. That said, if folks think this is an instance where we're >> >> better off not having a limit I'm fine with it. >> >> >> >> My only concern is that the developer might not hit this wall, but then >> >> some user (doing things the developer didn't fully anticipate) could hit >> >> that wall. I can definitely see both sides of the argument though. And >> >> elsewhere we've headed more in the direction of forcing the developer to >> >> think about performance, but this case seems a bit more non-deterministic >> >> than any of those. >> >> Yeah, that's a good point for this case, avoiding data-dependent errors is >> probably worth the perf hit. >> >> My current thinking is that we should have some relatively large >> limitmaybe on the order of 64k? It seems like it'd be very difficult to >> hit such a limit with any sort of legitimate use case, and the chances of >> some subtle data-dependent error would be much less. But a 1GB key is just >> not going to work well in any implementation (if it doesn't simply oom the >> process!). So despite what I said earlier, I guess I think we should have >> some limit...but keep it an order of magnitude or two larger than what we >> expect any legitimate usage to hit just to keep the system as flexible as >> possible. >> >> Does that sound reasonable to people? I thought we were trying to avoid data-dependent errors and thus shooting for having no limit (which may translate into having very large limits in actual implementations but not the kind of thing you'd typically hit). Specifying an exact size may be a bit weird...I guess an alternative could be to spec what is the minimum size UAs need to support. A related problem is what units is this specified in, if it's bytes then that means developers need to make assumptions about how strings are stored or something. -pablo
RE: [IndexedDB] Reason for aborting transactions
(back!) From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Wednesday, February 09, 2011 6:47 PM >> On Wed, Feb 9, 2011 at 5:54 PM, Jonas Sicking wrote: >> On Wed, Feb 9, 2011 at 5:43 PM, Jeremy Orlow wrote: >> > On Wed, Feb 9, 2011 at 5:37 PM, ben turner wrote: >> >> >> >> > Normal exceptions have error messages that are not consistient across >> >> > implementations and are not localized. What's the difference? >> >> >> >> These messages aren't part of any exception though, it's just some >> >> property on a transaction object. (None of our DOM exceptions, IDB or >> >> otherwise, have message properties btw, they're only converted to some >> >> message if they make it to the error console). >> >> >> >> > For stuff like internal errors, they seem especially important. >> >> >> >> You're thinking of having multiple messages for the INTERAL_ERROR_ABORT >> >> code? >> > >> > I think that'd be ideal, yes. Since internal errors will be UA specific, >> > string matching wouldn't be so bad there. >> > If no one likes this idea, I'm happy hiding away the message in some >> > webkitAbortMessage attribute so it's super clear it's just us who >> > implements >> > this. (Speaking of which, maybe you guys should do that with getAll.) >> We'll definitely put getAll under a vendor prefix once we drop the >> "front door" prefix on .indexeddb. >> >> I'm with Ben here. I'd prefer to hide the message away under a vendor >> prefix (either now or once you drop the front door one) for now to >> gather feedback on how it'll be used. >> I'm not sure about this...as I was catching up on the thread I understood this more as a debugging helper feature. In the end if we didn't have this you could just have a database-wide error handler and stash errors as they come in a global array or something, and that's okay for diagnostics. If we want to make it easier to just look at the transaction and see what happened, we may as well let UAs include a descriptive string so you can really find out on the spot. I don't have a strong opinion about excluding (or vendor-prefixing) the property, but it seems it would come in handy. -pablo
[IndexedDB] KeyRange factory methods
I was going to file a bug on this but wanted to make sure I'm not missing something first. All the factory methods for ranges (e.g. bound, lowerBound, etc.) are in the IDBKeyRangeConstructors interface now, but I don't see the interface referenced anywhere. Who implements this interface, the Window object, IDBFactory[Sync], something else? Thanks -pablo
RE: [IndexedDB] Do we need a timeout for VERSION_CHANGE?
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, December 16, 2010 2:35 AM >>In another thread (in the last couple days) we actually decided to remove >>timeouts from normal transactions since they can be implemented as a >>setTimeout+abort. >> >>But I agree that we need a way to abort setVersion transactions before >>getting the callback (so that we implement timeouts for them as well). >>Unfortunately, I don't immediately have any good ideas on how to do that >>though. Sorry, forgot to qualify it, context == sync api. I assume that the sync versions of the API will truly block, so setTimeout won't do as code won't just reenter into the timeout callback while blocked on a sync IndexedDB call, are we all on the same page on that? If that's the case, then I don't think we can remove the timeout parameter from the sync versions of transaction() and setVersion(). Does that sound reasonable? I'll add them for now, we can adjust if somebody come up with a better approach. As for setVersion in async...that's actually a problem as well now that I think about it because you don't have access to the (version) transaction object until it actually was able to start. One option besides having a timeout parameter in the method would be to have an abort() method in IDBVersionChangeRequest. Thanks, -pablo
[IndexedDB] Do we need a timeout for VERSION_CHANGE?
Regular transactions take a timeout parameter when started, which ensures that we eventually make progress one way or the other if there's an un-cooperating script that won't let go of an object store or something like that. I'm not sure if we discussed this before, it seems that we need to add a similar thing for setVersion(), and it's basically a way of starting a transaction. I was thinking we could have an optional timeout argument in setVersion with a UA-specific default. In the async case we would fire the onerror event and in the sync case just throw, both with TIMEOUT_ERR. Thanks -pablo
RE: [Bug 11553] New: Ensure indexedDBSync is on the right worker interface
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Wednesday, December 15, 2010 3:21 AM >> >> I believe the instance of WorkerUtils is much like window in a page. I.e. >> you put stuff on there that you want in the global scope. Thus I'm pretty >> sure that WorkerUtils is the right place for both. Yeah, I read the workers spec too quickly yesterday. You're right, WorkerUtils is what we need, I'll make it implement both IDBEnvironment and IDBEnvironmentSync. Thanks, -pablo
[IndexedDB] versionchange event gone?
Just noticed that the algorithm for updating versions refers to the "versionchange" event but the event is actually not defined in IDBDatabase or IDBDatabaseSync. Just an omission? On a related note, I'm updating the sync API and changing the setVersion method so that it does all the version change notification dance synchronously and returns a transaction object that's the "version change" transaction. Given this behavior we probably don't need anything similar to the "blocked" event for the sync API. Any concerns? Thanks -pablo
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Tuesday, December 14, 2010 4:23 PM >> On Wed, Dec 15, 2010 at 12:19 AM, Pablo Castro >> wrote: >> >> From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] >> On Behalf Of Jonas Sicking >> Sent: Friday, December 10, 2010 1:42 PM >> >> >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow wrote: >> >> > Any more thoughts on this? >> >> >> >> I don't feel strongly one way or another. Implementation wise I don't >> >> really understand why implementations couldn't use keys of unlimited >> >> size. I wouldn't imagine implementations would want to use fixed-size >> >> allocations for every key anyway, right (which would be a strong >> >> reason to keep maximum size down). >> I don't have a very strong opinion either. I don't quite agree with the >> guideline of "having something working slowly is better than not working at >> all"...as having something not work at all sometimes may help developers hit >> a wall and think differently about their approach for a given problem. That >> said, if folks think this is an instance where we're better off not having a >> limit I'm fine with it. >> >> My only concern is that the developer might not hit this wall, but then some >> user (doing things the developer didn't fully anticipate) could hit that >> wall. I can definitely see both sides of the argument though. And >> elsewhere we've headed more in the direction of forcing the developer to >> think about performance, but this case seems a bit more non-deterministic >> than any of those. Yeah, that's a good point for this case, avoiding data-dependent errors is probably worth the perf hit. -pc
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, December 10, 2010 1:42 PM >> On Fri, Dec 10, 2010 at 7:32 AM, Jeremy Orlow wrote: >> > Any more thoughts on this? >> >> I don't feel strongly one way or another. Implementation wise I don't >> really understand why implementations couldn't use keys of unlimited >> size. I wouldn't imagine implementations would want to use fixed-size >> allocations for every key anyway, right (which would be a strong >> reason to keep maximum size down). I don't have a very strong opinion either. I don't quite agree with the guideline of "having something working slowly is better than not working at all"...as having something not work at all sometimes may help developers hit a wall and think differently about their approach for a given problem. That said, if folks think this is an instance where we're better off not having a limit I'm fine with it. >> Pablo, do you know why the back ends you were looking at had such >> relatively low limits? Mostly an implementation thing. Keys (and all other non-blob columns) typically need to fit in a page. Predictable perf is also nice (no linked lists, high density/locality, etc.), but not as fundamental as page size. -pablo
RE: [Bug 11375] New: [IndexedDB] Error codes need to be assigned new numbers
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Friday, December 10, 2010 5:03 AM >> I noticed that QUOTA_ERR is commented out. I can't remember when or why and >> the blame history is a bit mangled. Does anyone else? In Chromium we >> currently use UNKNOWN_ERR for whenever we have issues writing stuff to disk. >> We could probably tease quota related issues out into their own error. >> And/or we should probably create or find a good existing error for such uses. It sounds like a good idea to keep QUOTA_ERR separated from other general errors that come up when writing stuff to disk. >> Speaking of which, we use UNKNOWN_ERR for a bunch of other >> internal consistency issues. Is this OK by everyone, should we use another, >> or should we create a new one? (Ideally these issues will be few and far >> between as we make things more robust.) That sounds reasonable to me. >> We also use UNKNOWN_ERR for when things are not yet implemented. Any >> concerns? I don't think it's a big deal, but are we going to have a bunch of unimplemented stuff across browsers? If this becomes common, I wonder if we should have a separate error so calling code can choose to compensate or something. >> What error code should we use for IDBCursor.update/delete when the cursor is >> not currently on an item (or that item has been deleted)? NOT_ALLOWED_ERR? >> TRANSIENT_ERR doesn't seem to be used anywhere in the spec. Should it be >> removed? Sure. >> As for the numbering: does anyone object to me just starting from 1 and >> going sequentially? I.e. does anyone have a problem with them all getting >> new numbers, or should I keep the numbers the same when possible. (i.e. >> only UNKNOWN_ERR, RECOVERABLE_ERR, TRANSIENT_ERR, TIMEOUT_ERR, DEADLOCK_ERR >> would change number, but the ordering of those on the page would change.) I'm fine with that. -pc
RE: [Bug 11398] New: [IndexedDB] Methods that take multiple optional parameters should instead take an options object
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Friday, December 10, 2010 7:27 AM >> >> In addition to createObjectStore, I also intend to convert the following >> over: >> >> >> IDBObjectStore.createIndex >> IDBObjectStore.openCursor >> IDBIndex.openCursor >> IDBIndex.openKeyCursor >> IDBKeyRange.bound Sounds great. >> We did all of these two weeks ago in Chromium and have gotten some feedback. >> The main downside is that typos are silently ignored by JavaScript. We >> considered throwing if someone passed in an option we didn't recognize, but >> this would make it impossible to add more options later (which is one of the >> main reasons for doing this change). I think what we might do is just log >> something in the console with this happens. (Should the spec actually make >> a recommendation to this effect?) Besides that, I think overall we're happy >> with the change. I'm not sure what the problem is with throwing. Can't each implementation throw if it receives a parameter that has no meaning for it? Given that we can't know if future options will have substantial impact on the behavior of the function when they are present, it looks safer to go that route. Is there prior art in some other webapps API that takes JavaScript objects as parameters? What do they do? Thanks -pablo
RE: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
-Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org Sent: Friday, November 19, 2010 4:16 AM >> Just looking at this list, I guess I'm leaning towards _not_ limiting the >> maximum key size and instead pushing it onto implementations to do the hard >> work here. If so, we should probably have some normative text about how >> bigger >> keys will probably not be handled very efficiently. I was trying to make up my mind on this, and I'm not sure this is a good idea. What would be the options for an implementation? Hashing keys into smaller values is pretty painful because of sorting requirements (we'd have to index the data twice, once for the key prefix that fits within limits, and a second one for a hash plus some sort of discriminator for collisions). Just storing a prefix as part of the key under the covers obviously won't fly...am I missing some other option? Clearly consistency in these things is important to people don't get caught off guard. I wonder if we just pick a "reasonable" limit, say 1 K characters (yeah, trying to do something weird to avoid details of how stuff is actually stored), and run with it. I looked around at a few databases (from a single vendor :)), and they seem to all be well over this but not by orders of magnitude (2KB to 8KB seems to be the range of upper limits for this in practice). Thanks -pablo
RE: [Bug 11270] New: Interaction between in-line keys and key generators
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, November 10, 2010 2:08 PM >> On Wed, Nov 10, 2010 at 1:50 PM, Tab Atkins Jr. wrote: >> > On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro >> > wrote: >> >> >> >> From: public-webapps-requ...@w3.org >> >> [mailto:public-webapps-requ...@w3.org] On Behalf Of >> >> bugzi...@jessica.w3.org >> >> Sent: Monday, November 08, 2010 5:07 PM >> >> >> >> I'm fine with either solution here. My database experience is too weak >> to have strong opinions on this matter. >> >> What do databases usually do with columns that use autoincrement but a >> value is still supplied? My recollection is that that is generally >> allowed? It does happen in practice that sometimes you need to use explicit keys. The typical case is when you're initializing a database with base data and you want to have known keys. As for what databases do, I'll use SQL Server as an example (for no particular reason :) ). In SQL Server by default if you try to insert a row with a value in an "identity" column you get an error and the operation is aborted; however, developers can issue a command (SET IDENTITY_INSERT ON) to turn it off temporarily and insert rows with an explicitly provided primary key. Usually when you do this you have to be careful to use keys that are either way out of the range of keys the generator will use (or you may not be able to insert keys anymore) or you have to reset the next key (using an obscure DBCC CHECKIDENT (, RESEED, ) command). I don't know much about Oracle, but I believe the typical pattern is still to use a sequence object and set the default value for the key column to < sequence>.nextval, thus allowing callers to override the next value in the sequence by just providing one, and if necessary they may need to go and fix up the sequence. >From writing the above paragraph I'm realizing one more detail we need to be >explicit about: the fact that you do an add() with an explicit key does not >mean the implementation will fix up the next key it'll assign. You'll still >get the value that comes after the one generated last, and if you inserted >that value in the store explicitly you just made the store unable to add new >objects with generated keys until you delete it. If that's too much fine-print then we should just disallow it. I like the ability to set explicit key values, but it does come with some extra care that both implementers and users will have to have. -pablo
RE: [Bug 11270] New: Interaction between in-line keys and key generators
From: Tab Atkins Jr. [mailto:jackalm...@gmail.com] Sent: Wednesday, November 10, 2010 1:50 PM >> On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro >> wrote: >> > >> > From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] >> > On Behalf Of bugzi...@jessica.w3.org >> > Sent: Monday, November 08, 2010 5:07 PM >> > >> >>> So what happens if trying save in an object store which has the following >> >>> keypath, the following value. (The generated key is 4): >> >>> >> >>> "foo.bar" >> >>> { foo: {} } >> >>> >> >>> Here the resulting object is clearly { foo: { bar: 4 } } >> >>> >> >>> But what about >> >>> >> >>> "foo.bar" >> >>> { foo: { bar: 10 } } >> >>> >> >>> Does this use the value 10 rather than generate a new key, does it throw >> >>> an >> >>> exception or does it store the value { foo: { bar: 4 } }? >> > >> > I suspect that all options are somewhat arbitrary here. I'll just propose >> > that we error out to ensure that nobody has the wrong expectations about >> > the implementation preserving the initial value. I would be open to other >> > options except silently overwriting the initial value with a generated >> > one, as that's likely to confuse folks. >> >> It's relatively common for me to need to supply a manual value for an >> id field that's automatically generated when working with databases, >> and I don't see any particular reason that my situation would change >> if using IndexedDB. So I think that a manually-supplied key should be >> kept. That would be okay with me. One bit of fine-print on this one is that if you're calling store.add() with an explicit key then you may get a unique constraint error (which would never happen with a generator if you never provided your own keys). Also, did we settle for having put() never adding a new record if one didn't exist? If put() can create a record, then things still work but become a bit more elaborate in that put() would create a new record either if the key is not present in the object or if it's present but the value doesn't exist in the database, while it would update a record if the value was present and it existed as a key in the store. -pablo
RE: [Bug 11270] New: Interaction between in-line keys and key generators
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org Sent: Monday, November 08, 2010 5:07 PM >> So what happens if trying save in an object store which has the following >> keypath, the following value. (The generated key is 4): >> >> "foo.bar" >> { foo: {} } >> >> Here the resulting object is clearly { foo: { bar: 4 } } >> >> But what about >> >> "foo.bar" >> { foo: { bar: 10 } } >> >> Does this use the value 10 rather than generate a new key, does it throw an >> exception or does it store the value { foo: { bar: 4 } }? I suspect that all options are somewhat arbitrary here. I'll just propose that we error out to ensure that nobody has the wrong expectations about the implementation preserving the initial value. I would be open to other options except silently overwriting the initial value with a generated one, as that's likely to confuse folks. >> What happens if the property is missing several parents, such as >> >> "foo.bar.baz" >> { zip: {} } >> >> Does this throw or does it store { zip: {}, foo: { bar: { baz: 4 } } } We should just complete the object with all the missing parents. >> If we end up allowing array indexes in key paths (like "foo[1].bar") what >> does >> the following keypath/object result in? I think we can live without array indexing in keys for this round, it's probably best to just leave them out and only allow paths. -pablo
RE: IndexedDB TPAC agenda
To hit the ground running on this, here is a consolidated list of issues coming both from the thread below and various pending bugs/discussions we've had. I picked an arbitrary order and grouping, feel free to tweak in any way. - keys (arrays as keys, compound keys, general keypath restrictions) - index keys (arrays as keys, empty values, general keypath restrictions) - internationalization (collation specification, collation algorithm) - quotas (how do apps request more storage, is there a temp/permanent distinction?) - error handling (propagation, relationship to window.error, db scoped event handlers, errors vs return empty values) - blobs (be explicit about behavior of blobs in indexeddb objects) - transactions error modes (abort-on-unwind in error conditions; what happens when user leaves the page with pending transactions?) - transactions isolation/concurrent aspects - transactions scopes (dynamic support) - synchronous api Thanks -pablo -Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Pablo Castro Sent: Monday, November 01, 2010 10:39 PM To: Jeremy Orlow; Jonas Sicking Cc: public-webapps@w3.org Subject: RE: IndexedDB TPAC agenda A few other items to add to the list to discuss tomorrow: - Blobs support: have we discussed explicitly how things work when an object has a blob (file, array, etc.) as one of its properties? - Close on collation and international support - How do applications request that they need more storage? And related to this, at some point we discussed temporary vs permanent stores. Close on the whole story of how space is managed. - Database-wide exception handlers Looking forward to the discussion tomorrow. -pablo From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Monday, November 01, 2010 1:34 PM To: Jonas Sicking Cc: public-webapps@w3.org Subject: Re: IndexedDB TPAC agenda On Mon, Nov 1, 2010 at 12:23 PM, Jonas Sicking wrote: On Mon, Nov 1, 2010 at 5:13 AM, Jeremy Orlow wrote: > On Mon, Nov 1, 2010 at 11:53 AM, Jonas Sicking wrote: >> >> On Mon, Nov 1, 2010 at 4:40 AM, Jeremy Orlow wrote: >> > What items should we try to cover during the f2f? >> > On Mon, Nov 1, 2010 at 11:08 AM, Jonas Sicking wrote: >> >> >> >> > P.S. I'm happy to discuss all of this f2f tomorrow rather than over >> >> > email >> >> > now. >> >> >> >> Speaking of which, would be great to have an agenda. Some of the >> >> bigger items are: >> >> >> >> * Dynamic transactions >> >> * Arrays-as-keys >> >> * Arrays and indexes (what to do if the keyPath for an index evaluates >> >> to an array) >> >> * Synchronous API >> > >> > * Compound keys. >> > * What should be allowed in a keyPath. >> >> Aren't "compound keys" same as "arrays-as-keys"? > > Sorry, I meant to say compound indexes. > We've talked about using indexes in many different ways--including compound > indexes and allowing keys to include indexes. I assumed you meant the > latter? I'm lost as to what you're saying here. Could you elaborate? Are you saying "index" when you mean "array" anywhere? oops. Yes, I meant to say: "We've talked about using arrays in many different ways--including compound indexes and allowing keys to include arrays. I assumed you meant the latter?" >> * What should happen if an index's keyPath points to a property which >> doesn't exist or which isn't a valid key-value? (same general topic as >> "arrays and indexes" above) > > We've talked about this several times. It'd be great to settle on something > once and for all. Agreed. >> * What happens if the user leaves a page in the middle of a >> transaction? (this might be nice to tackle since there'll be lots of >> relevant people in the room) > > I'm pretty sure this is simple: if there's an onsuccess/onerror handler that > has not yet fired (or we're in the middle of firing), then you abort the > transaction. If not, the behavior is undefined (because there's no way the > app could have observed the difference anyway). The aborting behavior is > necessary since the user could have planned to execute additional commands > atomically in the handler. There is also the option to let the transaction finish. They should be short-lived so it shouldn't be too bad. I.e. keep the page alive for a bit longer in the background or something that blocks page unload? Is there precedent for this elsewhere? This sounds pretty complicated to get right both in terms of implementation and speccing. Let's chat about it though. >> * Error handling > > What do you mean by this? How to handle exceptions in various places. Where (error) events propagate. How does it relate to window.onerror. What happens if you do/don't call preventDefault on the error event? Sounds good.
RE: IndexedDB TPAC agenda
A few other items to add to the list to discuss tomorrow: - Blobs support: have we discussed explicitly how things work when an object has a blob (file, array, etc.) as one of its properties? - Close on collation and international support - How do applications request that they need more storage? And related to this, at some point we discussed temporary vs permanent stores. Close on the whole story of how space is managed. - Database-wide exception handlers Looking forward to the discussion tomorrow. -pablo From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Monday, November 01, 2010 1:34 PM To: Jonas Sicking Cc: public-webapps@w3.org Subject: Re: IndexedDB TPAC agenda On Mon, Nov 1, 2010 at 12:23 PM, Jonas Sicking wrote: On Mon, Nov 1, 2010 at 5:13 AM, Jeremy Orlow wrote: > On Mon, Nov 1, 2010 at 11:53 AM, Jonas Sicking wrote: >> >> On Mon, Nov 1, 2010 at 4:40 AM, Jeremy Orlow wrote: >> > What items should we try to cover during the f2f? >> > On Mon, Nov 1, 2010 at 11:08 AM, Jonas Sicking wrote: >> >> >> >> > P.S. I'm happy to discuss all of this f2f tomorrow rather than over >> >> > email >> >> > now. >> >> >> >> Speaking of which, would be great to have an agenda. Some of the >> >> bigger items are: >> >> >> >> * Dynamic transactions >> >> * Arrays-as-keys >> >> * Arrays and indexes (what to do if the keyPath for an index evaluates >> >> to an array) >> >> * Synchronous API >> > >> > * Compound keys. >> > * What should be allowed in a keyPath. >> >> Aren't "compound keys" same as "arrays-as-keys"? > > Sorry, I meant to say compound indexes. > We've talked about using indexes in many different ways--including compound > indexes and allowing keys to include indexes. I assumed you meant the > latter? I'm lost as to what you're saying here. Could you elaborate? Are you saying "index" when you mean "array" anywhere? oops. Yes, I meant to say: "We've talked about using arrays in many different ways--including compound indexes and allowing keys to include arrays. I assumed you meant the latter?" >> * What should happen if an index's keyPath points to a property which >> doesn't exist or which isn't a valid key-value? (same general topic as >> "arrays and indexes" above) > > We've talked about this several times. It'd be great to settle on something > once and for all. Agreed. >> * What happens if the user leaves a page in the middle of a >> transaction? (this might be nice to tackle since there'll be lots of >> relevant people in the room) > > I'm pretty sure this is simple: if there's an onsuccess/onerror handler that > has not yet fired (or we're in the middle of firing), then you abort the > transaction. If not, the behavior is undefined (because there's no way the > app could have observed the difference anyway). The aborting behavior is > necessary since the user could have planned to execute additional commands > atomically in the handler. There is also the option to let the transaction finish. They should be short-lived so it shouldn't be too bad. I.e. keep the page alive for a bit longer in the background or something that blocks page unload? Is there precedent for this elsewhere? This sounds pretty complicated to get right both in terms of implementation and speccing. Let's chat about it though. >> * Error handling > > What do you mean by this? How to handle exceptions in various places. Where (error) events propagate. How does it relate to window.onerror. What happens if you do/don't call preventDefault on the error event? Sounds good.
RE: Seeking agenda items for WebApps' Nov 1-2 f2f meeting
Are these slots more or less frozen at this point? Just wanted to confirm to make travel arrangements. Thanks -pablo -Original Message- From: Arthur Barstow [mailto:art.bars...@nokia.com] Sent: Wednesday, September 29, 2010 5:41 AM To: ext Eric Uhrhane; Jonas Sicking; Jeremy Orlow; Pablo Castro; public-webapps; Arun Ranganathan Subject: Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting I added the following slots for November 2: [[ http://www.w3.org/2008/webapps/wiki/TPAC2010#Tuesday.2C_November_2 13:30-15:00: Indexed DB 15:30-16:30: Indexed DB 16:30-18:00: File * APIs ]] Of course we can fine-tune the times as needed. Arun - we reserved a speaker phone for remote participants for both days. -Art Barstow On 9/28/10 5:45 PM, ext Eric Uhrhane wrote: > Works fine for me. I'll be there all of Monday and Tuesday. Due to > jetlag morning vs. afternoon's probably irrelevant to me, as I won't > have any idea what time it is ;'>. > > On Tue, Sep 28, 2010 at 2:30 PM, Jonas Sicking wrote: >> The later the better for me. If we can make it after noon I'll be >> there for sure. >> >> / Jonas >> >> On Tue, Sep 28, 2010 at 1:37 PM, Jeremy Orlow wrote: >>> I'm OK with any time slot. >>> >>> On Tue, Sep 28, 2010 at 8:57 PM, Arthur Barstow >>> wrote: >>>> Hi All, >>>> >>>> Currently, no one has requested a specific day + time slot for any of the >>>> proposed topics: >>>> >>>> http://www.w3.org/2008/webapps/wiki/TPAC2010 >>>> >>>> When our IndexedDB participants agree on a time slot on Tuesday the 2nd, >>>> I'll add it to the agenda. Pablo, Jonas, Jeremy - please propose a time. >>>> >>>> Day + time slot proposals for the agenda topics already proposed are also >>>> welcome (as are proposals for additional topics). >>>> >>>> -Art Barstow >>>> >>>> On 9/28/10 3:28 PM, ext Pablo Castro wrote: >>>>> It looks like there will be good critical mass for IndexedDB discussions, >>>>> so I'll try to make it as well. Tuesday would be best for me as well for >>>>> an >>>>> IndexedDB meeting so I can travel on Sunday/Monday. >>>>> >>>>> -pablo >>>>> >>>>> -Original Message- >>>>> From: Jonas Sicking [mailto:jo...@sicking.cc] >>>>> Sent: Tuesday, September 28, 2010 10:53 AM >>>>> To: Jeremy Orlow >>>>> Cc: Pablo Castro; art.bars...@nokia.com; public-webapps >>>>> Subject: Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting >>>>> >>>>> I'm not 100% sure that I'll make TPAC this year, but if I do, I likely >>>>> won't make monday. So a tuesday schedule would fit me better too. >>>>> >>>>> / Jonas >>>>> >>>>> On Tue, Sep 28, 2010 at 8:36 AM, Jeremy Orlowwrote: >>>>>> Is it possible to schedule IndexedDB for Tuesday? I'm pretty sure that >>>>>> I >>>>>> can be there then, but Monday is more up in the air at this moment. >>>>>> Thanks! >>>>>> Jeremy >>>>>> On Thu, Sep 2, 2010 at 3:28 AM, Jonas Sickingwrote: >>>>>>> I'm hoping to be there yes. Especially if we'll get a critical mass of >>>>>>> IndexedDB contributors. >>>>>>> >>>>>>> / Jonas >>>>>>> >>>>>>> On Wed, Sep 1, 2010 at 7:18 PM, Pablo >>>>>>> Castro >>>>>>> wrote: >>>>>>>> -Original Message- >>>>>>>> From: public-webapps-requ...@w3.org >>>>>>>> [mailto:public-webapps-requ...@w3.org] On Behalf Of Arthur Barstow >>>>>>>> Sent: Tuesday, August 31, 2010 4:32 AM >>>>>>>> >>>>>>>>>> The WebApps WG will meet face-to-face November 1-2 as part of the >>>>>>>>>> W3C's >>>>>>>>>> 2010 TPAC meeting week [TPAC]. >>>>>>>>>> >>>>>>>>>> I created a stub agenda item page and seek input to flesh out >>>>>>>>>> agenda: >>>>>>>>>> >>>>>>>>>> http://www.w3.org/2008/webapps/wiki/TPAC2010 >>>>>>>>>> >>>>>>>>>> [TPAC] includes a link to the Registration page, a detailed schedule >>>>>>>>>> of >>>>>>>>>> the group meetings, and other useful information. >>>>>>>>>> >>>>>>>>>> The registration fee is 40€ per day and will increase to 120€ per >>>>>>>>>> day >>>>>>>>>> after October 22. >>>>>>>>>> >>>>>>>>>> -Art Barstow >>>>>>>>>> >>>>>>>>>> [TPAC] http://www.w3.org/2010/11/TPAC/ >>>>>>>> For folks working on IndexedDB, are you guys planning on attending the >>>>>>>> TPAC? Given the timing of the event it may be a great opportunity to >>>>>>>> get >>>>>>>> together and iron out a whole bunch of issues at once. It would be >>>>>>>> good to >>>>>>>> know ahead of time so we can all make plans if we have critical mass. >>>>>>>> >>>>>>>> Thanks >>>>>>>> -pablo >>>>>>>> >>>>>>>> >>> >>
Re: [IndexedDB] Explicitly stablishing the timing of clone creation
On Mon, Aug 16, 2010 at 12:11 AM, Jonas Sicking wrote: >> >> > On Fri, Aug 13, 2010 at 1:43 PM, Pablo Castro >> > wrote: >> > > The spec for the asynchronous "put" and "add" methods in object store as >> > well as "update" in cursors don't explicitly state when clones are created, >> > and can even be read as if clones should be created after the function call >> > returned, when the queued up task is executed. This leads to problems where >> > the clone may be modified after the call to put/add/update happens. >> > Wouldn't >> > it be more reasonable to require implementations to always create a clone >> > of >> > the object before returning (i.e. synchronously) and perform the rest of >> > the >> > operation asynchronously? >> > >> > Yes. >> > >> > > If we agree on this I'll file a bug and later follow up with some text >> > for the spec. >> > >> > Please do. >> > >> >> Agreed. Closing the loop on this one. Proposed text is below, any feedback is welcome. I also updated the bug with it. http://www.w3.org/Bugs/Public/show_bug.cgi?id=10381 Thanks -pablo Proposed text changes for this: In section "3.2.5 Object Store", the description for the "add" method says: This method returns immediately and stores the given value in this object store by following the steps for storing a record into an object store with the no-overwrite flag set. If the record can be successfully stored in the object store, then a success event is fired on this method's returned object using the IDBTransactionEvent interface with its result set to the key for the stored record and transaction set to the transaction in which this object store is opened. If a record exists in this object store for the key key parameter, then an error event is fired on this method's returned object with its code set to CONSTRAINT_ERR We should change it to: This method stores the given value in this object store by first synchronously creating a copy of the value following steps 1 through 4 of the algorithm described in "4.2 Object Store Storage steps", then returning immediately and asynchronously performing the remaining steps for the algorithm that actually store the object in the object store, with the no-overwrite flag set. If the record can be successfully stored in the object store, then a success event is fired on this method's returned object using the IDBTransactionEvent interface with its result set to the key for the stored record and transaction set to the transaction in which this object store is opened. If a record exists in this object store for the key key parameter, then an error event is fired on this method's returned object with its code set to CONSTRAINT_ERR. In section "3.2.5 Object Store", the description for the "put" method says: This method returns immediately and stores the given value in this object store by following the steps for storing a record into an object store. If the record can be successfully stored in the object store, then a success event is fired on this method's returned object using the IDBTransactionEvent interface with its result set to the key for the stored record and transaction set to the transaction in which this object store is opened. We should change it to: This method stores the given value in this object store by first synchronously creating a copy of the value following steps 1 through 4 of the algorithm described in "4.2 Object Store Storage steps", then returning immediately and asynchronously performing the remaining steps for the algorithm that actually store the object in the object store. If the record can be successfully stored in the object store, then a success event is fired on this method's returned object using the IDBTransactionEvent interface with its result set to the key for the stored record and transaction set to the transaction in which this object store is opened. In section "3.2.7 Cursor" the description of the "update" method says: This method returns immediately and sets the value for the record at the cursor's position. We should change it to: This method sets the value for the record at the cursor's position by first synchronously creating a copy of the value using the structured clone algorithm, then returning immediately and asynchronously updating the record in the underlying store.
RE: [IndexedDB] Languages for collation
From: Jungshik Shin (신정식, 申政湜) [mailto:jungs...@google.com] Sent: Tuesday, August 24, 2010 10:34 PM >> As for the locale identifiers, my understanding is that Windows APIs (newer >> 'name-based' locale APIs) more or less follows BCP 47. >> Picking this back up from this August thread. I went around and asked Windows folks about this. Locale identifiers based on BCP 47 sound good. On the other hand, we probably wouldn't do UCA. I heard various worries from folks that work in this space, including the fact that it seems it's still changing so it would be a moving target (which btw means that collisions could still happen) and that we don't support it in a number of places today. Given that feedback, I would rather leave this open and let implementations choose the algorithm for collation (still need to do language-sensitive collation, of course). Would that work? Thanks -pablo
RE: [IndexedDB] IDBCursor.update for cursors returned from IDBIndex.openCursor
I agree with Jonas on this. I think accessing the index values is an important feature (in addition to joins you can imagine add an extra property or two to the index key* to create a covering index and avoid fetching the object in a perf-critical path). That said, to me it's just about allowing retrieval. For update/delete it would be perfectly reasonable to have to go to the store in my opinion. -pablo -Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, September 17, 2010 3:15 PM On Fri, Sep 17, 2010 at 2:46 AM, Jeremy Orlow wrote: > On Fri, Sep 17, 2010 at 1:06 AM, Jonas Sicking wrote: >> >> On Thu, Sep 16, 2010 at 2:23 PM, Jeremy Orlow wrote: >> > On Thu, Sep 16, 2010 at 8:53 PM, Jonas Sicking wrote: >> >> >> >> On Thu, Sep 16, 2010 at 2:15 AM, Jeremy Orlow >> >> wrote: >> >> > Wait a sec. What are the use cases for non-object cursors anyway? >> >> > They >> >> > made perfect sense back when we allowed explicit index management, >> >> > but >> >> > now >> >> > they kind of seem like a premature optimization or possibly even dead >> >> > weight. Maybe we should just remove them altogether? >> >> >> >> They are still useful for joins. Consider an objectStore "employees": >> >> >> >> { id: 1, name: "Sven", employed: "1-1-2010" } >> >> { id: 2, name: "Bert", employed: "5-1-2009" } >> >> { id: 3, name: "Adam", employed: "6-6-2008" } >> >> And objectStore "sales" >> >> >> >> { seller: 1, candyName: "lollipop", quantity: 5, date: "9-15-2010" } >> >> { seller: 1, candyName: "swedish fish", quantity: 12, date: "9-15-2010" >> >> } >> >> { seller: 2, candyName: "jelly belly", quantity: 3, date: "9-14-2010" } >> >> { seller: 3, candyName: "heath bar", quantity: 3, date: "9-13-2010" } >> >> If you want to display the amount of sales per person, sorted by names >> >> of sales person, you could do this by first creating and index for >> >> "employees" with keyPath "name". You'd then use IDBIndex.openCursor to >> >> iterate that index, and for each entry find all entries in the "sales" >> >> objectStore where "seller" matches the cursors .value. >> >> >> >> So in this case you don't actually need any data from the "employees" >> >> objectStore, all the data is available in the index. Thus it is >> >> sufficient, and faster, to use openCursor than openObjectCursor. >> >> >> >> In general, it's a common optimization to stick enough data in an >> >> index that you don't have to actually look up in the objectStore >> >> itself. This is slightly less commonly doable since we have relatively >> >> simple indexes so far. But still doable as the example above shows. >> >> Once we add support for arrays as keys this will be much more common >> >> as you can then stick arbitrary data into the index by simply adding >> >> additional entries to all key arrays. And even more so once we >> >> (probably in a future version) add support for computed indexes. >> > >> > >> > On Thu, Sep 16, 2010 at 8:57 PM, Jonas Sicking wrote: >> >> >> >> On Thu, Sep 16, 2010 at 4:08 AM, Jeremy Orlow >> >> wrote: >> >> > Actually, for that matter, are remove and update needed at all? I >> >> > think >> >> > they may just be more cruft left over from the explicit index days. >> >> > As >> >> > far >> >> > as I can tell, any .delete or .remove should be doable via an >> >> > objectCursor + >> >> > .puts/.removes on the objectStore. >> >> >> >> They are not strictly needed, but they are a decent convinence >> >> feature, and with a proper implementation they can even be a >> >> performance optimization. With a cursor iterating a b-tree you can let >> >> the cursor keep a pointer to the b-tree entry. They way .delete and >> >> .update doesn't have to do a b-tree lookup at all. >> >> >> >> We're currently not able to do this since our backend (sqlite) doesn't >> >> have good enough cursor support, but I suspect that this will change >> >> at some point in the future. In the mean time it seems like a good >> >> thing to allow people to use API that will be faster in the future. >> > >> > All your arguments revolve around what the spec >> > and implementations might do >> > in the future. >> >> I disagree. The IDBIndex.openCursor example I included uses only >> existing API, and is a performance improvement in at least our current >> implementation. Would be interested to hear if it's not a performance >> improvement in others. > > It's not in ours because we join to the ObjectStore's data table either way. > But that's not at all why I'm bringing this up. Why? >> > Typically we add API surface area only for use cases that >> > are currently impossible to satisfy or proven performance bottlenecks. I >> > agree that it's likely implementations will want to do optimizations >> > like >> > this in the future, but until they do, it'll be hard to really >> > understand >> > the implications and complications that might arrise. >> >> That's not entire
RE: [IndexedDB] setVersion with multiple IDBDatabase objects
-Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of ben turner Sent: Tuesday, September 28, 2010 8:19 AM >> Yes, let's have it tied to the instance on which setVersion() was called. >> >> As Shawn pointed out that is consistent with the behavior that >> database instances from different windows will observe. As Jeremy >> pointed out that is consistent with the way object stores and indexes >> are tied to a transaction instance. Also, the |event.source| will be >> db1 in the given example, so it seems natural to allow changes only to >> the database we pass in the event and no other. >> >> -Ben +1, let's tie it to the instance and make it consistent with stores/indexes. -pablo
RE: Seeking agenda items for WebApps' Nov 1-2 f2f meeting
It looks like there will be good critical mass for IndexedDB discussions, so I'll try to make it as well. Tuesday would be best for me as well for an IndexedDB meeting so I can travel on Sunday/Monday. -pablo -Original Message- From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Tuesday, September 28, 2010 10:53 AM To: Jeremy Orlow Cc: Pablo Castro; art.bars...@nokia.com; public-webapps Subject: Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting I'm not 100% sure that I'll make TPAC this year, but if I do, I likely won't make monday. So a tuesday schedule would fit me better too. / Jonas On Tue, Sep 28, 2010 at 8:36 AM, Jeremy Orlow wrote: > Is it possible to schedule IndexedDB for Tuesday? I'm pretty sure that I > can be there then, but Monday is more up in the air at this moment. > Thanks! > Jeremy > On Thu, Sep 2, 2010 at 3:28 AM, Jonas Sicking wrote: >> >> I'm hoping to be there yes. Especially if we'll get a critical mass of >> IndexedDB contributors. >> >> / Jonas >> >> On Wed, Sep 1, 2010 at 7:18 PM, Pablo Castro >> wrote: >> > >> > -Original Message- >> > From: public-webapps-requ...@w3.org >> > [mailto:public-webapps-requ...@w3.org] On Behalf Of Arthur Barstow >> > Sent: Tuesday, August 31, 2010 4:32 AM >> > >> >>> The WebApps WG will meet face-to-face November 1-2 as part of the >> >>> W3C's >> >>> 2010 TPAC meeting week [TPAC]. >> >>> >> >>> I created a stub agenda item page and seek input to flesh out agenda: >> >>> >> >>> http://www.w3.org/2008/webapps/wiki/TPAC2010 >> >>> >> >>> [TPAC] includes a link to the Registration page, a detailed schedule >> >>> of >> >>> the group meetings, and other useful information. >> >>> >> >>> The registration fee is 40€ per day and will increase to 120€ per day >> >>> after October 22. >> >>> >> >>> -Art Barstow >> >>> >> >>> [TPAC] http://www.w3.org/2010/11/TPAC/ >> > >> > For folks working on IndexedDB, are you guys planning on attending the >> > TPAC? Given the timing of the event it may be a great opportunity to get >> > together and iron out a whole bunch of issues at once. It would be good to >> > know ahead of time so we can all make plans if we have critical mass. >> > >> > Thanks >> > -pablo >> > >> > >> > >
RE: Seeking agenda items for WebApps' Nov 1-2 f2f meeting
-Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Arthur Barstow Sent: Tuesday, August 31, 2010 4:32 AM >> The WebApps WG will meet face-to-face November 1-2 as part of the W3C's >> 2010 TPAC meeting week [TPAC]. >> >> I created a stub agenda item page and seek input to flesh out agenda: >> >> http://www.w3.org/2008/webapps/wiki/TPAC2010 >> >> [TPAC] includes a link to the Registration page, a detailed schedule of >> the group meetings, and other useful information. >> >> The registration fee is 40€ per day and will increase to 120€ per day >> after October 22. >> >> -Art Barstow >> >> [TPAC] http://www.w3.org/2010/11/TPAC/ For folks working on IndexedDB, are you guys planning on attending the TPAC? Given the timing of the event it may be a great opportunity to get together and iron out a whole bunch of issues at once. It would be good to know ahead of time so we can all make plans if we have critical mass. Thanks -pablo
RE: [IndexedDB] Let's remove IDBDatabase.objectStore()
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Tuesday, August 24, 2010 12:40 AM >> On Tue, Aug 24, 2010 at 12:43 AM, ben turner wrote: >> Hi folks, >> >> We originally included IDBDatabase.objectStore() as a convenience >> function because we figured that everyone would hate typing >> |myDatabase.transaction('myObjectStore').objectStore('myObjectStore')|. >> Unfortunately I think we should remove it - too many developers have >> used the function without realizing that the returned object was tied >> to a particular transaction. Any objections? >> >> It does seem like it could be confusing and it doesn't seem to save all that >> many characters. So I'm fine with it. +1
[IndexedDB] Avoiding reader/writer starvation
In the context of transactions, readers using READ_ONLY and writers using READ_WRITE may block each other when starting transactions, at least for cases where the underlying implementation uses locking for isolation. Since we allow multiple readers and they can start while other readers were already running, it's possible that readers end up starving writers in a concurrent setting. It seems it would be a good idea to add some minimum guarantees to the spec that ensures some amount of fairness to concurrent activities against a given database. We could either include a loose recommendation or try to mandate a strict behavior. It seems the loose recommendation is more practical, the questions are a) is there a risk of incompatible behavior because of under-specification, and b) will we risk that some implementations will just ignore this aspect if it's specified too informally. The loose recommendation could just be a sentence in the transactions section: "UAs need to ensure a reasonable level of fairness across readers and writers to prevent starvation." If we wanted to be more specific, we could go with something like this (we'd probably spell it out as rules if we decide to put this strict version in the spec): "All readers can run concurrently, but once a writer tries to start a transaction we stop allowing new readers to start and queue up the writer and any subsequent reader/writer. Once the existing readers are drained the writer runs, and after that whatever is queued up next runs, which can be another writer or all the remaining readers (depending upon what came first, another writer or another reader; readers are released all simultaneously since they run concurrently)." Given that not all implementations will have to deal with this and that different implementations may want to have different strategies, it seems that just having the recommendation around starvation is the best option. Thanks -pablo
[IndexedDB] Explicitly stablishing the timing of clone creation
The spec for the asynchronous "put" and "add" methods in object store as well as "update" in cursors don't explicitly state when clones are created, and can even be read as if clones should be created after the function call returned, when the queued up task is executed. This leads to problems where the clone may be modified after the call to put/add/update happens. Wouldn't it be more reasonable to require implementations to always create a clone of the object before returning (i.e. synchronously) and perform the rest of the operation asynchronously? If we agree on this I'll file a bug and later follow up with some text for the spec. Thanks -pablo
RE: [IndexedDB] Languages for collation
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, August 12, 2010 3:36 AM >> On Thu, Aug 12, 2010 at 11:19 AM, Jonas Sicking wrote: >> On Wed, Aug 11, 2010 at 11:28 PM, Pablo Castro >> wrote: >> > We had some discussions about collation algorithms and such in the past, >> > but I don't think we have settled on the language aspect of it. In order >> > to have stores and indexes sort character-based keys in a way that is >> > consistent with users' expectations we'll have to take indication in the >> > API of what language we should use to collate strings. >> > >> > Trying to take a minimalist approach, we could add an optional parameter >> > on the database open call that indicates the language to use (e.g. "en" or >> > "en-UK", etc.). If the language is not specified and the database does not >> > exist, then we can use the current browser/OS language to create the >> > database. If not specified and database already exists, then use the one >> > it's already there (this accommodates the fact that a user may be able to >> > change their default language in the browser/OS after the database has >> > been created using the default). If the language is specified and the >> > database already exists and the specified language is not the one the >> > database has then we'll throw an exception (same behavior as with >> > "description", although we have that one in flight right now as well). >> > >> > We should probably also add a read-only attribute to the database object >> > that exposes the language. >> > >> > If this works for folks I can write a proposal for the specific changes to >> > the spec. >> If we make it part of the database open call, then that makes it >> impossible to change the sorting order of an existing database, no? >> This seems like it could be a problem. I.e. it quite possible that an >> application will want to allow the user to change the sorting >> language, for example when changing the language of the UI. >> >> One solution would be to allow language to be set as part of the >> setVersion call. >> >> Whether it's per-database or more fine grained I think it absolutely must be >> part of setVersion. Changing the language will be a very heavyweight >> operation that'll require a similar level of isolation to "schema" changes >> of the database. (Not sure how I missed this point of Pablo's original >> email.) Yes, changing the collection would effectively mean re-creating all the stores and indexes. At a very minimum it needs to be a setVersion thing. I also don't think it would be too crazy to not support changing collations period. In the unusual case where a user must absolutely do this, it can be done by creating a separate database and copying the data over using the APIs.
RE: [IndexedDB] Languages for collation
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, August 12, 2010 2:18 AM >> I think we should first break down the use cases and look at how many of >> them just need _a_ sort order, how many of them a per-database sort order is >> ok, and how many of them would need something finer grained (like a per-key >> ordering). That's reasonable. What I was thinking is that any case where you'll use the order of items in a store/index to display things to the user (e.g. a list of contacts) you'd want the items to be in proper order for the user's language. That will not only match users' expectations but also match other applications (or even other parts of the UA) that display data based on the current OS language or the users' choice of language. That covers a very broad spectrum of scenarios that need language-specific sort order. I find it unlikely that a single web app will need more than one language per database (or even per origin/OS account), given that most applications operate in a single language at any one point in time. >> Are there work-arounds for getting an UCA ordered data structure to hold >> data other language's order? For example, I could imagine it'd be possible >> to do some sort of encode step on the data before insertion (and decode on >> removal) that would make UCA work. I have no idea, but if such algorithms >> existed and were well understood, then it'd definitely make me lean towards >> punting language specification to v2. I'm not sure I understand this paragraph. "UCA ordered" may not mean much more than just ordering using a binary collation if the language is not specified. While this is typically not an issue in English, in other languages this introduces a varying level of deviation from users' expectations. Given that different languages have conflicting rules for collation, I'm not sure how this can be generalized independently of the language. Even in the UCA specification [1] the aspect of input language is mentioned as the most important feature of collation. [1] http://www.unicode.org/reports/tr10/
[IndexedDB] READ_ONLY vs SNAPSHOT_READ transactions
We currently have two read-only transaction modes, READ_ONLY and SNAPSHOT_READ. As we map this out to implementation we ran into various questions that made me wonder whether we have the right set of modes. It seems that READ_ONLY and SNAPSHOT_READ are identical in every aspect (point-in-time consistency for readers, allow multiple concurrent readers, etc.), except that they have different concurrency characteristics, with READ_ONLY blocking writers and SNAPSHOT_READ allowing concurrent writers come and go while readers are active. Does that match everybody's interpretation? Assuming that interpretation, then I'm not sure if we need both. Should we consider having only READ_ONLY, where transactions are guaranteed a stable view of the world regardless of the implementation strategy, and then let implementations either block writers or version the data? I understand that this introduces variability in the reader-writer interaction. On the other hand, I also suspect that the cost of SNAPSHOT_READ will also vary a lot across implementations (e.g. mvcc-based stores versus non-mvcc stores that will have to make copies of all stores included in a transaction to support this mode). Thanks -pablo
RE: [IndexedDB] question about description argument of IDBFactory::open()
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Thursday, August 12, 2010 3:59 AM >> On Thu, Aug 12, 2010 at 11:55 AM, Jonas Sicking wrote: >> On Thu, Aug 12, 2010 at 3:41 AM, Jeremy Orlow wrote: >> > http://www.w3.org/Bugs/Public/show_bug.cgi?id=10349 >> > One quesiton though: if they pass in null or undefined, do we want to >> > interpret this as the argument not being passed in or simply let them >> > convert to "undefined" and "null" (which is the default behavior in WebIDL, >> > I believe). I feel somewhat strongly we should do the former. Especially >> > since the latter would make it impossible to add additional parameters to >> > .open() in the future. >> I don't understand why it would make it impossible to add optional >> parameters in the future. Wouldn't it be a matter of people writing >> >> indexeddb.open("mydatabase", "", SOME_OTHER_PARAM); >> >> vs. >> >> indexeddb.open("mydatabase", null, SOME_OTHER_PARAM); >> >> So "" is assumed to mean "don't update"? My assumption was that "" meant >> empty description. >> >> It seems silly to make someone replace the description with a space (or >> something like that) if they truly want to zero it out. And it seems silly >> to ever make your description be >> "null". So it seemed natural to make >> null and/or undefined be such a signal. Given that open() is one of those functions that are likely to grow in parameters over time, I wonder if we should consider taking an object as the second argument with names/values(e.g. open("mydatabase", { description: "foo" }); ). That would allow us to keep the minimum specification small and easily add more parameters later without resulting un hard to read code that has a bunch of "undefined" in arguments. The only thing I'm not sure is if there is precedent of doing this in one of the standard APIs. Thanks -pablo
RE: [IndexedDB] Languages for collation
From: Mikeal Rogers [mailto:mikeal.rog...@gmail.com] Sent: Wednesday, August 11, 2010 11:35 PM >> Why not just use the unicode collation algorithm? >> >> Then you won't have to hint the locale. Unless I'm missing something, the UCA defines the general algorithm for collating strings but you still need to know the language in order to sort strings properly in that language. For example, in Spanish the letters "c" and "h" together (e.g. in "chau" (bye)) sort as a single letter, causing the expected sort order to be different from English where they are always two independent letters (e.g. so "chau" comes before "cuando" (when) when sorted in English, but after when sorted in Spanish). >> >> http://en.wikipedia.org/wiki/Unicode_collation_algorithm >> >> CouchDB uses some definitions around sorting complex types like arrays and >> objects but when it comes down to sorting strings it just defaults to to the >> unicode collation algorithm and all the locale's are happy. >> >> -Mikeal >> >> On Wed, Aug 11, 2010 at 11:28 PM, Pablo Castro >> wrote: >> We had some discussions about collation algorithms and such in the past, but >> I don't think we have settled on the language aspect of it. In order to have >> stores and indexes sort character-based keys in a way that is consistent >> with users' expectations we'll have to take indication in the API of what >> language we should use to collate strings. >> >> Trying to take a minimalist approach, we could add an optional parameter on >> the database open call that indicates the language to use (e.g. "en" or >> "en-UK", etc.). If the language is not specified and the database does not >> exist, then we can use the current browser/OS language to create the >> database. If not specified and database already exists, then use the one >> it's already there (this accommodates the fact that a user may be able to >> change their default language in the browser/OS after the database has been >> created using the default). If the language is specified and the database >> already exists and the specified language is not the one the database has >> then we'll throw an exception (same behavior as with "description", although >> we have that one in flight right now as well). >> >> We should probably also add a read-only attribute to the database object >> that exposes the language. >> >> If this works for folks I can write a proposal for the specific changes to >> the spec. >> >> Thanks >> -pablo
[IndexedDB] Languages for collation
We had some discussions about collation algorithms and such in the past, but I don't think we have settled on the language aspect of it. In order to have stores and indexes sort character-based keys in a way that is consistent with users' expectations we'll have to take indication in the API of what language we should use to collate strings. Trying to take a minimalist approach, we could add an optional parameter on the database open call that indicates the language to use (e.g. "en" or "en-UK", etc.). If the language is not specified and the database does not exist, then we can use the current browser/OS language to create the database. If not specified and database already exists, then use the one it's already there (this accommodates the fact that a user may be able to change their default language in the browser/OS after the database has been created using the default). If the language is specified and the database already exists and the specified language is not the one the database has then we'll throw an exception (same behavior as with "description", although we have that one in flight right now as well). We should probably also add a read-only attribute to the database object that exposes the language. If this works for folks I can write a proposal for the specific changes to the spec. Thanks -pablo
RE: CfC: to publish new WD of Indexed Database API; deadline August 17
We support this as well. -pablo -Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Tuesday, August 10, 2010 8:06 AM To: Jeremy Orlow Cc: art.bars...@nokia.com; public-webapps Subject: Re: CfC: to publish new WD of Indexed Database API; deadline August 17 I support this. On Tue, Aug 10, 2010 at 4:38 AM, Jeremy Orlow wrote: > On Tue, Aug 10, 2010 at 12:04 PM, Arthur Barstow > wrote: >> >> All - the Editors of the Indexed Database API would like to publish a new >> Working Draft: >> >> http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html >> >> If you have any comments or concerns about this proposal, please send them >> to public-webapps by August 10 at the latest. > > I assume you mean the 17th? >> >> As with all of our CfCs, positive response is preferred and encouraged and >> silence will be assumed to be assent. > > We support.
RE: [IndexedDB] Need a method to remove a database
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Friday, August 06, 2010 2:34 AM >> On Fri, Aug 6, 2010 at 12:37 AM, Jonas Sicking wrote: >> On Thu, Aug 5, 2010 at 4:02 PM, Pablo Castro >> wrote: >> > >> > -Original Message- >> > From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] >> > On Behalf Of Jonas Sicking >> > Sent: Thursday, August 05, 2010 2:12 PM >> > >> >>> >> I suggest we make removeDatabase (or whatever we call it) schedule a >> >>> >> database to be deleted, but doesn't actually delete it until all >> >>> >> existing connections to it are closed (though either explicit calls to >> >>> >> IDBDatabase.close(), or through the tab being closed). >> >>> >> >> >>> >> Any calls to IDBFactory.open with the same name will hold the callback >> >>> >> until the removeDatabase() operation is finished. I.e. after all >> >>> >> existing connections are closed and the database is removed. >> >>> >> >> >>> >> This is similar to how setVersion works. >> >>> > >> >>> > If we're not going to keep it simple, then we should match the >> >>> > setVersion >> >>> > semantics as much as is possible. I.e. add the blocked event and >> >>> > stuff like >> >>> > that. >> >>> >> >>> The "blocked" event fires on the IDBDatabase object. Do we want to >> >>> require that the database is opened before it can be removed? I don't >> >>> really feel strongly either way. >> >>> >> >>> The other question is if we should fire a "versionchange" event on >> >>> other open IDBDatabases, like setVersion does. Or should we fire a >> >>> "holy hell, your database is about to get nuked!" event? The former >> >>> would keep things simpler since there is just one event to listen to. >> >>> The latter might be more correct. >> >>> >> >>> / Jonas >> > >> > I like the idea of just scheduling the database to be deleted once the >> > last connection to it closes, and also preventing any new connection from >> > being established >> once the database has been scheduled for deletion. >> > This adds as little surface area as possible to the API. >> > >> > If we find that that's not a good idea for some reason, I wonder if we >> > should unify the "versionchange" event and this into a single "stuff >> > seriously changed" event where subscribers need to close their handles and >> > let go of any assumptions they had about the database. Once they can >> > re-open, they need to re-establish all their context (this is already true >> > for a version change, we may as well extend it to database deletes and any >> > other future big changes to the database schema, options, etc.) >> Here's my proposal, please poke holes in it: >> >> interface IDBFactory { >> ... >> IDBRequest deleteDatabase(in DOMString name); >> ... >> }; >> >> When deleteDatabase is called, the given database is scheduled for >> deletion. If any IDBDatabase objects are opened to the database fire a >> "versionchange" event on those IDBDatabase objects, with a .version >> set to null. If any calls to IDBFactory.open occur, stall those until >> after this algorithm is finished. Note that this generally won't mean >> that those open calls will fail. They'll generally will receive a >> newly created database instead. >> >> Once all existing IDBDatabase are closed (implicitly or explicitly), >> the database is removed. At this point any IDBFactory.open calls are >> fulfilled and a "success" event is fired on the returned IDBRequest. >> >> So no "blocked" event is fired as I'm not sure where to fire it. I'm >> also not sure that this is a big problem. I'm not even sure that >> returning a IDBRequest is worth it. The only value I can see is >> wanting to display to a user when a database is for sure deleted as to >> allow the user to for example safely shut down the computer without >> worrying that sensitive data is still in the database. >> >> All of this sounds good to me. I'd probably still return an IDBRequest >> for consistency and so that the app can get a conformation when it's really >> gone. On success would fire with a "null" result field, I'd think. This looks good to me too. I agree with still having deleteDatabase return an IDBRequest so the caller can tell when the operation is done. -pablo
RE: [IndexedDB] Need a method to remove a database
-Original Message- From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Thursday, August 05, 2010 2:12 PM >> >> I suggest we make removeDatabase (or whatever we call it) schedule a >> >> database to be deleted, but doesn't actually delete it until all >> >> existing connections to it are closed (though either explicit calls to >> >> IDBDatabase.close(), or through the tab being closed). >> >> >> >> Any calls to IDBFactory.open with the same name will hold the callback >> >> until the removeDatabase() operation is finished. I.e. after all >> >> existing connections are closed and the database is removed. >> >> >> >> This is similar to how setVersion works. >> > >> > If we're not going to keep it simple, then we should match the setVersion >> > semantics as much as is possible. I.e. add the blocked event and stuff >> > like >> > that. >> >> The "blocked" event fires on the IDBDatabase object. Do we want to >> require that the database is opened before it can be removed? I don't >> really feel strongly either way. >> >> The other question is if we should fire a "versionchange" event on >> other open IDBDatabases, like setVersion does. Or should we fire a >> "holy hell, your database is about to get nuked!" event? The former >> would keep things simpler since there is just one event to listen to. >> The latter might be more correct. >> >> / Jonas I like the idea of just scheduling the database to be deleted once the last connection to it closes, and also preventing any new connection from being established once the database has been scheduled for deletion. This adds as little surface area as possible to the API. If we find that that's not a good idea for some reason, I wonder if we should unify the "versionchange" event and this into a single "stuff seriously changed" event where subscribers need to close their handles and let go of any assumptions they had about the database. Once they can re-open, they need to re-establish all their context (this is already true for a version change, we may as well extend it to database deletes and any other future big changes to the database schema, options, etc.) -pablo
RE: [IndexedDB] Need a method to clear an object store
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Tuesday, August 03, 2010 12:21 PM >> On Tue, Aug 3, 2010 at 12:09 PM, ben turner wrote: >> > Hi folks, >> > >> > Currently there are only two ways to clear an object store of all >> > data: (i) remove the object store and recreate it, or (ii) open a >> > cursor and call remove for all entries. I propose a third, simpler >> > approach: >> > >> > interface IDBObjectStore >> > { >> > ... >> > void clear(); >> > ... >> > }; >> > >> > Any thoughts? >> >> Some background. At least in our implementation, removing each >> individual item is significantly slower than removing and recreating >> the objectStore. It's also significantly slower than a 'clear' >> function is. And while tearing down and recreating the objectStore >> works, it's fairly complex if there are multiple indexes on the store. >> Adding a clear() function, while redundant, should make things easier >> for developers while adding very little work in the implementation. >> >> I think there is a bug in the above proposal though. clear() should >> return a IDBRequest. However the .result of the request should likely >> be null. >> >> / Jonas +1 on having clear(). We ran into the need also while playing with samples and such. -pablo
RE: [IndexedDB] Need a method to remove a database
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Wednesday, August 04, 2010 2:56 AM >> On Tue, Aug 3, 2010 at 11:26 PM, Jonas Sicking wrote: >> On Tue, Aug 3, 2010 at 3:20 PM, Shawn Wilsher wrote: >> > Hey all, >> > >> > Some of the feedback I've been seeing on the web is that there is no way to >> > remove a database. Examples seem to be "web page wants to allow the user >> > to >> > remove the data they stored". A site can almost accomplish this now by >> > removing all object stores, but we still end up storing some meta data >> > (version number). Does this seem like a legit request to everyone? >> Sounds legit to me. Feel somewhat embarrassed that I've missed this so far :) >> >> Agreed. >> >> What should the semantics be for open database connections? We could do >> something like setVersion, but I'd just as soon nuke any existing connection >> (i.e. make all future operations fail). This seems >> reasonable since the >> reasons we didn't do this for setVersion (data loss) don't really seem to >> apply here. >> >> J +1 Nuking is fine...another option would be to queue up the delete until all database sessions are gone, but probably will complicate things and not add much. The only thing I wonder is if we'll create a bunch of pain for implementations where nuking is tricky (thinking of multi-process scenarios where maybe files are locked or something). -pablo
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Thursday, July 22, 2010 5:30 PM >> On Thu, Jul 22, 2010 at 5:26 PM, Pablo Castro >> wrote: >> > >> > From: Jonas Sicking [mailto:jo...@sicking.cc] >> > Sent: Thursday, July 22, 2010 5:18 PM >> > >> >>> > The author doesn't explicitly specify which rows to lock. All rows >> >>> > that you "see" become locked (e.g. through get(), put(), scanning with >> >>> > a cursor, etc.). If you start the transaction as read-only then >> >>> > they'll all have shared locks. If you start the transaction as >> >>> > read-write then we can choose whether the implementation should always >> >>> > attempt to take exclusive locks or if it should take shared locks on >> >>> > read, and attempt to upgrade to an exclusive lock on first write (this >> >>> > affects failure modes a bit). >> > >> >>> What counts as "see"? If you iterate using an index-cursor all the >> >>> rows that have some value between "A" and "B", but another, not yet >> >>> committed, transaction changes a row such that its value now is >> >>> between "A" and "B", what happens? >> > >> > We need to design something a bit more formal that covers the whole >> > spectrum. As a short answer, assuming we want to have "serializable" as >> > our isolation level, then we'd have a range lock that goes from the start >> > of a cursor to the point you've reached, so if you were to start another >> > cursor you'd be guaranteed the exact same view of the world. In that case >> > it wouldn't be possible for other transaction to insert a row between two >> > rows you scanned through with a cursor. >> >> How would you prevent that? Would a call to .modify() or .put() block >> until the other transaction finishes? With appropriate timeouts on >> deadlocks of course. That's right, calls would block if they need to acquire a lock for a key or a range and there is an incompatible lock present that overlaps somehow with that. -pablo
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Thursday, July 22, 2010 5:25 PM >> >> Regarding deadlocks, that's right, the implementation cannot determine if >> >> a deadlock will occur ahead of time. Sophisticated implementations could >> >> track locks/owners and do deadlock detection, although a simple >> >> timeout-based mechanism is probably enough for IndexedDB. >> > >> > Simple implementations will not deadlock because they're only doing object >> > store level locking in a constant locking order. Well, it's not really simple vs sophisticated, but whether they do dynamically scoped transactions or not, isn't it? If you do dynamic transactions, then regardless of the granularity of your locks, code will grow the lock space in a way that you cannot predict so you can't use a well-known locking order, so deadlocks are not avoidable. >> > Sophisticated implementations will be doing key level (IndexedDB's analog >> > to row level) locking with deadlock detection or using methods to >> > completely >> > avoid it. I'm not sure I'm comfortable with having one or two in-between >> > implementations relying on timeouts to resolve deadlocks. Deadlock detection is quite a bit to ask from the storage engine. From the developer's perspective, the difference between deadlock detection and timeouts for deadlocks is the fact that the timeout approach will take a bit longer, and the error won't be as definitive. I don't think this particular difference is enough to require deadlock detection. >> > Of course, if we're breaking deadlocks that means that web developers need >> > to handle this error case on every async request they make. As such, I'd >> > rather that we require implementations to make deadlocks impossible. This >> > means that they either need to be conservative about locking or to do MVCC >> > (or something similar) so that transactions can continue on even beyond the >> > point where we know they can't be serialized. This would >> > be consistent with >> > our usual policy of trying to put as much of the burden as is practical on >> > the browser developers rather than web developers. Same as above...MVCC is quite a bit to mandate from all implementations. For example, I'm not sure but from my basic understanding of SQLite I think it always does straight up locking and doesn't have support for versioning. >> >> >> >> As for locking only existing rows, that depends on how much isolation we >> >> want to provide. If we want "serializable", then we'd have to put in >> >> things >> >> such as range locks and locks on non-existing keys so reads are consistent >> >> w.r.t. newly created rows. >> > >> > For the record, I am completely against anything other than "serializable" >> > being the default. Everything a web developer deals with follows run to >> > completion. If you want to have optional modes that relax things in terms >> > of serializability, maybe we should start a new thread? >> >> Agreed. >> >> I was against dynamic transactions even when they used >> whole-objectStore locking. So I'm even more so now that people are >> proposing row-level locking. But I'd like to understand what people >> are proposing, and make sure that what is being proposed is a coherent >> solution, so that we can correctly evaluate it's risks versus >> benefits. The way I see the risk/benefit tradeoff of dynamic transactions: they bring better concurrency and more flexibility at the cost of new failure modes. I think that weighing them in those terms is more important than the specifics such as whether it's okay to have timeouts versus explicit deadlock errors. -pablo
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Thursday, July 22, 2010 5:18 PM >> > The author doesn't explicitly specify which rows to lock. All rows that >> > you "see" become locked (e.g. through get(), put(), scanning with a >> > cursor, etc.). If you start the transaction as read-only then they'll all >> > have shared locks. If you start the transaction as read-write then we can >> > choose whether the implementation should always attempt to take exclusive >> > locks or if it should take shared locks on read, and attempt to upgrade to >> > an exclusive lock on first write (this affects failure modes a bit). >> What counts as "see"? If you iterate using an index-cursor all the >> rows that have some value between "A" and "B", but another, not yet >> committed, transaction changes a row such that its value now is >> between "A" and "B", what happens? We need to design something a bit more formal that covers the whole spectrum. As a short answer, assuming we want to have "serializable" as our isolation level, then we'd have a range lock that goes from the start of a cursor to the point you've reached, so if you were to start another cursor you'd be guaranteed the exact same view of the world. In that case it wouldn't be possible for other transaction to insert a row between two rows you scanned through with a cursor. -pablo
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Thursday, July 22, 2010 11:27 AM >> On Thu, Jul 22, 2010 at 3:43 AM, Nikunj Mehta wrote: >> > >> > On Jul 16, 2010, at 5:41 AM, Pablo Castro wrote: >> > >> >> >> >> From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy >> >> Orlow >> >> Sent: Thursday, July 15, 2010 8:41 AM >> >> >> >> On Thu, Jul 15, 2010 at 4:30 PM, Andrei Popescu >> >> wrote: >> >> On Thu, Jul 15, 2010 at 3:24 PM, Jeremy Orlow wrote: >> >>> On Thu, Jul 15, 2010 at 3:09 PM, Andrei Popescu >> >>> wrote: >> >>>> >> >>>> On Thu, Jul 15, 2010 at 9:50 AM, Jeremy Orlow >> >>>> wrote: >> >>>>>>>> Nikunj, could you clarify how locking works for the dynamic >> >>>>>>>> transactions proposal that is in the spec draft right now? >> >>>>>>> >> >>>>>>> I'd definitely like to hear what Nikunj originally intended here. >> >>>>>>>> >> >>>>>> >> >>>>>> Hmm, after re-reading the current spec, my understanding is that: >> >>>>>> >> >>>>>> - Scope consists in a set of object stores that the transaction >> >>>>>> operates >> >>>>>> on. >> >>>>>> - A connection may have zero or one active transactions. >> >>>>>> - There may not be any overlap among the scopes of all active >> >>>>>> transactions (static or dynamic) in a given database. So you cannot >> >>>>>> have two READ_ONLY static transactions operating simultaneously over >> >>>>>> the same object store. >> >>>>>> - The granularity of locking for dynamic transactions is not specified >> >>>>>> (all the spec says about this is "do not acquire locks on any database >> >>>>>> objects now. Locks are obtained as the application attempts to access >> >>>>>> those objects"). >> >>>>>> - Using dynamic transactions can lead to dealocks. >> >>>>>> >> >>>>>> Given the changes in 9975, here's what I think the spec should say for >> >>>>>> now: >> >>>>>> >> >>>>>> - There can be multiple active static transactions, as long as their >> >>>>>> scopes do not overlap, or the overlapping objects are locked in modes >> >>>>>> that are not mutually exclusive. >> >>>>>> - [If we decide to keep dynamic transactions] There can be multiple >> >>>>>> active dynamic transactions. TODO: Decide what to do if they start >> >>>>>> overlapping: >> >>>>>> -- proceed anyway and then fail at commit time in case of >> >>>>>> conflicts. However, I think this would require implementing MVCC, so >> >>>>>> implementations that use SQLite would be in trouble? >> >>>>> >> >>>>> Such implementations could just lock more conservatively (i.e. not >> >>>>> allow >> >>>>> other transactions during a dynamic transaction). >> >>>>> >> >>>> Umm, I am not sure how useful dynamic transactions would be in that >> >>>> case...Ben Turner made the same comment earlier in the thread and I >> >>>> agree with him. >> >>>> >> >>>> Yes, dynamic transactions would not be useful on those implementations, >> >>>> but the point is that you could still implement the spec without a MVCC >> >>>> backend--though it >> would limit the concurrency that's possible. >> >>>> Thus "implementations that use SQLite would" NOT necessarily "be in >> >>>> trouble". >> >> >> >> Interesting, I'm glad this conversation came up so we can sync up on >> >> assumptions...mine where: >> >> - There can be multiple transactions of any kind active against a given >> >> database session (see note below) >> >> - Multiple static transactions may overlap as long as they have >> >> compatible modes, which in practice means they are all READ_ONLY >> >> - D
RE: [IndexedDB] Cursors and modifications
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Thursday, July 15, 2010 11:59 AM On Thu, Jul 15, 2010 at 11:02 AM, Pablo Castro wrote: >> > >> > From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy >> > Orlow >> > Sent: Thursday, July 15, 2010 2:04 AM >> > >> > On Thu, Jul 15, 2010 at 2:44 AM, Jonas Sicking wrote: >> > On Wed, Jul 14, 2010 at 6:20 PM, Pablo Castro >> > wrote: >> > >> >>> > If it's accurate, as a side note, for the async API it seems that this >> >>> > makes it more interesting to enforce callback order, so we can more >> >>> > easily explain what we mean by "before". >> >>> Indeed. >> >>> >> >>> What do you mean by enforce callback order? Are you saying that >> >>> callbacks should be done in the order the requests are made (rather than >> >>> prioritizing cursor callbacks)? (That's how I read it, but Jonas' >> >>> "Indeed" makes me suspect I missed something. :-) >> > >> > That's right. If changes are visible as they are made within a >> > transaction, then reordering the callbacks would have a visible effect. In >> > particular if we prioritize the cursor callbacks then you'll tend to see a >> > callback for a cursor move before you see a callback for say an >> > add/modify, and it's not clear at that point whether the add/modify >> > happened already and is visible (but the callback didn't land yet) or if >> > the change hasn't happened yet. If callbacks are in order, you see changes >> > within your transaction strictly in the order that each request is made, >> > avoiding surprises in cursor callbacks. >> Oh, I took what you said just as that we need to have a defined >> callback order. Not anything in particular what that definition should >> be. >> >> Regarding when a modification happens, I think the design should be >> that changes logically happen as soon as the 'success' call is fired. >> Any success calls after that will see the modified values. Yep, I agree with this, a change happened "for sure" when you see the success callback. Before that you may or may not observe the change if you do a get or open a cursor to look at the record. >> I still think given the quite substantial speedups gained from >> prioritizing cursor callbacks, that it's the right thing to do. It >> arguably also has some benefits from a practical point of view when it >> comes to the very topic we're discussing. If we prioritize cursor >> callbacks, that makes it much easier to iterate a set of entries and >> update them, without having to worry about those updates messing up >> your iterator. I hear you on the perf implications, but I'm worried that non-sequential order for callbacks will be completely non-intuitive for users. In particular, if you're changing things as you scan a cursor, if then you cursor through the changes you're not sure if you'll see the changes or not (because the callback is the only "definitive" point where the change is visible. That seems quite problematic... -pablo
RE: [IndexedDB] Current editor's draft
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, July 15, 2010 8:41 AM On Thu, Jul 15, 2010 at 4:30 PM, Andrei Popescu wrote: On Thu, Jul 15, 2010 at 3:24 PM, Jeremy Orlow wrote: > On Thu, Jul 15, 2010 at 3:09 PM, Andrei Popescu wrote: >> >> On Thu, Jul 15, 2010 at 9:50 AM, Jeremy Orlow wrote: >> >> >> Nikunj, could you clarify how locking works for the dynamic >> >> >> transactions proposal that is in the spec draft right now? >> >> > >> >> > I'd definitely like to hear what Nikunj originally intended here. >> >> >> >> >> >> >> Hmm, after re-reading the current spec, my understanding is that: >> >> >> >> - Scope consists in a set of object stores that the transaction operates >> >> on. >> >> - A connection may have zero or one active transactions. >> >> - There may not be any overlap among the scopes of all active >> >> transactions (static or dynamic) in a given database. So you cannot >> >> have two READ_ONLY static transactions operating simultaneously over >> >> the same object store. >> >> - The granularity of locking for dynamic transactions is not specified >> >> (all the spec says about this is "do not acquire locks on any database >> >> objects now. Locks are obtained as the application attempts to access >> >> those objects"). >> >> - Using dynamic transactions can lead to dealocks. >> >> >> >> Given the changes in 9975, here's what I think the spec should say for >> >> now: >> >> >> >> - There can be multiple active static transactions, as long as their >> >> scopes do not overlap, or the overlapping objects are locked in modes >> >> that are not mutually exclusive. >> >> - [If we decide to keep dynamic transactions] There can be multiple >> >> active dynamic transactions. TODO: Decide what to do if they start >> >> overlapping: >> >> -- proceed anyway and then fail at commit time in case of >> >> conflicts. However, I think this would require implementing MVCC, so >> >> implementations that use SQLite would be in trouble? >> > >> > Such implementations could just lock more conservatively (i.e. not allow >> > other transactions during a dynamic transaction). >> > >> Umm, I am not sure how useful dynamic transactions would be in that >> case...Ben Turner made the same comment earlier in the thread and I >> agree with him. >> >> Yes, dynamic transactions would not be useful on those implementations, but >> the point is that you could still implement the spec without a MVCC >> backend--though it would limit the concurrency that's possible. Thus >> "implementations that use SQLite would" NOT necessarily "be in trouble". Interesting, I'm glad this conversation came up so we can sync up on assumptions...mine where: - There can be multiple transactions of any kind active against a given database session (see note below) - Multiple static transactions may overlap as long as they have compatible modes, which in practice means they are all READ_ONLY - Dynamic transactions have arbitrary granularity for scope (implementation specific, down to row-level locking/scope) - Overlapping between statically and dynamically scoped transactions follows the same rules as static-static overlaps; they can only overlap on compatible scopes. The only difference is that dynamic transactions may need to block mid-flight until it can grab the resources it needs to proceed. Note: for some databases having multiple transactions active on a single connection may be an unsupported thing. This could probably be handled in the IndexedDB layer though by using multiple connections under the covers. -pablo
RE: [IndexedDB] Cursors and modifications
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, July 15, 2010 2:04 AM On Thu, Jul 15, 2010 at 2:44 AM, Jonas Sicking wrote: On Wed, Jul 14, 2010 at 6:20 PM, Pablo Castro wrote: >> > If it's accurate, as a side note, for the async API it seems that this >> > makes it more interesting to enforce callback order, so we can more easily >> > explain what we mean by "before". >> Indeed. >> >> What do you mean by enforce callback order? Are you saying that callbacks >> should be done in the order the requests are made (rather than prioritizing >> cursor callbacks)? (That's how I read it, but Jonas' "Indeed" makes me >> suspect I missed something. :-) That's right. If changes are visible as they are made within a transaction, then reordering the callbacks would have a visible effect. In particular if we prioritize the cursor callbacks then you'll tend to see a callback for a cursor move before you see a callback for say an add/modify, and it's not clear at that point whether the add/modify happened already and is visible (but the callback didn't land yet) or if the change hasn't happened yet. If callbacks are in order, you see changes within your transaction strictly in the order that each request is made, avoiding surprises in cursor callbacks. -pablo
RE: [IndexedDB] Cursors and modifications
Making sure I get the essence of this thread: we're saying that cursors see live changes as they happen on objects that are "after" the object you're currently standing on; and of course, any other activity within a transaction sees all the changes that happened before that activity took place. Is that accurate? If it's accurate, as a side note, for the async API it seems that this makes it more interesting to enforce callback order, so we can more easily explain what we mean by "before". Thanks -pablo From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 9:27 AM On Wed, Jul 14, 2010 at 5:17 PM, Jonas Sicking wrote: On Wed, Jul 14, 2010 at 5:12 AM, Jeremy Orlow wrote: > On Thu, Jul 8, 2010 at 8:42 PM, Jonas Sicking wrote: >> >> On Mon, Jul 5, 2010 at 9:45 AM, Andrei Popescu wrote: >> > On Sat, Jul 3, 2010 at 2:09 AM, Jonas Sicking wrote: >> >> On Fri, Jul 2, 2010 at 5:44 PM, Andrei Popescu >> >> wrote: >> >>> On Sat, Jul 3, 2010 at 1:14 AM, Jonas Sicking >> >>> wrote: >> >>>> On Fri, Jul 2, 2010 at 4:40 PM, Pablo Castro >> >>>> wrote: >> >>>>> >> >>>>> From: public-webapps-requ...@w3.org >> >>>>> [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking >> >>>>> Sent: Friday, July 02, 2010 4:00 PM >> >>>>> >> >>>>>>> We ran into an complicated issue while implementing IndexedDB. In >> >>>>>>> short, what should happen if an object store is modified while a >> >>>>>>> cursor is >> >>>>>>> iterating it? >> Note that the modification can be done within the >> >>>>>>> same >> >>>>>>> transaction, so the read/write locks preventing several transactions >> >>>>>>> from >> >>>>>>> accessing the same table isn't helping here. >> >>>>>>> >> >>>>>>> Detailed problem description (this assumes the API proposed by >> >>>>>>> mozilla): >> >>>>>>> >> >>>>>>> Consider a objectStore "words" containing the following objects: >> >>>>>>> { name: "alpha" } >> >>>>>>> { name: "bravo" } >> >>>>>>> { name: "charlie" } >> >>>>>>> { name: "delta" } >> >>>>>>> >> >>>>>>> and the following program (db is a previously opened IDBDatabase): >> >>>>>>> >> >>>>>>> var trans = db.transaction(["words"], READ_WRITE); var cursor; var >> >>>>>>> result = []; trans.objectStore("words").openCursor().onsuccess = >> >>>>>>> function(e) >> >>>>>>> { >> >>>>>>> cursor = e.result; >> >>>>>>> result.push(cursor.value); >> >>>>>>> cursor.continue(); >> >>>>>>> } >> >>>>>>> trans.objectStore("words").get("delta").onsuccess = function(e) { >> >>>>>>> trans.objectStore("words").put({ name: "delta", myModifiedValue: >> >>>>>>> 17 }); } >> >>>>>>> >> >>>>>>> When the cursor reads the "delta" entry, will it see the >> >>>>>>> 'myModifiedValue' property? Since we so far has defined that the >> >>>>>>> callback >> >>>>>>> order is defined to be >> the request order, that means that put >> >>>>>>> request >> >>>>>>> will be finished before the "delta" entry is iterated by the cursor. >> >>>>>>> >> >>>>>>> The problem is even more serious with cursors that iterate >> >>>>>>> indexes. >> >>>>>>> Here a modification can even affect the position of the currently >> >>>>>>> iterated object in the index, and the modification can (if i'm >> >>>>>>> reading the >> >>>>>>> spec correctly) >> come from the cursor itself. >> >>>>>>> >> &
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 5:43 PM On Wed, Jul 14, 2010 at 5:03 PM, Pablo Castro wrote: > > From: Jonas Sicking [mailto:jo...@sicking.cc] > Sent: Wednesday, July 14, 2010 12:07 AM > >> I think what I'm struggling with is how dynamic transactions will help >> since they are still doing whole-objectStore locking. I'm also curious >> how you envision people dealing with deadlock hazards. Nikunjs >> examples in the beginning of this thread simply throw up their hands >> and report an error if there was a deadlock. That is obviously not >> good enough for an actual application. >> >> So in short, looking forward to an example :) I'll try to come up with one, although I doubt the code itself will be very interesting in this particular case. Not sure what you mean by "they are still doing whole-objectStore locking". The point of dynamic transactions is that they *don't* lock the whole store, but instead have the freedom to choose the granularity (e.g. you could do row-level locking). As for deadlocks, whenever you're doing an operation you need to be ready to handle errors (out of disk, timeout, etc.). I'm not sure why deadlocks are different. If the underlying implementation has deadlock detection then you may get a specific error, otherwise you'll just get a timeout. >> >>> This will likely be extra bad for transactions where no write >> >>> operations are done. In this case failure to call a 'commit()' >> >>> function won't result in any broken behavior. The transaction will >> >>> just sit open for a long time and eventually "rolled back", though >> >>> since no changes were done, the rollback is transparent, and the only >> >>> noticeable effect is that the application halts for a while while the >> >>> transaction is waiting to time out. >> >>> >> >>> I should add that the WebSQLDatabase uses automatically committing >> >>> transactions very similar to what we're proposing, and it seems to >> >>> have worked fine there. >> > >> > I find this a bit scary, although it could be that I'm permanently tainted >> > with traditional database stuff. Typical databases follow a presumed abort >> > protocol, where if your code is interrupted by an exception, a process >> > crash or whatever, you can always assume transactions will be rolled back >> > if you didn't reach an explicit call to commit. The implicit commit here >> > takes that away, and I'm not sure how safe that is. >> > >> > For example, if I don't have proper exception handling in place, an >> > illegal call to some other non-indexeddb related API may throw an >> > exception causing the whole thing to unwind, at which point nothing will >> > be pending to do in the database and thus the currently active transaction >> > will be committed. >> > >> > Using the same line of thought we used for READ_ONLY, forgetting to call >> > commit() is easy to detect the first time you try out your code. Your >> > changes will simply not stick. It's not as clear as the READ_ONLY example >> > because there is no opportunity to throw an explicit exception with an >> > explanation, but the data not being around will certainly prompt >> > developers to look for the issue :) >> Ah, I see where we are differing in thinking. My main concern has been >> that of rollbacks, and associated dataloss, in the non-error case. For >> example people forget to call commit() in some branch of their code, >> thus causing dataloss when the transaction is rolled back. >> >> Your concern seems to be that of lack of rollback in the error case, >> for example when an exception is thrown and not caught somewhere in >> the code. In this case you'd want to have the transaction rolled back. >> >> One way to handle this is to try to detect unhandled errors and >> implicitly roll back the transaction. Two situations where we could do >> this is: >> 1. When an 'error' event is fired, but where .preventDefault() has is >> not called by any handler. The result is that if an error is ever >> fired, but no one explicitly handles it, we roll back the transaction. >> See also below. >> 2. When a success handler is called, but the handler throws an exception. >> >> The second is a bit of a problem from a spec point of view. I'm not >> sure it is allowed by the DOM Events spec, or by all existi
RE: [IndexedDB] IDBRequest.abort on writing requests
>From my perspective cancelling is not something that happens that often, and >when it happens it's probably ok to cancel the whole transaction. If we can >spec abort() in the transaction object such that it try to cancel all pending >operations and then rollback any work that has been done so far, then we >probably don't need abort on individual operations (with the added value that >it's uniform across read and write operations). -pablo From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 1:57 AM On Wed, Jul 14, 2010 at 9:14 AM, Jonas Sicking wrote: On Wed, Jul 14, 2010 at 1:02 AM, Jeremy Orlow wrote: > On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking wrote: >> >> On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow >> wrote: >> > On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking wrote: >> >> >> >> On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow >> >> wrote: >> >> > On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking >> >> > wrote: >> >> >> >> >> >> On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow >> >> >> wrote: >> >> >> > On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking >> >> >> > wrote: >> >> >> >> >> >> >> >> Hi All, >> >> >> >> >> >> >> >> Sorry if this is something that I've brought up before. I know I >> >> >> >> meant >> >> >> >> to bring this up in the past, but I couldn't find any actual >> >> >> >> emails. >> >> >> >> >> >> >> >> One thing that we discussed while implementing IndexedDB was what >> >> >> >> to >> >> >> >> do for IDBRequest.abort() or "writing" requests. For example on >> >> >> >> the >> >> >> >> request object returned from IDBObjectStore.remove() or >> >> >> >> IDBCursor.update(). >> >> >> >> >> >> >> >> Ideal would of course be if it would cancel the write operation, >> >> >> >> however this isn't always possible. If the call to .abort() comes >> >> >> >> after the write operation has already executed in the database, >> >> >> >> but >> >> >> >> before the 'success' event has had a chance to fire. What's worse >> >> >> >> is >> >> >> >> that other write operations might already have been performed on >> >> >> >> top >> >> >> >> of the aborted request. Consider for example the following code: >> >> >> >> >> >> >> >> req1 = myObjectStore.remove(12); >> >> >> >> req2 = myObjectStore.add({ id: 12, name: "Benny Andersson" }); >> >> >> >> do other stuff >> >> >> >> req1.abort(); >> >> >> >> >> >> >> >> In this case, even if the database supported aborting a specific >> >> >> >> operation, it's very hard to say what the correct thing to do >> >> >> >> with >> >> >> >> operations performed after it. As far as I know, databases >> >> >> >> generally >> >> >> >> don't support rolling back a given operation, only rolling back >> >> >> >> to a >> >> >> >> specific point, i.e. rolling back a given operation and all >> >> >> >> operations >> >> >> >> performed after it. >> >> >> >> >> >> >> >> We could say that abort() signals some sort of error if the >> >> >> >> operation >> >> >> >> has already been performed in the database, however that makes >> >> >> >> abort() >> >> >> >> very racy. >> >> >> >> >> >> >> >> Instead we concluded that the best thing to do was to specify >> >> >> >> that >> >> >> >> IDBRequest.abort() should throw if called on a modifying request. >> >> >> >> If >> >> >> >> this sounds good I'll make this change to the spec. >> >> >> > >> >> >> > I'd be fine with that. >> >> >> > Or we could remove abort all together. I can't really think of >> >> >> > what >> >> >> > types >> >> >> > of operations you'd really want to abort until (at least) we have >> >> >> > some >> >> >> > sort >> >> >> > of join language or other mechanism to do really expensive >> >> >> > read-only >> >> >> > calls. >> >> >> >> >> >> I think there are expensive-ish read-only calls. Indexes are >> >> >> effectively a join mechanism since you'll hit one b-tree to do the >> >> >> index lookup, and then a second b-tree to look up the full object in >> >> >> the objectStore. >> >> > >> >> > But each individual call (the scope of canceling an IDBRequest) is >> >> > pretty >> >> > short. >> >> > >> >> >> >> >> >> I don't really feel strongly either way. I think abort() isn't too >> >> >> hard to implement, but also doesn't provide a ton of value. At least >> >> >> not, like you say, until we add expensive calls like getAll or >> >> >> multi-step joins. >> >> > >> >> > I agree that when we look at adding such calls we may want to add an >> >> > abort >> >> > on just IDBRequest, but until then I don't think it's a very useful >> >> > feature. >> >> > And being easy to add is not a good reason to lock ourselves into >> >> > a particular design in the future. I think we should remove it until >> >> > there's a good reason for it to exist. >> >> > >> >> >> >> >> >> > Or we could take abort off IDBRequest and instead put a rollback >> >> >> > on >> >> >> > transactions (and not do the modify limitation). >> >> >> >> >> >> I
RE: [IndexedDB] Current editor's draft
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 12:10 AM On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM >> >> Dynamic transactions: >> >> I see that most folks would like to see these going away. While I like >> >> the predictability and simplifications that we're able to make by using >> >> static scopes for transactions, I worry that we'll close the door for two >> >> scenarios: background tasks and query processors. Background tasks such >> >> as synchronization and post-processing of content would seem to be almost >> >> impossible with the static scope approach, mostly due to the granularity >> >> of the scope specification (whole stores). Are we okay with saying that >> >> you can't for example sync something in the background (e.g. in a worker) >> >> while your app is still working? Am I missing something that would enable >> >> this class of scenarios? Query processors are also tricky because you >> >> usually take the query specification in some form after the transaction >> >> started (especially if you want to execute multiple queries with later >> >> queries depending on the outcome of the previous ones). The background >> >> tasks issue in particular looks pretty painful to me if we don't have a >> >> way to achieve it without freezing the application while it happens. >> Well, the application should never freeze in terms of the UI locking up, but >> in what you described I could see it taking a while for data to show up on >> the screen. This is something that can be fixed by doing smaller updates on >> the background thread, sending a message to the background thread that it >> should abort for now, doing all database access on the background thread, >> etc. This is an issue regardless, isn't it? Let's say you have a worker churning on the database somehow. The worker has no UI or user to wait for, so it'll run in a tight loop at full speed. If it splits the work in small transactions, in cases where it doesn't have to wait for something external there will still be a small gap between transactions. That could easily starve the UI thread that needs to find an opportunity to get in and do a quick thing against the database. As you say the difference between freezing and locking up at this point is not that critical, as the end user in the end is just waiting. >> One point that I never saw made in the thread that I think is really >> important is that dynamic transactions can make concurrency worse in some >> cases. For example, with dynamic transactions you can get into live-lock >> situations. Also, using Pablo's example, you could easily get into a >> situation where the long running transaction on the worker keeps hitting >> serialization issues and thus it's never able to make progress. While it could certainly happen, I don't remember seeing something like a live-lock in a long, long time. Deadlocks are common, but a simple timeout will kill one of the transactions and let the other make progress. A bit violent, but always effective. >> I do see that there are use cases where having dynamic transactions would be >> much nicer, but the amount of non-determinism they add (including to >> performance) has me pretty worried. I pretty firmly believe we should look >> into adding them in v2 and remove them for now. If we do leave them in, it >> should definitely be in its own method to make it quite clear that the >> semantics are more complex. Let's explore a bit more and see where we land. I'm not pushing for dynamic transactions themselves, but more for the scenarios they enable (background processing and such). If we find other ways of doing that, then all the better. Having different entry points is reasonable. >> >> Nested transactions: >> >> Not sure why we're considering this an advanced scenario. To be clear >> >> about what the feature means to me: make it legal to start a transaction >> >> when one is already in progress, and the nested one is effectively a >> >> no-op, just refcounts the transaction, so you need equal amounts of >> >> commit()'s, implicit or explicit, and an abort() cancels all nested >> >> transactions. The purpose of this is to allow composition, where a piece >> >> of code that needs a transaction can start one locally, independe
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 12:07 AM >> > Dynamic transactions: >> > I see that most folks would like to see these going away. While I like the >> > predictability and simplifications that we're able to make by using static >> > scopes for transactions, I worry that we'll close the door for two >> > scenarios: background tasks and query processors. Background tasks such as >> > synchronization and post-processing of content would seem to be almost >> > impossible with the static scope approach, mostly due to the granularity >> > of the scope specification (whole stores). Are we okay with saying that >> > you can't for example sync something in the background (e.g. in a worker) >> > while your app is still working? Am I missing something that would enable >> > this class of scenarios? Query processors are also tricky because you >> > usually take the query specification in some form after the transaction >> > started (especially if you want to execute multiple queries with later >> > queries depending on the outcome of the previous ones). The background >> > tasks issue in particular looks pretty painful to me if we don't have a >> > way to achieve it without freezing the application while it happens. >> I don't understand enough of the details here to be able to make a >> decision. The use cases you are bringing up I definitely agree are >> important, but I would love to look at even a rough draft of what code >> you are expecting people will need to write. I'll try and hack up and example. In general any scenario that has a worker and the UI thread working on the same database will be quite a challenge, because the worker will have to a) split the work in small pieces, even if it was naturally a bigger chunk and b) consider interleaving implications with the UI thread, otherwise even when split in chunks you're not guaranteed that one of the two will starve the other one (the worker running on a tight loop will effectively always have an active transaction, it'll be just changing the actual transaction from time to time). This can certainly happen with dynamic transactions as well, the only difference is that since the locking granularity is different, it may be that what you're working on in the worker and in the UI threads is independent enough that they don't interfere too much, allowing for some more concurrency. >> What I suggest is that we keep dynamic transactions in the spec for >> now, but separate the API from static transactions, start a separate >> thread and try to hammer out the details and see what we arrive at. I >> do want to clarify that I don't think dynamic transactions are >> particularly hard to implement, I just suspect they are hard to use >> correctly. Sounds reasonable. >> > Implicit commit: >> > Does this really work? I need to play with sample app code more, it may >> > just be that I'm old-fashioned. For example, if I'm downloading a bunch of >> > data form somewhere and pushing rows into the store within a transaction, >> > wouldn't it be reasonable to do the whole thing in a transaction? In that >> > case I'm likely to have to unwind while I wait for the next callback from >> > XmlHttpRequest with the next chunk of data. >> You definitely want to do it in a transaction. In our proposal there >> is no way to even call .get or .put if you aren't inside a >> transaction. For the case you are describing, you'd download the data >> using XMLHttpRequest first. Once the data has been downloaded you >> start a transaction, parse the data, and make the desired >> modifications. Once that is done the transaction is automatically >> committed. >> >> The idea here is to avoid keeping transactions open for long periods >> of time, while at the same time making the API easier to work with. >> I'm very concerned that any API that requires people to do: >> >> startOperation(); >>... do lots of stuff here ... >> endOperation(); >> >> people will forget to do the endOperation call. This is especially >> true if the startOperation/endOperation calls are spread out over >> multiple different asynchronously called functions, which seems to be >> the use case you're concerned about above. One very easy way to >> "forget" to call endOperation is if something inbetween the two >> function calls throw an exception. Fair enough, maybe I need to think of this scenario differently, and if someone needs to download a bunch of data and then put it in the database atomically the right way is to download to work tables first over a long time and independent transactions, and then use a transaction only to move the data around into its final spot. >> This will likely be extra bad for transactions where no write >> operations are done. In this case failure to call a 'commit()' >> function won't result in any broken behavior. The transaction will >> just sit open for a long time and eventually "rolled back", though >
RE: [IndexedDB] Current editor's draft
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM Sorry I disappeared for a while. Catching up with this discussion was an interesting exercise...there is no particular message in this thread I can respond to, so I thought I'd just reply to the last one. Overall I think the new proposal is shaping up well and is being effective in simplifying scenarios. I do have a few suggestions and questions for things I'm not sure I see all the way. READ_ONLY vs READ_WRITE as defaults for transactions: To be perfectly honest, I think this discussion went really deep over an issue that won't be a huge deal for most people. My perspective, trying to avoid performance or usage frequency speculation, is around what's easier to detect. Concurrency issues are hard to see. On the other hand, whenever we can throw an exception and give explicit guidance that unblocks people right away. For this case I suspect it's best to default to READ_ONLY, because if someone doesn't read or think about it and just uses the stuff and tries to change something they'll get a clear error message saying "if you want to change stuff, use READ_WRITE please". The error is not data- or context-dependent, so it'll fail on first try at most once per developer and once they fix it they'll know for all future cases. Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. Implicit commit: Does this really work? I need to play with sample app code more, it may just be that I'm old-fashioned. For example, if I'm downloading a bunch of data form somewhere and pushing rows into the store within a transaction, wouldn't it be reasonable to do the whole thing in a transaction? In that case I'm likely to have to unwind while I wait for the next callback from XmlHttpRequest with the next chunk of data. I understand that avoiding it results in nicer patterns (e.g. db.objectStores("foo").get(123).onsuccess = ...), but in practice I'm not sure if that will hold given that you still need error callbacks and such. Nested transactions: Not sure why we're considering this an advanced scenario. To be clear about what the feature means to me: make it legal to start a transaction when one is already in progress, and the nested one is effectively a no-op, just refcounts the transaction, so you need equal amounts of commit()'s, implicit or explicit, and an abort() cancels all nested transactions. The purpose of this is to allow composition, where a piece of code that needs a transaction can start one locally, independently of whether the caller had already one going. Schema versioning: It's unfortunate that we need to have explicit elements in the page for the versioning protocol to work, but the fact that we can have a reliable mechanism for pages to coordinate a version bump is really nice. For folks that don't know about this the first time they build it, an explicit error message on the schema change timeout can explain where to start. I do think that there may be a need for non-breaking changes to the schema to happen without a "version dance". For example, query processors regularly create temporary tables during sorts and such. Those shouldn't require any coordination (maybe we allow non-versioned additions, or we just introduce temporary, unnamed tables that evaporate on commit() or database close()...). Thanks -pablo
RE: [IndexedDB] Cursors and modifications
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, July 02, 2010 4:00 PM >> We ran into an complicated issue while implementing IndexedDB. In short, >> what should happen if an object store is modified while a cursor is >> iterating it? >> Note that the modification can be done within the same >> transaction, so the read/write locks preventing several transactions from >> accessing the same table isn't helping here. >> >> Detailed problem description (this assumes the API proposed by mozilla): >> >> Consider a objectStore "words" containing the following objects: >> { name: "alpha" } >> { name: "bravo" } >> { name: "charlie" } >> { name: "delta" } >> >> and the following program (db is a previously opened IDBDatabase): >> >> var trans = db.transaction(["words"], READ_WRITE); var cursor; var result = >> []; trans.objectStore("words").openCursor().onsuccess = function(e) { >> cursor = e.result; >> result.push(cursor.value); >> cursor.continue(); >> } >> trans.objectStore("words").get("delta").onsuccess = function(e) { >> trans.objectStore("words").put({ name: "delta", myModifiedValue: 17 }); } >> >> When the cursor reads the "delta" entry, will it see the 'myModifiedValue' >> property? Since we so far has defined that the callback order is defined to >> be >> the request order, that means that put request will be finished before >> the "delta" entry is iterated by the cursor. >> >> The problem is even more serious with cursors that iterate indexes. >> Here a modification can even affect the position of the currently iterated >> object in the index, and the modification can (if i'm reading the spec >> correctly) >> come from the cursor itself. >> >> Consider the following objectStore "people" with keyPath "name" >> containing the following objects: >> >> { name: "Adam", count: 30 } >> { name: "Bertil", count: 31 } >> { name: "Cesar", count: 32 } >> { name: "David", count: 33 } >> { name: "Erik", count: 35 } >> >> and an index "countIndex" with keyPath "count". What would the following >> code do? >> >> results = []; >> db.objectStore("people", >> READ_WRITE).index("countIndex").openObjectCursor().onsuccess = function (e) { >> cursor = e.result; >> if (!cursor) { >> alert(results); >> return; >> } >> if (cursor.value.name == "Bertil") { >> cursor.update({name: "Bertil", count: 34 }); >> } >> results.push(cursor.value.name); >> cursor.continue(); >> }; >> >> What does this alert? Would it alert "Adam,Bertil,Erik" as the cursor would >> stay on the "Bertil" object as it is moved in the index? Or would it alert >> "Adam,Bertil,Cesar,David,Bertil,Erik" as we would iterate "Bertil" again at >> its new position in the index? My first reaction is that both from the expected behavior of perspective (transaction is the scope of isolation) and from the implementation perspective it would be better to see live changes if they happened in the same transaction as the cursor (over a store or index). So in your example you would iterate one of the rows twice. Maintaining order and membership stable would mean creating another scope of isolation within the transaction, which to me would be unusual and it would be probably quite painful to implement without spilling a copy of the records to disk (at least a copy of the keys/order if you don't care about protecting from changes that don't affect membership/order; some databases call these keyset cursors). >> >> We could say that cursors always iterate snapshots, however this introduces >> MVCC. Though it seems to me that SNAPSHOT_READ already does that. Actually, even with MVCC you'd see your own changes, because they happen in the same transaction so the buffer pool will use the same version of the page. While it may be possible to reuse the MVCC infrastructure, it would still require the introduction of a second scope for stability. >> >> We could also say that cursors iterate live data though that can be pretty >> confusing and forces the implementation to deal with entries being added and >> >> removed during iteration, and it'd be tricky to define all edge cases. Would this be any different from the implementation perspective than dealing with changes that happen through other transactions once they are committed? Typically at least in non-MVCC systems committed changes that are "further ahead" in a cursor scan end up showing up even when the cursor was opened before the other transaction committed. >> >> It's certainly debatable how much of a problem any of these edgecases are >> for users. Note that all of this is only an issue if you modify and read >> from the >> same records *in the same transaction*. I can't think of a case >> where it isn't trivial to avoid these problems by separating things into >> separate transactions. >> However it'd be nice to avoid creating foot-guns >> for people to play with (think of the childre
RE: [IndexedDB] Multi-value keys
+1 on composite keys in general. The alternative to the proposal below would be to have the actual key path specification include multiple members (e.g. db.createObjectStore("foo", ["a", "b"])). I like the proposal below as well, I just wonder if having the key path specification (that's external to the object) indicate which members are keys would be less invasive for scenarios where you already have javascript objects you're getting from a web service or something and want to store them "as is". -pablo From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, June 18, 2010 4:08 PM Hi All, One thing that (if I'm reading the spec correctly) is currently impossible is to create multi-valued keys. Consider for example an object store containing objects like: { firstName: "Sven", lastName: "Svensson", age: 57 } { firstName: "Benny", lastName: "Andersson", age: 63 } { firstName: "Benny", lastName: "Bedrup", age: 9 } It is easy to create an index which lets you quickly find everyone with a given firstName or a given lastName. However it doesn't seem possible to create an index that finds everyone with a given firstName *and* lastName, or sort the list of people based on firstName and then lastName. The best thing you could do is to concatenate the firstname and lastname and insert a ascii-null character in between and then use that as a key in the index. However this doesn't work if firstName or lastName can contain null characters. Also, if you want to be able to sort by firstName and then age there is no good way to put all the information into a single string while having sorting work. Generally the way this is done in SQL is that you can create an index on multiple columns. That way each row has multiple values as the key, and sorting is first done on the first value, then the second, then the third etc. However since we don't really have columns we can't use that exact solution. Instead, the way we could allow multiple values is to add an additional type as keys: Arrays. That way you can use ["Sven", 57], ["Benny", 63] and ["Benny", 9] as keys for the respective objects above. This would allow sorting and searching on firstName and age. The way that array keys would be compared is that we'd first compare the first item in both arrays. If they are different the arrays are ordered the same way as the two first-values are order. If they are the same you look at the second value and so on. If you reach the end of one array before finding a difference then that array is sorted before the other. We'd also have to define the order if an array is compared to a non-array value. It doesn't really matter what we say here, but I propose that we put all array after all non-arrays. Note that I don't think we need to allow arrays to contain arrays. That just seems to add complication without adding additional functionality. Let me know what you think. / Jonas
RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Friday, June 11, 2010 3:20 PM >> >> >> So there is a real likelyhood of a browser implementation that >> >> >> will predate it's associated JS engine's upgrade to ES5? >> >> >> Feeling a "concern" isn't really much of technical argument on >> >> >> it's own, and designing for outdated technology is a poor approach. >> >> I don't think there is, just wanted to avoid imposing it. If you >> >> think it's really important then let's change it back to delete >> >> assuming other folks are good with it. >> >> >> I had the same concerns Pablo did, but I don't feel strongly >> >> either way. Besides the maneuvering we'll have to do on the C++ side of things to avoid clashes with language keywords, the question is whether we expect plugins and such to add support for IndexedDB in existing browsers that don't do ES5. For example: http://code.google.com/p/firebreath/wiki/FireBreathUsers >> >> Before we close on this, let me validate one more thing independently >> of the JS version. Are we going to have trouble when trying to expose >> these interfaces in C++? Not sure about other compilers and IDL >> processing tools, but I'm playing around with Visual Studio 2010 and >> while the COM IDL compiler will take "delete" as an interface member, >> my C++ compiler really doesn't like it. As far as I know there is no >> standard syntax to indicate that a symbol wasn't meant to be a >> keyword in C++, so having "delete" (or other C++ keywords for that >> matter) would be problematic. Am I missing something? > > Good point. Does anyone have a strong opinion on how much we should > care about reserved word conflicts in language other than JavaScript? > it seems like a slippery slope. > As an example, "IDBDatabase.description" is actually used by the > ObjectiveC base object class and so this caused some problems > initially. We worked around it by having the ObjectiveC bindings > generator add a suffix whenever an attribute named "description" is > hit. (Something similar was done for "hash" and "id" in other APIs.) > To be honest, I hadn't even considered bringing this up and asking for > it to be changed, but if we're going to avoid delete because it's a > reserved word in JavaScript (pre v5) and/or because it's a reserved > word in C++, perhaps we should consider changing description as well? >> We've had to do this a few times in the past already. One example was >> Window.postMessage where we couldn't use the name "PostMessage" in C++ >> because it was a predefined macro on some platform (windows iirc, not to >> point fingers ;) ). :) >> We developed a similar trick where we can indicate in the IDL that different >> names are used for scripted languages and for compiled languages. >> So all in all I believe this problem can be overcome. I prefer to focus on >> making the JS API be the best it can be, and let other languages take a back >> seat. As long as it's solvable without too much of an issue (such as large >> performance penalties) in other languages. I agree we can sort this out and certainly limitations on the implementation language shouldn't surface here. The issue is more whether folks care about a C++ binding (or some other language with a similar issue) where we'll have to have a different name for this method. Even though I've been bringing this up I'm ok with keeping delete(), I just want to make sure we understand all the implications that come with that. -pablo
RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Friday, June 11, 2010 3:20 AM Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline February 2 On Fri, Jun 11, 2010 at 1:54 AM, Pablo Castro wrote: From: Kris Zyp [mailto:k...@sitepen.com] Sent: Thursday, June 10, 2010 4:38 PM Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline February 2 >> >> So there is a real likelyhood of a browser implementation that will >> >> predate it's associated JS engine's upgrade to ES5? Feeling a >> >> "concern" isn't really much of technical argument on it's own, and >> >> designing for outdated technology is a poor approach. >> I don't think there is, just wanted to avoid imposing it. If you think it's >> really important then let's change it back to delete assuming other folks >> are good with it. >> I had the same concerns Pablo did, but I don't feel strongly either way. Before we close on this, let me validate one more thing independently of the JS version. Are we going to have trouble when trying to expose these interfaces in C++? Not sure about other compilers and IDL processing tools, but I'm playing around with Visual Studio 2010 and while the COM IDL compiler will take "delete" as an interface member, my C++ compiler really doesn't like it. As far as I know there is no standard syntax to indicate that a symbol wasn't meant to be a keyword in C++, so having "delete" (or other C++ keywords for that matter) would be problematic. Am I missing something? -pablo
RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2
From: Kris Zyp [mailto:k...@sitepen.com] Sent: Thursday, June 10, 2010 4:38 PM Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline February 2 >> On 6/10/2010 4:15 PM, Pablo Castro wrote: >> > >> >>> From: public-webapps-requ...@w3.org >> >>> [mailto:public-webapps-requ...@w3.org] On Behalf Of Kris Zyp >> >>> Sent: Thursday, June 10, 2010 9:49 AM Subject: Re: Seeking >> >>> pre-LCWD comments for Indexed Database API; deadline February >> >>> 2 >> > >> >>> I see that in the trunk version of the spec [1] that delete() >> >>> was changed to remove(). I thought we had established that >> >>> there is no reason to make this change. Is anyone seriously >> >>> expecting to have an implementation prior to or without ES5's >> >>> contextually unreserved keywords? I would greatly prefer >> >>> delete(), as it is much more consistent with standard DB and >> >>> REST terminology. >> > >> > My concern is that it seems like taking an unnecessary risk. I >> > understand the familiarity aspect (and I like delete() better as >> > well), but to me that's not a strong enough reason to use it and >> > potentially cause trouble in some browser. >> > >> So there is a real likelyhood of a browser implementation that will >> predate it's associated JS engine's upgrade to ES5? Feeling a >> "concern" isn't really much of technical argument on it's own, and >> designing for outdated technology is a poor approach. I don't think there is, just wanted to avoid imposing it. If you think it's really important then let's change it back to delete assuming other folks are good with it. -pablo
RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2
>> From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] >> On Behalf Of Kris Zyp >> Sent: Thursday, June 10, 2010 9:49 AM >> Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline >> February 2 >> I see that in the trunk version of the spec [1] that delete() was >> changed to remove(). I thought we had established that there is no >> reason to make this change. Is anyone seriously expecting to have an >> implementation prior to or without ES5's contextually unreserved >> keywords? I would greatly prefer delete(), as it is much more >> consistent with standard DB and REST terminology. My concern is that it seems like taking an unnecessary risk. I understand the familiarity aspect (and I like delete() better as well), but to me that's not a strong enough reason to use it and potentially cause trouble in some browser. -pablo
RE: [IndexedDB] Event on commits (WAS: Proposal for async API changes)
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Thursday, June 10, 2010 1:27 PM Subject: Re: [IndexedDB] Event on commits (WAS: Proposal for async API changes) >> >> >>> One of the things that will entail is a by-sequence index for all >> >> >>> the >> >> >>> changes in a give "database" (in my case a database will be scoped >> >> >>> to >> >> >>> more than one ObjectStore). In order to accomplish this I'll need >> >> >>> to >> >> >>> keep the last known sequence around so that each new write can >> >> >>> create >> >> >>> a new entry in the by-sequence index. The problem is that if >> >> >>> another >> >> >>> tab/window writes to the database it'll increment that sequence and >> >> >>> I >> >> >>> won't be notified so I would have to start every transaction with a >> >> >>> check on the sequence index for the last sequence which seems like >> >> >>> a >> >> >>> lot of extra cursor calls. >> >> >> >> >> >> It would be a lot of extra calls, but I'm a bit hesitant to add much >> >> >> more >> >> >> API surface area to v1, and the fall back plan doesn't seem too >> >> >> unreasonable. >> >> >> >> >> >>> >> >> >>> What I really need is an event listener on an ObjectStore that >> >> >>> fires >> >> >>> after a transaction is committed to the store but before the next >> >> >>> transaction is run that gives me information about the commits to >> >> >>> the >> >> >>> ObjectStore. >> >> >>> >> >> >>> Thoughts? >> >> >> >> >> >> To do this, we could specify an >> >> >> IndexedDatabaseRequest.ontransactioncommitted event that would >> >> >> be guaranteed to fire after every commit and before we started the >> >> >> next >> >> >> transaction. I think that'd meet your needs and not add too much >> >> >> additional >> >> >> surface area... What do others think? >> >> > >> >> > It sounds reasonable but, to clarify, it seems to me that >> >> > 'ontransactioncommitted' can only be guaranteed to fire after every >> >> > commit and before the next transaction starts in the current window. >> >> > Other transactions may have already started in other windows. >> >> >> >> We could technically enforce that other transactions won't be allowed >> >> to start until the event has fired in all windows that has the >> >> database open. >> > >> > Sure, but I can't think of any reason you'd want such semantics. Can >> > you? >> >> I'm not entirely sure what the requirements are, so not sure. >> >> If the requirement is that you are always notified about changes to a >> table before those changes start affecting reads, so that you can keep >> some separate information in sync, then we need to block further >> transactions until the event has been fired in all relevant windows. > > This would only make sense if all the oncommit handlers were started in > their own transaction so that you could at least read data. Otherwise all > you know is that something changed--so you wouldn't really have much to go > on for the goal of "keep[ing] some separate information in sync". Or you'd > then have to schedule a transaction which wouldn't necessarily run before > other stuff is updated which means there was no point for us to block > transactions on everyone being notified anyway. The only reason I can think > of why it'd matter is if your app was doing synchronization via other means > as well, but I can't immediately think of any places where waiting on all to > be notified would save you, even then. >> >> Possibly it would be ok to allow windows that has already received the >> transaction to start reading the updated data though. That should make >> this have virtually no performance impact. > > We should think VERY carefully about anything that has a perf impact. But > what I originally suggested should have a small one at worst, I would think. > >> >> But yes, we should definitely figure out what the actual requirements are. > > Agreed. >> >> >> Either way though, I'm wondering how this relates to the fact that you >> >> can (in our proposal, I'm unclear what the current draft allows) have >> >> several writing transactions at the same time, as long as they operate >> >> on different tables. Here another transaction might have already >> >> started by the time another transaction is committed. Be that in this >> >> window or another. >> > >> > That's only true of dynamic transactions. >> >> That isn't true in at least our proposal (again, I'm unclear what the >> current draft allows). In our proposal you can have multiple static >> write transactions in progress at the same time. As long as they don't >> overlap in which objectStores they use. > > I was assuming the sequence number would be stored in a single objectStore. >Ah, I see what you mean. Good point. >Mikeal, could you describe in detail how you were planning on using this event. We should drill more into the actual requirements. I would be really weary of introducing constructs that require cross-process coo
RE: [IndexDB] Collation Algorithm?
>> From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] >> On Behalf Of Mikeal Rogers >> Sent: Wednesday, June 09, 2010 2:42 PM >> Subject: [IndexDB] Collation Algorithm? >> One of the things I noticed that seems to be missing from the IndexDB >> specification is the collation algorithm used for sorting the index >> keys. >> There are lots of collation differences between databases, if left >> unspecified I'm afraid this would negatively affect interoperability >> between IndexDB implementations. >> CouchDB has a good collation specification for rich keys (any JSON >> type) and defers to the Unicode Collation Algorithm once it hits >> string comparisons. This might be a good starting point. >> http://wiki.apache.org/couchdb/View_collation#Collation_Specification >> http://www.unicode.org/reports/tr10/ >> -Mikeal We've touched on this in the past but haven't closed on a plan. I agree that this needs to be specified. I suspect that this will mean we'll have to take a collation name at some level (database, index) if we want to allow apps to get proper order for strings for different languages. I filed a bug to make sure we track this. -pablo
RE: Can IndexedDB depend on JavaScript? (WAS: [Bug 9793] New: Allow dates and floating point numbers in keys)
From: Jeremy Orlow Sent: Tuesday, May 25, 2010 6:54 AM >> On Mon, May 24, 2010 at 9:21 PM, Jonas Sicking wrote: >> On Sat, May 22, 2010 at 3:58 AM, Jeremy Orlow wrote: >> > On Fri, May 21, 2010 at 11:42 PM, wrote: >> >> >> >> http://www.w3.org/Bugs/Public/show_bug.cgi?id=9793 >> >> >> >> Summary: Allow dates and floating point numbers in keys >> >> Product: WebAppsWG >> >> Version: unspecified >> >> Platform: All >> >> OS/Version: All >> >> Status: NEW >> >> Severity: normal >> >> Priority: P2 >> >> Component: Indexed Database API >> >> AssignedTo: nikunj.me...@oracle.com >> >> ReportedBy: pablo.cas...@microsoft.com >> >> QAContact: member-webapi-...@w3.org >> >> CC: m...@w3.org, public-webapps@w3.org >> >> >> >> >> >> Currently the spec requires the values referenced by the key path to be >> >> integers or strings. I strongly believe that we should also allow dates >> >> and >> >> floating point numbers (am I missing any other important types?). While >> >> dates >> >> and floating point numbers alone are not good for a primary key, they are >> >> important for non-unique indexes and as part of a composite key, allowing >> >> for >> >> things such as scanning in temporal order. >> >> >> >> This is the change I'd like to propose: >> >> >> >> Section "3.1.1 Keys" of the currently published draft reads: >> >> >> >> - >> >> In order to efficiently retrieve records stored in an indexed database, a >> >> user >> >> agent needs to organize each record by its key. Conforming user agents >> >> must >> >> support the use of values of IDL data types [WEBIDL] DOMString and long as >> >> well >> >> as the value null as keys. >> >> >> >> For purposes of comparison, a DOMString key is always evaluated higher >> >> than any >> >> long key. Moreover, null always evaluates lower than any DOMString or long >> >> key. >> >> - >> >> >> >> New proposed text: >> >> >> >> - >> >> In order to efficiently retrieve records stored in an indexed database, a >> >> user >> >> agent needs to organize each record by its key. Conforming user agents >> >> must >> >> support the use of values of IDL data types [WEBIDL] DOMString, long, >> >> float, >> >> and the Date JavaScript object >> > >> > We really need to decide, once and for all, whether or not IndexedDB is >> > going to be tied to JavaScript or not. The two major reasons to do so are >> > the lack of date in WebIDL and keyPath. >> > KeyPath may be tricky to spec in a way that would work for any language >> > without cutting out a lot of flexibility. In order to keep what we're >> > speccing sane, it will probably need to be a pretty small subset of what's >> > possible in JavaScript and thus even browsers will likely need to roll >> > their >> > own parser and such to support it. (If we do decide to depend on >> > JavaScript, it should enable some really neat things with the keyPath as >> > well.) >> > The HTML spec defines its own date type, but does not specify sort order at >> > all. I started a thread on this a bit ago (subject: "[IndexedDB/WebIDL] >> > Dates + Sorting (WAS: Detailed comments for the current draft)") but it >> > only >> > got one response [3]. >> Note that a Date type for WebIDL doesn't really affect things a whole >> lot for the interfaces in IndexedDB though. The relevant functions all >> take 'any' as type though, so we'll still have to describe in prose >> what types are permitted. I don't think this makes IndexedDB depend on >> javascript though. Closing the loop on this one. Now that we agreed to add some language to WebIDL for the Date type [1], should we go ahead and make this change to the spec? I can ask Eliot to do it so we can close this one if folks feel it makes sense. Thanks -pablo [1] http://www.mail-archive.com/public-webapps@w3.org/msg08939.html
RE: [IndexedDB] Proposal for async API changes
(still catching up on the rest of the long thread of API changes, will get back to that a bit later) From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Thursday, May 20, 2010 3:34 PM >> >> On Thu, May 20, 2010 at 11:25 PM, Shawn Wilsher >> >> wrote: >> >> On 5/20/2010 7:34 AM, Shawn Wilsher wrote: >> >> So far it's really just that joins are painful in IndexedDB. I'm working >> >> on a blog post on this very topic though, and I'll be sure to point >> >> everyone in this thread to it (I figure this is useful stuff to get out >> >> to a wider audience). >> >> And honestly, I thought that we had discussed joins on this list, but I >> >> only see a thread from Pablo mentioning it, but no real discussions. >> >> Should we start that? >> Joins were actually in the original spec but taken out during the effort to >> simply the API greatly. IIRC, the main reason why Nikunj took them out is >> that we believed you could fairly efficiently join yourself if you had 2 >> sorted lists and because we didn't see a simple way to do them without >> introducing a lot of API surface area or creating (or borrowing) some sort >> of syntax for the joins. (Now that I think about it, though, maybe doing >> this is not that big of a leap from what we're going to need to do to spec >> keyPaths. I'm starting to wonder if we need to rethink that as well) >> Anyway, the decision was made so long ago that maybe it's worth re-opening >> the discussion. I'll hunt through my mail archives tomorrow and start a new >> thread with references to any original bits of info I can find. My main concern with joins, besides API surface, was that in order to implement joins you need to choose an actual strategy. Depending on whether you have indexes or not and other circumstances you could choose to do range scans/lookups, a merge join, etc. So at least for fancier libraries this would only be of partial help, as they would probably want to do their own joins sometimes. I'm happy to explore again though. It's certainly the case that for simpler cases it might help users pull off tasks without depending on a library. I do wonder if we should try and land the async API first. -pablo J
[IndexedDB] Interaction between transactions and objects that allow multiple operations
The interaction between transactions and objects that allow multiple operations is giving us trouble. I need to elaborate a little to explain the problem. You can perform operations in IndexedDB with or without an explicitly started transaction. When no transaction is present, you get an implicit one that is there for the duration of the operation and is committed and the end (or rolled-back if an error occurs). There are a number of operations in IndexedDB that are a single step. For example, store.put() occurs either entirely in the current transaction (if the user started one explicitly) or in an implicit transaction if there isn't one active at the time the operation starts. The interaction between the operation and transactions is straightforward in this case. On the other hand, other operations in IndexedDB return an object that then allows multiple operations on it. For example, when you open a cursor over a store, you can then move to the next row, update a row, delete a row, etc. The question is, what is the interaction between these operations and transactions? Are all interactions with a given cursor supposed to happen within the transaction that was active (implicit or explicit) when the cursor was opened? Or should each interaction happen in its own transaction (unless there is a long-lived active transaction, of course)? We have a few options: a) make multi-step objects bound to the transaction that was present when the object is first created (or an implicit one if none was present). This requires new APIs to mark cursors and such as "done" so implicit transactions can commit/abort, and has issues around use of the database object while a cursor with an implicit transaction is open. b) make each interaction happen in its own transaction (explicit or implicit). This is quite unusual and means you'll get inconsistent reads from row to row while scanning unless you wrap cursor/index scans on transactions. It also probably poses interesting implementation challenges depending on what you're using as your storage engine. c) require an explicit transaction always, along the lines Nikunj's original proposal had it. We would move most methods from database to transaction (except a few properties such as version and such, which it may still be ok to handle implicitly from the transactions perspective). This eliminates this whole problem altogether at the cost of an extra step required always. We would prefer to go with option c) and always require explicit transactions. Thoughts? Thanks -pablo
RE: [IndexedDB] Dynamic Transactions (WAS: Lots of small nits and clarifying questions)
On Apr 21, 2010, 11:18 PM Nikunj Mehta wrote: On Apr 21, 2010, at 5:11 PM, Jeremy Orlow wrote: On Mon, Apr 19, 2010 at 11:44 PM, Nikunj Mehta wrote: On Mar 15, 2010, at 10:45 AM, Jeremy Orlow wrote: On Mon, Mar 15, 2010 at 3:14 PM, Jeremy Orlow wrote: On Sat, Mar 13, 2010 at 9:02 AM, Nikunj Mehta wrote: On Feb 18, 2010, at 9:08 AM, Jeremy Orlow wrote: >> 2) In the spec, dynamic transactions and the difference between static and >> dynamic are not very well explained. >> >> Can you propose spec text? >> >> In 3.1.8 of http://dev.w3.org/2006/webapi/WebSimpleDB/ in the first >> paragraph, adding a sentence would probably be good enough. "If the scope >> is dynamic, the transaction may use any object stores or indexes in the >> database, but if another transaction touches any of the resources in a >> manner that could not be serialized by the implementation, a RECOVERABLE_ERR >> exception will be thrown on commit." maybe? >> >> By the way, are there strong use cases for Dynamic transactions? The more >> that I think about them, the more out of place they seem. >> >> Dynamic transactions are in common place use in server applications. It >> follows naturally that client applications would want to use them. >> >> There are a LOT of things that are common place in server applications that >> are not in v1 of IndexedDB. >> >> Consider the use case where you want to view records in entityStore A, >> while, at the same time, modifying another entityStore B using the records >> in entityStore A. Unless you use dynamic transactions, you will not be able >> to perform the two together. >> >>...unless you plan ahead. The only thing dynamic transactions buy you is not >>needing to plan ahead about using resources. >> >> The dynamic transaction case is particularly important when dealing with >> asynchronous update processing while keeping the UI updated with data. I strongly agree that dynamic transactions are important. Funnily enough we were considering proposing the other extreme, and drop all the static modes in favor of dynamic. This is not only about being able to transport server-service code to the client, but more in general about supporting modes of operation where the complete set of objects you'll use in a transaction is dependent upon things you'll only find out as you process the transaction; this includes the particular case where your application will make decisions based on data on the same database, so there is no way to plan ahead short of locking the whole thing. >> 1) Treat Dynamic transactions as "lock everything". >> >> This is not consistent with the spec behavior. Locking everything is the >> static global scope. >> >> I don't understand what you're trying to say in the second sentence. And I >> don't understand how this is inconsistent with spec behavior--it's simply >> more conservative. Of my main concerns around being overly conservative, and with the static locking model in general, is its impact on concurrency. While the client scenarios of IndexedDB don't have the same pressure for concurrency as server databases, things like synchronization and other background processing tasks do need a based level of concurrency to operate in a user-friendly way. >> 2) Implement MVCC so that dynamic transactions can operate on >> a consistent view of data. (At times, we'll know a transaction is doomed >> long before commit, but we'll need to let it keep running since only >> .commit() can raise the proper error.) >> >> MVCC is not required for dynamic transactions. MVCC is only required to open >> a database in the DETACHED_READ mode. >> >> Since locks are acquired in the order in which they are requested, a failure >> could occur when an object store is being opened, but it is locked by >> another transaction. One doesn't have to wait until commit is invoked. >> >> Am I missing something here? >> >> If we really expect UAs to implement MVCC (or something else along those >> lines), I would expect other more advanced transaction concepts to be >> exposed. >> >> What precisely are you referring to? Why are these other more advanced >> transaction concepts required? >> >> >> If we expect most v1 implementations to just use objectStore locks and thus >>use option 1, then is there any reason to include Dynamic transactions? >> >> Why do you conclude that most implementations just use object store locks? We were actually favoring use of the dynamic pattern. Note that other than the failure mode (which is a separate discussion we should have), you can do dynamic using regular locks instead of versioning if you follow the two-phase protocol[1]; that still results in a serializable schedule, although not with point-in-time consistency. More in general, I'm a bit worried about the number of options around transactions. I understand the goal of creating an "error free" model where once you succeed at starting a transaction you know you won't
RE: [IndexedDB] Lots of small nits and clarifying questions
On Tue, March 30, 2010 at 2:53 AM, Jeremy Orlow wrote: >> On Tue, Mar 30, 2010 at 9:10 AM, Pablo Castro >> wrote: >> Sorry for having disappeared for a while, "odata" was keeping me busy. I >> agree with all the clarifications listed in this thread that are required, >> so I won't redundantly mark each with "same here", but I have a few comments >> on one or two of them below. >> On Mon, Mar 15, 2010 at 8:14 AM, Jeremy Orlow wrote: >> On Sat, Mar 13, 2010 at 9:02 AM, Nikunj Mehta wrote: >> Thanks for your patience. Most questions below don't seem to need new spec >> text. >> On Feb 18, 2010, at 9:08 AM, Jeremy Orlow wrote: >> >> 1) Structured clone is going to change over time. And, realistically, >> >> UAs won't support every type right away anyway. What do we do when a >> >> value is inserted that we do not support? >> >> We will evolve the text as and when the same evolves in WebStorage. >> >> I don't know of any implementations which have moved away from only >> >> allowing strings within WebStorage. I suspect that not >> >> fully supporting the structured clone algorithm as specced is one of the >> >> reasons for this. >> >> As far as I can tell, you're essentially saying that fully supporting the >> >> the structured clone algorithm a pre-req for IndexedDB? I guess I can't >> >> argue too much with that, but I'm not sure how realistic it is. I know >> >> we only half support it at the moment in Chromium. I have the same worry about structured clones...it's right in principle but I can't see implementations converging and that will just hurt interoperability. Unfortunately there doesn't seem to be a well-known middle-ground. JSON is way too restrictive (e.g. no Date). Should we consider defining a subset of structured clones that work (maybe something like Javascript primitives plus Date plus whatever extra we feel we should include such as perhaps File objects)? >> There is some precedent for what you suggest: the spec for LocalStorage >> already specifies that storing ImageData isn't allowed. >> (http://dev.w3.org/html5/webstorage/#the-storage-interface see setItem >> section.) >> On the other hand, I'm not sure I like the idea of each API supporting >> different subsets of the structured clone algorithm. Even if all UAs >> support the same subset for each API, it still seems fairly confusing to web >> developers. And I'm guessing that UAs won't be to keen on adding more >> complex control flow to their structured clone implementations to disallow >> different parts of the algorithm based on what it's using. Thus any specced >> subset of the algorithm will probably need to be a MAY not a MUST. >> I still think we should spec an error to be returned when the UA doesn't >> fully support the structured clone algorithm and thus can't handle the data >> provided. I agree it's sub-optimal, but I think it's the pragmatic choice. >> Especially if the structured clone algorithm ever changes (and thus >> implementations can fall out of compliance with the spec). I agree with that concern, but I also worry that we'll end up with UAs implementing different subsets and then developers having to settle for the minimum common denominator or doing a bunch of guess work. May be we use structured clone but have some non-normative text that recommends reasonable subset that we can agree are something we can all implement consistently? -pablo
RE: [IndexedDB] Promises (WAS: Seeking pre-LCWD comments for Indexed Database API; deadline February 2)
On Fri, Mar 12, 2010 at 7:26 AM, Jeremy Orlow wrote: On Fri, Mar 12, 2010 at 3:23 PM, Jeremy Orlow wrote: On Fri, Mar 12, 2010 at 3:04 PM, Kris Zyp wrote: >> I believe computer science has clearly >> observed the fragility of passing callbacks to the initial function >> since it conflates the concerns of the operation with the asynchronous >> notifications and consequently greatly complicates composability. >> I don't understand this sentence. I'm pretty sure that you can wrap any >> callback based API in JavaScript with a promised, differed, etc based API. >> As >> Nikunj mentioned earlier, we're more concerned about creating a small >> API surface area and sticking with well understood API designs rather than >> >> eliminating the need for libraries that wrap IndexedDB. Trying to digest this thread, I think we've sort of gone full-circle with the whole promises thing. When looking at the code with the chained "then" pattern I just love the result, but it seems that we can't get all the way there (and nesting instead of chaining stuff kind of lacks the magic). My take is that either we get the really nice pattern by going all the way or we create a more traditional callback/events-based API and then we build promises on top. Things seem to indicate that frameworks are still cooking on promises, so it may be safe to stay with callbacks/events and just build libraries on top (I would have loved to have this be the thing that saved us from needing a library always...but it seems we'll fall just a bit short). As for callbacks versus events, while now I'm starting to get used to the events hooked up to the result object after the call, the callbacks may be a more natural mechanism for this particular usage. I'm not sure why this is fundamentally broken...would love to see examples or reference. If that's the case, then events are the obvious choice. Thanks -pablo
RE: [IndexedDB] Lots of small nits and clarifying questions
Sorry for having disappeared for a while, "odata" was keeping me busy. I agree with all the clarifications listed in this thread that are required, so I won't redundantly mark each with "same here", but I have a few comments on one or two of them below. On Mon, Mar 15, 2010 at 8:14 AM, Jeremy Orlow wrote: On Sat, Mar 13, 2010 at 9:02 AM, Nikunj Mehta wrote: Thanks for your patience. Most questions below don't seem to need new spec text. On Feb 18, 2010, at 9:08 AM, Jeremy Orlow wrote: >> 6) The specific ordering of elements should probably be specced including a >> mix of types. >> >> Can you propose spec text for this? What do you think about the text >> in http://www.w3.org/TR/IndexedDB/#key-construct? >> >> If we're only adding long long for v1, then I think language similar to >> what's there now is probably OK. But now that I think about it, I'm a bit >> concerned that we might be backing ourselves into a corner for the future. >> I also noticed that the sort order of JavaScript seems to order it numbers, >> strings, and then nulls (not strings, numbers, nulls). >> I wonder if there is some other spec on sort order we can cite rather than >> rolling our own. I really think that just doing long/strings won't do, even for v1. For non-primary-key indexes we'll need at least Date and number (not just integers) in addition to long/string. Without that there is no ordering by "date sent" for emails or "list price" for products or lots of other scenarios where you're caching data coming from a server. >> 2) What happens when data mutates while you're iterating via a cursor? >> >> This is covered by http://www.w3.org/TR/IndexedDB/#dfn-mode >> >> That applies to two separate transactions. As far as I can tell, it should >> be possible to have a cursor open and then delete an element that the cursor >> is currently traversing all within the same transaction. Am I missing >> something? I was assuming that within the same transaction you could change rows and those changes would be observable from open cursors. If it happens to be the current row then you won't be able to fetch it anymore but you can still move to the next one and continue scanning (and seeing any new changes that happened since you last moved). >> 1) Structured clone is going to change over time. And, realistically, UAs >> won't support every type right away anyway. What do we do when a value is >> inserted that we do not support? >> We will evolve the text as and when the same evolves in WebStorage. >> I don't know of any implementations which have moved away from only allowing >> strings within WebStorage. I suspect that not fully supporting the >> structured clone algorithm as specced is one of the reasons for this. >> As far as I can tell, you're essentially saying that fully supporting the >> the structured clone algorithm a pre-req for IndexedDB? I guess I can't >> argue too much with that, but I'm not sure how realistic it is. I know we >> only half support it at the moment in Chromium. I have the same worry about structured clones...it's right in principle but I can't see implementations converging and that will just hurt interoperability. Unfortunately there doesn't seem to be a well-known middle-ground. JSON is way too restrictive (e.g. no Date). Should we consider defining a subset of structured clones that work (maybe something like Javascript primitives plus Date plus whatever extra we feel we should include such as perhaps File objects)? Thanks -pablo
RE: [IndexedDB] Detailed comments for the current draft
On Mon, Feb 1, 2010 at 1:30 AM, Jeremy Orlow wrote: > > > 1. Keys and sorting > > > a. 3.1.1: it would seem that having also date/time values as keys > > > would be important and it's a common sorting criteria (e.g. as part of a > > > composite primary key or in general as an index key). > > The Web IDL spec does not support a Date/Time data type. Could your use > > case be supported by storing the underlying time with millisecond precision > > using an IDL long long type? I am willing to change the spec so that it > > allows long long instead of long IDL type, which will provide adequate > > support for Date and time sorting. > Can the spec not be augmented? It seems like other specs like WebGL have > created their own types. If not, I suppose your suggested change would > suffice as well. This does seem like an important use case. I agree, either we could augment the spec or we could describe it in terms of Javascript object values. That is, we can say something specific about the treatment of Javascript's Date object. Would that be possible? E.g. we could require implementations to provide full order for dates if they find an instance of that type in a path. > > > b. 3.1.1: similarly, sorting on number in general (not just > > > integers/longs) would be important (e.g. price lists, scores, etc.) > > I am once again hampered by Web IDL spec. Is it possible to leave this for > > future versions of the spec? Actually Web IDL does define the "double" type and its Javascript binding. Can we add double to the list of types an index can be applied to? > > > c. 3.1.1: cross type sorting and sorting of long values are clear. > > > Sorting of strings however needs more elaboration. In particular, which > > > collation do we use? Does the user or developer get to choose a > > > collation? If we pick up a collation from the environment (e.g. the OS), > > > if the collation changes we'd have to re-index all the databases. > > I propose to use Unicode collation algorithm, which was also suggested by > > Jonas during a conversation. I don't think this is specific enough, in that it still doesn't say which collation tables to use and how to specify them. A single collation strategy won't do for all languages (it'll range from slightly wrong to nonsense depending on the target language). This is a trickier area than I had initialize thought. We'll bake on this a bit and get back to this group with ideas. > > > d. 3.1.3: spec reads ".key path must be the name of an enumerated > > > property."; how about composite keys (would make the related APIs take a > > > DOMString or DOMStringList) > > I prefer to leave composite keys to a future version. I don't think we can get away with this. For indexes this is quite common (if anything else to have stable ordering when the prefix of the index has repeats). Once we have it for indexes the delta for having it for primary keys as well is pretty small (although I wouldn't oppose leaving out composite primary keys if that would help scope the feature). > > > b. Query processing libraries will need temporary stores, which need > > > temporary names. Should we introduce an API for the creation of temporary > > > stores with transaction lifetime and no name? > > Firstly, I think we can leave this safely to a future version. Secondly, my > > suggestion would be to provide a parameter to the create call to indicate > > that an object store being created is a transient one, i.e., not backed by > > durable storage. They could be available across different transactions. If > > your intention is to not make these object stores unavailable across > > connections, then we can also offer a connection-specific transient object > > store. > > In general, it requires us to introduce the notion of create params, which > > would simplify the evolution of the API. This is also similar to how > > Berkeley DB handles various options, not just those related to creation of > > a Berkeley "database". Let's see how we progress on this one, and maybe revisit it a bit later. I'm worried about code that wants to do things such as a block-sort that needs to spill to disk, as it would have to either use some pattern or ask the user for temp table names. > > > c. It would be nice to have an estimate row count on each store. > > > This comes at an implementation and runtime cost. Strong opinions? > > > Lacking everything else, this would be the only statistic to base > > > decisions on for a query processor. > > I believe we need to have a general way of estimating the number of records > > in a cursor once a key range has been specified. Kris Zyp also brings this > > up in a separate email. I am willing to add an estimateCount attribute to > > IDBCursor for this. EstimateCount sounds good. > > > d. The draft does not touch on how applications would do optimistic > > > concurrency. A common way
RE: Seeking pre-LCWD comments for Indexed Database API; deadline February 2
A few comments inline marked with [PC]. From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Nikunj Mehta Sent: Sunday, January 31, 2010 11:37 PM To: Kris Zyp Cc: Arthur Barstow; public-webapps Subject: Re: Seeking pre-LCWD comments for Indexed Database API; deadline February 2 On Jan 27, 2010, at 1:46 PM, Kris Zyp wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 A few comments I've been meaning to suggest: * count on KeyRange - Previously I had asked if there would be a way to get a count of the number of objects within a given key range. The addition of the KeyRange interface seems to be a step towards that, but the cursor generated with a KeyRange still only provides a "count" property that returns "the total number of objects that share the current key". There is still no way to determine how many objects are within a range. Was the intent to make "count" return the number of objects in a KeyRange and the wording is just not up to date? Otherwise could we add such a count property (countForRange maybe, or have a count and countForKey, I think Pablo suggested something like that). I agree with the concept. I have doubts about implementation success. However, I will include this in the editor's draft. [PC] I agree with Nikunj, I suspect that a implementations will have to just compute the count, as it's unlikely that updating intermediate nodes in the tree for each update would be desired (to try to maintain extra information for fast range size computation). At that point it's almost the same as user code iterating over the range (modulo the Javascript interface overhead). I'm also not sure how often you'd use this, as it would only work on simple conditions (no composite expressions, no functions in expressions) that happen to have an index. * Use promises for async interfaces - In server side JavaScript, most projects are moving towards using promises for asynchronous interfaces instead of trying to define the specific callback parameters for each interface. I believe the advantages of using promises over callbacks are pretty well understood in terms of decoupling async semantics from interface definitions, and improving encapsulation of concerns. For the indexed database API this would mean that sync and async interfaces could essentially look the same except sync would return completed values and async would return promises. I realize that defining a promise interface would have implications beyond the indexed database API, as the goal of promises is to provide a consistent interface for asynchronous interaction across components, but perhaps this would be a good time for the W3C to define such an API. It seems like the indexed database API would be a perfect interface to leverage promises. If you are interested in proposal, there is one from CommonJS here [1] (the get() and call() wouldn't apply here). With this interface, a promise.then(callback, errorHandler) function is the only function a promise would need to provide. Thanks for the pointer. I will look in to this as even Pablo had related requirements. [1] http://wiki.commonjs.org/wiki/Promises and a comment on this: On 1/26/2010 1:47 PM, Pablo Castro wrote: > 11. API Names > > a. "transaction" is really non-intuitive (particularly given > the existence of currentTransaction in the same class). > "beginTransaction" would capture semantics more accurately. b. > ObjectStoreSync.delete: delete is a Javascript keyword, can we use > "remove" instead? I'd prefer to keep both of these as is. Since commit and abort are part of the transaction interface, using transaction() to denote the transaction creator seems brief and appropriate. As far as ObjectStoreSync.delete, most JS engines have or should be contextually reserving "delete". I certainly prefer delete in preserving the familiarity of REST terminology. [PC] I understand the term familiarity aspect, but this seems to be something that would just cause trouble. From a quick check with the browsers I had at hand, both IE8 and Safari 4 reject scripts where you try to add a method called "delete" to an object's prototype. Natively-implemented objects may be able to work-around this but I see no reason to push it. remove() is probably equally intuitive. Note that the method "continue" on async cursors are likely to have the same issue as continue is also a Javascript keyword. Thanks, - -- Kris Zyp SitePen (503) 806-1841 http://sitepen.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktgtCkACgkQ9VpNnHc4zAwlkgCgti99/iJMi1QqDJYsMgxj9hC3 X0cAnj0J0xzqIQa8abaBQ8qxCMe/7/sU =W6Jx -END PGP SIGNATURE- -pablo
[IndexedDB] Detailed comments for the current draft
These are notes that we collected both from reviewing the spec (editor's draft up to Jan 24th) and from a prototype implementation that we are working on. I didn't realize we had this many notes, otherwise I would have been sending intermediate notes early. Will do so next round. 1. Keys and sorting a. 3.1.1: it would seem that having also date/time values as keys would be important and it's a common sorting criteria (e.g. as part of a composite primary key or in general as an index key). b. 3.1.1: similarly, sorting on number in general (not just integers/longs) would be important (e.g. price lists, scores, etc.) c. 3.1.1: cross type sorting and sorting of long values are clear. Sorting of strings however needs more elaboration. In particular, which collation do we use? Does the user or developer get to choose a collation? If we pick up a collation from the environment (e.g. the OS), if the collation changes we'd have to re-index all the databases. d. 3.1.3: spec reads "…key path must be the name of an enumerated property…"; how about composite keys (would make the related APIs take a DOMString or DOMStringList) 2. Values a. 3.1.2: isn't the requirement for "structured clones" too much? It would mean implementations would have to be able to store and retrieve File objects and such. Would it be more appropriate to say it's just graphs of Javascript primitive objects/values (object, string, number, date, arrays, null)? 3. Object store a. 3.1.3: do we really need in-line + out-of-line keys? Besides the concept-count increase, we wonder whether out-of-line keys would cause trouble to generic libraries, as the values for the keys wouldn't be part of the values iterated when doing a "foreach" over the table. b. Query processing libraries will need temporary stores, which need temporary names. Should we introduce an API for the creation of temporary stores with transaction lifetime and no name? c. It would be nice to have an estimate row count on each store. This comes at an implementation and runtime cost. Strong opinions? Lacking everything else, this would be the only statistic to base decisions on for a query processor. d. The draft does not touch on how applications would do optimistic concurrency. A common way of doing this is to use a timestamp value that's automatically updated by the system every time someone touches the row. While we don't feel it's a must have, it certainly supports common scenarios. 4. Indexes a. 3.1.4 mentions "auto-populated" indexes, but then there is no mention of other types. We suggest that we remove this and in the algorithms section describe side-effecting operations as always updating the indexes as well. b. If during insert/update the value of the key is not present (i.e. undefined as opposite to null or a value), is that a failure, does the row not get indexed, or is it indexed as null? Failure would probably cause a lot of trouble to users; the other two have correctness problems. An option is to index them as undefined, but now we have undefined and null as indexable keys. We lean toward this last option. 5. Databases a. Not being able to enumerate database gets in the way of creating good tools and frameworks such as database explorers. What was the motivation for this? Is it security related? b. Clarification on transactions: all database operations that affect the schema (create/remove store/index, setVersion, etc.) as well as data modification operations are assumed to be auto-commit by default, correct? Furthermore, all those operations (both schema and data) can happen within a transaction, including mixing schema and data changes. Does that line up with others' expectations? If so we should find a spot to articulate this explicitly. c. No way to delete a database? It would be reasonable for applications to want to do that and let go of the user data (e.g. a "forget me" feature in a web site) 6. Transactions a. While we understand the goal of simplifying developers' life with an error-free transactional model, we're not sure if we're making more harm by introducing more concepts into this space. Wouldn't it be better to use regular transactions with a well-known failure mode (e.g. either deadlocks or optimistic concurrency failure on commit)? b.If in auto-commit mode, if two cursors are opened at the same time (e.g. to scan them in an interleaved way), are they in independent transactions simultaneously active in the same connection? 7. Algorithms a. 3.2.2: steps 4 and 5 are inverted in order. b. 3.2.2: when there is a key generator and the store uses in-line keys, should the generated key value be propagated to the original object (in addition to the clone), such that both are in sync after the put operation? c. 3.2.3: step 2, probably editorial mistake? Wouldn't all indexes have
RE: IndexedDB and MVCC
Hi Chris, > -Original Message- > From: public-webapps-requ...@w3.org [mailto:public-webapps- > requ...@w3.org] On Behalf Of Chris Anderson > Sent: Friday, January 15, 2010 11:14 AM > To: public-webapps WG > Subject: IndexedDB and MVCC > > Hi, > > I've been reading the new IndexedDB spec as published here: > http://www.w3.org/TR/IndexedDB/ > > My first impression is that this simpler than WebSimpleDB, but not too > simple. I'm happy to see detached readers being mentioned. > > There's one other piece of the concurrency story that could be useful. > > In section 3.2.2 Object Store Storage steps > > step 7: If the no-overwrite flag was passed to these steps and is set, > and a record already exists with its key being key, then terminate > these steps and set error code CONSTRAINT_ERR. > > I think it wouldn't add much complexity to use a compare-and-swap > pattern, instead of a no-write-if-exists pattern. This would allow for > better concurrency via optimistic updates, and look a lot like HTTP > etags. Wouldn't these be different scenarios? The purpose of the flag is to help in scenarios where you don't want to automatically create an item, only update an existing one. What you're describing seems to be oriented towards the case where you're updating an existing item, have an optimistic concurrency token, and want to use it to check for conflicts before the update goes through. You definitely make a good point about the fact that the current document doesn't touch on how applications would handle optimistic concurrency. One way would be to build-in support for it (as you suggest, an optional path for the concurrency token, and perhaps also a timestamp sort of thing that gets automatically updated). Alternatively application code could do the check-and-update-or-fail deal within a transaction. > > It could be accomplished by allowing an object store to take a > key-path for the update-token. Then subsequent updates could require > that the key-path match. (Some additional complexity: we'd need the > ability to check for a matching update-token, then change it, in a > transaction). > > CouchDB uses an MVCC token that must match to allow updates. This > allows us to avoid locking. But even more important is the parallels > we have with HTTP Etags (if-match for idempotence, if-none-match for > caching). > > The CouchDB style of MVCC can be accomplished by updates in a > compare-and-swap transaction, so technically I can do what I want in > the spec as it stands. But I still think the parallels to HTTP etags > can be instructive. Out of curiosity: if you were to layer CouchDB on top of IndexedDB, would you always just use the dynamic locking mode, or do you actually have use for the other options offered? I ask because I'm seriously concerned that the extra modes will add to the overall concept count in an attempt to simplify the use of transactions, and don't really simplify the end to end. > > Chris > > > -- > Chris Anderson > http://jchrisa.net > http://couch.io > Thanks -pablo
[WebSimpleDB] Introduce a pause/resume pattern for coordinated access to multiple stores
Whenever we take a callback that's to be called for each item in a set (e.g. with a .forEach(callback) pattern), we need a way to indicate the system whether it's ok to move to the next row and invoke the next callback or not. Otherwise, in scenarios where the callback itself performs an operation that doesn't finish immediately (such as another database async call) the system will keep queuing up top-level callbacks, which in turn may queue up more callbacks as part of its implementation, and execution will be in "some order" that's very hard to predict at best. This comes up in several contexts. Applications will often need to scan more than one object store in coordination. Query processors will also need this when implementing physical operators for joins and such. A different context would be a system that needs to submit an HTTP request per row, where you may want to use an XmlHttpRequest and unwind after calling open. While the HTTP request is in flight you don't want to move to the next In most cases one of the key aspects is that we need separate components to work cooperatively as they pull rows from one or multiple scans, and there needs to be a way of controlling the advance of cursors through the rows. We would like to introduce "pause" and "resume" functions for scans to support this. Since there is no obvious place to put this right now, we could introduce an "iterator" object that can be used to control things related to the current state of the iteration as of when the callback happens, or maybe this is the cursor itself. The resulting code would look like this (the example uses the single-async-level pattern we're playing around, but these two are actually independent things): async_db.forEachObjectInStore("people", function(person, iteration) { iteration.pause(); // we won't be done with 'person' until later... var request = async_db.getFromStore("people", person.managerId); request.onsuccess = function() { var manager = request.result; // Do something with both 'person' and 'manager', and now we're ready to process the next person. iteration.resume(); }; }); The nice thing about adding these as methods on the side is that it's completely out of sight in simple scenarios where you may be just scanning to build some HTML for example. Only if you're doing multiple coordinated, async tasks you need to know about these functions. Regards, -pablo
RE: [WebSimpleDB] Allowing schema operations anywhere
My apologies for my late reply, I've been out for a while. > -Original Message- > From: Nikunj R. Mehta [mailto:nikunj.me...@oracle.com] > Sent: Friday, December 11, 2009 10:47 AM > To: public-webapps@w3.org WG > Cc: Pablo Castro > Subject: Re: [WebSimpleDB] Allowing schema operations anywhere > > I have gone ahead and updated the spec to allow option B (only). > Please take a look. Option B makes sense, as without it there is a class of algorithms that cannot be implemented or it would be quite difficult to do so (e.g. a "sort" type of construct a query language might want to support wouldn't be possible without a backing index). This certainly means versioning becomes the responsibility of the app/library and not the user agent. This makes sense to me, given that not all schema changes are really version changes (e.g. creation of a spill-to-disk table shouldn't bump up the database version). Thanks -pablo > > Nikunj > On Dec 8, 2009, at 10:14 AM, Nikunj R. Mehta wrote: > > > Hi Pablo, > > > > Sorry for the long delay in responding to your comments. Hopefully, we > > can continue the discussion now. > > > > Schema changes interact with the locking model of the database. As I > > see it, here are several ways in which the API could be designed and > > the consequences of doing so: > > > > A. Allow schema changes inside a metadata transaction which can only > > be performed at connection time B. Allow schema changes inside a data > > transaction, which can be performed any time a connection is open C. > > Allow schema changes inside a metadata transaction, which can be > > performed any time a connection is open > > > > Option A's disadvantages are that metadata manipulation cannot be > > combined with data changes. Moreover, version numbers are no longer > > issued by the application but rather by a user agent. > > > > Option A's advantages are that resource acquisition is simplified and > > deadlocks can be avoided considering that a connection acquires and > > releases the metadata resource in a consistent sequence. Another > > upside is that version number maintenance is automated. > > > > Option B's main disadvantage is that there is no real notion of > > version that can be managed by the user agent. Another is that > > deadlocks could occur because there is no a priori declaration of > > intent about metadata modification. This could be remedied by > > including the database itself in the list of objects that are intended > > to be modified in the transaction. > > > > Option B's advantages are closer interleaving of and atomic metadata > > changes with data changes, and application controlled version numbers > > used for the database. > > > > Option C's disadvantage is that data and metadata changes cannot be > > interleaved atomically. > > > > Option C's advantages are that deadlocks can be avoided and version > > number management can be performed by an application. > > > > Overall, I think version management and metadata changes are exclusive > > in some sense. IOW, if we want Option B and Option C, then we have to > > remove the connection time version check. > > > > Hope that helps. Please feel free to add if I missed anything. > > > > Nikunj > > > > On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote: > > > >> We are finding a number of reasons for wanting to create tables on > >> the fly, and without bumping up the database version. A few examples: > >> - Packaged components that create side tables to maintain its own > >> state > >> - Query processors often need to "spill to disk" during query > >> execution. For example, sorting large sets requires storing temporary > >> sets of rows on disk to be merged later. > >> > >> So we're thinking it would be better to have these methods directly > >> in the DatabaseSync/DatabaseAsync objects (with proper corresponding > >> patterns), instead of their current location in the Upgrade > >> interface. > >> > >> For the common case where several schema changes need to be done > >> atomically, developers can simply wrap the calls in a transaction, > >> and they would do for regular data manipulation. > >> > >> We would need an extra method to bump up the version explicitly, as > >> that would no longer be in the upgrade callback. > >> > >> Does this seem reasonable? > >> > >> Regards, > >> -pablo > >> > >> > > > > Nikunj > > http://o-micron.blogspot.com > > > > > > > > > > Nikunj > http://o-micron.blogspot.com > > >
[WebSimpleDB] Allowing schema operations anywhere
We are finding a number of reasons for wanting to create tables on the fly, and without bumping up the database version. A few examples: - Packaged components that create side tables to maintain its own state - Query processors often need to "spill to disk" during query execution. For example, sorting large sets requires storing temporary sets of rows on disk to be merged later. So we're thinking it would be better to have these methods directly in the DatabaseSync/DatabaseAsync objects (with proper corresponding patterns), instead of their current location in the Upgrade interface. For the common case where several schema changes need to be done atomically, developers can simply wrap the calls in a transaction, and they would do for regular data manipulation. We would need an extra method to bump up the version explicitly, as that would no longer be in the upgrade callback. Does this seem reasonable? Regards, -pablo
[WebSimpleDB] Flatting APIs to simplify primary cases
We're busy creating experimental implementations of WebSimpleDB to both understand what it takes to implement and also to see what the developer experience looks like. As we started to write "application code" against the API (particularly the async one) the first thing that popped is the fact that you need two levels of nested callbacks for everything. While the current factoring of the API makes sense on the design board, it's kind of noisy in app code. For example: // assume you already have a database opened in dbReq var html = ""; var storeReq = new ObjectStoreRequest(dbReq.database); storeReq.success = function() { var cursorReq = new CursorRequest(storeReq.store); cursorReq.callback = function(key, cursor, value) { html += "" + value.Name + ""; } cursorReq.onsuccess = function(r) { document.getElementById("output").innerHTML = html + ""; } cursorReq.open(); } storeReq.open(); One option that we would like to explore is to "flatten" the API, so most common methods are straight in the database class. This trades off some of the factoring in favor of usability for common cases using the async API. The change would span a couple of aspects: 1. Move operations from object store interface and the index interface into the Database interface. Accessing indexes and stores through specialized objects is problematic for the following reasons: - It's always the case that we need to consider when objects are invalidated because something changes from underneath them, for example a schema change. So for example, if there is an explicit store object, then when the store is dropped we need to consider what is valid/invalid and what its failure points and modes are. By not having a standalone store object, we significantly reduce the "gotchas" to consider. - From a usability perspective, it's simpler to work with a store in a single step, rather than having to open it first and then work with it (see patterns below with a single request and one DBRequest object). - With no "two-step" access pattern, the API has one less level of asynchronicity, as effectively the table lookup + operation are atomic within the store. This also consolidates all operations with an async variant in a single interface (the Database), which is a great simplification for discoverability. var html = ""; var request = asyncDb.forEachStoreObject("contacts", function(row) { html += "" + row.Name + ""; }); request.onsuccess = function(r) { document.getElementById("output").innerHTML = html + ""; } In moving the operations, it's probably best to rename them to something more descriptive, so we can have for example 'getFromStore(storeName, key)' and 'getFromIndex(storeName, indexName, key)'. This also helps in that 'delete' won't collide with the Javascript keyword. Note that the store and index interfaces are still around to provide metadata, but at this point they behave as simple read-only snapshots. 2. Generalize the use of DBRequest, add a 'result' member to it and have all asynchronous operations be initiated from a DatabaseAsync interface. As a result of the previous changes, all operations that have an async counterpart should now exist on the DatabaseAsync interface. Rather than having multiple types of requests depending on the target object, it is possible to have operations on a DatabaseAsync interface that provide a uniform invocation and handling programming pattern. This gives a nice pattern for understanding how a sync API maps to an async API. So for example: var record = db.getFromStore("store", key); // use record... Becomes: var request = asyncDb.getFromStore("store", key); request.onsuccess = function(req) { var record = req.result; // use record... }; We could include more data in DBRequest or DBRequest.result as needed if in some cases a method produces more than just a simple result. Further specializatons of DBRequest (subtypes) are still possible in the future if we need to introduce special cases for specific operations. Similarly, we would have something like asyncDb.forEachStoreObject() that queues a task to call a callback for each element in a store/index, potentially within a range if specified. The pattern scales well to all the other APIs present in db/store/index today. If this seems like a good idea to folks, we'd be happy to write up a more complete version that articulates the tweaks across all the WebSimpleDB APIs to make this happen. Regards, -pablo
Web Data APIs
We've been looking at the web database space here at Microsoft, trying to understand scenarios and requirements. After assessing what was out there we are forming an opinion around this. I wanted to write to this group to share how we think about the space, what principles we try to apply, and to discuss specifics. The short story is that we believe Nikunj's WebSimpleDB proposal, which basically describes a minimum-bar web database API and enables a whole set of diverse options to be built on top, is the right thing to do. During the last couple of weeks we have been talking with various folks from Mozilla and Oracle and iterating over details of the WebSimpleDB draft. In the process it has become clear that we all share the same high-level expectations on the scope and capabilities of this API, and Nikunj has been hard at work making changes to the draft to keep up with them. I'll touch on a few details below, but bear in mind that several of them are already in the process of being addressed. We would love to hear feedback, requirements, specific application scenarios, etc. We want to make progress quickly and get experimental implementations going to ensure that as we explore we stay grounded, with things that are implementable. Guiding principles and why we think the ISAM style proposed in WebSimpleDB is a good idea As we try to understand the problem space we formulated a couple of guiding principles: - Get into the standard the key building blocks that are either impossible to build on top, or so common that would be very redundant to do so - Focus on an API that is simple enough that can be reliably specifiable and that can be implemented to follow the spec in a relatively simple manner We believe that WebSimpleDB sets the stage in this direction. An ISAM layer can be used directly or can be a building block for more elaborate layers that can be built entirely in Javascript on top. Also, ISAM is simple enough that can be specified in a way that should enable highly interoperable implementations. Trimming down There are a number of elements of WebSimpleDB that we can probably live without, at least for a first version, such as Queues and Sequences. This may help simplify the database API even further. Also, there are a few simplifying assumptions we can make from the get-go. For example, that "paths" as informally mentioned in the spec only reference Javascript identifiers (perhaps with dot-notation) and when used for index/primary keys they point to Javascript primitive values and not to objects/arrays. Terminology The word "Entity" has a lot of different meanings depending on who you talk to. It would be interesting to find a simpler term, perhaps something that matches the Javascript terminology better. Areas where we need to dig deeper and have broader discussions to understand better Isolation model and its implications in locking: Various isolation models lead to different failure modes; for example, regular locks mean that application code needs to be ready to deal with deadlocks, or in the case of multi-versioning you can see optimistic concurrency violation exceptions during commit. There is a tricky balance between not dictating too much from the implementation and ensuring that observable behavior across implementations really enables interoperability. What's the sweet spot for the API?: is the primary use for this API to be directly consumed by application code? Or is it a building block to create various different libraries that present a diversity of styles for query formulation and execution? We lean to the side of making it an API that's great for libraries to build nice layers on top, but it's still useable directly in application code (along the lines of what happens with XmlHttpRequest, where most developers will actually use a wrapper that fits the particular scenario/library better). Regards, -pablo