Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
i've mostly stayed out of this thread because i felt like i'd just being fanning the flames but i really can't stay out anymore. databases are more that SQL, always have been. SQL is a DSL for relational database access. all implementations of SQL have a similar set of tools they implement first and layer SQL on top of. those tools tend to be a storage engine, btree, and some kind of transactional model between them. under the ugly covers, most databases look like berkeleydb and the layer you live in is just sugar on top. creating an in-browser specification/implementation on top of a given relational/SQL story is a terrible idea. it's unnecessarily limiting to a higher level api and can't be easily extended the way a simple set of tools like IndexedDB can. suggesting that other databases be implemented on top of SQL rather than on top of the tools in which SQL is built is just backwards to anyone who's built a database. it's not very hard to write the abstraction you're talking about on top of IndexedDB, and until you do it i'm going to have a hard time taking you seriously because it's clearly doable. i implemented a CouchDB compatible datastore on top of IndexedDB, it took me less than a week at a time when there was only one implementation that was still changing and still had bugs. it would be much easier now. https://github.com/mikeal/idbcouch it needs to be updated to use the latest version of the spec which is a day of work i just haven't gotten to yet. the constructs in IndexedDB are pretty low level but sufficient if you know how to implement databases. performance is definitely an issue, but making these constructs faster would be much easier than trying to tweak an off the shelf SQL implementation to your use case.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
Joran Greef April 4, 2011 April 4, 201110:18 AM On 04 Apr 2011, at 6:10 PM, Mikeal Rogers wrote: it's not very hard to write the abstraction you're talking about on top of IndexedDB, and until you do it i'm going to have a hard time taking you seriously because it's clearly doable. You assume I have not written the abstraction I am talking about on top of IndexedDB? the constructs in IndexedDB are pretty low level but sufficient if you know how to implement databases. performance is definitely an issue, but making these constructs faster would be much easier than trying to tweak an off the shelf SQL implementation to your use case. How exactly would you make a schema-enforcing interface faster than a stateless interface? How would you implement application-managed indices on top of IndexedDB without being slower than SQLite? this assumes your indexes can be describe well in sqlite and that you want them generated at write time. one of the performance optimizations couchdb makes is to generate secondary indexes at read time which is much more efficient as you can batch the processing and do a single bulk transaction which limits IO. if you need, exactly, the indexes sqlite provides then i'm sure they will be well optimized in sqlite, but for everyone else. How would you implement set operations on indices in IndexedDB without being slower or less memory efficient than SQLite? you can do pretty much anything if you write a good abstraction and leverage transactions. if it's too slow complain to the vendor and it'll get improved, that's how we've been improving performance in the browser for many, many years and it seems to be working. How would you create an index on an existing object store in IndexedDB containing more than 50,000 objects on an iPad, without incurring any object deserialization/serialization overhead, without being an order of magnitude slower than SQLite, and without bringing the iPad to its knees? If you can do it with even one IndexedDB implementation out there then kudos and hats off to you. :) the biggest bottleneck here in the current implementation would be the transaction overhead on a database this size, which is because of performance problems in sqlite which is underlying the implementation. sqlite can't fix this, it's currently the problem. the object serialization is not a huge performance issue, performance issues in databases are almost always do to IO or transaction locks. I understand your point of view. I once thought the same. You would think that IndexedDB would be more than satisfactory for these things. The question is whether IndexedDB provides adequate and performant database primitives, to the same degree as SQLite (and of course SQL is merely an interface to database storage primitives, I do not recalling saying otherwise). sqlite doesn't expose it's primitives well, it exposes it's abstractions. berkeleydb exposes it's primitives well, and you could make the case that it would be a better thing to standardize on, although i wouldn't think that would be a great idea either. You can build IndexedDB on top of SQLite (as some browsers are indeed doing), but you cannot build SQLite on IndexedDB. you should most definitely be able build sqlite on top of IDB, there would be a performance penalty of course, which we can address, but you should be able to do it. if you can't then we need to extend the specification. Keean Schupke April 4, 2011 April 4, 20118:55 AM Yes, it already has well defined set operations. Solid is a matter of testing by enough people (and if you wanted to try it and feed back that would be a start). Fast should not be a problem, as the SQL database does all the heavy lifting. In more detail, Codd's six primitive operators are project, restrict, cross-product, union and difference. Relations are an extension of Sets, so intersection and difference on compatible relations behave like they would on sets. RelationalDB already implements the following 5 methods making it relationally-complete. Meaning it can do anything you could possibly want to do with relations using combinations of these 5 methods
Re: Seeking agenda items for WebApps' Nov 1-2 f2f meeting
I might attend if there are enough IndexedDB people there. On Wed, Sep 1, 2010 at 7:28 PM, Jonas Sicking jo...@sicking.cc wrote: I'm hoping to be there yes. Especially if we'll get a critical mass of IndexedDB contributors. / Jonas On Wed, Sep 1, 2010 at 7:18 PM, Pablo Castro pablo.cas...@microsoft.com wrote: -Original Message- From: public-webapps-requ...@w3.org [mailto: public-webapps-requ...@w3.org] On Behalf Of Arthur Barstow Sent: Tuesday, August 31, 2010 4:32 AM The WebApps WG will meet face-to-face November 1-2 as part of the W3C's 2010 TPAC meeting week [TPAC]. I created a stub agenda item page and seek input to flesh out agenda: http://www.w3.org/2008/webapps/wiki/TPAC2010 [TPAC] includes a link to the Registration page, a detailed schedule of the group meetings, and other useful information. The registration fee is 40€ per day and will increase to 120€ per day after October 22. -Art Barstow [TPAC] http://www.w3.org/2010/11/TPAC/ For folks working on IndexedDB, are you guys planning on attending the TPAC? Given the timing of the event it may be a great opportunity to get together and iron out a whole bunch of issues at once. It would be good to know ahead of time so we can all make plans if we have critical mass. Thanks -pablo
Re: [IndexedDB] Languages for collation
Why not just use the unicode collation algorithm? Then you won't have to hint the locale. http://en.wikipedia.org/wiki/Unicode_collation_algorithm CouchDB uses some definitions around sorting complex types like arrays and objects but when it comes down to sorting strings it just defaults to to the unicode collation algorithm and all the locale's are happy. -Mikeal On Wed, Aug 11, 2010 at 11:28 PM, Pablo Castro pablo.cas...@microsoft.comwrote: We had some discussions about collation algorithms and such in the past, but I don't think we have settled on the language aspect of it. In order to have stores and indexes sort character-based keys in a way that is consistent with users' expectations we'll have to take indication in the API of what language we should use to collate strings. Trying to take a minimalist approach, we could add an optional parameter on the database open call that indicates the language to use (e.g. en or en-UK, etc.). If the language is not specified and the database does not exist, then we can use the current browser/OS language to create the database. If not specified and database already exists, then use the one it's already there (this accommodates the fact that a user may be able to change their default language in the browser/OS after the database has been created using the default). If the language is specified and the database already exists and the specified language is not the one the database has then we'll throw an exception (same behavior as with description, although we have that one in flight right now as well). We should probably also add a read-only attribute to the database object that exposes the language. If this works for folks I can write a proposal for the specific changes to the spec. Thanks -pablo
Re: [IndexedDB] Implicit transactions
For what it's worth I haven't found using it this way to be that hard or confusing but that could be because I'm a little more aware of the underlying implications when opening object stores. -Mikeal On Wed, Aug 4, 2010 at 11:47 AM, Shawn Wilsher sdwi...@mozilla.com wrote: On 8/4/2010 10:53 AM, Jeremy Orlow wrote: Whoatransaction() is synchronous?!? Ok, so I guess the entire premise of my question was super confused. :-) It is certainly spec'd that way [1]. The locks do not get acquired until the first actual bit of work is done though. Cheers, Shawn [1] http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#database-interface
Re: [IndexedDB] Editors
+1 On Tue, Jul 6, 2010 at 4:11 PM, art.bars...@nokia.com wrote: Nikunj, Jonas, All, Chaals, the Team and I all support this proposal. Thanks to you both! -Art Barstow From: public-webapps-requ...@w3.org [public-webapps-requ...@w3.org] On Behalf Of ext Nikunj Mehta [nik...@o-micron.com] Sent: Tuesday, July 06, 2010 12:39 PM To: public-webapps Subject: [IndexedDB] Editors Hi folks, I would like to propose adding Jonas Sicking to the list of editors for the IndexedDB spec. Many of you have seen the tremendous amount of work Jonas has done to assist in finalizing the asynchronous API as well as providing implementation feedback. I hope the WG will support this change. Best, Nikunj
Re: [IndexedDB] Atomic schema changes
In IDBCouch I don't have a schema but I do have to maintain consistency of the by-sequence index which is a similar problem to validating schema state before these kinds of operations. What I'm currently doing is just starting each write transaction with a lookup to the end of the by-sequence index to make sure the lastSequence I have is, in fact, the current one and another tab/window hasn't updated it. My plan for view generation is a similar problem and I plan to solve it with a an objectStore of meta information about all of the views. Storing the last known sequence and conflict resolution information about replicas is also a similar problem and I'll solve it the same way with a meta objectStore. I don't see why schema information couldn't also be stored in a meta objectStore at the end transactions that modify it and all of these higher level APIs could just start their transaction with a validation of the meta info. Rather than trying to keep the information globally and updating it with an event you can just validate it at the beginning of each transaction. The overhead is minimal and it seems, to me at least, to be a little less error prone. -Mikeal On Fri, Jun 25, 2010 at 2:43 AM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jun 25, 2010 at 9:04 AM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jun 24, 2010 at 4:32 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Jun 24, 2010 at 1:48 AM, Jonas Sicking jo...@sicking.cc wrote: Hi All, In bug 9975 comment 1 [1] Nikunj pointed out that it is unclear how to make atomic changes to the schema of a database. For example adding an objectStore and a couple of indexes. While it actually currently is possible, it is quite quirky and so I think we need to find a better solution. One way this is already possible is by calling setVersion. When the success event fires for this request, it contains an implicitly created transaction which, while it is alive, holds a lock on the whole database an prevents any other interactions with the database. However setVersion is a fairly heavy operation. We have discussed a couple of different ways it can work, but it seems like there is agreement that any other open database connections will either have to be close manually (by for example the user leaving the page), or they will be close forcefully (by making any requests on them fail). I intend on sending a starting a separate thread on defining the details of setVersion. We might want to allow making smaller schema changes, such as adding a new objectStore or a new index, without requiring all other database connections to be closed. Further, it would be nice if atomicness was a default behavior as to avoid people accidentally creating race conditions. We've talked a bit about this at mozilla and have three alternative proposals. In all three proposals we suggest moving the createObjectStore to the Transaction interface (or possibly a new SchemaTransaction interface). The createIndex function remains on objectStore, but is defined to throw an exception if called outside a transaction which allows schema changes. Proposal A: Always require calls to setVersion for changes to the database schema. The success event fired on the setVersion request is a IDBTransactionEvent. The author can use the createObjectStore method on the transaction available on the event to create new object stores. Additionally, since we know that no one else currently has an open database connection, we can make creating objectStores synchronous. The implementation can probably still asynchronously fail to create an objectStore, due to diskspace or other hardware issues. This failure will likely only be detected asynchronously, but can be raised as a failure to commit the transaction as it is extremely rare. The code would look something like: if (db.version == 2.0) { weAreDoneFunction(); } db.setVersion(2.0).onsuccess = function(event) { trans = event.transaction; store1 = trans.createObjectStore(myStore1, ...); store2 = trans.createObjectStore(myStore2, ...); store1.createIndex(...); store1.createIndex(...); store2.createIndex(...); trans.oncomplete = weAreDoneFunction; } Proposal B: Add a new type of transaction SCHEMA_CHANGE, in addition to READ and READ_WRITE. This transaction is required for any schema changes. As long as the transaction is open, no other schema changes can be done. The transaction is opened asynchronously using a new 'startSchemaTransaction' function. This ensures that no other modifications are attempted at the same time. Additionally, since we know that no one else currently is inside a SCHEMA_CHANGE transaction we can make creating objectStores synchronous. The implementation can probably still asynchronously fail to create an objectStore, due to diskspace or other hardware issues. This failure will
Re: [IndexedDB] Multi-value keys
I'm currently building out the CouchDB API on top of IndexedDB and achieving this particular use case is pretty trivial. I would just process a map function like: function (doc) { doc.names.forEach(function(n){emit(n, 1)})} ; Then I would process that map function and create an index with each key/value and a reference back to the document that was used to create the index. When I pull out the view I would optionally also pull out the originating document from the other objectStore. I think it's simple enough to do on top of the existing API that I wouldn't want to exclude Jonas' use case, or complicate things by specifying sorting of JSON style arrays inside the string sorting algorithm. -Mikeal On Fri, Jun 18, 2010 at 7:13 PM, Jeremy Orlow jor...@chromium.org wrote: Another possible meaning for arrays is allowing someone to insert multiple values into an index that point to one object store. For example: { names: [Sarah, Jessica, Parker], ...} { names: [Bono], ...} { names: [Jonas, Sicking], ...} Then I could look up Sicking inside an index with a keyPath of names and find Jonas even though I didn't know whether I was looking for his first name or last. I'm not sure whether creating semantics like this (or at least reserving the possibility for them in the future) is worth not using indexes as Jonas proposed, but it's worth considering. I'm also not so hot on the idea that if I want to index into something I either need to duplicate/mangle data in order to use keyPath or do explicit key management (which I'm not so hot on in general). I wonder if we could define keyPath to take some sort of array like syntax so that your example could work via a keyPath of [firstName, lastName] instead. Of course then the spec for the keyPath syntax is more complex. I'm sold on the need for ways to do composite indexes, but I'm not sure what the best way to express them will be. The fact that couchDB allows indexing on arrays definitely makes me lean towards your proposal though, Jonas. J On Fri, Jun 18, 2010 at 4:08 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, One thing that (if I'm reading the spec correctly) is currently impossible is to create multi-valued keys. Consider for example an object store containing objects like: { firstName: Sven, lastName: Svensson, age: 57 } { firstName: Benny, lastName: Andersson, age: 63 } { firstName: Benny, lastName: Bedrup, age: 9 } It is easy to create an index which lets you quickly find everyone with a given firstName or a given lastName. However it doesn't seem possible to create an index that finds everyone with a given firstName *and* lastName, or sort the list of people based on firstName and then lastName. The best thing you could do is to concatenate the firstname and lastname and insert a ascii-null character in between and then use that as a key in the index. However this doesn't work if firstName or lastName can contain null characters. Also, if you want to be able to sort by firstName and then age there is no good way to put all the information into a single string while having sorting work. Generally the way this is done in SQL is that you can create an index on multiple columns. That way each row has multiple values as the key, and sorting is first done on the first value, then the second, then the third etc. However since we don't really have columns we can't use that exact solution. Instead, the way we could allow multiple values is to add an additional type as keys: Arrays. That way you can use [Sven, 57], [Benny, 63] and [Benny, 9] as keys for the respective objects above. This would allow sorting and searching on firstName and age. The way that array keys would be compared is that we'd first compare the first item in both arrays. If they are different the arrays are ordered the same way as the two first-values are order. If they are the same you look at the second value and so on. If you reach the end of one array before finding a difference then that array is sorted before the other. We'd also have to define the order if an array is compared to a non-array value. It doesn't really matter what we say here, but I propose that we put all array after all non-arrays. Note that I don't think we need to allow arrays to contain arrays. That just seems to add complication without adding additional functionality. Let me know what you think. / Jonas
Re: [IndexedDB] Multi-value keys
The complex keys are how we do this in CouchDB as well. But, again, the sorting algorithm needs to be well defined in order for it work. http://wiki.apache.org/couchdb/View_collation#Collation_Specification Most pertinent to your example is how arrays of varying length might be ordered, for instance range queries over your example would break for [firstName, lastName] if an entry omitted lastName and arrays were sorted by length and then by comparison of each item. This is why the CouchDB collation algorithm sorts: [a] [b] [b,c] [b,c, a] [b,d] [b,d, e] -Mikeal On Fri, Jun 18, 2010 at 4:08 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, One thing that (if I'm reading the spec correctly) is currently impossible is to create multi-valued keys. Consider for example an object store containing objects like: { firstName: Sven, lastName: Svensson, age: 57 } { firstName: Benny, lastName: Andersson, age: 63 } { firstName: Benny, lastName: Bedrup, age: 9 } It is easy to create an index which lets you quickly find everyone with a given firstName or a given lastName. However it doesn't seem possible to create an index that finds everyone with a given firstName *and* lastName, or sort the list of people based on firstName and then lastName. The best thing you could do is to concatenate the firstname and lastname and insert a ascii-null character in between and then use that as a key in the index. However this doesn't work if firstName or lastName can contain null characters. Also, if you want to be able to sort by firstName and then age there is no good way to put all the information into a single string while having sorting work. Generally the way this is done in SQL is that you can create an index on multiple columns. That way each row has multiple values as the key, and sorting is first done on the first value, then the second, then the third etc. However since we don't really have columns we can't use that exact solution. Instead, the way we could allow multiple values is to add an additional type as keys: Arrays. That way you can use [Sven, 57], [Benny, 63] and [Benny, 9] as keys for the respective objects above. This would allow sorting and searching on firstName and age. The way that array keys would be compared is that we'd first compare the first item in both arrays. If they are different the arrays are ordered the same way as the two first-values are order. If they are the same you look at the second value and so on. If you reach the end of one array before finding a difference then that array is sorted before the other. We'd also have to define the order if an array is compared to a non-array value. It doesn't really matter what we say here, but I propose that we put all array after all non-arrays. Note that I don't think we need to allow arrays to contain arrays. That just seems to add complication without adding additional functionality. Let me know what you think. / Jonas
Re: [IndexedDB] Multi-value keys
Reading back over my email is sounds opposing and that wasn't my intention, it was a long way of saying +1 and giving an explanation for why we went with the same approach in CouchDB. -Mikeal On Fri, Jun 18, 2010 at 5:06 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 5:03 PM, Mikeal Rogers mikeal.rog...@gmail.com wrote: The complex keys are how we do this in CouchDB as well. But, again, the sorting algorithm needs to be well defined in order for it work. http://wiki.apache.org/couchdb/View_collation#Collation_Specification Most pertinent to your example is how arrays of varying length might be ordered, for instance range queries over your example would break for [firstName, lastName] if an entry omitted lastName and arrays were sorted by length and then by comparison of each item. This is why the CouchDB collation algorithm sorts: [a] [b] [b,c] [b,c, a] [b,d] [b,d, e] How is that different from what I proposed? I think that was what I intended to propose, but I might be missing some edge cases :) I take it that [a, z] would be sorted between [a] and [b]? / Jonas
Re: [IndexedDB] Multi-value keys
The biggest hole I see, even larger than sorting other types, is what we use for string comparisons. In CouchDB we use the unicode collation algorithm which is heavy but very well defined and works across various localizations. -Mikeal On Fri, Jun 18, 2010 at 5:26 PM, Jonas Sicking jo...@sicking.cc wrote: I didn't take it as opposing at all. I figured you'd like it as I based it on your description of how you do it in CouchDB ;-) I just wanted to make sure that we nail down all the details, including the sorting order, so if you see anything wrong definitely point it out! / Jonas On Fri, Jun 18, 2010 at 5:22 PM, Mikeal Rogers mikeal.rog...@gmail.com wrote: Reading back over my email is sounds opposing and that wasn't my intention, it was a long way of saying +1 and giving an explanation for why we went with the same approach in CouchDB. -Mikeal On Fri, Jun 18, 2010 at 5:06 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 5:03 PM, Mikeal Rogers mikeal.rog...@gmail.com wrote: The complex keys are how we do this in CouchDB as well. But, again, the sorting algorithm needs to be well defined in order for it work. http://wiki.apache.org/couchdb/View_collation#Collation_Specification Most pertinent to your example is how arrays of varying length might be ordered, for instance range queries over your example would break for [firstName, lastName] if an entry omitted lastName and arrays were sorted by length and then by comparison of each item. This is why the CouchDB collation algorithm sorts: [a] [b] [b,c] [b,c, a] [b,d] [b,d, e] How is that different from what I proposed? I think that was what I intended to propose, but I might be missing some edge cases :) I take it that [a, z] would be sorted between [a] and [b]? / Jonas
Re: [IndexedDB] Multi-value keys
I would like to see null and bool types in arrays as well. null is useful if it is assured to sort before any other type, bool types are useful if you want to use integers in the same sort as you want bools and therefor could not just use 0 and 1 instead. If you add all these new types (null, bool, int, float, date) as valid entries in the array and define a sort order I don't see why you wouldn't just add them as valid key types, it doesn't seem like it would be much extra work and it's definitely useful. objects would be hard, I'm +1 on punting them as key types. -Mikeal On Fri, Jun 18, 2010 at 5:36 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 5:18 PM, Pablo Castro pablo.cas...@microsoft.com wrote: +1 on composite keys in general. The alternative to the proposal below would be to have the actual key path specification include multiple members (e.g. db.createObjectStore(foo, [a, b])). I like the proposal below as well, I just wonder if having the key path specification (that's external to the object) indicate which members are keys would be less invasive for scenarios where you already have javascript objects you're getting from a web service or something and want to store them as is. My intention was for now to *just* add Arrays (of strings, ints, floats and dates) as a valid key type. No other changes at this point. This would mean that you would be allowed to pass an array as the 'key' argument to objectStore.add or have a keyPath point to a member that is an array. But also in the future other means might be possible too. See [1]. For the case that you bring up (and which I used in my example), where you have several separate members that you want to index on, I for now was going to defer that to the thread that I started yesterday [1]. That would allow you pick out several separate members and use them as index. Though it certainly is an interesting idea to allow multiple keyPaths. This would likely be more performant than falling back to the method in [1]. [1] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/1094.html / Jonas
Re: [IndexedDB] Changing the default overwrite behavior of Put
I don't have an opinion about addOrModify but in the Firefox build I'm messing with the cursor has an update method that I find highly useful and efficient. -Mikeal On Wed, Jun 16, 2010 at 11:08 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jun 16, 2010 at 10:46 AM, Nikunj Mehta nik...@o-micron.com wrote: On Jun 16, 2010, at 9:58 AM, Shawn Wilsher wrote: On 6/16/2010 9:43 AM, Nikunj Mehta wrote: There are three theoretical modes as you say. However, the second mode does not exist in practice. If you must overwrite, then you know that the record exists and hence don't need to specify that option. To be clear, you are saying that there are only two modes in practice: 1) add 2) add or modify But you don't believe that modify doesn't exist in practice? In terms of SQL, these three concepts exists and get used all the time. add maps to INSERT INTO, add or modify maps to INSERT OR REPLACE INTO, and modify maps to UPDATE. IndexedDB is not SQL, I think you would agree. UPDATE is useful when you replace on a column, by column basis and, hence, need to do a blind update. When updating a record in IndexedDB, you'd have to be certain about the state of the entire record. Hence, it makes sense to leave out UPDATE semantics in IndexedDB. I can't say that I have a strong sense of if modify is needed or not. On the surface if seems strange to leave out, but it's entirely possible that it isn't needed. Unless someone provides a good use case, I would be fine with leaving it out and seeing if people ask for it. / Jonas
[IndexDB] Collation Algorithm?
One of the things I noticed that seems to be missing from the IndexDB specification is the collation algorithm used for sorting the index keys. There are lots of collation differences between databases, if left unspecified I'm afraid this would negatively affect interoperability between IndexDB implementations. CouchDB has a good collation specification for rich keys (any JSON type) and defers to the Unicode Collation Algorithm once it hits string comparisons. This might be a good starting point. http://wiki.apache.org/couchdb/View_collation#Collation_Specification http://www.unicode.org/reports/tr10/ -Mikeal
Re: [IndexDB] Proposal for async API changes
I've been looking through the current spec and all the proposed changes. Great work. I'm going to be building a CouchDB compatible API on top of IndexedDB that can support peer-to-peer replication without other CouchDB instances. One of the things that will entail is a by-sequence index for all the changes in a give database (in my case a database will be scoped to more than one ObjectStore). In order to accomplish this I'll need to keep the last known sequence around so that each new write can create a new entry in the by-sequence index. The problem is that if another tab/window writes to the database it'll increment that sequence and I won't be notified so I would have to start every transaction with a check on the sequence index for the last sequence which seems like a lot of extra cursor calls. What I really need is an event listener on an ObjectStore that fires after a transaction is committed to the store but before the next transaction is run that gives me information about the commits to the ObjectStore. Thoughts? -Mikeal On Wed, Jun 9, 2010 at 11:40 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jun 9, 2010 at 7:25 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow jor...@chromium.org wrote: On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow jor...@chromium.org wrote: I'm not sure I like the idea of offering sync cursors either since the UA will either need to load everything into memory before starting or risk blocking on disk IO for large data sets. Thus I'm not sure I support the idea of synchronous cursors. But, at the same time, I'm concerned about the overhead of firing one event per value with async cursors. Which is why I was suggesting an interface where the common case (the data is in memory) is done synchronously but the uncommon case (we'd block if we had to respond synchronously) has to be handled since we guarantee that the first time will be forced to be asynchronous. Like I said, I'm not super happy with what I proposed, but I think some hybrid async/sync interface is really what we need. Have you guys spent any time thinking about something like this? How dead-set are you on synchronous cursors? The idea is that synchronous cursors load all the required data into memory, yes. I think it would help authors a lot to be able to load small chunks of data into memory and read and write to it synchronously. Dealing with asynchronous operations constantly is certainly possible, but a bit of a pain for authors. I don't think we should obsess too much about not keeping things in memory, we already have things like canvas and the DOM which adds up to non-trivial amounts of memory. Just because data is loaded from a database doesn't mean it's huge. I do note that you're not as concerned about getAll(), which actually have worse memory characteristics than synchronous cursors since you need to create the full JS object graph in memory. I've been thinking about this off and on since the original proposal was made, and I just don't feel right about getAll() or synchronous cursors. You make some good points about there already being many ways to overwhelm ram with webAPIs, but is there any place we make it so easy? You're right that just because it's a database doesn't mean it needs to be huge, but often times they can get quite big. And if a developer doesn't spend time making sure they test their app with the upper ends of what users may possibly see, it just seems like this is a recipe for problems. Here's a concrete example: structured clone allows you to store image data. Lets say I'm building an image hosting site and that I cache all the images along with their thumbnails locally in an IndexedDB entity store. Lets say each thumbnail is a trivial amount, but each image is 1MB. I have an album with 1000 images. I do |var photos = albumIndex.getAllObjects(albumName);| and then iterate over that to get the thumbnails. But I've just loaded over 1GB of stuff into ram (assuming no additional inefficiency/blowup). I suppose it's possible JavaScript engines could build mechanisms to fetch this stuff lazily (like you could even with a synchronous cursor) but that will take time/effort and introduce lag in the page (while fetching additional info from disk). I'm not completely against the idea of getAll/sync cursors, but I do think they should be de-coupled from this proposed API. I would also suggest that we re-consider them only after at least one implementation has normal cursors working and there's been some experimentation with it. Until then, we're basing most of our arguments on intuition and assumptions. I'm not married to the concept of sync cursors. However I
Re: [IndexedDB] Event on commits (WAS: Proposal for async API changes)
Ah, good point. I hadn't thought about just using postMessage in my ontransactioncommitted, that'll work. Thanks. -Mikeal On Thu, Jun 10, 2010 at 11:48 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Jun 10, 2010 at 6:15 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jun 10, 2010 at 6:31 AM, Andrei Popescu andr...@google.com wrote: On Thu, Jun 10, 2010 at 1:39 PM, Jeremy Orlow jor...@chromium.org wrote: Splitting into its own thread since this isn't really connected to the new Async interface and that thread is already pretty big. On Wed, Jun 9, 2010 at 10:36 PM, Mikeal Rogers mikeal.rog...@gmail.com wrote: I've been looking through the current spec and all the proposed changes. Great work. I'm going to be building a CouchDB compatible API on top of IndexedDB that can support peer-to-peer replication without other CouchDB instances. One of the things that will entail is a by-sequence index for all the changes in a give database (in my case a database will be scoped to more than one ObjectStore). In order to accomplish this I'll need to keep the last known sequence around so that each new write can create a new entry in the by-sequence index. The problem is that if another tab/window writes to the database it'll increment that sequence and I won't be notified so I would have to start every transaction with a check on the sequence index for the last sequence which seems like a lot of extra cursor calls. It would be a lot of extra calls, but I'm a bit hesitant to add much more API surface area to v1, and the fall back plan doesn't seem too unreasonable. What I really need is an event listener on an ObjectStore that fires after a transaction is committed to the store but before the next transaction is run that gives me information about the commits to the ObjectStore. Thoughts? To do this, we could specify an IndexedDatabaseRequest.ontransactioncommitted event that would be guaranteed to fire after every commit and before we started the next transaction. I think that'd meet your needs and not add too much additional surface area... What do others think? It sounds reasonable but, to clarify, it seems to me that 'ontransactioncommitted' can only be guaranteed to fire after every commit and before the next transaction starts in the current window. Other transactions may have already started in other windows. We could technically enforce that other transactions won't be allowed to start until the event has fired in all windows that has the database open. Sure, but I can't think of any reason you'd want such semantics. Can you? Either way though, I'm wondering how this relates to the fact that you can (in our proposal, I'm unclear what the current draft allows) have several writing transactions at the same time, as long as they operate on different tables. Here another transaction might have already started by the time another transaction is committed. Be that in this window or another. That's only true of dynamic transactions. Mikeal, were you planning on using static or dynamic transactions? If it's the latter, then I think Jonas has a good point and that this API might not help you after all. That said, I can think of other uses for notifications other windows have committed a transaction. But I think many if not all of them can be emulated with postMessage... I guess I'm starting to lean towards thinking this is extra API surface area for limited gain. J
Re: [IndexedDB] Event on commits (WAS: Proposal for async API changes)
For some reason I thought postMessage was broadcast, but looking at it further I was entirely incorrect. -Mikeal On Thu, Jun 10, 2010 at 12:02 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jun 10, 2010 at 12:00 PM, Mikeal Rogers mikeal.rog...@gmail.com wrote: Ah, good point. I hadn't thought about just using postMessage in my ontransactioncommitted, that'll work. Thanks. Except, how do you get a reference to the other windows that are affected? There is no API to enumarate all windows that uses a given site, much less all windows that are using a given database. / Jonas