Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 Mar 2011, at 1:01 AM, Jonas Sicking wrote: Anyhow, I do think that the idea of passing in index values at the same time as a entry is created/modified is an interesting idea. And I have said so in the past on this list. It's definitely something we should consider for v2. Oh, and if we did this, I wouldn't really know how to support things like collations. Neither if you did collations using built in sets of locales (like in Pablo's recent proposal), nor if you used some sort of callback to do collation. / Jonas That's fine. You don't need to figure it out. Just look at how stateless databases have done it (or not done it) and do likewise. I submit to you that there is inadequate understanding of the concerns raised, hence the lack of urgency in trying to address them. That there is even a need for a V2 is symptomatic of this. It may be a good idea to start looking at these things not as interesting ideas but as essential database concepts. If someone were trying to build some kind of transactional indexed key value store for the web, and they wanted to do a truly great job of it, they would certainly want to learn everything they could from databases that have made contributions to the field.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On Thu, Mar 31, 2011 at 12:16 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 1:01 AM, Jonas Sicking wrote: Anyhow, I do think that the idea of passing in index values at the same time as a entry is created/modified is an interesting idea. And I have said so in the past on this list. It's definitely something we should consider for v2. Oh, and if we did this, I wouldn't really know how to support things like collations. Neither if you did collations using built in sets of locales (like in Pablo's recent proposal), nor if you used some sort of callback to do collation. / Jonas That's fine. You don't need to figure it out. Just look at how stateless databases have done it (or not done it) and do likewise. I submit to you that there is inadequate understanding of the concerns raised, hence the lack of urgency in trying to address them. That there is even a need for a V2 is symptomatic of this. It may be a good idea to start looking at these things not as interesting ideas but as essential database concepts. If someone were trying to build some kind of transactional indexed key value store for the web, and they wanted to do a truly great job of it, they would certainly want to learn everything they could from databases that have made contributions to the field. I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. It has also been pointed out that there are unique constraints on APIs in browsers. For example due to the fact that several applications, i.e. pages, will be interacting with a given database. At the same time, and in some browsers from different processes. Additionally we're aiming to make an API which is easier to use, and where we can't place as much trust in the web page as you'd normally put in the user of a database API. For example, you've asked for callbacks to implement collations, but what do we do if those callbacks don't return consistent results? Or even do evil things like modify the stores where data is being inserted? In short, I don't think we'll get much further here without a concrete proposal. Especially not for IndexedDB v1. / Jonas
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. I have already referenced at least Redis and Tokyo Cabinet as examples of stateless database interfaces, on numerous occasions. For example, you've asked for callbacks to implement collations, but what do we do if those callbacks don't return consistent results? I have not once asked for callbacks, let alone callbacks to implement collations. You have jumped to this conclusion from my previous post, and missed the point of it entirely.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote: We have made an effort to understand other contributions to the field. I'm not convinced that these are essential database concepts and having personally spent quite some time working with the API in JS and implementing it, I feel pretty confident that what we have for v1 is pretty solid. There are definitely some things I wouldn't mind re-visiting or looking at closer, possibly even for v1, but they all seem reasonable to study further for v2 as well. We've spent a lot of time over the last year and a half talking about IndexedDB. But now it's shipping in Firefox 4 and soon Chrome 11. So realistically v1 is not going to change much unless we are convinced that what's there is fundamentally broken. We intentionally limited the scope of v1, which is why we know there'll be a v2. We can't solve all the problems at once, and the difficulty of speccing something is typically exponential to the size of the API. Maybe a constructive way to discuss this would be to look at what use cases will be difficult or impossible to achieve with the current design? Application-managed indices for starters. I would consider that to be essential when designing indexed key/value stores, and I would consider that to be the contribution made by almost every other indexed key/value store to date. If we have to use IDB the way FriendFeed used MySQL to achieve application-managed indices then I would argue that the API is in fact fundamentally broken and we would be better off with an embedding of SQLite by Mozilla. Regarding the difficulty of speccing something is typically exponential to the size of the API, if people want to build a Rube Goldberg device then they must deal with the spec issues of that. If we were provided with the primitives for an indexed key/value store with application-managed indices (as Nikunj suggested at the time), we would have been well out of the starting blocks by now, and issues such as computed indexes, indexing array values etc. would have been non-issues. Summary: 1. There's a problem. 2. It can still be fixed with a minimum of fuss. 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads).
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
I was the one that asked for callbacks. but what do we do if those callbacks don't return consistent results? Or even do evil things like modify the stores where data is being inserted? If the callback maps all values to a sort-order of '1' there could only ever be one entry in the index... its not hard, the callback is passed an immutable copy of the object and returns a sort-order as a binary-blob. If you capture the object store in the closure you of course you could do evil things as side-effects. But that is true in any non-purely-functional language, you can always do evil things with side-effects. In short, I don't think we'll get much further here without a concrete proposal. Which basically means nobody working on the current implementations understands the issues, or thinks the issues are unimportant? Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 08:38, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote: We have made an effort to understand other contributions to the field. I'm not convinced that these are essential database concepts and having personally spent quite some time working with the API in JS and implementing it, I feel pretty confident that what we have for v1 is pretty solid. There are definitely some things I wouldn't mind re-visiting or looking at closer, possibly even for v1, but they all seem reasonable to study further for v2 as well. We've spent a lot of time over the last year and a half talking about IndexedDB. But now it's shipping in Firefox 4 and soon Chrome 11. So realistically v1 is not going to change much unless we are convinced that what's there is fundamentally broken. We intentionally limited the scope of v1, which is why we know there'll be a v2. We can't solve all the problems at once, and the difficulty of speccing something is typically exponential to the size of the API. Maybe a constructive way to discuss this would be to look at what use cases will be difficult or impossible to achieve with the current design? Application-managed indices for starters. I would consider that to be essential when designing indexed key/value stores, and I would consider that to be the contribution made by almost every other indexed key/value store to date. If we have to use IDB the way FriendFeed used MySQL to achieve application-managed indices then I would argue that the API is in fact fundamentally broken and we would be better off with an embedding of SQLite by Mozilla. Regarding the difficulty of speccing something is typically exponential to the size of the API, if people want to build a Rube Goldberg device then they must deal with the spec issues of that. If we were provided with the primitives for an indexed key/value store with application-managed indices (as Nikunj suggested at the time), we would have been well out of the starting blocks by now, and issues such as computed indexes, indexing array values etc. would have been non-issues. Summary: 1. There's a problem. 2. It can still be fixed with a minimum of fuss. I totally agree with everything so far... 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads). I disagree that a simple API change is the answer. The problem is architectural, not just a superficial API issue. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote: I totally agree with everything so far... 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads). I disagree that a simple API change is the answer. The problem is architectural, not just a superficial API issue. Yes, for IndexedDB to be stateless with respect to application schema, one would need to: 1. Provide the application with a first-class means to manage indexes at time of putting/deleting objects. 2. Treat objects as opaque (remove key path, structured clone mechanisms, application must provide an id and JSON value to put/delete calls, reduces serialization/deserialization overhead where application already has the object as a string). 3. Remove setVersion (redundant, application migrates objects and indexes using transactions as it needs to). 4. Remove createIndex. This would rip so much from the spec as to reduce it to a bunch of tatters, defining nothing more than an interface for index/key/value primitives in terms of well-established interfaces. Essentially, we need LocalStorage with asynchronous IO (based on Node's callback style), large quota support, and a BTree API. Failing that, a decent FileSystem API on which to build these.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 12:41, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote: I totally agree with everything so far... 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads). I disagree that a simple API change is the answer. The problem is architectural, not just a superficial API issue. Yes, for IndexedDB to be stateless with respect to application schema, one would need to: 1. Provide the application with a first-class means to manage indexes at time of putting/deleting objects. 2. Treat objects as opaque (remove key path, structured clone mechanisms, application must provide an id and JSON value to put/delete calls, reduces serialization/deserialization overhead where application already has the object as a string). 3. Remove setVersion (redundant, application migrates objects and indexes using transactions as it needs to). 4. Remove createIndex. This would rip so much from the spec as to reduce it to a bunch of tatters, defining nothing more than an interface for index/key/value primitives in terms of well-established interfaces. Essentially, we need LocalStorage with asynchronous IO (based on Node's callback style), large quota support, and a BTree API. Failing that, a decent FileSystem API on which to build these. Stateless indexes can be provided differently to how you suggest. You can have a 'validate_index' call that checks the index exists and creates it if it does not. It is stateless in the sense that you call that to open existing index or create one, you dont care if the database has one already or not. Infact you can make SQL stateless by providing a validate_schema call that succeeds if the schema of the database matches the passed schema, can be modified with no data loss to be the same, or needs to be created. The RelationalDB wrapper for WebSQL provides this kind of stateless approach for SQL... you can check it out on github if you like (its a work in progress though): https://github.com/keean/RelationalDB Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On Thu, Mar 31, 2011 at 1:38 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote: We have made an effort to understand other contributions to the field. I'm not convinced that these are essential database concepts and having personally spent quite some time working with the API in JS and implementing it, I feel pretty confident that what we have for v1 is pretty solid. There are definitely some things I wouldn't mind re-visiting or looking at closer, possibly even for v1, but they all seem reasonable to study further for v2 as well. We've spent a lot of time over the last year and a half talking about IndexedDB. But now it's shipping in Firefox 4 and soon Chrome 11. So realistically v1 is not going to change much unless we are convinced that what's there is fundamentally broken. We intentionally limited the scope of v1, which is why we know there'll be a v2. We can't solve all the problems at once, and the difficulty of speccing something is typically exponential to the size of the API. Maybe a constructive way to discuss this would be to look at what use cases will be difficult or impossible to achieve with the current design? Application-managed indices for starters That's not a use case. I would consider that to be essential when designing indexed key/value stores, and I would consider that to be the contribution made by almost every other indexed key/value store to date. If we have to use IDB the way FriendFeed used MySQL to achieve application-managed indices then I would argue that the API is in fact fundamentally broken and we would be better off with an embedding of SQLite by Mozilla. Regarding the difficulty of speccing something is typically exponential to the size of the API, if people want to build a Rube Goldberg device then they must deal with the spec issues of that. If we were provided with the primitives for an indexed key/value store with application-managed indices (as Nikunj suggested at the time), we would have been well out of the starting blocks by now, and issues such as computed indexes, indexing array values etc. would have been non-issues. Summary: 1. There's a problem. 2. It can still be fixed with a minimum of fuss. 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads).
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On Thu, Mar 31, 2011 at 5:41 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote: I totally agree with everything so far... 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads). I disagree that a simple API change is the answer. The problem is architectural, not just a superficial API issue. Yes, for IndexedDB to be stateless with respect to application schema, one would need to: 1. Provide the application with a first-class means to manage indexes at time of putting/deleting objects. I'm OK with doing this for v1 if the others are. It doesn't seem like that big of an addition and it would give a decent amount of additional flexibility. 2. Treat objects as opaque (remove key path, Key paths are quite useful. I agree that making it possible to use statelessly is good, but I don't see any reason why making it 100% stateless should be a goal. structured clone mechanisms For sure, not going to happen. , application must provide an id and JSON value to put/delete calls, reduces serialization/deserialization overhead where application already has the object as a string). I'm not sure why you think this would reduce overhead. 3. Remove setVersion (redundant, application migrates objects and indexes using transactions as it needs to). 4. Remove createIndex. Like I said above, although I think we should make it possible to operate more statelessly, I don't see a reason we need to remove stuff like this. Some users will find it more convenient to work this way. J
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. That is hardly a comprehensive proposal, but rather just one small part of it. I do really think the idea of not having the implementation keep track of the set of indexes for a objectStore is a really interesting one. As is the idea of not even having a set set of objectStores. However, there are several problems that needs to be solved. In particular how do you deal with collations? I.e. we have concluded that there are important use cases which require using different collations for different indexes and objectStores. Even for different indexes attached to the same objectStore. Additionally, if we're getting rid of setVersion, how do we expect pages dealing with the (application managed) schema changing while the page has a connection open to the database? So pretty please, with sugar on top, please come up with a proposal for the full API rather than bits and pieces. And I should mention that I have as an absolute requirement that you should be able to specify collation by simply saying that you want to use en-US or sv-SV sorting. Using callbacks or other means is ok *in addition to this*, but callback mechanisms tend to be a lot more complex since they have to deal with the callback doing all sorts of evil things such as returning inconsistent results (think return Math.random()), or simply do evil things like navigate the current page, deleting the database, or modifying the record that is in the process of being stored. / Jonas
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.com wrote: On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote: On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. That is hardly a comprehensive proposal, but rather just one small part of it. I wanted to make a few comments about these points :- I do really think the idea of not having the implementation keep track of the set of indexes for a objectStore is a really interesting one. As is the idea of not even having a set set of objectStores. However, there are several problems that needs to be solved. In particular how do you deal with collations? no indexes, no object stores... well I for one prefer the validate_object_store, validate_index approach, in that it can hide statefullness if necessary (like I do with RelationalDB) whilst presenting a stateless API. It also keeps the size of the put statements down. I.e. we have concluded that there are important use cases which require using different collations for different indexes and objectStores. Even for different indexes attached to the same objectStore. Additionally, if we're getting rid of setVersion, how do we expect pages dealing with the (application managed) schema changing while the page has a connection open to the database? 1 - there is no schema 2 - dont allow it to change whilst the database is open In reality a schema is implicitly tied to a code version. In other words the source code of the application assumes a certain schema. If the assumed schema and the schema in the DB do not match things are going to go very wrong very quickly. Schema changes _always_ accompany code changes (otherwise they are not schema changes just data changes). As such they never happen when a DB is open. The way I handle this in RelationalDB, by validating the actual schema against the source-code schema in the db-open (actually the method is called validate), is probably the best way to handle this. If the database does not exist we create it according to the schema. If it exists we check it matches the schema. If there is a difference we see if we can 'upgrade' the database automatically (certain changes like adding a new column with a default value can be done automaticall), if we cannot automaticall upgrade, we exit with an error - as allowing the program to run will result in corruption of the data already in the database. At this point it is up to the application to figure out how to upgrade the database (by opening one database with an old schema and another with a new schema)... There is not point in ever allowing a database to be opened with the wrong schema. So pretty please, with sugar on top, please come up with a proposal for the full API rather than bits and pieces. And I should mention that I have as an absolute requirement that you should be able to specify collation by simply saying that you want to use en-US or sv-SV sorting. Using callbacks or other means is ok *in addition to this*, but callback mechanisms tend to be a lot more complex since they have to deal with the callback doing all sorts of evil things such as returning inconsistent results (think return Math.random()), or simply do evil things like navigate the current page, deleting the database, or modifying the record that is in the process of being stored. The core API only needs to deal with sorting binary-blob sort orders. A library wrapper could provide all the collation ordering goodness that people want. For example RelationalDB will have to deal with sorting orders, it does not need the browser to provide that functionality. In fact browser provided functionality may limit what can be done in libraries on top. This is difficult if not impossible to do. See previous threads on the matter. J I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On Thu, Mar 31, 2011 at 11:24 AM, Keean Schupke ke...@fry-it.com wrote: On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.com wrote: On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote: On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. That is hardly a comprehensive proposal, but rather just one small part of it. I wanted to make a few comments about these points :- I do really think the idea of not having the implementation keep track of the set of indexes for a objectStore is a really interesting one. As is the idea of not even having a set set of objectStores. However, there are several problems that needs to be solved. In particular how do you deal with collations? no indexes, no object stores... well I for one prefer the validate_object_store, validate_index approach, in that it can hide statefullness if necessary (like I do with RelationalDB) whilst presenting a stateless API. It also keeps the size of the put statements down. I.e. we have concluded that there are important use cases which require using different collations for different indexes and objectStores. Even for different indexes attached to the same objectStore. Additionally, if we're getting rid of setVersion, how do we expect pages dealing with the (application managed) schema changing while the page has a connection open to the database? 1 - there is no schema 2 - dont allow it to change whilst the database is open In reality a schema is implicitly tied to a code version. In other words the source code of the application assumes a certain schema. If the assumed schema and the schema in the DB do not match things are going to go very wrong very quickly. Schema changes _always_ accompany code changes (otherwise they are not schema changes just data changes). As such they never happen when a DB is open. The way I handle this in RelationalDB, by validating the actual schema against the source-code schema in the db-open (actually the method is called validate), is probably the best way to handle this. If the database does not exist we create it according to the schema. If it exists we check it matches the schema. If there is a difference we see if we can 'upgrade' the database automatically (certain changes like adding a new column with a default value can be done automaticall), if we cannot automaticall upgrade, we exit with an error - as allowing the program to run will result in corruption of the data already in the database. At this point it is up to the application to figure out how to upgrade the database (by opening one database with an old schema and another with a new schema)... There is not point in ever allowing a database to be opened with the wrong schema. So pretty please, with sugar on top, please come up with a proposal for the full API rather than bits and pieces. And I should mention that I have as an absolute requirement that you should be able to specify collation by simply saying that you want to use en-US or sv-SV sorting. Using callbacks or other means is ok *in addition to this*, but callback mechanisms tend to be a lot more complex since they have to deal with the callback doing all sorts of evil things such as returning inconsistent results (think return Math.random()), or simply do evil things like navigate the current page, deleting the database, or modifying the record that is in the process of being stored. The core API only needs to deal with sorting binary-blob sort orders. A library wrapper could provide all the collation ordering goodness that people want. For example RelationalDB will have to deal with sorting orders, it does not need the browser to provide that functionality. In fact browser provided functionality may limit what can be done in libraries on top. This is difficult if not impossible to do. See previous threads on the matter. J I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? Sorry, but that stuff is paged out of my brain. Pablo, can you? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 Mar 2011, at 7:27 PM, Jeremy Orlow wrote: 1. Provide the application with a first-class means to manage indexes at time of putting/deleting objects. I'm OK with doing this for v1 if the others are. It doesn't seem like that big of an addition and it would give a decent amount of additional flexibility. Thanks Jeremy that would be great. (reduces serialization/deserialization overhead where application already has the object as a string) I'm not sure why you think this would reduce overhead. How long would it take an iPad to JSON deserialize/serialize 500 / 5,000 / 50,000 / 500,000 / 5,000,000 2KB objects? That's a reasonable device and those are reasonable workloads. In it's present state, IndexedDB needs to do this every time setVersion is called with a createIndex in there... you see the problem is there's no way for the application to control this. The application would arguably be able to find better ways of migrating indexes than using key paths which necessitate deserialization/serialization to be performed on the client. For instance, you could use batch jobs on the server to do this on behalf of clients, and this would make sense especially where many clients/devices share the same objects. With IndexedDB this is not possible. With pure storage primitives it would have been possible. This is just one use-case, and for every one of these there will be plenty more. Like I said above, although I think we should make it possible to operate more statelessly, I don't see a reason we need to remove stuff like this. Some users will find it more convenient to work this way. Agreed on both counts. It is clearly too late to remove it now. But it may be a good idea in future to keep the focus on providing low-level primitives rather than convenience features, since the latter often get in the way of the former.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 18:36, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:24 AM, Keean Schupke ke...@fry-it.com wrote: On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.comwrote: On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote: On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. That is hardly a comprehensive proposal, but rather just one small part of it. I wanted to make a few comments about these points :- I do really think the idea of not having the implementation keep track of the set of indexes for a objectStore is a really interesting one. As is the idea of not even having a set set of objectStores. However, there are several problems that needs to be solved. In particular how do you deal with collations? no indexes, no object stores... well I for one prefer the validate_object_store, validate_index approach, in that it can hide statefullness if necessary (like I do with RelationalDB) whilst presenting a stateless API. It also keeps the size of the put statements down. I.e. we have concluded that there are important use cases which require using different collations for different indexes and objectStores. Even for different indexes attached to the same objectStore. Additionally, if we're getting rid of setVersion, how do we expect pages dealing with the (application managed) schema changing while the page has a connection open to the database? 1 - there is no schema 2 - dont allow it to change whilst the database is open In reality a schema is implicitly tied to a code version. In other words the source code of the application assumes a certain schema. If the assumed schema and the schema in the DB do not match things are going to go very wrong very quickly. Schema changes _always_ accompany code changes (otherwise they are not schema changes just data changes). As such they never happen when a DB is open. The way I handle this in RelationalDB, by validating the actual schema against the source-code schema in the db-open (actually the method is called validate), is probably the best way to handle this. If the database does not exist we create it according to the schema. If it exists we check it matches the schema. If there is a difference we see if we can 'upgrade' the database automatically (certain changes like adding a new column with a default value can be done automaticall), if we cannot automaticall upgrade, we exit with an error - as allowing the program to run will result in corruption of the data already in the database. At this point it is up to the application to figure out how to upgrade the database (by opening one database with an old schema and another with a new schema)... There is not point in ever allowing a database to be opened with the wrong schema. So pretty please, with sugar on top, please come up with a proposal for the full API rather than bits and pieces. And I should mention that I have as an absolute requirement that you should be able to specify collation by simply saying that you want to use en-US or sv-SV sorting. Using callbacks or other means is ok *in addition to this*, but callback mechanisms tend to be a lot more complex since they have to deal with the callback doing all sorts of evil things such as returning inconsistent results (think return Math.random()), or simply do evil things like navigate the current page, deleting the database, or modifying the record that is in the process of being stored. The core API only needs to deal with sorting binary-blob sort orders. A library wrapper could provide all the collation ordering goodness that people want. For example RelationalDB will have to deal with sorting orders, it does not need the browser to provide that functionality. In fact browser provided functionality may limit what can be done in libraries on top. This is difficult if not impossible to do. See previous threads on the matter. J I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? Sorry, but that stuff is paged out of my brain. Pablo, can you? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. Cheers, Keean. Thanks would help me understand. As long as there is a way to turn default collation off and just have a binary string sort order thats fine for my needs. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 3/31/2011 11:47 AM, Joran Greef wrote: Let those who introduced these design flaws be among the first to take responsibility and fix them. You aren't being constructive, and that's a surefire way to be ignored. You have yet to convince the working group that these are design flaws in the first place. /sdwilsh smime.p7s Description: S/MIME Cryptographic Signature
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 Mar 2011, at 10:07 PM, Shawn Wilsher wrote: On 3/31/2011 11:47 AM, Joran Greef wrote: Let those who introduced these design flaws be among the first to take responsibility and fix them. You aren't being constructive, and that's a surefire way to be ignored. You have yet to convince the working group that these are design flaws in the first place. /sdwilsh Agreed. I am actively using the API with real-world data and I am providing feedback. You are welcome to use it or not. It is not for me to convince anyone. As I said, if people think there is a problem, let those who introduced it fix it. Joran Greef
RE: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, March 31, 2011 11:36 AM I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? Sorry, but that stuff is paged out of my brain. Pablo, can you? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. - Currently there are no APIs in JavaScript to compare strings using specific collations. There are folks that are looking into this, but it will need time. - I'm far from an expert in the topic, but from talking to folks that understand this well it seems that to actually implement this entirely in JavaScript it would mean you have to download collation tables and apply them as needed in callbacks. Not only this means a hit in download size/time for the app but also that callbacks have to either download stuff or inline collation rules/tables in the callback itself. - In pure practical terms, I suspect the 80% scenario can be covered by implementing this natively, having it be fast and simple to use for common cases. Not pushing back on the callback stuff, just saying that I find it valuable to have users simply say en-US and get what they wanted. - Also from the practical perspective, simple cases that don't require the flexibility and can avoid having to take care of making the callbacks perfectly consistent even as you roll out updates that may hit only some of the pages, use components written by someone else, etc. - By default we would still do binary collation (there was a question in the thread, I forget exactly where). Thanks -pablo
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
Currently there are no APIs in JavaScript to compare strings using specific collations We dont actually need this, just a mapping from UTF-16 string to a sort-score (binary blob). Its true that downloading the collation tables might take time, so we could just provide: var blob = string_to_score('utf-16 string', 'en-US'); as a built in function to make this efficient. I agree with the other points though. Cheers, Keean. On 31 March 2011 22:38, Pablo Castro pablo.cas...@microsoft.com wrote: From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, March 31, 2011 11:36 AM I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? Sorry, but that stuff is paged out of my brain. Pablo, can you? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. - Currently there are no APIs in JavaScript to compare strings using specific collations. There are folks that are looking into this, but it will need time. - I'm far from an expert in the topic, but from talking to folks that understand this well it seems that to actually implement this entirely in JavaScript it would mean you have to download collation tables and apply them as needed in callbacks. Not only this means a hit in download size/time for the app but also that callbacks have to either download stuff or inline collation rules/tables in the callback itself. - In pure practical terms, I suspect the 80% scenario can be covered by implementing this natively, having it be fast and simple to use for common cases. Not pushing back on the callback stuff, just saying that I find it valuable to have users simply say en-US and get what they wanted. - Also from the practical perspective, simple cases that don't require the flexibility and can avoid having to take care of making the callbacks perfectly consistent even as you roll out updates that may hit only some of the pages, use components written by someone else, etc. - By default we would still do binary collation (there was a question in the thread, I forget exactly where). Thanks -pablo
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On Sat, Mar 26, 2011 at 1:14 AM, Nikunj Mehta nik...@o-micron.com wrote: What is the minimum that can be in IDB? I am guessing the following: 1. Sorted key-opaque value transactional store 2. Lookup of keys by values (or parts thereof) #1 is essential. #2 is unavoidable because you would want to efficiently manipulate values by values as opposed to values by key. I know of no efficient way of doing callbacks with JS. Moreover, avoiding indices completely seems to miss the point. Yes, IDB can be used without key paths and indices. When you do that, you would not have any headache of setVersion since every version change either adds or removes an object store. Next, originally, I also had floated the idea of application managed indices, but implementors thought of it as cruft. For what it's worth, I'm not sure anyone ever thought it was cruft. The main problem, IMHO, was that it was underdefined. It also created a somewhat awkward API since even indexes which were not manually managed would have functions for explicitly managing them. (It also wouldn't help with the state issues that is raised in the original email in this thread). Anyhow, I do think that the idea of passing in index values at the same time as a entry is created/modified is an interesting idea. And I have said so in the past on this list. It's definitely something we should consider for v2. However this still wouldn't solve the state issue raised in this thread. The browser still keeps track of the set of objectStores. You could get rid of that by changing the APIs such that there isn't a set list of object stores. I.e. remove IDBDatabase.objectStoreNames, IDBDatabase.createObjectStore and IDBDatabase.deleteObjectStore and make IDBDatabase.transaction allow any object store names to be passed to it. This would let you create a new objectStore by simply starting a transaction which uses the new name and start storing data into it. But even then you still want the setVersion API. Otherwise the web page has no way of migrating stored data to a new schema. Even though the browser doesn't keep track of a schema, the app still does, and likely will want to change that from time to time. / Jonas
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
What is the minimum that can be in IDB? I am guessing the following: 1. Sorted key-opaque value transactional store 2. Lookup of keys by values (or parts thereof) #1 is essential. #2 is unavoidable because you would want to efficiently manipulate values by values as opposed to values by key. I know of no efficient way of doing callbacks with JS. Moreover, avoiding indices completely seems to miss the point. Yes, IDB can be used without key paths and indices. When you do that, you would not have any headache of setVersion since every version change either adds or removes an object store. Next, originally, I also had floated the idea of application managed indices, but implementors thought of it as cruft. On Sun, Mar 20, 2011 at 3:10 PM, Joran Greef jo...@ronomon.com wrote: On 20 Mar 2011, at 4:54 AM, Jonas Sicking wrote: I don't understand what you are saying about application state though, so please do start that as a separate thread. At present, there's no way for an application to tell IDB what indexes to modify w.r.t. an object at the exact moment when putting or deleting that object. That's because this behavior is defined in advance using createIndex in a setVersion transaction. And then how IDB extracts the referenced value from the object is done using an IDB idea of key paths. But right there, in defining the indexes in advance (and not when the index is actually modified, which is when the object itself is modified), you've captured application state (data relationships that should be known only to the application) within IDB. Because this is done in advance (because IDB seems to have inherited this assumption that this is just the way MySQL happens to do it), there's a disconnect between when the index is defined and when it's actually used. And because of key paths you now need to spec out all kinds of things like how to handle compound keys, multiple values. It's becoming a bit of a spec-fest. That this bubble of state gets captured in IDB, it also means that IDB now needs to provide ways of updating that captured state within IDB when it changes in the application (which will happen, so essentially you now have your indexing logic stuck in the database AND in the application and the application developer now has to try and keep BOTH in sync using this awkward pre-defined indexes interface), thus the need for a setVersion transaction in the first place. None of this would be necessary if the application could reference indexes to be modified (and created if they don't exist, or deleted if they would then become empty) AT THE POINT of putting or deleting an object. Things like data migrations would also be better served if this were possible since this is something the application would need to manage anyway. Do you follow? The application is the right place to be handling indexing logic. IDB just needs to provide an interface to the indexing implementation, but not handle extracting values from objects or deciding which indexes to modify. That's the domain of the application. It's a question of encapsulation. IDB is crossing the boundaries by demanding to know ABOUT the data stored, and not just providing a simple way to put an object, and a simple way to put a reference to an object to an index, and a simple way to query an index and intersect or union an index with another. Essentially an object and its index memberships need to be completely opaque to IDB and you are doing the opposite. Take a look at the BDB interface. Do you see a setVersion or createIndex semantic in there? BDB has secondary databases, which are the same as indices with a one to many relation between primary and secondary database. Moreover, BDB uses application callbacks to let the application encapsulate the definition of the index. Take a look at Redis and Tokyo and many other things. Do you see a setVersion or createIndex semantic in there? Do these databases have any idea about the contents of objects? Any concept of key paths? I, for one, am not enamored by key paths. However, I am also morbidly aware of the perils in JS land when using callback like mechanisms. Certainly, I would like to hear from developers like you how you find IDB if you were to not use any createIndex at all. Or at least that you would like to manage your own indices. No, and that's the whole reason these databases were created in the first place. I'm sure you have read the BDB papers. Obviously this is not the approach of MySQL. But if IDB is trying to be MySQL but saying it wants to be BDB then I don't know. In any event, Firefox would be brave to also embed SQLite. Let the better API win. How much simpler could it be? At the end of the day, it's all objects and sets and sorted sets, and see Redis' epiphany on this point. IDB just needs to provide transactional access to these sets. The application must decide what goes in and out of these sets, and must be able to do
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 26 Mar 2011, at 10:14 AM, Nikunj Mehta wrote: What is the minimum that can be in IDB? I am guessing the following: 1. Sorted key-opaque value transactional store 2. Lookup of keys by values (or parts thereof) Yes, this is what we need. In programmer speak: objects (opaque strings), sets (hash indexes), sorted sets (range indexes). I know of no efficient way of doing callbacks with JS. Moreover, avoiding indices completely seems to miss the point. Callbacks are unnecessary. This is what you would want to do as a developer using the current form of IDB: objectStore.putObject({ name: Joran, emails: [jo...@gmail.com, jo...@ronomon.com] }, { id: 'arbitraryObjectIdProvidedByTheApplication', indexes: [emails=jo...@gmail.com, emails=jo...@ronomon.com, name=Joran] }); IDB would then store the user object using the id provided by the application, and make sure it's referenced by this id in the emails=jo...@gmail.com, emails=jo...@ronomon.com, name=Joran index references provided (creating these indexes along the way if need be). The application is responsible for passing in the extra id and indexes options to putObject. Supporting range indexes would be a question of expanding the above to let the developer pass in a sort score along with the index reference. Next, originally, I also had floated the idea of application managed indices, but implementors thought of it as cruft. I can understand how application managed indices would lead to less work on the part of the spec committee. There seems to be some perverse human characteristic that likes to make easy things difficult. Ships will sail around the world but the Flat Earth Society will flourish. I, for one, am not enamored by key paths. However, I am also morbidly aware of the perils in JS land when using callback like mechanisms. Certainly, I would like to hear from developers like you how you find IDB if you were to not use any createIndex at all. Or at least that you would like to manage your own indices. I am begging to be able to manage my indices. I know my data. I do not want to use any createIndex to declare indexes in advance of when I may or may not use them. What advantage would that give me? I want to create/update indexes only when I put or delete objects and I want to have control over which indexes to update accordingly. With one small change to the putObject and deleteObject interfaces, in the form of the indexes option, we can make that possible. We need these primitives in IDB: opaque strings, sets, sorted sets. Ideally, IDB need simply store these things and provide the standard interfaces (see Redis) to them along with a transactional mechanism. That's the perfect low-level API on which to build almost any database wrapper.
[IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 20 Mar 2011, at 4:54 AM, Jonas Sicking wrote: I don't understand what you are saying about application state though, so please do start that as a separate thread. At present, there's no way for an application to tell IDB what indexes to modify w.r.t. an object at the exact moment when putting or deleting that object. That's because this behavior is defined in advance using createIndex in a setVersion transaction. And then how IDB extracts the referenced value from the object is done using an IDB idea of key paths. But right there, in defining the indexes in advance (and not when the index is actually modified, which is when the object itself is modified), you've captured application state (data relationships that should be known only to the application) within IDB. Because this is done in advance (because IDB seems to have inherited this assumption that this is just the way MySQL happens to do it), there's a disconnect between when the index is defined and when it's actually used. And because of key paths you now need to spec out all kinds of things like how to handle compound keys, multiple values. It's becoming a bit of a spec-fest. That this bubble of state gets captured in IDB, it also means that IDB now needs to provide ways of updating that captured state within IDB when it changes in the application (which will happen, so essentially you now have your indexing logic stuck in the database AND in the application and the application developer now has to try and keep BOTH in sync using this awkward pre-defined indexes interface), thus the need for a setVersion transaction in the first place. None of this would be necessary if the application could reference indexes to be modified (and created if they don't exist, or deleted if they would then become empty) AT THE POINT of putting or deleting an object. Things like data migrations would also be better served if this were possible since this is something the application would need to manage anyway. Do you follow? The application is the right place to be handling indexing logic. IDB just needs to provide an interface to the indexing implementation, but not handle extracting values from objects or deciding which indexes to modify. That's the domain of the application. It's a question of encapsulation. IDB is crossing the boundaries by demanding to know ABOUT the data stored, and not just providing a simple way to put an object, and a simple way to put a reference to an object to an index, and a simple way to query an index and intersect or union an index with another. Essentially an object and its index memberships need to be completely opaque to IDB and you are doing the opposite. Take a look at the BDB interface. Do you see a setVersion or createIndex semantic in there? Take a look at Redis and Tokyo and many other things. Do you see a setVersion or createIndex semantic in there? Do these databases have any idea about the contents of objects? Any concept of key paths? No, and that's the whole reason these databases were created in the first place. I'm sure you have read the BDB papers. Obviously this is not the approach of MySQL. But if IDB is trying to be MySQL but saying it wants to be BDB then I don't know. In any event, Firefox would be brave to also embed SQLite. Let the better API win. How much simpler could it be? At the end of the day, it's all objects and sets and sorted sets, and see Redis' epiphany on this point. IDB just needs to provide transactional access to these sets. The application must decide what goes in and out of these sets, and must be able to do it when it wants to, not some time in advance. I bring this up because I once wrote the exact same kind of database that you are writing now (where one thinks it would be good if the database did NOT treat objects as opaque... that the database should be smart about the contents of objects and share control for how objects relate to each other etc.) and I have since seen how much better, simpler, faster the alternative is. So unless you have formidable reasons for maintaining the status quo in light of the above, even if you don't understand this concept of application state getting stuck in IDB, and even though you advocate that WebSQL is not deprecated and that we can consider LocalStorage to be an alternative, then it is my hope that you will heed this and make something of it. I'm sorry if this is not the kind of feedback you want to hear at this stage, but IDB needs to be good for more than just HTML 5 todo list demos.