Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Joran Greef
On 31 Mar 2011, at 1:01 AM, Jonas Sicking wrote:

 Anyhow, I do think that the idea of passing in index values at the
 same time as a entry is created/modified is an interesting idea. And I
 have said so in the past on this list. It's definitely something we
 should consider for v2.

 Oh, and if we did this, I wouldn't really know how to support things
 like collations. Neither if you did collations using built in sets of
 locales (like in Pablo's recent proposal), nor if you used some sort
 of callback to do collation.
 
 / Jonas

That's fine. You don't need to figure it out. Just look at how stateless 
databases have done it (or not done it) and do likewise.

I submit to you that there is inadequate understanding of the concerns raised, 
hence the lack of urgency in trying to address them. That there is even a need 
for a V2 is symptomatic of this.

It may be a good idea to start looking at these things not as interesting 
ideas but as essential database concepts.

If someone were trying to build some kind of transactional indexed key value 
store for the web, and they wanted to do a truly great job of it, they would 
certainly want to learn everything they could from databases that have made 
contributions to the field.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Jonas Sicking
On Thu, Mar 31, 2011 at 12:16 AM, Joran Greef jo...@ronomon.com wrote:
 On 31 Mar 2011, at 1:01 AM, Jonas Sicking wrote:

 Anyhow, I do think that the idea of passing in index values at the
 same time as a entry is created/modified is an interesting idea. And I
 have said so in the past on this list. It's definitely something we
 should consider for v2.

 Oh, and if we did this, I wouldn't really know how to support things
 like collations. Neither if you did collations using built in sets of
 locales (like in Pablo's recent proposal), nor if you used some sort
 of callback to do collation.

 / Jonas

 That's fine. You don't need to figure it out. Just look at how stateless 
 databases have done it (or not done it) and do likewise.

 I submit to you that there is inadequate understanding of the concerns 
 raised, hence the lack of urgency in trying to address them. That there is 
 even a need for a V2 is symptomatic of this.

 It may be a good idea to start looking at these things not as interesting 
 ideas but as essential database concepts.

 If someone were trying to build some kind of transactional indexed key value 
 store for the web, and they wanted to do a truly great job of it, they would 
 certainly want to learn everything they could from databases that have made 
 contributions to the field.

I previously have asked for a detailed proposal, but so far you have
not supplied one but instead keep referring to other unnamed database
APIs.

It has also been pointed out that there are unique constraints on APIs
in browsers. For example due to the fact that several applications,
i.e. pages, will be interacting with a given database. At the same
time, and in some browsers from different processes. Additionally
we're aiming to make an API which is easier to use, and where we can't
place as much trust in the web page as you'd normally put in the user
of a database API. For example, you've asked for callbacks to
implement collations, but what do we do if those callbacks don't
return consistent results? Or even do evil things like modify the
stores where data is being inserted?

In short, I don't think we'll get much further here without a concrete
proposal. Especially not for IndexedDB v1.

/ Jonas



Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Joran Greef
On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:

 I previously have asked for a detailed proposal, but so far you have
 not supplied one but instead keep referring to other unnamed database
 APIs.

I have already provided an adequate interface proposal for putObject and 
deleteObject.

I have already referenced at least Redis and Tokyo Cabinet as examples of 
stateless database interfaces, on numerous occasions.

 For example, you've asked for callbacks to
 implement collations, but what do we do if those callbacks don't
 return consistent results?

I have not once asked for callbacks, let alone callbacks to implement 
collations. You have jumped to this conclusion from my previous post, and 
missed the point of it entirely.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Joran Greef
On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote:

 We have made an effort to understand other contributions to the field.
 
 I'm not convinced that these are essential database concepts and having 
 personally spent quite some time working with the API in JS and implementing 
 it, I feel pretty confident that what we have for v1 is pretty solid.  There 
 are definitely some things I wouldn't mind re-visiting or looking at closer, 
 possibly even for v1, but they all seem reasonable to study further for v2 as 
 well.
 
 We've spent a lot of time over the last year and a half talking about 
 IndexedDB.  But now it's shipping in Firefox 4 and soon Chrome 11.  So 
 realistically v1 is not going to change much unless we are convinced that 
 what's there is fundamentally broken.
 
 We intentionally limited the scope of v1, which is why we know there'll be a 
 v2.  We can't solve all the problems at once, and the difficulty of speccing 
 something is typically exponential to the size of the API.
 
 Maybe a constructive way to discuss this would be to look at what use cases 
 will be difficult or impossible to achieve with the current design?

Application-managed indices for starters. I would consider that to be essential 
when designing indexed key/value stores, and I would consider that to be the 
contribution made by almost every other indexed key/value store to date. If we 
have to use IDB the way FriendFeed used MySQL to achieve application-managed 
indices then I would argue that the API is in fact fundamentally broken and 
we would be better off with an embedding of SQLite by Mozilla.

Regarding the difficulty of speccing something is typically exponential to the 
size of the API, if people want to build a Rube Goldberg device then they must 
deal with the spec issues of that.

If we were provided with the primitives for an indexed key/value store with 
application-managed indices (as Nikunj suggested at the time), we would have 
been well out of the starting blocks by now, and issues such as computed 
indexes, indexing array values etc. would have been non-issues.

Summary:

1. There's a problem.
2. It can still be fixed with a minimum of fuss.
3. This requires an adjustment to the putObject and deleteObject interfaces 
(see previous threads).


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
I was the one that asked for callbacks.

 but what do we do if those callbacks don't
 return consistent results? Or even do evil things like modify the
 stores where data is being inserted?

If the callback maps all values to a sort-order of '1' there could only ever
be one entry in the index... its not hard, the callback is passed an
immutable copy of the object and returns a sort-order as a binary-blob. If
you capture the object store in the closure you of course you could do evil
things as side-effects. But that is true in any non-purely-functional
language, you can always do evil things with side-effects.

 In short, I don't think we'll get much further here without a concrete
proposal.

Which basically means nobody working on the current implementations
understands the issues, or thinks the issues are unimportant?

Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 08:38, Joran Greef jo...@ronomon.com wrote:

 On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote:

  We have made an effort to understand other contributions to the field.
 
  I'm not convinced that these are essential database concepts and having
 personally spent quite some time working with the API in JS and implementing
 it, I feel pretty confident that what we have for v1 is pretty solid.  There
 are definitely some things I wouldn't mind re-visiting or looking at closer,
 possibly even for v1, but they all seem reasonable to study further for v2
 as well.
 
  We've spent a lot of time over the last year and a half talking about
 IndexedDB.  But now it's shipping in Firefox 4 and soon Chrome 11.  So
 realistically v1 is not going to change much unless we are convinced that
 what's there is fundamentally broken.
 
  We intentionally limited the scope of v1, which is why we know there'll
 be a v2.  We can't solve all the problems at once, and the difficulty of
 speccing something is typically exponential to the size of the API.
 
  Maybe a constructive way to discuss this would be to look at what use
 cases will be difficult or impossible to achieve with the current design?

 Application-managed indices for starters. I would consider that to be
 essential when designing indexed key/value stores, and I would consider that
 to be the contribution made by almost every other indexed key/value store to
 date. If we have to use IDB the way FriendFeed used MySQL to achieve
 application-managed indices then I would argue that the API is in fact
 fundamentally broken and we would be better off with an embedding of
 SQLite by Mozilla.

 Regarding the difficulty of speccing something is typically exponential to
 the size of the API, if people want to build a Rube Goldberg device then
 they must deal with the spec issues of that.

 If we were provided with the primitives for an indexed key/value store with
 application-managed indices (as Nikunj suggested at the time), we would have
 been well out of the starting blocks by now, and issues such as computed
 indexes, indexing array values etc. would have been non-issues.

 Summary:

 1. There's a problem.
 2. It can still be fixed with a minimum of fuss.


I totally agree with everything so far...


 3. This requires an adjustment to the putObject and deleteObject interfaces
 (see previous threads).


I disagree that a simple API change is the answer. The problem is
architectural, not just a superficial API issue.


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Joran Greef
On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote:

 I totally agree with everything so far...
 
 3. This requires an adjustment to the putObject and deleteObject interfaces 
 (see previous threads).
 
 I disagree that a simple API change is the answer. The problem is 
 architectural, not just a superficial API issue.

Yes, for IndexedDB to be stateless with respect to application schema, one 
would need to:

1. Provide the application with a first-class means to manage indexes at time 
of putting/deleting objects.
2. Treat objects as opaque (remove key path, structured clone mechanisms, 
application must provide an id and JSON value to put/delete calls, reduces 
serialization/deserialization overhead where application already has the object 
as a string).
3. Remove setVersion (redundant, application migrates objects and indexes using 
transactions as it needs to).
4. Remove createIndex.

This would rip so much from the spec as to reduce it to a bunch of tatters, 
defining nothing more than an interface for index/key/value primitives in terms 
of well-established interfaces.

Essentially, we need LocalStorage with asynchronous IO (based on Node's 
callback style), large quota support, and a BTree API. Failing that, a decent 
FileSystem API on which to build these.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 12:41, Joran Greef jo...@ronomon.com wrote:

 On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote:

  I totally agree with everything so far...
 
  3. This requires an adjustment to the putObject and deleteObject
 interfaces (see previous threads).
 
  I disagree that a simple API change is the answer. The problem is
 architectural, not just a superficial API issue.

 Yes, for IndexedDB to be stateless with respect to application schema, one
 would need to:

 1. Provide the application with a first-class means to manage indexes at
 time of putting/deleting objects.
 2. Treat objects as opaque (remove key path, structured clone mechanisms,
 application must provide an id and JSON value to put/delete calls, reduces
 serialization/deserialization overhead where application already has the
 object as a string).
 3. Remove setVersion (redundant, application migrates objects and indexes
 using transactions as it needs to).
 4. Remove createIndex.

 This would rip so much from the spec as to reduce it to a bunch of tatters,
 defining nothing more than an interface for index/key/value primitives in
 terms of well-established interfaces.

 Essentially, we need LocalStorage with asynchronous IO (based on Node's
 callback style), large quota support, and a BTree API. Failing that, a
 decent FileSystem API on which to build these.


Stateless indexes can be provided differently to how you suggest. You can
have a 'validate_index' call that checks the index exists and creates it if
it does not. It is stateless in the sense that you call that to open
existing index or create one, you dont care if the database has one already
or not.

Infact you can make SQL stateless by providing a validate_schema call that
succeeds if the schema of the database matches the passed schema, can be
modified with no data loss to be the same, or needs to be created.

The RelationalDB wrapper for WebSQL provides this kind of stateless approach
for SQL... you can check it out on github if you like (its a work in
progress though):

https://github.com/keean/RelationalDB


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Jeremy Orlow
On Thu, Mar 31, 2011 at 1:38 AM, Joran Greef jo...@ronomon.com wrote:

 On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote:

  We have made an effort to understand other contributions to the field.
 
  I'm not convinced that these are essential database concepts and having
 personally spent quite some time working with the API in JS and implementing
 it, I feel pretty confident that what we have for v1 is pretty solid.  There
 are definitely some things I wouldn't mind re-visiting or looking at closer,
 possibly even for v1, but they all seem reasonable to study further for v2
 as well.
 
  We've spent a lot of time over the last year and a half talking about
 IndexedDB.  But now it's shipping in Firefox 4 and soon Chrome 11.  So
 realistically v1 is not going to change much unless we are convinced that
 what's there is fundamentally broken.
 
  We intentionally limited the scope of v1, which is why we know there'll
 be a v2.  We can't solve all the problems at once, and the difficulty of
 speccing something is typically exponential to the size of the API.
 
  Maybe a constructive way to discuss this would be to look at what use
 cases will be difficult or impossible to achieve with the current design?

 Application-managed indices for starters


That's not a use case.


 I would consider that to be essential when designing indexed key/value
 stores, and I would consider that to be the contribution made by almost
 every other indexed key/value store to date. If we have to use IDB the way
 FriendFeed used MySQL to achieve application-managed indices then I would
 argue that the API is in fact fundamentally broken and we would be better
 off with an embedding of SQLite by Mozilla.

 Regarding the difficulty of speccing something is typically exponential to
 the size of the API, if people want to build a Rube Goldberg device then
 they must deal with the spec issues of that.

 If we were provided with the primitives for an indexed key/value store with
 application-managed indices (as Nikunj suggested at the time), we would have
 been well out of the starting blocks by now, and issues such as computed
 indexes, indexing array values etc. would have been non-issues.

 Summary:

 1. There's a problem.
 2. It can still be fixed with a minimum of fuss.
 3. This requires an adjustment to the putObject and deleteObject interfaces
 (see previous threads).


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Jeremy Orlow
On Thu, Mar 31, 2011 at 5:41 AM, Joran Greef jo...@ronomon.com wrote:

 On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote:

  I totally agree with everything so far...
 
  3. This requires an adjustment to the putObject and deleteObject
 interfaces (see previous threads).
 
  I disagree that a simple API change is the answer. The problem is
 architectural, not just a superficial API issue.

 Yes, for IndexedDB to be stateless with respect to application schema, one
 would need to:

 1. Provide the application with a first-class means to manage indexes at
 time of putting/deleting objects.


I'm OK with doing this for v1 if the others are.  It doesn't seem like that
big of an addition and it would give a decent amount of additional
flexibility.


 2. Treat objects as opaque (remove key path,


Key paths are quite useful.  I agree that making it possible to use
statelessly is good, but I don't see any reason why making it 100% stateless
should be a goal.


 structured clone mechanisms


For sure, not going to happen.


 , application must provide an id and JSON value to put/delete calls,
 reduces serialization/deserialization overhead where application already has
 the object as a string).


I'm not sure why you think this would reduce overhead.


 3. Remove setVersion (redundant, application migrates objects and indexes
 using transactions as it needs to).
 4. Remove createIndex.


Like I said above, although I think we should make it possible to operate
more statelessly, I don't see a reason we need to remove stuff like this.
 Some users will find it more convenient to work this way.

J


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Jonas Sicking
On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote:
 On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:

 I previously have asked for a detailed proposal, but so far you have
 not supplied one but instead keep referring to other unnamed database
 APIs.

 I have already provided an adequate interface proposal for putObject and 
 deleteObject.

That is hardly a comprehensive proposal, but rather just one small part of it.

I do really think the idea of not having the implementation keep track
of the set of indexes for a objectStore is a really interesting one.
As is the idea of not even having a set set of objectStores. However,
there are several problems that needs to be solved. In particular how
do you deal with collations?

I.e. we have concluded that there are important use cases which
require using different collations for different indexes and
objectStores. Even for different indexes attached to the same
objectStore.

Additionally, if we're getting rid of setVersion, how do we expect
pages dealing with the (application managed) schema changing while the
page has a connection open to the database?

So pretty please, with sugar on top, please come up with a proposal
for the full API rather than bits and pieces.

And I should mention that I have as an absolute requirement that you
should be able to specify collation by simply saying that you want to
use en-US or sv-SV sorting. Using callbacks or other means is ok
*in addition to this*, but callback mechanisms tend to be a lot more
complex since they have to deal with the callback doing all sorts of
evil things such as returning inconsistent results (think return
Math.random()), or simply do evil things like navigate the current
page, deleting the database, or modifying the record that is in the
process of being stored.

/ Jonas



Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.com wrote:

 On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote:
  On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:
 
  I previously have asked for a detailed proposal, but so far you have
  not supplied one but instead keep referring to other unnamed database
  APIs.
 
  I have already provided an adequate interface proposal for putObject
 and deleteObject.

 That is hardly a comprehensive proposal, but rather just one small part
 of it.


 I wanted to make a few comments about these points :-



 I do really think the idea of not having the implementation keep track
 of the set of indexes for a objectStore is a really interesting one.
 As is the idea of not even having a set set of objectStores. However,
 there are several problems that needs to be solved. In particular how
 do you deal with collations?


 no indexes, no object stores... well I for one prefer the
 validate_object_store, validate_index approach, in that it can hide
 statefullness if necessary (like I do with RelationalDB) whilst presenting a
 stateless API. It also keeps the size of the put statements down.



 I.e. we have concluded that there are important use cases which
 require using different collations for different indexes and
 objectStores. Even for different indexes attached to the same
 objectStore.

 Additionally, if we're getting rid of setVersion, how do we expect
 pages dealing with the (application managed) schema changing while the
 page has a connection open to the database?


 1 - there is no schema
 2 - dont allow it to change whilst the database is open

 In reality a schema is implicitly tied to a code version. In other words
 the source code of the application assumes a certain schema. If the assumed
 schema and the schema in the DB do not match things are going to go very
 wrong very quickly. Schema changes _always_ accompany code changes
 (otherwise they are not schema changes just data changes). As such they
 never happen when a DB is open. The way I handle this in RelationalDB, by
 validating the actual schema against the source-code schema in the db-open
 (actually the method is called validate), is probably the best way to handle
 this. If the database does not exist we create it according to the schema.
 If it exists we check it matches the schema. If there is a difference we see
 if we can 'upgrade' the database automatically (certain changes like adding
 a new column with a default value can be done automaticall), if we cannot
 automaticall upgrade, we exit with an error - as allowing the program to run
 will result in corruption of the data already in the database. At this point
 it is up to the application to figure out how to upgrade the database (by
 opening one database with an old schema and another with a new schema)...
 There is not point in ever allowing a database to be opened with the wrong
 schema.


 So pretty please, with sugar on top, please come up with a proposal
 for the full API rather than bits and pieces.

 And I should mention that I have as an absolute requirement that you
 should be able to specify collation by simply saying that you want to
 use en-US or sv-SV sorting. Using callbacks or other means is ok
 *in addition to this*, but callback mechanisms tend to be a lot more
 complex since they have to deal with the callback doing all sorts of
 evil things such as returning inconsistent results (think return
 Math.random()), or simply do evil things like navigate the current
 page, deleting the database, or modifying the record that is in the
 process of being stored.


 The core API only needs to deal with sorting binary-blob sort orders. A
 library wrapper could provide all the collation ordering goodness that
 people want. For example RelationalDB will have to deal with sorting orders,
 it does not need the browser to provide that functionality. In fact browser
 provided functionality may limit what can be done in libraries on top.


 This is difficult if not impossible to do.  See previous threads on the
 matter.

 J


I can find a lot of stuff on collation, but not a lot about why it could not
be done in a library. Could you summerise the reasons why this needs to be
core functionality for me?

A library could chose to use an object store as meta-data to store the
collation orders that it is using for various indexes for example.


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Jeremy Orlow
On Thu, Mar 31, 2011 at 11:24 AM, Keean Schupke ke...@fry-it.com wrote:

 On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.com wrote:

 On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote:
  On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:
 
  I previously have asked for a detailed proposal, but so far you have
  not supplied one but instead keep referring to other unnamed database
  APIs.
 
  I have already provided an adequate interface proposal for putObject
 and deleteObject.

 That is hardly a comprehensive proposal, but rather just one small part
 of it.


 I wanted to make a few comments about these points :-



 I do really think the idea of not having the implementation keep track
 of the set of indexes for a objectStore is a really interesting one.
 As is the idea of not even having a set set of objectStores. However,
 there are several problems that needs to be solved. In particular how
 do you deal with collations?


 no indexes, no object stores... well I for one prefer the
 validate_object_store, validate_index approach, in that it can hide
 statefullness if necessary (like I do with RelationalDB) whilst presenting a
 stateless API. It also keeps the size of the put statements down.



 I.e. we have concluded that there are important use cases which
 require using different collations for different indexes and
 objectStores. Even for different indexes attached to the same
 objectStore.

 Additionally, if we're getting rid of setVersion, how do we expect
 pages dealing with the (application managed) schema changing while the
 page has a connection open to the database?


 1 - there is no schema
 2 - dont allow it to change whilst the database is open

 In reality a schema is implicitly tied to a code version. In other words
 the source code of the application assumes a certain schema. If the assumed
 schema and the schema in the DB do not match things are going to go very
 wrong very quickly. Schema changes _always_ accompany code changes
 (otherwise they are not schema changes just data changes). As such they
 never happen when a DB is open. The way I handle this in RelationalDB, by
 validating the actual schema against the source-code schema in the db-open
 (actually the method is called validate), is probably the best way to handle
 this. If the database does not exist we create it according to the schema.
 If it exists we check it matches the schema. If there is a difference we see
 if we can 'upgrade' the database automatically (certain changes like adding
 a new column with a default value can be done automaticall), if we cannot
 automaticall upgrade, we exit with an error - as allowing the program to run
 will result in corruption of the data already in the database. At this point
 it is up to the application to figure out how to upgrade the database (by
 opening one database with an old schema and another with a new schema)...
 There is not point in ever allowing a database to be opened with the wrong
 schema.


 So pretty please, with sugar on top, please come up with a proposal
 for the full API rather than bits and pieces.

 And I should mention that I have as an absolute requirement that you
 should be able to specify collation by simply saying that you want to
 use en-US or sv-SV sorting. Using callbacks or other means is ok
 *in addition to this*, but callback mechanisms tend to be a lot more
 complex since they have to deal with the callback doing all sorts of
 evil things such as returning inconsistent results (think return
 Math.random()), or simply do evil things like navigate the current
 page, deleting the database, or modifying the record that is in the
 process of being stored.


 The core API only needs to deal with sorting binary-blob sort orders. A
 library wrapper could provide all the collation ordering goodness that
 people want. For example RelationalDB will have to deal with sorting orders,
 it does not need the browser to provide that functionality. In fact browser
 provided functionality may limit what can be done in libraries on top.


 This is difficult if not impossible to do.  See previous threads on the
 matter.

 J


 I can find a lot of stuff on collation, but not a lot about why it could
 not be done in a library. Could you summerise the reasons why this needs to
 be core functionality for me?


Sorry, but that stuff is paged out of my brain.  Pablo, can you?


 A library could chose to use an object store as meta-data to store the
 collation orders that it is using for various indexes for example.


 Cheers,
 Keean.




Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Joran Greef
On 31 Mar 2011, at 7:27 PM, Jeremy Orlow wrote:

 1. Provide the application with a first-class means to manage indexes at 
 time of putting/deleting objects.
 
 I'm OK with doing this for v1 if the others are.  It doesn't seem like that 
 big of an addition and it would give a decent amount of additional 
 flexibility.

Thanks Jeremy that would be great.

 (reduces serialization/deserialization overhead where application already 
 has the object as a string)
 
 I'm not sure why you think this would reduce overhead.

How long would it take an iPad to JSON deserialize/serialize 500 / 5,000 / 
50,000 / 500,000 / 5,000,000 2KB objects? That's a reasonable device and those 
are reasonable workloads. In it's present state, IndexedDB needs to do this 
every time setVersion is called with a createIndex in there... you see the 
problem is there's no way for the application to control this. The application 
would arguably be able to find better ways of migrating indexes than using key 
paths which necessitate deserialization/serialization to be performed on the 
client. For instance, you could use batch jobs on the server to do this on 
behalf of clients, and this would make sense especially where many 
clients/devices share the same objects. With IndexedDB this is not possible. 
With pure storage primitives it would have been possible. This is just one 
use-case, and for every one of these there will be plenty more.

 Like I said above, although I think we should make it possible to operate 
 more statelessly, I don't see a reason we need to remove stuff like this. 
 Some users will find it more convenient to work this way.

Agreed on both counts. It is clearly too late to remove it now. But it may be a 
good idea in future to keep the focus on providing low-level primitives rather 
than convenience features, since the latter often get in the way of the former.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 18:36, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:24 AM, Keean Schupke ke...@fry-it.com wrote:

 On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.comwrote:

 On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com
 wrote:
  On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:
 
  I previously have asked for a detailed proposal, but so far you have
  not supplied one but instead keep referring to other unnamed
 database
  APIs.
 
  I have already provided an adequate interface proposal for putObject
 and deleteObject.

 That is hardly a comprehensive proposal, but rather just one small part
 of it.


 I wanted to make a few comments about these points :-



 I do really think the idea of not having the implementation keep track
 of the set of indexes for a objectStore is a really interesting one.
 As is the idea of not even having a set set of objectStores. However,
 there are several problems that needs to be solved. In particular how
 do you deal with collations?


 no indexes, no object stores... well I for one prefer the
 validate_object_store, validate_index approach, in that it can hide
 statefullness if necessary (like I do with RelationalDB) whilst presenting 
 a
 stateless API. It also keeps the size of the put statements down.



 I.e. we have concluded that there are important use cases which
 require using different collations for different indexes and
 objectStores. Even for different indexes attached to the same
 objectStore.

 Additionally, if we're getting rid of setVersion, how do we expect
 pages dealing with the (application managed) schema changing while the
 page has a connection open to the database?


 1 - there is no schema
 2 - dont allow it to change whilst the database is open

 In reality a schema is implicitly tied to a code version. In other words
 the source code of the application assumes a certain schema. If the assumed
 schema and the schema in the DB do not match things are going to go very
 wrong very quickly. Schema changes _always_ accompany code changes
 (otherwise they are not schema changes just data changes). As such they
 never happen when a DB is open. The way I handle this in RelationalDB, by
 validating the actual schema against the source-code schema in the db-open
 (actually the method is called validate), is probably the best way to 
 handle
 this. If the database does not exist we create it according to the schema.
 If it exists we check it matches the schema. If there is a difference we 
 see
 if we can 'upgrade' the database automatically (certain changes like adding
 a new column with a default value can be done automaticall), if we cannot
 automaticall upgrade, we exit with an error - as allowing the program to 
 run
 will result in corruption of the data already in the database. At this 
 point
 it is up to the application to figure out how to upgrade the database (by
 opening one database with an old schema and another with a new schema)...
 There is not point in ever allowing a database to be opened with the wrong
 schema.


 So pretty please, with sugar on top, please come up with a proposal
 for the full API rather than bits and pieces.

 And I should mention that I have as an absolute requirement that you
 should be able to specify collation by simply saying that you want to
 use en-US or sv-SV sorting. Using callbacks or other means is ok
 *in addition to this*, but callback mechanisms tend to be a lot more
 complex since they have to deal with the callback doing all sorts of
 evil things such as returning inconsistent results (think return
 Math.random()), or simply do evil things like navigate the current
 page, deleting the database, or modifying the record that is in the
 process of being stored.


 The core API only needs to deal with sorting binary-blob sort orders. A
 library wrapper could provide all the collation ordering goodness that
 people want. For example RelationalDB will have to deal with sorting 
 orders,
 it does not need the browser to provide that functionality. In fact browser
 provided functionality may limit what can be done in libraries on top.


 This is difficult if not impossible to do.  See previous threads on the
 matter.

 J


 I can find a lot of stuff on collation, but not a lot about why it could
 not be done in a library. Could you summerise the reasons why this needs to
 be core functionality for me?


 Sorry, but that stuff is paged out of my brain.  Pablo, can you?


 A library could chose to use an object store as meta-data to store the
 collation orders that it is using for various indexes for example.


 Cheers,
 Keean.




Thanks would help me understand. As long as there is a way to turn default
collation off and just have a binary string sort order thats fine for my
needs.


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Shawn Wilsher

On 3/31/2011 11:47 AM, Joran Greef wrote:

Let those who introduced these design flaws be among the first to take 
responsibility and fix them.
You aren't being constructive, and that's a surefire way to be ignored. 
 You have yet to convince the working group that these are design 
flaws in the first place.


/sdwilsh



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Joran Greef
On 31 Mar 2011, at 10:07 PM, Shawn Wilsher wrote:

 On 3/31/2011 11:47 AM, Joran Greef wrote:
 Let those who introduced these design flaws be among the first to take 
 responsibility and fix them.
 You aren't being constructive, and that's a surefire way to be ignored.  You 
 have yet to convince the working group that these are design flaws in the 
 first place.
 
 /sdwilsh

Agreed. I am actively using the API with real-world data and I am providing 
feedback. You are welcome to use it or not. It is not for me to convince 
anyone. As I said, if people think there is a problem, let those who introduced 
it fix it.

Joran Greef


RE: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Pablo Castro

From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow
Sent: Thursday, March 31, 2011 11:36 AM

 I can find a lot of stuff on collation, but not a lot about why it could not 
 be done in a library. Could you summerise the reasons why this needs to be 
 core functionality for me?

 Sorry, but that stuff is paged out of my brain.  Pablo, can you?
 
 A library could chose to use an object store as meta-data to store the 
 collation orders that it is using for various indexes for example.

- Currently there are no APIs in JavaScript to compare strings using specific 
collations. There are folks that are looking into this, but it will need time.
- I'm far from an expert in the topic, but from talking to folks that 
understand this well it seems that to actually implement this entirely in 
JavaScript it would mean you have to download collation tables and apply them 
as needed in callbacks. Not only this means a hit in download size/time for the 
app but also that callbacks have to either download stuff or inline collation 
rules/tables in the callback itself. 
- In pure practical terms, I suspect the 80% scenario can be covered by 
implementing this natively, having it be fast and simple to use for common 
cases. Not pushing back on the callback stuff, just saying that I find it 
valuable to have users simply say en-US and get what they wanted.
- Also from the practical perspective, simple cases that don't require the 
flexibility and can avoid having to take care of making the callbacks perfectly 
consistent even as you roll out updates that may hit only some of the pages, 
use components written by someone else, etc.
- By default we would still do binary collation (there was a question in the 
thread, I forget exactly where).

Thanks
-pablo




Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
 Currently there are no APIs in JavaScript to compare strings using
specific collations

We dont actually need this, just a mapping from UTF-16 string to a
sort-score (binary blob).

Its true that downloading the collation tables might take time, so we could
just provide:

var blob = string_to_score('utf-16 string', 'en-US');

as a built in function to make this efficient.

I agree with the other points though.


Cheers,
Keean.


On 31 March 2011 22:38, Pablo Castro pablo.cas...@microsoft.com wrote:


 From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy
 Orlow
 Sent: Thursday, March 31, 2011 11:36 AM

  I can find a lot of stuff on collation, but not a lot about why it could
 not be done in a library. Could you summerise the reasons why this needs to
 be core functionality for me?
 
  Sorry, but that stuff is paged out of my brain.  Pablo, can you?
  
  A library could chose to use an object store as meta-data to store the
 collation orders that it is using for various indexes for example.

 - Currently there are no APIs in JavaScript to compare strings using
 specific collations. There are folks that are looking into this, but it will
 need time.
 - I'm far from an expert in the topic, but from talking to folks that
 understand this well it seems that to actually implement this entirely in
 JavaScript it would mean you have to download collation tables and apply
 them as needed in callbacks. Not only this means a hit in download size/time
 for the app but also that callbacks have to either download stuff or inline
 collation rules/tables in the callback itself.
 - In pure practical terms, I suspect the 80% scenario can be covered by
 implementing this natively, having it be fast and simple to use for common
 cases. Not pushing back on the callback stuff, just saying that I find it
 valuable to have users simply say en-US and get what they wanted.
 - Also from the practical perspective, simple cases that don't require the
 flexibility and can avoid having to take care of making the callbacks
 perfectly consistent even as you roll out updates that may hit only some of
 the pages, use components written by someone else, etc.
 - By default we would still do binary collation (there was a question in
 the thread, I forget exactly where).

 Thanks
 -pablo




Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-30 Thread Jonas Sicking
On Sat, Mar 26, 2011 at 1:14 AM, Nikunj Mehta nik...@o-micron.com wrote:
 What is the minimum that can be in IDB? I am guessing the following:
 1. Sorted key-opaque value transactional store
 2. Lookup of keys by values (or parts thereof)
 #1 is essential.
 #2 is unavoidable because you would want to efficiently manipulate values by
 values as opposed to values by key.
 I know of no efficient way of doing callbacks with JS. Moreover, avoiding
 indices completely seems to miss the point. Yes, IDB can be used without key
 paths and indices. When you do that, you would not have any headache of
 setVersion since every version change either adds or removes an object
 store. Next, originally, I also had floated the idea of application managed
 indices, but implementors thought of it as cruft.

For what it's worth, I'm not sure anyone ever thought it was cruft.
The main problem, IMHO, was that it was underdefined. It also created
a somewhat awkward API since even indexes which were not manually
managed would have functions for explicitly managing them. (It also
wouldn't help with the state issues that is raised in the original
email in this thread).

Anyhow, I do think that the idea of passing in index values at the
same time as a entry is created/modified is an interesting idea. And I
have said so in the past on this list. It's definitely something we
should consider for v2.

However this still wouldn't solve the state issue raised in this
thread. The browser still keeps track of the set of objectStores. You
could get rid of that by changing the APIs such that there isn't a set
list of object stores. I.e. remove IDBDatabase.objectStoreNames,
IDBDatabase.createObjectStore and IDBDatabase.deleteObjectStore and
make IDBDatabase.transaction allow any object store names to be passed
to it. This would let you create a new objectStore by simply
starting a transaction which uses the new name and start storing data
into it.

But even then you still want the setVersion API. Otherwise the web
page has no way of migrating stored data to a new schema. Even though
the browser doesn't keep track of a schema, the app still does, and
likely will want to change that from time to time.

/ Jonas



Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-26 Thread Nikunj Mehta
What is the minimum that can be in IDB? I am guessing the following:

1. Sorted key-opaque value transactional store
2. Lookup of keys by values (or parts thereof)

#1 is essential.
#2 is unavoidable because you would want to efficiently manipulate values by
values as opposed to values by key.

I know of no efficient way of doing callbacks with JS. Moreover, avoiding
indices completely seems to miss the point. Yes, IDB can be used without key
paths and indices. When you do that, you would not have any headache of
setVersion since every version change either adds or removes an object
store. Next, originally, I also had floated the idea of application managed
indices, but implementors thought of it as cruft.

On Sun, Mar 20, 2011 at 3:10 PM, Joran Greef jo...@ronomon.com wrote:


  On 20 Mar 2011, at 4:54 AM, Jonas Sicking wrote:
 
  I don't understand what you are saying about application state though,
  so please do start that as a separate thread.

 At present, there's no way for an application to tell IDB what indexes to
 modify w.r.t. an object at the exact moment when putting or deleting that
 object. That's because this behavior is defined in advance using
 createIndex in a setVersion transaction. And then how IDB extracts the
 referenced value from the object is done using an IDB idea of key paths.
 But right there, in defining the indexes in advance (and not when the index
 is actually modified, which is when the object itself is modified), you've
 captured application state (data relationships that should be known only to
 the application) within IDB. Because this is done in advance (because IDB
 seems to have inherited this assumption that this is just the way MySQL
 happens to do it), there's a disconnect between when the index is defined
 and when it's actually used. And because of key paths you now need to spec
 out all kinds of things like how to handle compound keys, multiple values.
 It's becoming a bit of a spec-fest.

 That this bubble of state gets captured in IDB, it also means that IDB now
 needs to provide ways of updating that captured state within IDB when it
 changes in the application (which will happen, so essentially you now have
 your indexing logic stuck in the database AND in the application and the
 application developer now has to try and keep BOTH in sync using this
 awkward pre-defined indexes interface), thus the need for a setVersion
 transaction in the first place. None of this would be necessary if the
 application could reference indexes to be modified (and created if they
 don't exist, or deleted if they would then become empty) AT THE POINT of
 putting or deleting an object. Things like data migrations would also be
 better served if this were possible since this is something the application
 would need to manage anyway. Do you follow?

 The application is the right place to be handling indexing logic. IDB just
 needs to provide an interface to the indexing implementation, but not handle
 extracting values from objects or deciding which indexes to modify. That's
 the domain of the application. It's a question of encapsulation. IDB is
 crossing the boundaries by demanding to know ABOUT the data stored, and not
 just providing a simple way to put an object, and a simple way to put a
 reference to an object to an index, and a simple way to query an index and
 intersect or union an index with another. Essentially an object and its
 index memberships need to be completely opaque to IDB and you are doing the
 opposite. Take a look at the BDB interface. Do you see a setVersion or
 createIndex semantic in there?


BDB has secondary databases, which are the same as indices with a one to
many relation between primary and secondary database. Moreover, BDB uses
application callbacks to let the application encapsulate the definition of
the index.


 Take a look at Redis and Tokyo and many other things. Do you see a
 setVersion or createIndex semantic in there? Do these databases have any
 idea about the contents of objects? Any concept of key paths?


I, for one, am not enamored by key paths. However, I am also morbidly aware
of the perils in JS land when using callback like mechanisms. Certainly, I
would like to hear from developers like you how you find IDB if you were to
not use any createIndex at all. Or at least that you would like to manage
your own indices.


 No, and that's the whole reason these databases were created in the first
 place. I'm sure you have read the BDB papers. Obviously this is not the
 approach of MySQL. But if IDB is trying to be MySQL but saying it wants to
 be BDB then I don't know. In any event, Firefox would be brave to also embed
 SQLite. Let the better API win.

 How much simpler could it be? At the end of the day, it's all objects and
 sets and sorted sets, and see Redis' epiphany on this point. IDB just needs
 to provide transactional access to these sets. The application must decide
 what goes in and out of these sets, and must be able to do 

Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-26 Thread Joran Greef
 On 26 Mar 2011, at 10:14 AM, Nikunj Mehta wrote:
 
 What is the minimum that can be in IDB? I am guessing the following:
 
 1. Sorted key-opaque value transactional store
 2. Lookup of keys by values (or parts thereof)

Yes, this is what we need. In programmer speak: objects (opaque strings), sets 
(hash indexes), sorted sets (range indexes).

 I know of no efficient way of doing callbacks with JS. Moreover, avoiding 
 indices completely seems to miss the point.

Callbacks are unnecessary. This is what you would want to do as a developer 
using the current form of IDB:

objectStore.putObject({ name: Joran, emails: [jo...@gmail.com, 
jo...@ronomon.com] }, { id: 'arbitraryObjectIdProvidedByTheApplication', 
indexes: [emails=jo...@gmail.com, emails=jo...@ronomon.com, name=Joran] 
});

IDB would then store the user object using the id provided by the application, 
and make sure it's referenced by this id in the emails=jo...@gmail.com, 
emails=jo...@ronomon.com, name=Joran index references provided (creating 
these indexes along the way if need be). The application is responsible for 
passing in the extra id and indexes options to putObject.

Supporting range indexes would be a question of expanding the above to let the 
developer pass in a sort score along with the index reference.

 Next, originally, I also had floated the idea of application managed indices, 
 but implementors thought of it as cruft.

I can understand how application managed indices would lead to less work on the 
part of the spec committee. There seems to be some perverse human 
characteristic that likes to make easy things difficult. Ships will sail around 
the world but the Flat Earth Society will flourish.

 I, for one, am not enamored by key paths. However, I am also morbidly aware 
 of the perils in JS land when using callback like mechanisms. Certainly, I 
 would like to hear from developers like you how you find IDB if you were to 
 not use any createIndex at all. Or at least that you would like to manage 
 your own indices.

I am begging to be able to manage my indices. I know my data. I do not want to 
use any createIndex to declare indexes in advance of when I may or may not use 
them. What advantage would that give me? I want to create/update indexes only 
when I put or delete objects and I want to have control over which indexes to 
update accordingly. With one small change to the putObject and deleteObject 
interfaces, in the form of the indexes option, we can make that possible.

We need these primitives in IDB: opaque strings, sets, sorted sets. Ideally, 
IDB need simply store these things and provide the standard interfaces (see 
Redis) to them along with a transactional mechanism. That's the perfect 
low-level API on which to build almost any database wrapper.


[IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-20 Thread Joran Greef

 On 20 Mar 2011, at 4:54 AM, Jonas Sicking wrote:
 
 I don't understand what you are saying about application state though,
 so please do start that as a separate thread.

At present, there's no way for an application to tell IDB what indexes to 
modify w.r.t. an object at the exact moment when putting or deleting that 
object. That's because this behavior is defined in advance using createIndex 
in a setVersion transaction. And then how IDB extracts the referenced value 
from the object is done using an IDB idea of key paths. But right there, in 
defining the indexes in advance (and not when the index is actually modified, 
which is when the object itself is modified), you've captured application state 
(data relationships that should be known only to the application) within IDB. 
Because this is done in advance (because IDB seems to have inherited this 
assumption that this is just the way MySQL happens to do it), there's a 
disconnect between when the index is defined and when it's actually used. And 
because of key paths you now need to spec out all kinds of things like how to 
handle compound keys, multiple values. It's becoming a bit of a spec-fest.

That this bubble of state gets captured in IDB, it also means that IDB now 
needs to provide ways of updating that captured state within IDB when it 
changes in the application (which will happen, so essentially you now have your 
indexing logic stuck in the database AND in the application and the application 
developer now has to try and keep BOTH in sync using this awkward pre-defined 
indexes interface), thus the need for a setVersion transaction in the first 
place. None of this would be necessary if the application could reference 
indexes to be modified (and created if they don't exist, or deleted if they 
would then become empty) AT THE POINT of putting or deleting an object. Things 
like data migrations would also be better served if this were possible since 
this is something the application would need to manage anyway. Do you follow?

The application is the right place to be handling indexing logic. IDB just 
needs to provide an interface to the indexing implementation, but not handle 
extracting values from objects or deciding which indexes to modify. That's the 
domain of the application. It's a question of encapsulation. IDB is crossing 
the boundaries by demanding to know ABOUT the data stored, and not just 
providing a simple way to put an object, and a simple way to put a reference to 
an object to an index, and a simple way to query an index and intersect or 
union an index with another. Essentially an object and its index memberships 
need to be completely opaque to IDB and you are doing the opposite. Take a look 
at the BDB interface. Do you see a setVersion or createIndex semantic in there? 
Take a look at Redis and Tokyo and many other things. Do you see a setVersion 
or createIndex semantic in there? Do these databases have any idea about the 
contents of objects? Any concept of key paths? No, and that's the whole reason 
these databases were created in the first place. I'm sure you have read the BDB 
papers. Obviously this is not the approach of MySQL. But if IDB is trying to be 
MySQL but saying it wants to be BDB then I don't know. In any event, Firefox 
would be brave to also embed SQLite. Let the better API win.

How much simpler could it be? At the end of the day, it's all objects and sets 
and sorted sets, and see Redis' epiphany on this point. IDB just needs to 
provide transactional access to these sets. The application must decide what 
goes in and out of these sets, and must be able to do it when it wants to, not 
some time in advance. I bring this up because I once wrote the exact same kind 
of database that you are writing now (where one thinks it would be good if the 
database did NOT treat objects as opaque... that the database should be smart 
about the contents of objects and share control for how objects relate to each 
other etc.) and I have since seen how much better, simpler, faster the 
alternative is. So unless you have formidable reasons for maintaining the 
status quo in light of the above, even if you don't understand this concept of 
application state getting stuck in IDB, and even though you advocate that 
WebSQL is not deprecated and that we can consider LocalStorage to be an 
alternative, then it is my hope that you will heed this and make something of 
it. I'm sorry if this is not the kind of feedback you want to hear at this 
stage, but IDB needs to be good for more than just HTML 5 todo list demos.