Re: [Bug 12321] New: Add compound keys to IndexedDB

2011-03-18 Thread Keean Schupke
I like BDB's solution. You have one primary key you cannot mess with (say an
integer for fast comparisons) you can then add any number of secondary
indexes. With a secondary index there is a callback to generate a binary
blob that is used for indexing. The callback has access to all the fields of
the object plus any info in the closure and can use that to generate the
index data any way it likes.

This has the advantage of supporting any indexing scheme's the user may wish
to implement (by writing a custom callback), whist allowing a few common
options to be provided for the user (say a hash of all fields, or a field
name, international char set, and direction captured in a closure). The user
gets the power, the core implementation is simple, and common cases can be
implemented in an easy to use way.

var lex_order = function(field, charset, direction) {return function(object)
{/* map indexed 'field' to blob in required order */ return key;};};

Then create a new index:

object_store.validate_index(1, lex_order('name', 'us',
'ascending')).on_done(function(status) {/* status ok or error */})

validate index checks if the requested secondary index (1) exists, if it
does not it creates the index and calls the done callback (with a status
code indicating successful creation), if it does and it passes some
validation checks it also calls the done callback (with a status code
indicating successful validation). If anything goes wrong with either the
creation or validation of the secondary index if would call the done
callback with an error status code.


Cheers,
Keean.


On 18 March 2011 02:03, Jeremy Orlow jor...@chromium.org wrote:

 Here's one ugliness with A: There's no way to specify ascending
 or descending for the individual components of the key.  So there's no way
 for me to open a cursor that looks at one field ascending and the other
 field descending.  In addition, I can't think of any easy/good ways to hack
 around this.

 Any thoughts on how we could address this use case?

 J

 On Wed, Mar 16, 2011 at 4:50 PM, bugzi...@jessica.w3.org wrote:

 http://www.w3.org/Bugs/Public/show_bug.cgi?id=12321

   Summary: Add compound keys to IndexedDB
   Product: WebAppsWG
   Version: unspecified
  Platform: PC
OS/Version: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: Indexed Database API
AssignedTo: dave.n...@w3.org
ReportedBy: jor...@chromium.org
 QAContact: member-webapi-...@w3.org
CC: m...@w3.org, public-webapps@w3.org


 From the thread [IndexedDB] Compound and multiple keys by Jonas
 Sicking,
 we're going to go with both options A and B.

 =

 Hi IndexedDB fans (yay!!),

 Problem description:

 One of the current shortcomings of IndexedDB is that it doesn't
 support compound indexes. I.e. indexing on more than one value. For
 example it's impossible to index on, and therefor efficiently search
 for, firstname and lastname in an objectStore which stores people. Or
 index on to-address and date sent in an objectStore holding emails.

 The way this is traditionally done is that multiple values are used as
 key for each individual entry in an index or objectStore. For example
 the CREATE INDEX statement in SQL can list multiple columns, and
 CREATE TABLE statment can list several columns as PRIMARY KEY.

 There have been a couple of suggestions how to do this in IndexedDB

 Option A)
 When specifying a key path in createObjectStore and createIndex, allow
 an array of key-paths to be specified. Such as

 store = db.createObjectStore(mystore, [firstName, lastName]);
 store.add({firstName: Benny, lastName: Zysk, age: 28});
 store.add({firstName: Benny, lastName: Andersson, age: 63});
 store.add({firstName: Charlie, lastName: Brown, age: 8});

 The records are stored in the following order
 Benny, Andersson
 Benny, Zysk
 Charlie, Brown

 Similarly, createIndex accepts the same syntax:
 store.createIndex(myindex, [lastName, age]);

 Option B)
 Allowing arrays as an additional data type for keys.
 store = db.createObjectStore(mystore, fullName);
 store.add({fullName: [Benny, Zysk], age: 28});
 store.add({fullName: [Benny, Andersson], age: 63});
 store.add({fullName: [Charlie, Brown], age: 8});

 Also allows out-of-line keys using:
 store = db.createObjectStore(mystore);
 store.add({age: 28}, [Benny, Zysk]);
 store.add({age: 63}, [Benny, Andersson]);
 store.add({age: 8}, [Charlie, Brown]);

 (the sort order here is the same as in option A).

 Similarly, if an index pointed used a keyPath which points to an
 array, this would create an entry in the index which used a compound
 key consisting of the values in the array.

 There are of course advantages and disadvantages with both options.

 Option A advantages:
 * Ensures that at objectStore/index creation time the number of keys
 are known. This allows the implementation to create and optimize the
 index using this 

Re: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Keean Schupke
See my proposal in another thread. The basic idea is to copy BDB. Have a
primary index that is based on an integer, something primitive and fast.
Allow secondary indexes which use a callback to generate a binary index key.
IDB shifts the complexity out into a library. Common use cases can be
provided (a hash of all fields in the object, internationalised
bidirectional lexicographic etc...), but the user is free to write their own
for less usual cases (for example indexing by the last word in a name string
to order by surname).


Cheers,
Keean.


On 18 March 2011 02:19, Jonas Sicking jo...@sicking.cc wrote:

 2011/3/17 Pablo Castro pablo.cas...@microsoft.com:
 
  From: Jonas Sicking [mailto:jo...@sicking.cc]
  Sent: Tuesday, March 08, 2011 1:11 PM
 
  All in all, is there anything preventing adding the API Pablo suggests
  in this thread to the IndexedDB spec drafts?
 
  I wanted to propose a couple of specific tweaks to the initial proposal
 and then unless I hear pushback start editing this into the spec.
 
  From reading the details on this thread I'm starting to realize that
 per-database collations won't do it. What did it for me was the example that
 has a fuzzier matching mode (case/accent insensitive). This is exactly the
 kind of index I would want to sort people's names in my address book, but
 most likely not the index I'll want to use for my primary key.
 
  Refactoring the API to accommodate for this would mean to move the
 setCollation() method and the collation property to the object store and
 index objects. If we were willing to live without the ability to change them
 we could take collation as one of the optional parameters to
 createObjectStore()/createIndex() and reduce a bit of surface area...

 Unfortunately I think you bring up good use cases for
 per-objectStore/index collations. It's definitely tempting to just add
 it as a optional parameter to createObjectStore/createIndex. The
 downside is obviously pushing more complexity onto web developers.
 Complexity which will be duplicated across sites.

 However there is another problem to consider here. Can switching
 collation on a objectStore or a unique index can affect its validity?
 I.e. if you switch from a case sensitive to a case insensitive
 collation, does that mean that if you have two entries with the
 primary keys Sweden and sweden they collide and thus the change of
 collation must result in an error (or aborted transaction)?

 I do seem to recall that there are ways to do at least case
 sensitivity such that you generally don't take case into account when
 sorting, unless two entries are exactly the same, in which case you do
 look at casing to differentiate them. However I don't really know a
 whole lot about this and so defer to people that know
 internationalization better.

  I don't have a strong preference there. In any case both would use BCP47
 names as discussed in this thread (as Jonas pointed out, implementations can
 also do their thing as long as they don't interfere with BCP47).
 
  Another piece of feedback I heard consistently as I discussed this with
 various folks at Microsoft is the need to be able to pick up what the UA
 would consider the collation that's most appropriate for the user
 environment (derived from settings, page language or whatever). We could
 support this by introducing a special value that  you can pass to
 setCollation that indicates pick whatever is the right for the
 environment's language right now. Given that there is no other way for
 people to discover the user preference on this, I think this is pretty
 important.

 I would be fine with this as long as it's a explicit opt-in. There is
 definitely a risk that people will do this and then only do testing in
 one language, but it seems to me like a useful use case to support,
 and I don't see a way of supporting this while completely avoiding the
 risk of internationalization bugs.

 / Jonas




Re: [Bug 12321] New: Add compound keys to IndexedDB

2011-03-18 Thread Glenn Maynard
On Fri, Mar 18, 2011 at 12:27 PM, Aryeh Gregor simetrical+...@gmail.comwrote:

 On Thu, Mar 17, 2011 at 10:03 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  Here's one ugliness with A: There's no way to specify ascending
  or descending for the individual components of the key.  So there's no
 way
  for me to open a cursor that looks at one field ascending and the other
  field descending.  In addition, I can't think of any easy/good ways to
 hack
  around this.
  Any thoughts on how we could address this use case?

 For what it's worth, the way MySQL does it is it doesn't.  If you have
 an index on (a, b), then it can be used for ORDER BY a, b or ORDER BY
 a DESC, b DESC, but not ORDER BY a DESC, b or ORDER BY a, b DESC.  In
 practice this usually works fine -- it's pretty rare that you really
 want to sort different columns in a different order.


Most SQL engines (Postgresql, SQLite) support CREATE INDEX idx ON tbl (date
DESC, name ASC) .  This allows ORDER BY date DESC, name ASC (eg. newest
events first, events per date sorted by name) and its reverse, name DESC,
date ASC.  MySQL is an outlier in not supporting this.

-- 
Glenn Maynard


Re: [Bug 12321] New: Add compound keys to IndexedDB

2011-03-18 Thread Aryeh Gregor
On Fri, Mar 18, 2011 at 12:51 PM, Glenn Maynard gl...@zewt.org wrote:
 Most SQL engines (Postgresql, SQLite) support CREATE INDEX idx ON tbl (date
 DESC, name ASC) .  This allows ORDER BY date DESC, name ASC (eg. newest
 events first, events per date sorted by name) and its reverse, name DESC,
 date ASC.  MySQL is an outlier in not supporting this.

In this as in many things.  Nevertheless, in my experience using
MySQL, it's rarely a problem, so solving it in a first pass is perhaps
not essential.  If you're going to solve it, I suggest allowing array
keys to be a dictionary, like

store = db.createObjectStore(mystore, [{col: firstName, dir:
asc}, {col: lastName, dir: desc}]);

where foo is equivalent to {col: foo} or whatever.  This will be
useful not just for sorting, but for any per-column option in the
index, such as what collation you want to use for text (which is
essential for international sorting).

If this syntax is to be used, though, the current syntax would remain
compatible, so column options can be pushed off to a later version of
the standard.



Re: [Bug 12321] New: Add compound keys to IndexedDB

2011-03-18 Thread Jeremy Orlow
On Fri, Mar 18, 2011 at 1:45 AM, Keean Schupke ke...@fry-it.com wrote:

 I like BDB's solution. You have one primary key you cannot mess with (say
 an integer for fast comparisons) you can then add any number of secondary
 indexes. With a secondary index there is a callback to generate a binary
 blob that is used for indexing. The callback has access to all the fields of
 the object plus any info in the closure and can use that to generate the
 index data any way it likes.


We discussed this a while ago.  IIRC, we decided to look at something like
it for v2.  It sounds like a good, general way to solve the problem though.
 And given the other discussion in this thread, it sounds like maybe this
isn't a super important use case to fix in the mean time.

J


 This has the advantage of supporting any indexing scheme's the user may
 wish to implement (by writing a custom callback), whist allowing a few
 common options to be provided for the user (say a hash of all fields, or a
 field name, international char set, and direction captured in a closure).
 The user gets the power, the core implementation is simple, and common cases
 can be implemented in an easy to use way.

 var lex_order = function(field, charset, direction) {return
 function(object) {/* map indexed 'field' to blob in required order */ return
 key;};};

 Then create a new index:

 object_store.validate_index(1, lex_order('name', 'us',
 'ascending')).on_done(function(status) {/* status ok or error */})

 validate index checks if the requested secondary index (1) exists, if it
 does not it creates the index and calls the done callback (with a status
 code indicating successful creation), if it does and it passes some
 validation checks it also calls the done callback (with a status code
 indicating successful validation). If anything goes wrong with either the
 creation or validation of the secondary index if would call the done
 callback with an error status code.


 Cheers,
 Keean.


 On 18 March 2011 02:03, Jeremy Orlow jor...@chromium.org wrote:

 Here's one ugliness with A: There's no way to specify ascending
 or descending for the individual components of the key.  So there's no way
 for me to open a cursor that looks at one field ascending and the other
 field descending.  In addition, I can't think of any easy/good ways to hack
 around this.

 Any thoughts on how we could address this use case?

 J

 On Wed, Mar 16, 2011 at 4:50 PM, bugzi...@jessica.w3.org wrote:

 http://www.w3.org/Bugs/Public/show_bug.cgi?id=12321

   Summary: Add compound keys to IndexedDB
   Product: WebAppsWG
   Version: unspecified
  Platform: PC
OS/Version: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: Indexed Database API
AssignedTo: dave.n...@w3.org
ReportedBy: jor...@chromium.org
 QAContact: member-webapi-...@w3.org
CC: m...@w3.org, public-webapps@w3.org


 From the thread [IndexedDB] Compound and multiple keys by Jonas
 Sicking,
 we're going to go with both options A and B.

 =

 Hi IndexedDB fans (yay!!),

 Problem description:

 One of the current shortcomings of IndexedDB is that it doesn't
 support compound indexes. I.e. indexing on more than one value. For
 example it's impossible to index on, and therefor efficiently search
 for, firstname and lastname in an objectStore which stores people. Or
 index on to-address and date sent in an objectStore holding emails.

 The way this is traditionally done is that multiple values are used as
 key for each individual entry in an index or objectStore. For example
 the CREATE INDEX statement in SQL can list multiple columns, and
 CREATE TABLE statment can list several columns as PRIMARY KEY.

 There have been a couple of suggestions how to do this in IndexedDB

 Option A)
 When specifying a key path in createObjectStore and createIndex, allow
 an array of key-paths to be specified. Such as

 store = db.createObjectStore(mystore, [firstName, lastName]);
 store.add({firstName: Benny, lastName: Zysk, age: 28});
 store.add({firstName: Benny, lastName: Andersson, age: 63});
 store.add({firstName: Charlie, lastName: Brown, age: 8});

 The records are stored in the following order
 Benny, Andersson
 Benny, Zysk
 Charlie, Brown

 Similarly, createIndex accepts the same syntax:
 store.createIndex(myindex, [lastName, age]);

 Option B)
 Allowing arrays as an additional data type for keys.
 store = db.createObjectStore(mystore, fullName);
 store.add({fullName: [Benny, Zysk], age: 28});
 store.add({fullName: [Benny, Andersson], age: 63});
 store.add({fullName: [Charlie, Brown], age: 8});

 Also allows out-of-line keys using:
 store = db.createObjectStore(mystore);
 store.add({age: 28}, [Benny, Zysk]);
 store.add({age: 63}, [Benny, Andersson]);
 store.add({age: 8}, [Charlie, Brown]);

 (the sort order here is the same as in option A).

 Similarly, if an index pointed used a 

RE: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Pablo Castro

From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
Behalf Of Keean Schupke
Sent: Friday, March 18, 2011 1:53 AM

 See my proposal in another thread. The basic idea is to copy BDB. Have a 
 primary index that is based on an integer, something primitive and fast. 
 Allow secondary indexes which use a callback to generate a binary index key. 
 IDB shifts the complexity out into a library. Common use cases can be 
 provided (a hash of all fields in the object, internationalised 
 bidirectional lexicographic etc...), but the user is free to write their own 
 for less usual cases (for example indexing by the last word in a name string 
 to order by surname).

I agree with Jeremy's comments on the other thread for this. Having the 
callback mechanism definitely sounds interesting but there are a ton of common 
cases that we can solve by just taking a language identifier, I'm not sure we 
want to make people work hard to get something that's already supported in most 
systems. The idea of having a callback to compute the index value feels 
incremental to this, so we could take on it later on without disrupting the 
explicit international collation stuff.

 On 18 March 2011 02:19, Jonas Sicking jo...@sicking.cc wrote:
 2011/3/17 Pablo Castro pablo.cas...@microsoft.com:
 
  From: Jonas Sicking [mailto:jo...@sicking.cc]
  Sent: Tuesday, March 08, 2011 1:11 PM
 
  All in all, is there anything preventing adding the API Pablo suggests
  in this thread to the IndexedDB spec drafts?
 
  I wanted to propose a couple of specific tweaks to the initial proposal 
  and then unless I hear pushback start editing this into the spec.
 
  From reading the details on this thread I'm starting to realize that 
  per-database collations won't do it. What did it for me was the example 
  that has a fuzzier matching mode (case/accent insensitive). This is 
  exactly the kind of index I would want to sort people's names in my 
  address book, but most likely not the index I'll want to use for my 
  primary key.
 
  Refactoring the API to accommodate for this would mean to move the 
  setCollation() method and the collation property to the object store and 
  index objects. If we were willing to live without the ability to change 
  them we could take collation as one of the optional parameters to 
  createObjectStore()/createIndex() and reduce a bit of surface area...
 Unfortunately I think you bring up good use cases for
 per-objectStore/index collations. It's definitely tempting to just add
 it as a optional parameter to createObjectStore/createIndex. The
 downside is obviously pushing more complexity onto web developers.
 Complexity which will be duplicated across sites.

 However there is another problem to consider here. Can switching
 collation on a objectStore or a unique index can affect its validity?
 I.e. if you switch from a case sensitive to a case insensitive
 collation, does that mean that if you have two entries with the
 primary keys Sweden and sweden they collide and thus the change of
 collation must result in an error (or aborted transaction)?

 I do seem to recall that there are ways to do at least case
 sensitivity such that you generally don't take case into account when
 sorting, unless two entries are exactly the same, in which case you do
 look at casing to differentiate them. However I don't really know a
 whole lot about this and so defer to people that know
 internationalization better.

This is a good point. It makes me lean toward not allowing changing the 
collation of an index or store. That means we could just have an optional 
parameter (in the generic parameter object thingy we have now) on 
createObjectStore and createIndex that indicates the collation name. It seems 
minimally disruptive, it doesn't tax people that don't care about it, and since 
there is no setCollation we don't have the problem of not being able to 
re-index the data.

  Another piece of feedback I heard consistently as I discussed this with 
  various folks at Microsoft is the need to be able to pick up what the UA 
  would consider the collation that's most appropriate for the user 
  environment (derived from settings, page language or whatever). We could 
  support this by introducing a special value that  you can pass to 
  setCollation that indicates pick whatever is the right for the 
  environment's language right now. Given that there is no other way for 
  people to discover the user preference on this, I think this is pretty 
  important.
 I would be fine with this as long as it's a explicit opt-in. There is
 definitely a risk that people will do this and then only do testing in
 one language, but it seems to me like a useful use case to support,
 and I don't see a way of supporting this while completely avoiding the
 risk of internationalization bugs.

I agree, it should be opt-in. I still assume we'll default to binary collation 
(same if you specify the collation value as null). I was 

Re: API for matrix manipulation

2011-03-18 Thread Chris Marrin

On Mar 15, 2011, at 5:08 PM, Tab Atkins Jr. wrote:

 On Tue, Mar 15, 2011 at 5:00 PM, Chris Marrin cmar...@apple.com wrote:
 I think it would be nice to unify the classes somehow. But that might be 
 difficult since SVG and CSS are (necessarily) separate specs. But maybe one 
 of the API gurus has a solution?
 
 We just discussed this on Monday at the FXTF telcon.  Sounds like
 people are, in general, okay with just using a 4x4 matrix, though
 there are some possible implementation issues with devices that can't
 do 3d at all.  (It was suggested that they can simply do a 2d
 projection, which is simple.)

I don't think there are implementation issues other than performance related 
ones. As you say, you can always flatten a 3D matrix for use in a purely 2D 
renderer. We do this in the WebKit implementation in some cases. But doing 4x4 
matrix math can be expensive, especially on less capable hardware,. So it would 
probably be valuable to have a set of 2D affine calls on any future 4x4 class, 
so an implementation can easily optimize by knowing they can get away with 
doing a subset of the math if all the operands are 2D affine matrices. But 
that's just a bit of extra API.

-
~Chris
cmar...@apple.com







Re: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Jonas Sicking
On Fri, Mar 18, 2011 at 12:29 PM, Pablo Castro
pablo.cas...@microsoft.com wrote:

 From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
 Behalf Of Keean Schupke
 Sent: Friday, March 18, 2011 1:53 AM

 See my proposal in another thread. The basic idea is to copy BDB. Have a 
 primary index that is based on an integer, something primitive and fast. 
 Allow secondary indexes which use a callback to generate a binary index 
 key. IDB shifts the complexity out into a library. Common use cases can be 
 provided (a hash of all fields in the object, internationalised 
 bidirectional lexicographic etc...), but the user is free to write their 
 own for less usual cases (for example indexing by the last word in a name 
 string to order by surname).

 I agree with Jeremy's comments on the other thread for this. Having the 
 callback mechanism definitely sounds interesting but there are a ton of 
 common cases that we can solve by just taking a language identifier, I'm not 
 sure we want to make people work hard to get something that's already 
 supported in most systems. The idea of having a callback to compute the index 
 value feels incremental to this, so we could take on it later on without 
 disrupting the explicit international collation stuff.

 On 18 March 2011 02:19, Jonas Sicking jo...@sicking.cc wrote:
 2011/3/17 Pablo Castro pablo.cas...@microsoft.com:
 
  From: Jonas Sicking [mailto:jo...@sicking.cc]
  Sent: Tuesday, March 08, 2011 1:11 PM
 
  All in all, is there anything preventing adding the API Pablo suggests
  in this thread to the IndexedDB spec drafts?
 
  I wanted to propose a couple of specific tweaks to the initial proposal 
  and then unless I hear pushback start editing this into the spec.
 
  From reading the details on this thread I'm starting to realize that 
  per-database collations won't do it. What did it for me was the example 
  that has a fuzzier matching mode (case/accent insensitive). This is 
  exactly the kind of index I would want to sort people's names in my 
  address book, but most likely not the index I'll want to use for my 
  primary key.
 
  Refactoring the API to accommodate for this would mean to move the 
  setCollation() method and the collation property to the object store and 
  index objects. If we were willing to live without the ability to change 
  them we could take collation as one of the optional parameters to 
  createObjectStore()/createIndex() and reduce a bit of surface area...
 Unfortunately I think you bring up good use cases for
 per-objectStore/index collations. It's definitely tempting to just add
 it as a optional parameter to createObjectStore/createIndex. The
 downside is obviously pushing more complexity onto web developers.
 Complexity which will be duplicated across sites.

 However there is another problem to consider here. Can switching
 collation on a objectStore or a unique index can affect its validity?
 I.e. if you switch from a case sensitive to a case insensitive
 collation, does that mean that if you have two entries with the
 primary keys Sweden and sweden they collide and thus the change of
 collation must result in an error (or aborted transaction)?

 I do seem to recall that there are ways to do at least case
 sensitivity such that you generally don't take case into account when
 sorting, unless two entries are exactly the same, in which case you do
 look at casing to differentiate them. However I don't really know a
 whole lot about this and so defer to people that know
 internationalization better.

 This is a good point. It makes me lean toward not allowing changing the 
 collation of an index or store. That means we could just have an optional 
 parameter (in the generic parameter object thingy we have now) on 
 createObjectStore and createIndex that indicates the collation name. It seems 
 minimally disruptive, it doesn't tax people that don't care about it, and 
 since there is no setCollation we don't have the problem of not being able to 
 re-index the data.

So there is no way to specify things such that the collation doesn't
affect unique-ness? If so, I tend to agree.

  Another piece of feedback I heard consistently as I discussed this with 
  various folks at Microsoft is the need to be able to pick up what the UA 
  would consider the collation that's most appropriate for the user 
  environment (derived from settings, page language or whatever). We could 
  support this by introducing a special value that  you can pass to 
  setCollation that indicates pick whatever is the right for the 
  environment's language right now. Given that there is no other way for 
  people to discover the user preference on this, I think this is pretty 
  important.
 I would be fine with this as long as it's a explicit opt-in. There is
 definitely a risk that people will do this and then only do testing in
 one language, but it seems to me like a useful use case to support,
 and I don't see a way of supporting 

RE: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Pablo Castro

From: Jonas Sicking [mailto:jo...@sicking.cc] 
Sent: Friday, March 18, 2011 1:57 PM

  However there is another problem to consider here. Can switching
  collation on a objectStore or a unique index can affect its validity?
  I.e. if you switch from a case sensitive to a case insensitive
  collation, does that mean that if you have two entries with the
  primary keys Sweden and sweden they collide and thus the change of
  collation must result in an error (or aborted transaction)?
 
  I do seem to recall that there are ways to do at least case
  sensitivity such that you generally don't take case into account when
  sorting, unless two entries are exactly the same, in which case you do
  look at casing to differentiate them. However I don't really know a
  whole lot about this and so defer to people that know
  internationalization better.
 
  This is a good point. It makes me lean toward not allowing changing the 
  collation of an index or store. That means we could just have an optional 
  parameter (in the generic parameter object thingy we have now) on 
  createObjectStore and createIndex that indicates the collation name. It 
  seems minimally disruptive, it doesn't tax people that don't care about 
  it, and since there is no setCollation we don't have the problem of not 
  being able to re-index the data.

 So there is no way to specify things such that the collation doesn't
 affect unique-ness? If so, I tend to agree.

The problem is that different collations will consider different things unique. 
This is bound to be variable across languages and such, so I'm not sure we want 
to be in the business of fine-tuning this. It seems that being a bit more 
restrictive could result in a more robust result overall. If someone really 
needs to change the collation they can copy the table manually...not great, but 
if we think it's a corner case it's probably fine.

   Another piece of feedback I heard consistently as I discussed this 
   with various folks at Microsoft is the need to be able to pick up what 
   the UA would consider the collation that's most appropriate for the 
   user environment (derived from settings, page language or whatever). 
   We could support this by introducing a special value that  you can 
   pass to setCollation that indicates pick whatever is the right for 
   the environment's language right now. Given that there is no other 
   way for people to discover the user preference on this, I think this 
   is pretty important.
  I would be fine with this as long as it's a explicit opt-in. There is
  definitely a risk that people will do this and then only do testing in
  one language, but it seems to me like a useful use case to support,
  and I don't see a way of supporting this while completely avoiding the
  risk of internationalization bugs.
 
  I agree, it should be opt-in. I still assume we'll default to binary 
  collation (same if you specify the collation value as null). I was reading 
  the BCP 47 [1] and in section 4.1 Choice of Language Tag the item #7 
  seems to describe what we're looking for. The value i-default seems to 
  match our needs close enough, so callers could use that value. 
  Discoverability is not great, but we avoid having to specify something 
  new, and arguably they'll need to read somewhere that this argument is a 
  BCP47-compatible value, and we could put a comment about i-default right 
  there.

 Sounds good to me. Though you seem to have forgotten to include the
 [1] reference.

Oops, here it goes:
 [1] http://tools.ietf.org/html/bcp47





Re: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Keean Schupke
On 18 March 2011 19:29, Pablo Castro pablo.cas...@microsoft.com wrote:


 From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com]
 On Behalf Of Keean Schupke
 Sent: Friday, March 18, 2011 1:53 AM

  See my proposal in another thread. The basic idea is to copy BDB. Have a
 primary index that is based on an integer, something primitive and fast.
 Allow secondary indexes which use a callback to generate a binary index key.
 IDB shifts the complexity out into a library. Common use cases can be
 provided (a hash of all fields in the object, internationalised
 bidirectional lexicographic etc...), but the user is free to write their own
 for less usual cases (for example indexing by the last word in a name string
 to order by surname).

 I agree with Jeremy's comments on the other thread for this. Having the
 callback mechanism definitely sounds interesting but there are a ton of
 common cases that we can solve by just taking a language identifier, I'm not
 sure we want to make people work hard to get something that's already
 supported in most systems. The idea of having a callback to compute the
 index value feels incremental to this, so we could take on it later on
 without disrupting the explicit international collation stuff.


The idea would be to provide pre-defined implementations of the callback for
common use cases, then it is just as simple to register a callback as set
any other option. All this means to the API is you pass a function instead
of a string. It also is better for modularity as all the code relating to
the sort order is kept in the callback functions.

The difference comes down to something like:

index.set_order_lexicographic('us');

vs

index.set_order_method(order_lexicographic('us'));

So more than just setting a property like the first case, where presumably
all the ordering code is mixed in with the indexing code, the second case
encapsulates all the ordering code in the function returned from the
execution of order_lexicographic('us'). This function would represent a
mapping from the object being indexed to a binary blob that is the actual
stored index data.

So doing it this was does not necessarily make things harder, and it
improves encapsulation, the type-safety, and the flexibility of the API.


Cheers,
Keean.