On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote:
> 1. Be able to put an object and pass an array of index names which must
> reference the object. This may remove the need for a complicated indexing
> spec (perhaps the reason why this issue has been pushed into the future) and
> give developers all the flexibility they need.
>
> You're talking about having multiple entries in a single index that point
> towards the same primary key? If so, then I strongly agree, and I think
> others agree as well. It's mostly a question of syntax. A while ago we
> brainstormed a couple possibilities. I'll try to send out a proposal this
> week. I think this + compound keys should probably be our last v1 features
> though. (Though they almost certainly won't make Chrome 11 or Firefox 4,
> unfortunately, hopefully they'll be done in the next version of each, and
> hopefully that release with be fairly soon after for both.)
Yes, for example this user object { name: "Joran Greef", emails:
["[email protected]", "[email protected]"] } with indexes on the "emails"
property, would be found in the "[email protected]" index as well as in the
"[email protected]" index.
What I've been thinking though is that the problem even with formally
specifying indexes in advance of object put calls, is that this pushes too much
application model logic into the database layer, making the database enforce a
schema (at least in terms of indexes). Of course IDB facilitates migrations in
the form of setVersion, but most schema migrations are also coupled with
changes to the data itself, and this would still have to be done by the
application in any event. So at the moment IDB takes too much responsibility on
behalf of the application (computing indexes, pre-defined indexes, pseudo
migrations) and not enough responsibility for pure database operations (index
intersections and index unions).
I would argue that things like migrations and schema's are best handled by the
application, even if this is more work for the application, as most people will
write wrappers for IDB in any event and IDB is supposed to be a core-level API.
The acid-test must be that the database is oblivious to schemas or anything
pre-defined or application-specific (i.e. stateless). Otherwise IDB risks being
a database for newbies who wouldn't use it, and a database that others would
treat as a KV anyway (see MySQL at FriendFeed).
A suggested interface then for putting or deleting objects, would be:
objectStore.put(object, ["indexname1", "indexname2", "indexname3"]) and then
IDB would need to ensure that the object would be referenced by the given index
names. When removing the object, the application would need to provide the
indexes again (or IDB could keep track of the indexes associated with an
object).
Using a function to compute indexes would not work as this would entrap
application-specific schema knowledge within the function (which would need to
be persisted) and these may subsequently change in the application, which would
then need a way to modify the function again. The key is that these things must
be stateless.
The objects must be opaque to IDB (no need for serialization/deserialization
overhead at the DB layer). Things like key-paths etc. could be removed and the
object id just passed in to put or delete calls.
> 2. Be able to intersect and union indexes. This covers a tremendous amount of
> ground in terms of authorization and filtering.
>
> Our plan was to punt some sort of join language to v2. Could you give a more
> concrete proposal for what we'd add? It'd make it easier to see if it's
> something realistic for v1 or not.
If you can perform intersect or union operations (and combinations of these) on
indexes (which are essentially sets or sorted sets), then this would be the
join language. It has the benefit that the interface would then be described in
terms of operations on data structures (set operations on sets) rather than a
custom language which would take longer to spec out.
I've written databases over append-only files, S3, WebSQL and even LocalStorage
(!) and from what I've found with my own applications, you could handle
everything from multi-tenant authorization to adequate filtering with the
following operations:
1. intersect([ index1, index2 ])
2. union([ index1, index2 ])
3. intersect([ union([ index1, index2 ]), index3, index4, index5, index6,
index7 ])
Hopefully, a join language described in terms of pure set operations would be
much simpler to implement and easier to use and reason with.
In fact I think if IDB offered only a single object store and an indexing
system described above, it would be completely perfect. That's all that's
needed. No need for a V2. Just a focus on high-performance thereafter.