On Apr 9, 2009, at 12:52 PM, Maciej Stachowiak wrote:
On Apr 9, 2009, at 8:19 AM, Boris Zbarsky wrote:
Giovanni Campagna wrote:
So why not adding a parameter on openDatabase() to specify what kind
of database we want (and what kind of query language we will use)?
I mean something like
openDatabase(name, version, type, displayName, estimatedSize)
where type can be any string
so, for example, type = "sql" uses the standard SQL, type="sqlite"
uses SQLite extensions, type="-vendor-xyz" is a vendor specific
extension, etc.
How does this solve the original "no such thing as standard SQL,
really" issue?
I agree that "no such thing as standard SQL" (or rather the fact
that implementations all have extensions and divergences from the
spec) is a problem. But I am not sure inventing a brand new query
language and database model as proposed by Vlad is a good solution
to this problem. A few thoughts off the cuff in no particular order:
1) Applications are starting to be deployed which use the SQL-based
storage API, such as the mobile version of GMail. So it may be too
late for us to remove SQL storage from WebKit entirely. If we want
this content to interoperate with non-WebKit-based user agents, then
we will ultimately need a clear spec for the SQL dialect to use,
even if we also added an OODB or a relational database using some
other query language.
Not clear who is 'us' or 'we'. Its great that WebKit is sporting a
preliminary spec and people are building applications on it, but
anyone can rely on it at their own risk. It does not justify why the
SQL technology is superior to other choices, however preliminary they
might be. Or you risk going down the path of EJB which only 'got
persistence right' after 5 iterations and only when the public
declared a revolt and completely gave up on J2EE persistence.
I find it hard to accept that the choice made by any current
application makes us all beholden to a technology as standard for
something as fundamentally path-breaking as locally persistent
structured storage.
Take, for example, the history of CSS. Its origins lie in TBL's
original user agent NeXT and was first proposed for standardization in
1993. Early on there was no notion of cascading, or even an
understanding of selectors. Eventually, selectors became critical to
their success and allowed syntax to be separated from semantics rather
gracefully. By the same token, we are beginning to talk about various
storage techniques including opaque and clear structures, which is
great. Nonetheless, SQL as a given cannot be assumed to be the only
acceptable query language for the data being managed. Problems of this
model include -
1. Problems caused by version mismatches between server and client
schemas significantly affects decentralized evolution. 2. The
relational model does not allow data for which there is no schema
placeholder, thus rejecting the principle of 'ignore what you don't
understand'.
3. Incompatibility between database ids and URLs means that
applications live a schizophrenic life when it comes to identifying
data.
4. Incompatibility between database operations and HTTP methods means
that atomicity and idempotence guarantees of HTTP are inapplicable to
local data
2) It's true that the server side code for many Web sites uses an
object-relational mapping layer. However, so far as I know, very few
use an actual OODB. Relational databases are dominant in the market
and OODBs are a rarely used niche product. Thus, I question Vlad's
suggestion than a client-side OODB would sufficiently meet the needs
of authors. Rather, we should make sure that the platform supports
adding an object-relational mapping on top of SQL storage.
It is fine for browsers to use a database to store data and then
provide a limited IDL interface to access this data just like Mozilla
decided to use SQLite for storing browsing history. It is completely
another thing to expose the full syntax and semantics of SQL in a
general purpose manner as an API. It is fraught with perils as anyone
who has tried to standardize data access technologies would vouch.
3) It's not obvious to me that designing and clearly specifying a
brand new query language would be easier than specifying a dialect
of SQL. Note that this may require implementations to actually parse
queries themselves and possibly change them, to ensure that the
accepted syntax and semantics conform to the dialect. We are ok with
this.
It is not clear to me whether the converse is true. Have you seen any
attempt to standardize a syntax as complicated as SQL? When do you
hope to have a Rec of a "SQL subset for Web applications" ready?
4) It's not obvious to me that writing a spec for a query language
with (afaik) a single implementation, such as jLINQ, is easier than
writing a clear and correct spec for "what SQLite does" or some
subset thereof.
If 10 years of successive attempts at harmonizing JDBC access have
still skirted the issue of "standardizing" SQL for the purposes of
data access from Java applications, what hope do we have to answer the
same questions in a Web browser world?
Thus, I think the best path forward is to spec a particular SQL
dialect, even though that task may be boring and unpleasant and not
as fun as inventing a new kind of database.
It is not about being boring or uninteresting - the problem is that
fundamentally, we are sowing the seeds of our own destruction by
leading the wider Web community to believe that SQL and relational
model is the right approach to deal with Web application (locally
persistent) data.
Regards,
Maciej
[1] http://virtuelvis.com/archives/2005/01/css-history