RE: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Laxmi Narsimha Rao Oruganti
That is all the responsibility of database system.  We don't need to tell 
database systems on how to do it, we just need to tell them on what to do.  
Today database systems do have lock manager which takes care of these 
responsibilities.  

Coming to the question of failing transaction unpredictably, even with current 
specification; transaction do fail.  For example, if there exists a writer 
transaction which is already holding an exclusive lock, this new thread would 
fail to acquire lock.  The failures would be there.

Now the next question people would ask is on how do we make sure that partial 
changes are not causing problem in case of a failure in the middle of sequence 
of operations.   That is the responsibility of transaction manager.   Note that 
transaction manager treats the whole sequence as a single atomic unit.  

Are we missing something?

Thanks,
Laxmi 

-Original Message-
From: Ian Hickson [mailto:i...@hixie.ch] 
Sent: Friday, July 24, 2009 6:55 AM
To: Nikunj R. Mehta
Cc: public-webapps WG; Laxmi Narsimha Rao Oruganti
Subject: Re: [WebStorage] Concerns on spec section 'Processing Model'

On Thu, 16 Jul 2009, Nikunj R. Mehta wrote:

 The spec should not restrict implementations to any one level of 
 concurrency unless there are specific undesirable effects.
 
 Restricting the database to a single writer means that if there are 
 separate workers or background threads working to update 
 non-overlapping portions, then they have to wait for the lone current 
 writer. Implementations can certainly compete to produce the level of 
 concurrency that developers need.
 Specifically, I propose that the following text [[ If the mode is 
 read/write, the transaction must have an exclusive write lock over the 
 entire database. If the mode is read-only, the transaction must have a 
 shared read lock over the entire database. The user agent should wait 
 for an appropriate lock to be available.
 ]]
 
 be replaced with the following text
 
 [[
 Multiple read-only transactions may share the same data as long as 
 there is no transaction attempting to write the  data being read. The 
 user agent must wait for transactions that are reading some data 
 before allowing a read/write transaction on the same data to continue.
 ]]

Since there's no way for the author to say ahead of time which rows or cells 
the transactions are going to use, how can you do the above without ending up 
with some transactions failing unpredictably?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'




RE: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Ian Hickson
On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote:

 That is all the responsibility of database system.  We don't need to 
 tell database systems on how to do it, we just need to tell them on what 
 to do.  Today database systems do have lock manager which takes care of 
 these responsibilities.
 
 Coming to the question of failing transaction unpredictably, even with 
 current specification; transaction do fail.  For example, if there 
 exists a writer transaction which is already holding an exclusive lock, 
 this new thread would fail to acquire lock.  The failures would be 
 there.
 
 Now the next question people would ask is on how do we make sure that 
 partial changes are not causing problem in case of a failure in the 
 middle of sequence of operations.  That is the responsibility of 
 transaction manager.  Note that transaction manager treats the whole 
 sequence as a single atomic unit.

As I understand it, with what is specced now, if you try to get a write 
transaction lock, it will only fail if it times out, which would probably 
be a symptom of a more serious bug anyway. There's never going to be a 
forced rollback; once you have got a transaction lock, you are not going 
to ever have it fail on you unexpectedly.

I think this is an important invariant, because otherwise script writers 
_will_ shoot themselves in the foot. These aren't professional database 
developers; Web authors span the gamut of developer experience from the 
novice who is writing code more by luck than by knowledge all the way to 
the UI designer who wound up stuck with the task for writing the UI logic 
but has no professional background in programing, let alone concurrency in 
databases. We can't be firing unexpected exceptions when their users 
happen to open two tabs to the same application at the same time, leaving 
data unsaved.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



RE: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Laxmi Narsimha Rao Oruganti
Let me probe this further to get clarity.

[Ian]   As I understand it, with what is specced now, if you try to get a 
write transaction lock, it will only fail if it times out, which would probably 
be a symptom of a more serious bug anyway. 
[Ian]   There's never going to be a forced rollback; once you have got a 
transaction lock, you are not going to ever have it fail on you unexpectedly.

My understanding of your requirement is Database should allow only one active 
writer transaction.  How the database systems achieve this need not be 
explained.  

Note that, this need not be achieved only by acquiring an exclusive lock on the 
database file.  Think about a database implementation which is not a single 
file based (Log + Checkpoint design model) where there is one data file and a 
couple of log files.  Spec-ing that they have to hold exclusive lock on 
database file is ambiguous between data file and log file.  If you take BDB JE 
as an example, they don't even have data file.  Their model is a sequence of 
log files.  

I have a question:
- Many of the database systems today don't ask the user to specify the purpose 
of use (read/write) when opening a transaction.  But the spec is expecting to 
specify the use of transaction for read/write.  If we really need this to fly 
with multiple database systems, we should start that type of choice at 
connection opening time and not at transaction creation time.  A connection 
typically accepts read/write or read only connection.  Nikunj or others should 
comment on whether taking read/write Vs read-only choice at connection time is 
practiced in their corresponding products.  


Thanks,
Laxmi

-Original Message-
From: Ian Hickson [mailto:i...@hixie.ch] 
Sent: Friday, July 24, 2009 2:07 PM
To: Laxmi Narsimha Rao Oruganti
Cc: Nikunj R. Mehta; public-webapps WG
Subject: RE: [WebStorage] Concerns on spec section 'Processing Model'

On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote:

 That is all the responsibility of database system.  We don't need to 
 tell database systems on how to do it, we just need to tell them on 
 what to do.  Today database systems do have lock manager which takes 
 care of these responsibilities.
 
 Coming to the question of failing transaction unpredictably, even with 
 current specification; transaction do fail.  For example, if there 
 exists a writer transaction which is already holding an exclusive 
 lock, this new thread would fail to acquire lock.  The failures would 
 be there.
 
 Now the next question people would ask is on how do we make sure that 
 partial changes are not causing problem in case of a failure in the 
 middle of sequence of operations.  That is the responsibility of 
 transaction manager.  Note that transaction manager treats the whole 
 sequence as a single atomic unit.

As I understand it, with what is specced now, if you try to get a write 
transaction lock, it will only fail if it times out, which would probably be a 
symptom of a more serious bug anyway. There's never going to be a forced 
rollback; once you have got a transaction lock, you are not going to ever have 
it fail on you unexpectedly.

I think this is an important invariant, because otherwise script writers _will_ 
shoot themselves in the foot. These aren't professional database developers; 
Web authors span the gamut of developer experience from the novice who is 
writing code more by luck than by knowledge all the way to the UI designer who 
wound up stuck with the task for writing the UI logic but has no professional 
background in programing, let alone concurrency in databases. We can't be 
firing unexpected exceptions when their users happen to open two tabs to the 
same application at the same time, leaving data unsaved.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'




[selectors-api] Test Suite Progress

2009-07-24 Thread Lachlan Hunt

Hi,
  I've made some progress with the test suite.

I have now split the test suite up into 3 files, similar to how I 
prevoiusly described [1]:


1. Baseline Tests: HTML with CSS Level 2.1 Selectors.
2. Additional Tests: HTML with Selectors Level 3.
3. Additional Tests: XHTML+SVG with Selectors Level 3.

http://dev.w3.org/2006/webapi/selectors-api-testsuite/

The baseline tests in the first file are a subset of all the tests in 
the second.  To create it, I basically removed any test using a selector 
introduced in Selectors 3 without modifying other tests.


I've also begun to add tests for the namespace selector syntax [2] to 
the second set, but they are currently a work in progress and are not 
functioning properly.  If anyone can figure out what I've done wrong, 
please let me know.  Once these are written, they will also be 
incorporated into the third set.


The third set is an XHTML version of the tests which incorporates the 
tests that Erik had submitted [3].  I haven't verified that all these 
tests are functioning properly, I simply applied the patch from Erik 
without modification.


Finally, I still need to merge in the tests from Hixie [4].

If anyone has the time and motivation and would like to assist with 
incorporating/fixing these tests, please let me know.


[1] http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/1221.html
[2] http://lists.w3.org/Archives/Public/public-webapps/2009JanMar/0713.html
[3] http://lists.w3.org/Archives/Public/public-webapps/2009JanMar/0788.html
[4] http://www.hixie.ch/tests/adhoc/dom/selectors/001.html

--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/



Re: [selectors-api] Test Suite Progress

2009-07-24 Thread Boris Zbarsky

Lachlan Hunt wrote:
I've also begun to add tests for the namespace selector syntax [2] to 
the second set, but they are currently a work in progress and are not 
functioning properly.  If anyone can figure out what I've done wrong, 
please let me know.


I'm glad to try to figure that out, if you give me some idea of what 
not functioning properly means here...  As far as I can tell, Gecko 
and Webkit are both passing the namespace syntax tests you have here. 
Is the problem that they're passing when they shouldn't be?  Or 
something else?


-Boris



Re: [selectors-api] Test Suite Progress

2009-07-24 Thread Lachlan Hunt

Boris Zbarsky wrote:

Lachlan Hunt wrote:

I've also begun to add tests for the namespace selector syntax [2] to
the second set, but they are currently a work in progress and are not
functioning properly. If anyone can figure out what I've done wrong,
please let me know.


I'm glad to try to figure that out, if you give me some idea of what
not functioning properly means here... As far as I can tell, Gecko and
Webkit are both passing the namespace syntax tests you have here. Is the
problem that they're passing when they shouldn't be? Or something else?


Both Minefield and Webkit trunk are failing those tests for me.  I have 
all but one commented out just so it would make my debugging easier (see 
lines 287 to 292 in 002.html).


But for the one that's still uncommented, it's failing.  In the current 
version, the output is now:


FAIL Element.querySelectorAll: .any-namespace *|div, e.code = TypeError: 
Result of expression 'found[f].style' [null] is not an object.


I just checked in a new copy that outputs the exception message for 
debugging.  So based on that, the problem is that when it tries to check 
the style of the found element, it fails because it's dealing with 
elements in namespaces other than the HTML namespace, so they don't have 
a .style property.  I'm not sure what we can do to workaround that easily.


--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 1:54 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:
 Experience has shown that there is no easy way out when dealing with
 transactions, and locking at the whole database level is no solution to
 failures.

The thing that makes the web browser environment different an
interesting is that multiple independent applications can end up
having access to the same database if the run in the same origin. This
could be multiple instances of the same app (eg multiple gmail
windows) or just different apps that happen to be on the same origin
(many Google apps run on www.google.com).

Because these apps are isolated from each other, they have no way to
cooperate to reduce conflicts. They also have no control over whether
there are multiple copies of themselves (the user control this).

Therefore if the platform does not protect against this, basically any
statement can fail due to conflict. This was a big problem with Gears,
and led to applications having to go to crazy contortions to do things
like master election.

When we designed the HTML5 version of the database API we specifically
tried to avoid it.

I do not agree that database-level locking is a big problem for web
applications. They are typically serving as most a handful of clients.
As long as you can specify read vs read/write transactions, I think
the current design is the correct trade-off in terms of complexity and
correctness.

- a



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 2:06 PM, Aaron Boodmana...@google.com wrote:
 I do not agree that database-level locking is a big problem for web
 applications.

Preemptive correction: I mean for the client-side of web applications.
There are usually at most a handful of clients accessing an HTML5
database instance.

- a



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 2:17 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:

 On Jul 24, 2009, at 1:36 AM, Ian Hickson wrote:

 On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote:

 That is all the responsibility of database system.  We don't need to
 tell database systems on how to do it, we just need to tell them on what
 to do.  Today database systems do have lock manager which takes care of
 these responsibilities.

 Coming to the question of failing transaction unpredictably, even with
 current specification; transaction do fail.  For example, if there
 exists a writer transaction which is already holding an exclusive lock,
 this new thread would fail to acquire lock.  The failures would be
 there.

 Now the next question people would ask is on how do we make sure that
 partial changes are not causing problem in case of a failure in the
 middle of sequence of operations.  That is the responsibility of
 transaction manager.  Note that transaction manager treats the whole
 sequence as a single atomic unit.

 As I understand it, with what is specced now, if you try to get a write
 transaction lock, it will only fail if it times out, which would probably
 be a symptom of a more serious bug anyway.

 Can you explain a more serious bug? The write lock may actually happen in
 the middle of a read-only transaction, can't it? I don't see spec text
 prohibiting that.

It's not clear to me what you're asking about in this paragraph. Write
transactions should be exclusive, read transactions can be shared.
Write transactions queue until all open read transactions complete.

I can't find the actual spec right now (where is its canonical home
now?) so I can't point at the exact text, but that is the goal I
believe.

 There's never going to be a
 forced rollback; once you have got a transaction lock, you are not going
 to ever have it fail on you unexpectedly.

 Even if you have a transaction lock,

 1. the application logic could cause an exception
 2. the application finds an unacceptable data condition and needs to
 rollback the transaction

In these cases, the developer caused the exception himself, and it
only affects his own application. So it's not unexpected in the same
sense locking exceptions are.

 3. face a disk failure

Yes, but there is nothing that the platform can do about this case. It
is an exception in the truest sense of the word.

 4. encounter a bug in the underlying software

The platform shouldn't have API just in case underlying software has
bugs. Throwing an exception and unwinding the transaction is the most
sensible thing to do and what it does now.

 In either of these cases, how would the application code be expected to
 recover?

It can't, but these are exceptional circumstances that are generally
unexpected and indicate bigger problems. This is different from access
conflicts which are expected and do not indicate bigger problems.  The
platform can and should make this easier.

 These aren't professional database
 developers; Web authors span the gamut of developer experience from the
 novice who is writing code more by luck than by knowledge all the way to
 the UI designer who wound up stuck with the task for writing the UI logic
 but has no professional background in programing, let alone concurrency in
 databases.

 This is a strong reason to avoid SQL in the front-end.

I am also interested in a non-SQL storage API, but I think this is a
separate issue.

 We can't be firing unexpected exceptions when their users
 happen to open two tabs to the same application at the same time, leaving
 data unsaved.

 So you'd much rather tell an application user that they should close one of
 the two tabs since they can't obtain a read-write lock in both. I still
 don't understand how the exclusive database lock helps. Would you please
 elaborate?

I think you are either misunderstanding the spec or it has a bug,
because this is not the intent. Requests to obtain read/write
transactions are asynchronous and are queued until they can be
granted:

myDatabase.transaction(function(tx) {  // this callback doesn't occur
until the caller has exclusive access

});


- a



RE: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Ian Hickson
On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote:

 Let me probe this further to get clarity.
 
  As I understand it, with what is specced now, if you try to get a 
  write transaction lock, it will only fail if it times out, which would 
  probably be a symptom of a more serious bug anyway. There's never 
  going to be a forced rollback; once you have got a transaction lock, 
  you are not going to ever have it fail on you unexpectedly.
 
 My understanding of your requirement is Database should allow only one 
 active writer transaction.  How the database systems achieve this need 
 not be explained.

Sure, so long as the implementation is black-box indistinguishable from 
what the spec says, it can do whatever it wants.


 Note that, this need not be achieved only by acquiring an exclusive lock 
 on the database file.  Think about a database implementation which is 
 not a single file based (Log + Checkpoint design model) where there is 
 one data file and a couple of log files.  Spec-ing that they have to 
 hold exclusive lock on database file is ambiguous between data file and 
 log file.  If you take BDB JE as an example, they don't even have data 
 file.  Their model is a sequence of log files.

The exclusive lock model described in the spec is just a model, it isn't 
intended to be actually require an exclusive lock. If an implementation 
can get the same result using some other mechanism, that's fine.


On Fri, 24 Jul 2009, Nikunj R. Mehta wrote:
 
 Database developers (whether experienced DBAs or newcomer WebApp 
 programmers) identify the data set they are using through statements 
 they execute (within or outside transactions). It is the database's job 
 to find out which records are being used.

Sure.


 The concepts of transaction processing apply no matter the granularity 
 of a data item, whether it is a record or a disk block, or a whole file. 
 There are many kinds of failures (and yes, failures are always 
 unpredictable) [1]. Let's focus on failures arising from concurrency 
 control enforcement, which is probably the one most people worry about 
 from a programming perspective. In the following discussion, I use the 
 term locking , even though other protocols have been developed and are 
 in use, to guarantee serializability, i.e., correct interleaving of 
 concurrent transactions.
 
 A knowledgeable database programmer would read the smallest set of data 
 in a transaction so as to avoid locking the entire database for 
 concurrent operations. Moreover, this approach also minimizes 
 starvation, i.e., the amount of time a program would need to wait to 
 obtain permission to exclusively access data.
 
 Transactions can fail even if locking occurs at the whole database 
 level. As example, consider the situation:
 
 1. A read-only transaction is timed out because some read-write transaction
 went on for too long.
 2. A read-write transaction is timed out because some read-only transaction
 went on for too long.

These are the only failure modes possible currently, I believe.


 3. A read-only transaction includes inside it a read-write transaction. 

This isn't possible with the current asynchronous API as far as I can 
tell. With the synchronous API, it would hang trying to open the 
read-write transaction for however long it takes the UA to realise that 
the script that is trying to get the read-write transaction is the same 
one as the one that has an open read-only transaction, and then it would 
fail with error code 7.


 Experience has shown that there is no easy way out when dealing with 
 transactions, and locking at the whole database level is no solution to 
 failures.

It's not supposed to be a solution to failures, it's supposed to be, and 
is, as far as I can tell, a way to make unpredictable, transient, 
intermittent, and hard-to-debug concurrency errors into guaranteed, 
easy-to-debug errors.


On Fri, 24 Jul 2009, Nikunj R. Mehta wrote:
 
  There's never going to be a forced rollback; once you have got a 
  transaction lock, you are not going to ever have it fail on you 
  unexpectedly.
 
 Even if you have a transaction lock,
 
 1. the application logic could cause an exception
 2. the application finds an unacceptable data condition and needs to rollback
 the transaction

Sure, but both of those are under the control of the author.


 3. face a disk failure

This is an exceptional situation from which there is no good recovery. It 
isn't an expected situation resulting from a complicated API.


 4. encounter a bug in the underlying software

We can't do anything to prevent these in the spec.


 In either of these cases, how would the application code be expected to 
 recover?

In the first two and the last one, the author can debug the problem and 
fix or work around the bug. In the case of hardware failure, there is no 
sane recovery model.

These are very different from concurrency bugs.


  I think this is an important invariant, because otherwise script 
  

Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Nikunj R. Mehta


On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote:


These are very different from concurrency bugs.



There are only three concurrency bugs

1. The Lost Update Problem
2. The Temporary Update (or Dirty Read) Problem
3. The Incorrect Summary Problem.

Neither of these is related to the granularity of locking. All of  
these are solved through the use of transactions.


If an application uses transactions correctly, then it is free from  
concurrency bugs.


Nikunj
http://o-micron.blogspot.com






Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Ian Hickson
On Fri, 24 Jul 2009, Nikunj R. Mehta wrote:
 On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote:
  
  These are very different from concurrency bugs.
 
 There are only three concurrency bugs
 
 1. The Lost Update Problem
 2. The Temporary Update (or Dirty Read) Problem
 3. The Incorrect Summary Problem.
 
 Neither of these is related to the granularity of locking. All of these 
 are solved through the use of transactions.
 
 If an application uses transactions correctly, then it is free from 
 concurrency bugs.

If you have two applications in two tabs, and they both need to read row 
A, then write to row B, and they start doing these two tasks 
simultaneously, how do you prevent either from failing if you don't have 
database-wide locking?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



Re: [WebStorage] Solution proposed (was: Concerns on spec section 'Processing Model')

2009-07-24 Thread Nikunj R. Mehta
If you want to provide an application programmer with a limited degree  
of freedom from a certain class of errors, then there is a different  
solution. It is called isolation level [1]. When opening a  
transaction, just provide the required isolation level. Heck, if you'd  
like, make SERIALIZABLE the default value.


But don't disallow other possibilities or create the illusion of  
silver bullets.


On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote:


On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote:


Let me probe this further to get clarity.


As I understand it, with what is specced now, if you try to get a
write transaction lock, it will only fail if it times out, which  
would

probably be a symptom of a more serious bug anyway. There's never
going to be a forced rollback; once you have got a transaction lock,
you are not going to ever have it fail on you unexpectedly.


My understanding of your requirement is Database should allow only  
one
active writer transaction.  How the database systems achieve this  
need

not be explained.


Sure, so long as the implementation is black-box indistinguishable  
from

what the spec says, it can do whatever it wants.


Note that, this need not be achieved only by acquiring an exclusive  
lock

on the database file.  Think about a database implementation which is
not a single file based (Log + Checkpoint design model) where there  
is

one data file and a couple of log files.  Spec-ing that they have to
hold exclusive lock on database file is ambiguous between data file  
and
log file.  If you take BDB JE as an example, they don't even have  
data

file.  Their model is a sequence of log files.


The exclusive lock model described in the spec is just a model, it  
isn't
intended to be actually require an exclusive lock. If an  
implementation

can get the same result using some other mechanism, that's fine.


The spec says:
[[
If the mode is read/write, the transaction must have an exclusive  
write lock over the entire database

]]

Therefore, correct me if I am wrong, but the spec prohibits the  
following:


An implementation of the Database object allows more than one  
transaction to write in a database while another transaction has a  
write lock on the same database, it is a failure.


If so, then I want to formally object to that spec text because it is  
overly restrictive on implementers as well as on application  
programmers.




[snip]

3. A read-only transaction includes inside it a read-write  
transaction.


This isn't possible with the current asynchronous API as far as I can
tell. With the synchronous API, it would hang trying to open the
read-write transaction for however long it takes the UA to realise  
that
the script that is trying to get the read-write transaction is the  
same
one as the one that has an open read-only transaction, and then it  
would

fail with error code 7.


Then again the spec is too restrictive because application programmers  
need the ability to upgrade their lock from read-only to read-write  
and an application should never deadlock itself. We would have failed  
the same dumb programmer if we didn't allow this.


Therefore, I formally object to the spec disallowing an application to  
upgrade its database lock.






Experience has shown that there is no easy way out when dealing with
transactions, and locking at the whole database level is no  
solution to

failures.


It's not supposed to be a solution to failures, it's supposed to be,  
and

is, as far as I can tell, a way to make unpredictable, transient,
intermittent, and hard-to-debug concurrency errors into guaranteed,
easy-to-debug errors.


How is a timeout an easy-to-debug error? What is the meaning of a  
guaranteed error? How is a guaranteed error better than its opposite?  
Do you have any facts to back this up? If not, I would like to avoid  
using that judgement.



I think this is an important invariant, because otherwise script
writers _will_ shoot themselves in the foot.


Even if the transaction lock doesn't fail, how would one deal with  
other

transaction failures?


I don't understand the relevance. If there's a hardware error,  
retrying

isn't going to help. If there's a concurrency error, the only solution
will be to design complex locking semantics outside the API, which  
would

be a terrible burden to place on Web authors.


As I explained in my simple example of updating a spreadsheet cell,  
users cannot avoid complex semantics when dealing with concurrency and  
sharing in the face of consistency needs. It is an end-to-end  
reliability requirement (in the same sense as that used by Saltzer,  
Reed and Clark), and unavoidable for all but the unreliable systems.






These aren't professional database developers; Web authors span the
gamut of developer experience from the novice who is writing code  
more
by luck than by knowledge all the way to the UI designer who wound  
up

stuck with the task for writing the UI logic 

Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 2:54 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:

 On Jul 24, 2009, at 2:06 PM, Aaron Boodman wrote:

 On Fri, Jul 24, 2009 at 1:54 PM, Nikunj R. Mehtanikunj.me...@oracle.com
 wrote:

 Experience has shown that there is no easy way out when dealing with
 transactions, and locking at the whole database level is no solution to
 failures.

 The thing that makes the web browser environment different an
 interesting is that multiple independent applications can end up
 having access to the same database if the run in the same origin.

 Applications have the ability to specify which database they want to use. So
 I don't see problems in apps sharing an origin.

Right, but say two Gmail tabs are opened independently. They both say
they want to access the messages database. They have no way to know
about each other except through shared storage (postMessage does not
work across multiple independent tabs).

Now they can conflict with each other. There is no way for the
developer to deal with this problem other than retrying or
implementing another concurrency system on top of shared storage. This
seems bad.

 This could be multiple instances of the same app (eg multiple gmail
 windows) or just different apps that happen to be on the same origin
 (many Google apps run on www.google.com).

 When running multiple instances of the same application, or when different
 applications share the same data, you are beginning to deal with multi-user
 applications (even though it may be the same security principal). In
 multi-user applications, database transactions are the same as what they are
 on the server. Applications have no choice but to be careful in performing
 transactions. Let me illustrate this with an example.

 Say that I had a spreadsheet app. The value of a cell was displayed to the
 user as X. Now, I go in to one tab A and say add five to X. I also go in
 to B and say add five to X. One of those operations will have to fail
 because it finds that the version of X is not what it was when the
 transaction started out. Even if you put a lock on the entire database, you
 can't avoid that problem.

The issue of the data changing between the time when it was displayed
to the user and the time when an update is started is different than
the problem of the data changing while a multi-step update (a
transaction) is in progress.

The first problem is well known and understood by client-side web
developers because the web is stateless and the same can occur between
contacts with the server. It's also pretty self-evident that if you
copy data out of storage and into the UI that the two can change
independently.

The second problem is not well known by the same people and would be
surprising. Up until recently there was no local storage except
cookies and some proprietary things, and both were synchronous.
Because all browsers until recently were single-threaded, this
effectively meant that clients had storage-wide locks (there were
actually bugs with this in Firefox+cookies, I am told, but cookies
were not frequently used in a way that exposed it).

 It seems that the way the spec is written, novice programmers would be led
 to either

 1. face lost updates because they assume the browser locks the entire
 database, and so they won't bother to do their own analysis of whether data
 has changed since the last time they saw it.

Some very novice users will not realize that their UI and local store
can change independently, or will, but won't realize that there can be
multiple copies of their apps.

I think the current design is a good trade-off because addressing that
problem would essentially mean binding the UI to the datastore, which
would introduce gigantic API complexity making it basically not
workable. And many developers will understand the problem from
experience with the web.

 2. create single-instance-only apps , i.e., hold a write lock on the
 database forever since they don't want to deal version checks.

I don't think you understand the spec - it isn't actually possible to
hold the lock forever. Locks aren't an explit part of the API, but are
implicit and released automatically when functions return.

Take a look at the transaction method again:

db.transaction(function() {
  tx.executeSql(strSql, function() {

  });
});

The transaction is implicitly released when the last sql statement is
completed (or fails). The only way you can keep this transaction open
is to execute more SQL.

 Because these apps are isolated from each other, they have no way to
 cooperate to reduce conflicts. They also have no control over whether
 there are multiple copies of themselves (the user control this).

 Sorry, but there is postMessage, localStorage, and the database itself. What
 do you mean these apps are isolated and have no way to cooperate?

postMessage can't be used across independent tabs. Even if it could,
it is asynchronous and most vendors would be reluctant to put more in
the spec 

Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Nikunj R. Mehta

On Jul 24, 2009, at 3:11 PM, Ian Hickson wrote:


On Fri, 24 Jul 2009, Nikunj R. Mehta wrote:

On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote:


These are very different from concurrency bugs.


There are only three concurrency bugs

1. The Lost Update Problem
2. The Temporary Update (or Dirty Read) Problem
3. The Incorrect Summary Problem.

Neither of these is related to the granularity of locking. All of  
these

are solved through the use of transactions.

If an application uses transactions correctly, then it is free from
concurrency bugs.


If you have two applications in two tabs, and they both need to read  
row

A, then write to row B, and they start doing these two tasks
simultaneously, how do you prevent either from failing if you don't  
have

database-wide locking?



First of all, the value of row A never changes in this example. So it  
is immaterial whether the transaction locked the whole database or  
just row B. Your application that wrote this kind of a query/update  
has a concurrency bug, namely lost update. IOW, it is losing the first  
update because it did not check the value of B before modifying it and  
didn't modify row A when it modified row B.


Therefore, your question itself has a concurrency bug. This is why I  
said that locking is not a silver bullet and multi-user concurrency  
should not be taken lightly. For a primer on isolation levels,  
transactions, and locks, please see [1].


This discussion is an indicator of both the complexity involved in  
designing standards such as these and the amount of background  
knowledge required to design a good standard. Proponents of existing  
spec language have chosen to never explicitly back up their statements  
with the body of knowledge that exists in the database sciences. Taken  
together, all this makes it unlikely that a good SQL standard can be  
developed by this WG in a short period of time that some might be  
expecting.


Nikunj
http://o-micron.blogspot.com

[1] http://www.oracle.com/technology/oramag/oracle/05-nov/o65asktom.html




Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Nikunj R. Mehta


On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:


So you are
reduced to very awkward ways of cooperating -- using the database
itself as a queue or for master election, or designing a separate
transaction system between tabs which might be on separate threads,
using an asynchronous API. Or you just accept that any statement can
fail and retry everything. Or your app is just buggy if multiple
instances are open.



Did you consider for a moment that all this is merely a result of the  
SQLite feature to lock the entire database?


Nikunj
http://o-micron.blogspot.com






Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 4:12 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:

 On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:

 2. create single-instance-only apps , i.e., hold a write lock on the
 database forever since they don't want to deal version checks.

 I don't think you understand the spec - it isn't actually possible to
 hold the lock forever.

 It is a little insulting for you to say that, but I will not take offense to
 it.

I didn't mean any offense, I really don't think you understand the
spec completely :).

 Locks aren't an explit part of the API, but are
 implicit and released automatically when functions return.

 Take a look at the transaction method again:

 db.transaction(function(tx) {
  tx.executeSql(strSql, function() {

  });
 });

 The transaction is implicitly released when the last sql statement is
 completed (or fails). The only way you can keep this transaction open
 is to execute more SQL.


 (corrected a slight typo in the example - it was missing the parameter
 definition for tx)

Thanks for the correction. Code is for conversational purposes only. I
also may be forgetting some API details since I haven't looked at this
in awhile.

 If I put in a timer or another asynchronous call inside the block and that
 block used the variable tx, wouldn't it force the implementation to continue
 holding the database lock? If so, there is no limit to how long I can hold
 on to variables, and hence I could hold on to the database as an exclusive
 reader/writer for as long as I wanted to. A novice programmer would probably
 not even understand what a transaction means, except that they need a
 handle to change stuff. That programmer could hold on to this handle for
 the duration of the session.

No. The transaction is not closed on GC, it is closed when the last
statement that is part of the transaction completes. So holding a
reference to the tx variable does nothing one way or the other. The
only way to hang the transaction open would be to execute statements
over and over.


On Fri, Jul 24, 2009 at 4:13 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:

 On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:

 So you are
 reduced to very awkward ways of cooperating -- using the database
 itself as a queue or for master election, or designing a separate
 transaction system between tabs which might be on separate threads,
 using an asynchronous API. Or you just accept that any statement can
 fail and retry everything. Or your app is just buggy if multiple
 instances are open.

 Did you consider for a moment that all this is merely a result of the SQLite
 feature to lock the entire database?

No, having the database not be able to change out from under a
multi-step update was a design goal of the API. Implementing a complex
application without exclusive transactions would be very difficult.

I do understand that your position that there is a tradeoff: you give
up some performance because a skilled developer could do finer grained
locking and get better concurrency. And I think you are arguing that
there should be an option for non-exclusive write transactions, or at
least it should be up to the UA.

I still feel that with the number of clients a typical HTML5 database
will have, this is a non-issue and the spec makes the correct
tradeoff.

- a



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Nikunj R. Mehta


On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:


I do not agree that database-level locking is a big problem for web
applications.


Our problem is not with databases doing database-level locking. Our  
problem

is that such behavior is a MUST.


I think it is very desirable for it to appear to the developer that
writes to the local datastore are atomic. Lots of complexity falls out
if this is not true.


It is implicit that transactions give atomicity (that's what A in ACID  
stands for). It would be mischaracterizing this discussion to say that  
we are arguing about atomicity. We are, however, talking about  
isolation (the I in ACID), or more precisely the degree of isolation.



In some models (non-SQL) it may be easier to
arrange a large update in the application layer and commit it all at
once. In SQL, this is less true so it is important to provide API that
makes conflicts impossible while a multi-step update is in progress.


This problem exists in the WebStorage model [1]. More specifically,  
there is no way to perform multiple updates atomically in it.


The proposal that I have sketched about B-trees [2] does not have this  
problem since it is possible to work with transactions to get the  
atomicity as well as a desired isolation level. I take it that there  
are no issues with that proposal since I have not heard anyone say so.




Perhaps your real issue is that the current API does not work well for
non SQL data stores.


Not at all! It would be disingenuous to find an ulterior motive in my  
arguments.


Nikunj
http://o-micron.blogspot.com

[1] http://dev.w3.org/html5/webstorage/
[2] http://www.w3.org/mid/f480f60a-5dae-4b73-922a-93ed401cf...@oracle.com



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 4:30 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:

 On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:

 In some models (non-SQL) it may be easier to
 arrange a large update in the application layer and commit it all at
 once. In SQL, this is less true so it is important to provide API that
 makes conflicts impossible while a multi-step update is in progress.

 This problem exists in the WebStorage model [1]. More specifically, there is
 no way to perform multiple updates atomically in it.

I agree.

 The proposal that I have sketched about B-trees [2] does not have this
 problem since it is possible to work with transactions to get the atomicity
 as well as a desired isolation level. I take it that there are no issues
 with that proposal since I have not heard anyone say so.

I haven't reviewed that. I only chimed into this conversation because
it looked like there were some misunderstandings and I worked on it
early-on.

- a



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Nikunj R. Mehta


On Jul 24, 2009, at 4:25 PM, Aaron Boodman wrote:

On Fri, Jul 24, 2009 at 4:12 PM, Nikunj R. Mehtanikunj.me...@oracle.com 
 wrote:


On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:

2. create single-instance-only apps , i.e., hold a write lock on  
the

database forever since they don't want to deal version checks.


I don't think you understand the spec - it isn't actually possible  
to

hold the lock forever.


It is a little insulting for you to say that, but I will not take  
offense to

it.


I didn't mean any offense, I really don't think you understand the
spec completely :).


I beg to differ. Au contraire, I really don't think you understand  
databases at all.





Locks aren't an explit part of the API, but are
implicit and released automatically when functions return.


This is completely incorrect. Read below for more details.



Take a look at the transaction method again:

db.transaction(function(tx) {
 tx.executeSql(strSql, function() {

 });
});

The transaction is implicitly released when the last sql statement  
is
completed (or fails). The only way you can keep this transaction  
open

is to execute more SQL.



(corrected a slight typo in the example - it was missing the  
parameter

definition for tx)


Thanks for the correction. Code is for conversational purposes only. I
also may be forgetting some API details since I haven't looked at this
in awhile.

If I put in a timer or another asynchronous call inside the block  
and that
block used the variable tx, wouldn't it force the implementation to  
continue
holding the database lock? If so, there is no limit to how long I  
can hold
on to variables, and hence I could hold on to the database as an  
exclusive
reader/writer for as long as I wanted to. A novice programmer would  
probably

not even understand what a transaction means, except that they need a
handle to change stuff. That programmer could hold on to this  
handle for

the duration of the session.


No. The transaction is not closed on GC, it is closed when the last
statement that is part of the transaction completes. So holding a
reference to the tx variable does nothing one way or the other. The
only way to hang the transaction open would be to execute statements
over and over.


A transaction is not complete until I either commit or rollback the  
transaction, which I can choose to do as late as I want to, e.g., at  
window.onclose. Therefore locks on the database will not be released  
for as long as the application wants to hold on to the transaction.





On Fri, Jul 24, 2009 at 4:13 PM, Nikunj R. Mehtanikunj.me...@oracle.com 
 wrote:


On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote:


So you are
reduced to very awkward ways of cooperating -- using the database
itself as a queue or for master election, or designing a separate
transaction system between tabs which might be on separate threads,
using an asynchronous API. Or you just accept that any statement can
fail and retry everything. Or your app is just buggy if multiple
instances are open.


Did you consider for a moment that all this is merely a result of  
the SQLite

feature to lock the entire database?


No, having the database not be able to change out from under a
multi-step update was a design goal of the API. Implementing a complex
application without exclusive transactions would be very difficult.


I am not proposing to take away your choice. But please don't take  
away mine.


It would be useful to see an explanation as to why the proposal I made
[[
add an isolation level parameter with a default value of SERIALIZABLE  
and remove the exclusive database-level write lock requirement

]]

is worse than the current spec text. You can refer to SQL92 explain  
the meaning of SERIALIZABLE. AFAIK, there are no interoperability  
problems with transaction isolation levels.


Nikunj
http://o-micron.blogspot.com






Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 4:45 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:
 No. The transaction is not closed on GC, it is closed when the last
 statement that is part of the transaction completes. So holding a
 reference to the tx variable does nothing one way or the other. The
 only way to hang the transaction open would be to execute statements
 over and over.

 A transaction is not complete until I either commit or rollback the
 transaction, which I can choose to do as late as I want to, e.g., at
 window.onclose. Therefore locks on the database will not be released for as
 long as the application wants to hold on to the transaction.

I don't think that this is true, at least in the Database interface:
http://dev.w3.org/html5/webdatabase/#asynchronous-database-api

There is no explicit commit() method. The commit happens implicitly
after all queued statements have been executed successfully.

It does appear that it is possible to hold a transaction open all day
with the DatabaseSync interface
(http://dev.w3.org/html5/webdatabase/#databasesync). Specifically the
SQLTransactionSync method has commit/rollback methods. The
DatabaseSync interface was added after I worked on this, so I can't
say why it doesn't use callbacks.

In any case, I was talking about the async flavor which is what my
example code referred to. Do you agree it is not possible to hang
transactions open from Database
(http://dev.w3.org/html5/webdatabase/#database)? If not, what am I
missing?

- a



Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Nikunj R. Mehta


On Jul 24, 2009, at 4:58 PM, Aaron Boodman wrote:

On Fri, Jul 24, 2009 at 4:45 PM, Nikunj R. Mehtanikunj.me...@oracle.com 
 wrote:

No. The transaction is not closed on GC, it is closed when the last
statement that is part of the transaction completes. So holding a
reference to the tx variable does nothing one way or the other. The
only way to hang the transaction open would be to execute statements
over and over.


A transaction is not complete until I either commit or rollback the
transaction, which I can choose to do as late as I want to, e.g., at
window.onclose. Therefore locks on the database will not be  
released for as

long as the application wants to hold on to the transaction.


I don't think that this is true, at least in the Database interface:
http://dev.w3.org/html5/webdatabase/#asynchronous-database-api

There is no explicit commit() method. The commit happens implicitly
after all queued statements have been executed successfully.


The spec is also silent about what happens if I put a wait by making  
another asynchronous call inside my transaction callback logic. By  
inference, this would be allowed since all statements are executed  
inside callbacks, so why distinguish between transaction and other  
(non-SQLTransactionErrorCallback) types of callbacks.


The processing model in 4.3.2 simply says that the SQL statements are  
queued up. It is unclear what if anything happens if the database runs  
out of statements to execute if the transaction logic takes time to  
add another statement to the queue before the database decides to  
commit. Am I wrong or is this an ambiguous, but correct interpretation?


Those who are worried about throwing complexity of transaction  
recovery on Web programmers should perhaps also be worried about the  
insane complexity of asynchronous transaction programming, that no one  
in the world should have to learn. The mainstream database developers  
don't have to deal with that. Why should poor Web programmers have to  
suffer this?


Moreover, with an asynchronous database the spec doesn't allow an  
application to rollback a transaction, should certain application  
logic require that. This is yet another case of creating a storage API  
that is different from traditional database developers.


There seems to be a pattern of ignoring good API practices when  
interacting with a database and it appears intentional. Am I wrong in  
my interpretation?




It does appear that it is possible to hold a transaction open all day
with the DatabaseSync interface
(http://dev.w3.org/html5/webdatabase/#databasesync). Specifically the
SQLTransactionSync method has commit/rollback methods. The
DatabaseSync interface was added after I worked on this, so I can't
say why it doesn't use callbacks.

In any case, I was talking about the async flavor which is what my
example code referred to. Do you agree it is not possible to hang
transactions open from Database
(http://dev.w3.org/html5/webdatabase/#database)? If not, what am I
missing?


I can't agree simply because the spec says nothing about it. In fact,  
if anything the rest of the spec text around asynchronous processing  
suggests that it is possible to hang transactions indefinitely.


Nikunj
http://o-micron.blogspot.com






Re: [WebStorage] Concerns on spec section 'Processing Model'

2009-07-24 Thread Aaron Boodman
On Fri, Jul 24, 2009 at 4:45 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote:
 I am not proposing to take away your choice. But please don't take away
 mine.

 It would be useful to see an explanation as to why the proposal I made
 [[
 add an isolation level parameter with a default value of SERIALIZABLE and
 remove the exclusive database-level write lock requirement
 ]]

 is worse than the current spec text. You can refer to SQL92 explain the
 meaning of SERIALIZABLE. AFAIK, there are no interoperability problems with
 transaction isolation levels.

I'm personally not opposed to adding more isolation levels in addition
to the current single option of SERIALIZABLE. It could be added as an
argument to transaction().

I don't think it is a particularly high value feature, but I also
don't see a big problem with it. And I can imagine that some
particularly ambitious developers might want to take advantage of it.

- a



[WebDatabase] Database interface (vs. DatabaseSync interface)

2009-07-24 Thread Nikunj R. Mehta
It appears that Database, SQLTransactionCallback,  
SQLTransactionErrorCallback, SQLVoidCallback, SQLTransaction,  
SQLStatementCallback, and SQLStatementErrorCallback interfaces can all  
be eliminated from WebDatabase completely.


Given WebWorkers and DatabaseSync, why do we need the Database object  
at all? Are there use cases that cannot be satisfied by the  
combination of the two that?


There is a brand new programming model being promoted by the Database  
object, it is as complex as it gets and seriously I cannot get it. I  
don't know about you, but I doubt it will work for an average Web page  
author. Given its programming model, Oracle is not supportive of the  
asynchronous Database and related interfaces.


Nikunj
http://o-micron.blogspot.com






Re: [selectors-api] Test Suite Progress

2009-07-24 Thread Boris Zbarsky

Lachlan Hunt wrote:
Both Minefield and Webkit trunk are failing those tests for me.  I have 
all but one commented out just so it would make my debugging easier (see 
lines 287 to 292 in 002.html).


Oh, I'd just missed the one failing test.  Showing only failing tests 
helped!


I just checked in a new copy that outputs the exception message for 
debugging.  So based on that, the problem is that when it tries to check 
the style of the found element, it fails because it's dealing with 
elements in namespaces other than the HTML namespace, so they don't have 
a .style property.


Yep.


 I'm not sure what we can do to workaround that easily.


Hmm.  For the any-namespace test you could use a non-HTML node that 
happens to have .style (like SVG, say).


For the no-namespace test, you'd want the harness to be somewhat 
different here...  Maybe ask John Resig if he has any ideas, since he 
presumably knows this code pretty well?


-Boris