RE: [WebStorage] Concerns on spec section 'Processing Model'
That is all the responsibility of database system. We don't need to tell database systems on how to do it, we just need to tell them on what to do. Today database systems do have lock manager which takes care of these responsibilities. Coming to the question of failing transaction unpredictably, even with current specification; transaction do fail. For example, if there exists a writer transaction which is already holding an exclusive lock, this new thread would fail to acquire lock. The failures would be there. Now the next question people would ask is on how do we make sure that partial changes are not causing problem in case of a failure in the middle of sequence of operations. That is the responsibility of transaction manager. Note that transaction manager treats the whole sequence as a single atomic unit. Are we missing something? Thanks, Laxmi -Original Message- From: Ian Hickson [mailto:i...@hixie.ch] Sent: Friday, July 24, 2009 6:55 AM To: Nikunj R. Mehta Cc: public-webapps WG; Laxmi Narsimha Rao Oruganti Subject: Re: [WebStorage] Concerns on spec section 'Processing Model' On Thu, 16 Jul 2009, Nikunj R. Mehta wrote: The spec should not restrict implementations to any one level of concurrency unless there are specific undesirable effects. Restricting the database to a single writer means that if there are separate workers or background threads working to update non-overlapping portions, then they have to wait for the lone current writer. Implementations can certainly compete to produce the level of concurrency that developers need. Specifically, I propose that the following text [[ If the mode is read/write, the transaction must have an exclusive write lock over the entire database. If the mode is read-only, the transaction must have a shared read lock over the entire database. The user agent should wait for an appropriate lock to be available. ]] be replaced with the following text [[ Multiple read-only transactions may share the same data as long as there is no transaction attempting to write the data being read. The user agent must wait for transactions that are reading some data before allowing a read/write transaction on the same data to continue. ]] Since there's no way for the author to say ahead of time which rows or cells the transactions are going to use, how can you do the above without ending up with some transactions failing unpredictably? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
RE: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote: That is all the responsibility of database system. We don't need to tell database systems on how to do it, we just need to tell them on what to do. Today database systems do have lock manager which takes care of these responsibilities. Coming to the question of failing transaction unpredictably, even with current specification; transaction do fail. For example, if there exists a writer transaction which is already holding an exclusive lock, this new thread would fail to acquire lock. The failures would be there. Now the next question people would ask is on how do we make sure that partial changes are not causing problem in case of a failure in the middle of sequence of operations. That is the responsibility of transaction manager. Note that transaction manager treats the whole sequence as a single atomic unit. As I understand it, with what is specced now, if you try to get a write transaction lock, it will only fail if it times out, which would probably be a symptom of a more serious bug anyway. There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. I think this is an important invariant, because otherwise script writers _will_ shoot themselves in the foot. These aren't professional database developers; Web authors span the gamut of developer experience from the novice who is writing code more by luck than by knowledge all the way to the UI designer who wound up stuck with the task for writing the UI logic but has no professional background in programing, let alone concurrency in databases. We can't be firing unexpected exceptions when their users happen to open two tabs to the same application at the same time, leaving data unsaved. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
RE: [WebStorage] Concerns on spec section 'Processing Model'
Let me probe this further to get clarity. [Ian] As I understand it, with what is specced now, if you try to get a write transaction lock, it will only fail if it times out, which would probably be a symptom of a more serious bug anyway. [Ian] There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. My understanding of your requirement is Database should allow only one active writer transaction. How the database systems achieve this need not be explained. Note that, this need not be achieved only by acquiring an exclusive lock on the database file. Think about a database implementation which is not a single file based (Log + Checkpoint design model) where there is one data file and a couple of log files. Spec-ing that they have to hold exclusive lock on database file is ambiguous between data file and log file. If you take BDB JE as an example, they don't even have data file. Their model is a sequence of log files. I have a question: - Many of the database systems today don't ask the user to specify the purpose of use (read/write) when opening a transaction. But the spec is expecting to specify the use of transaction for read/write. If we really need this to fly with multiple database systems, we should start that type of choice at connection opening time and not at transaction creation time. A connection typically accepts read/write or read only connection. Nikunj or others should comment on whether taking read/write Vs read-only choice at connection time is practiced in their corresponding products. Thanks, Laxmi -Original Message- From: Ian Hickson [mailto:i...@hixie.ch] Sent: Friday, July 24, 2009 2:07 PM To: Laxmi Narsimha Rao Oruganti Cc: Nikunj R. Mehta; public-webapps WG Subject: RE: [WebStorage] Concerns on spec section 'Processing Model' On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote: That is all the responsibility of database system. We don't need to tell database systems on how to do it, we just need to tell them on what to do. Today database systems do have lock manager which takes care of these responsibilities. Coming to the question of failing transaction unpredictably, even with current specification; transaction do fail. For example, if there exists a writer transaction which is already holding an exclusive lock, this new thread would fail to acquire lock. The failures would be there. Now the next question people would ask is on how do we make sure that partial changes are not causing problem in case of a failure in the middle of sequence of operations. That is the responsibility of transaction manager. Note that transaction manager treats the whole sequence as a single atomic unit. As I understand it, with what is specced now, if you try to get a write transaction lock, it will only fail if it times out, which would probably be a symptom of a more serious bug anyway. There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. I think this is an important invariant, because otherwise script writers _will_ shoot themselves in the foot. These aren't professional database developers; Web authors span the gamut of developer experience from the novice who is writing code more by luck than by knowledge all the way to the UI designer who wound up stuck with the task for writing the UI logic but has no professional background in programing, let alone concurrency in databases. We can't be firing unexpected exceptions when their users happen to open two tabs to the same application at the same time, leaving data unsaved. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[selectors-api] Test Suite Progress
Hi, I've made some progress with the test suite. I have now split the test suite up into 3 files, similar to how I prevoiusly described [1]: 1. Baseline Tests: HTML with CSS Level 2.1 Selectors. 2. Additional Tests: HTML with Selectors Level 3. 3. Additional Tests: XHTML+SVG with Selectors Level 3. http://dev.w3.org/2006/webapi/selectors-api-testsuite/ The baseline tests in the first file are a subset of all the tests in the second. To create it, I basically removed any test using a selector introduced in Selectors 3 without modifying other tests. I've also begun to add tests for the namespace selector syntax [2] to the second set, but they are currently a work in progress and are not functioning properly. If anyone can figure out what I've done wrong, please let me know. Once these are written, they will also be incorporated into the third set. The third set is an XHTML version of the tests which incorporates the tests that Erik had submitted [3]. I haven't verified that all these tests are functioning properly, I simply applied the patch from Erik without modification. Finally, I still need to merge in the tests from Hixie [4]. If anyone has the time and motivation and would like to assist with incorporating/fixing these tests, please let me know. [1] http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/1221.html [2] http://lists.w3.org/Archives/Public/public-webapps/2009JanMar/0713.html [3] http://lists.w3.org/Archives/Public/public-webapps/2009JanMar/0788.html [4] http://www.hixie.ch/tests/adhoc/dom/selectors/001.html -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Re: [selectors-api] Test Suite Progress
Lachlan Hunt wrote: I've also begun to add tests for the namespace selector syntax [2] to the second set, but they are currently a work in progress and are not functioning properly. If anyone can figure out what I've done wrong, please let me know. I'm glad to try to figure that out, if you give me some idea of what not functioning properly means here... As far as I can tell, Gecko and Webkit are both passing the namespace syntax tests you have here. Is the problem that they're passing when they shouldn't be? Or something else? -Boris
Re: [selectors-api] Test Suite Progress
Boris Zbarsky wrote: Lachlan Hunt wrote: I've also begun to add tests for the namespace selector syntax [2] to the second set, but they are currently a work in progress and are not functioning properly. If anyone can figure out what I've done wrong, please let me know. I'm glad to try to figure that out, if you give me some idea of what not functioning properly means here... As far as I can tell, Gecko and Webkit are both passing the namespace syntax tests you have here. Is the problem that they're passing when they shouldn't be? Or something else? Both Minefield and Webkit trunk are failing those tests for me. I have all but one commented out just so it would make my debugging easier (see lines 287 to 292 in 002.html). But for the one that's still uncommented, it's failing. In the current version, the output is now: FAIL Element.querySelectorAll: .any-namespace *|div, e.code = TypeError: Result of expression 'found[f].style' [null] is not an object. I just checked in a new copy that outputs the exception message for debugging. So based on that, the problem is that when it tries to check the style of the found element, it fails because it's dealing with elements in namespaces other than the HTML namespace, so they don't have a .style property. I'm not sure what we can do to workaround that easily. -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 1:54 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: Experience has shown that there is no easy way out when dealing with transactions, and locking at the whole database level is no solution to failures. The thing that makes the web browser environment different an interesting is that multiple independent applications can end up having access to the same database if the run in the same origin. This could be multiple instances of the same app (eg multiple gmail windows) or just different apps that happen to be on the same origin (many Google apps run on www.google.com). Because these apps are isolated from each other, they have no way to cooperate to reduce conflicts. They also have no control over whether there are multiple copies of themselves (the user control this). Therefore if the platform does not protect against this, basically any statement can fail due to conflict. This was a big problem with Gears, and led to applications having to go to crazy contortions to do things like master election. When we designed the HTML5 version of the database API we specifically tried to avoid it. I do not agree that database-level locking is a big problem for web applications. They are typically serving as most a handful of clients. As long as you can specify read vs read/write transactions, I think the current design is the correct trade-off in terms of complexity and correctness. - a
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 2:06 PM, Aaron Boodmana...@google.com wrote: I do not agree that database-level locking is a big problem for web applications. Preemptive correction: I mean for the client-side of web applications. There are usually at most a handful of clients accessing an HTML5 database instance. - a
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 2:17 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 1:36 AM, Ian Hickson wrote: On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote: That is all the responsibility of database system. We don't need to tell database systems on how to do it, we just need to tell them on what to do. Today database systems do have lock manager which takes care of these responsibilities. Coming to the question of failing transaction unpredictably, even with current specification; transaction do fail. For example, if there exists a writer transaction which is already holding an exclusive lock, this new thread would fail to acquire lock. The failures would be there. Now the next question people would ask is on how do we make sure that partial changes are not causing problem in case of a failure in the middle of sequence of operations. That is the responsibility of transaction manager. Note that transaction manager treats the whole sequence as a single atomic unit. As I understand it, with what is specced now, if you try to get a write transaction lock, it will only fail if it times out, which would probably be a symptom of a more serious bug anyway. Can you explain a more serious bug? The write lock may actually happen in the middle of a read-only transaction, can't it? I don't see spec text prohibiting that. It's not clear to me what you're asking about in this paragraph. Write transactions should be exclusive, read transactions can be shared. Write transactions queue until all open read transactions complete. I can't find the actual spec right now (where is its canonical home now?) so I can't point at the exact text, but that is the goal I believe. There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. Even if you have a transaction lock, 1. the application logic could cause an exception 2. the application finds an unacceptable data condition and needs to rollback the transaction In these cases, the developer caused the exception himself, and it only affects his own application. So it's not unexpected in the same sense locking exceptions are. 3. face a disk failure Yes, but there is nothing that the platform can do about this case. It is an exception in the truest sense of the word. 4. encounter a bug in the underlying software The platform shouldn't have API just in case underlying software has bugs. Throwing an exception and unwinding the transaction is the most sensible thing to do and what it does now. In either of these cases, how would the application code be expected to recover? It can't, but these are exceptional circumstances that are generally unexpected and indicate bigger problems. This is different from access conflicts which are expected and do not indicate bigger problems. The platform can and should make this easier. These aren't professional database developers; Web authors span the gamut of developer experience from the novice who is writing code more by luck than by knowledge all the way to the UI designer who wound up stuck with the task for writing the UI logic but has no professional background in programing, let alone concurrency in databases. This is a strong reason to avoid SQL in the front-end. I am also interested in a non-SQL storage API, but I think this is a separate issue. We can't be firing unexpected exceptions when their users happen to open two tabs to the same application at the same time, leaving data unsaved. So you'd much rather tell an application user that they should close one of the two tabs since they can't obtain a read-write lock in both. I still don't understand how the exclusive database lock helps. Would you please elaborate? I think you are either misunderstanding the spec or it has a bug, because this is not the intent. Requests to obtain read/write transactions are asynchronous and are queued until they can be granted: myDatabase.transaction(function(tx) { // this callback doesn't occur until the caller has exclusive access }); - a
RE: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote: Let me probe this further to get clarity. As I understand it, with what is specced now, if you try to get a write transaction lock, it will only fail if it times out, which would probably be a symptom of a more serious bug anyway. There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. My understanding of your requirement is Database should allow only one active writer transaction. How the database systems achieve this need not be explained. Sure, so long as the implementation is black-box indistinguishable from what the spec says, it can do whatever it wants. Note that, this need not be achieved only by acquiring an exclusive lock on the database file. Think about a database implementation which is not a single file based (Log + Checkpoint design model) where there is one data file and a couple of log files. Spec-ing that they have to hold exclusive lock on database file is ambiguous between data file and log file. If you take BDB JE as an example, they don't even have data file. Their model is a sequence of log files. The exclusive lock model described in the spec is just a model, it isn't intended to be actually require an exclusive lock. If an implementation can get the same result using some other mechanism, that's fine. On Fri, 24 Jul 2009, Nikunj R. Mehta wrote: Database developers (whether experienced DBAs or newcomer WebApp programmers) identify the data set they are using through statements they execute (within or outside transactions). It is the database's job to find out which records are being used. Sure. The concepts of transaction processing apply no matter the granularity of a data item, whether it is a record or a disk block, or a whole file. There are many kinds of failures (and yes, failures are always unpredictable) [1]. Let's focus on failures arising from concurrency control enforcement, which is probably the one most people worry about from a programming perspective. In the following discussion, I use the term locking , even though other protocols have been developed and are in use, to guarantee serializability, i.e., correct interleaving of concurrent transactions. A knowledgeable database programmer would read the smallest set of data in a transaction so as to avoid locking the entire database for concurrent operations. Moreover, this approach also minimizes starvation, i.e., the amount of time a program would need to wait to obtain permission to exclusively access data. Transactions can fail even if locking occurs at the whole database level. As example, consider the situation: 1. A read-only transaction is timed out because some read-write transaction went on for too long. 2. A read-write transaction is timed out because some read-only transaction went on for too long. These are the only failure modes possible currently, I believe. 3. A read-only transaction includes inside it a read-write transaction. This isn't possible with the current asynchronous API as far as I can tell. With the synchronous API, it would hang trying to open the read-write transaction for however long it takes the UA to realise that the script that is trying to get the read-write transaction is the same one as the one that has an open read-only transaction, and then it would fail with error code 7. Experience has shown that there is no easy way out when dealing with transactions, and locking at the whole database level is no solution to failures. It's not supposed to be a solution to failures, it's supposed to be, and is, as far as I can tell, a way to make unpredictable, transient, intermittent, and hard-to-debug concurrency errors into guaranteed, easy-to-debug errors. On Fri, 24 Jul 2009, Nikunj R. Mehta wrote: There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. Even if you have a transaction lock, 1. the application logic could cause an exception 2. the application finds an unacceptable data condition and needs to rollback the transaction Sure, but both of those are under the control of the author. 3. face a disk failure This is an exceptional situation from which there is no good recovery. It isn't an expected situation resulting from a complicated API. 4. encounter a bug in the underlying software We can't do anything to prevent these in the spec. In either of these cases, how would the application code be expected to recover? In the first two and the last one, the author can debug the problem and fix or work around the bug. In the case of hardware failure, there is no sane recovery model. These are very different from concurrency bugs. I think this is an important invariant, because otherwise script
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote: These are very different from concurrency bugs. There are only three concurrency bugs 1. The Lost Update Problem 2. The Temporary Update (or Dirty Read) Problem 3. The Incorrect Summary Problem. Neither of these is related to the granularity of locking. All of these are solved through the use of transactions. If an application uses transactions correctly, then it is free from concurrency bugs. Nikunj http://o-micron.blogspot.com
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, 24 Jul 2009, Nikunj R. Mehta wrote: On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote: These are very different from concurrency bugs. There are only three concurrency bugs 1. The Lost Update Problem 2. The Temporary Update (or Dirty Read) Problem 3. The Incorrect Summary Problem. Neither of these is related to the granularity of locking. All of these are solved through the use of transactions. If an application uses transactions correctly, then it is free from concurrency bugs. If you have two applications in two tabs, and they both need to read row A, then write to row B, and they start doing these two tasks simultaneously, how do you prevent either from failing if you don't have database-wide locking? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [WebStorage] Solution proposed (was: Concerns on spec section 'Processing Model')
If you want to provide an application programmer with a limited degree of freedom from a certain class of errors, then there is a different solution. It is called isolation level [1]. When opening a transaction, just provide the required isolation level. Heck, if you'd like, make SERIALIZABLE the default value. But don't disallow other possibilities or create the illusion of silver bullets. On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote: On Fri, 24 Jul 2009, Laxmi Narsimha Rao Oruganti wrote: Let me probe this further to get clarity. As I understand it, with what is specced now, if you try to get a write transaction lock, it will only fail if it times out, which would probably be a symptom of a more serious bug anyway. There's never going to be a forced rollback; once you have got a transaction lock, you are not going to ever have it fail on you unexpectedly. My understanding of your requirement is Database should allow only one active writer transaction. How the database systems achieve this need not be explained. Sure, so long as the implementation is black-box indistinguishable from what the spec says, it can do whatever it wants. Note that, this need not be achieved only by acquiring an exclusive lock on the database file. Think about a database implementation which is not a single file based (Log + Checkpoint design model) where there is one data file and a couple of log files. Spec-ing that they have to hold exclusive lock on database file is ambiguous between data file and log file. If you take BDB JE as an example, they don't even have data file. Their model is a sequence of log files. The exclusive lock model described in the spec is just a model, it isn't intended to be actually require an exclusive lock. If an implementation can get the same result using some other mechanism, that's fine. The spec says: [[ If the mode is read/write, the transaction must have an exclusive write lock over the entire database ]] Therefore, correct me if I am wrong, but the spec prohibits the following: An implementation of the Database object allows more than one transaction to write in a database while another transaction has a write lock on the same database, it is a failure. If so, then I want to formally object to that spec text because it is overly restrictive on implementers as well as on application programmers. [snip] 3. A read-only transaction includes inside it a read-write transaction. This isn't possible with the current asynchronous API as far as I can tell. With the synchronous API, it would hang trying to open the read-write transaction for however long it takes the UA to realise that the script that is trying to get the read-write transaction is the same one as the one that has an open read-only transaction, and then it would fail with error code 7. Then again the spec is too restrictive because application programmers need the ability to upgrade their lock from read-only to read-write and an application should never deadlock itself. We would have failed the same dumb programmer if we didn't allow this. Therefore, I formally object to the spec disallowing an application to upgrade its database lock. Experience has shown that there is no easy way out when dealing with transactions, and locking at the whole database level is no solution to failures. It's not supposed to be a solution to failures, it's supposed to be, and is, as far as I can tell, a way to make unpredictable, transient, intermittent, and hard-to-debug concurrency errors into guaranteed, easy-to-debug errors. How is a timeout an easy-to-debug error? What is the meaning of a guaranteed error? How is a guaranteed error better than its opposite? Do you have any facts to back this up? If not, I would like to avoid using that judgement. I think this is an important invariant, because otherwise script writers _will_ shoot themselves in the foot. Even if the transaction lock doesn't fail, how would one deal with other transaction failures? I don't understand the relevance. If there's a hardware error, retrying isn't going to help. If there's a concurrency error, the only solution will be to design complex locking semantics outside the API, which would be a terrible burden to place on Web authors. As I explained in my simple example of updating a spreadsheet cell, users cannot avoid complex semantics when dealing with concurrency and sharing in the face of consistency needs. It is an end-to-end reliability requirement (in the same sense as that used by Saltzer, Reed and Clark), and unavoidable for all but the unreliable systems. These aren't professional database developers; Web authors span the gamut of developer experience from the novice who is writing code more by luck than by knowledge all the way to the UI designer who wound up stuck with the task for writing the UI logic
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 2:54 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 2:06 PM, Aaron Boodman wrote: On Fri, Jul 24, 2009 at 1:54 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: Experience has shown that there is no easy way out when dealing with transactions, and locking at the whole database level is no solution to failures. The thing that makes the web browser environment different an interesting is that multiple independent applications can end up having access to the same database if the run in the same origin. Applications have the ability to specify which database they want to use. So I don't see problems in apps sharing an origin. Right, but say two Gmail tabs are opened independently. They both say they want to access the messages database. They have no way to know about each other except through shared storage (postMessage does not work across multiple independent tabs). Now they can conflict with each other. There is no way for the developer to deal with this problem other than retrying or implementing another concurrency system on top of shared storage. This seems bad. This could be multiple instances of the same app (eg multiple gmail windows) or just different apps that happen to be on the same origin (many Google apps run on www.google.com). When running multiple instances of the same application, or when different applications share the same data, you are beginning to deal with multi-user applications (even though it may be the same security principal). In multi-user applications, database transactions are the same as what they are on the server. Applications have no choice but to be careful in performing transactions. Let me illustrate this with an example. Say that I had a spreadsheet app. The value of a cell was displayed to the user as X. Now, I go in to one tab A and say add five to X. I also go in to B and say add five to X. One of those operations will have to fail because it finds that the version of X is not what it was when the transaction started out. Even if you put a lock on the entire database, you can't avoid that problem. The issue of the data changing between the time when it was displayed to the user and the time when an update is started is different than the problem of the data changing while a multi-step update (a transaction) is in progress. The first problem is well known and understood by client-side web developers because the web is stateless and the same can occur between contacts with the server. It's also pretty self-evident that if you copy data out of storage and into the UI that the two can change independently. The second problem is not well known by the same people and would be surprising. Up until recently there was no local storage except cookies and some proprietary things, and both were synchronous. Because all browsers until recently were single-threaded, this effectively meant that clients had storage-wide locks (there were actually bugs with this in Firefox+cookies, I am told, but cookies were not frequently used in a way that exposed it). It seems that the way the spec is written, novice programmers would be led to either 1. face lost updates because they assume the browser locks the entire database, and so they won't bother to do their own analysis of whether data has changed since the last time they saw it. Some very novice users will not realize that their UI and local store can change independently, or will, but won't realize that there can be multiple copies of their apps. I think the current design is a good trade-off because addressing that problem would essentially mean binding the UI to the datastore, which would introduce gigantic API complexity making it basically not workable. And many developers will understand the problem from experience with the web. 2. create single-instance-only apps , i.e., hold a write lock on the database forever since they don't want to deal version checks. I don't think you understand the spec - it isn't actually possible to hold the lock forever. Locks aren't an explit part of the API, but are implicit and released automatically when functions return. Take a look at the transaction method again: db.transaction(function() { tx.executeSql(strSql, function() { }); }); The transaction is implicitly released when the last sql statement is completed (or fails). The only way you can keep this transaction open is to execute more SQL. Because these apps are isolated from each other, they have no way to cooperate to reduce conflicts. They also have no control over whether there are multiple copies of themselves (the user control this). Sorry, but there is postMessage, localStorage, and the database itself. What do you mean these apps are isolated and have no way to cooperate? postMessage can't be used across independent tabs. Even if it could, it is asynchronous and most vendors would be reluctant to put more in the spec
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Jul 24, 2009, at 3:11 PM, Ian Hickson wrote: On Fri, 24 Jul 2009, Nikunj R. Mehta wrote: On Jul 24, 2009, at 2:53 PM, Ian Hickson wrote: These are very different from concurrency bugs. There are only three concurrency bugs 1. The Lost Update Problem 2. The Temporary Update (or Dirty Read) Problem 3. The Incorrect Summary Problem. Neither of these is related to the granularity of locking. All of these are solved through the use of transactions. If an application uses transactions correctly, then it is free from concurrency bugs. If you have two applications in two tabs, and they both need to read row A, then write to row B, and they start doing these two tasks simultaneously, how do you prevent either from failing if you don't have database-wide locking? First of all, the value of row A never changes in this example. So it is immaterial whether the transaction locked the whole database or just row B. Your application that wrote this kind of a query/update has a concurrency bug, namely lost update. IOW, it is losing the first update because it did not check the value of B before modifying it and didn't modify row A when it modified row B. Therefore, your question itself has a concurrency bug. This is why I said that locking is not a silver bullet and multi-user concurrency should not be taken lightly. For a primer on isolation levels, transactions, and locks, please see [1]. This discussion is an indicator of both the complexity involved in designing standards such as these and the amount of background knowledge required to design a good standard. Proponents of existing spec language have chosen to never explicitly back up their statements with the body of knowledge that exists in the database sciences. Taken together, all this makes it unlikely that a good SQL standard can be developed by this WG in a short period of time that some might be expecting. Nikunj http://o-micron.blogspot.com [1] http://www.oracle.com/technology/oramag/oracle/05-nov/o65asktom.html
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: So you are reduced to very awkward ways of cooperating -- using the database itself as a queue or for master election, or designing a separate transaction system between tabs which might be on separate threads, using an asynchronous API. Or you just accept that any statement can fail and retry everything. Or your app is just buggy if multiple instances are open. Did you consider for a moment that all this is merely a result of the SQLite feature to lock the entire database? Nikunj http://o-micron.blogspot.com
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 4:12 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: 2. create single-instance-only apps , i.e., hold a write lock on the database forever since they don't want to deal version checks. I don't think you understand the spec - it isn't actually possible to hold the lock forever. It is a little insulting for you to say that, but I will not take offense to it. I didn't mean any offense, I really don't think you understand the spec completely :). Locks aren't an explit part of the API, but are implicit and released automatically when functions return. Take a look at the transaction method again: db.transaction(function(tx) { tx.executeSql(strSql, function() { }); }); The transaction is implicitly released when the last sql statement is completed (or fails). The only way you can keep this transaction open is to execute more SQL. (corrected a slight typo in the example - it was missing the parameter definition for tx) Thanks for the correction. Code is for conversational purposes only. I also may be forgetting some API details since I haven't looked at this in awhile. If I put in a timer or another asynchronous call inside the block and that block used the variable tx, wouldn't it force the implementation to continue holding the database lock? If so, there is no limit to how long I can hold on to variables, and hence I could hold on to the database as an exclusive reader/writer for as long as I wanted to. A novice programmer would probably not even understand what a transaction means, except that they need a handle to change stuff. That programmer could hold on to this handle for the duration of the session. No. The transaction is not closed on GC, it is closed when the last statement that is part of the transaction completes. So holding a reference to the tx variable does nothing one way or the other. The only way to hang the transaction open would be to execute statements over and over. On Fri, Jul 24, 2009 at 4:13 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: So you are reduced to very awkward ways of cooperating -- using the database itself as a queue or for master election, or designing a separate transaction system between tabs which might be on separate threads, using an asynchronous API. Or you just accept that any statement can fail and retry everything. Or your app is just buggy if multiple instances are open. Did you consider for a moment that all this is merely a result of the SQLite feature to lock the entire database? No, having the database not be able to change out from under a multi-step update was a design goal of the API. Implementing a complex application without exclusive transactions would be very difficult. I do understand that your position that there is a tradeoff: you give up some performance because a skilled developer could do finer grained locking and get better concurrency. And I think you are arguing that there should be an option for non-exclusive write transactions, or at least it should be up to the UA. I still feel that with the number of clients a typical HTML5 database will have, this is a non-issue and the spec makes the correct tradeoff. - a
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: I do not agree that database-level locking is a big problem for web applications. Our problem is not with databases doing database-level locking. Our problem is that such behavior is a MUST. I think it is very desirable for it to appear to the developer that writes to the local datastore are atomic. Lots of complexity falls out if this is not true. It is implicit that transactions give atomicity (that's what A in ACID stands for). It would be mischaracterizing this discussion to say that we are arguing about atomicity. We are, however, talking about isolation (the I in ACID), or more precisely the degree of isolation. In some models (non-SQL) it may be easier to arrange a large update in the application layer and commit it all at once. In SQL, this is less true so it is important to provide API that makes conflicts impossible while a multi-step update is in progress. This problem exists in the WebStorage model [1]. More specifically, there is no way to perform multiple updates atomically in it. The proposal that I have sketched about B-trees [2] does not have this problem since it is possible to work with transactions to get the atomicity as well as a desired isolation level. I take it that there are no issues with that proposal since I have not heard anyone say so. Perhaps your real issue is that the current API does not work well for non SQL data stores. Not at all! It would be disingenuous to find an ulterior motive in my arguments. Nikunj http://o-micron.blogspot.com [1] http://dev.w3.org/html5/webstorage/ [2] http://www.w3.org/mid/f480f60a-5dae-4b73-922a-93ed401cf...@oracle.com
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 4:30 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: In some models (non-SQL) it may be easier to arrange a large update in the application layer and commit it all at once. In SQL, this is less true so it is important to provide API that makes conflicts impossible while a multi-step update is in progress. This problem exists in the WebStorage model [1]. More specifically, there is no way to perform multiple updates atomically in it. I agree. The proposal that I have sketched about B-trees [2] does not have this problem since it is possible to work with transactions to get the atomicity as well as a desired isolation level. I take it that there are no issues with that proposal since I have not heard anyone say so. I haven't reviewed that. I only chimed into this conversation because it looked like there were some misunderstandings and I worked on it early-on. - a
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Jul 24, 2009, at 4:25 PM, Aaron Boodman wrote: On Fri, Jul 24, 2009 at 4:12 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: 2. create single-instance-only apps , i.e., hold a write lock on the database forever since they don't want to deal version checks. I don't think you understand the spec - it isn't actually possible to hold the lock forever. It is a little insulting for you to say that, but I will not take offense to it. I didn't mean any offense, I really don't think you understand the spec completely :). I beg to differ. Au contraire, I really don't think you understand databases at all. Locks aren't an explit part of the API, but are implicit and released automatically when functions return. This is completely incorrect. Read below for more details. Take a look at the transaction method again: db.transaction(function(tx) { tx.executeSql(strSql, function() { }); }); The transaction is implicitly released when the last sql statement is completed (or fails). The only way you can keep this transaction open is to execute more SQL. (corrected a slight typo in the example - it was missing the parameter definition for tx) Thanks for the correction. Code is for conversational purposes only. I also may be forgetting some API details since I haven't looked at this in awhile. If I put in a timer or another asynchronous call inside the block and that block used the variable tx, wouldn't it force the implementation to continue holding the database lock? If so, there is no limit to how long I can hold on to variables, and hence I could hold on to the database as an exclusive reader/writer for as long as I wanted to. A novice programmer would probably not even understand what a transaction means, except that they need a handle to change stuff. That programmer could hold on to this handle for the duration of the session. No. The transaction is not closed on GC, it is closed when the last statement that is part of the transaction completes. So holding a reference to the tx variable does nothing one way or the other. The only way to hang the transaction open would be to execute statements over and over. A transaction is not complete until I either commit or rollback the transaction, which I can choose to do as late as I want to, e.g., at window.onclose. Therefore locks on the database will not be released for as long as the application wants to hold on to the transaction. On Fri, Jul 24, 2009 at 4:13 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: On Jul 24, 2009, at 3:57 PM, Aaron Boodman wrote: So you are reduced to very awkward ways of cooperating -- using the database itself as a queue or for master election, or designing a separate transaction system between tabs which might be on separate threads, using an asynchronous API. Or you just accept that any statement can fail and retry everything. Or your app is just buggy if multiple instances are open. Did you consider for a moment that all this is merely a result of the SQLite feature to lock the entire database? No, having the database not be able to change out from under a multi-step update was a design goal of the API. Implementing a complex application without exclusive transactions would be very difficult. I am not proposing to take away your choice. But please don't take away mine. It would be useful to see an explanation as to why the proposal I made [[ add an isolation level parameter with a default value of SERIALIZABLE and remove the exclusive database-level write lock requirement ]] is worse than the current spec text. You can refer to SQL92 explain the meaning of SERIALIZABLE. AFAIK, there are no interoperability problems with transaction isolation levels. Nikunj http://o-micron.blogspot.com
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 4:45 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: No. The transaction is not closed on GC, it is closed when the last statement that is part of the transaction completes. So holding a reference to the tx variable does nothing one way or the other. The only way to hang the transaction open would be to execute statements over and over. A transaction is not complete until I either commit or rollback the transaction, which I can choose to do as late as I want to, e.g., at window.onclose. Therefore locks on the database will not be released for as long as the application wants to hold on to the transaction. I don't think that this is true, at least in the Database interface: http://dev.w3.org/html5/webdatabase/#asynchronous-database-api There is no explicit commit() method. The commit happens implicitly after all queued statements have been executed successfully. It does appear that it is possible to hold a transaction open all day with the DatabaseSync interface (http://dev.w3.org/html5/webdatabase/#databasesync). Specifically the SQLTransactionSync method has commit/rollback methods. The DatabaseSync interface was added after I worked on this, so I can't say why it doesn't use callbacks. In any case, I was talking about the async flavor which is what my example code referred to. Do you agree it is not possible to hang transactions open from Database (http://dev.w3.org/html5/webdatabase/#database)? If not, what am I missing? - a
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Jul 24, 2009, at 4:58 PM, Aaron Boodman wrote: On Fri, Jul 24, 2009 at 4:45 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: No. The transaction is not closed on GC, it is closed when the last statement that is part of the transaction completes. So holding a reference to the tx variable does nothing one way or the other. The only way to hang the transaction open would be to execute statements over and over. A transaction is not complete until I either commit or rollback the transaction, which I can choose to do as late as I want to, e.g., at window.onclose. Therefore locks on the database will not be released for as long as the application wants to hold on to the transaction. I don't think that this is true, at least in the Database interface: http://dev.w3.org/html5/webdatabase/#asynchronous-database-api There is no explicit commit() method. The commit happens implicitly after all queued statements have been executed successfully. The spec is also silent about what happens if I put a wait by making another asynchronous call inside my transaction callback logic. By inference, this would be allowed since all statements are executed inside callbacks, so why distinguish between transaction and other (non-SQLTransactionErrorCallback) types of callbacks. The processing model in 4.3.2 simply says that the SQL statements are queued up. It is unclear what if anything happens if the database runs out of statements to execute if the transaction logic takes time to add another statement to the queue before the database decides to commit. Am I wrong or is this an ambiguous, but correct interpretation? Those who are worried about throwing complexity of transaction recovery on Web programmers should perhaps also be worried about the insane complexity of asynchronous transaction programming, that no one in the world should have to learn. The mainstream database developers don't have to deal with that. Why should poor Web programmers have to suffer this? Moreover, with an asynchronous database the spec doesn't allow an application to rollback a transaction, should certain application logic require that. This is yet another case of creating a storage API that is different from traditional database developers. There seems to be a pattern of ignoring good API practices when interacting with a database and it appears intentional. Am I wrong in my interpretation? It does appear that it is possible to hold a transaction open all day with the DatabaseSync interface (http://dev.w3.org/html5/webdatabase/#databasesync). Specifically the SQLTransactionSync method has commit/rollback methods. The DatabaseSync interface was added after I worked on this, so I can't say why it doesn't use callbacks. In any case, I was talking about the async flavor which is what my example code referred to. Do you agree it is not possible to hang transactions open from Database (http://dev.w3.org/html5/webdatabase/#database)? If not, what am I missing? I can't agree simply because the spec says nothing about it. In fact, if anything the rest of the spec text around asynchronous processing suggests that it is possible to hang transactions indefinitely. Nikunj http://o-micron.blogspot.com
Re: [WebStorage] Concerns on spec section 'Processing Model'
On Fri, Jul 24, 2009 at 4:45 PM, Nikunj R. Mehtanikunj.me...@oracle.com wrote: I am not proposing to take away your choice. But please don't take away mine. It would be useful to see an explanation as to why the proposal I made [[ add an isolation level parameter with a default value of SERIALIZABLE and remove the exclusive database-level write lock requirement ]] is worse than the current spec text. You can refer to SQL92 explain the meaning of SERIALIZABLE. AFAIK, there are no interoperability problems with transaction isolation levels. I'm personally not opposed to adding more isolation levels in addition to the current single option of SERIALIZABLE. It could be added as an argument to transaction(). I don't think it is a particularly high value feature, but I also don't see a big problem with it. And I can imagine that some particularly ambitious developers might want to take advantage of it. - a
[WebDatabase] Database interface (vs. DatabaseSync interface)
It appears that Database, SQLTransactionCallback, SQLTransactionErrorCallback, SQLVoidCallback, SQLTransaction, SQLStatementCallback, and SQLStatementErrorCallback interfaces can all be eliminated from WebDatabase completely. Given WebWorkers and DatabaseSync, why do we need the Database object at all? Are there use cases that cannot be satisfied by the combination of the two that? There is a brand new programming model being promoted by the Database object, it is as complex as it gets and seriously I cannot get it. I don't know about you, but I doubt it will work for an average Web page author. Given its programming model, Oracle is not supportive of the asynchronous Database and related interfaces. Nikunj http://o-micron.blogspot.com
Re: [selectors-api] Test Suite Progress
Lachlan Hunt wrote: Both Minefield and Webkit trunk are failing those tests for me. I have all but one commented out just so it would make my debugging easier (see lines 287 to 292 in 002.html). Oh, I'd just missed the one failing test. Showing only failing tests helped! I just checked in a new copy that outputs the exception message for debugging. So based on that, the problem is that when it tries to check the style of the found element, it fails because it's dealing with elements in namespaces other than the HTML namespace, so they don't have a .style property. Yep. I'm not sure what we can do to workaround that easily. Hmm. For the any-namespace test you could use a non-HTML node that happens to have .style (like SVG, say). For the no-namespace test, you'd want the harness to be somewhat different here... Maybe ask John Resig if he has any ideas, since he presumably knows this code pretty well? -Boris