Re: [IndexedDB] IDBRequest.abort on writing requests
On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count as long-running. / Jonas
Re: [IndexedDB] IDBRequest.abort on writing requests
On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count as long-running. Sounds good. I'll file a bug to remove them from the spec then (and put a note that if we do have getAll* we should re-add it). J
Re: [IndexedDB] Current editor's draft
Hi Pablo, First off, thanks for your comments! (Probably too much) details below. On Tue, Jul 13, 2010 at 7:52 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM Sorry I disappeared for a while. Catching up with this discussion was an interesting exercise...there is no particular message in this thread I can respond to, so I thought I'd just reply to the last one. Overall I think the new proposal is shaping up well and is being effective in simplifying scenarios. I do have a few suggestions and questions for things I'm not sure I see all the way. READ_ONLY vs READ_WRITE as defaults for transactions: To be perfectly honest, I think this discussion went really deep over an issue that won't be a huge deal for most people. My perspective, trying to avoid performance or usage frequency speculation, is around what's easier to detect. Concurrency issues are hard to see. On the other hand, whenever we can throw an exception and give explicit guidance that unblocks people right away. For this case I suspect it's best to default to READ_ONLY, because if someone doesn't read or think about it and just uses the stuff and tries to change something they'll get a clear error message saying if you want to change stuff, use READ_WRITE please. The error is not data- or context-dependent, so it'll fail on first try at most once per developer and once they fix it they'll know for all future cases. Yup, this was exactly my thinking. Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. I don't understand enough of the details here to be able to make a decision. The use cases you are bringing up I definitely agree are important, but I would love to look at even a rough draft of what code you are expecting people will need to write. What I suggest is that we keep dynamic transactions in the spec for now, but separate the API from static transactions, start a separate thread and try to hammer out the details and see what we arrive at. I do want to clarify that I don't think dynamic transactions are particularly hard to implement, I just suspect they are hard to use correctly. Implicit commit: Does this really work? I need to play with sample app code more, it may just be that I'm old-fashioned. For example, if I'm downloading a bunch of data form somewhere and pushing rows into the store within a transaction, wouldn't it be reasonable to do the whole thing in a transaction? In that case I'm likely to have to unwind while I wait for the next callback from XmlHttpRequest with the next chunk of data. You definitely want to do it in a transaction. In our proposal there is no way to even call .get or .put if you aren't inside a transaction. For the case you are describing, you'd download the data using XMLHttpRequest first. Once the data has been downloaded you start a transaction, parse the data, and make the desired modifications. Once that is done the transaction is automatically committed. The idea here is to avoid keeping transactions open for long periods of time, while at the same time making the API easier to work with. I'm very concerned that any API that requires people to do: startOperation(); ... do lots of stuff here ... endOperation(); people will forget to do the endOperation call. This is especially true if the startOperation/endOperation calls are spread out over multiple different asynchronously called functions, which seems to be the use case you're concerned about above. One very easy way to forget to call endOperation is if something inbetween the two function calls throw an exception. This will likely be extra bad for transactions where no write operations are done. In this case failure to call a 'commit()' function won't result in any broken behavior.
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro pablo.cas...@microsoft.comwrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM Sorry I disappeared for a while. Catching up with this discussion was an interesting exercise... Yes, Indeed. :-) there is no particular message in this thread I can respond to, so I thought I'd just reply to the last one. Probably a good idea. I was trying to respond hixie style--which is harder than it looks on stuff like this. Overall I think the new proposal is shaping up well and is being effective in simplifying scenarios. I do have a few suggestions and questions for things I'm not sure I see all the way. READ_ONLY vs READ_WRITE as defaults for transactions: To be perfectly honest, I think this discussion went really deep over an issue that won't be a huge deal for most people. My perspective, trying to avoid performance or usage frequency speculation, is around what's easier to detect. Concurrency issues are hard to see. On the other hand, whenever we can throw an exception and give explicit guidance that unblocks people right away. For this case I suspect it's best to default to READ_ONLY, because if someone doesn't read or think about it and just uses the stuff and tries to change something they'll get a clear error message saying if you want to change stuff, use READ_WRITE please. The error is not data- or context-dependent, so it'll fail on first try at most once per developer and once they fix it they'll know for all future cases. Couldn't have said it better myself. Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. Well, the application should never freeze in terms of the UI locking up, but in what you described I could see it taking a while for data to show up on the screen. This is something that can be fixed by doing smaller updates on the background thread, sending a message to the background thread that it should abort for now, doing all database access on the background thread, etc. One point that I never saw made in the thread that I think is really important is that dynamic transactions can make concurrency worse in some cases. For example, with dynamic transactions you can get into live-lock situations. Also, using Pablo's example, you could easily get into a situation where the long running transaction on the worker keeps hitting serialization issues and thus it's never able to make progress. I do see that there are use cases where having dynamic transactions would be much nicer, but the amount of non-determinism they add (including to performance) has me pretty worried. I pretty firmly believe we should look into adding them in v2 and remove them for now. If we do leave them in, it should definitely be in its own method to make it quite clear that the semantics are more complex. Implicit commit: Does this really work? I need to play with sample app code more, it may just be that I'm old-fashioned. For example, if I'm downloading a bunch of data form somewhere and pushing rows into the store within a transaction, wouldn't it be reasonable to do the whole thing in a transaction? In that case I'm likely to have to unwind while I wait for the next callback from XmlHttpRequest with the next chunk of data. I understand that avoiding it results in nicer patterns (e.g. db.objectStores(foo).get(123).onsuccess = ...), but in practice I'm not sure if that will hold given that you still need error callbacks and such. I believe your example of doing XHRs in the middle of a transaction is something we were explicitly trying to avoid making possible. In this case, you should do all of your XHRs first and then do your transaction. If you need to read form the ObjectStore, do a XHR, and then write to the ObjectStore, you can implement it with 2 transactions and have the
Re: [IndexedDB] IDBRequest.abort on writing requests
On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count as long-running. Sounds good. I'll file a bug to remove them from the spec then (and put a note that if we do have getAll* we should re-add it). Actually, I thought of another situation where we currently could have long-running read requests. One implementation strategy for transactions is to not wait for all objectStores to become available before starting to execute requests. Instead you start locking objectStores in some implementation defined, but consistent, order and as soon as you've locked the objectStore at which the next pending request is placed, you perform that request. As long as the order in which the implementation locks the objectStores is the same for all transactions, there is no risk of deadlocks. Consider for example the following code: trans = db.transaction([foo, bar]); trans.objectStore(bar).get(12).onsuccess = function(e) { trans.objectStore(foo).get(e.result.parentId).onsuccess = ...; } If the implementation decided to
Re: [IndexedDB] IDBRequest.abort on writing requests
On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count as long-running. Sounds good. I'll file a bug to remove them from the spec then (and put a note that if we do have getAll* we should re-add it). Actually, I thought of another situation where we currently could have long-running read requests. One implementation strategy for transactions is to not wait for all objectStores to become available before starting to execute requests. Instead you start locking objectStores in some implementation defined, but consistent, order and as soon as you've locked the objectStore at which the next pending request is placed, you perform that request. As long as the order in which the implementation locks the objectStores is the same for all transactions, there is no risk of deadlocks. Consider for example the following
Re: [IndexedDB] IDBRequest.abort on writing requests
On Wed, Jul 14, 2010 at 1:02 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count as long-running. Sounds good. I'll file a bug to remove them from the spec then (and put a note that if we do have getAll* we should re-add it). Actually, I thought of another situation where we currently could have long-running read requests. One implementation strategy for transactions is to not wait for all objectStores to become available before starting to execute requests. Instead you start locking objectStores in some implementation defined, but consistent, order and as soon as you've locked the objectStore at which the next pending request is placed, you perform that request. As long as the order in which the
Re: [IndexedDB] IDBRequest.abort on writing requests
On Wed, Jul 14, 2010 at 9:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 1:02 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count as long-running. Sounds good. I'll file a bug to remove them from the spec then (and put a note that if we do have getAll* we should re-add it). Actually, I thought of another situation where we currently could have long-running read requests. One implementation strategy for transactions is to not wait for all objectStores to become available before starting to execute requests. Instead you start locking
[Bug 10165] New: IDBRequest.abort() should throw on non-read-only requests or simply be removed
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10165 Summary: IDBRequest.abort() should throw on non-read-only requests or simply be removed Product: WebAppsWG Version: unspecified Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Indexed Database API AssignedTo: nikunj.me...@oracle.com ReportedBy: jor...@chromium.org QAContact: member-webapi-...@w3.org CC: m...@w3.org, public-webapps@w3.org As discussed in the thread [IndexedDB] IDBRequest.abort on writing requests [1], it's dangerous to allow IDBRequest.abort() to be called on any request that has side effects. The best solution brought up in the thread was to have it throw if you call it on any request that isn't read only. But later on in the thread, I think it became fairly questionable that IDBRequest.abort() provides any value in the current spec since none of the read only operations should be particularly long running and in the few cases where you might truly need to do an abort, you can simply roll back the transaction. [1] http://lists.w3.org/Archives/Public/public-webapps/2010JulSep/0190.html -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
Re: CfC: to publish new WD of CORS; deadline July 20
On Tue, 13 Jul 2010 17:50:26 +0200, Mark S. Miller erig...@google.com wrote: Has anyone been working towards a revised Security Considerations section? Your Google colleague Dirk has volunteered but I believe has not yet had the time unfortunately. -- Anne van Kesteren http://annevankesteren.nl/
Re: [IndexedDB] Callback order
On Wed, Jul 7, 2010 at 11:54 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jun 24, 2010 at 4:40 AM, Jeremy Orlow jor...@chromium.org wrote: On Sat, Jun 19, 2010 at 9:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 7:46 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jun 18, 2010 at 7:24 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 7:01 PM, Jeremy Orlow jor...@chromium.org wrote: I think determinism is most important for the reasons you cited. I think advanced, performance concerned apps could deal with either semantics you mentioned, so the key would be to pick whatever is best for the normal case. I'm leaning towards thinking firing in order is the best way to go because it's the most intuitive/easiest to understand, but I don't feel strongly about anything other than being deterministic. I definitely agree that firing in request order is the simplest, both from an implementation and usage point of view. However my concern is that we'd lose most of the performance benefits that cursors provide if we use that solution. What do you mean with apps could deal with either semantics? You mean that they could deal with the cursor case by simply being slower, or do you mean that they could work around the performance hit somehow? Hm. I was thinking they could save the value, call continue, then do work on it, but that'd of course only defer the slowdown for one iteration. So I guess they'd have to store up a bunch of data and then make calls on it. Indeed which could be bad for memory footprint. Of course, they'll run into all of these same issues with the sync API since things are of course done in order. So maybe trying to optimize this specific case for just the async API is silly? I honestly haven't looked at the sync API. But yes, I assume that it will in general have to serialize all calls into the database and thus generally not be as performant. I don't think that is a good reason to make the async API slower too though. But it's entirely possible that I'm overly concerned about cursor performance in general though. I won't argue too strongly that we need to prioritize cursor callback events until I've seen some numbers. If we want to simply define that callbacks fire in request order for now then that is fine with me. Yeah, I think we should get some hard numbers and think carefully about this before we make things even more complicated/nuanced. I ran some tests. Note that the test implementation is an approximation. It's both somewhat optimistic in that it doesn't make the extra effort to ensure that cursor callbacks always run before other callbacks. But it's also somewhat pessimistic in that it always returns to the main event loop, even though that is often not needed. My guess is that in the end it's a pretty close approximation performance wise. I've attached the testcase I used in case anyone want to play around with it. It contains a fair amount of mozilla specific features (generators are awesome for asynchronous callbacks) as well as is written to the IndexedDB API that we currently have implemented, but it should be portable to other browsers. For the currently proposed solution, of always running requests in the order they are made, including requests coming from cursor.continue(), gives the following results: Plain iteration over 1 entries using cursor: 2400ms Iteration over 1 entries using cursor, performing a join by for each iteration call getAll on an index: 5400ms For the proposed solution of prioritizing cursor.continue() callbacks over other callbacks: Plain iteration over 1 entries using cursor: 1050ms Iteration over 1 entries using cursor, performing a join by for each iteration call getAll on an index: 1280ms The reason that just plain iteration got faster is that we implemented the strict ordering by sending all requests to the thread the database runs on, and then having the database thread process all requests in order and send them back to the requesting thread. So for plain iteration it basically just means a roundtrip to the indexedDB thread and back. Based on these numbers, I think we should prioritize IDBCursor.continue() callbacks as for join example this results in a over 4x speedup. I would like to note that this speedup is on one particular implementation which isn't particularly optimized. Nevertheless, that is a pretty substantial difference in run times. But yet it just pains me to think of special casing the order of execution for just cursors. Especially when we're still trying to nail down the very basics of the async API. I would prefer to open a bug and leave this on the backburner for a while (like other features like nested transactions). When we do look at this, we may want to
Re: [IndexedDB] Cursors and modifications
On Thu, Jul 8, 2010 at 8:42 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Jul 5, 2010 at 9:45 AM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 2:09 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 5:44 PM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 1:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 4:40 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto: public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, July 02, 2010 4:00 PM We ran into an complicated issue while implementing IndexedDB. In short, what should happen if an object store is modified while a cursor is iterating it? Note that the modification can be done within the same transaction, so the read/write locks preventing several transactions from accessing the same table isn't helping here. Detailed problem description (this assumes the API proposed by mozilla): Consider a objectStore words containing the following objects: { name: alpha } { name: bravo } { name: charlie } { name: delta } and the following program (db is a previously opened IDBDatabase): var trans = db.transaction([words], READ_WRITE); var cursor; var result = []; trans.objectStore(words).openCursor().onsuccess = function(e) { cursor = e.result; result.push(cursor.value); cursor.continue(); } trans.objectStore(words).get(delta).onsuccess = function(e) { trans.objectStore(words).put({ name: delta, myModifiedValue: 17 }); } When the cursor reads the delta entry, will it see the 'myModifiedValue' property? Since we so far has defined that the callback order is defined to be the request order, that means that put request will be finished before the delta entry is iterated by the cursor. The problem is even more serious with cursors that iterate indexes. Here a modification can even affect the position of the currently iterated object in the index, and the modification can (if i'm reading the spec correctly) come from the cursor itself. Consider the following objectStore people with keyPath name containing the following objects: { name: Adam, count: 30 } { name: Bertil, count: 31 } { name: Cesar, count: 32 } { name: David, count: 33 } { name: Erik, count: 35 } and an index countIndex with keyPath count. What would the following code do? results = []; db.objectStore(people, READ_WRITE).index(countIndex).openObjectCursor().onsuccess = function (e) { cursor = e.result; if (!cursor) { alert(results); return; } if (cursor.value.name == Bertil) { cursor.update({name: Bertil, count: 34 }); } results.push(cursor.value.name); cursor.continue(); }; What does this alert? Would it alert Adam,Bertil,Erik as the cursor would stay on the Bertil object as it is moved in the index? Or would it alert Adam,Bertil,Cesar,David,Bertil,Erik as we would iterate Bertil again at its new position in the index? My first reaction is that both from the expected behavior of perspective (transaction is the scope of isolation) and from the implementation perspective it would be better to see live changes if they happened in the same transaction as the cursor (over a store or index). So in your example you would iterate one of the rows twice. Maintaining order and membership stable would mean creating another scope of isolation within the transaction, which to me would be unusual and it would be probably quite painful to implement without spilling a copy of the records to disk (at least a copy of the keys/order if you don't care about protecting from changes that don't affect membership/order; some databases call these keyset cursors). We could say that cursors always iterate snapshots, however this introduces MVCC. Though it seems to me that SNAPSHOT_READ already does that. Actually, even with MVCC you'd see your own changes, because they happen in the same transaction so the buffer pool will use the same version of the page. While it may be possible to reuse the MVCC infrastructure, it would still require the introduction of a second scope for stability. It's quite implementable using append-only b-trees. Though it might be much to ask that implementations are forced to use that. An alternative to what I suggested earlier is that all read operations use read committed. I.e. they always see the data as it looked at the beginning of the transaction. Would this be more compatible with existing MVCC implementations? Hmm, so if you modified the object store and then, later in the same transaction, used a cursor to iterate the object store, the cursor would not see the earlier modifications? That's not very intiutive to me...or did I misunderstand? If we go with read committed then yes, your understanding is correct. Out of curiosity, how
Re: [IndexedDB] Current editor's draft
Hi, I would like to propose that we update the current spec to reflect all the changes we have agreement on. We can then iteratively review and make edits as soon as the remaining issues are solved. Concretely, I would like to check in a fix for http://www.w3.org/Bugs/Public/show_bug.cgi?id=9975 with the following two exceptions which, based on the feedback in this thread, require more discussion: - leave in support for dynamic transactions but add a separate API for it, as suggested by Jonas earlier in this thread. - leave in the explicit transaction commit - leave in nested transactions The changes in 9975 have been debated for more than two month now, so I feel it's about time to update the specification so that it's in line with what we're actually discussing. Thanks, Andrei On Wed, Jul 14, 2010 at 8:10 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM Sorry I disappeared for a while. Catching up with this discussion was an interesting exercise... Yes, Indeed. :-) there is no particular message in this thread I can respond to, so I thought I'd just reply to the last one. Probably a good idea. I was trying to respond hixie style--which is harder than it looks on stuff like this. Overall I think the new proposal is shaping up well and is being effective in simplifying scenarios. I do have a few suggestions and questions for things I'm not sure I see all the way. READ_ONLY vs READ_WRITE as defaults for transactions: To be perfectly honest, I think this discussion went really deep over an issue that won't be a huge deal for most people. My perspective, trying to avoid performance or usage frequency speculation, is around what's easier to detect. Concurrency issues are hard to see. On the other hand, whenever we can throw an exception and give explicit guidance that unblocks people right away. For this case I suspect it's best to default to READ_ONLY, because if someone doesn't read or think about it and just uses the stuff and tries to change something they'll get a clear error message saying if you want to change stuff, use READ_WRITE please. The error is not data- or context-dependent, so it'll fail on first try at most once per developer and once they fix it they'll know for all future cases. Couldn't have said it better myself. Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. Well, the application should never freeze in terms of the UI locking up, but in what you described I could see it taking a while for data to show up on the screen. This is something that can be fixed by doing smaller updates on the background thread, sending a message to the background thread that it should abort for now, doing all database access on the background thread, etc. One point that I never saw made in the thread that I think is really important is that dynamic transactions can make concurrency worse in some cases. For example, with dynamic transactions you can get into live-lock situations. Also, using Pablo's example, you could easily get into a situation where the long running transaction on the worker keeps hitting serialization issues and thus it's never able to make progress. I do see that there are use cases where having dynamic transactions would be much nicer, but the amount of non-determinism they add (including to performance) has me pretty worried. I pretty firmly believe we should look into adding them in v2 and remove them for now. If we do leave them in, it should definitely be in its own method to make it quite clear that the semantics are more complex. Implicit commit: Does this really work? I need to play with sample app code more, it may just be that I'm
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 1:20 PM, Andrei Popescu andr...@google.com wrote: Hi, I would like to propose that we update the current spec to reflect all the changes we have agreement on. We can then iteratively review and make edits as soon as the remaining issues are solved. Concretely, I would like to check in a fix for http://www.w3.org/Bugs/Public/show_bug.cgi?id=9975 with the following two exceptions which, based on the feedback in this thread, require more discussion: - leave in support for dynamic transactions but add a separate API for it, as suggested by Jonas earlier in this thread. - leave in the explicit transaction commit - leave in nested transactions The changes in 9975 have been debated for more than two month now, so I feel it's about time to update the specification so that it's in line with what we're actually discussing. Agreed. In the future I think we should never let things stay this out of sync for this long, but I understand how this was a bit of a special case because of the scope of the changes. But yeah, let's make these changes and then iterate. And hopefully we can resolve the dynamic transaction, explicit commit, and nested transaction issues in the near future. Thanks, Andrei On Wed, Jul 14, 2010 at 8:10 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto: public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM Sorry I disappeared for a while. Catching up with this discussion was an interesting exercise... Yes, Indeed. :-) there is no particular message in this thread I can respond to, so I thought I'd just reply to the last one. Probably a good idea. I was trying to respond hixie style--which is harder than it looks on stuff like this. Overall I think the new proposal is shaping up well and is being effective in simplifying scenarios. I do have a few suggestions and questions for things I'm not sure I see all the way. READ_ONLY vs READ_WRITE as defaults for transactions: To be perfectly honest, I think this discussion went really deep over an issue that won't be a huge deal for most people. My perspective, trying to avoid performance or usage frequency speculation, is around what's easier to detect. Concurrency issues are hard to see. On the other hand, whenever we can throw an exception and give explicit guidance that unblocks people right away. For this case I suspect it's best to default to READ_ONLY, because if someone doesn't read or think about it and just uses the stuff and tries to change something they'll get a clear error message saying if you want to change stuff, use READ_WRITE please. The error is not data- or context-dependent, so it'll fail on first try at most once per developer and once they fix it they'll know for all future cases. Couldn't have said it better myself. Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. Well, the application should never freeze in terms of the UI locking up, but in what you described I could see it taking a while for data to show up on the screen. This is something that can be fixed by doing smaller updates on the background thread, sending a message to the background thread that it should abort for now, doing all database access on the background thread, etc. One point that I never saw made in the thread that I think is really important is that dynamic transactions can make concurrency worse in some cases. For example, with dynamic transactions you can get into live-lock situations. Also, using Pablo's example, you could easily get into a situation where the long running transaction on the worker keeps hitting serialization issues and thus it's
Re: [Web Storage] A couple questions about the storage spec
I'm not sure if discussion on this normally happens on WebApps. whatwg might be the better place. On Thu, Jul 8, 2010 at 5:33 PM, David John Burrowes s...@davidjohnburrowes.com wrote: Hello all, I have a couple questions about the storage spec (I'm reading the June 15th version at (http://dev.w3.org/html5/webstorage/). (1) The spec says: The object's indices of the supported indexed properties are the numbers in the range zero to one less than the number of key/value pairs currently present in the list associated with the object. If the list is empty, then there are no supported indexed properties. As far as I can tell, this seems to say I should be able to say something like: window.localStorage[3] and get something back (not clear if the key or the value). Am I right in my interpretation of that paragraph? I saw some discussion earlier about whether something like localStorage[3] was meaningful, but I didn't find the resolution. It does seem undesirable/confusing to me. And none of the browsers I've tried this with do this. So, I'm just confused, and probably misunderstanding indices of the supported indexed properties. Thanks for any clarification. All the browsers I know of handle localStorage[3] as localStorage.get/setItem('3', ...). My impression is that this behavior is pretty firmly rooted at this point. It seems as though the spec may need to change. (2) The spec also says: The names of the supported named properties on a Storagehttp://dev.w3.org/html5/webstorage/#storage-0 object are the keys of each key/value pair currently present in the list associated with the object. I read that (possibly/probably wrongly) as saying I should be able to say window.localStorage.setItem(foo, bar); myVariable = window.localStorage[foo]; and now myVariable will have bar. Named properties means localStorage.foo, I believe. If my reading is right (and it is the behavior I see in a couple browsers) this makes me very nervous, because I can do something like: window.localStorage.setItem(length, a value we computer); window.localStorage.setItem(clear, something that is transparent); which of course allows: window.localStorage[length]; window.localStorage[clear]; but in the browsers I've looked at, this (of course) also kinda messes up things like: for (index = 0; index window.localStorage.length; index++) { // whatever } window.localStorage.clear(); since length is now not a number, and clear isn't a function. Why is this a desirable feature? This doesn't seem very desirable to me either. IIRC, I brought this up a long time ago on whatwg though. Unfortunately, I don't remember the resolution or rational. Maybe look at the archives? (3) Real nitpicking here: The IDL for the Storage interface says setter creator void setItem http://dev.w3.org/html5/webstorage/#dom-storage-setitem(in DOMString key, in any data); but the text says The setItem(key, value) method Note the name of the second parameter is different between these. I'd agree. Thank you. Despite my nitpicking above, I really appreciate the presence of this spec! :-) david p.s. I'm still coming up to speed on these specs, so if I'm just misunderstanding something basic, direct me to TFM that I should R.
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 3:10 AM, Jeremy Orlow jor...@chromium.org wrote: For example, with dynamic transactions you can get into live-lock situations. I'm particularly opposed to dynamic transactions for just this reason. We would clearly have to throw an exception or call the error callback if we detect livelock. I doubt that most web authors would recognize the potential hazard, and even if they did I think it would be extremely difficult for a web author to test such a scenario or write code to handle it properly. The hardware running the web app and the browser's transaction scheduling algorithm would of course affect the frequency of these collisions making proper tests even more difficult. If we do leave them in, it should definitely be in its own method to make it quite clear that the semantics are more complex. I completely agree. So, as I've said, I'm very opposed to leaving dynamic transactions in the spec. However, one thing we could do if everyone really wanted this feature I guess is to set a limit of only a single dynamic transaction per database at a time. That would remove the livelock hazard but it may diminish the utility of the feature enough to be useless.
Re: [IndexedDB] Callback order
On Wed, Jul 14, 2010 at 4:16 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 7, 2010 at 11:54 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jun 24, 2010 at 4:40 AM, Jeremy Orlow jor...@chromium.org wrote: On Sat, Jun 19, 2010 at 9:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 7:46 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jun 18, 2010 at 7:24 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 7:01 PM, Jeremy Orlow jor...@chromium.org wrote: I think determinism is most important for the reasons you cited. I think advanced, performance concerned apps could deal with either semantics you mentioned, so the key would be to pick whatever is best for the normal case. I'm leaning towards thinking firing in order is the best way to go because it's the most intuitive/easiest to understand, but I don't feel strongly about anything other than being deterministic. I definitely agree that firing in request order is the simplest, both from an implementation and usage point of view. However my concern is that we'd lose most of the performance benefits that cursors provide if we use that solution. What do you mean with apps could deal with either semantics? You mean that they could deal with the cursor case by simply being slower, or do you mean that they could work around the performance hit somehow? Hm. I was thinking they could save the value, call continue, then do work on it, but that'd of course only defer the slowdown for one iteration. So I guess they'd have to store up a bunch of data and then make calls on it. Indeed which could be bad for memory footprint. Of course, they'll run into all of these same issues with the sync API since things are of course done in order. So maybe trying to optimize this specific case for just the async API is silly? I honestly haven't looked at the sync API. But yes, I assume that it will in general have to serialize all calls into the database and thus generally not be as performant. I don't think that is a good reason to make the async API slower too though. But it's entirely possible that I'm overly concerned about cursor performance in general though. I won't argue too strongly that we need to prioritize cursor callback events until I've seen some numbers. If we want to simply define that callbacks fire in request order for now then that is fine with me. Yeah, I think we should get some hard numbers and think carefully about this before we make things even more complicated/nuanced. I ran some tests. Note that the test implementation is an approximation. It's both somewhat optimistic in that it doesn't make the extra effort to ensure that cursor callbacks always run before other callbacks. But it's also somewhat pessimistic in that it always returns to the main event loop, even though that is often not needed. My guess is that in the end it's a pretty close approximation performance wise. I've attached the testcase I used in case anyone want to play around with it. It contains a fair amount of mozilla specific features (generators are awesome for asynchronous callbacks) as well as is written to the IndexedDB API that we currently have implemented, but it should be portable to other browsers. For the currently proposed solution, of always running requests in the order they are made, including requests coming from cursor.continue(), gives the following results: Plain iteration over 1 entries using cursor: 2400ms Iteration over 1 entries using cursor, performing a join by for each iteration call getAll on an index: 5400ms For the proposed solution of prioritizing cursor.continue() callbacks over other callbacks: Plain iteration over 1 entries using cursor: 1050ms Iteration over 1 entries using cursor, performing a join by for each iteration call getAll on an index: 1280ms The reason that just plain iteration got faster is that we implemented the strict ordering by sending all requests to the thread the database runs on, and then having the database thread process all requests in order and send them back to the requesting thread. So for plain iteration it basically just means a roundtrip to the indexedDB thread and back. Based on these numbers, I think we should prioritize IDBCursor.continue() callbacks as for join example this results in a over 4x speedup. I would like to note that this speedup is on one particular implementation which isn't particularly optimized. Nevertheless, that is a pretty substantial difference in run times. But yet it just pains me to think of special casing the order of execution for just cursors. Especially when we're still trying to nail down the very basics of the async API. I would prefer to open a bug and leave this on the
Re: [IndexedDB] Cursors and modifications
On Wed, Jul 14, 2010 at 5:12 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Jul 8, 2010 at 8:42 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Jul 5, 2010 at 9:45 AM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 2:09 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 5:44 PM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 1:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 4:40 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, July 02, 2010 4:00 PM We ran into an complicated issue while implementing IndexedDB. In short, what should happen if an object store is modified while a cursor is iterating it? Note that the modification can be done within the same transaction, so the read/write locks preventing several transactions from accessing the same table isn't helping here. Detailed problem description (this assumes the API proposed by mozilla): Consider a objectStore words containing the following objects: { name: alpha } { name: bravo } { name: charlie } { name: delta } and the following program (db is a previously opened IDBDatabase): var trans = db.transaction([words], READ_WRITE); var cursor; var result = []; trans.objectStore(words).openCursor().onsuccess = function(e) { cursor = e.result; result.push(cursor.value); cursor.continue(); } trans.objectStore(words).get(delta).onsuccess = function(e) { trans.objectStore(words).put({ name: delta, myModifiedValue: 17 }); } When the cursor reads the delta entry, will it see the 'myModifiedValue' property? Since we so far has defined that the callback order is defined to be the request order, that means that put request will be finished before the delta entry is iterated by the cursor. The problem is even more serious with cursors that iterate indexes. Here a modification can even affect the position of the currently iterated object in the index, and the modification can (if i'm reading the spec correctly) come from the cursor itself. Consider the following objectStore people with keyPath name containing the following objects: { name: Adam, count: 30 } { name: Bertil, count: 31 } { name: Cesar, count: 32 } { name: David, count: 33 } { name: Erik, count: 35 } and an index countIndex with keyPath count. What would the following code do? results = []; db.objectStore(people, READ_WRITE).index(countIndex).openObjectCursor().onsuccess = function (e) { cursor = e.result; if (!cursor) { alert(results); return; } if (cursor.value.name == Bertil) { cursor.update({name: Bertil, count: 34 }); } results.push(cursor.value.name); cursor.continue(); }; What does this alert? Would it alert Adam,Bertil,Erik as the cursor would stay on the Bertil object as it is moved in the index? Or would it alert Adam,Bertil,Cesar,David,Bertil,Erik as we would iterate Bertil again at its new position in the index? My first reaction is that both from the expected behavior of perspective (transaction is the scope of isolation) and from the implementation perspective it would be better to see live changes if they happened in the same transaction as the cursor (over a store or index). So in your example you would iterate one of the rows twice. Maintaining order and membership stable would mean creating another scope of isolation within the transaction, which to me would be unusual and it would be probably quite painful to implement without spilling a copy of the records to disk (at least a copy of the keys/order if you don't care about protecting from changes that don't affect membership/order; some databases call these keyset cursors). We could say that cursors always iterate snapshots, however this introduces MVCC. Though it seems to me that SNAPSHOT_READ already does that. Actually, even with MVCC you'd see your own changes, because they happen in the same transaction so the buffer pool will use the same version of the page. While it may be possible to reuse the MVCC infrastructure, it would still require the introduction of a second scope for stability. It's quite implementable using append-only b-trees. Though it might be much to ask that implementations are forced to use that. An alternative to what I suggested earlier is that all read operations use read committed. I.e. they always see the data as it looked at the beginning of the transaction. Would this be more compatible with existing MVCC implementations? Hmm, so if you modified the object store and then, later in the same transaction, used a cursor to iterate the object store, the cursor would not see the earlier
Re: [IndexedDB] Callback order
On Wed, Jul 14, 2010 at 5:15 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 4:16 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 7, 2010 at 11:54 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jun 24, 2010 at 4:40 AM, Jeremy Orlow jor...@chromium.org wrote: On Sat, Jun 19, 2010 at 9:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 7:46 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jun 18, 2010 at 7:24 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 18, 2010 at 7:01 PM, Jeremy Orlow jor...@chromium.org wrote: I think determinism is most important for the reasons you cited. I think advanced, performance concerned apps could deal with either semantics you mentioned, so the key would be to pick whatever is best for the normal case. I'm leaning towards thinking firing in order is the best way to go because it's the most intuitive/easiest to understand, but I don't feel strongly about anything other than being deterministic. I definitely agree that firing in request order is the simplest, both from an implementation and usage point of view. However my concern is that we'd lose most of the performance benefits that cursors provide if we use that solution. What do you mean with apps could deal with either semantics? You mean that they could deal with the cursor case by simply being slower, or do you mean that they could work around the performance hit somehow? Hm. I was thinking they could save the value, call continue, then do work on it, but that'd of course only defer the slowdown for one iteration. So I guess they'd have to store up a bunch of data and then make calls on it. Indeed which could be bad for memory footprint. Of course, they'll run into all of these same issues with the sync API since things are of course done in order. So maybe trying to optimize this specific case for just the async API is silly? I honestly haven't looked at the sync API. But yes, I assume that it will in general have to serialize all calls into the database and thus generally not be as performant. I don't think that is a good reason to make the async API slower too though. But it's entirely possible that I'm overly concerned about cursor performance in general though. I won't argue too strongly that we need to prioritize cursor callback events until I've seen some numbers. If we want to simply define that callbacks fire in request order for now then that is fine with me. Yeah, I think we should get some hard numbers and think carefully about this before we make things even more complicated/nuanced. I ran some tests. Note that the test implementation is an approximation. It's both somewhat optimistic in that it doesn't make the extra effort to ensure that cursor callbacks always run before other callbacks. But it's also somewhat pessimistic in that it always returns to the main event loop, even though that is often not needed. My guess is that in the end it's a pretty close approximation performance wise. I've attached the testcase I used in case anyone want to play around with it. It contains a fair amount of mozilla specific features (generators are awesome for asynchronous callbacks) as well as is written to the IndexedDB API that we currently have implemented, but it should be portable to other browsers. For the currently proposed solution, of always running requests in the order they are made, including requests coming from cursor.continue(), gives the following results: Plain iteration over 1 entries using cursor: 2400ms Iteration over 1 entries using cursor, performing a join by for each iteration call getAll on an index: 5400ms For the proposed solution of prioritizing cursor.continue() callbacks over other callbacks: Plain iteration over 1 entries using cursor: 1050ms Iteration over 1 entries using cursor, performing a join by for each iteration call getAll on an index: 1280ms The reason that just plain iteration got faster is that we implemented the strict ordering by sending all requests to the thread the database runs on, and then having the database thread process all requests in order and send them back to the requesting thread. So for plain iteration it basically just means a roundtrip to the indexedDB thread and back. Based on these numbers, I think we should prioritize IDBCursor.continue() callbacks as for join example this results in a over 4x speedup. I would like to note that this speedup is on one particular implementation which isn't particularly optimized. Nevertheless, that is a pretty substantial difference in run times. But yet it just pains me to
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 5:21 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 5:20 AM, Andrei Popescu andr...@google.com wrote: Hi, I would like to propose that we update the current spec to reflect all the changes we have agreement on. We can then iteratively review and make edits as soon as the remaining issues are solved. Concretely, I would like to check in a fix for http://www.w3.org/Bugs/Public/show_bug.cgi?id=9975 with the following two exceptions which, based on the feedback in this thread, require more discussion: - leave in support for dynamic transactions but add a separate API for it, as suggested by Jonas earlier in this thread. - leave in the explicit transaction commit - leave in nested transactions The changes in 9975 have been debated for more than two month now, so I feel it's about time to update the specification so that it's in line with what we're actually discussing. When you say leave in the explicit transaction commit, do you mean in addition to the implicit commit one there are no more requests on a transaction, or instead of it? In addition. In the current editor draft we have both: Implicit commit is described at: http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#dfn-transaction Explicit commit is defined at http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#widl-IDBTransaction-commit I was saying I would not remove the explicit one pending further discussion. Thanks, Andrei
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 9:28 AM, Andrei Popescu andr...@google.com wrote: On Wed, Jul 14, 2010 at 5:21 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 5:20 AM, Andrei Popescu andr...@google.com wrote: Hi, I would like to propose that we update the current spec to reflect all the changes we have agreement on. We can then iteratively review and make edits as soon as the remaining issues are solved. Concretely, I would like to check in a fix for http://www.w3.org/Bugs/Public/show_bug.cgi?id=9975 with the following two exceptions which, based on the feedback in this thread, require more discussion: - leave in support for dynamic transactions but add a separate API for it, as suggested by Jonas earlier in this thread. - leave in the explicit transaction commit - leave in nested transactions The changes in 9975 have been debated for more than two month now, so I feel it's about time to update the specification so that it's in line with what we're actually discussing. When you say leave in the explicit transaction commit, do you mean in addition to the implicit commit one there are no more requests on a transaction, or instead of it? In addition. In the current editor draft we have both: Implicit commit is described at: http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#dfn-transaction Explicit commit is defined at http://dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#widl-IDBTransaction-commit I was saying I would not remove the explicit one pending further discussion. Makes sense, thanks for clarifying. / Jonas
[IndexedDB]: typo in section 3.1.4
Just a minor nit: in the 2nd sentence of 3.1.4, the spec uses MAY in red where I believe you mean just an ordinary non-normative may. David Flanagan
Re: [cors] Unrestricted access
On Tue, Jul 13, 2010 at 8:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 3:47 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jul 2010 12:35:02 +0200, Jaka Jančar j...@kubje.org wrote: What I'd like is a global (per-host) way to disable these limitations all at once, giving XHR unrestricted access to the host, just like native apps have it. It used to be a mostly global per-resource switch, but the security folks at Mozilla thought that was too dangerous and we decided to go with the granular approach they proposed. This happened during a meeting in the summer of 2008 at Microsoft. I do not believe anything has changed meanwhile so this will probably not happen. This does not match my recollection of our requirements. The most important requirements that we had was that it was possible to opt in on a very granular basis, and that it was possible to opt in without getting cookies. Also note that the latter wasn't possible before we requested it and so this users requirements would not have been fulfilled if it wasn't for the changes we requested. Anyhow if we want to reopen discussions about syntax for the various headers that cors uses, for example to allow '*' as value, then I'm ok with that. Though personally I'd prefer to just ship this thing as it's a long time coming. Unless IE is soon to indicate support for all of the extra CORS headers, pre-flight requests and configuration caching, the decision should be to drop these unsupported features from the specification and come up with a solution that can achieve consensus among widely deployed browsers. I thought that was the declared policy for HTML5. As you know, I also think that is the right decision for many technical and security reasons. Jaka's request is reasonable and what the WG is offering in response is unreasonable. I expect many other web application developers will have needs similar to Jaka's. Meeting those needs with a simple solution is technically feasible. The politics seem to be much more difficult. --Tyler -- Waterken News: Capability security on the Web http://waterken.sourceforge.net/recent.html
Re: [cors] Unrestricted access
Tyler Close wrote: On Tue, Jul 13, 2010 at 8:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 3:47 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jul 2010 12:35:02 +0200, Jaka Jančar j...@kubje.org wrote: What I'd like is a global (per-host) way to disable these limitations all at once, giving XHR unrestricted access to the host, just like native apps have it. It used to be a mostly global per-resource switch, but the security folks at Mozilla thought that was too dangerous and we decided to go with the granular approach they proposed. This happened during a meeting in the summer of 2008 at Microsoft. I do not believe anything has changed meanwhile so this will probably not happen. This does not match my recollection of our requirements. The most important requirements that we had was that it was possible to opt in on a very granular basis, and that it was possible to opt in without getting cookies. Also note that the latter wasn't possible before we requested it and so this users requirements would not have been fulfilled if it wasn't for the changes we requested. Anyhow if we want to reopen discussions about syntax for the various headers that cors uses, for example to allow '*' as value, then I'm ok with that. Though personally I'd prefer to just ship this thing as it's a long time coming. Unless IE is soon to indicate support for all of the extra CORS headers, pre-flight requests and configuration caching, the decision should be to drop these unsupported features from the specification and come up with a solution that can achieve consensus among widely deployed browsers. I thought that was the declared policy for HTML5. As you know, I also think that is the right decision for many technical and security reasons. Jaka's request is reasonable and what the WG is offering in response is unreasonable. I expect many other web application developers will have needs similar to Jaka's. Meeting those needs with a simple solution is technically feasible. The politics seem to be much more difficult. well said Tyler, a big fat +1 from me and every other developer I know. Best, Nathan
Re: [cors] Unrestricted access
On Wed, Jul 14, 2010 at 10:39 AM, Tyler Close tyler.cl...@gmail.com wrote: On Tue, Jul 13, 2010 at 8:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 3:47 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jul 2010 12:35:02 +0200, Jaka Jančar j...@kubje.org wrote: What I'd like is a global (per-host) way to disable these limitations all at once, giving XHR unrestricted access to the host, just like native apps have it. It used to be a mostly global per-resource switch, but the security folks at Mozilla thought that was too dangerous and we decided to go with the granular approach they proposed. This happened during a meeting in the summer of 2008 at Microsoft. I do not believe anything has changed meanwhile so this will probably not happen. This does not match my recollection of our requirements. The most important requirements that we had was that it was possible to opt in on a very granular basis, and that it was possible to opt in without getting cookies. Also note that the latter wasn't possible before we requested it and so this users requirements would not have been fulfilled if it wasn't for the changes we requested. Anyhow if we want to reopen discussions about syntax for the various headers that cors uses, for example to allow '*' as value, then I'm ok with that. Though personally I'd prefer to just ship this thing as it's a long time coming. Unless IE is soon to indicate support for all of the extra CORS headers, pre-flight requests and configuration caching, the decision should be to drop these unsupported features from the specification and come up with a solution that can achieve consensus among widely deployed browsers. I thought that was the declared policy for HTML5. As you know, I also think that is the right decision for many technical and security reasons. Jaka's request is reasonable and what the WG is offering in response is unreasonable. I expect many other web application developers will have needs similar to Jaka's. Meeting those needs with a simple solution is technically feasible. The politics seem to be much more difficult. As far as I understand, UMP requires the exact same sever script, no? / Jonas
Re: CfC: to publish new WD of CORS; deadline July 20
That is correct (both that I volunteered and that I have not had time). I find myself home-bound for a couple days so I should be able to get something out to Anne for feedback by the end of the week. Apologies to all for the delay, -- Dirk On Wed, Jul 14, 2010 at 3:48 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jul 2010 17:50:26 +0200, Mark S. Miller erig...@google.com wrote: Has anyone been working towards a revised Security Considerations section? Your Google colleague Dirk has volunteered but I believe has not yet had the time unfortunately. -- Anne van Kesteren http://annevankesteren.nl/
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 12:07 AM Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. I don't understand enough of the details here to be able to make a decision. The use cases you are bringing up I definitely agree are important, but I would love to look at even a rough draft of what code you are expecting people will need to write. I'll try and hack up and example. In general any scenario that has a worker and the UI thread working on the same database will be quite a challenge, because the worker will have to a) split the work in small pieces, even if it was naturally a bigger chunk and b) consider interleaving implications with the UI thread, otherwise even when split in chunks you're not guaranteed that one of the two will starve the other one (the worker running on a tight loop will effectively always have an active transaction, it'll be just changing the actual transaction from time to time). This can certainly happen with dynamic transactions as well, the only difference is that since the locking granularity is different, it may be that what you're working on in the worker and in the UI threads is independent enough that they don't interfere too much, allowing for some more concurrency. What I suggest is that we keep dynamic transactions in the spec for now, but separate the API from static transactions, start a separate thread and try to hammer out the details and see what we arrive at. I do want to clarify that I don't think dynamic transactions are particularly hard to implement, I just suspect they are hard to use correctly. Sounds reasonable. Implicit commit: Does this really work? I need to play with sample app code more, it may just be that I'm old-fashioned. For example, if I'm downloading a bunch of data form somewhere and pushing rows into the store within a transaction, wouldn't it be reasonable to do the whole thing in a transaction? In that case I'm likely to have to unwind while I wait for the next callback from XmlHttpRequest with the next chunk of data. You definitely want to do it in a transaction. In our proposal there is no way to even call .get or .put if you aren't inside a transaction. For the case you are describing, you'd download the data using XMLHttpRequest first. Once the data has been downloaded you start a transaction, parse the data, and make the desired modifications. Once that is done the transaction is automatically committed. The idea here is to avoid keeping transactions open for long periods of time, while at the same time making the API easier to work with. I'm very concerned that any API that requires people to do: startOperation(); ... do lots of stuff here ... endOperation(); people will forget to do the endOperation call. This is especially true if the startOperation/endOperation calls are spread out over multiple different asynchronously called functions, which seems to be the use case you're concerned about above. One very easy way to forget to call endOperation is if something inbetween the two function calls throw an exception. Fair enough, maybe I need to think of this scenario differently, and if someone needs to download a bunch of data and then put it in the database atomically the right way is to download to work tables first over a long time and independent transactions, and then use a transaction only to move the data around into its final spot. This will likely be extra bad for transactions where no write operations are done. In this case failure to call a 'commit()' function won't result in any broken behavior. The transaction will just sit open for a long time and eventually rolled back, though since no changes were done, the rollback is transparent, and the only noticeable effect is that the application halts for a while while the
RE: [IndexedDB] Current editor's draft
From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 12:10 AM On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Andrei Popescu Sent: Monday, July 12, 2010 5:23 AM Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. Well, the application should never freeze in terms of the UI locking up, but in what you described I could see it taking a while for data to show up on the screen. This is something that can be fixed by doing smaller updates on the background thread, sending a message to the background thread that it should abort for now, doing all database access on the background thread, etc. This is an issue regardless, isn't it? Let's say you have a worker churning on the database somehow. The worker has no UI or user to wait for, so it'll run in a tight loop at full speed. If it splits the work in small transactions, in cases where it doesn't have to wait for something external there will still be a small gap between transactions. That could easily starve the UI thread that needs to find an opportunity to get in and do a quick thing against the database. As you say the difference between freezing and locking up at this point is not that critical, as the end user in the end is just waiting. One point that I never saw made in the thread that I think is really important is that dynamic transactions can make concurrency worse in some cases. For example, with dynamic transactions you can get into live-lock situations. Also, using Pablo's example, you could easily get into a situation where the long running transaction on the worker keeps hitting serialization issues and thus it's never able to make progress. While it could certainly happen, I don't remember seeing something like a live-lock in a long, long time. Deadlocks are common, but a simple timeout will kill one of the transactions and let the other make progress. A bit violent, but always effective. I do see that there are use cases where having dynamic transactions would be much nicer, but the amount of non-determinism they add (including to performance) has me pretty worried. I pretty firmly believe we should look into adding them in v2 and remove them for now. If we do leave them in, it should definitely be in its own method to make it quite clear that the semantics are more complex. Let's explore a bit more and see where we land. I'm not pushing for dynamic transactions themselves, but more for the scenarios they enable (background processing and such). If we find other ways of doing that, then all the better. Having different entry points is reasonable. Nested transactions: Not sure why we're considering this an advanced scenario. To be clear about what the feature means to me: make it legal to start a transaction when one is already in progress, and the nested one is effectively a no-op, just refcounts the transaction, so you need equal amounts of commit()'s, implicit or explicit, and an abort() cancels all nested transactions. The purpose of this is to allow composition, where a piece of code that needs a transaction can start one locally, independently of whether the caller had already one going. I believe it's actually a bit more tricky than what you said. For example, if we only support static transactions, will we require that any nested transaction only request a subset of the locks the outer one took? What if we try to start a dynamic transaction inside of a static one? Etc. But I agree it's not _that_ tricky and I'm also not convinced it's an advanced feature. I'd suggest we take it out for now and look at re-adding it when the basics of the async API are more solidified. I hope we can
Re: [cors] Unrestricted access
On Wed, Jul 14, 2010 at 12:02 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 10:39 AM, Tyler Close tyler.cl...@gmail.com wrote: On Tue, Jul 13, 2010 at 8:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 3:47 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jul 2010 12:35:02 +0200, Jaka Jančar j...@kubje.org wrote: What I'd like is a global (per-host) way to disable these limitations all at once, giving XHR unrestricted access to the host, just like native apps have it. It used to be a mostly global per-resource switch, but the security folks at Mozilla thought that was too dangerous and we decided to go with the granular approach they proposed. This happened during a meeting in the summer of 2008 at Microsoft. I do not believe anything has changed meanwhile so this will probably not happen. This does not match my recollection of our requirements. The most important requirements that we had was that it was possible to opt in on a very granular basis, and that it was possible to opt in without getting cookies. Also note that the latter wasn't possible before we requested it and so this users requirements would not have been fulfilled if it wasn't for the changes we requested. Anyhow if we want to reopen discussions about syntax for the various headers that cors uses, for example to allow '*' as value, then I'm ok with that. Though personally I'd prefer to just ship this thing as it's a long time coming. Unless IE is soon to indicate support for all of the extra CORS headers, pre-flight requests and configuration caching, the decision should be to drop these unsupported features from the specification and come up with a solution that can achieve consensus among widely deployed browsers. I thought that was the declared policy for HTML5. As you know, I also think that is the right decision for many technical and security reasons. Jaka's request is reasonable and what the WG is offering in response is unreasonable. I expect many other web application developers will have needs similar to Jaka's. Meeting those needs with a simple solution is technically feasible. The politics seem to be much more difficult. As far as I understand, UMP requires the exact same sever script, no? UMP Level One doesn't use pre-flight requests so doesn't have this complexity, but also doesn't enable arbitrary HTTP methods and headers. Instead, the plan was to have UMP Level Two introduce a well-known URL per host that could be consulted to turn on this functionality for all resources. Level One and Level Two are split since Level One is meant to cover only things that are currently deployed. --Tyler -- Waterken News: Capability security on the Web http://waterken.sourceforge.net/recent.html
RE: [IndexedDB] IDBRequest.abort on writing requests
From my perspective cancelling is not something that happens that often, and when it happens it's probably ok to cancel the whole transaction. If we can spec abort() in the transaction object such that it try to cancel all pending operations and then rollback any work that has been done so far, then we probably don't need abort on individual operations (with the added value that it's uniform across read and write operations). -pablo From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 1:57 AM On Wed, Jul 14, 2010 at 9:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 1:02 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla implementation since we have experimental objectStore.getAll(key) and index.getAllObjects(key) implementations, which both probably count
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 5:03 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 12:07 AM Dynamic transactions: I see that most folks would like to see these going away. While I like the predictability and simplifications that we're able to make by using static scopes for transactions, I worry that we'll close the door for two scenarios: background tasks and query processors. Background tasks such as synchronization and post-processing of content would seem to be almost impossible with the static scope approach, mostly due to the granularity of the scope specification (whole stores). Are we okay with saying that you can't for example sync something in the background (e.g. in a worker) while your app is still working? Am I missing something that would enable this class of scenarios? Query processors are also tricky because you usually take the query specification in some form after the transaction started (especially if you want to execute multiple queries with later queries depending on the outcome of the previous ones). The background tasks issue in particular looks pretty painful to me if we don't have a way to achieve it without freezing the application while it happens. I don't understand enough of the details here to be able to make a decision. The use cases you are bringing up I definitely agree are important, but I would love to look at even a rough draft of what code you are expecting people will need to write. I'll try and hack up and example. In general any scenario that has a worker and the UI thread working on the same database will be quite a challenge, because the worker will have to a) split the work in small pieces, even if it was naturally a bigger chunk and b) consider interleaving implications with the UI thread, otherwise even when split in chunks you're not guaranteed that one of the two will starve the other one (the worker running on a tight loop will effectively always have an active transaction, it'll be just changing the actual transaction from time to time). This can certainly happen with dynamic transactions as well, the only difference is that since the locking granularity is different, it may be that what you're working on in the worker and in the UI threads is independent enough that they don't interfere too much, allowing for some more concurrency. I think what I'm struggling with is how dynamic transactions will help since they are still doing whole-objectStore locking. I'm also curious how you envision people dealing with deadlock hazards. Nikunjs examples in the beginning of this thread simply throw up their hands and report an error if there was a deadlock. That is obviously not good enough for an actual application. So in short, looking forward to an example :) Implicit commit: Does this really work? I need to play with sample app code more, it may just be that I'm old-fashioned. For example, if I'm downloading a bunch of data form somewhere and pushing rows into the store within a transaction, wouldn't it be reasonable to do the whole thing in a transaction? In that case I'm likely to have to unwind while I wait for the next callback from XmlHttpRequest with the next chunk of data. You definitely want to do it in a transaction. In our proposal there is no way to even call .get or .put if you aren't inside a transaction. For the case you are describing, you'd download the data using XMLHttpRequest first. Once the data has been downloaded you start a transaction, parse the data, and make the desired modifications. Once that is done the transaction is automatically committed. The idea here is to avoid keeping transactions open for long periods of time, while at the same time making the API easier to work with. I'm very concerned that any API that requires people to do: startOperation(); ... do lots of stuff here ... endOperation(); people will forget to do the endOperation call. This is especially true if the startOperation/endOperation calls are spread out over multiple different asynchronously called functions, which seems to be the use case you're concerned about above. One very easy way to forget to call endOperation is if something inbetween the two function calls throw an exception. Fair enough, maybe I need to think of this scenario differently, and if someone needs to download a bunch of data and then put it in the database atomically the right way is to download to work tables first over a long time and independent transactions, and then use a transaction only to move the data around into its final spot. Yeah, I think that this is what we want to encourage. This will likely be extra bad for transactions where no write operations are done. In this case failure to call a 'commit()' function won't result in any broken behavior. The transaction will just
Re: [cors] Unrestricted access
On Wed, Jul 14, 2010 at 5:25 PM, Tyler Close tyler.cl...@gmail.com wrote: On Wed, Jul 14, 2010 at 12:02 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 10:39 AM, Tyler Close tyler.cl...@gmail.com wrote: On Tue, Jul 13, 2010 at 8:12 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 3:47 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jul 2010 12:35:02 +0200, Jaka Jančar j...@kubje.org wrote: What I'd like is a global (per-host) way to disable these limitations all at once, giving XHR unrestricted access to the host, just like native apps have it. It used to be a mostly global per-resource switch, but the security folks at Mozilla thought that was too dangerous and we decided to go with the granular approach they proposed. This happened during a meeting in the summer of 2008 at Microsoft. I do not believe anything has changed meanwhile so this will probably not happen. This does not match my recollection of our requirements. The most important requirements that we had was that it was possible to opt in on a very granular basis, and that it was possible to opt in without getting cookies. Also note that the latter wasn't possible before we requested it and so this users requirements would not have been fulfilled if it wasn't for the changes we requested. Anyhow if we want to reopen discussions about syntax for the various headers that cors uses, for example to allow '*' as value, then I'm ok with that. Though personally I'd prefer to just ship this thing as it's a long time coming. Unless IE is soon to indicate support for all of the extra CORS headers, pre-flight requests and configuration caching, the decision should be to drop these unsupported features from the specification and come up with a solution that can achieve consensus among widely deployed browsers. I thought that was the declared policy for HTML5. As you know, I also think that is the right decision for many technical and security reasons. Jaka's request is reasonable and what the WG is offering in response is unreasonable. I expect many other web application developers will have needs similar to Jaka's. Meeting those needs with a simple solution is technically feasible. The politics seem to be much more difficult. As far as I understand, UMP requires the exact same sever script, no? UMP Level One doesn't use pre-flight requests so doesn't have this complexity, but also doesn't enable arbitrary HTTP methods and headers. Instead, the plan was to have UMP Level Two introduce a well-known URL per host that could be consulted to turn on this functionality for all resources. Level One and Level Two are split since Level One is meant to cover only things that are currently deployed. So has IE, or any other browser, indicated support for UMP Level Two? / Jonas
Re: [IndexedDB] IDBRequest.abort on writing requests
Ok, I'll bow to majority vote then :) / Jonas On Wed, Jul 14, 2010 at 5:32 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From my perspective cancelling is not something that happens that often, and when it happens it's probably ok to cancel the whole transaction. If we can spec abort() in the transaction object such that it try to cancel all pending operations and then rollback any work that has been done so far, then we probably don't need abort on individual operations (with the added value that it's uniform across read and write operations). -pablo From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 1:57 AM On Wed, Jul 14, 2010 at 9:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 1:02 AM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 8:53 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:33 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Jul 14, 2010 at 7:28 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 11:12 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 9:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jul 13, 2010 at 1:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Jul 13, 2010 at 8:25 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Sorry if this is something that I've brought up before. I know I meant to bring this up in the past, but I couldn't find any actual emails. One thing that we discussed while implementing IndexedDB was what to do for IDBRequest.abort() or writing requests. For example on the request object returned from IDBObjectStore.remove() or IDBCursor.update(). Ideal would of course be if it would cancel the write operation, however this isn't always possible. If the call to .abort() comes after the write operation has already executed in the database, but before the 'success' event has had a chance to fire. What's worse is that other write operations might already have been performed on top of the aborted request. Consider for example the following code: req1 = myObjectStore.remove(12); req2 = myObjectStore.add({ id: 12, name: Benny Andersson }); do other stuff req1.abort(); In this case, even if the database supported aborting a specific operation, it's very hard to say what the correct thing to do with operations performed after it. As far as I know, databases generally don't support rolling back a given operation, only rolling back to a specific point, i.e. rolling back a given operation and all operations performed after it. We could say that abort() signals some sort of error if the operation has already been performed in the database, however that makes abort() very racy. Instead we concluded that the best thing to do was to specify that IDBRequest.abort() should throw if called on a modifying request. If this sounds good I'll make this change to the spec. I'd be fine with that. Or we could remove abort all together. I can't really think of what types of operations you'd really want to abort until (at least) we have some sort of join language or other mechanism to do really expensive read-only calls. I think there are expensive-ish read-only calls. Indexes are effectively a join mechanism since you'll hit one b-tree to do the index lookup, and then a second b-tree to look up the full object in the objectStore. But each individual call (the scope of canceling an IDBRequest) is pretty short. I don't really feel strongly either way. I think abort() isn't too hard to implement, but also doesn't provide a ton of value. At least not, like you say, until we add expensive calls like getAll or multi-step joins. I agree that when we look at adding such calls we may want to add an abort on just IDBRequest, but until then I don't think it's a very useful feature. And being easy to add is not a good reason to lock ourselves into a particular design in the future. I think we should remove it until there's a good reason for it to exist. Or we could take abort off IDBRequest and instead put a rollback on transactions (and not do the modify limitation). I definitely think we should have IDBTransaction.abort() no matter what. And that should allow rolling back write operations. Agreed. In which case it seems as though being able to abort individual operations isn't that important...especially given what we just talked about above. So can we just get rid of abort() on IDBRequest? I don't feel strongly either way. We'll probably keep them in the mozilla
RE: [IndexedDB] Current editor's draft
From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 5:43 PM On Wed, Jul 14, 2010 at 5:03 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 12:07 AM I think what I'm struggling with is how dynamic transactions will help since they are still doing whole-objectStore locking. I'm also curious how you envision people dealing with deadlock hazards. Nikunjs examples in the beginning of this thread simply throw up their hands and report an error if there was a deadlock. That is obviously not good enough for an actual application. So in short, looking forward to an example :) I'll try to come up with one, although I doubt the code itself will be very interesting in this particular case. Not sure what you mean by they are still doing whole-objectStore locking. The point of dynamic transactions is that they *don't* lock the whole store, but instead have the freedom to choose the granularity (e.g. you could do row-level locking). As for deadlocks, whenever you're doing an operation you need to be ready to handle errors (out of disk, timeout, etc.). I'm not sure why deadlocks are different. If the underlying implementation has deadlock detection then you may get a specific error, otherwise you'll just get a timeout. This will likely be extra bad for transactions where no write operations are done. In this case failure to call a 'commit()' function won't result in any broken behavior. The transaction will just sit open for a long time and eventually rolled back, though since no changes were done, the rollback is transparent, and the only noticeable effect is that the application halts for a while while the transaction is waiting to time out. I should add that the WebSQLDatabase uses automatically committing transactions very similar to what we're proposing, and it seems to have worked fine there. I find this a bit scary, although it could be that I'm permanently tainted with traditional database stuff. Typical databases follow a presumed abort protocol, where if your code is interrupted by an exception, a process crash or whatever, you can always assume transactions will be rolled back if you didn't reach an explicit call to commit. The implicit commit here takes that away, and I'm not sure how safe that is. For example, if I don't have proper exception handling in place, an illegal call to some other non-indexeddb related API may throw an exception causing the whole thing to unwind, at which point nothing will be pending to do in the database and thus the currently active transaction will be committed. Using the same line of thought we used for READ_ONLY, forgetting to call commit() is easy to detect the first time you try out your code. Your changes will simply not stick. It's not as clear as the READ_ONLY example because there is no opportunity to throw an explicit exception with an explanation, but the data not being around will certainly prompt developers to look for the issue :) Ah, I see where we are differing in thinking. My main concern has been that of rollbacks, and associated dataloss, in the non-error case. For example people forget to call commit() in some branch of their code, thus causing dataloss when the transaction is rolled back. Your concern seems to be that of lack of rollback in the error case, for example when an exception is thrown and not caught somewhere in the code. In this case you'd want to have the transaction rolled back. One way to handle this is to try to detect unhandled errors and implicitly roll back the transaction. Two situations where we could do this is: 1. When an 'error' event is fired, but where .preventDefault() has is not called by any handler. The result is that if an error is ever fired, but no one explicitly handles it, we roll back the transaction. See also below. 2. When a success handler is called, but the handler throws an exception. The second is a bit of a problem from a spec point of view. I'm not sure it is allowed by the DOM Events spec, or by all existing DOM Events implementations. I do still think we can pull it off though. This is something I've been thinking about raising for a while, but I wanted to nail down the raised issues first. Would you feel more comfortable with implicit commits if we did the above? It does make it better, although this seems to introduce quite moving parts to the process. I still think an explicit commit() would be better, but I'm open to explore more options. And as you say, you still usually need error callbacks. In fact, we have found while writing examples using our implementation, that you almost always want to add a generic error handler. It's very easy to make a mistake, and if you don't add error handlers then these just go by silently, offering no help as to why your program
RE: [IndexedDB] Cursors and modifications
Making sure I get the essence of this thread: we're saying that cursors see live changes as they happen on objects that are after the object you're currently standing on; and of course, any other activity within a transaction sees all the changes that happened before that activity took place. Is that accurate? If it's accurate, as a side note, for the async API it seems that this makes it more interesting to enforce callback order, so we can more easily explain what we mean by before. Thanks -pablo From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 9:27 AM On Wed, Jul 14, 2010 at 5:17 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 5:12 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Jul 8, 2010 at 8:42 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Jul 5, 2010 at 9:45 AM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 2:09 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 5:44 PM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 1:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 4:40 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, July 02, 2010 4:00 PM We ran into an complicated issue while implementing IndexedDB. In short, what should happen if an object store is modified while a cursor is iterating it? Note that the modification can be done within the same transaction, so the read/write locks preventing several transactions from accessing the same table isn't helping here. Detailed problem description (this assumes the API proposed by mozilla): Consider a objectStore words containing the following objects: { name: alpha } { name: bravo } { name: charlie } { name: delta } and the following program (db is a previously opened IDBDatabase): var trans = db.transaction([words], READ_WRITE); var cursor; var result = []; trans.objectStore(words).openCursor().onsuccess = function(e) { cursor = e.result; result.push(cursor.value); cursor.continue(); } trans.objectStore(words).get(delta).onsuccess = function(e) { trans.objectStore(words).put({ name: delta, myModifiedValue: 17 }); } When the cursor reads the delta entry, will it see the 'myModifiedValue' property? Since we so far has defined that the callback order is defined to be the request order, that means that put request will be finished before the delta entry is iterated by the cursor. The problem is even more serious with cursors that iterate indexes. Here a modification can even affect the position of the currently iterated object in the index, and the modification can (if i'm reading the spec correctly) come from the cursor itself. Consider the following objectStore people with keyPath name containing the following objects: { name: Adam, count: 30 } { name: Bertil, count: 31 } { name: Cesar, count: 32 } { name: David, count: 33 } { name: Erik, count: 35 } and an index countIndex with keyPath count. What would the following code do? results = []; db.objectStore(people, READ_WRITE).index(countIndex).openObjectCursor().onsuccess = function (e) { cursor = e.result; if (!cursor) { alert(results); return; } if (cursor.value.name == Bertil) { cursor.update({name: Bertil, count: 34 }); } results.push(cursor.value.name); cursor.continue(); }; What does this alert? Would it alert Adam,Bertil,Erik as the cursor would stay on the Bertil object as it is moved in the index? Or would it alert Adam,Bertil,Cesar,David,Bertil,Erik as we would iterate Bertil again at its new position in the index? My first reaction is that both from the expected behavior of perspective (transaction is the scope of isolation) and from the implementation perspective it would be better to see live changes if they happened in the same transaction as the cursor (over a store or index). So in your example you would iterate one of the rows twice. Maintaining order and membership stable would mean creating another scope of isolation within the transaction, which to me would be unusual and it would be probably quite painful to implement without spilling a copy of the records to disk (at least a copy of the keys/order if you don't care about protecting from changes that don't affect membership/order; some databases call these keyset cursors). We could say that cursors always iterate snapshots, however this introduces MVCC. Though it seems to me that SNAPSHOT_READ already does that. Actually, even with MVCC you'd see your own changes, because they happen in the same transaction so the buffer pool will use the same version of the page.
Re: [IndexedDB] Current editor's draft
On Wed, Jul 14, 2010 at 6:05 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 5:43 PM On Wed, Jul 14, 2010 at 5:03 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Wednesday, July 14, 2010 12:07 AM I think what I'm struggling with is how dynamic transactions will help since they are still doing whole-objectStore locking. I'm also curious how you envision people dealing with deadlock hazards. Nikunjs examples in the beginning of this thread simply throw up their hands and report an error if there was a deadlock. That is obviously not good enough for an actual application. So in short, looking forward to an example :) I'll try to come up with one, although I doubt the code itself will be very interesting in this particular case. Not sure what you mean by they are still doing whole-objectStore locking. The point of dynamic transactions is that they *don't* lock the whole store, but instead have the freedom to choose the granularity (e.g. you could do row-level locking). My understanding is that the currently specced dynamic transactions are still whole-objectStore. Once you call openObjectStore and successfully receive the objectStore through the 'success' event, a lock is held on the whole objectStore until the transaction is committed. No other transaction, dynamic or static, can open the objectStore in the meantime. I base this on the sentence: There MAY not be any overlap among the scopes of all open connections to a given database from the spec. But I might be misunderstanding things entirely. Nikunj, could you clarify how locking works for the dynamic transactions proposal that is in the spec draft right now? As for deadlocks, whenever you're doing an operation you need to be ready to handle errors (out of disk, timeout, etc.). I'm not sure why deadlocks are different. If the underlying implementation has deadlock detection then you may get a specific error, otherwise you'll just get a timeout. Well, I agree that while you have to handle errors to prevent dataloss, I suspect that most authors won't. Thus the more error conditions that we introduce, the more I think the difference is that deadlocks will happen often enough that they are a real concern. Out of disk space makes most desktop applications freak out enough that they generally cause dataloss, thus OSs tend to warn when you're running low on disk space. As for timeouts, I think we should make the defaults not be to have a timeout. Only if authors specifically specify a timeout parameter should we use one. My main line of thinking is that authors are going to generally be very bad at even looking for errors. Even less so at successfully handling those errors in a way that is satisfactory for the user. So I think the default behavior is that any time an error occurs, we'll end up rolling back the transaction and there will be dataloss. We should absolutely still provide good error handling opportunities so that authors can at least try to deal with it. However I'm not too optimistic that people will actually use them correctly. This will likely be extra bad for transactions where no write operations are done. In this case failure to call a 'commit()' function won't result in any broken behavior. The transaction will just sit open for a long time and eventually rolled back, though since no changes were done, the rollback is transparent, and the only noticeable effect is that the application halts for a while while the transaction is waiting to time out. I should add that the WebSQLDatabase uses automatically committing transactions very similar to what we're proposing, and it seems to have worked fine there. I find this a bit scary, although it could be that I'm permanently tainted with traditional database stuff. Typical databases follow a presumed abort protocol, where if your code is interrupted by an exception, a process crash or whatever, you can always assume transactions will be rolled back if you didn't reach an explicit call to commit. The implicit commit here takes that away, and I'm not sure how safe that is. For example, if I don't have proper exception handling in place, an illegal call to some other non-indexeddb related API may throw an exception causing the whole thing to unwind, at which point nothing will be pending to do in the database and thus the currently active transaction will be committed. Using the same line of thought we used for READ_ONLY, forgetting to call commit() is easy to detect the first time you try out your code. Your changes will simply not stick. It's not as clear as the READ_ONLY example because there is no opportunity to throw an explicit exception with an explanation, but the data not being around will certainly prompt developers to look for the
Re: [IndexedDB] Cursors and modifications
On Wed, Jul 14, 2010 at 6:20 PM, Pablo Castro pablo.cas...@microsoft.com wrote: Making sure I get the essence of this thread: we're saying that cursors see live changes as they happen on objects that are after the object you're currently standing on; Yes. and of course, any other activity within a transaction sees all the changes that happened before that activity took place. Is that accurate? Yes. All other activity sees all changes as soon as they have happened. If it's accurate, as a side note, for the async API it seems that this makes it more interesting to enforce callback order, so we can more easily explain what we mean by before. Indeed. / Jonas From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Wednesday, July 14, 2010 9:27 AM On Wed, Jul 14, 2010 at 5:17 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 14, 2010 at 5:12 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Jul 8, 2010 at 8:42 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Jul 5, 2010 at 9:45 AM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 2:09 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 5:44 PM, Andrei Popescu andr...@google.com wrote: On Sat, Jul 3, 2010 at 1:14 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 2, 2010 at 4:40 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Jonas Sicking Sent: Friday, July 02, 2010 4:00 PM We ran into an complicated issue while implementing IndexedDB. In short, what should happen if an object store is modified while a cursor is iterating it? Note that the modification can be done within the same transaction, so the read/write locks preventing several transactions from accessing the same table isn't helping here. Detailed problem description (this assumes the API proposed by mozilla): Consider a objectStore words containing the following objects: { name: alpha } { name: bravo } { name: charlie } { name: delta } and the following program (db is a previously opened IDBDatabase): var trans = db.transaction([words], READ_WRITE); var cursor; var result = []; trans.objectStore(words).openCursor().onsuccess = function(e) { cursor = e.result; result.push(cursor.value); cursor.continue(); } trans.objectStore(words).get(delta).onsuccess = function(e) { trans.objectStore(words).put({ name: delta, myModifiedValue: 17 }); } When the cursor reads the delta entry, will it see the 'myModifiedValue' property? Since we so far has defined that the callback order is defined to be the request order, that means that put request will be finished before the delta entry is iterated by the cursor. The problem is even more serious with cursors that iterate indexes. Here a modification can even affect the position of the currently iterated object in the index, and the modification can (if i'm reading the spec correctly) come from the cursor itself. Consider the following objectStore people with keyPath name containing the following objects: { name: Adam, count: 30 } { name: Bertil, count: 31 } { name: Cesar, count: 32 } { name: David, count: 33 } { name: Erik, count: 35 } and an index countIndex with keyPath count. What would the following code do? results = []; db.objectStore(people, READ_WRITE).index(countIndex).openObjectCursor().onsuccess = function (e) { cursor = e.result; if (!cursor) { alert(results); return; } if (cursor.value.name == Bertil) { cursor.update({name: Bertil, count: 34 }); } results.push(cursor.value.name); cursor.continue(); }; What does this alert? Would it alert Adam,Bertil,Erik as the cursor would stay on the Bertil object as it is moved in the index? Or would it alert Adam,Bertil,Cesar,David,Bertil,Erik as we would iterate Bertil again at its new position in the index? My first reaction is that both from the expected behavior of perspective (transaction is the scope of isolation) and from the implementation perspective it would be better to see live changes if they happened in the same transaction as the cursor (over a store or index). So in your example you would iterate one of the rows twice. Maintaining order and membership stable would mean creating another scope of isolation within the transaction, which to me would be unusual and it would be probably quite painful to implement without spilling a copy of the records to disk (at least a copy of the keys/order if you don't care about protecting from changes that don't affect membership/order; some databases call these keyset cursors). We could say that cursors always iterate snapshots, however this introduces MVCC. Though it seems to me that SNAPSHOT_READ already