Re: [whatwg] Worker feedback
On Sat, 28 Mar 2009, Robert O'Callahan wrote: On Sat, Mar 28, 2009 at 2:23 PM, Ian Hickson i...@hixie.ch wrote: Robert O'Callahan wrote: Now, with the storage mutex, are there any cases you know of where serializability fails? If there are, it may be worth noting them in the spec. If there aren't, why not simply write serializability into the spec? Just writing that something must be true doesn't make it true. :-) I think it's safer for us to make the design explicitly enforce this rather than say that browser vendors must figure out where it might be broken and enforce it themselves. If serializability is the goal then I think it can only help to say so in the spec (in addition to whatever explicit design you wish to include), so that any failure of serializability is clearly an inconsistency in the spec that must be fixed rather than a loophole that authors and browser vendors might think they can rely on. Done. I also suggest that speccing just serializability should be fine. The problem is that this is specifying an anti-requirement, which doesn't really help in defining what the behaviour _should_ be like. It doesn't tell us what the order of events should be, for instance, just that some order should exist. It seems to me the current spec is proposing one implementation of serializability while other implementations are possible, and relying on the black-box equivalence principle to enable other implementations. But specifying serializability is probably simpler and may allow implementations that are unintentionally ruled out by the explicit design in the spec, especially as things become more complicated in the future. It would probably also be clearer to authors what they can expect. What kind of implementations are unintentionally ruled out that you think should not be ruled out? I think it's a lot like GC; we don't specify a GC algorithm, even though GC is hard; we just have an implicit specification that objects don't disappear arbitrarily. It's explicit now, actually (see 2.9.8 Garbage collection, 5.3.5 Garbage collection and browsing contexts, 7.3.3.1 Ports and garbage collection, and similar sections in the Event Source, Workers, and Web Sockets specs). On Sat, 28 Mar 2009, Alexey Proskuryakov wrote: On 28.03.2009, at 4:23, Ian Hickson wrote: I think, given text/css, text/html, and text/xml all have character encoding declarations inline, transcoding is not going to work in practice. I think the better solution would be to remove the rules that make text/* an issue in the standards world (it's not an issue in the real world). In fact, transcoding did work in practice - that's because HTTP headers override inline character declarations. It worked for as long as the HTTP override was around to override, but as soon as the user saves the file to disk, or some such, it fails. For new formats, though, I think just supporting UTF-8 is a big win. Could you please clarify what the win is? It's massively simpler to not have to deal with the horrors of character encodings. Disregarding charset from HTTP headers is just a weird special case for a few text resource types. If we were going to deprecate HTML, XML and CSS, but keep appcache manifest going forward, it could maybe make sense. What's the advantage of introducing all the pain and suffering that encodings will inevitably bring with them to the cache manifest format? On Sat, 28 Mar 2009, Kristof Zelechovski wrote: Scripts, and worker scripts in particular, should use application media type; using text/javascript is obsolete. [RFC4329#3]. IMHO RFC4329 is silly. On Mon, 30 Mar 2009, Drew Wilson wrote: In the past we've discussed having synchronous APIs for structured storage that only workers can use - it's a much more convenient API, particularly for applications porting to HTML5 structured storage from gears. It sounds like if we want to support these APIs in workers, we'd need to enforce the same kind of serializability guarantees that we have for localStorage in browser windows (i.e. add some kind of structured storage mutex similar to the localStorage mutex). This API now exists. I don't think it causes any particular serialization problems, the only issue seems to be what happens if a worker grabs the write lock to a database and then doesn't release it, but then all it will do is cause the browsing contexts that are waiting for that lock to just never call the relevant callback (and for the sync workers from that domain to block), so it doesn't seem like a huge deal. (It's still serialisable, it's just there's a big wait in there!) Re: cookies I suppose that network activity should also wait for the lock. I've made that happen. Seems like that would restrict parallelism between network loads and executing javascript, which seems like the wrong direction to go. I agree
Re: [whatwg] Worker feedback
29.04.2009, в 6:05, Ian Hickson написал(а): Disregarding charset from HTTP headers is just a weird special case for a few text resource types. If we were going to deprecate HTML, XML and CSS, but keep appcache manifest going forward, it could maybe make sense. What's the advantage of introducing all the pain and suffering that encodings will inevitably bring with them to the cache manifest format? Just what I said before - the ability to use the same code path for decoding manifests as for decoding other types of resources. It's a minor benefit, admittedly, but it's a potential issue at all stages - from generating content to checking it with automated tools to consuming it. For authors and admins, it may be a nuisance to maintain an UTF-8 text file if the rest of the site is in a different encoding. - WBR, Alexey Proskuryakov
Re: [whatwg] Worker feedback
On Wed, 29 Apr 2009, Alexey Proskuryakov wrote: 29.04.2009, в 6:05, Ian Hickson написал(а): Disregarding charset from HTTP headers is just a weird special case for a few text resource types. If we were going to deprecate HTML, XML and CSS, but keep appcache manifest going forward, it could maybe make sense. What's the advantage of introducing all the pain and suffering that encodings will inevitably bring with them to the cache manifest format? Just what I said before - the ability to use the same code path for decoding manifests as for decoding other types of resources. It's a minor benefit, admittedly, but it's a potential issue at all stages - from generating content to checking it with automated tools to consuming it. For authors and admins, it may be a nuisance to maintain an UTF-8 text file if the rest of the site is in a different encoding. I believe the long-term benefit of not having to deal with encodings, ever, for manifests, outweighs the medium-term benefit of people using non-UTF-8 elsewhere. Non-UTF-8 encodings are dropping in usage. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Worker feedback
On Mon, Apr 6, 2009 at 8:57 PM, timeless timel...@gmail.com wrote: FWIW, iirc multiple processes from IE dates to at least IE4 The best url I can find on the subject atm is http://aroundcny.com/technofile/texts/bit092098.html. Michael Nordman micha...@google.com wrote: There are additional constraints that haven't been mentioned yet... Plugins. The current model for plugins is that they execute in a single-threaded world. Chrome maintains that model by hosting each plugin in its own process and RPC'ing method invocations back and forth between calling pages and the plugin instances. All plugin instances (of a given plugin) reside on the same thread. Robert O'Callahan rob...@ocallahan.org wrote: Why can't instances of a plugin in different browser contexts be hosted in separate processes? Michael Nordman micha...@google.com wrote: It would be expensive, and i think has this would have some correctness issues too depending on the plugin. Some plugins depend on instances knowing about each other and interoperating with each other out of band of DOM based means doing so. Michael Nordman micha...@google.com wrote: And others probably assume they have exclusive access to mutable plugin resources on disk. This seems unlikely. I can run Firefox, Safari, Chrome, IE, Opera, and others browsers at the same time, heck I can run multiple profiles of a couple of these (I can't find the option in the current version of Chrome, but I used it before). chrome.exe --user-data-dir=c:\foo
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote: If I understood the discussion correctly, the spec for document.cookie never stated anything about it being immutable while a script is running. Well, there never was a decent spec for document.cookie for most of its life, and even if there had been, no implementations allowed asynchronous changes to cookies while a script was running (except for maybe during alert()) and no-one really thought about it. Was this even identified as a possible issue during Chrome development? In addition to alert(), don't forget about all the great state changing things that can happen to the cookie database (and other data stores) during a synchronous XMLHttpRequest (or synchronous document.load) in Firefox. Maybe those are just bugs? What if a Firefox extension wants to muck around with the cookie database while a web page is blocked on a synchronous XMLHttpRequest? Maybe that should fail to avoid dead-locking? Sounds like a recipe for flaky extensions since it is unlikely that the extension author would have been prepared for being called at this time when access to the cookie database would have to be denied. (In Firefox, a new event loop is run to continue processing events while that synchronous XMLHttpRequest is active. That event loop helps keep the application alive and responsive to user action.) When deciding how to handle cookies in Chrome, we did not worry about the problem being debated here. Our concerns were allayed by recognizing that IE does not try to solve it (and IE6 is multi-process just like Chrome with a shared network stack), so clearly web developers must already have to cope. We flirted with the idea of letting each renderer maintain a local copy of its cookies, but that turned out to more complicated than was necessary. In the end, we ended up synchronizing with the main process on each call to document.cookie to fetch a snapshot. I think it would be best to specify that document.cookie returns a snapshot. I think that is consistent with existing implementations including IE, Firefox, and Chrome. I don't know about Safari and Opera, but it seems plausible that they could have similar behavior thanks to nested event queues which are typically used to support synchronous XHR and window.alert(). You would be surprised by the number of times it comes up that web developers at Google think Firefox has multi-threaded JS thanks to this behavior of synchronous XHR ;-) -Darin People are now talking about specifying this, but there's been push back. Also, there's no way to guarantee serializability for the network traffic portion so I'm guessing (hoping!) that this wouldn't be required in the JavaScript side, even if it went through. What exactly do you mean by that? It's easy to guarantee that reading the cookies to send with an HTTP request is an atomic operation, and writing them as a result of an HTTP response is an atomic operation. The spec is written in such a way that you can't have more that one event loop per browser window/worker, and everything is essentially tied to this one event loop. In other words, each window/worker can't run on more than one CPU core at a time. Thus, the only way for a web application to scale in todays world is going to be through additional windows and/or workers. Depending on exactly what you mean by a Web application, that's not really true. There are a variety of ways to exploit multicore parallelism within a window with the current set of specs, at least in principle. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Mon, 6 Apr 2009, Darin Fisher wrote: In addition to alert(), don't forget about all the great state changing things that can happen to the cookie database (and other data stores) during a synchronous XMLHttpRequest (or synchronous document.load) in Firefox. Maybe those are just bugs? The HTML5 spec says the storage mutex is released when alert() is called. I've asked Anne (editor of the XHR spec) to say that it is released when a sync XHR is started, too. Per the HTML5 spec, setting the cookies from the network grabs the storage mutex briefly. (Reading them is implicitly atomic, but might happen while someone else holds the mutex, so per spec there is still a chance of the cookies sent to the server being in an inconsistent state if they are read while a script is in the middle of a multi-stage cookie update.) I don't really mind if the spec says whether cookies should be protected by the storage mutex or not (the spec says they should be because that seems to be the majority opinion). I'm pretty sure localStorage should be so protected, though. I don't really see how to get away from that. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Worker feedback
On Mon, Apr 6, 2009 at 7:03 PM, Darin Fisher da...@chromium.org wrote: On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote: If I understood the discussion correctly, the spec for document.cookie never stated anything about it being immutable while a script is running. Well, there never was a decent spec for document.cookie for most of its life, and even if there had been, no implementations allowed asynchronous changes to cookies while a script was running (except for maybe during alert()) and no-one really thought about it. Was this even identified as a possible issue during Chrome development? In addition to alert(), don't forget about all the great state changing things that can happen to the cookie database (and other data stores) during a synchronous XMLHttpRequest (or synchronous document.load) in Firefox. Maybe those are just bugs? What if a Firefox extension wants to muck around with the cookie database while a web page is blocked on a synchronous XMLHttpRequest? Maybe that should fail to avoid dead-locking? Sounds like a recipe for flaky extensions since it is unlikely that the extension author would have been prepared for being called at this time when access to the cookie database would have to be denied. According to the spec the storage mutex is dropped for blocking operations like alert() and sync XHR, and as you know, that's effectively what we do. But the general rule of DOM API design is that operations do not block and we offer asynchronous APIs instead. alert() and sync XHR are exceptions to this rule, but they're ugly stepchildren of DOM APIs and we don't want to treat them as norms. When deciding how to handle cookies in Chrome, we did not worry about the problem being debated here. Our concerns were allayed by recognizing that IE does not try to solve it (and IE6 is multi-process just like Chrome with a shared network stack), so clearly web developers must already have to cope. You mean IE8. How would Web developers cope? There's no way to synchronize. I doubt more than a handful of Web developers even know this problem could exist. I think it would be best to specify that document.cookie returns a snapshot. I think that is consistent with existing implementations including IE, Firefox, and Chrome. Not at all. In Firefox, cookies don't change while a script is running, as long as it doesn't call the handful of blocking DOM APIs (such as alert() or sync XHR); we satisfy the current spec. The insidious part is that almost all the time, IE and Chrome will also be observed to obey the spec; when a quick cookie-read-modify-write script runs, it is very unlikely cookies will change underneath it. (Is it possible people don't write such scripts?) Maybe we need dynamic race detection for Web browsers. After a script reads document.cookie, stall for a while to give network transactions or scripts running in other threads a chance to change the cookies so the original script carries on with wrong data. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Mon, Apr 6, 2009 at 4:20 AM, Robert O'Callahan rob...@ocallahan.orgwrote: On Mon, Apr 6, 2009 at 7:03 PM, Darin Fisher da...@chromium.org wrote: On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote: If I understood the discussion correctly, the spec for document.cookie never stated anything about it being immutable while a script is running. Well, there never was a decent spec for document.cookie for most of its life, and even if there had been, no implementations allowed asynchronous changes to cookies while a script was running (except for maybe during alert()) and no-one really thought about it. Was this even identified as a possible issue during Chrome development? In addition to alert(), don't forget about all the great state changing things that can happen to the cookie database (and other data stores) during a synchronous XMLHttpRequest (or synchronous document.load) in Firefox. Maybe those are just bugs? What if a Firefox extension wants to muck around with the cookie database while a web page is blocked on a synchronous XMLHttpRequest? Maybe that should fail to avoid dead-locking? Sounds like a recipe for flaky extensions since it is unlikely that the extension author would have been prepared for being called at this time when access to the cookie database would have to be denied. According to the spec the storage mutex is dropped for blocking operations like alert() and sync XHR, and as you know, that's effectively what we do. But the general rule of DOM API design is that operations do not block and we offer asynchronous APIs instead. alert() and sync XHR are exceptions to this rule, but they're ugly stepchildren of DOM APIs and we don't want to treat them as norms. OK... so if I am building an API, the consumer of my API might not realize that I have stuck a sync XHR in the middle of it. (People often do that so that their API can work during unload.) So the consumer of such an API now has to deal with the cookie lock being released? When deciding how to handle cookies in Chrome, we did not worry about the problem being debated here. Our concerns were allayed by recognizing that IE does not try to solve it (and IE6 is multi-process just like Chrome with a shared network stack), so clearly web developers must already have to cope. You mean IE8. No, IE6,7,8 (maybe older versions too?) ... you can launch multiple IE6 processes, and those share cookies. You can also programmatically access the same cookies via WinInet from any application. It is not uncommon for a separate application to be mucking around with cookies for intranet.com. How would Web developers cope? There's no way to synchronize. I doubt more than a handful of Web developers even know this problem could exist. You can synchronize through the origin server... What I meant was that they cope by not expecting document.cookie to return the same results each time it is called. I'd imagine it is not uncommon for users to login to a site in multiple windows and perform similar operations in each browser window. That scenario seems like it could trigger what we have here. I think it would be best to specify that document.cookie returns a snapshot. I think that is consistent with existing implementations including IE, Firefox, and Chrome. Not at all. In Firefox, cookies don't change while a script is running, as long as it doesn't call the handful of blocking DOM APIs (such as alert() or sync XHR); we satisfy the current spec. I don't understand why the sync XHR exception is taken so lightly. As I mention above, that is most frequently used as a transparent-to-the-rest-of-the-application way of communicating with the server (usually because some APIs cannot be easily changed or need to be available during unload). Yet, here we are saying that that cannot be transparent because of this locking issue. The insidious part is that almost all the time, IE and Chrome will also be observed to obey the spec; when a quick cookie-read-modify-write script runs, it is very unlikely cookies will change underneath it. (Is it possible people don't write such scripts?) I'm sure people write cookie-read-modify-write scripts and don't realize the potential problems. But I suspect the incidents of problems related to two scripts doing so are extremely low as to not matter enough to application developers. They can just say: opening our webmail program in two browser tabs at the same time is not supported. Maybe we need dynamic race detection for Web browsers. After a script reads document.cookie, stall for a while to give network transactions or scripts running in other threads a chance to change the cookies so the original script carries on with wrong data. Sounds interesting, but what happens when the script writes cookies? Now there is a merging
Re: [whatwg] Worker feedback
There are additional constraints that haven't been mentioned yet... Plugins. The current model for plugins is that they execute in a single-threaded world. Chrome maintains that model by hosting each plugin in its own process and RPC'ing method invocations back and forth between calling pages and the plugin instances. All plugin instances (of a given plugin) reside on the same thread. Consider three threads PageA PageB PluginC PageA -grabs storage lock PluginC -calls out to PageB (everything in NPAPI is synchronous) -now waiting for PageB to return PageB -while handling the plugins callback, attempts to grab the storage lock -BLOCKED waiting for PageA to release it PageA -calls plugin (sync method call) -BLOCK waiting indirectly for PageB == DEADLOCK
Re: [whatwg] Worker feedback
On Tue, Apr 7, 2009 at 1:53 AM, Darin Fisher da...@chromium.org wrote: OK... so if I am building an API, the consumer of my API might not realize that I have stuck a sync XHR in the middle of it. (People often do that so that their API can work during unload.) So the consumer of such an API now has to deal with the cookie lock being released? Yes. If sync XHR spins up a subsidiary event loop, the cookie lock is the least of your worries, because event handlers may run and mutate arbitrary script/DOM state. (We're actually tightening up what is allowed to run during sync XHR in Gecko, but I don't know the details and I don't know what other browsers do.) APIs that can cause reentrancy, or block, or yield, need to be carefully documented. That's why we want to minimize them... When deciding how to handle cookies in Chrome, we did not worry about the problem being debated here. Our concerns were allayed by recognizing that IE does not try to solve it (and IE6 is multi-process just like Chrome with a shared network stack), so clearly web developers must already have to cope. You mean IE8. No, IE6,7,8 (maybe older versions too?) ... you can launch multiple IE6 processes, and those share cookies. You can also programmatically access the same cookies via WinInet from any application. It is not uncommon for a separate application to be mucking around with cookies for intranet.com. OK, that's interesting. How would Web developers cope? There's no way to synchronize. I doubt more than a handful of Web developers even know this problem could exist. You can synchronize through the origin server... What I meant was that they cope by not expecting document.cookie to return the same results each time it is called. I'd imagine it is not uncommon for users to login to a site in multiple windows and perform similar operations in each browser window. That scenario seems like it could trigger what we have here. Many sites, such as my bank, detect that and attempt to prohibit it by refusing to let more than one window work. I wonder if they use a race-vulnerable cookie protocol to detect it... I think it would be best to specify that document.cookie returns a snapshot. I think that is consistent with existing implementations including IE, Firefox, and Chrome. Not at all. In Firefox, cookies don't change while a script is running, as long as it doesn't call the handful of blocking DOM APIs (such as alert() or sync XHR); we satisfy the current spec. I don't understand why the sync XHR exception is taken so lightly. As I mention above, that is most frequently used as a transparent-to-the-rest-of-the-application way of communicating with the server (usually because some APIs cannot be easily changed or need to be available during unload). Yet, here we are saying that that cannot be transparent because of this locking issue. Yes. Making sync-XHR transparent by reducing all consistency guarantees to what we can provide around sync-XHR is the wrong direction to go IMHO. The insidious part is that almost all the time, IE and Chrome will also be observed to obey the spec; when a quick cookie-read-modify-write script runs, it is very unlikely cookies will change underneath it. (Is it possible people don't write such scripts?) I'm sure people write cookie-read-modify-write scripts and don't realize the potential problems. But I suspect the incidents of problems related to two scripts doing so are extremely low as to not matter enough to application developers. They can just say: opening our webmail program in two browser tabs at the same time is not supported. If they're not aware of the problem, why would they say that? Maybe we need dynamic race detection for Web browsers. After a script reads document.cookie, stall for a while to give network transactions or scripts running in other threads a chance to change the cookies so the original script carries on with wrong data. Sounds interesting, but what happens when the script writes cookies? Now there is a merging problem :( Oh, dynamic race detection is only good for finding bugs more easily, not fixing them :-). Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.com wrote: There are additional constraints that haven't been mentioned yet... Plugins. The current model for plugins is that they execute in a single-threaded world. Chrome maintains that model by hosting each plugin in its own process and RPC'ing method invocations back and forth between calling pages and the plugin instances. All plugin instances (of a given plugin) reside on the same thread. Why can't instances of a plugin in different browser contexts be hosted in separate processes? Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Mon, Apr 6, 2009 at 7:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.comwrote: There are additional constraints that haven't been mentioned yet... Plugins. The current model for plugins is that they execute in a single-threaded world. Chrome maintains that model by hosting each plugin in its own process and RPC'ing method invocations back and forth between calling pages and the plugin instances. All plugin instances (of a given plugin) reside on the same thread. Why can't instances of a plugin in different browser contexts be hosted in separate processes? It would be expensive, and i think has this would have some correctness issues too depending on the plugin. Some plugins depend on instances knowing about each other and interoperating with each other out of band of DOM based means doing so. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Mon, Apr 6, 2009 at 7:28 PM, Michael Nordman micha...@google.com wrote: On Mon, Apr 6, 2009 at 7:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.comwrote: There are additional constraints that haven't been mentioned yet... Plugins. The current model for plugins is that they execute in a single-threaded world. Chrome maintains that model by hosting each plugin in its own process and RPC'ing method invocations back and forth between calling pages and the plugin instances. All plugin instances (of a given plugin) reside on the same thread. Why can't instances of a plugin in different browser contexts be hosted in separate processes? It would be expensive, and i think has this would have some correctness issues too depending on the plugin. Some plugins depend on instances knowing about each other and interoperating with each other out of band of DOM based means doing so. And others probably assume they have exclusive access to mutable plugin resources on disk. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.com wrote: Consider three threads PageA PageB PluginC PageA -grabs storage lock PluginC -calls out to PageB (everything in NPAPI is synchronous) -now waiting for PageB to return PageB -while handling the plugins callback, attempts to grab the storage lock -BLOCKED waiting for PageA to release it PageA -calls plugin (sync method call) -BLOCK waiting indirectly for PageB == DEADLOCK What happens if we don't have storage locks but PageB does a sync XHR or alert() inside the callout from the plugin? All the other pages containing plugins of that type lock up? Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
FWIW, iirc multiple processes from IE dates to at least IE4 The best url I can find on the subject atm is http://aroundcny.com/technofile/texts/bit092098.html. Michael Nordman micha...@google.com wrote: There are additional constraints that haven't been mentioned yet... Plugins. The current model for plugins is that they execute in a single-threaded world. Chrome maintains that model by hosting each plugin in its own process and RPC'ing method invocations back and forth between calling pages and the plugin instances. All plugin instances (of a given plugin) reside on the same thread. Robert O'Callahan rob...@ocallahan.org wrote: Why can't instances of a plugin in different browser contexts be hosted in separate processes? Michael Nordman micha...@google.com wrote: It would be expensive, and i think has this would have some correctness issues too depending on the plugin. Some plugins depend on instances knowing about each other and interoperating with each other out of band of DOM based means doing so. Michael Nordman micha...@google.com wrote: And others probably assume they have exclusive access to mutable plugin resources on disk. This seems unlikely. I can run Firefox, Safari, Chrome, IE, Opera, and others browsers at the same time, heck I can run multiple profiles of a couple of these (I can't find the option in the current version of Chrome, but I used it before).
Re: [whatwg] Worker feedback
On Sat, Apr 4, 2009 at 11:17 AM, Jeremy Orlow jor...@google.com wrote: True serializability would imply that the HTTP request read and write are atomic. In other words, you'd have to keep a lock for the entirety of each HTTP request and couldn't do multiple in parallel. When I said there's no way to guarantee serializability, I guess I meant to qualify it with in practice. OK, I don't think anyone expects, wants, or has ever had that :-). After thinking about it for a bit, your suggestion of reading the cookies to send with an HTTP request is an atomic operation, and writing them as a result of an HTTP response is an atomic operation does seems like a pretty sensible compromise. It's what the spec says (the spec doesn't say anything about reading cookies when constructing an HTTP request, but that's probably just an oversight) and it's what I expected, so not really a compromise :-). The one thing I'd still be concerned about: localStorage separates storage space by origins. In other words, www.google.com cannot access localStorage values from google.com and visa versa. Cookies, on the other hand, have a much more complex scheme of access control. Coming up with an efficient and dead-lock-proof locking scheme might take some careful thought. I hope browser implementors can solve this internally. I think the main thing we have to watch out for in the spec is situations where a script can *synchronously* entangle browsing contexts that previously could not interfere with each other (i.e., that a browser could have assigned independent locks). (Setting document.domain might be a problem, for example, although I don't know enough about cookies to be sure.) Depending on exactly what you mean by a Web application, that's not really true. There are a variety of ways to exploit multicore parallelism within a window with the current set of specs, at least in principle. What else is there? (I believe you, I'm just interested in knowing what's out there.) In Gecko we're working on making HTML parsing happen in parallel with other activities (including script execution), and video decoding already does. I can imagine doing all graphics rendering in parallel with other tasks and being parallel internally too. Some aspects of layout can be parallelized internally and overlapped with script execution. Expensive Javascript compiler optimizations can be run in parallel with actual application work. Canvas3D can run GPU programs which are another form of parallelism (OK that's not exactly multicore parallelism unless you believe Intel). Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Fri, 03 Apr 2009 06:26:43 +0200, Robert O'Callahan rob...@ocallahan.org wrote: Mozilla could probably get behind that, but I don't know who else is willing to bite the bullet. The problem already exists for document.cookie, no? And the current API is by far the most convenient the use. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 2:18 AM, Anne van Kesteren ann...@opera.com wrote: On Fri, 03 Apr 2009 06:26:43 +0200, Robert O'Callahan rob...@ocallahan.org wrote: Mozilla could probably get behind that, but I don't know who else is willing to bite the bullet. The problem already exists for document.cookie, no? And the current API is by far the most convenient the use. If I understood the discussion correctly, the spec for document.cookie never stated anything about it being immutable while a script is running. People are now talking about specifying this, but there's been push back. Also, there's no way to guarantee serializability for the network traffic portion so I'm guessing (hoping!) that this wouldn't be required in the JavaScript side, even if it went through. localStorage, on the other hand, does have language in the draft spec stating that changes to localStorage must be serialized as if only one event loop is running at a time. That's the problem. In other words, the strictness of the concurrency control for localStorage is what makes this different from document.cookie. As for convenience: The spec is written in such a way that you can't have more that one event loop per browser window/worker, and everything is essentially tied to this one event loop. In other words, each window/worker can't run on more than one CPU core at a time. Thus, the only way for a web application to scale in todays world is going to be through additional windows and/or workers. I agree that the current API is quite convenient, but it worries me a great deal that it's synchronous. Now that navigator.unlockStorage() has been added to the spec and you can't access localStorage from workers, I'm less worried. But I still feel like we're going to regret this in the next couple years and/or people will simply avoid localStorage. J
Re: [whatwg] Worker feedback
On Thu, Apr 2, 2009 at 8:37 PM, Robert O'Callahan rob...@ocallahan.org wrote: I agree it would make sense for new APIs to impose much greater constraints on consumers, such as requiring them to factor code into transactions, declare up-front the entire scope of resources that will be accessed, and enforce those restrictions, preferably syntactically --- Jonas' asynchronous multi-resource-acquisition callback, for example. Speaking as a novice javascript developer, this feels like the cleanest, simplest, most easily comprehensible way to solve this problem. We define what needs to be locked all at once, provide a callback, and within the dynamic context of the callback no further locks are acquirable. You have to completely exit the callback and start a new lock block if you need more resources. This prevents deadlocks, while still giving us developers a simple way to express what we need. As well, callbacks are at this point a relatively novice concept, as every major javascript library makes heavy use of them. ~TJ
Re: [whatwg] Worker feedback
I know I said I would stay out of this conversation, but I feel obliged to share a data point that's pertinent to our API design. The structured storage spec has an asynchronous API currently. There are no shortage of experienced javascript programmers at Google, and yet the single biggest piece of feedback I've gotten from the internal app community has been (essentially): The asynchronous APIs are too cumbersome. We are going to delay porting over to use the HTML5 APIs until we have synchronous APIs, like the ones in Gears. So, we should all take the whining of pampered Google engineers with a grain of salt :), but the point remains that even though callbacks are conceptually familiar and easy to use, it's not always convenient (or possible!) for an application to stop an operation in the middle and resume it via an asynchronous callback. Imagine if you're a library author that exposes a synchronous API for your clients - now you'd like to use localStorage within your library, but there's no way to do it while maintaining your existing synchronous APIs. If we try to force everyone to use asynchronous APIs to access local storage, the first thing everyone is going to do is build their own write-through caching wrapper objects around local storage to give them synchronous read access and lazy writes, which generates precisely the type of racy behavior we're trying to avoid. If we can capture the correct behavior using synchronous APIs, we should. -atw On Fri, Apr 3, 2009 at 11:44 AM, Tab Atkins Jr. jackalm...@gmail.comwrote: On Thu, Apr 2, 2009 at 8:37 PM, Robert O'Callahan rob...@ocallahan.org wrote: I agree it would make sense for new APIs to impose much greater constraints on consumers, such as requiring them to factor code into transactions, declare up-front the entire scope of resources that will be accessed, and enforce those restrictions, preferably syntactically --- Jonas' asynchronous multi-resource-acquisition callback, for example. Speaking as a novice javascript developer, this feels like the cleanest, simplest, most easily comprehensible way to solve this problem. We define what needs to be locked all at once, provide a callback, and within the dynamic context of the callback no further locks are acquirable. You have to completely exit the callback and start a new lock block if you need more resources. This prevents deadlocks, while still giving us developers a simple way to express what we need. As well, callbacks are at this point a relatively novice concept, as every major javascript library makes heavy use of them. ~TJ
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 2:25 PM, Drew Wilson atwil...@google.com wrote: If we can capture the correct behavior using synchronous APIs, we should. I think we already have a good, correct, synchronous API. My concern is the implications to the internals of the implemenation. Anyway, given that no one is chiming in to my defense, either no one really cares enough to have read this far or no one agrees with me. Either way, I guess I'll quite down. :-)
Re: [whatwg] Worker feedback
On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote: If I understood the discussion correctly, the spec for document.cookie never stated anything about it being immutable while a script is running. Well, there never was a decent spec for document.cookie for most of its life, and even if there had been, no implementations allowed asynchronous changes to cookies while a script was running (except for maybe during alert()) and no-one really thought about it. Was this even identified as a possible issue during Chrome development? People are now talking about specifying this, but there's been push back. Also, there's no way to guarantee serializability for the network traffic portion so I'm guessing (hoping!) that this wouldn't be required in the JavaScript side, even if it went through. What exactly do you mean by that? It's easy to guarantee that reading the cookies to send with an HTTP request is an atomic operation, and writing them as a result of an HTTP response is an atomic operation. The spec is written in such a way that you can't have more that one event loop per browser window/worker, and everything is essentially tied to this one event loop. In other words, each window/worker can't run on more than one CPU core at a time. Thus, the only way for a web application to scale in todays world is going to be through additional windows and/or workers. Depending on exactly what you mean by a Web application, that's not really true. There are a variety of ways to exploit multicore parallelism within a window with the current set of specs, at least in principle. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote: People are now talking about specifying this, but there's been push back. Also, there's no way to guarantee serializability for the network traffic portion so I'm guessing (hoping!) that this wouldn't be required in the JavaScript side, even if it went through. What exactly do you mean by that? It's easy to guarantee that reading the cookies to send with an HTTP request is an atomic operation, and writing them as a result of an HTTP response is an atomic operation. True serializability would imply that the HTTP request read and write are atomic. In other words, you'd have to keep a lock for the entirety of each HTTP request and couldn't do multiple in parallel. When I said there's no way to guarantee serializability, I guess I meant to qualify it with in practice. After thinking about it for a bit, your suggestion of reading the cookies to send with an HTTP request is an atomic operation, and writing them as a result of an HTTP response is an atomic operation does seems like a pretty sensible compromise. The one thing I'd still be concerned about: localStorage separates storage space by origins. In other words, www.google.com cannot access localStorage values from google.com and visa versa. Cookies, on the other hand, have a much more complex scheme of access control. Coming up with an efficient and dead-lock-proof locking scheme might take some careful thought. The spec is written in such a way that you can't have more that one event loop per browser window/worker, and everything is essentially tied to this one event loop. In other words, each window/worker can't run on more than one CPU core at a time. Thus, the only way for a web application to scale in todays world is going to be through additional windows and/or workers. Depending on exactly what you mean by a Web application, that's not really true. There are a variety of ways to exploit multicore parallelism within a window with the current set of specs, at least in principle. What else is there? (I believe you, I'm just interested in knowing what's out there.) Jeremy P.S. Please don't mistake me for an expert on document.cookie or even window.localStorage. I try to fact check myself as I go, but if I say something that seems stupid, please do let me know. :-)
Re: [whatwg] Worker feedback
On Wed, Apr 1, 2009 at 3:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Thu, Apr 2, 2009 at 11:02 AM, Robert O'Callahan rob...@ocallahan.orgwrote: (Note that you can provide hen read-only scripts are easy to optimize for full parallelism using ) Oops! I was going to point out that you can use a reader/writer lock to implement serializability while allowing read-only scripts to run in parallel, so if the argument is that most scripts are read-only then that means it shouldn't be hard to get pretty good parallelism. The problem is escalating the lock. If your script does a read and then a write, and you do this in 2 workers/windows/etc you can get a deadlock unless you have the ability to roll back one of the two scripts to before the read which took a shared lock. If both scripts have an 'alert(hi!);' then you're totally screwed, though. There's been a LOT of CS research done on automatically handling the details of concurrency. The problem has to become pretty constrained (especially in terms of stuff you can't roll back, like user input) before you can create something halfway efficient. On Wed, Apr 1, 2009 at 3:02 PM, Robert O'Callahan rob...@ocallahan.org wrote: On Thu, Apr 2, 2009 at 7:18 AM, Michael Nordman micha...@google.com wrote: I suggest that we can come up with a design that makes both of these camps happy and that should be our goal here. To that end... what if... interface Store { void putItem(string name, string value); string getItem(string name); // calling getItem multiple times prior to script completion with the same name is gauranteed to return the same value // (unless the current script had called putItem, if a different script had called putItem concurrently, the current script wont see that) void transact(func transactCallback); // is not guaranteed to execute if the page is unloaded prior to the lock being acquired // is guaranteed to NOT execute if called from within onunload // but... really... if you need transactional semantics, maybe you should be using a Database? attribute int length; // may only be accessed within a transactCallback, othewise throws an exception string getItemByIndex(int i); // // may only be accessed within a transactCallback, othewise throws an exception }; document.cookie; // has the same safe to read multiple times semantics as store.getItem() So there are no locking semantics (outside of the transact method)... and multiple reads are not error prone. WDYT? getItem stability is helpful for read-only scripts but no help for read-write scripts. For example, outside a transaction, two scripts doing putItem('x', getItem('x') + 1) can race and lose an increment. Totally agree that it doesn't quite work yet. But what if setItem were to watch for unserializable behavior and throw a transactCallback when it happens? This solves the silent data corruption problem, though reproducing the circumstances that'd cause this are obviously racy. Of course, reproducing the deadlocks or very slow script execution behavior is also racy. Addressing the larger context ... More than anything else, I'm channeling my experiences at IBM Research writing race detection tools for Java programs ( http://portal.acm.org/citation.cfm?id=781528 and others), and what I learned there about programmers with a range of skill levels grappling with shared memory (or in our case, shared storage) concurrency. I passionately, violently believe that Web programmers cannot and should not have to deal with it. It's simply a matter of implementing what programmers expect: that by default, a chunk of sequential code will do what it says without (occasional, random) interference from outside. I definitely see pro's and cons to providing a single threaded version of the world to all developers (both advanced and beginner), but this really isn't what we should be debating right now. What we should be debating is whether advanced, cross-event-loop APIs should be kept simple enough that any beginner web developer can use it (at the expense of performance and simplicity within the browser) or if we should be finding a compromise that can be kept fast, simple (causing less bugs!), and somewhat harder to program for. If someone wants to cross the event loop (except in the document.cookie case, which is a pretty special one), they should have to deal with more complexity in some form. Personally, I'd like to see a solution that does not involve locks of any sort (software transactional memory?). I realize that this creates major implementation difficulties for parallel browsers, which I believe will be all browsers. Evil', troubling and onerous are perhaps understatements... But it will be far better in the long run to put those burdens on browser developers than to kick them upstairs to Web developers. If it turns out that there is a compelling performance boost that can
Re: [whatwg] Worker feedback
On Tue, Mar 31, 2009 at 9:57 PM, Drew Wilson atwil...@google.com wrote: On Tue, Mar 31, 2009 at 6:25 PM, Robert O'Callahan rob...@ocallahan.orgwrote: We don't know how much (if any) performance must be sacrificed, because no-one's tried to implement parallel cookie access with serializability guarantees. So I don't think we can say what the correct tradeoff is. The spec as proposed states that script that accesses cookies cannot operate in parallel with network access on those same domains. The performance impact of something like this is pretty clear, IMO - we don't need to implement it and measure it to know it exists and in some situations could be significant. I agree with everything Drew said, but I think think this one point really needs to be singled out. Cookies go across the wire. Serializable semantics are not possible in todays (latent) world. Period.
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 9:00 AM, Jeremy Orlow jor...@google.com wrote: The problem is escalating the lock. If your script does a read and then a write, and you do this in 2 workers/windows/etc you can get a deadlock unless you have the ability to roll back one of the two scripts to before the read which took a shared lock. If both scripts have an 'alert(hi!);' then you're totally screwed, though. Double oops! Yes. On Wed, Apr 1, 2009 at 3:02 PM, Robert O'Callahan rob...@ocallahan.org wrote: getItem stability is helpful for read-only scripts but no help for read-write scripts. For example, outside a transaction, two scripts doing putItem('x', getItem('x') + 1) can race and lose an increment. Totally agree that it doesn't quite work yet. But what if setItem were to watch for unserializable behavior and throw a transactCallback when it happens? This solves the silent data corruption problem, though reproducing the circumstances that'd cause this are obviously racy. Of course, reproducing the deadlocks or very slow script execution behavior is also racy. You mean throw an exception when it happens? Yeah, that doesn't really help, you just replace one kind of random failure with another. A half-completed read-write script is very likely to have corrupted data. Addressing the larger context ... More than anything else, I'm channeling my experiences at IBM Research writing race detection tools for Java programs ( http://portal.acm.org/citation.cfm?id=781528 and others), and what I learned there about programmers with a range of skill levels grappling with shared memory (or in our case, shared storage) concurrency. I passionately, violently believe that Web programmers cannot and should not have to deal with it. It's simply a matter of implementing what programmers expect: that by default, a chunk of sequential code will do what it says without (occasional, random) interference from outside. I definitely see pro's and cons to providing a single threaded version of the world to all developers (both advanced and beginner), but this really isn't what we should be debating right now. Why not? I know of no better forum for debating the semantics of the Web platform, and it's clearly a matter of some urgency. What we should be debating is whether advanced, cross-event-loop APIs should be kept simple enough that any beginner web developer can use it (at the expense of performance and simplicity within the browser) or if we should be finding a compromise that can be kept fast, simple (causing less bugs!), and somewhat harder to program for. If someone wants to cross the event loop (except in the document.cookie case, which is a pretty special one), they should have to deal with more complexity in some form. Personally, I'd like to see a solution that does not involve locks of any sort (software transactional memory?). I agree it would make sense for new APIs to impose much greater constraints on consumers, such as requiring them to factor code into transactions, declare up-front the entire scope of resources that will be accessed, and enforce those restrictions, preferably syntactically --- Jonas' asynchronous multi-resource-acquisition callback, for example. That is entirely consistent with what I said above; I'm not saying all concurrency abstractions are intractable. But the abstraction which takes sequential code and adds races on shared storage everywhere certainly is. Unfortunately we have to deal with cookies and localStorage, where the API is already set. I realize that this creates major implementation difficulties for parallel browsers, which I believe will be all browsers. Evil', troubling and onerous are perhaps understatements... But it will be far better in the long run to put those burdens on browser developers than to kick them upstairs to Web developers. If it turns out that there is a compelling performance boost that can *only* be achieved by relaxing serializability, then I could be convinced ... but we are very far from proving that. Like I said, a LOT of research has been done on concurrency. Basically, if you're not really careful about how you construct your language and the abstractions you have for concurrency, you can really easily back yourself into a corner that you semantically can't get out of (no matter how good of a programmer you are). I know this, but I'm not sure exactly what point you're trying to make. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 9:02 AM, Jeremy Orlow jor...@google.com wrote: I agree with everything Drew said, but I think think this one point really needs to be singled out. Cookies go across the wire. Serializable semantics are not possible in todays (latent) world. Period. The unit of serializability is a single script (typically an event handler) running to completion. There's no problem interleaving network cookie reads and writes with those. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 9:02 AM, Jeremy Orlow jor...@google.com wrote: I agree with everything Drew said, but I think think this one point really needs to be singled out. Cookies go across the wire. Serializable semantics are not possible in todays (latent) world. Period. The unit of serializability is a single script (typically an event handler) running to completion. There's no problem interleaving network cookie reads and writes with those. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Thu, Apr 2, 2009 at 6:37 PM, Robert O'Callahan rob...@ocallahan.orgwrote: Unfortunately we have to deal with cookies and localStorage, where the API is already set. Is it set? I understand that localStorage has been around for a while, but as far as I can tell virtually no one uses it. I thought the reason for calling this spec a draft was so that such fairly major issues could be corrected? I agree that changing something this late in the game is less than ideal, but I think we're both agreeing that any synchronous APIs that cross the event-loop are going to be long term problems.
Re: [whatwg] Worker feedback
On Fri, Apr 3, 2009 at 5:11 PM, Jeremy Orlow jor...@google.com wrote: On Thu, Apr 2, 2009 at 6:37 PM, Robert O'Callahan rob...@ocallahan.orgwrote: Unfortunately we have to deal with cookies and localStorage, where the API is already set. Is it set? I understand that localStorage has been around for a while, but as far as I can tell virtually no one uses it. I thought the reason for calling this spec a draft was so that such fairly major issues could be corrected? I agree that changing something this late in the game is less than ideal, but I think we're both agreeing that any synchronous APIs that cross the event-loop are going to be long term problems. AFAIK every major browser has an implementation of localStorage close to shipping. The only way I can imagine having a chance to put the brakes on the feature now is for everyone who hasn't actually shipped it --- which I think is currently everyone but IE, since we shipped the old globalStorage which we're planning to rip out anyway --- to unite and disable it immediately until we have a better API. Maybe we could even get IE to disable it in an update. Mozilla could probably get behind that, but I don't know who else is willing to bite the bullet. I suppose sessionStorage can stay? Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
I'd like to propose a way forward. Please have an open mind. The objections your hearing from the chrome world are around the locking semantics being proposed. In various discussions the terms evil, troubling, and onerous have been used to describe what we think about aspects of those semantics. There are obvious difficulties in providing the semantics being discussed in a multi-threaded multi-process browser. There are obvious performance implications. There are limitations imposed on workers that would otherwise not be an issue. And with the introduction of these locks today, there would be challenges going forward when trying to add new features such that deadlocks would not be incurred... our hands would be getting tied up. So we don't like it... evil, troubling, onerous. The objections I'm hearing from the firefox world are around providing an API that is less error prone. I suggest that we can come up with a design that makes both of these camps happy and that should be our goal here. To that end... what if... interface Store { void putItem(string name, string value); string getItem(string name); // calling getItem multiple times prior to script completion with the same name is gauranteed to return the same value // (unless the current script had called putItem, if a different script had called putItem concurrently, the current script wont see that) void transact(func transactCallback); // is not guaranteed to execute if the page is unloaded prior to the lock being acquired // is guaranteed to NOT execute if called from within onunload // but... really... if you need transactional semantics, maybe you should be using a Database? attribute int length; // may only be accessed within a transactCallback, othewise throws an exception string getItemByIndex(int i); // // may only be accessed within a transactCallback, othewise throws an exception }; document.cookie; // has the same safe to read multiple times semantics as store.getItem() So there are no locking semantics (outside of the transact method)... and multiple reads are not error prone. WDYT?
Re: [whatwg] Worker feedback
On Thu, Apr 2, 2009 at 7:18 AM, Michael Nordman micha...@google.com wrote: I suggest that we can come up with a design that makes both of these camps happy and that should be our goal here. To that end... what if... interface Store { void putItem(string name, string value); string getItem(string name); // calling getItem multiple times prior to script completion with the same name is gauranteed to return the same value // (unless the current script had called putItem, if a different script had called putItem concurrently, the current script wont see that) void transact(func transactCallback); // is not guaranteed to execute if the page is unloaded prior to the lock being acquired // is guaranteed to NOT execute if called from within onunload // but... really... if you need transactional semantics, maybe you should be using a Database? attribute int length; // may only be accessed within a transactCallback, othewise throws an exception string getItemByIndex(int i); // // may only be accessed within a transactCallback, othewise throws an exception }; document.cookie; // has the same safe to read multiple times semantics as store.getItem() So there are no locking semantics (outside of the transact method)... and multiple reads are not error prone. WDYT? getItem stability is helpful for read-only scripts but no help for read-write scripts. For example, outside a transaction, two scripts doing putItem('x', getItem('x') + 1) can race and lose an increment. Even for read-only scripts, you have the problem that reading multiple values isn't guaranteed to give you a consistent state. So this isn't much better than doing nothing for the default case. (Note that you can provide hen read-only scripts are easy to optimize for full parallelism using ) Forcing iteration to be inside a transaction isn't compatible with existing localStorage either. Addressing the larger context ... More than anything else, I'm channeling my experiences at IBM Research writing race detection tools for Java programs ( http://portal.acm.org/citation.cfm?id=781528 and others), and what I learned there about programmers with a range of skill levels grappling with shared memory (or in our case, shared storage) concurrency. I passionately, violently believe that Web programmers cannot and should not have to deal with it. It's simply a matter of implementing what programmers expect: that by default, a chunk of sequential code will do what it says without (occasional, random) interference from outside. I realize that this creates major implementation difficulties for parallel browsers, which I believe will be all browsers. Evil', troubling and onerous are perhaps understatements... But it will be far better in the long run to put those burdens on browser developers than to kick them upstairs to Web developers. If it turns out that there is a compelling performance boost that can *only* be achieved by relaxing serializability, then I could be convinced ... but we are very far from proving that. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Thu, Apr 2, 2009 at 11:02 AM, Robert O'Callahan rob...@ocallahan.orgwrote: (Note that you can provide hen read-only scripts are easy to optimize for full parallelism using ) Oops! I was going to point out that you can use a reader/writer lock to implement serializability while allowing read-only scripts to run in parallel, so if the argument is that most scripts are read-only then that means it shouldn't be hard to get pretty good parallelism. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Mon, Mar 30, 2009 at 6:45 PM, Robert O'Callahan rob...@ocallahan.orgwrote: We have no way of knowing how much trouble this has caused so far; non-reproducibility means you probably won't get a good bug report for any given incident. It's even plausible that people are getting lucky with cookie races almost all the time, or maybe cookies are usually used in a way that makes them a non-issue. That doesn't mean designing cookie races in is a good idea. So, the first argument against cookie races was this is the way the web works now - if we introduce cookie races, we'll break the web. When this was proven to be incorrect (IE does not enforce exclusive access to cookies), the argument has now morphed to the web is breaking right now and nobody notices, which is more an article of faith than anything else. I agree that designing cookie races is not a good idea. If we could go back in time, we might design a better API for cookies that didn't introduce race conditions. However, given where we are today, I'd say that sacrificing performance in the form of preventing parallel network calls/script execution in order to provide theoretical correctness for an API that is already quite happily race-y is not a good tradeoff. In this case, I think the spec should describe the current implementation of cookies, warts and all. -atw
Re: [whatwg] Worker feedback
On Wed, Apr 1, 2009 at 7:27 AM, Drew Wilson atwil...@google.com wrote: So, the first argument against cookie races was this is the way the web works now - if we introduce cookie races, we'll break the web. When this was proven to be incorrect (IE does not enforce exclusive access to cookies), the argument has now morphed to the web is breaking right now and nobody notices, which is more an article of faith than anything else. We know for sure it's possible to write scripts with racy behaviour, so the question is whether this ever occurs in the wild. You're claiming it does not, and I'm questioning whether you really have that data. I agree that designing cookie races is not a good idea. If we could go back in time, we might design a better API for cookies that didn't introduce race conditions. However, given where we are today, I'd say that sacrificing performance in the form of preventing parallel network calls/script execution in order to provide theoretical correctness for an API that is already quite happily race-y is not a good tradeoff. We don't know how much (if any) performance must be sacrificed, because no-one's tried to implement parallel cookie access with serializability guarantees. So I don't think we can say what the correct tradeoff is. In this case, I think the spec should describe the current implementation of cookies, warts and all. You mean IE and Chrome's implementation, I presume, since Firefox and Safari do not allow cookies to be modified during script execution AFAIK. Do we know exactly what IE7, IE8 and Chrome guarantee around parallel cookie access? Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On Tue, Mar 31, 2009 at 6:25 PM, Robert O'Callahan rob...@ocallahan.orgwrote: We know for sure it's possible to write scripts with racy behaviour, so the question is whether this ever occurs in the wild. You're claiming it does not, and I'm questioning whether you really have that data. I'm not claiming it *never* occurs, because in the vasty depths of the internet I suspect *anything* can be found. Also, my rhetorical powers aren't up to the task of constructing a negative proof :) We don't know how much (if any) performance must be sacrificed, because no-one's tried to implement parallel cookie access with serializability guarantees. So I don't think we can say what the correct tradeoff is. The spec as proposed states that script that accesses cookies cannot operate in parallel with network access on those same domains. The performance impact of something like this is pretty clear, IMO - we don't need to implement it and measure it to know it exists and in some situations could be significant. You mean IE and Chrome's implementation, I presume, since Firefox and Safari do not allow cookies to be modified during script execution AFAIK. I think the old spec language captured the intent quite well - document.cookie is a snapshot of an inherently racy state, which is the set of cookies that would be sent with a network call at that precise instant. Due to varying browser implementations, that state may be less racy on some browsers than on others, but the general model was one without guarantees. I understand the philosophy behind serializing access to shared state, and I agree with it in general. But I think we need to make an exception in the case of document.cookie based on current usage and expected performance impact (since it impacts our ability to parallelize network access and script execution). In this case, the burden of proof has to fall on those trying to change the spec - I think we need a compelling real-world argument why we should be making our browsers slower. The pragmatic part of my brain suggests that we're trying to solve a problem that exists in theory, but which doesn't actually happen in practice. Anyhow, at this point I think we're just going around in circles about this - I'm not sure that either of us are going to convince the other, so I'll shut up now and let others have the last word :) -atw
Re: [whatwg] Worker feedback
On Fri, Mar 27, 2009 at 6:23 PM, Ian Hickson i...@hixie.ch wrote: Another use case would be keeping track of what has been done so far, for this I guess it would make sense to have a localStorage API for shared workers (scoped to their name). I haven't added this yet, though. On a related note, I totally understand the desire to protect developers from race conditions, so I understand why we've removed localStorage access from dedicated workers. In the past we've discussed having synchronous APIs for structured storage that only workers can use - it's a much more convenient API, particularly for applications porting to HTML5 structured storage from gears. It sounds like if we want to support these APIs in workers, we'd need to enforce the same kind of serializability guarantees that we have for localStorage in browser windows (i.e. add some kind of structured storage mutex similar to the localStorage mutex). Gears had an explicit permissions variable applications could check, which seems valuable - do we do anything similar elsewhere in HTML5 that we could use as a model here? HTML5 so far has avoided anything that requires explicit permission grants, because they are generally a bad idea from a security perspective (users will grant any permissions the system asks them for). The Database spec has a strong implication that applications can request a larger DB quota, which will result in the user being prompted for permission either immediately, or at the point that the default quota is exceeded. So it's not without precedent, I think. Or maybe I'm just misreading this: User agents are expected to use the display name and the estimated database size to optimize the user experience. For example, a user agent could use the estimated size to suggest an initial quota to the user. This allows a site that is aware that it will try to use hundreds of megabytes to declare this upfront, instead of the user agent prompting the user for permission to increase the quota every five megabytes. There are many ways to expose this, e.g. asynchronously as a drop-down infobar, or as a pie chart showing the disk usage that the user can click on to increase the allocaton whenever they want, etc. Certainly. I actually think we're in agreement here - my point is not that you need a synchronous permission grant (since starting up a worker is an inherently asynchronous operation anyway) - just that there's precedent in the spec for applications to request access to resources (storage space, persistent workers) that are not necessarily granted to all sites by default. It sounds like the specifics of how the UA chooses to expose this access control (pie charts, async dropdowns, domain whitelists, trusted zones with security levels) left to the individual implementation. Re: cookies I suppose that network activity should also wait for the lock. I've made that happen. Seems like that would restrict parallelism between network loads and executing javascript, which seems like the wrong direction to go. It feels like we are jumping through hoops to protect running script from having document.cookies modified out from underneath it, and now some of the ramifications may have real performance impacts. From a pragmatic point of view, I just want to remind people that many current browsers do not make these types of guarantees about document.cookies, and yet the tubes have not imploded. Cookies have a cross-domain aspect (multiple subdomains can share cookie state at the top domain) - does this impact the specification of the storage mutex since we need to lockout multiple domains? There's only one lock, so that should work fine. OK, I was assuming a single per-domain lock (ala localStorage) but it sounds like there's a group lock, cross-domain. This makes it even more onerous if network activity across all related domains has to serialize on a single lock. -atw
Re: [whatwg] Worker feedback
On Tue, Mar 31, 2009 at 7:22 AM, Drew Wilson atwil...@google.com wrote: Re: cookies I suppose that network activity should also wait for the lock. I've made that happen. Seems like that would restrict parallelism between network loads and executing javascript, which seems like the wrong direction to go. It feels like we are jumping through hoops to protect running script from having document.cookies modified out from underneath it, and now some of the ramifications may have real performance impacts. From a pragmatic point of view, I just want to remind people that many current browsers do not make these types of guarantees about document.cookies, and yet the tubes have not imploded. We have no way of knowing how much trouble this has caused so far; non-reproducibility means you probably won't get a good bug report for any given incident. It's even plausible that people are getting lucky with cookie races almost all the time, or maybe cookies are usually used in a way that makes them a non-issue. That doesn't mean designing cookie races in is a good idea. Cookies have a cross-domain aspect (multiple subdomains can share cookie state at the top domain) - does this impact the specification of the storage mutex since we need to lockout multiple domains? There's only one lock, so that should work fine. OK, I was assuming a single per-domain lock (ala localStorage) but it sounds like there's a group lock, cross-domain. This makes it even more onerous if network activity across all related domains has to serialize on a single lock. It doesn't have to. There are lots of ways to optimize here. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
I think it makes sense to treat dedicated workers as simple subresources, not separate browsing contexts, and that they should thus just use the application cache of their parent browsing contexts. This is what WebKit does, according to ap. I've now done this in the spec. Sounds good. I'd phrase it a little differently though. Dedicated worker do have a browsing context that is distinct from their parents, but the appcache selected for a dedicated worker context is identical to the appacache selected for the parents context. For shared workers, I see these options: - Not allow app caches, so shared workers don't work when offline. That seems bad. - Same as suggested for dedicated workers above -- use the creator's cache, so at least one client will get the version they expect. Other clients will have no idea what version they're talking to, the creator would have an unusual relationship with the worker (it would be able to call swapCache() but nobody else would), and once the creator goes away, there will be a zombie relationship. - Pick an appcache more or less at random, like when you view an image in a top-level browsing context. Clients will have no idea which version they're talking to. - Allow workers to specify a manifest using some sort of comment syntax. Nobody knows what version they'll get, but at least it's always the same version, and it's always up to date. Using the creator's cache is the one that minimises the number of clients that are confused, but it also makes the debugging experience most differ from the case where there are two apps using the worker. Using an appcache selected the same way we would pick one for images has the minor benefit of being somewhat consistent with how window.open() works, and we could say that window.open() and new SharedWorker are somewhat similar. I have picked this route for now. Implementation feedback is welcome in determining if this is a good idea. Sounds good for now. Ultimately, I suspect that additionally allowing workers to specify a manifest using some sort of syntax may be the right answer. That would put cache selection for shared workers on par with how cache selection works for pages (not just images) opened via window.open. As 'page' cache selection is refined due to experience with this system, those same refinements would also apply to 'shared worker' cache selection.
Re: [whatwg] Worker feedback
On Sat, Mar 28, 2009 at 2:23 PM, Ian Hickson i...@hixie.ch wrote: Robert O'Callahan wrote: Now, with the storage mutex, are there any cases you know of where serializability fails? If there are, it may be worth noting them in the spec. If there aren't, why not simply write serializability into the spec? Just writing that something must be true doesn't make it true. :-) I think it's safer for us to make the design explicitly enforce this rather than say that browser vendors must figure out where it might be broken and enforce it themselves. If serializability is the goal then I think it can only help to say so in the spec (in addition to whatever explicit design you wish to include), so that any failure of serializability is clearly an inconsistency in the spec that must be fixed rather than a loophole that authors and browser vendors might think they can rely on. I also suggest that speccing just serializability should be fine. It seems to me the current spec is proposing one implementation of serializability while other implementations are possible, and relying on the black-box equivalence principle to enable other implementations. But specifying serializability is probably simpler and may allow implementations that are unintentionally ruled out by the explicit design in the spec, especially as things become more complicated in the future. It would probably also be clearer to authors what they can expect. I think it's a lot like GC; we don't specify a GC algorithm, even though GC is hard; we just have an implicit specification that objects don't disappear arbitrarily. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Worker feedback
On 28.03.2009, at 4:23, Ian Hickson wrote: I think, given text/css, text/html, and text/xml all have character encoding declarations inline, transcoding is not going to work in practice. I think the better solution would be to remove the rules that make text/* an issue in the standards world (it's not an issue in the real world). In fact, transcoding did work in practice - that's because HTTP headers override inline character declarations. For new formats, though, I think just supporting UTF-8 is a big win. Could you please clarify what the win is? Disregarding charset from HTTP headers is just a weird special case for a few text resource types. If we were going to deprecate HTML, XML and CSS, but keep appcache manifest going forward, it could maybe make sense. - WBR, Alexey Proskuryakov
Re: [whatwg] Worker feedback
Scripts, and worker scripts in particular, should use application media type; using text/javascript is obsolete. [RFC4329#3]. Chris