Re: [whatwg] Worker feedback

2009-04-28 Thread Ian Hickson
On Sat, 28 Mar 2009, Robert O'Callahan wrote:
 On Sat, Mar 28, 2009 at 2:23 PM, Ian Hickson i...@hixie.ch wrote:
  Robert O'Callahan wrote:
   Now, with the storage mutex, are there any cases you know of where 
   serializability fails? If there are, it may be worth noting them in 
   the spec. If there aren't, why not simply write serializability into 
   the spec?
 
  Just writing that something must be true doesn't make it true. :-) I 
  think it's safer for us to make the design explicitly enforce this 
  rather than say that browser vendors must figure out where it might be 
  broken and enforce it themselves.
 
 If serializability is the goal then I think it can only help to say so 
 in the spec (in addition to whatever explicit design you wish to 
 include), so that any failure of serializability is clearly an 
 inconsistency in the spec that must be fixed rather than a loophole that 
 authors and browser vendors might think they can rely on.

Done.


 I also suggest that speccing just serializability should be fine.

The problem is that this is specifying an anti-requirement, which 
doesn't really help in defining what the behaviour _should_ be like. It 
doesn't tell us what the order of events should be, for instance, just 
that some order should exist.


 It seems to me the current spec is proposing one implementation of 
 serializability while other implementations are possible, and relying on 
 the black-box equivalence principle to enable other implementations. But 
 specifying serializability is probably simpler and may allow 
 implementations that are unintentionally ruled out by the explicit 
 design in the spec, especially as things become more complicated in the 
 future. It would probably also be clearer to authors what they can 
 expect.

What kind of implementations are unintentionally ruled out that you think 
should not be ruled out?


 I think it's a lot like GC; we don't specify a GC algorithm, even though 
 GC is hard; we just have an implicit specification that objects don't 
 disappear arbitrarily.

It's explicit now, actually (see 2.9.8 Garbage collection, 5.3.5 Garbage 
collection and browsing contexts, 7.3.3.1 Ports and garbage collection, 
and similar sections in the Event Source, Workers, and Web Sockets specs).


On Sat, 28 Mar 2009, Alexey Proskuryakov wrote:
 On 28.03.2009, at 4:23, Ian Hickson wrote:
  
  I think, given text/css, text/html, and text/xml all have character 
  encoding declarations inline, transcoding is not going to work in 
  practice. I think the better solution would be to remove the rules 
  that make text/* an issue in the standards world (it's not an issue in 
  the real world).
 
 In fact, transcoding did work in practice - that's because HTTP headers 
 override inline character declarations.

It worked for as long as the HTTP override was around to override, but as 
soon as the user saves the file to disk, or some such, it fails.


  For new formats, though, I think just supporting UTF-8 is a big win.
 
 Could you please clarify what the win is?

It's massively simpler to not have to deal with the horrors of character 
encodings.


 Disregarding charset from HTTP headers is just a weird special case for 
 a few text resource types. If we were going to deprecate HTML, XML and 
 CSS, but keep appcache manifest going forward, it could maybe make 
 sense.

What's the advantage of introducing all the pain and suffering that 
encodings will inevitably bring with them to the cache manifest format?



On Sat, 28 Mar 2009, Kristof Zelechovski wrote:

 Scripts, and worker scripts in particular, should use application media 
 type; using text/javascript is obsolete. [RFC4329#3].

IMHO RFC4329 is silly.


On Mon, 30 Mar 2009, Drew Wilson wrote:

 In the past we've discussed having synchronous APIs for structured 
 storage that only workers can use - it's a much more convenient API, 
 particularly for applications porting to HTML5 structured storage from 
 gears. It sounds like if we want to support these APIs in workers, we'd 
 need to enforce the same kind of serializability guarantees that we have 
 for localStorage in browser windows (i.e. add some kind of structured 
 storage mutex similar to the localStorage mutex).

This API now exists.

I don't think it causes any particular serialization problems, the only 
issue seems to be what happens if a worker grabs the write lock to a 
database and then doesn't release it, but then all it will do is cause the 
browsing contexts that are waiting for that lock to just never call the 
relevant callback (and for the sync workers from that domain to block), so 
it doesn't seem like a huge deal. (It's still serialisable, it's just 
there's a big wait in there!)



  Re: cookies I suppose that network activity should also wait for the 
  lock. I've made that happen.
 
 Seems like that would restrict parallelism between network loads and 
 executing javascript, which seems like the wrong direction to go.

I agree 

Re: [whatwg] Worker feedback

2009-04-28 Thread Alexey Proskuryakov


29.04.2009, в 6:05, Ian Hickson написал(а):

Disregarding charset from HTTP headers is just a weird special case  
for
a few text resource types. If we were going to deprecate HTML, XML  
and

CSS, but keep appcache manifest going forward, it could maybe make
sense.


What's the advantage of introducing all the pain and suffering that
encodings will inevitably bring with them to the cache manifest  
format?



Just what I said before - the ability to use the same code path for  
decoding manifests as for decoding other types of resources. It's a  
minor benefit, admittedly, but it's a potential issue at all stages -  
from generating content to checking it with automated tools to  
consuming it.


For authors and admins, it may be a nuisance to maintain an UTF-8 text  
file if the rest of the site is in a different encoding.


- WBR, Alexey Proskuryakov




Re: [whatwg] Worker feedback

2009-04-28 Thread Ian Hickson
On Wed, 29 Apr 2009, Alexey Proskuryakov wrote:
 29.04.2009, в 6:05, Ian Hickson написал(а):
 
   Disregarding charset from HTTP headers is just a weird special case 
   for a few text resource types. If we were going to deprecate HTML, 
   XML and CSS, but keep appcache manifest going forward, it could 
   maybe make sense.
  
  What's the advantage of introducing all the pain and suffering that 
  encodings will inevitably bring with them to the cache manifest 
  format?
 
 Just what I said before - the ability to use the same code path for 
 decoding manifests as for decoding other types of resources. It's a 
 minor benefit, admittedly, but it's a potential issue at all stages - 
 from generating content to checking it with automated tools to consuming 
 it.
 
 For authors and admins, it may be a nuisance to maintain an UTF-8 text 
 file if the rest of the site is in a different encoding.

I believe the long-term benefit of not having to deal with encodings, 
ever, for manifests, outweighs the medium-term benefit of people using 
non-UTF-8 elsewhere. Non-UTF-8 encodings are dropping in usage.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Worker feedback

2009-04-07 Thread Darin Fisher
On Mon, Apr 6, 2009 at 8:57 PM, timeless timel...@gmail.com wrote:

 FWIW, iirc multiple processes from IE dates to at least IE4

 The best url I can find on the subject atm is
 http://aroundcny.com/technofile/texts/bit092098.html.

 Michael Nordman micha...@google.com wrote:
  There are additional constraints that haven't been mentioned yet...
 Plugins.
  The current model for plugins is that they execute in a single-threaded
  world. Chrome maintains that model by hosting each plugin in its own
 process
  and RPC'ing method invocations back and forth between calling pages and
 the
  plugin instances. All plugin instances (of a given plugin) reside on the
  same thread.

 Robert O'Callahan rob...@ocallahan.org wrote:
  Why can't instances of a plugin in different browser contexts be hosted
  in separate processes?

 Michael Nordman micha...@google.com wrote:
  It would be expensive, and i think has this would have some correctness
  issues too depending on the plugin. Some plugins depend on instances
 knowing
  about each other and interoperating with each other out of band of DOM
 based
  means doing so.

 Michael Nordman micha...@google.com wrote:
  And others probably assume they have exclusive access to mutable plugin
  resources on disk.

 This seems unlikely. I can run Firefox, Safari, Chrome, IE, Opera, and
 others browsers at the same time, heck I can run multiple profiles of
 a couple of these (I can't find the option in the current version of
 Chrome, but I used it before).


chrome.exe --user-data-dir=c:\foo


Re: [whatwg] Worker feedback

2009-04-06 Thread Darin Fisher
On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote:

 If I understood the discussion correctly, the spec for document.cookie
 never stated anything about it being immutable while a script is running.


 Well, there never was a decent spec for document.cookie for most of its
 life, and even if there had been, no implementations allowed asynchronous
 changes to cookies while a script was running (except for maybe during
 alert()) and no-one really thought about it. Was this even identified as a
 possible issue during Chrome development?



In addition to alert(), don't forget about all the great state changing
things that can happen to the cookie database (and other data stores) during
a synchronous XMLHttpRequest (or synchronous document.load) in Firefox.
 Maybe those are just bugs?  What if a Firefox extension wants to muck
around with the cookie database while a web page is blocked on a synchronous
XMLHttpRequest?  Maybe that should fail to avoid dead-locking?  Sounds like
a recipe for flaky extensions since it is unlikely that the extension author
would have been prepared for being called at this time when access to the
cookie database would have to be denied.

(In Firefox, a new event loop is run to continue processing events while
that synchronous XMLHttpRequest is active.  That event loop helps keep the
application alive and responsive to user action.)

When deciding how to handle cookies in Chrome, we did not worry about the
problem being debated here.  Our concerns were allayed by recognizing that
IE does not try to solve it (and IE6 is multi-process just like Chrome with
a shared network stack), so clearly web developers must already have to
cope.  We flirted with the idea of letting each renderer maintain a local
copy of its cookies, but that turned out to more complicated than was
necessary. In the end, we ended up synchronizing with the main process on
each call to document.cookie to fetch a snapshot.

I think it would be best to specify that document.cookie returns a snapshot.
 I think that is consistent with existing implementations including IE,
Firefox, and Chrome.  I don't know about Safari and Opera, but it seems
plausible that they could have similar behavior thanks to nested event
queues which are typically used to support synchronous XHR and
window.alert().

You would be surprised by the number of times it comes up that web
developers at Google think Firefox has multi-threaded JS thanks to this
behavior of synchronous XHR ;-)

-Darin





 People are now talking about specifying this, but there's been push back.
 Also, there's no way to guarantee serializability for the network traffic
 portion so I'm guessing (hoping!) that this wouldn't be required in the
 JavaScript side, even if it went through.


 What exactly do you mean by that? It's easy to guarantee that reading the
 cookies to send with an HTTP request is an atomic operation, and writing
 them as a result of an HTTP response is an atomic operation.

 The spec is written in such a way that you can't have more that one event
 loop per browser window/worker, and everything is essentially tied to this
 one event loop.  In other words, each window/worker can't run on more than
 one CPU core at a time.  Thus, the only way for a web application to scale
 in todays world is going to be through additional windows and/or workers.


 Depending on exactly what you mean by a Web application, that's not
 really true. There are a variety of ways to exploit multicore parallelism
 within a window with the current set of specs, at least in principle.

 Rob
 --
 He was pierced for our transgressions, he was crushed for our iniquities;
 the punishment that brought us peace was upon him, and by his wounds we are
 healed. We all, like sheep, have gone astray, each of us has turned to his
 own way; and the LORD has laid on him the iniquity of us all. [Isaiah
 53:5-6]



Re: [whatwg] Worker feedback

2009-04-06 Thread Ian Hickson
On Mon, 6 Apr 2009, Darin Fisher wrote:
 
 In addition to alert(), don't forget about all the great state changing 
 things that can happen to the cookie database (and other data stores) 
 during a synchronous XMLHttpRequest (or synchronous document.load) in 
 Firefox. Maybe those are just bugs?

The HTML5 spec says the storage mutex is released when alert() is called. 
I've asked Anne (editor of the XHR spec) to say that it is released when a 
sync XHR is started, too. Per the HTML5 spec, setting the cookies from the 
network grabs the storage mutex briefly. (Reading them is implicitly 
atomic, but might happen while someone else holds the mutex, so per spec 
there is still a chance of the cookies sent to the server being in an 
inconsistent state if they are read while a script is in the middle of a 
multi-stage cookie update.)

I don't really mind if the spec says whether cookies should be protected 
by the storage mutex or not (the spec says they should be because that 
seems to be the majority opinion). I'm pretty sure localStorage should be 
so protected, though. I don't really see how to get away from that.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Worker feedback

2009-04-06 Thread Robert O'Callahan
On Mon, Apr 6, 2009 at 7:03 PM, Darin Fisher da...@chromium.org wrote:

 On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote:

 If I understood the discussion correctly, the spec for document.cookie
 never stated anything about it being immutable while a script is running.


 Well, there never was a decent spec for document.cookie for most of its
 life, and even if there had been, no implementations allowed asynchronous
 changes to cookies while a script was running (except for maybe during
 alert()) and no-one really thought about it. Was this even identified as a
 possible issue during Chrome development?


 In addition to alert(), don't forget about all the great state changing
 things that can happen to the cookie database (and other data stores) during
 a synchronous XMLHttpRequest (or synchronous document.load) in Firefox.
  Maybe those are just bugs?  What if a Firefox extension wants to muck
 around with the cookie database while a web page is blocked on a synchronous
 XMLHttpRequest?  Maybe that should fail to avoid dead-locking?  Sounds like
 a recipe for flaky extensions since it is unlikely that the extension author
 would have been prepared for being called at this time when access to the
 cookie database would have to be denied.


According to the spec the storage mutex is dropped for blocking operations
like alert() and sync XHR, and as you know, that's effectively what we do.

But the general rule of DOM API design is that operations do not block and
we offer asynchronous APIs instead. alert() and sync XHR are exceptions to
this rule, but they're ugly stepchildren of DOM APIs and we don't want to
treat them as norms.

When deciding how to handle cookies in Chrome, we did not worry about the
 problem being debated here.  Our concerns were allayed by recognizing that
 IE does not try to solve it (and IE6 is multi-process just like Chrome with
 a shared network stack), so clearly web developers must already have to
 cope.


You mean IE8.

How would Web developers cope? There's no way to synchronize. I doubt more
than a handful of Web developers even know this problem could exist.

I think it would be best to specify that document.cookie returns a snapshot.
  I think that is consistent with existing implementations including IE,
 Firefox, and Chrome.


Not at all. In Firefox, cookies don't change while a script is running, as
long as it doesn't call the handful of blocking DOM APIs (such as alert() or
sync XHR); we satisfy the current spec.

The insidious part is that almost all the time, IE and Chrome will also be
observed to obey the spec; when a quick cookie-read-modify-write script
runs, it is very unlikely cookies will change underneath it. (Is it possible
people don't write such scripts?)

Maybe we need dynamic race detection for Web browsers. After a script reads
document.cookie, stall for a while to give network transactions or scripts
running in other threads a chance to change the cookies so the original
script carries on with wrong data.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-06 Thread Darin Fisher
On Mon, Apr 6, 2009 at 4:20 AM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Mon, Apr 6, 2009 at 7:03 PM, Darin Fisher da...@chromium.org wrote:

 On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote:

 If I understood the discussion correctly, the spec for document.cookie
 never stated anything about it being immutable while a script is running.


 Well, there never was a decent spec for document.cookie for most of its
 life, and even if there had been, no implementations allowed asynchronous
 changes to cookies while a script was running (except for maybe during
 alert()) and no-one really thought about it. Was this even identified as a
 possible issue during Chrome development?


 In addition to alert(), don't forget about all the great state changing
 things that can happen to the cookie database (and other data stores) during
 a synchronous XMLHttpRequest (or synchronous document.load) in Firefox.
  Maybe those are just bugs?  What if a Firefox extension wants to muck
 around with the cookie database while a web page is blocked on a synchronous
 XMLHttpRequest?  Maybe that should fail to avoid dead-locking?  Sounds like
 a recipe for flaky extensions since it is unlikely that the extension author
 would have been prepared for being called at this time when access to the
 cookie database would have to be denied.


 According to the spec the storage mutex is dropped for blocking operations
 like alert() and sync XHR, and as you know, that's effectively what we do.

 But the general rule of DOM API design is that operations do not block and
 we offer asynchronous APIs instead. alert() and sync XHR are exceptions to
 this rule, but they're ugly stepchildren of DOM APIs and we don't want to
 treat them as norms.


OK... so if I am building an API, the consumer of my API might not realize
that I have stuck a sync XHR in the middle of it.  (People often do that so
that their API can work during unload.)  So the consumer of such an API now
has to deal with the cookie lock being released?





 When deciding how to handle cookies in Chrome, we did not worry about the
 problem being debated here.  Our concerns were allayed by recognizing that
 IE does not try to solve it (and IE6 is multi-process just like Chrome with
 a shared network stack), so clearly web developers must already have to
 cope.


 You mean IE8.


No, IE6,7,8 (maybe older versions too?) ... you can launch multiple IE6
processes, and those share cookies.  You can also programmatically access
the same cookies via WinInet from any application.  It is not uncommon for a
separate application to be mucking around with cookies for intranet.com.




 How would Web developers cope? There's no way to synchronize. I doubt more
 than a handful of Web developers even know this problem could exist.


You can synchronize through the origin server...

What I meant was that they cope by not expecting document.cookie to return
the same results each time it is called.  I'd imagine it is not uncommon for
users to login to a site in multiple windows and perform similar operations
in each browser window.  That scenario seems like it could trigger what we
have here.





 I think it would be best to specify that document.cookie returns a
 snapshot.  I think that is consistent with existing implementations
 including IE, Firefox, and Chrome.


 Not at all. In Firefox, cookies don't change while a script is running, as
 long as it doesn't call the handful of blocking DOM APIs (such as alert() or
 sync XHR); we satisfy the current spec.


I don't understand why the sync XHR exception is taken so lightly.  As I
mention above, that is most frequently used as a
transparent-to-the-rest-of-the-application way of communicating with the
server (usually because some APIs cannot be easily changed or need to be
available during unload).  Yet, here we are saying that that cannot be
transparent because of this locking issue.





 The insidious part is that almost all the time, IE and Chrome will also be
 observed to obey the spec; when a quick cookie-read-modify-write script
 runs, it is very unlikely cookies will change underneath it. (Is it possible
 people don't write such scripts?)


I'm sure people write cookie-read-modify-write scripts and don't realize the
potential problems.  But I suspect the incidents of problems related to two
scripts doing so are extremely low as to not matter enough to application
developers.  They can just say:  opening our webmail program in two browser
tabs at the same time is not supported.





 Maybe we need dynamic race detection for Web browsers. After a script reads
 document.cookie, stall for a while to give network transactions or scripts
 running in other threads a chance to change the cookies so the original
 script carries on with wrong data.


Sounds interesting, but what happens when the script writes cookies?  Now
there is a merging 

Re: [whatwg] Worker feedback

2009-04-06 Thread Michael Nordman
There are additional constraints that haven't been mentioned yet...
Plugins.

The current model for plugins is that they execute in a single-threaded
world. Chrome maintains that model by hosting each plugin in its own process
and RPC'ing method invocations back and forth between calling pages and the
plugin instances. All plugin instances (of a given plugin) reside on the
same thread.

Consider three threads

PageA
PageB
PluginC

PageA
-grabs storage lock

PluginC
-calls out to PageB (everything in NPAPI is synchronous)
-now waiting for PageB to return

PageB
-while handling the plugins callback, attempts to grab the storage lock
-BLOCKED waiting for PageA to release it

PageA
-calls plugin (sync method call)
-BLOCK waiting indirectly for PageB

== DEADLOCK


Re: [whatwg] Worker feedback

2009-04-06 Thread Robert O'Callahan
On Tue, Apr 7, 2009 at 1:53 AM, Darin Fisher da...@chromium.org wrote:

 OK... so if I am building an API, the consumer of my API might not realize
 that I have stuck a sync XHR in the middle of it.  (People often do that so
 that their API can work during unload.)  So the consumer of such an API now
 has to deal with the cookie lock being released?


Yes. If sync XHR spins up a subsidiary event loop, the cookie lock is the
least of your worries, because event handlers may run and mutate arbitrary
script/DOM state. (We're actually tightening up what is allowed to run
during sync XHR in Gecko, but I don't know the details and I don't know what
other browsers do.)

APIs that can cause reentrancy, or block, or yield, need to be carefully
documented. That's why we want to minimize them...


 When deciding how to handle cookies in Chrome, we did not worry about the
 problem being debated here.  Our concerns were allayed by recognizing that
 IE does not try to solve it (and IE6 is multi-process just like Chrome with
 a shared network stack), so clearly web developers must already have to
 cope.


 You mean IE8.


 No, IE6,7,8 (maybe older versions too?) ... you can launch multiple IE6
 processes, and those share cookies.  You can also programmatically access
 the same cookies via WinInet from any application.  It is not uncommon for a
 separate application to be mucking around with cookies for intranet.com.


OK, that's interesting.


 How would Web developers cope? There's no way to synchronize. I doubt more
 than a handful of Web developers even know this problem could exist.


 You can synchronize through the origin server...

 What I meant was that they cope by not expecting document.cookie to
 return the same results each time it is called.  I'd imagine it is not
 uncommon for users to login to a site in multiple windows and perform
 similar operations in each browser window.  That scenario seems like it
 could trigger what we have here.


Many sites, such as my bank, detect that and attempt to prohibit it by
refusing to let more than one window work. I wonder if they use a
race-vulnerable cookie protocol to detect it...



 I think it would be best to specify that document.cookie returns a
 snapshot.  I think that is consistent with existing implementations
 including IE, Firefox, and Chrome.


 Not at all. In Firefox, cookies don't change while a script is running, as
 long as it doesn't call the handful of blocking DOM APIs (such as alert() or
 sync XHR); we satisfy the current spec.


 I don't understand why the sync XHR exception is taken so lightly.  As I
 mention above, that is most frequently used as a
 transparent-to-the-rest-of-the-application way of communicating with the
 server (usually because some APIs cannot be easily changed or need to be
 available during unload).  Yet, here we are saying that that cannot be
 transparent because of this locking issue.


Yes. Making sync-XHR transparent by reducing all consistency guarantees to
what we can provide around sync-XHR is the wrong direction to go IMHO.



 The insidious part is that almost all the time, IE and Chrome will also be
 observed to obey the spec; when a quick cookie-read-modify-write script
 runs, it is very unlikely cookies will change underneath it. (Is it possible
 people don't write such scripts?)


 I'm sure people write cookie-read-modify-write scripts and don't realize
 the potential problems.  But I suspect the incidents of problems related to
 two scripts doing so are extremely low as to not matter enough to
 application developers.  They can just say:  opening our webmail program in
 two browser tabs at the same time is not supported.


If they're not aware of the problem, why would they say that?


 Maybe we need dynamic race detection for Web browsers. After a script reads
 document.cookie, stall for a while to give network transactions or scripts
 running in other threads a chance to change the cookies so the original
 script carries on with wrong data.


 Sounds interesting, but what happens when the script writes cookies?  Now
 there is a merging problem :(


Oh, dynamic race detection is only good for finding bugs more easily, not
fixing them :-).

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-06 Thread Robert O'Callahan
On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.com wrote:

 There are additional constraints that haven't been mentioned yet...
 Plugins.

 The current model for plugins is that they execute in a single-threaded
 world. Chrome maintains that model by hosting each plugin in its own process
 and RPC'ing method invocations back and forth between calling pages and the
 plugin instances. All plugin instances (of a given plugin) reside on the
 same thread.


Why can't instances of a plugin in different browser contexts be hosted in
separate processes?

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-06 Thread Michael Nordman
On Mon, Apr 6, 2009 at 7:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.comwrote:

 There are additional constraints that haven't been mentioned yet...
 Plugins.

 The current model for plugins is that they execute in a single-threaded
 world. Chrome maintains that model by hosting each plugin in its own process
 and RPC'ing method invocations back and forth between calling pages and the
 plugin instances. All plugin instances (of a given plugin) reside on the
 same thread.


 Why can't instances of a plugin in different browser contexts be hosted in
 separate processes?


It would be expensive, and i think has this would have some correctness
issues too depending on the plugin. Some plugins depend on instances knowing
about each other and interoperating with each other out of band of DOM based
means doing so.




 Rob
 --
 He was pierced for our transgressions, he was crushed for our iniquities;
 the punishment that brought us peace was upon him, and by his wounds we are
 healed. We all, like sheep, have gone astray, each of us has turned to his
 own way; and the LORD has laid on him the iniquity of us all. [Isaiah
 53:5-6]



Re: [whatwg] Worker feedback

2009-04-06 Thread Michael Nordman
On Mon, Apr 6, 2009 at 7:28 PM, Michael Nordman micha...@google.com wrote:



 On Mon, Apr 6, 2009 at 7:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.comwrote:

 There are additional constraints that haven't been mentioned yet...
 Plugins.

 The current model for plugins is that they execute in a single-threaded
 world. Chrome maintains that model by hosting each plugin in its own process
 and RPC'ing method invocations back and forth between calling pages and the
 plugin instances. All plugin instances (of a given plugin) reside on the
 same thread.


 Why can't instances of a plugin in different browser contexts be hosted in
 separate processes?


 It would be expensive, and i think has this would have some correctness
 issues too depending on the plugin. Some plugins depend on instances knowing
 about each other and interoperating with each other out of band of DOM based
 means doing so.


And others probably assume they have exclusive access to mutable plugin
resources on disk.






 Rob
 --
 He was pierced for our transgressions, he was crushed for our iniquities;
 the punishment that brought us peace was upon him, and by his wounds we are
 healed. We all, like sheep, have gone astray, each of us has turned to his
 own way; and the LORD has laid on him the iniquity of us all. [Isaiah
 53:5-6]





Re: [whatwg] Worker feedback

2009-04-06 Thread Robert O'Callahan
On Tue, Apr 7, 2009 at 5:04 AM, Michael Nordman micha...@google.com wrote:

 Consider three threads
 PageA
 PageB
 PluginC

 PageA
 -grabs storage lock

 PluginC
 -calls out to PageB (everything in NPAPI is synchronous)
 -now waiting for PageB to return

 PageB
 -while handling the plugins callback, attempts to grab the storage lock
 -BLOCKED waiting for PageA to release it

 PageA
 -calls plugin (sync method call)
 -BLOCK waiting indirectly for PageB

 == DEADLOCK


What happens if we don't have storage locks but PageB does a sync XHR or
alert() inside the callout from the plugin? All the other pages containing
plugins of that type lock up?

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-06 Thread timeless
FWIW, iirc multiple processes from IE dates to at least IE4

The best url I can find on the subject atm is
http://aroundcny.com/technofile/texts/bit092098.html.

Michael Nordman micha...@google.com wrote:
 There are additional constraints that haven't been mentioned yet... Plugins.
 The current model for plugins is that they execute in a single-threaded
 world. Chrome maintains that model by hosting each plugin in its own process
 and RPC'ing method invocations back and forth between calling pages and the
 plugin instances. All plugin instances (of a given plugin) reside on the
 same thread.

Robert O'Callahan rob...@ocallahan.org wrote:
 Why can't instances of a plugin in different browser contexts be hosted
 in separate processes?

Michael Nordman micha...@google.com wrote:
 It would be expensive, and i think has this would have some correctness
 issues too depending on the plugin. Some plugins depend on instances knowing
 about each other and interoperating with each other out of band of DOM based
 means doing so.

Michael Nordman micha...@google.com wrote:
 And others probably assume they have exclusive access to mutable plugin
 resources on disk.

This seems unlikely. I can run Firefox, Safari, Chrome, IE, Opera, and
others browsers at the same time, heck I can run multiple profiles of
a couple of these (I can't find the option in the current version of
Chrome, but I used it before).


Re: [whatwg] Worker feedback

2009-04-04 Thread Robert O'Callahan
On Sat, Apr 4, 2009 at 11:17 AM, Jeremy Orlow jor...@google.com wrote:

 True serializability would imply that the HTTP request read and write are
 atomic.  In other words, you'd have to keep a lock for the entirety of each
 HTTP request and couldn't do multiple in parallel.  When I said there's no
 way to guarantee serializability, I guess I meant to qualify it with in
 practice.


OK, I don't think anyone expects, wants, or has ever had that :-).

After thinking about it for a bit, your suggestion of reading the cookies
 to send with an HTTP request is an atomic operation, and writing them as a
 result of an HTTP response is an atomic operation does seems like a pretty
 sensible compromise.


It's what the spec says (the spec doesn't say anything about reading cookies
when constructing an HTTP request, but that's probably just an oversight)
and it's what I expected, so not really a compromise :-).

The one thing I'd still be concerned about:  localStorage separates storage
 space by origins.  In other words, www.google.com cannot access
 localStorage values from google.com and visa versa.  Cookies, on the other
 hand, have a much more complex scheme of access control.  Coming up with an
 efficient and dead-lock-proof locking scheme might take some careful
 thought.


I hope browser implementors can solve this internally. I think the main
thing we have to watch out for in the spec is situations where a script can
*synchronously* entangle browsing contexts that previously could not
interfere with each other (i.e., that a browser could have assigned
independent locks). (Setting document.domain might be a problem, for
example, although I don't know enough about cookies to be sure.)



 Depending on exactly what you mean by a Web application, that's not
 really true. There are a variety of ways to exploit multicore parallelism
 within a window with the current set of specs, at least in principle.


 What else is there?  (I believe you, I'm just interested in knowing what's
 out there.)


In Gecko we're working on making HTML parsing happen in parallel with other
activities (including script execution), and video decoding already does. I
can imagine doing all graphics rendering in parallel with other tasks and
being parallel internally too. Some aspects of layout can be parallelized
internally and overlapped with script execution. Expensive Javascript
compiler optimizations can be run in parallel with actual application work.
Canvas3D can run GPU programs which are another form of parallelism (OK
that's not exactly multicore parallelism unless you believe Intel).

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-03 Thread Anne van Kesteren
On Fri, 03 Apr 2009 06:26:43 +0200, Robert O'Callahan  
rob...@ocallahan.org wrote:
Mozilla could probably get behind that, but I don't know who else is  
willing to bite the bullet.


The problem already exists for document.cookie, no? And the current API is  
by far the most convenient the use.



--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Worker feedback

2009-04-03 Thread Jeremy Orlow
On Fri, Apr 3, 2009 at 2:18 AM, Anne van Kesteren ann...@opera.com wrote:

 On Fri, 03 Apr 2009 06:26:43 +0200, Robert O'Callahan 
 rob...@ocallahan.org wrote:

 Mozilla could probably get behind that, but I don't know who else is
 willing to bite the bullet.


 The problem already exists for document.cookie, no? And the current API is
 by far the most convenient the use.


If I understood the discussion correctly, the spec for document.cookie never
stated anything about it being immutable while a script is running.  People
are now talking about specifying this, but there's been push back.  Also,
there's no way to guarantee serializability for the network traffic portion
so I'm guessing (hoping!) that this wouldn't be required in the JavaScript
side, even if it went through.

localStorage, on the other hand, does have language in the draft spec
stating that changes to localStorage must be serialized as if only one event
loop is running at a time.  That's the problem.  In other words, the
strictness of the concurrency control for localStorage is what makes this
different from document.cookie.


As for convenience:

The spec is written in such a way that you can't have more that one event
loop per browser window/worker, and everything is essentially tied to this
one event loop.  In other words, each window/worker can't run on more than
one CPU core at a time.  Thus, the only way for a web application to scale
in todays world is going to be through additional windows and/or workers.

I agree that the current API is quite convenient, but it worries me a great
deal that it's synchronous.  Now that navigator.unlockStorage() has been
added to the spec and you can't access localStorage from workers, I'm less
worried.  But I still feel like we're going to regret this in the next
couple years and/or people will simply avoid localStorage.

J


Re: [whatwg] Worker feedback

2009-04-03 Thread Tab Atkins Jr.
On Thu, Apr 2, 2009 at 8:37 PM, Robert O'Callahan rob...@ocallahan.org wrote:
 I agree it would make sense for new APIs to impose much greater constraints
 on consumers, such as requiring them to factor code into transactions,
 declare up-front the entire scope of resources that will be accessed, and
 enforce those restrictions, preferably syntactically --- Jonas' asynchronous
 multi-resource-acquisition callback, for example.

Speaking as a novice javascript developer, this feels like the
cleanest, simplest, most easily comprehensible way to solve this
problem.  We define what needs to be locked all at once, provide a
callback, and within the dynamic context of the callback no further
locks are acquirable.  You have to completely exit the callback and
start a new lock block if you need more resources.

This prevents deadlocks, while still giving us developers a simple way
to express what we need.  As well, callbacks are at this point a
relatively novice concept, as every major javascript library makes
heavy use of them.

~TJ


Re: [whatwg] Worker feedback

2009-04-03 Thread Drew Wilson
I know I said I would stay out of this conversation, but I feel obliged to
share a data point that's pertinent to our API design.
The structured storage spec has an asynchronous API currently. There are no
shortage of experienced javascript programmers at Google, and yet the single
biggest piece of feedback I've gotten from the internal app community has
been (essentially): The asynchronous APIs are too cumbersome. We are going
to delay porting over to use the HTML5 APIs until we have synchronous APIs,
like the ones in Gears.

So, we should all take the whining of pampered Google engineers with a grain
of salt :), but the point remains that even though callbacks are
conceptually familiar and easy to use, it's not always convenient (or
possible!) for an application to stop an operation in the middle and resume
it via an asynchronous callback. Imagine if you're a library author that
exposes a synchronous API for your clients - now you'd like to use
localStorage within your library, but there's no way to do it while
maintaining your existing synchronous APIs.

If we try to force everyone to use asynchronous APIs to access local
storage, the first thing everyone is going to do is build their own
write-through caching wrapper objects around local storage to give them
synchronous read access and lazy writes, which generates precisely the type
of racy behavior we're trying to avoid.

If we can capture the correct behavior using synchronous APIs, we should.

-atw

On Fri, Apr 3, 2009 at 11:44 AM, Tab Atkins Jr. jackalm...@gmail.comwrote:

 On Thu, Apr 2, 2009 at 8:37 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:
  I agree it would make sense for new APIs to impose much greater
 constraints
  on consumers, such as requiring them to factor code into transactions,
  declare up-front the entire scope of resources that will be accessed, and
  enforce those restrictions, preferably syntactically --- Jonas'
 asynchronous
  multi-resource-acquisition callback, for example.

 Speaking as a novice javascript developer, this feels like the
 cleanest, simplest, most easily comprehensible way to solve this
 problem.  We define what needs to be locked all at once, provide a
 callback, and within the dynamic context of the callback no further
 locks are acquirable.  You have to completely exit the callback and
 start a new lock block if you need more resources.

 This prevents deadlocks, while still giving us developers a simple way
 to express what we need.  As well, callbacks are at this point a
 relatively novice concept, as every major javascript library makes
 heavy use of them.

 ~TJ



Re: [whatwg] Worker feedback

2009-04-03 Thread Jeremy Orlow
On Fri, Apr 3, 2009 at 2:25 PM, Drew Wilson atwil...@google.com wrote:


 If we can capture the correct behavior using synchronous APIs, we should.


I think we already have a good, correct, synchronous API.  My concern is the
implications to the internals of the implemenation.

Anyway, given that no one is chiming in to my defense, either no one really
cares enough to have read this far or no one agrees with me.  Either way, I
guess I'll quite down.  :-)


Re: [whatwg] Worker feedback

2009-04-03 Thread Robert O'Callahan
On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote:

 If I understood the discussion correctly, the spec for document.cookie
 never stated anything about it being immutable while a script is running.


Well, there never was a decent spec for document.cookie for most of its
life, and even if there had been, no implementations allowed asynchronous
changes to cookies while a script was running (except for maybe during
alert()) and no-one really thought about it. Was this even identified as a
possible issue during Chrome development?

People are now talking about specifying this, but there's been push back.
 Also, there's no way to guarantee serializability for the network traffic
 portion so I'm guessing (hoping!) that this wouldn't be required in the
 JavaScript side, even if it went through.


What exactly do you mean by that? It's easy to guarantee that reading the
cookies to send with an HTTP request is an atomic operation, and writing
them as a result of an HTTP response is an atomic operation.

The spec is written in such a way that you can't have more that one event
 loop per browser window/worker, and everything is essentially tied to this
 one event loop.  In other words, each window/worker can't run on more than
 one CPU core at a time.  Thus, the only way for a web application to scale
 in todays world is going to be through additional windows and/or workers.


Depending on exactly what you mean by a Web application, that's not really
true. There are a variety of ways to exploit multicore parallelism within a
window with the current set of specs, at least in principle.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-03 Thread Jeremy Orlow
On Fri, Apr 3, 2009 at 2:49 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Sat, Apr 4, 2009 at 6:35 AM, Jeremy Orlow jor...@google.com wrote:

 People are now talking about specifying this, but there's been push back.
 Also, there's no way to guarantee serializability for the network traffic
 portion so I'm guessing (hoping!) that this wouldn't be required in the
 JavaScript side, even if it went through.


 What exactly do you mean by that? It's easy to guarantee that reading the
 cookies to send with an HTTP request is an atomic operation, and writing
 them as a result of an HTTP response is an atomic operation.


True serializability would imply that the HTTP request read and write are
atomic.  In other words, you'd have to keep a lock for the entirety of each
HTTP request and couldn't do multiple in parallel.  When I said there's no
way to guarantee serializability, I guess I meant to qualify it with in
practice.

After thinking about it for a bit, your suggestion of reading the cookies
to send with an HTTP request is an atomic operation, and writing them as a
result of an HTTP response is an atomic operation does seems like a pretty
sensible compromise.

The one thing I'd still be concerned about:  localStorage separates storage
space by origins.  In other words, www.google.com cannot access localStorage
values from google.com and visa versa.  Cookies, on the other hand, have a
much more complex scheme of access control.  Coming up with an efficient and
dead-lock-proof locking scheme might take some careful thought.


 The spec is written in such a way that you can't have more that one event
 loop per browser window/worker, and everything is essentially tied to this
 one event loop.  In other words, each window/worker can't run on more than
 one CPU core at a time.  Thus, the only way for a web application to scale
 in todays world is going to be through additional windows and/or workers.


 Depending on exactly what you mean by a Web application, that's not
 really true. There are a variety of ways to exploit multicore parallelism
 within a window with the current set of specs, at least in principle.


What else is there?  (I believe you, I'm just interested in knowing what's
out there.)

Jeremy

P.S. Please don't mistake me for an expert on document.cookie or even
window.localStorage.  I try to fact check myself as I go, but if I say
something that seems stupid, please do let me know.  :-)


Re: [whatwg] Worker feedback

2009-04-02 Thread Jeremy Orlow
On Wed, Apr 1, 2009 at 3:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Thu, Apr 2, 2009 at 11:02 AM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

  (Note that you can provide hen read-only scripts are easy to optimize for
 full parallelism using )


 Oops!

 I was going to point out that you can use a reader/writer lock to implement
 serializability while allowing read-only scripts to run in parallel, so if
 the argument is that most scripts are read-only then that means it shouldn't
 be hard to get pretty good parallelism.


The problem is escalating the lock.  If your script does a read and then a
write, and you do this in 2 workers/windows/etc you can get a deadlock
unless you have the ability to roll back one of the two scripts to before
the read which took a shared lock.  If both scripts have an 'alert(hi!);'
then you're totally screwed, though.

There's been a LOT of CS research done on automatically handling the details
of concurrency.  The problem has to become pretty constrained (especially in
terms of stuff you can't roll back, like user input) before you can create
something halfway efficient.


On Wed, Apr 1, 2009 at 3:02 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:

 On Thu, Apr 2, 2009 at 7:18 AM, Michael Nordman micha...@google.com
  wrote:

 I suggest that we can come up with a design that makes both of these camps
 happy and that should be our goal here.
 To that end... what if...

 interface Store {
   void putItem(string name, string value);

   string getItem(string name);
   // calling getItem multiple times prior to script completion with the
 same name is gauranteed to return the same value
   // (unless the current script had called putItem, if a different script
 had called putItem concurrently, the current script wont see that)

   void transact(func transactCallback);
   // is not guaranteed to execute if the page is unloaded prior to the
 lock being acquired
   // is guaranteed to NOT execute if called from within onunload
   // but... really... if you need transactional semantics, maybe you
 should be using a Database?

   attribute int length;
   // may only be accessed within a transactCallback, othewise throws an
 exception

   string getItemByIndex(int i);
   // // may only be accessed within a transactCallback, othewise throws an
 exception
 };



 document.cookie;
 // has the same safe to read multiple times semantics as store.getItem()


 So there are no locking semantics (outside of the transact method)... and
 multiple reads are not error prone.

 WDYT?


 getItem stability is helpful for read-only scripts but no help for
 read-write scripts. For example, outside a transaction, two scripts doing
 putItem('x', getItem('x') + 1) can race and lose an increment.


Totally agree that it doesn't quite work yet.

But what if setItem were to watch for unserializable behavior and throw a
transactCallback when it happens?  This solves the silent data corruption
problem, though reproducing the circumstances that'd cause this are
obviously racy.  Of course, reproducing the deadlocks or very slow script
execution behavior is also racy.



 Addressing the larger context ... More than anything else, I'm channeling
 my experiences at IBM Research writing race detection tools for Java
 programs ( http://portal.acm.org/citation.cfm?id=781528 and others), and
 what I learned there about programmers with a range of skill levels
 grappling with shared memory (or in our case, shared storage) concurrency. I
 passionately, violently believe that Web programmers cannot and should not
 have to deal with it. It's simply a matter of implementing what programmers
 expect: that by default, a chunk of sequential code will do what it says
 without (occasional, random) interference from outside.


I definitely see pro's and cons to providing a single threaded version of
the world to all developers (both advanced and beginner), but this really
isn't what we should be debating right now.

What we should be debating is whether advanced, cross-event-loop APIs should
be kept simple enough that any beginner web developer can use it (at the
expense of performance and simplicity within the browser) or if we should be
finding a compromise that can be kept fast, simple (causing less bugs!), and
somewhat harder to program for.

If someone wants to cross the event loop (except in the document.cookie
case, which is a pretty special one), they should have to deal with more
complexity in some form.  Personally, I'd like to see a solution that does
not involve locks of any sort (software transactional memory?).



 I realize that this creates major implementation difficulties for parallel
 browsers, which I believe will be all browsers. Evil', troubling and
 onerous are perhaps understatements... But it will be far better in the
 long run to put those burdens on browser developers than to kick them
 upstairs to Web developers. If it turns out that there is a compelling
 performance boost that can 

Re: [whatwg] Worker feedback

2009-04-02 Thread Jeremy Orlow
On Tue, Mar 31, 2009 at 9:57 PM, Drew Wilson atwil...@google.com wrote:


 On Tue, Mar 31, 2009 at 6:25 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

  We don't know how much (if any) performance must be sacrificed, because
 no-one's tried to implement parallel cookie access with serializability
 guarantees. So I don't think we can say what the correct tradeoff is.


 The spec as proposed states that script that accesses cookies cannot
 operate in parallel with network access on those same domains. The
 performance impact of something like this is pretty clear, IMO - we don't
 need to implement it and measure it to know it exists and in some situations
 could be significant.


I agree with everything Drew said, but I think think this one point really
needs to be singled out.  Cookies go across the wire.  Serializable
semantics are not possible in todays (latent) world.  Period.


Re: [whatwg] Worker feedback

2009-04-02 Thread Robert O'Callahan
On Fri, Apr 3, 2009 at 9:00 AM, Jeremy Orlow jor...@google.com wrote:

 The problem is escalating the lock.  If your script does a read and then a
 write, and you do this in 2 workers/windows/etc you can get a deadlock
 unless you have the ability to roll back one of the two scripts to before
 the read which took a shared lock.  If both scripts have an 'alert(hi!);'
 then you're totally screwed, though.


Double oops! Yes.

On Wed, Apr 1, 2009 at 3:02 PM, Robert O'Callahan rob...@ocallahan.org
  wrote:

 getItem stability is helpful for read-only scripts but no help for
 read-write scripts. For example, outside a transaction, two scripts doing
 putItem('x', getItem('x') + 1) can race and lose an increment.


 Totally agree that it doesn't quite work yet.

 But what if setItem were to watch for unserializable behavior and throw a
 transactCallback when it happens?  This solves the silent data corruption
 problem, though reproducing the circumstances that'd cause this are
 obviously racy.  Of course, reproducing the deadlocks or very slow script
 execution behavior is also racy.


You mean throw an exception when it happens? Yeah, that doesn't really help,
you just replace one kind of random failure with another. A half-completed
read-write script is very likely to have corrupted data.



 Addressing the larger context ... More than anything else, I'm channeling
 my experiences at IBM Research writing race detection tools for Java
 programs ( http://portal.acm.org/citation.cfm?id=781528 and others), and
 what I learned there about programmers with a range of skill levels
 grappling with shared memory (or in our case, shared storage) concurrency. I
 passionately, violently believe that Web programmers cannot and should not
 have to deal with it. It's simply a matter of implementing what programmers
 expect: that by default, a chunk of sequential code will do what it says
 without (occasional, random) interference from outside.


 I definitely see pro's and cons to providing a single threaded version of
 the world to all developers (both advanced and beginner), but this really
 isn't what we should be debating right now.


Why not? I know of no better forum for debating the semantics of the Web
platform, and it's clearly a matter of some urgency.

What we should be debating is whether advanced, cross-event-loop APIs should
 be kept simple enough that any beginner web developer can use it (at the
 expense of performance and simplicity within the browser) or if we should be
 finding a compromise that can be kept fast, simple (causing less bugs!), and
 somewhat harder to program for.

 If someone wants to cross the event loop (except in the document.cookie
 case, which is a pretty special one), they should have to deal with more
 complexity in some form.  Personally, I'd like to see a solution that does
 not involve locks of any sort (software transactional memory?).


I agree it would make sense for new APIs to impose much greater constraints
on consumers, such as requiring them to factor code into transactions,
declare up-front the entire scope of resources that will be accessed, and
enforce those restrictions, preferably syntactically --- Jonas' asynchronous
multi-resource-acquisition callback, for example. That is entirely
consistent with what I said above; I'm not saying all concurrency
abstractions are intractable. But the abstraction which takes sequential
code and adds races on shared storage everywhere certainly is.

Unfortunately we have to deal with cookies and localStorage, where the API
is already set.



 I realize that this creates major implementation difficulties for parallel
 browsers, which I believe will be all browsers. Evil', troubling and
 onerous are perhaps understatements... But it will be far better in the
 long run to put those burdens on browser developers than to kick them
 upstairs to Web developers. If it turns out that there is a compelling
 performance boost that can *only* be achieved by relaxing serializability,
 then I could be convinced ... but we are very far from proving that.


 Like I said, a LOT of research has been done on concurrency.  Basically, if
 you're not really careful about how you construct your language and the
 abstractions you have for concurrency, you can really easily back yourself
 into a corner that you semantically can't get out of (no matter how good of
 a programmer you are).


I know this, but I'm not sure exactly what point you're trying to make.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-02 Thread Robert O'Callahan
On Fri, Apr 3, 2009 at 9:02 AM, Jeremy Orlow jor...@google.com wrote:

 I agree with everything Drew said, but I think think this one point really
 needs to be singled out.  Cookies go across the wire.  Serializable
 semantics are not possible in todays (latent) world.  Period.


The unit of serializability is a single script (typically an event handler)
running to completion. There's no problem interleaving network cookie reads
and writes with those.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-02 Thread Robert O'Callahan
On Fri, Apr 3, 2009 at 9:02 AM, Jeremy Orlow jor...@google.com wrote:

 I agree with everything Drew said, but I think think this one point really
 needs to be singled out.  Cookies go across the wire.  Serializable
 semantics are not possible in todays (latent) world.  Period.


The unit of serializability is a single script (typically an event handler)
running to completion. There's no problem interleaving network cookie reads
and writes with those.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-02 Thread Jeremy Orlow
On Thu, Apr 2, 2009 at 6:37 PM, Robert O'Callahan rob...@ocallahan.orgwrote:


 Unfortunately we have to deal with cookies and localStorage, where the API
 is already set.


Is it set?

I understand that localStorage has been around for a while, but as far as I
can tell virtually no one uses it.  I thought the reason for calling this
spec a draft was so that such fairly major issues could be corrected?  I
agree that changing something this late in the game is less than ideal, but
I think we're both agreeing that any synchronous APIs that cross the
event-loop are going to be long term problems.


Re: [whatwg] Worker feedback

2009-04-02 Thread Robert O'Callahan
On Fri, Apr 3, 2009 at 5:11 PM, Jeremy Orlow jor...@google.com wrote:

 On Thu, Apr 2, 2009 at 6:37 PM, Robert O'Callahan rob...@ocallahan.orgwrote:


 Unfortunately we have to deal with cookies and localStorage, where the API
 is already set.


 Is it set?

 I understand that localStorage has been around for a while, but as far as I
 can tell virtually no one uses it.  I thought the reason for calling this
 spec a draft was so that such fairly major issues could be corrected?  I
 agree that changing something this late in the game is less than ideal, but
 I think we're both agreeing that any synchronous APIs that cross the
 event-loop are going to be long term problems.


AFAIK every major browser has an implementation of localStorage close to
shipping. The only way I can imagine having a chance to put the brakes on
the feature now is for everyone who hasn't actually shipped it --- which I
think is currently everyone but IE, since we shipped the old globalStorage
which we're planning to rip out anyway --- to unite and disable it
immediately until we have a better API. Maybe we could even get IE to
disable it in an update.

Mozilla could probably get behind that, but I don't know who else is willing
to bite the bullet.

I suppose sessionStorage can stay?

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-01 Thread Michael Nordman
I'd like to propose a way forward. Please have an open mind.
The objections your hearing from the chrome world are around the locking
semantics being proposed. In various discussions the terms evil,
troubling, and onerous have been used to describe what we think about
aspects of those semantics. There are obvious difficulties in providing the
semantics being discussed in a multi-threaded multi-process browser. There
are obvious performance implications. There are limitations imposed on
workers that would otherwise not be an issue. And with the introduction of
these locks today, there would be challenges going forward when trying to
add new features such that deadlocks would not be incurred... our hands
would be getting tied up. So we don't like it... evil, troubling, onerous.

The objections I'm hearing from the firefox world are around providing an
API that is less error prone.

I suggest that we can come up with a design that makes both of these camps
happy and that should be our goal here.

To that end... what if...

interface Store {
  void putItem(string name, string value);

  string getItem(string name);
  // calling getItem multiple times prior to script completion with the same
name is gauranteed to return the same value
  // (unless the current script had called putItem, if a different script
had called putItem concurrently, the current script wont see that)

  void transact(func transactCallback);
  // is not guaranteed to execute if the page is unloaded prior to the lock
being acquired
  // is guaranteed to NOT execute if called from within onunload
  // but... really... if you need transactional semantics, maybe you should
be using a Database?

  attribute int length;
  // may only be accessed within a transactCallback, othewise throws an
exception

  string getItemByIndex(int i);
  // // may only be accessed within a transactCallback, othewise throws an
exception
};


document.cookie;
// has the same safe to read multiple times semantics as store.getItem()


So there are no locking semantics (outside of the transact method)... and
multiple reads are not error prone.

WDYT?


Re: [whatwg] Worker feedback

2009-04-01 Thread Robert O'Callahan
On Thu, Apr 2, 2009 at 7:18 AM, Michael Nordman micha...@google.com wrote:

 I suggest that we can come up with a design that makes both of these camps
 happy and that should be our goal here.
 To that end... what if...

 interface Store {
   void putItem(string name, string value);

   string getItem(string name);
   // calling getItem multiple times prior to script completion with the
 same name is gauranteed to return the same value
   // (unless the current script had called putItem, if a different script
 had called putItem concurrently, the current script wont see that)

   void transact(func transactCallback);
   // is not guaranteed to execute if the page is unloaded prior to the lock
 being acquired
   // is guaranteed to NOT execute if called from within onunload
   // but... really... if you need transactional semantics, maybe you should
 be using a Database?

   attribute int length;
   // may only be accessed within a transactCallback, othewise throws an
 exception

   string getItemByIndex(int i);
   // // may only be accessed within a transactCallback, othewise throws an
 exception
 };



 document.cookie;
 // has the same safe to read multiple times semantics as store.getItem()


 So there are no locking semantics (outside of the transact method)... and
 multiple reads are not error prone.

 WDYT?


getItem stability is helpful for read-only scripts but no help for
read-write scripts. For example, outside a transaction, two scripts doing
putItem('x', getItem('x') + 1) can race and lose an increment. Even for
read-only scripts, you have the problem that reading multiple values isn't
guaranteed to give you a consistent state. So this isn't much better than
doing nothing for the default case. (Note that you can provide hen read-only
scripts are easy to optimize for full parallelism using ) Forcing iteration
to be inside a transaction isn't compatible with existing localStorage
either.

Addressing the larger context ... More than anything else, I'm channeling my
experiences at IBM Research writing race detection tools for Java programs (
http://portal.acm.org/citation.cfm?id=781528 and others), and what I learned
there about programmers with a range of skill levels grappling with shared
memory (or in our case, shared storage) concurrency. I passionately,
violently believe that Web programmers cannot and should not have to deal
with it. It's simply a matter of implementing what programmers expect: that
by default, a chunk of sequential code will do what it says without
(occasional, random) interference from outside.

I realize that this creates major implementation difficulties for parallel
browsers, which I believe will be all browsers. Evil', troubling and
onerous are perhaps understatements... But it will be far better in the
long run to put those burdens on browser developers than to kick them
upstairs to Web developers. If it turns out that there is a compelling
performance boost that can *only* be achieved by relaxing serializability,
then I could be convinced ... but we are very far from proving that.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-04-01 Thread Robert O'Callahan
On Thu, Apr 2, 2009 at 11:02 AM, Robert O'Callahan rob...@ocallahan.orgwrote:

  (Note that you can provide hen read-only scripts are easy to optimize for
 full parallelism using )


Oops!

I was going to point out that you can use a reader/writer lock to implement
serializability while allowing read-only scripts to run in parallel, so if
the argument is that most scripts are read-only then that means it shouldn't
be hard to get pretty good parallelism.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-03-31 Thread Drew Wilson
On Mon, Mar 30, 2009 at 6:45 PM, Robert O'Callahan rob...@ocallahan.orgwrote:


 We have no way of knowing how much trouble this has caused so far;
 non-reproducibility means you probably won't get a good bug report for any
 given incident.

 It's even plausible that people are getting lucky with cookie races almost
 all the time, or maybe cookies are usually used in a way that makes them a
 non-issue. That doesn't mean designing cookie races in is a good idea.


So, the first argument against cookie races was this is the way the web
works now - if we introduce cookie races, we'll break the web. When this
was proven to be incorrect (IE does not enforce exclusive access to
cookies), the argument has now morphed to the web is breaking right now and
nobody notices, which is more an article of faith than anything else.

I agree that designing cookie races is not a good idea. If we could go back
in time, we might design a better API for cookies that didn't introduce race
conditions. However, given where we are today, I'd say that sacrificing
performance in the form of preventing parallel network calls/script
execution in order to provide theoretical correctness for an API that is
already quite happily race-y is not a good tradeoff.

In this case, I think the spec should describe the current implementation of
cookies, warts and all.

-atw


Re: [whatwg] Worker feedback

2009-03-31 Thread Robert O'Callahan
On Wed, Apr 1, 2009 at 7:27 AM, Drew Wilson atwil...@google.com wrote:

 So, the first argument against cookie races was this is the way the web
 works now - if we introduce cookie races, we'll break the web. When this
 was proven to be incorrect (IE does not enforce exclusive access to
 cookies), the argument has now morphed to the web is breaking right now and
 nobody notices, which is more an article of faith than anything else.


We know for sure it's possible to write scripts with racy behaviour, so the
question is whether this ever occurs in the wild. You're claiming it does
not, and I'm questioning whether you really have that data.

I agree that designing cookie races is not a good idea. If we could go back
 in time, we might design a better API for cookies that didn't introduce race
 conditions. However, given where we are today, I'd say that sacrificing
 performance in the form of preventing parallel network calls/script
 execution in order to provide theoretical correctness for an API that is
 already quite happily race-y is not a good tradeoff.


We don't know how much (if any) performance must be sacrificed, because
no-one's tried to implement parallel cookie access with serializability
guarantees. So I don't think we can say what the correct tradeoff is.

In this case, I think the spec should describe the current implementation of
 cookies, warts and all.


You mean IE and Chrome's implementation, I presume, since Firefox and Safari
do not allow cookies to be modified during script execution AFAIK. Do we
know exactly what IE7, IE8 and Chrome guarantee around parallel cookie
access?

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-03-31 Thread Drew Wilson
On Tue, Mar 31, 2009 at 6:25 PM, Robert O'Callahan rob...@ocallahan.orgwrote:


 We know for sure it's possible to write scripts with racy behaviour, so the
 question is whether this ever occurs in the wild. You're claiming it does
 not, and I'm questioning whether you really have that data.


I'm not claiming it *never* occurs, because in the vasty depths of the
internet I suspect *anything* can be found. Also, my rhetorical powers
aren't up to the task of constructing a negative proof :)


 We don't know how much (if any) performance must be sacrificed, because
 no-one's tried to implement parallel cookie access with serializability
 guarantees. So I don't think we can say what the correct tradeoff is.


The spec as proposed states that script that accesses cookies cannot operate
in parallel with network access on those same domains. The performance
impact of something like this is pretty clear, IMO - we don't need to
implement it and measure it to know it exists and in some situations could
be significant.


 You mean IE and Chrome's implementation, I presume, since Firefox and
 Safari do not allow cookies to be modified during script execution AFAIK.


I think the old spec language captured the intent quite well -
document.cookie is a snapshot of an inherently racy state, which is the set
of cookies that would be sent with a network call at that precise instant.
Due to varying browser implementations, that state may be less racy on some
browsers than on others, but the general model was one without guarantees.

I understand the philosophy behind serializing access to shared state, and I
agree with it in general. But I think we need to make an exception in the
case of document.cookie based on current usage and expected performance
impact (since it impacts our ability to parallelize network access and
script execution).

In this case, the burden of proof has to fall on those trying to change the
spec - I think we need a compelling real-world argument why we should be
making our browsers slower. The pragmatic part of my brain suggests that
we're trying to solve a problem that exists in theory, but which doesn't
actually happen in practice.

Anyhow, at this point I think we're just going around in circles about this
- I'm not sure that either of us are going to convince the other, so I'll
shut up now and let others have the last word :)

-atw


Re: [whatwg] Worker feedback

2009-03-30 Thread Drew Wilson
On Fri, Mar 27, 2009 at 6:23 PM, Ian Hickson i...@hixie.ch wrote:


 Another use case would be keeping track of what has been done so far, for
 this I guess it would make sense to have a localStorage API for shared
 workers (scoped to their name). I haven't added this yet, though.


On a related note, I totally understand the desire to protect developers
from race conditions, so I understand why we've removed localStorage access
from dedicated workers. In the past we've discussed having synchronous APIs
for structured storage that only workers can use - it's a much more
convenient API, particularly for applications porting to HTML5 structured
storage from gears. It sounds like if we want to support these APIs in
workers, we'd need to enforce the same kind of serializability guarantees
that we have for localStorage in browser windows (i.e. add some kind of
structured storage mutex similar to the localStorage mutex).





Gears had an explicit permissions variable applications could check,
which seems valuable - do we do anything similar elsewhere in HTML5
that we could use as a model here?
  
   HTML5 so far has avoided anything that requires explicit permission
   grants, because they are generally a bad idea from a security
   perspective (users will grant any permissions the system asks them
   for).
 
  The Database spec has a strong implication that applications can request
  a larger DB quota, which will result in the user being prompted for
  permission either immediately, or at the point that the default quota is
  exceeded. So it's not without precedent, I think. Or maybe I'm just
  misreading this:
 
  User agents are expected to use the display name and the estimated
  database size to optimize the user experience. For example, a user agent
  could use the estimated size to suggest an initial quota to the user.
  This allows a site that is aware that it will try to use hundreds of
  megabytes to declare this upfront, instead of the user agent prompting
  the user for permission to increase the quota every five megabytes.

 There are many ways to expose this, e.g. asynchronously as a drop-down
 infobar, or as a pie chart showing the disk usage that the user can click
 on to increase the allocaton whenever they want, etc.


Certainly. I actually think we're in agreement here - my point is not that
you need a synchronous permission grant (since starting up a worker is an
inherently asynchronous operation anyway) - just that there's precedent in
the spec for applications to request access to resources (storage space,
persistent workers) that are not necessarily granted to all sites by
default. It sounds like the specifics of how the UA chooses to expose this
access control (pie charts, async dropdowns, domain whitelists, trusted
zones with security levels) left to the individual implementation.


 Re: cookies
 I suppose that network activity should also wait for the lock. I've made
 that happen.


Seems like that would restrict parallelism between network loads and
executing javascript, which seems like the wrong direction to go.

It feels like we are jumping through hoops to protect running script from
having document.cookies modified out from underneath it, and now some of the
ramifications may have real performance impacts. From a pragmatic point of
view, I just want to remind people that many current browsers do not make
these types of guarantees about document.cookies, and yet the tubes have not
imploded.




  Cookies have a cross-domain aspect (multiple subdomains can share cookie
  state at the top domain) - does this impact the specification of the
  storage mutex since we need to lockout multiple domains?

 There's only one lock, so that should work fine.


OK, I was assuming a single per-domain lock (ala localStorage) but it sounds
like there's a group lock, cross-domain. This makes it even more onerous if
network activity across all related domains has to serialize on a single
lock.

-atw


Re: [whatwg] Worker feedback

2009-03-30 Thread Robert O'Callahan
On Tue, Mar 31, 2009 at 7:22 AM, Drew Wilson atwil...@google.com wrote:



 Re: cookies
 I suppose that network activity should also wait for the lock. I've made
 that happen.


 Seems like that would restrict parallelism between network loads and
 executing javascript, which seems like the wrong direction to go.

 It feels like we are jumping through hoops to protect running script from
 having document.cookies modified out from underneath it, and now some of the
 ramifications may have real performance impacts. From a pragmatic point of
 view, I just want to remind people that many current browsers do not make
 these types of guarantees about document.cookies, and yet the tubes have not
 imploded.


We have no way of knowing how much trouble this has caused so far;
non-reproducibility means you probably won't get a good bug report for any
given incident.

It's even plausible that people are getting lucky with cookie races almost
all the time, or maybe cookies are usually used in a way that makes them a
non-issue. That doesn't mean designing cookie races in is a good idea.



  Cookies have a cross-domain aspect (multiple subdomains can share cookie
  state at the top domain) - does this impact the specification of the
  storage mutex since we need to lockout multiple domains?

 There's only one lock, so that should work fine.


 OK, I was assuming a single per-domain lock (ala localStorage) but it
 sounds like there's a group lock, cross-domain. This makes it even more
 onerous if network activity across all related domains has to serialize on a
 single lock.


It doesn't have to. There are lots of ways to optimize here.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-03-30 Thread Michael Nordman

 I think it makes sense to treat dedicated workers as simple subresources,

not separate browsing contexts, and that they should thus just use the

application cache of their parent browsing contexts. This is what WebKit

does, according to ap.


 I've now done this in the spec.


Sounds good. I'd phrase it a little differently though. Dedicated worker do
have a browsing context that is distinct from their parents, but the
appcache
selected for a dedicated worker context is identical to the appacache
selected
for the parents context.



 For shared workers, I see these options:


  - Not allow app caches, so shared workers don't work when offline. That

  seems bad.


  - Same as suggested for dedicated workers above -- use the creator's

  cache, so at least one client will get the version they expect. Other

  clients will have no idea what version they're talking to, the creator

  would have an unusual relationship with the worker (it would be able

  to call swapCache() but nobody else would), and once the creator goes

  away, there will be a zombie relationship.


  - Pick an appcache more or less at random, like when you view an image in

  a top-level browsing context. Clients will have no idea which version

  they're talking to.


  - Allow workers to specify a manifest using some sort of comment syntax.

  Nobody knows what version they'll get, but at least it's always the

  same version, and it's always up to date.


 Using the creator's cache is the one that minimises the number of clients

that are confused, but it also makes the debugging experience most differ

from the case where there are two apps using the worker.


 Using an appcache selected the same way we would pick one for images has

the minor benefit of being somewhat consistent with how window.open()

works, and we could say that window.open() and new SharedWorker are

somewhat similar.


 I have picked this route for now. Implementation feedback is welcome in

determining if this is a good idea.


Sounds good for now.

Ultimately, I suspect that  additionally allowing workers to specify a
manifest using
some sort of syntax may be the right answer. That would put cache selection
for
shared workers on par with how cache selection works for pages (not just
images)
opened via window.open. As 'page' cache selection is refined due to
experience with
this system, those same refinements would also apply to 'shared worker'
cache
selection.


Re: [whatwg] Worker feedback

2009-03-28 Thread Robert O'Callahan
On Sat, Mar 28, 2009 at 2:23 PM, Ian Hickson i...@hixie.ch wrote:

 Robert O'Callahan wrote:
  Now, with the storage mutex, are there any cases you know of where
  serializability fails? If there are, it may be worth noting them in the
  spec. If there aren't, why not simply write serializability into the
  spec?

 Just writing that something must be true doesn't make it true. :-) I think
 it's safer for us to make the design explicitly enforce this rather than
 say that browser vendors must figure out where it might be broken and
 enforce it themselves.


If serializability is the goal then I think it can only help to say so in
the spec (in addition to whatever explicit design you wish to include), so
that any failure of serializability is clearly an inconsistency in the spec
that must be fixed rather than a loophole that authors and browser vendors
might think they can rely on.

I also suggest that speccing just serializability should be fine. It seems
to me the current spec is proposing one implementation of serializability
while other implementations are possible, and relying on the black-box
equivalence principle to enable other implementations. But specifying
serializability is probably simpler and may allow implementations that are
unintentionally ruled out by the explicit design in the spec, especially
as things become more complicated in the future. It would probably also be
clearer to authors what they can expect.

I think it's a lot like GC; we don't specify a GC algorithm, even though GC
is hard; we just have an implicit specification that objects don't disappear
arbitrarily.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Worker feedback

2009-03-28 Thread Alexey Proskuryakov


On 28.03.2009, at 4:23, Ian Hickson wrote:


I think, given text/css, text/html, and text/xml all have character
encoding declarations inline, transcoding is not going to work in
practice. I think the better solution would be to remove the rules  
that

make text/* an issue in the standards world (it's not an issue in the
real world).



In fact, transcoding did work in practice - that's because HTTP  
headers override inline character declarations.



For new formats, though, I think just supporting UTF-8 is a big win.



Could you please clarify what the win is? Disregarding charset from  
HTTP headers is just a weird special case for a few text resource  
types. If we were going to deprecate HTML, XML and CSS, but keep  
appcache manifest going forward, it could maybe make sense.


- WBR, Alexey Proskuryakov




Re: [whatwg] Worker feedback

2009-03-28 Thread Kristof Zelechovski
Scripts, and worker scripts in particular, should use application media
type; using text/javascript is obsolete. [RFC4329#3].
Chris