Re: [whatwg] Workers feedback
On Fri, 4 Feb 2011, Glenn Maynard wrote: On Fri, Feb 4, 2011 at 6:43 PM, Ian Hickson i...@hixie.ch wrote: All workers should run soon, not maybe in the future. Not running a worker should be an unusual circumstance. Errors that occur in unusual circumstances aren't errors that authors will check for. This dicussion comes from Chrome having a very small limit currently. It is my understanding that this limit is temporary and not representative of what the API will be like once widely implemented. There will probably always be some limit, even if it's very high (eg. 1024). Well certainly there comes a point where there are simply too many workers for it to make sense. I think reasonable behavior when you exceed it is to throw an exception, but the spec seems to disallow this. Maybe that's why Chrome has the queueing behavior in the first place. Implementations are essentially allowed to do whatever they want in the face of hardware limitations like this; we refer to this as the hardware limitations clause: http://www.whatwg.org/specs/web-apps/current-work/complete.html#hardwareLimitations Non-looping 0ms timers are common, to run code when the current call finishes. Yeah, spec allows those fine. I should have said non-recursive. That is, you can run a 0ms timer from another timer without causing recursion, but you'll receive the 4ms clamping unnecessarily. It might be possible to avoid this while still preventing 0ms looping timers from busy looping, but the spec prohibits that. (I could give an example of how this would happen, but I don't think it's important enough to go into further for now.) What's the use case? On Sat, 5 Feb 2011, Samuel Ytterbrink wrote: 2011/2/5 Ian Hickson i...@hixie.ch On Sat, 16 Oct 2010, Samuel Ytterbrink wrote: *What is the problem you are trying to solve?* To create sophisticated single file webpages. That's maybe a bit vaguer than I was hoping for when asking the question. :-) Why does it have to be a single file? Would multipart MIME be acceptable? A single file is a solution, not a problem. What is the problem? Okey, I see the (implementation of the) web standards as the ultimate framework. This makes it a great tool to create OS independent software (if a browser is implementing the specs the same on both platforms). Therefore its great if it supports as much of the usual behavior of programs. I understand that this is a long process but.. almost every thing is possible to inline with data-urls (wrote a simple script to do this for me) but not web workers. And if you want to hand over a program to a customer you want it to be 1 file, in many cases. These standards are intended for the Web. You don't have to hand anything over other than a URL. The user case for my program is for a user that is a student that sits at his/her schools computer. Having a DAISY book and needs to read it with this app. The problem is: that the user are not allowed to install new software therefor can't install a 'real' player. that the user cant play it online because a book is to large that the File API right now you cant get hold of a whole folder structure. Ah. This isn't really a use case I think we should try to solve at the WHATWG. The WHATWG is specifically about improving the Web. [...] trying to build a more optimal standalone DAISY player (would be nice if i could rewrite it with web workers). Now that's a problem. :-) It seems like what you need is a package mechanism, not necessarily a way to run workers without an external script. if i understand you correctly you suggests a module system for javascript? That would be nice but still web workers are needed fore more then that. And you can do module systems with a macro-compiler with out changing the specs. I just mean zip up all the files and give that to the user. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, Feb 11, 2011 at 5:45 PM, Ian Hickson i...@hixie.ch wrote: On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, 7 Jan 2011, Berend-Jan Wever wrote: 1) To give WebWorkers access to the DOM API so they can create their own elements such as img, canvas, etc...? It's the API itself that isn't thread-safe, unfortunately. I didn't see the original thread but how is a WebWorker any different from another webpage? Those run just fine in other threads and use the DOM API. Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. The goal here is interoperability with all browsers, not just some. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Mon, Feb 14, 2011 at 1:37 AM, Ian Hickson i...@hixie.ch wrote: On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, Feb 11, 2011 at 5:45 PM, Ian Hickson i...@hixie.ch wrote: On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, 7 Jan 2011, Berend-Jan Wever wrote: 1) To give WebWorkers access to the DOM API so they can create their own elements such as img, canvas, etc...? It's the API itself that isn't thread-safe, unfortunately. I didn't see the original thread but how is a WebWorker any different from another webpage? Those run just fine in other threads and use the DOM API. Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. The goal here is interoperability with all browsers, not just some. I guess I don't understand. There are lots of things all browsers didn't do at some point in the past. The video tag. CSS animation. The canvas tag. Etc. We don't say because it hasn't been done yet therefore we can't try or can't spec it. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Mon, 14 Feb 2011, Gregg Tavares (wrk) wrote: Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. The goal here is interoperability with all browsers, not just some. I guess I don't understand. There are lots of things all browsers didn't do at some point in the past. The video tag. CSS animation. The canvas tag. Etc. We don't say because it hasn't been done yet therefore we can't try or can't spec it. We do when one or more browser vendors say we will not implement this, which is what happened here. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Mon, Feb 14, 2011 at 5:02 AM, Ian Hickson i...@hixie.ch wrote: On Mon, 14 Feb 2011, Gregg Tavares (wrk) wrote: Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. The goal here is interoperability with all browsers, not just some. I guess I don't understand. There are lots of things all browsers didn't do at some point in the past. The video tag. CSS animation. The canvas tag. Etc. We don't say because it hasn't been done yet therefore we can't try or can't spec it. We do when one or more browser vendors say we will not implement this, which is what happened here. And having every *thread* running in a separate process wouldn't exactly be a sane model, anyway, and that's what this would actually require. -- Glenn Maynard
Re: [whatwg] Workers feedback
On Fri, Feb 4, 2011 at 3:43 PM, Ian Hickson i...@hixie.ch wrote: On Sat, 16 Oct 2010, Samuel Ytterbrink wrote: *What is the problem you are trying to solve?* To create sophisticated single file webpages. That's maybe a bit vaguer than I was hoping for when asking the question. :-) Why does it have to be a single file? Would multipart MIME be acceptable? A single file is a solution, not a problem. What is the problem? [...] trying to build a more optimal standalone DAISY player (would be nice if i could rewrite it with web workers). Now that's a problem. :-) It seems like what you need is a package mechanism, not necessarily a way to run workers without an external script. On Fri, 15 Oct 2010, Jonas Sicking wrote: Allowing both blob URLs and data URLs for workers sounds like a great idea. I expect we'll add these in due course, probably around the same time we add cross-origin workers. (We didn't add them before because exactly how we do them depends on how we determine origins.) On Sat, 16 Oct 2010, Samuel Ytterbrink wrote: But then i got another problem, why is not file:///some_directory_where_the_html_are/ not the same domain as file:///some_directory_where_the_html_are/child_directory_with_ajax_stuff/. I understand if it was not okay to go closer to root when ajax, file:///where_all_secrete_stuff_are/ or /../../. That's not a Web problem. I recommend contacting your browser vendor about it. (It's probably security-related.) On Thu, 30 Dec 2010, Glenn Maynard wrote: On Thu, Dec 30, 2010 at 7:11 PM, Ian Hickson i...@hixie.ch wrote: Unfortunately we can't really require immediate failure, since there'd be no way to test it or to prove that it wasn't implemented -- a user agent could always just say oh, it's just that we take a long time to launch the worker sometimes. (Performance can be another hardware limitation.) Preferably, if a Worker is successfully created, the worker thread starting must not block on user code taking certain actions, like closing other threads. How can you tell the difference between the thread takes 3 seconds to start and the thread waits for the user to close a thread, if it takes 3 seconds for the user to close a thread? My point is from a black-box perspective, one can never firmly say that it's not just the browser being slow to start the thread. And we can't disallow the browser from being slow. That doesn't mean it needs to start immediately, but if I start a thread and then do nothing, it's very bad for the thread to sit in limbo forever because the browser expects me to take some action, without anything to tell me so. I don't disagree that it's bad. Hopefully browser vendors will agree and this problem will go away. If queuing is really necessary, please at least give us a way to query whether a worker is queued. It's queued if you asked it to start and it hasn't yet started. On Fri, 31 Dec 2010, Aryeh Gregor wrote: I've long thought that HTML5 should specify hardware limitations more precisely. We can't, because it depends on the hardware. For example, we can't say you must be able to allocate a 1GB string because the system might only have 500MB of storage. Clearly it can't cover all cases, and some sort of general escape clause will always be needed -- but in cases where limits are likely to be low enough that authors might run into them, the limit should really be standardized. It's not much of a standardised limit if there's still an escape clause. I'm happy to put recommendations in if we have data showing certain specific limits are needed for interop with real content. Unfortunately we can't really require immediate failure, since there'd be no way to test it or to prove that it wasn't implemented -- a user agent could always just say oh, it's just that we take a long time to launch the worker sometimes. (Performance can be another hardware limitation.) In principle this is so, but in practice it's not. In real life, you can easily tell an algorithm that runs the first sixteen workers and then stalls any further ones until one of the early ones exit, from an algorithm that just takes a while to launch workers sometimes. I think it would be entirely reasonable and would help interoperability in practice if HTML5 were to require that the UA must run all pending workers in some manner that doesn't allow starvation, and that if it can't do so, it must return an error rather than accepting a new worker. Failure to return an error should mean that the worker can be run soon, in a predictable timeframe, not maybe at some indefinite point in the future. All workers should run soon, not maybe in the future. Not running a worker should be an unusual circumstance. Errors that occur in unusual circumstances aren't errors that authors will check for. This dicussion comes from Chrome
Re: [whatwg] Workers feedback
On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, 7 Jan 2011, Berend-Jan Wever wrote: 1) To give WebWorkers access to the DOM API so they can create their own elements such as img, canvas, etc...? It's the API itself that isn't thread-safe, unfortunately. I didn't see the original thread but how is a WebWorker any different from another webpage? Those run just fine in other threads and use the DOM API. Web pages do not run in a different thread. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Fri, Feb 11, 2011 at 5:45 PM, Ian Hickson i...@hixie.ch wrote: On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, 7 Jan 2011, Berend-Jan Wever wrote: 1) To give WebWorkers access to the DOM API so they can create their own elements such as img, canvas, etc...? It's the API itself that isn't thread-safe, unfortunately. I didn't see the original thread but how is a WebWorker any different from another webpage? Those run just fine in other threads and use the DOM API. Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
I'll mention that the Chrome team is experimenting with something like this (as a Chrome extensions API) - certain extensions will be able to do: window.open(my_bg_page.html, name, background); ...and the associated window will be opened offscreen. They share a process with other pages under that domain which means they can't be used as a worker (doing long-lived operations). But I agree, there's some value in having the full set of page APIs available. -atw On Fri, Feb 11, 2011 at 5:58 PM, Gregg Tavares (wrk) g...@google.comwrote: On Fri, Feb 11, 2011 at 5:45 PM, Ian Hickson i...@hixie.ch wrote: On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, 7 Jan 2011, Berend-Jan Wever wrote: 1) To give WebWorkers access to the DOM API so they can create their own elements such as img, canvas, etc...? It's the API itself that isn't thread-safe, unfortunately. I didn't see the original thread but how is a WebWorker any different from another webpage? Those run just fine in other threads and use the DOM API. Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] Workers feedback
On Sat, 16 Oct 2010, Samuel Ytterbrink wrote: *What is the problem you are trying to solve?* To create sophisticated single file webpages. That's maybe a bit vaguer than I was hoping for when asking the question. :-) Why does it have to be a single file? Would multipart MIME be acceptable? A single file is a solution, not a problem. What is the problem? [...] trying to build a more optimal standalone DAISY player (would be nice if i could rewrite it with web workers). Now that's a problem. :-) It seems like what you need is a package mechanism, not necessarily a way to run workers without an external script. On Fri, 15 Oct 2010, Jonas Sicking wrote: Allowing both blob URLs and data URLs for workers sounds like a great idea. I expect we'll add these in due course, probably around the same time we add cross-origin workers. (We didn't add them before because exactly how we do them depends on how we determine origins.) On Sat, 16 Oct 2010, Samuel Ytterbrink wrote: But then i got another problem, why is not file:///some_directory_where_the_html_are/ not the same domain as file:///some_directory_where_the_html_are/child_directory_with_ajax_stuff/. I understand if it was not okay to go closer to root when ajax, file:///where_all_secrete_stuff_are/ or /../../. That's not a Web problem. I recommend contacting your browser vendor about it. (It's probably security-related.) On Thu, 30 Dec 2010, Glenn Maynard wrote: On Thu, Dec 30, 2010 at 7:11 PM, Ian Hickson i...@hixie.ch wrote: Unfortunately we can't really require immediate failure, since there'd be no way to test it or to prove that it wasn't implemented -- a user agent could always just say oh, it's just that we take a long time to launch the worker sometimes. (Performance can be another hardware limitation.) Preferably, if a Worker is successfully created, the worker thread starting must not block on user code taking certain actions, like closing other threads. How can you tell the difference between the thread takes 3 seconds to start and the thread waits for the user to close a thread, if it takes 3 seconds for the user to close a thread? My point is from a black-box perspective, one can never firmly say that it's not just the browser being slow to start the thread. And we can't disallow the browser from being slow. That doesn't mean it needs to start immediately, but if I start a thread and then do nothing, it's very bad for the thread to sit in limbo forever because the browser expects me to take some action, without anything to tell me so. I don't disagree that it's bad. Hopefully browser vendors will agree and this problem will go away. If queuing is really necessary, please at least give us a way to query whether a worker is queued. It's queued if you asked it to start and it hasn't yet started. On Fri, 31 Dec 2010, Aryeh Gregor wrote: I've long thought that HTML5 should specify hardware limitations more precisely. We can't, because it depends on the hardware. For example, we can't say you must be able to allocate a 1GB string because the system might only have 500MB of storage. Clearly it can't cover all cases, and some sort of general escape clause will always be needed -- but in cases where limits are likely to be low enough that authors might run into them, the limit should really be standardized. It's not much of a standardised limit if there's still an escape clause. I'm happy to put recommendations in if we have data showing certain specific limits are needed for interop with real content. Unfortunately we can't really require immediate failure, since there'd be no way to test it or to prove that it wasn't implemented -- a user agent could always just say oh, it's just that we take a long time to launch the worker sometimes. (Performance can be another hardware limitation.) In principle this is so, but in practice it's not. In real life, you can easily tell an algorithm that runs the first sixteen workers and then stalls any further ones until one of the early ones exit, from an algorithm that just takes a while to launch workers sometimes. I think it would be entirely reasonable and would help interoperability in practice if HTML5 were to require that the UA must run all pending workers in some manner that doesn't allow starvation, and that if it can't do so, it must return an error rather than accepting a new worker. Failure to return an error should mean that the worker can be run soon, in a predictable timeframe, not maybe at some indefinite point in the future. All workers should run soon, not maybe in the future. Not running a worker should be an unusual circumstance. Errors that occur in unusual circumstances aren't errors that authors will check for. This dicussion comes from Chrome having a very small limit currently. It is my understanding that this limit is temporary and not
Re: [whatwg] Workers feedback
On Fri, Feb 4, 2011 at 6:43 PM, Ian Hickson i...@hixie.ch wrote: My point is from a black-box perspective, one can never firmly say that it's not just the browser being slow to start the thread. And we can't disallow the browser from being slow. I don't think a black-box test suite can ever generally prove compliance against a dishonest vendor. All workers should run soon, not maybe in the future. Not running a worker should be an unusual circumstance. Errors that occur in unusual circumstances aren't errors that authors will check for. This dicussion comes from Chrome having a very small limit currently. It is my understanding that this limit is temporary and not representative of what the API will be like once widely implemented. There will probably always be some limit, even if it's very high (eg. 1024). I think reasonable behavior when you exceed it is to throw an exception, but the spec seems to disallow this. Maybe that's why Chrome has the queueing behavior in the first place. Should there be a step early in 4.7.2/4.7.3 to permit browsers to throw an exception if the thread creation isn't allowed for any reason (without having any requirement to do so)? Non-looping 0ms timers are common, to run code when the current call finishes. Yeah, spec allows those fine. I should have said non-recursive. That is, you can run a 0ms timer from another timer without causing recursion, but you'll receive the 4ms clamping unnecessarily. It might be possible to avoid this while still preventing 0ms looping timers from busy looping, but the spec prohibits that. (I could give an example of how this would happen, but I don't think it's important enough to go into further for now.) -- Glenn Maynard
Re: [whatwg] Workers feedback
On Tue, 18 Nov 2008, Alexey Proskuryakov wrote: [...] If you implement the actual IPC using, say, a Unix socket, then you can just pass the actual socket along and do the same thing without blocking. This is an interesting point. I do not know enough about how Unix domain sockets are passed around, but since they the laws of nature are the same for them, it's either that: - my FUD is unbased, and it is in fact possible to implement the behavior; - or semantics are very different for sockets. Some guesses are that queues may be strictly limited in size, message delivery may not be guaranteed, or that it is possible for client code to irrepairably deadlock processes with them - something that JS developers obviously shouldn't be able to do. I do not know which of the options is correct, but if the spec talked in terms of message passing, it would have been more easily verifiable. I've revamped several parts of the spec today, so it may be that this is better now. I imagine that your original problem didn't go away, since I still talk about entangling things synchronously, but as far as I can tell there's not any way to actually distinguish what the spec describes from a more asynchronous message-passing mechanism, so long as you have an implementation strategy that handles the queuing of messages in the message channels separately from the passing of ports to other threads. If you mean that two ports in two threads are posted to each other's threads at the same time, Yes, this is what I'm talking about. then deadlock would only happen in a naive implementation that didn't keep pumping its message queue while waiting for a response from the other thread. Instead what you would want to do is to ask for a semaphore to communicate with the other thread, and if you don't get it, see if anyone wants to communicate with you, and if they do, let them do whatever they want, and then try again. Designs like this are quite prone to all sorts of crazy problems, too. As a simple example, the port waiting to be entangled may be sent further on, if you let them do whatever they want. If you're using one of the mechanisms I outlined in my e-mail to Jonas earlier today, as far as I can tell you end up sidestepping these issues. I'm certainly open to changing the algorithms around if a better solution exists in a manner that gets the same behavior. I'm certainly no expert on the topic (as I'm sure the above responses have shown). Since the spec is written in form of algorithms, and relies on a number of arguable implicit assumptions on the implementation of their steps, it is hard to process or verify the algorithms. In my opinion (I'm not claiming expertise either!), a message passing design would be much clearer. It's not clear to me how the current design _isn't_ a message-passing design. The only way to get a port into another thread is to post it over a previously created channel, which is message passing. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
on 18.11.2008 06:43, Ian Hickson at [EMAIL PROTECTED] wrote: I'd be more that happy with a separate interface if the objects actually behaved differently. One example of a good reason to have separate interfaces was recently proposed here: shared workers should outlive their creators. This is the sort of difference that would make having a separate API reasonable, in my opinion. You don't think that the way that a handle to a shared worker can be obtained dynamically without contact with the original creator is enough of a difference? I think that this is a difference in a single function (namely, constructor) behavior. One constructor can create named workers, and another can create unnamed (null-named) workers, which doesn't mean that they need to create different kinds of objects. But I've already said it before, so this is not new feedback. The complication here seems to be in the way you are implementing this. Port entangling should be atomic across threads -- when you are sending a port over another channel, you should block both threads, create the new object, Sorry if this looks like I'm just trying to be difficult, but you already have a chance to deadlock here. If the blocked thread was inside malloc(), then attempting to allocate memory in the main thread will freeze the application. This is very much an implementation concern, and in this particular case, it is easily resolvable (you could allocate memory before locking). Unfortunately, implementation bugs like this are notoriously hard to find with testing, as they may be triggered by very specific usage scenarios. So, even having a working implementation doesn't really mean that a spec written in this manner is implementable, paradoxically. update the information, I'm not sure what you mean here - certainly not hunting down all references to the old entangled port that may be anywhere, or fixing all results of calculations that involved its address? Yet, this is necessary if you are blocking threads at arbitrary moments. Again, an implementation concern, but the spec as it is talks about algorithms, and not observable effects, and it is not clear to me what the observable effects should be in cases where synchronous communication is specced. shunt all pending messages over, and then resume the threads. If you implement the actual IPC using, say, a Unix socket, then you can just pass the actual socket along and do the same thing without blocking. This is an interesting point. I do not know enough about how Unix domain sockets are passed around, but since they the laws of nature are the same for them, it's either that: - my FUD is unbased, and it is in fact possible to implement the behavior; - or semantics are very different for sockets. Some guesses are that queues may be strictly limited in size, message delivery may not be guaranteed, or that it is possible for client code to irrepairably deadlock processes with them - something that JS developers obviously shouldn't be able to do. I do not know which of the options is correct, but if the spec talked in terms of message passing, it would have been more easily verifiable. For example, any method that entangles two ports blocks until both threads are synchronised and entangled. This will cause deadlocks - if portB' is sent to the first thread as portB'' in the above scheme, the lock will not let synchronization ever finish. Could you elaborate on this? I'm not sure I follow what you mean. If you mean that two ports in two threads are posted to each other's threads at the same time, Yes, this is what I'm talking about. then deadlock would only happen in a naive implementation that didn't keep pumping its message queue while waiting for a response from the other thread. Instead what you would want to do is to ask for a semaphore to communicate with the other thread, and if you don't get it, see if anyone wants to communicate with you, and if they do, let them do whatever they want, and then try again. Designs like this are quite prone to all sorts of crazy problems, too. As a simple example, the port waiting to be entangled may be sent further on, if you let them do whatever they want. I'm certainly open to changing the algorithms around if a better solution exists in a manner that gets the same behavior. I'm certainly no expert on the topic (as I'm sure the above responses have shown). Since the spec is written in form of algorithms, and relies on a number of arguable implicit assumptions on the implementation of their steps, it is hard to process or verify the algorithms. In my opinion (I'm not claiming expertise either!), a message passing design would be much clearer. There are lots of discussions about designing multi-threaded algorithms on the net, one I liked quite a bit recently is http://codemines.blogspot.com/2006/09/another-thread-on-threads.html - it presents the do's and don'ts very well. - WBR, Alexey Proskuryakov.
[whatwg] Workers feedback
Summary: * added window.location.resolveURL(). I haven't added it to the Location in Workers -- should I? * no other normative changes. On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. On Fri, 14 Nov 2008, Aaron Boodman wrote: In Gears, authors asked us for a location object because they had applications that could be served from different origins. Also in Gears, workers can be accessed across origins, so incoming messages need to be validated that they are from the correct origin. It's more convenient for a worker to access its own origin through an API the UA provides than to bake it into the script or send it in the first postMessage(). Although HTML5 workers don't have the ability to be accessed across origins, you do have the ability to receive ports that are connected to other origins, right? So this might still be an issue. Other than that, I can't think of any specific use cases. It seemed like a generally useful API to have access to. It seems somewhat useful to me too. Is the implementation cost high? On Thu, 13 Nov 2008, Jonas Sicking wrote: It returns the script's URL, not the page's. Oh?! Then I understand even less what the use case is. This is something that doesn't exist for script and i've never heard anyone ask for it (granted, that is not proof that no one wants it). On Fri, 14 Nov 2008, Alexey Proskuryakov wrote: Actually, this exists for script is you say that it returns the URL that the script execution context uses for resolving relative URLs. It just so happens that for script, it's document URL, and for workers, it's script URL. On Fri, 14 Nov 2008, Jonas Sicking wrote: Actually, this exists for script is you say that it returns the URL that the script execution context uses for resolving relative URLs. It just so happens that for script, it's document URL, and for workers, it's script URL. That works the same in Workers iirc, without the need for a .location API. How is a script supposed to know what base URL is used to resolve URLs if we remove .location? On Fri, 14 Nov 2008, Jonas Sicking wrote: Actually, come to think of it, what is the BaseURI for workers? I.e. what URI is importScripts and XHR resolved against? http://www.whatwg.org/specs/web-workers/current-work/#base-urls On Fri, 14 Nov 2008, Jonas Sicking wrote: The base URL of a URL passed to an API in a worker is the absolute URL given that the worker's location attribute represents. Both the origin and effective script origin of scripts running in workers are the origin of the absolute URL given that the worker's locationattribute represents. Hmm.. this makes a lot of sense for importScripts, but for XHR you probably want the baseURI to be that of the opening page, since it's quite likely that the opening page gave you a URI to open and process. Of course that would be quite confusing (different baseURIs for different APIs), as well as impossible for shared workers as they don't have a single document as opener. What we need is an API for resolving relative URIs, that way scripts can at least do the resolving manually. We could also add an API for getting the baseURI of the document on the other side of a port (should possibly live on the message event). On Fri, 14 Nov 2008, Aaron Boodman wrote: My expectation was that the base URI would always be the URI of the worker. I think of opening a worker a lot like starting a new process, or opening a new window. I would expect that the new process has its own base URI which is the same URI as the script it is running. On Fri, 14 Nov 2008, Michael Nordman wrote: My mental model for workers is a GUI less document almost. A dedicated worker is akin to a nested iframe. A shared worker is akin to a top window or tab. In that model, base is src of the resource loaded into the context. On Fri, 14 Nov 2008, Jonas Sicking wrote: I think in many cases for XHR that is probably not what is the most useful. Consider for example: example.html: w = new Worker(http://docs.google.com/MSWordParser.js;); w.onmessage = function (e) { doneLoadingDoc(JSON.parse(e.data)); } w.postMessage(/documents/report.doc); MSWordParser.js: onmessage = function(e) { xhr = new XMLHttpRequest(); xhr.open(GET, e.data, false); xhr.send(); res = mainParse(xhr.responseText); postMessage(JSON.stringify(res)); } The above won't work since the URL is relative to the worker JS file, not relative to example.html. So an absolute URI always needs to be used which sort of sucks as is
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Fri, 14 Nov 2008, Shannon wrote: I don't see any value in the user-agent specified amount of time delay in stopping scripts. How can you write cleanup code when you have no consistency in how long it gets to run (or if it runs at all)? The user-agent specified amount of time delay is implemented by every major browser for every script they run today. How can people write clean code when they have no consistency in how long their scripts will run (or if they will run at all)? Why is this any different? Why does that matter? I think you're asking the wrong question. As designers of a new spec the question should be how can we fix this?. If the answer is to include a mandatory cleanup window for ALL scripts then that should be considered (even if that window is 0 seconds). If you can't rely on a cleanup then it becomes necessary to have some kind of repair/validation sequence run on the data next time it is accessed to check if it's valid. You need to do that anyway to handle powerouts and crashes. That was the point of my concern. Given that the only 100% reliable cleanup window is 0 seconds it would be more consistent (and honest) to make that the spec. Offering a cleanup window of uncertain length is somewhat pointless and bound to cause incompatibilities across UAs. Is there a strong argument against making 0 seconds mandatory, given that anything else is inconsistent across UA, architecture and circumstance? It's not clear which document the cookie would be for. localStorage is as light-weight than cookies (lighter-weight, arguably), though, so that should be enough, no? Fair enough. Shannon
Re: [whatwg] Workers feedback
Nov 14, 2008, в 2:30 AM, Ian Hickson написал(а): I believe that the idea that the API for shared and dedicated workers should be the same is misguided. The spec used to make the two cases identical. The result was confusion, and the dedicated case was much more complex than necessary. This is the strongest argument for having separate interfaces of all presented, but in my opinion, it misses some important data points: - Was introducing a new interface the only way to resolve the confusion? Maybe renaming a method or two would have had a positive effect? - Did it help resolve the confusion? - What kind of danger the confusion was? Would authors write under- performing and unreliable code, or refuse to use such a cumbersome API at all? Or was is just a momentary delay for some, resolved quickly and harmlessly? I don't remember any of these discussed here. Shared workers and dedicated workers are fundamentally different and have different needs, and we should expose these needs in ways optimised for the two cases. The basic need is that dedicated workers be able to have a two-way communication channel with their creators, and shared workers be able to have a two-way communication with each user of the worker. I think that this argument is false. It is normal for a single API to support multiple use cases. Replace Worker with XMLHttpRequest, and we end up with separate interfaces for GET and POST. skipped code samples For dedicated workers though that's way more complexity than we want to require of authors -- why do they have to listen for a port when there will always be exactly one? So it makes sense to use It is not true that there will always be one - additional ports can be passed in via postMessage(). Now, we've at this point made the two different already, so as to simplify the dedicated worker, so we could (and the spec does) make the dedicated worker even simpler while we're at it. Returning to the XMLHttpRequest example, we really can combine open() and send() for XMLHttpRequestGET, but not for XMLHttpRequestPOST. Generally, the interfaces can be very different if we try to. One way to do that is to bury the ports into the Worker and global scope objects. If we do one side, though, we have to do the other, because it would be really weird to have a message channel that half-acts like a two-port channel and half doesn't. I agree. For example, if we bury it, we shouldn't expose .close(), since it's better for the worker to be closed using the actual .close() or .terminate() API, but if we only bury one side, then one end could close the pipe and not the other, and we'd have to make sure we expose .onclose on the buried end, and so forth. I agree that if we have a direct interface for messaging on one side, we should have it on both sides. So, we end up with what the spec has now. I think what we have now is better than making dedicated and shared workers superficially the same (as the spec used to be, and as the people involved in this thread argued was bad) is more confusing for authors. I'd be more that happy with a separate interface if the objects actually behaved differently. One example of a good reason to have separate interfaces was recently proposed here: shared workers should outlive their creators. This is the sort of difference that would make having a separate API reasonable, in my opinion. At this point, if the only arguments for changing the API are it's confusing for authors, then I'd rather not change the API. We got to where we are today by carefully considering what would be better for authors. We could continue going back-and-forth and reverting earlier decisions until the cows come home, but I see no benefit to doing so. I don't think it's inappropriate to continue back-and-forth until there is at least one reasonably complete implementation validating the spec. Currently, the Mozilla implementation is very different in spirit, not supporting MessagePorts at all. * I still don't buy the utility of passing around MessagePorts, so I suggest we table that for v2. It can always be added back later. Since they so drastically affect the API design, I think putting them off is a mistake. We might end up constraining ourselves in unobvious ways. I agree. - WBR, Alexey Proskuryakov
Re: [whatwg] Workers feedback
On Fri, Nov 14, 2008 at 9:00 AM, Jonas Sicking [EMAIL PROTECTED] wrote: Oh?! Then I understand even less what the use case is. This is something that doesn't exist for script and i've never heard anyone ask for it (granted, that is not proof that no one wants it). a script can figure out where it is by using try { null.null } catch (e) { location=e.fileName } and i do have scripts which do stuff like that. that said i don't see a reason to add such a feature, just let people pass it if they need to.
Re: [whatwg] Workers feedback
Nov 14, 2008, в 10:00 AM, Jonas Sicking написал(а): What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. It returns the script's URL, not the page's. Oh?! Then I understand even less what the use case is. This is something that doesn't exist for script and i've never heard anyone ask for it (granted, that is not proof that no one wants it). Actually, this exists for script is you say that it returns the URL that the script execution context uses for resolving relative URLs. It just so happens that for script, it's document URL, and for workers, it's script URL. - WBR, Alexey Proskuryakov
Re: [whatwg] Workers feedback
Ian, Thanks for taking the time to read and understand all the feedback. Although this is not my most preferred design for the API, I can live with it. I'm happy that we removed startConversation(). I think that was just extra complexity on top of an already large API. As for putting forward contradictory suggestions, I apologize for that. In the future, I will try to form final opinions before opining. Thanks, - a
Re: [whatwg] Workers feedback
On Thu, Nov 13, 2008 at 10:17 PM, Jonas Sicking [EMAIL PROTECTED] wrote: Aaron Boodman wrote: On Thu, Nov 13, 2008 at 8:45 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. I added that one becase Aaron asked for it. Aaron? I think it's useful. Obviously it's not totally necessary. What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. It represents the URI of the worker itself, not the URI of the calling page. In Gears, authors asked us for a location object because they had applications that could be served from different origins. Also in Gears, workers can be accessed across origins, so incoming messages need to be validated that they are from the correct origin. It's more convenient for a worker to access its own origin through an API the UA provides than to bake it into the script or send it in the first postMessage(). Although HTML5 workers don't have the ability to be accessed across origins, you do have the ability to receive ports that are connected to other origins, right? So this might still be an issue. Other than that, I can't think of any specific use cases. It seemed like a generally useful API to have access to. - a
Re: [whatwg] Workers feedback
Alexey Proskuryakov wrote: Nov 14, 2008, в 10:00 AM, Jonas Sicking написал(а): What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. It returns the script's URL, not the page's. Oh?! Then I understand even less what the use case is. This is something that doesn't exist for script and i've never heard anyone ask for it (granted, that is not proof that no one wants it). Actually, this exists for script is you say that it returns the URL that the script execution context uses for resolving relative URLs. It just so happens that for script, it's document URL, and for workers, it's script URL. That works the same in Workers iirc, without the need for a .location API. Actually, come to think of it, what is the BaseURI for workers? I.e. what URI is importScripts and XHR resolved against? / Jonas
[whatwg] Workers feedback
I haven't written a summary of changes because this is a rather involved issue and I'd like everyone who has an opinion to actually read this. I missed a few e-mails sent in the last few hours in this reply, as I started this yesterday. I'll read and respond to those in a bit. On Thu, 28 Aug 2008, Jonas Sicking wrote: The spec currently says: Once the WorkerGlobalScope's closing flag is set to true, the queue must discard anything else that would be added to it. Effectively, once the closing flag is true, timers stop firing, notifications for all pending asynchronous operations are dropped, etc. Does this mean that anything already on the queue will remain there? Or will it be dropped? It sounds like it will remain, but it's somewhat unclear. I've added a parenthetical clarifying this. In general I think the three shutdown mechanisms that exist are somewhat messy: * Kill a worker * Terminate a worker * WorkerGlobalScope.close() It seems excessive with 3 possible shutdown sequences, but more importantly the differences between them seems unnecessarily big. Mostly for users, but to a small extent also for implementors. Currently the situation is as follows: | Abort | Processes | Fires| Fires | current | more | close on | close on | script | events| scope| tangled | | | | ports - WorkerGlobalScope.close() | No | Maybe[2] | Yes | Yes Kill a worker | Maybe[1]| Maybe[1] | Maybe[1] | No Terminate a worker| Yes | No| Yes | No - [1] Implementation dependent. Presumably depends on how much patience that the implementation thinks its users has. [2] Depends on if the event has been placed in the queue yet or not, somewhat racy. There are other ways to kill the worker: The worker is orphaned| Maybe[1]| Yes | Yes | No The browser dies | Yes | No| No | No Also, your No in the top-left cell is really a Maybe[1], since if the script doesn't stop then it'll trigger the Kill algorithm. This seems excessively messy. The number of differences in the columns and the number of maybes seems bad. I propose the following: * Remove the Kill a worker algorithm and use Terminate a worker everywhere it is used. I strongly disagree with that. The whole point of having a distinction is that we don't want scripts just being killed willy-nilly when the user navigates away from the page. Scripts in the page itself aren't terminated, why would we want such drastic behaviour in the threads? * Make WorkerGlobalScope.close() not process any more events. I.e. make setting the 'closing flag' to true always clear out all events except a single close event. Again, this seems bad as it would mean that if you navigated away from a page that happened to use a worker, you could get data loss. * Always fire close on tangled ports. In many cases this will be a no-op since we're doing it in workers that are being closed. However if the port is in another window or a shared worker this might not be the case. I thought we weren't doing this because it exposed the details of garbage collection? On Fri, 12 Sep 2008, Aaron Boodman wrote: * We have discussed having onerror expose runtime script errors that happened inside the worker. I don't think this makes sense for shared workers, so I propose that it be spec'd to only expose load errors. Script errors can still be exposed via a global onerror property inside the worker, and they can still be reported to the error console. I don't think having script errors that happened inside a worker be exposed outside it is that useful (load errors are useful, though). Right now only load errors are reported. I'll wait til the API is more stable before exposing script errors and the like at all (whether on a global onerror or whatever). It is noted as an XXX issues in the spec source. On Thu, 28 Aug 2008, Jonas Sicking wrote: Why is importScripts imposing a same origin restriction? This won't increase security in any way since cross-origin scripts can always be loaded from the main thread. I think cross-site loading is fairly common exactly for the case that importScripts, which is loading libraries. I don't recall the precise reason, but I seem to recall concern over specific attack vectors are what caused us to restrict this. Also, the spec doesn't seem clear on what to do if compiling a script fails. I think some sort of exception should be thrown, probably the same one that is thrown if eval() is given a non-compiling script. Done. On Tue, 30 Sep 2008, Alexey Proskuryakov wrote: I've been trying to
Re: [whatwg] Workers feedback
I don't see any value in the user-agent specified amount of time delay in stopping scripts. How can you write cleanup code when you have no consistency in how long it gets to run (or if it runs at all)? If you can't rely on a cleanup then it becomes necessary to have some kind of repair/validation sequence run on the data next time it is accessed to check if it's valid. If you can do that then you didn't really need a cleanup anyway. As far as I can tell the user-agent specified amount of time is going to be a major source of hard-to-spot, hard-to-test bugs (since full testing probably involves closing and killing browsing contexts in different ways followed by a login sequence and several page navigations to get back to the page). I can see authors maybe performing these tests in IE but not across a range of browsers and computer specifications. The spec really needs to make a decision here. Either consistently provide no cleanup window or make it a requirement to provide a fixed number of seconds, which is still unreliable but at least within a smaller margin. Failure to do so will impact heavily on users of less popular browsers. The specification for message ports is still limited to strings. If no effort is going to be made to allow numbers, arrays, structs and binary data then I'd suggest Workers be given functions to serialise/deserialise these objects. Since the whole point of workers is presumably the processing of large datasets then a reliable and low-overhead means of passing these sets between workers and main threads (without resorting to SQL, XMLHttpRequest or other indirection) is an essential function. WorkerUtils does not implement document.cookie. I imagine this would be very useful in conjunction with cleanup code to flag if a cleanup operation failed to complete. Storage and Database interfaces are too heavy for the purpose of simple data like this. Shannon
Re: [whatwg] Workers feedback
On Fri, 14 Nov 2008, Shannon wrote: I don't see any value in the user-agent specified amount of time delay in stopping scripts. How can you write cleanup code when you have no consistency in how long it gets to run (or if it runs at all)? The user-agent specified amount of time delay is implemented by every major browser for every script they run today. How can people write clean code when they have no consistency in how long their scripts will run (or if they will run at all)? Why is this any different? If you can't rely on a cleanup then it becomes necessary to have some kind of repair/validation sequence run on the data next time it is accessed to check if it's valid. You need to do that anyway to handle powerouts and crashes. The spec really needs to make a decision here. Either consistently provide no cleanup window or make it a requirement to provide a fixed number of seconds, which is still unreliable but at least within a smaller margin. Failure to do so will impact heavily on users of less popular browsers. I don't see how this is any different than the current script abort timeout feature in browsers. The specification for message ports is still limited to strings. If no effort is going to be made to allow numbers, arrays, structs and binary data then I'd suggest Workers be given functions to serialise/deserialise these objects. We're going to add JSON-serialisable data support in due course. I'd rather get the rest of the API nailed down first though. WorkerUtils does not implement document.cookie. I imagine this would be very useful in conjunction with cleanup code to flag if a cleanup operation failed to complete. Storage and Database interfaces are too heavy for the purpose of simple data like this. It's not clear which document the cookie would be for. localStorage is as light-weight than cookies (lighter-weight, arguably), though, so that should be enough, no? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Fri, 14 Nov 2008, Shannon wrote: I don't see any value in the user-agent specified amount of time delay in stopping scripts. How can you write cleanup code when you have no consistency in how long it gets to run (or if it runs at all)? The user-agent specified amount of time delay is implemented by every major browser for every script they run today. How can people write clean code when they have no consistency in how long their scripts will run (or if they will run at all)? Why is this any different? For what it's worth we set all user-agent specified amount of time times as close to 0 as we can in our implementation. There's always a tiny amount of delay in cross-thread communication, in the order of fractions of a second. The specification for message ports is still limited to strings. If no effort is going to be made to allow numbers, arrays, structs and binary data then I'd suggest Workers be given functions to serialise/deserialise these objects. We're going to add JSON-serialisable data support in due course. I'd rather get the rest of the API nailed down first though. For what it's worth we support the ECMAScript JSON object inside the worker (along with things like Math and Date) which means that you can manually JSON-ify you objects when communicating. WorkerUtils does not implement document.cookie. I imagine this would be very useful in conjunction with cleanup code to flag if a cleanup operation failed to complete. Storage and Database interfaces are too heavy for the purpose of simple data like this. It's not clear which document the cookie would be for. localStorage is as light-weight than cookies (lighter-weight, arguably), though, so that should be enough, no? If you really need cookie information you can always call postMessage and ask for it. Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. / Jonas
Re: [whatwg] Workers feedback
On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. I added that one becase Aaron asked for it. Aaron? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Thu, Nov 13, 2008 at 8:45 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. I added that one becase Aaron asked for it. Aaron? I think it's useful. Obviously it's not totally necessary. - a
Re: [whatwg] Workers feedback
On Thu, 13 Nov 2008, Jonas Sicking wrote: Aaron Boodman wrote: On Thu, Nov 13, 2008 at 8:45 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. I added that one becase Aaron asked for it. Aaron? I think it's useful. Obviously it's not totally necessary. What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. It returns the script's URL, not the page's. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
Aaron Boodman wrote: On Thu, Nov 13, 2008 at 8:45 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. I added that one becase Aaron asked for it. Aaron? I think it's useful. Obviously it's not totally necessary. What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. / Jonas
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Thu, 13 Nov 2008, Jonas Sicking wrote: Aaron Boodman wrote: On Thu, Nov 13, 2008 at 8:45 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Thu, 13 Nov 2008, Jonas Sicking wrote: Actually, i think we should remove the location accessor as well. I can't think of a common enough use case that warrants an explicit API. You can always transfer the data through postMessage. I added that one becase Aaron asked for it. Aaron? I think it's useful. Obviously it's not totally necessary. What are the use cases? Also note that we can't use it with shared workers since they can be connected to several pages from different uris. It returns the script's URL, not the page's. Oh?! Then I understand even less what the use case is. This is something that doesn't exist for script and i've never heard anyone ask for it (granted, that is not proof that no one wants it). / Jonas
Re: [whatwg] Workers feedback
I've made createWorker() and createNamedWorker() return Worker objects with a 'port' attribute that represents the original message port. I've removed the 'utils' object and put all the APIs back onto the global scope. I've changed importScript() to importScripts() and made it take any number of URLs. I've made unload be fired in a worker when the worker's lifetime expires. On Wed, 6 Aug 2008, Chris Prince wrote: I'll try to write up some more detailed comments later, but for the question above about a 'name' parameter and overoading: My current thinking is that the best API design for createWorker() is: MessagePort createWorker(worker_body, [WorkerOptions]) The reason: workers are a powerful concept, and it's very likely we'll want to extend them over time. I agree in general, but I think named workers are an important enough distinction that it's worth a second constructor. I'd say other options are likely to be just as 'important' as name, so I wouldn't special-case that parameter. A 'WorkerOptions' parameter supports naming, but future expansion as well. name is pretty important, I don't know of anything other than the URL of the script that I would say is more important. On Thu, 7 Aug 2008, Chris Prince wrote: It is short-sighted to expect you can fully spec something as large as workers. This is a significant new concept, and we are only scratching the surface. So why back ourselves into a corner? Let's be smart about the API design, and allow for future flexibility. I don't see any downsides to the approach outlined above. If you have something specific in mind, please let us know. There's no downside per se, it's just not neccessary at this point. We can always add more arguments later, including one for a lot of optional parameters should we decide we need an object to do that. On Thu, 7 Aug 2008, Jonas Sicking wrote: [utils] I don't feel very strongly about this right now. It's something we started debating at mozilla and I think we'll debate it a bit more before coming to a conclusion. I'm fine with putting it in the global scope for now. Sorry, i didn't mean to ask for an immediate action on this yet. You said things were urgent, I assumed I should act on all your requests. :-) One solution I thought about is to have a base interface such as: interface MessagePort { void postMessage(...); attribute EventListener onmessage; ... } Then have interface Worker : MessagePort { bool isShared(); worker specific stuff } interface PipePort : MessagePort { attribute Window ownerWindow; Pipe specific stuff } And then make the APIs that we want to allow passing around pipe-ends take a PipePort object. The result is basically that workers are separate objects from what's returned for (new MessagePipe()).port1, but they share some API. The problem there though is that when you receive a port, you have no idea if it's a port into another frame, or a port into a worker that happened to be created as a pipe, or a port into a worker that happened to be created when the worker was created. I don't see why you would ever need to have that distinction, either. The whole point of ports as an architectural concept is that they provide an opaque interface, and who exactly is on the other side is not something that you should need to have any information about. In the design in the spec now, there's a Worker object that has 'load', 'error', and 'unload' events on, which fire at the appropriate times in the lifetime of the worker, and there's a .pipe attribute that provides a pipe into the worker. [importScripts()] Yes. Another thing is that this function should probably return void and always throw if something goes wrong. I doubt that having the server return a 404 is expected enough that just returning 'false' will keep the program executing fine. Ok. On Thu, 7 Aug 2008, Jonas Sicking wrote: To add to the above point, while the MessagePort API currently aligns with the proposed Worker API, this seems likely to change in the future, for example to test if a worker is shared between multiple frames. I don't see why we'd ever do this, but I do see other things we might want to do to control a worker, e.g. close it or throttle it. I in general am not a big fan of the MessagePort API, the whole cloning and dying thing is really ugly. I don't think there is much we can do about that, but because of it I think we should only use the API when it's strictly needed, which seems to be only in fairly complex usecases. I don't really understand this concern. Why is it complex? Then again, I have the same reaction to your proposal for a Worker object. :-) Exposing a MessagePort as a permanent property, like the global 'port' property, has the downside that that object can potentially die if the MessagePort is ever passed through postMessage
Re: [whatwg] Workers feedback
One solution I thought about is to have a base interface such as: interface MessagePort { void postMessage(...); attribute EventListener onmessage; ... } Then have interface Worker : MessagePort { bool isShared(); worker specific stuff } interface PipePort : MessagePort { attribute Window ownerWindow; Pipe specific stuff } And then make the APIs that we want to allow passing around pipe-ends take a PipePort object. The result is basically that workers are separate objects from what's returned for (new MessagePipe()).port1, but they share some API. The problem there though is that when you receive a port, you have no idea if it's a port into another frame, or a port into a worker that happened to be created as a pipe, or a port into a worker that happened to be created when the worker was created. I don't see why you would ever need to have that distinction, either. Sorry, I might have been unclear. My suggested inheritance might be better explained as interface CommunicationPort { void postMessage(...) attribute EventListener onmessage; ... } interface Worker : CommuncationPort { attribute EventListener onload; attribute EventListener onerror; } interface MessagePort : CommunicationPort { } I.e. we wouldn't allow a Worker to be passed to postMessage, but the object returned from myPipe.port1 would be allowed. The whole point of ports as an architectural concept is that they provide an opaque interface, and who exactly is on the other side is not something that you should need to have any information about. Why do we need this feature? I.e. why is it useful to have an abstracted MessagePort where you don't know who you are communicating with? The one useful thing that I see that MessagePorts do is that they allow objects that usually can't directly reach each other send messages to each other. I.e. two iframes that live next to each other can't usually get a reference to each other, but using MessagePorts a communcation channel can be negotiated between them. Similarly, two sibling workers can't in the current proposal reach each other, but using MessagePorts they can communicate with each other anyway. I in general am not a big fan of the MessagePort API, the whole cloning and dying thing is really ugly. I don't think there is much we can do about that, but because of it I think we should only use the API when it's strictly needed, which seems to be only in fairly complex usecases. I don't really understand this concern. Why is it complex? Then again, I have the same reaction to your proposal for a Worker object. :-) My proposal makes Workers behave the same way as Windows when it comes to sending messages. I think postMessage on Windows can generally be considered a success, I haven't heard a lot of people complaining about it being complex. Exposing a MessagePort as a permanent property, like the global 'port' property, has the downside that that object can potentially die if the MessagePort is ever passed through postMessage somewhere. Do you mean that: var w = createWorker('worker.js'); otherWindow.postMessage('here is the worker you asked for', w.port); w.port.postMessage('oh i wanted to talk to you after all'); ...would fail? (It would return false from the last call.) Yes. Further, the fact that a clone is created on the other end rather than the same object I think can be confusing. I.e. if I set an expando on a port the receiver of the port won't see the expando. This is required since otherwise we'd have synchronous communication between threads, but I think it's confusing to authors. This is why I generally don't like MessagePorts and think that they should be used as little as possible. Also, I would have expected the above to throw an exception. Having it silently fail (which is what'll happen if you don't check the return value) seems likely to cause hidden bugs. I don't think this is a big problem. I mean, it's like being worried that references into a window fail to have the right effect after the window is closed or navigated. I think for windows we are usually saved by the fact that generally when a window is navigated, all the code that uses that window is killed. This leaves the user with a permanent property containing a dead useless object. Not exposing it as a permanent property forces things like the onconnect event and returning a MessagePort from createWorker. Do you mean on the Worker (outside) or the WorkerGlobalScope (inside)? Yes The current spec doesn't expose 'port' as a permanent attribute on the WorkerGlobalScope (inside), it's just a property added to the global object, it's not NoDelete or anything. Hmm.. pretty much all other properties that are created by a browser is permanent. So I don't expect that this will change as far as user expected behavior goes. I have yet to actually see any advantages to demanding the use of
Re: [whatwg] Workers feedback
On Fri, 8 Aug 2008, Jonas Sicking wrote: I.e. we wouldn't allow a Worker to be passed to postMessage, but the object returned from myPipe.port1 would be allowed. I strongly disagree with the idea of making communication channels with workers be a second class citizen in terms of being able to send communication channels about. The whole point of ports as an architectural concept is that they provide an opaque interface, and who exactly is on the other side is not something that you should need to have any information about. Why do we need this feature? I.e. why is it useful to have an abstracted MessagePort where you don't know who you are communicating with? It is a critical component of any capabilities granting mechanism. My proposal makes Workers behave the same way as Windows when it comes to sending messages. That's the problem. The Window communication mechanism is a pretty crappy one -- it's a single channel, there's no delegation, if you want to connect two windows who don't know about each other you have to proxy, etc. If it wasn't for the fact that everyone is implementing it, I'd really be pushing for changing to a more capable (and more secure) system, something much more akin to message channels. (I originally came up with postMessage() years ago, I have learnt much in that time about how message passing mechanisms should work.) Exposing a MessagePort as a permanent property, like the global 'port' property, has the downside that that object can potentially die if the MessagePort is ever passed through postMessage somewhere. Do you mean that: var w = createWorker('worker.js'); otherWindow.postMessage('here is the worker you asked for', w.port); w.port.postMessage('oh i wanted to talk to you after all'); ...would fail? (It would return false from the last call.) Yes. Further, the fact that a clone is created on the other end rather than the same object I think can be confusing. I.e. if I set an expando on a port the receiver of the port won't see the expando. This is required since otherwise we'd have synchronous communication between threads, but I think it's confusing to authors. This is why I generally don't like MessagePorts and think that they should be used as little as possible. I disagree, but I don't know what I can say to convince you. All I can say is that the original impetus for the message channel mechanism came from authors who wanted a more capable messaging mechanism. Also, I would have expected the above to throw an exception. Having it silently fail (which is what'll happen if you don't check the return value) seems likely to cause hidden bugs. Throwing an exception seems a little drastic, but I could be convinced to change that -- the problem is that there's no way to know if it's going to throw (or return false) before the call. Which is better?: if (!p.postMessage(msg)) { // it went away } ...or: try { p.postMessage(msg); } catch (e) { if (e.code == 20) { // it went away } } ...? Consider also that the postMessage() might not be critical, e.g.: // this code runs when the user asks to save his work for each (var p in registeredNotifiers) { // registeredNotifiers is a list of ports to parts of // the codebase that want to be notified just before // something is saved p.postMessage(msg); } doSave(); If the author doesn't check for the potential exceptions (which at the time of writing he might not be expecting, since he doesn't know if anyone is ever going to be doing something with the ports that would cause an exception to be possible here), then the saving doesn't work. If we just return false, then the error is ignored, which is likely fine here. I don't think this is a big problem. I mean, it's like being worried that references into a window fail to have the right effect after the window is closed or navigated. I think for windows we are usually saved by the fact that generally when a window is navigated, all the code that uses that window is killed. Not if it's in another window. I think it's very much the same problem. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
So the API I'm proposing is the following: [NoInterfaceObject] interface WorkerFactory { Worker createWorker(in DOMString scriptURL); Worker createSharedWorker(in DOMString name, in DOMString scriptURL); }; interface Worker { boolean postMessage(in DOMString message); boolean postMessage(in DOMString message, in MessagePort messagePort); // event handler attributes attribute EventListener onmessage; attribute EventListener onload; attribute EventListener onerror; attribute EventListener onunload; }; interface WorkerParent { boolean postMessage(in DOMString message); boolean postMessage(in DOMString message, in MessagePort messagePort); }; [NoInterfaceObject] interface WorkerGlobalScope { // core worker features readonly attribute WorkerGlobalScope self; readonly attribute WorkerLocation location; readonly attribute DOMString name; readonly attribute boolean closing; readonly attribute WorkerParent parent; void close(); // event handler attributes attribute EventListener onunload; }; (We might want to add an onconnect property to WorkerGlobalScope, but it doesn't seem strictly needed) I think that it has the following advantages over the current draft spec: * We don't expose users to MessagePort objects in the majority of scenarios. * There is no permanent .port properties anywhere that would go dead if the port is passed somewhere. * There is no need for pseudo properties anywhere (the port variable inside the WorkerGlobalScript) * The current draft duplicates the onunload property on both the worker and its port. Not sure if this is needed or just a bug. * All onX objects live on the same object rather than some living on the worker, some living on worker.port. I'd be interested to hear what others think of this proposal. / Jonas
Re: [whatwg] Workers feedback
On Fri, 8 Aug 2008, Jonas Sicking wrote: So the API I'm proposing is the following: This seems to be a strict subset of what the spec has now; the only difference being that there's no easy way to create a worker and then pass it to someone else to take care of, and there seems to be no easy way for a worker to hear about a new parent. (We might want to add an onconnect property to WorkerGlobalScope, but it doesn't seem strictly needed) How else would you connect to a shared worker? I think that it has the following advantages over the current draft spec: * We don't expose users to MessagePort objects in the majority of scenarios. I do not consider this an advantage. All we're doing is moving the complexity to a later point -- instead of learning part of the API and then more of the API if they want to, authors have to learn two APIs, one of which is just as complicated as today's, and another that is not quite the same as either Window.postMessage() or port.postMessage() but is similar enough to get them confused. * There is no permanent .port properties anywhere that would go dead if the port is passed somewhere. The .parent property is as likely to go dead, e.g. if the parent goes away after the worker has been provided a port to another window. * There is no need for pseudo properties anywhere (the port variable inside the WorkerGlobalScript) Replacing 'port' with 'parent' doesn't really change much. :-) * The current draft duplicates the onunload property on both the worker and its port. Not sure if this is needed or just a bug. It's not strictly needed but it's useful to distinguish the death of the port from the death of the worker. * All onX objects live on the same object rather than some living on the worker, some living on worker.port. They still also live on .port, it's just that you're not exposing this explicitly now. This is a false simplification IMHO. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Fri, 8 Aug 2008, Jonas Sicking wrote: So the API I'm proposing is the following: This seems to be a strict subset of what the spec has now; the only difference being that there's no easy way to create a worker and then pass it to someone else to take care of, and there seems to be no easy way for a worker to hear about a new parent. Don't really think it's a subset or a superset. The same feature set exists in both proposals, the syntax is just different. The idea is that it's a simpler syntax for the common cases. However I think we'll have to agree to disagree what is simple at this point. (We might want to add an onconnect property to WorkerGlobalScope, but it doesn't seem strictly needed) How else would you connect to a shared worker? That is done at an application level. For example: worker = createSharedWorker(foo, bar.js); worker.addEventListener(message, handler, false); worker.postMessage(wassup dude, i just connected); Actually, it seems like onconnect as defined in the current spec has a race condition. The shared worker example does the following: var worker = createSharedWorker('worker.js', 'core'); function configure(event) { if (event.message.substr(0, 4) != 'cfg ') return; var name = event.message.substr(4).split(' ', 1); // update display to mention our name is name document.getElementsByTagName('h1')[0].textContent += ' ' + name; // no longer need this listener worker.port.removeEventListener('message', configure, false); } worker.port.addEventListener('message', configure, false); However what's to say that the 'connect' event hasn't fired inside the worker before the 'worker.port.addEventListener' line executes? Note that there can already be other listeners to the port, so the port has been activated. Also, what MessagePort object is handed to the connect event if the inner or outer port has been handed through postMessage somewhere? I.e. if someone does: var worker = createSharedWorker('worker.js', 'core'); someIframe.postMessage(here's your worker, worker.port); Does that mean that noone can ever share that worker again? And that anyone else currently sharing that worker is going to break? I would have expected sharing workers would always set up new message pipes. So here's my revised proposal: [NoInterfaceObject] interface WorkerFactory { Worker createWorker(in DOMString scriptURL); Worker createSharedWorker(in DOMString name, in DOMString scriptURL); }; interface Worker { boolean postMessage(in DOMString message); boolean postMessage(in DOMString message, in MessagePort messagePort); MessagePort connectNewPipe(); // event handler attributes attribute EventListener onmessage; attribute EventListener onload; attribute EventListener onerror; attribute EventListener onunload; }; interface WorkerParent { boolean postMessage(in DOMString message); boolean postMessage(in DOMString message, in MessagePort messagePort); }; [NoInterfaceObject] interface WorkerGlobalScope { // core worker features readonly attribute WorkerGlobalScope self; readonly attribute WorkerLocation location; readonly attribute DOMString name; readonly attribute boolean closing; readonly attribute WorkerParent parent; void close(); // event handler attributes attribute EventListener onunload; attribute EventListener onconnect; }; The change from previous version is the Worker.connectNewPipe function. When that function is called, two entangled MessagePorts are created. One is returned from the function, and one is provided to the code inside the worker by firing a 'connect' event which contains the port. Note that calling createSharedWorker does not fire a 'connect' event. / Jonas
Re: [whatwg] Workers feedback
On Fri, 8 Aug 2008, Jonas Sicking wrote: (We might want to add an onconnect property to WorkerGlobalScope, but it doesn't seem strictly needed) How else would you connect to a shared worker? That is done at an application level. For example: worker = createSharedWorker(foo, bar.js); worker.addEventListener(message, handler, false); worker.postMessage(wassup dude, i just connected); How would the worker distinguish that from the original parent sending the same message? Actually, it seems like onconnect as defined in the current spec has a race condition. The shared worker example does the following: var worker = createSharedWorker('worker.js', 'core'); function configure(event) { if (event.message.substr(0, 4) != 'cfg ') return; var name = event.message.substr(4).split(' ', 1); // update display to mention our name is name document.getElementsByTagName('h1')[0].textContent += ' ' + name; // no longer need this listener worker.port.removeEventListener('message', configure, false); } worker.port.addEventListener('message', configure, false); However what's to say that the 'connect' event hasn't fired inside the worker before the 'worker.port.addEventListener' line executes? Doesn't matter. MessagePorts queue up messages until they receiver either sets onmessage or calls start(). (This is explained just below the example.) Note that there can already be other listeners to the port, so the port has been activated. The port only activates if you set onmessage or call start(). Calling addEventListener() doesn't activate it. Also, what MessagePort object is handed to the connect event if the inner or outer port has been handed through postMessage somewhere? I.e. if someone does: var worker = createSharedWorker('worker.js', 'core'); someIframe.postMessage(here's your worker, worker.port); Does that mean that noone can ever share that worker again? The createSharedWorker() call always creates a new pipe to hand to the 'connect' event. And that anyone else currently sharing that worker is going to break? Why would it break anything? I'm confused. I would have expected sharing workers would always set up new message pipes. It does. So here's my revised proposal: Now it's even more complicated, while still not doing everything that the current proposal does. I'm not at all convinced this is better. Is the only problem you have with the current design that it is too complicated? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Fri, 8 Aug 2008, Jonas Sicking wrote: (We might want to add an onconnect property to WorkerGlobalScope, but it doesn't seem strictly needed) How else would you connect to a shared worker? That is done at an application level. For example: worker = createSharedWorker(foo, bar.js); worker.addEventListener(message, handler, false); worker.postMessage(wassup dude, i just connected); How would the worker distinguish that from the original parent sending the same message? Why would the original parent same the message twice? Of course applications following their own application level protocols is going to break themselves. Actually, it seems like onconnect as defined in the current spec has a race condition. The shared worker example does the following: var worker = createSharedWorker('worker.js', 'core'); function configure(event) { if (event.message.substr(0, 4) != 'cfg ') return; var name = event.message.substr(4).split(' ', 1); // update display to mention our name is name document.getElementsByTagName('h1')[0].textContent += ' ' + name; // no longer need this listener worker.port.removeEventListener('message', configure, false); } worker.port.addEventListener('message', configure, false); However what's to say that the 'connect' event hasn't fired inside the worker before the 'worker.port.addEventListener' line executes? Doesn't matter. MessagePorts queue up messages until they receiver either sets onmessage or calls start(). (This is explained just below the example.) Note that there can already be other listeners to the port, so the port has been activated. The port only activates if you set onmessage or call start(). Calling addEventListener() doesn't activate it. Ah, i missed the fact that calling createSheredWorker always created a new Worker object, even if one already existed. / Jonas
Re: [whatwg] Workers feedback
Jonas Sicking wrote: Ian Hickson wrote: On Fri, 8 Aug 2008, Jonas Sicking wrote: (We might want to add an onconnect property to WorkerGlobalScope, but it doesn't seem strictly needed) How else would you connect to a shared worker? That is done at an application level. For example: worker = createSharedWorker(foo, bar.js); worker.addEventListener(message, handler, false); worker.postMessage(wassup dude, i just connected); How would the worker distinguish that from the original parent sending the same message? Why would the original parent same the message twice? Of course applications following their own application level protocols is going to break themselves. Sorry, that should say: Of course applications *not* following their own application level protocols are going to break themselves. / Jonas
Re: [whatwg] Workers feedback
On Fri, 8 Aug 2008, Jonas Sicking wrote: worker = createSharedWorker(foo, bar.js); worker.addEventListener(message, handler, false); worker.postMessage(wassup dude, i just connected); How would the worker distinguish that from the original parent sending the same message? Why would the original parent same the message twice? Of course applications following their own application level protocols is going to break themselves. The whole point of capabilities-based systems is that you can pass these communcation ports over to unknown entities and don't have to trust that they won't screw up your protocol. For example, you could create a shared worker to handle all the requests from all the gadgets hosted on a user's home page and just pass the worker off to them each time: // a new gadget has been created var worker = createSharedWorker('gadget-api.js', 'gadgets'); gadget.postMessage('here is the gadget API', worker.port); What you're proposing would be way more complex -- now you'd have to create the pipe separately, you'd have to have the worker know how to handle both a new connection from its parent as well as its parent saying it wants a new pipe for a gadget, you'd have lifetime issues as you now have extra commucation mechanisms, etc. (This brings up another point, which is that by having Worker objects also be communication end points, we double the complexity of the definitions for worker lifetime, since now they have to deal with both types of channels, not just the one generic type.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
On Wed, 6 Aug 2008, Aaron Boodman wrote: I am opposed to the utils object. I don't see any precedent for this anywhere, and it just feels ugly to me. I liked it the way you had it before, with these APIs in a shared base interface. Ok. I don't have an opinion on this. Jonas? In the absence of any arguments either way, my default would be put it all on the global object; clashes are manageable, the Window object does it that way, and there are enough things that we kinda want to put on the global scope anyway (the core worker stuff) that it's not a clear that the gain is huge. I've tried to simplify the MessagePort interface as follows: * messages are now queued, and won't be delivered until either the 'start()' method on the port is called, or the 'onmessage' attribute is set to some value. * messages are now queued, instead of a port becoming inactive when its other side is suspended. Can you explain the rationale for these two changes? It makes it a lot simpler to use. Before, you ran the risk of the port becoming unavailable as you posted to it, now it'll just get sent later. Before, you had to wait for hte onload on the port before sending messages, now it'll just work. * I've made the worker receive its first port as a property of the global object (port) instead of having to listen to the 'connect' event (though the connect event still fires, so you can do shared workers). I liked it the way you had it, before. I'd rather the first connection to a worker wasn't a special case, either for the worker or for the worker's creator. If it's not a special case, in the simple case you have to listen to an event just to get the port, which seems excessive. For example this in the old model: var port; onconnect = function (event) { port = event.port; } ...becomes this in the current model: // don't have to do anything! (It also means that workers by default have a lifetime of the creator even if they don't do anything. I guess we don't need this anymore given the new lifetime stuff, though.) That's also one reason why I like having a separate Worker object and having the two-step process of creating the worker, then sending it a message. It means that creating a new channel to a worker is always the same. It seems that designing the API to add extra steps is a bad thing generally speaking. :-) I think that 'load', 'error', and 'unload' could go on the worker. As far as I can tell, the only thing 'load' and 'error' are used for is telling the creator of a worker that the worker loaded or failed to load. In that case, it seems wrong to throw them on MessagePort, since MessagePorts are also used for many other things. I also still think that Workers could have their own sendMessage. The messages sent to this would be delivered to the worker as 'message' events targeted at WorkerGlobalObject (eliminating the need for onconnect?). This would make Workers and postMessage very similar to Window and postMessage, which seems nice to me. How's this for a compromise: We make the createWorker() methods return a Worker object, but the Worker object _inherits_ from MessagePort. So effectively it is a port, but we can move the onload and onerror stuff to the Worker object instead of having them on all MessagePorts. Would that work for you? The main reason I used 'unload' and 'close' is consistency with how the rest of the platform works. (With a Window, you call window.close() to invoke window.onunload.) I can change that if people want, though I do think consistency is worth keeping here. I think the concept of a port becoming inactive is interesting in all the cases MessagePorts are used, so this should stay. In fact, should it be called 'oninactive'? Well at this point a port only becomes inactive (and fires unload) when the close() method is called or when the other side is owned by a document that gets discarded. Incidentally, I noticed that unload isn't called when it stops being an active needed worker (i.e. when its users are all discarded). Should we fire unload in that case too? I'm thinking the Closing orphan workers step should fire unload at the worker. On Mon, 4 Aug 2008, Aaron Boodman wrote: So for example, I would be for moving over a subset of the navigator and location objects as-is (these seem to work well), but against moving over the document.cookie interface (it works poorly). I agree with porting some subset of 'navigator' over, though since the relevant parts of 'navigator' aren't defined even for HTML5 yet, I haven't yet done this. There's an issue marker in the spec about this. What bits would you like defined? The ones that are most often used for browser detection are most important, so: - appName - appCodeName - appVersion - platform - userAgent I know the whole business of browser detection
Re: [whatwg] Workers feedback
On Wed, Aug 6, 2008 at 11:53 AM, Chris Prince [EMAIL PROTECTED] wrote: My current thinking is that the best API design for createWorker() is: MessagePort createWorker(worker_body, [WorkerOptions]) The reason: workers are a powerful concept, and it's very likely we'll want to extend them over time. The 'name' option is just one such case. Here are a few others: - 'language' for non-JS workers (e.g. 'text/python' or 'application/llvm') - 'isContent' to pass a string or Blob instead of a url - 'lifetime' for running beyond the lifetime of a page - etc. I'd say other options are likely to be just as 'important' as name, so I wouldn't special-case that parameter. A 'WorkerOptions' parameter supports naming, but future expansion as well. FWIW, Chris's suggestion is also fine with me. In general, I like these options objects since they are easily extensible. The language should be done via an HTTP Content-Type. isContent can be done using data: URLs instead, if this is really needed. If we want to do something with a different lifetime I think we want to do it with a much clearer API entry point than an option burried in an argument. So I'm not really convinced about this. I would be interested in other viewpoints, though. I think you are missing the point. I'll try to rephrase it: It is short-sighted to expect you can fully spec something as large as workers. This is a significant new concept, and we are only scratching the surface. So why back ourselves into a corner? Let's be smart about the API design, and allow for future flexibility. I don't see any downsides to the approach outlined above. If you have something specific in mind, please let us know.
Re: [whatwg] Workers feedback
Aaron Boodman wrote: That's also one reason why I like having a separate Worker object and having the two-step process of creating the worker, then sending it a message. It means that creating a new channel to a worker is always the same. Hixie asked me on IRC why I didn't like the MessagePort solution. So here is a list of a few reasons: I prefer that the createWorker function returns an actual Worker object. I think that is what you would expect from an API with such a name. Otherwise we should call it something like createAWorkerAndReturnAMessagePort. We shouldn't trick people into thinking that they have a worker when they really have a MessagePort, even if the two APIs happen to mostly align. One way to fix this and still keep MessagePorts would be to return a Worker object that has a .port property, but that has other problems, see below. To add to the above point, while the MessagePort API currently aligns with the proposed Worker API, this seems likely to change in the future, for example to test if a worker is shared between multiple frames. I in general am not a big fan of the MessagePort API, the whole cloning and dying thing is really ugly. I don't think there is much we can do about that, but because of it I think we should only use the API when it's strictly needed, which seems to be only in fairly complex usecases. I am aware that returning a MessagePort basically means that you write your code the same way in the trivial cases, but I dislike designing a complex API and telling the users don't pay attention to the full API of the object you are using, just think of it as something else and it'll work fine. Exposing a MessagePort as a permanent property, like the global 'port' property, has the downside that that object can potentially die if the MessagePort is ever passed through postMessage somewhere. This leaves the user with a permanent property containing a dead useless object. Not exposing it as a permanent property forces things like the onconnect event and returning a MessagePort from createWorker. On Wed, Aug 6, 2008 at 11:53 AM, Chris Prince [EMAIL PROTECTED] wrote: My current thinking is that the best API design for createWorker() is: MessagePort createWorker(worker_body, [WorkerOptions]) The reason: workers are a powerful concept, and it's very likely we'll want to extend them over time. The 'name' option is just one such case. Here are a few others: - 'language' for non-JS workers (e.g. 'text/python' or 'application/llvm') - 'isContent' to pass a string or Blob instead of a url - 'lifetime' for running beyond the lifetime of a page - etc. I'd say other options are likely to be just as 'important' as name, so I wouldn't special-case that parameter. A 'WorkerOptions' parameter supports naming, but future expansion as well. FWIW, Chris's suggestion is also fine with me. In general, I like these options objects since they are easily extensible. I do sort of prefer the idea of keeping the give me a worker that is potentially shared with other windows API separate. In fact I think we should call it createSharedWorker or some such. But allowing optional arguments at the end seems like a good idea. Not sure if that requires specific action right now or not though. / Jonas
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Wed, 6 Aug 2008, Aaron Boodman wrote: I am opposed to the utils object. I don't see any precedent for this anywhere, and it just feels ugly to me. I liked it the way you had it before, with these APIs in a shared base interface. Ok. I don't have an opinion on this. Jonas? In the absence of any arguments either way, my default would be put it all on the global object; clashes are manageable, the Window object does it that way, and there are enough things that we kinda want to put on the global scope anyway (the core worker stuff) that it's not a clear that the gain is huge. I don't feel very strongly about this right now. It's something we started debating at mozilla and I think we'll debate it a bit more before coming to a conclusion. I'm fine with putting it in the global scope for now. Sorry, i didn't mean to ask for an immediate action on this yet. That's also one reason why I like having a separate Worker object and having the two-step process of creating the worker, then sending it a message. It means that creating a new channel to a worker is always the same. It seems that designing the API to add extra steps is a bad thing generally speaking. :-) Though in the vast majority of cases only the first step is needed. A second step is only needed in the very complex use cases of: * Having sibling workers talk directly to each other. * Having a worker talk directly to a frame from a different origin. * Having a worker shared across different instances of the app. I think that 'load', 'error', and 'unload' could go on the worker. As far as I can tell, the only thing 'load' and 'error' are used for is telling the creator of a worker that the worker loaded or failed to load. In that case, it seems wrong to throw them on MessagePort, since MessagePorts are also used for many other things. I also still think that Workers could have their own sendMessage. The messages sent to this would be delivered to the worker as 'message' events targeted at WorkerGlobalObject (eliminating the need for onconnect?). This would make Workers and postMessage very similar to Window and postMessage, which seems nice to me. How's this for a compromise: We make the createWorker() methods return a Worker object, but the Worker object _inherits_ from MessagePort. So effectively it is a port, but we can move the onload and onerror stuff to the Worker object instead of having them on all MessagePorts. Would that work for you? I thought about that, but what happens if you pass such an object to postMessage? Throws an exception? Only the parts of the API that is a MessagePort dies? One solution I thought about is to have a base interface such as: interface MessagePort { void postMessage(...); attribute EventListener onmessage; ... } Then have interface Worker : MessagePort { bool isShared(); worker specific stuff } interface PipePort : MessagePort { attribute Window ownerWindow; Pipe specific stuff } And then make the APIs that we want to allow passing around pipe-ends take a PipePort object. The result is basically that workers are separate objects from what's returned for (new MessagePipe()).port1, but they share some API. - Should import() accept an array of URLs, so that the UA can fetch them in parallel if it has the ability to do that? We could do that if you like. Is it needed? With the connection limits being upped in all the browsers, I think this would be a good thing to have from the beginning. Fair enough. Should they be run in whatever order they load in or should they be run in the order given on the aguments? Yes. Another thing is that this function should probably return void and always throw if something goes wrong. I doubt that having the server return a 404 is expected enough that just returning 'false' will keep the program executing fine. / Jonas
Re: [whatwg] Workers feedback
On Thu, 7 Aug 2008, Jonas Sicking wrote: We make the createWorker() methods return a Worker object, but the Worker object _inherits_ from MessagePort. So effectively it is a port, but we can move the onload and onerror stuff to the Worker object instead of having them on all MessagePorts. I thought about that, but what happens if you pass such an object to postMessage? It would just get cloned as usual. The only things that a Worker object would have that a regular MessagePort wouldn't is the onload and onerror things, and they're only relevant until the point where you have a connection, so losing them when you clone the port is fine. One solution I thought about is to have a base interface such as: interface MessagePort { ... } Then have interface Worker : MessagePort { bool isShared(); worker specific stuff } interface PipePort : MessagePort { attribute Window ownerWindow; Pipe specific stuff } ownerWindow is gone. There's no pipe-specific stuff that wouldn't also apply to a worker. There's no worker-specific stuff once the channel has been established. What's the use case for isShared()? What does it do? I really don't like this idea of making Workers less generic. There's no need for it as far as I can tell, and all it does is make things less powerful while actually increasing implementation complexity. Would it be better if instead of createWorker() we called the method connectToWorker(), and it creates it as well if the worker is unnamed or doesn't yet exist? That would resolve the issue of what looks like a constructor not returning an object representing what it constructs? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers feedback
Would it be better if instead of createWorker() we called the method connectToWorker(), and it creates it as well if the worker is unnamed or doesn't yet exist? That sounds pretty good to me.
Re: [whatwg] Workers feedback
Ian Hickson wrote: On Thu, 7 Aug 2008, Jonas Sicking wrote: We make the createWorker() methods return a Worker object, but the Worker object _inherits_ from MessagePort. So effectively it is a port, but we can move the onload and onerror stuff to the Worker object instead of having them on all MessagePorts. I thought about that, but what happens if you pass such an object to postMessage? It would just get cloned as usual. This seems to be exactly the same thing as what the spec says now, just with a different name. The only things that a Worker object would have that a regular MessagePort wouldn't is the onload and onerror things, and they're only relevant until the point where you have a connection, so losing them when you clone the port is fine. One solution I thought about is to have a base interface such as: interface MessagePort { ... } Then have interface Worker : MessagePort { bool isShared(); worker specific stuff } interface PipePort : MessagePort { attribute Window ownerWindow; Pipe specific stuff } ownerWindow is gone. There's no pipe-specific stuff that wouldn't also apply to a worker. There's no worker-specific stuff once the channel has been established. I think it's much overly optimistic to think that we will never want to add port specific stuff to the port object, or worker specific stuff to the worker object. I just don't have that high confidence in that we can design these interfaces perfectly in the first version, nor phantom the features that people will want in future versions. I really don't like this idea of making Workers less generic. There's no need for it as far as I can tell, and all it does is make things less powerful while actually increasing implementation complexity. Why is it making them less generic? One need listed is the ability to in the future add properties that make sense on one interface but not the other. You are already listing onerror and onload as not really making sense on a generic MessagePort. Why does it make things less powerful? So far it seems like implementors are commenting on wanting this change, so implementation complexity doesn't seem like a real concern. Would it be better if instead of createWorker() we called the method connectToWorker(), and it creates it as well if the worker is unnamed or doesn't yet exist? That would resolve the issue of what looks like a constructor not returning an object representing what it constructs? That only addresses one of the comments listed in my last mail. / Jonas
[whatwg] Workers feedback
Summary: * I've written an intro section which shows how the API is expected to be used. I've tried to illustrate each use case that people raised. I will add more tomorrow. * I've completely decoupled workers and Window objects. * I've moved APIs to a utils object, so that we rarely, if ever, have to add members to the global scope (reduces chances of future collisions). * I've simplified the way message channels and ports work. * I've replaced the URL string with a Location object. On Mon, 4 Aug 2008, Jonas Sicking wrote: So the first comment is the 'window' and 'self' properties. I don't see a reason for these. We need some self-reference so that people can check for the presence of members on the global scope. 'window' was there to allow library re-use. I've now removed it, leaving only 'self'. I have also simplified the spec to remove the Window concepts from the workers. I also removed all the APIs to a utils object, leaving the global scope with only: self - self reference for checking presence of APIs location - the address of the script name - the name of the worker, if it is shared closing - whether the worker is shutting down close() - to shut the worker down utils - all the APIs onconnect - to receive new connections onunload - to run any shut down code port - the first connection I can move more of this to 'utils' if people want. Opinions? The fact that the only way to communicate between workers and the main browser context is through MessagePorts seems unnecessarily complex as well as differing from how windows communicates using postMessage. We can't use the Window object postMessage() communication method, because it relies on the objects being able to have references to each other. I've tried to simplify the MessagePort interface as follows: * messages are now queued, and won't be delivered until either the 'start()' method on the port is called, or the 'onmessage' attribute is set to some value. * messages are now queued, instead of a port becoming inactive when its other side is suspended. * I've made the worker receive its first port as a property of the global object (port) instead of having to listen to the 'connect' event (though the connect event still fires, so you can do shared workers). On Mon, 4 Aug 2008, Aaron Boodman wrote: So for example, I would be for moving over a subset of the navigator and location objects as-is (these seem to work well), but against moving over the document.cookie interface (it works poorly). I agree with porting some subset of 'navigator' over, though since the relevant parts of 'navigator' aren't defined even for HTML5 yet, I haven't yet done this. There's an issue marker in the spec about this. What bits would you like defined? On Tue, 5 Aug 2008, Aaron Boodman wrote: The protocol, host, hostname, port, pathname, and search properties are all very useful. An application might want to compare the origin of a message it receives with it's own host and port, for example. Ok, I've provided a castrated Location interface. On Tue, 5 Aug 2008, Aaron Boodman wrote: - It seems like we might want an object that represents workers. This would allow us put the 'onload' and 'onerror' events from MessagePort there, instead of on MessagePort, which makes more sense to me (I don't know what it means for a MessagePort to 'load' or 'error' outside of the context of a worker). The main reason for not having a separate Worker object is that I couldn't find anything that would go on it other than the port. You'd still want the unload messages going to whoever owns the port, not whoever created the worker, if you passed the port around. Basically, adding a Worker object just seemed like it would double the number of objects, and potentially the complexity if we also allow Worker objects to be sent along channels, without really providing any new features. MessagePort.onunload could then change to 'onclose' to go with the close() method. The main reason I used 'unload' and 'close' is consistency with how the rest of the platform works. (With a Window, you call window.close() to invoke window.onunload.) I can change that if people want, though I do think consistency is worth keeping here. - It's odd to me that the way to establish a channel to a worker depends on whether you are the creator of the worker or not. The creator gets a MessagePort to a new channel back from createWorker(), but any other function must pass a new MessagePort over the original one, and the worker must know to use that secondary port to talk back. In the old mechanism, from the worker's point of view there was only one way to get a new connection, onconnect. The changes to simplify the mechanism actually introduced a new mechanism, so it is true that we now have two mechanisms, one for the initial creation and one for others. Is that
Re: [whatwg] Workers feedback
- Would it be too weird to have createWorker overloaded to take an optional name parameter? This would make the behavior similar to window.open(), which either opens a new window or reuses an existing window with the same name. People seem to dislike overloading in general, but I don't mind. Anyone against this? First, let me apologize for not jumping into this thread earlier. As I was the one who designed workers, and have been iterating on the design for nearly two years now (!), I really need to start sharing my thoughts on this list, instead of just the Gears lists. I'll try to write up some more detailed comments later, but for the question above about a 'name' parameter and overoading: My current thinking is that the best API design for createWorker() is: MessagePort createWorker(worker_body, [WorkerOptions]) The reason: workers are a powerful concept, and it's very likely we'll want to extend them over time. The 'name' option is just one such case. Here are a few others: - 'language' for non-JS workers (e.g. 'text/python' or 'application/llvm') - 'isContent' to pass a string or Blob instead of a url - 'lifetime' for running beyond the lifetime of a page - etc. I'd say other options are likely to be just as 'important' as name, so I wouldn't special-case that parameter. A 'WorkerOptions' parameter supports naming, but future expansion as well.
Re: [whatwg] Workers feedback
On Wed, Aug 6, 2008 at 4:24 AM, Ian Hickson [EMAIL PROTECTED] wrote: * I've written an intro section which shows how the API is expected to be used. I've tried to illustrate each use case that people raised. I will add more tomorrow. Thanks, that helps a lot. * I've moved APIs to a utils object, so that we rarely, if ever, have to add members to the global scope (reduces chances of future collisions). I am opposed to the utils object. I don't see any precedent for this anywhere, and it just feels ugly to me. I liked it the way you had it before, with these APIs in a shared base interface. * I've replaced the URL string with a Location object. Thanks :). I've tried to simplify the MessagePort interface as follows: * messages are now queued, and won't be delivered until either the 'start()' method on the port is called, or the 'onmessage' attribute is set to some value. * messages are now queued, instead of a port becoming inactive when its other side is suspended. Can you explain the rationale for these two changes? * I've made the worker receive its first port as a property of the global object (port) instead of having to listen to the 'connect' event (though the connect event still fires, so you can do shared workers). I liked it the way you had it, before. I'd rather the first connection to a worker wasn't a special case, either for the worker or for the worker's creator. That's also one reason why I like having a separate Worker object and having the two-step process of creating the worker, then sending it a message. It means that creating a new channel to a worker is always the same. On Mon, 4 Aug 2008, Aaron Boodman wrote: So for example, I would be for moving over a subset of the navigator and location objects as-is (these seem to work well), but against moving over the document.cookie interface (it works poorly). I agree with porting some subset of 'navigator' over, though since the relevant parts of 'navigator' aren't defined even for HTML5 yet, I haven't yet done this. There's an issue marker in the spec about this. What bits would you like defined? The ones that are most often used for browser detection are most important, so: - appName - appCodeName - appVersion - platform - userAgent I know the whole business of browser detection is a big mess right now, so if you're working on defining something better, I'd be open to having some combination of the old navigator object and that new thing in workers. But there is a lot of code that is very carefully crafted to analyze the navigator object, so maybe it's best not to mess with that too much. On Tue, 5 Aug 2008, Aaron Boodman wrote: - It seems like we might want an object that represents workers. This would allow us put the 'onload' and 'onerror' events from MessagePort there, instead of on MessagePort, which makes more sense to me (I don't know what it means for a MessagePort to 'load' or 'error' outside of the context of a worker). The main reason for not having a separate Worker object is that I couldn't find anything that would go on it other than the port. You'd still want the unload messages going to whoever owns the port, not whoever created the worker, if you passed the port around. Basically, adding a Worker object just seemed like it would double the number of objects, and potentially the complexity if we also allow Worker objects to be sent along channels, without really providing any new features. I think that 'load', 'error', and 'unload' could go on the worker. As far as I can tell, the only thing 'load' and 'error' are used for is telling the creator of a worker that the worker loaded or failed to load. In that case, it seems wrong to throw them on MessagePort, since MessagePorts are also used for many other things. I also still think that Workers could have their own sendMessage. The messages sent to this would be delivered to the worker as 'message' events targeted at WorkerGlobalObject (eliminating the need for onconnect?). This would make Workers and postMessage very similar to Window and postMessage, which seems nice to me. MessagePort.onunload could then change to 'onclose' to go with the close() method. The main reason I used 'unload' and 'close' is consistency with how the rest of the platform works. (With a Window, you call window.close() to invoke window.onunload.) I can change that if people want, though I do think consistency is worth keeping here. I think the concept of a port becoming inactive is interesting in all the cases MessagePorts are used, so this should stay. In fact, should it be called 'oninactive'? I would prefer to see something like: void Worker.postMessage(DOMString message) void Worker.postMessage(DOMString message, MessagePort port) That way the way to establish a new channel is the same for all callers. It also has the advantage of looking similar to a window's postMessage API. With the exception