Re: Sync API for workers
Look at IndexedDB API, since asynchronous one was enought dfor everybody, synchronous one was not implemented by the browsers and now it has became deprecated... :-) Well regarding my position I would not smile ;-) I was considering a server-side implementation of indexedDB. There is currently the indexeddb-js project for Node using sqlite-3 I have in my background todo list to develop one of them ;-) This case, using LevelDB as base database instead of SQLite (I'm a fan of it, but this is not the correct use case for it). and should be considered by clients as remote workers (the server let debug those contexts via Web Inspector). Interesting concept, seems we both see WebSockets and WebWorkers are cousins (the only diference in the API is just one use send() and the other postMessage() ). Have you tried to propose this RemoteWorkers as a standard? I like the JavaScript EventLoop strength for async coding, as I appreciates the upcoming promises, but still, as some people in this list I think that synchronous code is more user friendly. For general purposes and as a general statement, yes, it's more user friendly, but for some use cases asynchronous code is more eficient and up to some point more user friendly once you understand it correctly. -- Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton de sitios diferentes, simplemente escribe un sistema operativo Unix. – Linus Tordvals, creador del sistema operativo Linux
Re: Sync API for workers
Let me introduce the first sketch of a variant. The general idea is to add a |postSyncMessage| We extend DedicatedWorkerGlobalScope and MessageEvent as follows: interface DedicatedWorkerGlobalScope : WorkerGlobalScope { void postMessage(any message, optional sequenceTransferable transfer); any postSyncMessage(any message, optional sequenceTransferable transfer); }; interface SyncMessageEvent : MessageEvent { void reply(optional any message, optional sequenceTransferable transfer); }; The behavior of |postSyncMessage| is the following: 1. the sender worker sleeps and does not handle any |postMessage| messages until it is awakened; 2. instead of the usual |MessageEvent|, the target's |onmessage| receives as argument a |SyncMessageEvent| (call it |s|); 3. if |s.reply(x)| is called, the sender's |postSyncMessage| method returns a copy of |x|, obtained with the usual algorithm; 5. if |s.reply()| has not called by the time the worker is either garbage-collected or |terminate()| is called on its |MessagePort|, the worker is killed as usual. I have not attempted to detail the inner workings of the underlying MessagePort, but I suspect that this is close to Jonas Sicking's proposal. Cheers, David
Re: Sync API for workers
On 10/13/13 4:21 PM, James Greene wrote: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. My apologies, I had obviously misunderstood your initial statement. I was thinking of an extension of the Worker API and how to implement it at little CPU/battery clost. Cheers, David -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
- Original Message - From: David Bruant bruan...@gmail.com To: Jonas Sicking jo...@sicking.cc Cc: public-webapps public-webapps@w3.org, aza...@mozilla.com Sent: Sunday, October 13, 2013 1:36:22 PM Subject: Re: Sync API for workers * You could solve the use case of compile-to-JS for code that uses sync APIs using yield. However it requires changing all functions into generators, and all function calls into yield* statements. all? as is all function in the application? that sounds like a too violent constraint, especially if a small proportion of the code uses sync functions. Maybe only the functions that may call a sync function need to be changed to generators... oh... hmm... I don't know. Taking the liberty to cc Alon Zakai to ask for his expert opinion on this topics. Not sure about all the context here. In general, the idea of using yield or CPS to handle synchronous code has come up in emscripten, but no one has done work to implement it, so we don't have a concrete answer for how practical it would be. My guess however is that it would be not very practical, because large codebases can have sync code anywhere, and relying on static analysis to simplify that so it is mostly not a factor is very optimistic. CPS all the time would likely be too slow; yield all the time I am less clear on because I am not sure the implementations are mature enough to benchmark yet (and no implementation at all in IE and Safari last I heard) - we would need to ask JS engine devs on that. - Alon
Re: Sync API for workers
On Mon, Oct 14, 2013 at 2:33 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: Let me introduce the first sketch of a variant. The general idea is to add a |postSyncMessage| We extend DedicatedWorkerGlobalScope and MessageEvent as follows: interface DedicatedWorkerGlobalScope : WorkerGlobalScope { void postMessage(any message, optional sequenceTransferable transfer); any postSyncMessage(any message, optional sequenceTransferable transfer); }; interface SyncMessageEvent : MessageEvent { void reply(optional any message, optional sequenceTransferable transfer); }; This API was suggested by Olli way up in this thread. It has a few downsides: 1. It only allows a single synchronous message channel. That means that if you have several libraries that all need synchronous communication with the parent they have to coordinate on some way to distinguish each others messages. The fact that Gecko hasn't had MessageChannel support has resulted in the same problem for asynchronous communication and that has been a big headache for developers. 2. It doesn't support streaming return values. I.e. you can't send multiple return values from a single postSyncMessage call. 3. It doesn't allow direct synchronous communication between a worker and the workers grand parent. Everything single message has to be individually routed through the parent. 4. What happens if you have multiple eventlisteners in the parent and several of them call .reply()? I wouldn't say that all of these are killer issues. I do think the first one is though. And the other three are clearly downsides. All in all I think the added complexity in the later proposal is worth it. / Jonas
Re: Sync API for workers
You snipped the comment about waitForMessage(). I think it should return an Event, as if the message had been received from onmessage, not just the received data. On Sun, Oct 13, 2013 at 10:37 PM, Jonas Sicking jo...@sicking.cc wrote: This is certainly an improvement over the previous proposal. However given that synchronous APIs of any type are quite controversial, I'd rather stick to a basic approach for now. There's nothing controversial about synchronous APIs in workers. Doing work synchronously is the whole point. The nice thing about your proposal is that it's strictly additive, so it's something we can add later if there's agreement that the problems it aims to solve are problems that need solving, and there's agreement that the proposal is the right way to solve them. This will cause people to learn to structure their workers poorly, and to create worker libraries based on that structure, with extra message relaying infrastructure to work around this, and pollute people's still-immature understanding of message ports. We should do it right in the first place. On Mon, Oct 14, 2013 at 4:33 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: Let me introduce the first sketch of a variant. The general idea is to add a |postSyncMessage| (I'm not sure what problems with the existing proposals this is trying to solve.) -- Glenn Maynard
Re: Sync API for workers
This meant to be a more limited and well-behaved variant. However, as pointed out by Jonas, a very similar proposal has been submitted and discussed long before I joined this list. So, please disregard my proposal, it is an artifact of me not searching the archives well enough. Best regards, David On 10/14/13 4:14 PM, Glenn Maynard wrote: On Mon, Oct 14, 2013 at 4:33 AM, David Rajchenbach-Teller dtel...@mozilla.com mailto:dtel...@mozilla.com wrote: Let me introduce the first sketch of a variant. The general idea is to add a |postSyncMessage| (I'm not sure what problems with the existing proposals this is trying to solve.) -- Glenn Maynard -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Could we change the method name under discussion to `postMessageSync` instead of `postSyncMessage`? I know they're not grammatically equivalent but I've always found the *Sync suffixes used on pertinent Node.js APIs to be much more intuitive than trying to guess which position within a string of words it should take. Not sure on prior art within the web platform. On Oct 14, 2013 4:59 AM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Oct 14, 2013 at 2:33 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: Let me introduce the first sketch of a variant. The general idea is to add a |postSyncMessage| We extend DedicatedWorkerGlobalScope and MessageEvent as follows: interface DedicatedWorkerGlobalScope : WorkerGlobalScope { void postMessage(any message, optional sequenceTransferable transfer); any postSyncMessage(any message, optional sequenceTransferable transfer); }; interface SyncMessageEvent : MessageEvent { void reply(optional any message, optional sequenceTransferable transfer); }; This API was suggested by Olli way up in this thread. It has a few downsides: 1. It only allows a single synchronous message channel. That means that if you have several libraries that all need synchronous communication with the parent they have to coordinate on some way to distinguish each others messages. The fact that Gecko hasn't had MessageChannel support has resulted in the same problem for asynchronous communication and that has been a big headache for developers. 2. It doesn't support streaming return values. I.e. you can't send multiple return values from a single postSyncMessage call. 3. It doesn't allow direct synchronous communication between a worker and the workers grand parent. Everything single message has to be individually routed through the parent. 4. What happens if you have multiple eventlisteners in the parent and several of them call .reply()? I wouldn't say that all of these are killer issues. I do think the first one is though. And the other three are clearly downsides. All in all I think the added complexity in the later proposal is worth it. / Jonas
Re: Sync API for workers
On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Actually only IDBRequest need to be sync, which are prone to error and complicate workflow. Async workflow on database opening and transaction request are fine. Kyaw
Re: Sync API for workers
a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Javascript now has support for yield statements the same way Python does, that's a way to stop (ie. sleep) the execution of a script to allow another to work and restart from there. It's not their main function, but allow to create what's called greenlets, green threads, and that's how I seen sync APIs are build in top of async ones... El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Oh, does `yield` work anywhere? I thought it was only for use within generators. Admittedly, I haven't been keeping up with the latest ES6 changes. On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote: Javascript now has support for yield statements the same way Python does, that's a way to stop (ie. sleep) the execution of a script to allow another to work and restart from there. It's not their main function, but allow to create what's called greenlets, green threads, and that's how I seen sync APIs are build in top of async ones... El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Don't know, I only know behavior of Python yield statement, but Javascript one was developed following it and I'm 90% secure it follows the same behaviour (almost all new functionalities of Javascript are being borrowed from Python since seems Mozilla Javascript implementors are Python ex-programmers in purpose) so yes, I believe it should work this way :-) El 13/10/2013 18:27, James Greene james.m.gre...@gmail.com escribió: Oh, does `yield` work anywhere? I thought it was only for use within generators. Admittedly, I haven't been keeping up with the latest ES6 changes. On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote: Javascript now has support for yield statements the same way Python does, that's a way to stop (ie. sleep) the execution of a script to allow another to work and restart from there. It's not their main function, but allow to create what's called greenlets, green threads, and that's how I seen sync APIs are build in top of async ones... El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
On 10/13/13 6:33 PM, pira...@gmail.com wrote: Don't know, I only know behavior of Python yield statement, but Javascript one was developed following it and I'm 90% secure it follows the same behaviour (almost all new functionalities of Javascript are being borrowed from Python since seems Mozilla Javascript implementors are Python ex-programmers in purpose) so yes, I believe it should work this way :-) It's slightly more complex than [my undersatnding of] your original phrasing, but in a word, yes, it behaves essentially as in Python. e.g., using Task.js [1], with a proper (and trivial) definition of wait(), the following implements a polling loop that does not block the event loop: Tasks.spawn(function* () { while (true) { yield wait(); poll(); } }); [1] http://taskjs.org/ -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Demostration by example, thanks :-) 2013/10/13 David Rajchenbach-Teller dtel...@mozilla.com: On 10/13/13 6:33 PM, pira...@gmail.com wrote: Don't know, I only know behavior of Python yield statement, but Javascript one was developed following it and I'm 90% secure it follows the same behaviour (almost all new functionalities of Javascript are being borrowed from Python since seems Mozilla Javascript implementors are Python ex-programmers in purpose) so yes, I believe it should work this way :-) It's slightly more complex than [my undersatnding of] your original phrasing, but in a word, yes, it behaves essentially as in Python. e.g., using Task.js [1], with a proper (and trivial) definition of wait(), the following implements a polling loop that does not block the event loop: Tasks.spawn(function* () { while (true) { yield wait(); poll(); } }); [1] http://taskjs.org/ -- David Rajchenbach-Teller, PhD Performance Team, Mozilla -- Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton de sitios diferentes, simplemente escribe un sistema operativo Unix. – Linus Tordvals, creador del sistema operativo Linux
Re: Sync API for workers
Ok, this thread is clearly heading off the deep end. Let me clear up a few points of confusion: * You can not wrap a truly synchronous library around an asynchronous API. Spinning the event loop gets you close, but breaks run-to-completion. Furthermore, spinning the event loop is irrelevant as we don't have an API to do that, nor are we planning to introduce one. * yield only works within generators in JS. * You could solve the use case of compile-to-JS for code that uses sync APIs using yield. However it requires changing all functions into generators, and all function calls into yield* statements. That comes at a performance overhead that is significant enough as to make it an unacceptable solution (several times slower in current implementations). * You could likewise solve the compile-to-JS use case by instead of generating plain JS generate JS that implements a virtual machine that runs the compiled code. This would allow pausing the virtual machine whenever an async call is happening. The performance overhead here is simply too large (in fact, that's essentially what using generators and yield* does). * yield would not solve the use-case of allowing libraries that use features from the main thread as it, again, would require a rewrite of all code that directly or indirectly uses that library to change all functions into generators and all function calls into yield*. / Jonas On Sun, Oct 13, 2013 at 9:33 AM, pira...@gmail.com pira...@gmail.com wrote: Don't know, I only know behavior of Python yield statement, but Javascript one was developed following it and I'm 90% secure it follows the same behaviour (almost all new functionalities of Javascript are being borrowed from Python since seems Mozilla Javascript implementors are Python ex-programmers in purpose) so yes, I believe it should work this way :-) El 13/10/2013 18:27, James Greene james.m.gre...@gmail.com escribió: Oh, does `yield` work anywhere? I thought it was only for use within generators. Admittedly, I haven't been keeping up with the latest ES6 changes. On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote: Javascript now has support for yield statements the same way Python does, that's a way to stop (ie. sleep) the execution of a script to allow another to work and restart from there. It's not their main function, but allow to create what's called greenlets, green threads, and that's how I seen sync APIs are build in top of async ones... El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Le 13/10/2013 21:39, Jonas Sicking a écrit : Ok, this thread is clearly heading off the deep end. Let me clear up a few points of confusion: * You can not wrap a truly synchronous library around an asynchronous API. Spinning the event loop gets you close, but breaks run-to-completion. Furthermore, spinning the event loop is irrelevant as we don't have an API to do that, nor are we planning to introduce one. * yield only works within generators in JS. To be honest, I feel generators will be interoperably deployed cross-browser long before sync APIs in workers. V8 and SpiderMonkey already have generators. I'm not entirely sure it's 100% compliant in SpiderMonkey yet, but for sure it's being actively worked on and should be soon if not yet. * You could solve the use case of compile-to-JS for code that uses sync APIs using yield. However it requires changing all functions into generators, and all function calls into yield* statements. all? as is all function in the application? that sounds like a too violent constraint, especially if a small proportion of the code uses sync functions. Maybe only the functions that may call a sync function need to be changed to generators... oh... hmm... I don't know. Taking the liberty to cc Alon Zakai to ask for his expert opinion on this topics. That comes at a performance overhead that is significant enough as to make it an unacceptable solution (several times slower in current implementations). I guess this point depends on the previous one. Given that compile-to-JS has wind behind these days, generators may benefits from optimizations. * yield would not solve the use-case of allowing libraries that use features from the main thread as it, again, would require a rewrite of all code that directly or indirectly uses that library to change all functions into generators and all function calls into yield*. I think this point is about interoperability between main thread and worker. I don't remember this point being discussed too much yet. What about exposing both async and sync APIs to workers? David
Re: Sync API for workers
On Sun, Oct 13, 2013 at 1:36 PM, David Bruant bruan...@gmail.com wrote: Le 13/10/2013 21:39, Jonas Sicking a écrit : Ok, this thread is clearly heading off the deep end. Let me clear up a few points of confusion: * You can not wrap a truly synchronous library around an asynchronous API. Spinning the event loop gets you close, but breaks run-to-completion. Furthermore, spinning the event loop is irrelevant as we don't have an API to do that, nor are we planning to introduce one. * yield only works within generators in JS. To be honest, I feel generators will be interoperably deployed cross-browser long before sync APIs in workers. V8 and SpiderMonkey already have generators. I'm not entirely sure it's 100% compliant in SpiderMonkey yet, but for sure it's being actively worked on and should be soon if not yet. I would expect that too, but that's beside the point given that yield doesn't actually solve the problem. See below. * You could solve the use case of compile-to-JS for code that uses sync APIs using yield. However it requires changing all functions into generators, and all function calls into yield* statements. all? as is all function in the application? that sounds like a too violent constraint, especially if a small proportion of the code uses sync functions. Maybe only the functions that may call a sync function need to be changed to generators... oh... hmm... I don't know. All as in all that directly *or indirectly* call the sync API. That comes at a performance overhead that is significant enough as to make it an unacceptable solution (several times slower in current implementations). I guess this point depends on the previous one. Given that compile-to-JS has wind behind these days, generators may benefits from optimizations. Generators certainly can be made faster. But given that they always result in stopping to use the CPU mechanisms for function calls I'd be very surprised if the performance can be made good enough. * yield would not solve the use-case of allowing libraries that use features from the main thread as it, again, would require a rewrite of all code that directly or indirectly uses that library to change all functions into generators and all function calls into yield*. I think this point is about interoperability between main thread and worker. I don't remember this point being discussed too much yet. See my original email. Workers are likely to always lag behind main thread. And some APIs aren't currently slated for workers at all. In particular ones that involve UI or the DOM. What about exposing both async and sync APIs to workers? What about it? We could certainly add lots of sync APIs to workers. But if one of the main use cases is compile-to-JS of existing pre-web codebases, which presumably over the years will become less of an issue as people target the web directly, then finding a smaller surface like my proposed API seems beneficial to lots of specialized sync APIs. / Jonas
Re: Sync API for workers
On Sunday, October 13, 2013, James Greene wrote: Oh, does `yield` work anywhere? I thought it was only for use within generators. Admittedly, I haven't been keeping up with the latest ES6 changes. yield may only appear in the body of a generator function, denoted by star syntax: function* g(){} Rick On Oct 13, 2013 9:38 AM, pira...@gmail.com javascript:_e({}, 'cvml', 'pira...@gmail.com'); pira...@gmail.com javascript:_e({}, 'cvml', 'pira...@gmail.com'); wrote: Javascript now has support for yield statements the same way Python does, that's a way to stop (ie. sleep) the execution of a script to allow another to work and restart from there. It's not their main function, but allow to create what's called greenlets, green threads, and that's how I seen sync APIs are build in top of async ones... El 13/10/2013 16:21, James Greene james.m.gre...@gmail.comjavascript:_e({}, 'cvml', 'james.m.gre...@gmail.com'); escribió: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.comjavascript:_e({}, 'cvml', 'james.m.gre...@gmail.com'); wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com javascript:_e({}, 'cvml', 'dtel...@mozilla.com'); wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
Rick: Thanks for confirming that. Being more familiar with generators (and other ES6 goodies), can you envision any setup where a generator (or perhaps multiple yielding to each other) would enable us to build synchronous API wrappers around async APIs in JS? On Oct 13, 2013 6:44 PM, Rick Waldron waldron.r...@gmail.com wrote: On Sunday, October 13, 2013, James Greene wrote: Oh, does `yield` work anywhere? I thought it was only for use within generators. Admittedly, I haven't been keeping up with the latest ES6 changes. yield may only appear in the body of a generator function, denoted by star syntax: function* g(){} Rick On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote: Javascript now has support for yield statements the same way Python does, that's a way to stop (ie. sleep) the execution of a script to allow another to work and restart from there. It's not their main function, but allow to create what's called greenlets, green threads, and that's how I seen sync APIs are build in top of async ones... El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió: a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted In other words, your clarification is completely true but my initial statement was written with regard to client-side JavaScript, which cannot sleep. As such, I believe my original assertions are still correct with regard to writing a sync wrapper in JS. On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote: Thanks for adding clarification. That CAN be true but it depends on the environment [so far as I can see]. For example, such an API wrapper couldn't be built in today's client-side JavaScript because the UI thread can't do a synchronous yielding sleep but rather can only do a synchronous blocking wait, which means it wouldn't yield to allow for the Worker thread to asynchronously respond and toggle such a condition/mutex/etc. unless such can be synchronously requested by the blocking thread from within the busy wait loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous flow of the JS busy loop to trigger `onmessage` handlers for async messages sent from the Worker. If I'm mistaken, please consider providing a code snippet, gist, etc. to get me back on track. Thanks! On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: On 10/12/13 3:48 PM, James Greene wrote: You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. a) is necessary, but for b) it is sufficient for the sync thread to be able to sleep until a condition/mutex/... is lifted -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: Sync API for workers
What I really dislike about this is that the worker can't send the port directly to a UI thread if it's a nested worker; it has to send it to its parent, who has to forward it to its parent, and so on. That seems like it'll make it hard to implement libraries, since libraries needs to have its fingers in every one of your workers' main message port, or the author needs to invent a new message routing mechanism (which is something message ports are supposed to do for us). For example, the below example can't be easily modularized and made into a library. If regular ports could be used, you could wrap both sides in initAlertHandling(messagePort), handing it a port to communicate over. The deadlock prevention algorithm I proposed earlier attempted to address this, but it was too complex. It attempted to allow as close to arbitrary message passing as possible without allowing deadlocks; maybe there's a less permissive approach that would be simple enough. Here's an alternate proposal. It attempts to allow transferring these ports in any way, across regular MessagePorts, but it no longer tries to implement blocking with regular MessagePorts. It keeps the SyncMessageChannel, MessagePortSyncSide and MessagePortAsyncSide interfaces of your proposal to simplify things: - When a SyncMessageChannel is created, store an identifier for the current thread in both ports, called the port's initial thread. This property stays the same as the port is transferred around. - Add a property to both ports called transferred first, initially null. If MessagePortSyncSide is transferred and its transferred first property is null, set its transferred first to true and the transferred first property of its corresponding MessagePortAsyncSide to false. The same is true in reverse. (This is possible because, the first time either port is transferred, they're obviously still in the same thread.) - If either type of port is transferred to an illegal thread, the recipient thread automatically calls close() on the port. - Descendants of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is true, the initial thread itself is also a legal thread. - Ancestors of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is false, the initial thread itself is also a legal thread. Like the previous proposal, this requires that the browser can get a view of the thread tree. Since all of the checking happens in the thread that receives the port, and not the sending port, there are no race conditions due to not knowing the target of a port in advance. The transferred first field allows creating a SyncMessageChannel, and then either 1: transferring the MessagePortSyncSide to a child worker, or transferring MessagePortAsyncSide upwards to a parent worker or the UI thread. You can't do both; once you transfer a port once, that property is immutable. Simple put, if you transfer either port across the boundary defined by the initial thread, the ports are shut down to prevent the possibility of deadlocks. any waitForMessage(); With regular messaging you have an Event containing a .data property with the posted message. Here, you just have the message. That'll make adding metadata difficult. Are we sure that's what we want? (One thing we'd lose is the message's origin; I'm not sure on a quick reading if this is meant to be supported for message channels, but it looks like it is.) An aside: one thing you'll want to be able to do is block on multiple sources, possibly with a timeout, like select(). I think the way this API would allow that is for the user to create a multiplexer thread that does all of the listening asynchronously, and then a single blocking port is used between the mux thread and the real thread. On Sun, Oct 13, 2013 at 2:39 PM, Jonas Sicking jo...@sicking.cc wrote: * yield only works within generators in JS. And if that's like Python, yield only goes up one level, and then the caller has to yield too. Generators are useless for wrapping async APIs unless you structure your entire codebase around it. It's very useful (its incarnation in Python, at least), but has no relevance to the problems the API this thread is discussing. To give a the simplest example of what you can do with this API: implement confirm() in workers. --- worker --- ({ var channel = new SyncMessageChannel(); postMessage({ port: channel.asyncPort }); var _confirmPort = channel.syncPort; confirm = function(message) { _confirmPort.postMessage(message); return _confirmPort.waitForMessage().result; } })(); if(confirm(Delete everything?)) ... --- main thread var worker = createWorker(); // create the above worker worker.onmessage = function(e) { var confirmPort = event.data.port; confirmPort.onmessage = function(e) { var answer =
Re: Sync API for workers
On Sun, Oct 13, 2013 at 8:11 PM, Glenn Maynard gl...@zewt.org wrote: - Descendants of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is true, the initial thread itself is also a legal thread. - Ancestors of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is false, the initial thread itself is also a legal thread. Correction: - Descendants of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is *false*, the initial thread itself is also a legal thread. - Ancestors of a *MessagePortAsyncSide*'s initial thread are always legal threads. Additionally, if the port's transferred first value is false, the initial thread itself is also a legal thread. The initial thread is only a valid thread for the port that was *not* transferred first. When the first port is transferred for the first time, the remaining port, which is still in the thread it was created in, is always in a valid thread. -- Glenn Maynard
Re: Sync API for workers
On Sun, Oct 13, 2013 at 8:19 PM, Glenn Maynard gl...@zewt.org wrote: On Sun, Oct 13, 2013 at 8:11 PM, Glenn Maynard gl...@zewt.org wrote: - Descendants of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is true, the initial thread itself is also a legal thread. - Ancestors of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is false, the initial thread itself is also a legal thread. Correction: - Descendants of a MessagePortSyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is false, the initial thread itself is also a legal thread. - Ancestors of a MessagePortAsyncSide's initial thread are always legal threads. Additionally, if the port's transferred first value is false, the initial thread itself is also a legal thread. The initial thread is only a valid thread for the port that was *not* transferred first. When the first port is transferred for the first time, the remaining port, which is still in the thread it was created in, is always in a valid thread. This is certainly an improvement over the previous proposal. However given that synchronous APIs of any type are quite controversial, I'd rather stick to a basic approach for now. The nice thing about your proposal is that it's strictly additive, so it's something we can add later if there's agreement that the problems it aims to solve are problems that need solving, and there's agreement that the proposal is the right way to solve them. / Jonas
Re: Sync API for workers
When I see discussion of any new/recent synchronous APIs for the Web platform these days I pretty much take it they're implicitly intended just for use with Workers. So I assume that's the context Jonas intended. It's a safe assumption, but I think it's better to be asynchronous also on workers, not only for efficience, but also for having only one programming model so you can easily interchange your code between workers and main thread and also maybe Node.js. Look at IndexedDB API, since asynchronous one was enought dfor everybody, synchronous one was not implemented by the browsers and now it has became deprecated... :-) --Mike P.S. Of course there's room for disagreement about whether synchronous APIs are even a good idea even for the Workers case - http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/ Good to know I'm not the only one... :-)
Re: Sync API for workers
Synchronous APIs are easier to use since it's how things have been done since decades ago, No, they're easier to use because they fit the model of linear human thought more naturally. The idea that asynchronous APIs are just as good and easy as synchronous APIs, and that people only disagree because of lack of experience with asynchronous APIs, is mistaken. APIs must be designed around how programmer's minds actually work, not how you'd like them to work. I agree that APIs should have the least-surprise-factor, but are you sure that's the reason why we use to program in a synchronous, linear way? Because I've always though it was an heritage of the batch-processing machines of the '60s... I don't believe event-oriented programming fit bad with how humans thinks because the nature is event-oriented (if this then that), and I don't believe I'm an alien so that's the reason I think in a different way... but the required POSIX-like APIs would be better developed as external libraries on top of the asynchronous ones. You can't build synchronous APIs on top of asynchronous APIs without the mechanism this thread is specifically about. I've always been taught that you can implement one on top of the other... :-/ Obviously, asynchronous on top of synchronous is fairly easier thsn in the other way...
Re: Sync API for workers
You can only build a synchronous API on top of an asynchronous API if they are (a) running in separate threads/processes AND (b) the sync thread can synchronously poll (busy loop) for the progress/completion of the async thread. On Oct 12, 2013 1:23 AM, pira...@gmail.com pira...@gmail.com wrote: Synchronous APIs are easier to use since it's how things have been done since decades ago, No, they're easier to use because they fit the model of linear human thought more naturally. The idea that asynchronous APIs are just as good and easy as synchronous APIs, and that people only disagree because of lack of experience with asynchronous APIs, is mistaken. APIs must be designed around how programmer's minds actually work, not how you'd like them to work. I agree that APIs should have the least-surprise-factor, but are you sure that's the reason why we use to program in a synchronous, linear way? Because I've always though it was an heritage of the batch-processing machines of the '60s... I don't believe event-oriented programming fit bad with how humans thinks because the nature is event-oriented (if this then that), and I don't believe I'm an alien so that's the reason I think in a different way... but the required POSIX-like APIs would be better developed as external libraries on top of the asynchronous ones. You can't build synchronous APIs on top of asynchronous APIs without the mechanism this thread is specifically about. I've always been taught that you can implement one on top of the other... :-/ Obviously, asynchronous on top of synchronous is fairly easier thsn in the other way...
Re: Sync API for workers
On Wed, Sep 5, 2012 at 7:03 PM, Jonas Sicking jo...@sicking.cc wrote: Hence I think something like the following would work: [Constructor] interface SyncMessageChannel { readonly attribute MessagePortSyncSide syncPort; readonly attribute MessagePortAsyncSide asyncPort; }; interface MessagePortSyncSide { void postMessage(any message, optional sequenceTransferable transfer); any waitForMessage(); void close(); }; MessagePortSyncSide implements Transferable; interface MessagePortAsyncSide : EventTarget { void postMessage(any message, optional sequenceTransferable transfer); void start(); void close(); // event handlers attribute EventHandler onmessage; }; MessagePortAsyncSide implements Transferable; Where there's the additional limitation that MessagePortSyncSide can only be transferred though MessagePortAsyncSide.postMessage() or Worker.postMessage(), and MessagePortAsyncSide can only be transferred though MessagePortSyncSide.postMessage() or DedicatedWorkerGlobalScope.postMessage(). We're planning on implementing this API soon in Gecko. The main use cases that we're looking to solve are: * Enable libraries to implement APIs that require functionality that are not yet available in workers but that is available on the main thread. There will likely be such functionality for a long time to come given that we're constantly adding new functionality to the web platform and implementations tend to add APIs to the main thread before they do so to workers. * Enable compiling code that was written for other platforms to the web. Specifically where such code uses synchronous APIs, but where we for good reasons have chosen not to expose synchronous counterparts in the web platform. The most obvious example here is synchronous filesystem access which is very commonly used in other platforms like posix and windows. If people have other ideas for how to solve those use-cases, we're of course always open to other proposals. But please look through this thread first as there has been lots of good discussion. If people are not interested in solving these use-cases I'm always interested in that input too. / Jonas
Re: Sync API for workers
* Enable compiling code that was written for other platforms to the web. Specifically where such code uses synchronous APIs, but where we for good reasons have chosen not to expose synchronous counterparts in the web platform. The most obvious example here is synchronous filesystem access which is very commonly used in other platforms like posix and windows. Synchronous APIs are easier to use since it's how things have been done since decades ago, but I don't think they fit in a event-oriented environment like Javascript, and more specially to some so time consuming like filesystem and IO. I find it better to only develop asynchronous APIs for this use cases. It would make sense to use synchronous APIs to help porting current code for example from C/C++ to Javascript, but the required POSIX-like APIs would be better developed as external libraries on top of the asynchronous ones.
Re: Sync API for workers
pira...@gmail.com pira...@gmail.com, 2013-10-11 21:24 +0200: [Jonas said]: * Enable compiling code that was written for other platforms to the web. Specifically where such code uses synchronous APIs, but where we for good reasons have chosen not to expose synchronous counterparts in the web platform. The most obvious example here is synchronous filesystem access which is very commonly used in other platforms like posix and windows. Synchronous APIs are easier to use since it's how things have been done since decades ago, but I don't think they fit in a event-oriented environment like Javascript, and more specially to some so time consuming like filesystem and IO. I find it better to only develop asynchronous APIs for this use cases. It would make sense to use synchronous APIs to help porting current code for example from C/C++ to Javascript, but the required POSIX-like APIs would be better developed as external libraries on top of the asynchronous ones. When I see discussion of any new/recent synchronous APIs for the Web platform these days I pretty much take it they're implicitly intended just for use with Workers. So I assume that's the context Jonas intended. --Mike P.S. Of course there's room for disagreement about whether synchronous APIs are even a good idea even for the Workers case - http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/ -- Michael[tm] Smith http://people.w3.org/mike
Re: Sync API for workers
On Fri, Oct 11, 2013 at 2:24 PM, pira...@gmail.com pira...@gmail.comwrote: Synchronous APIs are easier to use since it's how things have been done since decades ago, No, they're easier to use because they fit the model of linear human thought more naturally. The idea that asynchronous APIs are just as good and easy as synchronous APIs, and that people only disagree because of lack of experience with asynchronous APIs, is mistaken. APIs must be designed around how programmer's minds actually work, not how you'd like them to work. but the required POSIX-like APIs would be better developed as external libraries on top of the asynchronous ones. You can't build synchronous APIs on top of asynchronous APIs without the mechanism this thread is specifically about. -- Glenn Maynard
Re: Sync API for workers
On Thu, Sep 6, 2012 at 7:18 PM, Glenn Maynard gl...@zewt.org wrote: On Thu, Sep 6, 2012 at 12:31 AM, Jonas Sicking jo...@sicking.cc wrote: That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. I understand the concept, but I'm having trouble coming up with useful examples. Can you give one? I have one: virtualizing the API of the main thread in a worker. In Treehouse [1], we sandbox untrusted JS in a worker, where we provide a virtual browser interface - including the DOM (using a hacked up fork of jsdom). This allows us to (1) restrict the guest code to a subset of the DOM, and (2) interpose on privileged operations. At present, we present a synchronous interface to the virtual DOM and then replicate changes back to the actual DOM in the main thread asynchronously. This works surprisingly well, but has some limitations. First, concurrent access to a given DOM node from more than one worker is a difficult problem, so we punt and require that a node may appear in at most one virtual DOM. Second, we do not know how to virtualize some synchronous methods and properties, such as window.prompt. I believe that a synchronous messaging API would allow us to overcome both of these issues. [1] Treehouse PDF: https://www.usenix.org/system/files/conference/atc12/atc12-final159.pdf -- Lon Ingram @lawnsea
Re: Sync API for workers
On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote: On 09/06/2012 08:31 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote: The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. You can't have both--you have to choose one of 1: allow blocking upwards, 2: allow blocking downwards, or 3: allow deadlocks. (I believe #1 is more useful than #2, but each proposal can go both ways. I'm ignoring more complex deadlock detection algorithms that can allow both #1 and #2, of course, since that's a lot harder.) Indeed. But I believe #2 is more useful than #1. I wasn't proposing having both, I was proposing only doing #2. It's actually technically possible to allow both #1 and #2 without deadlock detection algorithms, but to keep things sane I'll leave that as out of scope for this thread. [snip] I think that's by far the most interesting category of use cases raised for this feature so far, the ability to implement sync APIs from async APIs (or several async APIs). That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. That's why I'm not interested in only blocking on children, but rather only blocking on parents. The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. getMessage doesn't spin the event loop. Spinning the event loop means that tasks are run from task queues (such as asynchronous callbacks) which might not be expecting to run, and that tasks might be run recursively; none of that that happens here. All this does is block until a message is available on a specified port (or ports), and then returns it--it's just a blocking call, like sync XHR or FileReaderSync. The example from Olli's proposal 3 does what effectively amounts to spinning an event loop. It pulls out a bunch of events from the normal event loop and then manually dispatches them in a while loop. The behavior is exactly the same as spinning the event loop (except that non-message tasks doesn't get dispatchet). It is just dispatching events. The problems we (Gecko) have had with event loop spinning in main thread relate mainly to the problems where unexpected events are dispatched while running the loop, as an example user input events or events coming from network. getMessage/waitForMessage does not have that problem. I'm not sure what you mean by just dispatching events. That's exactly what event loop spinning is. Why are the gecko events any more unexpected than the message events that the example dispatches. The network events or user input events that we had are all events created by gecko code. The messages that might get dispatches by the worker code can easily also be network events or user events which are sent to the worker for processing. Web pages can have just as much inconsistent state while deep in call stacks as we do. If they at that point call into a library which starts pulling messages off of the task queue and dispatches them, they'll run into the same problems as we've had. / Jonas
Re: Sync API for workers
On 09/06/2012 09:12 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote: On 09/06/2012 08:31 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote: The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. You can't have both--you have to choose one of 1: allow blocking upwards, 2: allow blocking downwards, or 3: allow deadlocks. (I believe #1 is more useful than #2, but each proposal can go both ways. I'm ignoring more complex deadlock detection algorithms that can allow both #1 and #2, of course, since that's a lot harder.) Indeed. But I believe #2 is more useful than #1. I wasn't proposing having both, I was proposing only doing #2. It's actually technically possible to allow both #1 and #2 without deadlock detection algorithms, but to keep things sane I'll leave that as out of scope for this thread. [snip] I think that's by far the most interesting category of use cases raised for this feature so far, the ability to implement sync APIs from async APIs (or several async APIs). That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. That's why I'm not interested in only blocking on children, but rather only blocking on parents. The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. getMessage doesn't spin the event loop. Spinning the event loop means that tasks are run from task queues (such as asynchronous callbacks) which might not be expecting to run, and that tasks might be run recursively; none of that that happens here. All this does is block until a message is available on a specified port (or ports), and then returns it--it's just a blocking call, like sync XHR or FileReaderSync. The example from Olli's proposal 3 does what effectively amounts to spinning an event loop. It pulls out a bunch of events from the normal event loop and then manually dispatches them in a while loop. The behavior is exactly the same as spinning the event loop (except that non-message tasks doesn't get dispatchet). It is just dispatching events. The problems we (Gecko) have had with event loop spinning in main thread relate mainly to the problems where unexpected events are dispatched while running the loop, as an example user input events or events coming from network. getMessage/waitForMessage does not have that problem. I'm not sure what you mean by just dispatching events. That's exactly what event loop spinning is. No. waitForMessage example I wrote down just dispatches DOM events in a loop. That is a synchronous operation and you know exactly which events you're about to dispatch. If you run the generic event loop, you also end up running timers and getting input from network and user etc. and you can't controls those. Why are the gecko events any more unexpected than the message events that the example dispatches. We don't want to block certain events in Gecko (like user input to chrome). Blocking events in worker code is ok. The network events or user input events that we had are all events created by gecko code. The messages that might get dispatches by the worker code can easily also be network events or user events which are sent to the worker for processing. Web pages can have just as much inconsistent state while deep in call stacks as we do. That is true... If they at that point call into a library which starts pulling messages off of the task queue and dispatches them, they'll run into the same problems as we've had. ...but then it is up to the library to handle the case properly and dispatch events async. .Olli / Jonas
Re: Sync API for workers
On 09/06/2012 09:30 AM, Olli Pettay wrote: On 09/06/2012 09:12 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote: On 09/06/2012 08:31 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote: The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. You can't have both--you have to choose one of 1: allow blocking upwards, 2: allow blocking downwards, or 3: allow deadlocks. (I believe #1 is more useful than #2, but each proposal can go both ways. I'm ignoring more complex deadlock detection algorithms that can allow both #1 and #2, of course, since that's a lot harder.) Indeed. But I believe #2 is more useful than #1. I wasn't proposing having both, I was proposing only doing #2. It's actually technically possible to allow both #1 and #2 without deadlock detection algorithms, but to keep things sane I'll leave that as out of scope for this thread. [snip] I think that's by far the most interesting category of use cases raised for this feature so far, the ability to implement sync APIs from async APIs (or several async APIs). That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. That's why I'm not interested in only blocking on children, but rather only blocking on parents. The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. getMessage doesn't spin the event loop. Spinning the event loop means that tasks are run from task queues (such as asynchronous callbacks) which might not be expecting to run, and that tasks might be run recursively; none of that that happens here. All this does is block until a message is available on a specified port (or ports), and then returns it--it's just a blocking call, like sync XHR or FileReaderSync. The example from Olli's proposal 3 does what effectively amounts to spinning an event loop. It pulls out a bunch of events from the normal event loop and then manually dispatches them in a while loop. The behavior is exactly the same as spinning the event loop (except that non-message tasks doesn't get dispatchet). It is just dispatching events. The problems we (Gecko) have had with event loop spinning in main thread relate mainly to the problems where unexpected events are dispatched while running the loop, as an example user input events or events coming from network. getMessage/waitForMessage does not have that problem. I'm not sure what you mean by just dispatching events. That's exactly what event loop spinning is. No. waitForMessage example I wrote down just dispatches DOM events in a loop. That is a synchronous operation and you know exactly which events you're about to dispatch. If you run the generic event loop, you also end up running timers and getting input from network and user etc. and you can't controls those. Why are the gecko events any more unexpected than the message events that the example dispatches. We don't want to block certain events in Gecko (like user input to chrome). Blocking events in worker code is ok. The network events or user input events that we had are all events created by gecko code. The messages that might get dispatches by the worker code can easily also be network events or user events which are sent to the worker for processing. Web pages can have just as much inconsistent state while deep in call stacks as we do. That is true... If they at that point call into a library which starts pulling messages off of the task queue and dispatches them, they'll run into the same problems as we've had. ...but then it is up to the library to handle the case properly and dispatch events async. Though, dispatching events async so that other new message events don't get handled before them would require some new API. .Olli / Jonas
Re: Sync API for workers
On Wed, Sep 5, 2012 at 11:56 PM, Olli Pettay olli.pet...@helsinki.fi wrote: On 09/06/2012 09:49 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 11:30 PM, Olli Pettay olli.pet...@helsinki.fi wrote: On 09/06/2012 09:12 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote: On 09/06/2012 08:31 AM, Jonas Sicking wrote: On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote: The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. You can't have both--you have to choose one of 1: allow blocking upwards, 2: allow blocking downwards, or 3: allow deadlocks. (I believe #1 is more useful than #2, but each proposal can go both ways. I'm ignoring more complex deadlock detection algorithms that can allow both #1 and #2, of course, since that's a lot harder.) Indeed. But I believe #2 is more useful than #1. I wasn't proposing having both, I was proposing only doing #2. It's actually technically possible to allow both #1 and #2 without deadlock detection algorithms, but to keep things sane I'll leave that as out of scope for this thread. [snip] I think that's by far the most interesting category of use cases raised for this feature so far, the ability to implement sync APIs from async APIs (or several async APIs). That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. That's why I'm not interested in only blocking on children, but rather only blocking on parents. The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. getMessage doesn't spin the event loop. Spinning the event loop means that tasks are run from task queues (such as asynchronous callbacks) which might not be expecting to run, and that tasks might be run recursively; none of that that happens here. All this does is block until a message is available on a specified port (or ports), and then returns it--it's just a blocking call, like sync XHR or FileReaderSync. The example from Olli's proposal 3 does what effectively amounts to spinning an event loop. It pulls out a bunch of events from the normal event loop and then manually dispatches them in a while loop. The behavior is exactly the same as spinning the event loop (except that non-message tasks doesn't get dispatchet). It is just dispatching events. The problems we (Gecko) have had with event loop spinning in main thread relate mainly to the problems where unexpected events are dispatched while running the loop, as an example user input events or events coming from network. getMessage/waitForMessage does not have that problem. I'm not sure what you mean by just dispatching events. That's exactly what event loop spinning is. No. waitForMessage example I wrote down just dispatches DOM events in a loop. That is a synchronous operation and you know exactly which events you're about to dispatch. If you run the generic event loop, you also end up running timers and getting input from network and user etc. and you can't controls those. Just because they are message events doesn't mean that you know exactly which events you're about to dispatch. That's basically equivalent to saying that it's safe to spin the event loop in Gecko as long as you only dispatch nsIRunnables that were dispatched from Gecko code, as opposed to native events from the native event loop. Note that messages can be sent to the worker in response to network and UI events on the main thread. Why are the gecko events any more unexpected than the message events that the example dispatches. We don't want to block certain events in Gecko (like user input to chrome). Blocking events in worker code is ok. I don't understand what you are saying here. If they at that point call into a library which starts pulling messages off of the task queue and dispatches them, they'll run into the same problems as we've had. ...but then it is up to the library to handle the case properly and dispatch events async. But if it dispatches them asynchronously, they have lost their place in the message queue. I.e. now they are placed after all other incoming message events. Such event reordering is likely to break application level logic. And like I said in my original email,
Re: Sync API for workers
Just to ping a detail, so it's not lost in history: it should also be possible to peek at a On Thu, Sep 6, 2012 at 12:31 AM, Jonas Sicking jo...@sicking.cc wrote: 1: allow blocking upwards, 2: allow blocking downwards, Indeed. But I believe #2 is more useful than #1. I wasn't proposing having both, I was proposing only doing #2. OK. I think I disagree, but this is orthogonal to which API approach is used, since it's easy to take every proposal and flip it either way. That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. I understand the concept, but I'm having trouble coming up with useful examples. Can you give one? The only case that comes to mind is blocking for user input; for example, requesting the UI thread to ask the user his name, and then waiting for the response. (That might be useful, but I think it's far useful than generic sync worker APIs.) It sounds like you're talking about something like construct a DOM tree, do some stuff to it and return a result, but I can't think of a useful example for that. The only DOM APIs I've *really* wanted in workers is eg. HTMLImageElement, as part of getting WebGL into workers, but this wouldn't help there. The example from Olli's proposal 3 does what effectively amounts to spinning an event loop. It pulls out a bunch of events from the normal event loop and then manually dispatches them in a while loop. The behavior is exactly the same as spinning the event loop (except that non-message tasks doesn't get dispatchet). I don't like it either, but it was only needed in his proposal because it didn't support MessagePorts, so he had to do his own ad hoc filtering on a single port. Claiming that we don't need to explain all edge cases to authors and just give them a simplified version would, I think, be ignoring the complexity of software that people write using the web platform. Most of the complexity of the algorithm is in the mechanics of implementation, rather than its effects. Users don't have to know that it's the receiving thread handling the flag, or that an internal message is sent to clear the flag on the other side of the channel; these are algorithmic details. So it sounds like you are ok with not permitting the using both synchronous and asynchronous messages to the same port then? As long as some ports allow synchronous messages and others allow asynchronous messages. Leaving aside the issue of how and when it is determined that a port is sync vs. async. I don't really like it or think it's necessary, but it doesn't seem crippling and I'd live with it if needed to come to a resolution. It does lead to making people jump some extra hoops, though. The original use case that led me to this in the first place was an autocomplete worker. The worker receives an ordinary message with the user's typed text, eg. {text: intern}, and starts searching. The search may take longer than it takes for the user to type the next letter, and I wanted to be able to immediately stop the search if another message comes along, so I can restart with the new text. The simplest way to do that is to occasionally poll for a new message (zero timeout), and when {text: interne} comes along, restart the search. This isn't impossible with this restriction, but you do have to jump a few extra hoops. It'd need a separate cancel port which can be accessed synchronously; when a message shows up on it, cancel the search so the next {text} message (on a regular port) can be received. Actually, I don't think that would work, since the order of messages across ports is unspecified... An aside: it should always be possible to poll a message port, regardless of whether you're a parent or child. (It doesn't cause deadlocks since it's nonblocking, and it allows the above scenario even if blocking is only allowed from the parent side.) Unless you structure you code such that it's the responsibility of the consumer of the API to create the channel. That way the consumer of the API can choose if it wants to use blocking or non-blocking. But then you still end up with a port which is heavily restricted in where and how it can be passed around. If you want to pass it to your great-great-grandfather thread, you have to post it to your parent, who posts it to his parent, who posts it to his parent. Posting it directly there isn't possible. Passing it to siblings or uncles isn't possible at all. The same is true on the worker side; if it wants to pass its port to its grandchild, it has to pass it to its child, who passes it again to its child. You have to carefully structure your messaging to always do this. And do note that even your proposal requires the async side of a message channel to be aware of that the other side might be using synchronous polling. If it wants to support the
Re: Sync API for workers
On Mon, Sep 3, 2012 at 8:55 PM, Glenn Maynard gl...@zewt.org wrote: On Mon, Sep 3, 2012 at 9:30 PM, Jonas Sicking jo...@sicking.cc wrote: We can't generically block on children since we can't let the main window block on a child. That would effectively permit synchronous IO from the main thread which is not something that we want to allow. The UI thread would never be allowed to block, of course. The getMessage API itself would never even be exposed in the UI thread, regardless of the state of this flag. The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. APIs only existing on the main thread is likely going to always be the case, even once browsers get better at implementing APIs in the main thread and worker threads at the same time, since there will always be some APIs that are main-thread only, like the DOM. Your proposal makes it possible for pages to avoid the problems described in my email by setting up a separate channel used for synchronous messages. But some of the problems still remain. As soon as a message channel is used for both synchronous and asynchronous messages you can easily get into trouble. If someone calls the blocking waitForMessage() function and receive a message which was intended to be delivered asynchronously there is no good recourse. Basically any time that happens there are only bad options available, many of which have subtle problems that only happen intermittently like the ones I described in my initial email. Since that is the case, I think the best solution is to always force separate channels to be used for synchronous and asynchronous messages. If you have messages that must be received synchronously, and other messages that must be received asynchronously, then that's precisely a time when you'd want to use MessagePorts to separate them. That's what they're for. It's the same as using separate MessagePorts when you have two unrelated libraries receiving their own messages, so each library only sees messages intended for it. I agree that APIs that encourage people to write brittle code should be squinted at carefully, and we should definitely examine all APIs for that problem, but really I don't think it's the case here. You are more optimistic than I am. The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. And the fact that using the existing communication channel is much simpler than manually setting up a new one using |new MessageChannel| and passing transferring ports using postMessage further adds to that. So I would say that there's a high degree of risk that people will get this wrong. Importantly, the sending side doesn't have to know whether the receiving side is using a sync API to receive it or not--in other words, that information doesn't have to be part of the user's messaging protocol. I agree that this is desirable trait. But so far I think the risks in encouraging event loop spinning outweigh the benefits. I also wonder if what you are describing doesn't make more sense when communicating with a child worker and blocking on receiving a response from it. Another trait that this looses is the ability to terminate a worker as soon as we know that a synchronous response can't be sent. I.e. in proposal 1 and 2 the implementation can terminate the worker as soon as the object with the .reply() function is GCed. Note that this doesn't expose any GC behavior since a forever blocked worker behaves exactly the same as a terminated worker. I.e. neither will ever execute any code. All in all this is a much more complicated setup though. I think it'd be worth keeping the simpler API like the 1 or 2 proposals even if we do introduce SyncMessageChannel since that likely covers the majority of use cases. Those proposals seem much more complex to me. You can't send a message that will be received synchronously unless the other side prompts you for one first; you have to care whether the other side is acting synchronously or asynchronously. It's a bunch of new concepts (synchronous messages, message replies), instead of a simple (to users, at least) addition to MessagePorts. Fewer APIs isn't the same thing as a simpler API. On the contrary, I think trying to fit too much functionality into the same set of functions can easily result in more complexity. I think this is fairly well illustrated by the set of rules that you ended up having to set up in order to make the blocking permitted flags work out correctly. And your algorithm produces
Re: Sync API for workers
On Wed, Sep 5, 2012 at 7:03 PM, Jonas Sicking jo...@sicking.cc wrote: [Constructor] interface MessageChannel { readonly attribute MessagePortSyncSide syncPort; readonly attribute MessagePortAsyncSide asyncPort; }; This should of course say SyncMessageChannel. / Jonas
Re: Sync API for workers
On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote: The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. You can't have both--you have to choose one of 1: allow blocking upwards, 2: allow blocking downwards, or 3: allow deadlocks. (I believe #1 is more useful than #2, but each proposal can go both ways. I'm ignoring more complex deadlock detection algorithms that can allow both #1 and #2, of course, since that's a lot harder.) The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. getMessage doesn't spin the event loop. Spinning the event loop means that tasks are run from task queues (such as asynchronous callbacks) which might not be expecting to run, and that tasks might be run recursively; none of that that happens here. All this does is block until a message is available on a specified port (or ports), and then returns it--it's just a blocking call, like sync XHR or FileReaderSync. I also wonder if what you are describing doesn't make more sense when communicating with a child worker and blocking on receiving a response from it. That's what I meant, based on a use case you brought up: a library which implements the synchronous IndexedDB API. I think that's by far the most interesting category of use cases raised for this feature so far, the ability to implement sync APIs from async APIs (or several async APIs). Another trait that this looses is the ability to terminate a worker as soon as we know that a synchronous response can't be sent. I.e. in proposal 1 and 2 the implementation can terminate the worker as soon as the object with the .reply() function is GCed. Note that this doesn't expose any GC behavior since a forever blocked worker behaves exactly the same as a terminated worker. I.e. neither will ever execute any code. It's the same: terminate when the other MessagePort is GC'd or its port.close is called. (The MessagePort cross-process GC issues might sometimes prevent that, but that's just another instance of the issue that already exists. By the way, do you happen to remember where that issue was last discussed in detail? I'd like to refresh my memory on the details of this problem.) Fewer APIs isn't the same thing as a simpler API. On the contrary, I think trying to fit too much functionality into the same set of functions can easily result in more complexity. Sure, but I do think they're the same in this case. I think this is fairly well illustrated by the set of rules that you ended up having to set up in order to make the blocking permitted flags work out correctly. Explaining this to users is simple: if you want to block on a port, it needs to only ever be transferred above its other side, not below. And your algorithm produces weird edge cases, such as that it matters if someone sets up a message proxy which forwards all messages from one channel to another, rather than just passes on one end of a channel. With such a proxy your ports end up touching more threads and so are more likely to clear the blocking permitted flag. In all other cases such a proxy is transparent. The previous proposals allow nothing *but* blocking on your parent (or child), so if you have threads A - B - C and you want to pass messages from C to A, you *have* to proxy messages across (and keep thread B alive forever as a result). That's a big part of why we have MessagePorts to begin with. That aside, the iteration below removes most of the cases where you can't block, eg. passing a port up to your parent and then down to a sibling. This is a bit wordier, but I think it's easier to understand, since all it cares about is where the ports was originally created. - Add a direction flag to ports, which may have the values up, down, disallowed and initial, and is initially initial. - Add an original thread value to ports, which is set to the current thread at MessageChannel creation time. This value is preserved across structured clone. - When a thread receives a port, compare its original thread with the current thread. - If the root thread of original thread is not on the same as that of the current thread, mark the port as disallowed. - Otherwise, if original thread is the current thread, mark the port as initial. - Otherwise, if original thread is a descendant of the current thread, mark the port as up. - Otherwise, mark the port as down. The UI thread and shared workers are root threads. The root thread of a dedicated worker is its one ancestor thread which
Re: Sync API for workers
On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote: The problem with a Only allow blocking on children, except that window can't block on its children is that you can never block on a computation which is implemented in the main thread. I think that cuts out some major use cases since todays browsers have many APIs which are only implemented in the main thread. You can't have both--you have to choose one of 1: allow blocking upwards, 2: allow blocking downwards, or 3: allow deadlocks. (I believe #1 is more useful than #2, but each proposal can go both ways. I'm ignoring more complex deadlock detection algorithms that can allow both #1 and #2, of course, since that's a lot harder.) Indeed. But I believe #2 is more useful than #1. I wasn't proposing having both, I was proposing only doing #2. It's actually technically possible to allow both #1 and #2 without deadlock detection algorithms, but to keep things sane I'll leave that as out of scope for this thread. [snip] I think that's by far the most interesting category of use cases raised for this feature so far, the ability to implement sync APIs from async APIs (or several async APIs). That is certainly an interesting use case. I think another interesting use case is being able to write synchronous APIs in workers whose implementation uses APIs that are only available on the main thread. That's why I'm not interested in only blocking on children, but rather only blocking on parents. The fact that all the examples that people have used while we have been discussing synchronous messaging have spun event loops in attempts to deal with messages that couldn't be handled by the synchronous poller makes me very much think that so will web developers. getMessage doesn't spin the event loop. Spinning the event loop means that tasks are run from task queues (such as asynchronous callbacks) which might not be expecting to run, and that tasks might be run recursively; none of that that happens here. All this does is block until a message is available on a specified port (or ports), and then returns it--it's just a blocking call, like sync XHR or FileReaderSync. The example from Olli's proposal 3 does what effectively amounts to spinning an event loop. It pulls out a bunch of events from the normal event loop and then manually dispatches them in a while loop. The behavior is exactly the same as spinning the event loop (except that non-message tasks doesn't get dispatchet). I think this is fairly well illustrated by the set of rules that you ended up having to set up in order to make the blocking permitted flags work out correctly. Explaining this to users is simple: if you want to block on a port, it needs to only ever be transferred above its other side, not below. Claiming that we don't need to explain all edge cases to authors and just give them a simplified version would, I think, be ignoring the complexity of software that people write using the web platform. On Wed, Sep 5, 2012 at 9:03 PM, Jonas Sicking jo...@sicking.cc wrote: The part that I dislike about having single channel used for both sync and async messaging is that you end up with one or more async listeners which expect to get notified about all incoming messages, but then you have an API which steals a message away from those listeners. On top of that it has to do that stealing without any way of ensuring ensuring that it actually steals the right message. That's exactly the reason to use MessagePorts: to categorize messages. So it sounds like you are ok with not permitting the using both synchronous and asynchronous messages to the same port then? As long as some ports allow synchronous messages and others allow asynchronous messages. Leaving aside the issue of how and when it is determined that a port is sync vs. async. But being (mostly) agnostic to if the other side is using sync messages or not doesn't mean that the other side uses both sync and async messaging! But you still have to do extra work on the sending side to support both sync and async receiving, since you have to hand it the right type of channel. Unless you structure you code such that it's the responsibility of the consumer of the API to create the channel. That way the consumer of the API can choose if it wants to use blocking or non-blocking. Couldn't we just make calling getMessage permanently disable .onmessage dispatching (perhaps until the port is posted again)? That would make it very hard to accidentally use both, while encapsulating knowledge about which way it's being used to the receiver, so the sender doesn't need to carefully send the receiver the right type of MessageChannel. (I really don't feel this is necessary, but I'd prefer it to multiple MessagePort interfaces.) We could define that calling .start() sets the port in async mode, at which point
Re: Sync API for workers
Hi, Before anything else, thanks for this detailed and quite complete explanation. Le 03/09/2012 23:32, Jonas Sicking a écrit : The other thing that I wanted to talk about is use cases. It has been claimed in this thread that synchronous message passing isn't needed and that people can just write code using async patterns. While this is absolutely true, I would absolutely say that writing asynchronous code is dramatically more complicated than writing synchronous code. I acknowledge that writing async code may be hard for some with JavaScript-as-it-is. The solution I have found personally to this issue is using promises which makes async code look sync (+some noise due to JS-the-language). Others think promises is a good solution [1] Dave Herman has created task.js [2] to solve the same problem differently (promises+generators). Some have created compile-to-JS languages (like Roy as I said in a previous message) to solve the problem in yet another way. Others find other solutions. All these solutions have in common that they reduce the complexity of writing/reading async code while keeping the benefits of it against a small layer of code. The proposed solution here throws away all benefits of async code to reduce the complexity of writing async code by... writing sync code. I wish we'd explore more solutions to make async more workable rather than throwing away async. The problem with blocking workers is that it may create a culture of creating always more and more blocking workers (like Apache creates more and more threads to handle more blocking connections). I understand the benefit that may come with a sync API, but we all know that APIs aren't used the way they're primarily intended and sometimes with very bad consequences. I'm afraid the bad consequences of a sync API misuse haven't been explored. Maybe soon you'll have people filing bugs telling poor Firefox memory usage when using a lot of workers. This is one of the big reasons that we have workers at all. I had never heard this argument before the topic of sync messaging API for workers. Where does it come from? When I read the worker API, I see a way to create a new computation unit and to send messages back and forth, nothing about writing sync code. Regardless of goal, do people actually write more sync code with workers? Taras Glek seems to think that the local storage API (which is sync) is not a good fit for workers [3]: We could expose this API to workers, but then we run into an ethical question of bringing crappy APIs to new environments. (the article mentions that the localStorage API is synchronous as part of the crappy aspect of it) There is also another use-case which has been brought up. As the web platform is becoming more powerful, people have started converting code written for other platforms to javascript+html. By html, here, do you mean something else than canvas? Is there something that compiles any Windows/Mac/Linus UI framework into HTML5? For example the emscipten[1] and mandreel[2] allow recompiling C++ code to javascript which is then run in a web browser. I've been following loosely this topic and all examples I've seen were either about pure computation (like turning a C GZIP library in JS) or graphics stuffs (using canvas, hence my above question). Many times such code relies on APIs or libraries which contain blocking calls. Do you have an example of that? I haven't seen one so far, but that's an interesting point. Specifically, I can't recall having seen any C/C++-JS example that were doing IO (maybe files, but not network), Last I heard, Emscripten compiles to JS from LLVM bytecode. I'm not sure they rely on any library containing blocking calls. But as I said, I have been following that loosely. Technically it might be possible to automatically rewrite such code to use asynchronous coding patterns, but so far I don't think anyone has managed to do that. Naively, I would say, that once we've paid the price to compile code from one language to another, you're not that far off from compiling to a given coding pattern. Especially compiling from LLVM bytecode. Regardless, has anyone tried at all? Has anyone tried to compile to JS+Promises or JS+task.js? All of the compile-to-the-web movement started recently. Only now do we start seeing what it can do. it's just the beginning. Also, I wish the demand came from the people who do work on Emscripten or Mandreel, that they came to standards mailing-list saying it took us a billion hours to compile I/O to JS correctly, we tried promises, task.js and it didn't help. It would have taken 10 seconds if we had a sync API. But I can't recall having read such a message yet. Compile-to-the-web is a complicated field and I wish we didn't try to guess for them what they need before they ask. David [1] http://jeditoolkit.com/2012/04/26/code-logic-not-mechanics.html#post [2] http://taskjs.org/ [3] (see before-last
Re: Sync API for workers
On 9/4/12 5:23 AM, David Bruant wrote: Also, I wish the demand came from the people who do work on Emscripten or Mandreel As far as I know, what Jonas is saying about Emscripten did come from the Emscripten folks. I've certainly seen them say it in bugs in the recent past. No guessing involved on our part there. But I can't recall having read such a message yet. Maybe it's because the barrier to posting on standards list is pretty high? Starting with the requirement to subscribe to a high-traffic list. Some people may decide they don't have time to deal with that. -Boris
Re: Sync API for workers
Le 04/09/2012 14:34, Boris Zbarsky a écrit : On 9/4/12 5:23 AM, David Bruant wrote: Also, I wish the demand came from the people who do work on Emscripten or Mandreel As far as I know, what Jonas is saying about Emscripten did come from the Emscripten folks. I've certainly seen them say it in bugs in the recent past. No guessing involved on our part there. Ok I wasn't aware of that. Do you have bug numbers in mind by any chance? It however doesn't change that it could be tried to use promises or task.js to generate sync-like code more easily. David
Re: Sync API for workers
On 9/4/12 8:54 AM, David Bruant wrote: Ok I wasn't aware of that. Do you have bug numbers in mind by any chance? I don't offhand, unfortunately. Would have to search. -Boris
Re: Sync API for workers
On Tue, Sep 4, 2012 at 4:23 AM, David Bruant bruan...@gmail.com wrote: The proposed solution here throws away all benefits of async code to reduce the complexity of writing async code by... writing sync code. I wish we'd explore more solutions to make async more workable rather than throwing away async. It seems like you're thinking of asynchronous code as fundamentally better than synchronous code. It's not; it has a set of advantages--ones that the Web needs badly for the UI thread, in order for scripts and the browser to coexist. It also has a set of serious disadvantages. We're not throwing away async; we're bringing sync back into the game where it's appropriate. The problem with blocking workers is that it may create a culture of creating always more and more blocking workers (like Apache creates more and more threads to handle more blocking connections). You're not talking about this particular API here, you're talking about every sync API in workers. Having sync APIs in workers and performing blocking tasks in workers isn't something new. This is one of the big reasons that we have workers at all. I had never heard this argument before the topic of sync messaging API for workers. Where does it come from? When I read the worker API, I see a way to create a new computation unit and to send messages back and forth, nothing about writing sync code. Regardless of goal, do people actually write more sync code with workers? The very first example in the spec is doing work synchronously. http://www.whatwg.org/specs/web-apps/current-work/#a-background-number-crunching-worker Taras Glek seems to think that the local storage API (which is sync) is not a good fit for workers [3]: We could expose this API to workers, but then we run into an ethical question of bringing crappy APIs to new environments. (the article mentions that the localStorage API is synchronous as part of the crappy aspect of it) (The synchronous part is bad for the UI thread, but not a problem in workers, so this isn't a very good argument, at least as summarized here.) Many times such code relies on APIs or libraries which contain blocking calls. Do you have an example of that? I haven't seen one so far, but that's an interesting point. Specifically, I can't recall having seen any C/C++-JS example that were doing IO (maybe files, but not network), I believe that was his point--it's very hard to programmatically convert synchronous code to asynchronous code. Technically it might be possible to automatically rewrite such code to use asynchronous coding patterns, but so far I don't think anyone has managed to do that. Naively, I would say, that once we've paid the price to compile code from one language to another, you're not that far off from compiling to a given coding pattern. Especially compiling from LLVM bytecode. Regardless, has anyone tried at all? Has anyone tried to compile to JS+Promises or JS+task.js? All of the compile-to-the-web movement started recently. Only now do we start seeing what it can do. it's just the beginning. Also, I wish the demand came from the people who do work on Emscripten or Mandreel, that they came to standards mailing-list saying it took us a billion hours to compile I/O to JS correctly, we tried promises, task.js and it didn't help. It would have taken 10 seconds if we had a sync API. But I can't recall having read such a message yet. Compile-to-the-web is a complicated field and I wish we didn't try to guess for them what they need before they ask. I don't want to use a heavily layered environment that compiles to JavaScript in order to write linear code. I want the Web platform to be robust on its own, without complicated systems piled up on top of it. Expecting that to be a solution is just throwing in the hat and giving up. It's the you don't need a native API for that, just use a library argument notched up several orders of magnitude. -- Glenn Maynard
Re: Sync API for workers
Le 04/09/2012 17:03, Glenn Maynard a écrit : On Tue, Sep 4, 2012 at 4:23 AM, David Bruant bruan...@gmail.com mailto:bruan...@gmail.com wrote: The proposed solution here throws away all benefits of async code to reduce the complexity of writing async code by... writing sync code. I wish we'd explore more solutions to make async more workable rather than throwing away async. It seems like you're thinking of asynchronous code as fundamentally better than synchronous code. It's not; it has a set of advantages--ones that the Web needs badly for the UI thread, in order for scripts and the browser to coexist. It also has a set of serious disadvantages. Cognitive load is the only one mentioned so far. It is a serious issue since for the foreseeable future, only human beings will be writing code. However, as said, there are solutions to reduce this load. I wish to share an experience. Back in April, I gave a JavaScript/jQuery training to people who knew programming, but didn't know JavaScript. I made the decision to teach promises right away (jQuery has them built-in, so that's easy). It seems that it helped a lot understanding async programming. The cognitive load has its solutions. We're not throwing away async; we're bringing sync back into the game where it's appropriate. True. I was exagerating a bit here :-) The problem with blocking workers is that it may create a culture of creating always more and more blocking workers (like Apache creates more and more threads to handle more blocking connections). You're not talking about this particular API here, you're talking about every sync API in workers. Having sync APIs in workers and performing blocking tasks in workers isn't something new. This is one of the big reasons that we have workers at all. I had never heard this argument before the topic of sync messaging API for workers. Where does it come from? When I read the worker API, I see a way to create a new computation unit and to send messages back and forth, nothing about writing sync code. Regardless of goal, do people actually write more sync code with workers? The very first example in the spec is doing work synchronously. http://www.whatwg.org/specs/web-apps/current-work/#a-background-number-crunching-worker This is a very interesting example and I realize that I have used blocking and sync interchangeably by mistake. I'm against blocking, but not sync. What I'm fundamentally (to answer what you said above) against is the idea of blocking a computation unit (like a worker) that does nothing but idly waits (for IO or a message for instance). It seems that proposals so far make the worker wait for a message and do nothing meanwhile and that's a pure waste of resources. A worker has been paid for (memory, init time...) and it's waiting while it could be doing other things. The current JS event loop run-to-completion model prevents that waste by design. Taras Glek seems to think that the local storage API (which is sync) is not a good fit for workers [3]: We could expose this API to workers, but then we run into an ethical question of bringing crappy APIs to new environments. (the article mentions that the localStorage API is synchronous as part of the crappy aspect of it) (The synchronous part is bad for the UI thread, but not a problem in workers, so this isn't a very good argument, at least as summarized here.) Many times such code relies on APIs or libraries which contain blocking calls. Do you have an example of that? I haven't seen one so far, but that's an interesting point. Specifically, I can't recall having seen any C/C++-JS example that were doing IO (maybe files, but not network), I believe that was his point--it's very hard to programmatically convert synchronous code to asynchronous code. True, but I mean, we could have read intentions or blog posts of people saying it's way too hard Technically it might be possible to automatically rewrite such code to use asynchronous coding patterns, but so far I don't think anyone has managed to do that. Naively, I would say, that once we've paid the price to compile code from one language to another, you're not that far off from compiling to a given coding pattern. Especially compiling from LLVM bytecode. Regardless, has anyone tried at all? Has anyone tried to compile to JS+Promises or JS+task.js? All of the compile-to-the-web movement started recently. Only now do we start seeing what it can do. it's just the beginning. Also, I wish the demand came from the people who do work on Emscripten or Mandreel, that they came to standards mailing-list saying it took us a billion hours to compile I/O to JS correctly, we tried promises, task.js and it didn't help. It would have taken 10 seconds if we had a sync API.
Re: Sync API for workers
On Tue, Sep 4, 2012 at 10:32 AM, David Bruant bruan...@gmail.com wrote: Cognitive load is the only one mentioned so far. It is a serious issue since for the foreseeable future, only human beings will be writing code. However, as said, there are solutions to reduce this load. I wish to share an experience. Back in April, I gave a JavaScript/jQuery training to people who knew programming, but didn't know JavaScript. I made the decision to teach promises right away (jQuery has them built-in, so that's easy). It seems that it helped a lot understanding async programming. The cognitive load has its solutions. (Understanding asynchronous programming isn't really the issue. I'm sure everyone in this discussion has an intuitive grasp of that.) Those are attempts at making asynchronous code easier to write; they're not substitutes for synchronous code. They still result in code with less understandable, well-scoped state. This is a very interesting example and I realize that I have used blocking and sync interchangeably by mistake. I'm against blocking, but not sync. What I'm fundamentally (to answer what you said above) against is the idea of blocking a computation unit (like a worker) that does nothing but idly waits (for IO or a message for instance). It seems that proposals so far make the worker wait for a message and do nothing meanwhile and that's a pure waste of resources. A worker has been paid for (memory, init time...) and it's waiting while it could be doing other things. The current JS event loop run-to-completion model prevents that waste by design. Workers broke away from requiring the do a bit of work then keep returning to the event loop model of the UI thread from the start. This is no different than the APIs we already have. To take an earlier example: var worker = createDictionaryWorker(); worker.postMessage(elephant); var definition = getMessage(worker); // wait for the answer This is no different than a sync XHR or IndexedDB call to do the same thing: var xhr = new XMLHttpRequest(); xhr.open(GET, /dictionary?elephant, false); // sync xhr.send(); var definition = xhr.responseText; It simply allows workers, not just native code, to implement these APIs. That's a natural step. -- Glenn Maynard
Re: Sync API for workers
Le 04/09/2012 18:46, Glenn Maynard a écrit : On Tue, Sep 4, 2012 at 10:32 AM, David Bruant bruan...@gmail.com mailto:bruan...@gmail.com wrote: Cognitive load is the only one mentioned so far. It is a serious issue since for the foreseeable future, only human beings will be writing code. However, as said, there are solutions to reduce this load. I wish to share an experience. Back in April, I gave a JavaScript/jQuery training to people who knew programming, but didn't know JavaScript. I made the decision to teach promises right away (jQuery has them built-in, so that's easy). It seems that it helped a lot understanding async programming. The cognitive load has its solutions. (Understanding asynchronous programming isn't really the issue. I'm sure everyone in this discussion has an intuitive grasp of that.) Those are attempts at making asynchronous code easier to write; they're not substitutes for synchronous code. They still result in code with less understandable, well-scoped state. I'm sorry, but I have to disagree. Have you ever used promises in a large-scale project? I've been amazed to discover that promise-based API are ridiculously much easier to refactor than callback-based API. Obviously, refactoring necessitates well-scoped state. I can't show the commit I have in mind, because it's in closed-source software, but really, a promise-based API isn't less understandable and less well-scoped. That statement is at the opposite direction of my experience these last 8 months. This is a very interesting example and I realize that I have used blocking and sync interchangeably by mistake. I'm against blocking, but not sync. What I'm fundamentally (to answer what you said above) against is the idea of blocking a computation unit (like a worker) that does nothing but idly waits (for IO or a message for instance). It seems that proposals so far make the worker wait for a message and do nothing meanwhile and that's a pure waste of resources. A worker has been paid for (memory, init time...) and it's waiting while it could be doing other things. The current JS event loop run-to-completion model prevents that waste by design. Workers broke away from requiring the do a bit of work then keep returning to the event loop model of the UI thread from the start. This is no different than the APIs we already have. To take an earlier example: var worker = createDictionaryWorker(); worker.postMessage(elephant); var definition = getMessage(worker); // wait for the answer This is no different than a sync XHR or IndexedDB call to do the same thing: var xhr = new XMLHttpRequest(); xhr.open(GET, /dictionary?elephant, false); // sync xhr.send(); var definition = xhr.responseText; It simply allows workers, not just native code, to implement these APIs. That's a natural step. I understand and agree, but you're not addressing the problem of the resource waste I've mentionned above. Even if you're doing sync xhr in a worker, you're wasting the worker time, because it could be computing other things while waiting for the network to respond. That problem was obvious in the main thread because it was resulting in poor user experience, but the problem still holds with workers. What do you do if your worker is busy idling while waiting for network, but still need some other work to be done? Open another worker? And when this one is idling and you need work done? Open another worker? To oppose both things in the same sentence, is the readability worth the waste of resources? That's a genuine question. My experience with Node.js (which also provides sync methods for IO) is that for small scripts, sync methods are more convenient that callbacks or even promises. But arguably, for small scripts, readability isn't that big of a concern by nature of a small script. David
Re: Sync API for workers
On Tue, Sep 4, 2012 at 12:49 PM, David Bruant bruan...@gmail.com wrote: I'm sorry, but I have to disagree. Have you ever used promises in a large-scale project? I've been amazed to discover that promise-based API are ridiculously much easier to refactor than callback-based API. Obviously, refactoring necessitates well-scoped state. I can't show the commit I have in mind, because it's in closed-source software, but really, a promise-based API isn't less understandable and less well-scoped. That statement is at the opposite direction of my experience these last 8 months. You have to choose between scoping state to a class (poor scoping) or in closures (hard to debug) instead of using locals in a call stack (tightly scoped and easy to debug); the overall current state of execution is much harder to see compared to a stack trace; the basic idea of stepping through code in a debugger scarcely translates at all. I understand and agree, but you're not addressing the problem of the resource waste I've mentionned above. I don't feel like I need to, because I expect this question was explored before workers were introduced in the first place. You apparently want to argue against *all* sync APIs, but you should do that separately, rather than singling out one sync API at random. -- Glenn Maynard
Re: Sync API for workers
Le 04/09/2012 20:47, Glenn Maynard a écrit : On Tue, Sep 4, 2012 at 12:49 PM, David Bruant bruan...@gmail.com mailto:bruan...@gmail.com wrote: I'm sorry, but I have to disagree. Have you ever used promises in a large-scale project? I've been amazed to discover that promise-based API are ridiculously much easier to refactor than callback-based API. Obviously, refactoring necessitates well-scoped state. I can't show the commit I have in mind, because it's in closed-source software, but really, a promise-based API isn't less understandable and less well-scoped. That statement is at the opposite direction of my experience these last 8 months. You have to choose between scoping state to a class (poor scoping) or in closures (hard to debug) instead of using locals in a call stack (tightly scoped and easy to debug); the overall current state of execution is much harder to see compared to a stack trace; the basic idea of stepping through code in a debugger scarcely translates at all. Tooling isn't perfect for async debugging. It's being worked on. Yet it hasn't prevented web devs from buiding (and debugging) event-based code. As someone else said in another message, async isn't going away. There won't be new blocking API for the main thread, so all the costs of learning async programming will have to be paid. Debugging included. I'm less and less convinced there is really something substancial to win from the JS developer perspective. For small scripts, it will be possible to use blocking APIs, but the cost of async in small scripts is bearable. For big scripts, blocking APIs induce a performance cost that soon makes people move to async. I understand and agree, but you're not addressing the problem of the resource waste I've mentionned above. I don't feel like I need to, because I expect this question was explored before workers were introduced in the first place. It likely hasn't because workers do not have access to blocking APIs except sync xhr. Are there examples in the wild of people creating new workers when one is doing a sync xhr or do people just turn their code into async when performance becomes an issue? If the question has been explored before, can anyone point to the answer? Otherwise, the debate won't move forward on that point. You apparently want to argue against *all* sync APIs, but you should do that separately, rather than singling out one sync API at random. As I said in a previous message, I'm arguing against the waste of resources due to blocking APIs. If a sync API makes an actual use of the worker and CPU, that's excellent. If it's blocking on IO, it's wasting resources that could be doing other computations. David
Re: Sync API for workers
On Tue, Sep 4, 2012 at 2:23 PM, David Bruant bruan...@gmail.com wrote: Tooling isn't perfect for async debugging. It's being worked on. Yet it hasn't prevented web devs from buiding (and debugging) event-based code. Developers work in lots of bad environments and get stuff done anyway. That's no argument. As someone else said in another message, async isn't going away. There won't be new blocking API for the main thread, so all the costs of learning async programming will have to be paid. Debugging included. I can only repeat what I already said: Understanding asynchronous programming isn't really the issue. I'm sure everyone in this discussion has an intuitive grasp of that. You apparently want to argue against *all* sync APIs, but you should do that separately, rather than singling out one sync API at random. As I said in a previous message, I'm arguing against the waste of resources due to blocking APIs. That's what I said: you're arguing against all sync APIs, not *this* API. I don't really want to spend more time on this tangent, since it's not about this API at all but a higher-level concept, and one we already have an answer to: synchronous APIs in workers are OK. Again, if you want to debate a basic premise of Web Workers, I recommend starting a separate thread. -- Glenn Maynard
Re: Sync API for workers
On Mon, 3 Sep 2012, Glenn Maynard wrote: - Add an internal flag to MessagePort, blocking permitted, which is initially set. - When a MessagePort port is transferred from source to dest, - If source is an ancestor of dest, the blocking permitted flag of port is cleared. (This is a down transfer.) You basically can't do this, because by the time you've received the message saying that the port is in a permitted scope, the other side of the port could have been shunted three times and now be who knows where. Basically as soon as a port leaves the scope in which it was created, you can no longer make any stable statements about where the other side is. This is why the ports used in dedicated Workers are hidden (so you can't send them anywhere). -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: Sync API for workers
[Forwarding a response from Alon Zakai, who is behind Emscripten and CC'ing him] There is also another use-case which has been brought up. As the web platform is becoming more powerful, people have started converting code written for other platforms to javascript+html. By html, here, do you mean something else thancanvas? Is there something that compiles any Windows/Mac/Linus UI framework into HTML5? There is pyjamas that compiles Python into JS and GWT that compiles Java into JS. Both are UI frameworks, and being able to use sync calls in both would be more natural since the original languages have lots of sync stuff I believe. I don't know how important this use case is though. But we would like to compile open source UI frameworks like Qt and GTK into JS using emscripten, and sync would help a lot there. For example the emscipten[1] and mandreel[2] allow recompiling C++ code to javascript which is then run in a web browser. I've been following loosely this topic and all examples I've seen were either about pure computation (like turning a C GZIP library in JS) or graphics stuffs (using canvas, hence my above question). Many times such code relies on APIs or libraries which contain blockingcalls. Do you have an example of that? I haven't seen one so far, but that's an interesting point. Specifically, I can't recall having seen any C/C++-JS example that were doing IO (maybe files, but not network), Last I heard, Emscripten compiles to JS from LLVM bytecode. I'm not sure they rely on any library containing blocking calls. But as I said, I have been following that loosely. The normal C/C++ IO calls are all synchronous - fopen, fread, etc. As you said above, this is indeed less of a problem for pure computation. But when compiling a complete game engine for example (like in BananaBread), you need to handle everything a complete app needs, including synchronous IO. Both file IO and and network IO (for multiplayer, downloading assets, etc. - we haven't gotten around to that yet though in BananaBread) are relevant, as well as synchronous GL operations using WebGL. Technically it might be possible to automatically rewrite such code to use asynchronous coding patterns, but so far I don't think anyone has managed to do that. Naively, I would say, that once we've paid the price to compile code from one language to another, you're not that far off from compiling to a given coding pattern. Especially compiling from LLVM bytecode. Regardless, has anyone tried at all? Has anyone tried to compile to JS+Promises or JS+task.js? Consider even the simple and very common case of .. while (!feof(f)) { fread(buf, 100, 1, f); ..process data in buf.. } .. I don't have any good ideas for something asynchronous to compile this into that does not currently substantially harm performance. Compiling synchronous code into continuation passing style, generators, control flow emulation, or some other async style would greatly reduce performance. In theory JS engines could optimize those styles, but it would be hard and I don't think this is at the top of anyone's list of priorities for any JS engine. Now, we could say that sync code, mainly IO, will run slowly but that is ok because it's mainly just done during startup. That's true to some extent, but startup is very important too, in BananaBread we already have 5-10 seconds or so to load the entire game engine, a lot of which is file IO and processing, and we have gotten requests to improve that as much as possible because it is very significant for the user's initial impression. Best, Alon Zakai
Re: Sync API for workers
Alon Zakai wrote: Technically it might be possible to automatically rewrite such code to use asynchronous coding patterns, but so far I don't think anyone has managed to do that. Naively, I would say, that once we've paid the price to compile code from one language to another, you're not that far off from compiling to a given coding pattern. Especially compiling from LLVM bytecode. Regardless, has anyone tried at all? Has anyone tried to compile to JS+Promises or JS+task.js? Consider even the simple and very common case of .. while (!feof(f)) { fread(buf, 100, 1, f); ..process data in buf.. } .. I don't have any good ideas for something asynchronous to compile this into that does not currently substantially harm performance. Compiling synchronous code into continuation passing style, generators, control flow emulation, or some other async style would greatly reduce performance. I can imagine, it sounds hard indeed. Do you have numbers on how it affects performance? Or an intuition on these numbers? I don't need to be convinced that it affects performance significantly, but just to get an idea. I remember that at some point (your JSConf.eu talk last October), in order to be able to compile through Emscripten, the source codebase (in C/C++) had to be manually tweaked sometimes. Is it still the case? If it's an acceptable thing to ask to authors, then would there be easy ways for authors to make their IO blocking code more easily translated to async JS code? I'm pessimistic, but it seems like an interesting question to explore. David
Re: Sync API for workers
David Bruant wrote: I can imagine, it sounds hard indeed. Do you have numbers on how it affects performance? Or an intuition on these numbers? I don't need to be convinced that it affects performance significantly, but just to get an idea. This is not going to be easy to estimate, but you might benchmark generator vs. non-generator code in the latest SpiderMonkey. I don't think we need quantification, though. Alon's right, the optimizing VMs are not focused on uncommon code other than what's in the dopey industry-standard benchmarks. I remember that at some point (your JSConf.eu talk last October), in order to be able to compile through Emscripten, the source codebase (in C/C++) had to be manually tweaked sometimes. Is it still the case? If it's an acceptable thing to ask to authors, then would there be easy ways for authors to make their IO blocking code more easily translated to async JS code? I'm pessimistic, but it seems like an interesting question to explore. BananaBread required zero Cube 2 changes, IIRC. Other Emscripten examples are also pure compilation. Forget it. Inversion of control flow is hard enough and error prone that developers won't do it. It's the #1 reason Mozilla's Electrolysis project is paused indefinitely. The SuperSnappy work (threads, not processes) preserves most execution model compatibility, and avoids requiring programmers writing Firefox XUL front-end and add-on code from having to manually callback-CPS their code (on every DOM access!). /be
Re: Sync API for workers
On Tue, Sep 4, 2012 at 3:11 PM, Ian Hickson i...@hixie.ch wrote: On Mon, 3 Sep 2012, Glenn Maynard wrote: - Add an internal flag to MessagePort, blocking permitted, which is initially set. - When a MessagePort port is transferred from source to dest, - If source is an ancestor of dest, the blocking permitted flag of port is cleared. (This is a down transfer.) You basically can't do this, because by the time you've received the message saying that the port is in a permitted scope, the other side of the port could have been shunted three times and now be who knows where. There's no message saying that it's permitted; the only possible message is that it's *no longer* permitted. Once the that flag is cleared, it's cleared permanently. Also see my later message to Jonas, which reformulates this a bit to put responsibility of triggering the clear the flag behavior on the receiver, rather than the sender of the port, since while sending is asynchronous (you don't know where a message is going when you send it), receiving is not (you know where a message is going when you receive it--to you!--and you can know where it originally came from, since the sender can tuck that information in the message). -- Glenn Maynard
Re: Sync API for workers
On Mon, Sep 3, 2012 at 10:55 PM, Glenn Maynard gl...@zewt.org wrote: I suspect there's a way to make the general-case version work, though. To restate this by itself instead of as a delta: - Add an internal flag to MessagePort, blocking permitted, which is initially set. Add a value previous owner, which is initially null. - During postMessage, when a MessagePort port is to be transferred, set the MessagePort's previous owner to the current thread. - When a MessagePort is transferred to the current thread, the current thread must compare itself to the previous owner value of the received MessagePort: - If the current thread is a descendant of previous owner, the blocking permitted flag of port must be cleared. (This is a down transfer.) - Otherwise, if previous owner is a descendant of the current thread, a clear blocking permitted message must be sent over port. (This is an up transfer.) - Otherwise, if previous owner is the current thread, do nothing. - Otherwise, the blocking permitted flag of port must be cleared and a clear blocking permitted message must be sent over port. - When a clear blocking permitted message is received on the a port's message queue, it must be discarded and the blocking permitted flag of the port must be cleared. - When the blocking permitted flag of any MessagePort is cleared, any getMessage calls blocking on that port throw an exception. - Calling getMessage on a port (with a nonzero timeout) whose blocking permitted flag is cleared throws the same exception. - Additionally, calling getMessage on a port (with a nonzero timeout) when neither it nor its entangled port has ever been transferred to another thread throws an exception. (Blocking for data when the current thread holds both sides of the port guarantees a deadlock.) The compare itself to step must be allowed to happen asynchronously, as soon as the port appears in a message queue held by the current thread (and before the message containing the port is taken from the queue for delivery to onmessage or getMessage). That ensures that a long-running script doesn't prevent the clear blocking permitted message from being sent. That's a bit annoying, but it seems doable, since it has no script-visible effects in the thread it's happening in. (This isn't necessary for the clear blocking permitted message; that can simply be processed in port message queue order.) -- Glenn Maynard
Re: Sync API for workers
On Tue, Sep 4, 2012 at 1:59 PM, Brendan Eich bren...@mozilla.org wrote: David Bruant wrote: I can imagine, it sounds hard indeed. Do you have numbers on how it affects performance? Or an intuition on these numbers? I don't need to be convinced that it affects performance significantly, but just to get an idea. This is not going to be easy to estimate, but you might benchmark generator vs. non-generator code in the latest SpiderMonkey. I don't think we need quantification, though. Alon's right, the optimizing VMs are not focused on uncommon code other than what's in the dopey industry-standard benchmarks. Yes, last I checked generator code is not even JITed. This kind of problem isn't on the radar of JS engine people - for understandable reasons, of course. I remember that at some point (your JSConf.eu talk last October), in order to be able to compile through Emscripten, the source codebase (in C/C++) had to be manually tweaked sometimes. Is it still the case? If it's an acceptable thing to ask to authors, then would there be easy ways for authors to make their IO blocking code more easily translated to async JS code? I'm pessimistic, but it seems like an interesting question to explore. BananaBread required zero Cube 2 changes, IIRC. Other Emscripten examples are also pure compilation. Yes, we aim at 0 code changes when porting. This is usually the case, although sometimes something absolutely must be changed. In BananaBread we changed a few dozen lines of code out of 120,000 for example. - azakai
Re: Sync API for workers
On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com wrote: Also, I don't think I have seen mentionned use cases of things that are not possible without a Sync API. Everything presented is already possible (sometimes at arguably high costs like Glenn Maynard's use case in discussion [1]). I have a use case: at USENIX I presented Treehouse [1, 2], a system that sandboxes (mostly) unmodified JavaScript by running it in a worker with a virtual DOM and browser API. Treehouse presents guest code a synchronous interface to the virtual DOM, and then asynchronously updates the real DOM in the parent page. This works surprisingly well, but has some limitations (S3.4 of [1]). First, Treehouse cannot virtualize synchronous API calls such as window.prompt. Second, sharing resources like cookies and DOM nodes between workers is difficult. We punted on this. For example, we require that a given DOM node in the parent page appear in the virtual DOM of at most one worker. I can't weigh in on the implementation debate, but I can say that a blocking message API would make Treehouse more powerful and simplify its implementation. Whether this is a *real world* use case is another matter entirely... [1] paper: https://www.usenix.org/system/files/conference/atc12/atc12-final159.pdf [2] video, audio: https://www.usenix.org/conference/usenixfederatedconferencesweek/treehouse-javascript-sandboxes-help-web-developers-help -- Lon Ingram @lawnsea
Re: Sync API for workers
Hi All, I'd like to start by clearing up some confusion here. That's why I'm responding to the first email in this thread. We at mozilla have no interest in creating an API which runs the risk of causing dead-locks. I would expect this to be true of other browser vendors too, though obviously I can't speak for them. This is why all the proposals that we have been discussing have been to allow *dedicated* workers block on receiving messages *only* from their parent. Dedicated workers always create tree-like structures, and so if children can only block on their parents, you can't end up with a situation where two actors are blocked on each other. It seems hard to ensure that deadlocks can't happen if we try to allow blocking calls on generic MessagePorts, this is why we haven't been interested in doing that. I'm not saying it's impossible, but if someone wants to propose this, please keep in mind that we're not interested in proposals which allow deadlocks, so you'll need to prove that your proposal can't cause deadlocks. It's been mentioned in this thread, and elsewhere, that we need to take caution to not allow deadlocks to happen. That's exactly what we are doing by only allowing blocking calls from dedicated workers to their parents. The other thing that I wanted to talk about is use cases. It has been claimed in this thread that synchronous message passing isn't needed and that people can just write code using async patterns. While this is absolutely true, I would absolutely say that writing asynchronous code is dramatically more complicated than writing synchronous code. This is one of the big reasons that we have workers at all. If writing asynchronous callbacks was almost as easy as writing blocking code, then we could simply ask people to return to the event loop and asynchronously continue their computation through a callback. There are other reasons for workers to exist, but this is one of them. So yes, it's definitely the case that synchronous blocking code doesn't allow any new use cases that were impossible before. But it makes certain code dramatically easier to write, which is of big value to authors. There is also another use-case which has been brought up. As the web platform is becoming more powerful, people have started converting code written for other platforms to javascript+html. For example the emscipten[1] and mandreel[2] allow recompiling C++ code to javascript which is then run in a web browser. Many times such code relies on APIs or libraries which contain blocking calls. Technically it might be possible to automatically rewrite such code to use asynchronous coding patterns, but so far I don't think anyone has managed to do that. One of the big use cases I am interested in solving (though I can't speak for other people at mozilla) is to allow libraries to be written and imported into workers which expose easy-to-use synchronous APIs, and whose implementation makes blocking calls to a the parent in order to implement the API. Such a library would of course require part of the library to also be running in the parent so that it could handle the incoming messages. For example you could imagine a library which implements the synchronous IndexedDB API since browsers so far has not implemented it. Or a library which implements a DOM which allows a worker to modify part of a document rendered by the parent window. So with that in mind, let me express some opinions on the three proposals Olli mentioned in [3] The 3 proposal, i.e. a blocking waitForMessage() function which returns the next message event is something which has come up several times in the past. There's certainly a lot of logic to it, however there's some pretty important problems with it. Consider the following scenario: 1. Worker starts running a task, say a messagehandler in response to a websocket message. 2. Main thread sends async messages A and B to the worker. This message is added to the worker's message queue. 3. While still inside of the task started in step 1, the worker decides that it needs to send a synchronous message to the main thread. So it sends an asynchronous message, X, and starts polling messages using waitForMessage(). 4. It first receives messages A and B, but since they aren't the reply to the message X sent in step 3, it keeps polling. Messages A and B end up in the local events array. 5. Message X arrives in the main thread and the main thread performs the calculation and responds with message X'. X' is added to the worker's event queue. 6. Main thread sends async message C to the worker. This message is added to the worker's message queue. 7. The worker keeps polling and now gets message X', so it stops polling and uses the data in X' as result. 8. The worker keeps running the task and eventually gets to the while-loop which processes the events array. 9. The first message in the array is A which the worker dispatches, causing the handler for A to start running. 10. The handler
Re: Sync API for workers
On Mon, Sep 3, 2012 at 4:47 PM, Glenn Maynard gl...@zewt.org wrote: On Mon, Sep 3, 2012 at 4:32 PM, Jonas Sicking jo...@sicking.cc wrote: It seems hard to ensure that deadlocks can't happen if we try to allow blocking calls on generic MessagePorts, this is why we haven't been interested in doing that. I'm not saying it's impossible, but if someone wants to propose this, please keep in mind that we're not interested in proposals which allow deadlocks, so you'll need to prove that your proposal can't cause deadlocks. (See below.) Another problem you have is that the A, B and C events aren't run from the event loop like normal events. They are instead run from whatever callstack existed when someone decided to make synchronous call to the parent. This will give web developers exactly the same problem as we've had with Gecko code spinning the event loop. When doing something like that, you have to be absolutely sure that all code which exists up your call stack can deal with all of these messages getting dispatched. And all of those messages have to be able to deal with being dispatched under the existing callstack. I think all of the problems you're describing only happen if there's just one channel that you can post messages to, eg. if you can't block on MessagePort but only the global port. I think we can find a solution for the MessagePort problem. Once you can block on specific MessagePorts, you no longer have the confusion of getMessage() returning messages meant for other APIs. (After all, isn't that what MessageChannels are for?) Conceptually, I think this is possible. You should only be able to perform a blocking getMessage if the other side of the port is in a dedicated worker who is a descendant of the current thread. Here's an attempt: - Add an internal flag to MessagePort, blocking permitted, which is initially set. - When a MessagePort port is transferred from source to dest, - If source is an ancestor of dest, the blocking permitted flag of port is cleared. (This is a down transfer.) - Otherwise, if source is a descendent of dest, the blocking permitted flag of port's entangled port is cleared. (This is an up transfer.) - Otherwise, if source == dest, do nothing. - Otherwise, the blocking permitted flag of both port and its entangled port are cleared. (For example, a port was transferred to a shared worker.) - When the blocking permitted flag of any MessagePort is cleared, any getMessage calls blocking on that port throw an exception. - Calling getMessage on a port (with a nonzero timeout) whose blocking permitted flag is cleared throws the same exception. - Additionally, calling getMessage on a port (with a nonzero timeout) when neither it nor its entangled port has ever been transferred to another thread throws an exception. (Blocking for data when the current thread holds both sides of the port guarantees a deadlock.) In other words, if a port is transferred up the thread tree, then it's allowed to block downwards, but any port that's ever been transferred down can not. If you transfer a port down and then back up, then neither side can ever block on the port (the flag has been cleared on both sides). (The clear the entangled port's flag would presumably actually mean sending a control message over the pipe, telling the other side to clear the flag.) This works for dedicated workers, where the ancestor/descendant concepts make sense. This wouldn't work for shared workers, which would never be able to block. (That's hard, since shared workers create cycles. I don't think any current proposal can support shared workers while also disallowing deadlocks.) Now, this approach can go one of two ways: we can either allow blocking up the tree or down the tree, but we'd have to pick one or the other. I'm inclined to recommend blocking *down* the tree, since that allows use cases like the ones you mentioned, eg. starting a thread to do IndexedDB calls, which you (the parent) can then block on. We can't generically block on children since we can't let the main window block on a child. That would effectively permit synchronous IO from the main thread which is not something that we want to allow. So if we're only choosing one direction (which is definitely the simpler thing to do), then it has to be that you can only block up the tree. Also, the last Otherwise, the blocking permitted flag of both port and its entangled port are cleared. has to apply any time when sending a port through a generic port rather than through a dedicated worker parent/child? When communicating with a generic port we never have any idea what is on the receiving end. And what is on the receiving end can change between the time when a message is sent, and when it is received. 1.1 is nifty in that it allows us to use events while dealing with replies from multiple handlers. But it seems like it adds a feature that doesn't
Re: Sync API for workers
On Mon, Sep 3, 2012 at 9:30 PM, Jonas Sicking jo...@sicking.cc wrote: We can't generically block on children since we can't let the main window block on a child. That would effectively permit synchronous IO from the main thread which is not something that we want to allow. The UI thread would never be allowed to block, of course. The getMessage API itself would never even be exposed in the UI thread, regardless of the state of this flag. I picture this being methods on DedicatedWorkerGlobalScope and SharedWorkerGlobalScope. (SharedWorkerGlobalScope's version would only get the zero-timeout polling version.) Also, the last Otherwise, the blocking permitted flag of both port and its entangled port are cleared. has to apply any time when sending a port through a generic port rather than through a dedicated worker parent/child? When communicating with a generic port we never have any idea what is on the receiving end. And what is on the receiving end can change between the time when a message is sent, and when it is received. Well, you can simplify the algorithm by always clearing the flag when posting through anything but a dedicated worker's port. That's straightforward since you always know at send time what the relationship to the receiver is--if it's through worker.postMessage it's to a child, and if it's through postMessage on DedicatedWorkerGlobalScope it's to the parent. It would be nice to do this in the more generic way, but this would be enough for a lot of cases. I suspect there's a way to make the general-case version work, though. For example, when a worker is transferred to another thread, include the thread ID sending the port, as part of the metadata of the transfer. The receiver then knows where the port came from, and it knows itself, so it can see where it lies relative to the sender to determine whether it was an up, down or a transfer that always invalidates both sides (sent to a worker that is neither an ancestor nor a descendant). If it determines that the transfer invalidated the other side, then it sends a message across the pipe saying whoever you are, you need to clear your blocking-permitted flag. This would apply even if the other side changes hands in the meantime, since once that flag is set, it's set permanently. All that aside, do any implementations actually put dedicated workers in a different process than their creator (if so, I'm curious as to why)? This should all be very simple if you don't do that, and I can't think of any reason to--and good reasons not to, eg. fast ArrayBuffer transfers become much harder. Shared workers may cross processes, but dedicated workers? Your proposal makes it possible for pages to avoid the problems described in my email by setting up a separate channel used for synchronous messages. But some of the problems still remain. As soon as a message channel is used for both synchronous and asynchronous messages you can easily get into trouble. If someone calls the blocking waitForMessage() function and receive a message which was intended to be delivered asynchronously there is no good recourse. Basically any time that happens there are only bad options available, many of which have subtle problems that only happen intermittently like the ones I described in my initial email. Since that is the case, I think the best solution is to always force separate channels to be used for synchronous and asynchronous messages. If you have messages that must be received synchronously, and other messages that must be received asynchronously, then that's precisely a time when you'd want to use MessagePorts to separate them. That's what they're for. It's the same as using separate MessagePorts when you have two unrelated libraries receiving their own messages, so each library only sees messages intended for it. I agree that APIs that encourage people to write brittle code should be squinted at carefully, and we should definitely examine all APIs for that problem, but really I don't think it's the case here. It seems much simpler to me to have only one kind of MessagePort, each representing only one message channel. Importantly, the sending side doesn't have to know whether the receiving side is using a sync API to receive it or not--in other words, that information doesn't have to be part of the user's messaging protocol. As a simple example, you can have a worker thread whose protocol is simply: - Send a message to the worker's port with a word. - The worker sends a message to its parent with the word's definition. (The mechanism of this lookup is black-boxed--it might be IndexedDB, or a network request, or a complex combination.) This means that the caller can use this worker's API synchronously or asynchronously, without needing to define two interfaces and without the child knowing the difference. You can use it synchronously (if you're in a worker yourself): var worker = createDictionaryWorker();
Re: Sync API for workers
On Sat, Sep 1, 2012 at 2:32 PM, Glenn Maynard gl...@zewt.org wrote: On Sat, Sep 1, 2012 at 3:19 PM, Rick Waldron waldron.r...@gmail.comwrote: I can seriously dispute this, as someone who involved in research and development of JavaScript programming for hardware. Processing high volume serialport IO is relatively simple with streams and data events. It's just a matter of thinking differently about the program. We'll have to professional disagree, then. I've used both models extensively, and for some tasks, such as complex algorithms, I find linear code much easier to write, understand and debug. On Sat, Sep 1, 2012 at 3:28 PM, Olli Pettay olli.pet...@helsinki.fiwrote: Proposal 3 Worker: postMessage(I want reply to this); var events = []; while (var m = waitForMessage()) { if (m.data != /* the reply * /) { events.push(m); } else { // do something with message } } while (events.length()) { dispatchEvent(events.shift()); } The intent is much simpler: postMessage(foo); var response = getMessage(); You're correct that this wouldn't integrate all that well with libraries, since you may end up receiving an unrelated message. (Trying to deal with that is where the bulk of the above comes from--and you probably really wouldn't want to dispatch unknown events, since that's effectively making *all* events async.) That's why I originally proposed this as a method on MessagePort. You'd create a dedicated MessagePort to block on, so you wouldn't collide with (and possibly be confused by, or discard by accident) unrelated messages. Just wanted to point out that all of the arguments for a wait-for-reply API in workers also apply to SharedWorkers. It's trickier for SharedWorkers since they use MessagePorts, and we probably don't want to expose this kind of API to pages (which also use MessagePorts). But I would strongly prefer a solution that would be applicable to all kinds of workers, not just dedicated workers. FWIW, I find the getMessage(timeout) API (sol'n 3 in the thread) preferable to adding new send sync message/reply + related events to the API. Perhaps exposing this on both Worker and on MessagePort (with appropriate errors if getMessage(timeout) is called from page scope with timeout != 0) would be acceptable? I'm not entirely certain what the semantics of getMessage() are, though - if you grab a message via getMessage(), does this imply that normal onmessage event handlers are not run (or perhaps are run after we re-enter the event loop)? I am not optimistic that we can do deadlock prevention in the general case with MessagePorts, for the same reason that it's prohibitively difficult to reliably garbage collect MessagePorts when they can be passed between processes. Of course, that means that while you're waiting for messages on the port, no other messages are being received, since you aren't in the event loop. That's just inherent--nothing else that happens during the event loop would take place either, like async XHR. I'm not sure how to do the implicit deadlock prevention if this is exposed on MessagePort, though. (I personally don't find it a problem, as I said earlier, but I know that a proposal that has that as an option will have a much better chance of being implemented than one that doesn't.) - The message must be read in order to reply I'm not quite following here. You could getMessage at any time, and the other thread could postMessage at any time--there doesn't have to be a hey, send me a message message in the first place at all. For example, a site may have a button that sends a message when clicked. You don't have to jump hoops in order to wait for the query message to reply to, as it seems you'd have to with the reply proposals. -- Glenn Maynard
Re: Sync API for workers
On Sun, Sep 2, 2012 at 12:24 PM, Andrew Wilson atwil...@google.com wrote: Just wanted to point out that all of the arguments for a wait-for-reply API in workers also apply to SharedWorkers. It's trickier for SharedWorkers since they use MessagePorts, and we probably don't want to expose this kind of API to pages (which also use MessagePorts). But I would strongly prefer a solution that would be applicable to all kinds of workers, not just dedicated workers. You can do that by giving MessagePort another interface in workers, eg. MessagePortSync or MessagePortWorkers, which inherits from MessagePort and adds eg. getMessage(). It might need a bit of finessing to switch interfaces during structured clone. Alternatively--and as I type this I like it better--add a getMessage(port) method to WorkerGlobalScope. Simplicity aside, I like that it can naturally support getMessage([port1, port2, port3], 100). With port.getMessage(), there's no way to wait for a message from multiple ports (think select()/poll()). This doesn't give any way to specify the worker's implicit port, though; I guess that could be a special case, eg. pass in null. I'm not entirely certain what the semantics of getMessage() are, though - if you grab a message via getMessage(), does this imply that normal onmessage event handlers are not run (or perhaps are run after we re-enter the event loop)? It shouldn't still dispatch onmessage asynchronously. That's confusing, and also, it means the messages would build up in the queue until the script returns. Due to the nature of the feature, the script may not return for a long time, or it may receive lots of messages before it does. Also, if you may handle the message during processing (via getMessage), and also when idle (via onmessage), this means it's hard to ensure you don't process messages twice. The two options that have come up are: 1: don't dispatch onmessage at all if getMessage returns a message. getMessage() consumes the message from the queue. 2: dispatch onmessage synchronously, before returning from getMessage (which can also return the message or not). They're mostly equivalent; you can build either on top of the other. (createEvent isn't actually exposed to WorkerGlobalScope in order to implement #2 from #1, but that's a separate issue.) I just noticed a strong argument against #2: it's recursive. Without any strong benefit, that seems like a good thing to avoid. I am not optimistic that we can do deadlock prevention in the general case with MessagePorts, for the same reason that it's prohibitively difficult to reliably garbage collect MessagePorts when they can be passed between processes. Would you consider this an implementation-blocking problem? By the way, another option is to remove the ability to block, so it always behaves as getMessage(0)--return a waiting message, but don't wait for one. That would also make it impossible to deadlock. Being able to wait for a message would be a nice plus, but I don't think I've seen any use cases that really require it. (If this is done, there's no need to be able to give multiple ports to getMessage, as I mentioned at the top, since you can just call getMessage separately for each port.) -- Glenn Maynard
Re: Sync API for workers
On Sun, 2 Sep 2012, Andrew Wilson wrote: Just wanted to point out that all of the arguments for a wait-for-reply API in workers also apply to SharedWorkers. It's trickier for SharedWorkers since they use MessagePorts Dedicated Workers use MessagePorts too, they're just embedded in the WorkerGlobalScope and the API exposed through that interface, so that you can't send the port's endpoint around. But it's literally defined in terms of a MessagePort. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: Sync API for workers
On Sun, Sep 2, 2012 at 12:16 PM, Glenn Maynard gl...@zewt.org wrote: On Sun, Sep 2, 2012 at 12:24 PM, Andrew Wilson atwil...@google.comwrote: I am not optimistic that we can do deadlock prevention in the general case with MessagePorts, for the same reason that it's prohibitively difficult to reliably garbage collect MessagePorts when they can be passed between processes. Would you consider this an implementation-blocking problem? No. To the contrary, I was (poorly) arguing in favor of not making deadlock prevention a required part of the spec.
Re: Sync API for workers
On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com wrote: A Sync API for workers is being implemented in Firefox [1]. I'd like to come back to the discussions mentionned in comment 4 of the bug. A summary of points I find important and my comments, questions and concerns # Discussion 1 ## Glenn Maynard [2] Use case exposed: Ability to cancel long-running synchronous worker task Terminating the whole worker thread is the blunt way to do it; that's no good since it requires starting a new thread for every keystroke, and there may be significant startup costs (eg. loading search data). = It's a legitimate use case that has no good solution today other than cutting the task in smaller tasks between which a cancellation message can be interleaved. The solution proposed in 783190 seems more complex and less useful than the one Sicking and I discussed. To summarize that one: add a getMessage(timeout) method, which consumes and returns the next message (causing onmessage to not be called[1]). If timeout is nonzero, wait for a message for up to that duration; if zero the function never blocks (eg. peek for a waiting message). If the timeout expires, returns null. This turns the first example in 783190 into: worker.js: var res = getMessage(timeout); page.html: worker = new Worker(...); setTimeout(function() { worker.postMessage(data, transferrable); }, 1000); I think this has several advantages. - Mozilla's proposal effectly creates a separate, parallel messaging channel on the MessagePort; synchronous vs. asynchronous messages. This is simpler: messages are just messages, and no new API is exposed outside of workers. - User messaging protocols are much simpler. For example, take a long-running processing task in a worker which wants to be able to receive a stop what you're doing, I have new information that affects your processing task message. With this proposal, the UI thread (or whatever) simply sends a message with the new information. With Mozilla's proposal, it would have to wait for the thread to periodically send a do you have anything to tell me? message, in order to be able to send a response that the thread can receive synchronously. - Polling is much cheaper. With Mozilla's proposal, you have to send a message to another thread, then sit and wait until you get a response. If it's the UI thread, that may take many milliseconds, since it may be busy doing other things. With this proposal, polling for new messages in a processing loop should never block due to activity in the other thread. - The resulting message protocols are more robust. With the query/response approach, if someone fails to send a response, the worker will wait forever or time out. [1] We didn't come to agreement on whether it's better to return the message or to call onmessage synchronously, but that's a detail; whichever approach is used, it's possible to implement the other in script. # Discussion 2 ## Joshua Bell [5] This can be done today using bidirectional postMessage, but of course this requires the Worker to then be coded in now common asynchronous JavaScript fashion, with either a tangled mess of callbacks or some sort of Promises/Futures library, which removes some of the benefits of introducing sync APIs to Workers in the first place. = What are these benefits? The benefit of being able to write linear code. I don't think anyone who's written complex algorithms in JavaScript can seriously dispute this as anything but a huge win. ## Glenn Maynard [7] I think this is a fundamental missing piece to worker communication. A basic reason for having Workers in the first place is so you can write linear code, instead of having to structure code to be able to return regularly (often awkward and inconvenient), but currently in order to receive messages in workers you still have to do that. = A basic reason for having workers is to move computation away from window to a concurrent and parallel computation unit so that the UI is not blocked by computation. End of story. Nothing to do with writing linear code. That's another good reason; it doesn't in any way reduce the importance of being able to write linear code, which *is* an important use case of workers. It's precisely why we have APIs like FileReaderSync. If JavaScript as it is doesn't allow people to write code as they wish, once again, it's a language issue. Either ask a change in the language or create a language that looks the way you want and compiles down to JavaScript. This has nothing to do with JavaScript/ECMAScript as a language. The ugliness of having to implement algorithms in an event-based way is caused by the way the Web uses the language, not the language itself. I wish to add that adding a sync API (even if the sync aspect is asymetrical as proposed in [1]) breaks the event-loop run-to-completion model of in-browser-JavaScript which is intended to be formalized at [concurr].
Re: Sync API for workers
On 09/01/2012 11:19 PM, Rick Waldron wrote: David, Thanks for preparing this summary—I just wanted to note that I still stand behind my original, reality based arguments. One comment inline.. On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote: Hi, A Sync API for workers is being implemented in Firefox [1]. I'd like to come back to the discussions mentionned in comment 4 of the bug. The original post actually describes an async API—putting the word sync in the middle of a method or event name doesn't make it sync. As the proposed API developed, it still retains the event handler-esque design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the terminology being used is async: - event - callback - onfoo Even Olli's proposal example is async. https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout) If the argument is callback hell, save it—because if that's the problem with your program, then your doing it wrong (see: node.js ecosystem). If this API introduces any renderer process blocking, the result will be catastrophic in the hands of inexperienced web developers. I haven't seen any proposal which would block rendering/main/dom thread We've been thinking the following approaches: Proposal 1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { event.reply('bar'); } Worker: var r = postSyncMessage('foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - It's already implemented :) CON: - Multiple event listeners - Multiple reply() calls. How to deal with it? - Multiple event listeners - is this your message? - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The message must be read in order to reply Proposal 1.1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { var r = new Reply(event); r.reply(bar); // Can be called after event dispatch. } Worker: var replies = postSyncMessage('foobar', null, 1000 /* timeout */); for (var r in replies) { handleEachReply(r); } PRO: - Can handle multiple replies. - No awkward limitations on main thread because of reply handling CON: - A bit ugly. - Reply on the worker thread becomes an array - unintuitive - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The Reply object must be created during event dispatch. Proposal 2 Parent Thread: var w = new Worker('foo.js'); w.setSyncHandler('typeFoobar', function(message) { return 'bar'; }); Worker: var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - no multple replyies are possible - types for sync messages CON: - Just a single listener - It's not based on event - it's something different compare with any other worker/parent communication. - Wrong order of the messages in worker if parent sends async message just before receiving sync message Proposal 3 Worker: postMessage(I want reply to this); var events = []; while (var m = waitForMessage()) { if (m.data != /* the reply * /) { events.push(m); } else { // do something with message } } while (events.length()) { dispatchEvent(events.shift()); } PRO: - Flexible - the order of the events is changed by the developer - since there isn't any special sync messaging, multiple event listeners don't cause problems. CON: - complex for web developers(?) - The message must be read in order to reply - Means that you can't use libraries that use sync messages. Only frameworks are possible as all message handling needs to be aware of the new syncmessages. Atm, I personally prefer the proposal 3. -Olli Rick A summary of points I find important and my comments, questions and concerns # Discussion 1 ## Glenn Maynard [2] Use case exposed: Ability to cancel long-running synchronous worker task Terminating the whole worker thread is the blunt way to do it; that's no good since it requires starting a new thread for every keystroke, and there may be significant startup costs (eg. loading search data). = It's a legitimate use case that has no good solution today other than cutting the task in smaller tasks between which a cancellation message can be interleaved. ## Tab Atkins [3] If we were to fix this, it needs to be done at the language level, because there are language-level issues to be solved that can't be hacked around by a specialized solution. = I agree a lot with that point. This is a discussion that should be had on es-discuss since JavaScript is the underlying language. ECMAScript per se doesn't define a concurrency model and it's not even on the table for ES.next, but might be in ES.next.next (7?). See [concurr] ## Jonas Sicking [4] Ideas of providing control (read-only) over pending messages in workers. (not part of the current Sync API, but interesting nonetheless) # Discussion 2 ## Joshua Bell [5] This can be done today using bidirectional
Re: Sync API for workers
On Saturday, September 1, 2012 at 4:02 PM, Glenn Maynard wrote: On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com (mailto:bruan...@gmail.com) wrote: A Sync API for workers is being implemented in Firefox [1]. I'd like to come back to the discussions mentionned in comment 4 of the bug. A summary of points I find important and my comments, questions and concerns # Discussion 1 ## Glenn Maynard [2] Use case exposed: Ability to cancel long-running synchronous worker task Terminating the whole worker thread is the blunt way to do it; that's no good since it requires starting a new thread for every keystroke, and there may be significant startup costs (eg. loading search data). = It's a legitimate use case that has no good solution today other than cutting the task in smaller tasks between which a cancellation message can be interleaved. The solution proposed in 783190 seems more complex and less useful than the one Sicking and I discussed. To summarize that one: add a getMessage(timeout) method, which consumes and returns the next message (causing onmessage to not be called[1]). If timeout is nonzero, wait for a message for up to that duration; if zero the function never blocks (eg. peek for a waiting message). If the timeout expires, returns null. This turns the first example in 783190 into: worker.js: var res = getMessage(timeout); page.html: worker = new Worker(...); setTimeout(function() { worker.postMessage(data, transferrable); }, 1000); I think this has several advantages. - Mozilla's proposal effectly creates a separate, parallel messaging channel on the MessagePort; synchronous vs. asynchronous messages. This is simpler: messages are just messages, and no new API is exposed outside of workers. - User messaging protocols are much simpler. For example, take a long-running processing task in a worker which wants to be able to receive a stop what you're doing, I have new information that affects your processing task message. With this proposal, the UI thread (or whatever) simply sends a message with the new information. With Mozilla's proposal, it would have to wait for the thread to periodically send a do you have anything to tell me? message, in order to be able to send a response that the thread can receive synchronously. - Polling is much cheaper. With Mozilla's proposal, you have to send a message to another thread, then sit and wait until you get a response. If it's the UI thread, that may take many milliseconds, since it may be busy doing other things. With this proposal, polling for new messages in a processing loop should never block due to activity in the other thread. - The resulting message protocols are more robust. With the query/response approach, if someone fails to send a response, the worker will wait forever or time out. [1] We didn't come to agreement on whether it's better to return the message or to call onmessage synchronously, but that's a detail; whichever approach is used, it's possible to implement the other in script. # Discussion 2 ## Joshua Bell [5] This can be done today using bidirectional postMessage, but of course this requires the Worker to then be coded in now common asynchronous JavaScript fashion, with either a tangled mess of callbacks or some sort of Promises/Futures library, which removes some of the benefits of introducing sync APIs to Workers in the first place. = What are these benefits? The benefit of being able to write linear code. I don't think anyone who's written complex algorithms in JavaScript can seriously dispute this as anything but a huge win. I can seriously dispute this, as someone who involved in research and development of JavaScript programming for hardware. Processing high volume serialport IO is relatively simple with streams and data events. It's just a matter of thinking differently about the program. Rick ## Glenn Maynard [7] I think this is a fundamental missing piece to worker communication. A basic reason for having Workers in the first place is so you can write linear code, instead of having to structure code to be able to return regularly (often awkward and inconvenient), but currently in order to receive messages in workers you still have to do that. = A basic reason for having workers is to move computation away from window to a concurrent and parallel computation unit so that the UI is not blocked by computation. End of story. Nothing to do with writing linear code. That's another good reason; it doesn't in any way reduce the importance of being able to write linear code, which *is* an important use case of workers. It's precisely why we have APIs like FileReaderSync. If JavaScript as it is doesn't allow people to write code as they wish, once again, it's a language issue. Either ask a change in the language or create a language
Re: Sync API for workers
On Saturday, September 1, 2012 at 4:28 PM, Olli Pettay wrote: On 09/01/2012 11:19 PM, Rick Waldron wrote: David, Thanks for preparing this summary—I just wanted to note that I still stand behind my original, reality based arguments. One comment inline.. On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote: Hi, A Sync API for workers is being implemented in Firefox [1]. I'd like to come back to the discussions mentionned in comment 4 of the bug. The original post actually describes an async API—putting the word sync in the middle of a method or event name doesn't make it sync. As the proposed API developed, it still retains the event handler-esque design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the terminology being used is async: - event - callback - onfoo Even Olli's proposal example is async. https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout) If the argument is callback hell, save it—because if that's the problem with your program, then your doing it wrong (see: node.js ecosystem). If this API introduces any renderer process blocking, the result will be catastrophic in the hands of inexperienced web developers. I haven't seen any proposal which would block rendering/main/dom thread So far, they all look async. Just calling them sync doesn't make them sync. A sync worker API: // really sync behaviour means blocking until this returns: var result = rendererBlockingAPI( data ); Unless you specifically design that to stop everything and wait for a response (renderer blocking), JavaScript will run to completion. Rick We've been thinking the following approaches: Proposal 1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { event.reply('bar'); } Worker: var r = postSyncMessage('foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - It's already implemented :) CON: - Multiple event listeners - Multiple reply() calls. How to deal with it? - Multiple event listeners - is this your message? - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The message must be read in order to reply Proposal 1.1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { var r = new Reply(event); r.reply(bar); // Can be called after event dispatch. } Worker: var replies = postSyncMessage('foobar', null, 1000 /* timeout */); for (var r in replies) { handleEachReply(r); } PRO: - Can handle multiple replies. - No awkward limitations on main thread because of reply handling CON: - A bit ugly. - Reply on the worker thread becomes an array - unintuitive - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The Reply object must be created during event dispatch. Proposal 2 Parent Thread: var w = new Worker('foo.js'); w.setSyncHandler('typeFoobar', function(message) { return 'bar'; }); Worker: var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - no multple replyies are possible - types for sync messages CON: - Just a single listener - It's not based on event - it's something different compare with any other worker/parent communication. - Wrong order of the messages in worker if parent sends async message just before receiving sync message Proposal 3 Worker: postMessage(I want reply to this); var events = []; while (var m = waitForMessage()) { if (m.data != /* the reply * /) { events.push(m); } else { // do something with message } } while (events.length()) { dispatchEvent(events.shift()); } PRO: - Flexible - the order of the events is changed by the developer - since there isn't any special sync messaging, multiple event listeners don't cause problems. CON: - complex for web developers(?) - The message must be read in order to reply - Means that you can't use libraries that use sync messages. Only frameworks are possible as all message handling needs to be aware of the new syncmessages. Atm, I personally prefer the proposal 3. -Olli Rick A summary of points I find important and my comments, questions and concerns # Discussion 1 ## Glenn Maynard [2] Use case exposed: Ability to cancel long-running synchronous worker task Terminating the whole worker thread is the blunt way to do it; that's no good since it requires starting a new thread for every keystroke, and there may be significant startup costs (eg. loading search data). = It's a legitimate use case that has no good solution today other than cutting the task in smaller tasks between which a cancellation message can be interleaved. ## Tab Atkins [3] If we were to fix this, it needs
Re: Sync API for workers
My reading (from the proposed APIs) is that these are only synchronous from the Worker's PoV. If that's correct I have no real objections to such an API - the render thread simply sees a regular message. It doesn't even need a special API on the receiving side. If the Worker has ... var response = postSynchronousMessage(message) ... and the renderer has worker.onmessage =function() { ; return foo } There are no UI blocking hazards (other than the usual slow event handler problem, which is already present anyway) The only real problem I can see is along the lines of: // Worker1 onmessage = function () { ...worker2.postSynchronousMessage(...)... } // Worker2 onmessage = function () { ...worker1.postSynchronousMessage(...)... } In the current implementations I imagine that this may cause difficulty, but I don't think that there is an actual technical argument against it. --Oliver On Sep 1, 2012, at 1:38 PM, Rick Waldron wrote: On Saturday, September 1, 2012 at 4:28 PM, Olli Pettay wrote: On 09/01/2012 11:19 PM, Rick Waldron wrote: David, Thanks for preparing this summary—I just wanted to note that I still stand behind my original, reality based arguments. One comment inline.. On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote: Hi, A Sync API for workers is being implemented in Firefox [1]. I'd like to come back to the discussions mentionned in comment 4 of the bug. The original post actually describes an async API—putting the word sync in the middle of a method or event name doesn't make it sync. As the proposed API developed, it still retains the event handler-esque design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the terminology being used is async: - event - callback - onfoo Even Olli's proposal example is async. https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout) If the argument is callback hell, save it—because if that's the problem with your program, then your doing it wrong (see: node.js ecosystem). If this API introduces any renderer process blocking, the result will be catastrophic in the hands of inexperienced web developers. I haven't seen any proposal which would block rendering/main/dom thread So far, they all look async. Just calling them sync doesn't make them sync. A sync worker API: // really sync behaviour means blocking until this returns: var result = rendererBlockingAPI( data ); Unless you specifically design that to stop everything and wait for a response (renderer blocking), JavaScript will run to completion. Rick We've been thinking the following approaches: Proposal 1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { event.reply('bar'); } Worker: var r = postSyncMessage('foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - It's already implemented :) CON: - Multiple event listeners - Multiple reply() calls. How to deal with it? - Multiple event listeners - is this your message? - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The message must be read in order to reply Proposal 1.1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { var r = new Reply(event); r.reply(bar); // Can be called after event dispatch. } Worker: var replies = postSyncMessage('foobar', null, 1000 /* timeout */); for (var r in replies) { handleEachReply(r); } PRO: - Can handle multiple replies. - No awkward limitations on main thread because of reply handling CON: - A bit ugly. - Reply on the worker thread becomes an array - unintuitive - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The Reply object must be created during event dispatch. Proposal 2 Parent Thread: var w = new Worker('foo.js'); w.setSyncHandler('typeFoobar', function(message) { return 'bar'; }); Worker: var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - no multple replyies are possible - types for sync messages CON: - Just a single listener - It's not based on event - it's something different compare with any other worker/parent communication. - Wrong order of the messages in worker if parent sends async message just before receiving sync message Proposal 3 Worker: postMessage(I want reply to this); var events = []; while (var m = waitForMessage()) { if (m.data != /* the reply * /) { events.push(m); } else { // do something with message } } while (events.length()) { dispatchEvent(events.shift()); } PRO: - Flexible - the order of the events is changed by the developer - since there isn't any special sync messaging, multiple event listeners don't cause problems. CON: - complex for web developers(?) - The message must be read in order to reply - Means that
Re: Sync API for workers
On Sat, Sep 1, 2012 at 4:51 PM, Oliver Hunt oli...@apple.com wrote: My reading (from the proposed APIs) is that these are only synchronous from the Worker's PoV. If that's correct I have no real objections to such an API - the render thread simply sees a regular message. It doesn't even need a special API on the receiving side. If the Worker has ... var response = postSynchronousMessage(message) ... and the renderer has worker.onmessage =function() { ; return foo } There are no UI blocking hazards (other than the usual slow event handler problem, which is already present anyway) The only real problem I can see is along the lines of: // Worker1 onmessage = function () { ...worker2.postSynchronousMessage(...)... } // Worker2 onmessage = function () { ...worker1.postSynchronousMessage(...)... } In the current implementations I imagine that this may cause difficulty, but I don't think that there is an actual technical argument against it. Perhaps I misread or misinterpreted the goals of the proposal and while I appreciate the clarification here, I'm still left wondering: what is the benefit of a synchronous in-worker-only messaging API if the alleged pain point is the desire to write linear code. That aside, I agree with Oliver, if no renderer process blocking can occur and no existing APIs are broken, then there is no harm (aside from polluting the global object with more verbosely named APIs, but that's just a nit) . Rick --Oliver On Sep 1, 2012, at 1:38 PM, Rick Waldron wrote: On Saturday, September 1, 2012 at 4:28 PM, Olli Pettay wrote: On 09/01/2012 11:19 PM, Rick Waldron wrote: David, Thanks for preparing this summary—I just wanted to note that I still stand behind my original, reality based arguments. One comment inline.. On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote: Hi, A Sync API for workers is being implemented in Firefox [1]. I'd like to come back to the discussions mentionned in comment 4 of the bug. The original post actually describes an async API—putting the word sync in the middle of a method or event name doesn't make it sync. As the proposed API developed, it still retains the event handler-esque design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the terminology being used is async: - event - callback - onfoo Even Olli's proposal example is async. https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout) If the argument is callback hell, save it—because if that's the problem with your program, then your doing it wrong (see: node.js ecosystem). If this API introduces any renderer process blocking, the result will be catastrophic in the hands of inexperienced web developers. I haven't seen any proposal which would block rendering/main/dom thread So far, they all look async. Just calling them sync doesn't make them sync. A sync worker API: // really sync behaviour means blocking until this returns: var result = rendererBlockingAPI( data ); Unless you specifically design that to stop everything and wait for a response (renderer blocking), JavaScript will run to completion. Rick We've been thinking the following approaches: Proposal 1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { event.reply('bar'); } Worker: var r = postSyncMessage('foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - It's already implemented :) CON: - Multiple event listeners - Multiple reply() calls. How to deal with it? - Multiple event listeners - is this your message? - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The message must be read in order to reply Proposal 1.1 Parent Thread: var w = new Worker('foo.js'); w.onsyncmessage = function(event) { var r = new Reply(event); r.reply(bar); // Can be called after event dispatch. } Worker: var replies = postSyncMessage('foobar', null, 1000 /* timeout */); for (var r in replies) { handleEachReply(r); } PRO: - Can handle multiple replies. - No awkward limitations on main thread because of reply handling CON: - A bit ugly. - Reply on the worker thread becomes an array - unintuitive - Wrong order of the messages in worker if parent sends async message just before receiving sync message - The Reply object must be created during event dispatch. Proposal 2 Parent Thread: var w = new Worker('foo.js'); w.setSyncHandler('typeFoobar', function(message) { return 'bar'; }); Worker: var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */); if (r == 'bar') .. PRO: - no multple replyies are possible - types for sync messages CON: - Just a single listener - It's not based on event - it's something different compare with any other worker/parent communication. - Wrong order of the messages in worker if parent sends async message just before receiving sync
Re: Sync API for workers
On 09/01/2012 11:38 PM, Rick Waldron wrote: So far, they all look async. Just calling them sync doesn't make them sync. Sure they are sync. They are sync inside worker. We all know that we must not introduce new sync APIs in the main thread.
Re: Sync API for workers
Le 01/09/2012 22:30, Rick Waldron a écrit : On Saturday, September 1, 2012 at 4:02 PM, Glenn Maynard wrote: On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com mailto:bruan...@gmail.com wrote: # Discussion 2 ## Joshua Bell [5] This can be done today using bidirectional postMessage, but of course this requires the Worker to then be coded in now common asynchronous JavaScript fashion, with either a tangled mess of callbacks or some sort of Promises/Futures library, which removes some of the benefits of introducing sync APIs to Workers in the first place. = What are these benefits? The benefit of being able to write linear code. I don't think anyone who's written complex algorithms in JavaScript can seriously dispute this as anything but a huge win. I can seriously dispute this, as someone who involved in research and development of JavaScript programming for hardware. Processing high volume serialport IO is relatively simple with streams and data events. It's just a matter of thinking differently about the program. I dispute it too. It's been 8 months I work with Node.js and have written algorithms with IO. I've used promises and I'm very happy with it when it comes to readability. The code is very close to being linear. JavaScript, because of its syntax imposes some noise (especially the function keyword and the Q library imposes a .then/.fail), but I'm confident a language that compiles to JS could have sugar to eliminate this issue. I recommand taking a look at Roy for that inspiration on that topic (start at ~8'00'') http://blip.tv/jsconf/jsconf2012-brian-mckenna-6145371 (it doesn't use promises, but adds sugar to get out of the callback hell) David
Re: Sync API for workers
On Sat, Sep 1, 2012 at 5:12 PM, Olli Pettay olli.pet...@helsinki.fi wrote: On 09/01/2012 11:38 PM, Rick Waldron wrote: So far, they all look async. Just calling them sync doesn't make them sync. Sure they are sync. They are sync inside worker. We all know that we must not introduce new sync APIs in the main thread. See my response to Oliver Hunt's message Rick
Re: Sync API for workers
On Sat, Sep 1, 2012 at 3:19 PM, Rick Waldron waldron.r...@gmail.com wrote: I can seriously dispute this, as someone who involved in research and development of JavaScript programming for hardware. Processing high volume serialport IO is relatively simple with streams and data events. It's just a matter of thinking differently about the program. We'll have to professional disagree, then. I've used both models extensively, and for some tasks, such as complex algorithms, I find linear code much easier to write, understand and debug. On Sat, Sep 1, 2012 at 3:28 PM, Olli Pettay olli.pet...@helsinki.fi wrote: Proposal 3 Worker: postMessage(I want reply to this); var events = []; while (var m = waitForMessage()) { if (m.data != /* the reply * /) { events.push(m); } else { // do something with message } } while (events.length()) { dispatchEvent(events.shift()); } The intent is much simpler: postMessage(foo); var response = getMessage(); You're correct that this wouldn't integrate all that well with libraries, since you may end up receiving an unrelated message. (Trying to deal with that is where the bulk of the above comes from--and you probably really wouldn't want to dispatch unknown events, since that's effectively making *all* events async.) That's why I originally proposed this as a method on MessagePort. You'd create a dedicated MessagePort to block on, so you wouldn't collide with (and possibly be confused by, or discard by accident) unrelated messages. Of course, that means that while you're waiting for messages on the port, no other messages are being received, since you aren't in the event loop. That's just inherent--nothing else that happens during the event loop would take place either, like async XHR. I'm not sure how to do the implicit deadlock prevention if this is exposed on MessagePort, though. (I personally don't find it a problem, as I said earlier, but I know that a proposal that has that as an option will have a much better chance of being implemented than one that doesn't.) - The message must be read in order to reply I'm not quite following here. You could getMessage at any time, and the other thread could postMessage at any time--there doesn't have to be a hey, send me a message message in the first place at all. For example, a site may have a button that sends a message when clicked. You don't have to jump hoops in order to wait for the query message to reply to, as it seems you'd have to with the reply proposals. -- Glenn Maynard