Re: Web workers: synchronously handling events
On Sun, Dec 26, 2010 at 4:29 PM, Glenn Maynard gl...@zewt.org wrote: Havn't been able to find this in the spec: is there a way to allow processing messages synchronously during a number-crunching worker thread? A typical use case is a CPU-intensive task that needs to be aborted due to user action. For example, it might take 250ms to run a search on a local database. This can be triggered on each keystroke, updating as the user types. If the user enters another letter into the text field before the previous search completes, that search may no longer be needed; the running task should be cancelled so it can start on the new text. However, if a cancel message is sent to the thread, it won't receive it until it returns from the work it's doing. Creating a cancellation message port to run periodically would deal with this nicely, where a port.dispatch() function would dispatch the first waiting event that came from that port, if any: var cancelling = false; cancellation_port.onmessage = function(event) { cancelling = true; } while(!finished !cancelling) { if(cancellation_port.dispatch()) continue; /* if a message was received, cancelling may have been modified */ /* do work */ } Terminating the whole worker thread is the blunt way to do it; that's no good since it requires starting a new thread for every keystroke, and there may be significant startup costs (eg. loading search data). The only way I know to do this currently is to return periodically to allow events to run, and to resume the work with setTimeout(f, 1). That's ugly and requires writing algorithms in a specific, inconvenient way that shouldn't be required in a thread. Browsers clamping timeouts to a minimum of 5-10ms also breaks this approach; hopefully they won't do that from worker threads, but I have a feeling they will. I'd hate to be stuck with ugly messaging hacks to achieve this, eg. having to use other mechanisms not meant for cross-thread messaging, like a database. Am I missing something in the API? I definitely agree that workers need more features to take advantage of the fact that they are running on their own event loop. One of which is the one you are asking for. We could add something like: boolean checkPendingMessages(); which would return true if there are pending messages. The script running in the worker could use this information to return to the event loop to process these messages only when needed. One downside with this API is that there is a risk that people could write: if (checkPendingMessages()) { doStuff(); } ... code here ... if (checkPendingMessages()) { continueToDoStuff(); } And expect that either none of the if statements are entered, or both are. Not realizing that the return value from checkPendingMessages could change at any point. I'm not terribly worried about this risk, but it's definitely there. An alternative solution would be something like: MessageInfo getMessageIfExists(); which would return an object containing the message data if a message was pending and remove the message from the queue of pending messages. If there are no pending messages null is returned and the message queue remains empty. This makes it significantly harder to write code like the above. However it might make coding somewhat more awkward since you likely will have to deal with messages arriving two ways, through the normal event loop and through getMessageIfExists. Unrelated to the above question, I do think we should add an API like MessageInfo getMessage(); which would return information about the next pending message and remove said message from the queue. If no message is pending when the function is called, the function waits until a message arrives and only then returns. This allows writing code which could be considerably cleaner and easier to write since you don't have to return to the main event loop to retrieve additional data. It doesn't however let you solve the use case described in the initial email in this thread. But since it's related I thought it was worth bringing up here. / Jonas
Re: Web workers: synchronously handling events
On Wed, Dec 29, 2010 at 4:56 AM, Jonas Sicking jo...@sicking.cc wrote: I definitely agree that workers need more features to take advantage of the fact that they are running on their own event loop. One of which is the one you are asking for. We could add something like: boolean checkPendingMessages(); which would return true if there are pending messages. The script running in the worker could use this information to return to the event loop to process these messages only when needed. One downside with this API is that there is a risk that people could write: Is there a problem with synchronously delivering the next pending message event, rather than having to return all the way out to allow it to run? Code shouldn't need to be engineered to allow returning and resuming in order to receive messages, as if we're still in a UI thread. Note that it's critical that this run only events from a specific message port, so only targetted messages are run and not unrelated ones. I've used this approach with C++ worker threads and it works very well; it allows thread work to be cancelled at specific, predetermined points, without exposing significant threadsafety issues to the thread itself, and allowing deeply nested algorithm code to handle cancellation in a consistent way: var checkForCancellation() { // run all waiting messages from this port while(cancellationPort.runPendingMessage()) ; // if a message set cancellation, stop working if(cancel) throw Operation cancelled; } try { checkForCancellation(); doStuff(); while(var i = 0; i longRunningLoop; ++i) { checkForCancellation(); doMoreWork(); } } catch... An alternative solution would be something like: MessageInfo getMessageIfExists(); which would return an object containing the message data if a message was pending and remove the message from the queue of pending messages. If there are no pending messages null is returned and the message queue remains empty. This makes it significantly harder to write code like the above. However it might make coding somewhat more awkward since you likely will have to deal with messages arriving two ways, through the normal event loop and through getMessageIfExists. I avoided that in my suggestion, since it seems likely to cause confusing bugs and it's hard to think of when this behavior would really be wanted. If it was, you could do something like this, I think: function getPendingMessageWithoutDelivery(port) { var onmessage = port.onmessage; try { port.onmessage = null; return port.runPendingMessage(); // returns the handled message or null if none } finally { port.onmessage = onmessage; } } -- Glenn Maynard
Re: Web workers: synchronously handling events
On Wed, Dec 29, 2010 at 10:33 AM, Glenn Maynard gl...@zewt.org wrote: On Wed, Dec 29, 2010 at 4:56 AM, Jonas Sicking jo...@sicking.cc wrote: I definitely agree that workers need more features to take advantage of the fact that they are running on their own event loop. One of which is the one you are asking for. We could add something like: boolean checkPendingMessages(); which would return true if there are pending messages. The script running in the worker could use this information to return to the event loop to process these messages only when needed. One downside with this API is that there is a risk that people could write: Is there a problem with synchronously delivering the next pending message event, rather than having to return all the way out to allow it to run? Code shouldn't need to be engineered to allow returning and resuming in order to receive messages, as if we're still in a UI thread. Note that it's critical that this run only events from a specific message port, so only targetted messages are run and not unrelated ones. I've used this approach with C++ worker threads and it works very well; it allows thread work to be cancelled at specific, predetermined points, without exposing significant threadsafety issues to the thread itself, and allowing deeply nested algorithm code to handle cancellation in a consistent way: var checkForCancellation() { // run all waiting messages from this port while(cancellationPort.runPendingMessage()) ; // if a message set cancellation, stop working if(cancel) throw Operation cancelled; } try { checkForCancellation(); doStuff(); while(var i = 0; i longRunningLoop; ++i) { checkForCancellation(); doMoreWork(); } } catch... An alternative solution would be something like: MessageInfo getMessageIfExists(); which would return an object containing the message data if a message was pending and remove the message from the queue of pending messages. If there are no pending messages null is returned and the message queue remains empty. This makes it significantly harder to write code like the above. However it might make coding somewhat more awkward since you likely will have to deal with messages arriving two ways, through the normal event loop and through getMessageIfExists. I avoided that in my suggestion, since it seems likely to cause confusing bugs and it's hard to think of when this behavior would really be wanted. If it was, you could do something like this, I think: function getPendingMessageWithoutDelivery(port) { var onmessage = port.onmessage; try { port.onmessage = null; return port.runPendingMessage(); // returns the handled message or null if none } finally { port.onmessage = onmessage; } } Yeah, this might be a better design. I don't feel strongly between your port.runPendingMessage() or my getMessageIfExists() proposals. They both allows for the same use cases to be solved. Your example code above shows that port.runPendingMessage can be used to implement getMessageIfExists, it can equally be shown that getMessageIfExists can be used to implement port.runPendingMessage(). / Jonas
Re: Web workers: synchronously handling events
On Sun, Dec 26, 2010 at 4:29 PM, Glenn Maynard gl...@zewt.org wrote: Havn't been able to find this in the spec: is there a way to allow processing messages synchronously during a number-crunching worker thread? Yes, by pausing every once in a while with setTimeout and letting the event loop spin. Doing anything else would break javascript's appearance of single-threadedness. I agree that it's not particularly nice to write your algorithms like this, but it's already familiar to any js dev who uses any algorithm with significant running time. If we were to fix this, it needs to be done at the language level, because there are language-level issues to be solved that can't be hacked around by a specialized solution. ~TJ
Re: Web workers: synchronously handling events
On Tue, Dec 28, 2010 at 3:06 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: Yes, by pausing every once in a while with setTimeout and letting the event loop spin. Doing anything else would break javascript's appearance of single-threadedness. I'm not suggesting that events should run on their own in the middle of script execution. I'm suggesting that code should be able to explicitly, synchronously run the event loop, only for messages from a specific message port. I don't think this makes Javascript less single-threaded-looking; it's analogous to doing a nonblocking read() from a TTY in Unix, to retrieve the user's next keystroke. I agree that it's not particularly nice to write your algorithms like this, but it's already familiar to any js dev who uses any algorithm with significant running time. If we were to fix this, it needs to be done at the language level, because there are language-level issues to be solved that can't be hacked around by a specialized solution. That's a workaround for doing long-running computations in a UI thread. The very reason for using worker threads to do computation in the first place is so this sort of hack isn't necessary. -- Glenn Maynard
Web workers: synchronously handling events
Havn't been able to find this in the spec: is there a way to allow processing messages synchronously during a number-crunching worker thread? A typical use case is a CPU-intensive task that needs to be aborted due to user action. For example, it might take 250ms to run a search on a local database. This can be triggered on each keystroke, updating as the user types. If the user enters another letter into the text field before the previous search completes, that search may no longer be needed; the running task should be cancelled so it can start on the new text. However, if a cancel message is sent to the thread, it won't receive it until it returns from the work it's doing. Creating a cancellation message port to run periodically would deal with this nicely, where a port.dispatch() function would dispatch the first waiting event that came from that port, if any: var cancelling = false; cancellation_port.onmessage = function(event) { cancelling = true; } while(!finished !cancelling) { if(cancellation_port.dispatch()) continue; /* if a message was received, cancelling may have been modified */ /* do work */ } Terminating the whole worker thread is the blunt way to do it; that's no good since it requires starting a new thread for every keystroke, and there may be significant startup costs (eg. loading search data). The only way I know to do this currently is to return periodically to allow events to run, and to resume the work with setTimeout(f, 1). That's ugly and requires writing algorithms in a specific, inconvenient way that shouldn't be required in a thread. Browsers clamping timeouts to a minimum of 5-10ms also breaks this approach; hopefully they won't do that from worker threads, but I have a feeling they will. I'd hate to be stuck with ugly messaging hacks to achieve this, eg. having to use other mechanisms not meant for cross-thread messaging, like a database. Am I missing something in the API? -- Glenn Maynard