Re: Web workers: synchronously handling events

2010-12-29 Thread Jonas Sicking
On Sun, Dec 26, 2010 at 4:29 PM, Glenn Maynard gl...@zewt.org wrote:
 Havn't been able to find this in the spec: is there a way to allow
 processing messages synchronously during a number-crunching worker
 thread?

 A typical use case is a CPU-intensive task that needs to be aborted
 due to user action.  For example, it might take 250ms to run a search
 on a local database.  This can be triggered on each keystroke,
 updating as the user types.  If the user enters another letter into
 the text field before the previous search completes, that search may
 no longer be needed; the running task should be cancelled so it can
 start on the new text.  However, if a cancel message is sent to the
 thread, it won't receive it until it returns from the work it's doing.

 Creating a cancellation message port to run periodically would deal
 with this nicely, where a port.dispatch() function would dispatch the
 first waiting event that came from that port, if any:

    var cancelling = false;
    cancellation_port.onmessage = function(event) { cancelling = true; }
    while(!finished  !cancelling)
    {
        if(cancellation_port.dispatch())
            continue; /* if a message was received, cancelling may
 have been modified */
        /* do work */
    }

 Terminating the whole worker thread is the blunt way to do it; that's
 no good since it requires starting a new thread for every keystroke,
 and there may be significant startup costs (eg. loading search data).

 The only way I know to do this currently is to return periodically to
 allow events to run, and to resume the work with setTimeout(f, 1).
 That's ugly and requires writing algorithms in a specific,
 inconvenient way that shouldn't be required in a thread.  Browsers
 clamping timeouts to a minimum of 5-10ms also breaks this approach;
 hopefully they won't do that from worker threads, but I have a feeling
 they will.

 I'd hate to be stuck with ugly messaging hacks to achieve this, eg.
 having to use other mechanisms not meant for cross-thread messaging,
 like a database.  Am I missing something in the API?

I definitely agree that workers need more features to take advantage
of the fact that they are running on their own event loop. One of
which is the one you are asking for.

We could add something like:

boolean checkPendingMessages();

which would return true if there are pending messages. The script
running in the worker could use this information to return to the
event loop to process these messages only when needed. One downside
with this API is that there is a risk that people could write:

if (checkPendingMessages()) {
  doStuff();
}
... code here ...
if (checkPendingMessages()) {
  continueToDoStuff();
}

And expect that either none of the if statements are entered, or both
are. Not realizing that the return value from checkPendingMessages
could change at any point. I'm not terribly worried about this risk,
but it's definitely there.

An alternative solution would be something like:

MessageInfo getMessageIfExists();

which would return an object containing the message data if a message
was pending and remove the message from the queue of pending messages.
If there are no pending messages null is returned and the message
queue remains empty. This makes it significantly harder to write code
like the above. However it might make coding somewhat more awkward
since you likely will have to deal with messages arriving two ways,
through the normal event loop and through getMessageIfExists.


Unrelated to the above question, I do think we should add an API like

MessageInfo getMessage();

which would return information about the next pending message and
remove said message from the queue. If no message is pending when the
function is called, the function waits until a message arrives and
only then returns. This allows writing code which could be
considerably cleaner and easier to write since you don't have to
return to the main event loop to retrieve additional data. It doesn't
however let you solve the use case described in the initial email in
this thread. But since it's related I thought it was worth bringing up
here.

/ Jonas



Re: Web workers: synchronously handling events

2010-12-29 Thread Glenn Maynard
On Wed, Dec 29, 2010 at 4:56 AM, Jonas Sicking jo...@sicking.cc wrote:
 I definitely agree that workers need more features to take advantage
 of the fact that they are running on their own event loop. One of
 which is the one you are asking for.

 We could add something like:

 boolean checkPendingMessages();

 which would return true if there are pending messages. The script
 running in the worker could use this information to return to the
 event loop to process these messages only when needed. One downside
 with this API is that there is a risk that people could write:

Is there a problem with synchronously delivering the next pending
message event, rather than having to return all the way out to allow
it to run?  Code shouldn't need to be engineered to allow returning
and resuming in order to receive messages, as if we're still in a UI
thread.

Note that it's critical that this run only events from a specific
message port, so only targetted messages are run and not unrelated
ones.

I've used this approach with C++ worker threads and it works very
well; it allows thread work to be cancelled at specific, predetermined
points, without exposing significant threadsafety issues to the thread
itself, and allowing deeply nested algorithm code to handle
cancellation in a consistent way:

var checkForCancellation()
{
// run all waiting messages from this port
while(cancellationPort.runPendingMessage())
;
// if a message set cancellation, stop working
if(cancel)
throw Operation cancelled;
}

try {
checkForCancellation();
doStuff();
while(var i = 0; i  longRunningLoop; ++i)
{
checkForCancellation();
doMoreWork();
}
} catch...

 An alternative solution would be something like:

 MessageInfo getMessageIfExists();

 which would return an object containing the message data if a message
 was pending and remove the message from the queue of pending messages.
 If there are no pending messages null is returned and the message
 queue remains empty. This makes it significantly harder to write code
 like the above. However it might make coding somewhat more awkward
 since you likely will have to deal with messages arriving two ways,
 through the normal event loop and through getMessageIfExists.

I avoided that in my suggestion, since it seems likely to cause
confusing bugs and it's hard to think of when this behavior would
really be wanted.  If it was, you could do something like this, I
think:

function getPendingMessageWithoutDelivery(port)
{
var onmessage = port.onmessage;
try {
port.onmessage = null;
return port.runPendingMessage(); // returns the handled
message or null if none
} finally {
port.onmessage = onmessage;
}
}

-- 
Glenn Maynard



Re: Web workers: synchronously handling events

2010-12-29 Thread Jonas Sicking
On Wed, Dec 29, 2010 at 10:33 AM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, Dec 29, 2010 at 4:56 AM, Jonas Sicking jo...@sicking.cc wrote:
 I definitely agree that workers need more features to take advantage
 of the fact that they are running on their own event loop. One of
 which is the one you are asking for.

 We could add something like:

 boolean checkPendingMessages();

 which would return true if there are pending messages. The script
 running in the worker could use this information to return to the
 event loop to process these messages only when needed. One downside
 with this API is that there is a risk that people could write:

 Is there a problem with synchronously delivering the next pending
 message event, rather than having to return all the way out to allow
 it to run?  Code shouldn't need to be engineered to allow returning
 and resuming in order to receive messages, as if we're still in a UI
 thread.

 Note that it's critical that this run only events from a specific
 message port, so only targetted messages are run and not unrelated
 ones.

 I've used this approach with C++ worker threads and it works very
 well; it allows thread work to be cancelled at specific, predetermined
 points, without exposing significant threadsafety issues to the thread
 itself, and allowing deeply nested algorithm code to handle
 cancellation in a consistent way:

 var checkForCancellation()
 {
    // run all waiting messages from this port
    while(cancellationPort.runPendingMessage())
        ;
    // if a message set cancellation, stop working
    if(cancel)
        throw Operation cancelled;
 }

 try {
    checkForCancellation();
    doStuff();
    while(var i = 0; i  longRunningLoop; ++i)
    {
        checkForCancellation();
        doMoreWork();
    }
 } catch...

 An alternative solution would be something like:

 MessageInfo getMessageIfExists();

 which would return an object containing the message data if a message
 was pending and remove the message from the queue of pending messages.
 If there are no pending messages null is returned and the message
 queue remains empty. This makes it significantly harder to write code
 like the above. However it might make coding somewhat more awkward
 since you likely will have to deal with messages arriving two ways,
 through the normal event loop and through getMessageIfExists.

 I avoided that in my suggestion, since it seems likely to cause
 confusing bugs and it's hard to think of when this behavior would
 really be wanted.  If it was, you could do something like this, I
 think:

 function getPendingMessageWithoutDelivery(port)
 {
    var onmessage = port.onmessage;
    try {
        port.onmessage = null;
        return port.runPendingMessage(); // returns the handled
 message or null if none
    } finally {
        port.onmessage = onmessage;
    }
 }

Yeah, this might be a better design. I don't feel strongly between
your port.runPendingMessage() or my getMessageIfExists() proposals.
They both allows for the same use cases to be solved. Your example
code above shows that port.runPendingMessage can be used to implement
getMessageIfExists, it can equally be shown that getMessageIfExists
can be used to implement port.runPendingMessage().

/ Jonas



Re: Web workers: synchronously handling events

2010-12-28 Thread Tab Atkins Jr.
On Sun, Dec 26, 2010 at 4:29 PM, Glenn Maynard gl...@zewt.org wrote:
 Havn't been able to find this in the spec: is there a way to allow
 processing messages synchronously during a number-crunching worker
 thread?

Yes, by pausing every once in a while with setTimeout and letting the
event loop spin.

Doing anything else would break javascript's appearance of single-threadedness.

I agree that it's not particularly nice to write your algorithms like
this, but it's already familiar to any js dev who uses any algorithm
with significant running time.  If we were to fix this, it needs to be
done at the language level, because there are language-level issues to
be solved that can't be hacked around by a specialized solution.

~TJ



Re: Web workers: synchronously handling events

2010-12-28 Thread Glenn Maynard
On Tue, Dec 28, 2010 at 3:06 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 Yes, by pausing every once in a while with setTimeout and letting the
 event loop spin.

 Doing anything else would break javascript's appearance of 
 single-threadedness.

I'm not suggesting that events should run on their own in the middle
of script execution.  I'm suggesting that code should be able to
explicitly, synchronously run the event loop, only for messages from a
specific message port.  I don't think this makes Javascript less
single-threaded-looking; it's analogous to doing a nonblocking read()
from a TTY in Unix, to retrieve the user's next keystroke.

 I agree that it's not particularly nice to write your algorithms like
 this, but it's already familiar to any js dev who uses any algorithm
 with significant running time.  If we were to fix this, it needs to be
 done at the language level, because there are language-level issues to
 be solved that can't be hacked around by a specialized solution.

That's a workaround for doing long-running computations in a UI
thread.  The very reason for using worker threads to do computation in
the first place is so this sort of hack isn't necessary.

-- 
Glenn Maynard



Web workers: synchronously handling events

2010-12-26 Thread Glenn Maynard
Havn't been able to find this in the spec: is there a way to allow
processing messages synchronously during a number-crunching worker
thread?

A typical use case is a CPU-intensive task that needs to be aborted
due to user action.  For example, it might take 250ms to run a search
on a local database.  This can be triggered on each keystroke,
updating as the user types.  If the user enters another letter into
the text field before the previous search completes, that search may
no longer be needed; the running task should be cancelled so it can
start on the new text.  However, if a cancel message is sent to the
thread, it won't receive it until it returns from the work it's doing.

Creating a cancellation message port to run periodically would deal
with this nicely, where a port.dispatch() function would dispatch the
first waiting event that came from that port, if any:

var cancelling = false;
cancellation_port.onmessage = function(event) { cancelling = true; }
while(!finished  !cancelling)
{
if(cancellation_port.dispatch())
continue; /* if a message was received, cancelling may
have been modified */
/* do work */
}

Terminating the whole worker thread is the blunt way to do it; that's
no good since it requires starting a new thread for every keystroke,
and there may be significant startup costs (eg. loading search data).

The only way I know to do this currently is to return periodically to
allow events to run, and to resume the work with setTimeout(f, 1).
That's ugly and requires writing algorithms in a specific,
inconvenient way that shouldn't be required in a thread.  Browsers
clamping timeouts to a minimum of 5-10ms also breaks this approach;
hopefully they won't do that from worker threads, but I have a feeling
they will.

I'd hate to be stuck with ugly messaging hacks to achieve this, eg.
having to use other mechanisms not meant for cross-thread messaging,
like a database.  Am I missing something in the API?

-- 
Glenn Maynard