Re: Sync API for workers

2013-10-20 Thread pira...@gmail.com


  Look at IndexedDB API, since asynchronous one was enought dfor everybody,
 synchronous one was not implemented by the browsers and now it has became
 deprecated... :-)

 Well regarding my position I would not smile ;-)
 I was considering a server-side implementation of indexedDB. There is
 currently the indexeddb-js project for Node using sqlite-3


I have in my background todo list to develop one of them ;-) This case,
using LevelDB as base database instead of SQLite (I'm a fan of it, but this
is not the correct use case for it).


and should be considered by clients as remote workers (the server let debug
 those contexts via Web Inspector).

 Interesting concept, seems we both see WebSockets and WebWorkers are
cousins (the only diference in the API is just one use send() and the other
postMessage() ). Have you tried to propose this RemoteWorkers as a
standard?



 I like the JavaScript EventLoop strength for async coding, as I
 appreciates the upcoming promises, but still, as some people in this list I
 think that synchronous code is more user friendly.


For general purposes and as a general statement, yes, it's more user
friendly, but for some use cases asynchronous code is more eficient and up
to some point more user friendly once you understand it correctly.



-- 
Si quieres viajar alrededor del mundo y ser invitado a hablar en un monton
de sitios diferentes, simplemente escribe un sistema operativo Unix.
– Linus Tordvals, creador del sistema operativo Linux


Re: Sync API for workers

2013-10-14 Thread David Rajchenbach-Teller
Let me introduce the first sketch of a variant. The general idea is to
add a |postSyncMessage|

We extend DedicatedWorkerGlobalScope and MessageEvent as follows:

interface DedicatedWorkerGlobalScope : WorkerGlobalScope {
  void postMessage(any message, optional sequenceTransferable transfer);
  any postSyncMessage(any message, optional sequenceTransferable
transfer);
};

interface SyncMessageEvent : MessageEvent {
  void reply(optional any message, optional sequenceTransferable
transfer);
};

The behavior of |postSyncMessage| is the following:
1. the sender worker sleeps and does not handle any |postMessage|
messages until it is awakened;
2. instead of the usual |MessageEvent|, the target's |onmessage|
receives as argument a |SyncMessageEvent| (call it |s|);
3. if |s.reply(x)| is called, the sender's |postSyncMessage| method
returns a copy of |x|, obtained with the usual algorithm;
5. if |s.reply()| has not called by the time the worker is either
garbage-collected or |terminate()| is called on its |MessagePort|, the
worker is killed as usual.

I have not attempted to detail the inner workings of the underlying
MessagePort, but I suspect that this is close to Jonas Sicking's proposal.

Cheers,
 David



Re: Sync API for workers

2013-10-14 Thread David Rajchenbach-Teller
On 10/13/13 4:21 PM, James Greene wrote:
 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted
 
 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which
 cannot sleep. As such, I believe my original assertions are still
 correct with regard to writing a sync wrapper in JS.

My apologies, I had obviously misunderstood your initial statement. I
was thinking of an extension of the Worker API and how to implement it
at little CPU/battery clost.

Cheers,
 David

-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla



Re: Sync API for workers

2013-10-14 Thread Alon Zakai

- Original Message -
 From: David Bruant bruan...@gmail.com
 To: Jonas Sicking jo...@sicking.cc
 Cc: public-webapps public-webapps@w3.org, aza...@mozilla.com
 Sent: Sunday, October 13, 2013 1:36:22 PM
 Subject: Re: Sync API for workers
 
  * You could solve the use case of compile-to-JS for code that uses
  sync APIs using yield. However it requires changing all functions into
  generators, and all function calls into yield* statements.

 all? as is all function in the application? that sounds like a too
 violent constraint, especially if a small proportion of the code uses
 sync functions. Maybe only the functions that may call a sync function
 need to be changed to generators... oh... hmm... I don't know.
 Taking the liberty to cc Alon Zakai to ask for his expert opinion on
 this topics.
 

Not sure about all the context here. In general, the idea of using yield or CPS 
to handle synchronous code has come up in emscripten, but no one has done work 
to implement it, so we don't have a concrete answer for how practical it would 
be. My guess however is that it would be not very practical, because large 
codebases can have sync code anywhere, and relying on static analysis to 
simplify that so it is mostly not a factor is very optimistic. CPS all the time 
would likely be too slow; yield all the time I am less clear on because I am 
not sure the implementations are mature enough to benchmark yet (and no 
implementation at all in IE and Safari last I heard) - we would need to ask JS 
engine devs on that.

- Alon




Re: Sync API for workers

2013-10-14 Thread Jonas Sicking
On Mon, Oct 14, 2013 at 2:33 AM, David Rajchenbach-Teller
dtel...@mozilla.com wrote:
 Let me introduce the first sketch of a variant. The general idea is to
 add a |postSyncMessage|

 We extend DedicatedWorkerGlobalScope and MessageEvent as follows:

 interface DedicatedWorkerGlobalScope : WorkerGlobalScope {
   void postMessage(any message, optional sequenceTransferable transfer);
   any postSyncMessage(any message, optional sequenceTransferable
 transfer);
 };

 interface SyncMessageEvent : MessageEvent {
   void reply(optional any message, optional sequenceTransferable
 transfer);
 };

This API was suggested by Olli way up in this thread. It has a few downsides:

1. It only allows a single synchronous message channel. That means
that if you have several libraries that all need synchronous
communication with the parent they have to coordinate on some way to
distinguish each others messages. The fact that Gecko hasn't had
MessageChannel support has resulted in the same problem for
asynchronous communication and that has been a big headache for
developers.
2. It doesn't support streaming return values. I.e. you can't send
multiple return values from a single postSyncMessage call.
3. It doesn't allow direct synchronous communication between a worker
and the workers grand parent. Everything single message has to be
individually routed through the parent.
4. What happens if you have multiple eventlisteners in the parent and
several of them call .reply()?

I wouldn't say that all of these are killer issues. I do think the
first one is though. And the other three are clearly downsides.

All in all I think the added complexity in the later proposal is worth it.

/ Jonas



Re: Sync API for workers

2013-10-14 Thread Glenn Maynard
You snipped the comment about waitForMessage().  I think it should return
an Event, as if the message had been received from onmessage, not just the
received data.

On Sun, Oct 13, 2013 at 10:37 PM, Jonas Sicking jo...@sicking.cc wrote:

 This is certainly an improvement over the previous proposal. However
 given that synchronous APIs of any type are quite controversial, I'd
 rather stick to a basic approach for now.


There's nothing controversial about synchronous APIs in workers.  Doing
work synchronously is the whole point.

The nice thing about your proposal is that it's strictly additive, so
 it's something we can add later if there's agreement that the problems
 it aims to solve are problems that need solving, and there's agreement
 that the proposal is the right way to solve them.


This will cause people to learn to structure their workers poorly, and to
create worker libraries based on that structure, with extra message
relaying infrastructure to work around this, and pollute people's
still-immature understanding of message ports.  We should do it right in
the first place.


On Mon, Oct 14, 2013 at 4:33 AM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 Let me introduce the first sketch of a variant. The general idea is to
 add a |postSyncMessage|


(I'm not sure what problems with the existing proposals this is trying to
solve.)

-- 
Glenn Maynard


Re: Sync API for workers

2013-10-14 Thread David Rajchenbach-Teller
This meant to be a more limited and well-behaved variant.
However, as pointed out by Jonas, a very similar proposal has been
submitted and discussed long before I joined this list. So, please
disregard my proposal, it is an artifact of me not searching the
archives well enough.

Best regards,
 David

On 10/14/13 4:14 PM, Glenn Maynard wrote:
 On Mon, Oct 14, 2013 at 4:33 AM, David Rajchenbach-Teller
 dtel...@mozilla.com mailto:dtel...@mozilla.com wrote:
 
 Let me introduce the first sketch of a variant. The general idea is to
 add a |postSyncMessage|
 
 
 (I'm not sure what problems with the existing proposals this is trying
 to solve.)
 
 -- 
 Glenn Maynard
 


-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla



Re: Sync API for workers

2013-10-14 Thread James Greene
Could we change the method name under discussion to `postMessageSync`
instead of `postSyncMessage`? I know they're not grammatically equivalent
but I've always found the *Sync suffixes used on pertinent Node.js APIs to
be much more intuitive than trying to guess which position within a string
of words it should take.

Not sure on prior art within the web platform.
On Oct 14, 2013 4:59 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Oct 14, 2013 at 2:33 AM, David Rajchenbach-Teller
 dtel...@mozilla.com wrote:
  Let me introduce the first sketch of a variant. The general idea is to
  add a |postSyncMessage|
 
  We extend DedicatedWorkerGlobalScope and MessageEvent as follows:
 
  interface DedicatedWorkerGlobalScope : WorkerGlobalScope {
void postMessage(any message, optional sequenceTransferable
 transfer);
any postSyncMessage(any message, optional sequenceTransferable
  transfer);
  };
 
  interface SyncMessageEvent : MessageEvent {
void reply(optional any message, optional sequenceTransferable
  transfer);
  };

 This API was suggested by Olli way up in this thread. It has a few
 downsides:

 1. It only allows a single synchronous message channel. That means
 that if you have several libraries that all need synchronous
 communication with the parent they have to coordinate on some way to
 distinguish each others messages. The fact that Gecko hasn't had
 MessageChannel support has resulted in the same problem for
 asynchronous communication and that has been a big headache for
 developers.
 2. It doesn't support streaming return values. I.e. you can't send
 multiple return values from a single postSyncMessage call.
 3. It doesn't allow direct synchronous communication between a worker
 and the workers grand parent. Everything single message has to be
 individually routed through the parent.
 4. What happens if you have multiple eventlisteners in the parent and
 several of them call .reply()?

 I wouldn't say that all of these are killer issues. I do think the
 first one is though. And the other three are clearly downsides.

 All in all I think the added complexity in the later proposal is worth it.

 / Jonas




Re: Sync API for workers

2013-10-13 Thread David Rajchenbach-Teller
On 10/12/13 3:48 PM, James Greene wrote:
 You can only build a synchronous API on top of an asynchronous API if
 they are (a) running in separate threads/processes AND (b) the sync
 thread can synchronously poll (busy loop) for the progress/completion of
 the async thread.

a) is necessary, but for b) it is sufficient for the sync thread to be
able to sleep until a condition/mutex/... is lifted


-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla



Re: Sync API for workers

2013-10-13 Thread James Greene
Thanks for adding clarification. That CAN be true but it depends on the
environment [so far as I can see].

For example, such an API wrapper couldn't be built in today's client-side
JavaScript because the UI thread can't do a synchronous yielding sleep
but rather can only do a synchronous blocking wait, which means it wouldn't
yield to allow for the Worker thread to asynchronously respond and toggle
such a condition/mutex/etc. unless such can be synchronously requested by
the blocking thread from within the busy wait loop (e.g.
`processEvents();`) as browsers won't interrupt the synchronous flow of the
JS busy loop to trigger `onmessage` handlers for async messages sent from
the Worker.

If I'm mistaken, please consider providing a code snippet, gist, etc. to
get me back on track. Thanks!
On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com
wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the progress/completion of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla



Re: Sync API for workers

2013-10-13 Thread Kyaw Tun
Actually only IDBRequest need to be sync, which are prone to error and
complicate workflow. Async workflow on database opening and transaction
request are fine.

Kyaw


Re: Sync API for workers

2013-10-13 Thread James Greene
 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted

In other words, your clarification is completely true but my initial
statement was written with regard to client-side JavaScript, which cannot
sleep. As such, I believe my original assertions are still correct with
regard to writing a sync wrapper in JS.
On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote:

 Thanks for adding clarification. That CAN be true but it depends on the
 environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's client-side
 JavaScript because the UI thread can't do a synchronous yielding sleep
 but rather can only do a synchronous blocking wait, which means it wouldn't
 yield to allow for the Worker thread to asynchronously respond and toggle
 such a condition/mutex/etc. unless such can be synchronously requested by
 the blocking thread from within the busy wait loop (e.g.
 `processEvents();`) as browsers won't interrupt the synchronous flow of the
 JS busy loop to trigger `onmessage` handlers for async messages sent from
 the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc. to
 get me back on track. Thanks!
 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com
 wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the progress/completion of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla




Re: Sync API for workers

2013-10-13 Thread pira...@gmail.com
Javascript now has support for yield statements the same way Python does,
that's a way to stop (ie. sleep) the execution of a script to allow another
to work and restart from there. It's not their main function, but allow to
create what's called greenlets, green threads, and that's how I seen sync
APIs are build in top of async ones...
El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió:

  a) is necessary, but for b) it is sufficient for the sync thread to be
  able to sleep until a condition/mutex/... is lifted

 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which cannot
 sleep. As such, I believe my original assertions are still correct with
 regard to writing a sync wrapper in JS.
 On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote:

 Thanks for adding clarification. That CAN be true but it depends on the
 environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's client-side
 JavaScript because the UI thread can't do a synchronous yielding sleep
 but rather can only do a synchronous blocking wait, which means it wouldn't
 yield to allow for the Worker thread to asynchronously respond and toggle
 such a condition/mutex/etc. unless such can be synchronously requested by
 the blocking thread from within the busy wait loop (e.g.
 `processEvents();`) as browsers won't interrupt the synchronous flow of the
 JS busy loop to trigger `onmessage` handlers for async messages sent from
 the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc. to
 get me back on track. Thanks!
 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com
 wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the progress/completion
 of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla




Re: Sync API for workers

2013-10-13 Thread James Greene
Oh, does `yield` work anywhere? I thought it was only for use within
generators. Admittedly, I haven't been keeping up with the latest ES6
changes.
On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote:

 Javascript now has support for yield statements the same way Python does,
 that's a way to stop (ie. sleep) the execution of a script to allow another
 to work and restart from there. It's not their main function, but allow to
 create what's called greenlets, green threads, and that's how I seen sync
 APIs are build in top of async ones...
 El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió:

  a) is necessary, but for b) it is sufficient for the sync thread to be
  able to sleep until a condition/mutex/... is lifted

 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which cannot
 sleep. As such, I believe my original assertions are still correct with
 regard to writing a sync wrapper in JS.
 On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com wrote:

 Thanks for adding clarification. That CAN be true but it depends on the
 environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's
 client-side JavaScript because the UI thread can't do a synchronous
 yielding sleep but rather can only do a synchronous blocking wait, which
 means it wouldn't yield to allow for the Worker thread to asynchronously
 respond and toggle such a condition/mutex/etc. unless such can be
 synchronously requested by the blocking thread from within the busy wait
 loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous
 flow of the JS busy loop to trigger `onmessage` handlers for async messages
 sent from the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc. to
 get me back on track. Thanks!
 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller dtel...@mozilla.com
 wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the progress/completion
 of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla




Re: Sync API for workers

2013-10-13 Thread pira...@gmail.com
Don't know, I only know behavior of Python yield statement, but Javascript
one was developed following it and I'm 90% secure it follows the same
behaviour (almost all new functionalities of Javascript are being borrowed
from Python since seems Mozilla Javascript implementors are Python
ex-programmers in purpose) so yes, I believe it should work this way :-)
El 13/10/2013 18:27, James Greene james.m.gre...@gmail.com escribió:

 Oh, does `yield` work anywhere? I thought it was only for use within
 generators. Admittedly, I haven't been keeping up with the latest ES6
 changes.
 On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote:

 Javascript now has support for yield statements the same way Python does,
 that's a way to stop (ie. sleep) the execution of a script to allow another
 to work and restart from there. It's not their main function, but allow to
 create what's called greenlets, green threads, and that's how I seen sync
 APIs are build in top of async ones...
 El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió:

  a) is necessary, but for b) it is sufficient for the sync thread to be
  able to sleep until a condition/mutex/... is lifted

 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which cannot
 sleep. As such, I believe my original assertions are still correct with
 regard to writing a sync wrapper in JS.
 On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com
 wrote:

 Thanks for adding clarification. That CAN be true but it depends on the
 environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's
 client-side JavaScript because the UI thread can't do a synchronous
 yielding sleep but rather can only do a synchronous blocking wait, which
 means it wouldn't yield to allow for the Worker thread to asynchronously
 respond and toggle such a condition/mutex/etc. unless such can be
 synchronously requested by the blocking thread from within the busy wait
 loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous
 flow of the JS busy loop to trigger `onmessage` handlers for async messages
 sent from the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc.
 to get me back on track. Thanks!
 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller 
 dtel...@mozilla.com wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the
 progress/completion of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla




Re: Sync API for workers

2013-10-13 Thread David Rajchenbach-Teller
On 10/13/13 6:33 PM, pira...@gmail.com wrote:
 Don't know, I only know behavior of Python yield statement, but
 Javascript one was developed following it and I'm 90% secure it follows
 the same behaviour (almost all new functionalities of Javascript are
 being borrowed from Python since seems Mozilla Javascript implementors
 are Python ex-programmers in purpose) so yes, I believe it should work
 this way :-)

It's slightly more complex than [my undersatnding of] your original
phrasing, but in a word, yes, it behaves essentially as in Python.

e.g., using Task.js [1], with a proper (and trivial) definition of
wait(), the following implements a polling loop that does not block the
event loop:

Tasks.spawn(function* () {
  while (true) {
yield wait();
poll();
  }
});

[1] http://taskjs.org/

-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla



Re: Sync API for workers

2013-10-13 Thread pira...@gmail.com
Demostration by example, thanks :-)

2013/10/13 David Rajchenbach-Teller dtel...@mozilla.com:
 On 10/13/13 6:33 PM, pira...@gmail.com wrote:
 Don't know, I only know behavior of Python yield statement, but
 Javascript one was developed following it and I'm 90% secure it follows
 the same behaviour (almost all new functionalities of Javascript are
 being borrowed from Python since seems Mozilla Javascript implementors
 are Python ex-programmers in purpose) so yes, I believe it should work
 this way :-)

 It's slightly more complex than [my undersatnding of] your original
 phrasing, but in a word, yes, it behaves essentially as in Python.

 e.g., using Task.js [1], with a proper (and trivial) definition of
 wait(), the following implements a polling loop that does not block the
 event loop:

 Tasks.spawn(function* () {
   while (true) {
 yield wait();
 poll();
   }
 });

 [1] http://taskjs.org/

 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla



-- 
Si quieres viajar alrededor del mundo y ser invitado a hablar en un
monton de sitios diferentes, simplemente escribe un sistema operativo
Unix.
– Linus Tordvals, creador del sistema operativo Linux



Re: Sync API for workers

2013-10-13 Thread Jonas Sicking
Ok, this thread is clearly heading off the deep end. Let me clear up a
few points of confusion:

* You can not wrap a truly synchronous library around an asynchronous
API. Spinning the event loop gets you close, but breaks
run-to-completion. Furthermore, spinning the event loop is irrelevant
as we don't have an API to do that, nor are we planning to introduce
one.
* yield only works within generators in JS.
* You could solve the use case of compile-to-JS for code that uses
sync APIs using yield. However it requires changing all functions into
generators, and all function calls into yield* statements. That comes
at a performance overhead that is significant enough as to make it an
unacceptable solution (several times slower in current
implementations).
* You could likewise solve the compile-to-JS use case by instead of
generating plain JS generate JS that implements a virtual machine that
runs the compiled code. This would allow pausing the virtual machine
whenever an async call is happening. The performance overhead here is
simply too large (in fact, that's essentially what using generators
and yield* does).
* yield would not solve the use-case of allowing libraries that use
features from the main thread as it, again, would require a rewrite of
all code that directly or indirectly uses that library to change all
functions into generators and all function calls into yield*.

/ Jonas

On Sun, Oct 13, 2013 at 9:33 AM, pira...@gmail.com pira...@gmail.com wrote:
 Don't know, I only know behavior of Python yield statement, but Javascript
 one was developed following it and I'm 90% secure it follows the same
 behaviour (almost all new functionalities of Javascript are being borrowed
 from Python since seems Mozilla Javascript implementors are Python
 ex-programmers in purpose) so yes, I believe it should work this way :-)

 El 13/10/2013 18:27, James Greene james.m.gre...@gmail.com escribió:

 Oh, does `yield` work anywhere? I thought it was only for use within
 generators. Admittedly, I haven't been keeping up with the latest ES6
 changes.

 On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote:

 Javascript now has support for yield statements the same way Python does,
 that's a way to stop (ie. sleep) the execution of a script to allow another
 to work and restart from there. It's not their main function, but allow to
 create what's called greenlets, green threads, and that's how I seen sync
 APIs are build in top of async ones...

 El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió:

  a) is necessary, but for b) it is sufficient for the sync thread to be
  able to sleep until a condition/mutex/... is lifted

 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which cannot
 sleep. As such, I believe my original assertions are still correct with
 regard to writing a sync wrapper in JS.

 On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com
 wrote:

 Thanks for adding clarification. That CAN be true but it depends on the
 environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's
 client-side JavaScript because the UI thread can't do a synchronous 
 yielding
 sleep but rather can only do a synchronous blocking wait, which means it
 wouldn't yield to allow for the Worker thread to asynchronously respond 
 and
 toggle such a condition/mutex/etc. unless such can be synchronously
 requested by the blocking thread from within the busy wait loop (e.g.
 `processEvents();`) as browsers won't interrupt the synchronous flow of 
 the
 JS busy loop to trigger `onmessage` handlers for async messages sent from
 the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc.
 to get me back on track. Thanks!

 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller
 dtel...@mozilla.com wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API
  if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the
  progress/completion of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla



Re: Sync API for workers

2013-10-13 Thread David Bruant

Le 13/10/2013 21:39, Jonas Sicking a écrit :

Ok, this thread is clearly heading off the deep end. Let me clear up a
few points of confusion:

* You can not wrap a truly synchronous library around an asynchronous
API. Spinning the event loop gets you close, but breaks
run-to-completion. Furthermore, spinning the event loop is irrelevant
as we don't have an API to do that, nor are we planning to introduce
one.
* yield only works within generators in JS.
To be honest, I feel generators will be interoperably deployed 
cross-browser long before sync APIs in workers. V8 and SpiderMonkey 
already have generators. I'm not entirely sure it's 100% compliant in 
SpiderMonkey yet, but for sure it's being actively worked on and should 
be soon if not yet.



* You could solve the use case of compile-to-JS for code that uses
sync APIs using yield. However it requires changing all functions into
generators, and all function calls into yield* statements.
all? as is all function in the application? that sounds like a too 
violent constraint, especially if a small proportion of the code uses 
sync functions. Maybe only the functions that may call a sync function 
need to be changed to generators... oh... hmm... I don't know.
Taking the liberty to cc Alon Zakai to ask for his expert opinion on 
this topics.



That comes
at a performance overhead that is significant enough as to make it an
unacceptable solution (several times slower in current
implementations).
I guess this point depends on the previous one. Given that compile-to-JS 
has wind behind these days, generators may benefits from optimizations.



* yield would not solve the use-case of allowing libraries that use
features from the main thread as it, again, would require a rewrite of
all code that directly or indirectly uses that library to change all
functions into generators and all function calls into yield*.
I think this point is about interoperability between main thread and 
worker. I don't remember this point being discussed too much yet.

What about exposing both async and sync APIs to workers?

David



Re: Sync API for workers

2013-10-13 Thread Jonas Sicking
On Sun, Oct 13, 2013 at 1:36 PM, David Bruant bruan...@gmail.com wrote:
 Le 13/10/2013 21:39, Jonas Sicking a écrit :

 Ok, this thread is clearly heading off the deep end. Let me clear up a
 few points of confusion:

 * You can not wrap a truly synchronous library around an asynchronous
 API. Spinning the event loop gets you close, but breaks
 run-to-completion. Furthermore, spinning the event loop is irrelevant
 as we don't have an API to do that, nor are we planning to introduce
 one.
 * yield only works within generators in JS.

 To be honest, I feel generators will be interoperably deployed cross-browser
 long before sync APIs in workers. V8 and SpiderMonkey already have
 generators. I'm not entirely sure it's 100% compliant in SpiderMonkey yet,
 but for sure it's being actively worked on and should be soon if not yet.

I would expect that too, but that's beside the point given that yield
doesn't actually solve the problem. See below.

 * You could solve the use case of compile-to-JS for code that uses
 sync APIs using yield. However it requires changing all functions into
 generators, and all function calls into yield* statements.

 all? as is all function in the application? that sounds like a too violent
 constraint, especially if a small proportion of the code uses sync
 functions. Maybe only the functions that may call a sync function need to be
 changed to generators... oh... hmm... I don't know.

All as in all that directly *or indirectly* call the sync API.

 That comes
 at a performance overhead that is significant enough as to make it an
 unacceptable solution (several times slower in current
 implementations).

 I guess this point depends on the previous one. Given that compile-to-JS has
 wind behind these days, generators may benefits from optimizations.

Generators certainly can be made faster. But given that they always
result in stopping to use the CPU mechanisms for function calls I'd be
very surprised if the performance can be made good enough.

 * yield would not solve the use-case of allowing libraries that use
 features from the main thread as it, again, would require a rewrite of
 all code that directly or indirectly uses that library to change all
 functions into generators and all function calls into yield*.

 I think this point is about interoperability between main thread and worker.
 I don't remember this point being discussed too much yet.

See my original email. Workers are likely to always lag behind main
thread. And some APIs aren't currently slated for workers at all. In
particular ones that involve UI or the DOM.

 What about exposing both async and sync APIs to workers?

What about it? We could certainly add lots of sync APIs to workers.
But if one of the main use cases is compile-to-JS of existing pre-web
codebases, which presumably over the years will become less of an
issue as people target the web directly, then finding a smaller
surface like my proposed API seems beneficial to lots of specialized
sync APIs.

/ Jonas



Re: Sync API for workers

2013-10-13 Thread Rick Waldron
On Sunday, October 13, 2013, James Greene wrote:

 Oh, does `yield` work anywhere? I thought it was only for use within
 generators. Admittedly, I haven't been keeping up with the latest ES6
 changes.


yield may only appear in the body of a generator function, denoted by star
syntax: function* g(){}

Rick



 On Oct 13, 2013 9:38 AM, pira...@gmail.com javascript:_e({}, 'cvml',
 'pira...@gmail.com'); pira...@gmail.com javascript:_e({}, 'cvml',
 'pira...@gmail.com'); wrote:

 Javascript now has support for yield statements the same way Python does,
 that's a way to stop (ie. sleep) the execution of a script to allow another
 to work and restart from there. It's not their main function, but allow to
 create what's called greenlets, green threads, and that's how I seen sync
 APIs are build in top of async ones...
 El 13/10/2013 16:21, James Greene 
 james.m.gre...@gmail.comjavascript:_e({}, 'cvml', 
 'james.m.gre...@gmail.com');
 escribió:

  a) is necessary, but for b) it is sufficient for the sync thread to be
  able to sleep until a condition/mutex/... is lifted

 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which cannot
 sleep. As such, I believe my original assertions are still correct with
 regard to writing a sync wrapper in JS.
 On Oct 13, 2013 9:09 AM, James Greene 
 james.m.gre...@gmail.comjavascript:_e({}, 'cvml', 
 'james.m.gre...@gmail.com');
 wrote:

 Thanks for adding clarification. That CAN be true but it depends on the
 environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's
 client-side JavaScript because the UI thread can't do a synchronous
 yielding sleep but rather can only do a synchronous blocking wait, which
 means it wouldn't yield to allow for the Worker thread to asynchronously
 respond and toggle such a condition/mutex/etc. unless such can be
 synchronously requested by the blocking thread from within the busy wait
 loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous
 flow of the JS busy loop to trigger `onmessage` handlers for async messages
 sent from the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc.
 to get me back on track. Thanks!
 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller 
 dtel...@mozilla.com javascript:_e({}, 'cvml', 'dtel...@mozilla.com');
 wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the
 progress/completion of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla




Re: Sync API for workers

2013-10-13 Thread James Greene
Rick:
Thanks for confirming that.

Being more familiar with generators (and other ES6 goodies), can you
envision any setup where a generator (or perhaps multiple yielding to each
other) would enable us to build synchronous API wrappers around async APIs
in JS?
On Oct 13, 2013 6:44 PM, Rick Waldron waldron.r...@gmail.com wrote:



 On Sunday, October 13, 2013, James Greene wrote:

 Oh, does `yield` work anywhere? I thought it was only for use within
 generators. Admittedly, I haven't been keeping up with the latest ES6
 changes.


 yield may only appear in the body of a generator function, denoted by star
 syntax: function* g(){}

 Rick



 On Oct 13, 2013 9:38 AM, pira...@gmail.com pira...@gmail.com wrote:

 Javascript now has support for yield statements the same way Python
 does, that's a way to stop (ie. sleep) the execution of a script to allow
 another to work and restart from there. It's not their main function, but
 allow to create what's called greenlets, green threads, and that's how I
 seen sync APIs are build in top of async ones...
 El 13/10/2013 16:21, James Greene james.m.gre...@gmail.com escribió:

  a) is necessary, but for b) it is sufficient for the sync thread to be
  able to sleep until a condition/mutex/... is lifted

 In other words, your clarification is completely true but my initial
 statement was written with regard to client-side JavaScript, which cannot
 sleep. As such, I believe my original assertions are still correct with
 regard to writing a sync wrapper in JS.
 On Oct 13, 2013 9:09 AM, James Greene james.m.gre...@gmail.com
 wrote:

 Thanks for adding clarification. That CAN be true but it depends on
 the environment [so far as I can see].

 For example, such an API wrapper couldn't be built in today's
 client-side JavaScript because the UI thread can't do a synchronous
 yielding sleep but rather can only do a synchronous blocking wait, which
 means it wouldn't yield to allow for the Worker thread to asynchronously
 respond and toggle such a condition/mutex/etc. unless such can be
 synchronously requested by the blocking thread from within the busy wait
 loop (e.g. `processEvents();`) as browsers won't interrupt the synchronous
 flow of the JS busy loop to trigger `onmessage` handlers for async 
 messages
 sent from the Worker.

 If I'm mistaken, please consider providing a code snippet, gist, etc.
 to get me back on track. Thanks!
 On Oct 13, 2013 5:06 AM, David Rajchenbach-Teller 
 dtel...@mozilla.com wrote:

 On 10/12/13 3:48 PM, James Greene wrote:
  You can only build a synchronous API on top of an asynchronous API
 if
  they are (a) running in separate threads/processes AND (b) the sync
  thread can synchronously poll (busy loop) for the
 progress/completion of
  the async thread.

 a) is necessary, but for b) it is sufficient for the sync thread to be
 able to sleep until a condition/mutex/... is lifted


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla




Re: Sync API for workers

2013-10-13 Thread Glenn Maynard
What I really dislike about this is that the worker can't send the port
directly to a UI thread if it's a nested worker; it has to send it to its
parent, who has to forward it to its parent, and so on.  That seems like
it'll make it hard to implement libraries, since libraries needs to have
its fingers in every one of your workers' main message port, or the author
needs to invent a new message routing mechanism (which is something message
ports are supposed to do for us).

For example, the below example can't be easily modularized and made into a
library.  If regular ports could be used, you could wrap both sides in
initAlertHandling(messagePort), handing it a port to communicate over.

The deadlock prevention algorithm I proposed earlier attempted to address
this, but it was too complex.  It attempted to allow as close to arbitrary
message passing as possible without allowing deadlocks; maybe there's a
less permissive approach that would be simple enough.

Here's an alternate proposal.  It attempts to allow transferring these
ports in any way, across regular MessagePorts, but it no longer tries to
implement blocking with regular MessagePorts.  It keeps the
SyncMessageChannel, MessagePortSyncSide and MessagePortAsyncSide interfaces
of your proposal to simplify things:

- When a SyncMessageChannel is created, store an identifier for the current
thread in both ports, called the port's initial thread.  This property
stays the same as the port is transferred around.
- Add a property to both ports called transferred first, initially null.
 If MessagePortSyncSide is transferred and its transferred first property
is null, set its transferred first to true and the transferred first
property of its corresponding MessagePortAsyncSide to false.  The same is
true in reverse.  (This is possible because, the first time either port is
transferred, they're obviously still in the same thread.)
- If either type of port is transferred to an illegal thread, the recipient
thread automatically calls close() on the port.
- Descendants of a MessagePortSyncSide's initial thread are always legal
threads.  Additionally, if the port's transferred first value is true, the
initial thread itself is also a legal thread.
- Ancestors of a MessagePortSyncSide's initial thread are always legal
threads.  Additionally, if the port's transferred first value is false, the
initial thread itself is also a legal thread.

Like the previous proposal, this requires that the browser can get a view
of the thread tree.  Since all of the checking happens in the thread that
receives the port, and not the sending port, there are no race conditions
due to not knowing the target of a port in advance.

The transferred first field allows creating a SyncMessageChannel, and
then either 1: transferring the MessagePortSyncSide to a child worker, or
transferring MessagePortAsyncSide upwards to a parent worker or the UI
thread.  You can't do both; once you transfer a port once, that property is
immutable.

Simple put, if you transfer either port across the boundary defined by the
initial thread, the ports are shut down to prevent the possibility of
deadlocks.

   any waitForMessage();

With regular messaging you have an Event containing a .data property with
the posted message.  Here, you just have the message.  That'll make adding
metadata difficult.  Are we sure that's what we want?  (One thing we'd lose
is the message's origin; I'm not sure on a quick reading if this is meant
to be supported for message channels, but it looks like it is.)

An aside: one thing you'll want to be able to do is block on multiple
sources, possibly with a timeout, like select().  I think the way this API
would allow that is for the user to create a multiplexer thread that does
all of the listening asynchronously, and then a single blocking port is
used between the mux thread and the real thread.

On Sun, Oct 13, 2013 at 2:39 PM, Jonas Sicking jo...@sicking.cc wrote:

 * yield only works within generators in JS.


And if that's like Python, yield only goes up one level, and then the
caller has to yield too.  Generators are useless for wrapping async APIs
unless you structure your entire codebase around it.  It's very useful (its
incarnation in Python, at least), but has no relevance to the problems the
API this thread is discussing.

To give a the simplest example of what you can do with this API: implement
confirm() in workers.

--- worker ---
({
var channel = new SyncMessageChannel();
postMessage({
port: channel.asyncPort
});
var _confirmPort = channel.syncPort;
confirm = function(message)
{
_confirmPort.postMessage(message);
return _confirmPort.waitForMessage().result;
}
})();

if(confirm(Delete everything?)) ...

--- main thread
var worker = createWorker(); // create the above worker
worker.onmessage = function(e)
{
var confirmPort = event.data.port;
confirmPort.onmessage = function(e)
{
var answer = 

Re: Sync API for workers

2013-10-13 Thread Glenn Maynard
On Sun, Oct 13, 2013 at 8:11 PM, Glenn Maynard gl...@zewt.org wrote:

 - Descendants of a MessagePortSyncSide's initial thread are always legal
 threads.  Additionally, if the port's transferred first value is true, the
 initial thread itself is also a legal thread.
 - Ancestors of a MessagePortSyncSide's initial thread are always legal
 threads.  Additionally, if the port's transferred first value is false, the
 initial thread itself is also a legal thread.


Correction:

- Descendants of a MessagePortSyncSide's initial thread are always legal
threads.  Additionally, if the port's transferred first value is *false*,
the initial thread itself is also a legal thread.
- Ancestors of a *MessagePortAsyncSide*'s initial thread are always legal
threads.  Additionally, if the port's transferred first value is false, the
initial thread itself is also a legal thread.

The initial thread is only a valid thread for the port that was *not*
transferred first.  When the first port is transferred for the first time,
the remaining port, which is still in the thread it was created in, is
always in a valid thread.

-- 
Glenn Maynard


Re: Sync API for workers

2013-10-13 Thread Jonas Sicking
On Sun, Oct 13, 2013 at 8:19 PM, Glenn Maynard gl...@zewt.org wrote:
 On Sun, Oct 13, 2013 at 8:11 PM, Glenn Maynard gl...@zewt.org wrote:

 - Descendants of a MessagePortSyncSide's initial thread are always legal
 threads.  Additionally, if the port's transferred first value is true, the
 initial thread itself is also a legal thread.
 - Ancestors of a MessagePortSyncSide's initial thread are always legal
 threads.  Additionally, if the port's transferred first value is false, the
 initial thread itself is also a legal thread.


 Correction:

 - Descendants of a MessagePortSyncSide's initial thread are always legal
 threads.  Additionally, if the port's transferred first value is false, the
 initial thread itself is also a legal thread.
 - Ancestors of a MessagePortAsyncSide's initial thread are always legal
 threads.  Additionally, if the port's transferred first value is false, the
 initial thread itself is also a legal thread.

 The initial thread is only a valid thread for the port that was *not*
 transferred first.  When the first port is transferred for the first time,
 the remaining port, which is still in the thread it was created in, is
 always in a valid thread.

This is certainly an improvement over the previous proposal. However
given that synchronous APIs of any type are quite controversial, I'd
rather stick to a basic approach for now.

The nice thing about your proposal is that it's strictly additive, so
it's something we can add later if there's agreement that the problems
it aims to solve are problems that need solving, and there's agreement
that the proposal is the right way to solve them.

/ Jonas



Re: Sync API for workers

2013-10-12 Thread pira...@gmail.com
 When I see discussion of any new/recent synchronous APIs for the Web
 platform these days I pretty much take it they're implicitly intended just
 for use with Workers. So I assume that's the context Jonas intended.

It's a safe assumption, but I think it's better to be asynchronous also on
workers, not only for efficience, but also for having only one programming
model so you can easily interchange your code between workers and main
thread and also maybe Node.js. Look at IndexedDB API, since asynchronous
one was enought dfor everybody, synchronous one was not implemented by the
browsers and now it has became deprecated... :-)

   --Mike

 P.S. Of course there's room for disagreement about whether synchronous
APIs
 are even a good idea even for the Workers case -


http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/

Good to know I'm not the only one... :-)


Re: Sync API for workers

2013-10-12 Thread pira...@gmail.com
 Synchronous APIs are easier to use since it's how things have been done
since decades ago,


 No, they're easier to use because they fit the model of linear human
thought more naturally.  The idea that asynchronous APIs are just as good
and easy as synchronous APIs, and that people only disagree because of lack
of experience with asynchronous APIs, is mistaken.  APIs must be designed
around how programmer's minds actually work, not how you'd like them to
work.

I agree that APIs should have the least-surprise-factor, but are you sure
that's the reason why we use to program in a synchronous, linear way?
Because I've always though it was an heritage of the batch-processing
machines of the '60s... I don't believe event-oriented programming fit bad
with how humans thinks because the nature is event-oriented (if this then
that), and I don't believe I'm an alien so that's the reason I think in a
different way...

 but the required POSIX-like APIs would be better developed as external
libraries on top of the asynchronous ones.

 You can't build synchronous APIs on top of asynchronous APIs without the
mechanism this thread is specifically about.

I've always been taught that you can implement one on top of the other...
:-/ Obviously, asynchronous on top of synchronous is fairly easier thsn in
the other way...


Re: Sync API for workers

2013-10-12 Thread James Greene
You can only build a synchronous API on top of an asynchronous API if they
are (a) running in separate threads/processes AND (b) the sync thread can
synchronously poll (busy loop) for the progress/completion of the async
thread.
On Oct 12, 2013 1:23 AM, pira...@gmail.com pira...@gmail.com wrote:

  Synchronous APIs are easier to use since it's how things have been done
 since decades ago,
 
 
  No, they're easier to use because they fit the model of linear human
 thought more naturally.  The idea that asynchronous APIs are just as good
 and easy as synchronous APIs, and that people only disagree because of lack
 of experience with asynchronous APIs, is mistaken.  APIs must be designed
 around how programmer's minds actually work, not how you'd like them to
 work.
 
 I agree that APIs should have the least-surprise-factor, but are you sure
 that's the reason why we use to program in a synchronous, linear way?
 Because I've always though it was an heritage of the batch-processing
 machines of the '60s... I don't believe event-oriented programming fit bad
 with how humans thinks because the nature is event-oriented (if this then
 that), and I don't believe I'm an alien so that's the reason I think in a
 different way...

  but the required POSIX-like APIs would be better developed as external
 libraries on top of the asynchronous ones.
 
  You can't build synchronous APIs on top of asynchronous APIs without the
 mechanism this thread is specifically about.
 
 I've always been taught that you can implement one on top of the other...
 :-/ Obviously, asynchronous on top of synchronous is fairly easier thsn in
 the other way...



Re: Sync API for workers

2013-10-11 Thread Jonas Sicking
On Wed, Sep 5, 2012 at 7:03 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hence I think something like the following would work:

 [Constructor]
 interface SyncMessageChannel {
   readonly attribute MessagePortSyncSide syncPort;
   readonly attribute MessagePortAsyncSide asyncPort;
 };

 interface MessagePortSyncSide {
   void postMessage(any message, optional sequenceTransferable transfer);
   any waitForMessage();
   void close();
 };
 MessagePortSyncSide implements Transferable;

 interface MessagePortAsyncSide : EventTarget {
   void postMessage(any message, optional sequenceTransferable transfer);
   void start();
   void close();

   // event handlers
attribute EventHandler onmessage;
 };
 MessagePortAsyncSide implements Transferable;


 Where there's the additional limitation that MessagePortSyncSide can
 only be transferred though MessagePortAsyncSide.postMessage() or
 Worker.postMessage(), and MessagePortAsyncSide can only be transferred
 though MessagePortSyncSide.postMessage() or
 DedicatedWorkerGlobalScope.postMessage().

We're planning on implementing this API soon in Gecko.

The main use cases that we're looking to solve are:

* Enable libraries to implement APIs that require functionality that
are not yet available in workers but that is available on the main
thread. There will likely be such functionality for a long time to
come given that we're constantly adding new functionality to the web
platform and implementations tend to add APIs to the main thread
before they do so to workers.
* Enable compiling code that was written for other platforms to the
web. Specifically where such code uses synchronous APIs, but where we
for good reasons have chosen not to expose synchronous counterparts in
the web platform. The most obvious example here is synchronous
filesystem access which is very commonly used in other platforms like
posix and windows.

If people have other ideas for how to solve those use-cases, we're of
course always open to other proposals. But please look through this
thread first as there has been lots of good discussion.

If people are not interested in solving these use-cases I'm always
interested in that input too.

/ Jonas



Re: Sync API for workers

2013-10-11 Thread pira...@gmail.com
 * Enable compiling code that was written for other platforms to the
 web. Specifically where such code uses synchronous APIs, but where we
 for good reasons have chosen not to expose synchronous counterparts in
 the web platform. The most obvious example here is synchronous
 filesystem access which is very commonly used in other platforms like
 posix and windows.

Synchronous APIs are easier to use since it's how things have been done
since decades ago, but I don't think they fit in a event-oriented
environment like Javascript, and more specially to some so time consuming
like filesystem and IO. I find it better to only develop asynchronous APIs
for this use cases. It would make sense to use synchronous APIs to help
porting current code for example from C/C++ to Javascript, but the required
POSIX-like APIs would be better developed as external libraries on top of
the asynchronous ones.


Re: Sync API for workers

2013-10-11 Thread Michael[tm] Smith
pira...@gmail.com pira...@gmail.com, 2013-10-11 21:24 +0200:

 [Jonas said]:
  * Enable compiling code that was written for other platforms to the
  web. Specifically where such code uses synchronous APIs, but where we
  for good reasons have chosen not to expose synchronous counterparts in
  the web platform. The most obvious example here is synchronous
  filesystem access which is very commonly used in other platforms like
  posix and windows.
 
 Synchronous APIs are easier to use since it's how things have been done
 since decades ago, but I don't think they fit in a event-oriented
 environment like Javascript, and more specially to some so time consuming
 like filesystem and IO. I find it better to only develop asynchronous APIs
 for this use cases. It would make sense to use synchronous APIs to help
 porting current code for example from C/C++ to Javascript, but the required
 POSIX-like APIs would be better developed as external libraries on top of
 the asynchronous ones.

When I see discussion of any new/recent synchronous APIs for the Web
platform these days I pretty much take it they're implicitly intended just
for use with Workers. So I assume that's the context Jonas intended.

  --Mike

P.S. Of course there's room for disagreement about whether synchronous APIs
are even a good idea even for the Workers case -

  http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/

-- 
Michael[tm] Smith http://people.w3.org/mike



Re: Sync API for workers

2013-10-11 Thread Glenn Maynard
On Fri, Oct 11, 2013 at 2:24 PM, pira...@gmail.com pira...@gmail.comwrote:

 Synchronous APIs are easier to use since it's how things have been done
 since decades ago,


No, they're easier to use because they fit the model of linear human
thought more naturally.  The idea that asynchronous APIs are just as good
and easy as synchronous APIs, and that people only disagree because of lack
of experience with asynchronous APIs, is mistaken.  APIs must be designed
around how programmer's minds actually work, not how you'd like them to
work.

but the required POSIX-like APIs would be better developed as external
 libraries on top of the asynchronous ones.

  You can't build synchronous APIs on top of asynchronous APIs without the
mechanism this thread is specifically about.

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-07 Thread Lon Ingram
 On Thu, Sep 6, 2012 at 7:18 PM, Glenn Maynard gl...@zewt.org wrote:
 On Thu, Sep 6, 2012 at 12:31 AM, Jonas Sicking jo...@sicking.cc wrote:
 That is certainly an interesting use case. I think another interesting
 use case is being able to write synchronous APIs in workers whose
 implementation uses APIs that are only available on the main thread.


 I understand the concept, but I'm having trouble coming up with useful
 examples.  Can you give one?

I have one: virtualizing the API of the main thread in a worker. In Treehouse
[1], we sandbox untrusted JS in a worker, where we provide a virtual browser
interface - including the DOM (using a hacked up fork of jsdom). This allows us
to (1) restrict the guest code to a subset of the DOM, and (2) interpose on
privileged operations.

At present, we present a synchronous interface to the virtual DOM and then
replicate changes back to the actual DOM in the main thread asynchronously. This
works surprisingly well, but has some limitations.

First, concurrent access to a given DOM node from more than one worker is a
difficult problem, so we punt and require that a node may appear in at most one
virtual DOM. Second, we do not know how to virtualize some synchronous methods
and properties, such as window.prompt.

I believe that a synchronous messaging API would allow us to overcome both of
these issues.

[1] Treehouse PDF:
https://www.usenix.org/system/files/conference/atc12/atc12-final159.pdf

--
Lon Ingram
@lawnsea



Re: Sync API for workers

2012-09-06 Thread Jonas Sicking
On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote:
 On 09/06/2012 08:31 AM, Jonas Sicking wrote:

 On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote:

 On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote:


 The problem with a Only allow blocking on children, except that
 window can't block on its children is that you can never block on a
 computation which is implemented in the main thread. I think that cuts
 out some major use cases since todays browsers have many APIs which
 are only implemented in the main thread.


 You can't have both--you have to choose one of 1: allow blocking upwards,
 2:
 allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is more
 useful than #2, but each proposal can go both ways.  I'm ignoring more
 complex deadlock detection algorithms that can allow both #1 and #2, of
 course, since that's a lot harder.)


 Indeed. But I believe #2 is more useful than #1. I wasn't proposing
 having both, I was proposing only doing #2.

 It's actually technically possible to allow both #1 and #2 without
 deadlock detection algorithms, but to keep things sane I'll leave that
 as out of scope for this thread.

 [snip]

 I think that's by far the most
 interesting category of use cases raised for this feature so far, the
 ability to implement sync APIs from async APIs (or several async APIs).


 That is certainly an interesting use case. I think another interesting
 use case is being able to write synchronous APIs in workers whose
 implementation uses APIs that are only available on the main thread.

 That's why I'm not interested in only blocking on children, but rather
 only blocking on parents.

 The fact that all the examples that people have used while we have
 been discussing synchronous messaging have spun event loops in
 attempts to deal with messages that couldn't be handled by the
 synchronous poller makes me very much think that so will web
 developers.


 getMessage doesn't spin the event loop.  Spinning the event loop means
 that tasks are run from task queues (such as asynchronous callbacks)
 which
 might not be expecting to run, and that tasks might be run recursively;
 none
 of that that happens here.  All this does is block until a message is
 available on a specified port (or ports), and then returns it--it's just
 a
 blocking call, like sync XHR or FileReaderSync.


 The example from Olli's proposal 3 does what effectively amounts to
 spinning an event loop. It pulls out a bunch of events from the
 normal event loop and then manually dispatches them in a while loop.
 The behavior is exactly the same as spinning the event loop (except
 that non-message tasks doesn't get dispatchet).

 It is just dispatching events.
 The problems we (Gecko) have had with event loop spinning in main thread
 relate mainly to the problems where unexpected events are dispatched while
 running the loop, as an example user input events or events coming from
 network.
 getMessage/waitForMessage does not have that problem.

I'm not sure what you mean by just dispatching events. That's
exactly what event loop spinning is.

Why are the gecko events any more unexpected than the message events
that the example dispatches. The network events or user input events
that we had are all events created by gecko code. The messages that
might get dispatches by the worker code can easily also be network
events or user events which are sent to the worker for processing.

Web pages can have just as much inconsistent state while deep in call
stacks as we do. If they at that point call into a library which
starts pulling messages off of the task queue and dispatches them,
they'll run into the same problems as we've had.

/ Jonas



Re: Sync API for workers

2012-09-06 Thread Olli Pettay

On 09/06/2012 09:12 AM, Jonas Sicking wrote:

On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote:

On 09/06/2012 08:31 AM, Jonas Sicking wrote:


On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote:


On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote:



The problem with a Only allow blocking on children, except that
window can't block on its children is that you can never block on a
computation which is implemented in the main thread. I think that cuts
out some major use cases since todays browsers have many APIs which
are only implemented in the main thread.



You can't have both--you have to choose one of 1: allow blocking upwards,
2:
allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is more
useful than #2, but each proposal can go both ways.  I'm ignoring more
complex deadlock detection algorithms that can allow both #1 and #2, of
course, since that's a lot harder.)



Indeed. But I believe #2 is more useful than #1. I wasn't proposing
having both, I was proposing only doing #2.

It's actually technically possible to allow both #1 and #2 without
deadlock detection algorithms, but to keep things sane I'll leave that
as out of scope for this thread.

[snip]


I think that's by far the most
interesting category of use cases raised for this feature so far, the
ability to implement sync APIs from async APIs (or several async APIs).



That is certainly an interesting use case. I think another interesting
use case is being able to write synchronous APIs in workers whose
implementation uses APIs that are only available on the main thread.

That's why I'm not interested in only blocking on children, but rather
only blocking on parents.


The fact that all the examples that people have used while we have
been discussing synchronous messaging have spun event loops in
attempts to deal with messages that couldn't be handled by the
synchronous poller makes me very much think that so will web
developers.



getMessage doesn't spin the event loop.  Spinning the event loop means
that tasks are run from task queues (such as asynchronous callbacks)
which
might not be expecting to run, and that tasks might be run recursively;
none
of that that happens here.  All this does is block until a message is
available on a specified port (or ports), and then returns it--it's just
a
blocking call, like sync XHR or FileReaderSync.



The example from Olli's proposal 3 does what effectively amounts to
spinning an event loop. It pulls out a bunch of events from the
normal event loop and then manually dispatches them in a while loop.
The behavior is exactly the same as spinning the event loop (except
that non-message tasks doesn't get dispatchet).


It is just dispatching events.
The problems we (Gecko) have had with event loop spinning in main thread
relate mainly to the problems where unexpected events are dispatched while
running the loop, as an example user input events or events coming from
network.
getMessage/waitForMessage does not have that problem.


I'm not sure what you mean by just dispatching events. That's
exactly what event loop spinning is.


No. waitForMessage example I wrote down just dispatches DOM events in a loop.
That is a synchronous operation and you know exactly which events you're
about to dispatch.
If you run the generic event loop, you also end up running
timers and getting input from network and user etc. and you can't
controls those.



Why are the gecko events any more unexpected than the message events
that the example dispatches.

We don't want to block certain events in Gecko (like user input to chrome).
Blocking events in worker code is ok.


The network events or user input events
that we had are all events created by gecko code. The messages that
might get dispatches by the worker code can easily also be network
events or user events which are sent to the worker for processing.

Web pages can have just as much inconsistent state while deep in call
stacks as we do.

That is true...


If they at that point call into a library which
starts pulling messages off of the task queue and dispatches them,
they'll run into the same problems as we've had.


...but then it is up to the library to handle the case properly and
dispatch events async.


.Olli




/ Jonas






Re: Sync API for workers

2012-09-06 Thread Olli Pettay

On 09/06/2012 09:30 AM, Olli Pettay wrote:

On 09/06/2012 09:12 AM, Jonas Sicking wrote:

On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote:

On 09/06/2012 08:31 AM, Jonas Sicking wrote:


On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote:


On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote:



The problem with a Only allow blocking on children, except that
window can't block on its children is that you can never block on a
computation which is implemented in the main thread. I think that cuts
out some major use cases since todays browsers have many APIs which
are only implemented in the main thread.



You can't have both--you have to choose one of 1: allow blocking upwards,
2:
allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is more
useful than #2, but each proposal can go both ways.  I'm ignoring more
complex deadlock detection algorithms that can allow both #1 and #2, of
course, since that's a lot harder.)



Indeed. But I believe #2 is more useful than #1. I wasn't proposing
having both, I was proposing only doing #2.

It's actually technically possible to allow both #1 and #2 without
deadlock detection algorithms, but to keep things sane I'll leave that
as out of scope for this thread.

[snip]


I think that's by far the most
interesting category of use cases raised for this feature so far, the
ability to implement sync APIs from async APIs (or several async APIs).



That is certainly an interesting use case. I think another interesting
use case is being able to write synchronous APIs in workers whose
implementation uses APIs that are only available on the main thread.

That's why I'm not interested in only blocking on children, but rather
only blocking on parents.


The fact that all the examples that people have used while we have
been discussing synchronous messaging have spun event loops in
attempts to deal with messages that couldn't be handled by the
synchronous poller makes me very much think that so will web
developers.



getMessage doesn't spin the event loop.  Spinning the event loop means
that tasks are run from task queues (such as asynchronous callbacks)
which
might not be expecting to run, and that tasks might be run recursively;
none
of that that happens here.  All this does is block until a message is
available on a specified port (or ports), and then returns it--it's just
a
blocking call, like sync XHR or FileReaderSync.



The example from Olli's proposal 3 does what effectively amounts to
spinning an event loop. It pulls out a bunch of events from the
normal event loop and then manually dispatches them in a while loop.
The behavior is exactly the same as spinning the event loop (except
that non-message tasks doesn't get dispatchet).


It is just dispatching events.
The problems we (Gecko) have had with event loop spinning in main thread
relate mainly to the problems where unexpected events are dispatched while
running the loop, as an example user input events or events coming from
network.
getMessage/waitForMessage does not have that problem.


I'm not sure what you mean by just dispatching events. That's
exactly what event loop spinning is.


No. waitForMessage example I wrote down just dispatches DOM events in a loop.
That is a synchronous operation and you know exactly which events you're
about to dispatch.
If you run the generic event loop, you also end up running
timers and getting input from network and user etc. and you can't
controls those.



Why are the gecko events any more unexpected than the message events
that the example dispatches.

We don't want to block certain events in Gecko (like user input to chrome).
Blocking events in worker code is ok.


The network events or user input events
that we had are all events created by gecko code. The messages that
might get dispatches by the worker code can easily also be network
events or user events which are sent to the worker for processing.

Web pages can have just as much inconsistent state while deep in call
stacks as we do.

That is true...


If they at that point call into a library which
starts pulling messages off of the task queue and dispatches them,
they'll run into the same problems as we've had.


...but then it is up to the library to handle the case properly and
dispatch events async.



Though, dispatching events async so that other new message events don't get 
handled before them
would require some new API.



.Olli




/ Jonas









Re: Sync API for workers

2012-09-06 Thread Jonas Sicking
On Wed, Sep 5, 2012 at 11:56 PM, Olli Pettay olli.pet...@helsinki.fi wrote:
 On 09/06/2012 09:49 AM, Jonas Sicking wrote:

 On Wed, Sep 5, 2012 at 11:30 PM, Olli Pettay olli.pet...@helsinki.fi
 wrote:

 On 09/06/2012 09:12 AM, Jonas Sicking wrote:


 On Wed, Sep 5, 2012 at 11:02 PM, b...@pettay.fi b...@pettay.fi wrote:


 On 09/06/2012 08:31 AM, Jonas Sicking wrote:



 On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote:



 On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc
 wrote:




 The problem with a Only allow blocking on children, except that
 window can't block on its children is that you can never block on a
 computation which is implemented in the main thread. I think that
 cuts
 out some major use cases since todays browsers have many APIs which
 are only implemented in the main thread.




 You can't have both--you have to choose one of 1: allow blocking
 upwards,
 2:
 allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is
 more
 useful than #2, but each proposal can go both ways.  I'm ignoring
 more
 complex deadlock detection algorithms that can allow both #1 and #2,
 of
 course, since that's a lot harder.)




 Indeed. But I believe #2 is more useful than #1. I wasn't proposing
 having both, I was proposing only doing #2.

 It's actually technically possible to allow both #1 and #2 without
 deadlock detection algorithms, but to keep things sane I'll leave that
 as out of scope for this thread.

 [snip]



 I think that's by far the most
 interesting category of use cases raised for this feature so far, the
 ability to implement sync APIs from async APIs (or several async
 APIs).




 That is certainly an interesting use case. I think another interesting
 use case is being able to write synchronous APIs in workers whose
 implementation uses APIs that are only available on the main thread.

 That's why I'm not interested in only blocking on children, but rather
 only blocking on parents.

 The fact that all the examples that people have used while we have
 been discussing synchronous messaging have spun event loops in
 attempts to deal with messages that couldn't be handled by the
 synchronous poller makes me very much think that so will web
 developers.




 getMessage doesn't spin the event loop.  Spinning the event loop
 means
 that tasks are run from task queues (such as asynchronous callbacks)
 which
 might not be expecting to run, and that tasks might be run
 recursively;
 none
 of that that happens here.  All this does is block until a message is
 available on a specified port (or ports), and then returns it--it's
 just
 a
 blocking call, like sync XHR or FileReaderSync.




 The example from Olli's proposal 3 does what effectively amounts to
 spinning an event loop. It pulls out a bunch of events from the
 normal event loop and then manually dispatches them in a while loop.
 The behavior is exactly the same as spinning the event loop (except
 that non-message tasks doesn't get dispatchet).



 It is just dispatching events.
 The problems we (Gecko) have had with event loop spinning in main
 thread
 relate mainly to the problems where unexpected events are dispatched
 while
 running the loop, as an example user input events or events coming from
 network.
 getMessage/waitForMessage does not have that problem.



 I'm not sure what you mean by just dispatching events. That's
 exactly what event loop spinning is.



 No. waitForMessage example I wrote down just dispatches DOM events in a
 loop.
 That is a synchronous operation and you know exactly which events you're
 about to dispatch.
 If you run the generic event loop, you also end up running
 timers and getting input from network and user etc. and you can't
 controls those.


 Just because they are message events doesn't mean that you know
 exactly which events you're about to dispatch. That's basically
 equivalent to saying that it's safe to spin the event loop in Gecko as
 long as you only dispatch nsIRunnables that were dispatched from Gecko
 code, as opposed to native events from the native event loop.

 Note that messages can be sent to the worker in response to network
 and UI events on the main thread.

 Why are the gecko events any more unexpected than the message events
 that the example dispatches.


 We don't want to block certain events in Gecko (like user input to
 chrome).
 Blocking events in worker code is ok.


 I don't understand what you are saying here.

 If they at that point call into a library which
 starts pulling messages off of the task queue and dispatches them,
 they'll run into the same problems as we've had.


 ...but then it is up to the library to handle the case properly and
 dispatch events async.


 But if it dispatches them asynchronously, they have lost their place
 in the message queue. I.e. now they are placed after all other
 incoming message events. Such event reordering is likely to break
 application level logic.

 And like I said in my original email, 

Re: Sync API for workers

2012-09-06 Thread Glenn Maynard
Just to ping a detail, so it's not lost in history: it should also be
possible to peek at a

On Thu, Sep 6, 2012 at 12:31 AM, Jonas Sicking jo...@sicking.cc wrote:

  1: allow blocking upwards, 2: allow blocking downwards,

 Indeed. But I believe #2 is more useful than #1. I wasn't proposing
 having both, I was proposing only doing #2.


OK.  I think I disagree, but this is orthogonal to which API approach is
used, since it's easy to take every proposal and flip it either way.

That is certainly an interesting use case. I think another interesting
 use case is being able to write synchronous APIs in workers whose
 implementation uses APIs that are only available on the main thread.


I understand the concept, but I'm having trouble coming up with useful
examples.  Can you give one?

The only case that comes to mind is blocking for user input; for example,
requesting the UI thread to ask the user his name, and then waiting for the
response.  (That might be useful, but I think it's far useful than generic
sync worker APIs.)  It sounds like you're talking about something like
construct a DOM tree, do some stuff to it and return a result, but I
can't think of a useful example for that.

The only DOM APIs I've *really* wanted in workers is eg. HTMLImageElement,
as part of getting WebGL into workers, but this wouldn't help there.

The example from Olli's proposal 3 does what effectively amounts to
 spinning an event loop. It pulls out a bunch of events from the
 normal event loop and then manually dispatches them in a while loop.
 The behavior is exactly the same as spinning the event loop (except
 that non-message tasks doesn't get dispatchet).


I don't like it either, but it was only needed in his proposal because it
didn't support MessagePorts, so he had to do his own ad hoc filtering on a
single port.

Claiming that we don't need to explain all edge cases to authors and
 just give them a simplified version would, I think, be ignoring the
 complexity of software that people write using the web platform.


Most of the complexity of the algorithm is in the mechanics of
implementation, rather than its effects.  Users don't have to know that
it's the receiving thread handling the flag, or that an internal message is
sent to clear the flag on the other side of the channel; these are
algorithmic details.

So it sounds like you are ok with not permitting the using both
 synchronous and asynchronous messages to the same port then? As long
 as some ports allow synchronous messages and others allow asynchronous
 messages. Leaving aside the issue of how and when it is determined
 that a port is sync vs. async.


I don't really like it or think it's necessary, but it doesn't seem
crippling and I'd live with it if needed to come to a resolution.  It does
lead to making people jump some extra hoops, though.

The original use case that led me to this in the first place was an
autocomplete worker.  The worker receives an ordinary message with the
user's typed text, eg. {text: intern}, and starts searching.  The search
may take longer than it takes for the user to type the next letter, and I
wanted to be able to immediately stop the search if another message comes
along, so I can restart with the new text.  The simplest way to do that is
to occasionally poll for a new message (zero timeout), and when {text:
interne} comes along, restart the search.

This isn't impossible with this restriction, but you do have to jump a few
extra hoops.  It'd need a separate cancel port which can be accessed
synchronously; when a message shows up on it, cancel the search so the next
{text} message (on a regular port) can be received.

Actually, I don't think that would work, since the order of messages across
ports is unspecified...

An aside: it should always be possible to poll a message port, regardless
of whether you're a parent or child.  (It doesn't cause deadlocks since
it's nonblocking, and it allows the above scenario even if blocking is only
allowed from the parent side.)

 Unless you structure you code such that it's the responsibility of the
 consumer of the API to create the channel. That way the consumer of
 the API can choose if it wants to use blocking or non-blocking.


But then you still end up with a port which is heavily restricted in where
and how it can be passed around.  If you want to pass it to your
great-great-grandfather thread, you have to post it to your parent, who
posts it to his parent, who posts it to his parent.  Posting it directly
there isn't possible.  Passing it to siblings or uncles isn't possible at
all.

The same is true on the worker side; if it wants to pass its port to its
grandchild, it has to pass it to its child, who passes it again to its
child.  You have to carefully structure your messaging to always do this.

And do note that even your proposal requires the async side of a
 message channel to be aware of that the other side might be using
 synchronous polling. If it wants to support the 

Re: Sync API for workers

2012-09-05 Thread Jonas Sicking
On Mon, Sep 3, 2012 at 8:55 PM, Glenn Maynard gl...@zewt.org wrote:
 On Mon, Sep 3, 2012 at 9:30 PM, Jonas Sicking jo...@sicking.cc wrote:

 We can't generically block on children since we can't let the main
 window block on a child. That would effectively permit synchronous IO
 from the main thread which is not something that we want to allow.


 The UI thread would never be allowed to block, of course.  The getMessage
 API itself would never even be exposed in the UI thread, regardless of the
 state of this flag.

The problem with a Only allow blocking on children, except that
window can't block on its children is that you can never block on a
computation which is implemented in the main thread. I think that cuts
out some major use cases since todays browsers have many APIs which
are only implemented in the main thread.

APIs only existing on the main thread is likely going to always be the
case, even once browsers get better at implementing APIs in the main
thread and worker threads at the same time, since there will always be
some APIs that are main-thread only, like the DOM.

 Your proposal makes it possible for pages to avoid the problems
 described in my email by setting up a separate channel used for
 synchronous messages. But some of the problems still remain. As soon
 as a message channel is used for both synchronous and asynchronous
 messages you can easily get into trouble. If someone calls the
 blocking waitForMessage() function and receive a message which was
 intended to be delivered asynchronously there is no good recourse.
 Basically any time that happens there are only bad options available,
 many of which have subtle problems that only happen intermittently
 like the ones I described in my initial email.

 Since that is the case, I think the best solution is to always force
 separate channels to be used for synchronous and asynchronous
 messages.

 If you have messages that must be received synchronously, and other messages
 that must be received asynchronously, then that's precisely a time when
 you'd want to use MessagePorts to separate them.  That's what they're for.
 It's the same as using separate MessagePorts when you have two unrelated
 libraries receiving their own messages, so each library only sees messages
 intended for it.

 I agree that APIs that encourage people to write brittle code should be
 squinted at carefully, and we should definitely examine all APIs for that
 problem, but really I don't think it's the case here.

You are more optimistic than I am.

The fact that all the examples that people have used while we have
been discussing synchronous messaging have spun event loops in
attempts to deal with messages that couldn't be handled by the
synchronous poller makes me very much think that so will web
developers.

And the fact that using the existing communication channel is much
simpler than manually setting up a new one using |new MessageChannel|
and passing transferring ports using postMessage further adds to that.

So I would say that there's a high degree of risk that people will get
this wrong.

 Importantly, the sending side
 doesn't have to know whether the receiving side is using a sync API to
 receive it or not--in other words, that information doesn't have to be part
 of the user's messaging protocol.

I agree that this is desirable trait. But so far I think the risks in
encouraging event loop spinning outweigh the benefits.

I also wonder if what you are describing doesn't make more sense when
communicating with a child worker and blocking on receiving a response
from it.

Another trait that this looses is the ability to terminate a worker as
soon as we know that a synchronous response can't be sent. I.e. in
proposal 1 and 2 the implementation can terminate the worker as soon
as the object with the .reply() function is GCed. Note that this
doesn't expose any GC behavior since a forever blocked worker
behaves exactly the same as a terminated worker. I.e. neither will
ever execute any code.

 All in all this is a much more complicated setup though. I think it'd
 be worth keeping the simpler API like the 1 or 2 proposals even if we
 do introduce SyncMessageChannel since that likely covers the majority
 of use cases.


 Those proposals seem much more complex to me.  You can't send a message that
 will be received synchronously unless the other side prompts you for one
 first; you have to care whether the other side is acting synchronously or
 asynchronously.  It's a bunch of new concepts (synchronous messages,
 message replies), instead of a simple (to users, at least) addition to
 MessagePorts.

Fewer APIs isn't the same thing as a simpler API. On the contrary, I
think trying to fit too much functionality into the same set of
functions can easily result in more complexity.

I think this is fairly well illustrated by the set of rules that you
ended up having to set up in order to make the blocking permitted
flags work out correctly. And your algorithm produces 

Re: Sync API for workers

2012-09-05 Thread Jonas Sicking
On Wed, Sep 5, 2012 at 7:03 PM, Jonas Sicking jo...@sicking.cc wrote:
 [Constructor]
 interface MessageChannel {
   readonly attribute MessagePortSyncSide syncPort;
   readonly attribute MessagePortAsyncSide asyncPort;
 };

This should of course say SyncMessageChannel.

/ Jonas



Re: Sync API for workers

2012-09-05 Thread Glenn Maynard
On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote:

  The problem with a Only allow blocking on children, except that
 window can't block on its children is that you can never block on a
 computation which is implemented in the main thread. I think that cuts
 out some major use cases since todays browsers have many APIs which
 are only implemented in the main thread.


You can't have both--you have to choose one of 1: allow blocking upwards,
2: allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is more
useful than #2, but each proposal can go both ways.  I'm ignoring more
complex deadlock detection algorithms that can allow both #1 and #2, of
course, since that's a lot harder.)

 The fact that all the examples that people have used while we have
 been discussing synchronous messaging have spun event loops in
 attempts to deal with messages that couldn't be handled by the
 synchronous poller makes me very much think that so will web
 developers.


getMessage doesn't spin the event loop.  Spinning the event loop means
that tasks are run from task queues (such as asynchronous callbacks) which
might not be expecting to run, and that tasks might be run recursively;
none of that that happens here.  All this does is block until a message is
available on a specified port (or ports), and then returns it--it's just a
blocking call, like sync XHR or FileReaderSync.

I also wonder if what you are describing doesn't make more sense when
 communicating with a child worker and blocking on receiving a response
 from it.


That's what I meant, based on a use case you brought up: a library which
implements the synchronous IndexedDB API.  I think that's by far the most
interesting category of use cases raised for this feature so far, the
ability to implement sync APIs from async APIs (or several async APIs).

Another trait that this looses is the ability to terminate a worker as
 soon as we know that a synchronous response can't be sent. I.e. in
 proposal 1 and 2 the implementation can terminate the worker as soon
 as the object with the .reply() function is GCed. Note that this
 doesn't expose any GC behavior since a forever blocked worker
 behaves exactly the same as a terminated worker. I.e. neither will
 ever execute any code.


It's the same: terminate when the other MessagePort is GC'd or its
port.close is called.

(The MessagePort cross-process GC issues might sometimes prevent that, but
that's just another instance of the issue that already exists.  By the way,
do you happen to remember where that issue was last discussed in detail?
I'd like to refresh my memory on the details of this problem.)

 Fewer APIs isn't the same thing as a simpler API. On the contrary, I
 think trying to fit too much functionality into the same set of
 functions can easily result in more complexity.


Sure, but I do think they're the same in this case.

I think this is fairly well illustrated by the set of rules that you
 ended up having to set up in order to make the blocking permitted
 flags work out correctly.


Explaining this to users is simple: if you want to block on a port, it
needs to only ever be transferred above its other side, not below.

And your algorithm produces weird edge
 cases, such as that it matters if someone sets up a message proxy
 which forwards all messages from one channel to another, rather than
 just passes on one end of a channel. With such a proxy your ports
 end up touching more threads and so are more likely to clear the
 blocking permitted flag. In all other cases such a proxy is
 transparent.


The previous proposals allow nothing *but* blocking on your parent (or
child), so if you have threads A - B - C and you want to pass messages
from C to A, you *have* to proxy messages across (and keep thread B alive
forever as a result).  That's a big part of why we have MessagePorts to
begin with.  That aside, the iteration below removes most of the cases
where you can't block, eg. passing a port up to your parent and then down
to a sibling.



This is a bit wordier, but I think it's easier to understand, since all it
cares about is where the ports was originally created.

- Add a direction flag to ports, which may have the values up, down,
disallowed and initial, and is initially initial.
- Add an original thread value to ports, which is set to the current
thread at MessageChannel creation time.  This value is preserved across
structured clone.
- When a thread receives a port, compare its original thread with the
current thread.
  - If the root thread of original thread is not on the same as that of
the current thread, mark the port as disallowed.
  - Otherwise, if original thread is the current thread, mark the port as
initial.
  - Otherwise, if original thread is a descendant of the current thread,
mark the port as up.
  - Otherwise, mark the port as down.

The UI thread and shared workers are root threads.  The root thread of a
dedicated worker is its one ancestor thread which 

Re: Sync API for workers

2012-09-05 Thread Jonas Sicking
On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking jo...@sicking.cc wrote:

 The problem with a Only allow blocking on children, except that
 window can't block on its children is that you can never block on a
 computation which is implemented in the main thread. I think that cuts
 out some major use cases since todays browsers have many APIs which
 are only implemented in the main thread.

 You can't have both--you have to choose one of 1: allow blocking upwards, 2:
 allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is more
 useful than #2, but each proposal can go both ways.  I'm ignoring more
 complex deadlock detection algorithms that can allow both #1 and #2, of
 course, since that's a lot harder.)

Indeed. But I believe #2 is more useful than #1. I wasn't proposing
having both, I was proposing only doing #2.

It's actually technically possible to allow both #1 and #2 without
deadlock detection algorithms, but to keep things sane I'll leave that
as out of scope for this thread.

[snip]
 I think that's by far the most
 interesting category of use cases raised for this feature so far, the
 ability to implement sync APIs from async APIs (or several async APIs).

That is certainly an interesting use case. I think another interesting
use case is being able to write synchronous APIs in workers whose
implementation uses APIs that are only available on the main thread.

That's why I'm not interested in only blocking on children, but rather
only blocking on parents.

 The fact that all the examples that people have used while we have
 been discussing synchronous messaging have spun event loops in
 attempts to deal with messages that couldn't be handled by the
 synchronous poller makes me very much think that so will web
 developers.

 getMessage doesn't spin the event loop.  Spinning the event loop means
 that tasks are run from task queues (such as asynchronous callbacks) which
 might not be expecting to run, and that tasks might be run recursively; none
 of that that happens here.  All this does is block until a message is
 available on a specified port (or ports), and then returns it--it's just a
 blocking call, like sync XHR or FileReaderSync.

The example from Olli's proposal 3 does what effectively amounts to
spinning an event loop. It pulls out a bunch of events from the
normal event loop and then manually dispatches them in a while loop.
The behavior is exactly the same as spinning the event loop (except
that non-message tasks doesn't get dispatchet).

 I think this is fairly well illustrated by the set of rules that you
 ended up having to set up in order to make the blocking permitted
 flags work out correctly.

 Explaining this to users is simple: if you want to block on a port, it
 needs to only ever be transferred above its other side, not below.

Claiming that we don't need to explain all edge cases to authors and
just give them a simplified version would, I think, be ignoring the
complexity of software that people write using the web platform.

 On Wed, Sep 5, 2012 at 9:03 PM, Jonas Sicking jo...@sicking.cc wrote:

 The part that I dislike about having single channel used for both sync
 and async messaging is that you end up with one or more async
 listeners which expect to get notified about all incoming messages,
 but then you have an API which steals a message away from those
 listeners. On top of that it has to do that stealing without any way
 of ensuring ensuring that it actually steals the right message.

 That's exactly the reason to use MessagePorts: to categorize messages.

So it sounds like you are ok with not permitting the using both
synchronous and asynchronous messages to the same port then? As long
as some ports allow synchronous messages and others allow asynchronous
messages. Leaving aside the issue of how and when it is determined
that a port is sync vs. async.

 But being (mostly) agnostic to if the other side is using sync
 messages or not doesn't mean that the other side uses both sync and
 async messaging!

 But you still have to do extra work on the sending side to support both sync
 and async receiving, since you have to hand it the right type of channel.

Unless you structure you code such that it's the responsibility of the
consumer of the API to create the channel. That way the consumer of
the API can choose if it wants to use blocking or non-blocking.

 Couldn't we just make calling getMessage permanently disable .onmessage
 dispatching (perhaps until the port is posted again)?  That would make it
 very hard to accidentally use both, while encapsulating knowledge about
 which way it's being used to the receiver, so the sender doesn't need to
 carefully send the receiver the right type of MessageChannel.  (I really
 don't feel this is necessary, but I'd prefer it to multiple MessagePort
 interfaces.)

We could define that calling .start() sets the port in async mode,
at which point 

Re: Sync API for workers

2012-09-04 Thread David Bruant

Hi,

Before anything else, thanks for this detailed and quite complete 
explanation.


Le 03/09/2012 23:32, Jonas Sicking a écrit :

The other thing that I wanted to talk about is use cases. It has been
claimed in this thread that synchronous message passing isn't needed
and that people can just write code using async patterns. While this
is absolutely true, I would absolutely say that writing asynchronous
code is dramatically more complicated than writing synchronous code.
I acknowledge that writing async code may be hard for some with 
JavaScript-as-it-is.
The solution I have found personally to this issue is using promises 
which makes async code look sync (+some noise due to JS-the-language). 
Others think promises is a good solution [1]
Dave Herman has created task.js [2] to solve the same problem 
differently (promises+generators).
Some have created compile-to-JS languages (like Roy as I said in a 
previous message) to solve the problem in yet another way.

Others find other solutions.

All these solutions have in common that they reduce the complexity of 
writing/reading async code while keeping the benefits of it against a 
small layer of code.


The proposed solution here throws away all benefits of async code to 
reduce the complexity of writing async code by... writing sync code.


I wish we'd explore more solutions to make async more workable rather 
than throwing away async.
The problem with blocking workers is that it may create a culture of 
creating always more and more blocking workers (like Apache creates more 
and more threads to handle more blocking connections).
I understand the benefit that may come with a sync API, but we all know 
that APIs aren't used the way they're primarily intended and sometimes 
with very bad consequences. I'm afraid the bad consequences of a sync 
API misuse haven't been explored. Maybe soon you'll have people filing 
bugs telling poor Firefox memory usage when using a lot of workers.



This is one of the big reasons that we have workers at all.
I had never heard this argument before the topic of sync messaging API 
for workers. Where does it come from?
When I read the worker API, I see a way to create a new computation unit 
and to send messages back and forth, nothing about writing sync code.

Regardless of goal, do people actually write more sync code with workers?
Taras Glek seems to think that the local storage API (which is sync) is 
not a good fit for workers [3]:
We could expose this API to workers, but then we run into an ethical 
question of bringing crappy APIs to new environments. (the article 
mentions that the localStorage API is synchronous as part of the 
crappy aspect of it)



There is also another use-case which has been brought up. As the web
platform is becoming more powerful, people have started converting
code written for other platforms to javascript+html.

By html, here, do you mean something else than canvas?
Is there something that compiles any Windows/Mac/Linus UI framework into 
HTML5?



For example the
emscipten[1] and mandreel[2] allow recompiling C++ code to javascript
which is then run in a web browser.
I've been following loosely this topic and all examples I've seen were 
either about pure computation (like turning a C GZIP library in JS) or 
graphics stuffs (using canvas, hence my above question).



Many times such code relies on APIs or libraries which contain blocking calls.
Do you have an example of that? I haven't seen one so far, but that's an 
interesting point.
Specifically, I can't recall having seen any C/C++-JS example that were 
doing IO (maybe files, but not network),
Last I heard, Emscripten compiles to JS from LLVM bytecode. I'm not sure 
they rely on any library containing blocking calls.

But as I said, I have been following that loosely.


Technically it might
be possible to automatically rewrite such code to use asynchronous
coding patterns, but so far I don't think anyone has managed to do
that.
Naively, I would say, that once we've paid the price to compile code 
from one language to another, you're not that far off from compiling to 
a given coding pattern. Especially compiling from LLVM bytecode.
Regardless, has anyone tried at all? Has anyone tried to compile to 
JS+Promises or JS+task.js?
All of the compile-to-the-web movement started recently. Only now do we 
start seeing what it can do. it's just the beginning.
Also, I wish the demand came from the people who do work on Emscripten 
or Mandreel, that they came to standards mailing-list saying it took us 
a billion hours to compile I/O to JS correctly, we tried promises, 
task.js and it didn't help. It would have taken 10 seconds if we had a 
sync API. But I can't recall having read such a message yet. 
Compile-to-the-web is a complicated field and I wish we didn't try to 
guess for them what they need before they ask.


David

[1] http://jeditoolkit.com/2012/04/26/code-logic-not-mechanics.html#post
[2] http://taskjs.org/
[3] (see before-last 

Re: Sync API for workers

2012-09-04 Thread Boris Zbarsky

On 9/4/12 5:23 AM, David Bruant wrote:

Also, I wish the demand came from the people who do work on Emscripten
or Mandreel


As far as I know, what Jonas is saying about Emscripten did come from 
the Emscripten folks.  I've certainly seen them say it in bugs in the 
recent past.  No guessing involved on our part there.



But I can't recall having read such a message yet.


Maybe it's because the barrier to posting on standards list is pretty 
high?  Starting with the requirement to subscribe to a high-traffic 
list.  Some people may decide they don't have time to deal with that.


-Boris



Re: Sync API for workers

2012-09-04 Thread David Bruant

Le 04/09/2012 14:34, Boris Zbarsky a écrit :

On 9/4/12 5:23 AM, David Bruant wrote:

Also, I wish the demand came from the people who do work on Emscripten
or Mandreel


As far as I know, what Jonas is saying about Emscripten did come from 
the Emscripten folks.  I've certainly seen them say it in bugs in the 
recent past.  No guessing involved on our part there.

Ok I wasn't aware of that. Do you have bug numbers in mind by any chance?
It however doesn't change that it could be tried to use promises or 
task.js to generate sync-like code more easily.


David



Re: Sync API for workers

2012-09-04 Thread Boris Zbarsky

On 9/4/12 8:54 AM, David Bruant wrote:

Ok I wasn't aware of that. Do you have bug numbers in mind by any chance?


I don't offhand, unfortunately.  Would have to search.

-Boris



Re: Sync API for workers

2012-09-04 Thread Glenn Maynard
On Tue, Sep 4, 2012 at 4:23 AM, David Bruant bruan...@gmail.com wrote:

 The proposed solution here throws away all benefits of async code to
 reduce the complexity of writing async code by... writing sync code.

 I wish we'd explore more solutions to make async more workable rather than
 throwing away async.


It seems like you're thinking of asynchronous code as fundamentally better
than synchronous code.  It's not; it has a set of advantages--ones that the
Web needs badly for the UI thread, in order for scripts and the browser to
coexist.  It also has a set of serious disadvantages.  We're not throwing
away async; we're bringing sync back into the game where it's appropriate.

 The problem with blocking workers is that it may create a culture of
 creating always more and more blocking workers (like Apache creates more
 and more threads to handle more blocking connections).


You're not talking about this particular API here, you're talking about
every sync API in workers.  Having sync APIs in workers and performing
blocking tasks in workers isn't something new.

This is one of the big reasons that we have workers at all.

 I had never heard this argument before the topic of sync messaging API for
 workers. Where does it come from?
 When I read the worker API, I see a way to create a new computation unit
 and to send messages back and forth, nothing about writing sync code.
 Regardless of goal, do people actually write more sync code with workers?


The very first example in the spec is doing work synchronously.
http://www.whatwg.org/specs/web-apps/current-work/#a-background-number-crunching-worker

Taras Glek seems to think that the local storage API (which is sync) is not
 a good fit for workers [3]:
 We could expose this API to workers, but then we run into an ethical
 question of bringing crappy APIs to new environments. (the article
 mentions that the localStorage API is synchronous as part of the crappy
 aspect of it)


(The synchronous part is bad for the UI thread, but not a problem in
workers, so this isn't a very good argument, at least as summarized here.)

 Many times such code relies on APIs or libraries which contain blocking
 calls.


Do you have an example of that? I haven't seen one so far, but that's an
 interesting point.
 Specifically, I can't recall having seen any C/C++-JS example that were
 doing IO (maybe files, but not network),


I believe that was his point--it's very hard to programmatically convert
synchronous code to asynchronous code.

 Technically it might be possible to automatically rewrite such code to use
 asynchronous
 coding patterns, but so far I don't think anyone has managed to do
 that.

Naively, I would say, that once we've paid the price to compile code from
 one language to another, you're not that far off from compiling to a given
 coding pattern. Especially compiling from LLVM bytecode.
 Regardless, has anyone tried at all? Has anyone tried to compile to
 JS+Promises or JS+task.js?
 All of the compile-to-the-web movement started recently. Only now do we
 start seeing what it can do. it's just the beginning.
 Also, I wish the demand came from the people who do work on Emscripten or
 Mandreel, that they came to standards mailing-list saying it took us a
 billion hours to compile I/O to JS correctly, we tried promises, task.js
 and it didn't help. It would have taken 10 seconds if we had a sync API.
 But I can't recall having read such a message yet. Compile-to-the-web is a
 complicated field and I wish we didn't try to guess for them what they need
 before they ask.


I don't want to use a heavily layered environment that compiles to
JavaScript in order to write linear code.  I want the Web platform to be
robust on its own, without complicated systems piled up on top of it.
Expecting that to be a solution is just throwing in the hat and giving up.
It's the you don't need a native API for that, just use a library
argument notched up several orders of magnitude.

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-04 Thread David Bruant

Le 04/09/2012 17:03, Glenn Maynard a écrit :
On Tue, Sep 4, 2012 at 4:23 AM, David Bruant bruan...@gmail.com 
mailto:bruan...@gmail.com wrote:


The proposed solution here throws away all benefits of async code
to reduce the complexity of writing async code by... writing sync
code.

I wish we'd explore more solutions to make async more workable
rather than throwing away async.


It seems like you're thinking of asynchronous code as fundamentally 
better than synchronous code.  It's not; it has a set of 
advantages--ones that the Web needs badly for the UI thread, in order 
for scripts and the browser to coexist.  It also has a set of serious 
disadvantages.
Cognitive load is the only one mentioned so far. It is a serious issue 
since for the foreseeable future, only human beings will be writing 
code. However, as said, there are solutions to reduce this load.

I wish to share an experience.
Back in April, I gave a JavaScript/jQuery training to people who knew 
programming, but didn't know JavaScript. I made the decision to teach 
promises right away (jQuery has them built-in, so that's easy). It seems 
that it helped a lot understanding async programming.

The cognitive load has its solutions.

We're not throwing away async; we're bringing sync back into the game 
where it's appropriate.

True. I was exagerating a bit here :-)



The problem with blocking workers is that it may create a culture
of creating always more and more blocking workers (like Apache
creates more and more threads to handle more blocking connections).


You're not talking about this particular API here, you're talking 
about every sync API in workers.  Having sync APIs in workers and 
performing blocking tasks in workers isn't something new.


This is one of the big reasons that we have workers at all.

I had never heard this argument before the topic of sync messaging
API for workers. Where does it come from?
When I read the worker API, I see a way to create a new
computation unit and to send messages back and forth, nothing
about writing sync code.
Regardless of goal, do people actually write more sync code with
workers?


The very first example in the spec is doing work synchronously. 
http://www.whatwg.org/specs/web-apps/current-work/#a-background-number-crunching-worker
This is a very interesting example and I realize that I have used 
blocking and sync interchangeably by mistake. I'm against blocking, 
but not sync.
What I'm fundamentally (to answer what you said above) against is the 
idea of blocking a computation unit (like a worker) that does nothing 
but idly waits (for IO or a message for instance). It seems that 
proposals so far make the worker wait for a message and do nothing 
meanwhile and that's a pure waste of resources. A worker has been paid 
for (memory, init time...) and it's waiting while it could be doing 
other things.
The current JS event loop run-to-completion model prevents that waste by 
design.




Taras Glek seems to think that the local storage API (which is
sync) is not a good fit for workers [3]:
We could expose this API to workers, but then we run into an
ethical question of bringing crappy APIs to new environments.
(the article mentions that the localStorage API is synchronous as
part of the crappy aspect of it)


(The synchronous part is bad for the UI thread, but not a problem in 
workers, so this isn't a very good argument, at least as summarized here.)


Many times such code relies on APIs or libraries which contain
blocking calls.


Do you have an example of that? I haven't seen one so far, but
that's an interesting point.
Specifically, I can't recall having seen any C/C++-JS example
that were doing IO (maybe files, but not network),


I believe that was his point--it's very hard to programmatically 
convert synchronous code to asynchronous code.
True, but I mean, we could have read intentions or blog posts of people 
saying it's way too hard




Technically it might be possible to automatically rewrite such
code to use asynchronous
coding patterns, but so far I don't think anyone has managed to do
that.

Naively, I would say, that once we've paid the price to compile
code from one language to another, you're not that far off from
compiling to a given coding pattern. Especially compiling from
LLVM bytecode.
Regardless, has anyone tried at all? Has anyone tried to compile
to JS+Promises or JS+task.js?
All of the compile-to-the-web movement started recently. Only now
do we start seeing what it can do. it's just the beginning.
Also, I wish the demand came from the people who do work on
Emscripten or Mandreel, that they came to standards mailing-list
saying it took us a billion hours to compile I/O to JS correctly,
we tried promises, task.js and it didn't help. It would have taken
10 seconds if we had a sync API. 

Re: Sync API for workers

2012-09-04 Thread Glenn Maynard
On Tue, Sep 4, 2012 at 10:32 AM, David Bruant bruan...@gmail.com wrote:

 Cognitive load is the only one mentioned so far. It is a serious issue
 since for the foreseeable future, only human beings will be writing code.

However, as said, there are solutions to reduce this load.
 I wish to share an experience.
 Back in April, I gave a JavaScript/jQuery training to people who knew
 programming, but didn't know JavaScript. I made the decision to teach
 promises right away (jQuery has them built-in, so that's easy). It seems
 that it helped a lot understanding async programming.
 The cognitive load has its solutions.


(Understanding asynchronous programming isn't really the issue.  I'm sure
everyone in this discussion has an intuitive grasp of that.)

Those are attempts at making asynchronous code easier to write; they're not
substitutes for synchronous code.  They still result in code with less
understandable, well-scoped state.

This is a very interesting example and I realize that I have used
 blocking and sync interchangeably by mistake. I'm against blocking, but
 not sync.
 What I'm fundamentally (to answer what you said above) against is the idea
 of blocking a computation unit (like a worker) that does nothing but idly
 waits (for IO or a message for instance). It seems that proposals so far
 make the worker wait for a message and do nothing meanwhile and that's a
 pure waste of resources. A worker has been paid for (memory, init time...)
 and it's waiting while it could be doing other things.
 The current JS event loop run-to-completion model prevents that waste by
 design.


Workers broke away from requiring the do a bit of work then keep returning
to the event loop model of the UI thread from the start.  This is no
different than the APIs we already have.  To take an earlier example:

var worker = createDictionaryWorker();
worker.postMessage(elephant);
var definition = getMessage(worker); // wait for the answer

This is no different than a sync XHR or IndexedDB call to do the same thing:

var xhr = new XMLHttpRequest();
xhr.open(GET, /dictionary?elephant, false); // sync
xhr.send();
var definition = xhr.responseText;

It simply allows workers, not just native code, to implement these APIs.
That's a natural step.

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-04 Thread David Bruant
Le 04/09/2012 18:46, Glenn Maynard a écrit :
 On Tue, Sep 4, 2012 at 10:32 AM, David Bruant bruan...@gmail.com
 mailto:bruan...@gmail.com wrote:

 Cognitive load is the only one mentioned so far. It is a serious
 issue since for the foreseeable future, only human beings will be
 writing code.

 However, as said, there are solutions to reduce this load.
 I wish to share an experience.
 Back in April, I gave a JavaScript/jQuery training to people who
 knew programming, but didn't know JavaScript. I made the decision
 to teach promises right away (jQuery has them built-in, so that's
 easy). It seems that it helped a lot understanding async programming.
 The cognitive load has its solutions.


 (Understanding asynchronous programming isn't really the issue.  I'm
 sure everyone in this discussion has an intuitive grasp of that.)

 Those are attempts at making asynchronous code easier to write;
 they're not substitutes for synchronous code.  They still result in
 code with less understandable, well-scoped state.
I'm sorry, but I have to disagree. Have you ever used promises in a
large-scale project?
I've been amazed to discover that promise-based API are ridiculously
much easier to refactor than callback-based API. Obviously, refactoring
necessitates well-scoped state. I can't show the commit I have in mind,
because it's in closed-source software, but really, a promise-based API
isn't less understandable and less well-scoped. That statement is at the
opposite direction of my experience these last 8 months.



 This is a very interesting example and I realize that I have used
 blocking and sync interchangeably by mistake. I'm against
 blocking, but not sync.
 What I'm fundamentally (to answer what you said above) against is
 the idea of blocking a computation unit (like a worker) that does
 nothing but idly waits (for IO or a message for instance). It
 seems that proposals so far make the worker wait for a message and
 do nothing meanwhile and that's a pure waste of resources. A
 worker has been paid for (memory, init time...) and it's waiting
 while it could be doing other things.
 The current JS event loop run-to-completion model prevents that
 waste by design.


 Workers broke away from requiring the do a bit of work then keep
 returning to the event loop model of the UI thread from the start. 
 This is no different than the APIs we already have.  To take an
 earlier example:

 var worker = createDictionaryWorker();
 worker.postMessage(elephant);
 var definition = getMessage(worker); // wait for the answer

 This is no different than a sync XHR or IndexedDB call to do the same
 thing:

 var xhr = new XMLHttpRequest();
 xhr.open(GET, /dictionary?elephant, false); // sync
 xhr.send();
 var definition = xhr.responseText;

 It simply allows workers, not just native code, to implement these
 APIs.  That's a natural step.
I understand and agree, but you're not addressing the problem of the
resource waste I've mentionned above.
Even if you're doing sync xhr in a worker, you're wasting the worker
time, because it could be computing other things while waiting for the
network to respond. That problem was obvious in the main thread because
it was resulting in poor user experience, but the problem still holds
with workers.
What do you do if your worker is busy idling while waiting for network,
but still need some other work to be done? Open another worker? And when
this one is idling and you need work done? Open another worker?

To oppose both things in the same sentence, is the readability worth the
waste of resources?
That's a genuine question. My experience with Node.js (which also
provides sync methods for IO) is that for small scripts, sync methods
are more convenient that callbacks or even promises. But arguably, for
small scripts, readability isn't that big of a concern by nature of a
small script.

David


Re: Sync API for workers

2012-09-04 Thread Glenn Maynard
On Tue, Sep 4, 2012 at 12:49 PM, David Bruant bruan...@gmail.com wrote:

 I'm sorry, but I have to disagree. Have you ever used promises in a
 large-scale project?
 I've been amazed to discover that promise-based API are ridiculously much
 easier to refactor than callback-based API. Obviously, refactoring
 necessitates well-scoped state. I can't show the commit I have in mind,
 because it's in closed-source software, but really, a promise-based API
 isn't less understandable and less well-scoped. That statement is at the
 opposite direction of my experience these last 8 months.


You have to choose between scoping state to a class (poor scoping) or in
closures (hard to debug) instead of using locals in a call stack (tightly
scoped and easy to debug); the overall current state of execution is much
harder to see compared to a stack trace; the basic idea of stepping through
code in a debugger scarcely translates at all.

 I understand and agree, but you're not addressing the problem of the
 resource waste I've mentionned above.


I don't feel like I need to, because I expect this question was explored
before workers were introduced in the first place.  You apparently want to
argue against *all* sync APIs, but you should do that separately, rather
than singling out one sync API at random.

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-04 Thread David Bruant
Le 04/09/2012 20:47, Glenn Maynard a écrit :
 On Tue, Sep 4, 2012 at 12:49 PM, David Bruant bruan...@gmail.com
 mailto:bruan...@gmail.com wrote:

 I'm sorry, but I have to disagree. Have you ever used promises in
 a large-scale project?
 I've been amazed to discover that promise-based API are
 ridiculously much easier to refactor than callback-based API.
 Obviously, refactoring necessitates well-scoped state. I can't
 show the commit I have in mind, because it's in closed-source
 software, but really, a promise-based API isn't less
 understandable and less well-scoped. That statement is at the
 opposite direction of my experience these last 8 months.


 You have to choose between scoping state to a class (poor scoping) or
 in closures (hard to debug) instead of using locals in a call stack
 (tightly scoped and easy to debug); the overall current state of
 execution is much harder to see compared to a stack trace; the basic
 idea of stepping through code in a debugger scarcely translates at all.
Tooling isn't perfect for async debugging. It's being worked on. Yet it
hasn't prevented web devs from buiding (and debugging) event-based code.

As someone else said in another message, async isn't going away. There
won't be new blocking API for the main thread, so all the costs of
learning async programming will have to be paid. Debugging included.
I'm less and less convinced there is really something substancial to win
from the JS developer perspective.

For small scripts, it will be possible to use blocking APIs, but the
cost of async in small scripts is bearable. For big scripts, blocking
APIs induce a performance cost that soon makes people move to async.


 I understand and agree, but you're not addressing the problem of
 the resource waste I've mentionned above.


 I don't feel like I need to, because I expect this question was
 explored before workers were introduced in the first place.
It likely hasn't because workers do not have access to blocking APIs
except sync xhr. Are there examples in the wild of people creating new
workers when one is doing a sync xhr or do people just turn their code
into async when performance becomes an issue?
If the question has been explored before, can anyone point to the
answer? Otherwise, the debate won't move forward on that point.

 You apparently want to argue against *all* sync APIs, but you should
 do that separately, rather than singling out one sync API at random.
As I said in a previous message, I'm arguing against the waste of
resources due to blocking APIs. If a sync API makes an actual use of the
worker and CPU, that's excellent. If it's blocking on IO, it's wasting
resources that could be doing other computations.

David


Re: Sync API for workers

2012-09-04 Thread Glenn Maynard
On Tue, Sep 4, 2012 at 2:23 PM, David Bruant bruan...@gmail.com wrote:

 Tooling isn't perfect for async debugging. It's being worked on. Yet it
 hasn't prevented web devs from buiding (and debugging) event-based code.


Developers work in lots of bad environments and get stuff done anyway.
That's no argument.

As someone else said in another message, async isn't going away. There
 won't be new blocking API for the main thread, so all the costs of learning
 async programming will have to be paid. Debugging included.


I can only repeat what I already said: Understanding asynchronous
programming isn't really the issue.  I'm sure everyone in this discussion
has an intuitive grasp of that.

  You apparently want to argue against *all* sync APIs, but you should do
 that separately, rather than singling out one sync API at random.

 As I said in a previous message, I'm arguing against the waste of
 resources due to blocking APIs.


That's what I said: you're arguing against all sync APIs, not *this* API.
I don't really want to spend more time on this tangent, since it's not
about this API at all but a higher-level concept, and one we already have
an answer to: synchronous APIs in workers are OK.  Again, if you want to
debate a basic premise of Web Workers, I recommend starting a separate
thread.

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-04 Thread Ian Hickson
On Mon, 3 Sep 2012, Glenn Maynard wrote:
 
 - Add an internal flag to MessagePort, blocking permitted, which is
 initially set.
 - When a MessagePort port is transferred from source to dest,
 - If source is an ancestor of dest, the blocking permitted flag of
 port is cleared.  (This is a down transfer.)

You basically can't do this, because by the time you've received the 
message saying that the port is in a permitted scope, the other side of 
the port could have been shunted three times and now be who knows where.

Basically as soon as a port leaves the scope in which it was created, you 
can no longer make any stable statements about where the other side is. 
This is why the ports used in dedicated Workers are hidden (so you can't 
send them anywhere).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



Re: Sync API for workers

2012-09-04 Thread David Bruant
[Forwarding a response from Alon Zakai, who is behind Emscripten and
CC'ing him]
  There is also another use-case which has been brought up. As the
  web
  platform is becoming more powerful, people have started converting
  code written for other platforms to javascript+html.
 By html, here, do you mean something else thancanvas?
 Is there something that compiles any Windows/Mac/Linus UI framework
 into
 HTML5?
 There is pyjamas that compiles Python into JS and GWT that compiles Java into 
 JS. Both are UI frameworks, and being able to use sync calls in both would be 
 more natural since the original languages have lots of sync stuff I believe. 
 I don't know how important this use case is though. But we would like to 
 compile open source UI frameworks like Qt and GTK into JS using emscripten, 
 and sync would help a lot there.

  For example the
  emscipten[1] and mandreel[2] allow recompiling C++ code to
  javascript
  which is then run in a web browser.
 I've been following loosely this topic and all examples I've seen
 were
 either about pure computation (like turning a C GZIP library in JS)
 or
 graphics stuffs (using canvas, hence my above question).

  Many times such code relies on APIs or libraries which contain
  blockingcalls.
 Do you have an example of that? I haven't seen one so far, but that's
 an
 interesting point.
 Specifically, I can't recall having seen any C/C++-JS example that
 were
 doing IO (maybe files, but not network),
 Last I heard, Emscripten compiles to JS from LLVM bytecode. I'm not
 sure
 they rely on any library containing blocking calls.
 But as I said, I have been following that loosely.
 The normal C/C++ IO calls are all synchronous - fopen, fread, etc.

 As you said above, this is indeed less of a problem for pure computation. But 
 when compiling a complete game engine for example (like in BananaBread), you 
 need to handle everything a complete app needs, including synchronous IO. 
 Both file IO and and network IO (for multiplayer, downloading assets, etc. - 
 we haven't gotten around to that yet though in BananaBread) are relevant, as 
 well as synchronous GL operations using WebGL.

  Technically it might
  be possible to automatically rewrite such code to use asynchronous
  coding patterns, but so far I don't think anyone has managed to do
  that.
 Naively, I would say, that once we've paid the price to compile code
 from one language to another, you're not that far off from compiling
 to
 a given coding pattern. Especially compiling from LLVM bytecode.
 Regardless, has anyone tried at all? Has anyone tried to compile to
 JS+Promises or JS+task.js?
 Consider even the simple and very common case of

 ..
 while (!feof(f)) {
   fread(buf, 100, 1, f);
   ..process data in buf..
 }
 ..

 I don't have any good ideas for something asynchronous to compile this into 
 that does not currently substantially harm performance. Compiling synchronous 
 code into continuation passing style, generators, control flow emulation, or 
 some other async style would greatly reduce performance. In theory JS engines 
 could optimize those styles, but it would be hard and I don't think this is 
 at the top of anyone's list of priorities for any JS engine.

 Now, we could say that sync code, mainly IO, will run slowly but that is ok 
 because it's mainly just done during startup. That's true to some extent, but 
 startup is very important too, in BananaBread we already have 5-10 seconds or 
 so to load the entire game engine, a lot of which is file IO and processing, 
 and we have gotten requests to improve that as much as possible because it is 
 very significant for the user's initial impression.

 Best,
   Alon Zakai




Re: Sync API for workers

2012-09-04 Thread David Bruant
Alon Zakai wrote:
  Technically it might
  be possible to automatically rewrite such code to use asynchronous
  coding patterns, but so far I don't think anyone has managed to do
  that.
 Naively, I would say, that once we've paid the price to compile code
 from one language to another, you're not that far off from compiling
 to
 a given coding pattern. Especially compiling from LLVM bytecode.
 Regardless, has anyone tried at all? Has anyone tried to compile to
 JS+Promises or JS+task.js?
 Consider even the simple and very common case of

 ..
 while (!feof(f)) {
   fread(buf, 100, 1, f);
   ..process data in buf..
 }
 ..

 I don't have any good ideas for something asynchronous to compile this into 
 that does not currently substantially harm performance. Compiling 
 synchronous code into continuation passing style, generators, control flow 
 emulation, or some other async style would greatly reduce performance.
I can imagine, it sounds hard indeed. Do you have numbers on how it
affects performance? Or an intuition on these numbers? I don't need to
be convinced that it affects performance significantly, but just to get
an idea.

I remember that at some point (your JSConf.eu talk last October), in
order to be able to compile through Emscripten, the source codebase (in
C/C++) had to be manually tweaked sometimes. Is it still the case? If
it's an acceptable thing to ask to authors, then would there be easy
ways for authors to make their IO blocking code more easily translated
to async JS code? I'm pessimistic, but it seems like an interesting
question to explore.

David



Re: Sync API for workers

2012-09-04 Thread Brendan Eich

David Bruant wrote:

I can imagine, it sounds hard indeed. Do you have numbers on how it
affects performance? Or an intuition on these numbers? I don't need to
be convinced that it affects performance significantly, but just to get
an idea.


This is not going to be easy to estimate, but you might benchmark 
generator vs. non-generator code in the latest SpiderMonkey.


I don't think we need quantification, though. Alon's right, the 
optimizing VMs are not focused on uncommon code other than what's in the 
dopey industry-standard benchmarks.



I remember that at some point (your JSConf.eu talk last October), in
order to be able to compile through Emscripten, the source codebase (in
C/C++) had to be manually tweaked sometimes. Is it still the case? If
it's an acceptable thing to ask to authors, then would there be easy
ways for authors to make their IO blocking code more easily translated
to async JS code? I'm pessimistic, but it seems like an interesting
question to explore.


BananaBread required zero Cube 2 changes, IIRC. Other Emscripten 
examples are also pure compilation.


Forget it. Inversion of control flow is hard enough and error prone that 
developers won't do it. It's the #1 reason Mozilla's Electrolysis 
project is paused indefinitely. The SuperSnappy work (threads, not 
processes) preserves most execution model compatibility, and avoids 
requiring programmers writing Firefox XUL front-end and add-on code from 
having to manually callback-CPS their code (on every DOM access!).


/be



Re: Sync API for workers

2012-09-04 Thread Glenn Maynard
On Tue, Sep 4, 2012 at 3:11 PM, Ian Hickson i...@hixie.ch wrote:

 On Mon, 3 Sep 2012, Glenn Maynard wrote:
 
  - Add an internal flag to MessagePort, blocking permitted, which is
  initially set.
  - When a MessagePort port is transferred from source to dest,
  - If source is an ancestor of dest, the blocking permitted flag of
  port is cleared.  (This is a down transfer.)

 You basically can't do this, because by the time you've received the
 message saying that the port is in a permitted scope, the other side of
 the port could have been shunted three times and now be who knows where.


There's no message saying that it's permitted; the only possible message is
that it's *no longer* permitted.  Once the that flag is cleared, it's
cleared permanently.  Also see my later message to Jonas, which
reformulates this a bit to put responsibility of triggering the clear the
flag behavior on the receiver, rather than the sender of the port, since
while sending is asynchronous (you don't know where a message is going when
you send it), receiving is not (you know where a message is going when you
receive it--to you!--and you can know where it originally came from, since
the sender can tuck that information in the message).

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-04 Thread Glenn Maynard
On Mon, Sep 3, 2012 at 10:55 PM, Glenn Maynard gl...@zewt.org wrote:

 I suspect there's a way to make the general-case version work, though.


To restate this by itself instead of as a delta:

- Add an internal flag to MessagePort, blocking permitted, which is
initially set.  Add a value previous owner, which is initially null.
- During postMessage, when a MessagePort port is to be transferred, set
the MessagePort's previous owner to the current thread.
- When a MessagePort is transferred to the current thread, the current
thread must compare itself to the previous owner value of the received
MessagePort:
  - If the current thread is a descendant of previous owner, the blocking
permitted flag of port must be cleared.  (This is a down transfer.)
  - Otherwise, if previous owner is a descendant of the current thread, a
clear blocking permitted message must be sent over port.  (This is an
up transfer.)
  - Otherwise, if previous owner is the current thread, do nothing.
  - Otherwise, the blocking permitted flag of port must be cleared and
a clear blocking permitted message must be sent over port.
- When a clear blocking permitted message is received on the a port's
message queue, it must be discarded and the blocking permitted flag of
the port must be cleared.
- When the blocking permitted flag of any MessagePort is cleared, any
getMessage calls blocking on that port throw an exception.
- Calling getMessage on a port (with a nonzero timeout) whose blocking
permitted flag is cleared throws the same exception.
- Additionally, calling getMessage on a port (with a nonzero timeout) when
neither it nor its entangled port has ever been transferred to another
thread throws an exception.  (Blocking for data when the current thread
holds both sides of the port guarantees a deadlock.)

The compare itself to step must be allowed to happen asynchronously, as
soon as the port appears in a message queue held by the current thread (and
before the message containing the port is taken from the queue for delivery
to onmessage or getMessage).  That ensures that a long-running script
doesn't prevent the clear blocking permitted message from being sent.
That's a bit annoying, but it seems doable, since it has no script-visible
effects in the thread it's happening in.  (This isn't necessary for the
clear blocking permitted message; that can simply be processed in port
message queue order.)

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-04 Thread Alon Zakai
On Tue, Sep 4, 2012 at 1:59 PM, Brendan Eich bren...@mozilla.org wrote:
 David Bruant wrote:

 I can imagine, it sounds hard indeed. Do you have numbers on how it
 affects performance? Or an intuition on these numbers? I don't need to
 be convinced that it affects performance significantly, but just to get
 an idea.


 This is not going to be easy to estimate, but you might benchmark generator
 vs. non-generator code in the latest SpiderMonkey.

 I don't think we need quantification, though. Alon's right, the optimizing
 VMs are not focused on uncommon code other than what's in the dopey
 industry-standard benchmarks.


Yes, last I checked generator code is not even JITed. This kind of
problem isn't on the radar of JS engine people - for understandable
reasons, of course.


 I remember that at some point (your JSConf.eu talk last October), in
 order to be able to compile through Emscripten, the source codebase (in
 C/C++) had to be manually tweaked sometimes. Is it still the case? If
 it's an acceptable thing to ask to authors, then would there be easy
 ways for authors to make their IO blocking code more easily translated
 to async JS code? I'm pessimistic, but it seems like an interesting
 question to explore.


 BananaBread required zero Cube 2 changes, IIRC. Other Emscripten examples
 are also pure compilation.

Yes, we aim at 0 code changes when porting. This is usually the case,
although sometimes something absolutely must be changed. In
BananaBread we changed a few dozen lines of code out of 120,000 for
example.

- azakai




Re: Sync API for workers

2012-09-04 Thread Lon Ingram
On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com wrote:
 Also, I don't think I have seen mentionned use cases of things that are
 not possible without a Sync API. Everything presented is already
 possible (sometimes at arguably high costs like Glenn Maynard's use case
 in discussion [1]).

I have a use case: at USENIX I presented Treehouse [1, 2], a system that
sandboxes (mostly) unmodified JavaScript by running it in a worker with a
virtual DOM and browser API. Treehouse presents guest code a synchronous
interface to the virtual DOM, and then asynchronously updates the real DOM in
the parent page.

This works surprisingly well, but has some limitations (S3.4 of [1]). First,
Treehouse cannot virtualize synchronous API calls such as window.prompt.
Second, sharing resources like cookies and DOM nodes between workers is
difficult. We punted on this. For example, we require that a given DOM node in
the parent page appear in the virtual DOM of at most one worker.

I can't weigh in on the implementation debate, but I can say that a blocking
message API would make Treehouse more powerful and simplify its
implementation.

Whether this is a *real world* use case is another matter entirely...

[1] paper: 
https://www.usenix.org/system/files/conference/atc12/atc12-final159.pdf
[2] video, audio:
https://www.usenix.org/conference/usenixfederatedconferencesweek/treehouse-javascript-sandboxes-help-web-developers-help

--
Lon Ingram
@lawnsea




Re: Sync API for workers

2012-09-03 Thread Jonas Sicking
Hi All,

I'd like to start by clearing up some confusion here. That's why I'm
responding to the first email in this thread.

We at mozilla have no interest in creating an API which runs the risk
of causing dead-locks. I would expect this to be true of other browser
vendors too, though obviously I can't speak for them.

This is why all the proposals that we have been discussing have been
to allow *dedicated* workers block on receiving messages *only* from
their parent. Dedicated workers always create tree-like structures,
and so if children can only block on their parents, you can't end up
with a situation where two actors are blocked on each other.

It seems hard to ensure that deadlocks can't happen if we try to allow
blocking calls on generic MessagePorts, this is why we haven't been
interested in doing that. I'm not saying it's impossible, but if
someone wants to propose this, please keep in mind that we're not
interested in proposals which allow deadlocks, so you'll need to prove
that your proposal can't cause deadlocks.

It's been mentioned in this thread, and elsewhere, that we need to
take caution to not allow deadlocks to happen. That's exactly what we
are doing by only allowing blocking calls from dedicated workers to
their parents.

The other thing that I wanted to talk about is use cases. It has been
claimed in this thread that synchronous message passing isn't needed
and that people can just write code using async patterns. While this
is absolutely true, I would absolutely say that writing asynchronous
code is dramatically more complicated than writing synchronous code.
This is one of the big reasons that we have workers at all. If writing
asynchronous callbacks was almost as easy as writing blocking code,
then we could simply ask people to return to the event loop and
asynchronously continue their computation through a callback. There
are other reasons for workers to exist, but this is one of them.

So yes, it's definitely the case that synchronous blocking code
doesn't allow any new use cases that were impossible before. But it
makes certain code dramatically easier to write, which is of big value
to authors.

There is also another use-case which has been brought up. As the web
platform is becoming more powerful, people have started converting
code written for other platforms to javascript+html. For example the
emscipten[1] and mandreel[2] allow recompiling C++ code to javascript
which is then run in a web browser. Many times such code relies on
APIs or libraries which contain blocking calls. Technically it might
be possible to automatically rewrite such code to use asynchronous
coding patterns, but so far I don't think anyone has managed to do
that.

One of the big use cases I am interested in solving (though I can't
speak for other people at mozilla) is to allow libraries to be written
and imported into workers which expose easy-to-use synchronous APIs,
and whose implementation makes blocking calls to a the parent in order
to implement the API. Such a library would of course require part of
the library to also be running in the parent so that it could handle
the incoming messages.

For example you could imagine a library which implements the
synchronous IndexedDB API since browsers so far has not implemented
it. Or a library which implements a DOM which allows a worker to
modify part of a document rendered by the parent window.

So with that in mind, let me express some opinions on the three
proposals Olli mentioned in [3]

The 3 proposal, i.e. a blocking waitForMessage() function which
returns the next message event is something which has come up several
times in the past. There's certainly a lot of logic to it, however
there's some pretty important problems with it.

Consider the following scenario:

1. Worker starts running a task, say a messagehandler in response to a
websocket message.
2. Main thread sends async messages A and B to the worker. This
message is added to the worker's message queue.
3. While still inside of the task started in step 1, the worker
decides that it needs to send a synchronous message to the main
thread. So it sends an asynchronous message, X, and starts polling
messages using waitForMessage().
4. It first receives messages A and B, but since they aren't the reply
to the message X sent in step 3, it keeps polling. Messages A and B
end up in the local events array.
5. Message X arrives in the main thread and the main thread performs
the calculation and responds with message X'. X' is added to the
worker's event queue.
6. Main thread sends async message C to the worker. This message is
added to the worker's message queue.
7. The worker keeps polling and now gets message X', so it stops
polling and uses the data in X' as result.
8. The worker keeps running the task and eventually gets to the
while-loop which processes the events array.
9. The first message in the array is A which the worker dispatches,
causing the handler for A to start running.
10. The handler 

Re: Sync API for workers

2012-09-03 Thread Jonas Sicking
On Mon, Sep 3, 2012 at 4:47 PM, Glenn Maynard gl...@zewt.org wrote:
 On Mon, Sep 3, 2012 at 4:32 PM, Jonas Sicking jo...@sicking.cc wrote:

 It seems hard to ensure that deadlocks can't happen if we try to allow
 blocking calls on generic MessagePorts, this is why we haven't been
 interested in doing that. I'm not saying it's impossible, but if
 someone wants to propose this, please keep in mind that we're not
 interested in proposals which allow deadlocks, so you'll need to prove
 that your proposal can't cause deadlocks.


 (See below.)

 Another problem you have is that the A, B and C events aren't run from
 the event loop like normal events. They are instead run from whatever
 callstack existed when someone decided to make synchronous call to the
 parent. This will give web developers exactly the same problem as
 we've had with Gecko code spinning the event loop. When doing
 something like that, you have to be absolutely sure that all code
 which exists up your call stack can deal with all of these messages
 getting dispatched. And all of those messages have to be able to deal
 with being dispatched under the existing callstack.


 I think all of the problems you're describing only happen if there's just
 one channel that you can post messages to, eg. if you can't block on
 MessagePort but only the global port.  I think we can find a solution for
 the MessagePort problem.  Once you can block on specific MessagePorts, you
 no longer have the confusion of getMessage() returning messages meant for
 other APIs.  (After all, isn't that what MessageChannels are for?)

 Conceptually, I think this is possible.  You should only be able to perform
 a blocking getMessage if the other side of the port is in a dedicated worker
 who is a descendant of the current thread.  Here's an attempt:

 - Add an internal flag to MessagePort, blocking permitted, which is
 initially set.
 - When a MessagePort port is transferred from source to dest,
 - If source is an ancestor of dest, the blocking permitted flag of
 port is cleared.  (This is a down transfer.)
 - Otherwise, if source is a descendent of dest, the blocking permitted
 flag of port's entangled port is cleared.  (This is an up transfer.)
 - Otherwise, if source == dest, do nothing.
 - Otherwise, the blocking permitted flag of both port and its
 entangled port are cleared.  (For example, a port was transferred to a
 shared worker.)
 - When the blocking permitted flag of any MessagePort is cleared, any
 getMessage calls blocking on that port throw an exception.
 - Calling getMessage on a port (with a nonzero timeout) whose blocking
 permitted flag is cleared throws the same exception.
 - Additionally, calling getMessage on a port (with a nonzero timeout) when
 neither it nor its entangled port has ever been transferred to another
 thread throws an exception.  (Blocking for data when the current thread
 holds both sides of the port guarantees a deadlock.)

 In other words, if a port is transferred up the thread tree, then it's
 allowed to block downwards, but any port that's ever been transferred down
 can not.  If you transfer a port down and then back up, then neither side
 can ever block on the port (the flag has been cleared on both sides).  (The
 clear the entangled port's flag would presumably actually mean sending a
 control message over the pipe, telling the other side to clear the flag.)

 This works for dedicated workers, where the ancestor/descendant concepts
 make sense.  This wouldn't work for shared workers, which would never be
 able to block.  (That's hard, since shared workers create cycles.  I don't
 think any current proposal can support shared workers while also disallowing
 deadlocks.)

 Now, this approach can go one of two ways: we can either allow blocking up
 the tree or down the tree, but we'd have to pick one or the other.  I'm
 inclined to recommend blocking *down* the tree, since that allows use cases
 like the ones you mentioned, eg. starting a thread to do IndexedDB calls,
 which you (the parent) can then block on.

We can't generically block on children since we can't let the main
window block on a child. That would effectively permit synchronous IO
from the main thread which is not something that we want to allow.

So if we're only choosing one direction (which is definitely the
simpler thing to do), then it has to be that you can only block up
the tree.

Also, the last Otherwise, the blocking permitted flag of both
port and its entangled port are cleared. has to apply any time when
sending a port through a generic port rather than through a dedicated
worker parent/child? When communicating with a generic port we never
have any idea what is on the receiving end. And what is on the
receiving end can change between the time when a message is sent, and
when it is received.

 1.1 is nifty in that it allows us to use events while dealing with
 replies from multiple handlers. But it seems like it adds a feature
 that doesn't 

Re: Sync API for workers

2012-09-03 Thread Glenn Maynard
On Mon, Sep 3, 2012 at 9:30 PM, Jonas Sicking jo...@sicking.cc wrote:

 We can't generically block on children since we can't let the main
 window block on a child. That would effectively permit synchronous IO
 from the main thread which is not something that we want to allow.


The UI thread would never be allowed to block, of course.  The getMessage
API itself would never even be exposed in the UI thread, regardless of the
state of this flag.  I picture this being methods on
DedicatedWorkerGlobalScope and SharedWorkerGlobalScope.
(SharedWorkerGlobalScope's version would only get the zero-timeout polling
version.)

Also, the last Otherwise, the blocking permitted flag of both
 port and its entangled port are cleared. has to apply any time when
 sending a port through a generic port rather than through a dedicated
 worker parent/child? When communicating with a generic port we never
 have any idea what is on the receiving end. And what is on the
 receiving end can change between the time when a message is sent, and
 when it is received.


Well, you can simplify the algorithm by always clearing the flag when
posting through anything but a dedicated worker's port.  That's
straightforward since you always know at send time what the relationship to
the receiver is--if it's through worker.postMessage it's to a child, and if
it's through postMessage on DedicatedWorkerGlobalScope it's to the parent.
It would be nice to do this in the more generic way, but this would be
enough for a lot of cases.

I suspect there's a way to make the general-case version work, though.  For
example, when a worker is transferred to another thread, include the thread
ID sending the port, as part of the metadata of the transfer.  The receiver
then knows where the port came from, and it knows itself, so it can see
where it lies relative to the sender to determine whether it was an up,
down or a transfer that always invalidates both sides (sent to a worker
that is neither an ancestor nor a descendant).  If it determines that the
transfer invalidated the other side, then it sends a message across the
pipe saying whoever you are, you need to clear your blocking-permitted
flag.  This would apply even if the other side changes hands in the
meantime, since once that flag is set, it's set permanently.

All that aside, do any implementations actually put dedicated workers in a
different process than their creator (if so, I'm curious as to why)?  This
should all be very simple if you don't do that, and I can't think of any
reason to--and good reasons not to, eg. fast ArrayBuffer transfers become
much harder.  Shared workers may cross processes, but dedicated workers?

Your proposal makes it possible for pages to avoid the problems
 described in my email by setting up a separate channel used for
 synchronous messages. But some of the problems still remain. As soon
 as a message channel is used for both synchronous and asynchronous
 messages you can easily get into trouble. If someone calls the
 blocking waitForMessage() function and receive a message which was
 intended to be delivered asynchronously there is no good recourse.
 Basically any time that happens there are only bad options available,
 many of which have subtle problems that only happen intermittently
 like the ones I described in my initial email.

 Since that is the case, I think the best solution is to always force
 separate channels to be used for synchronous and asynchronous
 messages.


If you have messages that must be received synchronously, and other
messages that must be received asynchronously, then that's precisely a time
when you'd want to use MessagePorts to separate them.  That's what they're
for.  It's the same as using separate MessagePorts when you have two
unrelated libraries receiving their own messages, so each library only sees
messages intended for it.

I agree that APIs that encourage people to write brittle code should be
squinted at carefully, and we should definitely examine all APIs for that
problem, but really I don't think it's the case here.

It seems much simpler to me to have only one kind of MessagePort, each
representing only one message channel.  Importantly, the sending side
doesn't have to know whether the receiving side is using a sync API to
receive it or not--in other words, that information doesn't have to be part
of the user's messaging protocol.  As a simple example, you can have a
worker thread whose protocol is simply:

- Send a message to the worker's port with a word.
- The worker sends a message to its parent with the word's definition.
(The mechanism of this lookup is black-boxed--it might be IndexedDB, or a
network request, or a complex combination.)

This means that the caller can use this worker's API synchronously or
asynchronously, without needing to define two interfaces and without the
child knowing the difference.  You can use it synchronously (if you're in a
worker yourself):

var worker = createDictionaryWorker();

Re: Sync API for workers

2012-09-02 Thread Andrew Wilson
On Sat, Sep 1, 2012 at 2:32 PM, Glenn Maynard gl...@zewt.org wrote:

 On Sat, Sep 1, 2012 at 3:19 PM, Rick Waldron waldron.r...@gmail.comwrote:

 I can seriously dispute this, as someone who involved in research and
 development of JavaScript programming for hardware. Processing high volume
 serialport IO is relatively simple with streams and data events. It's just
 a matter of thinking differently about the program.


 We'll have to professional disagree, then.  I've used both models
 extensively, and for some tasks, such as complex algorithms, I find linear
 code much easier to write, understand and debug.


 On Sat, Sep 1, 2012 at 3:28 PM, Olli Pettay olli.pet...@helsinki.fiwrote:

 Proposal 3
 Worker:
 postMessage(I want reply to this);
 var events = [];
 while (var m = waitForMessage()) {
   if (m.data != /* the reply * /) {
 events.push(m);
   } else {
 // do something with message
   }
 }
 while (events.length()) {
dispatchEvent(events.shift());
 }


 The intent is much simpler:

 postMessage(foo);
 var response = getMessage();

 You're correct that this wouldn't integrate all that well with libraries,
 since you may end up receiving an unrelated message.  (Trying to deal with
 that is where the bulk of the above comes from--and you probably really
 wouldn't want to dispatch unknown events, since that's effectively making
 *all* events async.)  That's why I originally proposed this as a method on
 MessagePort.  You'd create a dedicated MessagePort to block on, so you
 wouldn't collide with (and possibly be confused by, or discard by accident)
 unrelated messages.



Just wanted to point out that all of the arguments for a wait-for-reply API
in workers also apply to SharedWorkers. It's trickier for SharedWorkers
since they use MessagePorts, and we probably don't want to expose this kind
of API to pages (which also use MessagePorts). But I would strongly prefer
a solution that would be applicable to all kinds of workers, not just
dedicated workers.

FWIW, I find the getMessage(timeout) API (sol'n 3 in the thread) preferable
to adding new send sync message/reply + related events to the API.
Perhaps exposing this on both Worker and on MessagePort (with appropriate
errors if getMessage(timeout) is called from page scope with timeout != 0)
would be acceptable? I'm not entirely certain what the semantics of
getMessage() are, though - if you grab a message via getMessage(), does
this imply that normal onmessage event handlers are not run (or perhaps are
run after we re-enter the event loop)?

I am not optimistic that we can do deadlock prevention in the general case
with MessagePorts, for the same reason that it's prohibitively difficult to
reliably garbage collect MessagePorts when they can be passed between
processes.



 Of course, that means that while you're waiting for messages on the port,
 no other messages are being received, since you aren't in the event loop.
 That's just inherent--nothing else that happens during the event loop would
 take place either, like async XHR.

 I'm not sure how to do the implicit deadlock prevention if this is
 exposed on MessagePort, though.  (I personally don't find it a problem, as
 I said earlier, but I know that a proposal that has that as an option will
 have a much better chance of being implemented than one that doesn't.)

 - The message must be read in order to reply


 I'm not quite following here.  You could getMessage at any time, and the
 other thread could postMessage at any time--there doesn't have to be a
 hey, send me a message message in the first place at all.  For example, a
 site may have a button that sends a message when clicked.  You don't have
 to jump hoops in order to wait for the query message to reply to, as it
 seems you'd have to with the reply proposals.

 --
 Glenn Maynard




Re: Sync API for workers

2012-09-02 Thread Glenn Maynard
On Sun, Sep 2, 2012 at 12:24 PM, Andrew Wilson atwil...@google.com wrote:

 Just wanted to point out that all of the arguments for a wait-for-reply
 API in workers also apply to SharedWorkers. It's trickier for SharedWorkers
 since they use MessagePorts, and we probably don't want to expose this kind
 of API to pages (which also use MessagePorts). But I would strongly prefer
 a solution that would be applicable to all kinds of workers, not just
 dedicated workers.


You can do that by giving MessagePort another interface in workers, eg.
MessagePortSync or MessagePortWorkers, which inherits from MessagePort and
adds eg. getMessage().  It might need a bit of finessing to switch
interfaces during structured clone.

Alternatively--and as I type this I like it better--add a getMessage(port)
method to WorkerGlobalScope.  Simplicity aside, I like that it can
naturally support getMessage([port1, port2, port3], 100).  With
port.getMessage(), there's no way to wait for a message from multiple ports
(think select()/poll()).  This doesn't give any way to specify the worker's
implicit port, though; I guess that could be a special case, eg. pass in
null.


 I'm not entirely certain what the semantics of getMessage() are, though -
 if you grab a message via getMessage(), does this imply that normal
 onmessage event handlers are not run (or perhaps are run after we re-enter
 the event loop)?


It shouldn't still dispatch onmessage asynchronously.  That's confusing,
and also, it means the messages would build up in the queue until the
script returns.  Due to the nature of the feature, the script may not
return for a long time, or it may receive lots of messages before it does.
Also, if you may handle the message during processing (via getMessage), and
also when idle (via onmessage), this means it's hard to ensure you don't
process messages twice.

The two options that have come up are:

1: don't dispatch onmessage at all if getMessage returns a message.
getMessage() consumes the message from the queue.
2: dispatch onmessage synchronously, before returning from getMessage
(which can also return the message or not).

They're mostly equivalent; you can build either on top of the other.
(createEvent isn't actually exposed to WorkerGlobalScope in order to
implement #2 from #1, but that's a separate issue.)

I just noticed a strong argument against #2: it's recursive.  Without any
strong benefit, that seems like a good thing to avoid.

I am not optimistic that we can do deadlock prevention in the general case
 with MessagePorts, for the same reason that it's prohibitively difficult to
 reliably garbage collect MessagePorts when they can be passed between
 processes.


Would you consider this an implementation-blocking problem?

By the way, another option is to remove the ability to block, so it always
behaves as getMessage(0)--return a waiting message, but don't wait for
one.  That would also make it impossible to deadlock.  Being able to wait
for a message would be a nice plus, but I don't think I've seen any use
cases that really require it.  (If this is done, there's no need to be able
to give multiple ports to getMessage, as I mentioned at the top, since you
can just call getMessage separately for each port.)

-- 
Glenn Maynard


Re: Sync API for workers

2012-09-02 Thread Ian Hickson
On Sun, 2 Sep 2012, Andrew Wilson wrote:
 
 Just wanted to point out that all of the arguments for a wait-for-reply API
 in workers also apply to SharedWorkers. It's trickier for SharedWorkers
 since they use MessagePorts

Dedicated Workers use MessagePorts too, they're just embedded in the 
WorkerGlobalScope and the API exposed through that interface, so that you 
can't send the port's endpoint around. But it's literally defined in terms 
of a MessagePort.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



Re: Sync API for workers

2012-09-02 Thread Andrew Wilson
On Sun, Sep 2, 2012 at 12:16 PM, Glenn Maynard gl...@zewt.org wrote:

 On Sun, Sep 2, 2012 at 12:24 PM, Andrew Wilson atwil...@google.comwrote:

 I am not optimistic that we can do deadlock prevention in the general
 case with MessagePorts, for the same reason that it's prohibitively
 difficult to reliably garbage collect MessagePorts when they can be passed
 between processes.


 Would you consider this an implementation-blocking problem?


No. To the contrary, I was (poorly) arguing in favor of not making deadlock
prevention a required part of the spec.


Re: Sync API for workers

2012-09-01 Thread Glenn Maynard
On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com wrote:

 A Sync API for workers is being implemented in Firefox [1].
 I'd like to come back to the discussions mentionned in comment 4 of the
 bug.

 A summary of points I find important and my comments, questions and
 concerns

 # Discussion 1
 ## Glenn Maynard [2] Use case exposed:
 Ability to cancel long-running synchronous worker task
 Terminating the whole worker thread is the blunt way to do it; that's
 no good since it requires starting a new thread for every keystroke, and
 there may be significant startup costs (eg. loading search data).
 = It's a legitimate use case that has no good solution today other than
 cutting the task in smaller tasks between which a cancellation message
 can be interleaved.


The solution proposed in 783190 seems more complex and less useful than the
one Sicking and I discussed.  To summarize that one: add a
getMessage(timeout) method, which consumes and returns the next message
(causing onmessage to not be called[1]).  If timeout is nonzero, wait for a
message for up to that duration; if zero the function never blocks (eg.
peek for a waiting message).  If the timeout expires, returns null.

This turns the first example in 783190 into:

worker.js:
var res = getMessage(timeout);

page.html:
worker = new Worker(...);
setTimeout(function() {
  worker.postMessage(data, transferrable);
}, 1000);

I think this has several advantages.

- Mozilla's proposal effectly creates a separate, parallel messaging
channel on the MessagePort; synchronous vs. asynchronous messages.  This is
simpler: messages are just messages, and no new API is exposed outside of
workers.
- User messaging protocols are much simpler.  For example, take a
long-running processing task in a worker which wants to be able to receive
a stop what you're doing, I have new information that affects your
processing task message.  With this proposal, the UI thread (or whatever)
simply sends a message with the new information.  With Mozilla's proposal,
it would have to wait for the thread to periodically send a do you have
anything to tell me? message, in order to be able to send a response that
the thread can receive synchronously.
- Polling is much cheaper.  With Mozilla's proposal, you have to send a
message to another thread, then sit and wait until you get a response.  If
it's the UI thread, that may take many milliseconds, since it may be busy
doing other things.  With this proposal, polling for new messages in a
processing loop should never block due to activity in the other thread.
- The resulting message protocols are more robust.  With the
query/response approach, if someone fails to send a response, the worker
will wait forever or time out.


[1] We didn't come to agreement on whether it's better to return the
message or to call onmessage synchronously, but that's a detail; whichever
approach is used, it's possible to implement the other in script.

# Discussion 2
 ## Joshua Bell [5]
 This can be done today using bidirectional postMessage, but of course
 this requires the Worker to then be coded in now common asynchronous
 JavaScript fashion, with either a tangled mess of callbacks or some sort
 of Promises/Futures library, which removes some of the benefits of
 introducing sync APIs to Workers in the first place.
 = What are these benefits?


The benefit of being able to write linear code.  I don't think anyone who's
written complex algorithms in JavaScript can seriously dispute this as
anything but a huge win.

## Glenn Maynard [7]
 I think this is a fundamental missing piece to worker communication.  A
 basic reason for having Workers in the first place is so you can write
 linear code, instead of having to structure code to be able to return
 regularly (often awkward and inconvenient), but currently in order to
 receive messages in workers you still have to do that.
 = A basic reason for having workers is to move computation away from
 window to a concurrent and parallel computation unit so that the UI is
 not blocked by computation. End of story. Nothing to do with writing
 linear code.


That's another good reason; it doesn't in any way reduce the importance of
being able to write linear code, which *is* an important use case of
workers.  It's precisely why we have APIs like FileReaderSync.

If JavaScript as it is doesn't allow people to write code
 as they wish, once again, it's a language issue. Either ask a change in
 the language or create a language that looks the way you want and
 compiles down to JavaScript.


This has nothing to do with JavaScript/ECMAScript as a language.  The
ugliness of having to implement algorithms in an event-based way is caused
by the way the Web uses the language, not the language itself.

I wish to add that adding a sync API (even if the sync aspect is
 asymetrical as proposed in [1]) breaks the event-loop run-to-completion
 model of in-browser-JavaScript which is intended to be formalized at
 [concurr]. 

Re: Sync API for workers

2012-09-01 Thread Olli Pettay

On 09/01/2012 11:19 PM, Rick Waldron wrote:


David,

Thanks for preparing this summary—I just wanted to note that I still stand 
behind my original, reality based arguments.

One comment inline..

On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote:


Hi,

A Sync API for workers is being implemented in Firefox [1].
I'd like to come back to the discussions mentionned in comment 4 of the bug.

The original post actually describes an async API—putting the word sync in the middle 
of a method or event name doesn't make it sync.

As the proposed API developed, it still retains the event handler-esque 
design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the
terminology being used is async:
- event
- callback
- onfoo

Even Olli's proposal example is async. 
https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout)

If the argument is callback hell, save it—because if that's the problem with 
your program, then your doing it wrong (see: node.js ecosystem).


If this API introduces any renderer process blocking, the result will be 
catastrophic in the hands of inexperienced web developers.



I haven't seen any proposal which would block rendering/main/dom thread


We've been thinking the following approaches:

Proposal 1
Parent Thread:
var w = new Worker('foo.js');
w.onsyncmessage = function(event) {
  event.reply('bar');
}
Worker:
var r = postSyncMessage('foobar', null, 1000 /* timeout */);
if (r == 'bar') ..
PRO:
- It's already implemented :)
CON:
- Multiple event listeners - Multiple reply() calls. How to deal with it?
- Multiple event listeners - is this your message?
- Wrong order of the messages in worker if parent sends async message just 
before receiving sync message
- The message must be read in order to reply


Proposal 1.1
Parent Thread:
var w = new Worker('foo.js');
w.onsyncmessage = function(event) {
  var r = new Reply(event);
  r.reply(bar);  // Can be called after event dispatch.
}
Worker:
var replies = postSyncMessage('foobar', null, 1000 /* timeout */);
for (var r in replies) {
  handleEachReply(r);
}
PRO:
- Can handle multiple replies.
- No awkward limitations on main thread because of reply handling
CON:
- A bit ugly.
- Reply on the worker thread becomes an array - unintuitive
- Wrong order of the messages in worker if parent sends async message just 
before receiving sync message
- The Reply object must be created during event dispatch.


Proposal 2
Parent Thread:
var w = new Worker('foo.js');
w.setSyncHandler('typeFoobar', function(message) {
  return 'bar';
});
Worker:
var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */);
if (r == 'bar') ..
PRO:
- no multple replyies are possible
- types for sync messages
CON:
- Just a single listener
- It's not based on event - it's something different compare with any other 
worker/parent communication.
- Wrong order of the messages in worker if parent sends async message just 
before receiving sync message


Proposal 3
Worker:
postMessage(I want reply to this);
var events = [];
while (var m = waitForMessage()) {
  if (m.data != /* the reply * /) {
events.push(m);
  } else {
// do something with message
  }
}
while (events.length()) {
   dispatchEvent(events.shift());
}
PRO:
- Flexible
- the order of the events is changed by the developer
- since there isn't any special sync messaging, multiple event listeners don't
  cause problems.
CON:
- complex for web developers(?)
- The message must be read in order to reply
- Means that you can't use libraries that use sync messages. Only frameworks are possible as all message handling needs to be aware of the new 
syncmessages.





Atm, I personally prefer the proposal 3.


-Olli





Rick


A summary of points I find important and my comments, questions and concerns

# Discussion 1
## Glenn Maynard [2] Use case exposed:
Ability to cancel long-running synchronous worker task
Terminating the whole worker thread is the blunt way to do it; that's
no good since it requires starting a new thread for every keystroke, and
there may be significant startup costs (eg. loading search data).
= It's a legitimate use case that has no good solution today other than
cutting the task in smaller tasks between which a cancellation message
can be interleaved.


## Tab Atkins [3]
If we were to fix this, it needs to be done at the language level,
because there are language-level issues to be solved that can't be
hacked around by a specialized solution.
= I agree a lot with that point. This is a discussion that should be
had on es-discuss since JavaScript is the underlying language.
ECMAScript per se doesn't define a concurrency model and it's not even
on the table for ES.next, but might be in ES.next.next (7?). See [concurr]

## Jonas Sicking [4]
Ideas of providing control (read-only) over pending messages in workers.
(not part of the current Sync API, but interesting nonetheless)



# Discussion 2
## Joshua Bell [5]
This can be done today using bidirectional 

Re: Sync API for workers

2012-09-01 Thread Rick Waldron



On Saturday, September 1, 2012 at 4:02 PM, Glenn Maynard wrote:

 On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com 
 (mailto:bruan...@gmail.com) wrote:
  A Sync API for workers is being implemented in Firefox [1].
  I'd like to come back to the discussions mentionned in comment 4 of the bug.
  
  A summary of points I find important and my comments, questions and concerns
  
  # Discussion 1
  ## Glenn Maynard [2] Use case exposed:
  Ability to cancel long-running synchronous worker task
  Terminating the whole worker thread is the blunt way to do it; that's
  no good since it requires starting a new thread for every keystroke, and
  there may be significant startup costs (eg. loading search data).
  = It's a legitimate use case that has no good solution today other than
  cutting the task in smaller tasks between which a cancellation message
  can be interleaved.
 
 The solution proposed in 783190 seems more complex and less useful than the 
 one Sicking and I discussed.  To summarize that one: add a 
 getMessage(timeout) method, which consumes and returns the next message 
 (causing onmessage to not be called[1]).  If timeout is nonzero, wait for a 
 message for up to that duration; if zero the function never blocks (eg. peek 
 for a waiting message).  If the timeout expires, returns null.
 
 This turns the first example in 783190 into:
 worker.js: var res = getMessage(timeout);
 page.html: worker = new Worker(...); setTimeout(function() { 
 worker.postMessage(data, transferrable); }, 1000); I think this has several 
 advantages.
 
 - Mozilla's proposal effectly creates a separate, parallel messaging channel 
 on the MessagePort; synchronous vs. asynchronous messages.  This is simpler: 
 messages are just messages, and no new API is exposed outside of workers.
 - User messaging protocols are much simpler.  For example, take a 
 long-running processing task in a worker which wants to be able to receive a 
 stop what you're doing, I have new information that affects your processing 
 task message.  With this proposal, the UI thread (or whatever) simply sends 
 a message with the new information.  With Mozilla's proposal, it would have 
 to wait for the thread to periodically send a do you have anything to tell 
 me? message, in order to be able to send a response that the thread can 
 receive synchronously.
 - Polling is much cheaper.  With Mozilla's proposal, you have to send a 
 message to another thread, then sit and wait until you get a response.  If 
 it's the UI thread, that may take many milliseconds, since it may be busy 
 doing other things.  With this proposal, polling for new messages in a 
 processing loop should never block due to activity in the other thread.
 - The resulting message protocols are more robust.  With the query/response 
 approach, if someone fails to send a response, the worker will wait forever 
 or time out.
 
 
 [1] We didn't come to agreement on whether it's better to return the message 
 or to call onmessage synchronously, but that's a detail; whichever approach 
 is used, it's possible to implement the other in script.
 
  # Discussion 2
  ## Joshua Bell [5]
  This can be done today using bidirectional postMessage, but of course
  this requires the Worker to then be coded in now common asynchronous
  JavaScript fashion, with either a tangled mess of callbacks or some sort
  of Promises/Futures library, which removes some of the benefits of
  introducing sync APIs to Workers in the first place.
  = What are these benefits?
 
 The benefit of being able to write linear code.  I don't think anyone who's 
 written complex algorithms in JavaScript can seriously dispute this as 
 anything but a huge win.

I can seriously dispute this, as someone who involved in research and 
development of JavaScript programming for hardware. Processing high volume 
serialport IO is relatively simple with streams and data events. It's just a 
matter of thinking differently about the program. 


Rick

 
 
  ## Glenn Maynard [7]
  I think this is a fundamental missing piece to worker communication.  A
  basic reason for having Workers in the first place is so you can write
  linear code, instead of having to structure code to be able to return
  regularly (often awkward and inconvenient), but currently in order to
  receive messages in workers you still have to do that.
  = A basic reason for having workers is to move computation away from
  window to a concurrent and parallel computation unit so that the UI is
  not blocked by computation. End of story. Nothing to do with writing
  linear code.
 
 That's another good reason; it doesn't in any way reduce the importance of 
 being able to write linear code, which *is* an important use case of workers. 
  It's precisely why we have APIs like FileReaderSync.
 
  If JavaScript as it is doesn't allow people to write code
  as they wish, once again, it's a language issue. Either ask a change in
  the language or create a language 

Re: Sync API for workers

2012-09-01 Thread Rick Waldron



On Saturday, September 1, 2012 at 4:28 PM, Olli Pettay wrote:

 On 09/01/2012 11:19 PM, Rick Waldron wrote:
   
  David,
   
  Thanks for preparing this summary—I just wanted to note that I still stand 
  behind my original, reality based arguments.
   
  One comment inline..
   
  On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote:
   
   Hi,

   A Sync API for workers is being implemented in Firefox [1].
   I'd like to come back to the discussions mentionned in comment 4 of the 
   bug.

   
  The original post actually describes an async API—putting the word sync 
  in the middle of a method or event name doesn't make it sync.
   
  As the proposed API developed, it still retains the event handler-esque 
  design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the
  terminology being used is async:
  - event
  - callback
  - onfoo
   
  Even Olli's proposal example is async. 
  https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout)
   
  If the argument is callback hell, save it—because if that's the problem 
  with your program, then your doing it wrong (see: node.js ecosystem).
   
   
  If this API introduces any renderer process blocking, the result will be 
  catastrophic in the hands of inexperienced web developers.
  
  
 I haven't seen any proposal which would block rendering/main/dom thread
So far, they all look async. Just calling them sync doesn't make them sync. 
A sync worker API:

// really sync behaviour means blocking until this returns:  
var result = rendererBlockingAPI( data );

Unless you specifically design that to stop everything and wait for a response 
(renderer blocking), JavaScript will run to completion.

Rick
  
  
  
  
 We've been thinking the following approaches:
  
 Proposal 1
 Parent Thread:
 var w = new Worker('foo.js');
 w.onsyncmessage = function(event) {
 event.reply('bar');
 }
 Worker:
 var r = postSyncMessage('foobar', null, 1000 /* timeout */);
 if (r == 'bar') ..
 PRO:
 - It's already implemented :)
 CON:
 - Multiple event listeners - Multiple reply() calls. How to deal with it?
 - Multiple event listeners - is this your message?
 - Wrong order of the messages in worker if parent sends async message just 
 before receiving sync message
 - The message must be read in order to reply
  
  
 Proposal 1.1
 Parent Thread:
 var w = new Worker('foo.js');
 w.onsyncmessage = function(event) {
 var r = new Reply(event);
 r.reply(bar); // Can be called after event dispatch.
 }
 Worker:
 var replies = postSyncMessage('foobar', null, 1000 /* timeout */);
 for (var r in replies) {
 handleEachReply(r);
 }
 PRO:
 - Can handle multiple replies.
 - No awkward limitations on main thread because of reply handling
 CON:
 - A bit ugly.
 - Reply on the worker thread becomes an array - unintuitive
 - Wrong order of the messages in worker if parent sends async message just 
 before receiving sync message
 - The Reply object must be created during event dispatch.
  
  
 Proposal 2
 Parent Thread:
 var w = new Worker('foo.js');
 w.setSyncHandler('typeFoobar', function(message) {
 return 'bar';
 });
 Worker:
 var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */);
 if (r == 'bar') ..
 PRO:
 - no multple replyies are possible
 - types for sync messages
 CON:
 - Just a single listener
 - It's not based on event - it's something different compare with any other 
 worker/parent communication.
 - Wrong order of the messages in worker if parent sends async message just 
 before receiving sync message
  
  
 Proposal 3
 Worker:
 postMessage(I want reply to this);
 var events = [];
 while (var m = waitForMessage()) {
 if (m.data != /* the reply * /) {
 events.push(m);
 } else {
 // do something with message
 }
 }
 while (events.length()) {
 dispatchEvent(events.shift());
 }
 PRO:
 - Flexible
 - the order of the events is changed by the developer
 - since there isn't any special sync messaging, multiple event listeners don't
 cause problems.
 CON:
 - complex for web developers(?)
 - The message must be read in order to reply
 - Means that you can't use libraries that use sync messages. Only frameworks 
 are possible as all message handling needs to be aware of the new  
 syncmessages.
  
  
  
  
 Atm, I personally prefer the proposal 3.
  
  
 -Olli
  
  
   
   
  Rick

   A summary of points I find important and my comments, questions and 
   concerns

   # Discussion 1
   ## Glenn Maynard [2] Use case exposed:
   Ability to cancel long-running synchronous worker task
   Terminating the whole worker thread is the blunt way to do it; that's
   no good since it requires starting a new thread for every keystroke, and
   there may be significant startup costs (eg. loading search data).
   = It's a legitimate use case that has no good solution today other than
   cutting the task in smaller tasks between which a cancellation message
   can be interleaved.


   ## Tab Atkins [3]
   If we were to fix this, it needs 

Re: Sync API for workers

2012-09-01 Thread Oliver Hunt
My reading (from the proposed APIs) is that these are only synchronous from the 
Worker's PoV.  If that's correct I have no real objections to such an API - the 
render thread simply sees a regular message.  It doesn't even need a special 
API on the receiving side.  If the Worker has

...
var response = postSynchronousMessage(message)
...

and the renderer has

worker.onmessage =function() { ; return foo }

There are no UI blocking hazards (other than the usual slow event handler 
problem, which is already present anyway)

The only real problem I can see is along the lines of:

// Worker1
onmessage = function () { ...worker2.postSynchronousMessage(...)... }

// Worker2
onmessage = function () { ...worker1.postSynchronousMessage(...)... }

In the current implementations I imagine that this may cause difficulty, but I 
don't think that there is an actual technical argument against it.

--Oliver


On Sep 1, 2012, at 1:38 PM, Rick Waldron wrote:

 
 
 On Saturday, September 1, 2012 at 4:28 PM, Olli Pettay wrote:
 
 On 09/01/2012 11:19 PM, Rick Waldron wrote:
 
 David,
 
 Thanks for preparing this summary—I just wanted to note that I still stand 
 behind my original, reality based arguments.
 
 One comment inline..
 
 On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote:
 
 Hi,
 
 A Sync API for workers is being implemented in Firefox [1].
 I'd like to come back to the discussions mentionned in comment 4 of the 
 bug.
 The original post actually describes an async API—putting the word sync 
 in the middle of a method or event name doesn't make it sync.
 
 As the proposed API developed, it still retains the event handler-esque 
 design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of the
 terminology being used is async:
 - event
 - callback
 - onfoo
 
 Even Olli's proposal example is async. 
 https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout)
 
 If the argument is callback hell, save it—because if that's the problem 
 with your program, then your doing it wrong (see: node.js ecosystem).
 
 
 If this API introduces any renderer process blocking, the result will be 
 catastrophic in the hands of inexperienced web developers.
 
 
 I haven't seen any proposal which would block rendering/main/dom thread
 So far, they all look async. Just calling them sync doesn't make them 
 sync. A sync worker API:
 
 // really sync behaviour means blocking until this returns: 
 var result = rendererBlockingAPI( data );
 
 Unless you specifically design that to stop everything and wait for a 
 response (renderer blocking), JavaScript will run to completion.
 
 Rick
  
  
 
 
 We've been thinking the following approaches:
 
 Proposal 1
 Parent Thread:
 var w = new Worker('foo.js');
 w.onsyncmessage = function(event) {
 event.reply('bar');
 }
 Worker:
 var r = postSyncMessage('foobar', null, 1000 /* timeout */);
 if (r == 'bar') ..
 PRO:
 - It's already implemented :)
 CON:
 - Multiple event listeners - Multiple reply() calls. How to deal with it?
 - Multiple event listeners - is this your message?
 - Wrong order of the messages in worker if parent sends async message just 
 before receiving sync message
 - The message must be read in order to reply
 
 
 Proposal 1.1
 Parent Thread:
 var w = new Worker('foo.js');
 w.onsyncmessage = function(event) {
 var r = new Reply(event);
 r.reply(bar); // Can be called after event dispatch.
 }
 Worker:
 var replies = postSyncMessage('foobar', null, 1000 /* timeout */);
 for (var r in replies) {
 handleEachReply(r);
 }
 PRO:
 - Can handle multiple replies.
 - No awkward limitations on main thread because of reply handling
 CON:
 - A bit ugly.
 - Reply on the worker thread becomes an array - unintuitive
 - Wrong order of the messages in worker if parent sends async message just 
 before receiving sync message
 - The Reply object must be created during event dispatch.
 
 
 Proposal 2
 Parent Thread:
 var w = new Worker('foo.js');
 w.setSyncHandler('typeFoobar', function(message) {
 return 'bar';
 });
 Worker:
 var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */);
 if (r == 'bar') ..
 PRO:
 - no multple replyies are possible
 - types for sync messages
 CON:
 - Just a single listener
 - It's not based on event - it's something different compare with any other 
 worker/parent communication.
 - Wrong order of the messages in worker if parent sends async message just 
 before receiving sync message
 
 
 Proposal 3
 Worker:
 postMessage(I want reply to this);
 var events = [];
 while (var m = waitForMessage()) {
 if (m.data != /* the reply * /) {
 events.push(m);
 } else {
 // do something with message
 }
 }
 while (events.length()) {
 dispatchEvent(events.shift());
 }
 PRO:
 - Flexible
 - the order of the events is changed by the developer
 - since there isn't any special sync messaging, multiple event listeners 
 don't
 cause problems.
 CON:
 - complex for web developers(?)
 - The message must be read in order to reply
 - Means that 

Re: Sync API for workers

2012-09-01 Thread Rick Waldron
On Sat, Sep 1, 2012 at 4:51 PM, Oliver Hunt oli...@apple.com wrote:

 My reading (from the proposed APIs) is that these are only synchronous
 from the Worker's PoV.  If that's correct I have no real objections to such
 an API - the render thread simply sees a regular message.  It doesn't even
 need a special API on the receiving side.  If the Worker has

 ...
 var response = postSynchronousMessage(message)
 ...

 and the renderer has

 worker.onmessage =function() { ; return foo }

 There are no UI blocking hazards (other than the usual slow event handler
 problem, which is already present anyway)

 The only real problem I can see is along the lines of:

 // Worker1
 onmessage = function () { ...worker2.postSynchronousMessage(...)... }

 // Worker2
 onmessage = function () { ...worker1.postSynchronousMessage(...)... }

 In the current implementations I imagine that this may cause difficulty,
 but I don't think that there is an actual technical argument against it.


Perhaps I misread or misinterpreted the goals of the proposal and while I
appreciate the clarification here, I'm still left wondering: what is the
benefit of a synchronous in-worker-only messaging API if the alleged pain
point is the desire to write linear code.

That aside, I agree with Oliver, if no renderer process blocking can occur
and no existing APIs are broken, then there is no harm (aside from
polluting the global object with more verbosely named APIs, but that's just
a nit) .

Rick



 --Oliver


 On Sep 1, 2012, at 1:38 PM, Rick Waldron wrote:



 On Saturday, September 1, 2012 at 4:28 PM, Olli Pettay wrote:

 On 09/01/2012 11:19 PM, Rick Waldron wrote:


 David,

 Thanks for preparing this summary—I just wanted to note that I still stand
 behind my original, reality based arguments.

 One comment inline..

 On Saturday, September 1, 2012 at 12:49 PM, David Bruant wrote:

 Hi,

 A Sync API for workers is being implemented in Firefox [1].
 I'd like to come back to the discussions mentionned in comment 4 of the
 bug.

 The original post actually describes an async API—putting the word sync
 in the middle of a method or event name doesn't make it sync.

 As the proposed API developed, it still retains the event handler-esque
 design (https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c12). All of
 the
 terminology being used is async:
 - event
 - callback
 - onfoo

 Even Olli's proposal example is async.
 https://bugzilla.mozilla.org/show_bug.cgi?id=783190#c9 (setTimeout)

 If the argument is callback hell, save it—because if that's the problem
 with your program, then your doing it wrong (see: node.js ecosystem).


 If this API introduces any renderer process blocking, the result will be
 catastrophic in the hands of inexperienced web developers.



 I haven't seen any proposal which would block rendering/main/dom thread

 So far, they all look async. Just calling them sync doesn't make them
 sync. A sync worker API:

 // really sync behaviour means blocking until this returns:
 var result = rendererBlockingAPI( data );

 Unless you specifically design that to stop everything and wait for a
 response (renderer blocking), JavaScript will run to completion.

 Rick





 We've been thinking the following approaches:

 Proposal 1
 Parent Thread:
 var w = new Worker('foo.js');
 w.onsyncmessage = function(event) {
 event.reply('bar');
 }
 Worker:
 var r = postSyncMessage('foobar', null, 1000 /* timeout */);
 if (r == 'bar') ..
 PRO:
 - It's already implemented :)
 CON:
 - Multiple event listeners - Multiple reply() calls. How to deal with it?
 - Multiple event listeners - is this your message?
 - Wrong order of the messages in worker if parent sends async message just
 before receiving sync message
 - The message must be read in order to reply


 Proposal 1.1
 Parent Thread:
 var w = new Worker('foo.js');
 w.onsyncmessage = function(event) {
 var r = new Reply(event);
 r.reply(bar); // Can be called after event dispatch.
 }
 Worker:
 var replies = postSyncMessage('foobar', null, 1000 /* timeout */);
 for (var r in replies) {
 handleEachReply(r);
 }
 PRO:
 - Can handle multiple replies.
 - No awkward limitations on main thread because of reply handling
 CON:
 - A bit ugly.
 - Reply on the worker thread becomes an array - unintuitive
 - Wrong order of the messages in worker if parent sends async message just
 before receiving sync message
 - The Reply object must be created during event dispatch.


 Proposal 2
 Parent Thread:
 var w = new Worker('foo.js');
 w.setSyncHandler('typeFoobar', function(message) {
  return 'bar';
 });
 Worker:
 var r = postSyncMessage('typeFoobar', 'foobar', null, 1000 /* timeout */);
 if (r == 'bar') ..
 PRO:
 - no multple replyies are possible
 - types for sync messages
 CON:
 - Just a single listener
 - It's not based on event - it's something different compare with any
 other worker/parent communication.
 - Wrong order of the messages in worker if parent sends async message just
 before receiving sync 

Re: Sync API for workers

2012-09-01 Thread Olli Pettay

On 09/01/2012 11:38 PM, Rick Waldron wrote:


So far, they all look async. Just calling them sync doesn't make them sync.


Sure they are sync. They are sync inside worker. We all know that we must not 
introduce
new sync APIs in the main thread.





Re: Sync API for workers

2012-09-01 Thread David Bruant
Le 01/09/2012 22:30, Rick Waldron a écrit :
 On Saturday, September 1, 2012 at 4:02 PM, Glenn Maynard wrote:
 On Sat, Sep 1, 2012 at 11:49 AM, David Bruant bruan...@gmail.com
 mailto:bruan...@gmail.com wrote:
 # Discussion 2
 ## Joshua Bell [5]
 This can be done today using bidirectional postMessage, but of course
 this requires the Worker to then be coded in now common asynchronous
 JavaScript fashion, with either a tangled mess of callbacks or some sort
 of Promises/Futures library, which removes some of the benefits of
 introducing sync APIs to Workers in the first place.
 = What are these benefits?

 The benefit of being able to write linear code.  I don't think anyone
 who's written complex algorithms in JavaScript can seriously dispute
 this as anything but a huge win.

 I can seriously dispute this, as someone who involved in research and
 development of JavaScript programming for hardware. Processing high
 volume serialport IO is relatively simple with streams and data
 events. It's just a matter of thinking differently about the program.
I dispute it too. It's been 8 months I work with Node.js and have
written algorithms with IO. I've used promises and I'm very happy with
it when it comes to readability. The code is very close to being linear.
JavaScript, because of its syntax imposes some noise (especially the
function keyword and the Q library imposes a .then/.fail), but I'm
confident a language that compiles to JS could have sugar to eliminate
this issue.
I recommand taking a look at Roy for that inspiration on that topic
(start at ~8'00'') http://blip.tv/jsconf/jsconf2012-brian-mckenna-6145371
(it doesn't use promises, but adds sugar to get out of the callback hell)

David


Re: Sync API for workers

2012-09-01 Thread Rick Waldron
On Sat, Sep 1, 2012 at 5:12 PM, Olli Pettay olli.pet...@helsinki.fi wrote:

 On 09/01/2012 11:38 PM, Rick Waldron wrote:

  So far, they all look async. Just calling them sync doesn't make them
 sync.


 Sure they are sync. They are sync inside worker. We all know that we must
 not introduce
 new sync APIs in the main thread.



See my response to Oliver Hunt's message

Rick


Re: Sync API for workers

2012-09-01 Thread Glenn Maynard
On Sat, Sep 1, 2012 at 3:19 PM, Rick Waldron waldron.r...@gmail.com wrote:

 I can seriously dispute this, as someone who involved in research and
 development of JavaScript programming for hardware. Processing high volume
 serialport IO is relatively simple with streams and data events. It's just
 a matter of thinking differently about the program.


We'll have to professional disagree, then.  I've used both models
extensively, and for some tasks, such as complex algorithms, I find linear
code much easier to write, understand and debug.


On Sat, Sep 1, 2012 at 3:28 PM, Olli Pettay olli.pet...@helsinki.fi wrote:

 Proposal 3
 Worker:
 postMessage(I want reply to this);
 var events = [];
 while (var m = waitForMessage()) {
   if (m.data != /* the reply * /) {
 events.push(m);
   } else {
 // do something with message
   }
 }
 while (events.length()) {
dispatchEvent(events.shift());
 }


The intent is much simpler:

postMessage(foo);
var response = getMessage();

You're correct that this wouldn't integrate all that well with libraries,
since you may end up receiving an unrelated message.  (Trying to deal with
that is where the bulk of the above comes from--and you probably really
wouldn't want to dispatch unknown events, since that's effectively making
*all* events async.)  That's why I originally proposed this as a method on
MessagePort.  You'd create a dedicated MessagePort to block on, so you
wouldn't collide with (and possibly be confused by, or discard by accident)
unrelated messages.

Of course, that means that while you're waiting for messages on the port,
no other messages are being received, since you aren't in the event loop.
That's just inherent--nothing else that happens during the event loop would
take place either, like async XHR.

I'm not sure how to do the implicit deadlock prevention if this is
exposed on MessagePort, though.  (I personally don't find it a problem, as
I said earlier, but I know that a proposal that has that as an option will
have a much better chance of being implemented than one that doesn't.)

- The message must be read in order to reply


I'm not quite following here.  You could getMessage at any time, and the
other thread could postMessage at any time--there doesn't have to be a
hey, send me a message message in the first place at all.  For example, a
site may have a button that sends a message when clicked.  You don't have
to jump hoops in order to wait for the query message to reply to, as it
seems you'd have to with the reply proposals.

-- 
Glenn Maynard