Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness

2013-08-07 Thread Ian Hickson
On Tue, 6 Aug 2013, Boris Zbarsky wrote:
 On 8/6/13 5:58 PM, Ian Hickson wrote:
  
  Parsing is easy to do on a separate worker, because it has no 
  dependencies -- you can do it all in isolation.
 
 Sadly, that may not be the [case].
 
 Actual JS implementations have various thread-local data that objects 
 depend on (starting with interned property names), such that it's not 
 actually possible to create an object on one thread and use it on 
 another in many of them.

Yeah, the final step of parsing a JSON string might require sync access to 
the target thread.


   For instance, how would you serialize something as simple as the 
   following?
   
   {
  name: The One,
  hp: 1000,
  achievements: [achiever, overachiever, extreme overachiever]
   // Length of the list is unpredictable
   }
  
  Why serialise it? If you want to post this across a MessagePort to a 
  worker, or back from a worker, why not just post it?
  
  var a = { ... }; // from above
  port.postMessage(a);
 
 This in practice does some sort of serialization in UAs.

Indeed. My question was why do it manually.


  why not just do this in C++?
 
 Let's start with because writing C++ code without memory errors is 
 harder than writing JS code without memory errors?

  I don't understand why you would constrain yourself to using Web APIs 
  in JavaScript to write a browser.
 
 Simplicity of implementation?  Sandboxing of the code?  Eating your own 
 dogfood?

I guess that's a design choice.

But fundamentally, the needs of programmers writing Web browsers aren't 
valid use cases for adding features to the Web platform. There's no need 
for internal APIs to be interoperable.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness

2013-08-06 Thread Ian Hickson
On Thu, 7 Mar 2013, j...@mailb.org wrote:

 right now JSON.parse blocks the mainloop, this gets more and more of an 
 issue as JSON documents get bigger and are also used as serialization 
 format to communicate with web workers.

I think it would make sense to have a Promise-based API for JSON parsing. 
This probably belongs either in the JS spec or the DOM spec; Anne, Ms2ger, 
and any JS people, is anyone interested in taking this?


On Thu, 7 Mar 2013, David Rajchenbach-Teller wrote:
 
 Actually, communicating large JSON objects between threads may cause 
 performance issues. I do not have the means to measure reception speed 
 simply (which would be used to implement asynchronous JSON.parse), but 
 it is easy to measure main thread blocks caused by sending (which would 
 be used to implement asynchronous JSON.stringify).

I don't understand why there'd be any difficulty in sending large objects 
between workers or from a worker to the main thread. It's possible this is 
not well-implemented today, but isn't that just an implementation detail?

One could imagine an implementation strategy where the cloning is done on 
the sending side, or even on a third thread altogether, and just passed 
straight to the receiving side in one go.


On Thu, 7 Mar 2013, Tobie Langel wrote:
 
 Even if an async API for JSON existed, wouldn't the perf bottleneck then 
 simply fall on whatever processing needs to be done afterwards?

That was my initial reaction as well, I must admit.


On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote:

 For the moment, the main use case I see is for asynchronous
 serialization of JSON is that of snapshoting the world without stopping
 it, for backup purposes, e.g.:
 a. saving the state of the current region in an open world RPG;
 b. saving the state of an ongoing physics simulation;
 c. saving the state of the browser itself in case of crash/power loss
 (that's assuming a FirefoxOS-style browser implemented as a web
 application);
 d. backing up state and history of the browser itself to a server
 (again, assuming that the browser is a web application).

Serialising is hard to do async, since you fundamentally have to walk the 
data structure, and the actual serialisation at that point is not 
especially more expensive than a copy.


 The natural course of action would be to do the following:
 1. collect data to a JSON object (possibly a noop);

I'm not sure what you mean by JSON object. JSON is a string format. Do you 
mean a JS object data structure?

 2. send the object to a worker;
 3. apply some post-treatment to the object (possibly a noop);
 4. write/upload the object.
 
 Having an asynchronous JSON serialization to some Transferable form 
 would considerably the task of implement step 2. without janking if data 
 ends up very heavy.

I don't understand what JSON has to do with sending data to a worker. You 
can just send the actual JS object; MessagePorts and postMessage() support 
raw JS objects.


 So far, I have discussed serializing JSON, not deserializing it, but I 
 believe that the symmetric scenarios also hold.

No, they are quite asymetric. Serialising requires stalling the code that 
is interacting with the data structure, to guarantee integrity. Parsing is 
easy to do on a separate worker, because it has no dependencies -- you can 
do it all in isolation.


On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote:
 
 If I am correct, this means that we need some mechanism to provide 
 efficient serialization of non-Transferable data into something 
 Transferable.

I don't understand what this means. Transferable is about neutering 
objects on one side and creating new versions on the other. It's the 
equivalent of a move. Your use cases were about making copies, as far as 
I can tell (saving and backing up).

As a general rule, JSON has nothing to do with Transferable objects, as 
far as I can tell.


On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote:
 
 Intuitively, this sounds like:
 1. collect data to a JSON;
 2. serialize JSON (hopefully asynchronously) to a Transferable (or
 several Transferables).

I really don't understand this. Are you asking for a way to move a JS 
object from one thread to another, killing references to it in the first 
thread? What's the use case? (What would this have to do with JSON?)


On Fri, 8 Mar 2013, David Bruant wrote:

 Why not collect the data in a Transferable like an ArrayBuffer directly? 
 It skips the additional serialization part. Writing a byte stream 
 directly is a bit hardcore I admit, but an object full of setters can 
 give the impression to create an object while actually filling an 
 ArrayBuffer as a backend. I feel that could work efficiently.

It's not clear to me what the use case is, but if the desire is to move a 
batch of data from one thread to another, then this is certainly one way 
to do it. Another would be to just copy the data in the first place, no 
need to move it -- since you have to pay the cost of reading 

Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness

2013-08-06 Thread Boris Zbarsky

On 8/6/13 5:58 PM, Ian Hickson wrote:

One could imagine an implementation strategy where the cloning is done on
the sending side, or even on a third thread altogether


The cloning needs to run to completion (in the sense of capturing an 
immutable representation) before anyone can change the data structure 
being cloned.


That means either serializing the whole data structure in some way 
before returning control to JS or doing something where you start 
serializing it async and block until finish as soon as someone tries to 
modify any of those objects in any way, right?


The latter is rather nontrivial to implement, so UAs do the former at 
the moment.



Serialising is hard to do async, since you fundamentally have to walk the
data structure, and the actual serialisation at that point is not
especially more expensive than a copy.


Right, that's what I said above...  ;)


Parsing is easy to do on a separate worker, because it has no dependencies -- 
you can
do it all in isolation.


Sadly, that may not be the.

Actual JS implementations have various thread-local data that objects 
depend on (starting with interned property names), such that it's not 
actually possible to create an object on one thread and use it on 
another in many of them.



For instance, how would you serialize something as simple as the following?

{
   name: The One,
   hp: 1000,
   achievements: [achiever, overachiever, extreme overachiever]
// Length of the list is unpredictable
}


Why serialise it? If you want to post this across a MessagePort to a
worker, or back from a worker, why not just post it?

var a = { ... }; // from above
port.postMessage(a);


This in practice does some sort of serialization in UAs.


Assuming by Firefox Desktop you mean the browser for desktop OSes called
Firefox, then, why not just do this in C++?


Let's start with because writing C++ code without memory errors is 
harder than writing JS code without memory errors?



I don't understand why you
would constrain yourself to using Web APIs in JavaScript to write a browser.


Simplicity of implementation?  Sandboxing of the code?  Eating your own 
dogfood?


I can come up with some more reasons if you want.

-Boris


Re: [whatwg] asynchronous JSON.parse

2013-03-09 Thread David Bruant

Le 08/03/2013 22:16, David Rajchenbach-Teller a écrit :

On 3/8/13 5:35 PM, David Bruant wrote:

2. serialize JSON (hopefully asynchronously) to a Transferable (or
several Transferables).

Why not collect the data in a Transferable like an ArrayBuffer directly?
It skips the additional serialization part. Writing a byte stream
directly is a bit hardcore I admit, but an object full of setters can
give the impression to create an object while actually filling an
ArrayBuffer as a backend. I feel that could work efficiently.

I suspect that this will quickly grow to either:
- an API for serializing an object to a Transferable or a stream of
Transferable; or
- a lower-level but equivalent API for doing the same, without having to
actually build the object.
Yes. The difference with JSON is that it can be transfered directly 
without an extra step.


Whether you put the info in an Object as properties (before being 
JSON.stringify()'ed) or directly in a Transferable, the snapshot info 
needs to be stored somewhere.




For instance, how would you serialize something as simple as the following?

{
   name: The One,
   hp: 1000,
   achievements: [achiever, overachiever, extreme overachiever]
// Length of the list is unpredictable
}
If it's possible to serialize this as a string (like in JSON), it's 
possible to serialize it in an ArrayBuffer.
Depending on implementations, serializing a list will require to define 
separators or maybe a length field upfront, etc. But that's doable.


Taking a second for an aside.
I've once met someone who told me that JSON was bullshit. Since the guy 
had blown my mind during a presentation, I've decided to give him a 
chance after this sentence :-p He explained that in JSON, a lot of 
characters are double quotes and commas and brackets. Also, you have to 
name fields.
He said that if you want to share 2 ints (like longitude and latitude), 
you probably have to send the following down the wire:

'{long:12.986,lat: -98.047}'
which is about 30 bytes... for 2 numbers. He suggested that a client and 
server could send only 2 floats (4 bytes each, so 8 bytes total) and 
have a convention as to which number is first and you'd just be done 
with it.
30 bytes isn't fully fair because it could be gzipped, but that takes 
additional processing time in both ends.


He talked about a technology he was working on that, based on a message 
description would output both the client and server code (in different 
languages if necessary) so that whatever message you send, you just 
write your business code and play with well-abstracted objects and the 
generated code takes care of the annoying send/receive a 
well-compressed message part.


That was an interesting idea.

Back to your case, it's always possible to represent structured 
information in a linear array (hence filesystems, hence databases).


David


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread Robin Berjon

On 07/03/2013 23:34 , Tobie Langel wrote:

In which case, isn't part of the solution to paginate your data, and
parse those pages separately?


Assuming you can modify the backend. Also, data doesn't necessarily have 
to get all that bulky before you notice on a somewhat sluggish device.



Even if an async API for JSON existed, wouldn't the perf bottleneck
then simply fall on whatever processing needs to be done afterwards?


But for that part you're in control of whether your processing is 
blocking or not.



Wouldn't some form of event-based API be more indicated? E.g.:

var parser = JSON.parser();

 parser.parse(src);
 parser.onparse = function(e) { doSomething(e.data); };

I'm not sure how that snippet would be different from a single callback API.

There could possibly be value in an event-based API if you could set it 
up with a filter, e.g. JSON.filtered($.*).then(function (item) {}); 
which would call you for ever item in the root object. Getting an event 
for every information item that the parser processes would likely flood 
you in events.


Yet another option is a pull API. There's a lot of experience from the 
XML planet in APIs with specific performance characteristics. They would 
obviously be a lot simpler for JSON; I wonder how well that experience 
translates.


--
Robin Berjon - http://berjon.com/ - @robinberjon


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread Tobie Langel
On Friday, March 8, 2013 at 10:44 AM, Robin Berjon wrote:
 On 07/03/2013 23:34 , Tobie Langel wrote:
  Wouldn't some form of event-based API be more indicated? E.g.:
  
  var parser = JSON.parser();
  parser.parse(src);
  parser.onparse = function(e) { doSomething(e.data); };
 
 
 I'm not sure how that snippet would be different from a single callback API.
 
 There could possibly be value in an event-based API if you could set it 
 up with a filter, e.g. JSON.filtered($.*).then(function (item) {}); 
 which would call you for ever item in the root object. Getting an event 
 for every information item that the parser processes would likely flood 
 you in events.

Agreed, you need something higher-level than just JSON tokens. Which is why 
this can be very much app-specific, unless most of the use cases are to parse 
data of a format similar to [Object, Object, Object, ..., Object]. This could 
be special-cased so as to send each object to the event handler as it's parsed.

--tobie


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Bruant

Le 08/03/2013 02:01, Glenn Maynard a écrit :
If you're dealing with lots of data, you should be loading or creating 
the data in the worker in the first place, not creating it in the UI 
thread and then shuffling it off to a worker.

Exactly. That would be the proper way to handle a big amount of data.

David


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Rajchenbach-Teller
Let me answer your question about the scenario, before entering the
specifics of an API.

For the moment, the main use case I see is for asynchronous
serialization of JSON is that of snapshoting the world without stopping
it, for backup purposes, e.g.:
a. saving the state of the current region in an open world RPG;
b. saving the state of an ongoing physics simulation;
c. saving the state of the browser itself in case of crash/power loss
(that's assuming a FirefoxOS-style browser implemented as a web
application);
d. backing up state and history of the browser itself to a server
(again, assuming that the browser is a web application).

Cases a., b. and d. are hypothetical but, I believe, realistic. Case c.
is very close to a scenario I am currently facing.

The natural course of action would be to do the following:
1. collect data to a JSON object (possibly a noop);
2. send the object to a worker;
3. apply some post-treatment to the object (possibly a noop);
4. write/upload the object.

Having an asynchronous JSON serialization to some Transferable form
would considerably the task of implement step 2. without janking if data
ends up very heavy.

Note that, in all the scenarios I have mentioned, it is generally
difficult for the author of the application to know ahead of time which
part of the JSON object will be heavy and should be transmitted through
an ad hoc protocol. In scenario c., for instance, it is quite frequent
that just one or two pages contain 90%+ of the data that needs to be
saved, in the form of form fields, or iframes, or Session Storage.

So far, I have discussed serializing JSON, not deserializing it, but I
believe that the symmetric scenarios also hold.

Best regards,
 David

On 3/7/13 11:34 PM, Tobie Langel wrote:
 I'd like to hear about the use cases a bit more. 
 
 Generally, structured data gets bulky because it contains more items, not 
 because items get bigger.
 
 In which case, isn't part of the solution to paginate your data, and parse 
 those pages separately?
 
 Even if an async API for JSON existed, wouldn't the perf bottleneck then 
 simply fall on whatever processing needs to be done afterwards?
 
 Wouldn't some form of event-based API be more indicated? E.g.:
 
 var parser = JSON.parser();
 parser.parse(src);
 parser.onparse = function(e) {
   doSomething(e.data);
 };
 
 And wouldn't this be highly dependent on how the data is structured, and thus 
 very much app-specific?
 
 --tobie 
 


-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Bruant

Le 07/03/2013 23:18, David Rajchenbach-Teller a écrit :

(Note: New on this list, please be gentle if I'm debating an
inappropriate issue in an inappropriate place.)

Actually, communicating large JSON objects between threads may cause
performance issues. I do not have the means to measure reception speed
simply (which would be used to implement asynchronous JSON.parse), but
it is easy to measure main thread blocks caused by sending (which would
be used to implement asynchronous JSON.stringify).

I have put together a small test here - warning, this may kill your browser:
http://yoric.github.com/Bugzilla-832664/

While there are considerable fluctuations, even inside one browser, on
my system, I witness janks that last 300ms to 3s.

Consequently, I am convinced that we need asynchronous variants of
JSON.{parse, stringify}.
I don't think this is necessary as all the processing can be done a 
worker (starting in the worker even).
But if an async solution were to happen, I think it should be all the 
way, that is changing the JSON.parse method so that it accepts not only 
a string, but a stream of data.
Currently, one has to wait until the entire string before being able to 
parse it. That's a waste of time for big data which is your use case 
(especially if waiting for data to come from the network) and probably a 
misuse of memory. With a stream, temporary strings can be thrown away.


David


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Rajchenbach-Teller
On 3/8/13 2:01 AM, Glenn Maynard wrote:
 (Not nitpicking, since I really wasn't sure what you meant at first, but
 I think you mean a JavaScript object.  There's no such thing as a JSON
 object.)

I meant a pure data structure, i.e. JavaScript object without methods.
It was my understanding that JSON object was a common denomination for
such objects, but I am willing to use something else.

I believe I have just addressed your other points in post
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2013-March/039090.html .

Best regards,
 David

-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Rajchenbach-Teller
I fully agree that any asynchronous JSON [de]serialization should be
stream-based, not string-based.

Now, if the main heavy duty work is dealing with the large object, this
can certainly be kept on a worker thread. I suspect, however, that this
is not always feasible.

Consider, for instance, a browser implemented as a web application,
FirefoxOS-style. The data that needs to be collected to save its current
state is held in the DOM. For performance and consistency, it is not
practical to keep the DOM synchronized at all times with a worker
thread. Consequently, data needs to be collected on the main thread and
then sent to a worker thread.

Similarly, for a 3d game, until workers can perform some off-screen
WebGL, I suspect that a considerable amount of complex game data needs
to reside on the main thread, because sending the appropriate subsets
from a worker to the main thread on demand might not be reactive enough
for 60 fps. I have no experience with such complex games, though, so my
intuition could be wrong.

Best regards,
 David


On 3/8/13 11:53 AM, David Bruant wrote:
 I don't think this is necessary as all the processing can be done a
 worker (starting in the worker even).
 But if an async solution were to happen, I think it should be all the
 way, that is changing the JSON.parse method so that it accepts not only
 a string, but a stream of data.
 Currently, one has to wait until the entire string before being able to
 parse it. That's a waste of time for big data which is your use case
 (especially if waiting for data to come from the network) and probably a
 misuse of memory. With a stream, temporary strings can be thrown away.
 
 David


-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Bruant

Le 08/03/2013 13:34, David Rajchenbach-Teller a écrit :

I fully agree that any asynchronous JSON [de]serialization should be
stream-based, not string-based.

Now, if the main heavy duty work is dealing with the large object, this
can certainly be kept on a worker thread. I suspect, however, that this
is not always feasible.

Consider, for instance, a browser implemented as a web application,
FirefoxOS-style. The data that needs to be collected to save its current
state is held in the DOM. For performance and consistency, it is not
practical to keep the DOM synchronized at all times with a worker
thread. Consequently, data needs to be collected on the main thread and
then sent to a worker thread.
I feel the data can be collected on the main thread in a Transferable 
(probably awkward, yet doable). This way, when the data needs to be 
transfered, the transfer is fast and heavy processing can happen in the 
worker.



Similarly, for a 3d game, until workers can perform some off-screen
WebGL
What if a cross-origin or sandbox iframe was actually a worker with a 
DOM? [1]

Not for today, I admit.
Today, canvas contexts can be transferred [2]. There is no 
implementation of that to my knowledge, but that's happening.



I suspect that a considerable amount of complex game data needs
to reside on the main thread, because sending the appropriate subsets
from a worker to the main thread on demand might not be reactive enough
for 60 fps. I have no experience with such complex games, though, so my
intuition could be wrong.
I share your intuition, but miss the relevant expertise too. Let's wait 
until people complain :-) And let's see how far transferable CanvasProxy 
let us go.


David

[1] 
https://groups.google.com/d/msg/mozilla.dev.servo/LQ46AtKp_t0/plqFfjLSER8J
[2] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transferable


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Rajchenbach-Teller
On 3/8/13 1:59 PM, David Bruant wrote:
 Consider, for instance, a browser implemented as a web application,
 FirefoxOS-style. The data that needs to be collected to save its current
 state is held in the DOM. For performance and consistency, it is not
 practical to keep the DOM synchronized at all times with a worker
 thread. Consequently, data needs to be collected on the main thread and
 then sent to a worker thread.
 I feel the data can be collected on the main thread in a Transferable
 (probably awkward, yet doable). This way, when the data needs to be
 transfered, the transfer is fast and heavy processing can happen in the
 worker.

Intuitively, this sounds like:
1. collect data to a JSON;
2. serialize JSON (hopefully asynchronously) to a Transferable (or
several Transferables).

If so, we are back to the problem of serializing JSON asynchronously to
something transferable. Possibly an iterator (or an asynchronous
iterator == a stream) of ByteArray, for instance.

The alternative would be to serialize to a stream while we are still
building the object. This sounds possible, although I suspect that the
API would be much more complex.

 Similarly, for a 3d game, until workers can perform some off-screen
 WebGL
 What if a cross-origin or sandbox iframe was actually a worker with a
 DOM? [1]
 Not for today, I admit.
 Today, canvas contexts can be transferred [2]. There is no
 implementation of that to my knowledge, but that's happening.

Yes, I believe that, in time, this will solve many scenarios. Definitely
not the DOM-related scenario above, though.

 I suspect that a considerable amount of complex game data needs
 to reside on the main thread, because sending the appropriate subsets
 from a worker to the main thread on demand might not be reactive enough
 for 60 fps. I have no experience with such complex games, though, so my
 intuition could be wrong.
 I share your intuition, but miss the relevant expertise too. Let's wait
 until people complain :-) And let's see how far transferable CanvasProxy
 let us go.

Ok, let's just say that I won't use games as a running example until
people start complaining :) However, the DOM situation remains.

Cheers,
 David

-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread Glenn Maynard
On Fri, Mar 8, 2013 at 4:51 AM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 a. saving the state of the current region in an open world RPG;
 b. saving the state of an ongoing physics simulation;


These should live in a worker in the first place.

c. saving the state of the browser itself in case of crash/power loss
 (that's assuming a FirefoxOS-style browser implemented as a web
 application);


I don't understand this case.  Why would you implement a browser in a
browser?  That sounds like a weird novelty app, not a real use case.  Can
you explain this for people who don't know what FirefoxOS means?

d. backing up state and history of the browser itself to a server
 (again, assuming that the browser is a web application).


(This sounds identical to C.)

Similarly, for a 3d game, until workers can perform some off-screen
 WebGL, I suspect that a considerable amount of complex game data needs
 to reside on the main thread, because sending the appropriate subsets
 from a worker to the main thread on demand might not be reactive enough
 for 60 fps. I have no experience with such complex games, though, so my
 intuition could be wrong.


If so, we should be fixing the problems preventing workers from being used
fully, not to add workarounds to help people do computationally-expensive
work in the UI thread.

-- 
Glenn Maynard


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Bruant

Le 08/03/2013 15:29, David Rajchenbach-Teller a écrit :

On 3/8/13 1:59 PM, David Bruant wrote:

Consider, for instance, a browser implemented as a web application,
FirefoxOS-style. The data that needs to be collected to save its current
state is held in the DOM. For performance and consistency, it is not
practical to keep the DOM synchronized at all times with a worker
thread. Consequently, data needs to be collected on the main thread and
then sent to a worker thread.

I feel the data can be collected on the main thread in a Transferable
(probably awkward, yet doable). This way, when the data needs to be
transfered, the transfer is fast and heavy processing can happen in the
worker.

Intuitively, this sounds like:
1. collect data to a JSON;

I don't understand this sentence. Do you mean collect data in an object?
Just to be sure we use the same vocabulary:
When I say object, I mean something described by ES5 - 8.6 [1], so 
basically a bag of properties (usually data properties) with an internal 
[[Prototype]], etc.
When I say JSON, it's a shortcut for JSON string following the 
grammar defined at ES5 - 5.1.5 [2].
Given the vocabulary I use, one can collect data in an object (by adding 
own properties, most likely), then serialize it as a JSON string with a 
call to JSON.stringify, but one cannot collect data in/to a JSON.



2. serialize JSON (hopefully asynchronously) to a Transferable (or
several Transferables).
Why not collect the data in a Transferable like an ArrayBuffer directly? 
It skips the additional serialization part. Writing a byte stream 
directly is a bit hardcore I admit, but an object full of setters can 
give the impression to create an object while actually filling an 
ArrayBuffer as a backend. I feel that could work efficiently.


What are the data you want to collect? Is it all at once or are you 
building the object little by little? For a backup and for FirefoxOS 
specifically, could a FileHandle [3] work? It's an async API to write in 
a file.


David

[1] http://es5.github.com/#x8.6
[2] http://es5.github.com/#x5.1.5
[3] https://developer.mozilla.org/en-US/docs/WebAPI/FileHandle_API


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread David Rajchenbach-Teller
On 3/8/13 5:35 PM, David Bruant wrote:
 Intuitively, this sounds like:
 1. collect data to a JSON;
 I don't understand this sentence. Do you mean collect data in an object?

My bad. I sometimes write JSON for object that may be stringified to
JSON format and parsed back without loss, i.e. a bag of [bags of]
non-function properties. So let's just say object.

 2. serialize JSON (hopefully asynchronously) to a Transferable (or
 several Transferables).
 Why not collect the data in a Transferable like an ArrayBuffer directly?
 It skips the additional serialization part. Writing a byte stream
 directly is a bit hardcore I admit, but an object full of setters can
 give the impression to create an object while actually filling an
 ArrayBuffer as a backend. I feel that could work efficiently.

I suspect that this will quickly grow to either:
- an API for serializing an object to a Transferable or a stream of
Transferable; or
- a lower-level but equivalent API for doing the same, without having to
actually build the object.

For instance, how would you serialize something as simple as the following?

{
  name: The One,
  hp: 1000,
  achievements: [achiever, overachiever, extreme overachiever]
   // Length of the list is unpredictable
}

 What are the data you want to collect? Is it all at once or are you
 building the object little by little? For a backup and for FirefoxOS
 specifically, could a FileHandle [3] work? It's an async API to write in
 a file.

Thanks for the suggestion. I am effectively working on refactoring
storing browser session data. Not for FirefoxOS, but for Firefox
Desktop, which gives me more architectural constraints but frees my hand
to extend the platform with additional non-web libraries.

Best regards,
 David



-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla


Re: [whatwg] asynchronous JSON.parse

2013-03-08 Thread Glenn Maynard
On Thu, Mar 7, 2013 at 4:18 PM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 I have put together a small test here - warning, this may kill your
 browser:
http://yoric.github.com/Bugzilla-832664/


By the way, I'd recommend keeping sample benchmarks as minimal and concise
as possible.  It's always tempting to make things configurable and dynamic
and output lots of stats, but everyone interested in the results of your
benchmark needs to read the code, to verify it's correct.


On Fri, Mar 8, 2013 at 9:12 AM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 Ideally, yes. The question is whether this is actually feasible.

Also, once we have a worker thread that needs to react fast enough to
 provide sufficient data to the ui thread for animating at 60fps, this
 worker thread ends up being nearly as critical as the ui thread, in
 terms of jank.


I don't think making a call asynchronous is really going to help much, at
least for serialization.  You'd have to make a copy of the data
synchronously, before returning to the caller, in order to guarantee that
changes made after the call returns won't affect the result.  This would
probably be more expensive than the JSON serialization itself, since it
means allocating lots of objects instead of just appending to a string.

If it's possible to make that copy quickly, then that should be done for
postMessage itself, to make postMessage return quickly, instead of doing it
for a bunch of individual computationally-expensive APIs.

(Also, remember that returns quickly and does work asynchronously doesn't
mean the work goes away; the CPU time still has to be spent.  Serializing
the complete state of a large system while it's running and trying to
maintain 60 FPS doesn't sound like a good approach in the first place.)

Seriously?
 FirefoxOS [1, 2] is a mobile operating system in which all applications
 are written in JavaScript, HTML, CSS. This includes the browser itself.
 Given the number of companies involved in the venture, all over the
 world, I believe that this qualifies as real use case.


That doesn't sound like a good idea to me at all, but in any case that's a
system platform, not the Web.  APIs aren't typically added to the web to
support non-Web tasks.  For example, if there's something people want to do
in an iOS app using UIWebView, which doesn't come up on web pages, that
doesn't typically drive web APIs.  Platforms can add their own APIs for
their platform-specific needs.

-- 
Glenn Maynard


[whatwg] asynchronous JSON.parse

2013-03-07 Thread j
right now JSON.parse blocks the mainloop, this gets more and more of an
issue as JSON documents get bigger and are also used as serialization
format to communicate with web workers.
To handle large JSON Documents there is a need for an async JSON.parse,
something like:

 JSON.parse(data, function(obj) { ... });

or more like FileReader:

 var json = new JSONReader();
 json.addEventListener('load', function(event) {
   //parsed JSON document in: this.result
 });
 json.parse(data);

While my major need is asynchronous parsing of JSON data, the same is
also true for serialization into JSON.

 var json = new JSONWriter();
 json.addEventListener('load', function(event) {
   // serialized JSON string in: this.result
 });
 json.serialize(obj);


Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Glenn Maynard
(It's hard to talk to somebody called j, by the way.  :)

On Thu, Mar 7, 2013 at 2:06 AM, j...@mailb.org wrote:

 right now JSON.parse blocks the mainloop, this gets more and more of an
 issue as JSON documents get bigger


Just load the data you want to parse inside a worker, and perform the
parsing there.  Computationally-expensive work is exactly something Web
Workers are meant for.

and are also used as serialization
 format to communicate with web workers.


There's no need to serialize to JSON before sending data to a worker;
there's nothing that JSON can represent that postMessage can't post
directly.  Just postMessage the object itself.

-- 
Glenn Maynard


Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Rick Waldron
The JSON object and its API are part of the ECMAScript language
specification which is standardized by Ecma/TC39, not whatwg.


Rick

On Thursday, March 7, 2013, wrote:

 right now JSON.parse blocks the mainloop, this gets more and more of an
 issue as JSON documents get bigger and are also used as serialization
 format to communicate with web workers.
 To handle large JSON Documents there is a need for an async JSON.parse,
 something like:

  JSON.parse(data, function(obj) { ... });

 or more like FileReader:

  var json = new JSONReader();
  json.addEventListener('load', function(event) {
//parsed JSON document in: this.result
  });
  json.parse(data);

 While my major need is asynchronous parsing of JSON data, the same is
 also true for serialization into JSON.

  var json = new JSONWriter();
  json.addEventListener('load', function(event) {
// serialized JSON string in: this.result
  });
  json.serialize(obj);



Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Glenn Maynard
On Thu, Mar 7, 2013 at 9:29 AM, Rick Waldron waldron.r...@gmail.com wrote:

 The JSON object and its API are part of the ECMAScript language
 specification which is standardized by Ecma/TC39, not whatwg.


He's talking about an async interface to it, not the core parser.  It's a
higher level of abstraction than the core language, which doesn't know
anything about eg. DOM Events and doesn't typically define asynchronous
interfaces.  If an API like this was to be exposed (which I believe is
unnecessary), it would belong here or webapps, not at the language level.

-- 
Glenn Maynard


Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Rick Waldron
On Thu, Mar 7, 2013 at 10:42 AM, Glenn Maynard gl...@zewt.org wrote:

 On Thu, Mar 7, 2013 at 9:29 AM, Rick Waldron waldron.r...@gmail.comwrote:

 The JSON object and its API are part of the ECMAScript language
 specification which is standardized by Ecma/TC39, not whatwg.


 He's talking about an async interface to it, not the core parser.  It's a
 higher level of abstraction than the core language, which doesn't know
 anything about eg. DOM Events and doesn't typically define asynchronous
 interfaces.  If an API like this was to be exposed (which I believe is
 unnecessary), it would belong here or webapps, not at the language level.


Yes, and as a member of ECMA/TC39 I felt that it was my responsibility
to clarify the specification ownership—but thanks for filling me in ;)

Rick



 --
 Glenn Maynard




Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread David Rajchenbach-Teller
(Note: New on this list, please be gentle if I'm debating an
inappropriate issue in an inappropriate place.)

Actually, communicating large JSON objects between threads may cause
performance issues. I do not have the means to measure reception speed
simply (which would be used to implement asynchronous JSON.parse), but
it is easy to measure main thread blocks caused by sending (which would
be used to implement asynchronous JSON.stringify).

I have put together a small test here - warning, this may kill your browser:
   http://yoric.github.com/Bugzilla-832664/

While there are considerable fluctuations, even inside one browser, on
my system, I witness janks that last 300ms to 3s.

Consequently, I am convinced that we need asynchronous variants of
JSON.{parse, stringify}.

Best regards,
 David

 Glenn Maynard wrote

 (It's hard to talk to somebody called j, by the way.  :)
 
 On Thu, Mar 7, 2013 at 2:06 AM, j at mailb.org wrote:
 
 right now JSON.parse blocks the mainloop, this gets more and more of an
 issue as JSON documents get bigger
 
 
 Just load the data you want to parse inside a worker, and perform the
 parsing there.  Computationally-expensive work is exactly something Web
 Workers are meant for.
 
 and are also used as serialization
 format to communicate with web workers.

 
 There's no need to serialize to JSON before sending data to a worker;
 there's nothing that JSON can represent that postMessage can't post
 directly.  Just postMessage the object itself.


-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla






Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Tobie Langel
I'd like to hear about the use cases a bit more. 

Generally, structured data gets bulky because it contains more items, not 
because items get bigger.

In which case, isn't part of the solution to paginate your data, and parse 
those pages separately?

Even if an async API for JSON existed, wouldn't the perf bottleneck then simply 
fall on whatever processing needs to be done afterwards?

Wouldn't some form of event-based API be more indicated? E.g.:

var parser = JSON.parser();
parser.parse(src);
parser.onparse = function(e) {
  doSomething(e.data);
};

And wouldn't this be highly dependent on how the data is structured, and thus 
very much app-specific?

--tobie 


On Thursday, March 7, 2013 at 11:18 PM, David Rajchenbach-Teller wrote:

 (Note: New on this list, please be gentle if I'm debating an
 inappropriate issue in an inappropriate place.)
 
 Actually, communicating large JSON objects between threads may cause
 performance issues. I do not have the means to measure reception speed
 simply (which would be used to implement asynchronous JSON.parse), but
 it is easy to measure main thread blocks caused by sending (which would
 be used to implement asynchronous JSON.stringify).
 
 I have put together a small test here - warning, this may kill your browser:
 http://yoric.github.com/Bugzilla-832664/
 
 While there are considerable fluctuations, even inside one browser, on
 my system, I witness janks that last 300ms to 3s.
 
 Consequently, I am convinced that we need asynchronous variants of
 JSON.{parse, stringify}.
 
 Best regards,
 David
 
  Glenn Maynard wrote
  
  (It's hard to talk to somebody called j, by the way. :)
  
  On Thu, Mar 7, 2013 at 2:06 AM, j at mailb.org (http://mailb.org) wrote:
  
   right now JSON.parse blocks the mainloop, this gets more and more of an
   issue as JSON documents get bigger
  
  
  
  
  Just load the data you want to parse inside a worker, and perform the
  parsing there. Computationally-expensive work is exactly something Web
  Workers are meant for.
  
  and are also used as serialization
   format to communicate with web workers.
  
  
  
  There's no need to serialize to JSON before sending data to a worker;
  there's nothing that JSON can represent that postMessage can't post
  directly. Just postMessage the object itself.
 
 
 
 
 -- 
 David Rajchenbach-Teller, PhD
 Performance Team, Mozilla





Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Dan Beam
On Thu, Mar 7, 2013 at 2:18 PM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 (Note: New on this list, please be gentle if I'm debating an
 inappropriate issue in an inappropriate place.)

 Actually, communicating large JSON objects between threads may cause
 performance issues. I do not have the means to measure reception speed
 simply (which would be used to implement asynchronous JSON.parse), but
 it is easy to measure main thread blocks caused by sending (which would
 be used to implement asynchronous JSON.stringify).


Isn't this precisely what Transferable objects are for?
http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transferable-objects

--
Dan Beam
db...@chromium.org


 I have put together a small test here - warning, this may kill your
 browser:
http://yoric.github.com/Bugzilla-832664/

 While there are considerable fluctuations, even inside one browser, on
 my system, I witness janks that last 300ms to 3s.

 Consequently, I am convinced that we need asynchronous variants of
 JSON.{parse, stringify}.

 Best regards,
  David

  Glenn Maynard wrote
 
  (It's hard to talk to somebody called j, by the way.  :)
 
  On Thu, Mar 7, 2013 at 2:06 AM, j at mailb.org wrote:
 
  right now JSON.parse blocks the mainloop, this gets more and more of an
  issue as JSON documents get bigger
 
 
  Just load the data you want to parse inside a worker, and perform the
  parsing there.  Computationally-expensive work is exactly something Web
  Workers are meant for.
 
  and are also used as serialization
  format to communicate with web workers.
 
 
  There's no need to serialize to JSON before sending data to a worker;
  there's nothing that JSON can represent that postMessage can't post
  directly.  Just postMessage the object itself.


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla






Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread David Rajchenbach-Teller
It is.

However, to use Transferable objects for purpose of implementing
asynchronous parse/stringify, one needs conversions of JSON objects
from/to Transferable objects. As it turns out, these conversions are
just variants on JSON parse/stringify, so we have not simplified the issue.

Note that I would be quite satisfied with an efficient, asynchronous,
implementation of these [de]serializations of JSON from/to Transferable
objects.

Best regards,
 David

On Thu Mar  7 23:37:43 2013, Dan Beam wrote:
 Isn't this precisely what Transferable objects are for?
 http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transferable-objects

-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla


Re: [whatwg] asynchronous JSON.parse

2013-03-07 Thread Glenn Maynard
On Thu, Mar 7, 2013 at 4:18 PM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 (Note: New on this list, please be gentle if I'm debating an
 inappropriate issue in an inappropriate place.)


(To my understanding of this list, it's completely acceptable to discuss
this here.)

Actually, communicating large JSON objects between threads may cause
 performance issues. I do not have the means to measure reception speed
 simply (which would be used to implement asynchronous JSON.parse), but
 it is easy to measure main thread blocks caused by sending (which would
 be used to implement asynchronous JSON.stringify).


If you're dealing with lots of data, you should be loading or creating the
data in the worker in the first place, not creating it in the UI thread and
then shuffling it off to a worker.

For example, if you're reading a large file provided by the user, post the
File object (received eg. from input) to the worker, then do the heavy
lifting there in the first place.

Benchmarks are always good, but it'd be better to talk about a real-world
use case, since it gives us something concrete to talk about.  What's a
practical case where you would actually have to create the big object in
the UI thread?


On Thu, Mar 7, 2013 at 5:25 PM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 However, to use Transferable objects for purpose of implementing
 asynchronous parse/stringify, one needs conversions of JSON objects
 from/to Transferable objects. As it turns out, these conversions are
 just variants on JSON parse/stringify, so we have not simplified the issue.


(Not nitpicking, since I really wasn't sure what you meant at first, but I
think you mean a JavaScript object.  There's no such thing as a JSON
object.)

-- 
Glenn Maynard