Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness
On Tue, 6 Aug 2013, Boris Zbarsky wrote: On 8/6/13 5:58 PM, Ian Hickson wrote: Parsing is easy to do on a separate worker, because it has no dependencies -- you can do it all in isolation. Sadly, that may not be the [case]. Actual JS implementations have various thread-local data that objects depend on (starting with interned property names), such that it's not actually possible to create an object on one thread and use it on another in many of them. Yeah, the final step of parsing a JSON string might require sync access to the target thread. For instance, how would you serialize something as simple as the following? { name: The One, hp: 1000, achievements: [achiever, overachiever, extreme overachiever] // Length of the list is unpredictable } Why serialise it? If you want to post this across a MessagePort to a worker, or back from a worker, why not just post it? var a = { ... }; // from above port.postMessage(a); This in practice does some sort of serialization in UAs. Indeed. My question was why do it manually. why not just do this in C++? Let's start with because writing C++ code without memory errors is harder than writing JS code without memory errors? I don't understand why you would constrain yourself to using Web APIs in JavaScript to write a browser. Simplicity of implementation? Sandboxing of the code? Eating your own dogfood? I guess that's a design choice. But fundamentally, the needs of programmers writing Web browsers aren't valid use cases for adding features to the Web platform. There's no need for internal APIs to be interoperable. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness
On Thu, 7 Mar 2013, j...@mailb.org wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger and are also used as serialization format to communicate with web workers. I think it would make sense to have a Promise-based API for JSON parsing. This probably belongs either in the JS spec or the DOM spec; Anne, Ms2ger, and any JS people, is anyone interested in taking this? On Thu, 7 Mar 2013, David Rajchenbach-Teller wrote: Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). I don't understand why there'd be any difficulty in sending large objects between workers or from a worker to the main thread. It's possible this is not well-implemented today, but isn't that just an implementation detail? One could imagine an implementation strategy where the cloning is done on the sending side, or even on a third thread altogether, and just passed straight to the receiving side in one go. On Thu, 7 Mar 2013, Tobie Langel wrote: Even if an async API for JSON existed, wouldn't the perf bottleneck then simply fall on whatever processing needs to be done afterwards? That was my initial reaction as well, I must admit. On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote: For the moment, the main use case I see is for asynchronous serialization of JSON is that of snapshoting the world without stopping it, for backup purposes, e.g.: a. saving the state of the current region in an open world RPG; b. saving the state of an ongoing physics simulation; c. saving the state of the browser itself in case of crash/power loss (that's assuming a FirefoxOS-style browser implemented as a web application); d. backing up state and history of the browser itself to a server (again, assuming that the browser is a web application). Serialising is hard to do async, since you fundamentally have to walk the data structure, and the actual serialisation at that point is not especially more expensive than a copy. The natural course of action would be to do the following: 1. collect data to a JSON object (possibly a noop); I'm not sure what you mean by JSON object. JSON is a string format. Do you mean a JS object data structure? 2. send the object to a worker; 3. apply some post-treatment to the object (possibly a noop); 4. write/upload the object. Having an asynchronous JSON serialization to some Transferable form would considerably the task of implement step 2. without janking if data ends up very heavy. I don't understand what JSON has to do with sending data to a worker. You can just send the actual JS object; MessagePorts and postMessage() support raw JS objects. So far, I have discussed serializing JSON, not deserializing it, but I believe that the symmetric scenarios also hold. No, they are quite asymetric. Serialising requires stalling the code that is interacting with the data structure, to guarantee integrity. Parsing is easy to do on a separate worker, because it has no dependencies -- you can do it all in isolation. On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote: If I am correct, this means that we need some mechanism to provide efficient serialization of non-Transferable data into something Transferable. I don't understand what this means. Transferable is about neutering objects on one side and creating new versions on the other. It's the equivalent of a move. Your use cases were about making copies, as far as I can tell (saving and backing up). As a general rule, JSON has nothing to do with Transferable objects, as far as I can tell. On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote: Intuitively, this sounds like: 1. collect data to a JSON; 2. serialize JSON (hopefully asynchronously) to a Transferable (or several Transferables). I really don't understand this. Are you asking for a way to move a JS object from one thread to another, killing references to it in the first thread? What's the use case? (What would this have to do with JSON?) On Fri, 8 Mar 2013, David Bruant wrote: Why not collect the data in a Transferable like an ArrayBuffer directly? It skips the additional serialization part. Writing a byte stream directly is a bit hardcore I admit, but an object full of setters can give the impression to create an object while actually filling an ArrayBuffer as a backend. I feel that could work efficiently. It's not clear to me what the use case is, but if the desire is to move a batch of data from one thread to another, then this is certainly one way to do it. Another would be to just copy the data in the first place, no need to move it -- since you have to pay the cost of reading
Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness
On 8/6/13 5:58 PM, Ian Hickson wrote: One could imagine an implementation strategy where the cloning is done on the sending side, or even on a third thread altogether The cloning needs to run to completion (in the sense of capturing an immutable representation) before anyone can change the data structure being cloned. That means either serializing the whole data structure in some way before returning control to JS or doing something where you start serializing it async and block until finish as soon as someone tries to modify any of those objects in any way, right? The latter is rather nontrivial to implement, so UAs do the former at the moment. Serialising is hard to do async, since you fundamentally have to walk the data structure, and the actual serialisation at that point is not especially more expensive than a copy. Right, that's what I said above... ;) Parsing is easy to do on a separate worker, because it has no dependencies -- you can do it all in isolation. Sadly, that may not be the. Actual JS implementations have various thread-local data that objects depend on (starting with interned property names), such that it's not actually possible to create an object on one thread and use it on another in many of them. For instance, how would you serialize something as simple as the following? { name: The One, hp: 1000, achievements: [achiever, overachiever, extreme overachiever] // Length of the list is unpredictable } Why serialise it? If you want to post this across a MessagePort to a worker, or back from a worker, why not just post it? var a = { ... }; // from above port.postMessage(a); This in practice does some sort of serialization in UAs. Assuming by Firefox Desktop you mean the browser for desktop OSes called Firefox, then, why not just do this in C++? Let's start with because writing C++ code without memory errors is harder than writing JS code without memory errors? I don't understand why you would constrain yourself to using Web APIs in JavaScript to write a browser. Simplicity of implementation? Sandboxing of the code? Eating your own dogfood? I can come up with some more reasons if you want. -Boris
Re: [whatwg] asynchronous JSON.parse
Le 08/03/2013 22:16, David Rajchenbach-Teller a écrit : On 3/8/13 5:35 PM, David Bruant wrote: 2. serialize JSON (hopefully asynchronously) to a Transferable (or several Transferables). Why not collect the data in a Transferable like an ArrayBuffer directly? It skips the additional serialization part. Writing a byte stream directly is a bit hardcore I admit, but an object full of setters can give the impression to create an object while actually filling an ArrayBuffer as a backend. I feel that could work efficiently. I suspect that this will quickly grow to either: - an API for serializing an object to a Transferable or a stream of Transferable; or - a lower-level but equivalent API for doing the same, without having to actually build the object. Yes. The difference with JSON is that it can be transfered directly without an extra step. Whether you put the info in an Object as properties (before being JSON.stringify()'ed) or directly in a Transferable, the snapshot info needs to be stored somewhere. For instance, how would you serialize something as simple as the following? { name: The One, hp: 1000, achievements: [achiever, overachiever, extreme overachiever] // Length of the list is unpredictable } If it's possible to serialize this as a string (like in JSON), it's possible to serialize it in an ArrayBuffer. Depending on implementations, serializing a list will require to define separators or maybe a length field upfront, etc. But that's doable. Taking a second for an aside. I've once met someone who told me that JSON was bullshit. Since the guy had blown my mind during a presentation, I've decided to give him a chance after this sentence :-p He explained that in JSON, a lot of characters are double quotes and commas and brackets. Also, you have to name fields. He said that if you want to share 2 ints (like longitude and latitude), you probably have to send the following down the wire: '{long:12.986,lat: -98.047}' which is about 30 bytes... for 2 numbers. He suggested that a client and server could send only 2 floats (4 bytes each, so 8 bytes total) and have a convention as to which number is first and you'd just be done with it. 30 bytes isn't fully fair because it could be gzipped, but that takes additional processing time in both ends. He talked about a technology he was working on that, based on a message description would output both the client and server code (in different languages if necessary) so that whatever message you send, you just write your business code and play with well-abstracted objects and the generated code takes care of the annoying send/receive a well-compressed message part. That was an interesting idea. Back to your case, it's always possible to represent structured information in a linear array (hence filesystems, hence databases). David
Re: [whatwg] asynchronous JSON.parse
On 07/03/2013 23:34 , Tobie Langel wrote: In which case, isn't part of the solution to paginate your data, and parse those pages separately? Assuming you can modify the backend. Also, data doesn't necessarily have to get all that bulky before you notice on a somewhat sluggish device. Even if an async API for JSON existed, wouldn't the perf bottleneck then simply fall on whatever processing needs to be done afterwards? But for that part you're in control of whether your processing is blocking or not. Wouldn't some form of event-based API be more indicated? E.g.: var parser = JSON.parser(); parser.parse(src); parser.onparse = function(e) { doSomething(e.data); }; I'm not sure how that snippet would be different from a single callback API. There could possibly be value in an event-based API if you could set it up with a filter, e.g. JSON.filtered($.*).then(function (item) {}); which would call you for ever item in the root object. Getting an event for every information item that the parser processes would likely flood you in events. Yet another option is a pull API. There's a lot of experience from the XML planet in APIs with specific performance characteristics. They would obviously be a lot simpler for JSON; I wonder how well that experience translates. -- Robin Berjon - http://berjon.com/ - @robinberjon
Re: [whatwg] asynchronous JSON.parse
On Friday, March 8, 2013 at 10:44 AM, Robin Berjon wrote: On 07/03/2013 23:34 , Tobie Langel wrote: Wouldn't some form of event-based API be more indicated? E.g.: var parser = JSON.parser(); parser.parse(src); parser.onparse = function(e) { doSomething(e.data); }; I'm not sure how that snippet would be different from a single callback API. There could possibly be value in an event-based API if you could set it up with a filter, e.g. JSON.filtered($.*).then(function (item) {}); which would call you for ever item in the root object. Getting an event for every information item that the parser processes would likely flood you in events. Agreed, you need something higher-level than just JSON tokens. Which is why this can be very much app-specific, unless most of the use cases are to parse data of a format similar to [Object, Object, Object, ..., Object]. This could be special-cased so as to send each object to the event handler as it's parsed. --tobie
Re: [whatwg] asynchronous JSON.parse
Le 08/03/2013 02:01, Glenn Maynard a écrit : If you're dealing with lots of data, you should be loading or creating the data in the worker in the first place, not creating it in the UI thread and then shuffling it off to a worker. Exactly. That would be the proper way to handle a big amount of data. David
Re: [whatwg] asynchronous JSON.parse
Let me answer your question about the scenario, before entering the specifics of an API. For the moment, the main use case I see is for asynchronous serialization of JSON is that of snapshoting the world without stopping it, for backup purposes, e.g.: a. saving the state of the current region in an open world RPG; b. saving the state of an ongoing physics simulation; c. saving the state of the browser itself in case of crash/power loss (that's assuming a FirefoxOS-style browser implemented as a web application); d. backing up state and history of the browser itself to a server (again, assuming that the browser is a web application). Cases a., b. and d. are hypothetical but, I believe, realistic. Case c. is very close to a scenario I am currently facing. The natural course of action would be to do the following: 1. collect data to a JSON object (possibly a noop); 2. send the object to a worker; 3. apply some post-treatment to the object (possibly a noop); 4. write/upload the object. Having an asynchronous JSON serialization to some Transferable form would considerably the task of implement step 2. without janking if data ends up very heavy. Note that, in all the scenarios I have mentioned, it is generally difficult for the author of the application to know ahead of time which part of the JSON object will be heavy and should be transmitted through an ad hoc protocol. In scenario c., for instance, it is quite frequent that just one or two pages contain 90%+ of the data that needs to be saved, in the form of form fields, or iframes, or Session Storage. So far, I have discussed serializing JSON, not deserializing it, but I believe that the symmetric scenarios also hold. Best regards, David On 3/7/13 11:34 PM, Tobie Langel wrote: I'd like to hear about the use cases a bit more. Generally, structured data gets bulky because it contains more items, not because items get bigger. In which case, isn't part of the solution to paginate your data, and parse those pages separately? Even if an async API for JSON existed, wouldn't the perf bottleneck then simply fall on whatever processing needs to be done afterwards? Wouldn't some form of event-based API be more indicated? E.g.: var parser = JSON.parser(); parser.parse(src); parser.onparse = function(e) { doSomething(e.data); }; And wouldn't this be highly dependent on how the data is structured, and thus very much app-specific? --tobie -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
Le 07/03/2013 23:18, David Rajchenbach-Teller a écrit : (Note: New on this list, please be gentle if I'm debating an inappropriate issue in an inappropriate place.) Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). I have put together a small test here - warning, this may kill your browser: http://yoric.github.com/Bugzilla-832664/ While there are considerable fluctuations, even inside one browser, on my system, I witness janks that last 300ms to 3s. Consequently, I am convinced that we need asynchronous variants of JSON.{parse, stringify}. I don't think this is necessary as all the processing can be done a worker (starting in the worker even). But if an async solution were to happen, I think it should be all the way, that is changing the JSON.parse method so that it accepts not only a string, but a stream of data. Currently, one has to wait until the entire string before being able to parse it. That's a waste of time for big data which is your use case (especially if waiting for data to come from the network) and probably a misuse of memory. With a stream, temporary strings can be thrown away. David
Re: [whatwg] asynchronous JSON.parse
On 3/8/13 2:01 AM, Glenn Maynard wrote: (Not nitpicking, since I really wasn't sure what you meant at first, but I think you mean a JavaScript object. There's no such thing as a JSON object.) I meant a pure data structure, i.e. JavaScript object without methods. It was my understanding that JSON object was a common denomination for such objects, but I am willing to use something else. I believe I have just addressed your other points in post http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2013-March/039090.html . Best regards, David -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
I fully agree that any asynchronous JSON [de]serialization should be stream-based, not string-based. Now, if the main heavy duty work is dealing with the large object, this can certainly be kept on a worker thread. I suspect, however, that this is not always feasible. Consider, for instance, a browser implemented as a web application, FirefoxOS-style. The data that needs to be collected to save its current state is held in the DOM. For performance and consistency, it is not practical to keep the DOM synchronized at all times with a worker thread. Consequently, data needs to be collected on the main thread and then sent to a worker thread. Similarly, for a 3d game, until workers can perform some off-screen WebGL, I suspect that a considerable amount of complex game data needs to reside on the main thread, because sending the appropriate subsets from a worker to the main thread on demand might not be reactive enough for 60 fps. I have no experience with such complex games, though, so my intuition could be wrong. Best regards, David On 3/8/13 11:53 AM, David Bruant wrote: I don't think this is necessary as all the processing can be done a worker (starting in the worker even). But if an async solution were to happen, I think it should be all the way, that is changing the JSON.parse method so that it accepts not only a string, but a stream of data. Currently, one has to wait until the entire string before being able to parse it. That's a waste of time for big data which is your use case (especially if waiting for data to come from the network) and probably a misuse of memory. With a stream, temporary strings can be thrown away. David -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
Le 08/03/2013 13:34, David Rajchenbach-Teller a écrit : I fully agree that any asynchronous JSON [de]serialization should be stream-based, not string-based. Now, if the main heavy duty work is dealing with the large object, this can certainly be kept on a worker thread. I suspect, however, that this is not always feasible. Consider, for instance, a browser implemented as a web application, FirefoxOS-style. The data that needs to be collected to save its current state is held in the DOM. For performance and consistency, it is not practical to keep the DOM synchronized at all times with a worker thread. Consequently, data needs to be collected on the main thread and then sent to a worker thread. I feel the data can be collected on the main thread in a Transferable (probably awkward, yet doable). This way, when the data needs to be transfered, the transfer is fast and heavy processing can happen in the worker. Similarly, for a 3d game, until workers can perform some off-screen WebGL What if a cross-origin or sandbox iframe was actually a worker with a DOM? [1] Not for today, I admit. Today, canvas contexts can be transferred [2]. There is no implementation of that to my knowledge, but that's happening. I suspect that a considerable amount of complex game data needs to reside on the main thread, because sending the appropriate subsets from a worker to the main thread on demand might not be reactive enough for 60 fps. I have no experience with such complex games, though, so my intuition could be wrong. I share your intuition, but miss the relevant expertise too. Let's wait until people complain :-) And let's see how far transferable CanvasProxy let us go. David [1] https://groups.google.com/d/msg/mozilla.dev.servo/LQ46AtKp_t0/plqFfjLSER8J [2] http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transferable
Re: [whatwg] asynchronous JSON.parse
On 3/8/13 1:59 PM, David Bruant wrote: Consider, for instance, a browser implemented as a web application, FirefoxOS-style. The data that needs to be collected to save its current state is held in the DOM. For performance and consistency, it is not practical to keep the DOM synchronized at all times with a worker thread. Consequently, data needs to be collected on the main thread and then sent to a worker thread. I feel the data can be collected on the main thread in a Transferable (probably awkward, yet doable). This way, when the data needs to be transfered, the transfer is fast and heavy processing can happen in the worker. Intuitively, this sounds like: 1. collect data to a JSON; 2. serialize JSON (hopefully asynchronously) to a Transferable (or several Transferables). If so, we are back to the problem of serializing JSON asynchronously to something transferable. Possibly an iterator (or an asynchronous iterator == a stream) of ByteArray, for instance. The alternative would be to serialize to a stream while we are still building the object. This sounds possible, although I suspect that the API would be much more complex. Similarly, for a 3d game, until workers can perform some off-screen WebGL What if a cross-origin or sandbox iframe was actually a worker with a DOM? [1] Not for today, I admit. Today, canvas contexts can be transferred [2]. There is no implementation of that to my knowledge, but that's happening. Yes, I believe that, in time, this will solve many scenarios. Definitely not the DOM-related scenario above, though. I suspect that a considerable amount of complex game data needs to reside on the main thread, because sending the appropriate subsets from a worker to the main thread on demand might not be reactive enough for 60 fps. I have no experience with such complex games, though, so my intuition could be wrong. I share your intuition, but miss the relevant expertise too. Let's wait until people complain :-) And let's see how far transferable CanvasProxy let us go. Ok, let's just say that I won't use games as a running example until people start complaining :) However, the DOM situation remains. Cheers, David -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
On Fri, Mar 8, 2013 at 4:51 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: a. saving the state of the current region in an open world RPG; b. saving the state of an ongoing physics simulation; These should live in a worker in the first place. c. saving the state of the browser itself in case of crash/power loss (that's assuming a FirefoxOS-style browser implemented as a web application); I don't understand this case. Why would you implement a browser in a browser? That sounds like a weird novelty app, not a real use case. Can you explain this for people who don't know what FirefoxOS means? d. backing up state and history of the browser itself to a server (again, assuming that the browser is a web application). (This sounds identical to C.) Similarly, for a 3d game, until workers can perform some off-screen WebGL, I suspect that a considerable amount of complex game data needs to reside on the main thread, because sending the appropriate subsets from a worker to the main thread on demand might not be reactive enough for 60 fps. I have no experience with such complex games, though, so my intuition could be wrong. If so, we should be fixing the problems preventing workers from being used fully, not to add workarounds to help people do computationally-expensive work in the UI thread. -- Glenn Maynard
Re: [whatwg] asynchronous JSON.parse
Le 08/03/2013 15:29, David Rajchenbach-Teller a écrit : On 3/8/13 1:59 PM, David Bruant wrote: Consider, for instance, a browser implemented as a web application, FirefoxOS-style. The data that needs to be collected to save its current state is held in the DOM. For performance and consistency, it is not practical to keep the DOM synchronized at all times with a worker thread. Consequently, data needs to be collected on the main thread and then sent to a worker thread. I feel the data can be collected on the main thread in a Transferable (probably awkward, yet doable). This way, when the data needs to be transfered, the transfer is fast and heavy processing can happen in the worker. Intuitively, this sounds like: 1. collect data to a JSON; I don't understand this sentence. Do you mean collect data in an object? Just to be sure we use the same vocabulary: When I say object, I mean something described by ES5 - 8.6 [1], so basically a bag of properties (usually data properties) with an internal [[Prototype]], etc. When I say JSON, it's a shortcut for JSON string following the grammar defined at ES5 - 5.1.5 [2]. Given the vocabulary I use, one can collect data in an object (by adding own properties, most likely), then serialize it as a JSON string with a call to JSON.stringify, but one cannot collect data in/to a JSON. 2. serialize JSON (hopefully asynchronously) to a Transferable (or several Transferables). Why not collect the data in a Transferable like an ArrayBuffer directly? It skips the additional serialization part. Writing a byte stream directly is a bit hardcore I admit, but an object full of setters can give the impression to create an object while actually filling an ArrayBuffer as a backend. I feel that could work efficiently. What are the data you want to collect? Is it all at once or are you building the object little by little? For a backup and for FirefoxOS specifically, could a FileHandle [3] work? It's an async API to write in a file. David [1] http://es5.github.com/#x8.6 [2] http://es5.github.com/#x5.1.5 [3] https://developer.mozilla.org/en-US/docs/WebAPI/FileHandle_API
Re: [whatwg] asynchronous JSON.parse
On 3/8/13 5:35 PM, David Bruant wrote: Intuitively, this sounds like: 1. collect data to a JSON; I don't understand this sentence. Do you mean collect data in an object? My bad. I sometimes write JSON for object that may be stringified to JSON format and parsed back without loss, i.e. a bag of [bags of] non-function properties. So let's just say object. 2. serialize JSON (hopefully asynchronously) to a Transferable (or several Transferables). Why not collect the data in a Transferable like an ArrayBuffer directly? It skips the additional serialization part. Writing a byte stream directly is a bit hardcore I admit, but an object full of setters can give the impression to create an object while actually filling an ArrayBuffer as a backend. I feel that could work efficiently. I suspect that this will quickly grow to either: - an API for serializing an object to a Transferable or a stream of Transferable; or - a lower-level but equivalent API for doing the same, without having to actually build the object. For instance, how would you serialize something as simple as the following? { name: The One, hp: 1000, achievements: [achiever, overachiever, extreme overachiever] // Length of the list is unpredictable } What are the data you want to collect? Is it all at once or are you building the object little by little? For a backup and for FirefoxOS specifically, could a FileHandle [3] work? It's an async API to write in a file. Thanks for the suggestion. I am effectively working on refactoring storing browser session data. Not for FirefoxOS, but for Firefox Desktop, which gives me more architectural constraints but frees my hand to extend the platform with additional non-web libraries. Best regards, David -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
On Thu, Mar 7, 2013 at 4:18 PM, David Rajchenbach-Teller dtel...@mozilla.com wrote: I have put together a small test here - warning, this may kill your browser: http://yoric.github.com/Bugzilla-832664/ By the way, I'd recommend keeping sample benchmarks as minimal and concise as possible. It's always tempting to make things configurable and dynamic and output lots of stats, but everyone interested in the results of your benchmark needs to read the code, to verify it's correct. On Fri, Mar 8, 2013 at 9:12 AM, David Rajchenbach-Teller dtel...@mozilla.com wrote: Ideally, yes. The question is whether this is actually feasible. Also, once we have a worker thread that needs to react fast enough to provide sufficient data to the ui thread for animating at 60fps, this worker thread ends up being nearly as critical as the ui thread, in terms of jank. I don't think making a call asynchronous is really going to help much, at least for serialization. You'd have to make a copy of the data synchronously, before returning to the caller, in order to guarantee that changes made after the call returns won't affect the result. This would probably be more expensive than the JSON serialization itself, since it means allocating lots of objects instead of just appending to a string. If it's possible to make that copy quickly, then that should be done for postMessage itself, to make postMessage return quickly, instead of doing it for a bunch of individual computationally-expensive APIs. (Also, remember that returns quickly and does work asynchronously doesn't mean the work goes away; the CPU time still has to be spent. Serializing the complete state of a large system while it's running and trying to maintain 60 FPS doesn't sound like a good approach in the first place.) Seriously? FirefoxOS [1, 2] is a mobile operating system in which all applications are written in JavaScript, HTML, CSS. This includes the browser itself. Given the number of companies involved in the venture, all over the world, I believe that this qualifies as real use case. That doesn't sound like a good idea to me at all, but in any case that's a system platform, not the Web. APIs aren't typically added to the web to support non-Web tasks. For example, if there's something people want to do in an iOS app using UIWebView, which doesn't come up on web pages, that doesn't typically drive web APIs. Platforms can add their own APIs for their platform-specific needs. -- Glenn Maynard
[whatwg] asynchronous JSON.parse
right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger and are also used as serialization format to communicate with web workers. To handle large JSON Documents there is a need for an async JSON.parse, something like: JSON.parse(data, function(obj) { ... }); or more like FileReader: var json = new JSONReader(); json.addEventListener('load', function(event) { //parsed JSON document in: this.result }); json.parse(data); While my major need is asynchronous parsing of JSON data, the same is also true for serialization into JSON. var json = new JSONWriter(); json.addEventListener('load', function(event) { // serialized JSON string in: this.result }); json.serialize(obj);
Re: [whatwg] asynchronous JSON.parse
(It's hard to talk to somebody called j, by the way. :) On Thu, Mar 7, 2013 at 2:06 AM, j...@mailb.org wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger Just load the data you want to parse inside a worker, and perform the parsing there. Computationally-expensive work is exactly something Web Workers are meant for. and are also used as serialization format to communicate with web workers. There's no need to serialize to JSON before sending data to a worker; there's nothing that JSON can represent that postMessage can't post directly. Just postMessage the object itself. -- Glenn Maynard
Re: [whatwg] asynchronous JSON.parse
The JSON object and its API are part of the ECMAScript language specification which is standardized by Ecma/TC39, not whatwg. Rick On Thursday, March 7, 2013, wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger and are also used as serialization format to communicate with web workers. To handle large JSON Documents there is a need for an async JSON.parse, something like: JSON.parse(data, function(obj) { ... }); or more like FileReader: var json = new JSONReader(); json.addEventListener('load', function(event) { //parsed JSON document in: this.result }); json.parse(data); While my major need is asynchronous parsing of JSON data, the same is also true for serialization into JSON. var json = new JSONWriter(); json.addEventListener('load', function(event) { // serialized JSON string in: this.result }); json.serialize(obj);
Re: [whatwg] asynchronous JSON.parse
On Thu, Mar 7, 2013 at 9:29 AM, Rick Waldron waldron.r...@gmail.com wrote: The JSON object and its API are part of the ECMAScript language specification which is standardized by Ecma/TC39, not whatwg. He's talking about an async interface to it, not the core parser. It's a higher level of abstraction than the core language, which doesn't know anything about eg. DOM Events and doesn't typically define asynchronous interfaces. If an API like this was to be exposed (which I believe is unnecessary), it would belong here or webapps, not at the language level. -- Glenn Maynard
Re: [whatwg] asynchronous JSON.parse
On Thu, Mar 7, 2013 at 10:42 AM, Glenn Maynard gl...@zewt.org wrote: On Thu, Mar 7, 2013 at 9:29 AM, Rick Waldron waldron.r...@gmail.comwrote: The JSON object and its API are part of the ECMAScript language specification which is standardized by Ecma/TC39, not whatwg. He's talking about an async interface to it, not the core parser. It's a higher level of abstraction than the core language, which doesn't know anything about eg. DOM Events and doesn't typically define asynchronous interfaces. If an API like this was to be exposed (which I believe is unnecessary), it would belong here or webapps, not at the language level. Yes, and as a member of ECMA/TC39 I felt that it was my responsibility to clarify the specification ownership—but thanks for filling me in ;) Rick -- Glenn Maynard
Re: [whatwg] asynchronous JSON.parse
(Note: New on this list, please be gentle if I'm debating an inappropriate issue in an inappropriate place.) Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). I have put together a small test here - warning, this may kill your browser: http://yoric.github.com/Bugzilla-832664/ While there are considerable fluctuations, even inside one browser, on my system, I witness janks that last 300ms to 3s. Consequently, I am convinced that we need asynchronous variants of JSON.{parse, stringify}. Best regards, David Glenn Maynard wrote (It's hard to talk to somebody called j, by the way. :) On Thu, Mar 7, 2013 at 2:06 AM, j at mailb.org wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger Just load the data you want to parse inside a worker, and perform the parsing there. Computationally-expensive work is exactly something Web Workers are meant for. and are also used as serialization format to communicate with web workers. There's no need to serialize to JSON before sending data to a worker; there's nothing that JSON can represent that postMessage can't post directly. Just postMessage the object itself. -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
I'd like to hear about the use cases a bit more. Generally, structured data gets bulky because it contains more items, not because items get bigger. In which case, isn't part of the solution to paginate your data, and parse those pages separately? Even if an async API for JSON existed, wouldn't the perf bottleneck then simply fall on whatever processing needs to be done afterwards? Wouldn't some form of event-based API be more indicated? E.g.: var parser = JSON.parser(); parser.parse(src); parser.onparse = function(e) { doSomething(e.data); }; And wouldn't this be highly dependent on how the data is structured, and thus very much app-specific? --tobie On Thursday, March 7, 2013 at 11:18 PM, David Rajchenbach-Teller wrote: (Note: New on this list, please be gentle if I'm debating an inappropriate issue in an inappropriate place.) Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). I have put together a small test here - warning, this may kill your browser: http://yoric.github.com/Bugzilla-832664/ While there are considerable fluctuations, even inside one browser, on my system, I witness janks that last 300ms to 3s. Consequently, I am convinced that we need asynchronous variants of JSON.{parse, stringify}. Best regards, David Glenn Maynard wrote (It's hard to talk to somebody called j, by the way. :) On Thu, Mar 7, 2013 at 2:06 AM, j at mailb.org (http://mailb.org) wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger Just load the data you want to parse inside a worker, and perform the parsing there. Computationally-expensive work is exactly something Web Workers are meant for. and are also used as serialization format to communicate with web workers. There's no need to serialize to JSON before sending data to a worker; there's nothing that JSON can represent that postMessage can't post directly. Just postMessage the object itself. -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
On Thu, Mar 7, 2013 at 2:18 PM, David Rajchenbach-Teller dtel...@mozilla.com wrote: (Note: New on this list, please be gentle if I'm debating an inappropriate issue in an inappropriate place.) Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). Isn't this precisely what Transferable objects are for? http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transferable-objects -- Dan Beam db...@chromium.org I have put together a small test here - warning, this may kill your browser: http://yoric.github.com/Bugzilla-832664/ While there are considerable fluctuations, even inside one browser, on my system, I witness janks that last 300ms to 3s. Consequently, I am convinced that we need asynchronous variants of JSON.{parse, stringify}. Best regards, David Glenn Maynard wrote (It's hard to talk to somebody called j, by the way. :) On Thu, Mar 7, 2013 at 2:06 AM, j at mailb.org wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger Just load the data you want to parse inside a worker, and perform the parsing there. Computationally-expensive work is exactly something Web Workers are meant for. and are also used as serialization format to communicate with web workers. There's no need to serialize to JSON before sending data to a worker; there's nothing that JSON can represent that postMessage can't post directly. Just postMessage the object itself. -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
It is. However, to use Transferable objects for purpose of implementing asynchronous parse/stringify, one needs conversions of JSON objects from/to Transferable objects. As it turns out, these conversions are just variants on JSON parse/stringify, so we have not simplified the issue. Note that I would be quite satisfied with an efficient, asynchronous, implementation of these [de]serializations of JSON from/to Transferable objects. Best regards, David On Thu Mar 7 23:37:43 2013, Dan Beam wrote: Isn't this precisely what Transferable objects are for? http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transferable-objects -- David Rajchenbach-Teller, PhD Performance Team, Mozilla
Re: [whatwg] asynchronous JSON.parse
On Thu, Mar 7, 2013 at 4:18 PM, David Rajchenbach-Teller dtel...@mozilla.com wrote: (Note: New on this list, please be gentle if I'm debating an inappropriate issue in an inappropriate place.) (To my understanding of this list, it's completely acceptable to discuss this here.) Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). If you're dealing with lots of data, you should be loading or creating the data in the worker in the first place, not creating it in the UI thread and then shuffling it off to a worker. For example, if you're reading a large file provided by the user, post the File object (received eg. from input) to the worker, then do the heavy lifting there in the first place. Benchmarks are always good, but it'd be better to talk about a real-world use case, since it gives us something concrete to talk about. What's a practical case where you would actually have to create the big object in the UI thread? On Thu, Mar 7, 2013 at 5:25 PM, David Rajchenbach-Teller dtel...@mozilla.com wrote: However, to use Transferable objects for purpose of implementing asynchronous parse/stringify, one needs conversions of JSON objects from/to Transferable objects. As it turns out, these conversions are just variants on JSON parse/stringify, so we have not simplified the issue. (Not nitpicking, since I really wasn't sure what you meant at first, but I think you mean a JavaScript object. There's no such thing as a JSON object.) -- Glenn Maynard