Re: The Structured Clone Wars
2011/7/15 Jason Orendorff jason.orendo...@gmail.com Back to Mark S. Miller: And finally there's the issue raised by David on the es-discuss thread: What should the structured clone algorithm do when encountering a proxy? The algorithm as coded below will successfully clone proxies, for some meaning of clone. Is that the clone behavior we wish for proxies? The structured cloning algorithm should be redefined in terms of the ES object protocol. This seems necessary anyway, for precision. The appropriate behavior regarding proxies would fall out of that; proxies would not have to be specifically mentioned in the algorithm's spec. +1. This also matches with the behavior of JSON.stringify(aProxy): serializing a proxy as data should simply query the object's own properties by calling the appropriate traps (in the case of JSON, this includes intercepting the call to 'toJSON'). (Every algorithm that mentions proxies, or really any other object type, by name is one broken piece of a Proxy.isProxy implementation.) Cheers, -j ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: [Harmony Proxies] LazyReadCopy experiment and invariant checking for [[Extensible]]=false
2011/7/14 David Bruant david.bru...@labri.fr Le 14/07/2011 16:23, Tom Van Cutsem a écrit : Hi David, 2011/7/13 David Bruant david.bru...@labri.fr Hi, Recently, I've been thinking about the structured clone algorithm used in postMessage The browser has to do a copy of an object and pass it to the message receiving context. In some cases, if the object has a lot of properties, it could be costly to do this copy. So I thought hey, what about wrapping it with a proxy and just hand the proxy?. This is a good use-case, and definitely one that Mark and I would like to explore with Proxies, as indicated by the Eventual reference proxy example on the harmony:proxies wiki page. Also, it's not just that copying objects is sometimes too costly, sometimes it's just not what the programmer wants. If the passed object is mutable, you don't want to create copies that can then diverge when e.g. distributed to separate web workers. You need to create copies, don't you? Otherwise I think you introduce non-determinism since web workers can have parallel execution. That's the point of having a structured clone algorithm I thought. Having mutable shared data across different contexts (potentially running in parallel) is prevented, isn't it? Well, in the particular case of the eventual reference abstraction, any operations with side-effects that the RC performs on the original object are queued as separate events in the SC's event queue. Only the SC has direct (synchronous) access to the object. The RC doesn't have a local copy of the object, just a unique reference that it can use to send messages (in this particular case, asynchronous messages), to the target object. At no point can the RC gain synchronous access to the target. That avoids the shared-state concurrency quagmire. Also, if the object represents some local resource that is unavailable to the receiver endpoint, you want to hand it a reference to the service, not a copy of some data. For the rest of this use case, I will consider myself in the role of the browser (implementing postMessage and tryiing to do so in JavaScript) and I may use the become primitive if needed since I'm in privileged code. Basically, the proxy has to forward reads to the target object, keep the writes internally and make sure future reads are consistent if some writes occured. You can see my implementation of LazyReadCopy at: https://github.com/DavidBruant/HarmonyProxyLab/tree/master/LazyReadCopy (I discuss limitations below) I'm having trouble relating the LazyReadCopy proxy to your above use case. If I were to create a proxy that represents a reference to an object in another frame/web worker, I wouldn't cache writes in the proxy. I'm not saying LazyReadCopy never makes sense, it just seems to make less sense in the context you describe. My intention is to give to the two contexts the impression they're manipulating two different objects (because in postMessage, the message in the receiving and sending contexts are two different objects) So now we can send an object in the message Receiving Context (RC). This object is a copy of the one in the message Sending Context (SC). The RC can manipulate the object and this will have no noticeable effect in the SC. However, the opposite is not true. If a property is added in the SC, the RC can read this new property. So, for my implementation to work, I would need 2 lazy read copies. One for the SC, one for the RC. The one in the SC can replace the object thanks to the become privileged primitive. Why the need for become? Why does the SC need to refer to a proxy, rather than to the real target object directly? Consider: - // In sending context var o = {}; receivingContext.postMessage(o); o.a = 1; // after having sent the message - at no time the RC should be able to observe an 'a' property. The message was an empty object, so that's what the RC should observe (at least in the postMessage use case). Yet, I have no reason to prevent the SC to change o after a postMessage, so in order to keep the reads consistent and still allow writes, I can turn my object into a LazyReadCopy object and do the become. Ok, thanks for clarifying. I now understand your case better. When I think about passing o to RC, if o is mutable, what I really want is for RC to receive a remote reference to o, not a copy of o. But let's not digress too far into the details of parameter-passing semantics. So far, so good, we have an object, two lazy read copies of it (one in each context), basic expectations are respected. Now, consider the following snippet: - var o = {}; Object.preventExtension(o); Object.isExtensible(o); // false receivingContext.postMessage(o); // The browser replaces o with a lazy read copy of o. That's now a proxy. Object.isExtensible(o); // true, because it's a proxy :-s - Of course, that's my fault, I shouldn't play too much with
Re: [Harmony Proxies] LazyReadCopy experiment and invariant checking for [[Extensible]]=false
2011/7/14 David Bruant david.bru...@labri.fr + Mark to discuss ES5.1 invariants. I'm working on an implementation. It will be independent from the FixedHandler one. It should be possible to combine both easily though. I'll give a follow-up here when I'm done. So, here it is: https://github.com/DavidBruant/HarmonyProxyLab/tree/master/NonExtensibleProxies I have just commited but not tested. The reason is that I wondered if all this effort was worth. I've re-read the invariants in ES5 (last one on extensible not allowed to go from false to true omitted): * If the value of the host object’s [[Extensible]] internal property is has been observed by ECMAScript code to be false, then if a call to [[GetOwnProperty]] describes a property as non-existent all subsequent calls must also describe that property as non-existent. = To implement this, the handler just has to keep a record of already observed as non-existent properties and make sure [[GetOwnProperty]] respects the invariant by a lookup in this record. Which is bad since the set of already observed as non-existent properties is a potentially unbounded set determined by clients. We wouldn't want a proxy to have to remember all property names for which it ever replied undefined. * The [[DefineOwnProperty]] internal method of a host object must not permit the addition of a new property to a host object if the [[Extensible]] internal property of that host object has been observed by ECMAScript code to be false. = This one is a bit more tricky. must not permit the addition of a new property seems to assume that an object has a set of properties. But that's not really true for proxies. The engine has no way to say whether a property is defined in a proxy or not. Consequently, it cannot tell if [[DefineOwnProperty]] is called on a new or old property. This could be compensated by explicitely returning a list of property names in the fix/preventExtension trap. Indeed. I think it's still necessary that upon fixing a proxy, the proxy asks the handler for all of its own properties (either via the return value of fix() or by calling getOwnPropertyNames + getOwnPropertyDescriptor). That gives it a fixed set of properties, making it at least easy to check the previous invariant without requiring unbounded storage (which also seems to be the way you have implemented it in your prototype, I noticed). Not sure how to deal with inherited properties, though. A fixed proxy may still have a non-fixed [[Prototype]], so its [[GetProperty]] and [[Get]] traps may still report new properties. OTOH, if the proxy's entire [[Prototype]] chain is also fixed, the constraints become more tight. A compromise could be that a non-extensible proxy can no longer virtualize inherited properties, only own properties. Also, none of the 3 invariants regarding [[Extensible]]=false are about property names enumeration. Is there just no invariant to hold on enumeration? If there is none, well, no need to do O(n)-ish enforcements! I'll update and test my code based on conclusions of this thread. David ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: [[Extensible]]and Proxies (Was: Proxy.isProxy )
Hi Allen, While proxies are definitely designed to emulate some of the peculiar behavior of host objects, that does not necessarily imply that proxies should be as powerful and free of restrictions as host objects. If any JS script can create a proxy, and a proxy can violate invariants that do hold for built-in Object or Function values, the logical implication is that Javascript code, in general, can no longer rely on those invariants. Code that does can then more easily be exploited. Consider a non-configurable, non-writable data property. The binding of such a property for regular objects is guaranteed to be immutable. Immutability can be used for caching purposes, but is equally useful to not have to adopt a defensive coding style (as a point in case, take Mark's implementation of the structured clone algorithm in JS. Such defensive coding would be unnecessary when dealing with frozen objects.) In a security context, immutability may be used to determine whether or not to allow some third-party code to access an object. In short, if proxies could violate the semantics of e.g. frozen objects, then it seems to me that the usefulness of that concept is severely crippled. Where Object.isFrozen would have been a test for guaranteed immutability (of an object's structure), it merely becomes a test that hints at it, but cannot be relied upon. Cheers, Tom 2011/7/15 Allen Wirfs-Brock al...@wirfs-brock.com On Jul 14, 2011, at 4:59 PM, Mark S. Miller wrote: On Thu, Jul 14, 2011 at 3:00 PM, Allen Wirfs-Brock al...@wirfs-brock.comwrote: On Jul 14, 2011, at 1:19 PM, Mark S. Miller wrote: Do you really think it makes sense to allow new properties to appear on non-extensible objects? Really? Perhaps you do. Again, unless and until we get agreement to clean up these (irregularities, inconsistencies, features, whatever you want to call them) in 8.6.2, ES5.1 is, as you say, what it is. I hope we can do better. Prior to ES5, [[Extensible]] did exist. We added it in a manner that was consistent for native objects. We may have over-stepped in requiring that host objects have such an internal property when in the past they didn't. I don't think there is any fundamental necessity that such an internal property exists for host objects or that it is possible to be able to place a host object into a state where properties cannot be added. I not sure that DOMs violate this, but I'm also not sure that they don't. If an object (whether native or host) has such a state, then it clearly is a bug if new properties can be added when the object is in such a state. But is it a critical safely bug that must be proactively detected and prevented? I not yet convinced that it is or that the benefit of proactively guaranteeing that such an invariant cannot be violated is worth the cost. and the host object invariants are incomplete and hard to make impossible to violated. Host objects are part of the platform. A platform provider is free to violate any part of the spec they like, and there's nothing we can do about it other than to add tests to std test suites to make the violations obvious to the community. We could provide a defined interface mechanism that validates constraints or limits behavior in a way that guarantees the desired invariants. That's what Proxies appear to be trying to do, why not do it for host objects. If you can depend upon host objects actually supporting your invariants why does not matter whether or not Proxy objects also do so. I think this is key to the whole discussion, as you and I have discussed verbally. The main use case you seem to be thinking in terms of is Mozilla's own use, as a platform provider, of the proxy mechanism as a way to replace your current mechanisms for providing host objects. This is a laudable goal which I fully support. However, for your use as a platform provider, it is legitimate for you to provide yourself dangerous shortcuts that we cannot expose to untrusted code, given that you take responsibility for not abusing these dangerous shortcuts. No, that is just one use case. However, in general I think the ability to use intercession to produce interesting object semantics is a valuable feature that has various uses include hosting other languages. However, I am content to leave the maintenance of complex invariants up to the implementors of such object semantics and to tolerate any failure to do so as user bugs alongs as they can't violate memory safety. For clarity, let's take an extreme example: peek and poke. For those who missed the history, peek and poke http://en.wikipedia.org/wiki/PEEK_and_POKE were primitives in some Basics for providing direct unchecked access to raw memory. If you, as platform providers, wanted to provide yourself with peek and poke operations in order to gain some efficiency or whatever, there's nothing fundamentally wrong with that. But clearly, no matter
Re: The Structured Clone Wars
On Jul 14, 2011, at 9:30 PM, Jason Orendorff wrote: On Thu, Jul 14, 2011 at 2:46 PM, Mark S. Miller erig...@google.com wrote: Allen Wirfs-Brock write: Something that isn't clear to me is which primordials are used to set the [[Prototype]] of the generated objects. It isn't covered in the internal structured cloning algorithm. Perhaps it is, where structured clone is invoked. Consider that in IndexedDB, the copy is made at the time the object is put into the database, and then the copy is used in perhaps a completely different browser instance. And in the case of postMessage, the copy is in a totally separate heap that lives on a separate thread. Perhaps this makes it clearer what structured cloning really is. It's serialization. Or, it's a spec fiction to explain and codify the Web-visible effects of serialization and deserialization without specifying a serialization format. As such, it seem like this may be a poor specification approach. Translation to/from a static serialization format would make clear that there is no sharing of any active object mechanisms such as prototype objects. This is not clear in the current specification. If the specification did use an explicit serialization format in this manner then certainly a valid optimization in appropriate situations would be for an implementation to eliminate the actual encoding/decoding of the serialized representation and to directly generate the target objects. However, by specifying it terms of such a format you would precisely define the required transformation. If you didn't want to be bothered with inventing a serialization format solely for specification purposes you could accomplish the same thing by specify structured clone as if it was a transformation to/from JSON format We implement a pair of functions, JS_WriteStructuredClone and JS_ReadStructuredClone. The latter requires the caller to specify the global object in which the clone is to be made. That global's primordial prototypes are used. The actual serialization format is not exposed to the web. What happens when cloning an object that is an Object object whose [[Prototype]] is not Object.prototype? Allen ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
It's serialization. Or, it's a spec fiction to explain and codify the Web-visible effects of serialization and deserialization without specifying a serialization format. As such, it seem like this may be a poor specification approach. Translation to/from a static serialization format would make clear that there is no sharing of any active object mechanisms such as prototype objects. This is not clear in the current specification. If the specification did use an explicit serialization format in this manner then certainly a valid optimization in appropriate situations would be for an implementation to eliminate the actual encoding/decoding of the serialized representation and to directly generate the target objects. However, by specifying it terms of such a format you would precisely define the required transformation. If you didn't want to be bothered with inventing a serialization format solely for specification purposes you could accomplish the same thing by specify structured clone as if it was a transformation to/from JSON format JSON alone may not be enough, but it shouldn't be too troublesome to specify a slightly enhanced es-specific JSON extension that includes serializations for undefined, NaN, Infinity, -Infinity, etc. And naturally, support for Date, RegExp and Function would be a huge boon. If a referencing technique were addressed this could even include a | equivalent to address the [[Prototype]] issue Allen mentioned. In this context some of the limitations intentionally imposed in JSON are unnecessary, so why saddle the web platform with them? A more expressive standardized serialization would be useful across the board. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Should Decode accept U+FFFE or U+FFFF (and other Unicode non-characters)?
On Jul 14, 2011, at 10:38 PM, Jeff Walden wrote: Reraising this issue... To briefly repeat: Decode, called by decodeURI{,Component}, says to reject %ab%cd%ef sequences whose octets [do] not contain a valid UTF-8 encoding of a Unicode code point. It appears browsers interpret this requirement as: reject overlong UTF-8 sequences, and otherwise reject only unpaired or mispaired surrogate code points. Is this exactly what ES5 requires? And if it is, should it be? Firefox has also treated otherwise-valid-looking encodings of U+FFFE and U+ as specifying that the replacement character U+FFFD be used. And the rationale for rejecting U+FFF{E,F} also seems to apply to the non-character range [U+FDD0, U+FDEF] and U+xyFF{E,F}. Table 21 seems to say only malformed encodings and bad surrogates should be rejected, but valid encoding of a code point is arguably unclear. I haven't swapped back my technical understanding of the subtleties of UTF8 encodings yet today so I'm not yet prepared to try to provide a technical response. But I think I can speak to the intent of the spec (or at least the ES5 version): 1) these are legacy functions that have been in browser JS implementations at least since ES3 days. We didn't want to change them in any incompatible way. 2) Like with RegExp and other similar issues, browser reality (well, legacy browser reality, maybe not newbies) is more important than what the spec. actually says. If browser all do something different from the spec. then the spec. should be updated accordingly. However, for ES5 we didn't do any deep analysis of this browser reality so we might have missed something. 3) The intent is pretty clearly stated in the last paragraph note that includes table 21 (BTW, since the table is in a note it isn't normative). It essentially says throw an exception when decoding anything that RFC 3629 says if not a valid UTF-8 encoding. I would prioritizes #3 after #1#2. If there is consistent behavior in all major browsers that date prior to ES5 then that is the behavior that should be followed (and the spec. updated if necessary). If there is disagreement among those legacy browsers then I would simply follow the ES5 spec. unless it does something that is contrary to RFC 3629. If it does, then we need to think about whether we have a spec. bug. At least one person interested in Firefox's decoding implementation argues that not rejecting or replacing U+FFF{E,F} is a potential security vulnerability because those code points (particularly U+FFFE) might confuse code into interpreting a sequence of code points with the wrong endianness. I find the argument unpersuasive and the potential harm too speculative (particularly as no other browser replaces or rejects U+FFF{E,F}). But the point's been raised, and it's at least somewhat plausible, so I'd like to see it conclusively addressed. It's just a transformation from one JS string to another. It can't do anything that hand written JS code couldn't do. How would this be any more of a problem then simply providing the code points that that the bogus sequence would be incorrectly interpreted as. That said, #3 above does that that the intent is to reject anything that is not valid UTF-8. Allen ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Jul 15, 2011, at 8:51 AM, Dean Landolt wrote: It's serialization. Or, it's a spec fiction to explain and codify the Web-visible effects of serialization and deserialization without specifying a serialization format. As such, it seem like this may be a poor specification approach. Translation to/from a static serialization format would make clear that there is no sharing of any active object mechanisms such as prototype objects. This is not clear in the current specification. If the specification did use an explicit serialization format in this manner then certainly a valid optimization in appropriate situations would be for an implementation to eliminate the actual encoding/decoding of the serialized representation and to directly generate the target objects. However, by specifying it terms of such a format you would precisely define the required transformation. If you didn't want to be bothered with inventing a serialization format solely for specification purposes you could accomplish the same thing by specify structured clone as if it was a transformation to/from JSON format JSON alone may not be enough, but it shouldn't be too troublesome to specify a slightly enhanced es-specific JSON extension that includes serializations for undefined, NaN, Infinity, -Infinity, etc. And naturally, support for Date, RegExp and Function would be a huge boon. If a referencing technique were addressed this could even include a | equivalent to address the [[Prototype]] issue Allen mentioned. In this context some of the limitations intentionally imposed in JSON are unnecessary, so why saddle the web platform with them? A more expressive standardized serialization would be useful across the board. JSON + an appropiate schema is enough. You can define a JSON encoded schema that deals with undefined, NaN, etc. as well as circular object references, property attributes, and other issues. For example see https://github.com/allenwb/jsmirrors/blob/master/jsonObjSample.js for a sketch of such a schema. For structured clone usage cases that is all you need. I'm less convinced that one standardized universal JS object serialization format is such a good idea. There are lots of application specific issues involved in object serialization and to create a universal format/serializer/deserializer you have to make the policy that are applied highly parameterized. I think it might be better to leave that problem to library writers. Allen ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Fri, Jul 15, 2011 at 10:22 AM, Allen Wirfs-Brock al...@wirfs-brock.com wrote: What happens when cloning an object that is an Object object whose [[Prototype]] is not Object.prototype? The original object's [[Prototype]] is entirely ignored. During deserialization, the target global's initial Object.prototype object is used as the new object's [[Prototype]]. -j ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Fri, Jul 15, 2011 at 1:26 AM, Tom Van Cutsem tomvc...@gmail.com wrote: 2011/7/15 Jason Orendorff jason.orendo...@gmail.com Back to Mark S. Miller: And finally there's the issue raised by David on the es-discuss thread: What should the structured clone algorithm do when encountering a proxy? The algorithm as coded below will successfully clone proxies, for some meaning of clone. Is that the clone behavior we wish for proxies? The structured cloning algorithm should be redefined in terms of the ES object protocol. This seems necessary anyway, for precision. The appropriate behavior regarding proxies would fall out of that; proxies would not have to be specifically mentioned in the algorithm's spec. +1. This also matches with the behavior of JSON.stringify(aProxy): serializing a proxy as data should simply query the object's own properties by calling the appropriate traps (in the case of JSON, this includes intercepting the call to 'toJSON'). Except that you don't want to do that for host objects. Trying to clone a File object by cloning its properties is going to give you an object which is a whole lot less useful as it wouldn't contain any of the file data. Once we define support for cloning ArrayBuffers the same thing will apply to it. This might in fact be a big hurdle to implementing structured cloning in javascript. How would a JS implementation of structured clone determine if an object is a host object which would loose all its useful semantics if cloned, vs. a plain JS object which can usefully be cloned? / Jonas ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Jul 15, 2011, at 10:00 AM, Jonas Sicking wrote: Except that you don't want to do that for host objects. Trying to clone a File object by cloning its properties is going to give you an object which is a whole lot less useful as it wouldn't contain any of the file data. Once we define support for cloning ArrayBuffers the same thing will apply to it. This might in fact be a big hurdle to implementing structured cloning in javascript. How would a JS implementation of structured clone determine if an object is a host object which would loose all its useful semantics if cloned, vs. a plain JS object which can usefully be cloned? / Jonas And a cloned JS object is a lot less useful if it has lost it's original [[Prototype]]. Generalizations about host objects are no more or less valid than generalizations about pure JS objects. This issue applies to pure JS object graphs or any serialization scheme. Sometimes language specific physical clones won't capture the desired semantics. (Consider for example, an object that references a resource by using a symbolic token to access a local resource registry). That is why the ES5 JSON encoder/decoder includes extension points such as the toJSON method. To enable semantic encodings that are different form the physical object structure. The structured clone algorithm, as currently written allows the passing of strings, so it is possible to use in to transmit anything that can be encoded within a string. All it takes needs is an application specific encoder/decoder. It seems to me the real complication is a desire for some structured clone use cases to avoid serialization and permit sharing via a copy-on-right of a real JS object graph. If you define this sharing in terms of serialization then you probably eliminate some of the language-specific low level sharing semantic issues. But you are still going to have higher lever semantic issues such as what does it mean to serialize a File. It isn't clear to me that there is a general solution to the latter. Allen ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Fri, Jul 15, 2011 at 1:00 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 15, 2011 at 1:26 AM, Tom Van Cutsem tomvc...@gmail.com wrote: 2011/7/15 Jason Orendorff jason.orendo...@gmail.com Back to Mark S. Miller: And finally there's the issue raised by David on the es-discuss thread: What should the structured clone algorithm do when encountering a proxy? The algorithm as coded below will successfully clone proxies, for some meaning of clone. Is that the clone behavior we wish for proxies? The structured cloning algorithm should be redefined in terms of the ES object protocol. This seems necessary anyway, for precision. The appropriate behavior regarding proxies would fall out of that; proxies would not have to be specifically mentioned in the algorithm's spec. +1. This also matches with the behavior of JSON.stringify(aProxy): serializing a proxy as data should simply query the object's own properties by calling the appropriate traps (in the case of JSON, this includes intercepting the call to 'toJSON'). Except that you don't want to do that for host objects. Trying to clone a File object by cloning its properties is going to give you an object which is a whole lot less useful as it wouldn't contain any of the file data. Once we define support for cloning ArrayBuffers the same thing will apply to it. This might in fact be a big hurdle to implementing structured cloning in javascript. How would a JS implementation of structured clone determine if an object is a host object which would loose all its useful semantics if cloned, vs. a plain JS object which can usefully be cloned? Through the use of a *serializable* predicate -- perhaps toJSON, as you recognized in your response Dave's referenced post. Is it really a problem if host objects don't survive in full across serialization boundaries? As you say All APIs that use structured cloning are pretty explicit. Things like Worker.postMessage and IDBObjectStore.put pretty explicitly creates a new copy. If you expect host objects to survive across that boundary you'll quickly learn otherwise, and it won't take long to grok the difference. Java draws a distinction between marshalling and serialization which might be useful to this discussion: http://tools.ietf.org/html/rfc2713#section-2.3 To marshal an object means to record its state and codebase(s) insuch a way that when the marshalled object is unmarshalled, a copy of the original object is obtained, possibly by automatically loading the class definitions of the object. You can marshal any object that is serializable or remote (that is, implements the java.rmi.Remote interface). Marshalling is like serialization, except marshalling also records codebases. I agree with the conclusion in Dave's post: A more adaptable approach might be for ECMAScript to specify “transmittable” data structures. But the premise suggests the need for marshalling where we can get by without preserving all this environment information. If you don't like the behavior toJSON is already metaprogrammable, and as Allen suggests a schema could be used to capture deeper type information -- it could also communicate property descriptor config. I'd rather a fully self-describing format exist but I concede it's unnecessary. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Fri, Jul 15, 2011 at 10:22 AM, Allen Wirfs-Brock al...@wirfs-brock.com wrote: On Jul 14, 2011, at 9:30 PM, Jason Orendorff wrote: Or, it's a spec fiction to explain and codify the Web-visible effects of serialization and deserialization without specifying a serialization format. As such, it seem like this may be a poor specification approach. Perhaps. Certainly the current spec language isn't ideal. This algorithm is in the Here's a bunch of random stuff section of the HTML5 standard. Perhaps the ES spec is a better place for it. I'm not sure. On Jul 15, 2011, at 12:00 PM, Jonas Sicking wrote: 2011/7/15 Jason Orendorff jason.orendo...@gmail.com The structured cloning algorithm should be redefined in terms of the ES object protocol. This seems necessary anyway, for precision. Except that you don't want to do that for host objects. I only meant to say that the structured cloning algorithm should be specified in precise language, not that the meaning should be drastically changed. After all this is a deployed standard, right? If this it were to be done in the style of the ES standard, it would mean offering an extension point, such as a [[Clone]] internal method, which cloneable host objects such as File could implement. (I say [[Clone]], but there are other possibilities.) -j ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Fri, Jul 15, 2011 at 1:30 PM, Allen Wirfs-Brock al...@wirfs-brock.comwrote: On Jul 15, 2011, at 10:00 AM, Jonas Sicking wrote: Except that you don't want to do that for host objects. Trying to clone a File object by cloning its properties is going to give you an object which is a whole lot less useful as it wouldn't contain any of the file data. Once we define support for cloning ArrayBuffers the same thing will apply to it. This might in fact be a big hurdle to implementing structured cloning in javascript. How would a JS implementation of structured clone determine if an object is a host object which would loose all its useful semantics if cloned, vs. a plain JS object which can usefully be cloned? / Jonas And a cloned JS object is a lot less useful if it has lost it's original [[Prototype]]. Didn't you just argue you could communicate this kind of information with a schema? You couldn't share the actual [[Prototype]] anyway. So you'd have to pass the expected behaviors along with the object (this is why a Function serialization would be wonderful, but this could be done in a schema too). Sure, it won't be terribly efficient since (without mutable __proto__ or a | like mechanism in JSON) your worker would have to another key pass to tack on the appropriate behaviors. There's no benefit to the branding info (again, no shared memory) so I don't really see the problem. Why would this JS object be substantially less useful? It just requires a slightly different paradigm -- but this is to be expected. The only alternatives I can imagine would require some kind of spec. assistance (e.g. a specified schema format or a JSON++), which I gather you were trying to avoid. Generalizations about host objects are no more or less valid than generalizations about pure JS objects. This issue applies to pure JS object graphs or any serialization scheme. Sometimes language specific physical clones won't capture the desired semantics. (Consider for example, an object that references a resource by using a symbolic token to access a local resource registry). That is why the ES5 JSON encoder/decoder includes extension points such as the toJSON method. To enable semantic encodings that are different form the physical object structure. The structured clone algorithm, as currently written allows the passing of strings, so it is possible to use in to transmit anything that can be encoded within a string. All it takes needs is an application specific encoder/decoder. It seems to me the real complication is a desire for some structured clone use cases to avoid serialization and permit sharing via a copy-on-right of a real JS object graph. There are alternatives to CoW (dherman alluded to safely transferring ownership in his post, for instance). If you define this sharing in terms of serialization then you probably eliminate some of the language-specific low level sharing semantic issues. But you are still going to have higher lever semantic issues such as what does it mean to serialize a File. It isn't clear to me that there is a general solution to the latter. Why does it matter what it means to serialize a File? For the use cases in question (IndexedDB and WebWorkers) there are various paths an app could take, why would this have to be spec'ed? What does toJSON do? And does a file handle really need to make it across this serialization boundary and into your IDB store for later retrieval? I suspect not. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: [[Extensible]]and Proxies (Was: Proxy.isProxy )
On Jul 15, 2011, at 3:51 AM, Tom Van Cutsem wrote: Hi Allen, While proxies are definitely designed to emulate some of the peculiar behavior of host objects, that does not necessarily imply that proxies should be as powerful and free of restrictions as host objects. If any JS script can create a proxy, and a proxy can violate invariants that do hold for built-in Object or Function values, the logical implication is that Javascript code, in general, can no longer rely on those invariants. Code that does can then more easily be exploited. My view is that intercession is useful in that it supports the implementation of object semantics that differ form the base language. It actually seems to me that the property attributes are part of those base object semantics that might be reasonably modified by intercession. But I guess that is really the base question here. What are the base invariants that must be maintained at all costs and what are secondary invariants that may be useful but not essential. Consider a non-configurable, non-writable data property. The binding of such a property for regular objects is guaranteed to be immutable. Immutability can be used for caching purposes, but is equally useful to not have to adopt a defensive coding style (as a point in case, take Mark's implementation of the structured clone algorithm in JS. Such defensive coding would be unnecessary when dealing with frozen objects.) In a security context, immutability may be used to determine whether or not to allow some third-party code to access an object. The immediate issue was about [[Extensible]] not about the broader issue of Object.freeze. I think they are probably separable issues and that part of the disagreement is that they have been conflated. In short, if proxies could violate the semantics of e.g. frozen objects, then it seems to me that the usefulness of that concept is severely crippled. Where Object.isFrozen would have been a test for guaranteed immutability (of an object's structure), it merely becomes a test that hints at it, but cannot be relied upon. But can't a Proxy based object do all sorts of nasty back channel stuff even while it maintains the apparent object freeze invariants? More generally, it seems like what you really need to defend against is objects that are implemented via untrusted Proxy handlers. I have no problem with your defensive subsystem rejecting my non-extensible proxy based array because you don't trust my proxy handler. But to tell me that I can't even create such an object for my own purposes because you are are afraid I might pass it to you seems to be putting your use case ahead of mine. Allen___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Jul 15, 2011, at 10:56 AM, Dean Landolt wrote: On Fri, Jul 15, 2011 at 1:30 PM, Allen Wirfs-Brock al...@wirfs-brock.com wrote: On Jul 15, 2011, at 10:00 AM, Jonas Sicking wrote: Except that you don't want to do that for host objects. T ... And a cloned JS object is a lot less useful if it has lost it's original [[Prototype]]. Didn't you just argue you could communicate this kind of information with a schema? You couldn't share the actual [[Prototype]] anyway. So you'd have to pass the expected behaviors along with the object (this is why a Function serialization would be wonderful, but this could be done in a schema too). Sure, it won't be terribly efficient since (without mutable __proto__ or a | like mechanism in JSON) your worker would have to another key pass to tack on the appropriate behaviors. There's no benefit to the branding info (again, no shared memory) so I don't really see the problem. Why would this JS object be substantially less useful? It just requires a slightly different paradigm -- but this is to be expected. The only alternatives I can imagine would require some kind of spec. assistance (e.g. a specified schema format or a JSON++), which I gather you were trying to avoid. I was only objecting to any argument that starts out by essentially saying host objects are have unique requirements. Anything that is an issue host objects is likely to be an issue for some pure JS application. Generalizations about host objects are no more or less valid than generalizations about pure JS objects. This issue applies to pure JS object graphs or any serialization scheme. Sometimes language specific physical clones won't capture the desired semantics. (Consider for example, an object that references a resource by using a symbolic token to access a local resource registry). That is why the ES5 JSON encoder/decoder includes extension points such as the toJSON method. To enable semantic encodings that are different form the physical object structure. The structured clone algorithm, as currently written allows the passing of strings, so it is possible to use in to transmit anything that can be encoded within a string. All it takes needs is an application specific encoder/decoder. It seems to me the real complication is a desire for some structured clone use cases to avoid serialization and permit sharing via a copy-on-right of a real JS object graph. There are alternatives to CoW (dherman alluded to safely transferring ownership in his post, for instance). In either case there are contextual issues such as the [[Prototype]] problem. More generally if you have a behavioral based (methods+accessors are the only public interface) object model then you really can't CoW or transfer ownership meaningfully and maintain no shared state illusion. If you define this sharing in terms of serialization then you probably eliminate some of the language-specific low level sharing semantic issues. But you are still going to have higher lever semantic issues such as what does it mean to serialize a File. It isn't clear to me that there is a general solution to the latter. Why does it matter what it means to serialize a File? For the use cases in question (IndexedDB and WebWorkers) there are various paths an app could take, why would this have to be spec'ed? What does toJSON do? And does a file handle really need to make it across this serialization boundary and into your IDB store for later retrieval? I suspect not. Again, just an example and something that structured clone does deal even if not in a very precise manner. I was trying to say that there probably isn't a general solution communicate higher level semantic information. It needs to be designed on a case by case basis. Allen___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: [[Extensible]]and Proxies (Was: Proxy.isProxy )
On Jul 15, 2011, at 11:16 AM, Allen Wirfs-Brock wrote: On Jul 15, 2011, at 3:51 AM, Tom Van Cutsem wrote: Hi Allen, While proxies are definitely designed to emulate some of the peculiar behavior of host objects, that does not necessarily imply that proxies should be as powerful and free of restrictions as host objects. If any JS script can create a proxy, and a proxy can violate invariants that do hold for built-in Object or Function values, the logical implication is that Javascript code, in general, can no longer rely on those invariants. Code that does can then more easily be exploited. My view is that intercession is useful in that it supports the implementation of object semantics that differ form the base language. It actually seems to me that the property attributes are part of those base object semantics that might be reasonably modified by intercession. But I guess that is really the base question here. What are the base invariants that must be maintained at all costs and what are secondary invariants that may be useful but not essential. Consider a non-configurable, non-writable data property. The binding of such a property for regular objects is guaranteed to be immutable. Immutability can be used for caching purposes, but is equally useful to not have to adopt a defensive coding style (as a point in case, take Mark's implementation of the structured clone algorithm in JS. Such defensive coding would be unnecessary when dealing with frozen objects.) In a security context, immutability may be used to determine whether or not to allow some third-party code to access an object. The immediate issue was about [[Extensible]] not about the broader issue of Object.freeze. I think they are probably separable issues and that part of the disagreement is that they have been conflated. In short, if proxies could violate the semantics of e.g. frozen objects, then it seems to me that the usefulness of that concept is severely crippled. Where Object.isFrozen would have been a test for guaranteed immutability (of an object's structure), it merely becomes a test that hints at it, but cannot be relied upon. But can't a Proxy based object do all sorts of nasty back channel stuff even while it maintains the apparent object freeze invariants? Duh, of course a frozen object really isn't a Proxy any more as currently defined. But is has also lost all useful intercession derived semantics. BTW, if http://wiki.ecmascript.org/doku.php?id=strawman:fixed_properties was extended to support Object.preventExtensions we might not have anything to argue about except perhaps performance issues. More generally, it seems like what you really need to defend against is objects that are implemented via untrusted Proxy handlers. I have no problem with your defensive subsystem rejecting my non-extensible proxy based array because you don't trust my proxy handler. But to tell me that I can't even create such an object for my own purposes because you are are afraid I might pass it to you seems to be putting your use case ahead of mine. Allen ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: [[Extensible]]and Proxies (Was: Proxy.isProxy )
On Jul 15, 2011, at 12:38 PM, Allen Wirfs-Brock wrote: On Jul 15, 2011, at 11:16 AM, Allen Wirfs-Brock wrote: But can't a Proxy based object do all sorts of nasty back channel stuff even while it maintains the apparent object freeze invariants? Duh, of course a frozen object really isn't a Proxy any more as currently defined. But is has also lost all useful intercession derived semantics. This may be ok, so long as we can distinguish preventExtensions from seal from freeze, *and* determine the class or constructor used for the newborn object that becomes the proxy. Yes, I'm thinking faithful Array emulation. BTW, if http://wiki.ecmascript.org/doku.php?id=strawman:fixed_properties was extended to support Object.preventExtensions we might not have anything to argue about except perhaps performance issues. Indeed, that helps quite a bit. Glad to hear it. But there may be more power needed in the fix trap, per the above Array-proxy test. /be___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: [[Extensible]]and Proxies (Was: Proxy.isProxy )
On Jul 15, 2011, at 12:47 PM, Brendan Eich wrote: On Jul 15, 2011, at 12:38 PM, Allen Wirfs-Brock wrote: On Jul 15, 2011, at 11:16 AM, Allen Wirfs-Brock wrote: But can't a Proxy based object do all sorts of nasty back channel stuff even while it maintains the apparent object freeze invariants? Duh, of course a frozen object really isn't a Proxy any more as currently defined. But is has also lost all useful intercession derived semantics. This may be ok, so long as we can distinguish preventExtensions from seal from freeze, *and* determine the class or constructor used for the newborn object that becomes the proxy. Yes, I'm thinking faithful Array emulation. BTW, if http://wiki.ecmascript.org/doku.php?id=strawman:fixed_properties was extended to support Object.preventExtensions we might not have anything to argue about except perhaps performance issues. Indeed, that helps quite a bit. Glad to hear it. But there may be more power needed in the fix trap, per the above Array-proxy test. Perhaps the fix trap should return the constructor to use or perhaps even the substitute object. Allen___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Fri, Jul 15, 2011 at 2:50 PM, Allen Wirfs-Brock al...@wirfs-brock.comwrote: On Jul 15, 2011, at 10:56 AM, Dean Landolt wrote: On Fri, Jul 15, 2011 at 1:30 PM, Allen Wirfs-Brock al...@wirfs-brock.comwrote: On Jul 15, 2011, at 10:00 AM, Jonas Sicking wrote: Except that you don't want to do that for host objects. T ... And a cloned JS object is a lot less useful if it has lost it's original [[Prototype]]. Didn't you just argue you could communicate this kind of information with a schema? You couldn't share the actual [[Prototype]] anyway. So you'd have to pass the expected behaviors along with the object (this is why a Function serialization would be wonderful, but this could be done in a schema too). Sure, it won't be terribly efficient since (without mutable __proto__ or a | like mechanism in JSON) your worker would have to another key pass to tack on the appropriate behaviors. There's no benefit to the branding info (again, no shared memory) so I don't really see the problem. Why would this JS object be substantially less useful? It just requires a slightly different paradigm -- but this is to be expected. The only alternatives I can imagine would require some kind of spec. assistance (e.g. a specified schema format or a JSON++), which I gather you were trying to avoid. I was only objecting to any argument that starts out by essentially saying host objects are have unique requirements. Anything that is an issue host objects is likely to be an issue for some pure JS application. Okay, but what of the assertion itself? Must a cloned JS object must maintain its [[Prototype]]? I'm just curious as to why. Generalizations about host objects are no more or less valid than generalizations about pure JS objects. This issue applies to pure JS object graphs or any serialization scheme. Sometimes language specific physical clones won't capture the desired semantics. (Consider for example, an object that references a resource by using a symbolic token to access a local resource registry). That is why the ES5 JSON encoder/decoder includes extension points such as the toJSON method. To enable semantic encodings that are different form the physical object structure. The structured clone algorithm, as currently written allows the passing of strings, so it is possible to use in to transmit anything that can be encoded within a string. All it takes needs is an application specific encoder/decoder. It seems to me the real complication is a desire for some structured clone use cases to avoid serialization and permit sharing via a copy-on-right of a real JS object graph. There are alternatives to CoW (dherman alluded to safely transferring ownership in his post, for instance). In either case there are contextual issues such as the [[Prototype]] problem. More generally if you have a behavioral based (methods+accessors are the only public interface) object model then you really can't CoW or transfer ownership meaningfully and maintain no shared state illusion. By who's definition of meaningful? IIUC you're asserting that directly sharing context like [[Prototype]] is both important and impossible. I contend that the behavior-based object model can be shared, if only indirectly, by completely detaching it from the main thread's deeply intertwined, deeply mutable object graph (to borrow dherman's colorful phrase). This would almost certainly require spec. support but I can think of at least a few ways to do it. If something like this were doable it could open the door for the most efficient structured clone I can think of: no clone at all. If you define this sharing in terms of serialization then you probably eliminate some of the language-specific low level sharing semantic issues. But you are still going to have higher lever semantic issues such as what does it mean to serialize a File. It isn't clear to me that there is a general solution to the latter. Why does it matter what it means to serialize a File? For the use cases in question (IndexedDB and WebWorkers) there are various paths an app could take, why would this have to be spec'ed? What does toJSON do? And does a file handle really need to make it across this serialization boundary and into your IDB store for later retrieval? I suspect not. Again, just an example and something that structured clone does deal even if not in a very precise manner. I was trying to say that there probably isn't a general solution communicate higher level semantic information. It needs to be designed on a case by case basis. Indeed, certain applications may require custom handling. But it would be great if there were an easy and obvious default. It's good enough for JSON, which is a very similar use case (especially in the context of IDB). So I'm still curious just how important is it to transmit a faithful representation of an object, prototype and all, for the WebWorker use
Re: [[Extensible]]and Proxies (Was: Proxy.isProxy )
2011/7/15 Brendan Eich bren...@mozilla.com On Jul 15, 2011, at 12:38 PM, Allen Wirfs-Brock wrote: BTW, if http://wiki.ecmascript.org/doku.php?id=strawman:fixed_properties was extended to support Object.preventExtensions we might not have anything to argue about except perhaps performance issues. Indeed, that helps quite a bit. Glad to hear it. But there may be more power needed in the fix trap, per the above Array-proxy test. Glad to hear that too. I'll try to extend the FixedHandler with support for non-extensibility. David may beat me to it, though ;-) I'll be offline for the next couple of weeks, so this may take a while, but it's noted and I'll follow-up on suggestions later. Cheers, Tom ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On Jul 15, 2011, at 1:45 PM, Dean Landolt wrote: On Fri, Jul 15, 2011 at 2:50 PM, Allen Wirfs-Brock al...@wirfs-brock.com wrote: Okay, but what of the assertion itself? Must a cloned JS object must maintain its [[Prototype]]? I'm just curious as to why. If it is a local clone, yes. Essential parts of an object's behavior (and even state) may be defined by the objects along its [[Prototype]] chain. If you eliminate those it isn't behaviorally the same kind of object. If you are talking about cloning a non-local copy then it depends upon what you are really trying to accomplish. If you are trying to create a behaviorally equivalent clone in a remote but similar environment that then you are dealing with some sort of built-in object then maybe you will be satisfied with just connecting to the equivalent built-in pototypes in the remote system (this is essentially what structured clone is doing for well known JS objects like RgExp and Date). If it is an application defined object with application defined prototypes you may want to first force remote loading of your application so you can connect to it. Or maybe you want to also serialize the [[Prototype]] chain as part of the remote cloning operation or something else. May be all you really want do is just clone some static application data without any behavioral component at all (essentially what JSON does). ... In either case there are contextual issues such as the [[Prototype]] problem. More generally if you have a behavioral based (methods+accessors are the only public interface) object model then you really can't CoW or transfer ownership meaningfully and maintain no shared state illusion. By who's definition of meaningful? IIUC you're asserting that directly sharing context like [[Prototype]] is both important and impossible. I contend that the behavior-based object model can be shared, if only indirectly, by completely detaching it from the main thread's deeply intertwined, deeply mutable object graph (to borrow dherman's colorful phrase). This would almost certainly require spec. support but I can think of at least a few ways to do it. If something like this were doable it could open the door for the most efficient structured clone I can think of: no clone at all. Perhaps, if you had the concept of immutable (and identity free??) behavioral specifications then they could be reified in multiple environment and perhaps sharing an underlying representation. But that isn't really how JavaScript programs are constructed today. .. Indeed, certain applications may require custom handling. But it would be great if there were an easy and obvious default. It's good enough for JSON, which is a very similar use case (especially in the context of IDB). So I'm still curious just how important is it to transmit a faithful representation of an object, prototype and all, for the WebWorker use case? I suspect a few compromises can be made to get to an efficient postMessage that could sidestep Structured Clone entirely. Wouldn't this be a more desirable outcome anyway? Here's how I'd put it, if JSON is good enough for http server/browser client communications why isn't it good enough for communicating to a Web worker? It seems we would have better scalability if a task could be fairly transparently assigned to either a local worker or a remote compute server depending upon local capabilities, etc. My experience (and I've worked with a lot of different OO languages and environments) is that transparent marshalling of object models for either communications or storage seldom ends up being a good long term solution. It seems very attractive but leads to problems such as schema evolution issues (particularly when long term storage is involved). Attractive nuance, don't do it :-)___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: The Structured Clone Wars
On 7/15/11 1:37 PM, Dean Landolt wrote: Is it really a problem if host objects don't survive in full across serialization boundaries? Depending on what you mean by in full, yes. As you say All APIs that use structured cloning are pretty explicit. Things like Worker.postMessage and IDBObjectStore.put pretty explicitly creates a new copy. If you expect host objects to survive across that boundary you'll quickly learn otherwise, and it won't take long to grok the difference. The whole point of structured cloning is to pass across objects in a way that's pretty difficult to do via serialization using existing ES5 reflection facilities. Java draws a distinction between marshalling and serialization which might be useful to this discussion: Structured clone is closer to marshalling. -Boris ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss