Re: Non-extensibility of Typed Arrays

Filip Pizlo Wed, 04 Sep 2013 16:15:32 -0700

On Sep 4, 2013, at 3:09 PM, Brendan Eich <[email protected]> wrote:


>> Filip Pizlo <mailto:[email protected]>
>> September 4, 2013 12:34 PM
>> My point is that having custom properties, or not, doesn't change the 
>> overhead for the existing typed array spec and hence has no effect on small 
>> arrays.  The reasons for this include:
>> 
>> - Typed arrays already have to be objects, and hence have a well-defined 
>> behavior on '=='.
>> 
>> - Typed arrays already have to be able to tell you that they are in fact 
>> typed arrays, since JS doesn't have static typing.
>> 
>> - Typed arrays already have prototypes, and those are observable regardless 
>> of expandability.  A typed array from one global object will have a 
>> different prototype than a typed array from a different global object.  Or 
>> am I misunderstanding the spec?
>> 
>> - Typed arrays already have to know about their buffer.
>> 
>> - Typed arrays already have to know about their offset into the buffer.  Or, 
>> more likely, they have to have a second pointer that points directly at the 
>> base from which they are indexed.
>> 
>> - Typed arrays already have to know their length.
>> 
>> You're not proposing changing these aspects of typed arrays, right?
> 
> Of course not, but for very small fixed length arrays whose .buffer is never 
> accessed, an implementation might optimize harder.

As I said, of course you can do this, and one way you could "try harder" is to 
put the buffer pointer in a side table.  The side table maps array object 
pointers to their buffers, and you only make an entry in this table if .buffer 
is mentioned.

But if we believe that this is a sensible thing for a VM to do - and of course 
it is! - then the same thing can be done for the custom property storage 
pointer.

> It's hard for me to say "no, Filip's analysis shows that's never worthwhile, 
> for all time."
> 
>> The super short message is this: so long as an object obeys object identity 
>> on '==' then you can have "free if unused, suboptimal if you use them" 
>> custom properties by using a weak map on the side.  This is true of typed 
>> arrays and it would be true of any other object that does object-style ==.  
>> If you allocate such an object and never add a custom property then the weak 
>> map will never have an entry for it; but if you put custom properties in the 
>> object then the map will have things in it.  But with typed arrays you can 
>> do even better as my previous message suggests: so long as an object has a 
>> seldom-touched field and you're willing to eat an extra indirection or an 
>> extra branch on that field, you can have "free if unused, still pretty good 
>> if you use them" custom properties by displacing that field.  Typed arrays 
>> have both of these properties right now and so expandability is a free lunch.
> 
> The last sentence makes a "for-all" assertion I don't think implementations 
> must be constrained by.

How so?  It is true that some VM implementations will be better than others.  
But ultimately every VM can implement every optimization that every other VM 
has; in fact my impression is that this is exactly what is happening as we 
speak.

So, it doesn't make much sense to make language design decisions because it 
might make some implementor's life easier right now.  If you could argue that 
something will never be efficient if we add feature X, then that might be an 
interesting argument.  But as soon as we identify one sensible optimization 
strategy for making something free, I would tend to think that this is 
sufficient to conclude that the feature is free and there is no need to 
constrain it.  If we don't do this then we risk adding cargo-cult performance 
features that rapidly become obsolete.

> Small fixed-length arrays whose .buffer is never accessed (which an 
> implementation might be able to prove by type inference) could be optimized 
> harder.

And my point is that if you do so, then the same technique can be trivially 
applied to the custom property storage pointer.

> 
> The lack of static types in JS does not mean exactly one implementation 
> representation must serve for all instances of a given JS-level abstraction. 
> We already have strings optimized variously in the top VMs, including Chords 
> or Ropes, dependent strings, different character sets, etc.
>> 
>> Still find this discussion amusing?  Here's the long story is: It is these 
>> things that I list above that lead to a 16 byte overhead on 32-bit, and a 
>> 32-byte overhead on 64-bit in the best "sane" case.  Giving typed array 
>> objects expandability doesn't add to this overhead, because two of the 
>> fields necessary to implement the above (the type, and the buffer) can be 
>> displaced for pointing to property storage.  Any imaginable attempt to 
>> reduce the overhead incurred by the information - using BBOP (big bag of 
>> pages) for the type, using an out-of-line weak map for the buffer or the 
>> type, encoding some of the bits inside the pointer to the typed array, etc. 
>> - can be also used to eradicate any space overhead you'd need for custom 
>> properties, so long as you're on board with the "free if unused, sub-optimal 
>> if you use them" philosophy.
> 
> For something like decimal, it matters whether there's an empty side table 
> and large-N decimal instances of total size N*S, vs. N*(S+K) for some 
> constant K we could eliminate by specializing harder. Even better if we agree 
> that decimal instances should be non-extensible (and have value not reference 
> semantics -- more below).

With a side table, the constant K = 0 even if you have custom properties.  The 
table will only have an entry for those instances that had custom properties.

> 
>> - If the VM wants to go further and create immediate representations of some 
>> or all Int64's, similarly to what VMs do for JS small integers today, then 
>> the main problem you run into is object identity: does 
>> Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))?  A naive JS implementation 
>> of an Int64 class would say that this is false, since it's likely to 
>> allocate a new Int64 each time.  But an immediate representation would have 
>> no choice but to say true.  You can work around this if you say that the 
>> VM's implementation of Int64 operations behaves /as if/ the 
>> add()/sub()/whatever() methods used a singleton cache.  You can still then 
>> have custom properties; i.e. you could do Int64(2).foo = 42 and then 
>> Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an 
>> immediate-int64-to-customproperties map on the side. That's kind of 
>> analogous to how you could put a setter on field '2' of Array.prototype and 
>> do some really hilarious things.
> 
> The value objects proposal for ES7 is live, I'm championing it. It does not 
> use (double-dispatch for dyadic) operators as methods. It does not use 
> extensible objects.
> 
> http://wiki.ecmascript.org/doku.php?id=strawman:value_objects
> http://www.slideshare.net/BrendanEich/value-objects
> 
> Warning: both are slightly out of date, I'll be updating the strawman over 
> the next week.

Thanks for the links!  To clarify, I'm not trying to make a counterproposal - 
the above was nothing more than a fun thought experiment and I shared it to 
motivate why I think that custom properties are free.

My understanding is that you are still arguing that custom properties are not 
free, and that they incur some tangible cost in terms of space and/or time.  
I'm just trying to show you why they don't if you do the same optimizations for 
them that have become acceptable for a lot of other JS corners.  Unless you 
think that ES should have an "ease of implementation" bar for features.  I 
wouldn't necessarily mind that, but my impression is that this is not the case.

> 
> With value objects, TC39 has definitely favored something that I think you 
> oppose, namely extending JS to have (more) objects with value not reference 
> semantics, which requires non-extensibility.

Indeed.

> 
> If I have followed your messages correctly, this is because you think 
> non-extensibility is a rare case that should not proliferate.

I have two points here:

- Typed arrays already have so much observable objectyness that making then 
non-extensible feels arbitrary; this is true regardless of the prevalence, or 
lack thereof, of non-extensibility.

- At the same time, I do think that non-extensibiltiy is a rare case and I 
don't like it.

> But with ES5 Object.preventExtensions, etc., the horse is out of the barn.

It's there and we have to support it, and the fact that you can do 
preventExtensions() to an object is a good thing.  That doesn't mean it should 
become the cornerstone for every new feature.  If a user wants to 
preventExtensions() on their object, then that's totally cool - and I'm not 
arguing that it isn't.

The argument I'm making is a different one: should an object be non-expandable 
by default?

I keep hearing arguments that this somehow makes typed arrays more efficient.  
That's like arguing that there exists a C compiler, somewhere, that becomes 
more efficient if you label your variables as 'register'.  It's true that if 
you're missing the well-known optimization of register allocation then yes, 
'register' is an optimization.  Likewise, if you're missing the well-known 
object model optimizations like pointer displacement, BBOP's, or other kinds of 
side tables, then forcing objects to be non-extensible is also an optimization. 
 That doesn't mean that we should bake it into the language.  VM hackers can 
just implement these well-known optimizations and just deal with it.

> 
> At a deeper level, the primitives wired into the language, boolean number 
> string -- in particular number when considering int64, bignum, etc. -- can be 
> rationalized as value objects provided we make typeof work as people want 
> (and work so as to uphold a == b && typeof a == typeof b <=> a === b).

I think making int64/bignum be primitives is fine.  My only point is that 
whether or not you make them expandable has got nothing to do with how much 
memory they use.

> 
> This seems more winning in how it unifies concepts and empowers users to make 
> more value objects, than the alternative of saying "the primitives are 
> legacy, everything else has reference semantics" and turning a blind eye, or 
> directing harsh and probably ineffective deprecating words, to 
> Object.preventExtensions.

Well this is all subjective.  Objects being expandable by default is a unifying 
concept.  The only thing that expandability of typed arrays appears to change 
is the interaction with binary data - but that isn't exactly a value object 
system as much as it is a playing-with-bits system.  I'm not sure that having 
oddities there changes much.

> 
> /be

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Non-extensibility of Typed Arrays

Reply via email to