Re: Non-extensibility of Typed Arrays

Filip Pizlo Fri, 30 Aug 2013 11:32:42 -0700

On Aug 30, 2013, at 9:28 AM, Brendan Eich <[email protected]> wrote:


> Hi,
>> Filip Pizlo <mailto:[email protected]>
>> August 28, 2013 11:01 PM
>> Here's the part that gets me, though: what is the value of disallowing named 
>> properties on typed arrays?  Who does this help?
> 
> You've heard about symmetry with struct types (ES6), right? Those do not want 
> expandos. We could break symmetry but at some cost. Too small to worry about? 
> Outweighed by benefits?

It's a fair point.  I don't see where it would break semantics but I'll try to 
do a thought experiment to see if it makes things confusing or inconvenient to 
the programmer.  Whether or not I care depends on the answers to the following 
questions:

1) Is the purpose to simplify programming by allowing you to add static typing?
2) Are we trying to help JITs?
3) Do we just want a sensible way of mapping to binary data?  (For both DOM and 
C-to-JS compilers)

It appears that (1) is a non-goal; if it was a goal then we'd have a different 
aliasing story, we wouldn't have the byteOffset/byteLength/buffer properties, 
and there would be zero discussion of binary layout.  We'd also bake the types 
deeper into the language.  This doesn't simplify programming if you have to 
write code in a bifurcated world with both traditional JS objects (all dynamic, 
objects can point at each other, but the backing stores of objects don't alias 
each other) and binary objects (have some types to describe layout, but can't 
have arbitrary object graphs, and backing stores of distinct objects may alias 
each other).

(2) appears to be a bit more of a pie-in-the-sky dream than a goal.  A decent 
JIT will already recognize idioms where the programmer created an object with a 
clear sequence of fields and then uses that object in a monomorphic way.  Both 
'function Constructor() { this.a = ...; this.b = ...; }' and '{a:..., b:...}' 
will get recognized, though some combination of run-time and compile-time 
analysis, as indicating that the user intends to have a type that has 'a' and 
'b' as fields.  It's true that binary data makes this explicit, but the JIT can 
fall apart in the same way as it does already for normal objects: the 
references to these objects tend to be untyped so the programmer can 
inadvertently introduce polymorphism and lose some (most?) of the benefits.  
Because binary data objects will have potentially aliased backing stores, you 
get the additional problem that you can't do any field-based aliasing analysis: 
for a normal JS object if I know that 'o.a' accesses own-property 'a' and it's 
not a getter/setter; and 'o.b' accesses own-property 'b' and it's not a 
getter/setter - then I know that these two accesses don't alias.  For binary 
data, I don't quite have such a guarantee: 'a' can overlap 'b' in some other 
object.  Also, the fact that a struct type instance might have to know about a 
buffer along with an offset into that buffer introduces a greater object size 
overhead than plain JS objects.  A plain JS object needs roughly two pieces of 
overhead: something to identify the type and a pointer reserved for when you 
store more things into it.  A struct type instance will need roughly three 
pieces of overhead: something to identify the type, a pointer to the buffer, 
and some indication of the offset within that buffer.  The only performance win 
from struct types is probably that it gives you an explicit tuple flattening.  
That's kind of cool but I remember that C# had struct types while Java didn't 
and yet JVMs still killed .NET on any meaningful measure of performance.

So it appears that the most realistic goal is (3).  In that case, I can't 
imagine a case where arrays being expandos but struct types being totally 
frozen will make the task of struct mapping to native code any harder.  If 
you're a programmer who doesn't want a typed array to have custom properties, 
then you won't give it custom properties - simple as that.  No need to enforce 
the invariant.

> 
> Sfink's point about structured clone is good, except he wrote "structured 
> clone" and then angels cried... tears of blood.
>> 
>> I don't quite buy that this helps users; most of the objects in your program 
>> are going to allow custom properties to be added at any point.  That's kind 
>> of the whole point of programming in a dynamic language.  So having one type 
>> where it's disallowed doesn't help to clarify thinking.
> 
> There are other such types a-coming :-).

And I'll be grumpy about some of those, too. ;-)

>> 
>> I also don't buy that it makes anything more efficient.  We only incur 
>> overhead from named properties if you actually add named properties to a 
>> typed array, and in that case we incur roughly the overhead you'd expect 
>> (those named properties are a touch slower than named properties on normal 
>> objects, and you obviously need to allocate some extra space to store those 
>> named properties).
>> 
> 
> Honest q: couldn't you squeeze one more word out if JSC typed arrays were 
> non-extensible?

I'd love to hear about this from the SM and V8 peeps.  Here's my take.  A typed 
array *must* know about the following bits of information:

T: Its own type.
B: A base pointer (not the buffer but the thing you index off of).
L: Its length.

But that only works if it owns its buffer - that is it was allocated using for 
example "new Int8Array(100)" and you never used the .buffer property.  So in 
practice you also need:

R: Reserved space for a pointer to a buffer.

Now observe that 'R' can be reused for either a buffer pointer or a pointer to 
overflow storage for named properties.  If you have both a buffer and overflow 
storage, you can save room in the overflow storage for the buffer pointer (i.e. 
displace the buffer pointer into the property storage).  We play a slightly 
less ambitious trick, where R either points to overflow storage or NULL.  Most 
typed arrays don't have a .buffer, but once they get one, we allocate overflow 
storage and reserve a slow in there for the buffer pointer.  So you pay *one 
more* word of overhead for typed arrays with buffers even if they don't have 
named properties.  I think that's probably good enough - I mean, in that case, 
you have a freaking buffer object as well so you're not exactly conserving 
memory.

But, using R as a direct pointer to the buffer would be a simple hack if we 
really felt like saving one word when you also already have a separate buffer 
object.

I could sort of imagine going further and using T as a displaced pointer and 
saving an extra word, but that might make type checks more expensive, sometimes.

So lets do the math, on both 32-bit and 64-bit (where 64-bit implies 64-bit 
pointers), to see how big this would be.

32-bit:

T = 4 bytes, B = 4 bytes, L = 4 bytes, R = 4 bytes.  So, you get 16 bytes of 
overhead for most typed arrays, and 20 if you need to use R as an overflow 
storage pointer and displace the buffer pointer into the overflow storage.

64-bit:

T = 8 bytes, B = 8 bytes, L = 4 bytes, R = 8 bytes.  This implies you have 4 
bytes to spare if you want objects 8-byte aligned (we do); we use this for some 
extra bookkeeping.  So you get 32 bytes of overhead for most typed arrays, and 
40 if you need to use R as an overflow storage pointer and displace the buffer 
pointer into the overflow storage.

As far as I can tell, this object model compresses typed arrays about as much 
as they could be compressed while also allowing them to be extensible.  The 
downside is that you pay a small penalty for typed arrays that have an "active" 
buffer, in the sense that you either accessed the .buffer property or you 
constructed the typed array using a constructor that takes a buffer as an 
argument.

-Filip


> 
> /be
> 
>> -Filip
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> es-discuss mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/es-discuss
>> Oliver Hunt <mailto:[email protected]>
>> August 27, 2013 9:35 AM
>> Existing types with magic index properties (other than Array) just drop 
>> numeric expandos on the floor so it's logically a no-op. Unless there was a 
>> numeric accessor on the prototype (which non-extensibility does not save you 
>> from).
>> 
>> My complaint is that this appears to be removing functionality that has been 
>> present in the majority of shipping TA implementations, assuming from LH's 
>> comment that Chakra supports expandos.
>> 
>> --Oliver
>> 
>> 
>> _______________________________________________
>> es-discuss mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/es-discuss
>> 
>> Domenic Denicola <mailto:[email protected]>
>> August 27, 2013 9:26 AM
>> I am not aware of all the nuances of the discussion, but as a developer I 
>> would find the behavior for numeric expandos confusing. For a typed array of 
>> length 1024, setting `ta[1023]` would do something completely different from 
>> setting `ta[1024]`. Unlike normal arrays, setting `ta[1024]` would not 
>> change `ta.length`, and presumably `ta[1024]` would not be exposed by the 
>> various iteration facilities.
>> 
>> I would much rather received a loud error (in strict mode), which will 
>> either alert me to my code being weird, or possibly to my code committing an 
>> off-by-one error.
>> _______________________________________________
>> es-discuss mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/es-discuss
>> 
>> Oliver Hunt <mailto:[email protected]>
>> August 27, 2013 9:18 AM
>> The curent argument for non-extensibility seems to be mozilla doesn't 
>> support them.  It sounds like all other engines do.
>> 
>> There are plenty of reasons developers may want expandos - they're generally 
>> useful for holding different kinds of metadata.  By requiring a separate 
>> object to hold that information we're merely making a developer's life 
>> harder.  This is also inconsistent with all other magically-indexable types 
>> in ES and the DOM.
>> 
>> I'm also not sure what the performance gains of inextensibility are, if DH 
>> could expand it would be greatly appreciated.
>> 
>> --Oliver
>> 
>> 
>> 
>> _______________________________________________
>> es-discuss mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/es-discuss
>> Allen Wirfs-Brock <mailto:[email protected]>
>> August 27, 2013 9:04 AM
>> see meeting notes 
>> https://github.com/rwaldron/tc39-notes/blob/master/es6/2013-07/july-24.md#54-are-typedarray-insances-born-non-extensible
>>  
>> 
>> 
>> 
>> _______________________________________________
>> es-discuss mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Non-extensibility of Typed Arrays

Reply via email to