WeakMap constructor genericity

Allen Wirfs-Brock Fri, 02 Nov 2012 10:47:50 -0700

On Nov 2, 2012, at 8:54 AM, Jason Orendorff wrote:

> In the draft spec, you can basically turn any arbitrary object into a Map.
> 
>     var obj = new Date;
>     Map.call(obj);
>     Map.prototype.set.call(obj, "x", "y");
>     assert(Map.prototype.get.call(obj, "x") === "y");
> 
> The same object can be a Set too. Why not?
> 
>     Set.call(obj, ["z"]);
>     assert(Set.prototype.has.call(obj, "z"), true);
> 
> This is intended to make Map/Set/WeakMap subclassable, which is fine.  But 
> can we specify that without exposing Map initialization as a primitive that 
> users can apply to arbitrary objects?
> 
> As specified, a single object can have [[MapData]] and [[WeakMapData]] and 
> [[SetData]]. This is a pain to implement, and I don't see the benefit to web 
> developers.


Yes, indeed, although I believe the implementation can be less painful than you 
think.

Before I plunge into this, it may be helpful to review 
http://wiki.ecmascript.org/doku.php?id=strawman:subclassable-builtins 

I think you agree, that subclassability of built-ins is a valuable feature and 
that we want to avoid introducing any more non-subclassable built-ins (and, if 
possible, fix the existing ones so they are subclassable). So, it is a matter 
of how we can accomplish that.

Another design principle I've generally applied is that built-in "classes" 
should absolutely minimize their specialness.  We may have a built-in for perf 
or security reasons, to access an external resource that would not otherwise be 
accessible, to bridge to the implementation layer, to optimize runtime 
representations, etc.  But, where ever possible, the standard ES library should 
not be "magic".  It should be "self-hostable" in ECMAScript code.

So, if we follow that design principle in making Map and friends subclassable 
we  should do it in a manner that is consistent with what would be done in a 
self-hosted implementation.

One of the characteristics of creating class abstractions in JavaScript 
(whether manually or via a class declaration) is that object allocation is 
separated from object initialization.  The expression, new Foo, first allocates 
an ordinary object and then calls Foo to initialize it. Any specialness of the 
object derives not from its allocation but from the manner in which it is 
initialized.  For example, Foo might place a private symbol keyed property on 
the object to brand it as being a special Foo object.  Or, in the case of a Map 
object the Map constructor might associate special "MapData" internal state 
with the object via a private symbol keyed property. 

If we are going to subclass such a Foo (or Map) object (let's talk about a Bar 
subclass) the Bar constructor function needs to be able to able to call Foo to 
initialize the instance.  We do this by making a super call to Foo:
class Bar extends Foo {
   constructor() {
        //maybe do some initialization on this here
        super.constructor()}  //or just super(); either really means pretty 
much the same as Foo.call(this)
        //maybe do some other initialization on this here
};

So, Foo has to be prepared to deal with an  arbitrary this object that may have 
already had some initialization performed upon it.  It also means that anybody 
who has visibility of Foo (via either a name binding or via the constructor 
property of a Foo instance ) can call Foo to initialize any arbitrary object. 
And there is nothing that prevents someone from call Foo, Bar, Map, and any 
other constructors all on the same object.  If the initialization actions 
performed by all of the constructors are disjoint then every thing should work 
just fine.  If the different constructors interfere with each other you will 
have a buggy object, but this is one of an infinite number of ways to define a 
buggy object.  Also note that calling multiple constructors  is not necessarily 
an unreasonable thing to do:  consider for example, a package that was 
supporting a multiple inheritance layer for JavaScript.

So, the possibility that an object may be initialized by multiple constructors 
as an inherent part of JavaScript and if we are following the 
builtins-aren't-magic principle we should expect this to apply to them as much 
as any other objects.

There is one way, to code classes so that allocation is coupled to 
initialization.  Move allocation into the constructor:

class Foo {
   constructor () {
       let self = Object.create(Object.getPrototypeOf(this));
       // initialize self
       return self;
   }
}

However, then a subclass has to be written as:
class Bar extends Foo {
   constructor() {
      let self = super();
      //initialize Bar state using self to reference the instance
      return self;
    }
}

This seems error prone in may ways:  the subclass has to remember to  capture 
the result of the super.constructor call;  Constructors always need to end with 
an explicit return of self;  you have to avoid using this within constructors;  
you can't make super calls to any methods other than the constructor, etc.  And 
it breaks many multiple inheritance scenarios.  The cure seems worse than the 
disease.  Very few people are going to get screwed up by unintentionally 
multiply initialize some object. A lot of people will make mistake if thye have 
to remember to apply the above patterns to their superclass/subclass 
constructors.

I think the bottom lines is that the possibility of an object have multiple 
initializer is inherent to JS, whether we are talking about built-in or user 
defined objects.

Now let's talk about the implementation pain.

In JS code, we would avoid conflict among multiple initializers by using a 
unique and/or private symbol to access state that is specific to some "class".  
 The same thing can be done at the built-in implementation level. As I discuss 
in the above reference.  When I specify an internal data property such as 
[[MapData]] in the spec what  I'm saying is that here is private internal state 
that you might internally represent using a private symbol named own property.  
You can do something else, but it needs to have the dynamic extensibility 
characteristics of object properties.  In other words you have to be able to 
add it after you allocate the object.

That probably precludes you from directly using a unique C/C++ data structure  
for such objects.  But there is nothing that stops using you from using having 
such a data structure and using it as the value of a private symbol keyed 
property known only to you.  This adds the overhead of a property indirection 
to get at your native data structure.  I can think of various optimization to 
avoid that indirection in most cases, but I'm not sure it would be worth it.

So, yes, subclassable built-ins adds (or changes) somethings for built-in 
implementations. There may be a little pain at first, but I think it is worth 
it in the long run.

Allen











> 
> -j
> 
> _______________________________________________
> es-discuss mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Map/Set/WeakMap constructor genericity

Reply via email to