On May 23, 2020, at 9:07 AM, Remi Forax <fo...@univ-mlv.fr> wrote: > >> >> 3. Record components could also be normalized in the constructor. E.g. >> assume record Fraction(int numerator, int denominator) that normalizes >> the components using GCD in the constructor. Using withers would >> produce a weird result based on the previous value. > > A record is transparent (same API externally and internally) so as a user i > expect that if i set a record component to 3 using a constructor and then ask > its value, the result will be 3,
This is not the expectation we set for record constructors, so it is debatable that it would be an expectation for record “withers”. This is why I like to use a special term for “withers”: If an object can be built from scratch by calling its class’s constructor, then it follows that an API point which takes an existing record and builds a new one (with modifications) is a sort of constructor also; I call it a “reconstructor”. Some context: I’ve been thinking about this for a long time. When we added non-static final variables, with their funny DU/DA rules, I did the design and implementation, and one thing that irritated me was the lack of a way to change just one or two final fields of an object full of finals. At the time final fields were rare so we thought the user would always prefer to build another object “from scratch”. And it would be expensive enough, since fast GC was not yet a thing, that the user would not want a slick way to do the job. Now with inline types the balance of concerns has shifted, towards objects with *only* final fields, and very low allocation costs, yet we still have a “hole” in our language, corresponding to that old and now-important technical debt. Back in the ‘90s I didn’t have a crisp idea about the shape of this technical debt, but now I think I do. Which is not to say that I have a crisp idea how to pay it off, but I have some insights I want to share. The debt shows up when we transform mutable iterators into immutable value-based “cursors”. For those we need a way to say “offset++” inside an active cursor—yielding a new cursor value which is the logical “next state” of the iteration. Physically, that’s a constructor for the cursor class which takes the old cursor and does “offset++” (or the equivalent) in the body of that constructor. Because it takes the old cursor and preserves all other fields untouched, it is a very special kind of constructor, which should be called “reconstructor” because it reconstructs (a copy of) the old object to match some new modeling condition (an incremented offset). As side note, it is telling that the rules of constructors (but no other methods) allow you to perform local side effects on variables which correspond to fields; of course you commit “this.offset = offset” at some point and shouldn’t dream of modifying “this.offset”, but the constructor is free to change the values before committing them. Compact record constructors make this more seamless. In such settings (which are natural to reconstructors), saying something like “offset++” or “offset -= adjust”, before commit, is totally normal. A reconstructor makes the most sense for an inline type, but it also makes sense for identity types (as long as users are willing to eat the cost of making a new version of the object instead of side-effecting the old version). > so i don't think normalizing values of a record in the constructor is a good > idea. > This issue is independent of with/copy, calling a constructor with the > results of some accessors of an already constructed gcd will produce weird > results. Calling it a reconstructor sets expectation that those results are not weird at all: You are just getting the usual constructor logic for that particular class, which enforces all of its invariants. Having a “wither” that can “poke” any new value unchecked into an already-checked configuration, without allowing the class to validate the new data, would be the “weird result” in this case. Let’s not do that, and let’s not set that kind of expectation. Encapsulation means never having anyone else tell you the exact values of your fields. Constructors are the gatekeepers of that encapsulation. Even record classes (transparent as they are) are allowed to have opinions about valid and invalid field values, and to reject or modify requests to create instances which are invalid according to the contract of the record class. > >> >> 4. Points 2 and 3 may lead to the conclusion that not every record >> actually needs copying. In fact, I believe, only a few of them would >> need it. Adding them automatically would pollute the API and people >> may accidentally use them. I believe, if any automatic copying >> mechanism will be added, it should be explicitly enabled for specific >> records. An explicitly declared reconstructor would fulfill this goal. A reconstructor, as opposed to a “wither” feature, would also scale from one argument to any number of arguments. This leads to the issue of defining API points which are polymorphic across collections of (statically determined) fields. Which is the present point. Note, however, that it has surprisingly deep roots. As soon as we added non-static final fields to Java, we incurred a debt to eventually examine this problem. Time’s up; here we are. > with/copy calls the canonical constructor at the end, it's not something that > provide a new behavior, but more a syntactic sugar you provide because > updating few fields of a record declaring a dozen of components by calling > the canonical constructor explicitly involve a lot of boilerplate code that > may hide stupid bugs like the values of two components can be swapped because > the code called the accessors in the wrong order. Well, this is an argument for keyword-based constructors also. And reconstructors as well. (See the connection? It’s 1.25 things here not 2 things.) Setting all of the above aside for now, I have one old and one new idea about how to smooth out keyword-based calling sequences. These are offered in the spirit of brainstorming. The old idea is that, while good old Object… is a fine way to pass stuff around, we could (not now but later) choose to expand the set of available varargs calls by adding new carrier types as possible varargs bundles. So a key/val/key/val/... sequence could be passed with keys strongly typed as strings (or enum members, for extra checking!) and the vals typed as… well Object, still. The move needed for such a thing is, I think, simple though somewhat disruptive. Sketch of design: - have some way for marking a class A as varargs-capable - allow a method m to be marked as A-varargs instead of Object[]-varargs (m(…A a)?) - transform any call to m(a,b,c…) as m(new A(a,b,…)) - use the standard rules for constructor resolution in A - note that at least one A constructor is probably A-varargs (recursive) - this allows A’s constructors to do a L-to-R parse of m’s arguments - A(T,U) and A(T,U,…A) give you Map<T,U> key/val/key/val lists I’m just putting that out there. We can use Object… for the foreseeable future. An enhanced varargs feature would let us do better type checking, though. It would also allow the varargs carrier (A not Object…) to be (drum roll please) an inline type, getting rid of several kinds of technical debt associated with array-based varargs. Second, here’s a new idea: During the JSR 292 design, we talked about building BSMs which could somehow capture constant (or presumed-constant) argument values and fold them into the target of the call-site. Remi, your proposed design for record <strikeout>withers</strikeout> reconstructors could use such a thing. We were (IIRC) uncertain how to do this well, although you may have prototyped something slick like you often do. This conversation made me revisit the question, and I have a proposal, a new general-purpose BSM combinator which sets a “trap” for the first call to a call site, samples the arguments which are purported to be constant, and then spins a subsidiary call site which “sees” the constants, and patches the latter call site into the former. Various configurations of mutable and constant call sites are possible and useful. A new kind of call site might be desirable, the StableCallSite, which is one that computes its true target on the first call (not linkage) and thereafter does not allow target changes. /** Arrange a call site which samples selected arguments * on the first call to the call site and calls the indicated bsm * to hand-craft a sub-call site based on those arguments. * The bsm is called as CallSite subcs = bsm(L,S,MT,ca…,arg…) * where the ca values are sampled from the initial dynamic * list according to caspec. */ StableCallSite bootstrapWithConstantArguments(L,S,MT,caspec,bsm,arg…) /** Same as bootstrapWithConstantArguments, but the bsm * is called not only the first time, but every time a new argument * value is encountered. Arguments are compared with == not equals. */ CallSite bootstrapWithSpeculatedArguments(L,S,MT,caspec,bsm,arg…) Other variations are possible, using other comparators and also key extractors. Object::getClass is a great key extractor; this gives us the pattern of monomorphic inline caches. For the speculating version, the existing and new targets could recombined into a decision tree; that requires an extra hook, perhaps a SwitchingCallSite or a MH switch combinator. Anyway, I’m brainstorming here, but it seems like we might have some MH API work to do that would give us leverage on the wither problem. It’s probably obvious, but I’ll say it anyway: The “caspec” thingy (a String? “0,2,4”?) would point out the places where the key arguments are placed in the field-polymorphic reconstructor call. The secondary BSM would take responsibility for building a custom reconstructor MH that takes the non-key (val) arguments and builds the requested record. The primary BSM (bootstrapWithCA) would recede to the background; it’s just a bit of colorless plumbing, having no linkage at all to record type translation strategy, other than the fact that it’s useful. Maybe there’s a record-specific BSM that wraps the whole magic trick, but it’s a simple combo on top. The hard part is building the reconstructor factory. I think that should be done in such a way that the record class itself has complete autonomy over the reconstruction. Probably the reconstructor factory should just wire up arguments “foo” where they exist in the reconstructor call, and pass argument “this.bar” where they are not mentioned via keys. This is easy. It’s clunky too, but until the JVM gives a real way to say the thing directly, it will work. Note that the clunkiness is hidden deep inside the runtime, and can be swapped out (or optimized) when a better technique is available, *without changing translation strategy*. For records, everything goes through the canonical constructor, including synthesized reconstructors. In the case of *inline* records, the runtime would create a suitable constructor call, and might (if it could prove it correct) use bare “withfield” opcodes to make optimization easier. I think if we expose the API point as MyRecord::with(Object…) it should be possible to call the thing reflectively, or with variable keywords, or whatever. But javac should detect the common case of non-variable keywords, do some checks, and replace the call site with an indy, for those cases. That way we can have our cake and eat it too. (There are other things we can do beyond that, by slicing up the per-variable concerns from the per-instance concerns in the constructor, leading to better optimizations. The requires unknown translation strategy hooks. But this is enough brainstorming for one email.) — John