On Aug 13, 2020, at 3:01 PM, Guy Steele <guy.ste...@oracle.com> wrote: > >> >> On Aug 13, 2020, at 5:37 PM, John Rose <john.r.r...@oracle.com >> <mailto:john.r.r...@oracle.com>> wrote: >> >> On Aug 13, 2020, at 12:39 PM, Guy Steele <guy.ste...@oracle.com >> <mailto:guy.ste...@oracle.com>> wrote: >>> >>> Whereas I can more easily understand that the job of >>> >>> public deconstructor Point(int x, int y) { >>> x = this.x; >>> y = this.y; >>> } >>> >>> is to take values out of the object “this” and put them into separate >>> _variables_, not a new object. (Granted, these variables have a somewhat >>> new and mysterious existence and character.) >> >> And if this mysterious character were something completely unrelated >> to any other part of the Java language, I’d be more inclined to admit >> that maybe the missing primitive is some sort of tuple. It might have >> to be given a status like that of Java arrays to avoid the infinite regress >> problem you point out. >> >> BUT, when I stare at a block of code that is setting some named/typed >> variables, where those variables must be DA at the end of the block, >> and then they are to be available in various other scopes (not the >> current scope, but other API points), THEN I say, “where have I >> seen that pattern before…?” There is ALREADY a well-developed >> part of the Java language which covers this sort of workflow >> (of putting values into a set of required named/typed variables). >> >> Of course, it’s a constructor, > > Actually, a constructor _body_.
Yep. And it is distinguished from a tuple-based notation in its reference to (live) named/type values on exit. We *could* have used tuples there, by requiring that every (normal) exit from a constructor must “return multiple values” by specifying a positional argument package (a tuple) corresponding to all required (final) field settings. We *could* have observed that something like `this(a,b,c)`, where the argument list is exactly the required fields, is a perfectly universal way to commit all required field values to an object, at the end of its constructor. Why didn’t we? It would have been more symmetric in some way, to have the outputs of the constructor leave the block in the same format as the inputs. One reason is the entities which are already present: The fields are there, ready and waiting for assignment. Another reason is surely that tuples would have been the wrong notation for that job. In a nutshell, positional notations only work well when there are only a few positions, and named notations, though more verbose, are more robustly expressive regardless of the number of positions; they also degrade gracefully when items may be omitted (optional initialization/binding/assignment). I think we should (continue to) design for object arities which are larger than (comfortable) parameter list arities. > Let us also recall that there is a second well-developed part of the Java > language > that puts values into a set of required named/types variables: method > invocation. > And its structure and behavior are rather different from that of a > constructor body. > > (more below) > > ... > All of which would seem to suggest Rémi’s multi-value-return minmax example > as the dual to method invocation: > >>> . . . a method minMax that returns both the minimum and the maximum of an >>> array >>> public static (int min, int max) minMax(int[] array) { >>> >> Nope. Not going there. I went down this road too, but multiple-return is >> another one of those “tease” features that looks cool but very quickly looks >> like glass 80% empty. > > Part of the job of method invocation is to take a set of values and > definitely assign them to a set of variables (the method parameters). This > could be done with a block that is charged with the task of definitely > assigning to those variables: > > Math.atan{ x = 2.0; y = 3.0 } > myString.substring{ if (weird) { beginIndex = 3; endIndex = 5; } else { > beginIndex = 0; endIndex = myString.length(); } } > > but for convenience (or for compatibility with C) we provide a different > mechanism, with different syntax, that in effect uses positional tuples. A > block-with-assignment mechanism is possible, but that’s not Java. > > Therefore we will keep re-encountering the question of why positional tuples > are good Java style for passing several arguments to a method but not for > returning several values from a method. That’s a good argument; your code example looks plenty ugly. Surely positional notation is better for those simple use cases, of well-known APIs where programmers have committed the order of arguments firmly to memory. But there are two reasons “that’s not Java” is not the whole story here. 1. At high arities, positional notations falter, and people ask for keyword-based argument notations, because it’s hard to commit to memory the order of arguments for every API. Java might answer those demands at some point. What we are discussing here could do the job. 2. Java already has a “block of assignments” notation, the constructor body. Using that notation elsewhere, rewarding programmers for learning that notation by giving them more ways to use it, is a legitimate tactic. (Yeah, maybe putting it in an external block, outside its class, is “Not Java”; but lambdas were similarly “Not Java” at one point; now they are.) The imperative constructor body, with its named assignments, can be more expressive and compact than a tuple expression. It can be read piecewise, and the names help the reading (and writing) process. Conditional control flow can visually reify case analysis for setting up the field values to be output from the constructor body, without introducing extra temps. All this is even more true when we connect up record parameters to record fields, and allow elision of assignments of the form `this.x =x`. That amounts to an optionality feature where the (positional) argument list of a record provides defaults and then the compact constructor body provides a named argument set (not an ordered list) of additionally processed values. Tuples are not the right notation here; it would be less clear code if changing one record component (say, doing a range clip) required the coder to specify the adjusted record components as a new argument list. Tuple notations work OK for two or three items but don’t scale nearly as well as name-based notations when you have a larger collection of columns to wrangle. You could say, well, tuples are better if you are going to specify all the names in some well-determined order—as is the case with argument lists I suppose—because you can drop the noise of the names (they don’t add anything). Yes, in that case tuples are better. But even for argument lists there is a place where you really want by-name arguments, because remembering the order of names is just too hard. That’s what I mean by positional notation not scaling well to high arities. When we are talking about objects, I think we need to design for field sets that are more numerous than comfortable argument list arities. The constructor body notation is therefore a better precedent to build on, for deconstructors and reconstructors, and anything else that has a transaction on an object-sized scope (bigger than an arg-list sized scope). ADTs like Box and Rational and Point3D don’t support my case very well, because they amount (at most) to pairs or triples. But if you get anywhere close to database rows (and I do think we want to scale out that way), then tuples won’t take us where we want to go, but transactional blocks on names (that is, constructor bodies suitably generalized) will take us places, and will make use of mindshare already present in Java programmers. Back to the point about “the fields are already there”: While this may be why constructor bodies are the way they are, I think we could reconsider the source of the names that are present in what I call a “transactional block” (with named values falling out the bottom, and perhaps also falling in the top), starting with deconstructors. These names could be specified by an argument list for an ad hoc API point, not the (final non-static) field set of a class. So an arrow-reversed constructor body is not just a fine way to unpack the pre-existing fields of a class (that wants to cooperate with pattern matching). It is a direction in which Java can, maybe, move to add some benefits of keyword-based calling sequences, without importing something completely new. — John