Ok, teeing. Webrev updated: http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r6/ CSR updated accordingly: https://bugs.openjdk.java.net/browse/JDK-8209685
With best regards, Tagir Valeev. On Fri, Sep 21, 2018 at 8:26 PM Brian Goetz <brian.go...@oracle.com> wrote: > > The example of ISS is a good one. It is analogous to the question of > "when is it right to write a class, and when it is right to write a > function?" And the answer is, of course, "it depends." ISS was an > obvious grouping, but even there there was significant disagreement > during its design about what it should support and not (especially with > regard to sum-of-squares calculations), and extra work done to make it > extensible. If you're writing from scratch, you might well consider > writing something like ISS. > > But ... the whole motivation for having "teeing" _at all_ is that you > have some existing collectors you want to reuse! It seems a little > silly to claim "I definitely will want to reuse two collectors, so much > so that we need a new method, but can't imagine ever wanting to reuse > three." > > So, while I am not saying we have to solve the N-way problem now, but I > think we'd be silly to pick a naming scheme that falls apart when we try > to go past two. So I'm still at "teeing". It works for two, and it > works for larger numbers as well. > > On 9/16/2018 5:23 AM, Tagir Valeev wrote: > > Hello, Brian! > > > > Regarding more than two collectors. Some libraries definitely have > > analogs (e.g. [1]) which combine more than two collectors. To my > > opinion combining two collectors this way is an upper limit for > > readable code. Especially if you are going to collect to the list, you > > will have a list of untyped and unnamed results which positionally > > correspond to the collectors. If you have more than two collectors to > > combine, writing a separate accumulator class with accept/combine > > methods and creating a collector from the scratch would be much easier > > to read and support. A good example is IntSummaryStatistics and the > > corresponding summarizingInt collector. It could be emulated combining > > four collectors (maxBy, minBy, summingInt, counting), but having a > > dedicated class IntSummaryStatistics which does all four things > > explicitly is much better. It could be easily reused outside of Stream > > API context, it has well-named and well-typed accessor methods and it > > may contain other domain-specific methods like average(). Imagine if > > it were a List of four elements and you had to call summary.get(1) to > > get a maximum. So I think that supporting more than two collectors > > would encourage obscure programming. > > > > With best regards, > > Tagir Valeev > > > > [1] > > https://github.com/jOOQ/jOOL/blob/889d87c85ca57bafd4eddd78e0f7ae2804d2ee86/jOOL/src/main/java/org/jooq/lambda/tuple/Tuple.java#L1282 > > (don't ask me why!) > > > > On Sat, Sep 15, 2018 at 10:36 PM Brian Goetz <brian.go...@oracle.com> wrote: > >> tl;dr: "Duplexing" is an OK name, though I think `teeing` is less likely > >> to be a name we regret, for reasons outlined below. > >> > >> > >> The behavior of this Collector is: > >> - duplicate the stream into two identical streams > >> - collect the two streams with two collectors, yielding two results > >> - merge the two results into a single result > >> > >> Obviously, a name like `duplexingAndCollectingAndThenMerging`, which, > >> entirely accurate and explanatory, is "a bit" unwieldy. So the > >> questions are: > >> - how much can we drop and still be accurate > >> - which parts are best to drop. > >> > >> When we pick names, we are not just trying to pick the best name for > >> now, but we should imagine all the possible operations one might ever > >> want to do in the future (names in the JDK are forever) and make a > >> reasonable attempt to imagine whether this could cause confusion or > >> regret in the future. > >> > >> To evaluate "duplexing" here (which seems the most important thing to > >> keep), I'd ask: is there any other reasonable way to imagine a > >> `duplexing` collect operation, now or in the future? > >> > >> One could imagine wanting an operation that takes a stream and produces > >> two streams whose contents are that of the original stream. And > >> "duplex" is a good name for that. But, it is not a Collector; it would > >> be a stream transform, like concat. So that doesn't seem a conflict; a > >> duplexing collector and a duplexing stream transform are sort of from > >> "different namespaces." > >> > >> Can one imagine a "duplexing" Collector that doesn't do any collection? > >> I cannot. Something that returns a pair of streams would not be a > >> Collector, but something else. So dropping AndCollecting seems justified. > >> > >> What about "AndThenMerging"? The purpose of collect is to reduce the > >> stream into a summary description. Can we imagine a duplexing operation > >> that doesn't merge the two results, but instead just returns a tuple of > >> the results? Yes, I can totally imagine this, especially once we have > >> value types and records, which makes returning ad-hoc tuples cheaper > >> (syntactically, heap-wise, CPU-wise.) So I think this is quite a > >> reasonable possibility. But, I would have no problem with an overload > >> that didn't take a merger and returned a tuple of the result, and was > >> still called `duplexing`. > >> > >> So I'm fine with dropping all the extra AndThisAndThat. > >> > >> Finally, there's one other obvious direction we might extend this -- > >> more than two collectors. There's no reason why we can only do two; we > >> could take a (likely homogeneous) varargs of Collectors, and return a > >> List of results -- which itself could then be streamed into another > >> collector. This actually sounds pretty useful (though I'm not > >> suggesting doing this right now.) And, I think it would be silly if this > >> were not called the same thing as the two-collector version (just as it > >> would be silly to have separate names for "concat two" and "concat n".) > >> > >> And, this is where I think "duplexing" runs out of gas -- duplex implies > >> "two". Pedantic argue-for-the-sake-of-argument folks might observe that > >> "tee" also has bilateral symmetry, but I don't think you could > >> reasonably argue that a four-way "tee" is not less of an arity abuse > >> than a four-way "duplex", and the plumbing industry would agree: > >> > >> https://www.amazon.com/Way-Tee-PVC-Fitting-Furniture/dp/B017AO2WCM > >> > >> So, for these reasons, I still think "teeing" has a better balance of > >> being both evocative what it does and likely to stand the test of time. > >> > >> > >> > >> > >> On 9/14/2018 1:09 PM, Stuart Marks wrote: > >>> First, naming. I think "duplex" as the root word wins! Using > >>> "duplexing" to conform to many of other collectors is fine; so, > >>> "duplexing" is good. >