I agree with Tagir that supporting more than two Collectors sounds risky. I especially agree that well-typed and well-named accessors are important.
I use the quoted library (jOOL), but I: - either avoid all those tuple-based functions, - or I use only Tuple2/Tuple3 and I map the tuple to a dedicated result type immediately (with Collectors.collectingAndThen) so that I get well-named accessors. Note that if you need to combine more than two (generally, N) collectors, you can just call duplexing() N-1 times and use intermediate result holders, like I did for N=3 in [1]. It may be a bit of boilerplate, but the only *other* way to do it without tuples in a well-typed manner for N=3 would be to introduce a new functional interface like TriFunction<T,U,V,R> as a merger. That said, I found Brian's line of reasoning about dropping name parts very convincing, and I really liked the analogy to a 4-way tee in plumbing. Finally, here's a summary of the characteristics of the possible results types for n-ary *heterogeneous* Collector composition: - List<?> => well-typed: NO, well-named: NO - n-ary tuple => well-typed: YES, well-named: NO - custom result holder => well-typed: YES, well-named: YES Personally, I don't find n-ary *homogeneous* Collector composition that much useful, but if it were to be added, I agree List<T> would be the best result type. Regards, Tomasz Linkowski [1] https://stackoverflow.com/a/52211175/2032415 On Sun, Sep 16, 2018 at 11:23 AM, Tagir Valeev <amae...@gmail.com> wrote: > Hello, Brian! > > Regarding more than two collectors. Some libraries definitely have > analogs (e.g. [1]) which combine more than two collectors. To my > opinion combining two collectors this way is an upper limit for > readable code. Especially if you are going to collect to the list, you > will have a list of untyped and unnamed results which positionally > correspond to the collectors. If you have more than two collectors to > combine, writing a separate accumulator class with accept/combine > methods and creating a collector from the scratch would be much easier > to read and support. A good example is IntSummaryStatistics and the > corresponding summarizingInt collector. It could be emulated combining > four collectors (maxBy, minBy, summingInt, counting), but having a > dedicated class IntSummaryStatistics which does all four things > explicitly is much better. It could be easily reused outside of Stream > API context, it has well-named and well-typed accessor methods and it > may contain other domain-specific methods like average(). Imagine if > it were a List of four elements and you had to call summary.get(1) to > get a maximum. So I think that supporting more than two collectors > would encourage obscure programming. > > With best regards, > Tagir Valeev > > [1] https://github.com/jOOQ/jOOL/blob/889d87c85ca57bafd4eddd78e0f7ae > 2804d2ee86/jOOL/src/main/java/org/jooq/lambda/tuple/Tuple.java#L1282 > (don't ask me why!) > > On Sat, Sep 15, 2018 at 10:36 PM Brian Goetz <brian.go...@oracle.com> > wrote: > > > > tl;dr: "Duplexing" is an OK name, though I think `teeing` is less likely > > to be a name we regret, for reasons outlined below. > > > > > > The behavior of this Collector is: > > - duplicate the stream into two identical streams > > - collect the two streams with two collectors, yielding two results > > - merge the two results into a single result > > > > Obviously, a name like `duplexingAndCollectingAndThenMerging`, which, > > entirely accurate and explanatory, is "a bit" unwieldy. So the > > questions are: > > - how much can we drop and still be accurate > > - which parts are best to drop. > > > > When we pick names, we are not just trying to pick the best name for > > now, but we should imagine all the possible operations one might ever > > want to do in the future (names in the JDK are forever) and make a > > reasonable attempt to imagine whether this could cause confusion or > > regret in the future. > > > > To evaluate "duplexing" here (which seems the most important thing to > > keep), I'd ask: is there any other reasonable way to imagine a > > `duplexing` collect operation, now or in the future? > > > > One could imagine wanting an operation that takes a stream and produces > > two streams whose contents are that of the original stream. And > > "duplex" is a good name for that. But, it is not a Collector; it would > > be a stream transform, like concat. So that doesn't seem a conflict; a > > duplexing collector and a duplexing stream transform are sort of from > > "different namespaces." > > > > Can one imagine a "duplexing" Collector that doesn't do any collection? > > I cannot. Something that returns a pair of streams would not be a > > Collector, but something else. So dropping AndCollecting seems justified. > > > > What about "AndThenMerging"? The purpose of collect is to reduce the > > stream into a summary description. Can we imagine a duplexing operation > > that doesn't merge the two results, but instead just returns a tuple of > > the results? Yes, I can totally imagine this, especially once we have > > value types and records, which makes returning ad-hoc tuples cheaper > > (syntactically, heap-wise, CPU-wise.) So I think this is quite a > > reasonable possibility. But, I would have no problem with an overload > > that didn't take a merger and returned a tuple of the result, and was > > still called `duplexing`. > > > > So I'm fine with dropping all the extra AndThisAndThat. > > > > Finally, there's one other obvious direction we might extend this -- > > more than two collectors. There's no reason why we can only do two; we > > could take a (likely homogeneous) varargs of Collectors, and return a > > List of results -- which itself could then be streamed into another > > collector. This actually sounds pretty useful (though I'm not > > suggesting doing this right now.) And, I think it would be silly if this > > were not called the same thing as the two-collector version (just as it > > would be silly to have separate names for "concat two" and "concat n".) > > > > And, this is where I think "duplexing" runs out of gas -- duplex implies > > "two". Pedantic argue-for-the-sake-of-argument folks might observe that > > "tee" also has bilateral symmetry, but I don't think you could > > reasonably argue that a four-way "tee" is not less of an arity abuse > > than a four-way "duplex", and the plumbing industry would agree: > > > > https://www.amazon.com/Way-Tee-PVC-Fitting-Furniture/dp/B017AO2WCM > > > > So, for these reasons, I still think "teeing" has a better balance of > > being both evocative what it does and likely to stand the test of time. > > > > > > > > > > On 9/14/2018 1:09 PM, Stuart Marks wrote: > > > > > > First, naming. I think "duplex" as the root word wins! Using > > > "duplexing" to conform to many of other collectors is fine; so, > > > "duplexing" is good. > > > -- Pozdrawiam, Tomasz Linkowski