This is an automated email from the ASF dual-hosted git repository. paulk pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/groovy-website.git
commit 480e04f9cbeaad418e25da5862d831d77a440d16 Author: Paul King <[email protected]> AuthorDate: Mon Dec 1 18:32:17 2025 +1000 add groupByMany blog post --- site/src/site/blog/fruity-eclipse-grouping.adoc | 228 ++++++++++++++++++++++++ 1 file changed, 228 insertions(+) diff --git a/site/src/site/blog/fruity-eclipse-grouping.adoc b/site/src/site/blog/fruity-eclipse-grouping.adoc new file mode 100644 index 0000000..a29e922 --- /dev/null +++ b/site/src/site/blog/fruity-eclipse-grouping.adoc @@ -0,0 +1,228 @@ += Grouping Fruity Collections +Paul King +:revdate: 2025-12-01T10:45:00+00:00 +:keywords: eclipse collections, groovy, emoji, many-to-many, grouping +:description: This post looks at using grouping in the context of many-to-many relationships. + +Two previous blog posts have been inspired by the use of emojis in combination +with collections of fruit or pets as per a related https://github.com/eclipse/eclipse-collections-kata/tree/master/top-methods-kata-solutions[Eclipse Collections kata]. +Those posts had a bit of fun +https://groovy.apache.org/blog/deep-learning-and-eclipse-collections[using deep learning], +and https://groovy.apache.org/blog/fruity-eclipse-collections[clustering using k-means] +to do image recognition and color guessing based on emojis. + +This post doesn't look at machine learning topics, but instead looks at +what might seem like a more mundane task, but one that is very common: grouping. +Grouping occurs naturally in any datasets where relationships exist. + +The `groupBy` method does the job perfectly for one-to-many relationships. +It appears as one of the "top 25" methods in the previously mentioned Eclipse Collections kata, +and is one of the examples carried over into the aforementioned Groovy blog posts: + +[source,groovy] +---- +assert Fruit.ALL.groupBy(Fruit::getColor) == + Multimaps.mutable.list.empty() + .withKeyMultiValues(RED, Fruit.of('🍎'), Fruit.of('🍒')) + .withKeyMultiValues(ORANGE, Fruit.of('🍑'), Fruit.of('🍊')) + .withKeyMultiValues(YELLOW, Fruit.of('🍌')) + .withKeyMultiValues(MAGENTA, Fruit.of('🍇')) +---- + +We have a collection of fruit and we have colors. +The fruit is an enum. The colors are `java.awt.Color` values. +We won't show the details of those but see the previous blog posts or the +associated GitHub repo if you want all the details. + +As originally presented, +there is a one-to-many relationship between fruit and their color. +A fruit has one color, but there can be many fruit of a particular color. +That's what `groupBy` allows us to explore. + +In this post, we want to explore grouping in the context of many-to-many relationships. + +As an example, suppose now that rather than just coming in one _typical_ color, +in our case the predominant color of the supplied emoji, that multiple colors might be possible for +any given fruit: red and green apples, a green unripe banana, and so forth. +So, let's expand the previous example so that rather than calling `getColor` to find _the_ color, +we'll call `getColors` to find a list of potential colors. + +We'll do that first using Eclipse Collections and then look at various possibilities +for JDK collections when using Groovy. + +== Grouping Eclipse Collections Fruit Salad + +We saw earlier that we could do some grouping using the `groupBy` method offered +by Eclipse collection classes implementing the `RichIterable` interface. +In fact, if we combine `groupBy` with some other methods like `flatCollect` (_flatMap_) or the inject +family of methods, we'd be able to represent the many fruit to many color relationship that we now +want to explore. However, the `groupByEach` method combines several steps into just one +and that's what we'll use here. + +The `groupByEach` method didn't make the cut of being in the "top 25" methods in the Eclipse +Collections kata but is exactly what we need for a many-to-many relationship +(many fruit and many potential colors). + +For our example, we'll add `GREEN` as a possible color for apples, bananas (unripe), and grapes. +Here is how we can explore the relationship between fruit and colors with these additions: + +[source,groovy] +---- +assert Fruit.ALL.groupByEach(Fruit::getColors) == + Multimaps.mutable.list.empty() + .withKeyMultiValues(GREEN, Fruit.of('🍎'), Fruit.of('🍌'), Fruit.of('🍇')) + .withKeyMultiValues(RED, Fruit.of('🍎'), Fruit.of('🍒')) + .withKeyMultiValues(ORANGE, Fruit.of('🍑'), Fruit.of('🍊')) + .withKeyMultiValues(YELLOW, Fruit.of('🍌')) + .withKeyMultiValues(MAGENTA, Fruit.of('🍇')) +---- + +Grape colors are sometimes known by the juice or wine they produce (red and white) +and sometimes by their skin color, green and purple or magenta. We'll stick with the +latter for the purposes of this post. + +== Grouping JDK Collections Fruit Salad + +If we want to achieve the same thing for JDK collections, we have a few options. +We could consider stream functionality but, like with Eclipse Collections, there +are ways to achieve what we want building on the fairly widely known `groupBy` +functionality offered by Groovy. We'll explore that next, but first let's +look at our expected result. It will be similar to what we had for Eclipse +Collections but just using normal JDK lists and maps: + +[source,groovy] +---- +var expected = [ + (GREEN) : [Fruit.of('🍎'), Fruit.of('🍌'), Fruit.of('🍇')], + (RED) : [Fruit.of('🍎'), Fruit.of('🍒')], + (ORANGE) : [Fruit.of('🍑'), Fruit.of('🍊')], + (YELLOW) : [Fruit.of('🍌')], + (MAGENTA) : [Fruit.of('🍇')] +] +---- + +Note that Groovy defaults to keys being String values in its literal map notation, +so we use round brackets around key values so that Groovy will use keys with `java.awt.Color` +values like we have been using in earlier examples. + +Now, we can find fruits by color _simply_ by using `groupBy` +in combination with `collectMany` (_flatMap_) and `collectEntries` (or we could use `inject`): + +[source,groovy] +---- +assert expected == Fruit.values() + .collectMany(f -> f.colors.collect{ c -> [c, f] }) + .groupBy{ c, f -> c } + .collectEntries{ k, v -> [k, v*.get(1)] } +---- + +This works well but isn't necessarily obvious at first glance. + +== Grouping JDK Collections Fruit Salad with GQuery + +Dealing with many-to-many relationships is very common in database systems. +Query languages like SQL have special support for querying such relationships. +It should come as no surprise then, that Groovy's integrated query technology, +GQuery (groovy-ginq), would also support such relationships. + +Here is the same example again using GQuery: + +[source,groovy] +---- +assert expected == GQL { + from f in Fruit.values() + crossjoin c in Fruit.values()*.colors.sum().toSet() + where c in f.colors + groupby c + select c, list(f) +}.collectEntries() +---- + +The `crossjoin` gives us the cross-product and the `where` and `groupby` clauses +select the desired elements. The `collectEntries` at the end converts from GQuery's +tabular format to the map used for our expectation. + +== Grouping JDK Collections Fruit Salad in Groovy 6 + +Inspired by the `groupByEach` method from +https://www.eclipse.org/collections/[Eclipse Collections] and the examples in the +https://www.amazon.com/Eclipse-Collections-Categorically-Level-programming/dp/B0DZVK69D3[Eclipse Collections Categorically book], +the Groovy team has recently added a `groupByMany` method. This is in Groovy 6 +which is still in the alpha/snapshot stage of evolution, so is subject to change. + +Using `groupByMany` our example becomes: + +[source,groovy] +---- +assert expected == Fruit.values().groupByMany(Fruit::getColors) +---- + +That was easy! And that's the appeal of adding this method to Groovy. + +Let's look at some other variations. One variant takes a second closure which +allows the value to be transformed (mapped). In our case, we'll just get the emoji +representation for our fruit rather than the enum used in previous examples: + +[source,groovy] +---- +assert Fruit.values().groupByMany(Fruit::getEmoji, Fruit::getColors) == [ + (GREEN) : ['🍎', '🍌', '🍇'], + (RED) : ['🍎', '🍒'], + (ORANGE) : ['🍑', '🍊'], + (YELLOW) : ['🍌'], + (MAGENTA) : ['🍇'] +] +---- + +In more typical cases, you might have domain classes and mapping from some +domain, e.g. `Person` to some desired value, e.g. a String `name` for the person. + +As another example, let's group the fruit by the vowels they contain: + +[source,groovy] +---- +var vowels = 'AEIOU'.toSet() +var vowelsOf = { String word -> word.toSet().intersect(vowels) } +assert Fruit.values().groupByMany(Fruit::getEmoji, fruit -> vowelsOf(fruit.name())) == [ + A: ['🍎', '🍑', '🍌', '🍊', '🍇'], + E: ['🍎', '🍑', '🍒', '🍊', '🍇'], + O: ['🍊'] +] +---- + +Our `ORANGE` fruit makes all three lists. `BANANA` and `CHERRY` are just +in one list. The other fruit all have both `A` and `E`. + +There is also a variant of `groupByMany` taking no parameters. +It caters for Maps where the value is already a list of the appropriate keys. +As an example, suppose we want to buy fruit locally. +I'll roughly base this on subtropical Brisbane, but you could modify as appropriate +if you are interested. +We might now be interested in knowing when seasonal fruit will be available: + +[source,groovy] +---- +var availability = [ + '🍎': ['Spring'], + '🍌': ['Spring', 'Summer', 'Autumn', 'Winter'], + '🍇': ['Spring', 'Autumn'], + '🍒': ['Autumn'], + '🍑': ['Spring'] +] + +assert availability.groupByMany() == [ + Winter: ['🍌'], + Autumn: ['🍌', '🍇', '🍒'], + Summer: ['🍌'], + Spring: ['🍎', '🍌', '🍇', '🍑'] +] +---- + +[Sorry U.S. folks, _Autumn_ has the same number of letters as the other season names +and makes the last map look prettier - _Fall_ just didn't cut it this time!] + +== Further information + +* Repo with example code: https://github.com/paulk-asert/fruity-eclipse-collections +* Eclipse collections homepage: https://www.eclipse.org/collections/ +* Eclipse Collections Categorically: https://www.amazon.com/Eclipse-Collections-Categorically-Level-programming/dp/B0DZVK69D3
