[ https://issues.apache.org/jira/browse/GROOVY-11649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17949224#comment-17949224 ]
Paul King edited comment on GROOVY-11649 at 5/5/25 9:58 AM: ------------------------------------------------------------ When working with sorted aggregates (lists, arrays, etc.) many languages provide methods to calculate an index to insert into the aggregate and maintain the sort order. For example, suppose we want to insert the number 1 into the following list: {code:groovy} def s = [0, 1, 1, 1, 1, 1, 1, 1, 2, 3, 5, 8] {code} We can insert the 1 anywhere from index 1 to index 8 and the sort order will be maintained. In Python, bisect_left(1) returns 1, and bisect_right(1) returns 8. In C++, lower_bound(1) returns 1, and upper_bound(1) returns 8. Rust provides a binary_search which will return some number between 1 and 8. Java has Arrays.binarySearch which returns 5 (and can return a negative number to indicate that the actual result wasn't found but can then never-the-less be used to work out where to insert). The {{partitionPoint}} method is a generalisation of the above mentioned methods. So, for our example: {code:groovy} def low = s.partitionPoint{ it < 1 } // equiv to bisect_left/lower_bound assert low == 1 def high = s.partitionPoint{ it <= 1 } // equiv to bisect_right/upper_bound assert high == 8 assert Arrays.binarySearch(s, 1) in low..high // anything above all existing values will always be inserted at the end assert s.partitionPoint{ it <= 20 } == s.size() assert (Arrays.binarySearch(s, 20) + 1).abs() == s.size() // low == high when the element isn't found (there is exactly one spot we should insert) low = s.partitionPoint{ it < 4 } high = s.partitionPoint{ it <= 4 } assert low == high {code} But partitionPoint is even more general than it might first appear. It doesn't require the elements in the aggregate to be sorted, but just partitioned in some way. Groovy's sort variants that take a closure allow such partitioning. As an example, let's sort the numbers in a list into even and odd (using the mutating variant of sort): {code:groovy} def even = { it % 2 == 0 } def odd = { it % 2 != 0 } def list = [4, 7, 15, 12, 6, 3, 5] list.sort(even) assert list == [7, 15, 3, 5, 4, 12, 6] {code} Now supposed we want to add more numbers into this list. We can add them and re-sort, but {{partitionPoint}} let's us know where to add to retain the sorted properties: {code:groovy} def additions = [1, 6, 8, 7] additions.each { def idx = list.partitionPoint(odd) list.add(idx, it) } assert list == [7, 15, 3, 5, 1, 7, 8, 6, 4, 12, 6] {code} The partitionPoint method is also useful when processing partitioned data: {code:groovy} def idx = list.partitionPoint(odd) assert list[0..<idx].every(odd) assert list[idx..-1].every(even) {code} Does Java offer some better way to do this? It does offer the {{PriorityQueue}} which can take a Comparator and this means there is no need for {{partitionPoint}} to insert into such a collection, but it is a little cumbersome to use: {code:groovy} def evenComparator = { a, b -> (a % 2 == 0) <=> (b % 2 == 0) } def ordered = new PriorityQueue(evenComparator) ordered.addAll([4, 7, 15, 12, 6, 3, 5]) ordered.addAll(additions) println ordered.toList() // [7, 1, 15, 4, 7, 3, 5, 12, 6, 8, 6] while(!ordered.empty) { println ordered.poll() // a drain() extension method has been proposed to avoid this loop (see GROOVY-11650) } // 7 1 7 15 3 5 6 6 4 12 8 assert ordered.empty {code} The standard iterator ordered doesn't follow the priority. The "poll" method can be used but the queue will be exhausted of elements after using it. What about arrays? We can sort non-primitive arrays (and processing can make use of {{partitionPoint}}): {code:groovy} Integer[] array = [4, 7, 15, 12, 6, 3, 5] array.sort(true, evenComparator) assert array == [7, 15, 3, 5, 4, 12, 6] idx = array.partitionPoint(odd) assert array[0..<idx].every(odd) assert array[idx..-1].every(even) {code} But, we currently don't have sort variants for primitive arrays. Arrays.sort is available without the comparator, but no variant with comparators exist for primitive arrays. This alone doesn't make partitionPoint less useful, but since arrays are of fixed size, we also don't have the ability to insert an extra element into an array at a given index. I think this makes {{partitionPoint}} less useful for arrays in general. was (Author: paulk): When working with sorted aggregates (lists, arrays, etc.) many languages provide methods to calculate an index to insert into the aggregate and maintain the sort order. For example, suppose we want to insert the number 1 into the following list: {code:groovy} def s = [0, 1, 1, 1, 1, 1, 1, 1, 2, 3, 5, 8] {code} We can insert the 1 anywhere from index 1 to index 8 and the sort order will be maintained. In Python, bisect_left(1) returns 1, and bisect_right(1) returns 8. In C++, lower_bound(1) returns 1, and upper_bound(1) returns 8. Rust provides a binary_search which will return some number between 1 and 8. Java has Arrays.binarySearch which returns 5 (and can return a negative number to indicate that the actual result wasn't found but can then never-the-less be used to work out where to insert). The {{partitionPoint}} method is a generalisation of the above mentioned methods. So, for our example, {code:groovy} def low = s.partitionPoint{ it < 1 } // equiv to bisect_left/lower_bound assert low == 1 def high = s.partitionPoint{ it <= 1 } // equiv to bisect_right/upper_bound assert high == 8 assert Arrays.binarySearch(s, 1) in low..high // anything above all existing values will always be inserted at the end assert s.partitionPoint{ it <= 20 } == s.size() assert (Arrays.binarySearch(s, 20) + 1).abs() == s.size() // low == high when the element isn't found (there is exactly one spot we should insert) low = s.partitionPoint{ it < 4 } high = s.partitionPoint{ it <= 4 } assert low == high {code} But partitionPoint is even more general than it might first appear. It doesn't require the elements in the aggregate to be sorted, but just partitioned in some way. Groovy's sort variants that take a closure allow such partitioning. As an example, let's sort the numbers in a list into even and odd (using the mutating variant of sort): {code:groovy} def even = { it % 2 == 0 } def odd = { it % 2 != 0 } def list = [4, 7, 15, 12, 6, 3, 5] list.sort(even) assert list == [7, 15, 3, 5, 4, 12, 6] {code} Now supposed we want to add more numbers into this list. We can add them and re-sort, but {{partitionPoint}} let's us know where to add to retain the sorted properties: {code:groovy} def additions = [1, 6, 8, 7] additions.each { def idx = list.partitionPoint(odd) list.add(idx, it) } assert list == [7, 15, 3, 5, 1, 7, 8, 6, 4, 12, 6] {code} The partitionPoint method is also useful when processing partitioned data: {code:groovy} def idx = list.partitionPoint(odd) assert list[0..<idx].every(odd) assert list[idx..-1].every(even) {code} Does Java offer some better way to do this? It does offer the {{PriorityQueue}} and this means there is no need for {{partitionPoint}} to insert into such a collection, but it is a little cumbersome to use: {code:groovy} def evenComparator = { a, b -> (a % 2 == 0) <=> (b % 2 == 0) } def ordered = new PriorityQueue(evenComparator) ordered.addAll([4, 7, 15, 12, 6, 3, 5]) ordered.addAll(additions) println ordered.toList() // [7, 1, 15, 4, 7, 3, 5, 12, 6, 8, 6] while(!ordered.empty) { println ordered.poll() } // 7 1 7 15 3 5 6 6 4 12 8 assert ordered.empty {code} The standard iterator ordered doesn't follow the priority. The "poll" method can be used but the queue will be exhausted of elements after using it. What about arrays? We can sort non-primitive arrays (and processing can make use of {{partitionPoint}}): {code:groovy} Integer[] array = [4, 7, 15, 12, 6, 3, 5] array.sort(true, evenComparator) assert array == [7, 15, 3, 5, 4, 12, 6] idx = array.partitionPoint(odd) assert array[0..<idx].every(odd) assert array[idx..-1].every(even) {code} But, we currently don't have sort variants for primitive arrays. Arrays.sort is available without the comparator, but no variant with comparators exist. This alone doesn't make partitionPoint less useful, but since arrays are of fixed size, we also don't have the ability to insert an extra into an array at a given index. I think this makes {{partitionPoint}} less useful for arrays in general. > Create partitionPoint extension method variants > ----------------------------------------------- > > Key: GROOVY-11649 > URL: https://issues.apache.org/jira/browse/GROOVY-11649 > Project: Groovy > Issue Type: New Feature > Reporter: Paul King > Assignee: Paul King > Priority: Major > Fix For: 5.x > > > See: https://github.com/apache/groovy/pull/2210 > * rust: > https://doc.rust-lang.org/std/primitive.slice.html#method.partition_point > * c++: https://en.cppreference.com/w/cpp/algorithm/ranges/partition_point > * python: https://docs.python.org/3/library/bisect.html#module-bisect -- This message was sent by Atlassian Jira (v8.20.10#820010)