[ 
https://issues.apache.org/jira/browse/GROOVY-11649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17949224#comment-17949224
 ] 

Paul King edited comment on GROOVY-11649 at 5/5/25 9:58 AM:
------------------------------------------------------------

When working with sorted aggregates (lists, arrays, etc.) many languages 
provide methods to calculate an index to insert into the aggregate and maintain 
the sort order.

For example, suppose we want to insert the number 1 into the following list:
{code:groovy}
def s = [0, 1, 1, 1, 1, 1, 1, 1, 2, 3, 5, 8]
{code}
We can insert the 1 anywhere from index 1 to index 8 and the sort order will be 
maintained.
In Python, bisect_left(1) returns 1, and bisect_right(1) returns 8.
In C++, lower_bound(1) returns 1, and upper_bound(1) returns 8. Rust provides a 
binary_search which will return some number between 1 and 8. Java has 
Arrays.binarySearch which returns 5 (and can return a negative number to 
indicate that the actual result wasn't found but can then never-the-less be 
used to work out where to insert).

The {{partitionPoint}} method is a generalisation of the above mentioned 
methods. So, for our example:
{code:groovy}
def low = s.partitionPoint{ it < 1 } // equiv to bisect_left/lower_bound
assert low == 1
def high = s.partitionPoint{ it <= 1 } // equiv to bisect_right/upper_bound
assert high == 8
assert Arrays.binarySearch(s, 1) in low..high

// anything above all existing values will always be inserted at the end
assert s.partitionPoint{ it <= 20 } == s.size()
assert (Arrays.binarySearch(s, 20) + 1).abs() == s.size()

// low == high when the element isn't found (there is exactly one spot we 
should insert)
low = s.partitionPoint{ it < 4 }
high = s.partitionPoint{ it <= 4 }
assert low == high
{code}

But partitionPoint is even more general than it might first appear. It doesn't 
require the elements in the aggregate to be sorted, but just partitioned in 
some way. Groovy's sort variants that take a closure allow such partitioning.

As an example, let's sort the numbers in a list into even and odd (using the 
mutating variant of sort):
{code:groovy}
def even = { it % 2 == 0 }
def odd = { it % 2 != 0 }

def list = [4, 7, 15, 12, 6, 3, 5]
list.sort(even)
assert list == [7, 15, 3, 5, 4, 12, 6]
{code}
Now supposed we want to add more numbers into this list. We can add them and 
re-sort, but {{partitionPoint}} let's us know where to add to retain the sorted 
properties:
{code:groovy}
def additions = [1, 6, 8, 7]
additions.each {
    def idx = list.partitionPoint(odd)
    list.add(idx, it)
}
assert list == [7, 15, 3, 5, 1, 7, 8, 6, 4, 12, 6]
{code}
The partitionPoint method is also useful when processing partitioned data:
{code:groovy}
def idx = list.partitionPoint(odd)
assert list[0..<idx].every(odd)
assert list[idx..-1].every(even)
{code}
Does Java offer some better way to do this? It does offer the {{PriorityQueue}} 
which can take a Comparator and this means there is no need for 
{{partitionPoint}} to insert into such a collection, but it is a little 
cumbersome to use:
{code:groovy}
def evenComparator = { a, b -> (a % 2 == 0) <=> (b % 2 == 0) }
def ordered = new PriorityQueue(evenComparator)
ordered.addAll([4, 7, 15, 12, 6, 3, 5])
ordered.addAll(additions)
println ordered.toList() // [7, 1, 15, 4, 7, 3, 5, 12, 6, 8, 6]
while(!ordered.empty) {
    println ordered.poll() // a drain() extension method has been proposed to 
avoid this loop (see GROOVY-11650)
} // 7 1 7 15 3 5 6 6 4 12 8
assert ordered.empty
{code}
The standard iterator ordered doesn't follow the priority. The "poll" method 
can be used but the queue will be exhausted of elements after using it.

What about arrays? We can sort non-primitive arrays (and processing can make 
use of {{partitionPoint}}):
{code:groovy}
Integer[] array = [4, 7, 15, 12, 6, 3, 5]
array.sort(true, evenComparator)
assert array == [7, 15, 3, 5, 4, 12, 6]
idx = array.partitionPoint(odd)
assert array[0..<idx].every(odd)
assert array[idx..-1].every(even)
{code}
But, we currently don't have sort variants for primitive arrays. Arrays.sort is 
available without the comparator, but no variant with comparators exist for 
primitive arrays. This alone doesn't make partitionPoint less useful, but since 
arrays are of fixed size, we also don't have the ability to insert an extra 
element into an array at a given index. I think this makes {{partitionPoint}} 
less useful for arrays in general.


was (Author: paulk):
When working with sorted aggregates (lists, arrays, etc.) many languages 
provide methods to calculate an index to insert into the aggregate and maintain 
the sort order.

For example, suppose we want to insert the number 1 into the following list:
{code:groovy}
def s = [0, 1, 1, 1, 1, 1, 1, 1, 2, 3, 5, 8]
{code}
We can insert the 1 anywhere from index 1 to index 8 and the sort order will be 
maintained.
In Python, bisect_left(1) returns 1, and bisect_right(1) returns 8.
In C++, lower_bound(1) returns 1, and upper_bound(1) returns 8. Rust provides a 
binary_search which will return some number between 1 and 8. Java has 
Arrays.binarySearch which returns 5 (and can return a negative number to 
indicate that the actual result wasn't found but can then never-the-less be 
used to work out where to insert).

The {{partitionPoint}} method is a generalisation of the above mentioned 
methods.
So, for our example, 
{code:groovy}
def low = s.partitionPoint{ it < 1 } // equiv to bisect_left/lower_bound
assert low == 1
def high = s.partitionPoint{ it <= 1 } // equiv to bisect_right/upper_bound
assert high == 8
assert Arrays.binarySearch(s, 1) in low..high

// anything above all existing values will always be inserted at the end
assert s.partitionPoint{ it <= 20 } == s.size()
assert (Arrays.binarySearch(s, 20) + 1).abs() == s.size()

// low == high when the element isn't found (there is exactly one spot we 
should insert)
low = s.partitionPoint{ it < 4 }
high = s.partitionPoint{ it <= 4 }
assert low == high
{code}

But partitionPoint is even more general than it might first appear. It doesn't 
require the elements in the aggregate to be sorted, but just partitioned in 
some way.
Groovy's sort variants that take a closure allow such partitioning.

As an example, let's sort the numbers in a list into even and odd (using the 
mutating variant of sort):
{code:groovy}
def even = { it % 2 == 0 }
def odd = { it % 2 != 0 }

def list = [4, 7, 15, 12, 6, 3, 5]
list.sort(even)
assert list == [7, 15, 3, 5, 4, 12, 6]
{code}
Now supposed we want to add more numbers into this list. We can add them and 
re-sort, but {{partitionPoint}} let's us know where to add to retain the sorted 
properties:
{code:groovy}
def additions = [1, 6, 8, 7]
additions.each {
    def idx = list.partitionPoint(odd)
    list.add(idx, it)
}
assert list == [7, 15, 3, 5, 1, 7, 8, 6, 4, 12, 6]
{code}
The partitionPoint method is also useful when processing partitioned data:
{code:groovy}
def idx = list.partitionPoint(odd)
assert list[0..<idx].every(odd)
assert list[idx..-1].every(even)
{code}
Does Java offer some better way to do this? It does offer the {{PriorityQueue}} 
and this means there is no need for {{partitionPoint}} to insert into such a 
collection, but it is a little cumbersome to use:
{code:groovy}
def evenComparator = { a, b -> (a % 2 == 0) <=> (b % 2 == 0) }
def ordered = new PriorityQueue(evenComparator)
ordered.addAll([4, 7, 15, 12, 6, 3, 5])
ordered.addAll(additions)
println ordered.toList() // [7, 1, 15, 4, 7, 3, 5, 12, 6, 8, 6]
while(!ordered.empty) {
    println ordered.poll()
} // 7 1 7 15 3 5 6 6 4 12 8
assert ordered.empty
{code}
The standard iterator ordered doesn't follow the priority. The "poll" method 
can be used but the queue will be exhausted of elements after using it.

What about arrays? We can sort non-primitive arrays (and processing can make 
use of {{partitionPoint}}):
{code:groovy}
Integer[] array = [4, 7, 15, 12, 6, 3, 5]
array.sort(true, evenComparator)
assert array == [7, 15, 3, 5, 4, 12, 6]
idx = array.partitionPoint(odd)
assert array[0..<idx].every(odd)
assert array[idx..-1].every(even)
{code}
But, we currently don't have sort variants for primitive arrays. Arrays.sort is 
available without the comparator, but no variant with comparators exist. This 
alone doesn't make partitionPoint less useful, but since arrays are of fixed 
size, we also don't have the ability to insert an extra into an array at a 
given index. I think this makes {{partitionPoint}} less useful for arrays in 
general.

> Create partitionPoint extension method variants
> -----------------------------------------------
>
>                 Key: GROOVY-11649
>                 URL: https://issues.apache.org/jira/browse/GROOVY-11649
>             Project: Groovy
>          Issue Type: New Feature
>            Reporter: Paul King
>            Assignee: Paul King
>            Priority: Major
>             Fix For: 5.x
>
>
> See: https://github.com/apache/groovy/pull/2210
> * rust: 
> https://doc.rust-lang.org/std/primitive.slice.html#method.partition_point
> * c++: https://en.cppreference.com/w/cpp/algorithm/ranges/partition_point
> * python: https://docs.python.org/3/library/bisect.html#module-bisect



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to