On Tue, 13 Feb 2024 03:55:37 GMT, Stuart Marks <sma...@openjdk.org> wrote:

>> Somewhat surprisingly, `ArrayList$Sublist.sort()` is not specialized and 
>> will thus fall back to slower default method of `List.sort()` instead of 
>> sorting a range of the array in-place in its backing root `ArrayList`. 
>> 
>> This doesn't change observable behavior, so haven't added tests, and `tier1` 
>> tests still all pass except for 
>> `test/jdk/java/util/Locale/LocaleProvidersFormat.java` which also currently 
>> fails on master too on the machine I tested on.
>
> @szegedi Oh interesting catch. Looks like a reasonable optimization. Is this 
> covered by a test, or should a new test be added?
> 
> @JimLaskey The `List.sort` method was added in JDK 8, so not really day one. 
> Prior to that one had to call `Collections.sort` which copied to a temp 
> array, sorted, then copied back; this applied equally to full lists and 
> sublists. Since JDK 8, `ArrayList.sort` on the full list did sorting in place 
> but sorting a subList just inherited the default method (which uses the old 
> copy-to-temp-and-sort technique). My guess is that nobody noticed this 
> because sublists aren't used that much, and sorting of subarrays, while 
> arguably necessary for completeness, probably aren't used all that much 
> either.

@stuart-marks you're right that sorting of sublists is rare. I did run into a 
need for it a few months ago in my day job though, that's how I stumbled upon 
this. Granted, it was a performance optimization and not a correctness 
necessity. See, I have a large list of data I read from an external location. 
Some elements – in at most one identifiable but arbitrarily large stride in the 
large list – need some extra processing. The algorithm for this processing can 
be written to be more efficient if it can presume those elements are sorted.

I could still sort the entire list (as the ordering doesn't matter to the final 
receiver), using a sort that just compares irrelevant elements as equal, so 
they'd remain as large "already sorted" strides, and only apply the real sort 
on only the relevant elements:

static int fullListCompare(Data d1, Data d2) {
  if (isRelevant(d1)) {
    if (isRelevant(d2)) {
      return sublistCompare(d1, d2);
    } else {
      return -1;
    }
  } else if (isRelevant(d2)) {
    return 1;
  }
  return 0;    
}

This'd also have the side effect of moving all the relevant entries to the 
beginning of the list, but while that's not a problem, it's not necessary 
either. For this reason, I prefer to scan the list for the stride, reify it as 
a sublist, and only run a sort with `sublistCompare` on it, and then also do 
the extra processing of the elements on the sublist too.

So yeah, it is rare this pops up, but it does happen :-)

I'm not sure if there's a test that already exercises sublist sorts. I did find 
what seems like a setup for a [TCK test for 
ArrayList](https://github.com/openjdk/jdk/blob/master/test/jdk/java/util/concurrent/tck/ArrayListTest.java)
 that seems like it is specifically creating a test suite for sublists too, but 
I haven't looked more closely into how is that supposed to work. I'm happy to 
write some tests specifically for ArrayList sublist sort if this TCK one 
doesn't exercise it.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17818#issuecomment-1941426905

Reply via email to