> On 2 Dec 2024, at 08:39, John Naylor <[email protected]> wrote: > > On Mon, Dec 2, 2024 at 1:12 AM Andrey M. Borodin <[email protected]> wrote: >> >>> On 25 Nov 2024, at 17:50, John Naylor <[email protected]> wrote: >>> >>> I'd like to see the two sort specializations combined >>> into one, using a local comparator that knows when to reverse the >>> comparison result (hope that makes sense). >> >> Sure, please find attached. >> The prototype looks somewhat ugly (we pass bool* ascending instead of bool >> ascending) and it cost us ~2% of performance (on my MacBook Air M3). > > I haven't tried to reproduce, but the comparison function has a > different style for DESC than the tuplesort comparators, and the style > here has worse code density, at least in isolation. You can see the > difference in the link below. I also found a way to make the cmp > reversal branch-free (last example). That may not survive once it's > inlined, of course, or could make things worse, but you can try these > if you like: > > https://godbolt.org/z/nfPMT7Enr
On my machine this test
\timing on
SELECT (sort(arr))[1] FROM arrays_to_sort;\watch 0 c=5
produces
sort_int32_cmp Time: 543.690 ms
sort_int32_cmp_2 Time: 609.019 ms
sort_int32_cmp_4 Time: 612.219 ms
So, I'd stick with sort_int32_cmp. But, perhaps, on Intel we might have
different results.
>
>> But is not more generic.
>
> A lot of the churn is from v1, and made worse by v2, and that seems to
> be from getting rid of the QSORT macro:
>
> @@ -227,9 +228,10 @@ Datum
> sort_asc(PG_FUNCTION_ARGS)
> {
> ArrayType *a = PG_GETARG_ARRAYTYPE_P_COPY(0);
> + bool ascending = true;
>
> CHECKARRVALID(a);
> - QSORT(a, 1);
> + sort_int32(ARRPTR(a), ARRNELEMS(a), &ascending);
> PG_RETURN_POINTER(a);
> }
>
> The macro hides some details -- can we put "ascending" inside there?
Done.
Best regards, Andrey Borodin.
v3-0001-Use-specialized-sort-facilities.patch
Description: Binary data
