Hi, On 2024-02-07 16:21:24 -0600, Nathan Bossart wrote: > On Wed, Feb 07, 2024 at 01:48:57PM -0800, Andres Freund wrote: > > Now, in most cases this won't matter, the sorting isn't performance > > critical. But I don't think it's a good idea to standardize on a generally > > slower pattern. > > > > Not that that's a good test, but I did quickly benchmark [1] this with > > intarray. There's about a 10% difference in performance between using the > > existing compASC() and one using > > return (int64) *(const int32 *) a - (int64) *(const int32 *) b; > > > > > > Perhaps we could have a central helper for this somewhere? > > Maybe said helper could use __builtin_sub_overflow() and fall back to the > slow "if" version only if absolutely necessary.
I suspect that'll be worse code in the common case, given the cmov generated by gcc & clang for the typical branch-y formulation. But it's worth testing. > The assembly for that looks encouraging, but I still need to actually test > it... Possible. For 16bit upcasting to 32bit is clearly the best way. For 32 bit that doesn't work, given the 32bit return, so we need something more. Greetings, Andres Freund