Re: use a non-locking initial test in TAS_SPIN on AArch64

Nathan Bossart Wed, 08 Jan 2025 14:38:27 -0800

On Wed, Jan 08, 2025 at 05:25:24PM -0500, Andres Freund wrote:
> On 2025-01-08 16:01:19 -0600, Nathan Bossart wrote:
>> But I did retry my test from upthread without pg_stat_statements and was
>> surprised to find a reproducible 4-6% regression.
> 
> Uh, huh. I assume this was readonly pgbench with 256 clients just as you had
> tested upthread? I don't think there's any hot spinlock meaningfully involved
> in that workload?  A r/w workload is a different story, but upthread you
> mentioned select-only.
> 
> Do you see any spinlock in profiles?


Yes, this was using 256 clients.  Looking closer, I don't see anything
spinlock related anywhere near the top of perf.

>> I'm not seeing any obvious differences in perf, but I do see that the thread
>> for adding TAS_SPIN() for PPC mentions a regression at lower contention
>> levels [0].  Perhaps the non-locked test is failing often enough to hurt
>> performance in this case...  Whatever it is, it'll be mighty frustrating to
>> miss out on a
>> >7x gain because of a 4% regression.
> 
> I don't think the explanation can be that simple - even with TAS_SPIN defined,
> we do try to acquire the lock once without using TAS_SPIN:
> 
> #if !defined(S_LOCK)
> #define S_LOCK(lock) \
>       (TAS(lock) ? s_lock((lock), __FILE__, __LINE__, __func__) : 0)
> #endif         /* S_LOCK */
> 
> Only s_lock() then uses TAS_SPIN(lock).

Ah, right.  FWIW I tried setting a cap on the number of times we do a
non-locked test, and the results still showed the regression, which seems
to match your intuition here.

> I wonder if you're hitting an extreme case of binary-layout related effects?
> I've never seen them at this magnitude though.  I'd suggest using either lld
> or mold as linker and comparing the numbers for a few
> -Wl,--shuffle-sections=$seed seed values.

Will do.

-- 
nathan

Re: use a non-locking initial test in TAS_SPIN on AArch64

Reply via email to