Re: Protect syscache from bloating with negative cache entries

Heikki Linnakangas Fri, 06 Nov 2020 00:43:15 -0800

On 06/11/2020 10:24, Kyotaro Horiguchi wrote:

Thank you for the comment!


First off, I thought that I managed to eliminate the degradation
observed on the previous versions, but significant degradation (1.1%
slower) is still seen in on case.

One thing to keep in mind with micro-benchmarks like this is that evencompletely unrelated code changes can change the layout of the code inmemory, which in turn can affect CPU caching affects in surprising ways.If you're lucky, you can see 1-5% differences just by adding a functionthat's never called, for example, if it happens to move other code inmemory so that a some hot codepath or struct gets split across CPU cachelines. It can be infuriating when benchmarking.

At Thu, 5 Nov 2020 11:09:09 +0200, Heikki Linnakangas <hlinn...@iki.fi> wrote in
(A) original test patch

I naively thought that the code path is too short to bury the
degradation of additional a few instructions.  Actually I measured
performance again with the same patch set on the current master and
had the more or less the same result.

master 8195.58ms, patched 8817.40 ms: +10.75%

However, I noticed that the additional call was a recursive call and a
jmp inserted for the recursive call seems taking significant
time. After avoiding the recursive call, the difference reduced to
+0.96% (master 8268.71ms : patched 8348.30ms)

Just two instructions below are inserted in this case, which looks
reasonable.

   8720ff <+31>:  cmpl   $0xffffffff,0x4ba942(%rip)        # 0xd2ca48 
<catalog_cache_prune_min_age>
   872106 <+38>:  jl     0x872240 <SearchCatCache1+352> (call to a function)


That's interesting. I think a 1% degradation would be acceptable.

I think we'd like to enable this feature by default though, so theperformance when it's enabled is also very important.

(C) inserting bare counter-update code without a branch

Do we actually need a branch there? If I understand correctly, the
point is to bump up a usage counter on the catcache entry. You could
increment the counter unconditionally, even if the feature is not
used, and avoid the branch that way.


That change causes 4.9% degradation, which is worse than having a
branch.

master 8364.54ms, patched 8666.86ms (+4.9%)

The additional instructions follow.

+ 8721ab <+203>:  mov    0x30(%rbx),%eax  # %eax = ct->naccess
+ 8721ae <+206>:  mov    $0x2,%edx
+ 8721b3 <+211>:  add    $0x1,%eax        # %eax++
+ 8721b6 <+214>:  cmove  %edx,%eax        # if %eax == 0 then %eax = 2
<original code>
+ 8721bf <+223>:  mov    %eax,0x30(%rbx)  # ct->naccess = %eax
+ 8721c2 <+226>:  mov    0x4cfe9f(%rip),%rax        # 0xd42068 <catcacheclock>
+ 8721c9 <+233>:  mov    %rax,0x38(%rbx)  # ct->lastaccess = %rax

Do you need the "ntaccess == 2" test? You could always increment thecounter, and in the code that uses ntaccess to decide what to evict,treat all values >= 2 the same.

Need to handle integer overflow somehow. Or maybe not: integer overflowis so infrequent that even if a hot syscache entry gets evictedprematurely because its ntaccess count wrapped around to 0, it willhappen so rarely that it won't make any difference in practice.


- Heikki

Re: Protect syscache from bloating with negative cache entries

Reply via email to