05.05.2018 09:16, Andrey Borodin пишет:
> Hi!
> 
>> 4 мая 2018 г., в 16:05, Юрий Соколов <funny.fal...@gmail.com>
>> написал(а):
>> 
>> I didn't suggest log scale of usages, but rather
>> "replacement-period based" increment: usage count could be
>> incremented at most once in NBlocks/32 replaced items. Once it is
>> incremented, its "replacement time" is remembered, and next
>> NBlocks/32 replacements usage count of this buffer doesn't
>> increment. This way, increment is synchronized with replacement
>> activity.
> 
> But you loose difference between "touched once" and "actively used".
> Log scale of usage solves this: usage count grows logarithmically,
> but drains linearly.
No, I didn't loose difference. But instead of absolute value (log scale
or linear) I count how often in time block is used:
- if buffer were touched 1000 times just after placing into
shared_buffers should it live 500 times longer than neighbor that were
touched only 2 times? or 10 times longer? or 5 times longer?
- but what if that "1000 times" buffer were not touched in next period
of time, but neighbor were touched again 2 times?
All effective algorithms answers: "1000 times" buffer should be evicted
first, but its neighbor is a really hot buffer that should be saved for
longer period.

Log scale doesn't solve this. But increment "once in period" solves.
Especially if block is placed first with zero count (instead of 1 as
currently).

>> Digging further, I suggest as improvement of GClock algorithm: -
>> placing new buffer with usage count = 0 (and next NBlock/32
>> replacements its usage count doesn't increased)
>> - increment not by 1, but by 8 (it simulates "hot queue" of
>> popular algorithms) with limit 32.
>> - scan at most 25 buffers for eviction. If no buffer with zero
>> usage count found, the least used buffer (among scanned 25) is evicted.
>> (new buffers are not evicted during their first NBlock/32
>> replacements).
>> 
> 
> I do not understand where these numbers come from...

I found this number by testing with several artificial traces found in web.
I don't claim this number are best. Even on that traces best values may
vary on cache size: for small cache size increment and limit tends to be
higher, for huge cache - smaller. But this were most balanced.

And I don't claim those traces are representative for PostgreSQL, that
is why I'm pushing this discussion to collect more real-world PostgreSQL
traces and make them public.

And I believe my algorithm is not the best. Clock-Pro and ARC shows
better results on that traces. Tiny-LFU - cache admission algorithm -
may be even more efficient (in term of evictions). But results should be
rechecked with PostgreSQL traces.

My algorithm will be just least invasive for current code, imho.

With regards,
Sokolov Yura    

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to