вт, 24 апр. 2018 г., 8:04 Andrey Borodin <x4...@yandex-team.ru>:

> Hi, Thomas!
>
> > 24 апр. 2018 г., в 2:41, Thomas Munro <thomas.mu...@enterprisedb.com>
> написал(а):
> >
> > On Fri, Feb 12, 2016 at 10:02 AM, Konstantin Knizhnik
> > <k.knizh...@postgrespro.ru> wrote:
> >> Are there some well known drawbacks of this approach or it will be
> >> interesting to adopt this algorithm to PostgreSQL and measure it impact
> om
> >> performance under different workloads?
> >
> > I'm not currently planning to work in this area and have done no real
> > investigation, so please consider the following to be "water cooler
> > talk".
>
> I've intention to make some prototypes in this area, but still I hadn't
> allocated any time chunks sufficient enough to make anything usefull.
>
> I think that replacement of current CS5 will:
> 1. allow use of big shared buffers
> 2. make DIRECT_IO realistic possibility
> 3. improve BgWriter
> 4. unify different buffering strategies into single buffer manager (there
> will be no need in placing VACUUM into special buffer ring)
> 5. finally allow aio and more efficient prefetching like [0]
>
> Here's what we have about size of shared buffer now [1] (taken from [2]).
> I believe right hill must be big enough to reduce central pit to zero and
> make function monotonic: OS page cache knows less about data blocks and is
> expected to be less efficient.
>
>
> I'm not sure CART is the best possibility, though.
> I think that the right way is to implement many prototypes with LRU, ARC,
> CAR, CART, and 2Q. Then, benchmark them well. Or even make this algorithm
> pluggable? But currently we have a lot of dependent parts in the system. I
> do not even know where to start.
>
>
> Best regards, Andrey Borodin.
>
>
> [0]
> http://diku.dk/forskning/Publikationer/tekniske_rapporter/2004/04-03.pdf
> [1]
> https://4.bp.blogspot.com/-_Zz6X-e9_ok/WlaIvpStBmI/AAAAAAAAAA4/E1NwV-_82-oS5KfmyjoOff_IxUXiO96WwCLcBGAs/s1600/20180110-PTI.png
> [2] http://blog.dataegret.com/2018/01/postgresql-performance-meltdown.html


Before implementing algorithms within PostgreSQL it will be great to test
them outside with real traces.

I think, there should be mechamism to collect traces from real-world
postgresql instalations: every read and write access. It should be
extremely eficient to be enabled in real world. Something like circular
buffer in shared memory, and separate worker to dump it to disk.
Instead of full block address, 64bit hash could be used. Even 63bit + 1bit
to designate read/write access.

Using these traces, it will be easy to choose couple of "theoretically"
best algorithms, and then attempt to implement them.

With regards,
Yura

Reply via email to