Re: [HACKERS] Optimize kernel readahead using buffer access strategy

KONDO Mitsumasa Sun, 17 Nov 2013 17:58:11 -0800

(2013/11/15 13:48), Claudio Freire wrote:

On Thu, Nov 14, 2013 at 11:13 PM, KONDO Mitsumasa

I use CentOS 6.4 which kernel version is 2.6.32-358.23.2.el6.x86_64 in this
test.


That's close to the kernel version I was using, so you should see the
same effect.

OK. You proposed readahead maximum patch, I think it seems to get benefit forperofomance and your part of argument is really true.

Your patch becomes maximum readahead, when a sql is selected index range
scan. Is it right?


Ehm... sorta.

I think that your patch assumes that pages are ordered by
index-data.


No. It just knows which pages will be needed, and fadvises them. No
guessing involved, except the guess that the scan will not be aborted.
There's a heuristic to stop limited scans from attempting to fadvise,
and that's that prefetch strategy is applied only from the Nth+ page
walk.

We may completely optimize kernel readahead in PostgreSQL in the future,

however it is very difficult and takes long time that it completely comes truefrom a beginning. So I propose GUC switch that can use in their transactions.(I　will create this patch in this CF.). If someone off readahed for using file cachemore efficient in his transactions, he can set "SET readahead = off". PostgreSQLis open source, and I think that it becomes clear which case it is effective for,by using many people.

It improves index-only scans the most, but I also attempted to handle
heap prefetches. That's where the kernel started conspiring against
me, because I used many naturally-clustered indexes, and THERE
performance was adversely affected because of that kernel bug.

I also create gaussinan-distributed pgbench now and submit this CF. It can clearwhich situasion is effective, partially we will know.

You may want to try your patch with more
real workloads, and maybe you'll confirm what I found out last time I
messed with posix_fadvise. If my experience is still relevant, those
patterns will have suffered a severe performance penalty with this
patch, because it will disable kernel read-ahead on sequential index
access. It may still work for sequential heap scans, because the
access strategy will tell the kernel to do read-ahead, but many other
access methods will suffer.


The decisive difference with your patch is that my patch uses buffer hint
control architecture, so it can control readahaed smarter in some cases.


Indeed, but it's not enough. See my above comment about naturally
clustered indexes. The planner expects that, and plans accordingly. It
will notice correlation between a PK and physical location, and will
treat an index scan over PK to be almost sequential. With your patch,
that assumption will be broken I believe.

However, my patch is on the way and needed to more improvement. I am going
to add method of controlling readahead by GUC, for user can freely select
readahed parameter in their transactions.


Rather, I'd try to avoid fadvising consecutive or almost-consecutive
blocks. Detecting that is hard at the block level, but maybe you can
tie that detection into the planner, and specify a sequential strategy
when the planner expects index-heap correlation?

I think we had better to develop these patches in step by step each patches,because it is difficult that readahead optimizetion is completely come true froma beginning of one patch. We need flame-work in these patches, first.

Try OLAP-style queries.


I have DBT-3(TPC-H) benchmark tools. If you don't like TPC-H, could you tell
me good OLAP benchmark tools?


I don't really know. Skimming the specs, I'm not sure if those queries
generate large index range queries. You could try, maybe with
autoexplain?

OK, I do. And, I will use simple large index range queries with explain command.

Regards,
--
Mitsuamsa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

Reply via email to