Re: [HACKERS] Optimize kernel readahead using buffer access strategy

KONDO Mitsumasa Thu, 14 Nov 2013 18:09:10 -0800

Hi Claudio,

(2013/11/14 22:53), Claudio Freire wrote:

On Thu, Nov 14, 2013 at 9:09 AM, KONDO Mitsumasa
<[email protected]> wrote:

I create a patch that is improvement of disk-read and OS file caches. It can
optimize kernel readahead parameter using buffer access strategy and
posix_fadvice() in various disk-read situations.


In general OS, readahead parameter was dynamically decided by disk-read
situations. If long time disk-read was happened, readahead parameter becomes 
big.
However it is based on experienced or heuristic algorithm, it causes waste
disk-read and throws out useful OS file caches in some case. It is bad for
disk-read performance a lot.


It would be relevant to know which kernel did you use for those tests.

I use CentOS 6.4 which kernel version is 2.6.32-358.23.2.el6.x86_64 in this 
test.

A while back, I tried to use posix_fadvise to prefetch index pages.

I search your past work. Do you talk about this ML-thread? Or is there anotherlatest discussion? I see your patch is interesting, but it wasn't submitted to CFand stopping discussions.

http://www.postgresql.org/message-id/CAGTBQpZzf70n0PYJ=VQLd+jb3wJGo=2txmy+skjd6g_vjc5...@mail.gmail.com

I ended up finding out that interleaving posix_fadvise with I/O like
that severly hinders (ie: completely disables) the kernel's read-ahead
algorithm.

Your patch becomes maximum readahead, when a sql is selected index range scan. Isit right? I think that your patch assumes that pages are ordered by index-data.This assumption is partially wrong. If your assumption is true, we don't needCLUSTER command. In actuary, CLUSTER command becomes better performance than nothing.

How exactly did you set up those benchmarks? pg_bench defaults?

My detail test setting is under following,
* Server info
  CPU: Intel(R) Xeon(R) CPU E5645  @ 2.40GHz (2U/12C)
  RAM: 6GB
    -> I reduced it intentionally in OS paraemter, because large memory tests
       have long time.
  HDD: SEAGATE  Model: ST2000NM0001 @ 7200rpm * 1
  RAID: none.

* postgresql.conf(summarized)
  shared_buffers = 600MB (10% of RAM = 6GB)
  work_mem = 1MB
  maintenance_work_mem = 64MB
  wal_level = archive
  fsync = on
  archive_mode = on
  checkpoint_segments = 300
  checkpoint_timeout = 15min
  checkpoint_completion_target = 0.7

* pgbench settings
pgbench -j 4 -c 32 -T 600 pgbench

pg_bench does not exercise heavy sequential access patterns, or long
index scans. It performs many single-page index lookups per
transaction and that's it.

Yes, your argument is right. And it is also a fact that performance becomesbetter in these situations.

You may want to try your patch with more
real workloads, and maybe you'll confirm what I found out last time I
messed with posix_fadvise. If my experience is still relevant, those
patterns will have suffered a severe performance penalty with this
patch, because it will disable kernel read-ahead on sequential index
access. It may still work for sequential heap scans, because the
access strategy will tell the kernel to do read-ahead, but many other
access methods will suffer.

The decisive difference with your patch is that my patch uses buffer hint controlarchitecture, so it can control readahaed smarter in some cases.However, my patch is on the way and needed to more improvement. I am going to addmethod of controlling readahead by GUC, for user can freely select readahedparameter in their transactions.

Try OLAP-style queries.

I have DBT-3(TPC-H) benchmark tools. If you don't like TPC-H, could you tell megood OLAP benchmark tools?


Regards,
--
Mitsumasa KONDO
NTT Open Source Software



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

Reply via email to