(2013/07/05 0:35), Joshua D. Drake wrote:
On 07/04/2013 06:05 AM, Andres Freund wrote:
Presumably the smaller segsize is better because we don't
completely stall the system by submitting up to 1GB of io at once. So,
if we were to do it in 32MB chunks and then do a final fsync()
afterwards we might get most of the benefits.
Yes, I try to test this setting './configure --with-segsize=0.03125' tonight.
I will send you this test result tomorrow.

I did testing on this a few years ago, I tried with 2MB segments over 16MB
thinking similarly to you. It failed miserably, performance completely tanked.
Just as you say, test result was miserable... Too small segsize is bad for parformance. It might be improved by separate derectory, but too many FD with open() and close() seem to be bad. However, I think taht this implementation have potential which is improve for IO performance, so we need to try to test with some methods.

* Performance result in DBT-2 (WH340)
                                 | NOTPM    90%tile    Average  Maximum
 original_0.7 (baseline)         | 3474.62  18.348328  5.739    36.977713
 fsync + write                   | 3586.85  14.459486  4.960    27.266958
 fsync + write + segsize=0.25    | 3661.17  8.28816    4.117    17.23191
 fsync + wrote + segsize=0.03125 | 3309.99  10.851245  6.759    19.500598

(2013/07/04 22:05), Andres Freund wrote:
> 1) it breaks pg_upgrade. Which means many of the bigger users won't be
>     able to migrate to this and most packagers would carry the old
>     segsize around forever.
>     Even if we could get pg_upgrade to split files accordingly link mode
>     would still be broken.
I think that pg_upgrade is one of the contrib, but not mainly implimentation of Postgres. So contrib should not try to stand in improvement of main implimentaion. Pg_upgrade users might consider same opinion.

> 2) It drastically increases the amount of file handles neccessary and by
>     extension increases the amount of open/close calls. Those aren't all
>     that cheap. And it increases metadata traffic since mtime/atime are
>     kept for more files. Also, file creation is rather expensive since it
>     requires metadata transaction on the filesystem level.
My test result was seemed this problem. But my test wasn't separate directory in base/. I'm not sure that which way is best. If you have time to create patch, please send us, and I try to test in DBT-2.

Best regards,
Mitsumasa KONDO
NTT Open Sorce Software Center

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to