Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread Scott Marlowe
On Sun, Aug 14, 2011 at 6:51 AM,  t...@fuzzy.cz wrote:

 I've increased the test duration to 10 minutes, decreased the
 checkpoint timeout to 4 minutes and a checkpoint is issued just before
 the pgbench. That way the starting position should be more or less the
 same for all runs.

Also look at increasing checkpoint completion target to something
closer to 1. 0.8 is a nice starting place.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread tv
On Sun, 14 Aug 2011 07:15:00 -0600, Scott Marlowe
scott.marl...@gmail.com wrote:
 On Sun, Aug 14, 2011 at 6:51 AM,  t...@fuzzy.cz wrote:

 I've increased the test duration to 10 minutes, decreased the
 checkpoint timeout to 4 minutes and a checkpoint is issued just before
 the pgbench. That way the starting position should be more or less the
 same for all runs.
 
 Also look at increasing checkpoint completion target to something
 closer to 1. 0.8 is a nice starting place.

Yes, I've increased that already:

checkpoints_segments=64 
checkpoints_completion_target=0.9

Tomas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread Tom Lane
t...@fuzzy.cz writes:
 On 13 Srpen 2011, 5:09, Greg Smith wrote:
 And I keep seeing too many data corruption issues on ext4 to recommend
 anyone use it yet for PostgreSQL, that's why I focused on XFS.  ext4
 still needs at least a few more months before all the bug fixes it's
 gotten in later kernels are backported to the 2.6.32 versions deployed
 in RHEL6 and Debian Squeeze, the newest Linux distributions my customers
 care about right now.  On RHEL6 for example, go read
 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.1_Technical_Notes/kernel.html
 , specifically BZ#635199, and you tell me if that sounds like it's
 considered stable code yet or not.  The block layer will be updated in
 future kernels to provide this more efficient mechanism of ensuring
 ordering...these future block layer improvements will change some kernel
 interfaces...  Yikes, that does not inspire confidence to me.

 XFS is naturally much more mature / stable than EXT4, but I'm not quite
 sure I want to judge the stability of code based on a comment in release
 notes. As I understand it, the comment says something like things are
 not working as efficiently as it should, we'll improve that in the
 future and it relates to the block layer as a whole, not just specific
 file systems. But I don't have access to the bug #635199, so maybe I
 missed something.

I do ;-).  The reason for the tech note was to point out that RHEL6.1
would incorporate backports of upstream kernel changes that broke the
ABI for loadable kernel modules, compared to what it had been in
RHEL6.0.  That's of great interest to third-party software developers
who add device or filesystem drivers to RHEL, but I don't think it
speaks at all to whether the code is unstable from a user's standpoint.
(The changes in question were purely for performance, and involved a
conversion from write barriers in the block layer to flush+fua, whatever
that is.)  Furthermore, this affected every filesystem not only ext4,
so it really entirely fails to support Greg's argument.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread Greg Smith

On 08/14/2011 08:51 AM, t...@fuzzy.cz wrote:

I've increased the test duration to 10 minutes, decreased the
checkpoint timeout to 4 minutes and a checkpoint is issued just before
the pgbench. That way the starting position should be more or less the
same for all runs.
   


That's basically what I settled on for pgbench-tools.  Force a 
checkpoint just before the test, so the beginning of each run is aligned 
more consistently, then run for long enough that you're guaranteed at 
least one checkpoint finishes[1] (and you might see more than one if you 
fill checkpoint_segments fast enough).  I never bothered trying to 
compress that test cycle down by decreasing checkpoint_timeout.  There's 
already too many things you need to do in order to get this test working 
well, and I didn't want to include a change I'd never recommend people 
make on a production server in the mix.


[1] If your checkpoint behavior goes pathological, for example the 
extended checkpoints possible when the background writer fsync queue 
fills, it's not actually guaranteed that the checkpoint will finish 
within 5 minutes after it starts.  So a 10 minute run doesn't assure 
you'll a checkpoint begin and end in all circumstances, but it is the 
expected case.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance