Re: [PERFORM] strange pgbench results (as if blocked at the end)
On Sun, Aug 14, 2011 at 6:51 AM, t...@fuzzy.cz wrote: I've increased the test duration to 10 minutes, decreased the checkpoint timeout to 4 minutes and a checkpoint is issued just before the pgbench. That way the starting position should be more or less the same for all runs. Also look at increasing checkpoint completion target to something closer to 1. 0.8 is a nice starting place. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange pgbench results (as if blocked at the end)
On Sun, 14 Aug 2011 07:15:00 -0600, Scott Marlowe scott.marl...@gmail.com wrote: On Sun, Aug 14, 2011 at 6:51 AM, t...@fuzzy.cz wrote: I've increased the test duration to 10 minutes, decreased the checkpoint timeout to 4 minutes and a checkpoint is issued just before the pgbench. That way the starting position should be more or less the same for all runs. Also look at increasing checkpoint completion target to something closer to 1. 0.8 is a nice starting place. Yes, I've increased that already: checkpoints_segments=64 checkpoints_completion_target=0.9 Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange pgbench results (as if blocked at the end)
t...@fuzzy.cz writes: On 13 Srpen 2011, 5:09, Greg Smith wrote: And I keep seeing too many data corruption issues on ext4 to recommend anyone use it yet for PostgreSQL, that's why I focused on XFS. ext4 still needs at least a few more months before all the bug fixes it's gotten in later kernels are backported to the 2.6.32 versions deployed in RHEL6 and Debian Squeeze, the newest Linux distributions my customers care about right now. On RHEL6 for example, go read http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.1_Technical_Notes/kernel.html , specifically BZ#635199, and you tell me if that sounds like it's considered stable code yet or not. The block layer will be updated in future kernels to provide this more efficient mechanism of ensuring ordering...these future block layer improvements will change some kernel interfaces... Yikes, that does not inspire confidence to me. XFS is naturally much more mature / stable than EXT4, but I'm not quite sure I want to judge the stability of code based on a comment in release notes. As I understand it, the comment says something like things are not working as efficiently as it should, we'll improve that in the future and it relates to the block layer as a whole, not just specific file systems. But I don't have access to the bug #635199, so maybe I missed something. I do ;-). The reason for the tech note was to point out that RHEL6.1 would incorporate backports of upstream kernel changes that broke the ABI for loadable kernel modules, compared to what it had been in RHEL6.0. That's of great interest to third-party software developers who add device or filesystem drivers to RHEL, but I don't think it speaks at all to whether the code is unstable from a user's standpoint. (The changes in question were purely for performance, and involved a conversion from write barriers in the block layer to flush+fua, whatever that is.) Furthermore, this affected every filesystem not only ext4, so it really entirely fails to support Greg's argument. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange pgbench results (as if blocked at the end)
On 08/14/2011 08:51 AM, t...@fuzzy.cz wrote: I've increased the test duration to 10 minutes, decreased the checkpoint timeout to 4 minutes and a checkpoint is issued just before the pgbench. That way the starting position should be more or less the same for all runs. That's basically what I settled on for pgbench-tools. Force a checkpoint just before the test, so the beginning of each run is aligned more consistently, then run for long enough that you're guaranteed at least one checkpoint finishes[1] (and you might see more than one if you fill checkpoint_segments fast enough). I never bothered trying to compress that test cycle down by decreasing checkpoint_timeout. There's already too many things you need to do in order to get this test working well, and I didn't want to include a change I'd never recommend people make on a production server in the mix. [1] If your checkpoint behavior goes pathological, for example the extended checkpoints possible when the background writer fsync queue fills, it's not actually guaranteed that the checkpoint will finish within 5 minutes after it starts. So a 10 minute run doesn't assure you'll a checkpoint begin and end in all circumstances, but it is the expected case. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance