On Tue, May 28, 2013 at 10:36 AM, Greg Smith <g...@2ndquadrant.com> wrote:
> On 5/28/13 11:12 AM, Jon Nelson wrote:
>> It opens a new file, fallocates 16MB, calls fdatasync.
> Outside of the run for performance testing, I think it would be good at this
> point to validate that there is really a 16MB file full of zeroes resulting
> from these operations.  I am not really concerned that posix_fallocate might
> be slower in some cases; that seems unlikely.  I am concerned that it might
> result in a file that isn't structurally the same as the 16MB of zero writes
> implementation used now.

util-linux comes with fallocate which (might?) suffice for testing in
that respect, no?
If that is a real concern, it could be made part of the autoconf
testing, perhaps.

> The timing program you're writing has some aspects that are similar to the
> contrib/pg_test_fsync program.  You might borrow some code from there
> usefully.

Thanks! If it looks like what I'm attaching will not do, then I'll
look at that as a possible next step.

> To clarify the suggestion I was making before about including performance
> test results:  that doesn't necessarily mean the testing code must run using
> only the database.  That's better if possible, but as Robert says it may not
> be for some optimizations.  The important thing is to have something
> measuring the improvement that a reviewer can duplicate, and if that's a
> standalone benchmark problem that's still very useful.  The main thing I'm
> wary of is any "this should be faster" claims that don't come with any
> repeatable measurements at all.  Very often theories about the fastest way
> to do something don't match what's actually seen in testing.

A note: The attached test program uses *fsync* instead of *fdatasync*
after calling fallocate (or writing out 16MB of zeroes), per an
earlier suggestion.

#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/time.h>
#include <sys/types.h>

#define SIXTEENMB 1024*1024*16
#define EIGHTKB 1024*8

void writeout(int fd, char *buf)
	int i;
	for (i = 0; i < SIXTEENMB / EIGHTKB; ++i) {
		if (write(fd, buf, EIGHTKB) != EIGHTKB) {
			fprintf(stderr, "Error in write: %m!\n");

int main(int argc, char *argv[])
	int with_fallocate, open_close_iterations, rewrite_iterations;
	int fd, i, j;
	double tt;
	struct timeval tv1, tv2;
	char *buf0, *buf1;
	const char *filename; /* for convenience */

	if (argc != 4) {
		fprintf(stderr, "Usage: %s <filename> <open/close iterations> <rewrite iterations>\n", argv[0]);
	filename = argv[1];

	open_close_iterations = atoi(argv[2]);
	if (open_close_iterations < 0) {
		fprintf(stderr, "Error parsing 'open_close_iterations'!\n");

	rewrite_iterations = atoi(argv[3]);
	if (rewrite_iterations < 0) {
		fprintf(stderr, "Error parsing 'rewrite_iterations'!\n");

	buf0 = malloc(SIXTEENMB);
	if (!buf0) {
		fprintf(stderr, "Unable to allocate memory!\n");
	memset(buf0, 0, SIXTEENMB);

	buf1 = malloc(SIXTEENMB);
	if (!buf1) {
		fprintf(stderr, "Unable to allocate memory!\n");
	memset(buf1, 1, SIXTEENMB);

	for (with_fallocate = 0;with_fallocate < 2;++with_fallocate) {
		for (i = 0;i < open_close_iterations; ++i) {
			gettimeofday(&tv1, NULL);
			fd = open(filename, O_CREAT | O_EXCL | O_WRONLY);
			if (fd < 0) {
				fprintf(stderr, "Error opening file: %m\n");
			if (with_fallocate) {
				if (posix_fallocate(fd, 0, SIXTEENMB) != 0) {
					fprintf(stderr, "Error in posix_fallocate!\n");
			} else {
				writeout(fd, buf0);
			if (fsync(fd)) {
				fprintf(stderr, "Error in fdatasync: %m!\n");
			for (j = 0; j < rewrite_iterations; ++j) {
				lseek(fd, 0, SEEK_SET);
				writeout(fd, buf1);
				if (fdatasync(fd)) {
					fprintf(stderr, "Error in fdatasync: %m!\n");
			if (close(fd)) {
				fprintf(stderr, "Error in close: %m!\n");
			unlink(filename);		/* don't check for error */
		gettimeofday(&tv2, NULL);
		tt = (tv2.tv_usec + tv2.tv_sec * 1000000) - (tv1.tv_usec + tv1.tv_sec * 1000000);
		tt /= 1000000;
			"with%s posix_fallocate: %d open/close iterations, %d rewrite in %0.4fs\n",
			(with_fallocate ? "" : "out"), open_close_iterations, rewrite_iterations, tt);
	/* cleanup */
	return 0;
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to