On 12 April 2017 at 13:27, Craig Ringer <craig.rin...@2ndquadrant.com> wrote:
> On 12 April 2017 at 08:22, Michael Paquier <michael.paqu...@gmail.com> wrote:
>> On Wed, Apr 12, 2017 at 9:12 AM, Craig Ringer
>> <craig.rin...@2ndquadrant.com> wrote:
>>> Well, you can get a lot of information on timings in crake's latest
>>> builds for example, or nightjar's.
>>>
>>> Here's an interesting fact: almost all the time taken up in the scripts
>>> test set seems to be consumed in running initdb over and over.
>>>
>>> Is it worth adding an init(cache => 1) option to PostgresNode, where we
>>> stash an initdb'd db in tmp_check/  and just do a simple fs copy of it ?
>>
>> This looks like a good idea to me, but I don't like much the idea of
>> an option in the init() routine as that's hard to maintain.
>
> How so?
>
>> It would
>> make sense to me to be able to override the initialization with an
>> environment variable instead
>
> Yeah, that's reasonable.
>
> Patch attached. It supports setting CACHE_INITDB=0 to disable the cache.
>
> With cache: 3m55.063s (user 7.7s, sys 0m11.833s)
>
> Without cache:  4m11.229s (user 0m16.963s, sys 0m12.384s)
>
> so in a serial run it doesn't make a ton of difference.
>
> Adding some timing on initdb runs shows that initdb takes about 1s, a
> cached copy about 0.1s with our Perl based recursive copy code. So
> it's hardly an overwhelming benefit, but might be more beneficial with
> prove -j .

Of course, it'll need locking for prove -j. Added. Patch attached,
applies on top of prior patch.

With

make PROVE_FLAGS="--timer -j 9" check

I don't see much difference with/without caching initdb results -
saves about 4 seconds, from 74 to 70 seconds, but hard to tell with
the error margins.

So if we're going to do anything, defaulting to parallel prove seems a
good start, and I'm not sure caching initdb results is worth it.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From db1aadf450a15cbd4041c5f9bf5a0b16f66c2ad5 Mon Sep 17 00:00:00 2001
From: Craig Ringer <cr...@2ndquadrant.com>
Date: Wed, 12 Apr 2017 14:03:25 +0800
Subject: [PATCH] Add locking for initdb caching to allow parallel prove

---
 src/test/perl/PostgresNode.pm | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index eef9f66..0eacd41 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -94,6 +94,7 @@ use Socket;
 use Test::More;
 use TestLib ();
 use Scalar::Util qw(blessed);
+use Fcntl qw(:flock);
 
 our @EXPORT = qw(
   get_new_node
@@ -477,15 +478,27 @@ sub _initdb_cached
 	my $pgdata = $self->data_dir;
 
 	my $cachedir = TestLib::tmp_check_dir() . "/initdb_cache";
+	my $lockfile = TestLib::tmp_check_dir() . "/.initdb_cache.lock";
 
 	if (! -d $cachedir)
 	{
 		print(STDERR "initializing initdb cache\n");
-		# initdb into a tempdir, then rename the result so we don't risk
-		# leaving failed initdb results around
-		my $temp_dbdir = TestLib::tempdir("initdb_cache_");
-		$self->_initdb($temp_dbdir, %params);
-		rename($temp_dbdir, $cachedir);
+		open my $lockfh, ">", $lockfile
+			or die "cannot open $lockfile for writing: $1\n";
+		flock($lockfh, LOCK_EX)
+			or die "cannot flock($lockfile) in exclusive mode: $!\n";
+		# If running in parallel with other tests, someone else might've got
+		# the lock first and have swapped an initdb'd directory in already.
+		if (! -d $cachedir)
+		{
+			# initdb into a tempdir, then rename the result so we don't risk
+			# leaving failed initdb results around
+			my $temp_dbdir = TestLib::tempdir("initdb_cache_");
+			$self->_initdb($temp_dbdir, %params);
+			rename($temp_dbdir, $cachedir);
+		}
+		flock($lockfh, LOCK_UN)
+			or die "cannot unlock $lockfile: $!";
 	}
 
 	RecursiveCopy::copypath($cachedir, $pgdata);
-- 
2.5.5

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to