Re: [HACKERS] Bogus cleanup code in PostgresNode.pm

Michael Paquier Mon, 25 Apr 2016 21:21:48 -0700

On Mon, Apr 25, 2016 at 11:51 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> I noticed that even when they are successful, buildfarm members bowerbird
> and jacana tend to spew a lot of messages like this in their bin-check
> steps:
>
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/global:
>  Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/pg_xlog:
>  Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata:
>  Permission denied at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory 
> /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf:
>  Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> ### Signalling QUIT to 9156 for node "main"
> # Running: pg_ctl kill QUIT 9156
>
> What is happening here is that the test script is not bothering to do an
> explicit $node->stop operation, and if it doesn't, the automatic cleanup
> steps happen in the wrong order: the File::Temp destructor for the temp
> data directory runs before PostgresNode.pm's DESTROY function, which is
> what's issuing the "pg_ctl kill" command.  On Unix that's just messy,
> but on Windows it fails because you can't delete a process's working
> directory.  I am not sure whether this is guaranteed wrong or just
> sometimes wrong; the Perl docs I can find say that destructors are run in
> unspecified order once interpreter shutdown begins.  But by adding some
> debug printout I was able to verify on my own machine that the data
> directory was already gone when DESTROY runs.


The docs say regarding File::Temp that he object is removed once the
object goes out of scope in the parent:
http://search.cpan.org/~dagolden/File-Temp-0.2304/lib/File/Temp.pm
So basically it means that when we enter in PostgresNode's DESTROY the
temporary folder just "went out of scope" and has been removed?

DESTROY is run once per object, END is a global destructor, and END is
called really at the end of the execution. And actually one reason why
a DESTROY block instead of END is given by Alvaro here:
http://www.postgresql.org/message-id/20151201231121.GI2763@alvherre.pgsql
"
- I changed start/stop/restart so that they keep track of the postmaster
  PID; also added a DESTROY sub to PostgresNode that sends SIGQUIT.
  This means that when the test finishes, the server gets an immediate
  stop signal.  We were getting a lot of errors in the server log about
  failing to write to the stats file otherwise, until the node noticed
  that the datadir was gone.
"

> I believe we can fix this by forcing postmaster shutdown in an END
> routine instead of a DESTROY routine, and hence propose the attached
> patch, which does things in the right order for me.  I'm a pretty
> poor Perl programmer, so I'd appreciate somebody vetting this.

Another, perhaps more solid approach, would be put the DESTROY method
in charge of removing PGDATA and extend TestLib::tempdir with an
argument to be able to switch to CLEANUP => 0 at will. Then we use
this argument for PGDATA after sending SIGQUIT.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bogus cleanup code in PostgresNode.pm

Reply via email to