Re: Fwd: Re: [HACKERS] MSVC odd TAP test problem

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 7:29 AM, Andrew Dunstan
 wrote:
> This isn't going to work. If you look at the code in IPC/Run.pm you see
> that the coup_d_grace signal is only used after it has first sent the
> hardcoded SIGTERM. It might be tempting to play with using Sysinternals'
> pskill utility, but we can hardly expect buildfarm owners and others to
> hack their copies of IPC/Run.pm, so I'm going to go ahead and commit the
> changes I proposed.

OK, thanks for checking. That was worth a try.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: Re: [HACKERS] MSVC odd TAP test problem

2017-05-10 Thread Andrew Dunstan


On 05/10/2017 01:53 AM, Andrew Dunstan wrote:
>
>> Does it make a different if you use for example coup_d_grace =>
>> "QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.
>
> No idea. I'll try.
>
>
>


This isn't going to work. If you look at the code in IPC/Run.pm you see
that the coup_d_grace signal is only used after it has first sent the
hardcoded SIGTERM. It might be tempting to play with using Sysinternals'
pskill utility, but we can hardly expect buildfarm owners and others to
hack their copies of IPC/Run.pm, so I'm going to go ahead and commit the
changes I proposed.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Fwd: Re: [HACKERS] MSVC odd TAP test problem

2017-05-09 Thread Andrew Dunstan

On 05/09/2017 09:37 PM, Michael Paquier wrote:

> On Wed, May 10, 2017 at 2:11 AM, Andrew Dunstan
>  wrote:
>> (After extensive trial and error) Turns out it's not quite that, it's
>> the kill_kill stuff. I think for now we should just disable it on the
>> platform. That means not running tests 7 and 8 of the logical_decoding
>> tests and all of the crash_recovery test. test::More has nice
>> faciliti4es for skipping tests cleanly. See attached patch.
> +SKIP:
> +{
> +# some Windows Perls at least don't like IPC::Run's start/kill_kill 
> regime.
> +skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';
> So this basically works with msys but not with MSWin32? Interesting...


On Msys we use the Msys DTK perl to run prove, and it executes the Msys
shell to run commands, with Msys signal emulation. The buildfarm client
goes to some trouble to arrange this. So it's very different.



>
> Does it make a different if you use for example coup_d_grace =>
> "QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.


No idea. I'll try.


>
> +if  ($Config{osname} eq 'MSWin32')
> +{
> +# some Windows Perls at least don't like IPC::Run's start/kill_kill 
> regime.
> +plan skip_all => "Test fails on Windows perl";
> +}
> Indentation is weird here, with a mix of spaces and tabs.


I will indent it before I commit anything.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MSVC odd TAP test problem

2017-05-09 Thread Michael Paquier
On Wed, May 10, 2017 at 2:11 AM, Andrew Dunstan
 wrote:
> (After extensive trial and error) Turns out it's not quite that, it's
> the kill_kill stuff. I think for now we should just disable it on the
> platform. That means not running tests 7 and 8 of the logical_decoding
> tests and all of the crash_recovery test. test::More has nice
> faciliti4es for skipping tests cleanly. See attached patch.

+SKIP:
+{
+# some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
+skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';
So this basically works with msys but not with MSWin32? Interesting...

Does it make a different if you use for example coup_d_grace =>
"QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.

+if  ($Config{osname} eq 'MSWin32')
+{
+# some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
+plan skip_all => "Test fails on Windows perl";
+}
Indentation is weird here, with a mix of spaces and tabs.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MSVC odd TAP test problem

2017-05-09 Thread Andrew Dunstan


On 05/06/2017 08:54 PM, Andrew Dunstan wrote:
>
> On 05/06/2017 07:41 PM, Craig Ringer wrote:
>>
>> On 7 May 2017 4:24 am, "Andrew Dunstan"
>> > > wrote:
>>
>>
>> I have been working on enabling the remaining TAP tests on MSVC
>> build in
>> the buildfarm client, but I have come across an odd problem. The bin
>> tests all run fine, but the recover tests crash and in such a way
>> as to
>> crash the buildfarm client itself and require some manual cleanup.
>> This
>> happens at some stage after the tests have run (the final "ok" is
>> output) but before the END handler in PostgresNode.pm (I put some
>> traces
>> in there to see if I could narrow down where there were problems).
>>
>> The symptom is that this appears at the end of the output when the
>> client calls "vcregress.pl  taptest
>> src/test/recover":
>>
>> Terminating on signal SIGBREAK(21)
>> Terminating on signal SIGBREAK(21)
>> Terminate batch job (Y/N)?
>>
>> And at that point there is nothing at all apparently running,
>> according
>> to Sysinternals Process Explorer, including the buildfarm client.
>>
>> It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
>> fix it.
>>
>>
>> Anyone have any clues?
>>
>>
>> That looks like we've upset CMD.exe its self. I'm not sure how ...
>> leaking a signal to the parent proc?
>>
>> I suspect this could be something to do with console process groups.
>>
>> Bowerbird is win8 . So this isn't going to be related to the support
>> for ANSI escapes added in win10.
>>
>> A serach for the error turns up a complaint about IPC::Run as the
>> first hit. Probably not coincidence.
>>
>>
>> http://stackoverflow.com/q/40924750
>>
>> See this bug
>>
>> https://rt.cpan.org/Public/Bug/Display.html?id=101093
>>
>>
>>
>
>
> Actually, it's Win10, looks like I forgot to update the personality, my bad.
>
> I had a feeling it was probably something to do with timeout. That RT
> ticket looks like it's on the money.
>



(After extensive trial and error) Turns out it's not quite that, it's
the kill_kill stuff. I think for now we should just disable it on the
platform. That means not running tests 7 and 8 of the logical_decoding
tests and all of the crash_recovery test. test::More has nice
faciliti4es for skipping tests cleanly. See attached patch.

cheers

andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

>From f3ffdad568e9fbce6b8cc3c6ffc4490842b1b5fb Mon Sep 17 00:00:00 2001
From: Andrew Dunstan 
Date: Tue, 9 May 2017 13:03:41 -0400
Subject: [PATCH] Avoid tests which crash the calling process on Windows

Certain recovery tests use the Perl IPC::Run module's start/kill_kill
method of processing. On at least some versions of perl this causes the
whole process and its caller to crash. If we ever find a better way of
doing these tests they can be re-enabled.
---
 src/test/recovery/t/006_logical_decoding.pl | 22 +++---
 src/test/recovery/t/011_crash_recovery.pl   | 12 +++-
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/src/test/recovery/t/006_logical_decoding.pl b/src/test/recovery/t/006_logical_decoding.pl
index bf9b50a..095cfa8 100644
--- a/src/test/recovery/t/006_logical_decoding.pl
+++ b/src/test/recovery/t/006_logical_decoding.pl
@@ -8,6 +8,7 @@ use warnings;
 use PostgresNode;
 use TestLib;
 use Test::More tests => 16;
+use Config;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -72,13 +73,20 @@ is($node_master->psql('otherdb', "SELECT location FROM pg_logical_slot_peek_chan
 $node_master->safe_psql('otherdb', qq[SELECT pg_create_logical_replication_slot('otherdb_slot', 'test_decoding');]);
 
 # make sure you can't drop a slot while active
-my $pg_recvlogical = IPC::Run::start(['pg_recvlogical', '-d', $node_master->connstr('otherdb'), '-S', 'otherdb_slot', '-f', '-', '--start']);
-$node_master->poll_query_until('otherdb', "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'otherdb_slot' AND active_pid IS NOT NULL)");
-is($node_master->psql('postgres', 'DROP DATABASE otherdb'), 3,
-	'dropping a DB with inactive logical slots fails');
-$pg_recvlogical->kill_kill;
-is($node_master->slot('otherdb_slot')->{'slot_name'}, undef,
-	'logical slot still exists');
+#
+SKIP:
+{
+	# some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
+	skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';
+
+	my $pg_recvlogical = IPC::Run::start(['pg_recvlogical', '-d', $node_master->connstr('otherdb'), '-S', 'otherdb_slot', '-f', '-', '--start']);
+	$node_master->poll_query_until('otherdb', "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'otherdb_slot' AND active_pid IS NOT NULL)");
+	is($node_master->psql('postgres', 'DROP DATABASE otherdb'), 3,

Re: [HACKERS] MSVC odd TAP test problem

2017-05-06 Thread Andrew Dunstan


On 05/06/2017 07:41 PM, Craig Ringer wrote:
>
>
> On 7 May 2017 4:24 am, "Andrew Dunstan"
>  > wrote:
>
>
> I have been working on enabling the remaining TAP tests on MSVC
> build in
> the buildfarm client, but I have come across an odd problem. The bin
> tests all run fine, but the recover tests crash and in such a way
> as to
> crash the buildfarm client itself and require some manual cleanup.
> This
> happens at some stage after the tests have run (the final "ok" is
> output) but before the END handler in PostgresNode.pm (I put some
> traces
> in there to see if I could narrow down where there were problems).
>
> The symptom is that this appears at the end of the output when the
> client calls "vcregress.pl  taptest
> src/test/recover":
>
> Terminating on signal SIGBREAK(21)
> Terminating on signal SIGBREAK(21)
> Terminate batch job (Y/N)?
>
> And at that point there is nothing at all apparently running,
> according
> to Sysinternals Process Explorer, including the buildfarm client.
>
> It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
> fix it.
>
>
> Anyone have any clues?
>
>
> That looks like we've upset CMD.exe its self. I'm not sure how ...
> leaking a signal to the parent proc?
>
> I suspect this could be something to do with console process groups.
>
> Bowerbird is win8 . So this isn't going to be related to the support
> for ANSI escapes added in win10.
>
> A serach for the error turns up a complaint about IPC::Run as the
> first hit. Probably not coincidence.
>
>
> http://stackoverflow.com/q/40924750
>
> See this bug
>
> https://rt.cpan.org/Public/Bug/Display.html?id=101093
>
>
>



Actually, it's Win10, looks like I forgot to update the personality, my bad.

I had a feeling it was probably something to do with timeout. That RT
ticket looks like it's on the money.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MSVC odd TAP test problem

2017-05-06 Thread Craig Ringer
On 7 May 2017 4:24 am, "Andrew Dunstan" 
wrote:


I have been working on enabling the remaining TAP tests on MSVC build in
the buildfarm client, but I have come across an odd problem. The bin
tests all run fine, but the recover tests crash and in such a way as to
crash the buildfarm client itself and require some manual cleanup. This
happens at some stage after the tests have run (the final "ok" is
output) but before the END handler in PostgresNode.pm (I put some traces
in there to see if I could narrow down where there were problems).

The symptom is that this appears at the end of the output when the
client calls "vcregress.pl taptest src/test/recover":

Terminating on signal SIGBREAK(21)
Terminating on signal SIGBREAK(21)
Terminate batch job (Y/N)?

And at that point there is nothing at all apparently running, according
to Sysinternals Process Explorer, including the buildfarm client.

It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
fix it.


Anyone have any clues?


That looks like we've upset CMD.exe its self. I'm not sure how ... leaking
a signal to the parent proc?

I suspect this could be something to do with console process groups.

Bowerbird is win8 . So this isn't going to be related to the support for
ANSI escapes added in win10.

A serach for the error turns up a complaint about IPC::Run as the first
hit. Probably not coincidence.


http://stackoverflow.com/q/40924750

See this bug

https://rt.cpan.org/Public/Bug/Display.html?id=101093