Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-24 Thread Christoph Berg
Re: Andres Freund
> > I'd be good with tweaking this test case as you suggest, and maybe
> > revisiting the topic later.
> 
> Pushed the change of the test. Christoph, just to make sure, can you confirm
> that this fixes the test instability for you?

Unfortunately I could not reproduce the problem on the mipsel porter
box I have access to (which is the same box used as mips64el porter
box for the other thread). Running the "stats" test 30 times in a loop
always made it pass.

On the Debian buildds, the build has succeeded now in the 3rd try:

https://buildd.debian.org/status/logs.php?pkg=postgresql-15=15%7Ebeta1-1=mipsel

Christoph




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-22 Thread Andres Freund
Hi,

On 2022-05-20 01:25:10 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-05-20 00:22:14 -0400, Tom Lane wrote:
> >> There's some fallout in the expected-file, of course, but this
> >> does seem to fix it (20 consecutive successful runs now at
> >> 100/2).  Don't see why though ...
> 
> > I think what might be happening is that the transactional stats updates get
> > reported by s2 *before* the non-transactional stats updates come in from
> > s1. I.e. the pgstat_report_stat() at the end of s2_commit_prepared_a does a
> > report, because the machine is slow enough for it to be "time to reports 
> > stats
> > again". Then s1 reports its non-transactional stats.
> 
> Sounds plausible.  And I left the test loop running, and it's now past
> 100 consecutive successes, so I think this change definitely "fixes" it.

FWIW, the problem can be reliably reproduced by sticking a
pgstat_force_next_flush() into pgstat_twophase_postcommit(). This is the only
failure when doing so.


> > It looks like our stats maintenance around truncation isn't quite 
> > "concurrency
> > safe". That code hasn't meaningfully changed, but it'd not be surprising if
> > it's not 100% precise...
> 
> Yeah.  Probably not something to try to improve post-beta, especially
> since it's not completely clear how transactional and non-transactional
> cases *should* interact.

Yea. It's also not normally particularly crucial to be accurate down to that
degree.


> Maybe non-transactional updates should be
> pushed immediately?  But I'm not sure if that's fully correct, and
> it definitely sounds expensive.

I think that'd be far too expensive - the majority of stats are
non-transactional...

I think what we could do is to model truncates as subtracting the number of
live/dead rows the truncating backend knows about, rather than setting them to
0. But that of course could incur other inaccuracies.


> I'd be good with tweaking this test case as you suggest, and maybe
> revisiting the topic later.

Pushed the change of the test. Christoph, just to make sure, can you confirm
that this fixes the test instability for you?


> Kyotaro-san worried about whether any other places in stats.spec
> have the same issue.  I've not seen any evidence of that in my
> tests, but perhaps some other machine with different timing
> could find it.

I tried to find some by putting in forced flushes in a bunch of places before,
and now some more, without finding further cases.

Greetings,

Andres Freund




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-20 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> I think what might be happening is that the transactional stats updates get
>> reported by s2 *before* the non-transactional stats updates come in from
>> s1. I.e. the pgstat_report_stat() at the end of s2_commit_prepared_a does a
>> report, because the machine is slow enough for it to be "time to reports 
>> stats
>> again". Then s1 reports its non-transactional stats.

> Sounds plausible.  And I left the test loop running, and it's now past
> 100 consecutive successes, so I think this change definitely "fixes" it.

In the light of morning, at least half of the parameter dependency is
now obvious: the problematic test case involves a prepared transaction,
so it fails completely with max_prepared_transactions = 0.  The isolation
test harness masks that by matching against stats_1.out, but it's not
really a "success".

My numbers do still suggest that there's a weak dependence on
max_connections, but it's possible that that's a mirage.  I did not run
enough test cycles to be able to say positively that that's a real effect
(and the machine's slow enough that I'm not excited about doing so).

regards, tom lane




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Tom Lane
Andres Freund  writes:
> On 2022-05-20 00:22:14 -0400, Tom Lane wrote:
>> There's some fallout in the expected-file, of course, but this
>> does seem to fix it (20 consecutive successful runs now at
>> 100/2).  Don't see why though ...

> I think what might be happening is that the transactional stats updates get
> reported by s2 *before* the non-transactional stats updates come in from
> s1. I.e. the pgstat_report_stat() at the end of s2_commit_prepared_a does a
> report, because the machine is slow enough for it to be "time to reports stats
> again". Then s1 reports its non-transactional stats.

Sounds plausible.  And I left the test loop running, and it's now past
100 consecutive successes, so I think this change definitely "fixes" it.

> It looks like our stats maintenance around truncation isn't quite "concurrency
> safe". That code hasn't meaningfully changed, but it'd not be surprising if
> it's not 100% precise...

Yeah.  Probably not something to try to improve post-beta, especially
since it's not completely clear how transactional and non-transactional
cases *should* interact.  Maybe non-transactional updates should be
pushed immediately?  But I'm not sure if that's fully correct, and
it definitely sounds expensive.

I'd be good with tweaking this test case as you suggest, and maybe
revisiting the topic later.

Kyotaro-san worried about whether any other places in stats.spec
have the same issue.  I've not seen any evidence of that in my
tests, but perhaps some other machine with different timing
could find it.

regards, tom lane




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Andres Freund
Hi,

On 2022-05-20 00:22:14 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Hm. Does the instability vanish if you switch s2_commit_prepared_a and 
> > s1_ff?
> 
> Like this?

Yea.


> diff --git a/src/test/isolation/specs/stats.spec 
> b/src/test/isolation/specs/stats.spec
> index be4ae1f4ff..70be29b207 100644
> --- a/src/test/isolation/specs/stats.spec
> +++ b/src/test/isolation/specs/stats.spec
> @@ -562,8 +562,9 @@ permutation
>s1_table_insert_k1 # should be counted
>s1_table_update_k1 # dito
>s1_prepare_a
> +  s1_ff
>s2_commit_prepared_a
> -  s1_ff s2_ff
> +  s2_ff
>s1_table_stats
>  
>  # S1 prepares, S1 aborts prepared
> 
> There's some fallout in the expected-file, of course, but this
> does seem to fix it (20 consecutive successful runs now at
> 100/2).  Don't see why though ...

I think what might be happening is that the transactional stats updates get
reported by s2 *before* the non-transactional stats updates come in from
s1. I.e. the pgstat_report_stat() at the end of s2_commit_prepared_a does a
report, because the machine is slow enough for it to be "time to reports stats
again". Then s1 reports its non-transactional stats.

It looks like our stats maintenance around truncation isn't quite "concurrency
safe". That code hasn't meaningfully changed, but it'd not be surprising if
it's not 100% precise...

Greetings,

Andres Freund




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Tom Lane
Andres Freund  writes:
> Hm. Does the instability vanish if you switch s2_commit_prepared_a and s1_ff?

Like this?

diff --git a/src/test/isolation/specs/stats.spec 
b/src/test/isolation/specs/stats.spec
index be4ae1f4ff..70be29b207 100644
--- a/src/test/isolation/specs/stats.spec
+++ b/src/test/isolation/specs/stats.spec
@@ -562,8 +562,9 @@ permutation
   s1_table_insert_k1 # should be counted
   s1_table_update_k1 # dito
   s1_prepare_a
+  s1_ff
   s2_commit_prepared_a
-  s1_ff s2_ff
+  s2_ff
   s1_table_stats
 
 # S1 prepares, S1 aborts prepared

There's some fallout in the expected-file, of course, but this
does seem to fix it (20 consecutive successful runs now at
100/2).  Don't see why though ...

regards, tom lane




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Kyotaro Horiguchi
At Thu, 19 May 2022 22:58:14 -0400, Tom Lane  wrote in 
> Kyotaro Horiguchi  writes:
> Notice that these markers can only delay reporting of the completion
> of a step, not the launch of a step.  The isolationtester will launch
> the next step in a permutation as soon as (A) all prior steps of the
> same session are done, and (B) the immediately preceding step in the
> permutation is done or deemed blocked.  For this purpose, "deemed
> blocked" means that it has been seen to be waiting on a database lock,
> or that it is complete but the report of its completion is delayed by
> one of these markers.
> 
> There's no "waiting..." reports in the test output, nor do we have any
> condition markers in stats.spec right now, so I don't think any steps
> have been "deemed blocked".

Mmm... Thanks. I miunderstood the effect of it..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Andres Freund
Hi,

On 2022-05-19 22:58:14 -0400, Tom Lane wrote:
> What I am wondering about at this point is whether the effects of
> pg_stat_force_next_flush() could somehow be delayed until after we
> have told the client the command is complete.

It shouldn't - it just forces pg_stat_report_stat() to flush (rather than
doing so based on the time of the last report). And pg_stat_report_stat()
happens before ReadyForQuery().

Hm. Does the instability vanish if you switch s2_commit_prepared_a and s1_ff?

Greetings,

Andres Freund




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Tom Lane
Andres Freund  writes:
> On 2022-05-19 21:42:31 -0400, Tom Lane wrote:
>> Even more interesting, the repeatability varies with the settings
>> of max_connections and max_prepared_transactions.

> That is indeed quite odd.

>> At low values (resp. 20 and 0) I've not been able to make it happen at all,
>> but at 100 and 2 it happens circa three times out of four.

> Is this the only permutation that fails?

No, but those values definitely seem to affect the probability of
failure.  I just finished a more extensive series of runs, and got:

successes/tries with max_connections/max_prepared_transactions:
5/5 OK with 20/2
5/5 OK with 100/0
3/5 OK with 100/2
4/5 OK with 200/2
5/6 OK with 100/10
5/6 OK with 50/2
6/10 OK with 100/2

It seems like the probability decreases again if I raise either
number further.  So I'm now guessing that this is purely a timing
issue and that somehow 100/2 is near the sweet spot for hitting
the timing window.  Why those settings would affect pgstats at
all is unclear, though.

regards, tom lane




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Andres Freund
Hi,

On 2022-05-19 21:42:31 -0400, Tom Lane wrote:
> >  
> > seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
> >  
> > ++-+-+-+--+--+
> > -   3|   9|5|1|0| 1| 1| 
> >   0
> > +   3|   9|5|1|0| 4| 1| 
> >   0
> >  (1 row)

> Even more interesting, the repeatability varies with the settings
> of max_connections and max_prepared_transactions.

That is indeed quite odd.


> At low values (resp. 20 and 0) I've not been able to make it happen at all,
> but at 100 and 2 it happens circa three times out of four.

Is this the only permutation that fails?

Greetings,

Andres Freund




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Tom Lane
Kyotaro Horiguchi  writes:
> At Fri, 20 May 2022 11:02:21 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
>> Doesn't the step s1_table_stats needs a blocking condition (s2_ff)?
> s/needs/need/;
> If I removed the step s2_ff, I see the same difference. So I think it
> is that.  Some other permutations seem to need the same.

Hmm ... it does seem like the answer might be somewhere around there,
but it's not there exactly.  This won't fix it:

-  s1_table_stats
+  s1_table_stats(s2_ff)

That sort of marker only stabilizes cases where two different steps might
be reported as completing in either order.  In this case, that's clearly
not the problem.  What we do need is to ensure that s1_table_stats doesn't
*launch* before s2_ff is done.  However, it doesn't look to me like that's
what's happening.  isolation/README explains that

Notice that these markers can only delay reporting of the completion
of a step, not the launch of a step.  The isolationtester will launch
the next step in a permutation as soon as (A) all prior steps of the
same session are done, and (B) the immediately preceding step in the
permutation is done or deemed blocked.  For this purpose, "deemed
blocked" means that it has been seen to be waiting on a database lock,
or that it is complete but the report of its completion is delayed by
one of these markers.

There's no "waiting..." reports in the test output, nor do we have any
condition markers in stats.spec right now, so I don't think any steps
have been "deemed blocked".

What I am wondering about at this point is whether the effects of
pg_stat_force_next_flush() could somehow be delayed until after we
have told the client the command is complete.  I've not poked into
the code in that area, but if that could happen it'd explain this
behavior.

regards, tom lane




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Kyotaro Horiguchi
At Fri, 20 May 2022 11:02:21 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> Doesn't the step s1_table_stats needs a blocking condition (s2_ff)?

s/needs/need/;

If I removed the step s2_ff, I see the same difference. So I think it
is that.  Some other permutations seem to need the same.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/test/isolation/specs/stats.spec b/src/test/isolation/specs/stats.spec
index be4ae1f4ff..441dd7d3b8 100644
--- a/src/test/isolation/specs/stats.spec
+++ b/src/test/isolation/specs/stats.spec
@@ -417,7 +417,7 @@ permutation
   s2_table_update_k1
   s1_table_drop
   s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
 
 permutation
   s1_table_select
@@ -461,7 +461,7 @@ permutation
   s2_table_select
   s1_track_counts_on
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
   # but can count again after
   s1_table_select
   s1_table_update_k1
@@ -478,7 +478,7 @@ permutation
   s2_table_select
   s1_track_counts_on
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
   s1_table_select
   s1_table_update_k1
   s1_ff
@@ -509,7 +509,7 @@ permutation
   s2_commit_prepared_a
   s1_table_select
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
 
 # S1 prepares, S2 commits prepared
 permutation
@@ -533,7 +533,7 @@ permutation
   s2_rollback_prepared_a
   s1_table_select
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
 
 
 ### 2PC: truncate handling
@@ -564,7 +564,7 @@ permutation
   s1_prepare_a
   s2_commit_prepared_a
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
 
 # S1 prepares, S1 aborts prepared
 permutation
@@ -592,7 +592,7 @@ permutation
   s1_prepare_a
   s2_rollback_prepared_a
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
 
 
 ### 2PC: rolled back drop maintains live / dead counters
@@ -627,7 +627,7 @@ permutation
   s1_prepare_a
   s2_rollback_prepared_a
   s1_ff s2_ff
-  s1_table_stats
+  s1_table_stats(s2_ff)
 
 
 ##


Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Kyotaro Horiguchi
At Thu, 19 May 2022 21:42:31 -0400, Tom Lane  wrote in 
> Christoph Berg  writes:
> > Debian unstable mips (the old 32-bit one):
> 
> > --- /<>/src/test/isolation/expected/stats.out  2022-05-16 
> > 21:10:42.0 +
> > +++ /<>/build/src/test/isolation/output_iso/results/stats.out  
> > 2022-05-18 23:26:56.573000536 +
> > @@ -2854,7 +2854,7 @@
> 
> >  
> > seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
> >  
> > ++-+-+-+--+--+
> > -   3|   9|5|1|0| 1| 1| 
> >   0
> > +   3|   9|5|1|0| 4| 1| 
> >   0
> >  (1 row)
> 
> I have just discovered that I can reproduce this identical symptom
> fairly repeatably on an experimental lashup that I've been running
> with bleeding-edge NetBSD on my ancient HPPA box.  (You didn't think
> I was just going to walk away from that hardware, did you?)
> 
> Even more interesting, the repeatability varies with the settings
> of max_connections and max_prepared_transactions.  At low values
> (resp. 20 and 0) I've not been able to make it happen at all, but
> at 100 and 2 it happens circa three times out of four.
> 
> I have no idea where to start looking, but this is clearly an issue
> in the new stats code ... or else the hoped-for goal of removing
> flakiness from the stats tests is just as far away as ever.

Doesn't the step s1_table_stats needs a blocking condition (s2_ff)?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Tom Lane
Christoph Berg  writes:
> Debian unstable mips (the old 32-bit one):

> --- /<>/src/test/isolation/expected/stats.out2022-05-16 
> 21:10:42.0 +
> +++ /<>/build/src/test/isolation/output_iso/results/stats.out
> 2022-05-18 23:26:56.573000536 +
> @@ -2854,7 +2854,7 @@

>  
> seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
>  
> ++-+-+-+--+--+
> -   3|   9|5|1|0| 1| 1|   
> 0
> +   3|   9|5|1|0| 4| 1|   
> 0
>  (1 row)

I have just discovered that I can reproduce this identical symptom
fairly repeatably on an experimental lashup that I've been running
with bleeding-edge NetBSD on my ancient HPPA box.  (You didn't think
I was just going to walk away from that hardware, did you?)

Even more interesting, the repeatability varies with the settings
of max_connections and max_prepared_transactions.  At low values
(resp. 20 and 0) I've not been able to make it happen at all, but
at 100 and 2 it happens circa three times out of four.

I have no idea where to start looking, but this is clearly an issue
in the new stats code ... or else the hoped-for goal of removing
flakiness from the stats tests is just as far away as ever.

regards, tom lane




15beta1 test failure on mips in isolation/expected/stats

2022-05-19 Thread Christoph Berg
Debian unstable mips (the old 32-bit one):

test vacuum-conflict  ... ok 2170 ms
test vacuum-skip-locked   ... ok 2445 ms
test stats... FAILED38898 ms
test horizons ... ok 4543 ms
test predicate-hash   ... ok22419 ms

 build/src/test/isolation/output_iso/regression.diffs 
diff -U3 /<>/src/test/isolation/expected/stats.out 
/<>/build/src/test/isolation/output_iso/results/stats.out
--- /<>/src/test/isolation/expected/stats.out  2022-05-16 
21:10:42.0 +
+++ /<>/build/src/test/isolation/output_iso/results/stats.out  
2022-05-18 23:26:56.573000536 +
@@ -2854,7 +2854,7 @@

 
seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
 
++-+-+-+--+--+
-   3|   9|5|1|0| 1| 1| 
  0
+   3|   9|5|1|0| 4| 1| 
  0
 (1 row)

Full build log:
https://buildd.debian.org/status/fetch.php?pkg=postgresql-15=mipsel=15%7Ebeta1-1=1652916588=0

(I'll try rescheduling this build as well, the last one took 4h before
it failed.)

Christoph