Re: pgsql: Support partition pruning at execution time

2018-04-17 Thread David Rowley
On 18 April 2018 at 07:26, Alvaro Herrera  wrote:
> David Rowley wrote:
>
>> I've made another pass over the nodeAppend.c code and I'm unable to
>> see what might cause this, although I did discover a bug where
>> first_partial_plan is not set taking into account that some subplans
>> may have been pruned away during executor init. The only thing I think
>> this would cause is for parallel workers to not properly help out with
>> some partial plans if some earlier subplans were pruned. I can see no
>> reason for this to have caused this particular issue since the
>> first_partial_plan would be 0 with and without the attached fix.
>
> Pushed this.

Thanks!

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-17 Thread Alvaro Herrera
David Rowley wrote:

> I've made another pass over the nodeAppend.c code and I'm unable to
> see what might cause this, although I did discover a bug where
> first_partial_plan is not set taking into account that some subplans
> may have been pruned away during executor init. The only thing I think
> this would cause is for parallel workers to not properly help out with
> some partial plans if some earlier subplans were pruned. I can see no
> reason for this to have caused this particular issue since the
> first_partial_plan would be 0 with and without the attached fix.

Pushed this.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-12 Thread David Rowley
On 11 April 2018 at 18:58, David Rowley  wrote:
> On 10 April 2018 at 08:55, Tom Lane  wrote:
>> Alvaro Herrera  writes:
>>> David Rowley wrote:
 Okay, I've written and attached a fix for this.  I'm not 100% certain
 that this is the cause of the problem on pademelon, but the code does
 look wrong, so needs to be fixed. Hopefully, it'll make pademelon
 happy, if not I'll think a bit harder about what might be causing that
 instability.
>>
>>> Pushed it just now.  Let's see what happens with pademelon now.
>>
>> I've had pademelon's host running a "make installcheck" loop all day
>> trying to reproduce the problem.  I haven't gotten a bite yet (although
>> at 15+ minutes per cycle, this isn't a huge number of tests).  I think
>> we were remarkably (un)lucky to see the problem so quickly after the
>> initial commit, and I'm afraid pademelon isn't going to help us prove
>> much about whether this was the same issue.
>>
>> This does remind me quite a bit though of the ongoing saga with the
>> postgres_fdw test instability.  Given the frequency with which that's
>> failing in the buildfarm, you would not think it's impossible to
>> reproduce outside the buildfarm, and yet I'm here to tell you that
>> it's pretty damn hard.  I haven't succeeded yet, and that's not for
>> lack of trying.  Could there be something about the buildfarm
>> environment that makes these sorts of things more likely?
>
> coypu just demonstrated that this was not the cause of the problem [1]
>
> I'll study the code a bit more and see if I can think why this might
> be happening.
>
> [1] 
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=coypu=2018-04-11%2004%3A17%3A38=install-check-C

I've spent a bit of time tonight trying to dig into this problem to
see if I can figure out what's going on.

I ended up running the following script on both a Linux x86_64 machine
and also a power8 machine.

#!/bin/bash
for x in {1..1000}
do
echo "$x";
for i in {1..1000}
do
psql -d postgres -f test.sql -o test.out
diff -u test.out test.expect
done
done

I was unable to recreate this problem after about 700k loops on the
Linux machine and 130k loops on the power8.

I've emailed the owner of coypu to ask if it would be possible to get
access to the machine, or have him run the script to see if it does
actually fail. Currently waiting to hear back.

I've made another pass over the nodeAppend.c code and I'm unable to
see what might cause this, although I did discover a bug where
first_partial_plan is not set taking into account that some subplans
may have been pruned away during executor init. The only thing I think
this would cause is for parallel workers to not properly help out with
some partial plans if some earlier subplans were pruned. I can see no
reason for this to have caused this particular issue since the
first_partial_plan would be 0 with and without the attached fix.

Tom, would there be any chance you could run the above script for a
while on pademelon to see if it can in fact reproduce the problem?
coypu did show this problem in the install check, so I don't think it
will need the other concurrent tests to fail.  If you can recreate,
after adjusting the expected output, does the problem still exist in
5c0675215?

I also checked with other tests perform an EXPLAIN ANALYZE of a plan
with a Parallel Append and I see there's none. So I've not ruled out
that this is an existing bug. git grep "explain.*analyze" also does
not show much outside of the partition_prune tests either.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
set enable_indexonlyscan = off;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
set min_parallel_table_scan_size = 0;
set max_parallel_workers_per_gather = 2;

prepare ab_q5 (int, int, int) as
select avg(a) from ab where a in($1,$2,$3) and b < 4;

execute ab_q5 (1, 2, 3);
execute ab_q5 (1, 2, 3);
execute ab_q5 (1, 2, 3);
execute ab_q5 (1, 2, 3);
execute ab_q5 (1, 2, 3);

explain (analyze, costs off, summary off, timing off) execute ab_q5 (2, 3, 3);



test.expect
Description: Binary data


first_partial_plan_fix.patch
Description: Binary data
create table ab (a int not null, b int not null) partition by list (a);
create table ab_a2 partition of ab for values in(2) partition by list (b);
create table ab_a2_b1 partition of ab_a2 for values in (1);
create table ab_a2_b2 partition of ab_a2 for values in (2);
create table ab_a2_b3 partition of ab_a2 for values in (3);
create table ab_a1 partition of ab for values in(1) partition by list (b);
create table ab_a1_b1 partition of ab_a1 for values in (1);
create table ab_a1_b2 partition of ab_a1 for values in (2);
create table ab_a1_b3 partition of ab_a1 for values in (3);
create table ab_a3 partition of ab for values in(3) partition by list (b);
create table 

Re: pgsql: Support partition pruning at execution time

2018-04-11 Thread David Rowley
On 10 April 2018 at 08:55, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> David Rowley wrote:
>>> Okay, I've written and attached a fix for this.  I'm not 100% certain
>>> that this is the cause of the problem on pademelon, but the code does
>>> look wrong, so needs to be fixed. Hopefully, it'll make pademelon
>>> happy, if not I'll think a bit harder about what might be causing that
>>> instability.
>
>> Pushed it just now.  Let's see what happens with pademelon now.
>
> I've had pademelon's host running a "make installcheck" loop all day
> trying to reproduce the problem.  I haven't gotten a bite yet (although
> at 15+ minutes per cycle, this isn't a huge number of tests).  I think
> we were remarkably (un)lucky to see the problem so quickly after the
> initial commit, and I'm afraid pademelon isn't going to help us prove
> much about whether this was the same issue.
>
> This does remind me quite a bit though of the ongoing saga with the
> postgres_fdw test instability.  Given the frequency with which that's
> failing in the buildfarm, you would not think it's impossible to
> reproduce outside the buildfarm, and yet I'm here to tell you that
> it's pretty damn hard.  I haven't succeeded yet, and that's not for
> lack of trying.  Could there be something about the buildfarm
> environment that makes these sorts of things more likely?

coypu just demonstrated that this was not the cause of the problem [1]

I'll study the code a bit more and see if I can think why this might
be happening.

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=coypu=2018-04-11%2004%3A17%3A38=install-check-C

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-09 Thread David Rowley
On 10 April 2018 at 09:58, Alvaro Herrera  wrote:
> I then noticed that support for nfiltered3 was incomplete; hence 0001.
> (I then noticed that nfiltered3 was added for MERGE.  It looks wrong to
> me.)
>
> Frankly, I don't like this.  I would rather have an instrument->ntuples2
> rather than these "divide this by nloops, sometimes" schizoid counters.
> This is already being misused by ON CONFLICT (see "other_path" in
> show_modifytable_info).  But it seems like a correct fix would require
> more code.

+1 for a new field for this and making ON CONFLICT use it.

ntuples2 seems fine. If we make it too specific then we'll end up with
lots more than we need.

I don't think re-using the filter counters are very good when it's not
for filtering.

MERGE was probably just following the example made by ON CONFLICT.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-09 Thread Alvaro Herrera
Andrew Gierth wrote:
> > "Alvaro" == Alvaro Herrera  writes:
> 
>  Alvaro> Thanks for cleaning that up. I'll look into why the test
>  Alvaro> (without this commit) fails with force_parallel_mode=regress
>  Alvaro> next week.
> 
> Seems clear enough to me - the "Heap Fetches" statistic is kept in the
> IndexOnlyScanState node in its own field, not part of ss.ps.instrument,
> and is therefore not reported from workers to leader.

Right, thanks for the pointer.

So here's a patch that makes thing behave as expected.  I noticed that
instrument->nfiltered3 was available, so I used that to keep the
counter.  I wanted to print it using show_instrumentation_count (which
has the nice property that you don't even have to test for es->analyze),
but it doesn't work, because it divides the number by nloops, which is
not what we want in this case.  (It also doesn't print if the counter is
zero, which maybe is desirable for the other counters but probably not
for this one).

I then noticed that support for nfiltered3 was incomplete; hence 0001.
(I then noticed that nfiltered3 was added for MERGE.  It looks wrong to
me.)

Frankly, I don't like this.  I would rather have an instrument->ntuples2
rather than these "divide this by nloops, sometimes" schizoid counters.
This is already being misused by ON CONFLICT (see "other_path" in
show_modifytable_info).  But it seems like a correct fix would require
more code.

Anyway, the partition_prune test works correctly now (after reverting
AndrewSN's b47a86f5008f26) in both force_parallel_mode settings.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 10b29c7706efd00279182164de14592643ff7f40 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 9 Apr 2018 18:42:04 -0300
Subject: [PATCH 1/2] print nfiltered3

---
 src/backend/commands/explain.c| 4 +++-
 src/backend/executor/instrument.c | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 989b6aad67..347fbbe1cf 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2592,7 +2592,9 @@ show_instrumentation_count(const char *qlabel, int which,
if (!es->analyze || !planstate->instrument)
return;
 
-   if (which == 2)
+   if (which == 3)
+   nfiltered = planstate->instrument->nfiltered3;
+   else if (which == 2)
nfiltered = planstate->instrument->nfiltered2;
else
nfiltered = planstate->instrument->nfiltered1;
diff --git a/src/backend/executor/instrument.c 
b/src/backend/executor/instrument.c
index 86252cee1f..d3045f57ac 100644
--- a/src/backend/executor/instrument.c
+++ b/src/backend/executor/instrument.c
@@ -159,6 +159,7 @@ InstrAggNode(Instrumentation *dst, Instrumentation *add)
dst->nloops += add->nloops;
dst->nfiltered1 += add->nfiltered1;
dst->nfiltered2 += add->nfiltered2;
+   dst->nfiltered3 += add->nfiltered3;
 
/* Add delta of buffer usage since entry to node's totals */
if (dst->need_bufusage)
-- 
2.11.0

>From acaf8e8af643f05605eab21ad3e5a04a8714f06a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 9 Apr 2018 18:41:20 -0300
Subject: [PATCH 2/2] use nfiltered3 instead of ad-hoc counter

---
 src/backend/commands/explain.c   | 8 ++--
 src/backend/executor/nodeIndexonlyscan.c | 3 +--
 src/include/nodes/execnodes.h| 1 -
 3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 347fbbe1cf..434051e7df 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1459,12 +1459,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
show_instrumentation_count("Rows Removed by 
Filter", 1,

   planstate, es);
if (es->analyze)
-   {
-   longheapFetches =
-   ((IndexOnlyScanState *) 
planstate)->ioss_HeapFetches;
-
-   ExplainPropertyInteger("Heap Fetches", NULL, 
heapFetches, es);
-   }
+   ExplainPropertyInteger("Heap Fetches", NULL,
+  
planstate->instrument->nfiltered3, es);
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) 
plan)->indexqualorig,
diff --git a/src/backend/executor/nodeIndexonlyscan.c 
b/src/backend/executor/nodeIndexonlyscan.c
index ddc0ae9061..ef835f13c2 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ 

Re: pgsql: Support partition pruning at execution time

2018-04-09 Thread Tom Lane
Alvaro Herrera  writes:
> David Rowley wrote:
>> Okay, I've written and attached a fix for this.  I'm not 100% certain
>> that this is the cause of the problem on pademelon, but the code does
>> look wrong, so needs to be fixed. Hopefully, it'll make pademelon
>> happy, if not I'll think a bit harder about what might be causing that
>> instability.

> Pushed it just now.  Let's see what happens with pademelon now.

I've had pademelon's host running a "make installcheck" loop all day
trying to reproduce the problem.  I haven't gotten a bite yet (although
at 15+ minutes per cycle, this isn't a huge number of tests).  I think
we were remarkably (un)lucky to see the problem so quickly after the
initial commit, and I'm afraid pademelon isn't going to help us prove
much about whether this was the same issue.

This does remind me quite a bit though of the ongoing saga with the
postgres_fdw test instability.  Given the frequency with which that's
failing in the buildfarm, you would not think it's impossible to
reproduce outside the buildfarm, and yet I'm here to tell you that
it's pretty damn hard.  I haven't succeeded yet, and that's not for
lack of trying.  Could there be something about the buildfarm
environment that makes these sorts of things more likely?

regards, tom lane



Re: pgsql: Support partition pruning at execution time

2018-04-09 Thread Alvaro Herrera
David Rowley wrote:

> Okay, I've written and attached a fix for this.  I'm not 100% certain
> that this is the cause of the problem on pademelon, but the code does
> look wrong, so needs to be fixed. Hopefully, it'll make pademelon
> happy, if not I'll think a bit harder about what might be causing that
> instability.

Pushed it just now.  Let's see what happens with pademelon now.

> The misleading comment claimed we unset the extern params so we didn't
> perform pruning again using these. I'd failed to update this comment
> after I realised that we still need to attempt to prune again with the
> external params since quals against the partition key may actually
> contain a mix of exec and external params, which would mean that it's
> only possible to prune partitions using both types of params. No
> actual code needs to be updated since the 2nd pass of pruning uses
> "allparams" anyway. We could actually get away without the bms_free()
> and set to NULL in the lines below the comment, but I wanted some way
> to ensure that we never write any code which calls the function twice
> on the same PartitionPruneState, but maybe I'm just overly paranoid
> there.

Pushed this earlier today, thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-09 Thread David Rowley
On 9 April 2018 at 15:03, David Rowley  wrote:
> On 9 April 2018 at 13:03, David Rowley  wrote:
> Okay, I've written and attached a fix for this.  I'm not 100% certain
> that this is the cause of the problem on pademelon, but the code does
> look wrong, so needs to be fixed. Hopefully, it'll make pademelon
> happy, if not I'll think a bit harder about what might be causing that
> instability.

Added to PG11 open items list [1].

[1] https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Open_Issues

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-08 Thread David Rowley
On 9 April 2018 at 13:03, David Rowley  wrote:
> On 9 April 2018 at 09:54, Tom Lane  wrote:
>> BTW, pademelon just exhibited a different instability in this test:
>>
>> *** 
>> /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/partition_prune.out
>>   Sun Apr  8 01:56:04 2018
>> --- 
>> /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/results/partition_prune.out
>>Sun Apr  8 17:48:14 2018
>> ***
>> *** 1606,1612 
>>->  Partial Aggregate (actual rows=1 loops=3)
>>  ->  Parallel Append (actual rows=0 loops=3)
>>Subplans Removed: 6
>> !  ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 
>> loops=1)
>>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>>->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 
>> loops=1)
>>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>> --- 1606,1612 
>>->  Partial Aggregate (actual rows=1 loops=3)
>>  ->  Parallel Append (actual rows=0 loops=3)
>>Subplans Removed: 6
>> !  ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 
>> loops=2)
>>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>>->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 
>> loops=1)
>>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>>
>
> Reading code it looks like a bug in choose_next_subplan_for_worker():
>
> The following needs to be changed for this patch:
>
> /* Advance to next plan. */
> pstate->pa_next_plan++;
>
> I'll think a bit harder about the best way to fix and submit a patch
> for it later.

Okay, I've written and attached a fix for this.  I'm not 100% certain
that this is the cause of the problem on pademelon, but the code does
look wrong, so needs to be fixed. Hopefully, it'll make pademelon
happy, if not I'll think a bit harder about what might be causing that
instability.

I've also attached a 2nd patch to fix a spelling mistake and a
misleading comment in the code.

The misleading comment claimed we unset the extern params so we didn't
perform pruning again using these. I'd failed to update this comment
after I realised that we still need to attempt to prune again with the
external params since quals against the partition key may actually
contain a mix of exec and external params, which would mean that it's
only possible to prune partitions using both types of params. No
actual code needs to be updated since the 2nd pass of pruning uses
"allparams" anyway. We could actually get away without the bms_free()
and set to NULL in the lines below the comment, but I wanted some way
to ensure that we never write any code which calls the function twice
on the same PartitionPruneState, but maybe I'm just overly paranoid
there.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


0001-Fix-incorrect-logic-for-choosing-the-next-Parallel-A.patch
Description: Binary data


0002-Fix-misleading-comment-and-typo.patch
Description: Binary data


Re: pgsql: Support partition pruning at execution time

2018-04-08 Thread David Rowley
On 9 April 2018 at 09:54, Tom Lane  wrote:
> BTW, pademelon just exhibited a different instability in this test:
>
> *** 
> /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/partition_prune.out
>   Sun Apr  8 01:56:04 2018
> --- 
> /home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/results/partition_prune.out
>Sun Apr  8 17:48:14 2018
> ***
> *** 1606,1612 
>->  Partial Aggregate (actual rows=1 loops=3)
>  ->  Parallel Append (actual rows=0 loops=3)
>Subplans Removed: 6
> !  ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 
> loops=1)
>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 
> loops=1)
>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
> --- 1606,1612 
>->  Partial Aggregate (actual rows=1 loops=3)
>  ->  Parallel Append (actual rows=0 loops=3)
>Subplans Removed: 6
> !  ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 
> loops=2)
>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 
> loops=1)
>  Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
>

Reading code it looks like a bug in choose_next_subplan_for_worker():

The following needs to be changed for this patch:

/* Advance to next plan. */
pstate->pa_next_plan++;

I'll think a bit harder about the best way to fix and submit a patch
for it later.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-08 Thread Tom Lane
Andrew Gierth  writes:
> "Alvaro" == Alvaro Herrera  writes:
>  Alvaro> Thanks for cleaning that up. I'll look into why the test
>  Alvaro> (without this commit) fails with force_parallel_mode=regress
>  Alvaro> next week.

> Seems clear enough to me - the "Heap Fetches" statistic is kept in the
> IndexOnlyScanState node in its own field, not part of ss.ps.instrument,
> and is therefore not reported from workers to leader.

BTW, pademelon just exhibited a different instability in this test:

*** 
/home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/partition_prune.out
  Sun Apr  8 01:56:04 2018
--- 
/home/bfarm/bf-data/HEAD/pgsql.build/src/test/regress/results/partition_prune.out
   Sun Apr  8 17:48:14 2018
***
*** 1606,1612 
   ->  Partial Aggregate (actual rows=1 loops=3)
 ->  Parallel Append (actual rows=0 loops=3)
   Subplans Removed: 6
!  ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 loops=1)
 Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
   ->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 loops=1)
 Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
--- 1606,1612 
   ->  Partial Aggregate (actual rows=1 loops=3)
 ->  Parallel Append (actual rows=0 loops=3)
   Subplans Removed: 6
!  ->  Parallel Seq Scan on ab_a2_b1 (actual rows=0 loops=2)
 Filter: ((a >= $1) AND (a <= $2) AND (b < 4))
   ->  Parallel Seq Scan on ab_a2_b2 (actual rows=0 loops=1)
 Filter: ((a >= $1) AND (a <= $2) AND (b < 4))

==

Dunno quite what to make of that, but this animal previously passed
at commit 
b47a86f Sun Apr 8 05:35:42 2018 UTC  Attempt to stabilize partition_prune test 
output. 
so it's not a consistent failure.

regards, tom lane



Re: pgsql: Support partition pruning at execution time

2018-04-08 Thread Andrew Gierth
> "Alvaro" == Alvaro Herrera  writes:

 Alvaro> Thanks for cleaning that up. I'll look into why the test
 Alvaro> (without this commit) fails with force_parallel_mode=regress
 Alvaro> next week.

Seems clear enough to me - the "Heap Fetches" statistic is kept in the
IndexOnlyScanState node in its own field, not part of ss.ps.instrument,
and is therefore not reported from workers to leader.

-- 
Andrew (irc:RhodiumToad)



Re: pgsql: Support partition pruning at execution time

2018-04-08 Thread Alvaro Herrera
Andrew Gierth wrote:
> > "David" == David Rowley  writes:
> 
>  David> I've attached my proposed fix for the unstable regression tests.
>  David> I removed the vacuums I'd added in the last version and
>  David> commented why we're doing set enable_indesonlyscan = off;
> 
> Looks basically sane - I'll try it out and commit it shortly.

Thanks for cleaning that up.  I'll look into why the test (without this
commit) fails with force_parallel_mode=regress next week.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread Andrew Gierth
> "David" == David Rowley  writes:

 David> I've attached my proposed fix for the unstable regression tests.
 David> I removed the vacuums I'd added in the last version and
 David> commented why we're doing set enable_indesonlyscan = off;

Looks basically sane - I'll try it out and commit it shortly.

-- 
Andrew (irc:RhodiumToad)



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread David Rowley
On 8 April 2018 at 15:34, Andrew Gierth  wrote:
> You can't ever assume that data you just inserted will become
> all-visible just because you just vacuumed the table, unless you know
> that there is NO concurrent activity that might have a snapshot (and no
> other possible reason why OldestXmin might be older than your insert).

Thanks. I got it. It just slipped my slightly paranoid and sleep deprived mind.

I've attached my proposed fix for the unstable regression tests. I
removed the vacuums I'd added in the last version and commented why
we're doing set enable_indesonlyscan = off;

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


runtime_pruning_make_tests_stable_v2.patch
Description: Binary data


Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread Andrew Gierth
> "David" == David Rowley  writes:

 >> Setting autovacuum_naptime to 10 seconds makes it occur in 10 second
 >> intervals...

 David> Ok, I thought it might have been some concurrent vacuum on the
 David> table but the only tables I see being vacuumed are system
 David> tables.

It's not vacuum that tends to be the problem, but analyze (on any
table). Lazy-vacuum's snapshots are mostly ignored for computing global
xmin horizons by other vacuums, but analyze's snapshots are not.

 David> I tried performing a manual vacuum of each of these and could
 David> not get it to trigger, but then I did:

 David> select * from pg_class;

 David> from another session and then the script starts spitting out
 David> some errors.

Obviously, because the select holds a snapshot and therefore also holds
back OldestXmin.

You can't ever assume that data you just inserted will become
all-visible just because you just vacuumed the table, unless you know
that there is NO concurrent activity that might have a snapshot (and no
other possible reason why OldestXmin might be older than your insert).

-- 
Andrew (irc:RhodiumToad)



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread David Rowley
On 8 April 2018 at 15:02, David Rowley  wrote:
> On 8 April 2018 at 14:56, David Rowley  wrote:
>> It happens 12 or 13 times on my machine, then does not happen again
>> for 60 seconds, then happens again.
>
> Setting autovacuum_naptime to 10 seconds makes it occur in 10 second
> intervals...

Ok, I thought it might have been some concurrent vacuum on the table
but the only tables I see being vacuumed are system tables.

I tried performing a manual vacuum of each of these and could not get
it to trigger, but then I did:

select * from pg_class;

from another session and then the script starts spitting out some errors.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread Andrew Gierth
> "David" == David Rowley  writes:

 >> It happens 12 or 13 times on my machine, then does not happen again
 >> for 60 seconds, then happens again.

 David> Setting autovacuum_naptime to 10 seconds makes it occur in 10
 David> second intervals...

Analyze (including auto-analyze on a different table entirely) has a
snapshot, which can hold back OldestXmin, hence preventing the
all-visible flag from being set.

-- 
Andrew (irc:RhodiumToad)



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread David Rowley
On 8 April 2018 at 14:56, David Rowley  wrote:
> It happens 12 or 13 times on my machine, then does not happen again
> for 60 seconds, then happens again.

Setting autovacuum_naptime to 10 seconds makes it occur in 10 second
intervals...



-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread Alvaro Herrera
Yeah, I don't quite understand this problem, and I tend to agree that
it likely isn't this patch's fault.  However, for the moment I'm going
to avoid pushing the patch you propose because maybe there's a bug
elsewhere and it'd be good to understand it.  I'm looking at it now.

If others would prefer me to push David's patch (or do so themselves),
I'm not dead set against that.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Support partition pruning at execution time

2018-04-07 Thread Tom Lane
Alvaro Herrera  writes:
> Support partition pruning at execution time

Buildfarm member lapwing doesn't like this.  I can reproduce the
failures here by setting force_parallel_mode = regress.  Kind
of looks like instrumentation counts aren't getting propagated
from workers back to the leader?

regards, tom lane