Re: [HACKERS] Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

2016-06-19 Thread Masahiko Sawada
On Mon, Jun 20, 2016 at 3:42 PM, Amit Kapila  wrote:
> On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada 
> wrote:
>>
>> Hi all,
>>
>> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
>> fine for parallel seq scan.
>>
>> postgres(1)=# explain analyze verbose select count(*) from
>> pgbench_accounts ;
>> QUERY PLAN
>>
>> -
>>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
>> (actual time=2640.015..2640.015 rows=1 loops=1)
>>Output: count(*)
>>->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
>> time=2639.064..2640.002 rows=3 loops=1)
>>  Output: (PARTIAL count(*))
>>  Workers Planned: 2
>>  Workers Launched: 2
>>  ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
>> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>>Output: PARTIAL count(*)
>>Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>>Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>>->  Parallel Seq Scan on public.pgbench_accounts
>> (cost=0.00..205601.67 rows=417 width=0) (actual
>> time=0.042..1685.542 rows=333 loops=3)
>>  Worker 0: actual time=0.033..1657.486 rows=3457968
>> loops=1
>>  Worker 1: actual time=0.039..1702.979 rows=3741069
>> loops=1
>>  Planning time: 1.026 ms
>>  Execution time: 2640.225 ms
>> (15 rows)
>>
>> For example, the above result shows,
>> Parallel Seq Scan : actual rows = 333
>> worker 0   : actual rows = 3457968
>> worker 1   : actual rows = 3741069
>> Summation of these is 10532370, but actual total rows is 1000.
>> I think that Parallel Seq Scan should show actual rows =
>> 1000(total rows) or actual rows = 2800963(rows collected by
>> itself). (1000 maybe better)
>>
>
> You have to read the rows at Parallel Seq Scan nodes as total count of rows,
> but you have to consider the loops parameter as well.
>

In following case, it look to me that no one collect the tuple.
But it's obviously incorrect, this query collects a tuple(aid = 10) actually.

postgres(1)=# explain analyze verbose select * from pgbench_accounts
where aid = 10;
 QUERY PLAN

 Gather  (cost=1000.00..217018.43 rows=1 width=97) (actual
time=0.541..2094.773 rows=1 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..216018.34 rows=0 width=97) (actual time=1390.109..2088.103
rows=0 loops=3)
 Output: aid, bid, abalance, filler
 Filter: (pgbench_accounts.aid = 10)
 Rows Removed by Filter: 333
 Worker 0: actual time=2082.681..2082.681 rows=0 loops=1
 Worker 1: actual time=2087.532..2087.532 rows=0 loops=1
 Planning time: 0.126 ms
 Execution time: 2095.564 ms
(12 rows)

How can we consider actual rows and nloops?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

2016-06-19 Thread Amit Kapila
On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada 
wrote:
>
> Hi all,
>
> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
> fine for parallel seq scan.
>
> postgres(1)=# explain analyze verbose select count(*) from
pgbench_accounts ;
> QUERY PLAN
>
-
>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
> (actual time=2640.015..2640.015 rows=1 loops=1)
>Output: count(*)
>->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
> time=2639.064..2640.002 rows=3 loops=1)
>  Output: (PARTIAL count(*))
>  Workers Planned: 2
>  Workers Launched: 2
>  ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>Output: PARTIAL count(*)
>Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>->  Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..205601.67 rows=417 width=0) (actual
> time=0.042..1685.542 rows=333 loops=3)
>  Worker 0: actual time=0.033..1657.486 rows=3457968
loops=1
>  Worker 1: actual time=0.039..1702.979 rows=3741069
loops=1
>  Planning time: 1.026 ms
>  Execution time: 2640.225 ms
> (15 rows)
>
> For example, the above result shows,
> Parallel Seq Scan : actual rows = 333
> worker 0   : actual rows = 3457968
> worker 1   : actual rows = 3741069
> Summation of these is 10532370, but actual total rows is 1000.
> I think that Parallel Seq Scan should show actual rows =
> 1000(total rows) or actual rows = 2800963(rows collected by
> itself). (1000 maybe better)
>

You have to read the rows at Parallel Seq Scan nodes as total count of
rows, but you have to consider the loops parameter as well.

>
> After spent time to investigate this behaviour, ISTM that the problem
> is nloops of Parallel Seq Scan.
> Parallel Seq Scan is done only once, but nloops is incremented to 3.
>

nloops here indicates, that it is done for 2 workers and a master backend.

> So its "actual rows" is calculated 333(1000 / 3) at
explain.c:L1223.
>

Thats how it should be considered.  You might want to compare the behaviour
with other cases where value of nloops is used.

> Is it a bug?
>

I don't think so.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

2016-06-19 Thread Masahiko Sawada
Hi all,

My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
fine for parallel seq scan.

postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ;
QUERY PLAN
-
 Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
(actual time=2640.015..2640.015 rows=1 loops=1)
   Output: count(*)
   ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
time=2639.064..2640.002 rows=3 loops=1)
 Output: (PARTIAL count(*))
 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
   Output: PARTIAL count(*)
   Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
   Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
   ->  Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..205601.67 rows=417 width=0) (actual
time=0.042..1685.542 rows=333 loops=3)
 Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
 Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
 Planning time: 1.026 ms
 Execution time: 2640.225 ms
(15 rows)

For example, the above result shows,
Parallel Seq Scan : actual rows = 333
worker 0   : actual rows = 3457968
worker 1   : actual rows = 3741069
Summation of these is 10532370, but actual total rows is 1000.
I think that Parallel Seq Scan should show actual rows =
1000(total rows) or actual rows = 2800963(rows collected by
itself). (1000 maybe better)

After spent time to investigate this behaviour, ISTM that the problem
is nloops of Parallel Seq Scan.
Parallel Seq Scan is done only once, but nloops is incremented to 3.
So its "actual rows" is calculated 333(1000 / 3) at explain.c:L1223.
Is it a bug?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))

2016-06-19 Thread Thomas Munro
On Mon, Jun 20, 2016 at 3:43 PM, Craig Ringer  wrote:
> On 18 June 2016 at 11:28, Thomas Munro 
> wrote:
>> Several times now when reading, debugging and writing code I've wished
>> that LWLockHeldByMe assertions specified the expected mode, especially
>> where exclusive locking is required.
>>
>> What do you think about something like the attached?  See also an
>> example of use.  I will add this to the next commitfest.
>
> I've wanted this before too, and was surprised it wasn't present. TBH I
> assumed there was a technical reason it wasn't and didn't investigate
> further because I just assumed it'd have been added with the original
> LWLockHeldByMe if it were simple.

Before ab5194e6f (25 December 2014) held_lwlocks didn't record the mode.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory

2016-06-19 Thread Michael Paquier
On Mon, Jun 20, 2016 at 12:40 PM, Craig Ringer  wrote:
> On 18 June 2016 at 02:42, Robert Haas  wrote:
>>
>> On Fri, Jun 17, 2016 at 2:23 PM, Aleksey Demakov 
>> wrote:
>> > Essentially this is pessimizing for the lowest common denominator
>> > among OSes.
>>
>> I totally agree.  That's how we make the server portable.
>>
>> > Having a contiguous address space makes things so
>> > much simpler that considering this case, IMHO, is well worth of it.
>>
>> I think that would be great if you could make it work, but it has to
>> support Linux, Windows (all supported versions), MacOS X, all the
>> various BSD flavors for which we have buildfarm animals, and other
>> platforms that we currently run on like HP-UX.   If you come up with a
>> solution that works for this on all of those platforms, I will shake
>> your hand.  But I think that's probably impossible, or at least
>> really, really hard.
>
>
> Indeed. In particular, ASLR on Windows or anywhere we EXEC_BACKEND will
> cause difficulties attaching to those segments.

ASLR that we currently disable in the build because Win8/2k12 and
newer versions behaves differently than past OSes in the address
mapping, making the problem even harder if we'd want to have both
working.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))

2016-06-19 Thread Craig Ringer
On 18 June 2016 at 11:28, Thomas Munro 
wrote:

> Hi hackers,
>
> Several times now when reading, debugging and writing code I've wished
> that LWLockHeldByMe assertions specified the expected mode, especially
> where exclusive locking is required.
>
> What do you think about something like the attached?  See also an
> example of use.  I will add this to the next commitfest.


I've wanted this before too, and was surprised it wasn't present. TBH I
assumed there was a technical reason it wasn't and didn't investigate
further because I just assumed it'd have been added with the original
LWLockHeldByMe if it were simple.




-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory

2016-06-19 Thread Craig Ringer
On 18 June 2016 at 02:42, Robert Haas  wrote:

> On Fri, Jun 17, 2016 at 2:23 PM, Aleksey Demakov 
> wrote:
> > Essentially this is pessimizing for the lowest common denominator
> > among OSes.
>
> I totally agree.  That's how we make the server portable.
>
> > Having a contiguous address space makes things so
> > much simpler that considering this case, IMHO, is well worth of it.
>
> I think that would be great if you could make it work, but it has to
> support Linux, Windows (all supported versions), MacOS X, all the
> various BSD flavors for which we have buildfarm animals, and other
> platforms that we currently run on like HP-UX.   If you come up with a
> solution that works for this on all of those platforms, I will shake
> your hand.  But I think that's probably impossible, or at least
> really, really hard.
>

Indeed. In particular, ASLR on Windows or anywhere we EXEC_BACKEND will
cause difficuties attaching to those segments.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-19 Thread Tom Lane
Robert Haas  writes:
> On Sun, Jun 19, 2016 at 5:22 PM, Peter Eisentraut
>  wrote:
>> Well, the purpose of the test is to check the error passing between worker
>> and leader.  If we just silently revert to not doing that, then we can't
>> really be sure that we're testing the right thing.

> I've been thinking about renaming the "single_copy" flag for Gather
> nodes to something like "pipeline" or some other term that might hope
> to convey that the leader won't participate unless no workers can be
> launched at all, and then adding an option to force that to be used
> always.  That would allow better testing here,

How so?  You still have the fact that functions will run in the leader
when no workers are available, and the gold-plated certainty that that
case will arise routinely in the buildfarm.  Perhaps this would move
the probability of running in a worker from 90% to 95%, but I don't
think that'll make much difference from a testing standpoint.  Having
said that ...

> although I fear we
> might be getting to a level of tinkering with parallel query that
> starts to look more like feature development.

Personally, I'm +1 for such tinkering if it makes the feature either more
controllable or more understandable.  After reading the comments at the
head of nodeGather.c, though, I don't think that single_copy is either
understandable or useful, and merely renaming it won't help.  Apparently,
it runs code in the worker, except when it doesn't, and even when it does,
it's absolutely guaranteed to be a performance loss because the leader is
doing nothing.  What in the world is the point?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety tagging of extension functions

2016-06-19 Thread Tom Lane
Robert Haas  writes:
> On Sun, Jun 19, 2016 at 4:21 PM, Andreas Karlsson  wrote:
>> Here they are. Shelve them if you like. They are some of the least important
>> extensions that still should be fixed so they can end up in 10 if you do not
>> find the time.

> Thanks.  I think it's too late to try to push anything else into
> beta2, with less than 24 hours to go before wrap.  I'll just leave
> these for 10.0 unless I hear several votes for pushing them into 9.6
> post-beta2.

I agree with not pushing them right now, but I think it would be
sensible to push them post-beta2.  There's already one certain
reason for a forced initdb after beta2 [1], so leaving these for
next cycle seems kinda pointless.

regards, tom lane

[1] https://www.postgresql.org/message-id/30448.1466190...@sss.pgh.pa.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-19 Thread Robert Haas
On Sun, Jun 19, 2016 at 5:22 PM, Peter Eisentraut
 wrote:
> Well, the purpose of the test is to check the error passing between worker
> and leader.  If we just silently revert to not doing that, then we can't
> really be sure that we're testing the right thing.  We've already seen some
> instances in this thread where we figured out after some debugging that some
> construct is not actually going through the parallel infrastructure, so I'm
> afraid if we leave it like this it might silently change behavior at some
> point in the future.

I've been thinking about renaming the "single_copy" flag for Gather
nodes to something like "pipeline" or some other term that might hope
to convey that the leader won't participate unless no workers can be
launched at all, and then adding an option to force that to be used
always.  That would allow better testing here, although I fear we
might be getting to a level of tinkering with parallel query that
starts to look more like feature development.

> Independent of that, it would help testing things like this if we allowed
> setting max_worker_processes to 0, instead of the current minimum 1.  If
> there a reason for the minimum of 1?

I believe that's pure brain fade on my part.  I think the minimum should be 0.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety tagging of extension functions

2016-06-19 Thread Robert Haas
On Sun, Jun 19, 2016 at 4:21 PM, Andreas Karlsson  wrote:
> On 06/17/2016 09:11 PM, Robert Haas wrote:
>> I was kind of hoping you'd have a new version of this posted already.
>> beta2 is wrapping on Monday, and I'm inclined to shelve anything that
>> isn't done by then for the next release.  And I don't really plan to
>> work much this weekend.
>
> Here they are. Shelve them if you like. They are some of the least important
> extensions that still should be fixed so they can end up in 10 if you do not
> find the time.

Thanks.  I think it's too late to try to push anything else into
beta2, with less than 24 hours to go before wrap.  I'll just leave
these for 10.0 unless I hear several votes for pushing them into 9.6
post-beta2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-19 Thread Tom Lane
Peter Eisentraut  writes:
> On 6/19/16 3:09 PM, Robert Haas wrote:
>> On Sun, Jun 19, 2016 at 11:36 AM, Tom Lane  wrote:
>>> No, it *might* execute in a worker.  If you can get one.

>> Right.

> Well, the purpose of the test is to check the error passing between 
> worker and leader.  If we just silently revert to not doing that, then 
> we can't really be sure that we're testing the right thing.

I would guess that 90 to 95 times out of 100, that test *will* exercise
what it's supposed to, and that's enough to make it useful.  But we can't
assume that it'll happen 100% of the time.

> We've 
> already seen some instances in this thread where we figured out after 
> some debugging that some construct is not actually going through the 
> parallel infrastructure, so I'm afraid if we leave it like this it might 
> silently change behavior at some point in the future.

Agreed, if the percentage suddenly fell to 0 we wouldn't know it, and
that's concerning.  But wishing it were 100% doesn't make it so.

Depending on what the percentage actually is, maybe we could treat
this like the "random" test, and allow a failure to be disregarded
overall?  But that doesn't seem very nice either, in view of our
increasing reliance on automated testing.  If "random" were failing
90% of the time on some buildfarm critters, that would probably
indicate a real problem, but we'd likely not realize it for a long time.

> Independent of that, it would help testing things like this if we 
> allowed setting max_worker_processes to 0, instead of the current 
> minimum 1.  If there a reason for the minimum of 1?

Good question.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-19 Thread Peter Eisentraut

On 6/19/16 3:09 PM, Robert Haas wrote:

On Sun, Jun 19, 2016 at 11:36 AM, Tom Lane  wrote:

Amit Kapila  writes:

On Sun, Jun 19, 2016 at 10:10 AM, Tom Lane  wrote:

Would this not result in unstable test output depending on whether the
code executes in the leader or a worker?



Before doing that test, we set force_parallel_mode=1, so it should always
execute in worker which will ensure a stable output.


No, it *might* execute in a worker.  If you can get one.


Right.


Well, the purpose of the test is to check the error passing between 
worker and leader.  If we just silently revert to not doing that, then 
we can't really be sure that we're testing the right thing.  We've 
already seen some instances in this thread where we figured out after 
some debugging that some construct is not actually going through the 
parallel infrastructure, so I'm afraid if we leave it like this it might 
silently change behavior at some point in the future.


Independent of that, it would help testing things like this if we 
allowed setting max_worker_processes to 0, instead of the current 
minimum 1.  If there a reason for the minimum of 1?


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety tagging of extension functions

2016-06-19 Thread Andreas Karlsson

On 06/17/2016 09:11 PM, Robert Haas wrote:

I was kind of hoping you'd have a new version of this posted already.
beta2 is wrapping on Monday, and I'm inclined to shelve anything that
isn't done by then for the next release.  And I don't really plan to
work much this weekend.


Here they are. Shelve them if you like. They are some of the least 
important extensions that still should be fixed so they can end up in 10 
if you do not find the time.


Andreas



parallel-contrib-v6-pg_visibility.patch.gz
Description: application/gzip


parallel-contrib-v6-tablefunc.patch.gz
Description: application/gzip


parallel-contrib-v6-tsearch2.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-19 Thread Robert Haas
On Sun, Jun 19, 2016 at 11:36 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Sun, Jun 19, 2016 at 10:10 AM, Tom Lane  wrote:
>>> Would this not result in unstable test output depending on whether the
>>> code executes in the leader or a worker?
>
>> Before doing that test, we set force_parallel_mode=1, so it should always
>> execute in worker which will ensure a stable output.
>
> No, it *might* execute in a worker.  If you can get one.

Right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #14199: The pg_ctl status check on server start is not compatible with the silent_mode=on

2016-06-19 Thread Tom Lane
Michael Paquier  writes:
> On Sun, Jun 19, 2016 at 12:51 AM, Tom Lane  wrote:
>> What seems like a more conservative answer to me is to revert c869a7d5b
>> in 9.1 only, and address the buildfarm stability issue it sought to
>> resolve by increasing the fixed timeout from 5 seconds to, say, 10.

> +1 for doing that. Knowing that silent_mode will be out of community
> support scope in a couple of months, that's the right answer to this
> bug call.

Done that way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-19 Thread Tom Lane
Amit Kapila  writes:
> On Sun, Jun 19, 2016 at 10:10 AM, Tom Lane  wrote:
>> Would this not result in unstable test output depending on whether the
>> code executes in the leader or a worker?

> Before doing that test, we set force_parallel_mode=1, so it should always
> execute in worker which will ensure a stable output.

No, it *might* execute in a worker.  If you can get one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-19 Thread Michael Paquier
On Sun, Jun 19, 2016 at 7:38 PM, Vik Fearing  wrote:
> On 19/06/16 12:28, Michael Paquier wrote:
>> On Sun, Jun 19, 2016 at 5:56 PM, Michael Paquier
>> Or in short the attached.
>
> The code looks good to me but why no documentation?

Because that's a long day... Thanks! Now you have it.
-- 
Michael


wal-receiver-conninfo-v2.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Whether to back-patch fix for aggregate transtype width estimates

2016-06-19 Thread Tomas Vondra

On 06/18/2016 06:14 PM, Tom Lane wrote:

Robert Haas  writes:

On Fri, Jun 17, 2016 at 10:41 PM, Tom Lane  wrote:

Ordinarily I'd just summarily back-patch a fix, but that commit shipped
in 9.0, which means it's been broken a long time.  I'm worried that
back-patching a change might be more likely to destabilize plan choices
than to do anything anybody's happy about.



I suspect the consequences here aren't too bad, or someone would have
noticed by now.  So I would be tempted to leave it alone in
back-branches.  But I might change my mind if it's actually awful...


Well, you can construct scenarios where it would cause failures.
Consider "SELECT max(varchar_col) FROM tab GROUP BY foo".  The planner
will need to estimate the size of the hash table to decide whether
hash-style aggregation is OK.  In all 8.x releases, it would use the
varchar_col's typmod (max width) to determine the per-aggregate trans
value space requirement.  In 9.x, that's broken and it falls back to
get_typavgwidth's default guess of 32 bytes.  If what you've actually
got is, say, varchar(255) and most of the entries actually approach
that length, this could result in a drastic underestimate, possibly
leading to OOM from hash table growth.

However, I can't recall many field reports that seem to match that
theory, so in practice it's probably pretty rare.  It's certainly not
going to help people who declare their wide columns as "text"
not "varchar(n)".


All the HashAgg + OOM reports I can recall (both from community or 
through support) were caused by poor cardinality estimates, i.e. not 
related to this at all. The only exception was the array_agg() thing we 
fixed a while ago, and that was primarily due to using per-group memory 
contexts. So also unrelated to this.


So while I'm a fan of improving our planning, I'd lean towards not 
back-patching this particular bit.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #14199: The pg_ctl status check on server start is not compatible with the silent_mode=on

2016-06-19 Thread Michael Paquier
On Sun, Jun 19, 2016 at 12:51 AM, Tom Lane  wrote:
> What seems like a more conservative answer to me is to revert c869a7d5b
> in 9.1 only, and address the buildfarm stability issue it sought to
> resolve by increasing the fixed timeout from 5 seconds to, say, 10.

+1 for doing that. Knowing that silent_mode will be out of community
support scope in a couple of months, that's the right answer to this
bug call.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-19 Thread Tomas Vondra

Hi

On 06/18/2016 06:52 PM, Tom Lane wrote:

Tomas Vondra  writes:

A few more comments, about re-reading the patch more thoroughly. I
wouldn't say any of those qualify as bugs, but rather as discussion
about some of the design choices:



1) NULL handling



I'd argue that we should do something about this, although I agree it's
non-trivial to estimate - at least until we get some sort of correlation
stats (e.g. my patch already provides most of the pieces, I believe).


I concur, actually, but I feel that it's out of scope for this
particular patch, which is only trying to replace the functionality
that was committed previously. If you want to come up with a patch on
top of this that adds some accounting for NULLs, I'd be willing to
consider it as a post-beta2 improvement.


Sure, fair enough. By post-beta2 you mean for 9.7, or still for 9.6?




But I'd argue that in the case of multi-column foreign keys we can do
better even without it - my experience is that in such cases either all
values are NULL or none of them, and a single NULL value breaks the FK
of course. So I think max(null_frac) would work.


Yeah, I was thinking along the same lines: max of the per-column null
fractions is probably an OK estimate.


OK




3) ForeignKeyOptInfo->rinfos as a List
Can we actually get a list of matching RestrictInfos for a single
foreign key? I've been unable to construct such query.


I think you'd actually have to write redundant outer join quals,
along the lines of
  select ... a left join b on (a.x = b.y and a.x = b.y)
I don't believe we take the trouble to eliminate such duplicates
unless they get absorbed by an EC, which outer-join quals would
not be.  (Haven't tried this, though, as I don't have the patch
installed right now.)


OK. Let's look into this post-beta2 then.



The beta2 deadline is just about upon us; I feel that if we're going
to get this into this release at all, we need to push it today so
that we get a full buildfarm cycle on it before the wrap.

I plan to spend an hour or two adjusting the qual match logic as
discussed above, and re-reading the whole patch another time for
sanity.  If I've not heard objections by the time I'm done,
I will push it.



Thanks!

If I could wish one more thing - could you briefly explain why you 
rewrote the patch the way you did? I'd like to learn from this and while 
I think I kinda understand most of the changes, I'm not really sure I 
got it right.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-19 Thread Vik Fearing
On 19/06/16 12:28, Michael Paquier wrote:
> On Sun, Jun 19, 2016 at 5:56 PM, Michael Paquier
>  wrote:
>> The new pg_stat_wal_receiver does not include primary_conninfo.
>> Looking at that now, it looks almost stupid not to include it...
>> Adding it now would require a catalog bump, so I am not sure if this
>> is acceptable at this stage for 9.6...

This definitely needs to go in, we get regular requests for it.  The
last one I've seen being
https://www.postgresql.org/message-id/CAN1xZseiqNRh1ZE0seVk7UuHeWvFBEWG%2BFcW7Xfm_Nv%3Dd%2BfPGw%40mail.gmail.com

Whether it goes into 9.6 or 10 is not for me to say.

> Or in short the attached.

The code looks good to me but why no documentation?
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-19 Thread Michael Paquier
On Sun, Jun 19, 2016 at 5:56 PM, Michael Paquier
 wrote:
> The new pg_stat_wal_receiver does not include primary_conninfo.
> Looking at that now, it looks almost stupid not to include it...
> Adding it now would require a catalog bump, so I am not sure if this
> is acceptable at this stage for 9.6...

Or in short the attached.
-- 
Michael


wal-receiver-conninfo.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-19 Thread Michael Paquier
Hi all,

The new pg_stat_wal_receiver does not include primary_conninfo.
Looking at that now, it looks almost stupid not to include it...
Adding it now would require a catalog bump, so I am not sure if this
is acceptable at this stage for 9.6...

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers