Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Amit Kapila
On Tue, Jun 21, 2016 at 9:21 AM, Andres Freund  wrote:
>
> On 2016-06-21 08:59:13 +0530, Amit Kapila wrote:
> > Can we consider to use some strategy to avoid deadlocks without
releasing
> > the lock on old page?  Consider if we could have a mechanism such that
> > RelationGetBufferForTuple() will ensure that it always returns a new
buffer
> > which has targetblock greater than the old block (on which we already
held
> > a lock).  I think here tricky part is whether we can get anything like
that
> > from FSM. Also, there could be cases where we need to extend the heap
when
> > there were pages in heap with space available, but we have ignored them
> > because there block number is smaller than the block number on which we
> > have lock.
>
> I can't see that being acceptable, from a space-usage POV.
>
> > > So far the best idea I have - and it's really not a good one - is to
> > > invent a new hint-bit that tells concurrent updates to acquire a
> > > heavyweight tuple lock, while releasing the page-level lock. If that
> > > hint bit does not require any other modifications - and we don't need
an
> > > xid in xmax for this use case - that'll avoid doing all the other
> > > `already_marked` stuff early, which should address the correctness
> > > issue.
> > >
> >
> > Don't we need to clear such a flag in case of error?  Also don't we
need to
> > reset it later, like when modifying the old page later before WAL.
>
> If the flag just says "acquire a heavyweight lock", then there's no need
> for that. That's cheap enough to just do if it's errorneously set.  At
> least I can't see any reason.
>

I think it will just increase the chances of other backends to acquire a
heavy weight lock.

> > >  It's kinda invasive though, and probably has performance
> > > implications.
> > >
> >
> > Do you see performance implication due to requirement of heavywieht
tuple
> > lock in more cases than now or something else?
>
> Because of that, yes.
>
>
> > Some others ways could be:
> >
> > Before releasing the lock on buffer containing old tuple, clear the VM
and
> > visibility info from page and WAL log it.  I think this could impact
> > performance depending on how frequently we need to perform this action.
>
> Doubling the number of xlog inserts in heap_update would certainly be
> measurable :(. My guess is that the heavyweight tuple lock approach will
> be less expensive.
>

Probably, but I think heavyweight tuple lock is more invasive.  I think
increasing the number of xlog inserts could surely impact performance, but
depending upon how frequently we need to call it.  I think we might want to
combine it with the idea of RelationGetBufferForTuple() to return
higher-block number, such that if we don't find higher block-number from
FSM, then we can release the lock on old page and try to get the locks on
old and new buffers as we do now.  This will further reduce the chances of
increasing xlog insert calls and address the issue of space-wastage.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Amit Kapila
On Tue, Jun 21, 2016 at 9:16 AM, Thomas Munro 
wrote:
>
> On Tue, Jun 21, 2016 at 3:29 PM, Amit Kapila 
wrote:
> > Some others ways could be:
> >
> > Before releasing the lock on buffer containing old tuple, clear the VM
and
> > visibility info from page and WAL log it.  I think this could impact
> > performance depending on how frequently we need to perform this action.
> >
> > Have a new flag like HEAP_XMAX_UNLOGGED (as it was there when this
logic was
> > introduced in commit f2bfe8a24c46133f81e188653a127f939eb33c4a ) and set
the
> > same in old tuple header before releasing lock on buffer and teach
tqual.c
> > to honor the flag.  I think tqual.c should consider  HEAP_XMAX_UNLOGGED
as
> > an indication of aborted transaction unless it is currently in-progress.
> > Also, I think we need to clear this flag before WAL logging in
heap_update.
>
> I also noticed that and wondered whether it was a mistake to take that
> out.  It appears to have been removed as part of the logic to clear
> away UNDO log support in 11919160, but it may have been an important
> part of the heap_update protocol.  Though (as I mentioned nearby in a
> reply to Noah) it I'm not sure if the tqual.c side which would ignore
> the unlogged xmax in the event of a badly timed crash was ever
> implemented.
>

Right, my observation is similar to yours and that's what I am suggesting
as one-alternative to fix this issue.  I think making this approach work
(even if this doesn't have any problems) might turn out to be tricky.
However, the plus-point of this approach seems to be that it shouldn't
impact performance in most of the cases.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Amit Kapila
On Tue, Jun 21, 2016 at 9:08 AM, Thomas Munro 
wrote:
>
> On Tue, Jun 21, 2016 at 3:29 PM, Amit Kapila 
wrote:
> > On Tue, Jun 21, 2016 at 1:03 AM, Andres Freund 
wrote:
> >> Well, I think generally nobody seriously looked at actually refactoring
> >> heap_update(), even though that'd be a good idea.  But in this
instance,
> >> the problem seems relatively fundamental:
> >>
> >> We need to lock the origin page, to do visibility checks, etc. Then we
> >> need to figure out the target page. Even disregarding toasting - which
> >> we could be doing earlier with some refactoring - we're going to have
to
> >> release the page level lock, to lock them in ascending order. Otherwise
> >> we'll risk kinda likely deadlocks.
> >
> > Can we consider to use some strategy to avoid deadlocks without
releasing
> > the lock on old page?  Consider if we could have a mechanism such that
> > RelationGetBufferForTuple() will ensure that it always returns a new
buffer
> > which has targetblock greater than the old block (on which we already
held a
> > lock).  I think here tricky part is whether we can get anything like
that
> > from FSM. Also, there could be cases where we need to extend the heap
when
> > there were pages in heap with space available, but we have ignored them
> > because there block number is smaller than the block number on which we
have
> > lock.
>
> Doesn't that mean that over time, given a workload that does only or
> mostly updates, your records tend to migrate further and further away
> from the start of the file, leaving a growing unusable space at the
> beginning, until you eventually need to CLUSTER/VACUUM FULL?
>

The request for updates should ideally fit in same page as old tuple for
many of the cases if fillfactor is properly configured, considering
update-mostly loads.  Why would it be that always the records will migrate
further away, they should get the space freed by other updates in
intermediate pages. I think there could be some impact space-wise, but
freed-up space will be eventually used.

> I was wondering about speculatively asking for a free page with a
> lower block number than the origin page, if one is available, before
> locking the origin page.

Do you wan't to lock it as well?  In any-case, I think adding the code
without deciding whether the update can be accommodated in current page can
prove to be costly.

>  Then after locking the origin page, if it
> turns out you need a page but didn't get it earlier, asking for a free
> page with a higher block number than the origin page.
>

Something like that might workout if it is feasible and people agree on
pursuing such an approach.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Missing checks when malloc returns NULL...

2016-06-20 Thread Michael Paquier
Hi all,

While auditing the code, I got surprised that there are a couple of
code paths that do nothing for this error handling:
- pg_regress and isolationtester use malloc extensively, in case of
failure those would just crash crash. I think that it matters for
buildfarm members that are under memory pressure to not do so, so
those should use pg_malloc instead.
- refint.c makes use of malloc to store plans in top memory context.
That's a buggy concept clearly... This code would need to be reworked
more largely than in the patch I attach.
- pg_dlsym for darwin uses malloc, but would crash on failure
- ps_status.c does nothing when it uses malloc().
- sprompt.c uses malloc once, and would crash on failure
- mcxt.c uses that, which is surprising:
@@ -704,7 +704,8 @@ MemoryContextCreate(NodeTag tag, Size size,
{
/* Special case for startup: use good ol' malloc */
node = (MemoryContext) malloc(needed);
-   Assert(node != NULL);
+   if (node == NULL)
+   elog(PANIC, "out of memory");
}
I think that a PANIC is cleaner here instead of a simple crash.

So attached is a patch aimed at improving things. Thoughts?
-- 
Michael


malloc-nulls.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Amit Kapila
On Tue, Jun 21, 2016 at 1:24 AM, Robert Haas  wrote:
>
> On Mon, Jun 20, 2016 at 1:13 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Sun, Jun 19, 2016 at 10:23 PM, Tom Lane  wrote:
> >>> Personally, I'm +1 for such tinkering if it makes the feature either
more
> >>> controllable or more understandable.  After reading the comments at
the
> >>> head of nodeGather.c, though, I don't think that single_copy is either
> >>> understandable or useful, and merely renaming it won't help.
Apparently,
> >>> it runs code in the worker, except when it doesn't, and even when it
does,
> >>> it's absolutely guaranteed to be a performance loss because the
leader is
> >>> doing nothing.  What in the world is the point?
> >
> >> The single_copy flag allows a Gather node to have a child plan which
> >> is not intrinsically parallel.
> >
> > OK, but I'm very dubious of your claim that this has any use except for
> > testing purposes.  It certainly has no such use today.  Therefore, the
> > behavior of falling back to running in the leader seems like an
> > anti-feature to me.  If we want to test running in a worker, then we
> > want to test that, not maybe test it.
> >
> > I propose that the behavior we actually want here is to execute in
> > a worker, full stop.  If we can't get one, wait till we can.  If we
> > can't get one because no slots have been allocated at all, fail.
> > That would make the behavior deterministic enough for the regression
> > tests to rely on it.
>

+1.

> I agree that for force_parallel_mode testing, that behavior would be
useful.
>
> I am also pretty confident we're going to want the behavior where the
> leader runs the plan if, and only if, no workers can be obtained for
> other purposes.  However, I agree that's not essential today.
>
> > And while I don't know what this mode should be called, I'm pretty sure
> > that neither "single_copy" nor "pipeline" are useful descriptions.
>
> Maybe we should make this an enum rather than a boolean, since there
> seem to be more than two useful behaviors.
>

How about calling it as control_parallel/control_parallelism or
override_parallel/override_parallelism?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Andres Freund
On 2016-06-21 08:59:13 +0530, Amit Kapila wrote:
> Can we consider to use some strategy to avoid deadlocks without releasing
> the lock on old page?  Consider if we could have a mechanism such that
> RelationGetBufferForTuple() will ensure that it always returns a new buffer
> which has targetblock greater than the old block (on which we already held
> a lock).  I think here tricky part is whether we can get anything like that
> from FSM. Also, there could be cases where we need to extend the heap when
> there were pages in heap with space available, but we have ignored them
> because there block number is smaller than the block number on which we
> have lock.

I can't see that being acceptable, from a space-usage POV.

> > So far the best idea I have - and it's really not a good one - is to
> > invent a new hint-bit that tells concurrent updates to acquire a
> > heavyweight tuple lock, while releasing the page-level lock. If that
> > hint bit does not require any other modifications - and we don't need an
> > xid in xmax for this use case - that'll avoid doing all the other
> > `already_marked` stuff early, which should address the correctness
> > issue.
> >
> 
> Don't we need to clear such a flag in case of error?  Also don't we need to
> reset it later, like when modifying the old page later before WAL.

If the flag just says "acquire a heavyweight lock", then there's no need
for that. That's cheap enough to just do if it's errorneously set.  At
least I can't see any reason.

> >  It's kinda invasive though, and probably has performance
> > implications.
> >
> 
> Do you see performance implication due to requirement of heavywieht tuple
> lock in more cases than now or something else?

Because of that, yes.


> Some others ways could be:
> 
> Before releasing the lock on buffer containing old tuple, clear the VM and
> visibility info from page and WAL log it.  I think this could impact
> performance depending on how frequently we need to perform this action.

Doubling the number of xlog inserts in heap_update would certainly be
measurable :(. My guess is that the heavyweight tuple lock approach will
be less expensive.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Thomas Munro
On Tue, Jun 21, 2016 at 3:29 PM, Amit Kapila  wrote:
> Some others ways could be:
>
> Before releasing the lock on buffer containing old tuple, clear the VM and
> visibility info from page and WAL log it.  I think this could impact
> performance depending on how frequently we need to perform this action.
>
> Have a new flag like HEAP_XMAX_UNLOGGED (as it was there when this logic was
> introduced in commit f2bfe8a24c46133f81e188653a127f939eb33c4a ) and set the
> same in old tuple header before releasing lock on buffer and teach tqual.c
> to honor the flag.  I think tqual.c should consider  HEAP_XMAX_UNLOGGED as
> an indication of aborted transaction unless it is currently in-progress.
> Also, I think we need to clear this flag before WAL logging in heap_update.

I also noticed that and wondered whether it was a mistake to take that
out.  It appears to have been removed as part of the logic to clear
away UNDO log support in 11919160, but it may have been an important
part of the heap_update protocol.  Though (as I mentioned nearby in a
reply to Noah) it I'm not sure if the tqual.c side which would ignore
the unlogged xmax in the event of a badly timed crash was ever
implemented.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Thomas Munro
On Tue, Jun 21, 2016 at 3:29 PM, Amit Kapila  wrote:
> On Tue, Jun 21, 2016 at 1:03 AM, Andres Freund  wrote:
>> Well, I think generally nobody seriously looked at actually refactoring
>> heap_update(), even though that'd be a good idea.  But in this instance,
>> the problem seems relatively fundamental:
>>
>> We need to lock the origin page, to do visibility checks, etc. Then we
>> need to figure out the target page. Even disregarding toasting - which
>> we could be doing earlier with some refactoring - we're going to have to
>> release the page level lock, to lock them in ascending order. Otherwise
>> we'll risk kinda likely deadlocks.
>
> Can we consider to use some strategy to avoid deadlocks without releasing
> the lock on old page?  Consider if we could have a mechanism such that
> RelationGetBufferForTuple() will ensure that it always returns a new buffer
> which has targetblock greater than the old block (on which we already held a
> lock).  I think here tricky part is whether we can get anything like that
> from FSM. Also, there could be cases where we need to extend the heap when
> there were pages in heap with space available, but we have ignored them
> because there block number is smaller than the block number on which we have
> lock.

Doesn't that mean that over time, given a workload that does only or
mostly updates, your records tend to migrate further and further away
from the start of the file, leaving a growing unusable space at the
beginning, until you eventually need to CLUSTER/VACUUM FULL?

I was wondering about speculatively asking for a free page with a
lower block number than the origin page, if one is available, before
locking the origin page.  Then after locking the origin page, if it
turns out you need a page but didn't get it earlier, asking for a free
page with a higher block number than the origin page.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Michael Paquier
On Tue, Jun 21, 2016 at 12:06 PM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> On 6/20/16 10:29 PM, Tom Lane wrote:
>>> What I would want to know is whether this specific change is actually a
>>> good idea.  In particular, I'm concerned about the possible security
>>> implications of exposing primary_conninfo --- might it not contain a
>>> password, for example?
>
>> That would have been my objection.  This was also mentioned in the
>> context of moving recovery.conf settings to postgresql.conf, because
>> then the password would become visible in SHOW commands and the like.
>
>> Alternatively or additionally, implement a way to strip passwords out of
>> conninfo information.  libpq already has information about which
>> connection items are sensitive.
>
> Yeah, I'd been wondering whether we could parse the conninfo string into
> individual fields and then drop the password field.  It's hard to see a
> reason why this view needs to show passwords, since presumably everything
> in it corresponds to successful connections --- if your password is wrong,
> you aren't in it.

walreceiver.c does not have a direct dependency to libpq yet,
everything does through libpqwalreceiver. So a correct move would be
to unplug the low-level routines of PQconninfoParse into something in
src/common/, where both the backend and frontend could use it. And
then we use it to rebuild a connection string. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Amit Kapila
On Tue, Jun 21, 2016 at 1:03 AM, Andres Freund  wrote:
>
> On 2016-06-15 08:56:52 -0400, Robert Haas wrote:
> > Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2,
> > and CTID without logging anything or clearing the all-visible flag and
> > then releases the lock on the heap page to go do some more work that
> > might even ERROR out.  Only if that other work all goes OK do we
> > relock the page and perform the WAL-logged actions.
> >
> > That doesn't seem like a good idea even in existing releases, because
> > you've taken a tuple on an all-visible page and made it not
> > all-visible, and you've made a page modification that is not
> > necessarily atomic without logging it.
>
> Right, that's broken.
>
>
> > I'm not sure what to do about this: this part of the heap_update()
> > logic has been like this forever, and I assume that if it were easy to
> > refactor this away, somebody would have done it by now.
>
> Well, I think generally nobody seriously looked at actually refactoring
> heap_update(), even though that'd be a good idea.  But in this instance,
> the problem seems relatively fundamental:
>
> We need to lock the origin page, to do visibility checks, etc. Then we
> need to figure out the target page. Even disregarding toasting - which
> we could be doing earlier with some refactoring - we're going to have to
> release the page level lock, to lock them in ascending order. Otherwise
> we'll risk kinda likely deadlocks.
>

Can we consider to use some strategy to avoid deadlocks without releasing
the lock on old page?  Consider if we could have a mechanism such that
RelationGetBufferForTuple() will ensure that it always returns a new buffer
which has targetblock greater than the old block (on which we already held
a lock).  I think here tricky part is whether we can get anything like that
from FSM. Also, there could be cases where we need to extend the heap when
there were pages in heap with space available, but we have ignored them
because there block number is smaller than the block number on which we
have lock.

>  We also certainly don't want to nest
> the lwlocks for the toast stuff, inside a content lock for the old
> tupe's page.
>
> So far the best idea I have - and it's really not a good one - is to
> invent a new hint-bit that tells concurrent updates to acquire a
> heavyweight tuple lock, while releasing the page-level lock. If that
> hint bit does not require any other modifications - and we don't need an
> xid in xmax for this use case - that'll avoid doing all the other
> `already_marked` stuff early, which should address the correctness
> issue.
>

Don't we need to clear such a flag in case of error?  Also don't we need to
reset it later, like when modifying the old page later before WAL.

>  It's kinda invasive though, and probably has performance
> implications.
>

Do you see performance implication due to requirement of heavywieht tuple
lock in more cases than now or something else?

Some others ways could be:

Before releasing the lock on buffer containing old tuple, clear the VM and
visibility info from page and WAL log it.  I think this could impact
performance depending on how frequently we need to perform this action.

Have a new flag like HEAP_XMAX_UNLOGGED (as it was there when this logic
was introduced in commit f2bfe8a24c46133f81e188653a127f939eb33c4a ) and set
the same in old tuple header before releasing lock on buffer and teach
tqual.c to honor the flag.  I think tqual.c should consider
 HEAP_XMAX_UNLOGGED as an indication of aborted transaction unless it is
currently in-progress.  Also, I think we need to clear this flag before WAL
logging in heap_update.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Tom Lane
Peter Eisentraut  writes:
> On 6/19/16 5:55 PM, Tom Lane wrote:
>> Depending on what the percentage actually is, maybe we could treat
>> this like the "random" test, and allow a failure to be disregarded
>> overall?  But that doesn't seem very nice either, in view of our
>> increasing reliance on automated testing.  If "random" were failing
>> 90% of the time on some buildfarm critters, that would probably
>> indicate a real problem, but we'd likely not realize it for a long time.

> I think this test would only fail if it runs out of workers, and that 
> would only happen in an installcheck run against a server configured in 
> a nonstandard way or that is doing something else -- which doesn't 
> happen on the buildfarm.

Um, if you're speaking of select_parallel, that already runs in parallel
with two other regression tests, and there is no annotation in the
parallel_schedule file suggesting that adding more scripts to that group
would be bad.  But yes, perhaps putting this test into its own standalone
group would be enough of a fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Peter Eisentraut

On 6/19/16 5:55 PM, Tom Lane wrote:

Depending on what the percentage actually is, maybe we could treat
this like the "random" test, and allow a failure to be disregarded
overall?  But that doesn't seem very nice either, in view of our
increasing reliance on automated testing.  If "random" were failing
90% of the time on some buildfarm critters, that would probably
indicate a real problem, but we'd likely not realize it for a long time.


I think this test would only fail if it runs out of workers, and that 
would only happen in an installcheck run against a server configured in 
a nonstandard way or that is doing something else -- which doesn't 
happen on the buildfarm.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Tom Lane
Peter Eisentraut  writes:
> On 6/20/16 10:29 PM, Tom Lane wrote:
>> What I would want to know is whether this specific change is actually a
>> good idea.  In particular, I'm concerned about the possible security
>> implications of exposing primary_conninfo --- might it not contain a
>> password, for example?

> That would have been my objection.  This was also mentioned in the 
> context of moving recovery.conf settings to postgresql.conf, because 
> then the password would become visible in SHOW commands and the like.

> Alternatively or additionally, implement a way to strip passwords out of 
> conninfo information.  libpq already has information about which 
> connection items are sensitive.

Yeah, I'd been wondering whether we could parse the conninfo string into
individual fields and then drop the password field.  It's hard to see a
reason why this view needs to show passwords, since presumably everything
in it corresponds to successful connections --- if your password is wrong,
you aren't in it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel query and temp_file_limit

2016-06-20 Thread Tom Lane
Peter Geoghegan  writes:
> On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
>> What I'm tempted to do is trying to document that, as a point of
>> policy, parallel query in 9.6 uses up to (workers + 1) times the
>> resources that a single session might use.  That includes not only CPU
>> but also things like work_mem and temp file space.  This obviously
>> isn't ideal, but it's what could be done by the ship date.

> Where would that be documented, though? Would it need to be noted in
> the case of each such GUC?

Why can't we just note this in the number-of-workers GUCs?  It's not like
there even *is* a GUC for many of our per-process resource consumption
behaviors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Peter Eisentraut

On 6/20/16 10:29 PM, Tom Lane wrote:

What I would want to know is whether this specific change is actually a
good idea.  In particular, I'm concerned about the possible security
implications of exposing primary_conninfo --- might it not contain a
password, for example?


That would have been my objection.  This was also mentioned in the 
context of moving recovery.conf settings to postgresql.conf, because 
then the password would become visible in SHOW commands and the like.


We would need a way to put the password in a separate place, like a 
primary_conn_password setting.  Yes, you can already use .pgpass for 
that, but since pg_basebackup -R will happily copy a password out of 
.pgpass into recovery.conf, this makes accidentally leaking passwords 
way too easy.


Alternatively or additionally, implement a way to strip passwords out of 
conninfo information.  libpq already has information about which 
connection items are sensitive.


Needs more thought, in any case.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Jun 21, 2016 at 11:29 AM, Tom Lane  wrote:
>> What I would want to know is whether this specific change is actually a
>> good idea.  In particular, I'm concerned about the possible security
>> implications of exposing primary_conninfo --- might it not contain a
>> password, for example?

> Yes it could, as a connection string, but we make the information of
> this view only visible to superusers. For the others, that's just
> NULL.

Well, that's okay for now, but I'm curious to hear Stephen Frost's
opinion on this.  He's been on the warpath to decrease our dependence
on superuser-ness for protection purposes.  Seems to me that having
one column in this view that is a lot more security-sensitive than
the others is likely to be an issue someday.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Michael Paquier
On Tue, Jun 21, 2016 at 11:29 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Tue, Jun 21, 2016 at 9:58 AM, Tatsuo Ishii  wrote:
>>> Even there seems to be ongoing discussions on changing version number
>>> while in the beta period (and which definitely requires initdb). Why
>>> not changing system catalog during beta?:-)
>
>> I am not directly against that to be honest, but I'd expect Tom's
>> wraith showing up soon on this thread just by saying that. In the two
>> last releases, catalog bumps before beta2 because there were no other
>> choice. This issue is not really critical, just a stupid miss from me,
>> and we can live with this mistake as well.
>
> Since pg_stat_wal_receiver is new in 9.6, it seems to me that it'd be
> wise to try to get it right the first time.  And it's not like we are
> going to get to beta3 without another initdb --- we already know the
> partial-aggregate design is broken and needs some more catalog changes.

Amen. That's a sufficient argument to slip this one into 9.6 then.

> What I would want to know is whether this specific change is actually a
> good idea.  In particular, I'm concerned about the possible security
> implications of exposing primary_conninfo --- might it not contain a
> password, for example?

Yes it could, as a connection string, but we make the information of
this view only visible to superusers. For the others, that's just
NULL.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Jun 21, 2016 at 9:58 AM, Tatsuo Ishii  wrote:
>> Even there seems to be ongoing discussions on changing version number
>> while in the beta period (and which definitely requires initdb). Why
>> not changing system catalog during beta?:-)

> I am not directly against that to be honest, but I'd expect Tom's
> wraith showing up soon on this thread just by saying that. In the two
> last releases, catalog bumps before beta2 because there were no other
> choice. This issue is not really critical, just a stupid miss from me,
> and we can live with this mistake as well.

Since pg_stat_wal_receiver is new in 9.6, it seems to me that it'd be
wise to try to get it right the first time.  And it's not like we are
going to get to beta3 without another initdb --- we already know the
partial-aggregate design is broken and needs some more catalog changes.

What I would want to know is whether this specific change is actually a
good idea.  In particular, I'm concerned about the possible security
implications of exposing primary_conninfo --- might it not contain a
password, for example?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> This seems like pretty good evidence that we should remove the "ignored"
>> marking for the random test, and maybe remove that functionality from
>> pg_regress altogether.  We could probably adjust the test to decrease
>> its risk-of-failure by another factor of ten or so, if anyone feels like
>> 0.005% failure probability is too high.

> I suppose that as far as the buildfarm goes it's okay that the test
> fails from time to time, but it may be worse from packagers' points of
> view, where a randomly failing test can wreck the whole building
> process.  Is a 0.005% failure probability low enough that nobody will be
> bothered by that?

As an ex-packager, I think that's a couple orders of magnitude below where
anybody will notice it, let alone feel pain.  There are other causes of
failure that will dwarf this one.

(You may recall that I used to bitch regularly about the failure
probabilities for mysql's regression tests --- but that was because
the probability of failure was on the order of 50%, when building
in Red Hat's buildfarm.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-06-20 Thread Bruce Momjian
On Tue, May 24, 2016 at 09:23:27AM -0500, Jim Nasby wrote:
> On 5/16/16 2:36 AM, Bruce Momjian wrote:
> >Right.  I am thinking of writing some docs about how to avoid downtime
> >for upgrades of various types.
> 
> If there's some magic sauce to shrink pg_upgrade downtime to near 0 I think
> folks would be very interested in that.
> 
> Outside of that scenario, I think what would be far more useful is
> information on how to do seamless master/replica switchovers using tools
> like pgBouncer or pgPool. That ability is useful *all* the time, not just
> when upgrading. It makes it trivial to do OS-level maintenance, and if
> you're using a form of logical replication it also makes it trivial to do
> expensive database maintenance, such as cluster/vacuum full/reindex. I've
> worked with a few clients that had that ability and it was a huge stress
> reducer. As a bonus, an unplanned outage of the master becomes far less
> stressful, because you already know exactly how to fail over.

pg_upgrade's runtime can't be decreased dramatically --- I wanted to
document other methods like you described.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-06-20 Thread Bruce Momjian
On Fri, May 20, 2016 at 07:40:53PM -0400, Robert Haas wrote:
> On Mon, May 16, 2016 at 3:36 AM, Bruce Momjian  wrote:
> > On Sun, May 15, 2016 at 03:23:52PM -0500, Jim Nasby wrote:
> >> 2) There's no ability at all to revert, other than restore a backup. That
> >> means if you pull the trigger and discover some major performance problem,
> >> you have no choice but to deal with it (you can't switch back to the old
> >> version without losing data).
> >
> > In --link mode only
> 
> No, not really.  Once you let write transactions into the new cluster,
> there's no way to get back to the old server version no matter which
> option you used.

Yes, there is, and it is documented:

If you ran pg_upgrade without
--link or did not start the new server, the
old cluster was not modified except that, if linking
started, a .old suffix was appended to
$PGDATA/global/pg_control.  To reuse the old
cluster, possibly remove the .old suffix from
$PGDATA/global/pg_control; you can then restart the
old cluster.

What is confusing you?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Michael Paquier
On Tue, Jun 21, 2016 at 9:58 AM, Tatsuo Ishii  wrote:
>>> Thanks, this looks good.  Could you please add it to the next commitfest
>>> so that it doesn't get lost and also so I can do an official review of it?
>>
>> Yes, I just did that. That's too late for 9.6 anyway with beta2 close by.
>
> Even there seems to be ongoing discussions on changing version number
> while in the beta period (and which definitely requires initdb). Why
> not changing system catalog during beta?:-)

I am not directly against that to be honest, but I'd expect Tom's
wraith showing up soon on this thread just by saying that. In the two
last releases, catalog bumps before beta2 because there were no other
choice. This issue is not really critical, just a stupid miss from me,
and we can live with this mistake as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Tatsuo Ishii
>> Thanks, this looks good.  Could you please add it to the next commitfest
>> so that it doesn't get lost and also so I can do an official review of it?
> 
> Yes, I just did that. That's too late for 9.6 anyway with beta2 close by.

Even there seems to be ongoing discussions on changing version number
while in the beta period (and which definitely requires initdb). Why
not changing system catalog during beta?:-)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Michael Paquier
On Mon, Jun 20, 2016 at 11:26 PM, Vik Fearing  wrote:
> On 19/06/16 13:02, Michael Paquier wrote:
>> On Sun, Jun 19, 2016 at 7:38 PM, Vik Fearing  wrote:
>>> On 19/06/16 12:28, Michael Paquier wrote:
 On Sun, Jun 19, 2016 at 5:56 PM, Michael Paquier
 Or in short the attached.
>>>
>>> The code looks good to me but why no documentation?
>>
>> Because that's a long day... Thanks! Now you have it.
>
> Thanks, this looks good.  Could you please add it to the next commitfest
> so that it doesn't get lost and also so I can do an official review of it?

Yes, I just did that. That's too late for 9.6 anyway with beta2 close by.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 5:47 PM, Robert Haas  wrote:

> On Mon, Jun 20, 2016 at 4:52 PM, David G. Johnston
>  wrote:
> > Internal or external I do think the number and type of flags described
> here,
> > for the purposes described, seems undesirable from an architectural
> > standpoint.
>
> Well, that seems like a bold statement to me, because I think that one
> really has to understand why flags got added before deciding whether
> there are too many of them, and it's not clear to me that you do.
>

​I don't.  And I totally accept a response of: you are operating under a
false premise here.​  My response above was more defensive that
constructive.

What we're talking about here is ONE flag, which currently controls
> whether the leader will participate in running Gather's child plan.
> What happens when the flag is set to "leader should not participate"
> and no workers are available?  The current behavior is that the leader
> runs the plan in lieu of the workers, and Tom is proposing to change
> it so that we wait for a worker to become available instead.


​I think my biggest (again, I'm not digging deep here) misunderstanding is
testing mode versus production mode and what is or is not visible to the
test framework compared to SQL/libpq

I am usually quite happy when people other than senior hackers weigh
> in on threads here, especially about user-visible behavior, because
> senior hackers can be quite myopic.  We spend most of our time
> developing the system rather than using it, and sometimes our notion
> of what is useful or acceptable is distorted because of it.  Moreover,
> none of us are so smart that we can't be wrong, so a gut check from
> somebody else is often helpful.



> But, of course, an opinion that
> proceeds from a false premise doesn't provide useful guidance.

​
This is true - though I'd suspect not absolute.​


> Many
> people realize this, which is why we tend to get entirely too many
> opinions on questions like "should we change the version number?" and
> far too few on questions like "how should we refactor heap_update?".
>
>
​This is a non-sequitur - or I'm just to worn to follow it.

I did not intentionally set out to spend an hour of my own time to waste 20
minutes of yours.  I won't apologize for an earnest attempt at
self-education and contribution; no matter how badly it turned out.  I will
endeavor to learn from it and avoid similar errors in the future.

David J.


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Alvaro Herrera
Tom Lane wrote:

> This seems like pretty good evidence that we should remove the "ignored"
> marking for the random test, and maybe remove that functionality from
> pg_regress altogether.  We could probably adjust the test to decrease
> its risk-of-failure by another factor of ten or so, if anyone feels like
> 0.005% failure probability is too high.

I suppose that as far as the buildfarm goes it's okay that the test
fails from time to time, but it may be worse from packagers' points of
view, where a randomly failing test can wreck the whole building
process.  Is a 0.005% failure probability low enough that nobody will be
bothered by that?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Thomas Munro
On Fri, Jun 17, 2016 at 3:36 PM, Noah Misch  wrote:
> I agree the non-atomic, unlogged change is a problem.  A related threat
> doesn't require a torn page:
>
>   AssignTransactionId() xid=123
>   heapam.c:3931 HeapTupleHeaderSetXmax(oldtup.t_data, 123);
>   some ERROR before heap_update() finishes
>   rollback;  -- xid=123
>   some backend flushes the modified page
>   immediate shutdown
>   AssignTransactionId() xid=123
>   commit;  -- xid=123
>
> If nothing wrote an xlog record that witnesses xid 123, the cluster can
> reassign it after recovery.  The failed update is now considered a successful
> update, and the row improperly becomes dead.  That's important.

I wonder if that was originally supposed to be handled with the
HEAP_XMAX_UNLOGGED flag which was removed in 11919160.  A comment in
the heap WAL logging commit f2bfe8a2 said that tqual routines would
see the HEAP_XMAX_UNLOGGED flag in the event of a crash before logging
(though I'm not sure if the tqual routines ever actually did that).

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Tom Lane
I wrote:
> Depending on what the percentage actually is, maybe we could treat
> this like the "random" test, and allow a failure to be disregarded
> overall?  But that doesn't seem very nice either, in view of our
> increasing reliance on automated testing.  If "random" were failing
> 90% of the time on some buildfarm critters, that would probably
> indicate a real problem, but we'd likely not realize it for a long time.

BTW, this is getting a bit off-topic for this thread, but: I got
curious about whether any such effect might actually exist, and decided
to grep the buildfarm logs to see how many times a failure of the
"random" test had been ignored.  The answer is that there are 35 such
occurrences in otherwise-successful buildfarm runs, out of something
like 65 successful runs in the buildfarm database.  This is in the
noise, really.   There are more occurrences than that of cases where
"random   ... failed (ignored)" appeared in a failed
run, which more than likely means that some other regression test script
caused a crash with collateral damage to this test.

This seems like pretty good evidence that we should remove the "ignored"
marking for the random test, and maybe remove that functionality from
pg_regress altogether.  We could probably adjust the test to decrease
its risk-of-failure by another factor of ten or so, if anyone feels like
0.005% failure probability is too high.

Raw data attached for amusement's sake.

regards, tom lane

sysname|  snapshot   |  stage  |
  logline  
---+-+-+---
 orangutan | 2015-01-02 08:00:09 | OK  |  random
   ... failed (ignored)
 dingo | 2015-01-22 17:29:03 | Check   |  random
   ... failed (ignored)
 fulmar| 2015-01-26 03:03:17 | OK  | test random
   ... failed (ignored)
 nightjar  | 2015-02-03 22:27:13 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 dromedary | 2015-03-09 18:57:28 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 olinguito | 2015-03-09 18:59:08 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 rover_firefly | 2015-03-09 19:04:00 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 lapwing   | 2015-03-09 19:15:01 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 prairiedog| 2015-03-09 19:20:35 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 dingo | 2015-03-09 19:29:00 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 binturong | 2015-03-09 19:42:28 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 shearwater| 2015-03-09 19:58:15 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 magpie| 2015-04-25 20:59:48 | OK  | test random
   ... failed (ignored)
 anole | 2015-04-28 21:25:27 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 binturong | 2015-05-07 16:55:15 | pg_upgradeCheck | test random
   ... failed (ignored) (test process exited with exit code 2)
 dingo | 2015-05-07 16:58:01 | pg_upgradeCheck | test random
   ... failed (ignored) (test process exited with exit code 2)
 binturong | 2015-05-08 18:56:02 | pg_upgradeCheck | test random
   ... failed (ignored) (test process exited with exit code 2)
 reindeer  | 2015-05-15 06:00:01 | OK  |  random
   ... failed (ignored)
 anole | 2015-05-28 17:58:54 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 anole | 2015-06-01 00:30:15 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 crake | 2015-06-01 18:53:02 | OK  | test random
   ... failed (ignored)
 sittella  | 2015-06-01 22:50:36 | OK  | test random
   ... failed (ignored)
 anole | 2015-06-01 22:58:24 | Check   |  random
   ... failed (ignored) (test process exited with exit code 2)
 blesbok   | 2015-06-27 18:49:43 | Check   | 

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-20 Thread Peter Geoghegan
On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
> What I'm tempted to do is trying to document that, as a point of
> policy, parallel query in 9.6 uses up to (workers + 1) times the
> resources that a single session might use.  That includes not only CPU
> but also things like work_mem and temp file space.  This obviously
> isn't ideal, but it's what could be done by the ship date.

Where would that be documented, though? Would it need to be noted in
the case of each such GUC?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Bruce Momjian
On Mon, Jun 20, 2016 at 05:11:17PM -0400, Robert Haas wrote:
> On Mon, Jun 20, 2016 at 4:55 PM, Tom Lane  wrote:
> > No, the argument for it was that we'd no longer have to have the annual
> > discussions about "is it 10.0 yet?".
> 
> WHAT annual argument?  Did anyone even argue that any 9.x release
> prior to 9.6 deserved to be called 10.0?  Maybe somebody suggested
> that for 9.2 and it generated, like, four emails?  I certainly don't
> remember any discussion that remotely approached the amount of time
> we've spent litigating both the version number and the version
> numbering scheme in the last few months.

I do think Robert is 100% accurate on this.  Personally, I have never
understood the reduce arguments reason, and the jump to 8.0 and 9.0 were
done in a positive way that I think provided value to our community.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Bruce Momjian
On Mon, May 16, 2016 at 10:16:48AM -0400, Tom Lane wrote:
> Peter Eisentraut  writes:
> > On 5/16/16 9:53 AM, Greg Stark wrote:
> >> I thought the idea was that Berkeley tossed an source tree over the
> >> wall with no version number and then the first five releases were
> >> Postgres95 0.x, Postgres95 1.0, Postgres95 1.0.1, Postgres95 1.0.2,
> >> Postgres95 1.0.9. Then the idea was that PostgreSQL 6.0 was the sixth
> >> major release counting those as the first five releases.
> 
> > The last release out of Berkeley was 4.2.
> 
> Correct --- I have a copy of that tarball.
> 
> > Then Postgres95 was "5", and then PostgreSQL started at 6.
> 
> I wasn't actually around at the time, but our commit history starts
> with this:
> 
> Author: Marc G. Fournier 
> Branch: master Release: REL6_1 [d31084e9d] 1996-07-09 06:22:35 +
> 
> Postgres95 1.01 Distribution - Virgin Sources
> 
> The first mention of 6.anything is here:
> 
> Author: Bruce Momjian 
> Branch: master Release: REL6_1 [a2b7f6297] 1996-12-28 02:01:58 +
> 
> Updated changes for 6.0.
> 
> I see no references in the commit history to 5.anything, but there
> are some references like this:

The sole reason we jumped from Postgres 1.09 to 6.0 was that in Postgres
1.0.X, $PGDATA/PG_VERSION contained '5', meaning when Berkeley went from
University Postgres 4.2 to Postgres95 1.0, they didn't reset PG_VERSION.

We really had no way of going to Postgres 2.0 unless we were prepared to
have data/PG_VERSION never match the major version number.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Andres Freund
On 2016-06-20 17:55:19 -0400, Robert Haas wrote:
> On Mon, Jun 20, 2016 at 4:24 PM, Andres Freund  wrote:
> > On 2016-06-20 16:10:23 -0400, Robert Haas wrote:
> >> What exactly is the point of all of that already_marked stuff?
> >
> > Preventing the old tuple from being locked/updated by another backend,
> > while unlocking the buffer.
> >
> >> I
> >> mean, suppose we just don't do any of that before we go off to do
> >> toast_insert_or_update and RelationGetBufferForTuple.  Eventually,
> >> when we reacquire the page lock, we might find that somebody else has
> >> already updated the tuple, but couldn't that be handled by
> >> (approximately) looping back up to l2 just as we do in several other
> >> cases?
> >
> > We'd potentially have to undo a fair amount more work: the toasted data
> > would have to be deleted and such, just to retry. Which isn't going to
> > super easy, because all of it will be happening with the current cid (we
> > can't just increase CommandCounterIncrement() for correctness reasons).
> 
> Why would we have to delete the TOAST data?  AFAIUI, the tuple points
> to the TOAST data, but not the other way around.  So if we change our
> mind about where to put the tuple, I don't think that requires
> re-TOASTing.

Consider what happens if we, after restarting at l2, notice that we
can't actually insert, but return in the !HeapTupleMayBeUpdated
branch. If the caller doesn't error out - and there's certainly callers
doing that - we'd "leak" a toasted datum. Unless the transaction aborts,
the toasted datum would never be cleaned up, because there's no datum
pointing to it, so no heap_delete will ever recurse into the toast
datum (via toast_delete()).

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:24 PM, Andres Freund  wrote:
> On 2016-06-20 16:10:23 -0400, Robert Haas wrote:
>> What exactly is the point of all of that already_marked stuff?
>
> Preventing the old tuple from being locked/updated by another backend,
> while unlocking the buffer.
>
>> I
>> mean, suppose we just don't do any of that before we go off to do
>> toast_insert_or_update and RelationGetBufferForTuple.  Eventually,
>> when we reacquire the page lock, we might find that somebody else has
>> already updated the tuple, but couldn't that be handled by
>> (approximately) looping back up to l2 just as we do in several other
>> cases?
>
> We'd potentially have to undo a fair amount more work: the toasted data
> would have to be deleted and such, just to retry. Which isn't going to
> super easy, because all of it will be happening with the current cid (we
> can't just increase CommandCounterIncrement() for correctness reasons).

Why would we have to delete the TOAST data?  AFAIUI, the tuple points
to the TOAST data, but not the other way around.  So if we change our
mind about where to put the tuple, I don't think that requires
re-TOASTing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 5:36 PM, David G. Johnston
 wrote:
>> Yeah, no kidding.  We had a perfectly good consensus to keep this at
>> 9.6 on pgsql-advocacy, and then later we had a revised consensus to
>> retitle it to 10.0,
>
> If -advocacy had changed their mind before beta1 was tagged this may have
> played out a bit differently...maybe.  In any case 9.6 was a foregone
> conclusion given -advocacy's timeline and because of its independence from
> -hackers (Tom, the tagger, specifically).

One thing to keep in mind is that I did not realize there was any
urgency to make the decision before beta1, because Tom had previous
made reference to renumbering this version to 10.0 *over the summer*
depending on how awesome we then thought parallel query was.  I
assumed that he would not have made this comment if he had an
objection to renumbering the release after beta1.  Surely he didn't
think we were going to do beta1 in the fall.  It was only after beta1
had been released that anyone said post-beta1 was too late.  Had that
been brought up earlier, the discussion might have gone differently,
too.

Also, it is not the case that because Tom applies the tag, he also
gets veto power over the version numbering scheme.  That's not how
decision-making in this community works.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:52 PM, David G. Johnston
 wrote:
> Internal or external I do think the number and type of flags described here,
> for the purposes described, seems undesirable from an architectural
> standpoint.

Well, that seems like a bold statement to me, because I think that one
really has to understand why flags got added before deciding whether
there are too many of them, and it's not clear to me that you do.
What we're talking about here is ONE flag, which currently controls
whether the leader will participate in running Gather's child plan.
What happens when the flag is set to "leader should not participate"
and no workers are available?  The current behavior is that the leader
runs the plan in lieu of the workers, and Tom is proposing to change
it so that we wait for a worker to become available instead.  That has
the advantage of being better for testing, but the disadvantage of
being possibly less useful for other purposes.  Neither of us are
proposing to eliminate the flag, which would only serve to cripple
test coverage, and I will venture to say that neither Tom nor anyone
else who has spent much time looking at the way any part of the system
works would argue that a Node type with ONE flag has too many flags.

I am usually quite happy when people other than senior hackers weigh
in on threads here, especially about user-visible behavior, because
senior hackers can be quite myopic.  We spend most of our time
developing the system rather than using it, and sometimes our notion
of what is useful or acceptable is distorted because of it.  Moreover,
none of us are so smart that we can't be wrong, so a gut check from
somebody else is often helpful.  But, of course, an opinion that
proceeds from a false premise doesn't provide useful guidance.  Many
people realize this, which is why we tend to get entirely too many
opinions on questions like "should we change the version number?" and
far too few on questions like "how should we refactor heap_update?".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 5:08 PM, Robert Haas  wrote:

> On Mon, Jun 20, 2016 at 4:53 PM, Joshua D. Drake 
> wrote:
> > Or we could adopt the very reasonable and practical policy of:
> >
> > The current versioning scheme isn't broke, so we aren't going to fix it.
>
> Yeah, no kidding.  We had a perfectly good consensus to keep this at
> 9.6 on pgsql-advocacy, and then later we had a revised consensus to
> retitle it to 10.0,


​If -advocacy had changed their mind before beta1 was tagged this may have
played out a bit differently...maybe.​  In any case 9.6 was a foregone
conclusion given -advocacy's timeline and because of its independence from
-hackers (Tom, the tagger, specifically).

but as soon as the discussion came over to
> pgsql-hackers nothing would do but that we relitigate the whole thing
> ignoring the previous discussion because it wasn't on pgsql-hackers.
> Why -hackers is the right place to decide on the marketing version
> number rather than -advocacy went unexplained, of course.


​I had the same thought.  Yet no one even tried to move the discussion back
there.  And at this point PGCon was so close that I personally figured it
would be discussed.  It was, and an announcement was made that a decision
was reached.  No one voiced an objection​ so those not at PGCon (or at I)
figured for better or worse we're going to version 10 (expected) and to
address the expressed desire reduce the public confusion surrounding our
major-major-minor version scheme we would instead switch to a more
traditional major-minor scheme (acceptable).

  Now we have
> a new consensus, at least the third if not the fourth or fifth, about
> what to do on this topic, and since Tom likes this outcome better he'd
> like to stop discussion right here.


​This portion of the thread was mostly a technical concern - how do we
display the version in human-readable and machine-readable formats.  I'd
agree with your earlier idea that if you really want to open up the
decision between 10.n.x and 10.x ​start a new thread on -advocacy.  Tally
up those at PGCon as a starting point and lets other toss in their lot from
there.  At this point I'm only seeing rehashing of the same arguments.  Let
other people make their, informed or otherwise, opinions known if you feel
that the group at PGCon is not a fair sampling.


> A two-part version numbering
> scheme may or may not be for the best, but the idea that we're making
> that decision in any principled way, or that the consensus on this new
> system is any more valid than any of the previous consensus, doesn't
> ring true to me.  The idea that this discussion is not fixing any real
> problem, though -- that rings true.
>
>
You've hijack the meaning of "this here.  Most of "this" thread ended up
explaining the difference between "version()" and
current_setting('server_version_num").  That needs to be addressed unless
we revert back to status-quo.  That isn't this thread, though.  This thread
assumes we are going to 10.0 and semantically losing the middle digit.

I'd find it odd to argue against 10.0 on the basis that we'd be displaying
10.0.x in the output of version().  That seems like such a inconsequential
detail in the larger scheme of things.  Either you accept 10.0 and we pick
a human readable output or you don't and the decision doesn't come into
play.  I'm happy with how things are (+0.5 for we should output 10.0.x for
version()) so I'm stopping here.

David J.
​


Re: [HACKERS] 10.0

2016-06-20 Thread Joshua D. Drake

On 06/20/2016 02:14 PM, Merlin Moncure wrote:

On Mon, Jun 20, 2016 at 4:08 PM, Robert Haas  wrote:

On Mon, Jun 20, 2016 at 4:53 PM, Joshua D. Drake  wrote:

Or we could adopt the very reasonable and practical policy of:

The current versioning scheme isn't broke, so we aren't going to fix it.


The idea that this discussion is not fixing any real
problem, though -- that rings true.


sure -- it's my fault for starting the conversation back up.  I was
wondering about supporting older version checks, but only because I
was unaware of the 'machine' variant of the version check
(server_version_num), which properly supports numerical ordering for
historical versions.  If there's anything to do here, maybe we ought
to document that server_version_num should be used for checking
version a little more strongly.  Judging by google searching, this is
as not widely known as it should be.


I certainly had no idea it even existed until you displayed the query. I 
have always used version() but then, I am not a -hacker.


Sincerely,

JD



merlin




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] forcing a rebuild of the visibility map

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:57 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> I also don't see why it's a good idea to have knowledge about how to
>> truncate the visibility map outside of visibilitymap.c. Having that in a
>> contrib module just seems like a modularity violation.
>
> That seems like a pretty good argument.

I agree that's a good argument but it passes over one or two points
which motivated my approach.  Most of the work of truncating the
visibility map is, in fact, encapsulated by visibilitymap_truncate, so
the only knowledge we're exporting to the contrib module is what WAL
record to emit.  I put that in the caller of visibilitymap_truncate
because the only existing caller of visibilitymap_truncate is
RelationTruncate, which also knows what WAL record to emit on behalf
of visibilitymap.c.  So I don't think I've exported any more knowledge
from visibilitymap.c than was the case previously.

That having been said, if somebody wants to whack this around, I am
not going to put up a big fight.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Merlin Moncure
On Mon, Jun 20, 2016 at 4:08 PM, Robert Haas  wrote:
> On Mon, Jun 20, 2016 at 4:53 PM, Joshua D. Drake  
> wrote:
>> Or we could adopt the very reasonable and practical policy of:
>>
>> The current versioning scheme isn't broke, so we aren't going to fix it.
>
> The idea that this discussion is not fixing any real
> problem, though -- that rings true.

sure -- it's my fault for starting the conversation back up.  I was
wondering about supporting older version checks, but only because I
was unaware of the 'machine' variant of the version check
(server_version_num), which properly supports numerical ordering for
historical versions.  If there's anything to do here, maybe we ought
to document that server_version_num should be used for checking
version a little more strongly.  Judging by google searching, this is
as not widely known as it should be.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:55 PM, Tom Lane  wrote:
> No, the argument for it was that we'd no longer have to have the annual
> discussions about "is it 10.0 yet?".

WHAT annual argument?  Did anyone even argue that any 9.x release
prior to 9.6 deserved to be called 10.0?  Maybe somebody suggested
that for 9.2 and it generated, like, four emails?  I certainly don't
remember any discussion that remotely approached the amount of time
we've spent litigating both the version number and the version
numbering scheme in the last few months.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:53 PM, Joshua D. Drake  wrote:
> Or we could adopt the very reasonable and practical policy of:
>
> The current versioning scheme isn't broke, so we aren't going to fix it.

Yeah, no kidding.  We had a perfectly good consensus to keep this at
9.6 on pgsql-advocacy, and then later we had a revised consensus to
retitle it to 10.0, but as soon as the discussion came over to
pgsql-hackers nothing would do but that we relitigate the whole thing
ignoring the previous discussion because it wasn't on pgsql-hackers.
Why -hackers is the right place to decide on the marketing version
number rather than -advocacy went unexplained, of course.  Now we have
a new consensus, at least the third if not the fourth or fifth, about
what to do on this topic, and since Tom likes this outcome better he'd
like to stop discussion right here.  A two-part version numbering
scheme may or may not be for the best, but the idea that we're making
that decision in any principled way, or that the consensus on this new
system is any more valid than any of the previous consensus, doesn't
ring true to me.  The idea that this discussion is not fixing any real
problem, though -- that rings true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] forcing a rebuild of the visibility map

2016-06-20 Thread Tom Lane
Andres Freund  writes:
> I also don't see why it's a good idea to have knowledge about how to
> truncate the visibility map outside of visibilitymap.c. Having that in a
> contrib module just seems like a modularity violation.

That seems like a pretty good argument.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> If we were going to do it like that, I would argue for "every ten years
>> like clockwork", e.g. 10.0.x is next after 9.9.x.  But in point of fact,
>> Robert, you already made your case for that approach and nobody else
>> cared for it.

> I voted for this approach initially too, and I think it has merit --
> notably, that it would stop this discussion.  It was said that moving
> to two-part numbers would stop all discussion, but it seems to have had
> exactly the opposite effect.

No, the argument for it was that we'd no longer have to have the annual
discussions about "is it 10.0 yet?".  There was no claim that getting
there would be uncontroversial, only that things would be quieter once
we did get there.

Considering that same long-term viewpoint, I'm not very happy about the
idea of "let's reserve the middle digit for compatible feature backports",
because the minute we set up a numbering system like that, everybody and
his brother will be arguing for making a branch on which to backport their
favorite more-or-less-compatible new feature.  You as well as others
pointed out that we don't have the manpower to actually support any such
thing ... so let's not open the door to it.

If we do arrive at a consensus that going to simply "10.0, 10.1, etc"
would be too much change, then I'd be in favor of adopting the
every-ten-year rule instead, in hopes that we can at least shut down
future "is it 10.0 yet" threads more quickly.  But that really does
nothing at all to fix the confusion so often shown by outsiders about
the significance of our version numbers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:20 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 20, 2016 at 4:00 PM, David G. Johnston
>>  wrote:
>>> 10.x is the desired output.
>
>> 10.x is the output that some people desire.  A significant number of
>> people, including me, would prefer to stick with the current
>> three-part versioning scheme, possibly with some change to the
>> algorithm for bumping the first digit (e.g. every 5 years like
>> clockwork).
>
> If we were going to do it like that, I would argue for "every ten years
> like clockwork", e.g. 10.0.x is next after 9.9.x.  But in point of fact,
> Robert, you already made your case for that approach and nobody else
> cared for it.  Either there's a meaningful difference between the first
> and second parts of the number, or there is not.  If there is not, why
> have separate parts?  It can only cause confusion ... as this whole
> thread, and its many many predecessors, amply illustrate.

That's not how I remember it.  At the Ottawa developer meeting, there
were more votes for changing to a two-part versioning scheme than
there were for retaining a three-part versioning scheme, but my
recollection not overwhelmingly so.  I'm pretty sure it was less than
a 2/3 majority in favor of changing.

Furthermore, essentially everyone in the room, including people who
wanted to stick with a three-part scheme, was in favor of the next
version's first component being 10.  I do not recall there being a
strong consensus among the people who wanted the next version to be
10.0.0 on how to decide when to go to 11.0.0, though.  Various
proposals were offered and most of them got no more than one vote.

But saying that nobody other than me thought we should stick with a
three-part scheme is revisionist history.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Joshua D. Drake

On 06/20/2016 01:41 PM, Alvaro Herrera wrote:

Tom Lane wrote:

Robert Haas  writes:

On Mon, Jun 20, 2016 at 4:00 PM, David G. Johnston
 wrote:



If we were going to do it like that, I would argue for "every ten years
like clockwork", e.g. 10.0.x is next after 9.9.x.  But in point of fact,
Robert, you already made your case for that approach and nobody else
cared for it.


I voted for this approach initially too, and I think it has merit --
notably, that it would stop this discussion.  It was said that moving
to two-part numbers would stop all discussion, but it seems to have had
exactly the opposite effect.



Or we could adopt the very reasonable and practical policy of:

The current versioning scheme isn't broke, so we aren't going to fix it.

Put that in the FAQ and wave at it like we do with hints ala Oracle.

It is obvious from this thread alone that there is really no consensus.

Sincerely,

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 4:03 PM, Robert Haas  wrote:

> On Mon, Jun 20, 2016 at 3:29 PM, David G. Johnston
>  wrote:
> > The entire theory here looks whacked - and seems to fall into the "GUCs
> > controlling results" bucket of undesirable things.
>
> As far as I can see, this entire email is totally wrong and off-base,
> because the whole thing seems to be written on the presumption that
> single_copy is a GUC, when it's actually a structure member.  If there
> was some confusion about that, you could have spent 5 seconds running
> "git grep" before writing this email, or you could have tried "SET
> single_copy" and discovered, hey, there's no such GUC.
>
> Furthermore, I think that describing something that you obviously
> haven't taken any time to understand as "whacked" is not very nice.
> For that matter, I think that describing something you *have* taken
> time to understand as "whacked" is not very nice.
>
>
​Point taken.

I don't think my entire post depends solely upon this being a GUC though.

​I've burned too many brain cells on this already, though, to dive much
deeper.

Internal or external I do think the number and type of flags described
here, for the purposes described, seems undesirable from an architectural
standpoint.  I do not and cannot offer up more than that generally due to
knowledge and resource constraints.  I tried to frame things up relative to
my understanding of existing, non-parallel, idioms, both to understand it
better myself and to throw out another POV from a fresh perspective.  I'll
admit its one with some drawbacks but its offered in good faith.

Please do with it as you will and accept my apology for the poor choice
of colloquialism.

David J.


Re: [HACKERS] forcing a rebuild of the visibility map

2016-06-20 Thread Andres Freund
On 2016-06-18 11:56:51 -0400, Tom Lane wrote:
> Michael Paquier  writes:
> > On Sat, Jun 18, 2016 at 6:53 AM, Robert Haas  wrote:
> >> Andres, do you want to explain the nature of your concern further?
> 
> > I am not in his mind, but my guess is that contrib modules are
> > sometimes used as template examples by other people, and encouraging
> > users to use those routines in modules would increase the risk to
> > misuse them, aka badly-formed records that could corrupt the system.

That's not it, no.


> If Andres' concern is that XLogInsert isn't a very stable API, maybe
> we could address that by providing some wrapper function that knows
> how to emit the specific kind of record that pg_visibility needs.

That's part of the concern I have, yes. It's pretty common that during
minor version updates contrib modules are updated before the main server
is restarted. Increasing the coupling on something as critical as WAL
logging doesn't strike me as a good idea.

I also don't see why it's a good idea to have knowledge about how to
truncate the visibility map outside of visibilitymap.c. Having that in a
contrib module just seems like a modularity violation.  To me this
should be a wal_log paramter to visibilitymap_truncate().


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Alvaro Herrera
Tom Lane wrote:
> Robert Haas  writes:
> > On Mon, Jun 20, 2016 at 4:00 PM, David G. Johnston
> >  wrote:
> >> 10.x is the desired output.
> 
> > 10.x is the output that some people desire.  A significant number of
> > people, including me, would prefer to stick with the current
> > three-part versioning scheme, possibly with some change to the
> > algorithm for bumping the first digit (e.g. every 5 years like
> > clockwork).
> 
> If we were going to do it like that, I would argue for "every ten years
> like clockwork", e.g. 10.0.x is next after 9.9.x.  But in point of fact,
> Robert, you already made your case for that approach and nobody else
> cared for it.

I voted for this approach initially too, and I think it has merit --
notably, that it would stop this discussion.  It was said that moving
to two-part numbers would stop all discussion, but it seems to have had
exactly the opposite effect.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 4:14 PM, Robert Haas  wrote:

> On Mon, Jun 20, 2016 at 4:00 PM, David G. Johnston
>  wrote:
> > 10.x is the desired output.
>
> 10.x is the output that some people desire.  A significant number of
> people, including me, would prefer to stick with the current
> three-part versioning scheme, possibly with some change to the
> algorithm for bumping the first digit (e.g. every 5 years like
> clockwork).
>
> ​
​I was speaking for the project/community as a distinct entity and not
about any individual contributor.​  I'm acting as if we're past the point
of individual opinions and votes on the decision to go to a two-part
versioning scheme.

We will still welcome any major revelations that may have gone unconsidered
during the decision making but I find that to be unlikely.

David J.


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Andres Freund
On 2016-06-20 16:10:23 -0400, Robert Haas wrote:
> What exactly is the point of all of that already_marked stuff?

Preventing the old tuple from being locked/updated by another backend,
while unlocking the buffer.


> I
> mean, suppose we just don't do any of that before we go off to do
> toast_insert_or_update and RelationGetBufferForTuple.  Eventually,
> when we reacquire the page lock, we might find that somebody else has
> already updated the tuple, but couldn't that be handled by
> (approximately) looping back up to l2 just as we do in several other
> cases?

We'd potentially have to undo a fair amount more work: the toasted data
would have to be deleted and such, just to retry. Which isn't going to
super easy, because all of it will be happening with the current cid (we
can't just increase CommandCounterIncrement() for correctness reasons).

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 20, 2016 at 4:00 PM, David G. Johnston
>  wrote:
>> 10.x is the desired output.

> 10.x is the output that some people desire.  A significant number of
> people, including me, would prefer to stick with the current
> three-part versioning scheme, possibly with some change to the
> algorithm for bumping the first digit (e.g. every 5 years like
> clockwork).

If we were going to do it like that, I would argue for "every ten years
like clockwork", e.g. 10.0.x is next after 9.9.x.  But in point of fact,
Robert, you already made your case for that approach and nobody else
cared for it.  Either there's a meaningful difference between the first
and second parts of the number, or there is not.  If there is not, why
have separate parts?  It can only cause confusion ... as this whole
thread, and its many many predecessors, amply illustrate.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 4:00 PM, David G. Johnston
 wrote:
> 10.x is the desired output.

10.x is the output that some people desire.  A significant number of
people, including me, would prefer to stick with the current
three-part versioning scheme, possibly with some change to the
algorithm for bumping the first digit (e.g. every 5 years like
clockwork).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 3:33 PM, Andres Freund  wrote:
>> I'm not sure what to do about this: this part of the heap_update()
>> logic has been like this forever, and I assume that if it were easy to
>> refactor this away, somebody would have done it by now.
>
> Well, I think generally nobody seriously looked at actually refactoring
> heap_update(), even though that'd be a good idea.  But in this instance,
> the problem seems relatively fundamental:
>
> We need to lock the origin page, to do visibility checks, etc. Then we
> need to figure out the target page. Even disregarding toasting - which
> we could be doing earlier with some refactoring - we're going to have to
> release the page level lock, to lock them in ascending order. Otherwise
> we'll risk kinda likely deadlocks.  We also certainly don't want to nest
> the lwlocks for the toast stuff, inside a content lock for the old
> tupe's page.
>
> So far the best idea I have - and it's really not a good one - is to
> invent a new hint-bit that tells concurrent updates to acquire a
> heavyweight tuple lock, while releasing the page-level lock. If that
> hint bit does not require any other modifications - and we don't need an
> xid in xmax for this use case - that'll avoid doing all the other
> `already_marked` stuff early, which should address the correctness
> issue.  It's kinda invasive though, and probably has performance
> implications.
>
> Does anybody have a better idea?

What exactly is the point of all of that already_marked stuff?  I
mean, suppose we just don't do any of that before we go off to do
toast_insert_or_update and RelationGetBufferForTuple.  Eventually,
when we reacquire the page lock, we might find that somebody else has
already updated the tuple, but couldn't that be handled by
(approximately) looping back up to l2 just as we do in several other
cases?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 20, 2016, at 1:00 PM, David G. Johnston  
> wrote:
> 
> On Mon, Jun 20, 2016 at 3:08 PM, Mark Dilger  wrote:
> 
> > Do you have a problem with the human form and machine forms of the version 
> > number being different in this respect?  I don't - for me the decision of a 
> > choice for the human form is not influenced by the fact the machine form 
> > has 6 digits (with leading zeros which the human form elides...).
> 
> I don't have a problem with it if humans always use a two part number.  I 
> don't read
> the number 14 as being three parts, nor as being two parts, so it doesn't 
> matter.
> What got me to respond this morning was Josh's comment:
> 
> "Realistically, though, we're more likely to end up with 10.0.1 than 10.1."
> 
> He didn't say "11 than 10.1", he said "10.0.1 than 10.1", which showed 
> that we
> already have a confusion waiting to happen.
> 
> Now, you can try to avoid the confusion by saying that we'll always use all 
> three
> digits of the number rather than just two, or always use two digits rather 
> than three.
> But how do you enforce that? 
> 
> ​You do realize he was referring to machine generated output here?  

No I don't, nor will anyone who finds that via a google search.  That's my 
point.
You core hackers feel perfectly comfortable with that because you understand
what you are talking about.  Hardly anybody else will.

As you suggest, that's my $0.02, and I'm moving on.

mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 3:29 PM, David G. Johnston
 wrote:
> The entire theory here looks whacked - and seems to fall into the "GUCs
> controlling results" bucket of undesirable things.

As far as I can see, this entire email is totally wrong and off-base,
because the whole thing seems to be written on the presumption that
single_copy is a GUC, when it's actually a structure member.  If there
was some confusion about that, you could have spent 5 seconds running
"git grep" before writing this email, or you could have tried "SET
single_copy" and discovered, hey, there's no such GUC.

Furthermore, I think that describing something that you obviously
haven't taken any time to understand as "whacked" is not very nice.
For that matter, I think that describing something you *have* taken
time to understand as "whacked" is not very nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 3:08 PM, Mark Dilger 
wrote:

>
> > Do you have a problem with the human form and machine forms of the
> version number being different in this respect?  I don't - for me the
> decision of a choice for the human form is not influenced by the fact the
> machine form has 6 digits (with leading zeros which the human form
> elides...).
>
> I don't have a problem with it if humans always use a two part number.  I
> don't read
> the number 14 as being three parts, nor as being two parts, so it
> doesn't matter.
> What got me to respond this morning was Josh's comment:
>
> "Realistically, though, we're more likely to end up with 10.0.1 than 10.1."
>
> He didn't say "11 than 10.1", he said "10.0.1 than 10.1", which showed
> that we
> already have a confusion waiting to happen.
>
> Now, you can try to avoid the confusion by saying that we'll always use
> all three
> digits of the number rather than just two, or always use two digits rather
> than three.
> But how do you enforce that?


​You do realize he was referring to machine generated output here?  That
means unless we get careless we have full control over what is output.​
 Enforcement is easy.

If in some sense the middle number exists, but is
> always zero, then some people will mention it and some won't, with the
> result that
> 10.0.x and 10.x will be used by different people at different times to
> refer to the same
> release.
>

​Back to human generated output again...
​


>
> The problem that Tom and others mentioned upthread (rather a while ago) is
> that
> up until now, we've had three digits in the version numbers, but the first
> two numbers
> really just mean one thing, "major", and the last number means "minor".
> IIUC, he
> wanted to stop using two separate digits where clearly just one would
> suffice.  Hence
> the proposal to go to things like 10.0, 10.1.  But others chimed in saying
> that we need
> to keep it three parts for compatibility reasons, so how about we just
> have the middle
> number always be zero.  My response to that is what I've just said.  You
> can't do that
> without creating ambiguity in future version numbers, because of how
> people use
> language.  If you want to avoid the ambiguity and confusion going forward,
> all the
> numbers in the scheme must have meaning and not be mere placeholders.  I
> gave a
> suggestion about what the middle number *could* mean, which I don't deny
> is what
> I'd like best, but it's just a suggestion to avoid the confusion inherent
> in people eliding
> the middle number sometimes but not other times.
>
>
Given everything else you wrote I find it hard to believe you actually,
deep down, think that
​"definitional ​
meaning
​"​
is sufficient here.  The reason informed people say "9.2.x" and "9.3.x" is
that those are in fact t
​w​
o different things where the middle number changes every year.  If we go
right from: 10.0.7 to 11.0.0 there will be no added value in distinguishing
between 10.0.x
​and ​
10.x and so people will no
​t​
do it - and
​ and you said​
there is no way to force them.
​​  That the middle number could, in well defined circumstances, change is
immaterial when it will not do so in practice.

So, we can ensure the machine output is consistent.  Th
​at​
is easy.​
​  The only question here is whether we make the machine human-readable
output better conform to expected human-generate usage (i.e., 10.x) or do
we keep the machine generated human-readable output compatible with the
existing format so that machine processing of the string is unaffected.​

The only way to address your concern is to continue with the status-quo.
Make 10.1.x and 10.2.x and so people cannot meaningfully drop the second
digit.
​  Since it has been accepted that the status quo has its own problems -
namely that the less informed population are already treating our version
number as if it is 9.x (instead of 9.5.x)​ that change to better conform to
societal expectations is desirable.

​If you're going to vote for status-quo I'd say add your $0.02 and move on;
I don't think that is still on the table (unless some real bad reality
smacks us in the face.  The human generated dynamic is thus a foregone
conclusion - the majority are going to call and write our next release​ as
10.x.  The machine readable output will be 10.  The human readable
output seems to be up for a vote.  10.0.x or 10.x

10.x is the desired output.

Having a list of actual problems likely to face our users if we do this
would be helpful.

Identifying exactly where this human-readable output string appears would
be helpful.  Obviously the "version()" command but where else?

I do think Josh has the right of it, though.  Humans seeing 10.0.4 in the
version() output will easily adapt once they understand why we left it that
way.  Machines cannot adapt as readily.  The fact that we advertise 10.4
while our version number is 10.0.4 in one small corner of out system is a
minor irritant compared to the potential for

Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 1:13 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Jun 19, 2016 at 10:23 PM, Tom Lane  wrote:
>>> Personally, I'm +1 for such tinkering if it makes the feature either more
>>> controllable or more understandable.  After reading the comments at the
>>> head of nodeGather.c, though, I don't think that single_copy is either
>>> understandable or useful, and merely renaming it won't help.  Apparently,
>>> it runs code in the worker, except when it doesn't, and even when it does,
>>> it's absolutely guaranteed to be a performance loss because the leader is
>>> doing nothing.  What in the world is the point?
>
>> The single_copy flag allows a Gather node to have a child plan which
>> is not intrinsically parallel.
>
> OK, but I'm very dubious of your claim that this has any use except for
> testing purposes.  It certainly has no such use today.  Therefore, the
> behavior of falling back to running in the leader seems like an
> anti-feature to me.  If we want to test running in a worker, then we
> want to test that, not maybe test it.
>
> I propose that the behavior we actually want here is to execute in
> a worker, full stop.  If we can't get one, wait till we can.  If we
> can't get one because no slots have been allocated at all, fail.
> That would make the behavior deterministic enough for the regression
> tests to rely on it.

I agree that for force_parallel_mode testing, that behavior would be useful.

I am also pretty confident we're going to want the behavior where the
leader runs the plan if, and only if, no workers can be obtained for
other purposes.  However, I agree that's not essential today.

> And while I don't know what this mode should be called, I'm pretty sure
> that neither "single_copy" nor "pipeline" are useful descriptions.

Maybe we should make this an enum rather than a boolean, since there
seem to be more than two useful behaviors.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 3:07 PM, Gražvydas Valeika 
wrote:

> Hi,
>
> I recently bumped into http://semver.org/
>
> It can be interesting to participants of this discussion.
>
> Especially motivation for minor version (middle number).
>
>
​While we appreciate the comment this is third (maybe forth) time this has
come up during this discussion - the last being 20 emails and 3 days ago.
In short, we don't and never will be semver compliant so if anything its
numbering protocol is something to avoid, not embrace.

David J.
​


Re: [HACKERS] Reviewing freeze map code

2016-06-20 Thread Andres Freund
On 2016-06-15 08:56:52 -0400, Robert Haas wrote:
> Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2,
> and CTID without logging anything or clearing the all-visible flag and
> then releases the lock on the heap page to go do some more work that
> might even ERROR out.  Only if that other work all goes OK do we
> relock the page and perform the WAL-logged actions.
> 
> That doesn't seem like a good idea even in existing releases, because
> you've taken a tuple on an all-visible page and made it not
> all-visible, and you've made a page modification that is not
> necessarily atomic without logging it.

Right, that's broken.


> I'm not sure what to do about this: this part of the heap_update()
> logic has been like this forever, and I assume that if it were easy to
> refactor this away, somebody would have done it by now.

Well, I think generally nobody seriously looked at actually refactoring
heap_update(), even though that'd be a good idea.  But in this instance,
the problem seems relatively fundamental:

We need to lock the origin page, to do visibility checks, etc. Then we
need to figure out the target page. Even disregarding toasting - which
we could be doing earlier with some refactoring - we're going to have to
release the page level lock, to lock them in ascending order. Otherwise
we'll risk kinda likely deadlocks.  We also certainly don't want to nest
the lwlocks for the toast stuff, inside a content lock for the old
tupe's page.

So far the best idea I have - and it's really not a good one - is to
invent a new hint-bit that tells concurrent updates to acquire a
heavyweight tuple lock, while releasing the page-level lock. If that
hint bit does not require any other modifications - and we don't need an
xid in xmax for this use case - that'll avoid doing all the other
`already_marked` stuff early, which should address the correctness
issue.  It's kinda invasive though, and probably has performance
implications.

Does anybody have a better idea?

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 12:06 PM, Robert Haas  wrote:

> On Sun, Jun 19, 2016 at 10:23 PM, Tom Lane  wrote:
> >> although I fear we
> >> might be getting to a level of tinkering with parallel query that
> >> starts to look more like feature development.
> >
> > Personally, I'm +1 for such tinkering if it makes the feature either more
> > controllable or more understandable.  After reading the comments at the
> > head of nodeGather.c, though, I don't think that single_copy is either
> > understandable or useful, and merely renaming it won't help.  Apparently,
> > it runs code in the worker, except when it doesn't, and even when it
> does,
> > it's absolutely guaranteed to be a performance loss because the leader is
> > doing nothing.  What in the world is the point?
>
> The single_copy flag allows a Gather node to have a child plan which
> is not intrinsically parallel.  For example, consider these two plans:
>
> Gather
> -> Parallel Seq Scan
>
> Gather
> -> Seq Scan
>
> The first plan is safe regardless of the setting of the single-copy
> flag.  If the plan is executed in every worker, the results in
> aggregate across all workers will add up to the results of a
> non-parallel sequential scan of the table.  The second plan is safe
> only if the # of workers is 1 and the single-copy flag is set.  If
> either of those things is not true, then more than one process might
> try to execute the sequential scan, and the result will be that you'll
> get N copies of the output, where N = (# of parallel workers) +
> (leader also participates ? 1 : 0).
>
> For force_parallel_mode = {on, regress}, the single-copy behavior is
> essential.  We can run all of those plans inside a worker, but only
> because we know that the leader won't also try to run those same
> plans.
>
>
​The entire theory here looks whacked - and seems to fall into the "GUCs
controlling results" bucket of undesirable things.

Is this GUC enabled by a compile time directive, or otherwise protected
from misuse in production?

I'm having trouble sounding smart here about what is bothering me but
basically the parallel infrastructure (i.e., additional workers) shouldn't
even be used for "Seq Scan" and a "Seq Scan" under a Gather should behave
no differently than a "Parallel Seq Scan" under a Gather where all work is
done by the leader because no workers were available to help.

At worse this behavior should be an implementation artifact of
force_parallel_mode={on,regress}; at best the Gather node would simply have
this intelligence on, always, so as not to silently generate bogus results
in a mis-configured or buggy setup.



> ​[...]
>
>
> Actually, though, the behavior I really want the single_copy flag to
> embody is not so much "only one process runs this" but "leader does
> not participate unless there are no workers", which is the same thing
> only when the budgeted number of workers is one.


​This sounds an awful lot like a planner hint; especially since it defaults
to off.

  This is useful
> because of plans like this:
>
> Finalize HashAggregate
> -> Gather
>   -> Partial HashAggregate
> -> Hash Join
>-> Parallel Seq Scan on large_table
>-> Hash
>  -> Seq Scan on another_large_table
>
> Unless the # of groups is very small, the leader actually won't
> perform very much of the parallel-seq-scan on large_table, because
> it'll be too busy aggregating the results from the other workers.
> However, if it ever reaches a point where the Gather can't read a
> tuple from one of the workers immediately, which is almost certain to
> occur right at the beginning of execution, it's going to go build a
> copy of the hash table so that it can "help" with the hash join.  By
> the time it finishes, the workers will have done the same and be
> feeding it results, and it will likely get little use out of the copy
> that it built itself.  But it will still have gone to the effort of
> building it.
>
> For 10.0, Thomas Munro has already done a bunch of work, and will be
> doing more work, so that we can build a shared hash table, rather than
> one copy per worker.  That's going to be better when the table is
> large anyway, so maybe this particular case won't matter so much.  But
> in general when a partial path has a substantial startup cost, it may
> be better for the leader not to get involved.


​So have the Gather node understand this and act accordingly.​

This is also quite different than the "we'll get wrong results" problem
described above but which this GUC also attempts to solve.

​I'm inclined to believe three things:

1) We need a test mode whereby we guarantee at least one worker is used for
processing.
2) Gather needs to be inherently smart enough to accept data from a
non-parallel source.
3) Gather needs to use its knowledge (hopefully it has some) of partial
plan startup costs and worker availability to decide whether it wants to
participate in both hunting and gathering.  It should make this decision
once at startup a

Re: [HACKERS] [sqlsmith] OOM crash in plpgsql_extra_checks_check_hook

2016-06-20 Thread Tom Lane
Andreas Seltenreich  writes:
> Just had a parallel worker of a memory-starved instance of sqlsmith
> crash.  plpgsql_extra_checks_check_hook forgot to check the result of
> its malloc call here:

Good catch, will fix.  Thanks!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 20, 2016, at 11:30 AM, David G. Johnston  
> wrote:
> 
> On Mon, Jun 20, 2016 at 2:14 PM, Mark Dilger  wrote:
> 
> > On Jun 20, 2016, at 11:06 AM, Joshua D. Drake  
> > wrote:
> >
> > On 06/20/2016 10:45 AM, Mark Dilger wrote:
> >
> >>> Now maybe you have some other idea in mind, but I don't quite
> >>> understand what it is.
> >>
> >> I do, indeed, and here it is:
> >>
> >> When considering whether to *back port* a change, we typically do so
> >> on the basis that bug fixes are back ported, features are not.  In my
> >> proposed versioning scheme, you could back port any feature that is
> >> compatible with the old version, and bump the middle number to alert
> >> users that you've not just back ported a bug fix, but a feature.  Any new
> >> features that are not compatible don't get back ported.
> >>
> >> If you fix a bug on master during development, you can back port that
> >> as well.  But instead of bumping the middle number, you bump the last
> >> number.
> >>
> >> Somebody running a version that is three major versions back could
> >> still get the advantage of new features, so long as those new features
> >> are not incompatible.  It's frequently not nearly so easy to run pg_upgrade
> >> as it is to run rpm -U.  You get downtime either way, but the elapsed
> >> time of that downtime is orders of magnitude different.
> >>
> >> Someone else running that same version from three major versions ago
> >> can take a more conservative policy, and only upgrade bug fixes, and
> >> forego the back ported features.
> >>
> >> You still have one major version release per year.  At that time, you can
> >> also release back-port versions of prior major versions.
> >
> > OFFLIST:
> >
> > You are fighting a losing if noble battle.
> 
> I think all my emails on this subject have been seriously misunderstood.
> Perhaps the problem is that I don't understand some critical issue.  Can
> you please help me understand this part:
> 
> It seems like people want releases, starting with 10.0, to be named things
> like 10.0, 10.1, 10.2,..., but for the purposes of communicating with programs
> like nagios refer to them as 10.0.0, 10.0.1, 10.0.2
> 
> Is that right?
> 
> That's the part that really annoys me, and I keep trying to argue for not 
> doing
> that, and people keep replying to other parts of my messages rather than
> replying to the core part of what I am saying.
> 
> Why would any sensible person want a release to sometimes be called
> "10.4", but the exact same release sometimes referred to as "10.0.4"?
> This is just going to confuse average software users, as they would never
> expect that 10.4 and 10.0.4 are the same thing.
> 
> Have I misunderstood what is being proposed?
> 
> ​The software is only ever going to report 10.0.4 OR 10.4.  The area of 
> concern is that people are going to get annoyed at saying: "that was fixed in 
> 10.0.4​" and so mailing lists and other forums where people write the numbers 
> instead of a computer are going to be littered with: "that was fixed in 10.4".
> 
> So now human speak and machine speak are disjointed.
>  
> ​The machine form output for that release is going to be 14 regardless of 
> the decision to make the human form 10.4 or 10.0.4
> 
> Do you have a problem with the human form and machine forms of the version 
> number being different in this respect?  I don't - for me the decision of a 
> choice for the human form is not influenced by the fact the machine form has 
> 6 digits (with leading zeros which the human form elides...).

I don't have a problem with it if humans always use a two part number.  I don't 
read
the number 14 as being three parts, nor as being two parts, so it doesn't 
matter.
What got me to respond this morning was Josh's comment:

"Realistically, though, we're more likely to end up with 10.0.1 than 10.1."

He didn't say "11 than 10.1", he said "10.0.1 than 10.1", which showed that 
we
already have a confusion waiting to happen.

Now, you can try to avoid the confusion by saying that we'll always use all 
three
digits of the number rather than just two, or always use two digits rather than 
three.
But how do you enforce that?  If in some sense the middle number exists, but is
always zero, then some people will mention it and some won't, with the result 
that
10.0.x and 10.x will be used by different people at different times to refer to 
the same
release.

The problem that Tom and others mentioned upthread (rather a while ago) is that
up until now, we've had three digits in the version numbers, but the first two 
numbers
really just mean one thing, "major", and the last number means "minor".  IIUC, 
he
wanted to stop using two separate digits where clearly just one would suffice.  
Hence
the proposal to go to things like 10.0, 10.1.  But others chimed in saying that 
we need
to keep it three parts for compatibility reasons, so how about we just have the 
middle
number always be zero.  My response to that is what I've just sai

[HACKERS] [sqlsmith] OOM crash in plpgsql_extra_checks_check_hook

2016-06-20 Thread Andreas Seltenreich
Just had a parallel worker of a memory-starved instance of sqlsmith
crash.  plpgsql_extra_checks_check_hook forgot to check the result of
its malloc call here:

Core was generated by `postgres: bgworker: parallel worker for PID 5905 
   '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  plpgsql_extra_checks_check_hook (newvalue=, 
extra=0x7fff7fe31a58, source=) at pl_handler.c:113
113 *myextra = extrachecks;
(gdb) bt
#0  plpgsql_extra_checks_check_hook (newvalue=, 
extra=0x7fff7fe31a58, source=) at pl_handler.c:113
#1  0x0080173f in call_string_check_hook (newval=0x7fff7fe31a50, 
extra=, source=, elevel=15, conf=, 
conf=) at guc.c:9779
#2  0x008029b8 in InitializeOneGUCOption (gconf=0x4) at guc.c:4546
#3  0x00804dbc in define_custom_variable (variable=0x2cb6ef0) at 
guc.c:7466
#4  0x00805862 in DefineCustomStringVariable 
(name=name@entry=0x7f803cbfe011 "plpgsql.extra_warnings", 
short_desc=short_desc@entry=0x7f803cbfe1f8 "List of programming constructs that 
should produce a warning.", long_desc=long_desc@entry=0x0, 
valueAddr=valueAddr@entry=0x7f803ce070d8 , 
bootValue=bootValue@entry=0x7f803cbfdf78 "none", 
context=context@entry=PGC_USERSET, flags=1, check_hook=0x7f803cbe9700 
, assign_hook=0x7f803cbe96e0 
, show_hook=0x0) at guc.c:7733
#5  0x7f803cbe99ea in _PG_init () at pl_handler.c:173
#6  0x007f1bcb in internal_load_library 
(libname=libname@entry=0x7f8040cee14d ) at dfmgr.c:276
#7  0x007f2738 in RestoreLibraryState (start_address=0x7f8040cee14d 
) at dfmgr.c:741
#8  0x004e61c0 in ParallelWorkerMain (main_arg=) at 
parallel.c:985
#9  0x00684072 in StartBackgroundWorker () at bgworker.c:726
#10 0x0068f142 in do_start_bgworker (rw=0x2cb5230) at postmaster.c:5535
#11 maybe_start_bgworker () at postmaster.c:5709
#12 0x0068fb96 in sigusr1_handler (postgres_signal_arg=) 
at postmaster.c:4971
#13 
#14 0x7f8040091ac3 in __select_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#15 0x0046c31f in ServerLoop () at postmaster.c:1657
#16 0x00690fc7 in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0x2c8c620) at postmaster.c:1301
#17 0x0046d96d in main (argc=4, argv=0x2c8c620) at main.c:228


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Gražvydas Valeika
Hi,

I recently bumped into http://semver.org/

It can be interesting to participants of this discussion.

Especially motivation for minor version (middle number).


Best,

Grazvydas


On Mon, Jun 20, 2016 at 9:30 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Mon, Jun 20, 2016 at 2:14 PM, Mark Dilger 
> wrote:
>
>>
>> > On Jun 20, 2016, at 11:06 AM, Joshua D. Drake 
>> wrote:
>> >
>> > On 06/20/2016 10:45 AM, Mark Dilger wrote:
>> >
>> >>> Now maybe you have some other idea in mind, but I don't quite
>> >>> understand what it is.
>> >>
>> >> I do, indeed, and here it is:
>> >>
>> >> When considering whether to *back port* a change, we typically do so
>> >> on the basis that bug fixes are back ported, features are not.  In my
>> >> proposed versioning scheme, you could back port any feature that is
>> >> compatible with the old version, and bump the middle number to alert
>> >> users that you've not just back ported a bug fix, but a feature.  Any
>> new
>> >> features that are not compatible don't get back ported.
>> >>
>> >> If you fix a bug on master during development, you can back port that
>> >> as well.  But instead of bumping the middle number, you bump the last
>> >> number.
>> >>
>> >> Somebody running a version that is three major versions back could
>> >> still get the advantage of new features, so long as those new features
>> >> are not incompatible.  It's frequently not nearly so easy to run
>> pg_upgrade
>> >> as it is to run rpm -U.  You get downtime either way, but the elapsed
>> >> time of that downtime is orders of magnitude different.
>> >>
>> >> Someone else running that same version from three major versions ago
>> >> can take a more conservative policy, and only upgrade bug fixes, and
>> >> forego the back ported features.
>> >>
>> >> You still have one major version release per year.  At that time, you
>> can
>> >> also release back-port versions of prior major versions.
>> >
>> > OFFLIST:
>> >
>> > You are fighting a losing if noble battle.
>>
>> I think all my emails on this subject have been seriously misunderstood.
>> Perhaps the problem is that I don't understand some critical issue.  Can
>> you please help me understand this part:
>>
>> It seems like people want releases, starting with 10.0, to be named things
>> like 10.0, 10.1, 10.2,..., but for the purposes of communicating with
>> programs
>> like nagios refer to them as 10.0.0, 10.0.1, 10.0.2
>>
>> Is that right?
>>
>> That's the part that really annoys me, and I keep trying to argue for not
>> doing
>> that, and people keep replying to other parts of my messages rather than
>> replying to the core part of what I am saying.
>>
>> Why would any sensible person want a release to sometimes be called
>> "10.4", but the exact same release sometimes referred to as "10.0.4"?
>> This is just going to confuse average software users, as they would never
>> expect that 10.4 and 10.0.4 are the same thing.
>>
>> Have I misunderstood what is being proposed?
>>
>
> ​The software is only ever going to report 10.0.4 OR 10.4.  The area of
> concern is that people are going to get annoyed at saying: "that was fixed
> in 10.0.4​" and so mailing lists and other forums where people write the
> numbers instead of a computer are going to be littered with: "that was
> fixed in 10.4".
>
> So now human speak and machine speak are disjointed.
>
> ​The machine form output for that release is going to be 14 regardless
> of the decision to make the human form 10.4 or 10.0.4
>
> Do you have a problem with the human form and machine forms of the version
> number being different in this respect?  I don't - for me the decision of a
> choice for the human form is not influenced by the fact the machine form
> has 6 digits (with leading zeros which the human form elides...).
>
> This thread started with complaint that people are parsing our human form
> output instead of the machine form.  The OP later admitted that he didn't
> realize that a machine form was so readily available.  That is worry-some,
> since I doubt that is an isolated incident,​ and leads to the discussion of
> what form the human intended version should take.
>
> David J.
>
>


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Vik Fearing
On 19/06/16 13:02, Michael Paquier wrote:
> On Sun, Jun 19, 2016 at 7:38 PM, Vik Fearing  wrote:
>> On 19/06/16 12:28, Michael Paquier wrote:
>>> On Sun, Jun 19, 2016 at 5:56 PM, Michael Paquier
>>> Or in short the attached.
>>
>> The code looks good to me but why no documentation?
> 
> Because that's a long day... Thanks! Now you have it.

Quick bikeshed: I think the column should be called conninfo and not
conn_info to match other places it's used.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 2:14 PM, Mark Dilger 
wrote:

>
> > On Jun 20, 2016, at 11:06 AM, Joshua D. Drake 
> wrote:
> >
> > On 06/20/2016 10:45 AM, Mark Dilger wrote:
> >
> >>> Now maybe you have some other idea in mind, but I don't quite
> >>> understand what it is.
> >>
> >> I do, indeed, and here it is:
> >>
> >> When considering whether to *back port* a change, we typically do so
> >> on the basis that bug fixes are back ported, features are not.  In my
> >> proposed versioning scheme, you could back port any feature that is
> >> compatible with the old version, and bump the middle number to alert
> >> users that you've not just back ported a bug fix, but a feature.  Any
> new
> >> features that are not compatible don't get back ported.
> >>
> >> If you fix a bug on master during development, you can back port that
> >> as well.  But instead of bumping the middle number, you bump the last
> >> number.
> >>
> >> Somebody running a version that is three major versions back could
> >> still get the advantage of new features, so long as those new features
> >> are not incompatible.  It's frequently not nearly so easy to run
> pg_upgrade
> >> as it is to run rpm -U.  You get downtime either way, but the elapsed
> >> time of that downtime is orders of magnitude different.
> >>
> >> Someone else running that same version from three major versions ago
> >> can take a more conservative policy, and only upgrade bug fixes, and
> >> forego the back ported features.
> >>
> >> You still have one major version release per year.  At that time, you
> can
> >> also release back-port versions of prior major versions.
> >
> > OFFLIST:
> >
> > You are fighting a losing if noble battle.
>
> I think all my emails on this subject have been seriously misunderstood.
> Perhaps the problem is that I don't understand some critical issue.  Can
> you please help me understand this part:
>
> It seems like people want releases, starting with 10.0, to be named things
> like 10.0, 10.1, 10.2,..., but for the purposes of communicating with
> programs
> like nagios refer to them as 10.0.0, 10.0.1, 10.0.2
>
> Is that right?
>
> That's the part that really annoys me, and I keep trying to argue for not
> doing
> that, and people keep replying to other parts of my messages rather than
> replying to the core part of what I am saying.
>
> Why would any sensible person want a release to sometimes be called
> "10.4", but the exact same release sometimes referred to as "10.0.4"?
> This is just going to confuse average software users, as they would never
> expect that 10.4 and 10.0.4 are the same thing.
>
> Have I misunderstood what is being proposed?
>

​The software is only ever going to report 10.0.4 OR 10.4.  The area of
concern is that people are going to get annoyed at saying: "that was fixed
in 10.0.4​" and so mailing lists and other forums where people write the
numbers instead of a computer are going to be littered with: "that was
fixed in 10.4".

So now human speak and machine speak are disjointed.

​The machine form output for that release is going to be 14 regardless
of the decision to make the human form 10.4 or 10.0.4

Do you have a problem with the human form and machine forms of the version
number being different in this respect?  I don't - for me the decision of a
choice for the human form is not influenced by the fact the machine form
has 6 digits (with leading zeros which the human form elides...).

This thread started with complaint that people are parsing our human form
output instead of the machine form.  The OP later admitted that he didn't
realize that a machine form was so readily available.  That is worry-some,
since I doubt that is an isolated incident,​ and leads to the discussion of
what form the human intended version should take.

David J.


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 1:48 PM, Mark Dilger 
wrote:

>
> > On Jun 20, 2016, at 9:38 AM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
> >
> > On Mon, Jun 20, 2016 at 12:28 PM, Mark Dilger 
> wrote:
> >
> > > On Jun 20, 2016, at 8:53 AM, Mark Dilger 
> wrote:
> > >
> > >
> > > This is not a plea for keeping the three part versioning system.  It's
> just
> > > a plea not to have a 2 part versioning system masquerading as a three
> > > part versioning system, or vice versa.
> >
> > To clarify my concern, I never want to have to write code like this:
> >
> > CASE WHEN pg_version eq '11.1' OR pg_version eq '11.0.1' THEN
> foo()
> >WHEN pg_version eq '11.2' OR pg_version eq '11.0.2'
> THEN bar()
> > 
> > or
> >
> > if (0 == strcmp(pg_version_string, "11.1") || 0 ==
> strcmp(pg_version_string, "11.0.1"))
> > foo();
> > else if (0 == strcmp(pg_version_string, "11.2") || 0 ==
> strcmp(pg_version_string, "11.0.2"))
> > bar();
> >
> > either in sql, perl, c, java, or anywhere else.  As soon as you have two
> different
> > formats for the version string, you get into this hell.  Yeah, ok, you
> may have
> > a sql level function for this, but I'm thinking about applications
> somewhat removed
> > from a direct connection to the database, where you can't be sure which
> format
> > you'll be handed.
> >
> > Now you argue for keeping the middle number on pure compatibility
> grounds.​..
>
> I am not arguing for that.  I'm arguing against having two different
> versions of the
> same thing.  If you go with a two part versioning scheme that is sometimes
> written as a
> three part versioning scheme, you make every bit of code that deals with
> it from now on
> have to deal with both possible versions in order to be robust.
>
> My preference is to have a three part versioning scheme where all three
> parts have
> different purposes.  But since the community seems to have no interest in
> that, and
> have largely (if not universally) rejected that idea, I'm falling back to
> merely arguing
> that if we're going to have a two part versioning system, we should do
> that everywhere,
> and if we can't do that everywhere, then we should do that nowhere.
>
> You appear to be arguing that we should have a versioning scheme that is
> sometimes
> two parts, and sometimes three parts, but when three parts, always make
> the middle
> number zero.  The part of that which bothers me most is not the "always
> zero" part.   It's
> the "sometimes two parts, sometimes three parts" part
>

​The machine representation of the version number is 6 digits without any
punctuation.

The human representation for version numbers >= 10 is two integers
separated using a full-stop/period/decimal "."

You would never write 10.0.2 instead of 10.2 just like you'd never write
1002 instead of 12

​You only have to do more than one thing if you are using the wrong
representation and are trying to be robust.  While I understand that some
people may indeed be doing that or otherwise stuck in that unfortunate
circumstance accommodating their needs amounts to nothing other than
maintaining compatibility for unsupported usage.  I'll admit we probably
could have been more explicit on what we do consider proper usage - which
is why I have at least some sympathy for the position.

David J.


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 20, 2016, at 11:06 AM, Joshua D. Drake  wrote:
> 
> On 06/20/2016 10:45 AM, Mark Dilger wrote:
> 
>>> Now maybe you have some other idea in mind, but I don't quite
>>> understand what it is.
>> 
>> I do, indeed, and here it is:
>> 
>> When considering whether to *back port* a change, we typically do so
>> on the basis that bug fixes are back ported, features are not.  In my
>> proposed versioning scheme, you could back port any feature that is
>> compatible with the old version, and bump the middle number to alert
>> users that you've not just back ported a bug fix, but a feature.  Any new
>> features that are not compatible don't get back ported.
>> 
>> If you fix a bug on master during development, you can back port that
>> as well.  But instead of bumping the middle number, you bump the last
>> number.
>> 
>> Somebody running a version that is three major versions back could
>> still get the advantage of new features, so long as those new features
>> are not incompatible.  It's frequently not nearly so easy to run pg_upgrade
>> as it is to run rpm -U.  You get downtime either way, but the elapsed
>> time of that downtime is orders of magnitude different.
>> 
>> Someone else running that same version from three major versions ago
>> can take a more conservative policy, and only upgrade bug fixes, and
>> forego the back ported features.
>> 
>> You still have one major version release per year.  At that time, you can
>> also release back-port versions of prior major versions.
> 
> OFFLIST:
> 
> You are fighting a losing if noble battle.

I think all my emails on this subject have been seriously misunderstood.
Perhaps the problem is that I don't understand some critical issue.  Can 
you please help me understand this part:

It seems like people want releases, starting with 10.0, to be named things
like 10.0, 10.1, 10.2,..., but for the purposes of communicating with programs
like nagios refer to them as 10.0.0, 10.0.1, 10.0.2

Is that right?

That's the part that really annoys me, and I keep trying to argue for not doing
that, and people keep replying to other parts of my messages rather than
replying to the core part of what I am saying.

Why would any sensible person want a release to sometimes be called
"10.4", but the exact same release sometimes referred to as "10.0.4"?
This is just going to confuse average software users, as they would never
expect that 10.4 and 10.0.4 are the same thing.

Have I misunderstood what is being proposed?

The only reason I talk about using the middle number for this other purpose
is to forestall people assuming that they can elide the middle number of a
three number system, such that it gets elided sometimes but not other times.
I was quite clear in my email this morning that I'm not arguing for a three
number system.  I'm perfectly ok with a two number system.  I just don't want
a "let's confuse everybody by making each and every release have two
different names" system.

mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Alvaro Herrera
Mark Dilger wrote:

> When considering whether to *back port* a change, we typically do so
> on the basis that bug fixes are back ported, features are not.  In my
> proposed versioning scheme, you could back port any feature that is
> compatible with the old version, and bump the middle number to alert
> users that you've not just back ported a bug fix, but a feature.  Any new
> features that are not compatible don't get back ported.

So instead of having about 5 branches to maintain, we will end up with
5N branches, because we will still have to support the initial .0
branches of each major, plus .1, .2 ... of each major branch?

Granted, we could have major releases not every year but perhaps every 2
or 3 years; non-catversion-bumping patches would still be released
quicker than that, in middle-number-increasing releases.  But like
Robert, I don't think that the number of features we could add this way
would be numerous enough to support this idea in the long run.

I think this increases the load on committers many times.  Count me out.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 20, 2016, at 9:38 AM, David G. Johnston  
> wrote:
> 
> On Mon, Jun 20, 2016 at 12:28 PM, Mark Dilger  wrote:
> 
> > On Jun 20, 2016, at 8:53 AM, Mark Dilger  wrote:
> >
> >
> > This is not a plea for keeping the three part versioning system.  It's just
> > a plea not to have a 2 part versioning system masquerading as a three
> > part versioning system, or vice versa.
> 
> To clarify my concern, I never want to have to write code like this:
> 
> CASE WHEN pg_version eq '11.1' OR pg_version eq '11.0.1' THEN foo()
>WHEN pg_version eq '11.2' OR pg_version eq '11.0.2' THEN 
> bar()
> 
> or
> 
> if (0 == strcmp(pg_version_string, "11.1") || 0 == 
> strcmp(pg_version_string, "11.0.1"))
> foo();
> else if (0 == strcmp(pg_version_string, "11.2") || 0 == 
> strcmp(pg_version_string, "11.0.2"))
> bar();
> 
> either in sql, perl, c, java, or anywhere else.  As soon as you have two 
> different
> formats for the version string, you get into this hell.  Yeah, ok, you may 
> have
> a sql level function for this, but I'm thinking about applications somewhat 
> removed
> from a direct connection to the database, where you can't be sure which format
> you'll be handed.
> 
> Now you argue for keeping the middle number on pure compatibility grounds.​..

I am not arguing for that.  I'm arguing against having two different versions 
of the
same thing.  If you go with a two part versioning scheme that is sometimes 
written as a
three part versioning scheme, you make every bit of code that deals with it 
from now on
have to deal with both possible versions in order to be robust.

My preference is to have a three part versioning scheme where all three parts 
have
different purposes.  But since the community seems to have no interest in that, 
and
have largely (if not universally) rejected that idea, I'm falling back to 
merely arguing
that if we're going to have a two part versioning system, we should do that 
everywhere,
and if we can't do that everywhere, then we should do that nowhere.

You appear to be arguing that we should have a versioning scheme that is 
sometimes
two parts, and sometimes three parts, but when three parts, always make the 
middle
number zero.  The part of that which bothers me most is not the "always zero" 
part.   It's
the "sometimes two parts, sometimes three parts" part.

mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 20, 2016, at 9:43 AM, Robert Haas  wrote:
> 
> On Mon, Jun 20, 2016 at 12:26 PM, Mark Dilger  wrote:
>>> In practical effect that is exactly what your proposal does.  You just feel 
>>> better because you defined when B is allowed to change even though it never 
>>> should happen based upon our project policy.  And any rare exception can 
>>> justifiably be called a bug fix because, face it, it would only happen if 
>>> someone reports a bug.
>> 
>> Why are you refusing to acknowledge the difference between features that 
>> require a pg_upgrade and features that don't?
> 
> The amount of additional committer work that would be created by
> making that distinction would be large.  Currently, we're on an annual
> release cycle.  Every commit that's not a bug fix gets committed to
> exactly one branch: master.  Inevitably, there are multiple changes
> per cycle - dozens, probably - that change initial catalog contents
> and would therefore require pg_upgrade.
> 
> Suppose we switched to a semi-annual release cycle where every other
> release required a pg_upgrade, so once per year same as now, and the
> other ones did not.  The only way to do that would be to have two
> active development branches, one of which accepts only changes that
> don't bump catversion or xlog_page_magic and the other of which
> accepts changes of all sorts.  Every patch that qualifies for the
> no-pg-upgrade-required branch would have to be committed twice,
> resolving conflicts as necessary.
> 
> Also, over time, the number of supported branches would approximately
> double.  With a five year support window, it's currently about six.
> If you had another set of semi-major releases in between the main set
> of releases, you'd end up with 11 or 12 active branches, which would
> make back-patching significantly more burdensome than currently.
> 
> Now maybe you have some other idea in mind, but I don't quite
> understand what it is. 

I do, indeed, and here it is:

When considering whether to *back port* a change, we typically do so
on the basis that bug fixes are back ported, features are not.  In my
proposed versioning scheme, you could back port any feature that is
compatible with the old version, and bump the middle number to alert
users that you've not just back ported a bug fix, but a feature.  Any new
features that are not compatible don't get back ported.

If you fix a bug on master during development, you can back port that
as well.  But instead of bumping the middle number, you bump the last
number.

Somebody running a version that is three major versions back could
still get the advantage of new features, so long as those new features
are not incompatible.  It's frequently not nearly so easy to run pg_upgrade
as it is to run rpm -U.  You get downtime either way, but the elapsed
time of that downtime is orders of magnitude different.

Someone else running that same version from three major versions ago
can take a more conservative policy, and only upgrade bug fixes, and
forego the back ported features.

You still have one major version release per year.  At that time, you can
also release back-port versions of prior major versions.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Tom Lane
Robert Haas  writes:
> On Sun, Jun 19, 2016 at 10:23 PM, Tom Lane  wrote:
>> Personally, I'm +1 for such tinkering if it makes the feature either more
>> controllable or more understandable.  After reading the comments at the
>> head of nodeGather.c, though, I don't think that single_copy is either
>> understandable or useful, and merely renaming it won't help.  Apparently,
>> it runs code in the worker, except when it doesn't, and even when it does,
>> it's absolutely guaranteed to be a performance loss because the leader is
>> doing nothing.  What in the world is the point?

> The single_copy flag allows a Gather node to have a child plan which
> is not intrinsically parallel.

OK, but I'm very dubious of your claim that this has any use except for
testing purposes.  It certainly has no such use today.  Therefore, the
behavior of falling back to running in the leader seems like an
anti-feature to me.  If we want to test running in a worker, then we
want to test that, not maybe test it.

I propose that the behavior we actually want here is to execute in
a worker, full stop.  If we can't get one, wait till we can.  If we
can't get one because no slots have been allocated at all, fail.
That would make the behavior deterministic enough for the regression
tests to rely on it.

And while I don't know what this mode should be called, I'm pretty sure
that neither "single_copy" nor "pipeline" are useful descriptions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pager_min_lines tab completion

2016-06-20 Thread Andrew Dunstan
I have just noticed that when I added the pager_min_lines psetting to 
psql in 9.5, I omitted to add it to the relevant piece of 
tab-complete.c. If no-one objects I propose to remedy that shortly and 
backpatch it.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-20 Thread Tom Lane
Tomas Vondra  writes:
> On 06/18/2016 06:52 PM, Tom Lane wrote:
>> I concur, actually, but I feel that it's out of scope for this
>> particular patch, which is only trying to replace the functionality
>> that was committed previously. If you want to come up with a patch on
>> top of this that adds some accounting for NULLs, I'd be willing to
>> consider it as a post-beta2 improvement.

> Sure, fair enough. By post-beta2 you mean for 9.7, or still for 9.6?

I think it'd be legitimate to address the NULLs question for 9.6, as long
as the patch is not very large.  If it does turn out to be invasive or
otherwise hard to review, waiting for 9.7 might be more prudent.  But
I argued upthread that failing to consider nulls was a bug in the original
patch, so dealing with them could be considered a bug fix.

> If I could wish one more thing - could you briefly explain why you 
> rewrote the patch the way you did? I'd like to learn from this and while 
> I think I kinda understand most of the changes, I'm not really sure I 
> got it right.

I don't at the moment recall everything I changed, but I'm happy to
answer questions that are more specific than that one ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 12:26 PM, Mark Dilger  wrote:
>> In practical effect that is exactly what your proposal does.  You just feel 
>> better because you defined when B is allowed to change even though it never 
>> should happen based upon our project policy.  And any rare exception can 
>> justifiably be called a bug fix because, face it, it would only happen if 
>> someone reports a bug.
>
> Why are you refusing to acknowledge the difference between features that 
> require a pg_upgrade and features that don't?

The amount of additional committer work that would be created by
making that distinction would be large.  Currently, we're on an annual
release cycle.  Every commit that's not a bug fix gets committed to
exactly one branch: master.  Inevitably, there are multiple changes
per cycle - dozens, probably - that change initial catalog contents
and would therefore require pg_upgrade.

Suppose we switched to a semi-annual release cycle where every other
release required a pg_upgrade, so once per year same as now, and the
other ones did not.  The only way to do that would be to have two
active development branches, one of which accepts only changes that
don't bump catversion or xlog_page_magic and the other of which
accepts changes of all sorts.  Every patch that qualifies for the
no-pg-upgrade-required branch would have to be committed twice,
resolving conflicts as necessary.

Also, over time, the number of supported branches would approximately
double.  With a five year support window, it's currently about six.
If you had another set of semi-major releases in between the main set
of releases, you'd end up with 11 or 12 active branches, which would
make back-patching significantly more burdensome than currently.

Now maybe you have some other idea in mind, but I don't quite
understand what it is.  It's not likely to ever happen that we go
through a whole 12 month release cycle and then get to the end of it
and say "huh, I guess we never bumped catversion or xlog_page_magic".
If that ever does happen, it's probably a sign that nobody is doing
any meaningful feature development any more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New design for FK-based join selectivity estimation

2016-06-20 Thread Tom Lane
Tomas Vondra  writes:
> OK, thanks. The one thing we haven't done is testing the performance, to 
> see how this fares. So I've repeated the tests I've done on the original 
> version of the patch here

Hmm.  I'm not that excited about these results, for a couple of reasons:

* AFAICS, all the numbers are collected from the first execution of a
query within a session, meaning caches aren't populated and everything
has to be loaded from disk (or at least shared buffers).

* I do not credit hundreds of completely redundant FKs between the same
two tables as being representative of plausible real-world cases.

I modified your new script as attached to get rid of the first problem.
Comparing HEAD with HEAD minus commit 100340e2d, in non-assert builds,
I get results like this for the 100-foreign-key case (with repeat
count 1000 for the data collection script):

select code, test, avg(time),stddev(time) from data group by 1,2 order by 1,2;
  code  | test  |avg |   stddev
+---++-
 head   | t1/t2 |  0.065045045045045 | 0.00312962651081508
 head   | t3/t4 |  0.168561561561562 | 0.00379087132124092
 head   | t5/t6 |  0.127671671671672 | 0.00326275949269809
 head   | t7/t8 |  0.391057057057056 | 0.00590249325300915
 revert | t1/t2 | 0.0613933933933937 |  0.0032082678131875
 revert | t3/t4 | 0.0737507507507501 | 0.00221692725859567
 revert | t5/t6 |  0.123759759759759 | 0.00431225386651805
 revert | t7/t8 |  0.154082082082081 | 0.00405118420422266
(8 rows)

So for the somewhat-credible cases, ie 100 unrelated foreign keys,
I get about 3% - 6% slowdown.  The 100-duplicate-foreign-keys case
does indeed look like about a 2X slowdown, but as I said, I do not
think that has anything to do with interesting usage.

In any case, the situation I was worried about making better was
queries joining many tables, which none of this exercises at all.

regards, tom lane

DB=$1
COUNT=$2

for i in `seq 1 $COUNT`; do
echo "EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 USING (a);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t1/t2\t"$3}'

for i in `seq 1 $COUNT`; do
echo "EXPLAIN ANALYZE SELECT * FROM t3 JOIN t4 USING (a);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t3/t4\t"$3}'

for i in `seq 1 $COUNT`; do
echo "EXPLAIN ANALYZE SELECT * FROM t5 JOIN t6 USING (a,b,c,d);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t5/t6\t"$3}'

for i in `seq 1 $COUNT`; do
echo "EXPLAIN ANALYZE SELECT * FROM t7 JOIN t8 USING (a,b,c,d);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t7/t8\t"$3}'

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 12:28 PM, Mark Dilger 
wrote:

>
> > On Jun 20, 2016, at 8:53 AM, Mark Dilger 
> wrote:
> >
> >
> > This is not a plea for keeping the three part versioning system.  It's
> just
> > a plea not to have a 2 part versioning system masquerading as a three
> > part versioning system, or vice versa.
>
> To clarify my concern, I never want to have to write code like this:
>
> CASE WHEN pg_version eq '11.1' OR pg_version eq '11.0.1' THEN foo()
>WHEN pg_version eq '11.2' OR pg_version eq '11.0.2'
> THEN bar()
> 
> or
>
> if (0 == strcmp(pg_version_string, "11.1") || 0 ==
> strcmp(pg_version_string, "11.0.1"))
> foo();
> else if (0 == strcmp(pg_version_string, "11.2") || 0 ==
> strcmp(pg_version_string, "11.0.2"))
> bar();
>
> either in sql, perl, c, java, or anywhere else.  As soon as you have two
> different
> formats for the version string, you get into this hell.  Yeah, ok, you may
> have
> a sql level function for this, but I'm thinking about applications
> somewhat removed
> from a direct connection to the database, where you can't be sure which
> format
> you'll be handed.
>

Now you argue for keeping the middle number on pure compatibility
grounds.​..

The correct format is:  110001 and 110002

Which pretty much boils down to "we're keeping the middle number but it
will always be zero".

So, I'll suppose you are giving a +1 to keeping the human-readable display
10.0.x - and will let other's interpret your reasons as they will.

David J.


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 20, 2016, at 8:53 AM, Mark Dilger  wrote:
> 
> 
> This is not a plea for keeping the three part versioning system.  It's just 
> a plea not to have a 2 part versioning system masquerading as a three
> part versioning system, or vice versa.

To clarify my concern, I never want to have to write code like this:

CASE WHEN pg_version eq '11.1' OR pg_version eq '11.0.1' THEN foo()
   WHEN pg_version eq '11.2' OR pg_version eq '11.0.2' THEN 
bar()

or

if (0 == strcmp(pg_version_string, "11.1") || 0 == 
strcmp(pg_version_string, "11.0.1"))
foo();
else if (0 == strcmp(pg_version_string, "11.2") || 0 == 
strcmp(pg_version_string, "11.0.2"))
bar();

either in sql, perl, c, java, or anywhere else.  As soon as you have two 
different
formats for the version string, you get into this hell.  Yeah, ok, you may have
a sql level function for this, but I'm thinking about applications somewhat 
removed
from a direct connection to the database, where you can't be sure which format
you'll be handed.


mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger
> 
> In practical effect that is exactly what your proposal does.  You just feel 
> better because you defined when B is allowed to change even though it never 
> should happen based upon our project policy.  And any rare exception can 
> justifiably be called a bug fix because, face it, it would only happen if 
> someone reports a bug.

Why are you refusing to acknowledge the difference between features that 
require a pg_upgrade and features that don't?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Robert Haas
On Mon, Jun 20, 2016 at 1:26 AM, Amit Kapila  wrote:
> I have done analysis on this and didn't found any use case where passing
> CURSOR_OPT_PARALLEL_OK in exec_stmt_execsql() can help in parallelizing the
> queries.  Basically, there seems to be three ways in which function
> exec_stmt_execsql can be used inside plpgsql.
>
> a. In execution of statement used in open cursor (via exec_stmt_open())
> b. In execution of statement when used in for loop (via exec_stmt_forc())
> c. In execution of SQL statement (via direct call to exec_stmt_execsql())
>
> For the cases (a) and (b), one can construct a case where execution needs to
> be suspended, so using parallel mode for those cases will not be meaningful.
> For case (c), the Select statement seems to execute successfully only when
> there is a *into* target, else it will give an error after execution as
> below:
> ERROR:  query has no destination for result data
> HINT:  If you want to discard the results of a SELECT, use PERFORM instead.
>
> Now, if one has used into clause, the number of rows will be restricted in
> which cases parallelism won't be enabled.

OK, sounds like I got that right, then.  Thanks for verifying.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Robert Haas
On Sun, Jun 19, 2016 at 10:23 PM, Tom Lane  wrote:
>> although I fear we
>> might be getting to a level of tinkering with parallel query that
>> starts to look more like feature development.
>
> Personally, I'm +1 for such tinkering if it makes the feature either more
> controllable or more understandable.  After reading the comments at the
> head of nodeGather.c, though, I don't think that single_copy is either
> understandable or useful, and merely renaming it won't help.  Apparently,
> it runs code in the worker, except when it doesn't, and even when it does,
> it's absolutely guaranteed to be a performance loss because the leader is
> doing nothing.  What in the world is the point?

The single_copy flag allows a Gather node to have a child plan which
is not intrinsically parallel.  For example, consider these two plans:

Gather
-> Parallel Seq Scan

Gather
-> Seq Scan

The first plan is safe regardless of the setting of the single-copy
flag.  If the plan is executed in every worker, the results in
aggregate across all workers will add up to the results of a
non-parallel sequential scan of the table.  The second plan is safe
only if the # of workers is 1 and the single-copy flag is set.  If
either of those things is not true, then more than one process might
try to execute the sequential scan, and the result will be that you'll
get N copies of the output, where N = (# of parallel workers) +
(leader also participates ? 1 : 0).

For force_parallel_mode = {on, regress}, the single-copy behavior is
essential.  We can run all of those plans inside a worker, but only
because we know that the leader won't also try to run those same
plans.

But it might be useful in other cases too.  For example, imagine a
plan like this:

Join
-> Join
  -> Join
-> Join
  -> Gather (single copy)
-> Join
  -> Join
-> Join
  -> Join
-> Scan (not parallel aware)

This is pipeline parallelism.  Instead of having one process do all of
the joins, you can have a worker do some subset of them and the send
the outputs back to the leader which can do the rest and return the
results to the client.  This is actually kind of hard to get right -
according to the literature I've read on parallel query - because you
can get pipeline stalls that erase most or all of the benefit, but
it's a possible area to explore.

Actually, though, the behavior I really want the single_copy flag to
embody is not so much "only one process runs this" but "leader does
not participate unless there are no workers", which is the same thing
only when the budgeted number of workers is one.  This is useful
because of plans like this:

Finalize HashAggregate
-> Gather
  -> Partial HashAggregate
-> Hash Join
   -> Parallel Seq Scan on large_table
   -> Hash
 -> Seq Scan on another_large_table

Unless the # of groups is very small, the leader actually won't
perform very much of the parallel-seq-scan on large_table, because
it'll be too busy aggregating the results from the other workers.
However, if it ever reaches a point where the Gather can't read a
tuple from one of the workers immediately, which is almost certain to
occur right at the beginning of execution, it's going to go build a
copy of the hash table so that it can "help" with the hash join.  By
the time it finishes, the workers will have done the same and be
feeding it results, and it will likely get little use out of the copy
that it built itself.  But it will still have gone to the effort of
building it.

For 10.0, Thomas Munro has already done a bunch of work, and will be
doing more work, so that we can build a shared hash table, rather than
one copy per worker.  That's going to be better when the table is
large anyway, so maybe this particular case won't matter so much.  But
in general when a partial path has a substantial startup cost, it may
be better for the leader not to get involved.  In a case like this,
it's hard to see how the leader's involvement can ever hurt:

Finalize HashAggregate
-> Gather
  -> Partial HashAggregate
-> Nested Loop
   -> Parallel Seq Scan on large_table
   -> Index Scan on some_other_table

Even if the leader only processes only one or two pages of
large_table, there's no real harm done unless, I suppose, the combine
function is fabulously expensive, which seems unlikely.  The lack of
harm stems directly from the fact that there's no startup cost here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-20 Thread David G. Johnston
On Monday, June 20, 2016, Mark Dilger  wrote:

>
> > On Jun 18, 2016, at 5:48 PM, Josh Berkus  > wrote:
> >
> > On 06/16/2016 11:01 PM, Craig Ringer wrote:
> >>
> >> I thought about raising this, but I think in the end it's replacing one
> >> confusing and weird versioning scheme for another confusing and weird
> >> versioning scheme.
> >>
> >> It does have the advantage that that compare a two-part major like
> >> 090401 vs 090402 won't be confused when they compare 100100 and 100200,
> >> since it'll be 11 and 12. So it's more backward-compatible. But
> >> ugly.
> >>
> >
> > Realistically, though, we're more likely to end up with 10.0.1 than
> > 10.1.  I don't think we're anywhere near plumbing the depths of the
> > stuff which will break because folks are parsing our version numbers
> > with regexes.  In more major software, this will break nagios
> > check_postgres.
>
> Having a 3 part versioning scheme where the middle portion is always
> zero makes the least sense to me of any of the proposals.  If we're going
> to have the pain and hurting of moving to a 2 part versioning scheme,
> and breaking nagios and what not, then lets just get on with it.  If we're
> going to keep all three parts, can we please use my proposal earlier in
> this thread where A.B.C are allocated for:
>
> C++:  bug fixes only
> B++:  new features added, but still backward compatible with prior version
> A++:  new features added, not backward compatible, pg_upgrade required
>
> If every new feature release ends up requiring pg_upgrade, then B will
> always be zero, which is no worse than your proposal.  But at least users
> can
> refer to part B to learn something useful, namely whether they will need to
> run pg_upgrade as part of upgrading their existing cluster.
>
> This has the advantage that new minor features, like adding a new guc, can
> be released sooner than the next major release, but does not have to be
> released in disguise as if it were a bug fix.
>
> This is not a plea for keeping the three part versioning system.  It's just
> a plea not to have a 2 part versioning system masquerading as a three
> part versioning system, or vice versa.
>
>
In practical effect that is exactly what your proposal does.  You just feel
better because you defined when B is allowed to change even though it never
should happen based upon our project policy.  And any rare exception can
justifiably be called a bug fix because, face it, it would only happen if
someone reports a bug.

If we keep the middle digit it will be for compatibility reasons.  Let's
not cause people to infer something that isn't true by saying it could
change.

David J.


Re: [HACKERS] 10.0

2016-06-20 Thread Mark Dilger

> On Jun 18, 2016, at 5:48 PM, Josh Berkus  wrote:
> 
> On 06/16/2016 11:01 PM, Craig Ringer wrote:
>> 
>> I thought about raising this, but I think in the end it's replacing one
>> confusing and weird versioning scheme for another confusing and weird
>> versioning scheme.
>> 
>> It does have the advantage that that compare a two-part major like
>> 090401 vs 090402 won't be confused when they compare 100100 and 100200,
>> since it'll be 11 and 12. So it's more backward-compatible. But
>> ugly.
>> 
> 
> Realistically, though, we're more likely to end up with 10.0.1 than
> 10.1.  I don't think we're anywhere near plumbing the depths of the
> stuff which will break because folks are parsing our version numbers
> with regexes.  In more major software, this will break nagios
> check_postgres.

Having a 3 part versioning scheme where the middle portion is always
zero makes the least sense to me of any of the proposals.  If we're going
to have the pain and hurting of moving to a 2 part versioning scheme,
and breaking nagios and what not, then lets just get on with it.  If we're
going to keep all three parts, can we please use my proposal earlier in
this thread where A.B.C are allocated for:

C++:  bug fixes only
B++:  new features added, but still backward compatible with prior version
A++:  new features added, not backward compatible, pg_upgrade required

If every new feature release ends up requiring pg_upgrade, then B will
always be zero, which is no worse than your proposal.  But at least users can
refer to part B to learn something useful, namely whether they will need to
run pg_upgrade as part of upgrading their existing cluster.

This has the advantage that new minor features, like adding a new guc, can
be released sooner than the next major release, but does not have to be
released in disguise as if it were a bug fix.

This is not a plea for keeping the three part versioning system.  It's just 
a plea not to have a 2 part versioning system masquerading as a three
part versioning system, or vice versa.

mark


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?

2016-06-20 Thread Tom Lane
Teodor Sigaev  writes:
>> We're really quickly running out of time to get this done before
>> beta2.  Please don't commit anything that's going to break the tree
>> because we only have about 72 hours before the wrap, but if it's
>> correct then it should go in.

> Isn't late now? Or wait to beta2 is out?

Let's wait till after beta2.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?

2016-06-20 Thread Teodor Sigaev

We're really quickly running out of time to get this done before
beta2.  Please don't commit anything that's going to break the tree
because we only have about 72 hours before the wrap, but if it's
correct then it should go in.


Isn't late now? Or wait to beta2 is out?

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Alex Ignatov


On 13.06.2016 18:52, amul sul wrote:

Hi,

It's look like bug in to_timestamp() function when format string has more 
whitespaces compare to input string, see below:

Ex.1: Two white spaces before HH24 whereas one before input time string

postgres=# SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS');
to_timestamp

2016-06-13 05:43:36-07   <— incorrect time
(1 row)



Ex.2: One whitespace before  format string

postgres=# SELECT TO_TIMESTAMP('2016/06/13 15:43:36', ' /MM/DD HH24:MI:SS');
to_timestamp
--
0016-06-13 15:43:36-07:52:58  <— incorrect year
(1 row)



If there are one or more consecutive whitespace in the format, we should skip 
those as long as we could get an actual field.
Thoughts?
Thanks & Regards,
Amul Sul




From docs about to_timestamp() ( 
https://www.postgresql.org/docs/9.5/static/functions-formatting.html)
"These functions interpret input liberally, with minimal error checking. 
While they produce valid output, the conversion can yield unexpected 
results. For example, input to these functions is not restricted by 
normal ranges, thus to_date('20096040','MMDD') returns 2014-01-17 
rather than causing an error. Casting does not have this behavior."


And it wont stop on some simple whitespace. By using to_timestamp you 
can get any output results by providing illegal input parameters values:


postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD 
HH24:MI:SS');

  to_timestamp

 2016-01-06 14:40:39+03
(1 row)


Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Alex Ignatov


On 20.06.2016 16:36, Tom Lane wrote:

Robert Haas  writes:

On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas  wrote:

I think a space in the format string should skip a whitespace
character in the input string, but not a non-whitespace character.
It's my understanding that these functions exist in no small part for
compatibility with Oracle, and Oracle declines to skip the digit '1'
on the basis of an extra space in the format string, which IMHO is the
behavior any reasonable user would expect.

So Amul and I are of one opinion and Tom is of another.  Anyone else
have an opinion?

I don't necessarily have an opinion yet.  I would like to see more than
just an unsupported assertion about what Oracle's behavior is.  Also,
how should FM mode affect this?

regards, tom lane




Oracle:
SQL> SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD HH24:MI:SS') 
from dual;

SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD HH24:MI:SS') from dual
*
ERROR at line 1:
ORA-01843: not a valid month

PG:

postgres=# SELECT TO_TIMESTAMP('2016-06-13 99:99:99', 'MMDD 
HH24:MI:SS');

  to_timestamp

 2016-01-06 14:40:39+03
(1 row)


I know about:
"These functions interpret input liberally, with minimal error checking. 
While they produce valid output, the conversion can yield unexpected 
results" from docs but by providing illegal input parameters  we have no 
any exceptions or errors about that.
I think that to_timestamp() need to has more format checking than it has 
now.


Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] primary_conninfo missing from pg_stat_wal_receiver

2016-06-20 Thread Vik Fearing
On 19/06/16 13:02, Michael Paquier wrote:
> On Sun, Jun 19, 2016 at 7:38 PM, Vik Fearing  wrote:
>> On 19/06/16 12:28, Michael Paquier wrote:
>>> On Sun, Jun 19, 2016 at 5:56 PM, Michael Paquier
>>> Or in short the attached.
>>
>> The code looks good to me but why no documentation?
> 
> Because that's a long day... Thanks! Now you have it.

Thanks, this looks good.  Could you please add it to the next commitfest
so that it doesn't get lost and also so I can do an official review of it?
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Albe Laurenz
Tom Lane wrote:
> I don't necessarily have an opinion yet.  I would like to see more than
> just an unsupported assertion about what Oracle's behavior is.  Also,
> how should FM mode affect this?

I can supply what Oracle 12.1 does:

SQL> SELECT to_timestamp('2016-06-13 15:43:36', ' /MM/DD HH24:MI:SS') AS ts 
FROM dual;

TS

2016-06-13 15:43:36.0 AD

SQL> SELECT to_timestamp('2016-06-13 15:43:36', '/MM/DD  HH24:MI:SS') AS ts 
FROM dual;

TS

2016-06-13 15:43:36.0 AD

SQL> SELECT to_timestamp('2016-06-1315:43:36', '/MM/DD  HH24:MI:SS') AS 
ts FROM dual;

TS

2016-06-13 15:43:36.0 AD

(to_timestamp_tz behaves the same way.)

So Oracle seems to make no difference between one or more spaces.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas  wrote:
>> I think a space in the format string should skip a whitespace
>> character in the input string, but not a non-whitespace character.
>> It's my understanding that these functions exist in no small part for
>> compatibility with Oracle, and Oracle declines to skip the digit '1'
>> on the basis of an extra space in the format string, which IMHO is the
>> behavior any reasonable user would expect.

> So Amul and I are of one opinion and Tom is of another.  Anyone else
> have an opinion?

I don't necessarily have an opinion yet.  I would like to see more than
just an unsupported assertion about what Oracle's behavior is.  Also,
how should FM mode affect this?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-06-20 Thread Amit Kapila
On Thu, Jun 16, 2016 at 8:20 AM, Robert Haas  wrote:
>
> On Wed, Jun 15, 2016 at 10:48 PM, Amit Kapila 
wrote:
> > exec_stmt_execsql() is used to execute SQL statements insider plpgsql
which
> > includes dml statements as well, so probably you wanted to play safe by
not
> > allowing parallel option from that place.  However, I think there
shouldn't
> > be a problem in using CURSOR_OPT_PARALLEL_OK from this place as we have
a
> > check in standard_planner which will take care of whether to choose
parallel
> > mode or not for a particular statement.  If you want, I can do more
detailed
> > analysis and prepare a patch.
>
> That would be great.  I have a vague recollection that that function
> might be called in some contexts where execution can be suspended,
> which would be no good for parallel query.  But that might be wrong.
>

I have done analysis on this and didn't found any use case where passing
CURSOR_OPT_PARALLEL_OK in exec_stmt_execsql() can help in parallelizing the
queries.  Basically, there seems to be three ways in which function
exec_stmt_execsql can be used inside plpgsql.

a. In execution of statement used in open cursor (via exec_stmt_open())
b. In execution of statement when used in for loop (via exec_stmt_forc())
c. In execution of SQL statement (via direct call to exec_stmt_execsql())

For the cases (a) and (b), one can construct a case where execution needs
to be suspended, so using parallel mode for those cases will not be
meaningful.  For case (c), the Select statement seems to execute
successfully only when there is a *into* target, else it will give an error
after execution as below:
ERROR:  query has no destination for result data
HINT:  If you want to discard the results of a SELECT, use PERFORM instead.

Now, if one has used into clause, the number of rows will be restricted in
which cases parallelism won't be enabled.

Do, let me know if you see any case where generating a parallel path is
meaningful via this path?  I am of opinion, lets leave this code as is or
may be we can add a comment why we have decided not to enable parallelism
in this path.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread David G. Johnston
On Mon, Jun 20, 2016 at 8:19 AM, Robert Haas  wrote:

> On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas 
> wrote:
> > On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane  wrote:
> >> amul sul  writes:
> >>> It's look like bug in to_timestamp() function when format string has
> more whitespaces compare to input string, see below:
> >>
> >> No, I think this is a case of "input doesn't match the format string".
> >>
> >> As a rule of thumb, using to_timestamp() for input that could be parsed
> >> just fine by the standard timestamp input function is not a particularly
> >> good idea.  to_timestamp() is meant to deal with input that is in a
> >> well-defined format that happens to not be parsable by timestamp_in.
> >> This example doesn't meet either of those preconditions.
> >
> > I think a space in the format string should skip a whitespace
> > character in the input string, but not a non-whitespace character.
> > It's my understanding that these functions exist in no small part for
> > compatibility with Oracle, and Oracle declines to skip the digit '1'
> > on the basis of an extra space in the format string, which IMHO is the
> > behavior any reasonable user would expect.
>
> So Amul and I are of one opinion and Tom is of another.  Anyone else
> have an opinion?
>
>
​At least Tom's position has the benefit of being consistent with current
behavior.  The current implementation doesn't actually care what literal
value you specify - any non-special character consumes a single token from
the input, period.

SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD--HH24:MI:SS');
SELECT TO_TIMESTAMP('2016-06-13 15:43:36', '/MM/DD-HH24:MI:SS');

Both the above exhibit the same behavior as if you used a space instead of
the hyphen as the group separator.

The documentation should be updated to make this particular dynamic more
clear.

I don't see changing the general behavior of these "date formatting"
functions a worthwhile endeavor.​  Adding a less-liberal "parse_timestamp"
function I could get behind.

IOW, the user seems to be happy with the fact that the "/" in the date
matches his "-" but them complains that the space matches the number "1".
You don't get to have it both ways.

[re-reads the third usage note]

Or maybe you do.  We already define space as a being a greedy operator
(when not used in conjunction with FX).  A greedy space-space sequence
makes little sense on its face and if we are going to be helpful here we
should treat it as a single greedy space matcher.

Note that "returns an error because to_timestamp expects one space only" is
wrong - it errors because only a single space is captured and then the
attempt to parse 'JUN' using "MON" fails.  The following query doesn't
fail though it exhibits the same space discrepancy (it just gives the same
"wrong" result).

SELECT TO_TIMESTAMP('2016-06-13 15:43:36', 'FX/MM/DD  HH24:MI:SS');

Given that we already partially special-case the space expression I'd be
inclined to consider Robert's and Amul's position on the matter.  I think
I'd redefine our treatment of space to be "zero or more" instead of "one or
more" and require that it only match a literal space in the input.

Having considered that, I'm not convinced its worth a compatibility break.
I'd much rather deprecate these  versions and write
slightly-less-liberal versions named .

In any case I'd called the present wording a bug. Instead:

A single space consumes a single token of input and then, unless the FX
modifier is present, consumes zero or more subsequent literal spaces.
Thus, using two spaces in a row without the FX modifier, while allowed, is
unlikely to give you a satisfactory result.  The first space will consume
all available consecutive spaces so that the second space will be
guaranteed to consume a non-space token from the input.

David J.


Re: [HACKERS] Bug in to_timestamp().

2016-06-20 Thread Robert Haas
On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas  wrote:
> On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane  wrote:
>> amul sul  writes:
>>> It's look like bug in to_timestamp() function when format string has more 
>>> whitespaces compare to input string, see below:
>>
>> No, I think this is a case of "input doesn't match the format string".
>>
>> As a rule of thumb, using to_timestamp() for input that could be parsed
>> just fine by the standard timestamp input function is not a particularly
>> good idea.  to_timestamp() is meant to deal with input that is in a
>> well-defined format that happens to not be parsable by timestamp_in.
>> This example doesn't meet either of those preconditions.
>
> I think a space in the format string should skip a whitespace
> character in the input string, but not a non-whitespace character.
> It's my understanding that these functions exist in no small part for
> compatibility with Oracle, and Oracle declines to skip the digit '1'
> on the basis of an extra space in the format string, which IMHO is the
> behavior any reasonable user would expect.

So Amul and I are of one opinion and Tom is of another.  Anyone else
have an opinion?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Truncating/vacuuming relations on full tablespaces

2016-06-20 Thread Asif Naeem
Thank you for useful suggestions. PFA patch, I have tried to cover all the
points mentioned.

Regards,
Muhammad Asif Naeem

On Wed, Apr 6, 2016 at 9:06 PM, Robert Haas  wrote:

> On Wed, Apr 6, 2016 at 3:32 AM, Asif Naeem  wrote:
> >> Oh, I see.  I think it's probably not a good idea to skip truncating
> >> those maps, but perhaps the option could be defined as making no new
> >> entries, rather than ignoring them altogether, so that they never
> >> grow.  It seems that both of those are defined in such a way that if
> >> page Y follows page X in the heap, the entries for page Y in the
> >> relation fork will follow the one for page X.  So truncating them
> >> should be OK; it's just growing them that we need to avoid.
> >
> > Thank you Robert. PFA basic patch, it introduces EMERGENCY option to
> VACUUM
> > that forces to avoid extend any entries in the VM or FSM. It seems
> working
> > fine in simple test scenarios e.g.
> >
> >> postgres=# create table test1 as (select generate_series(1,10));
> >> SELECT 10
> >> postgres=# vacuum  EMERGENCY test1;
> >> VACUUM
> >> postgres=# select pg_relation_filepath('test1');
> >>  pg_relation_filepath
> >> --
> >>  base/13250/16384
> >> (1 row)
> >> [asif@centos66 inst_96]$ find . | grep base/13250/16384
> >> ./data/base/13250/16384
> >> postgres=# vacuum test1;
> >> VACUUM
> >> [asif@centos66 inst_96]$ find . | grep base/13250/16384
> >> ./data/base/13250/16384
> >> ./data/base/13250/16384_fsm
> >> ./data/base/13250/16384_vm
> >
> >
> > Please do let me know if I missed something or more information is
> required.
> > Thanks.
>
> This is too late for 9.6 at this point and certainly requires
> discussion anyway, so please add it to the next CommitFest.  But in
> the meantime, here are a few quick comments:
>
> You should only support EMERGENCY using the new-style, parenthesized
> options syntax.  That way, you won't need to make EMERGENCY a keyword.
> We don't want to do that, and we especially don't want it to be
> partially reserved, as you have done here.
>
> Passing the EMERGENCY flag around in a global variable is probably not
> a good idea; use parameter lists.  That's what they are for.  Also,
> calling the variable that decides whether EMERGENCY was set
> Extend_VM_FSM is confusing.
>
> I see why you changed the calling convention for visibilitymap_pin()
> and RecordPageWithFreeSpace(), but that's awfully invasive.  I wonder
> if there's a better way to do that, like maybe having vacuumlazy.c ask
> the VM and FSM for their length in pages and then not trying to use
> those functions for block numbers that are too large.
>
> Don't forget to update the documentation.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


VACUUM_EMERGENCY_Option_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallelized polymorphic aggs, and aggtype vs aggoutputtype

2016-06-20 Thread David Rowley
On 20 June 2016 at 19:06, David Rowley  wrote:
> On 18 June 2016 at 05:45, Tom Lane  wrote:
>> A possible solution is to give deserialize an extra dummy argument, along
>> the lines of "deserialize(bytea, internal) returns internal", thereby
>> ensuring it can't be called in any non-system-originated contexts.  This
>> is still rather dangerous if the other argument is variable, as somebody
>> might be able to abuse an internal-taking function by naming it as the
>> deserialize function for a maliciously-designed aggregate.  What I'm
>> inclined to do to lock it down further is to drop the "serialtype"
>> argument to CREATE AGGREGATE, which seems rather pointless (what else
>> would you ever use besides bytea?).  Instead, insist that
>> serialize/deserialize apply *only* when the transtype is INTERNAL, and
>> their signatures are exactly "serialize(internal) returns bytea" and
>> "deserialize(bytea, internal) returns internal", never anything else.
>
> This is also the only way that I can think of to fix this issue. If we
> can agree that the fix should be to insist that the deserialisation
> function take an additional 2nd parameter of INTERNAL, then I can
> write a patch to fix this, and include a patch for the document
> section 35.10 to explain better about parallelising user defined
> aggregates.

I've gone and implemented the dummy argument approach for
deserialization functions.

If we go with this, I can then write the docs for 35.10 which'll serve
to explain parallel user defined aggregates in detail.

Some notes about the patch;

I didn't remove the comments at the top of each deserial function
which mention something like:

 * numeric_avg_serialize(numeric_avg_deserialize(bytea)) must result in a value
 * which matches the original bytea value.

I'm thinking that perhaps these now make a little less sense, given
that numeric_avg_deserialize is now numeric_avg_deserialize(bytea,
internal).

Perhaps these should be updated or removed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


deserialization_function_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should XLogInsert() be done only inside a critical section?

2016-06-20 Thread Heikki Linnakangas

On 20/04/16 23:44, Tom Lane wrote:

Over in <17456.1460832...@sss.pgh.pa.us> I speculated about whether
we should be enforcing that WAL insertion happen only inside critical
sections.  We don't currently, and a survey of the backend says that
there are quite a few calls that aren't inside critical sections.
But there are at least two good reasons why we should, IMO:

1. It's not very clear that XLogInsert will recover cleanly if it's
invoked outside a critical section and hits a failure.  Certainly,
if we allow such usage, then every potential error inside that code
has to be analyzed under both critical-section and normal rules.


It was certainly designed to recover from errors gracefully. 
XLogInsertRecord(), which does the low-level work of inserting the 
record to the WAL buffer, has a critical section of its own inside it. 
The code in xloginsert.c, for constructing the record to insert, 
operates on backend-private buffers only.



2. With no such check, it's quite easy for calling code to forget to
create a critical section around code stanzas where one is *necessary*
(because you're changing shared-buffer contents).


Yeah, that is very true.


Both of these points represent pretty clear hazards for introduction
of future bugs, whether or not there are any such bugs today.

As against this, it could be argued that adding critical sections where
they're not absolutely necessary must make crashing failures more probable
than they need to be.  But first you'd have to prove that they're not
absolutely necessary, which I'm unsure about because of point #1.


One option would be to put the must-be-in-critical-section check in 
XLogRegisterBlock(). A WAL record that is associated with a shared 
buffer almost certainly needs a critical section, but many of the others 
are safe without it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Cleanup in contrib/postgres_fdw/deparse.c

2016-06-20 Thread Etsuro Fujita

Hi,

Here is a small patch to remove an unnecessary return from  
deparseFromExprForRel in contrib/postgres_fdw/deparse.c.


Best regards,
Etsuro Fujita


pg-fdw-deparse-cleanup.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-20 Thread Tatsuro Yamada

Hi,

I ran tpc-h's queries on this version and it's successful.
The error is fixed.

commit 705ad7f3b523acae0ddfdebd270b7048b2bb8029
Author: Tom Lane 
Date:   Sun Jun 19 13:11:40 2016 -0400

Regards,
Tatsuro Yamada
NTT OSS Center


On 2016/06/18 1:26, Robert Haas wrote:

On Fri, Jun 17, 2016 at 9:29 AM, Robert Haas  wrote:

On Fri, Jun 17, 2016 at 9:04 AM, Amit Kapila  wrote:

Attached, please find updated patch based on above lines.  Due to addition
of projection path on top of partial paths, some small additional cost is
added for parallel paths. In one of the tests in select_parallel.sql the
plan was getting converted to serial plan from parallel plan due to this
cost, so I have changed it to make it more restrictive.  Before changing, I
have debugged the test to confirm that it was due to this new projection
path cost.  I have added two new tests as well to cover the new code added
by patch.


I'm going to go work on this patch now.


Here is a revised version, which I plan to commit in a few hours
before the workday expires.  I kept it basically as Amit had it, but I
whacked the comments around some and, more substantively, avoided
inserting and/or charging for a Result node when that's not necessary.









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Push down join with PHVs (WAS: Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116)

2016-06-20 Thread Etsuro Fujita

On 2016/06/17 21:45, Robert Haas wrote:

On Thu, Jun 16, 2016 at 10:44 PM, Etsuro Fujita
 wrote:

On 2016/06/16 22:00, Robert Haas wrote:

On Thu, Jun 16, 2016 at 7:05 AM, Etsuro Fujita
 wrote:


ISTM that a robuster solution to this is to push down the ft1-ft2-ft3
join
with the PHV by extending deparseExplicitTargetList() and/or something
else
so that we can send the remote query like:

select ft1.a, (ft3.a IS NOT NULL) from (ft1 inner join ft2 on ft1.a =
ft2.a)
left join ft3 on ft1.a = ft3.a



I completely agree we should have that, but not for 9.6.



OK, I'll work on that for 10.0.



Great, that will be a nice improvement.


Here is a patch for that.

* Modified deparseExplicitTargetList as well as is_foreign_expr and 
deparseExpr so that a join is pushed down with PHVs if not only the join 
but the PHVs are safe to execute remotely.  The existing deparsing logic 
can't build a remote query that involves subqueries, so the patch 
currently gives up pushing down any upper joins involving a join (or 
foreign table!) with PHVs in the tlist.  But I think the patch would be 
easily extended to support that, once we support deparsing subqueries.


* Besides that, simplified foreign_join_ok a bit, by adding a new flag, 
allow_upper_foreign_join, to PgFdwRelationInfo, replacing the existing 
pushdown_safe flag with it.  See the comments in postgres_fdw.h.


Comments welcome.

Best regards,
Etsuro Fujita


pg-fdw-phv-handling-v1.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallelized polymorphic aggs, and aggtype vs aggoutputtype

2016-06-20 Thread David Rowley
On 18 June 2016 at 09:29, Tom Lane  wrote:
> So at this point my proposal is:
>
> 1. Add an OID-list field to Aggref holding the data types of the
> user-supplied arguments.  This can be filled in parse analysis since it
> won't change thereafter.  Replace calls to get_aggregate_argtypes() with
> use of the OID-list field, or maybe keep the function but have it look
> at the OID list not the physical args list.
>
> 2. Add an OID field to Aggref holding the resolved (non polymorphic)
> transition data type's OID.  For the moment we could just set this
> during parse analysis, since we do not support changing the transtype
> of an existing aggregate.  If we ever decide we want to allow that,
> the lookup could be postponed into the planner.  Replace calls to
> resolve_aggregate_transtype with use of this field.
> (resolve_aggregate_transtype() wouldn't disappear, but it would be
> invoked only once during parse analysis, not multiple times per query.)
>
> #2 isn't necessary to fix the bug, but as long as we are doing #1
> we might as well do #2 too, to buy back some of the planner overhead
> added by parallel query.
>
> Barring objections I'll make this happen by tomorrow.

Thanks for committing this change.

>From reading over the commit I see you've left;

+ * XXX need more documentation about partial aggregation here

Would the attached cover off what you had imagined might go here?

> I still don't like anything about the way that the matching logic in
> fix_combine_agg_expr works, but at the moment I can't point to any
> observable bug from that, so I'll not try to change it before beta2.

Right, I see more work is needed in that area as there's now an
out-of-date comment at the top of
search_indexed_tlist_for_partial_aggref. The comment claims we only
ignore aggoutputtype, but you added aggtranstype and aggargtypes to
that list.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


aggref_comments.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >