Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-08-02 Thread Amit Langote
(looking at the v5 patch but replying to an older email)

On 2018/07/31 16:03, David Rowley wrote:
> I've attached a complete v4 patch.
> 
>> By the way, when going over the updated code, I noticed that the code
>> around child_parent_tupconv_maps could use some refactoring too.
>> Especially, I noticed that ExecSetupChildParentMapForLeaf() allocates
>> child-to-parent map array needed for transition tuple capture even if not
>> needed by any of the leaf partitions.  I'm attaching here a patch that
>> applies on top of your v3 to show what I'm thinking we could do.
> 
> Maybe we can do that as a follow-on patch.

We probably could, but I think it would be a good idea get rid of *all*
redundant allocations due to tuple routing in one patch, if that's the
mission of this thread and the patch anyway.

> I think what we have so far
> is already ended up quite complex to review. What do you think?

Yeah, it's kind of complex, but at least it seems that we're clear on the
point that what we're trying to do here is to try to get rid of redundant
allocations.

Parts of the patch that appear complex seems to be around the allocation
of various maps.  Especially the child-to-parent maps, which as things
stand today, come from two arrays -- a per-update-subplan array that's
needed by update tuple routing proper and per-leaf partition array (one in
PartitionTupleRouting) that's needed by transition capture machinery.  The
original coding was such the update tuple routing handling code would try
to avoid allocating the per-update-subplan array if it saw that per-leaf
partition array was already set up in PartitionTupleRouting, because
transition capture is active in the query.  For update-tuple-routing code
to be able to use maps from the per-leaf array, it would have to know
which update-subplans mapped to which tuple-routing-initialized
partitions.  That was maintained in the subplan_partition_offset array
that's now gone with this patch, because we no longer want to fix the
tuple-routing-initialized-partition offsets in advance.  So, it's better
to dissociate per-subplan maps which are initialized during
ExecInitModifyTable from per-leaf maps which are initialized lazily when
tuple routing initializes a partition, which is what my portion of the
patch did.

As mentioned in my last email, I still think it would be a good idea to
simplify the handling of child-to-parent maps in PartitionTupleRouting
even further, while we're at improving the code in this area.  I revised
the patch such that it makes the handling of maps in PartitionTupleRouting
even more uniform.  With that patch, we no longer have two completely
unrelated places in the code managing parent-to-child and child-to-parent
maps, even though both arrays are in the same PartitionTupleRouting.
Please find the updated patch attached with this email.

Thanks,
Amit
From 1a814f5a40774a51bf702757ec91e02f418a5aba Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 3 Aug 2018 14:09:51 +0900
Subject: [PATCH v2 2/2] Refactor handling of child_parent_tupconv_maps

They're currently created and handed out by TupConvMapForLeaf, which
makes them look somewhat different from parent_to_child_tupconv_maps.
In fact, both contain conversion maps possibly needed between a
partition initialized by tuple routing and the root parent in one or
the other direction, so it seems odd that parent-to-child ones are
created in ExecInitRoutingInfo, whereas child-to-parent ones in
TupConvMapForLeaf.

The child-to-parent ones are only needed if transition capture is
active, but we can already check that in ExecInitRoutingInfo via
the incoming ModifyTableState (sure, we need to teach CopyFrom to
add the necessary information into its dummy ModifyTableState, but
that doesn't seem too bad).

This way, we can manage both parent-to-child and child-to-parent maps
in similar ways, and more importantly, use the same criterion of
checking whether a partition's slot in the respective array is NULL
or not to conclude if tuple conversion is necessary or not.
---
 src/backend/commands/copy.c|  37 +---
 src/backend/executor/execPartition.c   | 102 +
 src/backend/executor/nodeModifyTable.c |  11 ++--
 src/include/executor/execPartition.h   |  33 ++-
 4 files changed, 64 insertions(+), 119 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 752ba3d767..6f4069d321 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2510,8 +2510,12 @@ CopyFrom(CopyState cstate)
/*
 * If there are any triggers with transition tables on the named 
relation,
 * we need to be prepared to capture transition tuples.
+*
+* Because partition tuple routing would like to know about whether
+* transition capture is active, we also set it in mtstate, which is
+* passed to ExecFindPartition below.
 */
-   cstate->transition_capture =
+   

Re: [HACKERS] Restricting maximum keep segments by repslots

2018-08-02 Thread Kyotaro HORIGUCHI
At Thu, 2 Aug 2018 09:05:33 -0400, Robert Haas  wrote in 

> On Tue, Jul 31, 2018 at 9:52 PM, Kyotaro HORIGUCHI
>  wrote:
> > I thought it's to be deprecated for some reason so I'm leaving
> > wal_keep_segments in '# of segments' even though the new GUC is
> > in MB. I'm a bit uneasy that the two similar settings are in
> > different units. Couldn't we turn it into MB taking this
> > opportunity if we will keep wal_keep_segments, changing its name
> > to min_wal_keep_size?  max_slot_wal_keep_size could be changed to
> > just max_wal_keep_size along with it.
> 
> This seems like it's a little bit of a separate topic from what this
> thread about, but FWIW, +1 for standardizing on MB.

Thanks. Ok, I'll raise this after separately with this.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [PATCH] Improve geometric types

2018-08-02 Thread Kyotaro HORIGUCHI
At Fri, 03 Aug 2018 13:38:40 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180803.133840.180843182.horiguchi.kyot...@lab.ntt.co.jp>
> At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra 
>  wrote in 
> 
> > 
> > 
> > On 08/01/2018 01:40 PM, Emre Hasegeli wrote:
> > >> Ah, so there's an assumption that NaNs are handled earlier and never
> > >> reach
> > >> this place? That's probably a safe assumption. I haven't thought about
> > >> that,
> > >> it simply seemed suspicious that the code mixes direct comparisons and
> > >> float8_mi() calls.
> > > The comparison functions handle NaNs.  The arithmetic functions handle
> > > returning error on underflow, overflow and division by zero.  I
> > > assumed we want to return error on those in any case, but we don't
> > > want to handle NaNs at every place.
> > > 
> > >> Not sure, I'll leave that up to you. I don't mind doing it in a
> > >> separate
> > >> patch (I'd probably prefer that over mixing it into unrelated patch).
> > > It is attached separately.
> > > 
> > 
> > OK, thanks.
> > 
> > So, have we reached conclusion about all the bits I mentioned on 7/31?
> > The delta and float8/double cast are fixed, and for computeDistance
> > (i.e. doing comparisons directly or using float8_lt), the code may
> > seem a bit inconsistent, but it is in fact correct as the NaNs are
> > handled elsewhere. That seems reasonable, but perhaps a comment
> > pointing that out would be nice.
> 
> I'm not confident on replacing double to float8 partially in gist
> code. After the 0002 patch applied, I see most of problematic
> usage of double or bare arithmetic on dimentional values in
> gistproc.c.
> 
> > static inline float
> > non_negative(float val)
> > {
> > if (val >= 0.0f)
> > return val;
> > else
> > return 0.0f;
> > }
> 
> This takes float(4) and it is used as "non_negative(overlap)",
> where overlap is float4, which is calculated using float8_mi.
> Float4 makes sense if we store a large number of values somewhere
> but they are just working varialbles. Couldn't we eliminate
> float(4) whthout any specifc requirement?

I'm fine with the 0002-geo-float-v14 except the above.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [PATCH] Improve geometric types

2018-08-02 Thread Kyotaro HORIGUCHI
At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra  
wrote in 
> 
> 
> On 08/01/2018 01:40 PM, Emre Hasegeli wrote:
> >> Ah, so there's an assumption that NaNs are handled earlier and never
> >> reach
> >> this place? That's probably a safe assumption. I haven't thought about
> >> that,
> >> it simply seemed suspicious that the code mixes direct comparisons and
> >> float8_mi() calls.
> > The comparison functions handle NaNs.  The arithmetic functions handle
> > returning error on underflow, overflow and division by zero.  I
> > assumed we want to return error on those in any case, but we don't
> > want to handle NaNs at every place.
> > 
> >> Not sure, I'll leave that up to you. I don't mind doing it in a
> >> separate
> >> patch (I'd probably prefer that over mixing it into unrelated patch).
> > It is attached separately.
> > 
> 
> OK, thanks.
> 
> So, have we reached conclusion about all the bits I mentioned on 7/31?
> The delta and float8/double cast are fixed, and for computeDistance
> (i.e. doing comparisons directly or using float8_lt), the code may
> seem a bit inconsistent, but it is in fact correct as the NaNs are
> handled elsewhere. That seems reasonable, but perhaps a comment
> pointing that out would be nice.

I'm not confident on replacing double to float8 partially in gist
code. After the 0002 patch applied, I see most of problematic
usage of double or bare arithmetic on dimentional values in
gistproc.c.

> static inline float
> non_negative(float val)
> {
>   if (val >= 0.0f)
>   return val;
>   else
>   return 0.0f;
> }

It is used as "non_negative(overlap)", where overlap is float4,
which is calculated using float8_mi.  Float4 makes sense only if
we need to store a large number of it to somewhere but they are
just working varialbles. Couldn't we eliminate float4 that
doesn't have a requirement to do so?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Problem while updating a foreign table pointing to a partitioned table on foreign server

2018-08-02 Thread Kyotaro HORIGUCHI
Hello.

At Tue, 26 Jun 2018 10:19:45 +0530, Ashutosh Bapat 
 wrote in 

> On Tue, Jun 26, 2018 at 9:59 AM, Kyotaro HORIGUCHI
>  wrote:
> >
> >> But good thing is because of join and aggregate
> >> push-down we already have ability to push arbitrary kinds of nodes
> >> down to the foreign server through the targetlist. We should be able
> >> to leverage that capability. It looks like a lot of change, which
> >> again doesn't seem to be back-portable.
> >
> > After some struggles as you know, I agree to the opinion. As my
> > first impression, giving (physical) base relations (*1) an
> > ability to have junk attribute is rather straightforward.
> 
> By giving base relations an ability to have junk attribute you mean to
> add junk attribute in the targetlist of DML, something like
> postgresAddForeignUpdateTargets(). You seem to be fine with the new

Maybe.

> node approach described above. Just confirm.

Something-like-but-other-hanVar node? I'm not sure it is needed,
because whatever node we add to the relation-tlist, we must add
the correspondence to the relation descriptor. And if we do that,
a Var works to point it. (Am I correctly understanding?)

> >
> > Well, is our conclusion here like this?
> >
> > - For existing versions, check the errorneous situation and ERROR out.
> >   (documentaion will be needed.)
> >
> > - For 12, we try the above thing.
> 
> I think we have to see how invasive the fix is, and whether it's
> back-portable. If it's back-portable, we back-port it and the problem
> is fixed in previous versions as well. If not, we fix previous
> versions to ERROR out instead of corrupting the database.

Mmm. Ok, I try to make a patch. Please wait for a while.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Problem while updating a foreign table pointing to a partitioned table on foreign server

2018-08-02 Thread Kyotaro HORIGUCHI
Hello, thank you for the comment.

At Wed, 01 Aug 2018 21:21:57 +0900, Etsuro Fujita  
wrote in <5b61a5e5.6010...@lab.ntt.co.jp>
> (2018/06/12 12:19), Kyotaro HORIGUCHI wrote:
> > I have demonstrated and actually shown a problem of the
> > PARAM_EXEC case.
> 
> > A. Just detecting and reporting/erroring the problematic case.
> >
> > B. Giving to Sort-like nodes an ability to convert PARAMS into
> > junk columns.
> >
> > C. Adding a space for 64bit tuple identifier in a tuple header.
> >
> > D. Somehow inhibiting tuple-storing node like Sort between. (This
> >should break something working.)
> >
> >
> > B seems to have possibility to fix this but I haven't have a
> > concrete design of it.
> 
> I'm just wondering whether we could modify the planner (or executor)
> so that Params can propagate up to the ModifyTable node through all
> joins like Vars/PHVs.

Yeah, it's mentioned somewhere upthread. The most large obstacle
in my view is the fact that the tuple descriptor for an
RTE_RELATION baserel is tied with the relation definition. So we
need to separate the two to use "(junk) Vars/PHVs" to do that
purpose. The four above is based on the premise of EXEC_PARAMS.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [report] memory leaks in COPY FROM on partitioned table

2018-08-02 Thread Alvaro Herrera
On 2018-Aug-03, Kohei KaiGai wrote:

> 2018-08-02 5:38 GMT+09:00 Alvaro Herrera :
> > On 2018-Aug-01, Alvaro Herrera wrote:
> >
> >> Right, makes sense.  Pushed that way.
> >
> > KaiGai, if you can please confirm that the pushed change fixes your test
> > case, I'd appreciate it.
>
> Can you wait for a few days? I can drop the test dataset and reuse the storage
> once benchmark test is over

Of course -- take your time.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: Recovery performance of standby for multiple concurrent truncates on large tables

2018-08-02 Thread Jamison, Kirk
Hi, 
I appreciate the feedback and suggestions.

On Tue, Jul 31, 2018 at 8:01 AM, Robert Haas  wrote:
>> How would this work if a relfilenode number that belonged to an old 
>> relation got recycled for a new relation?
>> ..
>> I think something like this could be made to work -- both on the 
>> master and the standby, and not just while waiting for a failover --
>> ...
>> It's not clear to me whether it would be worth the overhead of doing 
>> something like this.  Making relation drops faster at the cost of 
>> making buffer cleaning slower could be a loser.

The deferred list has the information such as relfilenode and the head/top 
page number of invalid pages. The standby in the promote mode only registers
info in the deferred list when redoing the COMMIT record. However, it scans
the shared buffer cache only once before the end of the promote, and discards
the page related to the table included in the deferred list. After removing, 
increment the head page number of the abovementioned invalid page.
In this way, if ever while promoting an INSERT progresses (I'm not sure if 
it's possible), the discard progresses too, & the app can refer to it.

As mentioned by Tsunakawa-san above, the purpose of the design is to make
the failover process as shorter as possible and not really to make the 
drop of relations faster. My initial thought was not to touch the regular 
buffer invalidation process, but I am also not sure of the level of 
complexity that the proposed design would affect and how crucial the
requirement (shorter failover) would make, so I asked for community's feedback.


On Tue, July 31, 2018 8:27 AM, Thomas Munro wrote:
> Anyway, it's a lot of complexity, and it falls back to a worst cases 
> like today, and can also transfer work to innocent foreground processes.
> I see why Andres says we should just get a better data structure so we 
> can make the guy doing the dropping pay for it up front, but more 
> efficiently.  I suspect we may also want an ordered data structure for 
> write clustering purposes one day.

I also understand the value of solving the root cause of the problem that's why 
I asked Andres if we could expect a development from the community for V12
regarding the radix tree approach for buffer management, or even just from 
anyone who
could start a WIP patch regardless if it's radix tree or not.
And perhaps we'd like to get involved as well as this is also our customer's 
problem.

So I am just curious about the radix tree approach's design. Maybe we should
start discussing what kind of data structures, processing, etc. are involved?

I also read other design solutions from another thread [1].
a. Fujii-san's proposal
Add $SUBJECT for performance improvement; reloption to prevent vacuum 
from truncating empty pages
b. Pavan's comment
> What if we remember the buffers as seen by count_nondeletable_pages() and
> then just discard those specific buffers instead of scanning the entire
> shared_buffers again? Surely we revisit all to-be-truncated blocks before
> actual truncation. So we already know which buffers to discard. And we're
> holding exclusive lock at that point, so nothing can change underneath. Of
> course, we can't really remember a large number of buffers, so we can do
> this in small chunks. Scan last K blocks, remember those K buffers, discard
> those K buffers, truncate the relation and then try for next K blocks. If
> another backend requests lock on the table, we give up or retry after a
> while.

[1] 
https://www.postgresql.org/message-id/flat/20180419203802.hqrb4o2wexjnb2ft%40alvherre.pgsql#7d4a8c56a01392a3909b2150371e6495

Now, how do we move forward?


Thank you everyone.

Regards,
Kirk Jamison


Re: Pluggable Storage - Andres's take

2018-08-02 Thread Haribabu Kommi
On Tue, Jul 24, 2018 at 11:31 PM Haribabu Kommi 
wrote:

> On Tue, Jul 17, 2018 at 11:01 PM Haribabu Kommi 
> wrote:
>
>>
>> I added new API in the tableam.h to get all the page visible tuples to
> abstract the bitgetpage() function.
>
> >>- Merge tableam.h and tableamapi.h and make most tableam.c functions
> >> small inline functions. Having one-line tableam.c wrappers makes this
> >> more expensive than necessary. We'll have a big enough trouble not
> >> regressing performancewise.
>
> I merged tableam.h and tableamapi.h into tableam.h and changed all the
> functions as inline. This change may have added some additional headers,
> will check them if I can remove their need.
>
> Attached are the updated patches on top your github tree.
>
> Currently I am working on the following.
> - I observed that there is a crash when running isolation tests.
>

while investing the crash, I observed that it is due to the lot of FIXME's
in
the code. So I just fixed minimal changes and looking into correcting
the FIXME's first.

One thing I observed is lack relation pointer is leading to crash in the
flow of EvalPlan* functions, because all ROW_MARK types doesn't
contains relation pointer.

will continue to check all FIXME fixes.


> - COPY's multi_insert path should probably deal with a bunch of slots,
>   rather than forming HeapTuples
>

Implemented supporting of slots in the copy multi insert path.

Regards,
Haribabu Kommi
Fujitsu Australia


0002-Isolation-test-fixes-1.patch
Description: Binary data


0001-COPY-s-multi_insert-path-deal-with-bunch-of-slots.patch
Description: Binary data


Re: [report] memory leaks in COPY FROM on partitioned table

2018-08-02 Thread Kohei KaiGai
2018-08-02 5:38 GMT+09:00 Alvaro Herrera :
> On 2018-Aug-01, Alvaro Herrera wrote:
>
>> Right, makes sense.  Pushed that way.
>
> KaiGai, if you can please confirm that the pushed change fixes your test
> case, I'd appreciate it.
>
Can you wait for a few days? I can drop the test dataset and reuse the storage
once benchmark test is over
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: Ideas for a relcache test mode about missing invalidations

2018-08-02 Thread Kyotaro HORIGUCHI
At Thu, 2 Aug 2018 14:40:50 -0700, Peter Geoghegan  wrote in 

pg> On Thu, Aug 2, 2018 at 3:18 AM, Kyotaro HORIGUCHI
pg>  wrote:
pg> >> [1] 
http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com
pg> >> [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us
pg> >
pg> > As for [1], it is not a issue on invalidation. It happens also if
pg> > the relation has any index and even drop is not needed. The
pg> > following steps are sufficient.
pg> >
pg> > create table t( pk serial, t text );
pg> > insert into t( t ) values( 'hello' ), ('world');
pg> > create index idx_fake0 on t (pk);
pg> > create index idx_fake on t ( f_fake( pk ) ); -- ERROR
pg> 
pg> The fact that there was a weird error wasn't really what we cared
pg> about there. If the user is doing something that's clearly
pg> unreasonable or nonsensical, then they cannot expect much from the
pg> error message (maybe we could do better, but it's not a priority).

Hmm. I don't think it's unreasonable or nonsensical, but I don't
argue it here.

pg> What we really cared about was the fact that it was possible to make a
pg> backend's relcache irrecoverably corrupt. That should never be allowed
pg> to happen, even when the user is determined to do something
pg> unreasonable.

I reread through the thread and IUCC, drop_index() is sending
inval on the owing relation and invalidation happens (that is,
RelationCacheInvalidateEntry is called for the owner
relation.). Relcache for the owner is actually being rebuit
during the following create index. At least the problem mentioned
in [1] is happening using a fresh relcache created after invoking
the index creation. The cause is RelationGetIndexList collects
all "indislive" indexes, including the idx_fake before population,
which can be fixed by the attached patch..

I'm I still misunderstanding something?

# Anyway, I'll make another thread for the another issue.

[1]: 
https://www.postgresql.org/message-id/20180628150209.n2qch5jtn3vt2xaa%40alap3.anarazel.de

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian

2018-08-02 Thread Alvaro Herrera
On 2018-Aug-01, Tom Lane wrote:

> David Rowley  writes:
> > On 20 July 2018 at 01:03, David Rowley  wrote:
> >> I've attached a patch intended for master which is just v2 based on
> >> post 5220bb7533.
> 
> I've pushed the v3 patch with a lot of editorial work (e.g. cleaning
> up comments you hadn't).

Thanks Tom, much appreciated.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Page freezing, FSM, and WAL replay

2018-08-02 Thread Alvaro Herrera
We recently had a customer report a very strange problem, involving a
very large insert-only table: without explanation, insertions would
stall for several seconds, causing application timeout and process
accumulation and other nastiness.

After some investigation, we narrowed this down to happening immediately
after the first VACUUM on the table right after a standby got promoted.
It wasn't at first obvious what the connection between these factors
was, but eventually we realized that VACUUM must have been skipping a
bunch of pages because they had been marked all-frozen previously, so
the FSM was not updated with the correct freespace figures for those
pages.  The FSM pages had been transmitted as full-page images on WAL
before the promotion (because wal_log_hints), so they contained
optimistic numbers on amount of free space coming from the previous
master.  (Because this only happens on the first change to that FSM page
after a checkpoint, it's quite likely that one page every few thousand
or so contains optimistic figures while the others remain all zeroes, or
something like that.)

Before VACUUM, nothing too bad would happen, because the upper layers of
the FSM would not know about those optimistic numbers.  But when VACUUM
does FreeSpaceMapVacuum, it propagates those numbers upwards; as soon as
that happens, inserters looking for pages would be told about those
pages (wrongly catalogued to contain sufficient free space), go to
insert there, and fail because there isn't actually any freespace; ask
FSM for another page, lather, rinse, repeat until all those pages are
all catalogued correctly by FSM, at which point things continue
normally.  (There are many processes doing this chase-up concurrently
and it seems a pretty contentious process, about which see last
paragraph; it can be seen in pg_xlogdump that it takes several seconds
for things to settle).

After considering several possible solutions, I propose to have
heap_xlog_visible compute free space for any page being marked frozen;
Pavan adds to that to have heap_xlog_clean compute free space for all
pages also.  This means that if we later promote this standby and VACUUM
skips all-frozen pages, their FSM numbers are going to be up-to-date
anyway.  Patch attached.


Now, it's possible that the problem occurs for all-visible pages not
just all-frozen.  I haven't seen that one, maybe there's some reason why
it cannot.  But fixing both things together is an easy change in the
proposed patch: just do it on xlrec->flags != 0 rather than checking for
the specific all-frozen flag.

(This problem seems to be made worse by the fact that
RecordAndGetPageWithFreeSpace (or rather fsm_set_and_search) holds
exclusive lock on the FSM page for the whole duration of update plus
search.  So when there are many inserters, they all race to the update
process.  Maybe it'd be less terrible if we would release exclusive
after the update and grab shared lock for the search in
fsm_set_and_search, but we still have to have the exclusive for the
update, so the contention point remains.  Maybe there's not sufficient
improvement to make a practical difference, so I'm not proposing
changing this.)

-- 
Álvaro Herrera
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 5016181fd7..d024b4fa59 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8056,6 +8056,7 @@ heap_xlog_clean(XLogReaderState *record)
xl_heap_clean *xlrec = (xl_heap_clean *) XLogRecGetData(record);
Buffer  buffer;
Sizefreespace = 0;
+   boolknow_freespace = false;
RelFileNode rnode;
BlockNumber blkno;
XLogRedoAction action;
@@ -8107,8 +8108,6 @@ heap_xlog_clean(XLogReaderState *record)
nowdead, ndead,
nowunused, 
nunused);
 
-   freespace = PageGetHeapFreeSpace(page); /* needed to update FSM 
below */
-
/*
 * Note: we don't worry about updating the page's prunability 
hints.
 * At worst this will cause an extra prune cycle to occur soon.
@@ -8118,16 +8117,16 @@ heap_xlog_clean(XLogReaderState *record)
MarkBufferDirty(buffer);
}
if (BufferIsValid(buffer))
+   {
+   freespace = PageGetHeapFreeSpace(page); /* needed to update FSM 
below */
+   know_freespace = true;
UnlockReleaseBuffer(buffer);
+   }
 
/*
-* Update the FSM as well.
-*
-* XXX: Don't do this if the page was restored from full page image. We
-* don't bother to update the FSM in that case, it doesn't need to be
-* totally accurate anyway.
+* Update the FSM as well, if we can.
 */
-   if (action == BLK_NEEDS_REDO)
+   if (know_freespace)

Re: FailedAssertion on partprune

2018-08-02 Thread David Rowley
On 3 August 2018 at 07:53, Robert Haas  wrote:
> I don't really understand the issue at hand, but let me just make a
> comment about accumulate_append_subpath().  If we have a regular
> Append on top of another regular Append or on top of a MergeAppend, we
> can flatten the lower Merge(Append) into the upper one.  No problem.
> If, however, the lower path is a Parallel Append, we can't.  Consider
> this plan:
>
> Gather
> -> Append
>   -> Partial Subpath #1
>   -> Parallel Append
> -> Partial Subpath #2
> -> Non-partial Subpath
>
> Making Partial Subpath #2 a child of the Append rather than the
> Parallel Append would be correct; the only downside is that we'd lose
> the nice worker-balancing stuff that the Parallel Append does.
> However, making Non-partial Subpath a child of the Append rather than
> the Parallel Append would be dead wrong, because the Parallel Append
> will make sure that it only gets executed by one of the cooperating
> processes, and the regular Append won't.  If the upper Append were
> changed to a Parallel Append, then we could flatten the whole thing
> just fine, as long as we're careful to keep partial paths and
> non-partial paths separated.  Also, if the situation is reversed, with
> an upper Parallel Append and a regular Append beneath it, that's
> always flatten-able.

Wouldn't that code have more flexibility to flatten the Append if it
were to, instead of looking at the Append's subpaths, look at the
subpath's Parent RelOptInfo paths and just use the Append and
MergeAppend as a container to identify the relations that must be
included, rather than the paths that should be?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: FailedAssertion on partprune

2018-08-02 Thread David Rowley
On 3 August 2018 at 05:54, Tom Lane  wrote:
> Yeah, looking at the explain posted upthread, the issue is specifically
> that some of the child paths might be for Gathers over subsets of the
> partitioning hierarchy.  It's not real clear to me whether such a subset
> would necessarily correspond to a subtree of the hierarchy, but if it
> does, you'd want to be able to apply run-time pruning at the parent
> Append, in hopes of not having to execute the Gather at all.  Separately
> from that, if you do execute the Gather, applying runtime pruning within
> its child Append is also a good thing.

Yeah, that "good thing" was the reason I changed my mind from aborting
run-time pruning in my first patch idea to allowing it because if
pruning says so, it's fine to remove an entire sub-partitioned
hierarchy.

If we look at the Assert that's in question.

/* Same rel cannot be both leaf and non-leaf */
Assert(relid_subplan_map[rti] == 0);

If this fails then the subplan path that was found must have a parent
that's a partitioned table. If we're worried about those containing
subplans which are not partitions then we can probably add some
Asserts somewhere else.

One place that we could do that with a bit of cover would be inside
the bms_num_members(allmatchedsubplans) < list_length(subpaths)
condition, we could do:

Assert(root->simple_rte_array[parentrel->relid]->relkind !=
   RELKIND_PARTITIONED_TABLE);

There should never be any other_subplans when the parent is a partitioned table.

I've attached a patch to demonstrate what I mean here.

> So what's unclear to me is whether removing that Assert is sufficient
> to make all of that work nicely.  I'm concerned in particular that the
> planner might get confused about which pruning tests to apply at each
> level.  (EXPLAIN isn't a very adequate tool for testing this, since
> it fails to show anything about what pruning tests are attached to an
> Append.  I wonder if we need to create something that does show that.)

Pruning is only ever applied at the top-level when the RelOptInfo is a
RELOPT_BASEREL.  It might well be that we want to reconsider that now
that we've seen some plans where the subpaths are not always pulled
into the top-level [Merge]Append path, but as of now I don't quite see
how this could get broken. We're just pruning with a coarser
granularity.

> I think it'd be a real good idea to have a test case to exercise this.
> Jaime's presented test case is too expensive to consider as a regression
> test, but I bet that a skinnied-down version could be made to work
> once you'd set all the parallelization parameters to liberal values,
> like in select_parallel.sql.

Yeah, I did spend a bit of time trying to cut down that test and
failed at it. The fact that it seemed so hard to do that didn't give
me much confidence that the plan produced by the test would be very
stable, but perhaps we can just deal with that if we see its unstable.

I'll try again with the test.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


run-time_prune_asserts_for_discussion.patch
Description: Binary data


Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> Something that copes with different modules installing headers
 Tom> with the same base name. Allowing for that was the driving force
 Tom> for going with subdirectory-per-extension, but if we really want
 Tom> that to work, there seems to be no alternative but for extensions
 Tom> to write qualified header names (#include "hstore/hstore.h" not
 Tom> #include "hstore.h"). Andres, for one, seemed to think that
 Tom> wouldn't play nicely with PGXS,

I think that was me, not Andres?

But I think I was partially wrong and that it's possible that this can
be made to work at least in most cases, as long as we can rely on the
same-directory rule for #include "foo.h". (i.e. the first place to look
is always the same directory as the file containing the #include
statement).

I'm going to test this now, trying to do an out-of-both-trees build
of a transform function for an out-of-tree PL that uses multiple .h
files.

-- 
Andrew (irc:RhodiumToad)



Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> It's particularly bad for these cases, since what they demonstrate
 Tom> is that it's impossible to build transform modules for plperl or
 Tom> plpython out-of-tree at the moment.

Right. Both plperl and plpython install _some_ header files (which
behavior was added in the commit for the transforms feature), but they
don't install all of the header files that the existing transform
modules actually use. Is this supposed to be a principled choice, with
an out-of-tree transform expected to provide their own code instead of
using those headers, or is it just an oversight?

 Tom> That doesn't seem to me to be something we should just ignore; it
 Tom> goes against not only our larger commitment to extensibility, but
 Tom> also the entire point of the transforms feature.

Yeah, if an out-of-tree data type can't provide a plperl or plpython
transform for itself, something's broken. And that does seem to be the
case at present.

-- 
Andrew (irc:RhodiumToad)



Re: [HACKERS] Cached plans and statement generalization

2018-08-02 Thread Konstantin Knizhnik



On 02.08.2018 08:25, Yamaji, Ryo wrote:

-Original Message-
From: Konstantin Knizhnik [mailto:k.knizh...@postgrespro.ru]
Sent: Wednesday, August 1, 2018 4:53 PM
To: Yamaji, Ryo/山地 亮 
Cc: PostgreSQL mailing lists 
Subject: Re: [HACKERS] Cached plans and statement generalization

I failed to reproduce the problem.
I used the following non-default configuration parameters:

autoprepare_limit=1
autoprepare_threshold=1

create dummy database:

create table foo(x integer primary key, y integer); insert into foo values
(generate_series(1,1), 0);

and run different queries,  like:

postgres=# select *  from foo where x=1; postgres=# select *  from foo
where x+x=1; postgres=# select *  from foo where x+x+x=1; postgres=#
select *  from foo where x+x+x+x=1; ...

and check size of CacheMemoryContext using gdb - it is not increased.

Can you please send me your test?


I checked not CacheMemoryContext but "plan cache context".
Because I think that the memory context that autoprepare mainly uses is "plan cache 
context".

Non-default configuration parameter was used as well as Konstantin.
autoprepare_limit=1
autoprepare_threshold=1

The procedure of the test that I did is shown below.

create dummy database
create table test (key1 integer, key2 integer, ... , key100 integer);
insert into test values (1, 2, ... , 100);

And, different queries are executed.
select key1 from test where key1=1 and key2=2 and ... and key100=100;
select key2 from test where key1=1 and key2=2 and ... and key100=100;
select key3 from test where key1=1 and key2=2 and ... and key100=100;...

And, "plan cache context" was confirmed by using gdb.





Thank you.
Unfortunately expression_tree_walker is not consistent with copyObject: 
I tried to use this walker to destroy raw parse tree of autoprepared 
statement, but looks
like some nodes are not visited by expression_tree_walker. I have to 
create separate memory context for each autoprepared statement.

New version  of the patch is attached.
diff --git a/doc/src/sgml/autoprepare.sgml b/doc/src/sgml/autoprepare.sgml
new file mode 100644
index 000..b3309bd
--- /dev/null
+++ b/doc/src/sgml/autoprepare.sgml
@@ -0,0 +1,62 @@
+
+
+ 
+  Autoprepared statements
+
+  
+   autoprepared statements
+  
+
+  
+PostgreSQL makes it possible prepare
+frequently used statements to eliminate cost of their compilation
+and optimization on each execution of the query. On simple queries
+(like ones in pgbench -S) using prepared statements
+increase performance more than two times.
+  
+
+  
+Unfortunately not all database applications are using prepared statements
+and, moreover, it is not always possible. For example, in case of using
+pgbouncer or any other session pooler,
+there is no session state (transactions of one client may be executed at different
+backends) and so prepared statements can not be used.
+  
+
+  
+Autoprepare mode allows to overcome this limitation.
+In this mode Postgres tries to generalize executed statements
+and build parameterized plan for them. Speed of execution of
+autoprepared statements is almost the same as of explicitly
+prepared statements.
+  
+
+  
+By default autoprepare mode is switched off. To enable it, assign non-zero
+value to GUC variable autoprepare_tthreshold.
+This variable specified minimal number of times the statement should be
+executed before it is autoprepared. Please notice that, despite to the
+value of this parameter, Postgres makes a decision about using
+generalized plan vs. customized execution plans based on the results
+of comparison of average time of five customized plans with
+time of generalized plan.
+  
+
+  
+If number of different statements issued by application is large enough,
+then autopreparing all of them can cause memory overflow
+(especially if there are many active clients, because prepared statements cache
+is local to the backend). To prevent growth of backend's memory because of
+autoprepared cache, it is possible to limit number of autoprepared statements
+by setting autoprepare_limit GUC variable. LRU strategy will be used
+to keep in memory most frequently used queries.
+  
+
+  
+It is possible to inspect autoprepared queries in the backend using
+pg_autoprepared_statements view. It shows original text of the
+query, types of the extracted parameters (replacing literals) and
+query execution counter.
+  
+
+ 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index fffb79f..26f561d 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8230,6 +8230,11 @@ SCRAM-SHA-256$iteration count:
  
 
  
+  pg_autoprepared_statements
+  autoprepared statements
+ 
+
+ 
   pg_prepared_xacts
   prepared transactions
  
@@ -9541,6 +9546,68 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
  
 
+
+ 
+  

Re: Ideas for a relcache test mode about missing invalidations

2018-08-02 Thread Peter Geoghegan
On Thu, Aug 2, 2018 at 3:18 AM, Kyotaro HORIGUCHI
 wrote:
>> [1] 
>> http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com
>> [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us
>
> As for [1], it is not a issue on invalidation. It happens also if
> the relation has any index and even drop is not needed. The
> following steps are sufficient.
>
> create table t( pk serial, t text );
> insert into t( t ) values( 'hello' ), ('world');
> create index idx_fake0 on t (pk);
> create index idx_fake on t ( f_fake( pk ) ); -- ERROR

The fact that there was a weird error wasn't really what we cared
about there. If the user is doing something that's clearly
unreasonable or nonsensical, then they cannot expect much from the
error message (maybe we could do better, but it's not a priority).
What we really cared about was the fact that it was possible to make a
backend's relcache irrecoverably corrupt. That should never be allowed
to happen, even when the user is determined to do something
unreasonable.

-- 
Peter Geoghegan



Re: Alter index rename concurrently to

2018-08-02 Thread Andres Freund
On 2018-08-02 16:51:10 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > [ reasons why DDL under less than AEL sucks ]
> 
> Unfortunately, none of these problems are made to go away with an
> AcceptInvalidationMessages at statement start.  That just makes the
> window smaller.  But DDL effects could still be seen - or not -
> partway through a statement, with just as much ensuing hilarity
> as in your example.  Maybe more.

I think it's a lot easier to explain that every newly issued statement
sees the effect of DDL, and already running statements may see it, than
something else.  I don't agree that parse analysis is a good hook to
force that, as I've written.


> The real issue here, and the reason why I'm very unhappy with the mad rush
> in certain quarters to try to reduce locking levels for DDL, is exactly
> that it generally results in uncertainty about when the effects will be
> seen.  I do not think your proposal does much more than put a fig leaf
> over that problem.

I think it's a significant issue operationally, which is why that
pressure exists. I don't know what makes it a "mad rush", given these
discussions have been going on for *years*?

Greetings,

Andres Freund



Re: Alter index rename concurrently to

2018-08-02 Thread Tom Lane
Robert Haas  writes:
> [ reasons why DDL under less than AEL sucks ]

Unfortunately, none of these problems are made to go away with an
AcceptInvalidationMessages at statement start.  That just makes the
window smaller.  But DDL effects could still be seen - or not -
partway through a statement, with just as much ensuing hilarity
as in your example.  Maybe more.

The real issue here, and the reason why I'm very unhappy with the mad rush
in certain quarters to try to reduce locking levels for DDL, is exactly
that it generally results in uncertainty about when the effects will be
seen.  I do not think your proposal does much more than put a fig leaf
over that problem.

Barring somebody having a great idea about resolving that, I think we
just need to be very clear that any DDL done with less than AEL has
exactly this issue.  In the case at hand, couldn't we just document
that "the effects of RENAME CONCURRENTLY may not be seen in other
sessions right away; at worst, not till they begin a new transaction"?
If you don't like that, don't use CONCURRENTLY.

regards, tom lane



Re: Alter index rename concurrently to

2018-08-02 Thread Andres Freund
Hi,

On 2018-08-02 16:30:42 -0400, Robert Haas wrote:
> With this change, we'd be able to say that new statements will
> definitely see the results of concurrent DDL.

I don't think it really gets you that far. Because you're suggesting to
do it at the parse stage, you suddenly have issues with prepared
statements. Some client libraries, and a lot more frameworks,
automatically use prepared statements when they see the same query text
multiple times.  And this'll not improve anything on that.

ISTM, if you want to increase consistency in this area, you've to go
further. E.g. processing invalidations in StartTransactionCommand() in
all states, which'd give you a lot more consistency.

Greetings,

Andres Freund



Re: "Write amplification" is made worse by "getting tired" while inserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)

2018-08-02 Thread Peter Geoghegan
On Tue, Jul 17, 2018 at 10:42 PM, Simon Riggs  wrote:
> If we knew that we were never going to do deletes/non-HOT updates from
> the table we could continue to use the existing mechanism, otherwise
> we will be better off to use sorted index entries. However, it does
> appear as if keeping entries in sorted order would be a win on
> concurrency from reduced block contention on the first few blocks of
> the index key, so it may also be a win in cases where there are heavy
> concurrent inserts but no deletes.

I think so too. I also expect a big reduction in the number of FPIs in
the event of many duplicates.

> I hope we can see a patch that just adds the sorting-by-TID property
> so we can evaluate that aspect before we try to add other more
> advanced index ideas.

I can certainly see why that's desirable. Unfortunately, it isn't that simple.

If I want to sort on heap TID as a tie-breaker, I cannot cut any
corners. That is, it has to be just another column, at least as far as
the implementation is concerned (heap TID will have a different
representation in internal pages and leaf high keys, but nonetheless
it's essentially just another column in the keyspace). This means that
if I don't have suffix truncation, I'll regress performance in many
important cases that have no compensating benefit (e.g. pgbench).
There is no point in trying to assess that.

It is true that I could opt to only "logically truncate" the heap TID
attribute during a leaf page split (i.e. there'd only be "logical
truncation", which is to say there'd only be the avoidance of adding a
heap TID to the new high key, and never true physical truncation of
user attributes). But doing only that much saves very little code,
since the logic for assessing whether or not it's safe to avoid adding
a new heap attribute (whether or not we logically truncate) still has
to involve an insertion scankey. It seems much more natural to do
everything at once. Again, the heap TID attribute is more or less just
another attribute. Also, the code for doing physical suffix truncation
already exists from the covering/INCLUDE index commit.

I'm currently improving the logic for picking a page split in light of
suffix truncation, which I've been working on for weeks now. I had
something that did quite well with the initial index sizes for TPC-C
and TPC-H, but then realized I'd totally regressed the motivating
example with many duplicates that I started this thread with. I now
have something that does both things well, which I'm trying to
simplify. Another thing to bear in mind is that page split logic for
suffix truncation also helps space utilization on the leaf level. I
can get the main TPC-C order_line pkey about 7% smaller with true
suffix truncation, even though the internal page index tuples can
never be any smaller due to alignment, and even though there are no
duplicates that would otherwise make the implementation "get tired".

Can I really fix space utilization in a piecemeal fashion? I strongly doubt it.

-- 
Peter Geoghegan



Re: Alter index rename concurrently to

2018-08-02 Thread Robert Haas
On Thu, Aug 2, 2018 at 4:02 PM, Andres Freund  wrote:
>> Inserting AcceptInvalidationMessages() in some location that
>> guarantees it will be executed at least once per SQL statement.  I
>> tentatively propose the beginning of parse_analyze(), but I am open to
>> suggestions.
>
> I'm inclined to think that that doesn't really actually solve anything,
> but makes locking issues harder to find, because the window is smaller,
> but decidedly non-zero.  Can you describe why this'd make things more
> "predictable" precisely?

Sure.  I'd like to be able to explain in the documentation in simple
words when a given DDL change is likely to take effect.  For changes
made under AccessExclusiveLock, there's really nothing to say: the
change will definitely take effect immediately.  If you're able to do
anything at all with the relevant table, you must have got some kind
of lock on it, and that means you must have done
AcceptInvalidationMessages(), and so you will definitely see the
change.  With DDL changes made under less than AccessExclusiveLock,
the situation is more complicated.  Right now, we can say that a new
transaction will definitely see the changes, because it will have to
acquire a lock on that relation which it doesn't already have and will
thus have to do AcceptInvalidationMessages().  A new statement within
the same transaction may see the changes, or it may not.  If it
mentions any relations not previously mentioned or if it does
something like UPDATE a relation where we previously only did a
SELECT, thus triggering a new lock acquisition, it will see the
changes.  If a catchup interrupt gets sent to it, it will see the
changes.  Otherwise, it won't.  It's even possible that we'll start to
see the changes in the middle of a statement, because of a sinval
reset or something.  To summarize, at present, all we can really say
is that changes made by concurrent DDL which doesn't take AEL may or
may not affect already-running queries, may or may not affect new
queries, and will affect new transactions.

With this change, we'd be able to say that new statements will
definitely see the results of concurrent DDL.  That's a clear, easy to
understand rule which I think users will like.  It would be even
better if we could say something stronger, e.g. "Concurrent DDL will
affect new SQL statements, but not those already in progress."  Or
"Concurrent DDL will affect new SQL statements, but SQL statements
that are already running may take up to 10 seconds to react to the
changes".  Or whatever.  I'm not sure there's really a way to get to a
really solid guarantee, but being able to tell users that, at a
minimum, the next SQL statement will notice that things have changed
would be good.  Otherwise, we'll have conversations like this:

User: I have a usage pattern where I run a DDL command to rename an
object, and then in another session I tried to refer to it by the new
name, and it sometimes it works and sometimes it doesn't.  Why does
that happen?

Postgres Expert: Well, there are several possible reasons.  If you
already had a transaction in progress in the second window and it
already had a lock on the object strong enough for the operation you
attempted to perform and no sinval resets occurred and nothing else
triggered invalidation processing, then it would still know that
object under the old name.  Otherwise it would know about it under the
new name.

User: Well, I did have a transaction open and it may have had some
kind of lock on that object already, but how do I know whether
invalidation processing happened?

Postgres Expert: There's really know way to know.  If something else
on the system were doing a lot of DDL operations, then it might fill
up the invalidation queue enough to trigger catchup interrupts or
sinval resets, but if not, it could be deferred for an arbitrarily
long period of time.

User: So you're saying that if I have two PostgreSQL sessions, and I
execute the same commands in those sessions in the same order, just
like the isolationtester does, I can get different answers depending
on whether some third session creates a lot of unrelated temporary
tables in a different database while that's happening?

Postgres Expert: Yes.

I leave it to your imagination to fill in what this imaginary user
will say next, but I bet it will be snarky.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Alter index rename concurrently to

2018-08-02 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 1, 2018 at 3:36 PM, Andres Freund  wrote:
>> What precisely are you proposing?

> Inserting AcceptInvalidationMessages() in some location that
> guarantees it will be executed at least once per SQL statement.  I
> tentatively propose the beginning of parse_analyze(), but I am open to
> suggestions.

What will that accomplish that the existing call in transaction start
doesn't?  It might make the window for concurrency issues a bit narrower,
but it certainly doesn't close it.

regards, tom lane



Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

2018-08-02 Thread Tom Lane
Robert Haas  writes:
> Does anyone else want to weigh in on this?  It seems to me that there
> are several people who are quite willing to complain about the fact
> that this hasn't been fixed, but not willing to express an opinion
> about the shape of the fix.  Either the RMT needs to take executive
> action, or we need some more votes.

[ reads thread ... ]

Well, my vote is that I've got zero faith in either of these patches.

I do not trust Ashutosh's patch because of the point you noted that it
will kick in on ConvertRowtypeExprs that are not related to partitioning.
It's also got a rather fundamental conceptual (or at least documentation)
error in that it says it's making pull_var_clause do certain things to
all ConvertRowtypeExprs, but that's not what the code actually does.
I think the tension around that has a lot to do with the fact that the
patch went through so many revisions, and I don't have any faith that
it's right even yet.  As you mentioned upthread, this might be insoluble
without some representational change for converted whole-row Vars.

As for Etsuro-san's patch, I don't like that for the same reasons you
gave.  Also, I note that back towards the beginning of July it was
said that that patch depends on the assumption of no Gather below
an Append.  That's broken *now*, not in some hypothetical future.
(See the nearby "FailedAssertion on partprune" thread, wherein we're
trying to fix the planner's handling of exactly such plans.)

What I'm thinking might be a more appropriate thing, at least for
getting v11 out the door, is to refuse to generate partitionwise
joins when any whole-row vars are involved.  (Perhaps that's not
enough to dodge all the problems, though?)

Now, that's a bit of a problem for postgres_fdw, because it seems to
insist on injecting WRVs even when the query text does not require any.
Why is that, and can't we get rid of it?  It's horrid for performance
even without any of these other considerations.  But if we fail to get
rid of that in time for v11, it would mean that postgres_fdw can't
participate in PWJ, which isn't great but I think we could live with it
for now.

BTW, what exactly do we think the production status of PWJ is, anyway?
I notice that upthread it was stated that enable_partitionwise_join
defaults to off, as indeed it still does, and we'd turn it on later
when we'd gotten rid of some memory-hogging problems.  If that hasn't
happened yet (and I don't see any open item about considering enabling
PWJ by default for v11), then I have exactly no hesitation about
lobotomizing PWJ as hard as we need to to mask this problem for v11.
I'd certainly prefer a solution along those lines to either of these
patches.

regards, tom lane



Re: Standby trying "restore_command" before local WAL

2018-08-02 Thread Robert Haas
On Wed, Aug 1, 2018 at 7:14 AM, Emre Hasegeli  wrote:
>> There's still a question here, at least from my perspective, as to which
>> is actually going to be faster to perform recovery based off of.  A good
>> restore command, which pre-fetches the WAL in parallel and gets it local
>> and on the same filesystem, meaning that the restore_command only has to
>> execute essentially a 'mv' and return back to PG for the next WAL file,
>> is really rather fast, compared to streaming that same data over the
>> network with a single TCP connection to the primary.  Of course, there's
>> a lot of variables there and it depends on the network speed between the
>> various pieces, but I've certainly had cases where a replica catches up
>> much faster using restore command than streaming from the primary.
>
> Trying "restore_command" before streaming replication is totally fine.
> It is not likely that the same WAL would be on both places anyway.
>
> My problem is trying "restore_command" before the local WAL.  I
> understand the historic reason of this design, but I don't think it is
> expected behavior to anybody who is using "restore_command" together
> with streaming replication.

Right.  I don't really understand the argument that this should be
controlled by a GUC.  I could see having a GUC to choose between
archiving-first and streaming-first, but if it's safe to use the WAL
we've already got in pg_xlog, it seems like that should take priority
over every other approach.  The comments lend themselves to a certain
amount of doubt over whether we can actually trust the contents of
pg_xlog, but if we can't, it seems like we just shouldn't use it - at
all - ever.  It doesn't make much sense to me to say "hey, pg_xlog
might have evil bad WAL in it that we shouldn't replay, so let's look
for the same WAL elsewhere first, but then if we don't find it, we'll
just replay the bad stuff."  I might be missing something, but that
sounds a lot like "hey, this mysterious gloop I found might be rat
poison, so let me go check if there's some fruit in the fruit bowl,
but if I don't find any, I'm just going to eat the mysterious gloop."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: doc - add missing documentation for "acldefault"

2018-08-02 Thread Arthur Zakirov
On Wed, Aug 01, 2018 at 04:41:44PM +0300, Pavel Luzanov wrote:
> postgres=# \df acl*
> List of fun
>    Schema   |    Name | Result data type |
> +-+--+--
>  pg_catalog | aclcontains | boolean  | aclitem[], aclitem
>  pg_catalog | acldefault  | aclitem[]    | "char", oid
>  pg_catalog | aclexplode  | SETOF record | acl aclitem[], OUT grantor
> oid, O
>  pg_catalog | aclinsert   | aclitem[]    | aclitem[], aclitem
>  pg_catalog | aclitemeq   | boolean  | aclitem, aclitem
>  pg_catalog | aclitemin   | aclitem  | cstring
>  pg_catalog | aclitemout  | cstring  | aclitem
>  pg_catalog | aclremove   | aclitem[]    | aclitem[], aclitem

Some of them are definitely internal. For example, aclitemin and
aclitemout are input and output functions respectively of the aclitem
data type. I think they don't need to be documented and shouldn't have
"pg_" prefix as they was created to maintenance aclitem data type.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: Alter index rename concurrently to

2018-08-02 Thread Andres Freund
Hi,

On 2018-08-02 15:57:13 -0400, Robert Haas wrote:
> On Wed, Aug 1, 2018 at 3:36 PM, Andres Freund  wrote:
> >> Right.  If nobody sees a reason not to change that, I think we should.
> >> It would make the behavior more predictable with, I hope, no real
> >> loss.
> >
> > What precisely are you proposing?
> 
> Inserting AcceptInvalidationMessages() in some location that
> guarantees it will be executed at least once per SQL statement.  I
> tentatively propose the beginning of parse_analyze(), but I am open to
> suggestions.

I'm inclined to think that that doesn't really actually solve anything,
but makes locking issues harder to find, because the window is smaller,
but decidedly non-zero.  Can you describe why this'd make things more
"predictable" precisely?

Greetings,

Andres Freund



Re: Alter index rename concurrently to

2018-08-02 Thread Robert Haas
On Wed, Aug 1, 2018 at 3:36 PM, Andres Freund  wrote:
>> Right.  If nobody sees a reason not to change that, I think we should.
>> It would make the behavior more predictable with, I hope, no real
>> loss.
>
> What precisely are you proposing?

Inserting AcceptInvalidationMessages() in some location that
guarantees it will be executed at least once per SQL statement.  I
tentatively propose the beginning of parse_analyze(), but I am open to
suggestions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: FailedAssertion on partprune

2018-08-02 Thread Robert Haas
David Rowley  writes:
>> It's probably best discussed on the other thread, but it seems to be
>> by design in accumulate_append_subpath(). Parallel Append nodes don't
>> get flattened if they contain a mix of parallel aware and non-parallel
>> aware subplans.

I don't really understand the issue at hand, but let me just make a
comment about accumulate_append_subpath().  If we have a regular
Append on top of another regular Append or on top of a MergeAppend, we
can flatten the lower Merge(Append) into the upper one.  No problem.
If, however, the lower path is a Parallel Append, we can't.  Consider
this plan:

Gather
-> Append
  -> Partial Subpath #1
  -> Parallel Append
-> Partial Subpath #2
-> Non-partial Subpath

Making Partial Subpath #2 a child of the Append rather than the
Parallel Append would be correct; the only downside is that we'd lose
the nice worker-balancing stuff that the Parallel Append does.
However, making Non-partial Subpath a child of the Append rather than
the Parallel Append would be dead wrong, because the Parallel Append
will make sure that it only gets executed by one of the cooperating
processes, and the regular Append won't.  If the upper Append were
changed to a Parallel Append, then we could flatten the whole thing
just fine, as long as we're careful to keep partial paths and
non-partial paths separated.  Also, if the situation is reversed, with
an upper Parallel Append and a regular Append beneath it, that's
always flatten-able.

> Yeah, looking at the explain posted upthread, the issue is specifically
> that some of the child paths might be for Gathers over subsets of the
> partitioning hierarchy.  It's not real clear to me whether such a subset
> would necessarily correspond to a subtree of the hierarchy, but if it
> does, you'd want to be able to apply run-time pruning at the parent
> Append, in hopes of not having to execute the Gather at all.  Separately
> from that, if you do execute the Gather, applying runtime pruning within
> its child Append is also a good thing.

That sounds right to me.  If I'm not mistaken, in that EXPLAIN, the
radicado2012 and radicado2017 partitions are ineligible for parallel
query for some reason, maybe a reloption setting, but the other
partitions are all eligible, or rather, their subpartitions are
eligible.  If they were all OK for parallel query, the plan would
probably just end up with Gather over Parallel Append over individual
sample scans.

In general, I think the only ways to get an Append are inheritance,
partitioning, and set operations.  If an upper Append is due to
partitioning or inheritance, then I think that any (Merge)Append nodes
beneath it must be a subtree of the partitioning or inheritance
hierarchy, respectively.  Partitioning and inheritance can't be mixed,
and set operations can't occur below partition expansion.  However, if
the upper Append was created by a set operation, then its child Append
nodes could be from any of those causes.  For instance, it seems
possible to imagine an upper Append that is implementing EXCEPT and a
lower Append that is implementing both partition expansion and UNION
ALL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Bug in to_timestamp().

2018-08-02 Thread Arthur Zakirov
On Thu, Aug 02, 2018 at 06:17:01PM +0300, Alexander Korotkov wrote:
> So, if number of spaces/separators between fields in input string is
> more than in format string and list space/separator skipped is minus
> sign, then it decides to glue that minus sign to TZH.  I think this
> behavior is nice to follow, because it allows us to skip spaces after
> fields.

Good approach! With this change the following queries work now:

=# SELECT to_timestamp('2000 + JUN', ' MON');
=# SELECT to_timestamp('2000 +JUN', ' MON');

I think it is worth to add them to regression tests.

> 4) Spaces and separators in format string seem to be handled in the
> same way in non-FX mode.  But strange things happen when you mix
> spaces and separators there.
> ...
> 1+2+3 are implemented in the attached patch, but not 4.  I think that
> it's only worth to follow Oracle when it does reasonable things.

I agree with you. I think it isn't worth to copy this behaviour.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: [PATCH] Add regress test for pg_read_all_stats role

2018-08-02 Thread Michael Paquier
On Thu, Aug 02, 2018 at 06:25:14PM +0100, Alexandra Ryzhevich wrote:
> I have noticed that there is no regression tests for default monitoring
> roles such as pg_read_all_stats.

A bug has been recently fixed for that, see 0c8910a0, so having more
coverage would be welcome, now your patch has a couple of problems.
25fff40 is the original commit which has introduced pg_read_all_stats.

Your patch has a couple of problems by the way:
- database creation is costly, those should not be part of the main
regression test suite.
- Any roles created should use "regress_" as prefix.
- You should look also at negative tests which trigger "must be
superuser or a member of pg_read_all_settings", like say a "SHOW
stats_temp_directory" with a non-privileged user.
--
Michael


signature.asc
Description: PGP signature


Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

2018-08-02 Thread Andres Freund
Hi,

On 2018-08-02 15:19:49 -0400, Robert Haas wrote:
> Does anyone else want to weigh in on this?  It seems to me that there
> are several people who are quite willing to complain about the fact
> that this hasn't been fixed, but not willing to express an opinion
> about the shape of the fix.  Either the RMT needs to take executive
> action, or we need some more votes.

(taking my RMT hat off, as it's not been coordinated)

Well, I think it's enough in the weeds that not too many people are
going to have an informed opinion on this, and I'm not sure uninformed
opinions help. But I do think the fact that one of the authors and the
committer are on one side, and Tom seems to tentatively agree, weighs
quite a bit here.  As far as I can tell, it's "just" Etsuro on the other
side - obviously his opinion counts, but it doesn't outweigh others.  I
think if you feel confident, you should be ok just commit the approach
you favor, and I read you as being confident that that's the better
approach.

Greetings,

Andres Freund



Re: Allow COPY's 'text' format to output a header

2018-08-02 Thread Simon Muller
On 2 August 2018 at 17:07, Cynthia Shang 
wrote:

>
> > On Aug 2, 2018, at 8:11 AM, Daniel Verite 
> wrote:
> >
> > That makes sense, thanks for elaborating, although there are also
> > a fair number of ERRCODE_FEATURE_NOT_SUPPORTED in copy.c
> > that are raised on forbidden/nonsensical combination of features,
> > so the consistency argument could work both ways.
> >
>
> If there is not a strong reason to change the error code, then I believe
> we should not. The error is the same as it was before, just narrower in
> scope.
>
> Best,
> -Cynthia


Sure, thanks both for the feedback. Attached is a patch with the error kept
as ERRCODE_FEATURE_NOT_SUPPORTED.

--
Simon Muller


text_header_v5.patch
Description: Binary data


Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

2018-08-02 Thread Robert Haas
On Thu, Aug 2, 2018 at 7:20 AM, Etsuro Fujita
 wrote:
> The new approach seems to me more localized than what Ashutosh had proposed.
> One obvious advantage of the new approach is that we no longer need changes
> to postgres_fdw for it to work for partitionwise joins, which would apply
> for any other FDWs, making the FDW authors' life easy.

Well, I don't know what to say here expect that I don't agree.  I
think Ashutosh's approach is a pretty natural extension of the system
that we have now.  It involves needing to handle converted whole row
vars in some new places, most of which were handled in the original
patch, but postgres_fdw was missed.  Your approach involves changing
the meaning of the target list, but only in narrow corner cases.  I
completely disagree that we can say it's less invasive.  It may indeed
be less work for extension authors, though, though perhaps at the
expense of moving the conversion from the remote server to the local
one.

>> But in general, with your approach, any code that
>> looks at the tlist for a child-join has to know that the tlist is to
>> be used as-is *except that* ConvertRowtypeExpr may need to be
>> inserted.  I think that special case could be easy to overlook, and it
>> seems pretty arbitrary.
>
> I think we can check to see whether the conversion is needed, from the flags
> added to RelOptInfo.

Sure, I don't dispute that it can be made to work.  I just think it's
an ugly kludge.

> I have to admit that it's weird to adjust the child's targetlist in that
> case when putting the child under the parent's Append/MergeAppend, but IMHO
> I think that would be much localized.

Yeah, I just don't agree at all.

Does anyone else want to weigh in on this?  It seems to me that there
are several people who are quite willing to complain about the fact
that this hasn't been fixed, but not willing to express an opinion
about the shape of the fix.  Either the RMT needs to take executive
action, or we need some more votes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Documentaion fix.

2018-08-02 Thread Michael Paquier
On Wed, Aug 01, 2018 at 01:04:37PM +0900, Kyotaro HORIGUCHI wrote:
> The query and the result with four columns fit the current width.

Just wondering, what is your reason behind the addition of restart_lsn?
This part of the documentation focuses on slot creation, so slot_name,
slot_type and active would be representative enough, no?
--
Michael


signature.asc
Description: PGP signature


Re: BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack

2018-08-02 Thread Michael Paquier
On Wed, Aug 01, 2018 at 10:55:11AM +, Ahsan Hadi wrote:
> Can you rebase the reindex-priv-93.patch, it is getting some hunk failures.

Patch reindex-priv-93.patch is for REL9_3_STABLE, where it applies
cleanly for me.  reindex-priv-94.patch is for REL9_4_STABLE and
reindex-priv-95-master.patch is for 9.5~master.
--
Michael


signature.asc
Description: PGP signature


Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Andres" == Andres Freund  writes:

 Tom> There's also a question of whether we need to change anything in
 Tom> contrib/ so that it plays by whatever rules we set.  There's an
 Tom> expectation that contrib modules should be buildable with PGXS,
 Tom> so they need to follow the rules.

 Andrew> ... that at least all of the *_plperl transform modules in
 Andrew> contrib/ fail to build with USE_PGXS already (i.e. for as long
 Andrew> as they have ever existed), because they rely on
 Andrew> plperl_helpers.h which is never installed anywhere, and trying
 Andrew> to get it via $(top_srcdir) obviously can't work in PGXS.

 Andrew> Haven't tried the python ones yet.

 >> And none of the plpython transforms can even parse their makefiles
 >> with USE_PGXS, let alone build, because they have an "include"
 >> directive pointing into src/pl/plpython.

 Andres> FWIW, I'd be perfectly on board with just burying this policy.
 Andres> Designating one contrib module (or something in
 Andres> src/test/examples or such) as a PGXS example, and always
 Andres> building it with pgxs seems like it'd do a much better job than
 Andres> the current policy.

I suggest that modules that actually _can't_ build with PGXS should have
the PGXS logic removed from their makefiles (perhaps replaced with an
error). But for the rest, it's a convenience to be able to build single
modules using USE_PGXS without having to configure the whole source tree.

-- 
Andrew (irc:RhodiumToad)



Re: Should contrib modules install .h files?

2018-08-02 Thread Tom Lane
Andres Freund  writes:
> On 2018-08-02 19:13:05 +0100, Andrew Gierth wrote:
>> And none of the plpython transforms can even parse their makefiles with
>> USE_PGXS, let alone build, because they have an "include" directive
>> pointing into src/pl/plpython.

> FWIW, I'd be perfectly on board with just burying this policy. Designating
> one contrib module (or something in src/test/examples or such) as a PGXS
> example, and always building it with pgxs seems like it'd do a much
> better job than the current policy.

No, I think that'd be pretty wrongheaded.  One of the big reasons for
contrib to exist is to serve as a testbed proving that our extension
features actually work.  What you suggest would reduce the scope of
that testing, which seems like the wrong direction to be going in.

It's particularly bad for these cases, since what they demonstrate is
that it's impossible to build transform modules for plperl or plpython
out-of-tree at the moment.  That doesn't seem to me to be something
we should just ignore; it goes against not only our larger commitment
to extensibility, but also the entire point of the transforms feature.

regards, tom lane



Re: Should contrib modules install .h files?

2018-08-02 Thread Andres Freund
On 2018-08-02 19:13:05 +0100, Andrew Gierth wrote:
> > "Andrew" == Andrew Gierth  writes:
> 
>  Tom> There's also a question of whether we need to change anything in
>  Tom> contrib/ so that it plays by whatever rules we set.  There's an
>  Tom> expectation that contrib modules should be buildable with PGXS,
>  Tom> so they need to follow the rules.
> 
>  Andrew> ... that at least all of the *_plperl transform modules in
>  Andrew> contrib/ fail to build with USE_PGXS already (i.e. for as long
>  Andrew> as they have ever existed), because they rely on
>  Andrew> plperl_helpers.h which is never installed anywhere, and trying
>  Andrew> to get it via $(top_srcdir) obviously can't work in PGXS.
> 
>  Andrew> Haven't tried the python ones yet.
> 
> And none of the plpython transforms can even parse their makefiles with
> USE_PGXS, let alone build, because they have an "include" directive
> pointing into src/pl/plpython.

FWIW, I'd be perfectly on board with just burying this policy. Designating
one contrib module (or something in src/test/examples or such) as a PGXS
example, and always building it with pgxs seems like it'd do a much
better job than the current policy.

Greetings,

Andres Freund



Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:

 Tom> There's also a question of whether we need to change anything in
 Tom> contrib/ so that it plays by whatever rules we set.  There's an
 Tom> expectation that contrib modules should be buildable with PGXS,
 Tom> so they need to follow the rules.

 Andrew> ... that at least all of the *_plperl transform modules in
 Andrew> contrib/ fail to build with USE_PGXS already (i.e. for as long
 Andrew> as they have ever existed), because they rely on
 Andrew> plperl_helpers.h which is never installed anywhere, and trying
 Andrew> to get it via $(top_srcdir) obviously can't work in PGXS.

 Andrew> Haven't tried the python ones yet.

And none of the plpython transforms can even parse their makefiles with
USE_PGXS, let alone build, because they have an "include" directive
pointing into src/pl/plpython.

-- 
Andrew (irc:RhodiumToad)



Re: [HACKERS] Bug in to_timestamp().

2018-08-02 Thread Alexander Korotkov
On Thu, Aug 2, 2018 at 6:17 PM Alexander Korotkov
 wrote:
> After some experiments I found that when you mix spaces and separators
> between two fields, then Oracle takes into account only length of last
> group of spaces/separators.
>
> # SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM
> dual2018-01-01 00:00:00 -10:00
> (length of last spaces/separators group is 2)
>
> # SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM dual
> 2018-01-01 00:00:00 -10:00
> (length of last spaces/separators group is 3)
>
> # SELECT to_timestamp('2018- -01 02', '   -- ---MM-DD') FROM dual
> 02.01.2018 00:00:00
> (length of last spaces/separators group is 2)

Ooops... I'm sorry, but I've posted wrong results here.  Correct
version is here.

# SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM dual
ORA-01843: not a valid month
(length of last spaces/separators group is 2)

# SELECT to_timestamp('2018- -01 02', '   -- ---MM-DD') FROM dual
02.01.2018 00:00:00
(length of last spaces/separators group is 3)

So length of last group of spaces/separators in the pattern should be
greater or equal to length of spaces/separators in the input string.
Other previous groups are ignored in Oracle.  And that seems
ridiculous for me.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> Maybe this all just works without much thought, but given that
 Tom> smart people like Peter E. seem to be unsure of that, I'd sure
 Tom> like to see a concrete set of rules that extensions should follow
 Tom> for this.

I'll comment on the more substantive stuff later since I just noticed a
few relevant points that I need to investigate. But while investigating,
I found...

 Tom> There's also a question of whether we need to change anything in
 Tom> contrib/ so that it plays by whatever rules we set.  There's an
 Tom> expectation that contrib modules should be buildable with PGXS,
 Tom> so they need to follow the rules.

... that at least all of the *_plperl transform modules in contrib/ fail
to build with USE_PGXS already (i.e. for as long as they have ever
existed), because they rely on plperl_helpers.h which is never installed
anywhere, and trying to get it via $(top_srcdir) obviously can't work in
PGXS.

Haven't tried the python ones yet.

-- 
Andrew.



Re: FailedAssertion on partprune

2018-08-02 Thread Robert Haas
On Thu, Aug 2, 2018 at 1:54 PM, Tom Lane  wrote:
> (EXPLAIN isn't a very adequate tool for testing this, since
> it fails to show anything about what pruning tests are attached to an
> Append.  I wonder if we need to create something that does show that.)

I've asked for that more than once and still think it's a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: FailedAssertion on partprune

2018-08-02 Thread Tom Lane
[ bringing this discussion back to the original thread ]

David Rowley  writes:
> The attached patch removes the Assert, but I think it should be
> probably be done as part of [1]'s patch since that's also adding the
> code to handle subplans for tables that don't belong in the partition
> hierarchy.

Over in that thread, we had

David Rowley  writes:
> On 2 August 2018 at 11:48, Tom Lane  wrote:
>> I didn't include this change because (a) it's late, (b) no test
>> case was included, and (c) I don't entirely believe it anyway.
>> How would a rel be both leaf and nonleaf?  Isn't this indicative
>> of a problem somewhere upstream in the planner?

> It's probably best discussed on the other thread, but it seems to be
> by design in accumulate_append_subpath(). Parallel Append nodes don't
> get flattened if they contain a mix of parallel aware and non-parallel
> aware subplans.

Yeah, looking at the explain posted upthread, the issue is specifically
that some of the child paths might be for Gathers over subsets of the
partitioning hierarchy.  It's not real clear to me whether such a subset
would necessarily correspond to a subtree of the hierarchy, but if it
does, you'd want to be able to apply run-time pruning at the parent
Append, in hopes of not having to execute the Gather at all.  Separately
from that, if you do execute the Gather, applying runtime pruning within
its child Append is also a good thing.

So what's unclear to me is whether removing that Assert is sufficient
to make all of that work nicely.  I'm concerned in particular that the
planner might get confused about which pruning tests to apply at each
level.  (EXPLAIN isn't a very adequate tool for testing this, since
it fails to show anything about what pruning tests are attached to an
Append.  I wonder if we need to create something that does show that.)

I think it'd be a real good idea to have a test case to exercise this.
Jaime's presented test case is too expensive to consider as a regression
test, but I bet that a skinnied-down version could be made to work
once you'd set all the parallelization parameters to liberal values,
like in select_parallel.sql.

regards, tom lane



Re: Should contrib modules install .h files?

2018-08-02 Thread Robert Haas
On Thu, Aug 2, 2018 at 12:56 PM, Andres Freund  wrote:
>> On the other hand, _I'm_ getting pressure from at least one packager to
>> nail down a final release of pllua-ng so they can build it along with
>> beta3 (in place of the old broken pllua), which obviously I can't do if
>> I keep having to fiddle with workarounds for hstore.h.
>
> I just don't have a lot of sympathy for that, given the
> months-after-freeze-window timing. It's not like *any* of this just
> started to be problematic in v11.

Yeah, I agree.  Andrew, it seems to me that you brought this problem
on yourself.  You rammed a patch through four months after feature
freeze with a marginal consensus and are now complaining about having
to spend more time on it.  Well, that's a self-inflicted injury.  If
you don't commit features with a marginal consensus and/or don't do it
four months after feature freeze, you won't get nearly as much
pushback.

Don't get me wrong -- I think you've done a lot of great work over the
years and I'm glad to have you involved both as a contributor and as a
committer of your own patches and those of others.  It's just that
your email here reads as if you think that your commit privileges are
for your benefit rather than that of the project, and that's not how
it works.  Everybody here is free to pursue the things they think are
interesting and the diversity of such things is part of the strength
of the project, but nobody is free to ram through changes that make
life better for them and worse for other people, even if they don't
agree with the complaints those other people make.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Explain buffers wrong counter with parallel plans

2018-08-02 Thread Robert Haas
On Thu, Aug 2, 2018 at 5:41 AM, Amit Kapila  wrote:
> I have created three patches (a) move InstrStartParallelQuery from its
> original location so that we perform it just before ExecutorRun (b)
> patch to fix the gather stats by calling shutdown at appropriate place
> and allow stats collection in ExecShutdownNode (c) Probit calling
> ExecShutdownNode if there is a possibility of backward scans (I have
> done some basic tests with this patch, if we decide to proceed with
> it, then some more verification and testing would be required).
>
> I think we should commit first two patches as that fixes the problem
> being discussed in this thread and then do some additional
> verification for the third patch (mentioned in option c).  I can
> understand if people want to commit the third patch before the second
> patch, so let me know what you guys think.

I'm happy with the first two patches.  In the third one, I don't think
"See ExecLimit" is a good thing to put a comment like this, because
it's too hard to find the comment to which it refers, and because
future commits are likely to edit or remove that comment without
noticing the references to it from elsewhere.  Instead I would just
write, in all three places, /* If we know we won't need to back up, we
can release resources at this point. */ or something like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: doc - add missing documentation for "acldefault"

2018-08-02 Thread Pavel Luzanov

Hello Fabien,

However, the point of having hidden and/or undocumented functions 
fails me: they are hard/impossible to find if you do not know they 
exist from
the start, and if you ever find one you do not know what they do 
without reading the source code in detail, eg to know what to give 
arguments to get an answer.

At first, we must decide in which cases users will use them.
And I don't see such cases.
I must to know how to grant privileges, how to revoke them and how to 
check existing priveleges.
All theese tasks documented in GRANT, REVOKE commands and system catalog 
descriptions.


Your's patch from another thread closes the last hole - describing 
default privileges in various psql commands.


-
Pavel Luzanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Should contrib modules install .h files?

2018-08-02 Thread Tom Lane
Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> My impression is that there was consensus for per-extension
>  Tom> subdirectories, but the usage scenario isn't totally designed yet.
>  Tom> In principle, only the layout question has to be resolved to make
>  Tom> it OK to ship this in v11.

> Currently, everything is agnostic about the usage scenario - the
> existing extension include files will work with either
> -I$(includedir_server)/extension and #include "hstore/hstore.h", or with
> -I$(includedir_server)/extension/hstore and #include "hstore.h".

Well, the point is we don't want agnosticism --- we want a clear
usage model for people to follow.

>  Tom> On the other hand, if there's no very practical way to use the
>  Tom> per-extension subdirectory layout,

> What constitutes "practical"?

Something that copes with selecting the right headers if you're rebuilding
a module whose headers are already installed (as you mentioned upthread).
Something that copes with different modules installing headers with the
same base name.  Allowing for that was the driving force for going with
subdirectory-per-extension, but if we really want that to work, there
seems to be no alternative but for extensions to write qualified header
names (#include "hstore/hstore.h" not #include "hstore.h").  Andres,
for one, seemed to think that wouldn't play nicely with PGXS, though
I'm not sure why not.  Maybe he only meant that an extension's *own*
headers shouldn't be referenced that way.

Maybe this all just works without much thought, but given that smart
people like Peter E. seem to be unsure of that, I'd sure like to see
a concrete set of rules that extensions should follow for this.

There's also a question of whether we need to change anything in
contrib/ so that it plays by whatever rules we set.  There's an
expectation that contrib modules should be buildable with PGXS,
so they need to follow the rules.

regards, tom lane



Re: doc - add missing documentation for "acldefault"

2018-08-02 Thread Fabien COELHO


Hello Pavel,


I couldn't find the documentation. Attached patch adds one.

Probably this function should have been named pg_*. Too late.


I think this is intentionally hidden function, like others started with acl*.


Yep, I thought of that.

However, the point of having hidden and/or undocumented functions fails 
me: they are hard/impossible to find if you do not know they exist from
the start, and if you ever find one you do not know what they do without 
reading the source code in detail, eg to know what to give arguments to 
get an answer.


So I assumed that it was more lazyness and could be remedied with a doc 
patch for the one function I read. Maybe it would make sense to document 
the others as well, which are more straightforward.


Also, that does not explain why they do not use a "pg_" prefix, which is 
already a way of saying "this is an internal-purpose function". So I would 
be in favor of prefixing them with pg_: as it is an undocumented feature, 
there would not be a compatibility break, somehow:-)



postgres=# \df acl*
List of fun
   Schema   |    Name | Result data type |
+-+--+--
 pg_catalog | aclcontains | boolean  | aclitem[], aclitem
 pg_catalog | acldefault  | aclitem[]    | "char", oid
 pg_catalog | aclexplode  | SETOF record | acl aclitem[], OUT grantor 
oid, O

 pg_catalog | aclinsert   | aclitem[]    | aclitem[], aclitem
 pg_catalog | aclitemeq   | boolean  | aclitem, aclitem
 pg_catalog | aclitemin   | aclitem  | cstring
 pg_catalog | aclitemout  | cstring  | aclitem
 pg_catalog | aclremove   | aclitem[]    | aclitem[], aclitem


--
Fabien.

Re: Should contrib modules install .h files?

2018-08-02 Thread Andres Freund
On 2018-08-02 17:53:17 +0100, Andrew Gierth wrote:
> > "Andres" == Andres Freund  writes:
> 
>  >> Also, "near future" means "before Monday". I don't want to ship
>  >> beta3 with this in place if we end up reverting later, because it'd
>  >> mean thrashing packagers' file manifests, which they won't
>  >> appreciate. It might be best to revert in v11 for now, and then we
>  >> can put it back after beta3 if there's agreement that the questions
>  >> are satisfactorily resolved.
> 
>  Andres> +1
> 
> On the other hand, _I'm_ getting pressure from at least one packager to
> nail down a final release of pllua-ng so they can build it along with
> beta3 (in place of the old broken pllua), which obviously I can't do if
> I keep having to fiddle with workarounds for hstore.h.

I just don't have a lot of sympathy for that, given the
months-after-freeze-window timing. It's not like *any* of this just
started to be problematic in v11.

Greetings,

Andres Freund



Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Andres" == Andres Freund  writes:

 >> Also, "near future" means "before Monday". I don't want to ship
 >> beta3 with this in place if we end up reverting later, because it'd
 >> mean thrashing packagers' file manifests, which they won't
 >> appreciate. It might be best to revert in v11 for now, and then we
 >> can put it back after beta3 if there's agreement that the questions
 >> are satisfactorily resolved.

 Andres> +1

On the other hand, _I'm_ getting pressure from at least one packager to
nail down a final release of pllua-ng so they can build it along with
beta3 (in place of the old broken pllua), which obviously I can't do if
I keep having to fiddle with workarounds for hstore.h.

-- 
Andrew (irc:RhodiumToad)



Re: Should contrib modules install .h files?

2018-08-02 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> It seems like there were two separate areas of
 Tom> disagreement/questioning, one being the file layout (whether to
 Tom> add per-extension subdirectories) and then one about how one would
 Tom> actually use this, ie what would the -I switch(es) look like and
 Tom> where would they get injected.

 Tom> My impression is that there was consensus for per-extension
 Tom> subdirectories, but the usage scenario isn't totally designed yet.
 Tom> In principle, only the layout question has to be resolved to make
 Tom> it OK to ship this in v11.

Currently, everything is agnostic about the usage scenario - the
existing extension include files will work with either
-I$(includedir_server)/extension and #include "hstore/hstore.h", or with
-I$(includedir_server)/extension/hstore and #include "hstore.h".

 Tom> On the other hand, if there's no very practical way to use the
 Tom> per-extension subdirectory layout,

What constitutes "practical"?

Right now it seems unlikely that there's much of a use case for
referring to more than two different extensions at a time (a pl and a
data type, for building a transform module outside either the pl's or
the type's source tree). Referring to one is more likely (in my case,
hstore_pllua is written to build inside pllua-ng's source tree, so all
it needs is to get at hstore.h).

-- 
Andrew (irc:RhodiumToad)



Re: Making "COPY partitioned_table FROM" faster

2018-08-02 Thread Peter Eisentraut
On 02/08/2018 01:36, David Rowley wrote:
> On 31 July 2018 at 11:51, David Rowley  wrote:
>> The attached v6 delta replaces the v5 delta and should be applied on
>> top of the full v4 patch.
> 
> (now committed)
> 
> Many thanks for committing this Peter and many thanks to Melanie and
> Karen for reviewing it!

I problems with my email as I was committing this, which is why I didn't
let you know. ;-)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Should contrib modules install .h files?

2018-08-02 Thread Andres Freund
On 2018-08-02 10:54:00 -0400, Tom Lane wrote:
> Also, "near future" means "before Monday".  I don't want to ship beta3
> with this in place if we end up reverting later, because it'd mean
> thrashing packagers' file manifests, which they won't appreciate.
> It might be best to revert in v11 for now, and then we can put it back
> after beta3 if there's agreement that the questions are satisfactorily
> resolved.

+1



Fallout from PQhost() semantics changes

2018-08-02 Thread Tom Lane
Traditionally (prior to v10), PQhost() returned the "host" connection
parameter if that was nonempty, otherwise the default host name
(DEFAULT_PGSOCKET_DIR or "localhost" depending on platform).

That got whacked around to a state of brokenness in v10 (which I'll return
to in a bit), and then commit 1944cdc98 fixed it to return the active
host's connhost[].host string if nonempty, else the connhost[].hostaddr
string if nonempty, else an empty string.  Together with the fact that the
default host name gets inserted into connhost[].host if neither option is
supplied, that's compatible with the traditional behavior when host is
supplied or when both options are omitted.  It's not the same when only
hostaddr is supplied.  This change is generally a good thing: returning
the default host name is pretty misleading if hostaddr actually points at
some remote server.  However, it seems that insufficient attention was
paid to whether *every* call site is OK with it.

In particular, libpq has several internal calls to PQhost() to get the
host name to be compared to a server SSL certificate, or for comparable
usages in GSS and SSPI authentication.  These changes mean that sometimes
we will be comparing the server's numeric address, not its hostname,
to the server auth information.  I do not think that was the intention;
it's certainly in direct contradiction to our documentation, which clearly
says that the host name parameter and nothing else is used for this
purpose.  It's not clear to me if this could amount to a security problem,
but at the least it's wrongly documented.

What I think we should do about it is change those internal calls to
fetch connhost[].host directly instead of going through PQhost(), as
in the attached libpq-internal-PQhost-usage-1.patch.  This will restore
the semantics to what they were pre-v10, including erroring out when
hostaddr is supplied without host.

I also noted that psql's \conninfo code takes it upon itself to substitute
the value of the hostaddr parameter, if used, for the result of PQhost().
This is entirely wrong/unhelpful if multiple host targets were specified;
moreover, that patch failed to account for the very similar connection
info printout in do_connect().  Given the change in PQhost's behavior
I think it'd be fine to just drop that complexity and print PQhost's
result without any editorialization, as in the attached
psql-conninfo-PQhost-usage-1.patch.

I would also like to make the case for back-patching 1944cdc98 into v10.
I'm not sure why that wasn't done to begin with, because v10's PQhost()
is just completely broken for cases involving a hostaddr specification:

if (!conn)
return NULL;
if (conn->connhost != NULL &&
conn->connhost[conn->whichhost].type != CHT_HOST_ADDRESS)
return conn->connhost[conn->whichhost].host;
else if (conn->pghost != NULL && conn->pghost[0] != '\0')
return conn->pghost;
else
{
#ifdef HAVE_UNIX_SOCKETS
return DEFAULT_PGSOCKET_DIR;
#else
return DefaultHost;
#endif
}

In the CHT_HOST_ADDRESS case, it will either give back the raw host
parameter (again, wrong if multiple hosts are targeted) or give back
DEFAULT_PGSOCKET_DIR/DefaultHost if the host wasn't specified.
Ignoring the brokenness for multiple target hosts, you could argue
that that's compatible with pre-v10 behavior ... but it's still pretty
misleading to give back DefaultHost, much less DEFAULT_PGSOCKET_DIR,
for a remote connection.  (There's at least some chance that the
hostaddr is actually 127.0.0.1 or ::1.  There is no chance that
DEFAULT_PGSOCKET_DIR is an appropriate description.)

Given that we whacked around v10 libpq's behavior for some related corner
cases earlier this week, I think it'd be OK to change this in v10.
If we do, it'd make sense to back-patch psql-conninfo-PQhost-usage-1.patch
into v10 as well.  I think that libpq-internal-PQhost-usage-1.patch should
be back-patched to v10 in any case, since whether or not you want to live
with the existing behavior of PQhost() in v10, it's surely not appropriate
for comparing to server SSL certificates.

In fact, I think there's probably a good case for doing something
comparable to libpq-internal-PQhost-usage-1.patch all the way back.
In exactly what scenario is it sane to be comparing "/tmp" or
"localhost" to a server's SSL certificate?

regards, tom lane

diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 3b2073a..09012c5 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -199,7 +199,7 @@ pg_GSS_startup(PGconn *conn, int payloadlen)
 min_stat;
 	int			maxlen;
 	gss_buffer_desc temp_gbuf;
-	char	   *host = PQhost(conn);
+	char	   *host = conn->connhost[conn->whichhost].host;
 
 	if (!(host && host[0] != '\0'))
 	{
@@ -414,7 +414,7 @@ pg_SSPI_startup(PGconn *conn, int use_negotiate, int payloadlen)
 {
 	SECURITY_STATUS r;
 	TimeStamp	expire;
-	char	   *host = 

Re: Stored procedures and out parameters

2018-08-02 Thread Vladimir Sitnikov
Shay>Npgsql currently always sends a describe as part of statement
execution (for server-prepared messages the describe is done only once, at
preparation-time). Vladimir, are you doing things differently here?

The same thing is for pgjdbc. It does use describe to identify result row
format.
However, "CALL my_proc()" works just fine with current git master for both
simple and extended protocol.

The missing part is "invoke functions via CALL statement".

Vladimir


Re: Problem during Windows service start

2018-08-02 Thread Laurenz Albe
Sakai, Teppei wrote:
> This is my first posting to the mailing list.
> 
> Currently our customer uses PostgreSQL 9.5 and hits problem during Windows 
> service start.
> The Windows service status of the instance is different from actual status.
> 
> We got the following situation.
> 1. Register service with 'pg_ctl register -N "PostgreSQL" -U postgres -P  
> -D D:\data\inst1 -w'
> 2. Start the instance from the Windows service screen.
> 3. After 60 seconds, the startup process fails with a timeout.
>Because crash recovery takes a lot of times.
> 
> Then, the service status of the instance become "STOPPED",
> but the instance was running.
> It cannot be stopped from the Windows service screen (it can be stopped only 
> with pg_ctl).
> 
> PostgreSQL version : 9.5.12
> Operating system : Windows Server 2012 R2
> 
> I think this is a bug.
> I think it has not been fixed in the latest version, is my understanding 
> correct?
> If it is correct, I will fix it.

I agree that this is not nice.

How do you propose to fix it?

Yours,
Laurenz Albe



Re: Ideas for a relcache test mode about missing invalidations

2018-08-02 Thread Andres Freund
Hi,

On 2018-08-02 19:18:11 +0900, Kyotaro HORIGUCHI wrote:
> At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund  wrote 
> in <20180801162518.jnb2ql5dfmgwp...@alap3.anarazel.de>
> > Hi,
> > 
> > The issue at [1] is caused by missing invalidations, and [2] seems like
> > a likely candidate too. I wonder if it'd be good to have a relcache test
> > mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries
> > to ensure that we've done sufficiently to ensure the right invalidations
> > are sent.
> > 
> > I think what we'd kind of want is to ensure that relcache entries are
> > rebuilt at the earliest possible time, but *not* later. That'd mean
> > they're out of date if there's missing invalidations. Unfortunately I'm
> > not clear on how that'd be achievable?  Ideas?
> > 
> > The best I can come up with is to code some additional dependencies into
> > CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the
> > right messages. But that seems somewhat painful and filled with holes.
> > 
> > [1] 
> > http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com
> > [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us
> 
> As for [1], it is not a issue on invalidation. It happens also if
> the relation has any index and even drop is not needed. The
> following steps are sufficient.

Huh? I don't think this is a proper fix. But please let's argue over in
the other that in the other thread.


Greetings,

Andres Freund



Re: [HACKERS] Bug in to_timestamp().

2018-08-02 Thread Alexander Korotkov
On Wed, Aug 1, 2018 at 3:17 PM Arthur Zakirov  wrote:
> On Mon, Jul 23, 2018 at 05:21:43PM +0300, Alexander Korotkov wrote:
> > Thank you, Arthur.  These examples shows downside of this patch, where
> > users may be faced with incompatibility.  But it's good that this
> > situation can be handled by altering format string.  I think these
> > examples should be added to the documentation and highlighted in
> > release notes.
>
> I updated the documentation. I added a tip text which explains
> how to_timestamp() and to_date() handled ordinary text prior to
> PostgreSQL 11 and how they should handle ordinary text now.
>
> There is now changes in the code and tests.

Thank you, Arthur!

I made following observations on Oracle behavior.
1) Oracle also supports parsing TZH in to_timestamp_tz() function.
Previously, in order to handle TZH (which could be negative) we
decided to skip spaces before fields, but not after fields [1][2].
That leads to behavior, which is both incompatible with Oracle and
unlogical (at least in my opinion).  But after exploration of
to_timestamp_tz() behavior I found that Oracle can distinguish minus
sign used as separator from minus sign in TZH field.

# select to_char(to_timestamp_tz('2018-01-01 -10', ' MM DD  TZH'),
'-MM-DD HH24:MI:SS TZH:TZM') from dual
2018-01-01 00:00:00 +10:00

# select to_char(to_timestamp_tz('2018-01-01--10', ' MM DD  TZH'),
'-MM-DD HH24:MI:SS TZH:TZM') from dual
2018-01-01 00:00:00 +10:00

# select to_char(to_timestamp_tz('2018-01-01  -10', ' MM DD
TZH'), '-MM-DD HH24:MI:SS TZH:TZM') from dual
2018-01-01 00:00:00 -10:00

# select to_char(to_timestamp_tz('2018-01-01---10', ' MM DD
TZH'), '-MM-DD HH24:MI:SS TZH:TZM') from dual
2018-01-01 00:00:00 -10:00

So, if number of spaces/separators between fields in input string is
more than in format string and list space/separator skipped is minus
sign, then it decides to glue that minus sign to TZH.  I think this
behavior is nice to follow, because it allows us to skip spaces after
fields.

2) It appears that Oracle skips spaces not only before fields, but
also in the beginning of the input string.

SELECT to_timestamp(' -2018', ' ') FROM dual
# 01.08.2018 00:00:00

In the example given it seems that first space is skipped, while minus
sign is matched to space.

3) When multiple separators are specified in the format string, Oracle
doesn't allow skipping arbitrary number of spaces before each
separator as we did.

# SELECT to_timestamp('2018- -01 02', '--MM-DD') FROM dual
ORA-01843: not a valid month

4) Spaces and separators in format string seem to be handled in the
same way in non-FX mode.  But strange things happen when you mix
spaces and separators there.

# SELECT to_timestamp('2018- -01 02', '---MM-DD') FROM dual
02.01.2018 00:00:00

# SELECT to_timestamp('2018- -01 02', '   MM-DD') FROM dual
02.01.2018 00:00:00

# SELECT to_timestamp('2018- -01 02', '- -MM-DD') FROM dual
ORA-01843: not a valid month

After some experiments I found that when you mix spaces and separators
between two fields, then Oracle takes into account only length of last
group of spaces/separators.

# SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM
dual2018-01-01 00:00:00 -10:00
(length of last spaces/separators group is 2)

# SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM dual
2018-01-01 00:00:00 -10:00
(length of last spaces/separators group is 3)

# SELECT to_timestamp('2018- -01 02', '   -- ---MM-DD') FROM dual
02.01.2018 00:00:00
(length of last spaces/separators group is 2)

1+2+3 are implemented in the attached patch, but not 4.  I think that
it's only worth to follow Oracle when it does reasonable things.

I also plan to revise documentation and regression tests in the patch.
But I wanted to share my results so that everybody can give a feedback
if any.

1. 
https://www.postgresql.org/message-id/CAEepm%3D1Y7z1VSisBKxa6x3E-jP-%2B%3DrWfw25q_PH2nGjqVcX-rw%40mail.gmail.com
2. 
https://www.postgresql.org/message-id/20180112125848.GA32559%40zakirov.localdomain--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-to-timestamp-format-checking-v14.patch
Description: Binary data


Re: [HACKERS] logical decoding of two-phase transactions

2018-08-02 Thread Nikhil Sontakke
>> They can be, but currently they might not be. So this requires at least
>> big fat warning in docs and description on how to access user catalogs
>> from plugins correctly (ie to always use systable_* API on them). It
>> would be nice if we could check for it in Assert builds at least.
>

Ok, modified the sgml documentation for the above.

> Yea, I agree. I think we should just consider putting similar checks in
> the general scan APIs. With an unlikely() and the easy predictability of
> these checks, I think we should be fine, overhead-wise.
>

Ok, added unlikely() checks in the heap_* scan APIs.

Revised patchset attached.

Regards,
Nikhils

> Greetings,
>
> Andres Freund



-- 
 Nikhil Sontakke   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


0001-Cleaning-up-of-flags-in-ReorderBufferTXN-structure.patch
Description: Binary data


0002-Support-decoding-of-two-phase-transactions-at-PREPAR.patch
Description: Binary data


0003-Gracefully-handle-concurrent-aborts-of-uncommitted-t.patch
Description: Binary data


0004-Teach-test_decoding-plugin-to-work-with-2PC.patch
Description: Binary data


Re: Allow COPY's 'text' format to output a header

2018-08-02 Thread Cynthia Shang


> On Aug 2, 2018, at 8:11 AM, Daniel Verite  wrote:
> 
> That makes sense, thanks for elaborating, although there are also
> a fair number of ERRCODE_FEATURE_NOT_SUPPORTED in copy.c
> that are raised on forbidden/nonsensical combination of features,
> so the consistency argument could work both ways.
> 

If there is not a strong reason to change the error code, then I believe we 
should not. The error is the same as it was before, just narrower in scope.

Best,
-Cynthia


Re: Should contrib modules install .h files?

2018-08-02 Thread Tom Lane
Robert Haas  writes:
> Yeah, I would have voted -1 if I'd realized that it was close.  Now
> we're in a situation where we have patch not everyone likes not only
> in master (which is OK, because we've got a year to decide whether to
> change anything) but also in v11 (where we have a lot less time).
> That's not so great.

It seems like there were two separate areas of disagreement/questioning,
one being the file layout (whether to add per-extension subdirectories)
and then one about how one would actually use this, ie what would the
-I switch(es) look like and where would they get injected.

My impression is that there was consensus for per-extension
subdirectories, but the usage scenario isn't totally designed yet.
In principle, only the layout question has to be resolved to make
it OK to ship this in v11.  On the other hand, if there's no
very practical way to use the per-extension subdirectory layout,
we might have to rethink that layout.  So I'd be happier about this
if that question were answered promptly.  Failing a satisfactory
answer in the near future, I think we need to revert in v11.

Also, "near future" means "before Monday".  I don't want to ship beta3
with this in place if we end up reverting later, because it'd mean
thrashing packagers' file manifests, which they won't appreciate.
It might be best to revert in v11 for now, and then we can put it back
after beta3 if there's agreement that the questions are satisfactorily
resolved.

regards, tom lane



Re: Problems with plan estimates in postgres_fdw

2018-08-02 Thread Tom Lane
Andrew Gierth  writes:
> [ postgres_fdw is not smart about exploiting fast-start plans ]

Yeah, that's basically not accounted for at all in the current design.

> One possibility: would it be worth adding an option to EXPLAIN that
> makes it assume cursor_tuple_fraction?

[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining.  The trick would be to know what value to put as the
limit, though.  It'd be easy to do if we were willing to explain the query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start plans,
it might make sense to build two scan paths for a foreign table, one based
on a full-table scan and one based on EXPLAIN ... LIMIT 1.  This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.

regards, tom lane



Re: [Patch] Create a new session in postmaster by calling setsid()

2018-08-02 Thread Tom Lane
Paul Guo  writes:
> [ make the postmaster execute setsid() too ]

I'm a bit skeptical of this proposal.  Forcing the postmaster to
dissociate from its controlling terminal is a good thing in some
scenarios, but probably less good in others, and manual postmaster
starts are probably mostly in the latter class.

I wonder whether having "pg_ctl start" do a setsid() would accomplish
the same result with less chance of breaking debugging usages.

regards, tom lane



Re: Stored procedures and out parameters

2018-08-02 Thread Shay Rojansky
Apologies for disappearing from this conversation for a week.

First off, on the .NET side I have the exact same issue that Vladimir
Sitnikov described for the Java side. The .NET database API (ADO.NET) has a
standard, portable way for calling "server-side code". Since stored
procedures are in PostgreSQL, this portable API was implemented years ago
to invoke functions, which were the only thing in existence (so Npgsql
issues SELECT * FROM function()). Now that stored procedures have been
introduced, it's impossible to change what the portable API means without
massive backwards compatibility issues for all programs which already rely
on the API calling *functions*.

In other words, I really do hope that on the PostgreSQL side you consider
allowing both functions and procedures to be invoked via CALL. Npgsql (and
likely pgjdbc) would then be able to change the portable API to send CALL
instead of SELECT, avoiding all backwards compatibility issues (they would
do that only for PostgreSQL 11 and above). For now I'm telling users on the
beta version to avoid the API altogether (writing CALL SQL manually), which
as Vladimir wrote above is bad for portability.

> If you "DESCRIBE CALL my_func()" you get back a NoData response; it
doesn't try to inspect the RETURNS clause of the function even though in
theory it could.  The client is using CALL so that is it should expect to
receive.  That said I'm not entirely clear whether the NoData response is a
fixed bug or not...

Uh, this sounds like something we really need to understand... How is a
driver supposed to know what data types are being returned if it can't use
Describe? DataRow messages contain only field lengths and values, so having
a type OID is critical for knowing how to interpret the data, and that
currently is only available by sending a Describe on a statement... Npgsql
currently always sends a describe as part of statement execution (for
server-prepared messages the describe is done only once, at
preparation-time). Vladimir, are you doing things differently here?


On Tue, Jul 24, 2018 at 7:57 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 24, 2018 at 11:31 AM, Daniel Verite 
> wrote:
>
>> David G. Johnston wrote:
>>
>> > > 2. A stored procedure can decide dynamically of
>> > > the structure of the resultset(s) it returns,
>> > > and the caller will discover it as they're returned, not
>> > > before.
>> > >
>> >
>> > The function itself doesn't care - this concern is about SELECT vs CALL
>> > invocation only, not the script definition.
>>
>> It does care, because CREATE FUNCTION has a RETURNS clause and
>> matching RETURN statements in the body, whereas CREATE PROCEDURE
>> doesn't and (will?) have a different syntax for producing resultsets.
>>
>
> ​But why does whatever code that implements CALL have to care?
>
> In Object Oriented terms why can not both procedures and functions
> implement a "EXECUTE_VIA_CALL" interface; while functions additionally
> implement a "EXECUTE_VIA_SELECT" interface - the one that they do today.
>
> ISTM that any (most) function could be trivially ​rewritten into a
> procedure (or wrapped by one) in a mechanical fashion which could then be
> executed via CALL.  I'm proposing that instead of having people write their
> own wrappers we figure out what the mechanical wrapper looks like, ideally
> based upon the public API of the function, and create it on-the-fly
> whenever said function is executed via a CALL statement.
>
> As for the invocation, that's just the starting point.  At this point
>> the driver doesn't know from '{call x}' whether x is a procedure or a
>> function in Postgres, hence the request for a syntax that would work
>> for both.  Okay, but aside from that, if there are results, the driver
>> needs to get them in a way that works without knowing wether it's a
>> function or procedure. How would that happen?
>>
>
> I'm saying that the driver needs to rewrite {call x} as "CALL x()" and
> expect optional resultsets and optional output arguments.  For functions
> invoked as procedures this would be a single resultset with zero output
> arguments.  Which is exactly the same end-user result that is received
> today when "SELECT * FROM x()" is used.
>
>
>> Back to the first message of the thread, Shay Rojansky was saying:
>>
>>   "However, connecting via Npgsql, which uses the extended protocol, I
>>   see something quite different. As a response to a Describe PostgreSQL
>>   message, I get back a NoData response rather than a RowDescription
>>   message"
>>
>> Why would a Describe on a "CALL myproc()" even work if we
>> accept the premise that myproc() does not advertise what it may return,
>> contrary to a "SELECT * FROM function()"?
>> This issue goes beyond a "syntactic bridge" between CALL and SELECT.
>
>
> ​If you "DESCRIBE CALL my_func()" you get back a NoData response; it
> doesn't try to inspect the RETURNS clause of the function even though in
> theory it could.  The 

Re: Should contrib modules install .h files?

2018-08-02 Thread Robert Haas
On Tue, Jul 31, 2018 at 5:53 PM, Tom Lane  wrote:
> By my count of people expressing opinions, we had Michael -1, Stephen -1,
> me -0.1 or so, Alvaro +1, Peter -1, presumably +1 from Andrew; and Andres
> made a comment about not waiting, which perhaps Andrew read as a +1 for
> backpatching.  So it's not unreasonable for Andrew to have decided that
> it was about tied.  Nonetheless, it does seem like a feature and we're
> past feature freeze, so the default assumption ought to be "no backpatch"
> IMO.

Yeah, I would have voted -1 if I'd realized that it was close.  Now
we're in a situation where we have patch not everyone likes not only
in master (which is OK, because we've got a year to decide whether to
change anything) but also in v11 (where we have a lot less time).
That's not so great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Restricting maximum keep segments by repslots

2018-08-02 Thread Robert Haas
On Tue, Jul 31, 2018 at 9:52 PM, Kyotaro HORIGUCHI
 wrote:
> I thought it's to be deprecated for some reason so I'm leaving
> wal_keep_segments in '# of segments' even though the new GUC is
> in MB. I'm a bit uneasy that the two similar settings are in
> different units. Couldn't we turn it into MB taking this
> opportunity if we will keep wal_keep_segments, changing its name
> to min_wal_keep_size?  max_slot_wal_keep_size could be changed to
> just max_wal_keep_size along with it.

This seems like it's a little bit of a separate topic from what this
thread about, but FWIW, +1 for standardizing on MB.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Allow COPY's 'text' format to output a header

2018-08-02 Thread Daniel Verite
Simon Muller wrote:

> I changed the error type and message for consistency with other similar
> errors in that file. Whenever options are combined that are incompatible,
> it looks like the convention is for a ERRCODE_SYNTAX_ERROR to be thrown.

That makes sense, thanks for elaborating, although there are also
a fair number of ERRCODE_FEATURE_NOT_SUPPORTED in copy.c
that are raised on forbidden/nonsensical combination of features,
so the consistency argument could work both ways.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Problem during Windows service start

2018-08-02 Thread Sakai, Teppei
Hi,

This is my first posting to the mailing list.

Currently our customer uses PostgreSQL 9.5 and hits problem during Windows 
service start.
The Windows service status of the instance is different from actual status.

We got the following situation.
1. Register service with 'pg_ctl register -N "PostgreSQL" -U postgres -P  
-D D:\data\inst1 -w'
2. Start the instance from the Windows service screen.
3. After 60 seconds, the startup process fails with a timeout.
   Because crash recovery takes a lot of times.

Then, the service status of the instance become "STOPPED",
but the instance was running.
It cannot be stopped from the Windows service screen (it can be stopped only 
with pg_ctl).

PostgreSQL version : 9.5.12
Operating system : Windows Server 2012 R2

I think this is a bug.
I think it has not been fixed in the latest version, is my understanding 
correct?
If it is correct, I will fix it.

Regards,
SAKAI Teppei




Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

2018-08-02 Thread Etsuro Fujita

(2018/08/02 4:30), Robert Haas wrote:

On Wed, Aug 1, 2018 at 7:44 AM, Etsuro Fujita
  wrote:

I updated the patch that way.  Updated patch attached.  I fixed a bug and
did a bit of cleanups as well.


Looking this over from a technical point of view, I think it's better
than what you proposed before but I still don't agree with the
approach.  I don't think there is any getting around the fact that
converted whole-row vars are going to result in some warts.  Ashutosh
inserted most of the necessary warts in the original partition-wise
join patch, but missed some cases in postgres_fdw; his approach is,
basically, to add the matching warts there.  Your approach is to rip
out all of those warts and insert different ones.  You've simplified
build_tlist_index_other_vars() and add_placeholders_to_joinrel() and
build_joinrel_tlist() to basically what they were before
partition-wise join went in.  On the other hand, RelOptInfo grows
three new fields,


Sorry, I forgot to remove one of the fields that isn't needed anymore. 
RelOptInfo needs two new fields, in the new approach.



set_append_rel_size() ends up more complicated than
it was before when you include the node code added in the
build_childrel_tlist function it calls, build_child_joinrel() has a
different set of complications, and most importantly
create_append_path() and create_merge_append_path() now do surgery on
their input paths that knows specifically about the
converted-whole-row var case.  I would be glad to be rid of the stuff
that you're ripping out, but in terms of complexity, I don't think we
really come out ahead with the stuff you're adding.


The new approach seems to me more localized than what Ashutosh had 
proposed.  One obvious advantage of the new approach is that we no 
longer need changes to postgres_fdw for it to work for partitionwise 
joins, which would apply for any other FDWs, making the FDW authors' 
life easy.



I'm also still concerned about the design.  In your old approach, you
were creating the paths with with the wrong target list and then
fixing it when you turned the paths into a plan.  In your new
approach, you're still creating the paths with the wrong target list,
but now you're fixing it when you put an Append or MergeAppend path on
top of them.  I still think it's a bad idea to have anything other
than the correct paths in the target list from the beginning.


I don't think so; actually, the child's targetlist would not have to be 
the same as the parent's until the child gets in under the parent's 
Append or MergeAppend.  So, I modified the child's target list so as to 
make it easy to join the child relations.



For one
thing, what happens if no Append or MergeAppend is ever put on top of
them?  One way that can happen today is partition-wise aggregate,
although I haven't been able to demonstrate a bug, so maybe that's
covered somehow.


Right.  In the new approach, the targetlist for the topmost child 
scan/join is guaranteed to be the same as that for the topmost parent 
scan/join by the planner work that I added.



But in general, with your approach, any code that
looks at the tlist for a child-join has to know that the tlist is to
be used as-is *except that* ConvertRowtypeExpr may need to be
inserted.  I think that special case could be easy to overlook, and it
seems pretty arbitrary.


I think we can check to see whether the conversion is needed, from the 
flags added to RelOptInfo.



A tlist is a list of arbitrary expressions to
be produced; with this patch, we'd end up with the tlist being the
list of expressions to be produced in every case *except for*
child-joins containing whole-row-vars.  I just can't get behind a
strange exception like that.


I have to admit that it's weird to adjust the child's targetlist in that 
case when putting the child under the parent's Append/MergeAppend, but 
IMHO I think that would be much localized.


Thanks for the comments!

Best regards,
Etsuro Fujita



Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

2018-08-02 Thread Etsuro Fujita

(2018/08/02 2:44), Robert Haas wrote:

Tom, Ashutosh, and I all seem to agree that we shouldn't try to
re-jigger things at create-plan time.  I think that a 3-1 consensus
against your proposal is sufficient to say we shouldn't go that way.


Agreed.


Now, here you've got a new approach which avoids that, which I have
not yet reviewed.  I'll try to spend some time on it this afternoon,
but really, I think it's too late for this.  This bug was reported in
February, and we're supposed to be pushing 11 final out the door in
not much more than a month.  Proposing a new approach in August is not
good.


Agreed.


Can't we just do what Ashutosh proposed for now and revisit
this for v12?


I think that may be possible.

Best regards,
Etsuro Fujita



Re: Ideas for a relcache test mode about missing invalidations

2018-08-02 Thread Kyotaro HORIGUCHI
Hello.

At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund  wrote in 
<20180801162518.jnb2ql5dfmgwp...@alap3.anarazel.de>
> Hi,
> 
> The issue at [1] is caused by missing invalidations, and [2] seems like
> a likely candidate too. I wonder if it'd be good to have a relcache test
> mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries
> to ensure that we've done sufficiently to ensure the right invalidations
> are sent.
> 
> I think what we'd kind of want is to ensure that relcache entries are
> rebuilt at the earliest possible time, but *not* later. That'd mean
> they're out of date if there's missing invalidations. Unfortunately I'm
> not clear on how that'd be achievable?  Ideas?
> 
> The best I can come up with is to code some additional dependencies into
> CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the
> right messages. But that seems somewhat painful and filled with holes.
> 
> [1] 
> http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/12259.1532117...@sss.pgh.pa.us

As for [1], it is not a issue on invalidation. It happens also if
the relation has any index and even drop is not needed. The
following steps are sufficient.

create table t( pk serial, t text );
insert into t( t ) values( 'hello' ), ('world');
create index idx_fake0 on t (pk);
create index idx_fake on t ( f_fake( pk ) ); -- ERROR

index_create() creates a new pg_index entry with indislive = true
before building it. So the planner (RelationGetIndexList) sees
the not-yet-populated index while planning the query in f_fake().

The attached patch fixes the issue, but I'm not sure this story
is applicable to [2].

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From eaa5de68ff27bd43a643089f8c5963e5cc3d20cc Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 2 Aug 2018 18:52:24 +0900
Subject: [PATCH] Don't use currently-building index to populate it

index_create() creates a pg_index entry with indislive = true before
populating it. If it is a function index where the function runs a
query on the parent relation, planner sees the just created entry and
tries to access the heap page that is not created yet.  This patch let
index_create() not to set indislive = true after population.
---
 src/backend/catalog/index.c | 55 -
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8b276bc430..b561c8696d 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -124,6 +124,7 @@ static void UpdateIndexRelation(Oid indexoid, Oid heapoid,
 	bool immediate,
 	bool isvalid,
 	bool isready);
+static void ActivateIndexRelation(Oid indexoid);
 static void index_update_stats(Relation rel,
    bool hasindex,
    double reltuples);
@@ -666,7 +667,7 @@ UpdateIndexRelation(Oid indexoid,
 	values[Anum_pg_index_indisvalid - 1] = BoolGetDatum(isvalid);
 	values[Anum_pg_index_indcheckxmin - 1] = BoolGetDatum(false);
 	values[Anum_pg_index_indisready - 1] = BoolGetDatum(isready);
-	values[Anum_pg_index_indislive - 1] = BoolGetDatum(true);
+	values[Anum_pg_index_indislive - 1] = BoolGetDatum(false);
 	values[Anum_pg_index_indisreplident - 1] = BoolGetDatum(false);
 	values[Anum_pg_index_indkey - 1] = PointerGetDatum(indkey);
 	values[Anum_pg_index_indcollation - 1] = PointerGetDatum(indcollation);
@@ -693,6 +694,55 @@ UpdateIndexRelation(Oid indexoid,
 	heap_freetuple(tuple);
 }
 
+/* 
+ *		ActivateIndexRelation
+ *
+ * Publish index by marking it "relislive"
+
+ *  UpdateIndexRelation builds an index relation with relislive = false so as
+ *  not to be used by the quieries used to build the index. This function
+ *  marks the index as "live" so that it can be used hereafter.
+ *  
+ */
+static void
+ActivateIndexRelation(Oid indexoid)
+{
+	Relation	indrel;
+	SysScanDesc	sdesc;
+	ScanKeyData	skey;
+	HeapTuple	tup;
+
+	ScanKeyInit(,
+Anum_pg_index_indexrelid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(indexoid));
+	indrel = heap_open(IndexRelationId, RowExclusiveLock);
+	sdesc = systable_beginscan(indrel, InvalidOid, true, NULL, 1, );
+
+	/*
+	 * We must see one and only one entry for the key. But don't bother
+	 * checking that.
+	 */
+	while (HeapTupleIsValid(tup = systable_getnext(sdesc)))
+	{
+		Datum values[Natts_pg_index];
+		bool nulls[Natts_pg_index];
+		bool replaces[Natts_pg_index];
+
+		MemSet(values, 0, sizeof(values));
+		MemSet(nulls, 0, sizeof(nulls));
+		MemSet(replaces, 0, sizeof(replaces));
+		values[Anum_pg_index_indislive] = BoolGetDatum(true);
+		replaces[Anum_pg_index_indislive] = true;
+		tup = heap_modify_tuple(tup, RelationGetDescr(indrel),
+values, nulls, replaces);
+		

Re: [PATCH] Improve geometric types

2018-08-02 Thread Tomas Vondra




On 08/01/2018 01:40 PM, Emre Hasegeli wrote:

Ah, so there's an assumption that NaNs are handled earlier and never reach
this place? That's probably a safe assumption. I haven't thought about that,
it simply seemed suspicious that the code mixes direct comparisons and
float8_mi() calls.


The comparison functions handle NaNs.  The arithmetic functions handle
returning error on underflow, overflow and division by zero.  I
assumed we want to return error on those in any case, but we don't
want to handle NaNs at every place.


Not sure, I'll leave that up to you. I don't mind doing it in a separate
patch (I'd probably prefer that over mixing it into unrelated patch).


It is attached separately.



OK, thanks.

So, have we reached conclusion about all the bits I mentioned on 7/31? 
The delta and float8/double cast are fixed, and for computeDistance 
(i.e. doing comparisons directly or using float8_lt), the code may seem 
a bit inconsistent, but it is in fact correct as the NaNs are handled 
elsewhere. That seems reasonable, but perhaps a comment pointing that 
out would be nice.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Have an encrypted pgpass file

2018-08-02 Thread Geoff Winkless
On Thu, 2 Aug 2018 at 10:41, I wrote:

> Perhaps you could make your auditors happier by restricting that user's
> permissions to only run a defined function, and make that function do the
> work that the automation script wants? So even if the attacker can access
> the password he will still only be able to run that function? (You could
> even add DOS protection into the function to ensure it's only run so often,
> if you were worried about that.)
>
>
​I realise (of course, after I sent this) that I ​misunderstood the thrust
of your requirement, and that you want the ability to log in *your own user*
without entering your own password. Apologies. Ignore me.

Geoff


Re: New Defects reported by Coverity Scan for PostgreSQL

2018-08-02 Thread Tomas Vondra




On 08/01/2018 04:07 PM, Emre Hasegeli wrote:

I think there are three different things that need to be addressed:

* Underspecified comments.


I agree.  My patch added comments, the next one with actual fixes adds
more.  I just didn't want to invest even more time on them while the
code is its current shape.


* The function names and argument names are badly chosen IMO, because even
granted a convention such as the above, it's not very obvious what roles
"l1" and "l2" play.  I'm not exactly sure what would be better, but if you
used names like "ofseg" and "otherseg" you'd at least be trying.  I'd go
with an asymmetrical function name too, to make miswriting of calls less
likely.


Good idea.  I haven't though about that, but now such names makes more
sense to me.  I will prepare another patch to improve the naming and
comments to be applied on top of my current patches.


* And lastly, are we sure there aren't actual *bugs* here?  I'd initially
supposed that lseg_closept_lseg acted as Emre says above, but reading the
code makes me think it's the other way around.  Its first two potential
assignments to *result are definitely assigning points on l2 not l1.


Yes, it is wrong beyond understanding.  The committed patch intended
to keep answers as they were.  The next one actually fixes those.



OK, so I think we should not do anything about the issues reported by 
coverity until we commit all the patches. Which may resolve some of 
those (pre-existing) issues, for example.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Have an encrypted pgpass file

2018-08-02 Thread Geoff Winkless
On Tue, 24 Jul 2018 at 11:25, Marco van Eck  wrote:

> Indeed having unencrypted password lying (.pgpass or PGPASSWORD or -W)
> around is making my auditors unhappy,
>

With the greatest of respect, perhaps you need to get auditors who
understand crypto better.​

​Having a user that has the minimal permissions ​to perform the required
tasks with a stored password that only the automation user can read is
perfectly valid. Encrypting it with a key that must (perforce) be
accessible using the same permissions that the user would need in order to
to read the unencrypted password file is no more valid (look up "security
through obscurity").

Perhaps you could make your auditors happier by restricting that user's
permissions to only run a defined function, and make that function do the
work that the automation script wants? So even if the attacker can access
the password he will still only be able to run that function? (You could
even add DOS protection into the function to ensure it's only run so often,
if you were worried about that.)

Geoff


Re: Explain buffers wrong counter with parallel plans

2018-08-02 Thread Amit Kapila
On Thu, Aug 2, 2018 at 10:26 AM, Amit Kapila  wrote:
> On Thu, Aug 2, 2018 at 8:38 AM, Andres Freund  wrote:
>> Hi,
>>
>> On 2018-08-02 08:21:58 +0530, Amit Kapila wrote:
>>> I think something on the lines what Tom and you are suggesting can be
>>> done with the help of EXEC_FLAG_BACKWARD, but I don't see the need to
>>> do anything for this patch.  The change in nodeLimit.c is any way for
>>> forward scans, so there shouldn't be any need for any other check.
>>
>> I think this is almost a guarantee to introduce bugs in the future. And
>> besides that, as Robert points out, it's essentially an exiting bug for
>> custom scans.  Given that EXEC_FLAG_BACKWARD already exists, why not do
>> the right thing here?
>>
>
> Sure, if we want we can do it in this patch, but this is not the
> problem of this patch.  It is also related to existing usage of
> shutdown callbacks.  I think we can use es_top_eflags in Estate to
> detect it and then call shutdown only if EXEC_FLAG_BACKWARD is not
> set.  I think we should do that as a separate patch either before or
> after this patch rather than conflating that change into this patch.
> IIUC, then Robert also said that we should fix that separately.  I
> will do as per whatever consensus we reach here.
>

I have created three patches (a) move InstrStartParallelQuery from its
original location so that we perform it just before ExecutorRun (b)
patch to fix the gather stats by calling shutdown at appropriate place
and allow stats collection in ExecShutdownNode (c) Probit calling
ExecShutdownNode if there is a possibility of backward scans (I have
done some basic tests with this patch, if we decide to proceed with
it, then some more verification and testing would be required).

I think we should commit first two patches as that fixes the problem
being discussed in this thread and then do some additional
verification for the third patch (mentioned in option c).  I can
understand if people want to commit the third patch before the second
patch, so let me know what you guys think.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


postpone_buffer_usage_tracking_v1.patch
Description: Binary data


fix_gather_stats_v2.patch
Description: Binary data


prohibit_shutdown_backward_scans_v1.patch
Description: Binary data


Re: [HACKERS] Parallel Append implementation

2018-08-02 Thread Adrien NAYRAT

On 08/01/2018 03:14 PM, Robert Haas wrote:

Committed to master and v11.  Thanks for the review.


Thanks!



FW: [Todo item] Add entry creation timestamp column to pg_stat_replication

2018-08-02 Thread MyungKyu LIM
I changed field name from 'reply_time' to 'last_msg_send_time'.
Because 'last_msg_send_time' is used in 
pg_stat_wal_receiver/pg_stat_subsctiption view.
I think that field has the same meaning.

test example>
postgres=# select pid, last_msg_send_time from pg_stat_replication;
-[ RECORD 1 ]--+--
pid| 12015
last_msg_send_time | 2018-08-02 18:02:49.233049+09
-[ RECORD 2 ]--+--
pid| 12084
last_msg_send_time | 2018-08-02 18:02:48.583256+09

I Attached new patch file : 0001-Implement-following-TODO-list-item-v2.patch

Feedback and suggestion will be very welcome.
Thanks!
 
Best regards,
Myungkyu, Lim
 
- Original Message -
Date   : 2018-07-31 17:56 (GMT+9)
Title  : [Todo item] Add entry creation timestamp column to pg_stat_replication
 
Hello hackers,
 
I have worked on following todo list item.
 
  - Add entry creation timestamp column to pg_stat_replication
http://archives.postgresql.org/pgsql-hackers/2011-08/msg00694.php
 
This item looks like simple because necessary data was already exist.
So, I wrote a prototype patch.
 
test example>
postgres=# select pid, reply_time from pg_stat_replication;
-[ RECORD 1 ]-
pid| 4817
reply_time | 2018-07-31 12:00:53.911198+09
-[ RECORD 2 ]-
pid| 4819
reply_time | 2018-07-31 12:00:53.911154+09
 
 
Several candidates exist for the field name.
- reply_timestamp
- info_gen_timestamp
- stats_reset
- last_msg_send_time
 
Feedback and suggestion will be very welcome.
Thanks!
 
Best regards,
Myungkyu, Lim

0001-Implement-following-TODO-list-item-v1.patch
Description: Binary data


0001-Implement-following-TODO-list-item-v2.patch
Description: Binary data


Re: patch to ensure logical decoding errors early

2018-08-02 Thread Simon Riggs
On 1 August 2018 at 23:11, Alvaro Herrera  wrote:
> On 2018-Aug-01, Dave Cramer wrote:
>
>> See attached patch which fixes it, and adds a test for it.
>
> Pushed, thanks.

Thanks

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Have an encrypted pgpass file

2018-08-02 Thread Kyotaro HORIGUCHI
Hello.

I have had complaints several times on lack of this kind of
feature.

At Wed, 1 Aug 2018 17:33:39 +0200, Marco van Eck  wrote 
in 
> After explaining the patch to a college we identified potentially execution
> of another user when it is defined in as a command parameter. To protect
> agains it I've removed the possibility to pass the 'passcommand'. With the
> result libpq only allows the PGPASSCOMMAND environment variable, which can
> only be defined by the executing user, and will be executed by the same
> user. It only reduces the need of unencrypted password's in a file.

Myabe we don't need the new environment variable by just allowing
.pgpass be a script. If we put faith in the security of .pgpass,
this won't be a problem, putting aside what the script actually
does.

> I think this solution is secure enough, shall we solve this feature-request?
> 
> 
> Regards, Marco
> 
> On Tue, Jul 24, 2018 at 4:00 PM Tom Lane  wrote:
> 
> > Marco van Eck  writes:
> > > Indeed having unencrypted password lying (.pgpass or PGPASSWORD or -W)
> > > around is making my auditors unhappy, and forcing me to enter the
> > password
> > > over and over again. With a simple test it seems the password entered by
> > > the user also stays in memory, since it is able to reset a broken
> > > connection. Finding the password in memory is not trivial, but prevention
> > > is always preferred.
> >
> > > It might be an idea to wipe the password after the login, and
> > decrypt/read
> > > it again if it needs to reconnect. Would this make the solution more
> > > secure? I had a quick look at the code and the patch would stay compact.
> > > Please let me know of doing this would make sense.
> >
> > We're basically not going to accept any added complication that's designed
> > to prevent memory-inspection attacks, because in general that's a waste
> > of effort.  All you're doing is (slightly) reducing the attack window.
> >
> > regards, tom lane

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center