Re: let's kill AtSubStart_Notify

2019-09-11 Thread Dilip Kumar
On Wed, Sep 11, 2019 at 6:22 PM Robert Haas  wrote:
>
> There are only four subsystems which require a callback at the
> beginning of each subtransaction: the relevant functions are
> AtSubStart_Memory, AtSubStart_ResourceOwner, AtSubStart_Notify, and
> AfterTriggerBeginSubXact. The AtSubStart_Memory and
> AtSubStart_ResourceOwner callbacks seem relatively unobjectionable,
> because almost every subtransaction is going to allocate memory and
> acquire some resource managed by a resource owner, but the others
> represent initialization that has to be done whether or not the
> corresponding feature is used.
>
> Generally, a subsystem can avoid needing a callback at subtransaction
> start (or transaction start) by detecting new levels of
> subtransactions at time of use. A typical practice is to maintain a
> stack which has entries only for those transaction nesting levels
> where the functionality was used. The attached patch implements this
> method for async.c. I was a little surprised to find that it makes a
> pretty noticeable performance difference when starting and ending
> trivial subtransactions.  I used this test case:
>
> \timing
> do $$begin for i in 1 .. 1000 loop begin null; exception when
> others then null; end; end loop; end;$$;
>
> I ran the test four times with and without the patch and took the
> median of the last three.  This was an attempt to exclude effects due
> to starting up the database cluster. With the patch, the result was
> 3127.377 ms; without the patch, it was 3527.285 ms. That's a big
> enough difference that I'm wondering whether I did something wrong
> while testing this, so feel free to check my work and tell me whether
> I'm all wet. Still, I don't find it wholly unbelievable, because I've
> observed in the past that these code paths are lean enough that a few
> palloc() calls can make a noticeable difference, and the effect of
> this patch is to remove a few palloc() calls.

I did not read the patch but run the same case what you have given and
I can see the similar improvement with the patch.
With the patch 8832.988, without the patch 10252.701ms (median of three reading)

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: doc: update PL/pgSQL sample loop function

2019-09-11 Thread Ian Barwick

On 2019/09/11 14:44, Amit Kapila wrote:

On Sun, Sep 1, 2019 at 9:09 AM Amit Kapila  wrote:


The current example shows the usage of looping in plpgsql, so as such
there is no correctness issue, but OTOH there is no harm in updating
the example as proposed by Ian Barwick.  Does anyone else see any
problem with this idea?  If we agree to proceed with this update, it
might be better to backpatch it for the sake of consistency though I
am not sure about that.



While checking the patch in back-branches, I noticed that it doesn't
get applied to 9.4 due to the way the example forms the string.  I
have done the required changes for 9.4 as well and attached is the
result.


Aha, I had it in my head that 9.4 was being deprecated soon and didn't
check that far back, but turns out it's around until Feb. 2020.


Ian, if possible, can you once check the patch for 9.4?


Looks good, thanks for catching that!


Regards

Ian Barwick

--
 Ian Barwick   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: POC: Cleaning up orphaned files using undo logs

2019-09-11 Thread Alvaro Herrera
On 2019-Sep-06, vignesh C wrote:

> Hi Thomas,
> 
> While testing one of the recovery scenarios I found one issue:
> FailedAssertion("!(logno == context->recovery_logno)

I marked this patch Waiting on Author.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Do not check unlogged indexes on standby

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 7:10 PM Peter Geoghegan  wrote:
> The patch has been committed already.

Oh, wait. It hasn't. Andrey didn't create a new thread for his largely
independent patch, so I incorrectly assumed he created a CF entry for
his original bugfix.


-- 
Peter Geoghegan




Re: Do not check unlogged indexes on standby

2019-09-11 Thread Peter Geoghegan
The patch has been committed already.

Peter Geoghegan
(Sent from my phone)


Re: Pulling up direct-correlated ANY_SUBLINK

2019-09-11 Thread Richard Guo
Hi Tom,

On Tue, Sep 10, 2019 at 9:48 PM Tom Lane  wrote:

>
> > Can we try to pull up direct-correlated ANY SubLink with the help of
> > LATERAL?
>
> Perhaps.  But what's the argument that you'd end up with a better
> plan?  LATERAL pretty much constrains things to use a nestloop,
> so I'm not sure there's anything fundamentally different.
>

This is a point I didn't think of. In that case if the pull-up mostly
results in a nestloop then we cannot make sure we will get a better
plan. Thank you for pointing it out.

Thanks
Richard


Re: base backup client as auxiliary backend process

2019-09-11 Thread Kyotaro Horiguchi
Hello, thanks for pinging.

At Wed, 11 Sep 2019 19:15:24 -0300, Alvaro Herrera  
wrote in <20190911221524.GA16563@alvherre.pgsql>
> On 2019-Aug-30, Peter Eisentraut wrote:
> 
> > Attached is an updated patch for this.  I have changed the initdb option
> > name per suggestion.  The WAL receiver is now started concurrently with
> > the base backup.  There is progress reporting (ps display), fsyncing.
> > Configuration files are not copied anymore.  There is a simple test
> > suite.  Tablespace support is still missing, but it would be
> > straightforward.
> 
> This is an amazing feature.  How come we don't have people cramming to
> review this?

I love it, too. As for me, the reason for hesitating review this
is the patch is said to be experimental. That means 'the details
don't matter, let's discuss it's design/outline.'. So I wanted to
see what the past reviewers comment on the revised shape before I
would stir up the discussion by maybe-pointless comment. (Then
forgotten..)

I'll re-look on this.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] [PATCH] Generic type subscripting

2019-09-11 Thread Alvaro Herrera
On 2019-Jul-11, Dmitry Dolgov wrote:

> > On Thu, Jul 11, 2019 at 9:47 AM David Fetter  wrote:
> >
> > Looks great!
> >
> > The tutorial piece has bit-rotted slightly. Please find attached a
> > patch atop yours that fixes it.
> 
> Indeed, I've missed this change, thank you! Although there supposed to be an
> upperindex, not numupper (since the latter is just a number of upper indexes).

Can you please send an updated version?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Do not check unlogged indexes on standby

2019-09-11 Thread Alvaro Herrera
On 2019-Aug-15, Andrey Borodin wrote:

> PFA V1 of this check retry.

CFbot complains that this doesn't apply; can you please rebase?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: psql - improve test coverage from 41% to 88%

2019-09-11 Thread Michael Paquier
On Wed, Sep 11, 2019 at 10:52:01PM +0200, Fabien COELHO wrote:
> AFAICR this is because the coverage was not the same:-) Some backslash
> commands just skip silently to the end of the line, so that intermediate
> \commands on the same line are not recognized/processed the same, so I moved
> everything on one line to avoid this.

I see.  So basically this tests for more code paths to ignore
backslash commands and it improves the coverage of \elif.  Applied
after fixing two nits:
- Indentation was incorrect.
- Moved the \elif test closer to the existing one for the false
branch (you can grep #2 to find it).
--
Michael


signature.asc
Description: PGP signature


Re: Create collation reporting the ICU locale display name

2019-09-11 Thread Michael Paquier
On Wed, Sep 11, 2019 at 04:53:16PM +0200, Daniel Verite wrote:
> I think it would be nice to have CREATE COLLATION report this
> information as feedback in the form of a NOTICE message.
> PFA a simple patch implementing that.

Why is that better than the descriptions provided with \dO[S]+ in
psql?
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 3:09 PM Peter Geoghegan  wrote:
> Hmm. So v12 seems to have some problems with the WAL logging for
> posting list splits. With wal_debug = on and
> wal_consistency_checking='all', I can get a replica to fail
> consistency checking very quickly when "make installcheck" is run on
> the primary

I see the bug here. The problem is that we WAL-log a version of the
new item that already has its heap TID changed. On the primary, the
call to _bt_form_newposting() has a new item with the original heap
TID, which is then rewritten before being inserted -- that's correct.
But during recovery, we *start out with* a version of the new item
that *already* had its heap TID swapped. So we have nowhere to get the
original heap TID from during recovery.

Attached patch fixes the problem in a hacky way -- it WAL-logs the
original heap TID, just in case. Obviously this fix isn't usable, but
it should make the problem clearer.

Can you come up with a proper fix, please? I can think of one way of
doing it, but I'll leave the details to you.

The same issue exists in _bt_split(), so the tests will still fail
with wal_consistency_checking -- it just takes a lot longer to reach a
point where an inconsistent page is found, because posting list splits
that occur at the same point that we need to split a page are much
rarer than posting list splits that occur when we simply need to
insert, without splitting the page. I suggest using
wal_consistency_checking to test the fix that you come up with. As I
mentioned, I regularly use it. Also note that there are further
subtleties to doing this within _bt_split() -- see the FIXME comments
there.

Thanks
-- 
Peter Geoghegan


0001-Save-original-new-heap-TID-in-insert-WAL-record.patch
Description: Binary data


Re: [PATCH] Opclass parameters

2019-09-11 Thread Tomas Vondra

On Wed, Sep 11, 2019 at 01:44:28AM +0300, Nikita Glukhov wrote:

On 11.09.2019 1:03, Tomas Vondra wrote:


On Tue, Sep 10, 2019 at 04:30:41AM +0300, Nikita Glukhov wrote:


2. New AM method amattoptions().

  amattoptions() is used to specify per-column AM-specific options.
  The example is signature length for bloom indexes (patch #3).



I'm somewhat confused how am I supposed to use this, considering the 
patch

set only defines this for the contrib/bloom index AM. So let's say I want
to create a custom BRIN opclass with per-attribute options (like the two
BRIN opclasses I work on in the other thread). Clearly, I can't tweak the
IndexAmRoutine from the extension. ISTM the patch series should 
modify all

existing index AMs to have a valid amattoptions() implementation, calling
the new amproc if defined.

Or what is the correct way to define custom opclass for existing index AM
(e.g. BRIN) with attribute options?


Per-attribute opclass options are implemented independently from per-attribute
AM options.  amattoptions() is optional and needs to be defined only if AM has
per-attribute options.


OK, thanks for the explanation - so the per-attribute opclass options will
work even when the AM does not have amattoptions() defined. Is there any
practical reason why not to just define everything as opclass options and
get rid of the amattoptions() entirely?


amproc #0 is called regardless of whether amattoptions
is defined or not.  That was the main reason why uniform procnum 0 was
picked.



I still think using procnum 0 and passing the data through fn_expr are not
the right solution. Firstly, traditionally the amprocs are either required
or optional, with required procs having low procnums and optional starting
at 11 or so. The 0 breaks this, because it's optional but it contradicts
the procnum rule. Also, what happens if we need to add another optional
amproc defined for all AMs? Surely we won't use -1.

IMHO we should keep AM-specific procnum and pass it somehow to the AM
machinery.

FWIW there seems to be a bug in identify_opfamily_groups() which does
this: 


   /* Ignore strategy numbers outside supported range */
   if (oprform->amopstrategy > 0 && oprform->amopstrategy < 64)
   thisgroup->operatorset |= ((uint64) 1) << oprform->amopstrategy;

but then identify_opfamily_groups() computes allfuncs without any such
restriction, i.e. it includes procnum 0. And then it fails on this check

   if (thisgroup->functionset != allfuncs) {...}

None of the built-in brin opclasses defines the new amproc, so the code
does not hit this issue. I only noticed this with the opclasses added in
the other thread.

As for the fn_expr, I still think this seems like a misuse of a field
which was intended for something else. I wonder if it might be breaking
some exising code - either in code or in some extension. It seems quite
possible.

It just seems like we're inventing a new way to novel way to pass data
into a function while we already have parameters for that purpose. Adding
parameters may require care so as not to break existing opclasses, but it
seems like the right approach.


You should simply define function like that and use it as amproc #0:

Datum
brin_bloom_options(PG_FUNCTION_ARGS)
{
   local_relopts *relopts = (local_relopts *) PG_GETARG_POINTER(0);
   BloomOptions *blopts = NULL;

   extend_local_reloptions(relopts, blopts, sizeof(*blopts));

   add_local_real_reloption(relopts, "n_distinct_per_range", "desc",
-0.1, -1.0, INT_MAX, >nDistinctPerRange);

   add_local_real_reloption(relopts, "false_positive_rate", "desc",
0.01, 0.001, 1.0, >falsePositiveRate);

   PG_RETURN_VOID();
}



OK, this did the trick. Thanks. I don't have a clear opinion on the API,
but it certainly looks like an improvement.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: amcheck verification for GiST

2019-09-11 Thread Peter Geoghegan
On Sun, Sep 8, 2019 at 1:21 AM Andrey Borodin  wrote:
> Maybe we should PageGetItemIdCareful() to amcheck.c too?
> I think it can be reused later by GIN entry tree and to some extent by 
> SP-GiST.
> SP-GiST uses redirect tuples, but do not store this information in line 
> pointer.

Well, the details are slightly different in each case in at least one
way -- we use the size of the special area to determine the exact
bounds that it is safe for a tuple to appear in. The size of the
special area varies based on the access method. (Actually, pg_filedump
relies on this difference when inferring which access method a
particular page image is based on -- it starts out by looking at the
standard pd_special field that appears in page headers. So clearly
they're often different.)

> > My main concern now is the heavyweight lock strength needed by the new
> > function. I don't feel particularly qualified to sign off on the
> > concurrency aspects of the patch. Heikki's v6 used a ShareLock, like
> > bt_index_parent_check(), but you went back to an AccessShareLock,
> > Andrey. Why is this safe? I see that you do gist_refind_parent() in
> > your v9 a little differently to Heikki in his v6, which you seemed to
> > suggest made this safe in your e-mail on March 28, but I don't
> > understand that at all.
>
> Heikki's version was reading childblkno instead of parentblkno, thus never 
> refinding parent tuple.

> When we suspect key consistency violation, we hold lock on page with some 
> tuple. Then we take pin and lock on page that was parent for current some 
> time before.
> For example of validity see gistfinishsplit(). Comments state "On entry, the 
> caller must hold a lock on stack->buffer", line 1330 acquires 
> LockBuffer(stack->parent->buffer, GIST_EXCLUSIVE);
> This function is used during inserts, but we are not going to modify data and 
> place row locks, thus neither ROW EXCLUSIVE, not ROW SHARE is necessary.

But gistfinishsplit() is called when finishing a page split -- the
F_FOLLOW_RIGHT bit must be set on the leaf. Are you sure that you can
generalize from that, and safely relocate the parent?

I would be a lot more comfortable with this if Heikki weighed in. I am
also concerned about the correctness of this because of this paragraph
from the GiST README file:

"""
This page deletion algorithm works on a best-effort basis. It might fail to
find a downlink, if a concurrent page split moved it after the first stage.
In that case, we won't be able to remove all empty pages. That's OK, it's
not expected to happen very often, and hopefully the next VACUUM will clean
it up.
"""

Why is this not a problem for the new amcheck checks?  Maybe this is a
very naive question. I don't claim to be a GiST expert.

--
Peter Geoghegan




Re: base backup client as auxiliary backend process

2019-09-11 Thread Alvaro Herrera
On 2019-Aug-30, Peter Eisentraut wrote:

> Attached is an updated patch for this.  I have changed the initdb option
> name per suggestion.  The WAL receiver is now started concurrently with
> the base backup.  There is progress reporting (ps display), fsyncing.
> Configuration files are not copied anymore.  There is a simple test
> suite.  Tablespace support is still missing, but it would be
> straightforward.

This is an amazing feature.  How come we don't have people cramming to
review this?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 5:38 AM Anastasia Lubennikova
 wrote:
> Attached is v12, which contains WAL optimizations for posting split and
> page
> deduplication.

Hmm. So v12 seems to have some problems with the WAL logging for
posting list splits. With wal_debug = on and
wal_consistency_checking='all', I can get a replica to fail
consistency checking very quickly when "make installcheck" is run on
the primary:

4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/30423A0; LSN 0/30425A0:
prev 0/3041C78; xid 506; len 3; blkref #0: rel 1663/16385/2608, blk 56
FPW - Heap/INSERT: off 20 flags 0x00
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/30425A0; LSN 0/3042F78:
prev 0/30423A0; xid 506; len 4; blkref #0: rel 1663/16385/2673, blk 13
FPW - Btree/INSERT_LEAF: off 138; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3042F78; LSN 0/3043788:
prev 0/30425A0; xid 506; len 4; blkref #0: rel 1663/16385/2674, blk 37
FPW - Btree/INSERT_LEAF: off 68; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3043788; LSN 0/30437C0:
prev 0/3042F78; xid 506; len 28 - Transaction/ABORT: 2019-09-11
15:01:06.291717-07; rels: pg_tblspc/16388/PG_13_201909071/16385/16399
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/30437C0; LSN 0/3043A30:
prev 0/3043788; xid 507; len 3; blkref #0: rel 1663/16385/1247, blk 9
FPW - Heap/INSERT: off 9 flags 0x00
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3043A30; LSN 0/3043D08:
prev 0/30437C0; xid 507; len 4; blkref #0: rel 1663/16385/2703, blk 2
FPW - Btree/INSERT_LEAF: off 51; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3043D08; LSN 0/3044948:
prev 0/3043A30; xid 507; len 4; blkref #0: rel 1663/16385/2704, blk 1
FPW - Btree/INSERT_LEAF: off 169; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3044948; LSN 0/3044B58:
prev 0/3043D08; xid 507; len 3; blkref #0: rel 1663/16385/2608, blk 56
FPW - Heap/INSERT: off 21 flags 0x00
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3044B58; LSN 0/30454A0:
prev 0/3044948; xid 507; len 4; blkref #0: rel 1663/16385/2673, blk 8
FPW - Btree/INSERT_LEAF: off 156; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/30454A0; LSN 0/3045CC0:
prev 0/3044B58; xid 507; len 4; blkref #0: rel 1663/16385/2674, blk 37
FPW - Btree/INSERT_LEAF: off 71; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3045CC0; LSN 0/3045F48:
prev 0/30454A0; xid 507; len 3; blkref #0: rel 1663/16385/1247, blk 9
FPW - Heap/INSERT: off 10 flags 0x00
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3045F48; LSN 0/3046240:
prev 0/3045CC0; xid 507; len 4; blkref #0: rel 1663/16385/2703, blk 2
FPW - Btree/INSERT_LEAF: off 51; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3046240; LSN 0/3046E70:
prev 0/3045F48; xid 507; len 4; blkref #0: rel 1663/16385/2704, blk 1
FPW - Btree/INSERT_LEAF: off 44; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3046E70; LSN 0/3047090:
prev 0/3046240; xid 507; len 3; blkref #0: rel 1663/16385/2608, blk 56
FPW - Heap/INSERT: off 22 flags 0x00
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3047090; LSN 0/30479E0:
prev 0/3046E70; xid 507; len 4; blkref #0: rel 1663/16385/2673, blk 8
FPW - Btree/INSERT_LEAF: off 156; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/30479E0; LSN 0/3048420:
prev 0/3047090; xid 507; len 4; blkref #0: rel 1663/16385/2674, blk 38
FPW - Btree/INSERT_LEAF: off 10; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3048420; LSN 0/30486B0:
prev 0/30479E0; xid 507; len 3; blkref #0: rel 1663/16385/1259, blk 0
FPW - Heap/INSERT: off 6 flags 0x00
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/30486B0; LSN 0/3048C30:
prev 0/3048420; xid 507; len 4; blkref #0: rel 1663/16385/2662, blk 2
FPW - Btree/INSERT_LEAF: off 119; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3048C30; LSN 0/3049668:
prev 0/30486B0; xid 507; len 4; blkref #0: rel 1663/16385/2663, blk 1
FPW - Btree/INSERT_LEAF: off 42; in_posting_offset 0
4448/2019-09-11 15:01:06 PDT LOG:  REDO @ 0/3049668; LSN 0/304A550:
prev 0/3048C30; xid 507; len 4; blkref #0: rel 1663/16385/3455, blk 1
FPW - Btree/INSERT_LEAF: off 2; in_posting_offset 1
4448/2019-09-11 15:01:06 PDT FATAL:  inconsistent page found, rel
1663/16385/3455, forknum 0, blkno 1
4448/2019-09-11 15:01:06 PDT CONTEXT:  WAL redo at 0/3049668 for
Btree/INSERT_LEAF: off 2; in_posting_offset 1
4447/2019-09-11 15:01:06 PDT LOG:  startup process (PID 4448) exited
with exit code 1
4447/2019-09-11 15:01:06 PDT LOG:  terminating any other active server processes
4447/2019-09-11 15:01:06 PDT LOG:  database system is shut down

I regularly use this test case for the patch -- I think that I fixed a
similar problem in v11, when I changed the same WAL logging, but I
didn't mention it until now. I will debug this myself in a few days,
though you may prefer to do it before then.

-- 
Peter Geoghegan




Re: Run-time pruning for ModifyTable

2019-09-11 Thread Alvaro Herrera
Here's a rebased version of this patch (it had a trivial conflict).
No further changes.


-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 82d8140ba2..677b9cdd58 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1951,7 +1951,7 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
 	if (plan && plan->operation == CMD_UPDATE &&
 		(resultRelInfo->ri_usesFdwDirectModify ||
 		 resultRelInfo->ri_FdwState) &&
-		resultRelInfo > mtstate->resultRelInfo + mtstate->mt_whichplan)
+		resultRelInfo > mtstate->resultRelInfos[mtstate->mt_whichplan])
 		ereport(ERROR,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  errmsg("cannot route tuples into foreign table to be updated \"%s\"",
@@ -2005,7 +2005,7 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
 		 */
 		if (plan && plan->operation == CMD_UPDATE &&
 			resultRelation == plan->rootRelation)
-			resultRelation = mtstate->resultRelInfo[0].ri_RangeTableIndex;
+			resultRelation = mtstate->resultRelInfos[0]->ri_RangeTableIndex;
 	}
 
 	/* Construct the SQL command string. */
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3aeef30b28..2e9954ddda 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2856,7 +2856,7 @@ CopyFrom(CopyState cstate)
 	mtstate->ps.plan = NULL;
 	mtstate->ps.state = estate;
 	mtstate->operation = CMD_INSERT;
-	mtstate->resultRelInfo = estate->es_result_relations;
+	mtstate->resultRelInfos = >es_result_relations;
 
 	if (resultRelInfo->ri_FdwRoutine != NULL &&
 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 62fb3434a3..682bf615db 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -3190,14 +3190,14 @@ show_modifytable_info(ModifyTableState *mtstate, List *ancestors,
 	/* Should we explicitly label target relations? */
 	labeltargets = (mtstate->mt_nplans > 1 ||
 	(mtstate->mt_nplans == 1 &&
-	 mtstate->resultRelInfo->ri_RangeTableIndex != node->nominalRelation));
+	 mtstate->resultRelInfos[0]->ri_RangeTableIndex != node->nominalRelation));
 
 	if (labeltargets)
 		ExplainOpenGroup("Target Tables", "Target Tables", false, es);
 
 	for (j = 0; j < mtstate->mt_nplans; j++)
 	{
-		ResultRelInfo *resultRelInfo = mtstate->resultRelInfo + j;
+		ResultRelInfo *resultRelInfo = mtstate->resultRelInfos[j];
 		FdwRoutine *fdwroutine = resultRelInfo->ri_FdwRoutine;
 
 		if (labeltargets)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d23f292cb0..8d1d961809 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -471,7 +471,7 @@ ExecHashSubPlanResultRelsByOid(ModifyTableState *mtstate,
 	/* Hash all subplans by their Oid */
 	for (i = 0; i < mtstate->mt_nplans; i++)
 	{
-		ResultRelInfo *rri = >resultRelInfo[i];
+		ResultRelInfo *rri = mtstate->resultRelInfos[i];
 		bool		found;
 		Oid			partoid = RelationGetRelid(rri->ri_RelationDesc);
 		SubplanResultRelHashElem *elem;
@@ -508,7 +508,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 	ModifyTable *node = (ModifyTable *) mtstate->ps.plan;
 	Relation	rootrel = rootResultRelInfo->ri_RelationDesc,
 partrel;
-	Relation	firstResultRel = mtstate->resultRelInfo[0].ri_RelationDesc;
+	Relation	firstResultRel = mtstate->resultRelInfos[0]->ri_RelationDesc;
 	ResultRelInfo *leaf_part_rri;
 	MemoryContext oldcxt;
 	AttrNumber *part_attnos = NULL;
@@ -556,7 +556,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 		List	   *wcoList;
 		List	   *wcoExprs = NIL;
 		ListCell   *ll;
-		int			firstVarno = mtstate->resultRelInfo[0].ri_RangeTableIndex;
+		int			firstVarno = mtstate->resultRelInfos[0]->ri_RangeTableIndex;
 
 		/*
 		 * In the case of INSERT on a partitioned table, there is only one
@@ -621,7 +621,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 		TupleTableSlot *slot;
 		ExprContext *econtext;
 		List	   *returningList;
-		int			firstVarno = mtstate->resultRelInfo[0].ri_RangeTableIndex;
+		int			firstVarno = mtstate->resultRelInfos[0]->ri_RangeTableIndex;
 
 		/* See the comment above for WCO lists. */
 		Assert((node->operation == CMD_INSERT &&
@@ -681,7 +681,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
 	 */
 	if (node && node->onConflictAction != ONCONFLICT_NONE)
 	{
-		int			firstVarno = mtstate->resultRelInfo[0].ri_RangeTableIndex;
+		int			firstVarno = mtstate->resultRelInfos[0]->ri_RangeTableIndex;
 		TupleDesc	partrelDesc = RelationGetDescr(partrel);
 		ExprContext *econtext = mtstate->ps.ps_ExprContext;
 		ListCell   *lc;
diff --git a/src/backend/executor/nodeModifyTable.c 

Re: proposal - patch: psql - sort_by_size

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
On 2019-Jul-31, Rafia Sabih wrote:

> I had a look at this patch, seems like a useful thing to have.

So the two initial questions for this patch are

1. Is this a feature we want?
2. Is the user interface correct?

I think the feature is useful, and Rafia also stated as much.  Therefore
ISTM we're okay on that front.

As for the UI, Fabien thinks the patch adopts one that's far too
simplistic, and I agree.  Fabien has proposed a number of different UIs,
but doesn't seem convinced of any of them.  One of them was to have
"options" in the command,
 \dt+ [-o 1d,2a]

Another idea is to use variables in a more general form.  So instead of
Pavel's proposal of SORT_BY_SIZE=on we could do something like
SORT_BY=[list]
where the list after the equal sign consists of predetermined elements
(say SIZE, NAME, SCHEMA and so on) and indicates a specific column to
sort by.  This is less succint than Fabien's idea, and in particular you
can't specify it in the command itself but have to set the variable
beforehand instead.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_upgrade fails with non-standard ACL

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
On 2019-Aug-21, Bruce Momjian wrote:

> > 1) How exactly should we report this incompatibility to a user?
> > I think it's fine to leave the warnings and also write some hint for the
> > user by analogy with other checks.
> > "Reset ACL on the problem functions to default in the old cluster to
> > continue"
>
> Yes, I think it is good to at least throw an error during --check so
> they don't have to find out during a live upgrade.  Odds are it will
> require manual repair.

I'm not sure what you're proposing here ... are you saying that the user
would have to modify the source cluster before pg_upgrade accepts to
run?  That sounds pretty catastrophic.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_upgrade fails with non-standard ACL

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
Stephen,

On 2019-Aug-20, Stephen Frost wrote:

> Will try to take a look at the actual patch later today.

Any word on that review?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: add a MAC check for TRUNCATE

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
Hello,

I moved this patch from "Bug Fixes" to Security.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 5:38 AM Anastasia Lubennikova
 wrote:
> I reviewed them and everything looks good except the idea of not
> splitting dead posting tuples.
> According to comments to scan->ignore_killed_tuples in genam.c:107,
> it may lead to incorrect tuple order on a replica.
> I don't sure, if it leads to any real problem, though, or it will be
> resolved
> by subsequent visibility checks.

Fair enough, but I didn't do that because it's compelling on its own
-- it isn't. I did it because it seemed like the best way to handle
posting list splits in a version of the patch where LP_DEAD bits can
be set on posting list tuples. I think that we have 3 high level
options here:

1. We don't support kill_prior_tuple/LP_DEAD bit setting with posting
lists at all. This is clearly the easiest approach.

2. We do what I did in v11 of the patch -- we make it so that
_bt_insertonpg() and _bt_split() never have to deal with LP_DEAD
posting lists that they must split in passing.

3. We add additional code to _bt_insertonpg() and _bt_split() to deal
with the rare case where they must split an LP_DEAD posting list,
probably by unsetting the bit or something like that. Obviously it
would be wrong to leave the LP_DEAD bit set for the newly inserted
heap tuples TID that must go in a posting list that had its LP_DEAD
bit set -- that would make it dead to index scans even after its xact
successfully committed.

I think that you already agree that we want to have the
kill_prior_tuple optimizations with posting lists, so #1 isn't really
an option. That just leaves #2 and #3. Since posting list splits are
already assumed to be quite rare, it seemed far simpler to take the
conservative approach of forcing clean-up that removes LP_DEAD bits so
that _bt_insertonpg() and _bt_split() don't have to think about it.
Obviously I think it's important that we make as few changes as
possible to _bt_insertonpg() and _bt_split(), in general.

I don't understand what you mean about visibility checks. There is
nothing truly special about the way in which _bt_findinsertloc() will
sometimes have to kill LP_DEAD items so that _bt_insertonpg() and
_bt_split() don't have to think about LP_DEAD posting lists. As far as
recovery is concerned, it is just another XLOG_BTREE_DELETE record,
like any other. Note that there is a second call to
_bt_binsrch_insert() within _bt_findinsertloc() when it has to
generate a new XLOG_BTREE_DELETE record (by calling
_bt_dedup_one_page(), which calls _bt_delitems_delete() in a way that
isn't dependent on the BTP_HAS_GARBAGE status bit being set).

> Anyway, it's worth to add more comments in
> _bt_killitems() explaining why it's safe.

There is no question that the little snippet of code I added to
_bt_killitems() in v11 is still too complicated. We also have to
consider cases where the array overflows because the scan direction
was changed (see the kill_prior_tuple comment block in btgetuple()).
Yeah, it's messy.

> Attached is v12, which contains WAL optimizations for posting split and
> page
> deduplication.

Cool.

> * xl_btree_split record doesn't contain posting tuple anymore, instead
> it keeps
> 'in_posting offset'  and repeats the logic of _bt_insertonpg() as you
> proposed
> upthread.

That looks good.

> * I introduced new xlog record XLOG_BTREE_DEDUP_PAGE, which contains
> info about
> groups of tuples deduplicated into posting tuples. In principle, it is
> possible
> to fit it into some existing record, but I preferred to keep things clear.

I definitely think that inventing a new WAL record was the right thing to do.

> I haven't measured how these changes affect WAL size yet.
> Do you have any suggestions on how to automate testing of new WAL records?
> Is there any suitable place in regression tests?

I don't know about the regression tests (I doubt that there is a
natural place for such a test), but I came up with a rough test case.
I more or less copied the approach that you took with the index build
WAL reduction patches, though I also figured out a way of subtracting
heapam WAL overhead to get a real figure. I attach the test case --
note that you'll need to use the "land" database with this. (This test
case might need to be improved, but it's a good start.)

> * I also noticed that _bt_dedup_one_page() can be optimized to return early
> when none tuples were deduplicated. I wonder if we can introduce inner
> statistic to tune deduplication? That is returning to the idea of
> BT_COMPRESS_THRESHOLD, which can help to avoid extra work for pages that
> have
> very few duplicates or pages that are already full of posting lists.

I think that the BT_COMPRESS_THRESHOLD idea is closely related to
making _bt_dedup_one_page() behave incrementally.

On my machine, v12 of the patch actually uses slightly more WAL than
v11 did with the nbtree_wal_test.sql test case -- it's 6510 MB of
nbtree WAL in v12 vs. 6502 MB in v11 (note that v11 benefits from WAL
compression, so if I turned that off v12 would 

Re: psql - improve test coverage from 41% to 88%

2019-09-11 Thread Fabien COELHO


Bonjour Michaël,


+=item $node->icommand_checks(cmd, ...)
+
+=cut
+
+sub icommand_checks

Surely this can have a better description, like say
PostgresNode::command_checks_all.


Ok.


Is Expect compatible down to perl 5.8.0 which is the minimum required
for the TAP tests (see src/test/perl/README)?


I think so. It looks like this has existed for a very long time (22 
years?), but I cannot test it simply against a perl 5.8.



There are cases where we don't support tab completion, aka no
USE_READLINE.  So tests would need to be skipped.


Good catch. I added a skip if it detects that history/readline is 
disabled.



 -   \a \C arg1 \c arg1 arg2 arg3 arg4 \cd arg1 \conninfo
 +   \a
 +\C arg1
Why are you changing that?


AFAICR this is because the coverage was not the same:-) Some backslash 
commands just skip silently to the end of the line, so that intermediate 
\commands on the same line are not recognized/processed the same, so I 
moved everything on one line to avoid this.


Your patch does not touch the logic of psql.  Could it make sense as a 
separate patch to shape better the tests?


Nope, this is not just reshaping, it is really about improving coverage.


--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -307,6 +307,7 @@ describeTablespaces(const char *pattern, bool
verbose)
 * a for aggregates
 * n for normal
+ * p for procedure
This is a separate issue, fixed.


Ok.

Attached v3 which fixes the outlined issues.

--
Fabien.diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b7d36b65dd..aabc27ab6f 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -506,6 +506,7 @@ system_or_bail 'pg_ctl', '-D', $pgdata, 'start';
 $node->command_checks_all(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_corrupt" ],
 	1,
+	'',
 	[qr{^$}],
 	[qr/^WARNING.*checksum verification failed/s],
 	'pg_basebackup reports checksum mismatch');
@@ -526,6 +527,7 @@ system_or_bail 'pg_ctl', '-D', $pgdata, 'start';
 $node->command_checks_all(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_corrupt2" ],
 	1,
+	'',
 	[qr{^$}],
 	[qr/^WARNING.*further.*failures.*will.not.be.reported/s],
 	'pg_basebackup does not report more than 5 checksum mismatches');
@@ -542,6 +544,7 @@ system_or_bail 'pg_ctl', '-D', $pgdata, 'start';
 $node->command_checks_all(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_corrupt3" ],
 	1,
+	'',
 	[qr{^$}],
 	[qr/^WARNING.*7 total checksum verification failures/s],
 	'pg_basebackup correctly report the total number of checksum mismatches');
diff --git a/src/bin/pg_checksums/t/002_actions.pl b/src/bin/pg_checksums/t/002_actions.pl
index 59228b916c..cf4811d382 100644
--- a/src/bin/pg_checksums/t/002_actions.pl
+++ b/src/bin/pg_checksums/t/002_actions.pl
@@ -62,6 +62,7 @@ sub check_relation_corruption
 			'--filenode',   $relfilenode_corrupted
 		],
 		1,
+		'',
 		[qr/Bad checksums:.*1/],
 		[qr/checksum verification failed/],
 		"fails with corrupted data for single relfilenode on tablespace $tablespace"
@@ -71,6 +72,7 @@ sub check_relation_corruption
 	$node->command_checks_all(
 		[ 'pg_checksums', '--check', '-D', $pgdata ],
 		1,
+		'',
 		[qr/Bad checksums:.*1/],
 		[qr/checksum verification failed/],
 		"fails with corrupted data on tablespace $tablespace");
@@ -203,6 +205,7 @@ sub fail_corrupt
 	$node->command_checks_all(
 		[ 'pg_checksums', '--check', '-D', $pgdata ],
 		1,
+		'',
 		[qr/^$/],
 		[qr/could not read block 0 in file.*$file\":/],
 		"fails for corrupted data in $file");
diff --git a/src/bin/pg_controldata/t/001_pg_controldata.pl b/src/bin/pg_controldata/t/001_pg_controldata.pl
index 3b63ad230f..ebe9b80a52 100644
--- a/src/bin/pg_controldata/t/001_pg_controldata.pl
+++ b/src/bin/pg_controldata/t/001_pg_controldata.pl
@@ -33,6 +33,7 @@ close $fh;
 command_checks_all(
 	[ 'pg_controldata', $node->data_dir ],
 	0,
+	'',
 	[
 		qr/WARNING: Calculated CRC checksum does not match value stored in file/,
 		qr/WARNING: invalid WAL segment size/
diff --git a/src/bin/pg_resetwal/t/002_corrupted.pl b/src/bin/pg_resetwal/t/002_corrupted.pl
index f9940d7fc5..1990669d26 100644
--- a/src/bin/pg_resetwal/t/002_corrupted.pl
+++ b/src/bin/pg_resetwal/t/002_corrupted.pl
@@ -30,6 +30,7 @@ close $fh;
 command_checks_all(
 	[ 'pg_resetwal', '-n', $node->data_dir ],
 	0,
+	'',
 	[qr/pg_control version number/],
 	[
 		qr/pg_resetwal: warning: pg_control exists but is broken or wrong version; ignoring it/
@@ -46,6 +47,7 @@ close $fh;
 command_checks_all(
 	[ 'pg_resetwal', '-n', $node->data_dir ],
 	0,
+	'',
 	[qr/pg_control version number/],
 	[
 		qr/\Qpg_resetwal: warning: pg_control specifies invalid WAL segment size (0 bytes); proceed with caution\E/
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index b82d3f65c4..01010914fe 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ 

Re: [PATCH] Race condition in logical walsender causes long postgresql shutdown delay

2019-09-11 Thread Alvaro Herrera
On 2019-Sep-03, Alvaro Herrera wrote:

> On 2019-Jul-25, Craig Ringer wrote:
> 
> > Patch attached.
> 
> Here's a non-broken version of this patch.  I have not done anything
> other than reflowing the new comment.

Reading over this code, I noticed that the detection of the catch-up
state ends up being duplicate code, so I would rework that function as
in the attached patch.

The naming of those flags (got_SIGUSR2, got_STOPPING) is terrible, but
I'm not going to change that in a backpatchable bug fix.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6721ed3559da2b3781b7a0820f165444e573eaee Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 3 Sep 2019 18:17:11 -0400
Subject: [PATCH v3] Fix a delay in PostgreSQL shutdown caused by logical
 replication

Due to a race with WAL writing during shutdown, if logical walsenders were
running then PostgreSQL's shutdown could be delayed by up to
wal_sender_timeout/2 while it waits for the walsenders to shut down. The
walsenders wait for new WAL or end-of-wal which won't come until shutdown so
there's a deadlock. The walsender timeout eventually breaks the deadlock.

The issue was introduced by PostgreSQL 10 commit c6c3334364
"Prevent possibility of panics during shutdown checkpoint".

A logical walsender never enters WALSNDSTATE_STOPPING and allows the
checkpointer to continue shutdown. Instead it exits when it reads
end-of-WAL.  But if it reads the last WAL record written before shutdown
and that record doesn't generate a client network write, it can mark
itself caught up and go to sleep without checking to see if it's been
asked to shut down.

Fix by making sure the logical walsender always checks if it's been asked
to shut down before it allows the walsender main loop to go to sleep.

When this issue happens the walsender(s) can be seen to be sleeping in
WaitLatchOrSocket in WalSndLoop until woken by the keepalive timeout. The
checkpointer will be waiting in WalSndWaitStopping() for the walsenders to
enter WALSNDSTATE_STOPPING or exit, whichever happens first. The postmaster
will be waiting in ServerLoop for the checkpointer to finish the shutdown
checkpoint.
---
 src/backend/replication/walsender.c | 49 +
 1 file changed, 21 insertions(+), 28 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 23870a25a5..1c5380fa7d 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1296,7 +1296,6 @@ WalSndWaitForWal(XLogRecPtr loc)
 	int			wakeEvents;
 	static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;
 
-
 	/*
 	 * Fast path to avoid acquiring the spinlock in case we already know we
 	 * have enough WAL available. This is particularly interesting if we're
@@ -2814,6 +2813,7 @@ XLogSendLogical(void)
 {
 	XLogRecord *record;
 	char	   *errm;
+	XLogRecPtr	flushPtr;
 
 	/*
 	 * Don't know whether we've caught up yet. We'll set WalSndCaughtUp to
@@ -2830,11 +2830,15 @@ XLogSendLogical(void)
 	if (errm != NULL)
 		elog(ERROR, "%s", errm);
 
+	/*
+	 * Read current flush point; we'll use it to determine whether we've
+	 * caught up.  Note that logical decoding cannot be used while in
+	 * recovery, so ... XXX what?
+	 */
+	flushPtr = GetFlushRecPtr();
+
 	if (record != NULL)
 	{
-		/* XXX: Note that logical decoding cannot be used while in recovery */
-		XLogRecPtr	flushPtr = GetFlushRecPtr();
-
 		/*
 		 * Note the lack of any call to LagTrackerWrite() which is handled by
 		 * WalSndUpdateProgress which is called by output plugin through
@@ -2843,32 +2847,21 @@ XLogSendLogical(void)
 		LogicalDecodingProcessRecord(logical_decoding_ctx, logical_decoding_ctx->reader);
 
 		sentPtr = logical_decoding_ctx->reader->EndRecPtr;
-
-		/*
-		 * If we have sent a record that is at or beyond the flushed point, we
-		 * have caught up.
-		 */
-		if (sentPtr >= flushPtr)
-			WalSndCaughtUp = true;
 	}
-	else
-	{
-		/*
-		 * If the record we just wanted read is at or beyond the flushed
-		 * point, then we're caught up.
-		 */
-		if (logical_decoding_ctx->reader->EndRecPtr >= GetFlushRecPtr())
-		{
-			WalSndCaughtUp = true;
 
-			/*
-			 * Have WalSndLoop() terminate the connection in an orderly
-			 * manner, after writing out all the pending data.
-			 */
-			if (got_STOPPING)
-got_SIGUSR2 = true;
-		}
-	}
+	/*
+	 * We might have caught up; set flag if so.
+	 */
+	if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr)
+		WalSndCaughtUp = true;
+
+	/*
+	 * If we're caught up and have been requested to stop, have WalSndLoop()
+	 * terminate the connection in an orderly manner, after writing out all
+	 * the pending data.
+	 */
+	if (WalSndCaughtUp && got_STOPPING)
+		got_SIGUSR2 = true;
 
 	/* Update shared memory status */
 	{
-- 
2.17.1



Re: [PATCH] ltree, lquery, and ltxtquery binary protocol support

2019-09-11 Thread Nino Floris
Hi Alexander,

Thanks for the initial review!

> * Your patch doesn't follow coding convention.  In particular, it
> uses spaces for indentation instead of tabs.  Moreover, it changes
> tags to spaces in places it doesn't touch.  As the result, patch is
> "dirty".  Looks like your IDE isn't properly setup.  Please check your
> patch doesn't contain unintended changes and follow coding convention.
> It's also good practice to run pgindent over changed files before
> sending patch.

Apologies, will fix, I indeed haven't seen the requirement being tabs.
Though not to sound pedantic I would expect an .editorconfig for such
(industry) non standard indenting as it's broadly supported and very
little effort to do so.
I will run pgindent, thanks for the pointer, that looks super useful.

 > * We currently don't add new extension SQL-script for new extension
> version (i.e. ltree--1.2.sql).  Instead, we provide just a migration
> script (i.e. ltree--1.1--1.2.sql).  This simplifies testing of
> extension migration – plain extension creation implies migration.

I wasn't sure how to add new methods to the type without doing a full
CREATE TYPE, as ALTER TYPE doesn't allow updates to functions. At that
point wouldn't it be better as a new version?
I checked some other extensions like hstore to find reference material
how to do a new CREATE TYPE, all did a full version bump.
Should I just do a DROP and CREATE instead in a migration?

>  * What is motivation for binary format for lquery and ltxtquery?  One
could transfer large datasets of ltree's.  But is it so for lquery and
ltxtquery, which are used just for querying.

Completeness, Npgsql (and other drivers like Ecto from Elixir and
probably many others as well) can't do any text fallback in the binary
protocol without manual configuration.
This is because these drivers don't know much (or anything) about the
SQL they send so it wouldn't know to apply it for which columns.
I believe there has been a proposal at some point to enhance the
binary protocol to additionally allow clients to specify text/binary
per data type instead of per column.
That would allow all these drivers to automate this, but I think it
never went anywhere.
As it stands it's not ergonomic to ask developers to setup this
metadata per query they write.

 * Just send binary representation of datatype is not OK.  PostgreSQL
supports a lot of platforms with different byte ordering, alignment
and so on.  You basically need to send each particular field using
pq_send*() function.

Oh my, I don't think I did? I copied what jsonb is doing,
specifically, sending the textual representation of the data type with
a version field prefixed.
(It is why I introduced deparse_ltree/lquery, to take the respective
structure and create a null terminated string of its textual form)
So there are no other fields besides version and the string
representation of the structure.
ltree, lquery, and ltxtquery all seem to have a lossless and compact
textual interpretation.
My motivation here has been "if it's good enough for jsonb it should
be good enough for a niche extension like ltree" especially as having
any binary support is better than not having it at all.
I can change it to anything you'd like to see instead but I would need
some pointers as to what that would be.

Again, thank you for taking up this patch to review.

Best regards,
Nino Floris

On Mon, Sep 9, 2019 at 11:38 PM Alexander Korotkov
 wrote:
>
> Hi!
>
> On Sun, Aug 18, 2019 at 6:53 PM Nino Floris  wrote:
> > Attached is a patch to support send/recv on ltree, lquery and ltxtquery.
> > I'm a contributor to the Npgsql .NET PostgreSQL driver and we'll be
> > the first to have official support once those ltree changes have been
> > released.
> > You can find the driver support work here, the tests verify a
> > roundtrip for each of the types is succesful.
> > https://github.com/NinoFloris/npgsql/tree/label-tree-support
>
> Thank you for the patch.
>
> My notes are following:
>  * Your patch doesn't follow coding convention.  In particular, it
> uses spaces for indentation instead of tabs.  Moreover, it changes
> tags to spaces in places it doesn't touch.  As the result, patch is
> "dirty".  Looks like your IDE isn't properly setup.  Please check your
> patch doesn't contain unintended changes and follow coding convention.
> It's also good practice to run pgindent over changed files before
> sending patch.
>  * We currently don't add new extension SQL-script for new extension
> version (i.e. ltree--1.2.sql).  Instead, we provide just a migration
> script (i.e. ltree--1.1--1.2.sql).  This simplifies testing of
> extension migration – plain extension creation implies migration.
>  * What is motivation for binary format for lquery and ltxtquery?  One
> could transfer large datasets of ltree's.  But is it so for lquery and
> ltxtquery, which are used just for querying.
>  * Just send binary representation of datatype is not OK.  PostgreSQL
> supports a lot of platforms 

Re: propagating replica identity to partitions

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
I've not had time to work on this issue, and there are a few remaining
problems in the patch; mostly that I will need to change the way pg_dump
represents the replica identity so that it is only defined when the
index is dumped in all partitions rather than immediately after creating
the index.  Also, the bug reported by Dmitry Dolgov.

So I'm marking this as Returned with Feedback, with the intention to
resubmit for November.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-09-11 Thread Julien Rouhaud
Thanks for looking at it!

On Wed, Sep 11, 2019 at 6:45 AM Michael Paquier  wrote:
>
> An invalid query ID is assumed to be 0 in the patch, per the way it is
> defined in pg_stat_statements.  However this also maps with the case
> where we have a utility statement.

Oh indeed.  Which means that if a utility statements later calls
parse_analyze or friends, this patch would report an unexpected
queryid.  That's at least possible for something like

COPY (SELECT * FROM tbl) TO ...

The thing is that pg_stat_statements assigns a 0 queryid in the
post_parse_analyze_hook to recognize utility statements and avoid
tracking instrumentation twice in case of utility statements, and then
compute a queryid base on a hash of the query text.  Maybe we could
instead fully reserve queryid "2" for utility statements (so forcing
queryid "1" for standard queries if jumbling returns 0 *or* 2 instead
of only 0), and use "2" as the identifier for utility statement
instead of "0"?

> +   /*
> +* We only report the top-level query identifiers.  The stored queryid is
> +* reset when a backend call pgstat_report_activity(STATE_RUNNING), or 
> with
> +* an explicit call to this function.  If the saved query identifier is 
> not
> +* zero it means that it's not a top-level command, so ignore the one
> +* provided unless it's an explicit call to reset the identifier.
> +*/
> +   if (queryId != 0 && beentry->st_queryid != 0)
> +   return;
> Hmm.  I am wondering if we shouldn't have an API dedicated to the
> reset of the query ID.  That logic looks rather brittle..

How about adding a "bool force" parameter to allow resetting the queryid to 0?

> Wouldn't it be better (and more consistent) to update the query ID in
> parse_analyze_varparams() and parse_analyze() as well after going
> through the post_parse_analyze hook instead of pg_analyze_and_rewrite?

I thought about it without knowing what would be best.  I'll change to
report the queryid right after calling post_parse_analyze_hook then.

> +   /*
> +* If a new query is started, we reset the query identifier as it'll only
> +* be known after parse analysis, to avoid reporting last query's
> +* identifier.
> +*/
> +   if (state == STATE_RUNNING)
> +   beentry->st_queryid = 0
> I don't quite get why you don't reset the counter in other cases as
> well.  If the backend entry is idle in transaction or in an idle
> state, it seems to me that we should not report the query ID of the
> last query run in the transaction.  And that would make the reset in
> exec_simple_query() unnecessary, no?

I'm reproducing the same behavior as for the query text, ie. showing
the information about the last executed query text if state is idle:

+ queryid
+ bigint
+ Identifier of this backend's most recent query. If
+  state is active this field
+  shows the identifier of the currently executing query. In all other
+  states, it shows the identifier of last query that was executed.

I think that showing the last executed query's queryid is as useful as
the query text.  Also, while avoiding a reset in exec_simple_query()
it'd be required to do such reset in case of error during query
execution, so that wouldn't make things quite simpler..




Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-11 Thread Masahiko Sawada
On Wed, Sep 11, 2019 at 1:46 PM Michael Paquier  wrote:
>
> On Tue, Sep 10, 2019 at 08:29:43AM +0530, Amit Kapila wrote:
> > Good thought, but I think even if we want to change the name of
> > tuple_data_split, it might be better done separately.
>
> Yes, that's not the problem of this patch.  Not sure if it actually
> makes sense either to change it.

Hmm it will be more consistent with other functions but I think we
would need to increase the pageinspect version to 1.8 and need the new
sql file to rename the function name. And it will be for PG12, not
PG13. If we have to do it someday I think it's better to do it in PG12
that the table AM has been introduced to. Anyway I've attached
separate patch for it.

>
> The regression tests added are rather unreadable when it comes to
> print a lot of infomask flags.  Could you add at least some unnest()
> calls to the queries using heap_infomask_flags()?  It would make the
> diff lookup much more straight-forward to understand.
>

Seems good idea.

> It would be good to comment as well what 2816 and 1080 stand for.  The
> current code makes it hard to understand for which purpose this is
> used in the tests.

I've reconsidered and updated the regression tests.

>
> +  If decode_combined is set, combination flags like
> Missing a markup here.
>

Fixed.

I've attached the updated patch that incorporated all comments. I kept
the function as superuser-restricted.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index e5a581f141..cfe01297fb 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o hashfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.6--1.7.sql \
+DATA =  pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
 	pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
 	pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
 	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
diff --git a/contrib/pageinspect/pageinspect--1.7--1.8.sql b/contrib/pageinspect/pageinspect--1.7--1.8.sql
new file mode 100644
index 00..39421e5699
--- /dev/null
+++ b/contrib/pageinspect/pageinspect--1.7--1.8.sql
@@ -0,0 +1,6 @@
+/* contrib/pageinspect/pageinspect--1.7--1.8.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pageinspect UPDATE TO '1.8'" to load this file. \quit
+
+ALTER FUNCTION tuple_data_split(oid, bytea, integer, integer, text) RENAME TO heap_tuple_data_split;
diff --git a/contrib/pageinspect/pageinspect.control b/contrib/pageinspect/pageinspect.control
index dcfc61f22d..f8cdf526c6 100644
--- a/contrib/pageinspect/pageinspect.control
+++ b/contrib/pageinspect/pageinspect.control
@@ -1,5 +1,5 @@
 # pageinspect extension
 comment = 'inspect the contents of database pages at a low level'
-default_version = '1.7'
+default_version = '1.8'
 module_pathname = '$libdir/pageinspect'
 relocatable = true
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index 8ac9991837..bfc3a3fdc1 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql
@@ -26,7 +26,7 @@ SELECT pagesize, version FROM page_header(get_raw_page('test1', 0));
 
 SELECT page_checksum(get_raw_page('test1', 0), 0) IS NOT NULL AS silly_checksum_test;
 
-SELECT tuple_data_split('test1'::regclass, t_data, t_infomask, t_infomask2, t_bits)
+SELECT heap_tuple_data_split('test1'::regclass, t_data, t_infomask, t_infomask2, t_bits)
 FROM heap_page_items(get_raw_page('test1', 0));
 
 SELECT * FROM fsm_page_contents(get_raw_page('test1', 'fsm', 0));
@@ -49,6 +49,6 @@ drop table test_partitioned;
 create table test8 (f1 int, f2 int, f3 int, f4 int, f5 int, f6 int, f7 int, f8 int);
 insert into test8(f1, f8) values (x'7f7f'::int, 0);
 select t_bits, t_data from heap_page_items(get_raw_page('test8', 0));
-select tuple_data_split('test8'::regclass, t_data, t_infomask, t_infomask2, t_bits)
+select heap_tuple_data_split('test8'::regclass, t_data, t_infomask, t_infomask2, t_bits)
 from heap_page_items(get_raw_page('test8', 0));
 drop table test8;
diff --git a/doc/src/sgml/pageinspect.sgml b/doc/src/sgml/pageinspect.sgml
index 7a767b25ea..29fc32ab95 100644
--- a/doc/src/sgml/pageinspect.sgml
+++ b/doc/src/sgml/pageinspect.sgml
@@ -189,17 +189,17 @@ test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
 

 
- tuple_data_split(rel_oid oid, t_data bytea, t_infomask integer, t_infomask2 integer, t_bits text [, do_detoast bool]) returns bytea[]
+ heap_tuple_data_split(rel_oid oid, t_data bytea, t_infomask integer, t_infomask2 integer, t_bits text [, do_detoast bool]) returns bytea[]
  
-  tuple_data_split
+  heap_tuple_data_split
  
 
 
  
-  tuple_data_split splits tuple 

Re: A problem presentaion about ECPG, DECLARE STATEMENT

2019-09-11 Thread Michael Meskes
> > > Is it acceptable for PG12?
> > In general absolutely.
> 
> It seems far too late to be considering any major rewrite for v12;
> even assuming that there was consensus on the rewrite being an
> improvement, which I bet there won't be.

Oops, I read 13. Yes, it's obviously way too late for 12. Sorry for the
noise.

In this case I'd like to details about what is wrong with the
implementation.

Thanks.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL





Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-11 Thread Masahiko Sawada
On Wed, Sep 11, 2019 at 11:30 PM Alvaro Herrera from 2ndQuadrant
 wrote:
>
> On 2019-Sep-11, Masahiko Sawada wrote:
>
> > On Wed, Sep 11, 2019 at 1:46 PM Michael Paquier  wrote:
> > >
> > > On Tue, Sep 10, 2019 at 08:29:43AM +0530, Amit Kapila wrote:
> > > > Good thought, but I think even if we want to change the name of
> > > > tuple_data_split, it might be better done separately.
> > >
> > > Yes, that's not the problem of this patch.  Not sure if it actually
> > > makes sense either to change it.
> >
> > Hmm it will be more consistent with other functions but I think we
> > would need to increase the pageinspect version to 1.8 and need the new
> > sql file to rename the function name. And it will be for PG12, not
> > PG13. If we have to do it someday I think it's better to do it in PG12
> > that the table AM has been introduced to. Anyway I've attached
> > separate patch for it.
>
> I'd rather not change the name of the existing function ... that
> function is pretty old (it was introduced in 9.6, commit d6061f83a166).
> I think we can regard that name as an historical accident, and use a
> modern name convention for the new function (and any hypothetical future
> ones) that will, sadly, collide with the historical name for the old
> function.

Okay, that makes sense.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center




回复:Re: Does PostgreSQL support debian Linux on Arm CPU Platform?

2019-09-11 Thread gc_11
Thank you very much.
- 原始邮件 -
发件人:ilm...@ilmari.org (Dagfinn Ilmari Mannsåker)
收件人:
抄送人:"pgsql-hackers" 
主题:Re: Does PostgreSQL support debian Linux on Arm CPU Platform?
日期:2019年09月10日 18点08分


 writes:
> Hi,I just want know does PostgreSQL support debian Linux with ARM CPU 
> Platform,Thank you!
The PostgreSQL community provided packages (https://apt.postgresql.org/)
are only built for amd64, i386 and ppc64el, but Debian itself ships
PostgreSQL on every architecture it supports.
Each Debian release only ships one major version of PostgreSQL (the
current stable release has PostgreSQL 11), but if you need other
versions you could build them from the apt.postgresql.org source
packages.
- ilmari
-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl


回复:Re: Does PostgreSQL support debian Linux on Arm CPU Platform?

2019-09-11 Thread gc_11
As far as i can see,PostgreSQL BuildFarm is used to detect build failures on  a 
large collection of platforms and configurations.Is it necessary to  do 
function and performance test on a new platforms, Is there any tools to do 
this.Thank you very much. 
- 原始邮件 -
发件人:Andrew Dunstan 
收件人:gc...@sina.cn, pgsql-hackers 
主题:Re: Does PostgreSQL support debian Linux on Arm CPU Platform?
日期:2019年09月10日 01点32分


On 9/9/19 11:07 AM, gc...@sina.cn wrote:
> Hi,I just want know does PostgreSQL support debian Linux with ARM CPU
> Platform,Thank you!
See

cheers
andrew
-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
On 2019-Sep-11, Masahiko Sawada wrote:

> On Wed, Sep 11, 2019 at 1:46 PM Michael Paquier  wrote:
> >
> > On Tue, Sep 10, 2019 at 08:29:43AM +0530, Amit Kapila wrote:
> > > Good thought, but I think even if we want to change the name of
> > > tuple_data_split, it might be better done separately.
> >
> > Yes, that's not the problem of this patch.  Not sure if it actually
> > makes sense either to change it.
> 
> Hmm it will be more consistent with other functions but I think we
> would need to increase the pageinspect version to 1.8 and need the new
> sql file to rename the function name. And it will be for PG12, not
> PG13. If we have to do it someday I think it's better to do it in PG12
> that the table AM has been introduced to. Anyway I've attached
> separate patch for it.

I'd rather not change the name of the existing function ... that
function is pretty old (it was introduced in 9.6, commit d6061f83a166).
I think we can regard that name as an historical accident, and use a
modern name convention for the new function (and any hypothetical future
ones) that will, sadly, collide with the historical name for the old
function.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Built-in connection pooler

2019-09-11 Thread Konstantin Knizhnik



On 09.09.2019 18:12, Konstantin Knizhnik wrote:



On 06.09.2019 19:41, Konstantin Knizhnik wrote:



On 06.09.2019 1:01, Jaime Casanova wrote:


Sadly i got a lot of FAILED tests, i'm attaching the diffs on
regression with installcheck and installcheck-parallel.
btw, after make installcheck-parallel i wanted to do a new test but
wasn't able to drop regression database because there is still a
subscription, so i tried to drop it and got a core file (i was
connected trough the pool_worker), i'm attaching the backtrace of the
crash too.



Sorry, I failed to reproduce the crash.
So if you will be able to find out some scenario for reproduce it, I 
will be very pleased to receive it.


I was able to reproduce the crash.
Patch is attached. Also I added proxyign of RESET command.
Unfortunately it is still not enough to pass regression tests with 
"proxying_gucs=on".
Mostly because error messages doesn't match after prepending "set 
local" commands.





I have implemented passing startup options to pooler backend.
Now "make installcheck" is passed without  manual setting 
datestyle/timezone/intervalstyle in postgresql.conf.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490..5c2095f 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -11,6 +11,7 @@
 
 #include "commands/trigger.h"
 #include "executor/spi.h"
+#include "storage/proc.h"
 #include "utils/builtins.h"
 #include "utils/rel.h"
 
@@ -93,6 +94,8 @@ check_primary_key(PG_FUNCTION_ARGS)
 	else
 		tuple = trigdata->tg_newtuple;
 
+	MyProc->is_tainted = true;
+
 	trigger = trigdata->tg_trigger;
 	nargs = trigger->tgnargs;
 	args = trigger->tgargs;
@@ -284,6 +287,8 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: cannot process INSERT events");
 
+	MyProc->is_tainted = true;
+
 	/* Have to check tg_trigtuple - tuple being deleted */
 	trigtuple = trigdata->tg_trigtuple;
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c91e3e1..df0bcaf 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -719,6 +719,169 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_sessions (integer)
+  
+   max_sessions configuration parameter
+  
+  
+  
+
+  The maximum number of client sessions that can be handled by
+  one connection proxy when session pooling is enabled.
+  This parameter does not add any memory or CPU overhead, so
+  specifying a large max_sessions value
+  does not affect performance.
+  If the max_sessions limit is reached new connections are not accepted.
+
+
+  The default value is 1000. This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  session_pool_size (integer)
+  
+   session_pool_size configuration parameter
+  
+  
+  
+
+  Enables session pooling and defines the maximum number of
+  backends that can be used by client sessions for each database/user combination.
+  Launched non-tainted backends are never terminated even if there are no active sessions.
+  Backend is considered as tainted if client updates GUCs, creates temporary table or prepared statements.
+  Tainted backend can server only one client.
+
+
+  The default value is 10, so up to 10 backends will serve each database,
+
+  
+ 
+
+ 
+  proxy_port (integer)
+  
+   proxy_port configuration parameter
+  
+  
+  
+
+  Sets the TCP port for the connection pooler.
+  Clients connected to main "port" will be assigned dedicated backends,
+  while client connected to proxy port will be connected to backends through proxy which
+  performs transaction level scheduling. 
+   
+
+  The default value is 6543.
+
+  
+ 
+
+ 
+  connection_proxies (integer)
+  
+   connection_proxies configuration parameter
+  
+  
+  
+
+  Sets number of connection proxies.
+  Postmaster spawns separate worker process for each proxy. Postmaster scatters connections between proxies using one of scheduling policies (round-robin, random, load-balancing).
+  Each proxy launches its own subset of backends.
+  So maximal number of non-tainted backends is  session_pool_size*connection_proxies*databases*roles.
+   
+
+  The default value is 0, so session pooling is disabled.
+
+  
+ 
+
+ 
+  session_schedule (enum)
+  
+   session_schedule configuration parameter
+  
+  
+  
+
+  Specifies scheduling policy for assigning session to proxies in case of
+  connection pooling. Default policy is 

Create collation reporting the ICU locale display name

2019-09-11 Thread Daniel Verite
 Hi,

The 'locale' or 'lc_collate/lc_ctype' argument of an ICU collation may
have a complicated syntax, especially with non-deterministic
collations, and input mistakes in these names will not necessarily be
detected as such by ICU.

The "display name" of a locale is a simple way to get human-readable
feedback about the characteristics of that locale.
pg_import_system_collations() already push these as comments when
creating locales en masse.

I think it would be nice to have CREATE COLLATION report this
information as feedback in the form of a NOTICE message.
PFA a simple patch implementing that.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/backend/commands/collationcmds.c 
b/src/backend/commands/collationcmds.c
index 919e092483..ad011d149c 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -36,6 +36,10 @@
 #include "utils/syscache.h"
 
 
+#ifdef USE_ICU
+static char *get_icu_locale_comment(const char *localename);
+#endif
+
 typedef struct
 {
char   *localename; /* name of locale, as per "locale -a" */
@@ -232,6 +236,16 @@ DefineCollation(ParseState *pstate, List *names, List 
*parameters, bool if_not_e
if (!OidIsValid(newoid))
return InvalidObjectAddress;
 
+#ifdef USE_ICU
+   if (collprovider == COLLPROVIDER_ICU)
+   {
+   char *display_name = get_icu_locale_comment(collcollate);
+   if (display_name != NULL)
+   ereport(NOTICE,
+   (errmsg("ICU locale: \"%s\"", 
display_name)));
+   }
+#endif
+
/*
 * Check that the locales can be loaded.  NB: 
pg_newlocale_from_collation
 * is only supposed to be called on non-C-equivalent locales.


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2019-09-11 Thread Anastasia Lubennikova

10.09.2019 14:42, Asim R P wrote:

Hi Anastasia

On Thu, Aug 22, 2019 at 9:43 PM Anastasia Lubennikova 
mailto:a.lubennik...@postgrespro.ru>> 
wrote:

>
> But during the review, I found a bug in the current implementation.
> New behavior must apply to crash-recovery only, now it applies to 
archiveRecovery too.
> That can cause a silent loss of a tablespace during regular standby 
operation

> since it never calls CheckRecoveryConsistency().
>
> Steps to reproduce:
> 1) run master and replica
> 2) create dir for tablespace:
> mkdir  /tmp/tblspc1
>
> 3) create tablespace and database on the master:
> create tablespace tblspc1 location '/tmp/tblspc1';
> create database db1 tablespace tblspc1 ;
>
> 4) wait for replica to receive this changes and pause replication:
> select pg_wal_replay_pause();
>
> 5) move replica's tablespace symlink to some empty directory, i.e. 
/tmp/tblspc2

> mkdir  /tmp/tblspc2
> ln -sfn /tmp/tblspc2 postgresql_data_replica/pg_tblspc/16384
>

By changing the tablespace symlink target, we are silently nullifying 
effects of a committed transaction from the standby data directory - 
the directory structure created by the standby for create tablespace 
transaction.  This step, therefore, does not look like a valid test 
case to me.  Can you share a sequence of steps that does not involve 
changing data directory manually?


Hi, the whole idea of the test is to reproduce a data loss. For example, 
if the disk containing this tablespace failed.
Probably, simply deleting the directory 
'postgresql_data_replica/pg_tblspc/16384'
would work as well, though I was afraid that it can be caught by some 
earlier checks and my example won't be so illustrative.


Thank you for the review feedback.  I agree with all the points.  Let 
me incorporate them (I plan to pick this work up and drive it to 
completion as Paul got busy with other things).


But before that I'm revisiting another solution upthread, that of 
creating restart points when replaying create/drop database commands 
before making filesystem changes such as removing a directory.  The 
restart points should align with checkpoints on master.  The concern 
against this solution was creation of restart points will slow down 
recovery.  I don't think crash recovery is affected by this solution 
because of the already existing enforcement of checkpoints.  WAL 
records prior to a create/drop database will not be seen by crash 
recovery due to the checkpoint enforced during the command's normal 
execution.




I haven't measured the impact of generating extra restart points in 
previous solution, so I cannot tell whether concerns upthread are 
justified.  Still, I enjoy latest design more, since it is clear and 
similar with the code of checking unexpected uninitialized pages. In 
principle it works. And the issue, I described in previous review can be 
easily fixed by several additional checks of InHotStandby macro.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [patch]socket_timeout in interfaces/libpq

2019-09-11 Thread Fabien COELHO



By the way, Fabien, you are marked as a reviewer of this patch since the 
end of June.  Are you planning to review it?


Not this round.

--
Fabien.




Re: [PATCH] Improve performance of NOTIFY over many databases (v2)

2019-09-11 Thread Martijn van Oosterhout
Hoi Tom,


On Wed, 11 Sep 2019 at 00:18, Tom Lane  wrote:

>
> I pushed 0001 after doing some hacking on it --- it was sloppy about
> datatypes, and about whether the invalid-entry value is 0 or -1,
> and it was just wrong about keeping the list in backendid order.
> (You can't conditionally skip looking for where to put the new
> entry, if you want to maintain the order.  I thought about just
> defining the list as unordered, which would simplify joining the
> list initially, but that could get pretty cache-unfriendly when
> there are lots of entries.)
>
> 0002 is now going to need a rebase, so please do that.
>
>
Thanks for this, and good catch. Looks like I didn't test the first patch
by itself very well.

Here is the rebased second patch.

Thanks in advance,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
From bc4b1b458564f758b7fa1c1f7b0397aade71db06 Mon Sep 17 00:00:00 2001
From: Martijn van Oosterhout 
Date: Mon, 3 Jun 2019 17:13:31 +0200
Subject: [PATCH 1/2] Improve performance of async notifications

Advancing the tail pointer requires an exclusive lock which can block
backends from other databases, so it's worth keeping these attempts to a
minimum.

Instead of tracking the slowest backend exactly we update the queue more
lazily, only checking when we switch to a new SLRU page.  Additionally,
instead of waking up every slow backend at once, we do them one at a time.
---
 src/backend/commands/async.c | 142 +--
 1 file changed, 101 insertions(+), 41 deletions(-)

diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index f26269b5ea..b9dd0ca139 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -73,10 +73,11 @@
  *	  Finally, after we are out of the transaction altogether, we check if
  *	  we need to signal listening backends.  In SignalBackends() we scan the
  *	  list of listening backends and send a PROCSIG_NOTIFY_INTERRUPT signal
- *	  to every listening backend (we don't know which backend is listening on
- *	  which channel so we must signal them all). We can exclude backends that
- *	  are already up to date, though.  We don't bother with a self-signal
- *	  either, but just process the queue directly.
+ *	  to every listening backend for the relavent database (we don't know
+ *	  which backend is listening on which channel so we must signal them
+ *	  all).  We can exclude backends that are already up to date, though.
+ *	  We don't bother with a self-signal either, but just process the queue
+ *	  directly.
  *
  * 5. Upon receipt of a PROCSIG_NOTIFY_INTERRUPT signal, the signal handler
  *	  sets the process's latch, which triggers the event to be processed
@@ -89,13 +90,25 @@
  *	  Inbound-notify processing consists of reading all of the notifications
  *	  that have arrived since scanning last time. We read every notification
  *	  until we reach either a notification from an uncommitted transaction or
- *	  the head pointer's position. Then we check if we were the laziest
- *	  backend: if our pointer is set to the same position as the global tail
- *	  pointer is set, then we move the global tail pointer ahead to where the
- *	  second-laziest backend is (in general, we take the MIN of the current
- *	  head position and all active backends' new tail pointers). Whenever we
- *	  move the global tail pointer we also truncate now-unused pages (i.e.,
- *	  delete files in pg_notify/ that are no longer used).
+ *	  the head pointer's position.
+ *
+ * 6. To avoid SLRU wraparound and minimize disk space the tail pointer
+ *	  needs to be advanced so that old pages can be truncated.  This
+ *	  however requires an exclusive lock and as such should be done
+ *	  infrequently.
+ *
+ *	  When a new notification is added, the writer checks to see if the
+ *	  tail pointer is more than QUEUE_CLEANUP_DELAY pages behind.  If
+ *	  so, it attempts to advance the tail, and if there are slow
+ *	  backends (perhaps because all the notifications were for other
+ *	  databases), wake one of them up by sending a signal.
+ *
+ *	  When the slow backend processes the queue it notes it was behind
+ *	  and so also tries to advance the tail, possibly waking up another
+ *	  slow backend.  Eventually all backends will have processed the
+ *	  queue and the global tail pointer is move to a new page and we
+ *	  also truncate now-unused pages (i.e., delete files in pg_notify/
+ *	  that are no longer used).
  *
  * An application that listens on the same channel it notifies will get
  * NOTIFY messages for its own NOTIFYs.  These can be ignored, if not useful,
@@ -211,6 +224,12 @@ typedef struct QueuePosition
 	 (x).page != (y).page ? (x) : \
 	 (x).offset > (y).offset ? (x) : (y))
 
+/* how many pages does a backend need to be behind before it needs to be signalled */
+#define QUEUE_CLEANUP_DELAY 4
+
+/* is a backend so far behind it needs to be signalled? */
+#define QUEUE_SLOW_BACKEND(i) \
+	

Re: A problem presentaion about ECPG, DECLARE STATEMENT

2019-09-11 Thread Tom Lane
Michael Meskes  writes:
> Hi Kuroda-san,
>> In order to modify bugs, I think many designs, implementations, 
>> and specifications should be changed.

> I hope the authors of said patch speak up and explain why they
> implemented it as is.

>> Is it acceptable for PG12?

> In general absolutely.

It seems far too late to be considering any major rewrite for v12;
even assuming that there was consensus on the rewrite being an
improvement, which I bet there won't be.

"Two or three weeks from now" we'll be thinking about pushing 12.0
out the door.  We can't be accepting major definitional changes
at that point.

If a solid case can be made that ECPG's DECLARE STATEMENT was done
wrong, we'd be better off to just revert the feature out of v12
and try again, under less time pressure, for v13.

regards, tom lane




Re: Change atoi to strtol in same place

2019-09-11 Thread Robert Haas
On Tue, Sep 10, 2019 at 11:54 PM Michael Paquier  wrote:
> On Tue, Sep 10, 2019 at 08:03:32AM -0400, Robert Haas wrote:
> > -1. I think it's very useful to have routines for this sort of thing
> > that return an error message rather than emitting an error report
> > directly.  That gives the caller a lot more control.
>
> Please let me counter-argue here.

OK, but on the other hand, Joe's example of a custom message "invalid
compression level: 10 is outside range 0..9" is a world better than
"invalid compression level: %s".  We might even be able to do better
"argument to -Z must be a compression level between 0 and 9".  In
backend error-reporting, it's often important to show the misguided
value, because it may be coming from a complex query where it's hard
to find the source of the problematic value. But if the user types
-Z42 or -Zborked, I'm not sure it's important to tell them that the
problem is with "42" or "borked". It's more important to explain the
concept, or such would be my judgement.

Also, consider an option where the value must be an integer between 1
and 100 or one of several fixed strings (e.g. think of
recovery_target_timeline).  The user clearly can't use the generic
error message in that case.  Perhaps the answer is to say that such
users shouldn't use the provided range-checking function but rather
implement the logic from scratch. But that seems a bit limiting.

Also, suppose the user doesn't want to pg_log_error(). Maybe it's a
warning. Maybe it doesn't even need to be logged.

What this boils down to in the end is that putting more of the policy
decisions into the function helps ensure consistency and save code
when the function is used, but it also results in the function being
used less often. Reasonable people can differ on the merits of
different approaches, but for me the down side of returning the error
message appears minor at most, and the up sides seem significant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Re: libpq host/hostaddr/conninfo inconsistencies

2019-09-11 Thread Alvaro Herrera from 2ndQuadrant
On 2019-Sep-10, Alvaro Herrera from 2ndQuadrant wrote:

> I propose to settle this issue by applying "Robert's changes two and
> three", which has been +1'd by two people already and I also accept
> myself as improvement.  I don't think any further changes are required.

Applied, marked item committed.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




let's kill AtSubStart_Notify

2019-09-11 Thread Robert Haas
There are only four subsystems which require a callback at the
beginning of each subtransaction: the relevant functions are
AtSubStart_Memory, AtSubStart_ResourceOwner, AtSubStart_Notify, and
AfterTriggerBeginSubXact. The AtSubStart_Memory and
AtSubStart_ResourceOwner callbacks seem relatively unobjectionable,
because almost every subtransaction is going to allocate memory and
acquire some resource managed by a resource owner, but the others
represent initialization that has to be done whether or not the
corresponding feature is used.

Generally, a subsystem can avoid needing a callback at subtransaction
start (or transaction start) by detecting new levels of
subtransactions at time of use. A typical practice is to maintain a
stack which has entries only for those transaction nesting levels
where the functionality was used. The attached patch implements this
method for async.c. I was a little surprised to find that it makes a
pretty noticeable performance difference when starting and ending
trivial subtransactions.  I used this test case:

\timing
do $$begin for i in 1 .. 1000 loop begin null; exception when
others then null; end; end loop; end;$$;

I ran the test four times with and without the patch and took the
median of the last three.  This was an attempt to exclude effects due
to starting up the database cluster. With the patch, the result was
3127.377 ms; without the patch, it was 3527.285 ms. That's a big
enough difference that I'm wondering whether I did something wrong
while testing this, so feel free to check my work and tell me whether
I'm all wet. Still, I don't find it wholly unbelievable, because I've
observed in the past that these code paths are lean enough that a few
palloc() calls can make a noticeable difference, and the effect of
this patch is to remove a few palloc() calls.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


0001-Remove-AtSubStart_Notify.patch
Description: Binary data


Re: pgbench - allow to create partitioned tables

2019-09-11 Thread Fabien COELHO


Hello Amit,

Thanks for the feedback.


+ case 11: /* partitions */
+ initialization_option_set = true;
+ partitions = atoi(optarg);
+ if (partitions < 0)
+ {
+ fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+ optarg);
+ exit(1);
+ }
+ break;

Is there a reason why we treat "partitions = 0" as a valid value?


Yes. It is an explicit "do not create partitioned tables", which differ 
from 1 which says "create a partitionned table with just one partition".



Also, shouldn't we keep some max limit for this parameter as well?


I do not think so. If someone wants to test how terrible it is to use 
10 partitions, we should not prevent it.



Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table?


Although I agree that it does not make much sense, for testing purposes 
why not, to test overheads in critical cases for instance.


I understand it is not sensible to give such a value, but I guess the 
API should behave sanely in such cases as well.


Yep, it should work.


I am not sure what will be the good max value for it, but I
think there should be one.


I disagree. Pgbench is a tool for testing performance for given 
parameters. If postgres accepts a parameter there is no reason why pgbench 
should reject it.



@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
 const char *bigcols; /* column decls if accountIDs are 64 bits */
 int declare_fillfactor;
 };
+
 static const struct ddlinfo DDLs[] = {

Spurious line change.


Indeed.


*
+"  --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"
+"  --partition-method=(range|hash)\n"
+"   partition account table with this
method (default: range)\n"

Refer complete table name like pgbench_accounts instead of just account. 
It will be clear and in sync with what we display in some other options 
like --skip-some-updates.


Ok.


*
+"  --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"

/defaults/default.


Ok.


I think we should print the information about partitions in
printResults.  It can help users while analyzing results.


Hmmm. Why not, with some hocus-pocus to get the information out of 
pg_catalog, and trying to fail gracefully so that if pgbench is run 
against a no partitioning-support version.



*
+enum { PART_NONE, PART_RANGE, PART_HASH }
+ partition_method = PART_NONE;
+

I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.


Hmmm. Why not. This means inventing a used-once type name for 
partition_method. My great creativity lead to partition_method_t.



*
- int i;

 fprintf(stderr, "creating tables...\n");

- for (i = 0; i < lengthof(DDLs); i++)
+ for (int i = 0; i < lengthof(DDLs); i++)

This is unnecessary change as far as this patch is concerned.  I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.


The reason I did that is that I had a stupid bug in a development version 
which was due to an accidental reuse of this index, which would have been 
prevented by this declaration style. Removed anyway.



+ if (partitions >= 1)
+ {
+ int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
+
+ fprintf(stderr, "creating %d partitions...\n", partitions);
+
+ for (int p = 1; p <= partitions; p++)
+ {
+ char query[256];
+
+ if (partition_method == PART_RANGE)
+ {

part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here.


I just wanted to avoid recomputing the value in the loop, but indeed it 
may be computed needlessly. Moved.



In general, this part of the code can use some comments.


Ok.

Attached an updated version.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index c857aa3cba..e3a0abb4c7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -306,6 +306,31 @@ pgbench  options  d
   
  
 
+ 
+  --partitions=NUM
+  
+   
+Create a partitioned pgbench_accounts table with
+NUM partitions of nearly equal size for
+the scaled number of accounts.
+Default is 0, meaning no partitioning.
+   
+  
+ 
+
+ 
+  --partition-method=NAME
+  
+   
+Create a partitioned pgbench_accounts table with
+NAME method.
+Expected values are range or hash.
+This option requires that --partitions is set to non-zero.
+If unspecified, default is range.
+   
+  
+ 
+
  
   --tablespace=tablespace
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..673b175522 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -186,6 +186,14 @@ int64		

Re: Write visibility map during CLUSTER/VACUUM FULL

2019-09-11 Thread Amit Kapila
On Sun, Sep 1, 2019 at 1:37 PM Alexander Korotkov
 wrote:
>
> Hi!
>
> I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
> Attached patch implements writing visibility map in
> heapam_relation_copy_for_cluster().
>
> I've studied previous attempt to implement this [1].  The main problem
> of that attempt was usage of existing heap_page_is_all_visible() and
> visibilitymap_set() functions.  These functions works through buffer
> manager, while heap rewriting is made bypass buffer manager.
>
> In my patch visibility map pages are handled in the same way as heap
> pages are.
>

I haven't studied this patch in detail, but while glancing I observed
that this doesn't try to sync the vm pages as we do for heap pages in
the end (during end_heap_rewrite).  Am I missing something?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: A problem presentaion about ECPG, DECLARE STATEMENT

2019-09-11 Thread Michael Meskes
Hi Kuroda-san,

> This feature, committed last February, has some bugs.
> ...
> * This syntax does not have oracle compatibility.

This in itself is not a bug. If the syntax is not standard compliant,
then it's a bug. That of course does not mean we would not like to be
Oracle compatible where possible.

> In order to modify bugs, I think many designs, implementations, 
> and specifications should be changed.

I hope the authors of said patch speak up and explain why they
implemented it as is.

> Is it acceptable for PG12?

In general absolutely.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL





logical decoding : exceeded maxAllocatedDescs for .spill files

2019-09-11 Thread Amit Khandekar
Hi,

I reproduced the error "exceeded maxAllocatedDescs (492) while trying
to open file ...", which was also discussed about in the thread [1].
This issue is similar but not exactly the same as [1]. In [1], the
file for which this error used to show up was
"pg_logical/mappings/map" , while here it's the .spill file. And
here the issue , in short, seems to be : The .spill file does not get
closed there and then, unlike in [1] where there was a file descriptor
leak.

I could reproduce it using a transaction containing a long series of
sub-transactions (possibly could be reproduced without
sub-transactions, but looking at the code I could come up with this
script using sub-transactions easily) :

create table tab(id int);

-- Function that does huge changes in a single transaction
create or replace function f(id int) returns int as
$$
begin
-- Iterate this more than 492 times (max transient file
descriptors PG would typically allow)
-- This will create that many sub-transactions due to presence of
exception block.
FOR i IN 1..600 LOOP

BEGIN
-- Iterate more than 4096 times (so that changes spill to
disk: max_changes_in_memory)
FOR j IN 1..5000 LOOP
   insert into tab values (1);
END LOOP;
EXCEPTION
when division_by_zero then perform 'dummy';
END;

END LOOP;

return id;
end $$ language plpgsql;

SELECT * FROM pg_create_logical_replication_slot('logical', 'test_decoding');

begin;
select f(1); -- Do huge changes in a single transaction
commit;

\! pg_recvlogical -d postgres  --slot=logical --verbose --start -f -

pg_recvlogical: starting log streaming at 0/0 (slot logical)
pg_recvlogical: streaming initiated
pg_recvlogical: confirming write up to 0/0, flush to 0/0 (slot logical)
BEGIN 1869
pg_recvlogical: confirming write up to 0/1B6D6E38, flush to 0/1B6D6E38
(slot logical)
pg_recvlogical: error: unexpected termination of replication stream:
ERROR:  exceeded maxAllocatedDescs (492) while trying to open file
"pg_replslot/logical/xid-2362-lsn-0-2400.spill"
pg_recvlogical: disconnected; waiting 5 seconds to try again

Looking at the code, what might be happening is,
ReorderBufferIterTXNInit()=>ReorderBufferRestoreChanges() opens the
files, but leaves them open if end of file is not reached. Eventually
if end of file is reached, it gets closed. The function returns back
without closing the file descriptor if  reorder buffer changes being
restored are more than max_changes_in_memory. Probably later on, the
rest of the changes get restored in another
ReorderBufferRestoreChanges() call. But meanwhile, if there are a lot
of such files opened, we can run out of the max files that PG decides
to keep open (it has some logic that takes into account system files
allowed to be open, and already opened).

Offhand, what I am thinking is, we need to close the file descriptor
before returning from ReorderBufferRestoreChanges(), and keep track of
the file offset and file path, so that next time we can resume reading
from there.

Comments ?

[1] 
https://www.postgresql.org/message-id/flat/738a590a-2ce5-9394-2bef-7b1caad89b37%402ndquadrant.com

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company




Re: don't see materialized views in information_schema

2019-09-11 Thread Juan José Santamaría Flecha
On Wed, Sep 11, 2019 at 10:03 AM Pavel Stehule  wrote:
>
> st 11. 9. 2019 v 9:49 odesílatel Erik Rijkers  napsal:
>>
>> On 2019-09-11 08:14, Pavel Stehule wrote:
>> > Hi
>> >
>> > [matviews not showing up in information_schema.tables]
>> >
>> > Is it expected behave? Tested on master branch.
>>
>> I think it is; it has been like this all along.
>>
>> ( matviews are in pg_matviews. )
>
>
> Minimally I miss a entry in information_schema.views
>
> To today I expected so any object should be listed somewhere in 
> information_schema.
>

There has been previous discussion about this topic:

https://www.postgresql.org/message-id/3794.1412980...@sss.pgh.pa.us


Regards,

Juan José Santamaría Flecha




Re: doc: update PL/pgSQL sample loop function

2019-09-11 Thread Pavel Stehule
st 11. 9. 2019 v 11:51 odesílatel Amit Kapila 
napsal:

> On Wed, Sep 11, 2019 at 11:40 AM Pavel Stehule 
> wrote:
> >
> > st 11. 9. 2019 v 7:45 odesílatel Amit Kapila 
> napsal:
> >>
> >> On Sun, Sep 1, 2019 at 9:09 AM Amit Kapila 
> wrote:
> >> >
> >> > The current example shows the usage of looping in plpgsql, so as such
> >> > there is no correctness issue, but OTOH there is no harm in updating
> >> > the example as proposed by Ian Barwick.  Does anyone else see any
> >> > problem with this idea?  If we agree to proceed with this update, it
> >> > might be better to backpatch it for the sake of consistency though I
> >> > am not sure about that.
> >> >
> >>
> >> While checking the patch in back-branches, I noticed that it doesn't
> >> get applied to 9.4 due to the way the example forms the string.  I
> >> have done the required changes for 9.4 as well and attached is the
> >> result.
> >>
> >
> > Is question if for this queries should not be used some from
> information_schema instead direct access to pg_catalog.
> >
> > But I looked now, and we don't see materialized views in
> information_schema - what is probably bug.
> >
>
> I think you got the answer of this on a related thread.  Do you see
> any other problems or have any concerns about this?
>

no

Pavel


> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


Re: doc: update PL/pgSQL sample loop function

2019-09-11 Thread Amit Kapila
On Wed, Sep 11, 2019 at 11:40 AM Pavel Stehule  wrote:
>
> st 11. 9. 2019 v 7:45 odesílatel Amit Kapila  napsal:
>>
>> On Sun, Sep 1, 2019 at 9:09 AM Amit Kapila  wrote:
>> >
>> > The current example shows the usage of looping in plpgsql, so as such
>> > there is no correctness issue, but OTOH there is no harm in updating
>> > the example as proposed by Ian Barwick.  Does anyone else see any
>> > problem with this idea?  If we agree to proceed with this update, it
>> > might be better to backpatch it for the sake of consistency though I
>> > am not sure about that.
>> >
>>
>> While checking the patch in back-branches, I noticed that it doesn't
>> get applied to 9.4 due to the way the example forms the string.  I
>> have done the required changes for 9.4 as well and attached is the
>> result.
>>
>
> Is question if for this queries should not be used some from 
> information_schema instead direct access to pg_catalog.
>
> But I looked now, and we don't see materialized views in information_schema - 
> what is probably bug.
>

I think you got the answer of this on a related thread.  Do you see
any other problems or have any concerns about this?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




A problem presentaion about ECPG, DECLARE STATEMENT

2019-09-11 Thread kuroda.hay...@fujitsu.com
Dear ALL,

I want to report and consult about DECLARE STATEMENT.
This feature, committed last February, has some bugs.

* This is not thread-independent.
* If some cursors are declared for the same SQL identifier, 
  only one cursor you declared at last is enabled.
* This syntax does not have oracle compatibility.

In order to modify bugs, I think many designs, implementations, 
and specifications should be changed.
Any operations will be done at the preprocessor process, like #define
macro in C.

It will take about 2 or 3 weeks to make a patch.
Is it acceptable for PG12?

Best Regards,

Hayato Kuroda
FUJITSU LIMITED
E-Mail:kuroda.hay...@fujitsu.com





Re: Pulling up direct-correlated ANY_SUBLINK

2019-09-11 Thread Richard Guo
Hi Antonin,

On Wed, Sep 11, 2019 at 3:25 PM Antonin Houska  wrote:

>
> Nevertheless, I don't know how to overcome the problems that I mentioned
> upthread.
>

Do you mean the problem "the WHERE clause of the subquery didn't
participate in the SEMI JOIN evaluation"? Good news is it has been fixed
by commit 043f6ff0 as I mentioned upthread.

Thanks
Richard


Re: don't see materialized views in information_schema

2019-09-11 Thread Pavel Stehule
st 11. 9. 2019 v 9:49 odesílatel Erik Rijkers  napsal:

> On 2019-09-11 08:14, Pavel Stehule wrote:
> > Hi
> >
> > [matviews not showing up in information_schema.tables]
> >
> > Is it expected behave? Tested on master branch.
>
> I think it is; it has been like this all along.
>
> ( matviews are in pg_matviews. )
>

Minimally I miss a entry in information_schema.views

To today I expected so any object should be listed somewhere in
information_schema.

Pavel


Re: Pulling up direct-correlated ANY_SUBLINK

2019-09-11 Thread Antonin Houska
Tom Lane  wrote:

> > Can we try to pull up direct-correlated ANY SubLink with the help of
> > LATERAL?
> 
> Perhaps.  But what's the argument that you'd end up with a better
> plan?  LATERAL pretty much constrains things to use a nestloop,
> so I'm not sure there's anything fundamentally different.

I think that subquery pull-up is most beneficial when the queries (both the
subquery and the upper query) contain more than a few tables. In such a case,
if only a few tables reference the upper query (or if just a single one does),
the constraints imposed by LATERAL might be less significant.

Nevertheless, I don't know how to overcome the problems that I mentioned
upthread.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com




Re: don't see materialized views in information_schema

2019-09-11 Thread Erik Rijkers

On 2019-09-11 08:14, Pavel Stehule wrote:

Hi

[matviews not showing up in information_schema.tables]

Is it expected behave? Tested on master branch.


I think it is; it has been like this all along.

( matviews are in pg_matviews. )





Re: Pulling up direct-correlated ANY_SUBLINK

2019-09-11 Thread Richard Guo
Hi Antonin,

On Tue, Sep 10, 2019 at 4:31 PM Antonin Houska  wrote:

> Richard Guo  wrote:
>
> > Can we try to pull up direct-correlated ANY SubLink with the help of
> > LATERAL?
>
> > By this way, we can convert the query:
> >
> > select * from a where a.i = ANY(select i from b where a.j > b.j);
> >
> > To:
> >
> > select * from a SEMI JOIN lateral(select * from b where a.j > b.j)
> > sub on a.i = sub.i;
> >
>
> I tried this a few years ago. This is where the problems started:
>
>
> https://www.postgresql.org/message-id/1386716782.5203.YahooMailNeo%40web162905.mail.bf1.yahoo.com


Thank you for this link. Good to know the discussions years ago.


> I'm not sure I remember enough, but the problem has something to do with
> one
> possible strategy to plan SEMI JOIN: unique-ify the inner path and then
> perform plain INNER JOIN instead.
>
> I think the problemm was that the WHERE clause of the subquery didn't
> participate in the SEMI JOIN evaluation and was used as filter instead.
> Thus
> the clause's Vars were not used in unique keys of the inner path and so the
> SEMI JOIN didn't work well.
>

This used to be a problem until it was fixed by commit 043f6ff0, which
includes the postponed qual from a LATERAL subquery into the quals seen
by make_outerjoininfo().

Thanks
Richard


Re: Change atoi to strtol in same place

2019-09-11 Thread Joe Nelson
Robert Haas wrote:
> > -1. I think it's very useful to have routines for this sort of thing
> > that return an error message rather than emitting an error report
> > directly.  That gives the caller a lot more control.

Michael Paquier wrote:
> 1) Consistency with the error messages makes less work for translators,
> who already have a lot to deal with.

Agreed that the messages can be slightly inconsistent. I tried to make
the new messages match the styles of other messages in their respective
utilities. Maybe the bigger issue here is inconsistent output styles
across the utilities in general:

pg_standby.c includes flag names
%s: -r maxretries %s
pg_basebackup.c writes the settings out in words
invalid compression level: %s

Note that the final %s in those examples will expand to a more detailed
message.  For example passing "-Z 10" to pg_dump in the current patch will
output:

pg_dump: error: invalid compression level: 10 is outside range 0..9

> 2) A centralized error message can provide the same level of details.

Even assuming we standardize the message format, different callers have
different means to handle the messages. The front-end utilities affected in my
patch use calls as varied as fprintf, pg_log_error, write_stderr and pg_fatal.
Thus pg_strtoint64_range needs more flexibility than calling pg_log_error
internally.

> 3) I think that we should not expose directly the status values of
> pg_strtoint_status in pg_strtoint64_range(), that's less for module
> authors to worry about, and that would be the same approach as we are
> using for the wrappers of pg_strto[u]intXX() in the patch of the other
> thread (see pg_strto[u]intXX_check for example in [1]).

The pg_strto[u]intXX_check functions can return the integer directly only
because they handle errors with ereport(ERROR, ...). However, as I mentioned
earlier, this is not always what the front-end utilities need to do.

-- 
Joe Nelson  https://begriffs.com




Re: doc: update PL/pgSQL sample loop function

2019-09-11 Thread Pavel Stehule
st 11. 9. 2019 v 7:45 odesílatel Amit Kapila 
napsal:

> On Sun, Sep 1, 2019 at 9:09 AM Amit Kapila 
> wrote:
> >
> > The current example shows the usage of looping in plpgsql, so as such
> > there is no correctness issue, but OTOH there is no harm in updating
> > the example as proposed by Ian Barwick.  Does anyone else see any
> > problem with this idea?  If we agree to proceed with this update, it
> > might be better to backpatch it for the sake of consistency though I
> > am not sure about that.
> >
>
> While checking the patch in back-branches, I noticed that it doesn't
> get applied to 9.4 due to the way the example forms the string.  I
> have done the required changes for 9.4 as well and attached is the
> result.
>
>
Is question if for this queries should not be used some from
information_schema instead direct access to pg_catalog.

But I looked now, and we don't see materialized views in information_schema
- what is probably bug.

Pavel


> Ian, if possible, can you once check the patch for 9.4?
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


Re: psql - improve test coverage from 41% to 88%

2019-09-11 Thread Michael Paquier
On Tue, Sep 03, 2019 at 08:06:43AM +0200, Fabien COELHO wrote:
> Attached is a rebase after TestLib.pm got a documentation in
> 6fcc40b1.

I am not completely sure what to think about this patch, but here are
some high-level comments.

+=item $node->icommand_checks(cmd, ...)
+
+=cut
+
+sub icommand_checks

Surely this can have a better description, like say
PostgresNode::command_checks_all.

Is Expect compatible down to perl 5.8.0 which is the minimum required
for the TAP tests (see src/test/perl/README)?

There are cases where we don't support tab completion, aka no
USE_READLINE.  So tests would need to be skipped.

-   \a \C arg1 \c arg1 arg2 arg3 arg4 \cd arg1 \conninfo
+   \a
+\C arg1
Why are you changing that?  Your patch does not touch the logic of
psql.  Could it make sense as a separate patch to shape better the
tests?

--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -307,6 +307,7 @@ describeTablespaces(const char *pattern, bool
verbose)
  * a for aggregates
  * n for normal
+ * p for procedure
This is a separate issue, fixed.
--
Michael


signature.asc
Description: PGP signature


Re: Change atoi to strtol in same place

2019-09-11 Thread Joe Nelson
Michael Paquier wrote:
> Using a wrapper in src/fe_utils makes sense.  I would have used
> options.c for the new file, or would that be too much generic?

Sure, options.c sounds fine -- it doesn't seem any more generic than
"arg_utils" and is a little simpler. The only other question I have is
if the function inside -- with some adjustment in its interface -- might
be useful in other contexts beyond front-end args checking.

> The code indentation is weird, particularly the switch/case in
> pg_strtoint64_range().

I did run pgindent... Do I need to tell it about the existence of the
new file?

> The error handling is awkward.

Let's continue this point in your follow-up
<20190911035356.ge1...@paquier.xyz>.

> Why do you need to update common/Makefile?

Thought I needed to do this so that other parts would link properly, but
just removed the changes there and stuff still links OK, so I'll remove
that change.

> The builds on Windows are broken.  You are adding one file into
> fe_utils, but forgot to update @pgfeutilsfiles in Mkvcbuild.pm.  --

Thanks for the tip, I'm still learning about the build process. Is there
a service I can use to test my patches across multiple platforms? I'd
rather not bother reviewers with build problems that I can catch in a
more automated way.

-- 
Joe Nelson  https://begriffs.com




don't see materialized views in information_schema

2019-09-11 Thread Pavel Stehule
Hi

create table omega(a int);
create view omega_view as select * from omega;
insert into omega values(10);

postgres=# select table_type, table_name from information_schema.tables
where table_name like 'omega%';
┌┬┐
│ table_type │ table_name │
╞╪╡
│ BASE TABLE │ omega  │
│ VIEW   │ omega_view │
└┴┘
(2 rows)

postgres=# create materialized view omega_m_view as select * from omega;
SELECT 1
postgres=# select table_type, table_name from information_schema.tables
where table_name like 'omega%';
┌┬┐
│ table_type │ table_name │
╞╪╡
│ BASE TABLE │ omega  │
│ VIEW   │ omega_view │
└┴┘
(2 rows)

postgres=# refresh materialized view omega_m_view ;
REFRESH MATERIALIZED VIEW
postgres=# select table_type, table_name from information_schema.tables
where table_name like 'omega%';
┌┬┐
│ table_type │ table_name │
╞╪╡
│ BASE TABLE │ omega  │
│ VIEW   │ omega_view │
└┴┘
(2 rows)

Is it expected behave? Tested on master branch.

Pavel