date:20170801

Starting a new thread for a patch I posted earlier [1] to handle ON
CONFLICT DO NOTHING when inserting into a partitioned table.  It's
intended for PG 11 and so registered in the upcoming CF.

Summary of the previous discussion and the patch for anyone interested:

Currently, if an INSERT statement for a partitioned table mentions the ON
CONFLICT clause, we error out immediately.  It was implemented that way,
because it was thought that it could not be handled with zero support for
defining indexes on partitioned tables.  Peter Geoghegan pointed out [2]
that it's too restrictive a view.

He pointed out that planner doesn't *always* expect indexes to be present
on the table when ON CONFLICT is specified.  They must be present though
if DO UPDATE action is requested, because one would need to also specify
the exact columns on which conflict will be checked and those must covered
by the appropriate indexes.  So, if the table is partitioned and DO UPDATE
is specified, lack of indexes will result in an error saying that a
suitable index is absent.  DO UPDATE action cannot be supported until we
implement the feature to define indexes on partitioned tables.

OTOH, the DO NOTHING case should go through the planner without error,
because neither any columns need to be specified nor any indexes need to
be present covering them.  So, DO NOTHING on partitioned tables might work
after all.  Conflict can only be determined using indexes, which
partitioned tables don't allow, so how?  Leaf partitions into which tuples
are ultimately stored can have indexes defined on them, which can be used
to check for the conflict.

The patch's job is simple:

- Remove the check in the parser that causes an error the moment the
  ON CONFLICT clause is found.

- Fix leaf partition ResultRelInfo initialization code so that the call
  ExecOpenIndices() specifies 'true' for speculative, so that the
  information necessary for conflict checking will be initialized in the
  leaf partition's ResultRelInfo

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/62be3d7a-08f6-5dcb-f5c8-a5b764ca96df%40lab.ntt.co.jp

[2]
https://www.postgresql.org/message-id/CAH2-Wzm10T%2B_PWVM4XO5zaknVbAXkOH9-JW3gRVPm1njLHck_w%40mail.gmail.com
From 5029a9c6cbb9549024e701d56025c0de1f2b393f Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 3 Apr 2017 19:13:38 +0900
Subject: [PATCH] Allow ON CONFLICT DO NOTHING on partitioned tables

ON CONFLICT .. DO UPDATE still doesn't work, because it requires
specifying the conflict target.  DO NOTHING doesn't require it,
but the executor will check for conflicts within only a given
leaf partitions, if relevant constraints exist.

Specifying the conflict target makes the planner look for the
required indexes on the parent table, which are not allowed, so an
error will always be reported in that case.
---
 doc/src/sgml/ddl.sgml | 12 
 src/backend/executor/execMain.c   | 10 ++
 src/backend/parser/analyze.c  |  8 
 src/test/regress/expected/insert_conflict.out | 10 ++
 src/test/regress/sql/insert_conflict.sql  | 10 ++
 5 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b05a9c2150..2c8a58aa09 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3276,10 +3276,14 @@ ALTER TABLE measurement ATTACH PARTITION 
measurement_y2008m02
  
   
Using the ON CONFLICT clause with partitioned tables
-   will cause an error, because unique or exclusion constraints can only be
-   created on individual partitions.  There is no support for enforcing
-   uniqueness (or an exclusion constraint) across an entire partitioning
-   hierarchy.
+   will cause an error if the conflict target is specified (see
+for more details).  That means it's not
+   possible to specify DO UPDATE as the alternative
+   action, because it requires the conflict target to be specified.
+   On the other hand, specifying DO NOTHING as the
+   alternative action works fine.  Note that in the latter case, a unique
+   constraint (or an exclusion constraint) of the individual leaf
+   partitions is considered, not across the whole partitioning hierarchy.
   
  
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index c11aa4fe21..b313ad1efa 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -3300,13 +3300,15 @@ ExecSetupPartitionTupleRouting(Relation rel,
  0);
 
/*
-* Open partition indices (remember we do not support ON 
CONFLICT in
-* case of partitioned tables, so we do not need support 
information
-* for speculative insertion)
+* Open partition indices.  The user may have asked to check for
+*

Re: [HACKERS] Faster methods for getting SPI results

2017-08-01 Thread Chapman Flack

On 12/20/16 23:14, Jim Nasby wrote:
> I'm guessing one issue might be that
> we don't want to call an external interpreter while potentially holding page
> pins, but even then couldn't we just copy a single tuple at a time and save
> a huge amount of palloc overhead?

On 04/06/17 03:38, Craig Ringer wrote:
> Also, what rules apply in terms of what you can/cannot do from within
> a callback? Presumably it's unsafe to perform additional SPI calls,
> perform transactions, call into the executor, change the current
> snapshot, etc, but I would consider that reasonably obvious. Are there
> any specific things to avoid?

Confessing, right up front, that I'm not very familiar with the executor
or DestReceiver code, but thinking of issues that might be expected with
PLs, I wonder if there could be a design where the per-tuple callback
could sometimes return a status HAVE_SLOW_STUFF_TO_DO.

If it does, the executor could release some pins or locks, stack some
state, whatever allows it to (as far as practicable) relax restrictions
on what the callback would be allowed to do, then reinvoke the callback,
not with another tuple, but with OK_GO_DO_YOUR_SLOW_STUFF.

On return from that call, the executor could reacquire its stacked
state/locks/pins and resume generating tuples.

That way, a callback could, say, return normally 9 out of 10 times, just
quickly buffering up 10 tuples, and every 10th time return SLOW_STUFF_TO_DO
and get a chance to jump into the PL interpreter and deal with those 10 ...
(a) minimizing the restrictions on what the PL routine may do, and (b)
allowing any costs of state-stacking/lock-releasing-reacquiring, and control
transfer to the interpreter, to be amortized over some number of tuples.
How many tuples that should be might be an empirical question for any given
PL, but with a protocol like this, the callback has an easy way to control
it.

Or would that be overcomplicated?

-Chap

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for CSN based snapshots

2017-08-01 Thread Amit Kapila

On Tue, Aug 1, 2017 at 7:41 PM, Alexander Kuzmenkov
 wrote:
> Hi all,
>
> So I did some more experiments on this patch.
>
> * I fixed the bug with duplicate tuples I mentioned in the previous letter.
> Indeed, the oldestActiveXid could be advanced past the transaction's xid
> before it set the clog status. This happened because the oldestActiveXid is
> calculated based on the CSN log contents, and we wrote to CSN log before
> writing to clog. The fix is to write to clog before CSN log
> (TransactionIdAsyncCommitTree)
>
> * We can remove the exclusive locking on CSNLogControlLock when setting the
> CSN for a transaction (CSNLogSetPageStatus). When we assign a CSN to a
> transaction and its children, the atomicity is guaranteed by using an
> intermediate state (COMMITSEQNO_COMMITTING), so it doesn't matter if this
> function is not atomic in itself. The shared lock should suffice here.
>
> * On throughputs of about 100k TPS, we allocate ~1k CSN log pages per
> second. This is done with exclusive locking on CSN control lock, and
> noticeably increases contention. To alleviate this, I allocate new pages in
> batches (ExtendCSNLOG).
>
> * When advancing oldestActiveXid, we scan CSN log to find an xid that is
> still in progress. To do that, we increment the xid and query its CSN using
> the high level function, acquiring and releasing the lock and looking up the
> log page for each xid. I wrote some code to acquire the lock only once and
> then scan the pages (CSNLogSetPageStatus).
>
> * On bigger buffers the linear page lookup code that the SLRU uses now
> becomes slow. I added a shared dynahash table to speed up this lookup.
>
> * I merged in recent changes from master (up to 7e1fb4). Unfortunately I
> didn't have enough time to fix the logical replication and snapshot import,
> so now it's completely broken.
>
> I ran some pgbench with these tweaks (tpcb-like, 72 cores, scale 500). The
> throughput is good on lower number of clients (on 50 clients it's 35% higher
> than on the master), but then it degrades steadily. After 250 clients it's
> already lower than master; see the attached graph. In perf reports the
> CSN-related things have almost vanished, and I see lots of time spent
> working with clog. This is probably the situation where by making some parts
> faster, the contention in other parts becomes worse and overall we have a
> performance loss.

Yeah, this happens sometimes and I have also observed this behavior.

> Hilariously, at some point I saw a big performance
> increase after adding some debug printfs. I wanted to try some things with
> the clog next, but for now I'm out of time.
>

What problem exactly you are seeing in the clog, is it the contention
around CLOGControlLock or generally accessing CLOG is slower.  If
former, then we already have a patch [1] to address it.


[1] - https://commitfest.postgresql.org/14/358/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Confusing error message in pgbench

2017-08-01 Thread Tatsuo Ishii

I found an error message in pgbench is quite confusing.

pgbench -S -M extended -c 1 -T 30 test
query mode (-M) should be specified before any transaction scripts (-f or -b)

Since there's no -f or -b option is specified, users will be
confused. Actually the error occurs because pgbench implicitly
introduces a built in script for -S. To eliminate the confusion, I
think the error message should be fixed like this:

query mode (-M) should be specified before transaction type (-S or -N) or any 
transaction scripts (-f or -b)

Patch attached.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4d364a1..f7ef0ed 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3898,7 +3898,7 @@ main(int argc, char **argv)
 benchmarking_option_set = true;
 if (num_scripts > 0)
 {
-	fprintf(stderr, "query mode (-M) should be specified before any transaction scripts (-f or -b)\n");
+	fprintf(stderr, "query mode (-M) should be specified before transaction type (-S or -N) or any transaction scripts (-f or -b)\n");
 	exit(1);
 }
 for (querymode = 0; querymode < NUM_QUERYMODE; querymode++)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

2017-08-01 Thread Noah Misch

On Tue, Jul 25, 2017 at 07:02:28PM +0200, Tomas Vondra wrote:
> On 7/25/17 5:04 PM, Tom Lane wrote:
> >Tomas Vondra  writes:
> >>Attached is a patch that (I think) does just that. The disagreement
> >>was caused by VACUUM treating recently dead tuples as live, while
> >>ANALYZE treats both of those as dead.
> >
> >>At first I was worried that this will negatively affect plans in
> >>the long-running transaction, as it will get underestimates (due
> >>to reltuples not including rows it can see). But that's a problem
> >>we already have anyway, you just need to run ANALYZE in the other
> >>session.
> >
> >This definitely will have some impact on plans, at least in cases
> >where there's a significant number of unvacuumable dead tuples. So I
> >think it's a bit late for v10, and I wouldn't want to back-patch at
> >all. Please add to the next commitfest.
> >
> 
> I dare to disagree here, for two reasons.
> 
> Firstly, the impact *is* already there, it only takes running ANALYZE. Or
> VACUUM ANALYZE. In both those cases we already end up with
> reltuples=n_live_tup.
> 
> Secondly, I personally strongly prefer stable predictable behavior over
> intermittent oscillations between two values. That's a major PITA on
> production, both to investigate and fix.

> FWIW I personally see this as a fairly annoying bug, and would vote to
> backpatch it, although I understand people might object.

I tend to agree.  If you have a setup that somehow never uses ANALYZE or never
uses VACUUM on a particular table, you might prefer today's (accidental)
behavior.  However, the discrepancy arises only on a table that gets dead
tuples, and a table that gets dead tuples will see both VACUUM and ANALYZE
soon enough.  It does seem like quite a stretch to imagine someone wanting
plans to depend on which of VACUUM or ANALYZE ran most recently.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

On 2017/08/02 9:31, Amit Langote wrote:
> On 2017/08/02 4:02, Robert Haas wrote:
>> On Tue, Aug 1, 2017 at 12:26 AM, Amit Langote
>>  wrote:
>>> So is the latest patch posted upthread to process ON CONFLICT DO NOTHING
>>> using locally-defined unique indexes on leaf partitions something to 
>>> consider?
>>
>> Yeah, for v11.
> 
> OK.

Will stick this into the next CF.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

On 2017/08/02 10:27, Robert Haas wrote:
> On Tue, Aug 1, 2017 at 9:23 PM, Amit Langote
>  wrote:
>> Since ATExecAttachPartition() deals with the possibility that the table
>> being attached itself might be partitioned, someone reading the code might
>> find it helpful to get some clue about whose partitions/children a
>> particular piece of code is dealing with -  AT's target table's (rel's) or
>> those of the table being attached (attachRel's)? IMHO, attachRel_children
>> makes it abundantly clear that it is in fact the partitions of the table
>> being attached that are being manipulated.
> 
> True, but it's also long and oddly capitalized and punctuated.  Seems
> like a judgement call which way is better, but I'm allergic to
> fooBar_baz style names.

I too dislike the shape of attachRel.  How about we rename attachRel to
attachrel?  So, attachrel_children, attachrel_constr, etc.  It's still
long though... :)

>>> -if (part_rel != attachRel &&
>>> -part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>>> +if (part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>>>  {
>>> -heap_close(part_rel, NoLock);
>>> +if (part_rel != attachRel)
>>> +heap_close(part_rel, NoLock);
>>>
>>> This works out to a cosmetic change, I guess, but it makes it worse...
>>
>> Not sure what you mean by "makes it worse".  The comment above says that
>> we should skip partitioned tables from being scheduled for heap scan.  The
>> new code still does that.  We should close part_rel before continuing to
>> consider the next partition, but mustn't do that if part_rel is really
>> attachRel.  The new code does that too.  Stylistically worse?
> 
> Yeah.  I mean, do you write:
> 
> if (a)
>if (b)
>c();
> 
> rather than
> 
> if (a && b)
> c();
> 
> ?
Hmm, The latter is better.  I guess I just get confused with long &&, ||,
() chains.

If you're fine with the s/attachRel/attachrel/g suggestion above, I will
update the patch along with reverting to if (a && b) c().

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-01 Thread Masahiko Sawada

Hi all,

I'd like to propose a new option -I for pgbench command which skips
the creating primary keys after initialized tables. This option is
useful for users who want to do bench marking with no index or indexes
other than btree primary index. If we initialize pgbench tables at a
large number scale factor the primary key index creation takes a long
time even if we're going to use other types of indexes. With this
option, the initialization time is reduced and you can create indexes
as you want.

Feedback is very welcome. I'll add this patch to the next CF.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


skip_building_pkeys_after_initialized.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

On Tue, Aug 1, 2017 at 9:23 PM, Amit Langote
 wrote:
> Since ATExecAttachPartition() deals with the possibility that the table
> being attached itself might be partitioned, someone reading the code might
> find it helpful to get some clue about whose partitions/children a
> particular piece of code is dealing with -  AT's target table's (rel's) or
> those of the table being attached (attachRel's)? IMHO, attachRel_children
> makes it abundantly clear that it is in fact the partitions of the table
> being attached that are being manipulated.

True, but it's also long and oddly capitalized and punctuated.  Seems
like a judgement call which way is better, but I'm allergic to
fooBar_baz style names.

>> -if (part_rel != attachRel &&
>> -part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>> +if (part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>>  {
>> -heap_close(part_rel, NoLock);
>> +if (part_rel != attachRel)
>> +heap_close(part_rel, NoLock);
>>
>> This works out to a cosmetic change, I guess, but it makes it worse...
>
> Not sure what you mean by "makes it worse".  The comment above says that
> we should skip partitioned tables from being scheduled for heap scan.  The
> new code still does that.  We should close part_rel before continuing to
> consider the next partition, but mustn't do that if part_rel is really
> attachRel.  The new code does that too.  Stylistically worse?

Yeah.  I mean, do you write:

if (a)
   if (b)
   c();

rather than

if (a && b)
c();

?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

Thanks for reviewing.

On 2017/08/02 2:54, Robert Haas wrote:
> On Mon, Jul 31, 2017 at 11:10 PM, Amit Langote
>  wrote:
>> OK, these cosmetic changes are now in attached patch 0001.
> 
> Regarding 0001:
> 
> -List   *childrels;
> +List   *attachRel_children;
> 
> I sorta don't see why this is necessary, or better.

Since ATExecAttachPartition() deals with the possibility that the table
being attached itself might be partitioned, someone reading the code might
find it helpful to get some clue about whose partitions/children a
particular piece of code is dealing with -  AT's target table's (rel's) or
those of the table being attached (attachRel's)? IMHO, attachRel_children
makes it abundantly clear that it is in fact the partitions of the table
being attached that are being manipulated.

>  /* It's safe to skip the validation scan after all */
>  if (skip_validate)
> +{
> +/* No need to scan the table after all. */
> 
> The existing comment should be removed along with adding the new one, I think.

Oops, fixed.

> -if (part_rel != attachRel &&
> -part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> +if (part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>  {
> -heap_close(part_rel, NoLock);
> +if (part_rel != attachRel)
> +heap_close(part_rel, NoLock);
> 
> This works out to a cosmetic change, I guess, but it makes it worse...

Not sure what you mean by "makes it worse".  The comment above says that
we should skip partitioned tables from being scheduled for heap scan.  The
new code still does that.  We should close part_rel before continuing to
consider the next partition, but mustn't do that if part_rel is really
attachRel.  The new code does that too.  Stylistically worse?

Thanks,
Amit
From d67fac574abb085db33e00711ded0515d71d99eb Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 1 Aug 2017 10:12:39 +0900
Subject: [PATCH 1/4] Cosmetic fixes for code in ATExecAttachPartition

---
 src/backend/commands/tablecmds.c | 33 +++--
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index bb00858ad1..1307dc5893 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13421,7 +13421,7 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
 {
RelationattachRel,
catalog;
-   List   *childrels;
+   List   *attachRel_children;
TupleConstr *attachRel_constr;
List   *partConstraint,
   *existConstraint;
@@ -13490,9 +13490,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
 * Prevent circularity by seeing if rel is a partition of attachRel. (In
 * particular, this disallows making a rel a partition of itself.)
 */
-   childrels = find_all_inheritors(RelationGetRelid(attachRel),
-   
AccessShareLock, NULL);
-   if (list_member_oid(childrels, RelationGetRelid(rel)))
+   attachRel_children = find_all_inheritors(RelationGetRelid(attachRel),
+   
 AccessShareLock, NULL);
+   if (list_member_oid(attachRel_children, RelationGetRelid(rel)))
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_TABLE),
 errmsg("circular inheritance not allowed"),
@@ -13684,19 +13684,16 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
skip_validate = true;
}
 
-   /* It's safe to skip the validation scan after all */
if (skip_validate)
+   {
+   /* No need to scan the table after all. */
ereport(INFO,
(errmsg("partition constraint for table \"%s\" 
is implied by existing constraints",

RelationGetRelationName(attachRel;
-
-   /*
-* Set up to have the table be scanned to validate the partition
-* constraint (see partConstraint above).  If it's a partitioned table, 
we
-* instead schedule its leaf partitions to be scanned.
-*/
-   if (!skip_validate)
+   }
+   else
{
+   /* Constraints proved insufficient, so we need to scan the 
table. */
List   *all_parts;
ListCell   *lc;
 
@@ -13721,17 +13718,17 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
part_rel = attachRel;
 
/*
-* Skip if it's a partitioned table.  Only

Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-01 Thread Noah Misch

On Thu, Jul 27, 2017 at 10:27:36AM -0400, Stephen Frost wrote:
> Noah, all,
> 
> * Noah Misch (n...@leadboat.com) wrote:
> > This PostgreSQL 10 open item is past due for your status update.  Kindly 
> > send
> > a status update within 24 hours, and include a date for your subsequent 
> > status
> > update.  Refer to the policy on open item ownership:
> 
> Based on the ongoing discussion, this is really looking like it's
> actually a fix that needs to be back-patched to 9.6 rather than a PG10
> open item.  I don't have any issue with keeping it as an open item
> though, just mentioning it.  I'll provide another status update on or
> before Monday, July 31st.

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

On 2017-08-01 20:37:07 -0400, Robert Haas wrote:
> On Tue, Aug 1, 2017 at 7:03 PM, Andres Freund  wrote:
> > On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:
> >> When I shut down a cluster that isn't using logical replication, it
> >> always logs a line like the following.  So do the build farm members I
> >> looked at.  I didn't see anything about this in the open items list --
> >> isn't it a bug?
> >>
> >> 2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
> >> replication launcher (PID 34788) exited with exit code 1
> >
> > Exit code 0 signals that a worker should be restarted. Therefore
> > graceful exit can't really use that.  I think a) we really need to
> > improve bgworker infrastructure around that b) shows the limit of using
> > bgworkers for this kinda thing - we should probably have a more bgworker
> > like infrastructure for internal workers.
> 
> You might've missed commit be7558162acc5578d0b2cf0c8d4c76b6076ce352.

Not really, just thinko-ing it. We don't want to unregister, so we can't
return 0, IOW, I just * -1'd my comment ;)

We intentionally return 1, so we *do* get restarted:
else if (IsLogicalLauncher())
{
ereport(DEBUG1,
(errmsg("logical replication launcher 
shutting down")));

/*
 * The logical replication launcher can be stopped at 
any time.
 * Use exit status 1 so the background worker is 
restarted.
 */
proc_exit(1);
}

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

On Tue, Aug 1, 2017 at 7:03 PM, Andres Freund  wrote:
> On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:
>> When I shut down a cluster that isn't using logical replication, it
>> always logs a line like the following.  So do the build farm members I
>> looked at.  I didn't see anything about this in the open items list --
>> isn't it a bug?
>>
>> 2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
>> replication launcher (PID 34788) exited with exit code 1
>
> Exit code 0 signals that a worker should be restarted. Therefore
> graceful exit can't really use that.  I think a) we really need to
> improve bgworker infrastructure around that b) shows the limit of using
> bgworkers for this kinda thing - we should probably have a more bgworker
> like infrastructure for internal workers.

You might've missed commit be7558162acc5578d0b2cf0c8d4c76b6076ce352.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

On 2017/08/02 4:02, Robert Haas wrote:
> On Tue, Aug 1, 2017 at 12:26 AM, Amit Langote
>  wrote:
>> So is the latest patch posted upthread to process ON CONFLICT DO NOTHING
>> using locally-defined unique indexes on leaf partitions something to 
>> consider?
> 
> Yeah, for v11.

OK.

>> Maybe, not until we have cascading index definition working [1]?
> 
> Not sure what that has to do with it.

Hmm, scratch that.  I was thinking that if all partitions had uniformly
defined (unique) indexes, the behavior on specifying on conflict do
nothing would be consistent across all partitions, but I guess that's not
a really big win or anything.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

On 2017-08-02 12:14:18 +1200, Thomas Munro wrote:
> On Wed, Aug 2, 2017 at 11:03 AM, Andres Freund  wrote:
> > On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:
> >> When I shut down a cluster that isn't using logical replication, it
> >> always logs a line like the following.  So do the build farm members I
> >> looked at.  I didn't see anything about this in the open items list --
> >> isn't it a bug?
> >>
> >> 2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
> >> replication launcher (PID 34788) exited with exit code 1
> >
> > Exit code 0 signals that a worker should be restarted. Therefore
> > graceful exit can't really use that.  I think a) we really need to
> > improve bgworker infrastructure around that b) shows the limit of using
> > bgworkers for this kinda thing - we should probably have a more bgworker
> > like infrastructure for internal workers.
> 
> I see.  In the meantime IMHO I think we should try to find a way to
> avoid printing out this message -- it looks like something is wrong to
> the uninitiated.

Well, that's how it is for all bgworkers - maybe a better solution is to
adjust that message in the postmaster rather than fiddle with the worker
exist code?  Seems like we could easily take pmStatus into account
inside LogChildExit() and set the log level to DEBUG1 even for
EXIT_STATUS_1 in that case?  Additionally we probably should always log
a better message for bgworkers exiting with exit 1, something about
unregistering the worker or such.


> Possibly stupid question: why do we restart workers when we know we're
> shutting down anyway?  Hmm, I suppose there might conceivably be
> workers that need to do something during shutdown and they might not
> have done it yet.

The launcher doesn't really know the reason for the shutdown.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

2017-08-01 Thread Thomas Munro

On Wed, Aug 2, 2017 at 11:03 AM, Andres Freund  wrote:
> On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:
>> When I shut down a cluster that isn't using logical replication, it
>> always logs a line like the following.  So do the build farm members I
>> looked at.  I didn't see anything about this in the open items list --
>> isn't it a bug?
>>
>> 2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
>> replication launcher (PID 34788) exited with exit code 1
>
> Exit code 0 signals that a worker should be restarted. Therefore
> graceful exit can't really use that.  I think a) we really need to
> improve bgworker infrastructure around that b) shows the limit of using
> bgworkers for this kinda thing - we should probably have a more bgworker
> like infrastructure for internal workers.

I see.  In the meantime IMHO I think we should try to find a way to
avoid printing out this message -- it looks like something is wrong to
the uninitiated.

Possibly stupid question: why do we restart workers when we know we're
shutting down anyway?  Hmm, I suppose there might conceivably be
workers that need to do something during shutdown and they might not
have done it yet.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

Shay Rojansky  writes:
> Once again, I manged to make the error go away simply by setting the
> session id context, which seems to be a mandatory server-side step for
> properly support session tickets.

The fact that you made the error go away doesn't make this a good
solution.  In particular, using a simple constant session ID is completely
insecure according to the TLS spec.  RFC 5246, F.1.4, doesn't even care
for the idea of ever writing session IDs to stable storage; although
Apache seems to be content with a session ID that is unique per-server
(it looks like they just use a hash of the server's host name).

More generally, PG as currently configured can't do anything with a
session cache since each new backend would start with an empty cache.
So the question here is whether it's safe or worthwhile to allow use
of session tickets.  I agree with Heikki's opinion that it's unlikely
to provide any meaningful performance gain for database sessions that
are of reasonable length.  I'm also pretty concerned about the possibility
for security problems, eg a client being able to force a server into some
low-security SSL mode.  Both RFC 5077 and the Apache people say that if
you use session tickets you'd better rotate the keys for them regularly,
eg in Apache's changelog we find

 Session ticket creation uses a random key created during web
 server startup and recreated during restarts. No other key
 recreation mechanism is available currently. Therefore using session
 tickets without restarting the web server with an appropriate frequency
 (e.g. daily) compromises perfect forward secrecy. [Rainer Jung]

Since we have no mechanism for that, I think that we need to err on
the side of security.

Accordingly, what I think we should do is something more like the
attached.  Could you see whether it fixes your problem?

regards, tom lane

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index dc307c1..fc6d0f7 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -290,6 +290,14 @@ be_tls_init(bool isServerStart)
 		SSL_OP_SINGLE_DH_USE |
 		SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);

+	/* disallow SSL session tickets */
+#ifdef SSL_OP_NO_TICKET
+	SSL_CTX_set_options(context, SSL_OP_NO_TICKET);
+#endif
+
+	/* disallow SSL session caching, too */
+	SSL_CTX_set_session_cache_mode(context, SSL_SESS_CACHE_OFF);
+
 	/* set up ephemeral DH and ECDH keys */
 	if (!initialize_dh(context, isServerStart))
 		goto error;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

On 2017-08-02 11:19:39 +1200, Gavin Flower wrote:
> Returning zero to indicate success is a holdover to the time computers could
> only run one program at a time.  At the end of the code there was a jump
> table of 4 byte entries.  The first entry with a displacement of zero was
> the location to jump to for a normal exit, subsequent entries where for
> various error conditions.  This why often return codes where in positive
> multiples of 4, since we don't use jump tables now - more & more people are
> using any integers they want.
> 
> So apart from convention, returning zero is no longer held to be a sacred to
> indicate something exited okay.  In fact since, zero could simply mean a
> value was not explicitly assigned, hence it is actually a very dangerous
> value  to be used to indicate success!

This has nothing to do with this thread.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

2017-08-01 Thread Gavin Flower


On 02/08/17 11:03, Andres Freund wrote:

Hi,

On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:

When I shut down a cluster that isn't using logical replication, it
always logs a line like the following.  So do the build farm members I
looked at.  I didn't see anything about this in the open items list --
isn't it a bug?

2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
replication launcher (PID 34788) exited with exit code 1

Exit code 0 signals that a worker should be restarted. Therefore
graceful exit can't really use that.  I think a) we really need to
improve bgworker infrastructure around that b) shows the limit of using
bgworkers for this kinda thing - we should probably have a more bgworker
like infrastructure for internal workers.

- Andres


Returning zero to indicate success is a holdover to the time computers 
could only run one program at a time.  At the end of the code there was 
a jump table of 4 byte entries.  The first entry with a displacement of 
zero was the location to jump to for a normal exit, subsequent entries 
where for various error conditions.  This why often return codes where 
in positive multiples of 4, since we don't use jump tables now - more & 
more people are using any integers they want.


So apart from convention, returning zero is no longer held to be a 
sacred to indicate something exited okay.  In fact since, zero could 
simply mean a value was not explicitly assigned, hence it is actually a 
very dangerous value  to be used to indicate success!



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher exit with exit code 1?

Hi,

On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:
> When I shut down a cluster that isn't using logical replication, it
> always logs a line like the following.  So do the build farm members I
> looked at.  I didn't see anything about this in the open items list --
> isn't it a bug?
> 
> 2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
> replication launcher (PID 34788) exited with exit code 1

Exit code 0 signals that a worker should be restarted. Therefore
graceful exit can't really use that.  I think a) we really need to
improve bgworker infrastructure around that b) shows the limit of using
bgworkers for this kinda thing - we should probably have a more bgworker
like infrastructure for internal workers.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Why does logical replication launcher exit with exit code 1?

2017-08-01 Thread Thomas Munro

Hi,

When I shut down a cluster that isn't using logical replication, it
always logs a line like the following.  So do the build farm members I
looked at.  I didn't see anything about this in the open items list --
isn't it a bug?

2017-08-02 10:39:25.007 NZST [34781] LOG:  worker process: logical
replication launcher (PID 34788) exited with exit code 1

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] reload-through-the-top-parent switch the partition table

On Tue, Aug 1, 2017 at 5:34 AM, Rushabh Lathia  wrote:
> My colleague Robert and I had doubt about the order in of TABLE
> and TABLE_DATA. We thought earlier that reload-thought-root might
> might not solve the purpose which has been discussed in the above
> mentioned thread.  But later looking into code I realize the sort order
> for DO_TABLE and DO_TABLE_DATA are different, so we don't need
> to worry about that issue.

Hmm.  Does that mean that table data restoration will *absolutely
always* follow all CREATE TABLE commands, or is that something that's
*normally* true but potentially false if dependency sorting switches
things around?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

On Wed, Jul 12, 2017 at 3:31 PM, Jeevan Ladhe
 wrote:
> 0001:
> Refactoring existing ATExecAttachPartition  code so that it can be used for
> default partitioning as well

Boring refactoring.  Seems fine.

> 0002:
> This patch teaches the partitioning code to handle the NIL returned by
> get_qual_for_list().
> This is needed because a default partition will not have any constraints in
> case
> it is the only partition of its parent.

Perhaps it would be better to make validatePartConstraint() a no-op
when the constraint is empty rather than putting the logic in the
caller.  Otherwise, every place that calls validatePartConstraint()
has to think about whether or not the constraint-is-NULL case needs to
be handled.

> 0003:
> Support for default partition with the restriction of preventing addition of
> any
> new partition after default partition.

This looks generally reasonable, but can't really be committed without
the later patches, because it might break pg_dump, which won't know
that the DEFAULT partition must be dumped last and might therefore get
the dump ordering wrong, and of course also because it reverts commit
c1e0e7e1d790bf18c913e6a452dea811e858b554.

> 0004:
> Store the default partition OID in pg_partition_table, this will help us to
> retrieve the OID of default relation when we don't have the relation cache
> available. This was also suggested by Amit Langote here[1].

I looked this over and I think this is the right approach.  An
alternative way to avoid needing a relcache entry in
heap_drop_with_catalog() would be for get_default_partition_oid() to
call find_inheritance_children() here and then use a syscache lookup
to get the partition bound for each one, but that's still going to
cause some syscache bloat.

> 0005:
> Extend default partitioning support to allow addition of new partitions.

+   if (spec->is_default)
+   {
+   /* Default partition cannot be added if there already
exists one. */
+   if (partdesc->nparts > 0 &&
partition_bound_has_default(boundinfo))
+   {
+   with = boundinfo->default_index;
+   ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+errmsg("partition \"%s\"
conflicts with existing default partition \"%s\"",
+   relname,
get_rel_name(partdesc->oids[with])),
+parser_errposition(pstate,
spec->location)));
+   }
+
+   return;
+   }

I generally think it's good to structure the code so as to minimize
the indentation level.  In this case, if you did if (partdesc->nparts
== 0 || !partition_bound_has_default(boundinfo)) return; first, then
the rest of it could be one level less indented.  Also, perhaps it
would be clearer to test boundinfo == NULL rather than
partdesc->nparts == 0, assuming they are equivalent.

-* We must also lock the default partition, for the same
reasons explained
-* in heap_drop_with_catalog().
+* We must lock the default partition, for the same reasons explained in
+* DefineRelation().

I don't really see the point of this change.  Whichever earlier patch
adds this code could include or omit the word "also" as appropriate,
and then this patch wouldn't need to change it.

> 0006:
> Extend default partitioning validation code to reuse the refactored code in
> patch 0001.

I'm having a very hard time understanding what's going on with this
patch.  It certainly doesn't seem to be just refactoring things to use
the code from 0001.  For example:

-   if (ExecCheck(partqualstate, econtext))
+   if (!ExecCheck(partqualstate, econtext))

It seems hard to believe that refactoring things to use the code from
0001 would involve inverting the value of this test.

+* derived from the bounds(the partition constraint
never evaluates to
+* NULL, so negating it like this is safe).

I don't see it being negated.

I think this patch needs a better explanation of what it's trying to
do, and better comments.  I gather that at least part of the point
here is to skip validation scans on default partitions if the default
partition has been constrained not to contain any values that would
fall in the new partition, but neither the commit message for 0006 nor
your description here make that very clear.

> 0007:
> This patch introduces code to check if the scanning of default partition
> child
> can be skipped if it's constraints are proven.

If I understand correctly, this is actually a completely separate
feature not intrinsically related to default partitioning.

> [0008 documentation]

-  attached is marked NO INHERIT, the command will fail;
-  such a constraint must be recreated without the NO
INHERIT
-  clause.
+  target table.
+ 

I don't favor inserting a

Re: [HACKERS] More flexible LDAP auth search filters?

2017-08-01 Thread Thomas Munro

On Wed, Aug 2, 2017 at 5:36 AM, Peter Eisentraut
 wrote:
> On 7/16/17 19:09, Thomas Munro wrote:
>> On Mon, Jul 17, 2017 at 10:26 AM, Thomas Munro
>>  wrote:
>>> ldap-search-filters-v2.patch
>>
>> Gah, it would help if I could spell "occurrences" correctly.  Fixed in
>> the attached.
>
> Please also add the corresponding support for specifying search filters
> in LDAP URLs.  See RFC 4516 for the format and
> https://linux.die.net/man/3/ldap_url_parse for the API.  You might just
> need to grab LDAPURLDesc.lud_filter and use it.

Good idea.  Yes, it seems to be that simple.  Here's a version like
that.  Here's an example of how it looks in pg_hba.conf:

host   all all  127.0.0.1/32ldap
ldapurl="ldap://localhost/ou=people1,dc=my-domain,dc=com??sub?(cn=%25u)"

Maybe we could choose a better token than %u for user name, since it
has to be escaped when included in a URL like that, but on the other
hand there seems to be wide precedent for %u in other software.

-- 
Thomas Munro
http://www.enterprisedb.com


ldap-search-filters-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #14758: Segfault with logical replication on a function index

On 8/1/17 00:21, Noah Misch wrote:
> On Mon, Jul 31, 2017 at 09:40:34AM +0900, Masahiko Sawada wrote:
>> On Sat, Jul 29, 2017 at 4:35 AM, Scott Milliken  wrote:
>>> Thank you Masahiko! I've tested and confirmed that this patch fixes the
>>> problem.
>>>
>>
>> Thank you for the testing. This issue should be added to the open item
>> since this cause of the server crash. I'll add it.
> 
> [Action required within three days.  This is a generic notification.]

I'm looking into this now and will report back on Thursday.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] asynchronous execution

On Mon, Jul 31, 2017 at 5:42 AM, Kyotaro HORIGUCHI
 wrote:
> Another is getting rid of recursive call to run an execution
> tree.

That happens to be exactly what Andres did for expression evaluation
in commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755, and I think
generalizing that to include the plan tree as well as expression trees
is likely to be the long-term way forward here.  Unfortunately, that's
probably another gigantic patch (that should probably be written by
Andres).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

2017-08-01 Thread Pavel Stehule

2017-08-01 18:35 GMT+02:00 Remi Colinet :

> I did it in version 2 of the patch.
> The patch could yield TEXT, JSON, and XML ouput.
>
> For below query, it gives:
>
> => Terminal 1
> test=# select * from t_10m, t_1m where t_10m.md5 like '%cb%';
>
> => Terminal 2
> test=# \watch PROGRESS 9546;
>Wed 10 May 2017 06:29:59 PM CEST (every 1s)
>
>   PLAN
> PROGRESS
> 
> -
>  status: 
>  query: select * from t_10m, t_1m where t_10m.md5 like '%cb%';
>  time used (s): 10
>  Nested Loop
>->  Seq Scan on t_1m => rows 7/100 0% => blks 8334/8334 100%
>->  Materialize => file read readptrcount=1 rows write=1189285
> read=6584854 disk use (bytes) 53842965
>  ->  Seq Scan on t_10m => rows 1145596/738172 155%
>Filter: (md5 ~~ '%cb%'::text)
>  total disk space used (MB) 51
> (9 rows)
>
> => Terminal 2
> test=# PROGRESS (FORMAT JSON) 9546;
> PLAN PROGRESS
> --
>  [   +
>"status": "",  +
>"query": "select * from t_10m, t_1m where t_10m.md5 like '%cb%';",+
>"time used (s)": 0,   +
>"single worker": {+
>  "Node Type": "Nested Loop", +
>  "Partial Mode": "", +
>  "Operation": "single worker",   +
>  "Parent Relationship": "single worker", +
>  "Custom Plan Provider": "(@\u0004\u0001",   +
>  "Parallel Aware": false,+
>  "Outer": {  +
>"Node Type": "Seq Scan",  +
>"Strategy": "",   +
>"Partial Mode": "single worker",  +
>"Operation": "Outer", +
>"Parent Relationship": "Outer",   +
>"Custom Plan Provider": "(@\u0004\u0001", +
>"Parallel Aware": false,  +
>"relation": "t_1m",   +
>"rows fetched": 1,+
>"rows total": 100,+
>"rows percent": 0,+
>"blocks fetched": 8334,   +
>"blocks total": 8334, +
>"blocks percent": 100 +
>  },  +
>  "Inner": {  +
>"Node Type": "Materialize",   +
>"Strategy": "",   +
>"Partial Mode": "single worker",  +
>"Operation": "Inner", +
>"Parent Relationship": "Inner",   +
>"Custom Plan Provider": "(@\u0004\u0001", +
>"Parallel Aware": false,  +
>"file store": "write",+
>"readptrcount": 1,+
>"rows write": 297256, +
>"rows read": 0,   +
>"disk use (bytes)": 11911168, +
>"Outer": {+
>  "Node Type": "Seq Scan",+
>  "Strategy": "", +
>  "Partial Mode": "Inner",+
>  "Operation": "Outer",   +
>  "Parent Relationship": "Outer", +
>  "Custom Plan Provider": "HtFH\b[]\u000f\u001f", +
>  "Parallel Aware": false,+
>  "relation": "t_10m",+
>  "rows fetched": 253566, +
>  "rows total": 738172,   +
>  "rows percent": 34, +
>  "blocks fetched":

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Ildus K

On Tue, 1 Aug 2017 15:33:08 -0400
Robert Haas  wrote:

> On Tue, Aug 1, 2017 at 3:10 PM, Ildus K
>  wrote:
> >> So this would break pg_upgrade for tsvector columns?  
> >
> > I added a function that will convert old tsvectors on the fly. It's
> > the approach used in hstore before.  
> 
> Does that mean the answer to the question that I asked is "yes, but I
> have a workaround" or does it mean that the answer is "no"?
> 

It's a workaround. DatumGetTSVector and
DatumGetTSVectorCopy will upgrade tsvector on the fly if it
has old format.

Regards,
Ildus Kurbangaliev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better way to handle suppression of CASCADE detail messages

2017-08-01 Thread Alvaro Herrera

Andres Freund wrote:
> On 2017-08-01 13:48:34 -0400, Robert Haas wrote:
> > On Tue, Aug 1, 2017 at 1:39 PM, Andres Freund  wrote:
> > > Oid is probably not good enough - with parallel tests and such it's not
> > > necessarily predicable. Even less so when the tests are run against an
> > > existing cluster.  Sorting by name would probably be better...
> > 
> > It's arguably more user-friendly, too, although part of me feels like
> > it would be better to try to preserve the topological ordering in some
> > way.  If something cascades to foo and from there to bar and from
> > there to baz to and from there to quux, emitting the messages as
> > 
> > drop cascades to bar
> > drop cascades to baz
> > drop cascades to foo
> > drop cascades to quux
> > 
> > is arguably not going to be too helpful to the user in understanding
> > the chain of events, however nice it may be for regression testing
> > purposes.
> 
> I'm not sure that's going to easily be better - won't the oid order in
> turn determine the topological order. Which then again isn't very easy
> to understand for users.

I'm not sure I buy the idea of ordering by name.  Not all objects are
going to be of the same type, so ordering by name is going to look
strange.

OID order would only have a problem if you run the tests just before OID
wraparound, and the counter wraps in the middle of the test.  To me that
seems a fringe enough situation that we shouldn't worry about it ...  in
a freshly initdb'd cluster it will never behave strangely.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

On Tue, Aug 1, 2017 at 3:37 PM, Peter Eisentraut
 wrote:
> On 7/21/17 12:59, Robert Haas wrote:
>> That's an exceedingly-weak argument for rejecting this patch.  The
>> fact that you can probably hack around the lack of a hook for most
>> reasonable use cases is not an argument for having a hook that does
>> what people actually want to do.
>
> Still nobody has presented a concrete use case so far.

I've been asked for this at EDB, too.  Inserting a row into some table
on each logon, for example.

A quick Google search found 6 previous requests for this feature, some
of which describe intended use cases:

https://www.postgresql.org/message-id/4ebc6852.5030...@fuzzy.cz (2011)
https://www.postgresql.org/message-id/CAHyXU0wrsYShxmwBxZSGYoiBJa%3DgzEJ17iAeRvaf_vA%2BcoH_qA%40mail.gmail.com
(2011)
https://www.postgresql.org/message-id/bay104-w513cf26c0046c9d28747b8d1...@phx.gbl
(2009, in Spanish)
https://www.postgresql.org/message-id/758d5e7f0803130227m558d32cdl7159bed00d21f084%40mail.gmail.com
(2008)
https://www.postgresql.org/message-id/001a01c48077%240b118e60%240200030a%40gendron.ca
(2004)
https://www.postgresql.org/message-id/f96sgcorblsaqv6updv0...@hotmail.com
(2000)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

On 2017-08-01 15:37:40 -0400, Peter Eisentraut wrote:
> On 7/21/17 12:59, Robert Haas wrote:
> > That's an exceedingly-weak argument for rejecting this patch.  The
> > fact that you can probably hack around the lack of a hook for most
> > reasonable use cases is not an argument for having a hook that does
> > what people actually want to do.
> 
> Still nobody has presented a concrete use case so far.

Citus for example starts a background worker (performing
e.g. distributed deadlock detection) if the citus extension exists. We
atm need annoying hacks to do so when the first query is executed.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

On 7/21/17 12:59, Robert Haas wrote:
> That's an exceedingly-weak argument for rejecting this patch.  The
> fact that you can probably hack around the lack of a hook for most
> reasonable use cases is not an argument for having a hook that does
> what people actually want to do.

Still nobody has presented a concrete use case so far.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

On 7/20/17 07:47, Yugo Nagata wrote:
> Another patch, session_start_sample.patch, is a very simple
> example of this hook that changes work_mem values for sessions
> of a specific database. 

I think test modules should go into src/test/modules/ instead of contrib.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] A hook for session start

On 7/21/17 13:14, Jim Mlodgenski wrote:
> When I first saw this thread, my initial thought of a use case is to
> prepare some key application queries so they are there and ready to go.
> That would need to be before the ExecutorStart_hook or
> ProcessUtility_hook if an app would just want to execute the prepared
> statement.

Isn't that what the preprepare extension does already?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove 1MB size limit in tsvector

On Tue, Aug 1, 2017 at 3:10 PM, Ildus K  wrote:
>> So this would break pg_upgrade for tsvector columns?
>
> I added a function that will convert old tsvectors on the fly. It's the
> approach used in hstore before.

Does that mean the answer to the question that I asked is "yes, but I
have a workaround" or does it mean that the answer is "no"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel documentation improvements

On Tue, Aug 1, 2017 at 3:15 PM, Erik Rijkers  wrote:
> On 2017-08-01 20:43, Robert Haas wrote:
>> In commit 054637d2e08cda6a096f48cc99696136a06f4ef5, I updated the
>> parallel query documentation to reflect recently-committed parallel
>>
>> Barring objections, I'd like to commit this in the next couple of days
>
> I think that in this bit:
>
>  occurrence is frequent, considering increasing
>  max_worker_processes and max_parallel_workers
>  so that more workers can be run simultaneously or alternatively
> reducing
> - so that the
> planner
> +max_parallel_workers_per_gather so that the planner
>  requests fewer workers.
>
>
> 'considering increasing'  should be
> 'consider increasing'

Thanks, you are right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel documentation improvements

2017-08-01 Thread Erik Rijkers


On 2017-08-01 20:43, Robert Haas wrote:


In commit 054637d2e08cda6a096f48cc99696136a06f4ef5, I updated the
parallel query documentation to reflect recently-committed parallel

Barring objections, I'd like to commit this in the next couple of days


I think that in this bit:

 occurrence is frequent, considering increasing
 max_worker_processes and 
max_parallel_workers
 so that more workers can be run simultaneously or alternatively 
reducing
- so that the 
planner
+max_parallel_workers_per_gather so that the 
planner

 requests fewer workers.


'considering increasing'  should be
'consider increasing'




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Ildus K

On Tue, 1 Aug 2017 14:56:54 -0400
Robert Haas  wrote:

> On Tue, Aug 1, 2017 at 10:08 AM, Ildus Kurbangaliev
>  wrote:
> > Historically tsvector type can't hold more than 1MB data.
> > I want to propose a patch that removes that limit.
> >
> > That limit is created by 'pos' field from WordEntry, which have only
> > 20 bits for storage.
> >
> > In the proposed patch I removed this field and instead of it I keep
> > offsets only at each Nth item in WordEntry's array.  
> 
> So this would break pg_upgrade for tsvector columns?
> 

I added a function that will convert old tsvectors on the fly. It's the
approach used in hstore before.

Regards,
Ildus Kurbangaliev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Update comments in nodeModifyTable.c

On Tue, Aug 1, 2017 at 12:31 AM, Etsuro Fujita
 wrote:
> Maybe I'm missing something, but I'm not sure that's a good idea because the
> change says like we might have 'wholerow' only for the FDW case, but that
> isn't correct because we would have 'wholerow' for a view as well. ISTM that
> views should be one of the typical cases, so I'd like to propose to modify
> the sentence starting with 'Typically' to something like this: "Typically,
> this will be a 'ctid' or 'wholerow' attribute, but in the case of a foreign
> data wrapper it might be a set of junk attributes sufficient to identify the
> remote row."  What do you think about that?

Works for me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitioning vs ON CONFLICT

On Tue, Aug 1, 2017 at 12:26 AM, Amit Langote
 wrote:
> So is the latest patch posted upthread to process ON CONFLICT DO NOTHING
> using locally-defined unique indexes on leaf partitions something to consider?

Yeah, for v11.

> Maybe, not until we have cascading index definition working [1]?

Not sure what that has to do with it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] JDBC 42.1.4 released

2017-08-01 Thread Dave Cramer

*Notable changes*

   - Statements with non-zero fetchSize no longer require server-side named
   handle. This might cause issues when using old PostgreSQL versions
   (pre-8.4)+fetchSize+interleaved ResultSet processing combo. see issue 869
   


Thanks,

Dave Cramer

Re: [HACKERS] Remove 1MB size limit in tsvector

On Tue, Aug 1, 2017 at 10:08 AM, Ildus Kurbangaliev
 wrote:
> Historically tsvector type can't hold more than 1MB data.
> I want to propose a patch that removes that limit.
>
> That limit is created by 'pos' field from WordEntry, which have only
> 20 bits for storage.
>
> In the proposed patch I removed this field and instead of it I keep
> offsets only at each Nth item in WordEntry's array.

So this would break pg_upgrade for tsvector columns?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench minor doc typo

2017-08-01 Thread Fabien COELHO




Alik Khilazhev is submitting a patch about a zipfian random function
for pgbench, and noticed a typo in the documentation about
random_exponential.

Attached is a fix extracted from his patch submission, which could be
applied to head/10/9.6.


done


Ok, thanks.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] parallel documentation improvements

Hi,

In commit 054637d2e08cda6a096f48cc99696136a06f4ef5, I updated the
parallel query documentation to reflect recently-committed parallel
query features.  However, a few more things got committed after that.
Most of the attached patch consists of generalizing references to
Gather to also include Gather Merge, but also included a tweak to note
that uncorrelated subplans are no longer parallel-restricted and made
a few other minor improvements and clarifications.

Barring objections, I'd like to commit this in the next couple of days
so that it is included in beta3.  I probably should have gotten to do
this sooner; apologies for any inconvenience.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-doc-for-v10.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench minor doc typo

On 7/19/17 16:42, Fabien COELHO wrote:
> Alik Khilazhev is submitting a patch about a zipfian random function
> for pgbench, and noticed a typo in the documentation about 
> random_exponential.
> 
> Attached is a fix extracted from his patch submission, which could be 
> applied to head/10/9.6.

done

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

2017-08-01 Thread Michael Paquier

On Tue, Aug 1, 2017 at 7:17 PM, Andres Freund  wrote:
> On 2017-08-01 19:11:55 +0200, Michael Paquier wrote:
>> I think that Depesz makes use of a non-default format for its
>> explain.depesz.com, or he would have a hard time maintaining a
>> deparsing API for its application.
>
> Hm? e.d.c accepts the text explain format, so I'm unclear on what you're
> saying here.

Ah, right. I thought it did...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better way to handle suppression of CASCADE detail messages

On 2017-08-01 13:48:34 -0400, Robert Haas wrote:
> On Tue, Aug 1, 2017 at 1:39 PM, Andres Freund  wrote:
> > Oid is probably not good enough - with parallel tests and such it's not
> > necessarily predicable. Even less so when the tests are run against an
> > existing cluster.  Sorting by name would probably be better...
> 
> It's arguably more user-friendly, too, although part of me feels like
> it would be better to try to preserve the topological ordering in some
> way.  If something cascades to foo and from there to bar and from
> there to baz to and from there to quux, emitting the messages as
> 
> drop cascades to bar
> drop cascades to baz
> drop cascades to foo
> drop cascades to quux
> 
> is arguably not going to be too helpful to the user in understanding
> the chain of events, however nice it may be for regression testing
> purposes.

I'm not sure that's going to easily be better - won't the oid order in
turn determine the topological order. Which then again isn't very easy
to understand for users.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better way to handle suppression of CASCADE detail messages

Robert Haas  writes:
> On Tue, Aug 1, 2017 at 1:39 PM, Andres Freund  wrote:
>> Oid is probably not good enough - with parallel tests and such it's not
>> necessarily predicable. Even less so when the tests are run against an
>> existing cluster.  Sorting by name would probably be better...

> It's arguably more user-friendly, too, although part of me feels like
> it would be better to try to preserve the topological ordering in some
> way.

Yeah, loss of the causality relationship is the main thing that's bugging
me too.

If we sorted by OID then in most cases the objects would be listed in
creation order, which would likely also have something to do with the
dependency order; but it would be different in the same cases that are
most likely to be confusing :-(

I do not buy Andres' concern about parallelism breaking the test results.
We only ever drop objects created in the same test, so their OID ordering
would be the same (ie creation order) in every case unless an OID
wraparound occurred mid-test, which isn't a situation I feel a need to
worry about for this purpose.  However, possible loss of user friendliness
*is* a valid concern here.

Anyway, we don't need a design for this today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

On Tue, Aug 1, 2017 at 12:35 PM, Remi Colinet  wrote:
> I did it in version 2 of the patch.
> I'am skeptical about the use of JSON, XML, and others in such output.
>
> Does anyone use these formats (XML, JSON, YAML) for EXPLAIN output?
> I suspect only TEXT format is being used.

Do you have any reason to suspect that others aren't being used?  The
default format for anything is likely to be the most commonly-used
one, but I don't think that proves the others are unused.

Even if it were true, it wouldn't be a good justification for
inventing an entirely new machine-readable format, at least not IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

On Mon, Jul 31, 2017 at 11:10 PM, Amit Langote
 wrote:
> OK, these cosmetic changes are now in attached patch 0001.

Regarding 0001:

-List   *childrels;
+List   *attachRel_children;

I sorta don't see why this is necessary, or better.

 /* It's safe to skip the validation scan after all */
 if (skip_validate)
+{
+/* No need to scan the table after all. */

The existing comment should be removed along with adding the new one, I think.

-if (part_rel != attachRel &&
-part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+if (part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 {
-heap_close(part_rel, NoLock);
+if (part_rel != attachRel)
+heap_close(part_rel, NoLock);

This works out to a cosmetic change, I guess, but it makes it worse...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better way to handle suppression of CASCADE detail messages

On Tue, Aug 1, 2017 at 1:39 PM, Andres Freund  wrote:
> Oid is probably not good enough - with parallel tests and such it's not
> necessarily predicable. Even less so when the tests are run against an
> existing cluster.  Sorting by name would probably be better...

It's arguably more user-friendly, too, although part of me feels like
it would be better to try to preserve the topological ordering in some
way.  If something cascades to foo and from there to bar and from
there to baz to and from there to quux, emitting the messages as

drop cascades to bar
drop cascades to baz
drop cascades to foo
drop cascades to quux

is arguably not going to be too helpful to the user in understanding
the chain of events, however nice it may be for regression testing
purposes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better way to handle suppression of CASCADE detail messages

Hi,

On 2017-08-01 13:34:45 -0400, Tom Lane wrote:
> BTW, in the long run maybe we should instead make the CASCADE message
> ordering more predictable, perhaps by sorting the objects by OID.
> But that's not a job for beta time.

Oid is probably not good enough - with parallel tests and such it's not
necessarily predicable. Even less so when the tests are run against an
existing cluster.  Sorting by name would probably be better...

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More flexible LDAP auth search filters?

On 7/16/17 19:09, Thomas Munro wrote:
> On Mon, Jul 17, 2017 at 10:26 AM, Thomas Munro
>  wrote:
>> ldap-search-filters-v2.patch
> 
> Gah, it would help if I could spell "occurrences" correctly.  Fixed in
> the attached.

Please also add the corresponding support for specifying search filters
in LDAP URLs.  See RFC 4516 for the format and
https://linux.die.net/man/3/ldap_url_parse for the API.  You might just
need to grab LDAPURLDesc.lud_filter and use it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Better way to handle suppression of CASCADE detail messages

In various places in the regression tests, we want to suppress DROP
CASCADE's detail messages because of the fact that they don't always
come out in the same order.  I noticed that some places deal with that
by adjusting client_min_messages while others use "\set VERBOSITY terse".
I think that the latter approach is superior, because with it you still
get to see the basic notice message, something like
NOTICE:  drop cascades to 17 other objects
In this way, the test at least verifies that the expected number of
objects get dropped, even if we aren't checking their precise identities.

So, barring objection, I'm going to run around and change all the tests
that use client_min_messages for this purpose to use VERBOSITY instead.

BTW, in the long run maybe we should instead make the CASCADE message
ordering more predictable, perhaps by sorting the objects by OID.
But that's not a job for beta time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

On 2017-08-01 19:11:55 +0200, Michael Paquier wrote:
> I think that Depesz makes use of a non-default format for its
> explain.depesz.com, or he would have a hard time maintaining a
> deparsing API for its application.

Hm? e.d.c accepts the text explain format, so I'm unclear on what you're
saying here.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

2017-08-01 Thread Michael Paquier

On Tue, Aug 1, 2017 at 6:35 PM, Remi Colinet  wrote:
> I'am skeptical about the use of JSON, XML, and others in such output.

You should not.

> Does anyone use these formats (XML, JSON, YAML) for EXPLAIN output?
> I suspect only TEXT format is being used.

I think that Depesz makes use of a non-default format for its
explain.depesz.com, or he would have a hard time maintaining a
deparsing API for its application. JSON is for example easy to extract
and reformat when doing analysis of the inner planner nodes, and
Postgres has json and xml data types, which makes analysis with SQL
even easier sometimes.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Not able to create collation on Windows

Murtuza Zabuawala  writes:
> Yes, I was able to create collation using "C" instead of "POSIX" on windows,
> CREATE COLLATION public.test from pg_catalog."C";

Yeah, I thought that might happen.  So the point basically is that in
almost all of the collations code, the "C" and "POSIX" names are handled
by dedicated code paths that don't care what the system's locale support
thinks.  But we missed that for CREATE COLLATION.  Aside from the case
you ran into, this means you can't do CREATE COLLATION ... FROM "C"
at all on platforms that lack HAVE_LOCALE_T.  There's no good reason
for that IMO; not if we're one line of code away from allowing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?

"Tels"  writes:
> So, is the goal you are trying to achive here to be able to say "You need
> Perl 5.8.3; plus Module XYZ in vABC if you want point 2, otherwise skip
> this step" instead of saying "You need Perl 5.10.1?"?

I mainly want to be sure that if we say "it runs on 5.8.3", that's not a
lie.  The fact that some test scripts need a newer version of Test::More
seems like a detail that can be left out.  In the (quite improbable) case
that someone runs into that situation in the field, the error message that
they'll get is clear enough, and they should be able to figure out how to
fix it without help from our docs.  Anyone who's running a 15-year-old
Perl installation has probably had to upgrade some of the modules before.

But in reality, only developers are ever going to use --enable-tap-tests
in the first place, so it's largely moot.  What I was really annoyed by
was that PL/Perl failed to build and/or pass regression tests on allegedly
supported Perl versions, and that's sorted now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] AlterUserStmt anmd RoleSpec rules in grammar.y

On Tue, Aug 1, 2017 at 8:42 AM, Pavel Golub  wrote:
> Sorry, if I was rough. My English is not so excellent. My point is
> that I was trying to distinguish  behavior of EDB installer and
> "build from source" PG.
>
> And the result is that EDB executes ALTER USER and I don't know why.

I don't know why, either.  Are you sure you haven't made some mistake
in the testing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

2017-08-01 Thread Remi Colinet

I did it in version 2 of the patch.
The patch could yield TEXT, JSON, and XML ouput.

For below query, it gives:

=> Terminal 1
test=# select * from t_10m, t_1m where t_10m.md5 like '%cb%';

=> Terminal 2
test=# \watch PROGRESS 9546;
   Wed 10 May 2017 06:29:59 PM CEST (every 1s)

  PLAN
PROGRESS
-
 status: 
 query: select * from t_10m, t_1m where t_10m.md5 like '%cb%';
 time used (s): 10
 Nested Loop
   ->  Seq Scan on t_1m => rows 7/100 0% => blks 8334/8334 100%
   ->  Materialize => file read readptrcount=1 rows write=1189285
read=6584854 disk use (bytes) 53842965
 ->  Seq Scan on t_10m => rows 1145596/738172 155%
   Filter: (md5 ~~ '%cb%'::text)
 total disk space used (MB) 51
(9 rows)

=> Terminal 2
test=# PROGRESS (FORMAT JSON) 9546;
PLAN PROGRESS
--
 [   +
   "status": "",  +
   "query": "select * from t_10m, t_1m where t_10m.md5 like '%cb%';",+
   "time used (s)": 0,   +
   "single worker": {+
 "Node Type": "Nested Loop", +
 "Partial Mode": "", +
 "Operation": "single worker",   +
 "Parent Relationship": "single worker", +
 "Custom Plan Provider": "(@\u0004\u0001",   +
 "Parallel Aware": false,+
 "Outer": {  +
   "Node Type": "Seq Scan",  +
   "Strategy": "",   +
   "Partial Mode": "single worker",  +
   "Operation": "Outer", +
   "Parent Relationship": "Outer",   +
   "Custom Plan Provider": "(@\u0004\u0001", +
   "Parallel Aware": false,  +
   "relation": "t_1m",   +
   "rows fetched": 1,+
   "rows total": 100,+
   "rows percent": 0,+
   "blocks fetched": 8334,   +
   "blocks total": 8334, +
   "blocks percent": 100 +
 },  +
 "Inner": {  +
   "Node Type": "Materialize",   +
   "Strategy": "",   +
   "Partial Mode": "single worker",  +
   "Operation": "Inner", +
   "Parent Relationship": "Inner",   +
   "Custom Plan Provider": "(@\u0004\u0001", +
   "Parallel Aware": false,  +
   "file store": "write",+
   "readptrcount": 1,+
   "rows write": 297256, +
   "rows read": 0,   +
   "disk use (bytes)": 11911168, +
   "Outer": {+
 "Node Type": "Seq Scan",+
 "Strategy": "", +
 "Partial Mode": "Inner",+
 "Operation": "Outer",   +
 "Parent Relationship": "Outer", +
 "Custom Plan Provider": "HtFH\b[]\u000f\u001f", +
 "Parallel Aware": false,+
 "relation": "t_10m",+
 "rows fetched": 253566, +
 "rows total": 738172,   +
 "rows percent": 34, +
 "blocks fetched": 18436,+
 "blocks total": 83334,  +
 "blocks percent": 22,   +
 "Filter": "(md5 ~~ '%cb%'::text)"

Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-01 Thread Amit Khandekar

On 1 August 2017 at 15:11, Etsuro Fujita  wrote:
> On 2017/07/31 18:56, Amit Langote wrote:
>> Yes, that's what's needed here.  So we need to teach
>> map_variable_attnos_mutator() to convert whole-row vars just like it's
>> done in adjust_appendrel_attrs_mutator().
>
>
> Seems reasonable.  (Though I think it might be better to do this kind of
> conversion in the planner, not the executor, because that would increase the
> efficiency of cached plans.)
I think the work of shifting to planner should be taken as a different
task when we shift the preparation of DispatchInfo to the planner.

>
>>> I suspect that the other nodes that adjust_appendrel_attrs_mutator
>>> handles (like JoinExpr, CurrentOfExpr, etc ) may also require similar
>>> adjustments for our case, because WithCheckOptions (and I think even
>>> RETURNING) can have a subquery.
>>
>>
>> Actually, WITH CHECK and RETURNING expressions that are processed in the
>> executor (i.e., in ExecInitModifyTable) are finished plan tree expressions
>> (not parse or rewritten parse tree expressions), so we need not have to
>> worry about having to convert those things in this case.
>>
>> Remember that adjust_appendrel_attrs_mutator() has to handle raw Query
>> trees, because we plan the whole query separately for each partition in
>> the UPDATE and DELETE (inheritance_planner).  Since we don't need to plan
>> an INSERT query for each partition separately (at least without the
>> foreign tuple-routing support), we need not worry about converting
>> anything beside Vars (with proper support for converting whole-row ones
>> which you discovered has been missing), which we can do within the
>> executor on the finished plan tree expressions.
>
>
> Yeah, sublinks in WITH CHECK OPTION and RETURNING would already have been
> converted to subplans by the planner, so we don't need to worry about
> recursing into subqueries in sublinks at the execution time.

Yes, that seems true. It seems none of the nodes handled by
adjust_appendrel_attrs_mutator() other than Var nodes exist in planned
expressions. So looks like just additionally handling whole rows
should be sufficient.

>
>> Attached 2 patches:
>>
>> 0001: Addresses the bug that Rajkumar reported (handling whole-row vars in
>>WITH CHECK and RETURNING expressions at all)
>>
>> 0002: Addressed the bug that Amit reported (converting whole-row vars
>>that could occur in WITH CHECK and RETURNING expressions)
>
>
> I took a quick look at the patches.  One thing I noticed is this:
>
>  map_variable_attnos(Node *node,
> int target_varno, int sublevels_up,
> const AttrNumber *attno_map, int
> map_length,
> +   Oid from_rowtype, Oid to_rowtype,
> bool *found_whole_row)
>  {
> map_variable_attnos_context context;
> @@ -1291,6 +1318,8 @@ map_variable_attnos(Node *node,
> context.sublevels_up = sublevels_up;
> context.attno_map = attno_map;
> context.map_length = map_length;
> +   context.from_rowtype = from_rowtype;
> +   context.to_rowtype = to_rowtype;
> context.found_whole_row = found_whole_row;
>
> You added two arguments to pass to map_variable_attnos(): from_rowtype and
> to_rowtype.  But I'm not sure that we really need the from_rowtype argument
> because it's possible to get the Oid from the vartype of target whole-row
> Vars.  And in this:
>
> +   /*
> +* If the callers expects us to convert the
> same, do so if
> +* necessary.
> +*/
> +   if (OidIsValid(context->to_rowtype) &&
> +   OidIsValid(context->from_rowtype) &&
> +   context->to_rowtype !=
> context->from_rowtype)
> +   {
> +   ConvertRowtypeExpr *r =
> makeNode(ConvertRowtypeExpr);
> +
> +   r->arg = (Expr *) newvar;
> +   r->resulttype =
> context->from_rowtype;
> +   r->convertformat =
> COERCE_IMPLICIT_CAST;
> +   r->location = -1;
> +   /* Make sure the Var node has the
> right type ID, too */
> +   newvar->vartype =
> context->to_rowtype;
> +   return (Node *) r;
> +   }
>
> I think we could set r->resulttype to the vartype (ie, "r->resulttype =
> newvar->vartype" instead of "r->resulttype = context->from_rowtype").

I agree.

---

Few more comments :

@@ -1240,7 +1247,7 @@ map_variable_attnos_mutator(Node *node,
 var->varlevelsup ==

Re: [HACKERS] [GENERAL] Not able to create collation on Windows

2017-08-01 Thread Murtuza Zabuawala

Hi Tom,

Yes, I was able to create collation using "C" instead of "POSIX" on windows,

CREATE COLLATION public.test from pg_catalog."C";

--
Regards,
Murtuza Zabuawala

On Tue, Aug 1, 2017 at 9:09 PM, Tom Lane  wrote:

> Peter Eisentraut  writes:
> > On 8/1/17 10:53, Tom Lane wrote:
> >> I think this is actually a bug, because the collations code clearly
> >> means to allow clones of the C/POSIX locales --- see eg lc_collate_is_c,
>
> > You seem to say that we should support a "POSIX" locale even on systems
> > where the C library does not support that.  I'm not convinced about that.
>
> Uh, we already do.  Note all the regression tests that unconditionally
> assume that the POSIX collation works.  Also, I am confused by your
> apparent belief that there might somewhere be a version of libc that
> fails to provide C-locale-compliant behavior.  Surely nobody would
> tolerate a version of strcmp() that fails to act per C spec.
>
> regards, tom lane
>

Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?

2017-08-01 Thread Tels

On Sun, July 30, 2017 4:35 pm, Tom Lane wrote:
> "Tels"  writes:
>> On Sun, July 30, 2017 12:22 pm, Tom Lane wrote:
>>> Yeah, I looked into that.  The closest candidate I can find is that
>>> perl 5.10.1 contains Test::More 0.92.  However, it's not real clear
>>> to me exactly which files I'd need to pull out of 5.10.1 and inject
>>> into
>>> an older tarball --- the layout seems a lot different from a standalone
>>> package.
>
>> So basically the two files:
>> http://search.cpan.org/src/EXODIST/Test-Simple-1.302086/lib/Test/More.pm
>> http://search.cpan.org/src/EXODIST/Test-Simple-1.302086/lib/Test/Builder/Module.pm
>> might do the trick.
>
> Thanks for the hint.  I transplanted these files out of a 5.10.1
> tarball into 5.8.3, then built as usual:
> lib/Test/Simple.pm
> lib/Test/More.pm
> lib/Test/Builder.pm
> lib/Test/Builder/Module.pm
> The result seems to work, although it fails a few of 5.8.3's tests,
> probably because I didn't copy over the relevant test scripts.
> It's good enough to run PG's tests though.

Ah, cool. One more question:

> 1. Can it run the build-related Perl scripts needed to build PG from
> a bare git checkout, on non-Windows platforms?
> 2. Can it run our MSVC build scripts?
> 3. Does pl/perl compile (and pass its regression tests) against it?
> 4. Can it run our TAP tests?

> 5.8.3 does appear to succeed at points 1,3,4.  Now, to get through
> the TAP tests you need to install IPC::Run, and you also need to
> update Test::More because the version shipped with 5.8.3 is too old.
> But at least the failures that you get from lacking these are pretty
> clear.

So, for point 2, one would need Perl 5.8.3 + a newer Test::More and
IPC::Run, or a newer Perl. I've toyed around with cpanminus:

 apt-get install cpanminus
 cpanm Test::More
 cpanm IPC::Run

and it successfully build (in a local path because I was not root) the
latest Test::More and IPC::Run. I'm not sure if this would be enough to
run the MSVC build scripts, tho.

So, is the goal you are trying to achive here to be able to say "You need
Perl 5.8.3; plus Module XYZ in vABC if you want point 2, otherwise skip
this step" instead of saying "You need Perl 5.10.1?"?

In that case it might be also possible to write instructions on how to
obtain and use these modules, or how to just use a newer Perl in a temp.
path.

Best wishes,

Tels


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in 9.3.17 using operator <>

2017-08-01 Thread David G. Johnston

On Tue, Aug 1, 2017 at 8:39 AM, Nick Dro  wrote:

> The operator <> seems to not work properly comparing citext types in
> triggers function.
>
> https://stackoverflow.com/questions/45441840/posgresql-
> 9-3-operator-doesnt-give-logical-result
>
> Can someone figure out what is the problem? This seems like a bug.
>

Um, 'jack' and 'Jack' and indeed equal when compared case-insensitively.
You've asked whether they are unequal.  Since they are equal the answer to
your question is "no" (false).  That is the answer you were given.

David J.

[HACKERS] Possible bug in 9.3.17 using operator <>

2017-08-01 Thread Nick Dro

The operator <> seems to not work properly comparing citext types in triggers function.
 
https://stackoverflow.com/questions/45441840/posgresql-9-3-operator-doesnt-give-logical-result
 
Can someone figure out what is the problem? This seems like a bug.

Re: [HACKERS] [GENERAL] Not able to create collation on Windows

Peter Eisentraut  writes:
> On 8/1/17 10:53, Tom Lane wrote:
>> I think this is actually a bug, because the collations code clearly
>> means to allow clones of the C/POSIX locales --- see eg lc_collate_is_c,

> You seem to say that we should support a "POSIX" locale even on systems
> where the C library does not support that.  I'm not convinced about that.

Uh, we already do.  Note all the regression tests that unconditionally
assume that the POSIX collation works.  Also, I am confused by your
apparent belief that there might somewhere be a version of libc that
fails to provide C-locale-compliant behavior.  Surely nobody would
tolerate a version of strcmp() that fails to act per C spec.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Not able to create collation on Windows

On 8/1/17 10:53, Tom Lane wrote:
> Murtuza Zabuawala  writes:
>> I am trying to create collation on windows using default POSIX collation
>> with pgAdmin3 but I am getting error as shown in screenshot, Can someone
>> suggest how to fix this?
> 
>> *Syntax:*
>> CREATE COLLATION public.test from pg_catalog."POSIX";
> 
>> *Error:*
>> ERROR: could not create locale "POSIX". No error
> 
> Hmm.  Evidently Windows' _create_locale() doesn't accept "POSIX".
> You might find that "C" works instead, don't know for sure.
> 
> I think this is actually a bug, because the collations code clearly
> means to allow clones of the C/POSIX locales --- see eg lc_collate_is_c,

You seem to say that we should support a "POSIX" locale even on systems
where the C library does not support that.  I'm not convinced about that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 10 (latest beta) and older ICU

On 8/1/17 08:28, Victor Wagner wrote:
> On Tue, 1 Aug 2017 08:16:54 -0400
> Peter Eisentraut  wrote:
> 
>> On 8/1/17 02:12, Victor Wagner wrote:
 We are only calling uloc_toLanguageTag() with keyword/value
 combinations that ICU itself previously told us were supported.  So
 just ignoring errors doesn't seem proper in this case.
  
>>> We know that this version of ICU is broken. But what choice we
>>> have?  
>> I don't know that we can already reach that conclusion.  Maybe the
> Because it was fixed in subsequent versions.

I propose the attached patch to address this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From a9c89b7b1b98f0a0e3dc6d7571dfc7e7f146ed8d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 1 Aug 2017 10:49:55 -0400
Subject: [PATCH] Add support for ICU 4.2

Supporting ICU 4.2 seems useful because it ships with CentOS 6.

Versions before ICU 4.6 don't support pkg-config, so document an
installation method without using pkg-config.

In ICU 4.2, ucol_getKeywordsForLocale() sometimes returns values that
will not be accepted by uloc_toLanguageTag().  Skip loading keyword
variants in that version.

Reported-by: Victor Wagner 
---
 doc/src/sgml/installation.sgml   | 22 +++---
 src/backend/commands/collationcmds.c | 11 +++
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index fa0d05efe6..12866b4bf7 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -774,10 +774,26 @@ Configuration
  Build with support for
  the ICUICU
  library.  This requires the ICU4C package
- as well
- as 
pkg-configpkg-config
  to be installed.  The minimum required version
- of ICU4C is currently 4.6.
+ of ICU4C is currently 4.2.
+
+
+
+ By default,
+ 
pkg-configpkg-config
+ will be used to find the required compilation options.  This is
+ supported for ICU4C version 4.6 and later.
+ For older versions, or if pkg-config is
+ not available, the variables ICU_CFLAGS
+ and ICU_LIBS can be specified
+ to configure, like in this example:
+
+./configure ... --with-icu ICU_CFLAGS='-I/some/where/include' 
ICU_LIBS='-L/some/where/lib -licui18n -licuuc -licudata'
+
+ (If ICU4C is in the default search path
+ for the compiler, then you still need to specify a nonempty string in
+ order to avoid use of pkg-config, for
+ example, ICU_CFLAGS=' '.)
 

   
diff --git a/src/backend/commands/collationcmds.c 
b/src/backend/commands/collationcmds.c
index b6c14c920d..a7b5871466 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -718,7 +718,17 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 
/*
 * Add keyword variants
+*
+* In ICU 4.2, ucol_getKeywordsForLocale() sometimes 
returns
+* values that will not be accepted by 
uloc_toLanguageTag().  Skip
+* loading keyword variants in that version.  (Both
+* ucol_getKeywordValuesForLocale() and 
uloc_toLanguageTag() are
+* new in ICU 4.2, so older versions are not supported 
at all.)
+*
+* XXX We have no information about ICU 4.3 through 
4.7, but we
+* know the below works with 4.8.
 */
+#if U_ICU_VERSION_MAJOR_NUM > 4 || (U_ICU_VERSION_MAJOR_NUM == 4 && 
U_ICU_VERSION_MINOR_NUM > 2)
status = U_ZERO_ERROR;
en = ucol_getKeywordValuesForLocale("collation", name, 
TRUE, );
if (U_FAILURE(status))
@@ -765,6 +775,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
(errmsg("could not get keyword 
values for locale \"%s\": %s",
name, 
u_errorName(status;
uenum_close(en);
+#endif /* ICU >4.2 */
}
}
 #endif /* USE_ICU */
-- 
2.13.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Not able to create collation on Windows

Murtuza Zabuawala  writes:
> I am trying to create collation on windows using default POSIX collation
> with pgAdmin3 but I am getting error as shown in screenshot, Can someone
> suggest how to fix this?

> *Syntax:*
> CREATE COLLATION public.test from pg_catalog."POSIX";

> *Error:*
> ERROR: could not create locale "POSIX". No error

Hmm.  Evidently Windows' _create_locale() doesn't accept "POSIX".
You might find that "C" works instead, don't know for sure.

I think this is actually a bug, because the collations code clearly
means to allow clones of the C/POSIX locales --- see eg lc_collate_is_c,
which could be noticeably simpler if that case weren't contemplated.
However, DefineCollation checks validity of the new collation by
unconditionally calling pg_newlocale_from_collation().  That violates
the advice in pg_newlocale_from_collation's header comment:

 * Also, callers should avoid calling this before going down a C/POSIX
 * fastpath, because such a fastpath should work even on platforms without
 * locale_t support in the C library.

Every other call site honors that.

So I think what we ought to do is change DefineCollation more or
less like this:

-   (void) pg_newlocale_from_collation(newoid);
+   if (!lc_collate_is_c(newoid) || !lc_ctype_is_c(newoid))
+   (void) pg_newlocale_from_collation(newoid);

Another issue exposed by this report is that we aren't reporting
_create_locale() failures in a useful way.  It's possible this
could be improved by inserting

#ifdef WIN32
_dosmaperr(GetLastError());
#endif

into report_newlocale_failure in pg_locale.c, but I'm not really
sure.  Microsoft's man page for _create_locale() fails to say much
of anything about its error-case behavior, and definitely does not
say that it sets the GetLastError indicator.  Still, there's certainly
no chance that printing errno without doing this will be useful.
I would suggest that we do that for starters, and if we hear that
we're still getting silly errors, just hot-wire the code to assume
ENOENT on Windows.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to run PG TAP tests on windows?

2017-08-01 Thread Michael Paquier

On Tue, Aug 1, 2017 at 10:24 AM, Abbas Butt  wrote:
> Can anyone point out to a tutorial or a list of steps required to run PG TAP
> tests on windows?

Only MSVC has a special handling:
https://www.postgresql.org/docs/devel/static/install-windows-full.html#idm46046082578368
Using vcregress.bat, you are looking mainly for the subcommands
bincheck and recoverycheck.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Remove 1MB size limit in tsvector

2017-08-01 Thread Ildus Kurbangaliev

Hello, hackers!

Historically tsvector type can't hold more than 1MB data.
I want to propose a patch that removes that limit.

That limit is created by 'pos' field from WordEntry, which have only
20 bits for storage.

In the proposed patch I removed this field and instead of it I keep
offsets only at each Nth item in WordEntry's array. Now I set N as 4,
because it gave best results in my benchmarks. It can be increased in
the future without affecting already saved data in database. Also
removing the field improves compression of tsvectors.

I simplified the code by creating functions that can be used to
build tsvectors. There were duplicated code fragments in places where
tsvector was built.

Also new patch frees some space in WordEntry that can be used to
save some additional information about saved words.

- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/tsearch/Makefile b/src/backend/tsearch/Makefile
index 34fe4c5b3c..9585a25003 100644
--- a/src/backend/tsearch/Makefile
+++ b/src/backend/tsearch/Makefile
@@ -26,7 +26,7 @@ DICTFILES_PATH=$(addprefix dicts/,$(DICTFILES))
 OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
 	dict_simple.o dict_synonym.o dict_thesaurus.o \
 	dict_ispell.o regis.o spell.o \
-	to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o
+	to_tsany.o ts_selfuncs.o ts_typanalyze.o ts_utils.o ts_compat.o
 
 include $(top_srcdir)/src/backend/common.mk
 
diff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
index 35d9ab276c..d66b69baea 100644
--- a/src/backend/tsearch/to_tsany.c
+++ b/src/backend/tsearch/to_tsany.c
@@ -156,13 +156,10 @@ TSVector
 make_tsvector(ParsedText *prs)
 {
 	int			i,
-j,
 lenstr = 0,
-totallen;
+totallen,
+stroff = 0;
 	TSVector	in;
-	WordEntry  *ptr;
-	char	   *str;
-	int			stroff;
 
 	/* Merge duplicate words */
 	if (prs->curwords > 0)
@@ -171,12 +168,8 @@ make_tsvector(ParsedText *prs)
 	/* Determine space needed */
 	for (i = 0; i < prs->curwords; i++)
 	{
-		lenstr += prs->words[i].len;
-		if (prs->words[i].alen)
-		{
-			lenstr = SHORTALIGN(lenstr);
-			lenstr += sizeof(uint16) + prs->words[i].pos.apos[0] * sizeof(WordEntryPos);
-		}
+		int npos = prs->words[i].alen ? prs->words[i].pos.apos[0] : 0;
+		INCRSIZE(lenstr, i, prs->words[i].len, npos);
 	}
 
 	if (lenstr > MAXSTRPOS)
@@ -187,41 +180,21 @@ make_tsvector(ParsedText *prs)
 	totallen = CALCDATASIZE(prs->curwords, lenstr);
 	in = (TSVector) palloc0(totallen);
 	SET_VARSIZE(in, totallen);
-	in->size = prs->curwords;
+	TS_SETCOUNT(in, prs->curwords);
 
-	ptr = ARRPTR(in);
-	str = STRPTR(in);
-	stroff = 0;
 	for (i = 0; i < prs->curwords; i++)
 	{
-		ptr->len = prs->words[i].len;
-		ptr->pos = stroff;
-		memcpy(str + stroff, prs->words[i].word, prs->words[i].len);
-		stroff += prs->words[i].len;
-		pfree(prs->words[i].word);
+		int npos = 0;
 		if (prs->words[i].alen)
-		{
-			int			k = prs->words[i].pos.apos[0];
-			WordEntryPos *wptr;
+			npos = prs->words[i].pos.apos[0];
 
-			if (k > 0x)
-elog(ERROR, "positions array too long");
+		tsvector_addlexeme(in, i, , prs->words[i].word, prs->words[i].len,
+			prs->words[i].pos.apos + 1, npos);
 
-			ptr->haspos = 1;
-			stroff = SHORTALIGN(stroff);
-			*(uint16 *) (str + stroff) = (uint16) k;
-			wptr = POSDATAPTR(in, ptr);
-			for (j = 0; j < k; j++)
-			{
-WEP_SETWEIGHT(wptr[j], 0);
-WEP_SETPOS(wptr[j], prs->words[i].pos.apos[j + 1]);
-			}
-			stroff += sizeof(uint16) + k * sizeof(WordEntryPos);
+		pfree(prs->words[i].word);
+		if (prs->words[i].alen)
 			pfree(prs->words[i].pos.apos);
-		}
-		else
-			ptr->haspos = 0;
-		ptr++;
+
 	}
 
 	if (prs->words)
@@ -251,7 +224,6 @@ to_tsvector_byid(PG_FUNCTION_ARGS)
 	PG_FREE_IF_COPY(in, 1);
 
 	out = make_tsvector();
-
 	PG_RETURN_TSVECTOR(out);
 }
 
diff --git a/src/backend/tsearch/ts_compat.c b/src/backend/tsearch/ts_compat.c
new file mode 100644
index 00..bb2a62eaf7
--- /dev/null
+++ b/src/backend/tsearch/ts_compat.c
@@ -0,0 +1,83 @@
+#include "postgres.h"
+#include "tsearch/ts_type.h"
+
+/*
+ * Definition of old WordEntry struct in TSVector. Because of limitations
+ * in size (max 1MB for lexemes), the format has changed
+ */
+typedef struct
+{
+	uint32
+haspos:1,
+len:11,
+pos:20;
+} OldWordEntry;
+
+typedef struct
+{
+	uint16		npos;
+	WordEntryPos pos[FLEXIBLE_ARRAY_MEMBER];
+} OldWordEntryPosVector;
+
+#define OLDSTRPTR(x)	( (char *) &(x)->entries[x->size_] )
+#define _OLDPOSVECPTR(x, e)	\
+	((OldWordEntryPosVector *)(STRPTR(x) + SHORTALIGN((e)->pos + (e)->len)))
+#define OLDPOSDATALEN(x,e) ( ( (e)->haspos ) ? (_OLDPOSVECPTR(x,e)->npos) : 0 )
+#define OLDPOSDATAPTR(x,e) (_OLDPOSVECPTR(x,e)->pos)
+
+/*
+ * Converts tsvector with the old structure to current.
+ * @orig - tsvector to convert,
+ * @copy - return copy of tsvector, it has a meaning when tsvector doensn't
+ * need to be converted.
+ */
+TSVector
+tsvector_upgrade(Datum orig, bool

Re: [HACKERS] PostgreSQL 10 (latest beta) and older ICU

Victor Wagner  writes:
> Peter Eisentraut  wrote:
>> I don't know that we can already reach that conclusion.  Maybe the

> Because it was fixed in subsequent versions.

> And 4.2 is first version where this function appeared.
> So, we still have problems with SLES11 which use even older version and
> have to be supported at least two more years.

I think the answer is very clear: we do not need to support old broken
versions of ICU, especially not in our first release.  We'll have enough
to do making the code stable and performant with good versions of ICU.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-08-01 Thread Stas Kelvich

> On 31 Jul 2017, at 20:03, Robert Haas wrote:
>
> Regardless of whether we share XIDs or DXIDs, we need a more complex
> concept of transaction state than we have now.

Seems that discussion shifted from 2PC itself to the general issues with
distributed
transactions. So it is probably appropriate to share here resume of things that
we
done in area of distributed visibility. During last two years we tried three
quite different
approaches and finally settled with Clock-SI.

At first, to test different approaches we did small patch that wrap calls to
visibility-related
functions (SetTransactionStatus, GetSnapshot, etc. Described in detail at
wiki[1] ) in order to
allow overload them from extension. Such approach allows to implement almost
anything
related to distributed visibility since you have full control about how local
visibility is done.
That API isn’t hard prerequisite, and if one wants to create some concrete
implementation
it can be done just in place. However, I think it is good to have such API in
some form.

So three approaches that we tried:

1) Postgres-XL-like:

That is most straightforward way. Basically we need separate network service
(GTM/DTM) that is
responsible for xid generation, and managing running-list of transactions. So
acquiring
xid and snapshot is done by network calls. Because of shared xid space it is
possible
to compare them in ordinary way and get right order. Gap between
non-simultaneous
commits by 2pc is covered by the fact that we getting our snapshots from GTM,
and
it will remove xid from running list only when transaction committed on both
nodes.

Such approach is okay for OLAP-style transactions where tps isn’t high. But
OLTP with
high transaction rate GTM will immediately became a bottleneck since even write
transactions
need to get snapshot from GTM. Even if they access only one node.

2) Incremental SI [2]

Approach with central coordinator, that can allow local reads without network
communications by slightly altering visibility rules.

Despite the fact that it is kind of patented, we also failed to achieve proper
visibility
by implementing algorithms from that paper. It always showed some
inconsistencies.
May be because of bugs in our implementation, may be because of some
typos/mistakes in algorithm description itself. Reasoning in paper wasn’t very
clear for us, as well as patent issues, so we just leaved that.

3) Clock-SI [3]

It is MS research paper, that describes algorithm similar to ones used in
Spanner and
CockroachDB, without central GTM and with reads that do not require network
roundtrip.

There are two ideas behind it:

* Assuming snapshot isolation and visibility on node are based on CSN, use
local time as CSN,
then when you are doing 2PC, collect prepare time from all participating nodes
and
commit transaction everywhere with maximum of that times. If node during read
faces tuples
committed by tx with CSN greater then their snapshot CSN (that can happen due to
time desynchronisation on node) then it just waits until that time come. So
time desynchronisation
can affect performance, but can’t affect correctness.

* During distributed commit transaction neither running (if it commits then
tuple
should be already visible) nor committed/aborted (it still can be aborted, so
it is illegal to read).
So here IN-DOUBT transaction state appears, when reader should wait for writers.

We managed to implement that using mentioned XTM api. XID<->CSN mapping is
accounted by extension itself. Speed/scalability are also good.

I want to resubmit implementation of that algorithm for FDW later in August,
along with some
isolation tests based on set of queries in [4].

[1] https://wiki.postgresql.org/wiki/DTM#eXtensible_Transaction_Manager_API
[2] http://pi3.informatik.uni-mannheim.de/~norman/dsi_jour_2014.pdf
[3]
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/samehe-clocksi.srds2013.pdf
[4] https://github.com/ept/hermitage

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] AlterUserStmt anmd RoleSpec rules in grammar.y

2017-08-01 Thread Pavel Golub

Hello, Robert.

Sorry, if I was rough. My English is not so excellent. My point is
that I was trying to distinguish  behavior of EDB installer and
"build from source" PG.

And the result is that EDB executes ALTER USER and I don't know why.

You wrote:

RH> On Thu, Jul 27, 2017 at 2:52 AM, Pavel Golub  wrote:
>> One  more  notice.  ALTER  USER  ALL  works  in  EnterpriseDB 10beta2
>> installer. That's weird. I thought EnterpriseDB uses official sources.

RH> I find it really hard to believe that we're doing anything else.  It
RH> wouldn't make any sense to patch the PostgreSQL source code and then
RH> release the installers as PostgreSQL installers.  And if we *were*
RH> going to do that, wouldn't we patch something more interesting than
RH> the ALTER USER command?  I don't know what's going on here but I have
RH> a feeling that EnterpriseDB secretly maintaining patch sets that we
RH> inject into our PostgreSQL installers is not that thing.

RH> Adding a few EDB people.

-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 10 (latest beta) and older ICU

2017-08-01 Thread Victor Wagner

On Tue, 1 Aug 2017 08:16:54 -0400
Peter Eisentraut  wrote:

> On 8/1/17 02:12, Victor Wagner wrote:
> >> We are only calling uloc_toLanguageTag() with keyword/value
> >> combinations that ICU itself previously told us were supported.  So
> >> just ignoring errors doesn't seem proper in this case.
> >>  
> > We know that this version of ICU is broken. But what choice we
> > have?  
> 
> I don't know that we can already reach that conclusion.  Maybe the

Because it was fixed in subsequent versions.

And 4.2 is first version where this function appeared.
So, we still have problems with SLES11 which use even older version and
have to be supported at least two more years.

> APIs have changed or we are not using them correctly.  Are there any
> bug reports or changelog entries related to this?  I don't think we
> should just give up and ignore errors.

We also can provide configuration option or command-line option for
initdb which would restrict list of languages for which collation
sequences would be created. This would help for all but two languages.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 10 (latest beta) and older ICU

On 8/1/17 02:12, Victor Wagner wrote:
>> We are only calling uloc_toLanguageTag() with keyword/value
>> combinations that ICU itself previously told us were supported.  So
>> just ignoring errors doesn't seem proper in this case.
>>
> We know that this version of ICU is broken. But what choice we have?

I don't know that we can already reach that conclusion.  Maybe the APIs
have changed or we are not using them correctly.  Are there any bug
reports or changelog entries related to this?  I don't think we should
just give up and ignore errors.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-08-01 Thread Masahiko Sawada

On Tue, Aug 1, 2017 at 1:40 AM, Robert Haas  wrote:
> On Thu, Jul 27, 2017 at 8:25 AM, Ashutosh Bapat
>  wrote:
>> The remote transaction can be committed/aborted only after the fate of
>> the local transaction is decided. If we commit remote transaction and
>> abort local transaction, that's not good. AtEOXact* functions are
>> called immediately after that decision in post-commit/abort phase. So,
>> if we want to commit/abort the remote transaction immediately it has
>> to be done in post-commit/abort processing. Instead if we delegate
>> that to the remote transaction resolved backend (introduced by the
>> patches) the delay between local commit and remote commits depends
>> upon when the resolve gets a chance to run and process those
>> transactions. One could argue that that delay would anyway exist when
>> post-commit/abort processing fails to resolve remote transaction. But
>> given the real high availability these days, in most of the cases
>> remote transaction will be resolved in the post-commit/abort phase. I
>> think we should optimize for most common case. Your concern is still
>> valid, that we shouldn't raise an error or do anything critical in
>> post-commit/abort phase. So we should device a way to send
>> COMMIT/ABORT prepared messages to the remote server in asynchronous
>> fashion carefully avoiding errors. Recent changes to 2PC have improved
>> performance in that area to a great extent. Relying on resolver
>> backend to resolve remote transactions would erode that performance
>> gain.
>
> I think there are two separate but interconnected issues here.  One is
> that if we give the user a new command prompt without resolving the
> remote transaction, then they might run a new query that sees their
> own work as committed, which would be bad.  Or, they might commit,
> wait for the acknowledgement, and then tell some other session to go
> look at the data, and find it not there.  That would also be bad.  I
> think the solution is likely to do something like what we did for
> synchronous replication in commit
> 9a56dc3389b9470031e9ef8e45c95a680982e01a -- wait for the remove
> transaction to be resolved (by the background process) but allow an
> interrupt to escape the wait-loop.
>
> The second issue is that having the resolver resolve transactions
> might be slower than doing it in the foreground.  I don't necessarily
> see a reason why that should be a big problem.  I mean, the resolver
> might need to establish a separate connection, but if it keeps that
> connection open for a while (say, 5 minutes) in case further
> transactions arrive then it won't be an issue except on really
> low-volume system which isn't really a case I think we need to worry
> about very much.  Also, the hand-off to the resolver might take some
> time, but that's equally true for sync rep and we're living with it
> there.  Anything else is presumably just the resolver itself being
> inefficient which seems like something that can simply be fixed.

I think using the solution similar to sync rep to wait for the
transaction to be resolved is a good way. One concern I have is that
if we have one resolver process per one backend process the switching
connection between participant nodes would be overhead. In current
implementation the backend process uses connection caches to the
remote server. On the other hand if we have one resolver process per
one database on remote server the backend process have to communicate
with multiple resolver processes.

> FWIW, I don't think the present resolver implementation is likely to
> be what we want.  IIRC, it's just calling an SQL function which
> doesn't seem like a good approach.  Ideally we should stick an entry
> into a shared memory queue and then ping the resolver via SetLatch,
> and it can directly invoke an FDW method on the data from the shared
> memory queue.  It should be possible to set things up so that a user
> who wishes to do so can run multiple copies of the resolver thread at
> the same time, which would be a good way to keep latency down if the
> system is very busy with distributed transactions.
>

In current implementation the resolver process exists for resolving
in-doubt transactions. That process periodically checks if there is
unresolved transaction on shared memory and tries to resolve it
according commit log. If we change it so that the backend process can
communicate with the resolver process via SetLatch the resolver
process is better to be implemented into core rather than as a contrib
module.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Red-Black tree traversal tests

2017-08-01 Thread Aleksander Alekseev

Hi Victor,

> If it's not too much trouble perhaps you could write a few more test so
> we would have 100% test coverage for rbtree, could modify it safely and
> be sure that it actually works when someone will need the rest of its
> functionality?

Also I would recommend to add your patch to the nearest commitfest [1].
Otherwise there is a good chance that everyone will forget about it
quite soon.

[1]: https://commitfest.postgresql.org/14/

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: [HACKERS] Red-Black tree traversal tests

2017-08-01 Thread Aleksander Alekseev

Hi Victor,

> Postgres now has its own red-black tree implementation. This tree has 4
> types of traversals. In the attachment, you can find module test that
> checks the correctness of tree traversal strategies.
> 
> I hope that someone can find it useful.

Great job! However, according to lcov report, some procedures declared
in rbtree.c are still not test covered even with your patch,
particularly:

* rb_find
* rb_leftmost
* rb_delete + dependencies (rb_delete_fixup, etc)

You can generate a corresponding report using this script [1].

I must say, I was a little surprised that rb_find is not used anywhere
in PostgreSQL code. Turned out that rbtree currently is used only by GIN
and it uses a small subset of all procedures. 

If it's not too much trouble perhaps you could write a few more test so
we would have 100% test coverage for rbtree, could modify it safely and
be sure that it actually works when someone will need the rest of its
functionality?

[1]: https://github.com/afiskon/pgscripts/blob/master/code-coverage.sh


-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-01 Thread Etsuro Fujita


On 2017/07/31 18:56, Amit Langote wrote:

On 2017/07/28 20:46, Amit Khandekar wrote:

create table foo (a int, b text) partition by list (a);
create table foo1 partition of foo for values in (1);
create table foo2(b text, a int) ;
alter table foo attach partition foo2 for values in (2);

postgres=# insert into foo values (1, 'hi there');
INSERT 0 1
postgres=# insert into foo values (2, 'bi there');
INSERT 0 1
postgres=# insert into foo values (2, 'error there') returning foo;
ERROR:  table row type and query-specified row type do not match
DETAIL:  Table has type text at ordinal position 1, but query expects integer.



This is due to a mismatch between the composite type tuple descriptor
of the leaf partition doesn't match the RETURNING slot tuple
descriptor.

We probably need to do what
inheritance_planner()=>adjust_appendrel_attrs() does for adjusting the
attrs in the child rel parse tree; i.e., handle some specific nodes
other than just simple var nodes. In adjust_appendrel_attrs_mutator(),
for whole row node, it updates var->vartype to the child rel.


Yes, that's what's needed here.  So we need to teach
map_variable_attnos_mutator() to convert whole-row vars just like it's
done in adjust_appendrel_attrs_mutator().


Seems reasonable.  (Though I think it might be better to do this kind of 
conversion in the planner, not the executor, because that would increase 
the efficiency of cached plans.)



I suspect that the other nodes that adjust_appendrel_attrs_mutator
handles (like JoinExpr, CurrentOfExpr, etc ) may also require similar
adjustments for our case, because WithCheckOptions (and I think even
RETURNING) can have a subquery.


Actually, WITH CHECK and RETURNING expressions that are processed in the
executor (i.e., in ExecInitModifyTable) are finished plan tree expressions
(not parse or rewritten parse tree expressions), so we need not have to
worry about having to convert those things in this case.

Remember that adjust_appendrel_attrs_mutator() has to handle raw Query
trees, because we plan the whole query separately for each partition in
the UPDATE and DELETE (inheritance_planner).  Since we don't need to plan
an INSERT query for each partition separately (at least without the
foreign tuple-routing support), we need not worry about converting
anything beside Vars (with proper support for converting whole-row ones
which you discovered has been missing), which we can do within the
executor on the finished plan tree expressions.


Yeah, sublinks in WITH CHECK OPTION and RETURNING would already have 
been converted to subplans by the planner, so we don't need to worry 
about recursing into subqueries in sublinks at the execution time.



Attached 2 patches:

0001: Addresses the bug that Rajkumar reported (handling whole-row vars in
   WITH CHECK and RETURNING expressions at all)

0002: Addressed the bug that Amit reported (converting whole-row vars
   that could occur in WITH CHECK and RETURNING expressions)


I took a quick look at the patches.  One thing I noticed is this:

 map_variable_attnos(Node *node,
int target_varno, int sublevels_up,
const AttrNumber *attno_map, int 
map_length,
+   Oid from_rowtype, Oid to_rowtype,
bool *found_whole_row)
 {
map_variable_attnos_context context;
@@ -1291,6 +1318,8 @@ map_variable_attnos(Node *node,
context.sublevels_up = sublevels_up;
context.attno_map = attno_map;
context.map_length = map_length;
+   context.from_rowtype = from_rowtype;
+   context.to_rowtype = to_rowtype;
context.found_whole_row = found_whole_row;

You added two arguments to pass to map_variable_attnos(): from_rowtype 
and to_rowtype.  But I'm not sure that we really need the from_rowtype 
argument because it's possible to get the Oid from the vartype of target 
whole-row Vars.  And in this:


+   /*
+* If the callers expects us to convert the 
same, do so if
+* necessary.
+*/
+   if (OidIsValid(context->to_rowtype) &&
+   OidIsValid(context->from_rowtype) &&
+   context->to_rowtype != 
context->from_rowtype)
+   {
+   ConvertRowtypeExpr *r = 
makeNode(ConvertRowtypeExpr);
+
+   r->arg = (Expr *) newvar;
+   r->resulttype = context->from_rowtype;
+   r->convertformat = COERCE_IMPLICIT_CAST;
+   r->location = -1;
+   /* Make sure the Var node has the right 
type ID, too */
+

[HACKERS] reload-through-the-top-parent switch the partition table

2017-08-01 Thread Rushabh Lathia

https://www.postgresql.org/message-id/CA%2BTgmoZFn7TJ7QBsFat
nuEE%3DGYGdZSNXqr9489n5JBsdy5rFfA%40mail.gmail.com

Above thread, it's been pointed out as important consideration
about adding reload-through-the-top-parent switch the partition
table.  One small step toward making use of hash function was
adding a switch into pg_dump which reload through the top
parent for the partition table.

Attach patch does the implementation for the same:

- Added switch reload-through-root: (Open for suggestion for the switch
name)
- Fix dumpTableData to COPY to the Root table with the reload-through-root
switch.
- Fix dumpTableData_insert - to generate the proper insert statement with
the reload-through-root switch.
- Added switch reload-through-root into pg_dumpall

Testing:

- Performed test with multi level partition + different schema combination.
- Tested a case with parent and child attributes in different order.


Attaching the pg_dump output for below test with --reload-through-root for
COPY and INSERTS.

Testcase:

create schema a;
create schema b;
create schema c;

create table a.t1 ( a int, b int ) partition by list(a);
create table b.t1_p1 partition of a.t1 FOR VALUES in ( 1, 2, 3) partition
by list(b);
create table c.t1_p1_p1 partition of b.t1_p1 FOR VALUES in (10, 20 );
insert into a.t1 values ( 2 , 10 );
insert into a.t1 values ( 1 , 20 );

My colleague Robert and I had doubt about the order in of TABLE
and TABLE_DATA. We thought earlier that reload-thought-root might
might not solve the purpose which has been discussed in the above
mentioned thread.  But later looking into code I realize the sort order
for DO_TABLE and DO_TABLE_DATA are different, so we don't need
to worry about that issue.

TODOs:
- Update documentation for pg_dump & pg_dumpall

Thanks,
Rushabh Lathia
www.EnterpriseDB.com
--
-- PostgreSQL database dump
--

-- Dumped from database version 10beta2
-- Dumped by pg_dump version 10beta2

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: postgres; Type: COMMENT; Schema: -; Owner: rushabh
--

COMMENT ON DATABASE postgres IS 'default administrative connection database';


--
-- Name: a; Type: SCHEMA; Schema: -; Owner: rushabh
--

CREATE SCHEMA a;


ALTER SCHEMA a OWNER TO rushabh;

--
-- Name: b; Type: SCHEMA; Schema: -; Owner: rushabh
--

CREATE SCHEMA b;


ALTER SCHEMA b OWNER TO rushabh;

--
-- Name: c; Type: SCHEMA; Schema: -; Owner: rushabh
--

CREATE SCHEMA c;


ALTER SCHEMA c OWNER TO rushabh;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: 
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: 
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


SET search_path = a, pg_catalog;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: t1; Type: TABLE; Schema: a; Owner: rushabh
--

CREATE TABLE t1 (
a integer,
b integer
)
PARTITION BY LIST (a);


ALTER TABLE t1 OWNER TO rushabh;

SET search_path = b, pg_catalog;

--
-- Name: t1_p1; Type: TABLE; Schema: b; Owner: rushabh
--

CREATE TABLE t1_p1 PARTITION OF a.t1
FOR VALUES IN (1, 2, 3)
PARTITION BY LIST (b);


ALTER TABLE t1_p1 OWNER TO rushabh;

SET search_path = c, pg_catalog;

--
-- Name: t1_p1_p1; Type: TABLE; Schema: c; Owner: rushabh
--

CREATE TABLE t1_p1_p1 PARTITION OF b.t1_p1
FOR VALUES IN (10, 20);


ALTER TABLE t1_p1_p1 OWNER TO rushabh;

--
-- Data for Name: t1_p1_p1; Type: TABLE DATA; Schema: c; Owner: rushabh
--

INSERT INTO a.t1 VALUES (1, 20);
INSERT INTO a.t1 VALUES (2, 10);


--
-- PostgreSQL database dump complete
--

--
-- PostgreSQL database dump
--

-- Dumped from database version 10beta2
-- Dumped by pg_dump version 10beta2

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: postgres; Type: COMMENT; Schema: -; Owner: rushabh
--

COMMENT ON DATABASE postgres IS 'default administrative connection database';


--
-- Name: a; Type: SCHEMA; Schema: -; Owner: rushabh
--

CREATE SCHEMA a;


ALTER SCHEMA a OWNER TO rushabh;

--
-- Name: b; Type: SCHEMA; Schema: -; Owner: rushabh
--

CREATE SCHEMA b;


ALTER SCHEMA b OWNER TO rushabh;

--
-- Name: c; Type: SCHEMA; Schema: -; Owner: rushabh
--

CREATE SCHEMA c;


ALTER SCHEMA c OWNER TO rushabh;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: 
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: 
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


SET search_path = a, pg_catalog;

SET default_tablespace = '';

Re: [HACKERS] [PATCH] Pattern based listeners for asynchronous messaging (LISTEN/NOTIFY)

2017-08-01 Thread Markus Sintonen

The following does not work:
LISTEN 'foo%'
Neither this:
LISTEN SIMILAR TO "foo%"
This works:
LISTEN "foo%"
But it does not act as a pattern.

We could change the SIMILAR TO something like following (accepting also
type of the pattern), for example:
LISTEN PATTERN 'foo%' TYPE 'similar'
LISTEN PATTERN 'foo*' TYPE 'ltree'
Or
LISTEN PATTERN 'foo%' USING 'similar'
Or
LISTEN MATCH 'foo%' USING 'similar'
Or
LISTEN MATCH 'foo%' TYPE 'similar'

Documentation is coming up.

2017-07-31 23:30 GMT+03:00 Peter Eisentraut <
peter.eisentr...@2ndquadrant.com>:

> On 7/31/17 16:13, Markus Sintonen wrote:
> > This patch has no know backward compatibility issues with the existing
> > /LISTEN///UNLISTEN/ features. This is because patch extends the existing
> > syntax by accepting quoted strings which define the patterns as opposed
> > to the existing SQL literals.
>
> I don't see that in the patch.  Your patch changes the syntax LISTEN
> ColId to mean a regular expression.
>
> Even then, having LISTEN "foo" and LISTEN 'foo' mean different things
> would probably be confusing.
>
> I would think about specifying an operator somewhere in the syntax, like
> you have with LISTEN SIMILAR TO.  It would even be nice if a
> non-built-in operator could be used for matching names.
>
> Documentation is missing in the patch.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] Incorrect comment of XLByteToSeg() and XLByteToPrevSeg()

2017-08-01 Thread Yugo Nagata

On Tue, 01 Aug 2017 08:11:23 +0900 (JST)
Tatsuo Ishii  wrote:

> > Thanks for the patch. Looks good to me. I will commit/push into all
> > supported branches if there's no objection.
> 
> Done.

Thanks!
> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] foreign table creation and NOT VALID check constraints

On 2017/08/01 17:54, Simon Riggs wrote:
> On 1 August 2017 at 08:37, Amit Langote  wrote:
>> On 2017/08/01 15:22, Simon Riggs wrote:
>>> On 1 August 2017 at 07:16, Amit Langote  
>>> wrote:
 In f27a6b15e656 (9.6 & later), we decided to "Mark CHECK constraints
 declared NOT VALID valid if created with table."  In retrospect,
 constraints on foreign tables should have been excluded from consideration
 in that commit, because the thinking behind the aforementioned commit
 (that the constraint is trivially validated because the newly created
 table contains no data) does not equally apply to the foreign tables case.

 Should we do something about that?
>>>
>>> In what way does it not apply? Do you have a failure case?
>>
>> Sorry for not mentioning the details.
>>
>> I was thinking that a foreign table starts containing the data of the
>> remote object it points to the moment it's created (unlike local tables
>> which contain no data to begin with).  If a user is not sure whether a
>> particular constraint being created in the same command holds for the
>> remote data, they may mark it as NOT VALID and hope that the system treats
>> the constraint as such until such a time that they mark it valid by
>> running ALTER TABLE VALIDATE CONSTRAINT.  Since the planner is the only
>> consumer of pg_constraint.convalidated, that means the user expects the
>> planner to ignore such a constraint.  Since f27a6b15e656, users are no
>> longer able to expect so.
> 
> For Foreign Tables, it sounds like an issue. Don't think it exists for
> normal tables.

Yes.  I was saying in my first email that we should not disregard user's
request to mark a constraint NOT VALID if the table in question is a
*foreign table*.

So, it's OK that commit f27a6b15e656 changed things to ignore NOT VALID if
it's in the following command:

create table foo (a int, constraint check_a check (a > 0) not valid);

But, not OK in the following command:

create foreign table ffoo (
  a int,
  constraint check_a check (a > 0) not valid
) server loopback options (table_name 'foo');

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-08-01 Thread Masahiko Sawada

On Tue, Aug 1, 2017 at 3:43 AM, Robert Haas  wrote:
> On Mon, Jul 31, 2017 at 1:27 PM, Alvaro Herrera
>  wrote:
>> Postgres-XL seems to manage this problem by using a transaction manager
>> node, which is in charge of assigning snapshots.  I don't know how that
>> works, but perhaps adding that concept here could be useful too.  One
>> critical point to that design is that the app connects not directly to
>> the underlying Postgres server but instead to some other node which is
>> or connects to the node that manages the snapshots.
>>
>> Maybe Michael can explain in better detail how it works, and/or how (and
>> if) it could be applied here.
>
> I suspect that if you've got a central coordinator server that is the
> jumping-off point for all distributed transactions, the Postgres-XL
> approach is hard to beat (at least in concept, not sure about the
> implementation).  That server is brokering all of the connections to
> the data nodes anyway, so it might as well tell them all what
> snapshots to use while it's there.  When you scale to multiple
> coordinators, though, it's less clear that it's the best approach.
> Now one coordinator has to be the GTM master, and that server is
> likely to become a bottleneck -- plus talking to it involves extra
> network hops for all the other coordinators. When you then move the
> needle a bit further and imagine a system where the idea of a
> coordinator doesn't even exist, and you've just got a loosely couple
> distributed system where distributed transactions might arrive on any
> node, all of which are also servicing local transactions, then it
> seems pretty likely that the Postgres-XL approach is not the best fit.
>
> We might want to support multiple models.  Which one to support first
> is a harder question.  The thing I like least about the Postgres-XC
> approach is it seems inevitable that, as Michael says, the central
> server handing out XIDs and snapshots is bound to become a bottleneck.
> That type of system implicitly constructs a total order of all
> distributed transactions, but we don't really need a total order.  If
> two transactions don't touch the same data and there's no overlapping
> transaction that can notice the commit order, then we could make those
> commit decisions independently on different nodes without caring which
> one "happens first".  The problem is that it might take so much
> bookkeeping to figure out whether that is in fact the case in a
> particular instance that it's even more expensive than having a
> central server that functions as a global bottleneck.
>
> It might be worth some study not only of Postgres-XL but also of other
> databases that claim to provide distributed transactional consistency
> across nodes.  I've found literature on this topic from time to time
> over the years, but I'm not sure what the best practices in this area
> actually are.

Yeah it's worth to study other databases and to consider the approach
that goes well with the PostgreSQL architecture. I've read some papers
related to distributed transaction management but I'm also not sure
what the best practices in this area are. However, one trend I've seen
is that some cloud-native databases such as Google Spanner[1] and
Cockroachdb employs the tecniques using timestamps to determine the
visibility without centralized coordination. Google Spanner uses GPS
clocks and atomic clocks but since these are not common hardware
Cockroachdb uses local timestamps with NTP instead. Also, other
transaction techniques using local timestamp have been discussed. For
example Clock-SI[2] derives snapshots and commit timestamps from
loosely synchronized physical clocks, though it doesn't support
serializable isolation level. IIUC postgrespro multi-master cluster
employs the technique based on that. I've not read deeply yet but I
found new paper[3] on last week which introduces new SI mechanism that
allows transactions to determine their timestamps autonomously,
without relying on centralized coordination. PostgreSQL uses XID to
determine visibility now but mapping XID to its timestamp using commit
timestmap feature might be able to allow PostgreSQL to use the
timestamp for that purpose.

> https://en.wikipedia.org/wiki/Global_serializability
> claims that a technique called Commitment Ordering (CO) is teh
> awesome, but I've got my doubts about whether that's really an
> objective description of the state of the art.  One clue is that the
> global serialiazability article says three separate times that the
> technique has been widely misunderstood.  I'm not sure exactly which
> Wikipedia guideline that violates, but I think Wikipedia is supposed
> to summarize the views that exist on a topic in accordance with their
> prevalence, not take a position on which view is correct.
> https://en.wikipedia.org/wiki/Commitment_ordering contains citations
> from the papers only of one guy, Yoav Raz, which is another hint that
> this

Re: [HACKERS] foreign table creation and NOT VALID check constraints

2017-08-01 Thread Simon Riggs

On 1 August 2017 at 08:37, Amit Langote  wrote:
> On 2017/08/01 15:22, Simon Riggs wrote:
>> On 1 August 2017 at 07:16, Amit Langote  
>> wrote:
>>> In f27a6b15e656 (9.6 & later), we decided to "Mark CHECK constraints
>>> declared NOT VALID valid if created with table."  In retrospect,
>>> constraints on foreign tables should have been excluded from consideration
>>> in that commit, because the thinking behind the aforementioned commit
>>> (that the constraint is trivially validated because the newly created
>>> table contains no data) does not equally apply to the foreign tables case.
>>>
>>> Should we do something about that?
>>
>> In what way does it not apply? Do you have a failure case?
>
> Sorry for not mentioning the details.
>
> I was thinking that a foreign table starts containing the data of the
> remote object it points to the moment it's created (unlike local tables
> which contain no data to begin with).  If a user is not sure whether a
> particular constraint being created in the same command holds for the
> remote data, they may mark it as NOT VALID and hope that the system treats
> the constraint as such until such a time that they mark it valid by
> running ALTER TABLE VALIDATE CONSTRAINT.  Since the planner is the only
> consumer of pg_constraint.convalidated, that means the user expects the
> planner to ignore such a constraint.  Since f27a6b15e656, users are no
> longer able to expect so.

For Foreign Tables, it sounds like an issue. Don't think it exists for
normal tables.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor comment update in partition.c

2017-08-01 Thread Dean Rasheed

On 31 July 2017 at 12:53, Beena Emerson  wrote:
> The commit d363d42bb9a4399a0207bd3b371c966e22e06bd3 changed
> RangeDatumContent *content to PartitionRangeDatumKind *kind but a
> comment on function partition_rbound_cmp was left unedited and it
> still mentions content1 instead of kind1.
>
> PFA the patch to fix this.
>

Yes you're right, thanks. Patch committed.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pluggable storage

2017-08-01 Thread Haribabu Kommi

On Sun, Jul 23, 2017 at 4:10 PM, Amit Kapila 
wrote:

> On Wed, Jul 19, 2017 at 11:33 AM, Haribabu Kommi
>  wrote:
> >
> > I am finding out that eliminating the HeapTuple usage in the upper layers
> > needs some major changes, How about not changing anything in the upper
> > layers of storage currently and just support the pluggable tuple with
> one of
> > the following approach for first version?
> >
>
> It is not very clear to me how any of the below alternatives are
> better or worse as compare to the approach you are already working.
> Do you mean to say that we will get away without changing all the
> places which take HeapTuple as input or return it as output with some
> of the below approaches?
>

Yes, With the following approaches, my intention is to reduce the changing
of HeapTuple everywhere to support the Pluggable Storage API with minimal
changes and later improvise further. But until unless we change the
HeapTuple
everywhere properly, may be we can't see the benefits of pluggable storages.

> > 1. Design an API that returns values/nulls array and convert that into a
> > HeapTuple whenever it is required in the upper layers. All the existing
> > heap form/deform tuples are used for every tuple with some adjustments.
> >
>
> So, this would have the additional cost of form/deform.  Also, how
> would it have lesser changes as compare to what you have described
> earlier?
>

Yes, It have the additional cost of form/deform. It is the same approach
that
is described earlier. But I have an intention of modifying everywhere the
HeapTuple is accessed. But with the other prototype changes of removing
HeapTuple usage from triggers, I realized that it needs some clear design
to proceed further, instead of combining those changes with pluggable
Storage API.

- heap_getnext function is kept as it as and it is used only for system
table
  access.
- heap_getnext_slot function is introduced to return the slot whenever the
  data is found, otherwise NULL, This function is used in all the places
from
  Executor and etc.

- The TupleTableSlot structure is modified to contain a void* tuple instead
of
HeapTuple. And also it contains the storagehanlder functions.
- heap_insert and etc function can take Slot as an argument and perform the
insert operation.

The cases where the TupleTableSlot is not possible to sent, form a HeapTuple
from the data and sent it and also note down that it is a HeapTuple data,
not
the tuple from the storage.

> 2. Design an API that returns StorageTuple(void *) with first member
> > represents the TYPE of the storage, so that corresponding registered
> > function calls can be called to deform/form the tuple whenever there is
> > a need of tuple.
> >
>
> Do you intend to say that we store such information in disk tuple or
> only in the in-memory version of same?

Only in-memory version.

>   Also, what makes you think
> that we would need hooks only for form and deform?  Right now, in many
> cases tuple will directly point to disk page and we deal with it by
> retaining the pin on the corresponding buffer, what if some kinds of
> tuple don't follow that rule?  For ex. to support in-place updates, we
> might always need a separate copy of tuple rather than the one
> pointing to disk page.
>

In any of the approaches, except for system tables, we are going to remove
the direct disk access of the tuple. Either with replace tuple with slot or
something,
no direct disk access will be removed. Otherwise it will be difficult to
support,
and also it doesn't provide much performance benefit also.

All the storagehandler functions needs to be maintained seperately
and accessed by them using the hanlder ID.

heap_getnext function returns StorageTuple and not HeapTuple.
The StorageTuple can be mapped to HeapTuple or another Tuple
based on the first member type.

heap_insert etc function takes input as StorageTuple and internally
decides based on the type of the tuple to perform the insert operation.

Instead of storing the handler functions inside a relation/slot, the
function pointers can be accessed directly based on the storage
type data.

This works every case where the tuple is accessed, but the problem
is, it may need changes wherever the tuple is accessed.

> > 3. Design an API that returns StorageTuple(void *) but the necessary
> > format information of that tuple can be get from the tupledesc. wherever
> > the tuple is present, there exists a tupledesc in most of the cases. How
> > about adding some kind of information in tupledesc to find out the tuple
> > format and call the necessary functions
> >
>

heap_getnext function returns StorageTuple instead of HeapTuple. The tuple
type information is available in the TupleDesc structure.

All heap_insert and etc function accepts TupleTableSlot as input and perform
the insert operation. This approach is almost same as first approach except
the
storage handler functions are stored in

[HACKERS] How to run PG TAP tests on windows?

2017-08-01 Thread Abbas Butt

Hi,

Can anyone point out to a tutorial or a list of steps required to run PG
TAP tests on windows?

Regards

-- 
-- 
*Abbas*
Architect

Ph: 92.334.5100153
Skype ID: gabbasb
www.enterprisedb.co m


*Follow us on Twitter*
@EnterpriseDB

Visit EnterpriseDB for tutorials, webinars, whitepapers
 and more

Re: [HACKERS] foreign table creation and NOT VALID check constraints

On 2017/08/01 15:22, Simon Riggs wrote:
> On 1 August 2017 at 07:16, Amit Langote  wrote:
>> In f27a6b15e656 (9.6 & later), we decided to "Mark CHECK constraints
>> declared NOT VALID valid if created with table."  In retrospect,
>> constraints on foreign tables should have been excluded from consideration
>> in that commit, because the thinking behind the aforementioned commit
>> (that the constraint is trivially validated because the newly created
>> table contains no data) does not equally apply to the foreign tables case.
>>
>> Should we do something about that?
> 
> In what way does it not apply? Do you have a failure case?

Sorry for not mentioning the details.

I was thinking that a foreign table starts containing the data of the
remote object it points to the moment it's created (unlike local tables
which contain no data to begin with).  If a user is not sure whether a
particular constraint being created in the same command holds for the
remote data, they may mark it as NOT VALID and hope that the system treats
the constraint as such until such a time that they mark it valid by
running ALTER TABLE VALIDATE CONSTRAINT.  Since the planner is the only
consumer of pg_constraint.convalidated, that means the user expects the
planner to ignore such a constraint.  Since f27a6b15e656, users are no
longer able to expect so.

For example:

create extension postgres_fdw;

create server loopback
  foreign data wrapper postgres_fdw
  options (dbname 'postgres');

create table foo (a) as select 1;

create foreign table ffoo (
  a int,
  constraint check_a check (a = 0) not valid
) server loopback options (table_name 'foo');

set constraint_exclusion to on;

-- constraint check_a will exclude the table
select * from ffoo where a = 1;
 a
---
(0 rows)

explain select * from ffoo where a = 1;
QUERY PLAN
--
 Result  (cost=0.00..0.00 rows=0 width=4)
   One-Time Filter: false
(2 rows)

The above "issue" doesn't occur if the constraint is created
after-the-fact, whereby it's possible to specify NOT VALID and have the
system actually store it in the catalog as such.

alter foreign table ffoo drop constraint check_a;
alter foreign table ffoo add constraint check_a check (a = 0) not valid;

select * from ffoo where a = 1;
 a
---
 1
(1 row)

explain select * from ffoo where a = 1;
 QUERY PLAN
-
 Foreign Scan on ffoo  (cost=100.00..146.86 rows=15 width=4)

Maybe, this is not such a big matter though, because the core system
doesn't actually enforce any constraints of the foreign tables locally
anyway (which is a documented fact), but my concern is the possibility of
some considering this a POLA violation.  Lack of complaints so far perhaps
means nobody did.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improve the performance of the standby server when dropping tables on the primary server

2017-08-01 Thread Simon Riggs

On 1 August 2017 at 05:45, Tokuda, Takashi
 wrote:
> Hi,
>
> The attached patch changes data structure storing unowned SMgrRelation objects
> from list structure to hash structure.
> The reason why I change it is that list structure very slowly removes a node.
> And list structure takes longer time to remove a node than hash structure.
>
> The problem was reported in BUG #14575.
> https://www.postgresql.org/message-id/20170303023246.25054.66...@wrigleys.postgresql.org
>
> In my customer's case, the standby server was delayed more than 20 minites
> when dropping many table at once.
>
>
>  - Performance check

Interesting, thanks for the patch.

Couple of points regarding performance...

* The previous coding allowed for a fast path to access the last
unowned relation, which this patch removes. It seems a good idea to
cache the last unowned relation, or if not explain why the comment
that says why it worked that way is no longer true.

* We should only create the hash table when needed, i.e. on or after
when we add an unowned relation, since that is not a typical case.

* The hash table is sized at 400 elements and will grow from there.
The comments in dynahash say "An overly large nelem will penalize
hash_seq_search speed without buying much." so this makes your patch
suitable for the bulk case but likely to perform worse for fewer
elements. So I'm guessing that you picked 400 because that's what the
parameter is set to for the smgr relation table rather than because
this has had good consideration.

I'll take your word for now that it improves the main case but I'd
suggest you consider the performance effects of the patch on other
cases that use this code.

Without looking deeper: does this code also run for temp objects? Can
it be optimized for that case a little better?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Update description of \d[S+] in \?

On 2017/08/01 11:44, David G. Johnston wrote:
> On Mon, Jul 31, 2017 at 7:06 PM, Robert Haas  wrote:
> 
>> On Thu, Jul 13, 2017 at 8:40 PM, Amit Langote
>>  wrote:
>>> On 2017/07/13 19:57, Ashutosh Bapat wrote:
 On Thu, Jul 13, 2017 at 12:01 PM, Amit Langote
  wrote:
> The description of \d[S+] currently does not mention that it will list
> materialized views and foreign tables.  Attached fixes that.
>

 I guess the same change is applicable to the description of \d[S+] NAME
>> as well.
>>>
>>> Thanks for the review.  Fixed in the attached.
>>
>> The problem with this, IMV, is that it makes those lines more than 80
>> characters, whereas right now they are not.
> 
> 
> 84:   \\d[S+] list (foreign) tables, (materialized)
> views, and sequences\n
> 76:   \\d[S+] list (foreign) tables, (mat.) views, and
> sequences\n
> 
>   And that line seems
>> doomed to get even longer in the future.
>>
> 
> Cross that bridge when we come to it?
> 
> Lumping the tables and views into a single label (I'd go with "relations"
> since these are all - albeit non-exclusively - things that can appear in a
> FROM clause) would greatly aid things here.  Indexes and sequences would
> retain their own identities.  But I seem to recall that elsewhere we call
> indexes relations - and I'm not sure about sequences.
> 
> I'm partial to calling it "relations and sequences" and letting the reader
> check the documentation for what "relations" means in this context.

Hmm, that makes it short.

\d[S+] list relations and sequences
\d[S+]  NAME   describe relation, index, or sequence

But, quite a few error messages generated by the backend will still list
them with the current names that are based on relkind.  For example, here
is one:

alter table foo_a_seq rename last_value to what;

ERROR:  "foo_a_seq" is not a table, view, materialized view, composite
type, index, or foreign table

Any terminology change we introduce will have to preserve consistency
across the board.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] A little improvementof ApplyLauncherMain loop code

2017-08-01 Thread Yugo Nagata

Hi,

When reading the logical replication code, I found that the following
part could be improved a bit. In the foreach, LWLockAcquire and
logicalrep_worker_find are called for each loop, but they are needed
only when sub->enabled is true.

 846 /* Start the missing workers for enabled subscriptions. */
 847 foreach(lc, sublist)
 848 {
 849 Subscription *sub = (Subscription *) lfirst(lc);
 850 LogicalRepWorker *w;
 851 
 852 LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
 853 w = logicalrep_worker_find(sub->oid, InvalidOid, false);
 854 LWLockRelease(LogicalRepWorkerLock);
 855 
 856 if (sub->enabled && w == NULL)
 857 {
 858 last_start_time = now;
 859 wait_time = wal_retrieve_retry_interval;
 860 
 861 logicalrep_worker_launch(sub->dbid, sub->oid, 
sub->name,
 862  sub->owner, InvalidOid);
 863 }
 864 }

We can fix this to call them only when there are needed, but I'm not sure
whether these call are expensive enough to fix. Is it worth to fix? 
A patch attached.

Regards,

-- 
Yugo Nagata 
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index d165d51..4816a5b 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -849,11 +849,14 @@ ApplyLauncherMain(Datum main_arg)
 Subscription *sub = (Subscription *) lfirst(lc);
 LogicalRepWorker *w;
 
+if (!sub->enabled)
+	continue;
+
 LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
 w = logicalrep_worker_find(sub->oid, InvalidOid, false);
 LWLockRelease(LogicalRepWorkerLock);
 
-if (sub->enabled && w == NULL)
+if (w == NULL)
 {
 	last_start_time = now;
 	wait_time = wal_retrieve_retry_interval;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] foreign table creation and NOT VALID check constraints

2017-08-01 Thread Simon Riggs

On 1 August 2017 at 07:16, Amit Langote  wrote:
> In f27a6b15e656 (9.6 & later), we decided to "Mark CHECK constraints
> declared NOT VALID valid if created with table."  In retrospect,
> constraints on foreign tables should have been excluded from consideration
> in that commit, because the thinking behind the aforementioned commit
> (that the constraint is trivially validated because the newly created
> table contains no data) does not equally apply to the foreign tables case.
>
> Should we do something about that?

In what way does it not apply? Do you have a failure case?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] foreign table creation and NOT VALID check constraints