date:20190722

On 2019-07-22 21:16, Andrew Dunstan wrote:
> Modulo this issue, experimentation shows that adding '-A trust' to the
> line in run_build.pl where initdb is called fixes the issue. If we're
> going to rely on a buildfarm client fix that one seems simplest.

Yes, that is the right fix.  It's what the in-tree test drivers
(pg_regress, PostgresNode.pm) do.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Speed up transaction completion faster after many relations are accessed in a transaction

On Tue, 23 Jul 2019 at 15:47, Tsunakawa, Takayuki
 wrote:
> OTOH, how about my original patch that is based on the local lock list?  I 
> expect that it won't that significant slowdown in the same test case.  If 
> it's not satisfactory, then v6 is the best to commit.

I think we need to move beyond the versions that have been reviewed
and rejected. I don't think the reasons for their rejection will have
changed.

I've attached v7, which really is v6 with some comments adjusted and
the order of the hash_get_num_entries and hash_get_max_bucket function
calls swapped.  I think hash_get_num_entries() will return 0 most of
the time where we're calling it, so it makes sense to put the
condition that's less likely to be true first in the if condition.

I'm pretty happy with v7 now. If anyone has any objections to it,
please speak up very soon.  Otherwise, I plan to push it about this
time tomorrow.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

shrink_bloated_locallocktable_v7.patch
Description: Binary data

Re: Add parallelism and glibc dependent only options to reindexdb

On Mon, Jul 22, 2019 at 11:18:06AM -0400, Alvaro Herrera wrote:
> BTW "progname" is a global variable in logging.c, and it's initialized
> by pg_logging_init(), so there's no point in having a local variable in
> main() that's called the same and initialized the same way.  You could
> just remove it from the signature of all those functions
> (connectDatabase and callers), and there would be no visible change.

Sure, and I was really tempted to do that until I noticed that we pass
down progname for fallback_application_name in the connection string
and that we would basically need to externalize progname in logging.h,
as well as switch all the callers of pg_logging_init to now include
their own definition of progname, which was much more invasive than
the initial refactoring intended.  I am also under the impression that
we had better keep get_progname() and pg_logging_init() as rather
independent things.

> Also: [see attached]

Missed those in the initial cleanup.  Applied, thanks!
--
Michael

signature.asc
Description: PGP signature

Re: Introduce timeout capability for ConditionVariableSleep

On Tue, Jul 16, 2019 at 1:11 AM Robert Haas  wrote:
> On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro  wrote:
> > I pushed this too.  It's a separate commit, because I think there is
> > at least a theoretical argument that it should be back-patched.  I'm
> > not going to do that today though, because I doubt anyone is relying
> > on ConditionVariableSignal() working that reliably yet, and it's
> > really with timeouts that it becomes a likely problem.
>
> To make it work reliably, you'd need to be sure that a process can't
> ERROR or FATAL after getting signaled and before doing whatever the
> associated work is (or that if it does, it will first pass on the
> signal). Since that seems impossible, I'm not sure I see the point of
> trying to do anything at all.

I agree that that on its own doesn't fix problems in , but that doesn't mean we
shouldn't try to make this API as reliable as possible.  Unlike
typical CV implementations, our wait primitive is not atomic.  When we
invented two-step wait, we created a way for ConditionVariableSignal()
to have no effect due to bad timing.  Surely that's a bug.

-- 
Thomas Munro
https://enterprisedb.com

Re: progress report for ANALYZE

2019-07-22 Thread Tatsuro Yamada


Hi Horiguchi-san, Alvaro, Anthony, Julien and Robert,


On 2019/07/22 17:30, Kyotaro Horiguchi wrote:

Hello.

# It's very good timing, as you came in while I have a time after
# finishing a quite nerve-wrackig task..

At Mon, 22 Jul 2019 15:02:16 +0900, Tatsuro Yamada  
wrote in <0876b4fe-26fb-ca32-f179-c696fa3dd...@nttcom.co.jp>

3785|13599|postgres|16384|f|16384|analyzing complete|0|0 <-- Added and
fixed. :)
-


I have some comments.
+   const int   index[] = {
+   PROGRESS_ANALYZE_PHASE,
+   PROGRESS_ANALYZE_TOTAL_BLOCKS,
+   PROGRESS_ANALYZE_BLOCKS_DONE
+   };
+   const int64 val[] = {
+   PROGRESS_ANALYZE_PHASE_ANALYSIS,
+   0, 0
Do you oppose to leaving the total/done blocks alone here:p?



Thanks for your comments!
I created a new patch based on your comments because Alvaro allowed me
to work on revising the patch. :)


Ah, I revised it to remove "0, 0".


Thanks. For a second I thought that
PROGRESS_ANALYZE_PHASE_ANALYSIS was lost but it is living being
renamed to PROGRESS_ANALYZE_PHASE_ANALYSIS.



"PROGRESS_ANALYZE_PHASE_ANALYSIS" was replaced with
"PROGRESS_ANALYZE_PHASE_COMPUTING" because I changed
the phase-name on v3.patch like this:

./src/include/commands/progress.h

+/* Phases of analyze (as advertised via PROGRESS_ANALYZE_PHASE) */
+#define PROGRESS_ANALYZE_PHASE_SCAN_TABLE  1
+#define PROGRESS_ANALYZE_PHASE_COMPUTING   2
+#define PROGRESS_ANALYZE_PHASE_COMPUTING_EXTENDED 3
+#define PROGRESS_ANALYZE_PHASE_FINALIZE4

Is it Okay?

 

+  BlockNumber  nblocks;
+  doubleblksdone = 0;
Why is it a double even though blksdone is of the same domain
with nblocks? And finally:
+pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE,
+   ++blksdone);
It is converted into int64.



Fixed.
But is it suitable to use BlockNumber instead int64?


Yeah, I didn't meant that we should use int64 there. Sorry for
the confusing comment. I meant that blksdone should be of
BlockNumber.



Fixed. Thanks for the clarification. :)
Attached v4 patch file only includes this fix.
 


+  WHEN 2 THEN 'analyzing sample'
+  WHEN 3 THEN 'analyzing sample (extended stats)'
I think we should avoid parenthesized phrases as far as
not-necessary. That makes the column unnecessarily long. The
phase is internally called as "compute stats" so *I* would prefer
something like the followings:
+  WHEN 2 THEN 'computing stats'
+  WHEN 3 THEN 'computing extended stats'
+  WHEN 4 THEN 'analyzing complete'
And if you are intending by this that (as mentioned in the
documentation) "working to complete this analyze", this would be:
+  WHEN 4 THEN 'completing analyze'
+  WHEN 4 THEN 'finalizing analyze'



I have no strong opinion, so I changed the phase-names based on
your suggestions like following:

   WHEN 2 THEN 'computing stats'
   WHEN 3 THEN 'computing extended stats'
   WHEN 4 THEN 'finalizing analyze'

However, I'd like to get any comments from hackers to get a consensus
about the names.


Agreed. Especially such word choosing is not suitable for me..



To Alvaro, Julien, Anthony and Robert,
Do you have any ideas? :)


 

+ Process ID of backend.
of "the" backend. ? And period is not attached on all
descriptions consisting of a single sentence.

+ OID of the database to which this backend is
connected.
+ Name of the database to which this backend is
connected.
"database to which .. is connected" is phrased as "database this
backend is connected to" in pg_stat_acttivity. (Just for consistency)



I checked the sentences on other views of progress monitor (VACUUM,
CREATE INDEX and CLUSTER), and they are same sentence. Therefore,
I'd like to keep it as is. :)


Oh, I see from where the wordings came. But no periods seen after
sentense when it is the only one in a description in other system
views tables. I think the progress views tables should be
corrected following convention.



Sounds reasonable. However, I'd like to create another patch after
this feature was committed since that document fix influence other
progress monitor's description on the document.

 

+ Whether the current scan includes legacy inheritance
children.
This apparently excludes partition tables but actually it is
included.

"Whether scanning through child tables" ?
I'm not sure "child tables" is established as the word to mean
both "partition tables and inheritance children"..



Hmm... I'm also not sure but I fixed as you suggested.


Yeah, I also am not sure the suggestion is good enough as is..


+   Total number of heap blocks to scan in the current table.
 Number of heap blocks on scanning_table to scan?
It might be better that th

RE: Speed up transaction completion faster after many relations are accessed in a transaction

2019-07-22 Thread Tsunakawa, Takayuki

From: David Rowley [mailto:david.row...@2ndquadrant.com]
> Another counter-argument to this is that there's already an
> unexplainable slowdown after you run a query which obtains a large
> number of locks in a session or use prepared statements and a
> partitioned table with the default plan_cache_mode setting. Are we not
> just righting a wrong here? Albeit, possibly 1000 queries later.
> 
> I am, of course, open to other ideas which solve the problem that v5
> has, but failing that, I don't see v6 as all that bad.  At least all
> the logic is contained in one function.  I know Tom wanted to steer
> clear of heuristics to reinitialise the table, but most of the stuff
> that was in the patch back when he complained was trying to track the
> average number of locks over the previous N transactions, and his
> concerns were voiced before I showed the 7% performance regression
> with unconditionally rebuilding the table.

I think I understood what you mean.  Sorry, I don't have a better idea.  This 
unexplanability is probably what we should accept to avoid the shocking 7% 
slowdown.

OTOH, how about my original patch that is based on the local lock list?  I 
expect that it won't that significant slowdown in the same test case.  If it's 
not satisfactory, then v6 is the best to commit.


Regards
Takayuki Tsunakawa

Re: SegFault on 9.6.14

On Fri, Jul 19, 2019 at 3:00 PM Amit Kapila  wrote:
> On Thu, Jul 18, 2019 at 7:15 PM Tom Lane  wrote:
> > Thomas Munro  writes:
> > > Hmm, so something like a new argument "bool final" added to the
> > > ExecXXXShutdown() functions, which receives false in this case to tell
> > > it that there could be a rescan so keep the parallel context around.
> >
> > I think this is going in the wrong direction.  Nodes should *always*
> > assume that a rescan is possible until ExecEndNode is called.
>
> I am thinking that why not we remove the part of destroying the
> parallel context (and shared memory) from ExecShutdownGather (and
> ExecShutdownGatherMerge) and then do it at the time of ExecEndGather
> (and ExecEndGatherMerge)?   This should fix the bug in hand and seems
> to be more consistent with our overall design principles.  I have not
> tried to code it to see if there are any other repercussions of the
> same but seems worth investigating.  What do you think?

I tried moving ExecParallelCleanup() into ExecEndGather().  The first
problem is that ExecutePlan() wraps execution in
EnterParallelMode()/ExitParallelMode(), but ExitParallelMode() fails
an assertion that no parallel context is active because
ExecEndGather() hasn't run yet.  The enclosing
ExecutorStart()/ExecutorEnd() calls are further down the call stack,
in ProcessQuery().  So some more restructuring might be needed to exit
parallel mode later, but then I feel like you might be getting way out
of back-patchable territory, especially if it involves moving code to
the other side of the executor hook boundary.  Is there an easier way?

Another idea from the band-aid-solutions-that-are-easy-to-back-patch
department: in ExecutePlan() where we call ExecShutdownNode(), we
could write EXEC_FLAG_DONE into estate->es_top_eflags, and then have
ExecGatherShutdown() only run ExecParallelCleanup() if it sees that
flag.  That's not beautiful, but it's less churn that the 'bool final'
argument we discussed before, and could be removed in master when we
have a better way.

Stepping back a bit, it seems like we need two separate tree-walking
calls: one to free resources not needed anymore by the current rescan
(workers), and another to free resources not needed ever again
(parallel context).  That could be spelled ExecShutdownNode(false) and
ExecShutdownNode(true), or controlled with the EXEC_FLAG_DONE kluge,
or a new additional ExecSomethingSomethingNode() function, or as you
say, perhaps the second thing could be incorporated into
ExecEndNode().  I suspect that the Shutdown callbacks for Hash, Hash
Join, Custom Scan and Foreign Scan might not be needed anymore if we
could keep the parallel context around until after the run
ExecEndNode().

-- 
Thomas Munro
https://enterprisedb.com

"Attachment not found" by CF-app

2019-07-22 Thread Kyotaro Horiguchi

Hello. I happened to find that I cannot obtain a patch from the
Emails field of a patch on the CF-App.

https://commitfest.postgresql.org/23/1525/

Emails:
> Latest attachment (crash_dump_before_main_v2.patch) at 2018-03-01 05:13:34 
> from "Tsunakawa, Takayuki"  

The link directly shown in Emails (shown as crash_dump..patch) field is:

https://www.postgresql.org/message-id/attachment/59143/crash_dump_before_main_v2.patch

And it results in "Attachment not found".


I can get the patch with the same name from the message (shown as
"2018-03..13:34) page. The link shown there is:

https://www.postgresql.org/message-id/attachment/68293/crash_dump_before_main_v2.patch

Seems like something is getting wrong in the CF-App.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: POC: Cleaning up orphaned files using undo logs

2019-07-22 Thread Amit Kapila

On Mon, Jul 22, 2019 at 8:39 PM Amit Khandekar  wrote:
>
> On Mon, 22 Jul 2019 at 14:21, Amit Kapila  wrote:
>
> -
>
> +UndoWorkerGetSlotInfo(int slot, UndoRequestInfo *urinfo)
> +{
> + /* Block concurrent access. */
> + LWLockAcquire(UndoWorkerLock, LW_EXCLUSIVE);
> +
> + MyUndoWorker = &UndoApplyCtx->workers[slot];
> Not sure why MyUndoWorker is used here. Can't we use a local variable
> ? Or do we intentionally attach to the slot as a side-operation ?
>
> -
>

I think here, we can use a local variable as well.  Do you see any
problem with the current code or do you think it is better to use a
local variable here?

> --
>
> + if (!InsertRequestIntoErrorUndoQueue(urinfo))
> I was thinking what happens if for some reason
> InsertRequestIntoErrorUndoQueue() itself errors out. In that case, the
> entry will not be marked invalid, and so there will be no undo action
> carried out because I think the undo worker will exit. What happens
> next with this entry ?

The same entry is present in two queues xid and size, so next time it
will be executed from the second queue based on it's priority in that
queue.  However, if it fails again a second time in the same way, then
we will be in trouble because now the hash table has entry, but none
of the queues has entry, so none of the workers will attempt to
execute again.  Also, when discard worker again tries to register it,
we won't allow adding the entry to queue thinking either some backend
is executing the same or it must be part of some queue.

The one possibility to deal with this could be that we somehow allow
discard worker to register it again in the queue or we can do this in
critical section so that it allows system restart on error.  However,
the main thing is it possible that InsertRequestIntoErrorUndoQueue
will fail unless there is some bug in the code?  If so, we might want
to have an Assert for this rather than handling this condition.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Change ereport level for QueuePartitionConstraintValidation

On Thu, 18 Jul 2019 at 07:01, Tom Lane  wrote:
> Seems like maybe what we need is to transpose the tests at issue into
> a TAP test?  That could grep for the messages we care about and disregard
> other ones.

That seems like a good idea.   I guess that's a vote in favour of
having DEBUG1 for ATTACH PARTITION and SET NOT NULL too?

I don't know my way around the tap tests that well, but I started to
look at this and ended up a bit stuck on where the test should be
located.  I see src/test/modules/brin has some brin related tests, so
I thought that src/test/modules/alter_table might be the spot, but
after looking at src/test/README I see it mentions that only tests
that are themselves an extension should be located within:

modules/
  Extensions used only or mainly for test purposes, generally not suitable
  for installing in production databases

There are a few others in the same situation as brin; commit_ts,
snapshot_too_old, unsafe_tests.   I see unsafe_tests does mention the
lack of module in the README file.

Is there a better place to do the alter_table ones?  Or are the above
ones in there because there's no better place?

Also, if I'm not wrong, the votes so far appear to be:

NOTICE:  Robert, Amit
DEBUG1: Tom, Alvaro (I'm entirely basing this on the fact that they
mentioned possible ways to test with DEBUG1)

I'll be happy with DEBUG1 if we can get tests to test it.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

RE: [Patch] PQconnectPoll() is blocked if target_session_attrs is read-write

2019-07-22 Thread Matsumura, Ryo

Kyotaro-san

Thank you for your review.

> First, this patch looks broken.

I took a serious mistake.

> You also need to update the corresponding documentation.

I attach a new patch that includes updating of document.

Regards
Ryo Matasumura


libpq_state_change_bugfix.ver1.1.patch
Description: libpq_state_change_bugfix.ver1.1.patch

Re: pg_receivewal documentation

On Mon, Jul 22, 2019 at 01:25:41PM -0400, Jesper Pedersen wrote:
> Hi,
> 
> On 7/21/19 9:48 PM, Michael Paquier wrote:
> > > Since pg_receivewal does not apply WAL, you should not allow it to
> > > become a synchronous standby when synchronous_commit = remote_apply.
> > > If it does, it will appear to be a standby which never catches up,
> > > which may cause commits to block.  To avoid this, you should either
> > > configure an appropriate value for synchronous_standby_names, or
> > > specify an application_name for pg_receivewal that does not match it,
> > > or change the value of synchronous_commit to something other than
> > > remote_apply.
> > > 
> > > I think that'd be a lot more useful than enumerating the total-failure
> > > scenarios.
> > 
> > +1.  Thanks for the suggestions!  Your wording looks good to me.
> 
> +1
> 
> Here is the patch for it, with Robert as the author.

> +to something other than

Looks fine to me.  Just a tiny nit.  For the second reference to
synchronous_commit, I would change the link to a  markup. 
--
Michael


signature.asc
Description: PGP signature

Re: errbacktrace

On Tue, Jul 23, 2019 at 6:19 AM Peter Eisentraut
 wrote:
> On 2019-07-09 11:43, Peter Eisentraut wrote:
> > After further research I'm thinking about dropping the libunwind
> > support.  The backtrace()/backtrace_symbols() API is more widely
> > available: darwin, freebsd, linux, netbsd, openbsd (via port), solaris,
> > and of course it's built-in, whereas libunwind is only available for
> > linux, freebsd, hpux, solaris, and requires an external dependency.
>
> Here is an updated patch without the libunwind support, some minor
> cleanups, documentation, and automatic back traces from assertion failures.

Now works out of the box on FreeBSD.  The assertion thing is a nice touch.

I wonder if it'd make sense to have a log_min_backtrace GUC that you
could set to error/fatal/panicwhatever (perhaps in a later patch).

-- 
Thomas Munro
https://enterprisedb.com

Re: Psql patch to show access methods info

2019-07-22 Thread Alexander Korotkov

On Mon, Jul 22, 2019 at 11:25 PM Nikita Glukhov  wrote:
> Columns "Handler" and "Description" were added to \dA+.
>
> \dA [NAME] now shows only amname and amtype.

Cool!

> Also added support for pre-9.6 server versions to both \dA and \dA+.

I was going to ask about that.  You got ahead of me :-)

In general, patchset is very cool.  It was always scary there is no
way in psql to see am/opclass/opfamily information rather than query
catalog directly.  Shape of patches also looks good.

I'm going to push it.  Probably, someone find that commands syntax and
output formats are not well discussed yet.  But we're pretty earlier
in 13 release cycle.  So, we will have time to work out a criticism if
any.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: initdb recommendations

2019-07-22 Thread Jonathan S. Katz

On 7/22/19 3:20 PM, Andrew Dunstan wrote:
> 
> On 7/22/19 3:15 PM, Tom Lane wrote:
>>
>> Frankly, this episode makes me wonder whether changing the default is
>> even a good idea at this point.  People who care about security have
>> already set up their processes to select a useful-to-them auth option,
>> while people who do not care are unlikely to be happy about having
>> security rammed down their throats, especially if it results in the
>> sort of push-ups we're looking at having to do in the buildfarm.
>> I think this has effectively destroyed the argument that only
>> trivial adjustments will be required.
> 
> There's a strong tendency these days to be secure by default, so I
> understand the motivation.

So perhaps to bring back the idea that spawned this thread[1], as an
interim step, we provide some documented recommendations on how to set
things up. The original patch has a warning box (and arguably defaulting
to "trust" deserves a warning) but could be revised to be inline with
the text.

Jonathan

[1]
https://www.postgresql.org/message-id/bec17f0a-ddb1-8b95-5e69-368d9d0a3390%40postgresql.org



signature.asc
Description: OpenPGP digital signature

Re: getting ERROR "relation 16401 has no triggers" with partition foreign key alter

On 2019-Jul-17, Alvaro Herrera wrote:

> On 2019-Jul-17, Alvaro Herrera wrote:

> > I wonder if there are other AT subcommands that are similarly broken,
> > because many of them skip the CheckTableNotInUse for the partitions.
> 
> I suppose the question here is where else do we need to call the new
> ATRecurseCheckNotInUse function (which needs a comment).

I decided to rename the new function to ATCheckPartitionsNotInUse, and
make it a no-op for legacy inheritance.  This seems quite specific to
partitioned tables (as opposed to legacy inheritance behavior).

After looking at the code some more, I think calling the new function in
the Prep phase is correct.  The attached patch is pretty much final form
for this bugfix.  I decided to unwrap a couple of error messages (I did
get bitten while grepping because of this), and reordered one of the new
Identity command cases in ATPrepCmd since it appeared in inconsistent
order in that one place of four.

I looked at all the other AT subcommand cases that might require the
same treatment, and didn't find anything -- either the recursion is done
at Prep time, which checks already, or contains the proper check at Exec
time right after opening the partition rel.  (I think it would be better
to do the check during the Prep phase, to avoid wasting work in case a
partition happens to be used.  However, that's not critical and not for
this commit to fix IMO.)

Separately from that, there's AT_SetLogged / AT_SetUnlogged which look
pretty dubious ... I'm not sure that recursion is handled correctly
there.  Maybe it's considered okay to have a partitioned table with
unlogged partitions, and vice versa?

I also noticed that AT_AlterConstraint does not handle recursion at all,
and it also has this comment:

 * Currently only works for Foreign Key constraints.
 * Foreign keys do not inherit, so we purposely ignore the
 * recursion bit here, but we keep the API the same for when
 * other constraint types are supported.

which sounds to oppose reality.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From ca7841a24c27012e351d08dc36039233ae0d80e1 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 22 Jul 2019 11:09:06 -0400
Subject: [PATCH v2] Check partitions not in use

---
 src/backend/commands/tablecmds.c | 56 +---
 1 file changed, 45 insertions(+), 11 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f39f993e01..fd17aa4b5a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -354,6 +354,7 @@ static void ATSimplePermissions(Relation rel, int allowed_targets);
 static void ATWrongRelkindError(Relation rel, int allowed_targets);
 static void ATSimpleRecursion(List **wqueue, Relation rel,
 			  AlterTableCmd *cmd, bool recurse, LOCKMODE lockmode);
+static void ATCheckPartitionsNotInUse(Relation rel, LOCKMODE lockmode);
 static void ATTypedTableRecursion(List **wqueue, Relation rel, AlterTableCmd *cmd,
   LOCKMODE lockmode);
 static List *find_typed_table_dependencies(Oid typeOid, const char *typeName,
@@ -3421,8 +3422,7 @@ CheckTableNotInUse(Relation rel, const char *stmt)
 		ereport(ERROR,
 (errcode(ERRCODE_OBJECT_IN_USE),
 		/* translator: first %s is a SQL command, eg ALTER TABLE */
- errmsg("cannot %s \"%s\" because "
-		"it is being used by active queries in this session",
+ errmsg("cannot %s \"%s\" because it is being used by active queries in this session",
 		stmt, RelationGetRelationName(rel;

 	if (rel->rd_rel->relkind != RELKIND_INDEX &&
@@ -3431,8 +3431,7 @@ CheckTableNotInUse(Relation rel, const char *stmt)
 		ereport(ERROR,
 (errcode(ERRCODE_OBJECT_IN_USE),
 		/* translator: first %s is a SQL command, eg ALTER TABLE */
- errmsg("cannot %s \"%s\" because "
-		"it has pending trigger events",
+ errmsg("cannot %s \"%s\" because it has pending trigger events",
 		stmt, RelationGetRelationName(rel;
 }

@@ -3910,16 +3909,19 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 			break;
 		case AT_AddIdentity:
 			ATSimplePermissions(rel, ATT_TABLE | ATT_VIEW | ATT_FOREIGN_TABLE);
+			/* This command never recurses */
 			pass = AT_PASS_ADD_CONSTR;
 			break;
-		case AT_DropIdentity:
-			ATSimplePermissions(rel, ATT_TABLE | ATT_VIEW | ATT_FOREIGN_TABLE);
-			pass = AT_PASS_DROP;
-			break;
 		case AT_SetIdentity:
 			ATSimplePermissions(rel, ATT_TABLE | ATT_VIEW | ATT_FOREIGN_TABLE);
+			/* This command never recurses */
 			pass = AT_PASS_COL_ATTRS;
 			break;
+		case AT_DropIdentity:
+			ATSimplePermissions(rel, ATT_TABLE | ATT_VIEW | ATT_FOREIGN_TABLE);
+			/* This command never recurses */
+			pass = AT_PASS_DROP;
+			break;
 		case AT_DropNotNull:	/* ALTER COLUMN DROP NOT NULL */
 			ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
 			ATPrepDropNotNull(rel, recurse, recursing);
@@ -39

Re: Add CREATE DATABASE LOCALE option

2019-07-22 Thread Fabien COELHO




Hello Peter,


About the pg_dump code, I'm wondering whether it is worth generating
LOCALE as it breaks backward compatibility (eg dumping a new db to load it
into a older version).


We don't really care about backward compatibility here.  Moving forward,
with ICU and such, we don't want to have to drag around old syntax forever.


We will drag it anyway because LOCALE is just a shortcut for the other two 
LC_* when they have the same value.



How about this patch?


It applies cleanly, compiles, global & pg_dump make check ok, doc gen ok.

I'm still unconvinced of the interest of breaking backward compatibility, 
but this is no big deal.


I do not like much calling strlen() to check whether a string is empty, 
but this is pre-existing.


I switched the patch to READY.

--
Fabien.

Re: initdb recommendations

I wrote:
> I tried doing a run on gaur (old HPUX, so no "peer" auth) before the
> revert happened.  It got as far as initdb-check [1], which failed quite
> thoroughly with lots of the same error as above.
> ...
> Presumably Noah's AIX menagerie would have failed in about the
> same way if it had run.

Oh --- actually, Noah's machines *did* report in on that commit,
and they got past initdb-check, only to fail at install-check-C
much the same as most of the rest of the world.

Studying their configure output, the reason is that they have
getpeereid(), so that AIX *does* support peer auth.  At least
on that version of AIX.  That makes it only HPUX and Windows
that can't do it.

BTW, after looking at the patch a bit more, I'm pretty distressed
by this:

--- a/src/include/port.h
+++ b/src/include/port.h
@@ -361,6 +361,11 @@ extern int fls(int mask);
 extern int getpeereid(int sock, uid_t *uid, gid_t *gid);
 #endif

+/* must match src/port/getpeereid.c */
+#if defined(HAVE_GETPEEREID) || defined(SO_PEERCRED) || 
defined(LOCAL_PEERCRED) || defined(HAVE_GETPEERUCRED)
+#define HAVE_AUTH_PEER 1
+#endif
+
 #ifndef HAVE_ISINF
 extern int isinf(double x);
 #else

I seriously doubt that port.h includes, or should be made to include,
whatever headers provide SO_PEERCRED and/or LOCAL_PEERCRED.  That means
that the result of this test is going to be different in different .c
files depending on what was or wasn't included.  It could also get
silently broken on specific platforms by an ill-advised #include removal
(and, once we fix the buildfarm script to not fail on PEER-less platforms,
the buildfarm wouldn't detect the breakage either).

Another objection to this is that it's entirely unclear from the
buildfarm logs whether HAVE_AUTH_PEER got set on a particular system.

I think that when/if we try again, configure itself ought to be
responsible for setting HAVE_AUTH_PEER after probing for these
various antecedent symbols.

regards, tom lane

Re: Broken defenses against dropping a partitioning column

Alvaro Herrera  writes:
> On 2019-Jul-22, Tom Lane wrote:
>> I nearly missed the need for that because of all the noise that
>> check-world emits in pre-v12 branches.  We'd discussed back-patching
>> eb9812f27 at the time, and I think now it's tested enough that doing
>> so is low risk (or at least, lower risk than the risk of not seeing
>> a failure).  So I think I'll go do that now.

> I'd like that, as it bites me too, thanks.

Done.  The approach "make check-world >/dev/null" now emits the
same amount of noise on all branches, ie just

NOTICE:  database "regression" does not exist, skipping


The amount of parallelism you can apply is still pretty
branch-dependent, unfortunately.

regards, tom lane

Re: Broken defenses against dropping a partitioning column

2019-07-22 Thread Manuel Rigger

Thanks a lot for the fix!

Best,
Manuel

On Mon, Jul 22, 2019 at 9:35 PM Alvaro Herrera  wrote:
>
> On 2019-Jul-22, Tom Lane wrote:
>
> > I nearly missed the need for that because of all the noise that
> > check-world emits in pre-v12 branches.  We'd discussed back-patching
> > eb9812f27 at the time, and I think now it's tested enough that doing
> > so is low risk (or at least, lower risk than the risk of not seeing
> > a failure).  So I think I'll go do that now.
>
> I'd like that, as it bites me too, thanks.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Psql patch to show access methods info

2019-07-22 Thread Nikita Glukhov


Attached 8th version of the patches.


On 22.07.2019 15:58, Alexander Korotkov wrote:

On Mon, Jul 22, 2019 at 6:29 AM Alvaro Herrera  wrote:

On 2019-Jul-21, Alexander Korotkov wrote:

I've one note.  Behavior of "\dA" and "\dA pattern" look
counter-intuitive to me.  I would rather expect that "\dA pattern"
would just filter results of "\dA", but it displays different
information.  I suggest rename displaying access method properties
from "\dA pattern" to different.

\dA+ maybe?  Then ...


And leave "\dA pattern" just filter results of "\dA".

"\dA+ pattern" works intuitively, I think.

Sounds good for me.

We already have some functionality for \dA+.

# \dA+
  List of access methods
   Name  | Type  |   Handler|  Description
+---+--+
  brin   | index | brinhandler  | block range index (BRIN) access method
  btree  | index | bthandler| b-tree index access method
  gin| index | ginhandler   | GIN index access method
  gist   | index | gisthandler  | GiST index access method
  hash   | index | hashhandler  | hash index access method
  heap   | table | heap_tableam_handler | heap table access method
  spgist | index | spghandler   | SP-GiST index access method
(7 rows)

What we need is that new \dA+ functionality cover existing one.  That
it, we should add Handler and Description column to the output.

# \dA+ *
  Index access method properties
AM   | Ordering | Unique indexes | Multicol indexes | Exclusion
constraints | Include non-key columns
+--++--+---+-
  brin   | no   | no | yes  | no
 | no
  btree  | yes  | yes| yes  | yes
 | yes
  gin| no   | no | yes  | no
 | no
  gist   | no   | no | yes  | yes
 | yes
  hash   | no   | no | no   | yes
 | no
  spgist | no   | no | no   | yes
 | no
(6 rows)

  Table access method properties
  Name | Type  |   Handler|   Description
--+---+--+--
  heap | table | heap_tableam_handler | heap table access method
(1 row)


Columns "Handler" and "Description" were added to \dA+.

\dA [NAME] now shows only amname and amtype.


Also added support for pre-9.6 server versions to both \dA and \dA+.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
>From f2eadc2ca4e9593292748890fd62273c59e05b5f Mon Sep 17 00:00:00 2001
From: Nikita Glukhov 
Date: Mon, 15 Jul 2019 15:52:49 +0300
Subject: [PATCH 1/2] Add psql AM info commands

---
 doc/src/sgml/catalogs.sgml |   8 +-
 doc/src/sgml/ref/psql-ref.sgml |  81 ++-
 src/bin/psql/command.c |  20 +-
 src/bin/psql/describe.c| 430 ++---
 src/bin/psql/describe.h|  19 +-
 src/bin/psql/help.c|   6 +-
 src/bin/psql/tab-complete.c|  16 +-
 src/test/regress/expected/psql.out | 141 
 src/test/regress/sql/psql.sql  |  15 ++
 9 files changed, 696 insertions(+), 40 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 68ad507..ec79c11 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -681,7 +681,7 @@
search and ordering purposes.)
   
 
-  
+  
pg_amop Columns
 

@@ -824,7 +824,7 @@
is one row for each support function belonging to an operator family.
   
 
-  
+  
pg_amproc Columns
 

@@ -4467,7 +4467,7 @@ SCRAM-SHA-256$:&l
Operator classes are described at length in .
   
 
-  
+  
pg_opclass Columns
 

@@ -4729,7 +4729,7 @@ SCRAM-SHA-256$:&l
Operator families are described at length in .
   
 
-  
+  
pg_opfamily Columns
 

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7789fc6..e690c4d 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1222,11 +1222,82 @@ testdb=>
 
 
 
-Lists access methods. If pattern is specified, only access
-methods whose names match the pattern are shown. If
-+ is appended to the command name, each access
-method is listed with its associated handler function and description.
+Lists access methods with their associated handler function. If
++ is appended to the command name, additional
+description is provided.
+If pattern is specified,
+the command displays the properties of the access methods whose names
+match the search pattern.
+
+

Re: errbacktrace

I wrote:
> Just noticing that ExceptionalCondition has an "fflush(stderr);"
> in front of what you added --- perhaps you should also add one
> after the backtrace_symbols_fd call?  It's not clear to me that
> that function guarantees to fflush, nor do I want to assume that
> abort() does.

Oh, wait, it's writing to fileno(stderr) so it couldn't be
buffering anything.  Disregard ...

regards, tom lane

Re: errbacktrace

Peter Eisentraut  writes:
> Here is an updated patch without the libunwind support, some minor
> cleanups, documentation, and automatic back traces from assertion failures.

Just noticing that ExceptionalCondition has an "fflush(stderr);"
in front of what you added --- perhaps you should also add one
after the backtrace_symbols_fd call?  It's not clear to me that
that function guarantees to fflush, nor do I want to assume that
abort() does.

regards, tom lane

Re: errbacktrace

On 2019-Jul-22, Peter Eisentraut wrote:

> On 2019-07-09 11:43, Peter Eisentraut wrote:
> > After further research I'm thinking about dropping the libunwind
> > support.  The backtrace()/backtrace_symbols() API is more widely
> > available: darwin, freebsd, linux, netbsd, openbsd (via port), solaris,
> > and of course it's built-in, whereas libunwind is only available for
> > linux, freebsd, hpux, solaris, and requires an external dependency.
> 
> Here is an updated patch without the libunwind support, some minor
> cleanups, documentation, and automatic back traces from assertion failures.

The only possibly complaint I see is that the backtrace support in
ExceptionalCondition does not work for Windows eventlog/console ... but
that seems moot since Windows does not have backtrace support anyway.

+1 to get this patch in at this stage; we can further refine (esp. add
Windows support) later, if need be.

https://stackoverflow.com/questions/26398064/counterpart-to-glibcs-backtrace-and-backtrace-symbols-on-windows

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Broken defenses against dropping a partitioning column

On 2019-Jul-22, Tom Lane wrote:

> I nearly missed the need for that because of all the noise that
> check-world emits in pre-v12 branches.  We'd discussed back-patching
> eb9812f27 at the time, and I think now it's tested enough that doing
> so is low risk (or at least, lower risk than the risk of not seeing
> a failure).  So I think I'll go do that now.

I'd like that, as it bites me too, thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: initdb recommendations

2019-07-22 Thread Andrew Dunstan



On 7/22/19 3:15 PM, Tom Lane wrote:
> I wrote:
>> BTW, it looks like the Windows buildfarm critters have a
>> separate problem: they're failing with
>> initdb: error: must specify a password for the superuser to enable md5 
>> authentication
> I tried doing a run on gaur (old HPUX, so no "peer" auth) before the
> revert happened.  It got as far as initdb-check [1], which failed quite
> thoroughly with lots of the same error as above.  Depressingly, a lot of
> the test cases that expected some type of error "succeeded", indicating
> they're not actually checking to see which error they got.  Boo hiss.
>
> Presumably Noah's AIX menagerie would have failed in about the
> same way if it had run.
>
> So we've got a *lot* of buildfarm work to do before we can think about
> changing this.



Ouch. I'll test more on Windows.



>
> Frankly, this episode makes me wonder whether changing the default is
> even a good idea at this point.  People who care about security have
> already set up their processes to select a useful-to-them auth option,
> while people who do not care are unlikely to be happy about having
> security rammed down their throats, especially if it results in the
> sort of push-ups we're looking at having to do in the buildfarm.
> I think this has effectively destroyed the argument that only
> trivial adjustments will be required.
>
>   regards, tom lane
>
> [1] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gaur&dt=2019-07-22%2017%3A08%3A27
>


There's a strong tendency these days to be secure by default, so I
understand the motivation.


cheers


andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: initdb recommendations

2019-07-22 Thread Andrew Dunstan



On 7/22/19 12:39 PM, Tom Lane wrote:
> I wrote:
>> I'm afraid we're going to have to revert this, at least till
>> such time as a fixed buildfarm client is in universal use.
>> As for the nature of that fix, I don't quite understand why
>> the forced -U is there --- maybe we could just remove it?
>> But there are multiple places in the buildfarm client that
>> have hard-wired references to "buildfarm".
> BTW, it looks like the Windows buildfarm critters have a
> separate problem: they're failing with
>
> initdb: error: must specify a password for the superuser to enable md5 
> authentication
>
> One would imagine that even if we'd given a password to initdb,
> subsequent connection attempts would fail for lack of a password.
> There might not be any practical fix except forcing trust auth
> for the Windows critters.
>
>   



Yeah.


Modulo this issue, experimentation shows that adding '-A trust' to the
line in run_build.pl where initdb is called fixes the issue. If we're
going to rely on a buildfarm client fix that one seems simplest. there
are a couple of not very widely used modules that need similar treatment
- TestSepgsql and TesUpgradeXVersion


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: initdb recommendations

I wrote:
> BTW, it looks like the Windows buildfarm critters have a
> separate problem: they're failing with
> initdb: error: must specify a password for the superuser to enable md5 
> authentication

I tried doing a run on gaur (old HPUX, so no "peer" auth) before the
revert happened.  It got as far as initdb-check [1], which failed quite
thoroughly with lots of the same error as above.  Depressingly, a lot of
the test cases that expected some type of error "succeeded", indicating
they're not actually checking to see which error they got.  Boo hiss.

Presumably Noah's AIX menagerie would have failed in about the
same way if it had run.

So we've got a *lot* of buildfarm work to do before we can think about
changing this.

Frankly, this episode makes me wonder whether changing the default is
even a good idea at this point.  People who care about security have
already set up their processes to select a useful-to-them auth option,
while people who do not care are unlikely to be happy about having
security rammed down their throats, especially if it results in the
sort of push-ups we're looking at having to do in the buildfarm.
I think this has effectively destroyed the argument that only
trivial adjustments will be required.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gaur&dt=2019-07-22%2017%3A08%3A27

Re: Broken defenses against dropping a partitioning column

I wrote:
> Here's a proposed patch for that.  It's mostly pretty straightforward,
> except I had to add some recursion defenses in findDependentObjects that
> weren't there before.  But those seem like a good idea anyway to prevent
> infinite recursion in case of bogus entries in pg_depend.
> Per above, I'm envisioning applying this to HEAD and v12 with a catversion
> bump, and to v11 and v10 with no catversion bump.

Pushed.  Back-patching turned up one thing I hadn't expected: pre-v12
pg_dump bleated about circular dependencies.  It turned out that Peter
had already installed a hack in pg_dump to suppress that complaint in
connection with generated columns, so I improved the comment and
back-patched that too.

I nearly missed the need for that because of all the noise that
check-world emits in pre-v12 branches.  We'd discussed back-patching
eb9812f27 at the time, and I think now it's tested enough that doing
so is low risk (or at least, lower risk than the risk of not seeing
a failure).  So I think I'll go do that now.

regards, tom lane

Re: Add CREATE DATABASE LOCALE option

On 2019-07-13 19:20, Fabien COELHO wrote:
> The second error message about conflicting option could more explicit than 
> a terse "conflicting or redundant options"? The user may expect later 
> options to superseedes earlier options, for instance.

done

> About the pg_dump code, I'm wondering whether it is worth generating 
> LOCALE as it breaks backward compatibility (eg dumping a new db to load it 
> into a older version).

We don't really care about backward compatibility here.  Moving forward,
with ICU and such, we don't want to have to drag around old syntax forever.

> If it is to be generated, I'd do merge the two conditions instead of 
> nesting.
> 
>if (strlen(collate) > 0 && strcmp(collate, ctype) == 0)
>  // generate LOCALE

done

How about this patch?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From df26107a550b9e1d0933a04dc31f43c43a17b0f7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 22 Jul 2019 20:35:43 +0200
Subject: [PATCH v3] Add CREATE DATABASE LOCALE option

This sets both LC_COLLATE and LC_CTYPE with one option.  Similar
behavior is already supported in initdb, CREATE COLLATION, and
createdb.

Discussion: 
https://www.postgresql.org/message-id/flat/d9d5043a-dc70-da8a-0166-1e218e6e34d4%402ndquadrant.com
---
 doc/src/sgml/ref/create_database.sgml | 25 +++--
 src/backend/commands/dbcommands.c | 21 +
 src/bin/pg_dump/pg_dump.c | 18 +-
 src/bin/pg_dump/t/002_pg_dump.pl  |  9 +
 4 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/create_database.sgml 
b/doc/src/sgml/ref/create_database.sgml
index b2c9e241c2..4014f6703b 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -25,6 +25,7 @@
 [ [ WITH ] [ OWNER [=] user_name ]
[ TEMPLATE [=] template ]
[ ENCODING [=] encoding ]
+   [ LOCALE [=] locale ]
[ LC_COLLATE [=] lc_collate ]
[ LC_CTYPE [=] lc_ctype ]
[ TABLESPACE [=] tablespace_name ]
@@ -111,6 +112,26 @@ Parameters

   
  
+ 
+  locale
+  
+   
+This is a shortcut for setting LC_COLLATE
+and LC_CTYPE at once.  If you specify this,
+you cannot specify either of those parameters.
+   
+   
+
+ The other locale settings , , , and
+  are not fixed per database and are not
+ set by this command.  If you want to make them the default for a
+ specific database, you can use ALTER DATABASE
+ ... SET.
+
+   
+  
+ 
  
   lc_collate
   
@@ -287,7 +308,7 @@ Examples
To create a database music with a different locale:
 
 CREATE DATABASE music
-LC_COLLATE 'sv_SE.utf8' LC_CTYPE 'sv_SE.utf8'
+LOCALE 'sv_SE.utf8'
 TEMPLATE template0;
 
 In this example, the TEMPLATE template0 clause is 
required if
@@ -300,7 +321,7 @@ Examples
different character set encoding:
 
 CREATE DATABASE music2
-LC_COLLATE 'sv_SE.iso885915' LC_CTYPE 'sv_SE.iso885915'
+LOCALE 'sv_SE.iso885915'
 ENCODING LATIN9
 TEMPLATE template0;
 
diff --git a/src/backend/commands/dbcommands.c 
b/src/backend/commands/dbcommands.c
index 863f89f19d..fc1e1564a6 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -124,6 +124,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
DefElem*downer = NULL;
DefElem*dtemplate = NULL;
DefElem*dencoding = NULL;
+   DefElem*dlocale = NULL;
DefElem*dcollate = NULL;
DefElem*dctype = NULL;
DefElem*distemplate = NULL;
@@ -184,6 +185,15 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 parser_errposition(pstate, 
defel->location)));
dencoding = defel;
}
+   else if (strcmp(defel->defname, "locale") == 0)
+   {
+   if (dlocale)
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("conflicting or 
redundant options"),
+parser_errposition(pstate, 
defel->location)));
+   dlocale = defel;
+   }
else if (strcmp(defel->defname, "lc_collate") == 0)
{
if (dcollate)
@@ -244,6 +254,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 parser_errposition(pstate, 
defel->location)));
}
 
+   if (dlocale && (dcollate || dctype))
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+

Re: using explicit_bzero

Peter Eisentraut  writes:
> On 2019-07-18 00:45, Tom Lane wrote:
>> +1 for using the C11-standard name, even if that's not anywhere
>> in the real world yet.

> ISTM that a problem is that you cannot implement a replacement
> memset_s() as a wrapper around explicit_bzero(), unless you also want to
> implement the bound checking stuff.  (The "s"/safe in this family of
> functions refers to the bound checking, not the cannot-be-optimized-away
> property.)  The other way around it is easier.

Oh, hm.

> Also, the "s" family of functions appears to be a quagmire of
> controversy and incompatibility, so it's perhaps better to stay away
> from it for the time being.

Fair enough.

regards, tom lane

Re: errbacktrace

On 2019-07-09 11:43, Peter Eisentraut wrote:
> After further research I'm thinking about dropping the libunwind
> support.  The backtrace()/backtrace_symbols() API is more widely
> available: darwin, freebsd, linux, netbsd, openbsd (via port), solaris,
> and of course it's built-in, whereas libunwind is only available for
> linux, freebsd, hpux, solaris, and requires an external dependency.

Here is an updated patch without the libunwind support, some minor
cleanups, documentation, and automatic back traces from assertion failures.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 779231dfe56cf8f138a94ba0106ca72171b63350 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 22 Jul 2019 20:08:37 +0200
Subject: [PATCH v3] Add backtrace support

Add some support for automatically showing backtraces in certain error
situations in the server.  Backtraces are shown on assertion failure.
A new setting backtrace_function can be set to the name of a C
function, and all ereport()s and elog()s from the function will have
backtraces generated.  Finally, the function errbacktrace() can be
manually added to an ereport() call to generate a backtrace for that
call.

Discussion: 
https://www.postgresql.org/message-id/flat/5f48cb47-bf1e-05b6-7aae-3bf2cd015...@2ndquadrant.com
---
 configure| 61 +-
 configure.in |  4 ++
 doc/src/sgml/config.sgml | 25 +
 src/backend/utils/error/assert.c | 13 +
 src/backend/utils/error/elog.c   | 90 
 src/backend/utils/misc/guc.c | 12 +
 src/include/pg_config.h.in   |  6 +++
 src/include/utils/elog.h |  3 ++
 src/include/utils/guc.h  |  1 +
 9 files changed, 213 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 7a6bfc2339..fbd8224f50 100755
--- a/configure
+++ b/configure
@@ -11662,6 +11662,63 @@ if test "$ac_res" != no; then :
 
 fi
 
+# *BSD:
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing 
backtrace_symbols" >&5
+$as_echo_n "checking for library containing backtrace_symbols... " >&6; }
+if ${ac_cv_search_backtrace_symbols+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char backtrace_symbols ();
+int
+main ()
+{
+return backtrace_symbols ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' execinfo; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_backtrace_symbols=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext
+  if ${ac_cv_search_backtrace_symbols+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search_backtrace_symbols+:} false; then :
+
+else
+  ac_cv_search_backtrace_symbols=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$ac_cv_search_backtrace_symbols" >&5
+$as_echo "$ac_cv_search_backtrace_symbols" >&6; }
+ac_res=$ac_cv_search_backtrace_symbols
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+fi
+
 
 if test "$with_readline" = yes; then
 
@@ -12760,7 +12817,7 @@ $as_echo "#define HAVE_STDBOOL_H 1" >>confdefs.h
 fi
 
 
-for ac_header in atomic.h copyfile.h crypt.h fp_class.h getopt.h ieeefp.h 
ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/ipc.h sys/prctl.h 
sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h 
sys/sockio.h sys/tas.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h
+for ac_header in atomic.h copyfile.h crypt.h execinfo.h fp_class.h getopt.h 
ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/ipc.h 
sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h 
sys/shm.h sys/sockio.h sys/tas.h sys/un.h termios.h ucred.h utime.h wchar.h 
wctype.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" 
"$ac_includes_default"
@@ -15143,7 +15200,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in cbrt clock_gettime copyfile fdatasync getifaddrs getpeerucred 
getrlimit mbstowcs_l memmove poll posix_fallocate ppoll pstat 
pthread_is_threaded_np readlink setproctitle setproctitle_fast setsid shm_open 
strchrnul strsignal symlink sync_file_range uselocale utime utimes wcstombs_l
+for ac_func in backtrace_symbols cbrt clock_gettime copyfile fdatasync 
getifa

Re: using explicit_bzero

On 2019-07-18 00:45, Tom Lane wrote:
> Alvaro Herrera  writes:
>> On 2019-Jul-11, Thomas Munro wrote:
>>> Following a trail of crumbs beginning at OpenSSH's fallback
>>> implementation of this[1], I learned that C11 has standardised
>>> memset_s[2] for this purpose.  Macs have memset_s but no
>>> explicit_bzero.  FreeBSD has both.  I wonder if it'd be better to make
>>> memset_s the function we use in our code, considering its standard
>>> blessing and therefore likelihood of being available on every system
>>> eventually.
> 
>> Sounds like a future-proof way would be to implement memset_s in
>> src/port if absent from the OS (using explicit_bzero and other tricks),
>> and use that.
> 
> +1 for using the C11-standard name, even if that's not anywhere
> in the real world yet.

ISTM that a problem is that you cannot implement a replacement
memset_s() as a wrapper around explicit_bzero(), unless you also want to
implement the bound checking stuff.  (The "s"/safe in this family of
functions refers to the bound checking, not the cannot-be-optimized-away
property.)  The other way around it is easier.

Also, the "s" family of functions appears to be a quagmire of
controversy and incompatibility, so it's perhaps better to stay away
from it for the time being.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: using explicit_bzero

On 2019-07-11 03:11, Michael Paquier wrote:
> On Mon, Jun 24, 2019 at 02:08:50PM +0900, Michael Paquier wrote:
>> CreateRole() and AlterRole() can manipulate a password in plain format
>> in memory.  The cleanup could be done just after calling
>> encrypt_password() in user.c.
>>
>> Could it be possible to add the new flag in pg_config.h.win32?
> 
> While remembering about it...  Shouldn't the memset(0) now happening in
> base64.c for the encoding and encoding routines when facing a failure
> use explicit_zero()?

base64.c doesn't know what the data it is dealing with is used for.
That should be the responsibility of the caller, no?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: converting Lists into arrays

Alvaro Herrera  writes:
> So with this patch we end up with:

> list_union (copies list1, appends list2 element not already in list1)
> list_concat_unique (appends list2 elements not already in list)
> list_concat (appends all list2 elements)
> list_concat_copy (copies list1, appends all list2 elements)

> This seems a little random -- for example we end up with "union" being
> the same as "concat_copy" except for the copy; and the analogy between
> those two seems to exactly correspond to that between "concat_unique"
> and "concat".

Yeah, list_concat_unique is kind of weird here.  Its header comment
even points out that it's much like list_union:

 * This is almost the same functionality as list_union(), but list1 is
 * modified in-place rather than being copied. However, callers of this
 * function may have strict ordering expectations -- i.e. that the relative
 * order of those list2 elements that are not duplicates is preserved.

I think that last sentence is bogus --- does anybody really think
people have been careful not to assume anything about the ordering
of list_union results?

> I would propose to use the name list_union, with flags
> being "unique" (or "uniquify" if that's a word, or even just "all" which
> seems obvious to people with a SQL background), and something that
> suggests "copy_first".

I really dislike using "union" for something that doesn't have the
same semantics as SQL's UNION (ie guaranteed duplicate elimination);
so I've never been that happy with "list_union" and "list_difference".
Propagating that into things that aren't doing any dup-elimination
at all seems very wrong.

Also, a big -1 for replacing these calls with something with
extra parameter(s).  That's going to be verbose, and not any
more readable, and probably slower because the called code
will have to figure out what to do.

Perhaps there's an argument for doing something to change the behavior
of list_union and list_difference and friends.  Not sure --- it could
be a foot-gun for back-patching.  I'm already worried about the risk
of back-patching code that assumes the new semantics of list_concat.
(Which might be a good argument for renaming it to something else?
Just not list_union, please.)

regards, tom lane

Re: POC: converting Lists into arrays

On 2019-Jul-22, Tom Lane wrote:

> David Rowley  writes:

> > If we do end up with another function, it might be nice to stay away
> > from using "concat" in the name. I think we might struggle if there
> > are too many variations on concat and there's a risk we'll use the
> > wrong one.  If we need this then perhaps something like
> > list_append_all() might be a better choice... I'm struggling to build
> > a strong opinion on this though. (I know that because I've deleted
> > this paragraph 3 times and started again, each time with a different
> > opinion.)
> 
> Yeah, the name is really the sticking point here; if we could think
> of a name that was easy to understand then the whole thing would be
> much easier to accept.  The best I've been able to come up with is
> "list_join", by analogy to bms_join for bitmapsets ... but that's
> not great.

So with this patch we end up with:

list_union (copies list1, appends list2 element not already in list1)
list_concat_unique (appends list2 elements not already in list)
list_concat (appends all list2 elements)
list_concat_copy (copies list1, appends all list2 elements)

This seems a little random -- for example we end up with "union" being
the same as "concat_copy" except for the copy; and the analogy between
those two seems to exactly correspond to that between "concat_unique"
and "concat".  I would propose to use the name list_union, with flags
being "unique" (or "uniquify" if that's a word, or even just "all" which
seems obvious to people with a SQL background), and something that
suggests "copy_first".

Maybe we can offer a single name that does the four things, selecting
the exact semantics with boolean flags?  (We can provide the old names
as macros, to avoid unnecessarily breaking other code).  Also, perhaps
it would make sense to put them all closer in the source file.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_receivewal documentation

2019-07-22 Thread Jesper Pedersen


Hi,

On 7/21/19 9:48 PM, Michael Paquier wrote:

Since pg_receivewal does not apply WAL, you should not allow it to
become a synchronous standby when synchronous_commit = remote_apply.
If it does, it will appear to be a standby which never catches up,
which may cause commits to block.  To avoid this, you should either
configure an appropriate value for synchronous_standby_names, or
specify an application_name for pg_receivewal that does not match it,
or change the value of synchronous_commit to something other than
remote_apply.

I think that'd be a lot more useful than enumerating the total-failure
scenarios.


+1.  Thanks for the suggestions!  Your wording looks good to me.


+1

Here is the patch for it, with Robert as the author.

Best regards,
 Jesper

>From a6512e79e9fd054b188489757ee622dbf98ddf2b Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Mon, 22 Jul 2019 13:19:56 -0400
Subject: [PATCH] Highlight that pg_receivewal doesn't apply WAL, and as such
 synchronous-commit needs to be remote_write or lower.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Robert Haas
Review-by: Michael Paquier, Ãlvaro Herrera, Laurenz Albe and Jesper Pedersen
---
 doc/src/sgml/ref/pg_receivewal.sgml | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index e96d753955..beab6f0391 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -52,7 +52,16 @@ PostgreSQL documentation
Unlike the WAL receiver of a PostgreSQL standby server, pg_receivewal
by default flushes WAL data only when a WAL file is closed.
The option --synchronous must be specified to flush WAL data
-   in real time.
+   in real time. Since pg_receivewal does not apply WAL,
+   you should not allow it to become a synchronous standby when
+equals remote_apply.
+   If it does, it will appear to be a standby which never catches up,
+   which may cause commits to block.  To avoid this, you should either
+   configure an appropriate value for , or
+   specify an application_name for
+   pg_receivewal that does not match it, or change the value of
+to something other than
+   remote_apply.
   
 
   
@@ -207,16 +216,6 @@ PostgreSQL documentation
 server as a synchronous standby, to ensure that timely feedback is
 sent to the server.

-
-   
-Note that while WAL will be flushed with this setting,
-pg_receivewal never applies it,
-so  must not be set
-to remote_apply or on if
-pg_receivewal is a synchronous standby, be it a
-member of a priority-based (FIRST) or a
-quorum-based (ANY) synchronous replication setup.
-   
   
  
 
-- 
2.21.0

Re: Index Skip Scan

2019-07-22 Thread Jesper Pedersen


Hi,

On 7/22/19 1:44 AM, David Rowley wrote:

Here are the comments I noted down during the review:

cost_index:

I know you've not finished here, but I think it'll need to adjust
tuples_fetched somehow to account for estimate_num_groups() on the
Path's unique keys. Any Eclass with an ec_has_const = true does not
need to be part of the estimate there as there can only be at most one
value for these.

For example, in a query such as:

SELECT x,y FROM t WHERE x = 1 GROUP BY x,y;

you only need to perform estimate_num_groups() on "y".

I'm really not quite sure on what exactly will be required from
amcostestimate() here.  The cost of the skip scan is not the same as
the normal scan. So other that API needs adjusted to allow the caller
to mention that we want skip scans estimated, or there needs to be
another callback.



I think this part will become more clear once the index skip scan patch 
is rebased, as we got the uniquekeys field on the Path, and the 
indexskipprefixy info on the IndexPath node.




build_index_paths:

I don't quite see where you're checking if the query's unique_keys
match what unique keys can be produced by the index.  This is done for
pathkeys with:

pathkeys_possibly_useful = (scantype != ST_BITMAPSCAN &&
!found_lower_saop_clause &&
has_useful_pathkeys(root, rel));
index_is_ordered = (index->sortopfamily != NULL);
if (index_is_ordered && pathkeys_possibly_useful)
{
index_pathkeys = build_index_pathkeys(root, index,
   ForwardScanDirection);
useful_pathkeys = truncate_useless_pathkeys(root, rel,
index_pathkeys);
orderbyclauses = NIL;
orderbyclausecols = NIL;
}

Here has_useful_pathkeys() checks if the query requires some ordering.
For unique keys you'll want to do the same. You'll have set the
query's unique key requirements in standard_qp_callback().

I think basically build_index_paths() should be building index paths
with unique keys, for all indexes that can support the query's unique
keys. I'm just a bit uncertain if we need to create both a normal
index path and another path for the same index with unique keys.
Perhaps we can figure that out down the line somewhere. Maybe it's
best to build path types for now, when possible, and we can consider
later if we can skip the non-uniquekey paths.  Likely that would
require a big XXX comment to explain we need to review that before the
code makes it into core.

get_uniquekeys_for_index:

I think this needs to follow more the lead from build_index_pathkeys.
Basically, ask the index what it's pathkeys are.

standard_qp_callback:

build_group_uniquekeys & build_distinct_uniquekeys could likely be one
function that takes a list of SortGroupClause. You just either pass
the groupClause or distinctClause in.  Pretty much the UniqueKey
version of make_pathkeys_for_sortclauses().

Yeah, I'll move this part of the index skip scan patch to the unique key 
patch.


Thanks for your feedback !

Best regards,
 Jesper

Re: Add parallelism and glibc dependent only options to reindexdb

On 2019-Jul-22, Tom Lane wrote:

> Alvaro Herrera  writes:
> > On 2019-Jul-22, Julien Rouhaud wrote:
> >> On Mon, Jul 22, 2019 at 5:11 PM Alvaro Herrera  
> >> wrote:
> >>> Can we use List for this instead?
> 
> >> Isn't that for backend code only?
> 
> > Well, we already have palloc() on the frontend side, and list.c doesn't
> > have any elog()/ereport(), so it should be possible to use it ... I do
> > see that it uses MemoryContextAlloc() in a few places.  Maybe we can
> > just #define that to palloc()?
> 
> I'm not happy about either the idea of pulling all of list.c into
> frontend programs, or restricting it to be frontend-safe.  That's
> very fundamental infrastructure and I don't want it laboring under
> such a restriction.  Furthermore, List usage generally leaks memory
> like mad (cf nearby list_concat discussion) which doesn't seem like
> something we want for frontend code.

Fair enough.  List has gotten quite sophisticated now, so I understand
the concern.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: initdb recommendations

2019-07-22 Thread Andrew Dunstan

On 7/22/19 12:25 PM, Tom Lane wrote:
> I wrote:
>> Peter Eisentraut  writes:
>>> Pushed with that note.  Thanks.
>> This has completely broken the buildfarm.
> On inspection, it seems the reason for that is that the buildfarm
> script runs initdb with '-U buildfarm', so that peer-auth connections
> will only work if the buildfarm is being run by an OS user named
> exactly "buildfarm".  That happens to be true on my macOS animals,
> which is why they're not broken ... but apparently, nobody else
> does it that way.
>
> I'm afraid we're going to have to revert this, at least till
> such time as a fixed buildfarm client is in universal use.
>
> As for the nature of that fix, I don't quite understand why
> the forced -U is there --- maybe we could just remove it?
> But there are multiple places in the buildfarm client that
> have hard-wired references to "buildfarm".

This goes back quite a way:

commit 7528701abb88ab84f6775448c59b392ca7f33a07
Author: Andrew Dunstan 
Date:   Tue Nov 27 13:47:38 2012 -0500

    Run everything as buildfarm rather than local user name.

    This will help if we ever want to do things like comparing dump
diffs.
    Done by setting PGUSER and using initdb's -U option.

The pg_upgrade test (not the cross-version one) doesn't use this - it
explicitly unsets PGUSER.

There are a few things we could do. We could force trust auth, or we
could add an ident map that allowed $USER to login as buildfarm. Finding
all the places we would need to fix that could be a fun project ...

We could also maybe teach initdb to honor an environment setting
INTDB_DEFAULT_AUTH or some such.

I agree this should be reverted for now until we work out what we want
to do.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PATCH] minor bugfix for pg_basebackup (9.6 ~ )

On 2019-Jul-22, Michael Paquier wrote:

> On Sat, Jul 20, 2019 at 10:04:19AM +0900, Michael Paquier wrote:
> > This restriction is unlikely going to be removed, still I would rather
> > keep the escaped logic in pg_basebackup.  This is the usual,
> > recommended coding pattern, and there is a risk that folks refer to
> > this code block for their own fancy stuff, spreading the problem.  The
> > intention behind the code is to use an escaped name as well.  For 
> > those reasons your patch is fine by me.
> 
> Attempting to use a slot with an unsupported set of characters will
> lead beforehand to a failure when trying to fetch the WAL segments
> with START_REPLICATION, meaning that this spot will never be reached
> and that there is no active bug, but for the sake of consistency I see
> no problems with applying the fix on HEAD.  So, are there any
> objections with that?

Maybe it's just me, but it seems weird to try to forestall a problem
that cannot occur by definition.  I would rather remove the escaping,
and add a one-line comment explaining why we don't do it?

if (replication_slot)
/* needn't escape because slot name must comprise [a-zA-Z0-9_] only */
appendPQExpBuffer(recoveryconfcontents, "primary_slot_name = '%s'\n",
  replication_slot);

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: initdb recommendations

I wrote:
> I'm afraid we're going to have to revert this, at least till
> such time as a fixed buildfarm client is in universal use.

> As for the nature of that fix, I don't quite understand why
> the forced -U is there --- maybe we could just remove it?
> But there are multiple places in the buildfarm client that
> have hard-wired references to "buildfarm".

BTW, it looks like the Windows buildfarm critters have a
separate problem: they're failing with

initdb: error: must specify a password for the superuser to enable md5 
authentication

One would imagine that even if we'd given a password to initdb,
subsequent connection attempts would fail for lack of a password.
There might not be any practical fix except forcing trust auth
for the Windows critters.

regards, tom lane

Re: Memory Accounting

2019-07-22 Thread Jeff Davis

On Mon, 2019-07-22 at 18:16 +0200, Tomas Vondra wrote:
> * I changed it to only update mem_allocated for the current
> > > context,
> > > not recursively for all parent contexts. It's now up to the
> > > function
> > > that reports memory usage to recurse or not (as needed).
> > 
> > Is that OK for memory bounded hash aggregation? Might there be a
> > lot 
> > of sub-contexts during hash aggregation?
> > 
> 
> There shouldn't be, at least not since b419865a814abbc. There might
> be
> cases where custom aggregates still do that, but I think that's
> simply a
> design we should discourage.

Right, I don't think memory-context-per-group is something we should
optimize for.

Discussion:
https://www.postgresql.org/message-id/3839201.Nfa2RvcheX%40techfox.foxi
https://www.postgresql.org/message-id/5334D7A5.2000907%40fuzzy.cz

Commit link:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b419865a814abbca12bdd6eef6a3d5ed67f432e1

Regards,
Jeff Davis

Re: initdb recommendations

I wrote:
> Peter Eisentraut  writes:
>> Pushed with that note.  Thanks.

> This has completely broken the buildfarm.

On inspection, it seems the reason for that is that the buildfarm
script runs initdb with '-U buildfarm', so that peer-auth connections
will only work if the buildfarm is being run by an OS user named
exactly "buildfarm".  That happens to be true on my macOS animals,
which is why they're not broken ... but apparently, nobody else
does it that way.

I'm afraid we're going to have to revert this, at least till
such time as a fixed buildfarm client is in universal use.

As for the nature of that fix, I don't quite understand why
the forced -U is there --- maybe we could just remove it?
But there are multiple places in the buildfarm client that
have hard-wired references to "buildfarm".

regards, tom lane

Re: Memory Accounting

2019-07-22 Thread Tomas Vondra


On Mon, Jul 22, 2019 at 11:30:37AM +0300, Heikki Linnakangas wrote:

On 18/07/2019 21:24, Jeff Davis wrote:

Previous discussion:
https://postgr.es/m/1407012053.15301.53.camel%40jeff-desktop

This patch introduces a way to ask a memory context how much memory it
currently has allocated. Each time a new block (not an individual
chunk, but a new malloc'd block) is allocated, it increments a struct
member; and each time a block is free'd, it decrements the struct
member. So it tracks blocks allocated by malloc, not what is actually
used for chunks allocated by palloc.

The purpose is for Memory Bounded Hash Aggregation, but it may be
useful in more cases as we try to better adapt to and control memory
usage at execution time.


Seems handy.



Indeed.


* I changed it to only update mem_allocated for the current context,
not recursively for all parent contexts. It's now up to the function
that reports memory usage to recurse or not (as needed).


Is that OK for memory bounded hash aggregation? Might there be a lot 
of sub-contexts during hash aggregation?




There shouldn't be, at least not since b419865a814abbc. There might be
cases where custom aggregates still do that, but I think that's simply a
design we should discourage.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Cleaning up and speeding up string functions

2019-07-22 Thread Dagfinn Ilmari Mannsåker

Alvaro Herrera  writes:

> On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote:
>
>> ilm...@ilmari.org (Dagfinn Ilmari Mannsåker) writes:
>> 
>> > I noticed a lot of these are appending one StringInfo onto another;
>> > would it make sense to introduce a helper funciton
>> > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
>> > `str.data, str2.len` repetition?
>> 
>> A bit of grepping only turned up 18 uses, but I was bored and whipped up
>> the attached anyway, in case we decide it's worth it.
>
> David had already submitted the same thing upthread, and it was rejected
> on the grounds that it increases the code space.

Oops, sorry, I missed that. Never mind then.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law

Re: Add parallelism and glibc dependent only options to reindexdb

Alvaro Herrera  writes:
> On 2019-Jul-22, Julien Rouhaud wrote:
>> On Mon, Jul 22, 2019 at 5:11 PM Alvaro Herrera  
>> wrote:
>>> Can we use List for this instead?

>> Isn't that for backend code only?

> Well, we already have palloc() on the frontend side, and list.c doesn't
> have any elog()/ereport(), so it should be possible to use it ... I do
> see that it uses MemoryContextAlloc() in a few places.  Maybe we can
> just #define that to palloc()?

I'm not happy about either the idea of pulling all of list.c into
frontend programs, or restricting it to be frontend-safe.  That's
very fundamental infrastructure and I don't want it laboring under
such a restriction.  Furthermore, List usage generally leaks memory
like mad (cf nearby list_concat discussion) which doesn't seem like
something we want for frontend code.

regards, tom lane

Re: Cleaning up and speeding up string functions

On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote:

> ilm...@ilmari.org (Dagfinn Ilmari Mannsåker) writes:
> 
> > I noticed a lot of these are appending one StringInfo onto another;
> > would it make sense to introduce a helper funciton
> > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
> > `str.data, str2.len` repetition?
> 
> A bit of grepping only turned up 18 uses, but I was bored and whipped up
> the attached anyway, in case we decide it's worth it.

David had already submitted the same thing upthread, and it was rejected
on the grounds that it increases the code space.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Cleaning up and speeding up string functions

2019-07-22 Thread Dagfinn Ilmari Mannsåker

ilm...@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

> David Rowley  writes:
>
>> On Thu, 4 Jul 2019 at 13:51, David Rowley  
>> wrote:
>>> Instead of having 0004, how about the attached?
>>>
>>> Most of the calls won't improve much performance-wise since they're so
>>> cheap anyway, but there is xmlconcat(), I imagine that should see some
>>> speedup.
>>
>> I've pushed this after having found a couple more places where the
>> length is known.
>
> I noticed a lot of these are appending one StringInfo onto another;
> would it make sense to introduce a helper funciton
> appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
> `str.data, str2.len` repetition?

A bit of grepping only turned up 18 uses, but I was bored and whipped up
the attached anyway, in case we decide it's worth it.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law

>From 1e68adae513425470cad10cd2a44f66bca61a5fd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 22 Jul 2019 15:01:03 +0100
Subject: [PATCH] Add appendStringInfoStringInfo() function

This simplifies appending one StringInfo to another
---
 contrib/hstore/hstore_io.c  |  2 +-
 contrib/postgres_fdw/deparse.c  |  4 ++--
 src/backend/access/transam/xlog.c   |  2 +-
 src/backend/executor/execMain.c |  2 +-
 src/backend/lib/stringinfo.c| 12 
 src/backend/storage/lmgr/deadlock.c |  2 +-
 src/backend/utils/adt/ri_triggers.c |  4 ++--
 src/backend/utils/adt/ruleutils.c   | 10 +-
 src/backend/utils/adt/xml.c | 12 +---
 src/include/lib/stringinfo.h|  7 +++
 10 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c
index 745497c76f..58415ccaec 100644
--- a/contrib/hstore/hstore_io.c
+++ b/contrib/hstore/hstore_io.c
@@ -1309,7 +1309,7 @@ hstore_to_json_loose(PG_FUNCTION_ARGS)
 			appendBinaryStringInfo(&tmp, HSTORE_VAL(entries, base, i),
    HSTORE_VALLEN(entries, i));
 			if (IsValidJsonNumber(tmp.data, tmp.len))
-appendBinaryStringInfo(&dst, tmp.data, tmp.len);
+appendStringInfoStringInfo(&dst, &tmp);
 			else
 escape_json(&dst, tmp.data);
 		}
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 19f928ec59..510e034e8e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1531,7 +1531,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
 			{
 Assert(fpinfo->jointype == JOIN_INNER);
 Assert(fpinfo->joinclauses == NIL);
-appendBinaryStringInfo(buf, join_sql_o.data, join_sql_o.len);
+appendStringInfoStringInfo(buf, &join_sql_o);
 return;
 			}
 		}
@@ -1552,7 +1552,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
 			{
 Assert(fpinfo->jointype == JOIN_INNER);
 Assert(fpinfo->joinclauses == NIL);
-appendBinaryStringInfo(buf, join_sql_i.data, join_sql_i.len);
+appendStringInfoStringInfo(buf, &join_sql_i);
 return;
 			}
 		}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index da3d250986..e674c63056 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1192,7 +1192,7 @@ XLogInsertRecord(XLogRecData *rdata,
 		 */
 		initStringInfo(&recordBuf);
 		for (; rdata != NULL; rdata = rdata->next)
-			appendBinaryStringInfo(&recordBuf, rdata->data, rdata->len);
+			appendStringInfoStringInfo(&recordBuf, rdata);
 
 		if (!debug_reader)
 			debug_reader = XLogReaderAllocate(wal_segment_size, NULL, NULL);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbd7dd9bcd..1969b36891 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2309,7 +2309,7 @@ ExecBuildSlotValueDescription(Oid reloid,
 	if (!table_perm)
 	{
 		appendStringInfoString(&collist, ") = ");
-		appendBinaryStringInfo(&collist, buf.data, buf.len);
+		appendStringInfoStringInfo(&collist, &buf);
 
 		return collist.data;
 	}
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 99c83c1549..d25d122289 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -165,6 +165,18 @@ appendStringInfoString(StringInfo str, const char *s)
 	appendBinaryStringInfo(str, s, strlen(s));
 }
 
+/*
+ * appendStringInfoStringInfo
+ *
+ * Append another StringInfo to str.
+ * Like appendBinaryStringInfo(str, str2->data, str2->len)
+ */
+void
+appendStringInfoStringInfo(StringInfo str, const StringInfo str2)
+{
+	appendBinaryStringInfo(str, str2->data, str2->len);
+}
+
 /*
  * appendStringInfoChar
  *
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 990d48377d..14a47b9e66 100644
--- a/s

Re: Add parallelism and glibc dependent only options to reindexdb

On 2019-Jul-22, Julien Rouhaud wrote:

> On Mon, Jul 22, 2019 at 5:11 PM Alvaro Herrera  
> wrote:
> >
> > > I considered this, but it would require to adapt all code that declare
> > > SimpleStringList stack variable (vacuumdb.c, clusterdb.c,
> > > createuser.c, pg_dumpall.c and pg_dump.c), so it looked like too much
> > > trouble to avoid 2 local variables here and 1 in vacuumdb.c.  I don't
> > > have a strong opinion here, so I can go for it if you prefer.
> >
> > Can we use List for this instead?
> 
> Isn't that for backend code only?

Well, we already have palloc() on the frontend side, and list.c doesn't
have any elog()/ereport(), so it should be possible to use it ... I do
see that it uses MemoryContextAlloc() in a few places.  Maybe we can
just #define that to palloc()?

(Maybe we can use the impulse to get rid of these "simple lists"
altogether?)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-22 Thread Julien Rouhaud

On Mon, Jul 22, 2019 at 5:11 PM Alvaro Herrera  wrote:
>
> On 2019-Jul-22, Julien Rouhaud wrote:
>
> > On Mon, Jul 22, 2019 at 6:11 AM Michael Paquier  wrote:
>
> > >simple_string_list_append(&tables, optarg);
> > > +  tbl_count++;
> > >break;
> > > The number of items in a simple list is not counted, and vacuumdb does
> > > the same thing to count objects.  What do you think about extending
> > > simple lists to track the number of items stored?
> >
> > I considered this, but it would require to adapt all code that declare
> > SimpleStringList stack variable (vacuumdb.c, clusterdb.c,
> > createuser.c, pg_dumpall.c and pg_dump.c), so it looked like too much
> > trouble to avoid 2 local variables here and 1 in vacuumdb.c.  I don't
> > have a strong opinion here, so I can go for it if you prefer.
>
> Can we use List for this instead?

Isn't that for backend code only?

Re: Add parallelism and glibc dependent only options to reindexdb

On 2019-Jul-19, Julien Rouhaud wrote:

> > For the second patch, could you send a rebase with a fix for the
> > connection slot when processing the reindex commands?
> 
> Attached, I also hopefully removed all the now unneeded progname usage.

BTW "progname" is a global variable in logging.c, and it's initialized
by pg_logging_init(), so there's no point in having a local variable in
main() that's called the same and initialized the same way.  You could
just remove it from the signature of all those functions
(connectDatabase and callers), and there would be no visible change.

Also: [see attached]

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 957e56dee9a8b32f8f409a516a0195ceb3bc6a75
Author: Alvaro Herrera 
AuthorDate: Mon Jul 22 11:09:18 2019 -0400
CommitDate: Mon Jul 22 11:09:30 2019 -0400

remove useless progname

diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index d3ee0da917..d81bfa3a6b 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -57,7 +57,7 @@ static void prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
    vacuumingOptions *vacopts, const char *table);
 
 static void run_vacuum_command(PGconn *conn, const char *sql, bool echo,
-			   const char *table, const char *progname);
+			   const char *table);
 
 static void help(const char *progname);
 
@@ -646,7 +646,7 @@ vacuum_one_database(const char *dbname, vacuumingOptions *vacopts,
 		 * through ParallelSlotsGetIdle.
 		 */
 		run_vacuum_command(free_slot->connection, sql.data,
-		   echo, tabname, progname);
+		   echo, tabname);
 
 		cell = cell->next;
 	} while (cell != NULL);
@@ -855,7 +855,7 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
  */
 static void
 run_vacuum_command(PGconn *conn, const char *sql, bool echo,
-   const char *table, const char *progname)
+   const char *table)
 {
 	bool		status;

Re: POC: Cleaning up orphaned files using undo logs

2019-07-22 Thread Amit Khandekar

On Fri, 19 Jul 2019 at 17:24, Amit Khandekar  wrote:
>
> On Thu, 9 May 2019 at 12:04, Dilip Kumar  wrote:
>
> > Patches can be applied on top of undo branch [1] commit:
> > (cb777466d008e656f03771cf16ec7ef9d6f2778b)
> >
> > [1] https://github.com/EnterpriseDB/zheap/tree/undo
>
> Below are some review points for 0009-undo-page-consistency-checker.patch :

Another point that I missed :

+* Process the undo record of the page and mask their cid filed.
+*/
+   while (next_record < page_end)
+   {
+   UndoRecordHeader *header = (UndoRecordHeader *) next_record;
+
+   /* If this undo record has cid present, then mask it */
+   if ((header->urec_info & UREC_INFO_CID) != 0)

Here, even though next record starts in the current page, the
urec_info itself may or may not lie on this page.

I hope this possibility is also considered when populating the
partial-record-specific details in the page header.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Re: Add parallelism and glibc dependent only options to reindexdb

On 2019-Jul-22, Julien Rouhaud wrote:

> On Mon, Jul 22, 2019 at 6:11 AM Michael Paquier  wrote:

> >simple_string_list_append(&tables, optarg);
> > +  tbl_count++;
> >break;
> > The number of items in a simple list is not counted, and vacuumdb does
> > the same thing to count objects.  What do you think about extending
> > simple lists to track the number of items stored?
> 
> I considered this, but it would require to adapt all code that declare
> SimpleStringList stack variable (vacuumdb.c, clusterdb.c,
> createuser.c, pg_dumpall.c and pg_dump.c), so it looked like too much
> trouble to avoid 2 local variables here and 1 in vacuumdb.c.  I don't
> have a strong opinion here, so I can go for it if you prefer.

Can we use List for this instead?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: Cleaning up orphaned files using undo logs

2019-07-22 Thread Amit Khandekar

On Mon, 22 Jul 2019 at 14:21, Amit Kapila  wrote:

I have started review of
0014-Allow-execution-and-discard-of-undo-by-background-wo.patch. Below
are some quick comments to start with:

+++ b/src/backend/access/undo/undoworker.c

+#include "access/xact.h"
+#include "access/undorequest.h"
Order is not alphabetical

+ * Each undo worker then start reading from one of the queue the requests for
start=>starts
queue=>queues

-

+ rc = WaitLatch(MyLatch,
+WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+10L, WAIT_EVENT_BGWORKER_STARTUP);
+
+ /* emergency bailout if postmaster has died */
+ if (rc & WL_POSTMASTER_DEATH)
+ proc_exit(1);
I think now, thanks to commit cfdf4dc4fc9635a, you don't have to
explicitly handle postmaster death; instead you can use
WL_EXIT_ON_PM_DEATH. Please check at all such places where this is
done in this patch.

-

+UndoWorkerGetSlotInfo(int slot, UndoRequestInfo *urinfo)
+{
+ /* Block concurrent access. */
+ LWLockAcquire(UndoWorkerLock, LW_EXCLUSIVE);
+
+ MyUndoWorker = &UndoApplyCtx->workers[slot];
Not sure why MyUndoWorker is used here. Can't we use a local variable
? Or do we intentionally attach to the slot as a side-operation ?

-

+ * Get the dbid where the wroker should connect to and get the worker
wroker=>worker

-

+ BackgroundWorkerInitializeConnectionByOid(urinfo.dbid, 0, 0);
0, 0 => InvalidOid, 0

+ * Set the undo worker request queue from which the undo worker start
+ * looking for a work.
start => should start
a work => work

--

+ if (!InsertRequestIntoErrorUndoQueue(urinfo))
I was thinking what happens if for some reason
InsertRequestIntoErrorUndoQueue() itself errors out. In that case, the
entry will not be marked invalid, and so there will be no undo action
carried out because I think the undo worker will exit. What happens
next with this entry ?

Re: POC: converting Lists into arrays

David Rowley  writes:
> I looked over this and only noted down one thing:

> In estimate_path_cost_size, can you explain why list_concat_copy() is
> needed here? I don't see remote_param_join_conds being used after
> this, so might it be better to just get rid of remote_param_join_conds
> and pass remote_conds to classifyConditions(), then just
> list_concat()?

Hm, you're right, remote_param_join_conds is not used after that,
so we could just drop the existing list_copy() and make it

remote_conds = list_concat(remote_param_join_conds,
   fpinfo->remote_conds);

I'm disinclined to change the API of classifyConditions(),
if that's what you were suggesting.

>> It turns out there are a *lot* of places where list_concat() callers
>> are now leaking the second input list (where before they just leaked
>> that list's header).  So I've got mixed emotions about the choice not
>> to add a variant function that list_free's the second input.

> In some of these places, for example, the calls to
> generate_join_implied_equalities_normal() and
> generate_join_implied_equalities_broken(), I wonder, since these are
> static functions if we could just change the function signature to
> accept a List to append to.

I'm pretty disinclined to do that, too.  Complicating function APIs
for marginal performance gains isn't something that leads to
understandable or maintainable code.

> If we do end up with another function, it might be nice to stay away
> from using "concat" in the name. I think we might struggle if there
> are too many variations on concat and there's a risk we'll use the
> wrong one.  If we need this then perhaps something like
> list_append_all() might be a better choice... I'm struggling to build
> a strong opinion on this though. (I know that because I've deleted
> this paragraph 3 times and started again, each time with a different
> opinion.)

Yeah, the name is really the sticking point here; if we could think
of a name that was easy to understand then the whole thing would be
much easier to accept.  The best I've been able to come up with is
"list_join", by analogy to bms_join for bitmapsets ... but that's
not great.

regards, tom lane

Re: initdb recommendations

Peter Eisentraut  writes:
> Pushed with that note.  Thanks.

This has completely broken the buildfarm.

regards, tom lane

Re: psql - add SHOW_ALL_RESULTS option

2019-07-22 Thread Fabien COELHO

Hello Peter,

I'd go further and suggest that there shouldn't be a variable
controlling this. All results that come in should be processed, period.

I agree with that.

I kind of agree as well, but I was pretty sure that someone would complain
if the current behavior was changed.

Should I produce a patch where the behavior is not an option, or turn the
option on by default, or just keep it like that for the time being?

It's not just about \; If the ability of CALL to produce multiple
resultsets gets implemented (it was posted as a POC during v11
development), this will be needed too.

See previous patch here:
https://www.postgresql.org/message-id/flat/4580ff7b-d610-eaeb-e06f-4d686896b93b%402ndquadrant.com

In that patch, I discussed the specific ways in which \timing works in
psql and how that conflicts with multiple result sets. What is the
solution to that in this patch?

\timing was kind of a ugly feature to work around. The good intention
behind \timing is that it should reflect the time to perform the query
from the client perspective, but should not include processing the
results.

However, if a message results in several queries, they are processed as
they arrive, so that \timing reports the time to perform all queries and
the time to process all but the last result.

Although on paper we could try to get all results first, take the time,
then process them, this does not work in the general case because COPY
takes on the connection so you have to process its result before switching
to the next result.

There is also some stuff to handle notices which are basically send as
events when they occur, so that the notice shown are related to the
result under processing.

--
Fabien.

Re: Cleaning up and speeding up string functions

2019-07-22 Thread Dagfinn Ilmari Mannsåker

David Rowley  writes:

> On Thu, 4 Jul 2019 at 13:51, David Rowley  
> wrote:
>> Instead of having 0004, how about the attached?
>>
>> Most of the calls won't improve much performance-wise since they're so
>> cheap anyway, but there is xmlconcat(), I imagine that should see some
>> speedup.
>
> I've pushed this after having found a couple more places where the
> length is known.

I noticed a lot of these are appending one StringInfo onto another;
would it make sense to introduce a helper funciton
appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
`str.data, str2.len` repetition?

- ilmari
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen

Re: initdb recommendations

On 2019-07-13 18:58, Julien Rouhaud wrote:
>>> The default client authentication setup is such that users can connect
>>> over the Unix-domain socket to the same database user name as their
>>> operating system user names (on operating systems that support this,
>>> which are most modern Unix-like systems, but not Windows) and
>>> otherwise with a password. To assign a password to the initial
>>> database superuser, use one of initdb's -W, --pwprompt or -- pwfile
>>> options.
>>
>> Do you have a suggestion for where to put this and exactly how to phrase
>> this?
>>
>> I think the initdb reference page would be more appropriate than
>> runtime.sgml.
> 
> Yes initdb.sgml seems more suitable.  I was thinking something very
> similar to your note, maybe like (also attached if my MUA ruins it):

Pushed with that note.  Thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: GiST VACUUM

2019-07-22 Thread Heikki Linnakangas


On 28/06/2019 01:02, Thomas Munro wrote:

On Thu, Jun 27, 2019 at 11:38 PM Heikki Linnakangas  wrote:

* I moved the logic to extend a 32-bit XID to 64-bits to a new helper
function in varsup.c.


I'm a bit uneasy about making this into a generally available function
for reuse.  I think we should instead come up with a very small number
of sources of fxids that known to be free of races because of some
well explained interlocking.

For example, I believe it is safe to convert an xid obtained from a
WAL record during recovery, because (for now) recovery is a single
thread of execution and the next fxid can't advance underneath you.
So I think XLogRecGetFullXid(XLogReaderState *record)[1] as I'm about
to propose in another thread (though I've forgotten who wrote it,
maybe Andres, Amit or me, but if it wasn't me then it's exactly what I
would have written) is a safe blessed source of fxids.  The rationale
for writing this function instead of an earlier code that looked much
like your proposed helper function, during EDB-internal review of
zheap stuff, was this: if we provide a general purpose xid->fxid
facility, it's virtually guaranteed that someone will eventually use
it to shoot footwards, by acquiring an xid from somewhere, and then
some arbitrary time later extending it to a fxid when no interlocking
guarantees that the later conversion has the correct epoch.


Fair point.


I'd like to figure out how to maintain full versions of the
procarray.c-managed xid horizons, without widening the shared memory
representation.  I was planning to think harder about for 13.
Obviously that doesn't help you now.  So I'm wondering if you should
just open-code this for now, and put in a comment about why it's safe
and a note that there'll hopefully be a fxid horizon available in a
later release?


I came up with the attached. Instead of having a general purpose "widen" 
function, it adds GetFullRecentGlobalXmin(), to return a 64-bit version 
of RecentGlobalXmin. That's enough for this Gist vacuum patch.


In addition to that change, I modified the GistPageSetDeleted(), 
GistPageSetDeleteXid(), GistPageGetDeleteXid() inline functions a bit. I 
merged GistPageSetDeleted() and GistPageSetDeleteXid() into a single 
function, to make sure that the delete-XID is always set when a page is 
marked as deleted. And I modified GistPageGetDeleteXid() to return 
NormalTransactionId (or a FullTransactionId version of it, to be 
precise), for Gist pages that were deleted with older PostgreSQL v12 
beta versions. The previous patch tripped an assertion in that case, 
which is not nice for people binary-upgrading from earlier beta versions.


I did some testing of this with XID wraparound, and it seems to work. I 
used the attached bash script for the testing, with the a helper contrib 
module to consume XIDs faster. It's not very well commented and probably 
not too useful for anyone, but I'm attaching it here mainly as a note to 
future-self, if we need to revisit this.


Unless something comes up, I'll commit this tomorrow.

- Heikki
>From bdeb2467211a1ab9e733e070d54dce97c05cf18c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 22 Jul 2019 15:57:01 +0300
Subject: [PATCH 1/2] Refactor checks for deleted GiST pages.

The explicit check in gistScanPage() isn't currently really necessary, as
a deleted page is always empty, so the loop would fall through without
doing anything, anyway. But it's a marginal optimization, and it gives a
nice place to attach a comment to explain how it works.
---
 src/backend/access/gist/gist.c| 40 ---
 src/backend/access/gist/gistget.c | 14 +++
 2 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 169bf6fcfed..e9ca4b82527 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -709,14 +709,15 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 			continue;
 		}
 
-		if (stack->blkno != GIST_ROOT_BLKNO &&
-			stack->parent->lsn < GistPageGetNSN(stack->page))
+		if ((stack->blkno != GIST_ROOT_BLKNO &&
+			 stack->parent->lsn < GistPageGetNSN(stack->page)) ||
+			GistPageIsDeleted(stack->page))
 		{
 			/*
-			 * Concurrent split detected. There's no guarantee that the
-			 * downlink for this page is consistent with the tuple we're
-			 * inserting anymore, so go back to parent and rechoose the best
-			 * child.
+			 * Concurrent split or page deletion detected. There's no
+			 * guarantee that the downlink for this page is consistent with
+			 * the tuple we're inserting anymore, so go back to parent and
+			 * rechoose the best child.
 			 */
 			UnlockReleaseBuffer(stack->buffer);
 			xlocked = false;
@@ -735,9 +736,6 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 			GISTInsertStack *item;
 			OffsetNumber downlinkoffnum;
 
-			/* currently, internal pages are never deleted */
-			Assert(!GistPageIsDeleted(stack->page));

Re: Psql patch to show access methods info

2019-07-22 Thread Alexander Korotkov

On Mon, Jul 22, 2019 at 6:29 AM Alvaro Herrera  wrote:
> On 2019-Jul-21, Alexander Korotkov wrote:
> > I've one note.  Behavior of "\dA" and "\dA pattern" look
> > counter-intuitive to me.  I would rather expect that "\dA pattern"
> > would just filter results of "\dA", but it displays different
> > information.  I suggest rename displaying access method properties
> > from "\dA pattern" to different.
>
> \dA+ maybe?  Then ...
>
> > And leave "\dA pattern" just filter results of "\dA".
>
> "\dA+ pattern" works intuitively, I think.

Sounds good for me.

We already have some functionality for \dA+.

# \dA+
 List of access methods
  Name  | Type  |   Handler|  Description
+---+--+
 brin   | index | brinhandler  | block range index (BRIN) access method
 btree  | index | bthandler| b-tree index access method
 gin| index | ginhandler   | GIN index access method
 gist   | index | gisthandler  | GiST index access method
 hash   | index | hashhandler  | hash index access method
 heap   | table | heap_tableam_handler | heap table access method
 spgist | index | spghandler   | SP-GiST index access method
(7 rows)

What we need is that new \dA+ functionality cover existing one.  That
it, we should add Handler and Description column to the output.

# \dA+ *
 Index access method properties
   AM   | Ordering | Unique indexes | Multicol indexes | Exclusion
constraints | Include non-key columns
+--++--+---+-
 brin   | no   | no | yes  | no
| no
 btree  | yes  | yes| yes  | yes
| yes
 gin| no   | no | yes  | no
| no
 gist   | no   | no | yes  | yes
| yes
 hash   | no   | no | no   | yes
| no
 spgist | no   | no | no   | yes
| no
(6 rows)

 Table access method properties
 Name | Type  |   Handler|   Description
--+---+--+--
 heap | table | heap_tableam_handler | heap table access method
(1 row)




--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-22 Thread Julien Rouhaud

On Mon, Jul 22, 2019 at 6:11 AM Michael Paquier  wrote:
>
> On Fri, Jul 19, 2019 at 08:29:27AM +0200, Julien Rouhaud wrote:
> > On Fri, Jul 19, 2019 at 2:35 AM Michael Paquier  wrote:
> >> For the second patch, could you send a rebase with a fix for the
> >> connection slot when processing the reindex commands?
> >
> > Attached, I also hopefully removed all the now unneeded progname usage.
>
> +Note that this mode is not compatible the -i / 
> --index
> +or the -s / --system options.
> Nits: this is not a style consistent with the documentation.  When
> referring to both the long and short options the formulation "-i or
> --index" gets used.  Here we could just use the long option.  This
> sentence is missing a "with".

Right, so I kept the long option.  Also this comment was outdated, as
the --jobs is now just ignored with a list of indexes, so I fixed that
too.

>
>simple_string_list_append(&tables, optarg);
> +  tbl_count++;
>break;
> The number of items in a simple list is not counted, and vacuumdb does
> the same thing to count objects.  What do you think about extending
> simple lists to track the number of items stored?

I considered this, but it would require to adapt all code that declare
SimpleStringList stack variable (vacuumdb.c, clusterdb.c,
createuser.c, pg_dumpall.c and pg_dump.c), so it looked like too much
trouble to avoid 2 local variables here and 1 in vacuumdb.c.  I don't
have a strong opinion here, so I can go for it if you prefer.

>
> +$node->issues_sql_like([qw(reindexdb -j2)],
> +   qr/statement: REINDEX TABLE public.test1/,
> +   'Global and parallel reindex will issue per-table REINDEX');
> Would it make sense to have some tests for schemas here?
>
> One of my comments in [1] has not been answered.  What about
> the decomposition of a list of schemas into a list of tables when
> using the parallel mode?

I did that in attached v6, and added suitable regression tests.


reindex_parallel_v6.diff
Description: Binary data

Re: Cleaning up and speeding up string functions

On Thu, 4 Jul 2019 at 13:51, David Rowley  wrote:
> Instead of having 0004, how about the attached?
>
> Most of the calls won't improve much performance-wise since they're so
> cheap anyway, but there is xmlconcat(), I imagine that should see some
> speedup.

I've pushed this after having found a couple more places where the
length is known.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: make libpq documentation navigable between functions

On 2019-07-10 09:51, Fabien COELHO wrote:
>> One approach for making the currently non-monospaced ones into monospace 
>> would be to make the xref targets point to  elements
>> but *don't* put xreflabels on those.
> 
> I understand that you mean turning function usages:
> 
>PQbla
> 
> into:
> 
>
> 
> so that it points to function definitions that would look like:
> 
>PQbla...
> 
> (note: "libpq-pqbla" ids are already taken).

What I really meant was that you determine the best link target in each
case.  If there already is an id on a , then use that.  If
not, then make an id on something else, most likely the  element.

What you have now puts ids on both the  and the
, which seems unnecessary and confusing.

For some weird reason this setup with link targets in both
 and enclosed  breaks the PDF build, but if you
change it the way I suggest then those errors go away.

>> This will currently produce a warning Don't know what gentext to create 
>> for xref to: "function"
> 
> Indeed.
> 
>> but we can write a template
>>
>> 
>>
>> and then we can control the output format of that.
> 
> This step is (well) beyond my current XSLT proficiency, which is null 
> beyond knowing that it transforms XML into whatever. Also I'm unsure into 
> which of the 11 xsl file the definition should be included and what should 
> be written precisely.

See attached patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 74509f28778c69bc2168925f0d0ca6fa935c9013 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 22 Jul 2019 14:04:48 +0200
Subject: [PATCH] doc: Add support for xref to command and function elements

---
 doc/src/sgml/stylesheet-common.xsl | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/doc/src/sgml/stylesheet-common.xsl 
b/doc/src/sgml/stylesheet-common.xsl
index 6d26e7e5c9..e148c9057f 100644
--- a/doc/src/sgml/stylesheet-common.xsl
+++ b/doc/src/sgml/stylesheet-common.xsl
@@ -86,4 +86,15 @@
   ?
 
 
+
+
+
+
+  
+
+
+
+  
+
+
 
-- 
2.22.0

RE: extension patch of CREATE OR REPLACE TRIGGER

2019-07-22 Thread osumi.takami...@fujitsu.com

Dear Surafel

Thank you for your check of my patch.
I’ve made the version 03 to
fix what you mentioned about my patch.

I corrected my wrong bracing styles and
also reduced the amount of my regression test.
Off course, I erased unnecessary
white spaces and the C++ style comment.

Regards,
  Takamichi Osumi

I don't test your patch fully yet but here are same comment.
There are same white space issue like here
-  bool is_internal)
+  bool is_internal,
+  Oid existing_constraint_oid)
in a few place

+ // trigoid = HeapTupleGetOid(tuple); // raw code
please remove this line if you don't use it.

+ if(!existing_constraint_oid){
+ conOid = GetNewOidWithIndex(conDesc, ConstraintOidIndexId,
+ Anum_pg_constraint_oid);
+ values[Anum_pg_constraint_oid - 1] = ObjectIdGetDatum(conOid);
+ }
incorrect bracing style here and its appear in a few other places too
and it seems to me that the change in regression test is
huge can you reduce it?

regards
Surafel


CREATE_OR_REPLACE_TRIGGER_v03.patch
Description: CREATE_OR_REPLACE_TRIGGER_v03.patch

Re: psql - add SHOW_ALL_RESULTS option

On 2019-05-15 18:41, Daniel Verite wrote:
> I'd go further and suggest that there shouldn't be a variable
> controlling this. All results that come in should be processed, period.

I agree with that.

> It's not just about \; If the ability of CALL to produce multiple
> resultsets gets implemented (it was posted as a POC during v11
> development), this will be needed too.

See previous patch here:
https://www.postgresql.org/message-id/flat/4580ff7b-d610-eaeb-e06f-4d686896b93b%402ndquadrant.com

In that patch, I discussed the specific ways in which \timing works in
psql and how that conflicts with multiple result sets.  What is the
solution to that in this patch?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

 Hello hackers,

Here are the stats at the end of week 3 of the CF:

 status | w1  | w2  | w3
+-+-+-
 Committed  |  32 |  41 |  49
 Moved to next CF   |   5 |   6 |   6
 Needs review   | 146 | 128 | 114
 Ready for Committer|   7 |   9 |  10
 Rejected   |   2 |   2 |   2
 Returned with feedback |   2 |   2 |   2
 Waiting on Author  |  29 |  35 |  39
 Withdrawn  |   8 |   8 |   9

Here is the last batch of submissions that I want to highlight.  These
13 are all marked as "Needs review", but haven't yet seen any email
traffic since the CF began:

 2119 | Use memcpy in pglz decompression | {"Andrey
Borodin","Владимир Лесков"}
 2169 | Remove HeapTuple and Buffer dependency f | {"Ashwin Agrawal"}
 2172 | fsync error handling in pg_receivewal, p | {"Peter Eisentraut"}
 1695 | Global shared meta cache | {"Takeshi Ideriha"}
 2175 | socket_timeout in interfaces/libpq   | {"Ryohei Nagaura"}
 2096 | psql - add SHOW_ALL_RESULTS option   | {"Fabien Coelho"}
 2023 | NOT IN to ANTI JOIN transformation   | {"James Finnerty","Zheng Li"}
 2064 | src/test/modules/dummy_index -- way to t | {"Nikolay Shaplov"}
 1712 | Remove self join on a unique column  | {"Alexander Kuzmenkov"}
 2180 | Optimize pglz compression| {"Andrey
Borodin","Владимир Лесков"}
 2179 | Fix support for hypothetical indexes usi | {"Julien Rouhaud"}
 2025 | SimpleLruTruncate() mutual exclusion (da | {"Noah Misch"}
 2069 | Expose queryid in pg_stat_activity in lo | {"Julien Rouhaud"}

-- 
Thomas Munro
https://enterprisedb.com

Re: should there be a hard-limit on the number of transactions pending undo?

On Sat, Jul 20, 2019 at 11:28 AM Peter Geoghegan  wrote:
> On Fri, Jul 19, 2019 at 4:14 PM Robert Haas  wrote:
> > I don't think this matters here at all. As long as there's only DML
> > involved, there won't be any lock conflicts anyway - everybody's
> > taking RowExclusiveLock or less, and it's all fine. If you update a
> > row in zheap, abort, and then try to update again before the rollback
> > happens, we'll do a page-at-a-time rollback in the foreground, and
> > proceed with the update; when we get around to applying the undo,
> > we'll notice that page has already been handled and skip the undo
> > records that pertain to it.  To get the kinds of problems I'm on about
> > here, somebody's got to be taking some more serious locks.
>
> If I'm not mistaken, you're tacitly assuming that you'll always be
> using zheap, or something sufficiently similar to zheap. It'll
> probably never be possible to UNDO changes to something like a GIN
> index on a zheap table, because you can never do that with sensible
> concurrency/deadlock behavior.
>
> I don't necessarily have a problem with that. I don't pretend to
> understand how much of a problem it is. Obviously it partially depends
> on what your ambitions are for this infrastructure. Still, assuming
> that I have it right, ISTM that UNDO/zheap/whatever should explicitly
> own this restriction.

I had a similar thought: you might regret that choice if you were
wanting to implement an AM with lock table-based concurrency control
(meaning that there are lock ordering concerns for row and page locks,
for DML statements, not just DDL).  That seemed a bit too far fetched
to mention before, but are you saying the same sort of concerns might
come up with indexes that support true undo (as opposed to indexes
that still need VACUUM)?

For comparison, ARIES[1] has no-deadlock rollbacks as a basic property
and reacquires locks during restart before new transactions are allow
to execute.  In its model, the locks in question can be on things like
rows and pages.  We don't even use our lock table for those (except
for non-blocking SIREAD locks, irrelevant here).  After crash
recovery, if zheap encounters a row with pending rollback from an
aborted transaction, as usual it either needs to read an older version
from an undo log (for reads) or help execute the rollback before
updating (for writes).  That only requires page-at-a-time LWLocks
("latching"), so it's deadlock-free.  The only deadlock risk comes
from the need to acquire heavyweight locks on relations which
typically only conflict when you run DDL, so yeah, it's tempting to
worry a lot less about those than the fine grained lock traffic from
DML statements that DB2 and others have to deal with.

So spell out the two options again:

A.  Rollback can't deadlock.  You have to make sure you reliably hold
locks until rollback is completed (including some tricky new lock
transfer magic), and then reacquire them after recovery before new
transactions are allowed.  You could trivially achieve the restart
part by simply waiting until all rollback is executed before you allow
new transactions, but other systems including DB2 first acquire all
the locks in an earlier scan through the log, then allow new
connections, and then execute the rollback.  Acquiring them before new
transactions are allowed means that they must fit in the lock table and
there must be no conflicts among them if they were all granted as at
the moment you crashed or shut down.

B.  Rollback can deadlock or exhaust the lock table because we release
and reacquire some arbitrary time later.  No choice but to keep
retrying if anything goes wrong, and rollback is theoretically not
guaranteed to complete and you can contrive a workload that will never
make progress.  This amounts to betting that these problems will be
rare enough that it doesn't matter and eventually make progress, and
it should be fairly clear what's happening and why.

I might as well put the quote marks on now:  "Perhaps we could
implement A later."

[1] https://cs.stanford.edu/people/chrismre/cs345/rl/aries.pdf

--
Thomas Munro
https://enterprisedb.com

Re: Speed up transaction completion faster after many relations are accessed in a transaction

On Mon, 22 Jul 2019 at 12:48, Tsunakawa, Takayuki
 wrote:
>
> From: David Rowley [mailto:david.row...@2ndquadrant.com]
> > I went back to the drawing board on this and I've added some code that 
> > counts
> > the number of times we've seen the table to be oversized and just shrinks
> > the table back down on the 1000th time.  6.93% / 1000 is not all that much.
>
> I'm afraid this kind of hidden behavior would appear mysterious to users.  
> They may wonder "Why is the same query fast at first in the session (5 or 6 
> times of execution), then gets slower for a while, and gets faster again?  Is 
> there something to tune?  Am I missing something wrong with my app (e.g. how 
> to use prepared statements)?"  So I prefer v5.

Another counter-argument to this is that there's already an
unexplainable slowdown after you run a query which obtains a large
number of locks in a session or use prepared statements and a
partitioned table with the default plan_cache_mode setting. Are we not
just righting a wrong here? Albeit, possibly 1000 queries later.

I am, of course, open to other ideas which solve the problem that v5
has, but failing that, I don't see v6 as all that bad.  At least all
the logic is contained in one function.  I know Tom wanted to steer
clear of heuristics to reinitialise the table, but most of the stuff
that was in the patch back when he complained was trying to track the
average number of locks over the previous N transactions, and his
concerns were voiced before I showed the 7% performance regression
with unconditionally rebuilding the table.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: POC: Cleaning up orphaned files using undo logs

2019-07-22 Thread Dilip Kumar

On Mon, Jul 22, 2019 at 2:21 PM Amit Kapila  wrote:
>

I have reviewed 0012-Infrastructure-to-execute-pending-undo-actions,
Please find my comment so far.

1.
+ /* It shouldn't be discarded. */
+ Assert(!UndoRecPtrIsDiscarded(xact_urp));

I think comments can be added to explain why it shouldn't be discarded.

2.
+ /* Compute the offset of the uur_next in the undo record. */
+ offset = SizeOfUndoRecordHeader +
+ offsetof(UndoRecordTransaction, urec_progress);
+
in comment /uur_next/uur_progress

3.
+/*
+ * undo_record_comparator
+ *
+ * qsort comparator to handle undo record for applying undo actions of the
+ * transaction.
+ */
Function header formating is not in sync with other functions.

4.
+void
+undoaction_redo(XLogReaderState *record)
+{
+ uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+
+ switch (info)
+ {
+ case XLOG_UNDO_APPLY_PROGRESS:
+ undo_xlog_apply_progress(record);
+ break;

For HotStandby it doesn't make sense to apply this wal as this
progress is only required when we try to apply the undo action after
restart
but in HotStandby we never apply undo actions.

5.
+ Assert(from_urecptr != InvalidUndoRecPtr);
+ Assert(to_urecptr != InvalidUndoRecPtr);

we can use macros UndoRecPtrIsValid instead of checking like this.

6.
+ if ((slot == NULL) || (UndoRecPtrGetLogNo(urecptr) != slot->logno))
+ slot = UndoLogGetSlot(UndoRecPtrGetLogNo(urecptr), false);
+
+ Assert(slot != NULL);
We are passing missing_ok as false in UndoLogGetSlot.  But, not sure
why we are expecting that undo lot can not be dropped.  In multi-log
transaction it's possible
that the tablespace in which next undolog is there is already dropped?

7.
+ */
+ do
+ {
+ BlockNumber progress_block_num = InvalidBlockNumber;
+ int i;
+ int nrecords;
  .
+ */
+ if (!UndoRecPtrIsValid(urec_ptr))
+ break;
+ } while (true);

I think we can convert above loop to while(true) instead of do..while,
because there is no need for do while loop.

8.
+ if (last_urecinfo->uur->uur_info & UREC_INFO_LOGSWITCH)
+ {
+ UndoRecordLogSwitch *logswitch = last_urecinfo->uur->uur_logswitch;

IMHO, the caller of UndoFetchRecord should directly check
uur->uur_logswitch instead of uur_info & UREC_INFO_LOGSWITCH.
Actually, uur_info is internally set
for inserting the tuple and check there to know what to insert and
fetch but I think caller of UndoFetchRecord should directly rely on
the field because ideally all
the fields in UnpackUndoRecord must be set and uur_txt or
uur_logswitch will be allocated when those headers present.  I think
this needs to be improved in undo interface patch
as well (in UndoBulkFetchRecord).


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Identity columns should own only one sequence

On 2019-05-08 16:49, Laurenz Albe wrote:
> I believe we should have both:
> 
> - Identity columns should only use sequences with an INTERNAL dependency,
>   as in Peter's patch.

I have committed this.

> - When a column default is dropped, remove all dependencies between the
>   column and sequences.

There is no proposed patch for this, AFAICT.

So I have closed this commit fest item for now.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [Patch] PQconnectPoll() is blocked if target_session_attrs is read-write

2019-07-22 Thread Kyotaro Horiguchi

Hello.

At Mon, 22 Jul 2019 02:28:22 +, "Matsumura, Ryo" 
 wrote in 
<03040DFF97E6E54E88D3BFEE5F5480F74AC15BBD@G01JPEXMBYT04>
> Hi
> 
> # I rewrote my previous mail.
> 
> PQconnectPoll() is used as method for asynchronous using externally or 
> internally.
> If a caller confirms a socket ready for writing or reading that is
> requested by return value of previous PQconnectPoll(), next PQconnectPoll()
> must not be blocked. But if the caller specifies target_session_attrs to
> 'read-write', PQconnectPoll() may be blocked.
> 
> Detail:
> If target_session_attrs is set to read-write, PQconnectPoll() calls
> PQsendQuery("SHOW transaction_read_only") althogh previous return value was
> PGRES_POLLING_READING not WRITING.
> In result, PQsendQuery() may be blocked in pqsecure_raw_write().
> 
> I attach a patch.
> 
> Regards
> Ryo Matsumura

First, this patch looks broken.

patched> if (conn->sversion >= 70400 &&
patched>   conn->target_session_attrs != NULL &&
patched>   strcmp(conn->target_session_attrs, "read-write") == 0)
patched> {
patched> }

Perhaps you did cut-n-paste instead of copy-n-paste.

I'm not sure such a small write just after reading can block, but
doing that makes things tidy.

You also need to update the corresponding documentation.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: POC: Cleaning up orphaned files using undo logs

2019-07-22 Thread Amit Kapila

On Thu, Jul 18, 2019 at 4:41 PM Thomas Munro  wrote:
>
> On Tue, Jul 16, 2019 at 8:39 AM Robert Haas  wrote:
> > Thomas has already objected to another proposal to add functions that
> > turn 32-bit XIDs into 64-bit XIDs.  Therefore, I feel confident in
> > predicting that he will likewise object to GetEpochForXid. I think
> > this needs to be changed somehow, maybe by doing what the XXX comment
> > you added suggests.
>
> Perhaps we should figure out how to write GetOldestFullXmin() and friends.
>
> For FinishPreparedTransaction(), the XXX comment sounds about right
> (TwoPhaseFileHeader should hold an fxid).
>

I think we can do that, but what about subxids in TwoPhaseFileHeader?
Shall we store them as fxids as well?  If we don't do that then it
will appear inconsistent and if we want to store subxids as fxids,
then we need to track them as fxids in TransactionStateData.  It might
not be a very big change, but certainly, more work as compared to if
we just store top-level fxid or use GetEpochForXid as we are currently
using in the patch.   Another thing is changing subxids to fxids can
increase the size of two-phase file for a xact having many
sub-transactions which again might be okay, but not completely sure.

Thoughts?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_upgrade version checking questions

On 2019-04-04 15:40, Daniel Gustafsson wrote:
> On Wednesday, March 27, 2019 1:43 PM, Christoph Berg  wrote:
> 
>> Re: Daniel Gustafsson 2019-03-26 
>> pC-NMmh4vQLQP76YTwY4AuoD4OdNw9egikekyQpXFpgqmTlGjIZXOTd2W5RDZPpRski5N3ADRrLYgLk6QUuvmuT5fWC9acPAYyDU1AVxJcU=@yesql.se
>>
>>> 0003 - Make -B default to CWD and remove the exec_path check
>>> Defaulting to CWD for the new bindir has the side effect that the default
>>> sockdir is in the bin/ directory which may be less optimal.
>>
>> Hmm, I would have thought that the default for the new bindir is the
>> directory where pg_upgrade is located, not the CWD, which is likely to
>> be ~postgres or the like?
> 
> Yes, thinking on it that's obviously better.  The attached v2 repurposes the
> find_my_exec() check to make the current directory of pg_upgrade the default
> for new_cluster.bindir (the other two patches are left as they were).

0001-Only-allow-upgrades-by-the-same-exact-version-new-v2.patch

I don't understand what this does.  Please explain.


0002-Check-all-used-executables-v2.patch

I think we'd also need a check for pg_controldata.

Perhaps this comment could be improved:

/* these are only needed in the new cluster */

to

/* these are only needed for the target version */

(pg_dump runs on the old cluster but has to be of the new version.)


0003-Default-new-bindir-to-exec_path-v2.patch

I don't like how the find_my_exec() code has been moved around.  That
makes the modularity of the code worse.  Let's keep it where it was and
then structure it like this:

if -B was given:
new_cluster.bindir = what was given for -B
else:
# existing block

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: converting Lists into arrays

On Mon, 22 Jul 2019 at 08:01, Tom Lane  wrote:
>
> I wrote:
> >> * Rationalize places that are using combinations of list_copy and
> >> list_concat, probably by inventing an additional list-concatenation
> >> primitive that modifies neither input.
>
> > I poked around to see what we have in this department.  There seem to
> > be several identifiable use-cases:
> > [ ... analysis ... ]
>
> Here's a proposed patch based on that.  I added list_concat_copy()
> and then simplified callers as appropriate.

I looked over this and only noted down one thing:

In estimate_path_cost_size, can you explain why list_concat_copy() is
needed here? I don't see remote_param_join_conds being used after
this, so might it be better to just get rid of remote_param_join_conds
and pass remote_conds to classifyConditions(), then just
list_concat()?

/*
* The complete list of remote conditions includes everything from
* baserestrictinfo plus any extra join_conds relevant to this
* particular path.
*/
remote_conds = list_concat_copy(remote_param_join_conds,
fpinfo->remote_conds);

classifyConditions() seems to create new lists, so it does not appear
that you have to worry about modifying the existing lists.

> It turns out there are a *lot* of places where list_concat() callers
> are now leaking the second input list (where before they just leaked
> that list's header).  So I've got mixed emotions about the choice not
> to add a variant function that list_free's the second input.  On the
> other hand, the leakage probably amounts to nothing significant in
> all or nearly all of these places, and I'm concerned about the
> readability/understandability loss of having an extra version of
> list_concat.  Anybody have an opinion about that?

In some of these places, for example, the calls to
generate_join_implied_equalities_normal() and
generate_join_implied_equalities_broken(), I wonder, since these are
static functions if we could just change the function signature to
accept a List to append to.  This could save us from having to perform
any additional pallocs at all, so there'd be no need to free anything
or worry about any leaks.  The performance of the code would be
improved too.   There may be other cases where we can do similar, but
I wouldn't vote we change signatures of non-static functions for that.

If we do end up with another function, it might be nice to stay away
from using "concat" in the name. I think we might struggle if there
are too many variations on concat and there's a risk we'll use the
wrong one.  If we need this then perhaps something like
list_append_all() might be a better choice... I'm struggling to build
a strong opinion on this though. (I know that because I've deleted
this paragraph 3 times and started again, each time with a different
opinion.)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: progress report for ANALYZE

2019-07-22 Thread Kyotaro Horiguchi

Hello.

# It's very good timing, as you came in while I have a time after
# finishing a quite nerve-wrackig task..

At Mon, 22 Jul 2019 15:02:16 +0900, Tatsuro Yamada 
 wrote in 
<0876b4fe-26fb-ca32-f179-c696fa3dd...@nttcom.co.jp>
> >> 3785|13599|postgres|16384|f|16384|analyzing complete|0|0 <-- Added and
> >> fixed. :)
> >> -
> >
> > I have some comments.
> > +   const int   index[] = {
> > +   PROGRESS_ANALYZE_PHASE,
> > +   PROGRESS_ANALYZE_TOTAL_BLOCKS,
> > +   PROGRESS_ANALYZE_BLOCKS_DONE
> > +   };
> > +   const int64 val[] = {
> > +   PROGRESS_ANALYZE_PHASE_ANALYSIS,
> > +   0, 0
> > Do you oppose to leaving the total/done blocks alone here:p?
> 
> 
> Thanks for your comments!
> I created a new patch based on your comments because Alvaro allowed me
> to work on revising the patch. :)
> 
> 
> Ah, I revised it to remove "0, 0".

Thanks. For a second I thought that
PROGRESS_ANALYZE_PHASE_ANALYSIS was lost but it is living being
renamed to PROGRESS_ANALYZE_PHASE_ANALYSIS.

> > +  BlockNumber  nblocks;
> > +  doubleblksdone = 0;
> > Why is it a double even though blksdone is of the same domain
> > with nblocks? And finally:
> > +pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE,
> > +   ++blksdone);
> > It is converted into int64.
> 
> 
> Fixed.
> But is it suitable to use BlockNumber instead int64?

Yeah, I didn't meant that we should use int64 there. Sorry for
the confusing comment. I meant that blksdone should be of
BlockNumber.

> > +  WHEN 2 THEN 'analyzing sample'
> > +  WHEN 3 THEN 'analyzing sample (extended stats)'
> > I think we should avoid parenthesized phrases as far as
> > not-necessary. That makes the column unnecessarily long. The
> > phase is internally called as "compute stats" so *I* would prefer
> > something like the followings:
> > +  WHEN 2 THEN 'computing stats'
> > +  WHEN 3 THEN 'computing extended stats'
> > +  WHEN 4 THEN 'analyzing complete'
> > And if you are intending by this that (as mentioned in the
> > documentation) "working to complete this analyze", this would be:
> > +  WHEN 4 THEN 'completing analyze'
> > +  WHEN 4 THEN 'finalizing analyze'
> 
> 
> I have no strong opinion, so I changed the phase-names based on
> your suggestions like following:
> 
>   WHEN 2 THEN 'computing stats'
>   WHEN 3 THEN 'computing extended stats'
>   WHEN 4 THEN 'finalizing analyze'
> 
> However, I'd like to get any comments from hackers to get a consensus
> about the names.

Agreed. Especially such word choosing is not suitable for me..

> > + Process ID of backend.
> > of "the" backend. ? And period is not attached on all
> > descriptions consisting of a single sentence.
> >
> > + OID of the database to which this backend is
> > connected.
> > + Name of the database to which this backend is
> > connected.
> > "database to which .. is connected" is phrased as "database this
> > backend is connected to" in pg_stat_acttivity. (Just for consistency)
> 
> 
> I checked the sentences on other views of progress monitor (VACUUM,
> CREATE INDEX and CLUSTER), and they are same sentence. Therefore,
> I'd like to keep it as is. :)

Oh, I see from where the wordings came. But no periods seen after
sentense when it is the only one in a description in other system
views tables. I think the progress views tables should be
corrected following convention.

> > + Whether the current scan includes legacy inheritance
> > children.
> > This apparently excludes partition tables but actually it is
> > included.
> >
> >"Whether scanning through child tables" ?
> > I'm not sure "child tables" is established as the word to mean
> > both "partition tables and inheritance children"..
> 
> 
> Hmm... I'm also not sure but I fixed as you suggested.

Yeah, I also am not sure the suggestion is good enough as is..


> > +   The table being scanned (differs from relid
> > +   only when processing partitions or inheritance children).
> > Is  needed? And I think the parentheses are not needed.
> >OID of the table currently being scanned. Can differ from relid
> >when analyzing tables that have child tables.
> 
> 
> How about:
>   OID of the table currently being scanned.
>   It might be different from relid when analyzing tables that have child
>   tables.
> 
> 
> 
> > +   Total number of heap blocks to scan in the current table.
> > Number of heap blocks on scanning_table to scan?
> > It might be better that this description describes that this
> > and the next column is meaningful only while the phase
> > "scanning table".
> 
> 
> How about:
>   Total number of heap blocks in the scanning_table.

(For me, ) that seems like it shows blocks including all
descendents f

Re: Memory Accounting

2019-07-22 Thread Heikki Linnakangas


On 18/07/2019 21:24, Jeff Davis wrote:

Previous discussion:
https://postgr.es/m/1407012053.15301.53.camel%40jeff-desktop

This patch introduces a way to ask a memory context how much memory it
currently has allocated. Each time a new block (not an individual
chunk, but a new malloc'd block) is allocated, it increments a struct
member; and each time a block is free'd, it decrements the struct
member. So it tracks blocks allocated by malloc, not what is actually
used for chunks allocated by palloc.

The purpose is for Memory Bounded Hash Aggregation, but it may be
useful in more cases as we try to better adapt to and control memory
usage at execution time.


Seems handy.


* I changed it to only update mem_allocated for the current context,
not recursively for all parent contexts. It's now up to the function
that reports memory usage to recurse or not (as needed).


Is that OK for memory bounded hash aggregation? Might there be a lot of 
sub-contexts during hash aggregation?


- Heikki

Re: Compile from source using latest Microsoft Windows SDK

On Mon, Jul 22, 2019 at 04:01:46PM +0800, Peifeng Qiu wrote:
>> but it's really only a major issue for VS2019
>
> VS2019 will use the latest v10 SDK by default. So no need to install 8.1
> for VS2019.

Yes, FWIW, I have tested with VS2019 when committing 2b1394f, and in
this case only the v10 SDK got installed, with no actual issues
related to the dependency of the SDK reported.  In this case I have
installed VS using the community installer provided by Microsoft.

>> I guess we might need a test for what SDK is available?
> 
> We can just use the WindowsSDKVersion environment variable to
> determine the SDK for current cmd session. It's set when you start
> the Visual Studio Prompt or call one bat script.  Developers can
> choose the right version best suit their need. Detecting all
> installed SDK version can be done with some registry magic but I
> think that's not necessary in this case.

This looks more sensible to do if the environment variable is
available.  Looking around this variable is available when using the
command prompt for native tools.  So using it sounds like a good idea
to me if it exists.
--
Michael


signature.asc
Description: PGP signature

Re: Compile from source using latest Microsoft Windows SDK

2019-07-22 Thread Peifeng Qiu

> For VS2017, the 8.1 SDK is part of the optional package set
Yeah, if you install 8.1 SDK VS2017 can compile. I install VS2017 using the
GUI installer.
The main page are big checkboxs for packages sets like C++, .NET, Azure etc.
Checking C++ will only install the IDE and 10 SDK. 8.1 SDK is on the side
panel detailed list.

>but it's really only a major issue for VS2019
VS2019 will use the latest v10 SDK by default. So no need to install 8.1
for VS2019.

> I guess we might need a test for what SDK is available?
We can just use the WindowsSDKVersion environment variable to determine the
SDK for
current cmd session. It's set when you start the Visual Studio Prompt or
call one bat script.
Developers can choose the right version best suit their need. Detecting all
installed SDK
version can be done with some registry magic but I think that's not
necessary in this case.

We should change the title of the patch to "compile from source with VS2017
and SDK v10",
since that's the only problematic combination. Our need is compile our own
tools that link to
libpq and latest VC runtime. So libpq must also be linked with the same VC
runtime, and
thus use the same SDK version.

Best regards,
Peifeng Qiu

Re: Tid scan improvements

2019-07-22 Thread Edmund Horner

On Mon, 22 Jul 2019 at 19:25, David Rowley 
> On Sat, 20 Jul 2019 at 06:21, Andres Freund  wrote:
> > Yea, I was thinking of something like 2.  We already have a few extra
> > types of scan nodes (bitmap heap and sample scan), it'd not be bad to
> > add another. And as you say, they can just share most of the guts: For
> > heap I'd just implement pagemode, and perhaps split heapgettup_pagemode
> > into two parts (one to do the page processing, the other to determine
> > the relevant page).
> >
> > You say that we'd need new fields in HeapScanDescData - not so sure
> > about that, it seems feasible to just provide the boundaries in the
> > call? But I think it'd also just be fine to have the additional fields.
>
> Thanks for explaining.
>
> I've set the CF entry for the patch back to waiting on author.
>
> I think if we get this part the way Andres would like it, then we're
> pretty close.

I've moved the code in question into heapam, with:

  * a new scan type SO_TYPE_TIDRANGE (renumbering some of the other
enums).

  * a wrapper table_beginscan_tidrange(Relation rel, Snapshot snapshot);
I'm not sure whether we need scankeys and the other flags?

  * a new optional callback scan_settidrange(TableScanDesc scan,
ItemPointer startTid, ItemPointer endTid) with wrapper
table_scan_settidrange.
I thought about combining it with table_beginscan_tidrange but we're not
really doing that with any of the other beginscan methods.

  * another optional callback scan_getnexttidrangeslot.  The presence of
these two callbacks indicates that TidRangeScan is supported for
this relation.

Let me know if you can think of better names.

However, the heap_getnexttidrangeslot function is just the same
iterative code calling heap_getnextslot and checking the tuples
against the tid range (two fields added to the ScanDesc).

I'll have to spend a bit of time looking at heapgettup_pagemode to
figure how to split it as Andres suggests.

Thanks,
Edmund

Re: [PATCH] minor bugfix for pg_basebackup (9.6 ~ )

On Sat, Jul 20, 2019 at 10:04:19AM +0900, Michael Paquier wrote:
> This restriction is unlikely going to be removed, still I would rather
> keep the escaped logic in pg_basebackup.  This is the usual,
> recommended coding pattern, and there is a risk that folks refer to
> this code block for their own fancy stuff, spreading the problem.  The
> intention behind the code is to use an escaped name as well.  For 
> those reasons your patch is fine by me.

Attempting to use a slot with an unsupported set of characters will
lead beforehand to a failure when trying to fetch the WAL segments
with START_REPLICATION, meaning that this spot will never be reached
and that there is no active bug, but for the sake of consistency I see
no problems with applying the fix on HEAD.  So, are there any
objections with that?
--
Michael


signature.asc
Description: PGP signature

Re: Tid scan improvements