date:20210309

Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-03-09 Thread Jeevan Ladhe

On Wed, Mar 10, 2021 at 10:44 AM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> Hi,
>
> While providing thoughts on [1], I observed that the error messages
> that are emitted while adding foreign, temporary and unlogged tables
> can be improved a bit from the existing [2] to [3].
>

+1 for improving the error messages here.

> Attaching a small patch. Thoughts?
>

I had a look at the patch and it looks good to me. However, I think after
you have added the specific kind of table type in the error message itself,
now the error details seem to be giving redundant information, but others
might
have different thoughts.

The patch itself looks good otherwise. Also the make check and postgres_fdw
check looking good.

Regards,
Jeevan Ladhe

Re: a misbehavior of partition row movement (?)

2021-03-09 Thread Masahiko Sawada

On Fri, Feb 26, 2021 at 4:30 PM Amit Langote  wrote:
>
> Hi Rahila,
>
> On Wed, Feb 24, 2021 at 3:07 PM Rahila Syed  wrote:
> >> > I think the documentation update is missing from the patches.
> >>
> >> Hmm, I don't think we document the behavior that is improved by the v3
> >> patches as a limitation of any existing feature, neither of foreign
> >> keys referencing partitioned tables nor of the update row movement
> >> feature.  So maybe there's nothing in the existing documentation that
> >> is to be updated.
> >>
> >> However, the patch does add a new error message for a case that the
> >> patch doesn't handle, so maybe we could document that as a limitation.
> >> Not sure if in the Notes section of the UPDATE reference page which
> >> has some notes on row movement or somewhere else.  Do you have
> >> suggestions?
> >>
> > You are right, I could not find any direct explanation of the impact of row 
> > movement during
> > UPDATE on a referencing table in the PostgreSQL docs.
> >
> > The two documents that come close are either:
>
> Thanks for looking those up.
>
> > 1. https://www.postgresql.org/docs/13/trigger-definition.html .
> > The para starting with "If an UPDATE on a partitioned table causes a row to 
> > move to another partition"
> > However, this does not describe the behaviour of  internal triggers which 
> > is the focus of this patch.
>
> The paragraph does talk about a very related topic, but, like you, I
> am not very excited about adding a line here about what we're doing
> with internal triggers.
>
> > 2. Another one like you mentioned,  
> > https://www.postgresql.org/docs/11/sql-update.html
> > This has explanation for row movement behaviour for partitioned table but 
> > does not explain
> > any impact of such behaviour on a referencing table.
> > I think it is worth adding some explanation in this document. Thus, 
> > explaining
> > impact on referencing tables here, as it already describes behaviour of
> > UPDATE on a partitioned table.
>
> ISTM the description of the case that will now be prevented seems too
> obscure to make into a documentation line, but I tried.  Please check.
>

I looked at the 0001 patch and here are random comments. Please ignore
a comment if it is already discussed.

---
@@ -9077,7 +9102,8 @@ addFkRecurseReferenced(List **wqueue, Constraint
*fkconstraint, Relation rel,
   partIndexId, constrOid, numfks,
   mapped_pkattnum, fkattnum,
   pfeqoperators, ppeqoperators, ffeqoperators,
-  old_check_ok);
+  old_check_ok,
+  deleteTriggerOid, updateTriggerOid);

/* Done -- clean up (but keep the lock) */
table_close(partRel, NoLock);
@@ -9126,8 +9152,12 @@ addFkRecurseReferencing(List **wqueue,
Constraint *fkconstraint, Relation rel,
Relation pkrel, Oid indexOid, Oid parentConstr,
int numfks, int16 *pkattnum, int16 *fkattnum,
Oid *pfeqoperators, Oid *ppeqoperators, Oid
*ffeqoperators,
-   bool old_check_ok, LOCKMODE lockmode)
+   bool old_check_ok, LOCKMODE lockmode,
+   Oid parentInsTrigger, Oid parentUpdTrigger)
 {

We need to update the function comments as well.

---
I think it's better to add comments for newly added functions such as
GetForeignKeyActionTriggers() and GetForeignKeyCheckTriggers() etc.
Those functions have no comment at all.

BTW, those two functions out of newly added four functions:
AttachForeignKeyCheckTriggers() and DetachForeignKeyCheckTriggers(),
have only one user. Can we past the functions body at where each
function is called?

---
/*
 * If the referenced table is a plain relation, create the action triggers
 * that enforce the constraint.
 */
-   if (pkrel->rd_rel->relkind == RELKIND_RELATION)
-   {
-   createForeignKeyActionTriggers(rel, RelationGetRelid(pkrel),
-  fkconstraint,
-  constrOid, indexOid);
-   }
+   createForeignKeyActionTriggers(rel, RelationGetRelid(pkrel),
+  fkconstraint,
+  constrOid, indexOid,
+  parentDelTrigger, parentUpdTrigger,
+  , );

The comment needs to be updated.

---
 /*
  * If the referencing relation is a plain table, add the check triggers to
  * it and, if necessary, schedule it to be checked in Phase 3.
  *
  * If the relation is partitioned, drill down to do it to its partitions.
  */
+createForeignKeyCheckTriggers(RelationGetRelid(rel),
+  RelationGetRelid(pkrel),
+  fkconstraint,
+  parentConstr,
+

Re: Freeze the inserted tuples during CTAS?

2021-03-09 Thread Paul Guo

> On Mar 3, 2021, at 1:35 PM, Masahiko Sawada  wrote:
>> On Sun, Feb 21, 2021 at 4:46 PM Paul Guo  wrote:
>> Attached is the v2 version that fixes a test failure due to plan change 
>> (bitmap index scan -> index only scan).

> I think this is a good idea.

> BTW, how much does this patch affect the CTAS performance? I expect
> it's negligible but If there is much performance degradation due to
> populating visibility map, it might be better to provide a way to
> disable it.

Yes,  this is a good suggestion. I did a quick test yesterday.

Configuration: shared_buffers = 1280M and the test system memory is 7G.

Test queries:
  checkpoint;
  \timing
  create table t1 (a, b, c, d) as select i,i,i,i from 
generate_series(1,2000) i;
  \timing
  select pg_size_pretty(pg_relation_size('t1'));

Here are the running time:

HEAD   : Time: 10299.268 ms (00:10.299)  + 1537.876 ms (00:01.538)  

Patch  : Time: 12257.044 ms (00:12.257)  + 14.247 ms

The table size is 800+MB so the table should be all in the buffer. I was 
surprised
to see the patch increases the CTAS time by 19.x%, and also it is not better 
than
"CTAS+VACUUM" on HEAD version. In theory the visibility map buffer change should
not affect that much. I looked at related code again (heap_insert()). I believe
the overhead could decrease along with some discussed CTAS optimization
solutions (multi-insert, or raw-insert, etc).

I tested 'copy' also. The COPY FREEZE does not involve much overhead than COPY
according to the experiement results as below. COPY uses multi-insert. Seems 
there is
no other difference than CTAS when writing a new table.

COPY TO + VACUUM
Time: 8826.995 ms (00:08.827) + 1599.260 ms (00:01.599)
COPY TO FREEZE + VACUUM
Time: 8836.107 ms (00:08.836) + 13.581 ms

So maybe think about doing freeze in CTAS after optimizing the CTAS performance
later?

By the way, ‘REFRESH MatView’ does freeze by default. Matview is quite similar 
to CTAS.
I did test it also and the conclusion is similar to that of CTAS. Not sure why 
FREEZE was
enabled though, maybe I missed something?

Re: Occasional tablespace.sql failures in check-world -jnn

2021-03-09 Thread Michael Paquier

On Mon, Mar 08, 2021 at 11:53:57AM +0100, Peter Eisentraut wrote:
> On 09.12.20 08:55, Michael Paquier wrote:
>> ...  Because we may still introduce this problem again if some new
>> stuff uses src/test/pg_regress in a way similar to pg_upgrade,
>> triggering again tablespace-setup.  Something like the attached may be
>> enough, though I have not spent much time checking the surroundings,
>> Windows included.
> 
> This patch looks alright to me.

So, I have spent more time checking the surroundings of this patch,
and finally applied it.  Thanks for the review, Peter.
--
Michael


signature.asc
Description: PGP signature

Re: shared-memory based stats collector

2021-03-09 Thread Fujii Masao





On 2021/03/10 12:10, Kyotaro Horiguchi wrote:

At Tue, 9 Mar 2021 23:24:10 +0900, Fujii Masao  
wrote in



On 2021/03/09 16:51, Kyotaro Horiguchi wrote:

At Sat, 6 Mar 2021 00:32:07 +0900, Fujii Masao 
wrote in

I don't think that we should treat non-zero exit condition as a crash,
as before. Otherwise when archive_command fails on a signal,
archiver emits FATAL error and which leads the server restart.

Sounds reasonable. Now archiver is treated the same way to wal
receiver.  Specifically exit(1) doesn't cause server restart.


Thanks!

-   if (PgArchStartupAllowed())
-   PgArchPID = pgarch_start();

In the latest patch, why did you remove the code to restart new archiver
in reaper()? When archiver dies, I think new archiver should be restarted like
the current reaper() does. Otherwise, the restart of archiver can be
delayed until the next cycle of ServerLoop, which may take time.


Agreed. The code moved to the original place and added the crash
handling code. And I added a phrase to the comment.

+* Was it the archiver?  If exit status is zero (normal) or one 
(FATAL
+* exit), we assume everything is all right just like normal 
backends
+* and just try to restart a new one so that we immediately 
retry
  
+* archiving of remaining files. (If fail, we'll try again in 
future




"of" of "archiving of remaining" should be replaced with "the", or removed?


Just for record. Previously LogChildExit() was called and the following LOG
message was output when the archiver reported FATAL error. OTOH the patch
prevents that and the following LOG message is not output at FATAL exit of
archiver. But I don't think that the following message is required in that case
because FATAL message indicating the similar thing is already output.
Therefore, I'm ok with the patch.

LOG:  archiver process (PID 46418) exited with exit code 1



I read v50_003 patch.

When archiver dies, ProcGlobal->archiverLatch should be reset to NULL,
like walreceiver does the similar thing in WalRcvDie()?


Differently from walwriter and checkpointer, archiver as well as
walreceiver may die while server is running. Leaving the latch pointer
alone may lead to nudging a wrong process that took over the same
procarray slot. Added pgarch_die() to do that.


Thanks!

+   if (IsUnderPostmaster && ProcGlobal->archiverLatch)
+   SetLatch(ProcGlobal->archiverLatch);

The latch can be reset to NULL in pgarch_die() between the if-condition and
SetLatch(), and which would be problematic. Probably we should protect
the access to the latch by using spinlock, like we do for walreceiver's latch?




(I moved the archiverLatch to just after checkpointerLatch in this version.)


In pgarch.c, #include "postmaster/fork_process.h" seems no longer necessary.


Right. That's not due to this patch, postmater.h, dsm.h and pg_shmem.h
are not used. (fd.h is not necessary but pgarch.c uses AllocateDir().)


+   if (strcmp(argv[1], "--forkarch") == 0)
+   {

Why is this necessary? I was thinking that "--forkboot" handles archiver
in SubPostmasterMain().


Yeah, the correspondent code is removed in the same patch at the same
time.

The attached is v51 patchset.


Thanks a lot!

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: [PATCH] Identify LWLocks in tracepoints

2021-03-09 Thread Craig Ringer

On Wed, 3 Mar 2021 at 20:50, David Steele  wrote:

> On 1/22/21 6:02 AM, Peter Eisentraut wrote:
>
> This patch set no longer applies:
> http://cfbot.cputube.org/patch_32_2927.log.
>
> Can we get a rebase? Also marked Waiting on Author.
>

Rebased as requested.

I'm still interested in whether Andres will be able to do anything about
identifying LWLocks in a cross-backend manner. But this work doesn't really
depend on that; it'd benefit from it, but would be easily adapted to it
later if needed.
From 36c7ddcbca2dbbcb2967f01cb92aa1f61620c838 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 19 Nov 2020 17:38:45 +0800
Subject: [PATCH 1/4] Pass the target LWLock* and tranche ID to LWLock
 tracepoints

Previously the TRACE_POSTGRESQL_LWLOCK_ tracepoints only received a
pointer to the LWLock tranche name. This made it impossible to identify
individual locks.

Passing the lock pointer itself isn't perfect. If the lock is allocated inside
a DSM segment then it might be mapped at a different address in different
backends. It's safe to compare lock pointers between backends (assuming
!EXEC_BACKEND) if they're in the individual lock tranches or an
extension-requested named tranche, but not necessarily for tranches in
BuiltinTrancheIds or tranches >= LWTRANCHE_FIRST_USER_DEFINED that were
directly assigned with LWLockNewTrancheId(). Still, it's better than nothing;
the pointer is stable within a backend, and usually between backends.
---
 src/backend/storage/lmgr/lwlock.c | 35 +++
 src/backend/utils/probes.d| 18 +---
 2 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 8cb6a6f042..5c8744d316 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -1321,7 +1321,8 @@ LWLockAcquire(LWLock *lock, LWLockMode mode)
 #endif
 
 		LWLockReportWaitStart(lock);
-		TRACE_POSTGRESQL_LWLOCK_WAIT_START(T_NAME(lock), mode);
+		TRACE_POSTGRESQL_LWLOCK_WAIT_START(T_NAME(lock), mode, lock,
+lock->tranche);
 
 		for (;;)
 		{
@@ -1343,7 +1344,8 @@ LWLockAcquire(LWLock *lock, LWLockMode mode)
 		}
 #endif
 
-		TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(T_NAME(lock), mode);
+		TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(T_NAME(lock), mode, lock,
+lock->tranche);
 		LWLockReportWaitEnd();
 
 		LOG_LWDEBUG("LWLockAcquire", lock, "awakened");
@@ -1352,7 +1354,7 @@ LWLockAcquire(LWLock *lock, LWLockMode mode)
 		result = false;
 	}
 
-	TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(lock), mode);
+	TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(lock), mode, lock, lock->tranche);
 
 	/* Add lock to list of locks held by this backend */
 	held_lwlocks[num_held_lwlocks].lock = lock;
@@ -1403,14 +1405,16 @@ LWLockConditionalAcquire(LWLock *lock, LWLockMode mode)
 		RESUME_INTERRUPTS();
 
 		LOG_LWDEBUG("LWLockConditionalAcquire", lock, "failed");
-		TRACE_POSTGRESQL_LWLOCK_CONDACQUIRE_FAIL(T_NAME(lock), mode);
+		TRACE_POSTGRESQL_LWLOCK_CONDACQUIRE_FAIL(T_NAME(lock), mode, lock,
+lock->tranche);
 	}
 	else
 	{
 		/* Add lock to list of locks held by this backend */
 		held_lwlocks[num_held_lwlocks].lock = lock;
 		held_lwlocks[num_held_lwlocks++].mode = mode;
-		TRACE_POSTGRESQL_LWLOCK_CONDACQUIRE(T_NAME(lock), mode);
+		TRACE_POSTGRESQL_LWLOCK_CONDACQUIRE(T_NAME(lock), mode, lock,
+lock->tranche);
 	}
 	return !mustwait;
 }
@@ -1482,7 +1486,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 #endif
 
 			LWLockReportWaitStart(lock);
-			TRACE_POSTGRESQL_LWLOCK_WAIT_START(T_NAME(lock), mode);
+			TRACE_POSTGRESQL_LWLOCK_WAIT_START(T_NAME(lock), mode, lock,
+	lock->tranche);
 
 			for (;;)
 			{
@@ -1500,7 +1505,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 Assert(nwaiters < MAX_BACKENDS);
 			}
 #endif
-			TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(T_NAME(lock), mode);
+			TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(T_NAME(lock), mode, lock,
+	lock->tranche);
 			LWLockReportWaitEnd();
 
 			LOG_LWDEBUG("LWLockAcquireOrWait", lock, "awakened");
@@ -1530,7 +1536,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 		/* Failed to get lock, so release interrupt holdoff */
 		RESUME_INTERRUPTS();
 		LOG_LWDEBUG("LWLockAcquireOrWait", lock, "failed");
-		TRACE_POSTGRESQL_LWLOCK_ACQUIRE_OR_WAIT_FAIL(T_NAME(lock), mode);
+		TRACE_POSTGRESQL_LWLOCK_ACQUIRE_OR_WAIT_FAIL(T_NAME(lock), mode, lock,
+lock->tranche);
 	}
 	else
 	{
@@ -1538,7 +1545,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
 		/* Add lock to list of locks held by this backend */
 		held_lwlocks[num_held_lwlocks].lock = lock;
 		held_lwlocks[num_held_lwlocks++].mode = mode;
-		TRACE_POSTGRESQL_LWLOCK_ACQUIRE_OR_WAIT(T_NAME(lock), mode);
+		TRACE_POSTGRESQL_LWLOCK_ACQUIRE_OR_WAIT(T_NAME(lock), mode, lock,
+lock->tranche);
 	}
 
 	return !mustwait;
@@ -1698,7 +1706,8 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
 #endif
 
 		LWLockReportWaitStart(lock);
-

Re: [HACKERS] logical decoding of two-phase transactions

2021-03-09 Thread Peter Smith

On Tue, Mar 9, 2021 at 9:55 PM Amit Kapila  wrote:
>
> On Tue, Mar 9, 2021 at 3:22 PM Ajin Cherian  wrote:
> >
>
> Few comments:
> ==
>
> 3. In prepare_spoolfile_replay_messages(), it is better to free the
> memory allocated for temporary strings buffer and s2.

I guess this was suggested because it is what the
apply_handle_stream_commit() function was doing for very similar code.
But now the same code cannot work this time for the
*_replay_messages() function because those buffers are allocated with
the TopTransactionContext and they are already being freed as a
side-effect when the last psf message (the LOGICAL_REP_MSG_PREPARE) is
replayed/dispatched and ending the transaction. So attempting to free
them again causes segmentation violation (I already fixed this exact
problem last week when the pfree code was still in the code).

> 5. I think prepare_spoolfile_close can be extended to take PsfFile as
> input and then it can be also used from
> prepare_spoolfile_replay_messages.

No, the *_close() is intended only for ending the "current" psf (the
global psf_cur) which was being spooled. The function comment says the
same. The *_close() is paired with the *_create() which created the
psf_cur.

Whereas, the reply fd close at commit time is just a locally opened fd
unrelated to psf_cur. This close is deliberately self-contained in the
*_replay_messages() function, which is not dissimilar to what the
other streaming spool file code does - e.g. notice
apply_handle_stream_commit function simply closes its own fd using
BufFileClose; it doesn’t delegate stream_close_file() to do it.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Confusing behavior of psql's \e

2021-03-09 Thread Tom Lane

Laurenz Albe  writes:
> On Thu, 2021-03-04 at 16:51 +, Jacob Champion wrote:
>> You could backdate the temporary file, so that any save is guaranteed
>> to move the timestamp forward. That should work even if the filesystem
>> has extremely poor precision.

> Ah, of course, that is the way to go.

I took a quick look at this.  I don't have an opinion yet about the
question of changing the when-to-discard-the-buffer rules, but I agree
that trying to get rid of the race condition inherent in the existing
file mtime test would be a good idea.  However, I've got some
portability-related gripes about how you are doing the latter:

1. There is no principled reason to assume that the epoch date is in the
past.  IIRC, Postgres' timestamp epoch of 2000-01-01 was in the future
at the time we set it.  More relevant to the immediate issue, I clearly
recall a discussion at Red Hat in which one of the principal glibc
maintainers (likely Ulrich Drepper, though I'm not quite sure) argued
that 32-bit time_t could be used indefinitely by redefining the epoch
forward by 2^32 seconds every often; which would require intervals of
circa 68 years in which time_t was seen as a negative offset from a
future epoch date, rather than an unsigned offset from a past date.
Now, I thought he was nuts then and I still think that 32-bit hardware
will be ancient history by 2038 ... but there may be systems that do it
like that.  glibc hates ABI breakage.

2. Putting an enormously old date on a file that was just created will
greatly confuse onlookers, some of whom (such as backup or antivirus
daemons) might not react pleasantly.

Between #1 and #2, it's clearly worth the extra one or two lines of
code to set the file dates to, say, "time(NULL) - 1", rather than
assuming that zero is good enough.

3. I wonder about the portability of utime(2).  I see that we are using
it to cause updates of socket and lock file times, but our expectations
for it there are rock-bottom low.  I think that failing the edit command
if utime() fails is an overreaction even with optimistic assumptions about
its reliability.  Doing that makes things strictly worse than simply not
doing anything, because 99% of the time this refinement is unnecessary.

In short, I think the relevant code ought to be more like

else
{
struct utimbuf ut;

ut.modtime = ut.actime = time(NULL) - 1;
(void) utime(fname, );
}

(plus some comments of course)

regards, tom lane

PS: I seem to recall that some Microsoft filesystems have 2-second
resolution on file mod times, so maybe it needs to be "time(NULL) - 2"?

Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-03-09 Thread Bharath Rupireddy

Hi,

While providing thoughts on [1], I observed that the error messages
that are emitted while adding foreign, temporary and unlogged tables
can be improved a bit from the existing [2] to [3]. For instance, the
existing message when foreign table is tried to add into the
publication "f1" is not a table" looks odd. Because it says that the
foreign table is not a table at all.

Attaching a small patch. Thoughts?

[1] -
https://www.postgresql.org/message-id/CALj2ACWAxO3vSToT0o5nXL%3Drz5cNx90zaV-at%3DcvM14Tag4%3DcQ%40mail.gmail.com
[2] - t1 is a temporary table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR: table "t1" cannot be replicated
DETAIL: Temporary and unlogged relations cannot be replicated.

t1 is an unlogged table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR: table "t1" cannot be replicated
DETAIL: Temporary and unlogged relations cannot be replicated.

f1 is a foreign table:
postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
ERROR: "f1" is not a table
DETAIL: Only tables can be added to publications.

[3] - t1 is a temporary table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR: temporary table "t1" cannot be replicated
DETAIL: Temporary, unlogged and foreign relations cannot be replicated.

t1 is an unlogged table:
postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
ERROR: unlogged table "t1" cannot be replicated
DETAIL: Temporary, unlogged and foreign relations cannot be replicated.

f1 is a foreign table:
postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
ERROR: foreign table "f1" cannot be replicated
DETAIL: Temporary, unlogged and foreign relations cannot be replicated.

1 2 >

1 - 100 of 124 matches

Mail list logo