Re: segmentation fault when cassert enabled

2019-12-19 Thread Jehan-Guillaume de Rorthais
On Wed, 18 Dec 2019 08:46:01 +0530
Amit Kapila  wrote:

> On Tue, Dec 17, 2019 at 6:01 PM vignesh C  wrote:
> >
> > On Tue, Dec 17, 2019 at 10:09 AM Amit Kapila 
> > wrote:  
> > >
> > > Attached patch with updated commit message based on suggestions.  I am
> > > planning to commit this tomorrow unless there are more comments.
> > >  
> >
> > While testing the patch on back versions, I found that the patch does
> > not apply on PG 11 & PG 10 branch. Attached patch has the changes for
> > PG 11 & PG 10 branch. Only difference in the patch was that table_open
> > needed to be changed to heap_open. I have verified the patch on back
> > branches and found it to be working. For PG 12 & current the previous
> > patch that Amit need to be used, I'm not reattaching here.
> >  
> 
> Pushed.

Thanks!




Re: segmentation fault when cassert enabled

2019-12-17 Thread Amit Kapila
On Tue, Dec 17, 2019 at 6:01 PM vignesh C  wrote:
>
> On Tue, Dec 17, 2019 at 10:09 AM Amit Kapila  wrote:
> >
> > Attached patch with updated commit message based on suggestions.  I am
> > planning to commit this tomorrow unless there are more comments.
> >
>
> While testing the patch on back versions, I found that the patch does
> not apply on PG 11 & PG 10 branch. Attached patch has the changes for
> PG 11 & PG 10 branch. Only difference in the patch was that table_open
> needed to be changed to heap_open. I have verified the patch on back
> branches and found it to be working. For PG 12 & current the previous
> patch that Amit need to be used, I'm not reattaching here.
>

Pushed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: segmentation fault when cassert enabled

2019-12-17 Thread vignesh C
On Tue, Dec 17, 2019 at 10:09 AM Amit Kapila  wrote:
>
> On Mon, Dec 16, 2019 at 9:16 PM Jehan-Guillaume de Rorthais
>  wrote:
> >
> > On Mon, 16 Dec 2019 13:27:43 +0100
> > Peter Eisentraut  wrote:
> >
> > > On 2019-12-16 11:11, Amit Kapila wrote:
> > > > I agree that this is a timing issue.  I also don't see a way to write
> > > > a reproducible test for this.  However, I could reproduce it via
> > > > debugger consistently by following the below steps.  I have updated a
> > > > few comments and commit messages in the attached patch.
> > > >
> > > > Peter E., Petr J or anyone else, do you have comments or objections on
> > > > this patch?  If none, then I am planning to commit (and backpatch)
> > > > this patch in a few days time.
> > >
> > > The patch seems fine to me.  Writing a test seems hard.  Let's skip it.
> > >
>
> Okay.
>
> > > The commit message has a duplicate "building"/"built" in the first 
> > > sentence.
> >
> > I think the sentence is quite long. I had to re-read it to get it.
> >
> > What about:
> >
> >   This patch allows building the local relmap cache for a subscribed 
> > relation
> >   after processing pending invalidation messages and potential relcache
> >   updates.
> >
>
> Attached patch with updated commit message based on suggestions.  I am
> planning to commit this tomorrow unless there are more comments.
>

While testing the patch on back versions, I found that the patch does
not apply on PG 11 & PG 10 branch. Attached patch has the changes for
PG 11 & PG 10 branch. Only difference in the patch was that table_open
needed to be changed to heap_open. I have verified the patch on back
branches and found it to be working. For PG 12 & current the previous
patch that Amit need to be used, I'm not reattaching here.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From 2c4123d514473a9203c3617655229286a84487e6 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Tue, 17 Dec 2019 17:05:35 +0530
Subject: [PATCH] Fix subscriber invalid memory access on DDL.

This patch allows building the local relmap cache for a subscribed
relation after processing pending invalidation messages and potential
relcache updates.  Without this, the attributes in the local cache don't
tally with the updated relcache entry leading to invalid memory access.

Reported-by Jehan-Guillaume de Rorthais
Author: Jehan-Guillaume de Rorthais and Vignesh C
Reviewed-by: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/20191025175929.7e90dbf5@firost
---
 src/backend/replication/logical/relation.c | 40 +++---
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index cf6f0a7..e1bbabd 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -220,6 +220,8 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 {
 	LogicalRepRelMapEntry *entry;
 	bool		found;
+	Oid			relid = InvalidOid;
+	LogicalRepRelation *remoterel;
 
 	if (LogicalRepRelMap == NULL)
 		logicalrep_relmap_init();
@@ -232,20 +234,17 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 		elog(ERROR, "no relation map entry for remote relation ID %u",
 			 remoteid);
 
-	/* Need to update the local cache? */
+	remoterel = >remoterel;
+
+	/*
+	 * When opening and locking a relation, pending invalidation messages are
+	 * processed which can invalidate the relation.  We need to update the
+	 * local cache both when we are first time accessing the relation and when
+	 * the relation is invalidated (aka entry->localreloid is set InvalidOid).
+	 */
 	if (!OidIsValid(entry->localreloid))
 	{
-		Oid			relid;
-		int			i;
-		int			found;
-		Bitmapset  *idkey;
-		TupleDesc	desc;
-		LogicalRepRelation *remoterel;
-		MemoryContext oldctx;
-
-		remoterel = >remoterel;
-
-		/* Try to find and lock the relation by name. */
+		/* Try to find and lock the relation by name */
 		relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname,
 			  remoterel->relname, -1),
  lockmode, true);
@@ -256,6 +255,21 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 			remoterel->nspname, remoterel->relname)));
 		entry->localrel = heap_open(relid, NoLock);
 
+	}
+	else
+	{
+		relid = entry->localreloid;
+		entry->localrel = heap_open(entry->localreloid, lockmode);
+	}
+
+	if (!OidIsValid(entry->localreloid))
+	{
+		int			found;
+		Bitmapset  *idkey;
+		TupleDesc	desc;
+		MemoryContext oldctx;
+		int			i;
+
 		/* Check for supported relkind. */
 		CheckSubscriptionRelkind(entry->localrel->rd_rel->relkind,
  remoterel->nspname, remoterel->relname);
@@ -350,8 +364,6 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 
 		entry->localreloid = relid;
 	}
-	else
-		entry->localrel = heap_open(entry->localreloid, lockmode);
 
 	if (entry->state != SUBREL_STATE_READY)
 		entry->state = 

Re: segmentation fault when cassert enabled

2019-12-16 Thread Amit Kapila
On Mon, Dec 16, 2019 at 9:16 PM Jehan-Guillaume de Rorthais
 wrote:
>
> On Mon, 16 Dec 2019 13:27:43 +0100
> Peter Eisentraut  wrote:
>
> > On 2019-12-16 11:11, Amit Kapila wrote:
> > > I agree that this is a timing issue.  I also don't see a way to write
> > > a reproducible test for this.  However, I could reproduce it via
> > > debugger consistently by following the below steps.  I have updated a
> > > few comments and commit messages in the attached patch.
> > >
> > > Peter E., Petr J or anyone else, do you have comments or objections on
> > > this patch?  If none, then I am planning to commit (and backpatch)
> > > this patch in a few days time.
> >
> > The patch seems fine to me.  Writing a test seems hard.  Let's skip it.
> >

Okay.

> > The commit message has a duplicate "building"/"built" in the first sentence.
>
> I think the sentence is quite long. I had to re-read it to get it.
>
> What about:
>
>   This patch allows building the local relmap cache for a subscribed relation
>   after processing pending invalidation messages and potential relcache
>   updates.
>

Attached patch with updated commit message based on suggestions.  I am
planning to commit this tomorrow unless there are more comments.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


0001-Fix-subscriber-invalid-memory-access-on-DDL.amit.2.patch
Description: Binary data


Re: segmentation fault when cassert enabled

2019-12-16 Thread Jehan-Guillaume de Rorthais
On Mon, 16 Dec 2019 13:27:43 +0100
Peter Eisentraut  wrote:

> On 2019-12-16 11:11, Amit Kapila wrote:
> > I agree that this is a timing issue.  I also don't see a way to write
> > a reproducible test for this.  However, I could reproduce it via
> > debugger consistently by following the below steps.  I have updated a
> > few comments and commit messages in the attached patch.
> > 
> > Peter E., Petr J or anyone else, do you have comments or objections on
> > this patch?  If none, then I am planning to commit (and backpatch)
> > this patch in a few days time.  
> 
> The patch seems fine to me.  Writing a test seems hard.  Let's skip it.
> 
> The commit message has a duplicate "building"/"built" in the first sentence.

I think the sentence is quite long. I had to re-read it to get it.

What about:

  This patch allows building the local relmap cache for a subscribed relation
  after processing pending invalidation messages and potential relcache
  updates.

Regards,




Re: segmentation fault when cassert enabled

2019-12-16 Thread Jehan-Guillaume de Rorthais
On Fri, 13 Dec 2019 12:10:07 +0530
vignesh C  wrote:

> On Fri, Dec 6, 2019 at 5:30 PM Amit Kapila  wrote:
> >
> > On Mon, Nov 25, 2019 at 8:25 PM Jehan-Guillaume de Rorthais
> >  wrote:  
> > >
> > > On Wed, 6 Nov 2019 14:34:38 +0100
> > > Peter Eisentraut  wrote:
> > >  
> > > > On 2019-11-05 17:29, Jehan-Guillaume de Rorthais wrote:  
> > > > > My best bet so far is that logicalrep_relmap_invalidate_cb is not
> > > > > called after the DDL on the subscriber so the relmap cache is not
> > > > > invalidated. So we end up with slot->tts_tupleDescriptor->natts
> > > > > superior than rel->remoterel->natts in slot_store_cstrings, leading
> > > > > to the overflow on attrmap and the sigsev.  
> > > >
> > > > It looks like something like that is happening.  But it shouldn't.
> > > > Different table schemas on publisher and subscriber are well supported,
> > > > so this must be an edge case of some kind.  Please continue
> > > > investigating.  
> > >
> > > I've been able to find the origin of the crash, but it was a long journey.
> > >
> > > 
> > >
> > >   I was unable to debug using gdb record because of this famous error:
> > >
> > > Process record does not support instruction 0xc5 at address
> > > 0x1482758a4b30.
> > >
> > > Program stopped.
> > > __memset_avx2_unaligned_erms ()
> > > at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:168
> > > 168 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such
> > >   file or directory.
> > >
> > >   Trying with rr, I had constant "stack depth limit exceeded", even with
> > >   unlimited one. Does it worth opening a discussion or a wiki page about
> > > these tools? Peter, it looks like you have some experience with rr, any
> > > feedback?
> > >
> > >   Finally, Julien Rouhaud spend some time with me after work hours,
> > > a,swering my questions about some parts of the code and pointed me to the
> > > excellent backtrace_functions GUC addition few days ago. This finally did
> > > the trick to find out what was happening. Many thanks Julien!
> > >
> > > 
> > >
> > > Back to the bug itself. Consider a working logical replication with
> > > constant update/insert activity, eg. pgbench running against provider.
> > >
> > > Now, on the subscriber side, a session issue an "ALTER TABLE ADD
> > > COLUMN" on a subscribed table, eg. pgbench_branches. A cache invalidation
> > > message is then pending for this table.
> > >
> > > In the meantime, the logical replication worker receive an UPDATE to
> > > apply. It opens the local relation using "logicalrep_rel_open". It finds
> > > the related entry in LogicalRepRelMap is valid, so it does not update its
> > > attrmap and directly opens and locks the local relation:
> > >  
> >
> > - /* Try to find and lock the relation by name. */
> > + /* Try to find the relation by name */
> >   relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname,\
> > remoterel->relname, -1),
> > - lockmode, true);
> > + NoLock, true);
> >
> > I think we can't do this because it could lead to locking the wrong
> > reloid.  See RangeVarGetRelidExtended.  It ensures that after locking
> > the relation (which includes accepting invalidation messages), that
> > the reloid is correct.  I think changing the code in the way you are
> > suggesting can lead to locking incorrect reloid.

Sorry for the delay, I couldn't answer earlier.

To be honest, I was wondering about that. Since we keep in cache the relid and
use it as cache invalidation, I thought it might be fragile. But then - as far
as I could find - the only way to change the relid is to drop and create a new
table. I wasn't sure it could really cause a race condition there because of
the impact of such commands on logical replication.

But now, I realize I should have go all the way through and close this
potential bug as well. Thank you.

> I have made changes to fix the comment provided. The patch for the
> same is attached. Could not add a test case for this scenario is based
> on timing issue.

Thank you for this fix Vignesh!





Re: segmentation fault when cassert enabled

2019-12-16 Thread Peter Eisentraut

On 2019-12-16 11:11, Amit Kapila wrote:

I agree that this is a timing issue.  I also don't see a way to write
a reproducible test for this.  However, I could reproduce it via
debugger consistently by following the below steps.  I have updated a
few comments and commit messages in the attached patch.

Peter E., Petr J or anyone else, do you have comments or objections on
this patch?  If none, then I am planning to commit (and backpatch)
this patch in a few days time.


The patch seems fine to me.  Writing a test seems hard.  Let's skip it.

The commit message has a duplicate "building"/"built" in the first sentence.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: segmentation fault when cassert enabled

2019-12-16 Thread Amit Kapila
On Fri, Dec 13, 2019 at 12:10 PM vignesh C  wrote:
>
> I have made changes to fix the comment provided. The patch for the
> same is attached. Could not add a test case for this scenario is based
> on timing issue.
> Thoughts?
>

I agree that this is a timing issue.  I also don't see a way to write
a reproducible test for this.  However, I could reproduce it via
debugger consistently by following the below steps.  I have updated a
few comments and commit messages in the attached patch.

Peter E., Petr J or anyone else, do you have comments or objections on
this patch?  If none, then I am planning to commit (and backpatch)
this patch in a few days time.

Test steps to reproduce the issue.
Set up
-
set up master and subscriber nodes.
In code, add a while(true) in apply_handle_update() before a call to
logicalrep_rel_open().  This is to ensure that we can debug the replay
of Update
operation on subscriber.

Master
---
Create table t1(c1 int);
Create publication pub_t1 for table t1;
Alter table t1 replica identity full;


Subscriber
-
Create table t1(c1 int);
CREATE SUBSCRIPTION sub_t1 CONNECTION 'host=localhost port=5432
dbname=postgres' PUBLICATION pub_t1;

Master
--
Insert into t1 values(1);  --this will create LogicalRepRelMap entry
for t1 on subscriber.

Subscriber
--
Select * from t1; -- This should display the data inserted in master.

Master
--
Update t1 set c1 = 2 where c1=1;

Now on the subscriber, attach a debugger and debug logicalrep_rel_open
and stop debugger just before table_open call.

Subscriber
---
Alter table t1 add c2 int;

Now, continue in debugger, it should crash in slot_store_cstrings()
because the rel->attrmap is not updated.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


0001-Fix-subscriber-invalid-memory-access-on-DDL.amit.1.patch
Description: Binary data


Re: segmentation fault when cassert enabled

2019-12-12 Thread vignesh C
On Fri, Dec 6, 2019 at 5:30 PM Amit Kapila  wrote:
>
> On Mon, Nov 25, 2019 at 8:25 PM Jehan-Guillaume de Rorthais
>  wrote:
> >
> > On Wed, 6 Nov 2019 14:34:38 +0100
> > Peter Eisentraut  wrote:
> >
> > > On 2019-11-05 17:29, Jehan-Guillaume de Rorthais wrote:
> > > > My best bet so far is that logicalrep_relmap_invalidate_cb is not called
> > > > after the DDL on the subscriber so the relmap cache is not invalidated. 
> > > > So
> > > > we end up with slot->tts_tupleDescriptor->natts superior than
> > > > rel->remoterel->natts in slot_store_cstrings, leading to the overflow on
> > > > attrmap and the sigsev.
> > >
> > > It looks like something like that is happening.  But it shouldn't.
> > > Different table schemas on publisher and subscriber are well supported,
> > > so this must be an edge case of some kind.  Please continue investigating.
> >
> > I've been able to find the origin of the crash, but it was a long journey.
> >
> > 
> >
> >   I was unable to debug using gdb record because of this famous error:
> >
> > Process record does not support instruction 0xc5 at address 
> > 0x1482758a4b30.
> >
> > Program stopped.
> > __memset_avx2_unaligned_erms ()
> > at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:168
> > 168 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such
> >   file or directory.
> >
> >   Trying with rr, I had constant "stack depth limit exceeded", even with
> >   unlimited one. Does it worth opening a discussion or a wiki page about 
> > these
> >   tools? Peter, it looks like you have some experience with rr, any 
> > feedback?
> >
> >   Finally, Julien Rouhaud spend some time with me after work hours, 
> > a,swering
> >   my questions about some parts of the code and pointed me to the excellent
> >   backtrace_functions GUC addition few days ago. This finally did the trick 
> > to
> >   find out what was happening. Many thanks Julien!
> >
> > 
> >
> > Back to the bug itself. Consider a working logical replication with constant
> > update/insert activity, eg. pgbench running against provider.
> >
> > Now, on the subscriber side, a session issue an "ALTER TABLE ADD
> > COLUMN" on a subscribed table, eg. pgbench_branches. A cache invalidation
> > message is then pending for this table.
> >
> > In the meantime, the logical replication worker receive an UPDATE to apply. 
> > It
> > opens the local relation using "logicalrep_rel_open". It finds the related
> > entry in LogicalRepRelMap is valid, so it does not update its attrmap
> > and directly opens and locks the local relation:
> >
>
> - /* Try to find and lock the relation by name. */
> + /* Try to find the relation by name */
>   relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname,\
> remoterel->relname, -1),
> - lockmode, true);
> + NoLock, true);
>
> I think we can't do this because it could lead to locking the wrong
> reloid.  See RangeVarGetRelidExtended.  It ensures that after locking
> the relation (which includes accepting invalidation messages), that
> the reloid is correct.  I think changing the code in the way you are
> suggesting can lead to locking incorrect reloid.
>

I have made changes to fix the comment provided. The patch for the
same is attached. Could not add a test case for this scenario is based
on timing issue.
Thoughts?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From 9bec5e5250c5af034096ed9ab966c2c237518976 Mon Sep 17 00:00:00 2001
From: Jehan-Guillaume de Rorthais , vignesh.c 
Date: Fri, 13 Dec 2019 11:59:13 +0530
Subject: [PATCH] Fix subscriber invalid memory access on DDL

Before this patch, the attrmap structure mapping the local fields
to remote ones for a subscribed relation was rebuild before handling
pending invalidation messages and potential relcache updates. This
was leading to an invalid memory access by overflowing the attrmap
when building the related tuple slot received from the provider.
---
 src/backend/replication/logical/relation.c | 53 ++
 1 file changed, 39 insertions(+), 14 deletions(-)

diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index b386f84..9009ebc 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -220,6 +220,8 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 {
 	LogicalRepRelMapEntry *entry;
 	bool		found;
+	Oid			relid = InvalidOid;
+	LogicalRepRelation	*remoterel;
 
 	if (LogicalRepRelMap == NULL)
 		logicalrep_relmap_init();
@@ -232,20 +234,24 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 		elog(ERROR, "no relation map entry for remote relation ID %u",
 			 remoteid);
 
-	/* Need to update the local cache? */
-	if (!OidIsValid(entry->localreloid))
-	{
-		Oid			relid;
-		int			i;
-		int			found;
-		Bitmapset  *idkey;
-		TupleDesc	desc;
-		LogicalRepRelation *remoterel;
-		MemoryContext oldctx;
+	remoterel = >remoterel;
 
-		

Re: segmentation fault when cassert enabled

2019-12-06 Thread Amit Kapila
On Mon, Nov 25, 2019 at 8:25 PM Jehan-Guillaume de Rorthais
 wrote:
>
> On Wed, 6 Nov 2019 14:34:38 +0100
> Peter Eisentraut  wrote:
>
> > On 2019-11-05 17:29, Jehan-Guillaume de Rorthais wrote:
> > > My best bet so far is that logicalrep_relmap_invalidate_cb is not called
> > > after the DDL on the subscriber so the relmap cache is not invalidated. So
> > > we end up with slot->tts_tupleDescriptor->natts superior than
> > > rel->remoterel->natts in slot_store_cstrings, leading to the overflow on
> > > attrmap and the sigsev.
> >
> > It looks like something like that is happening.  But it shouldn't.
> > Different table schemas on publisher and subscriber are well supported,
> > so this must be an edge case of some kind.  Please continue investigating.
>
> I've been able to find the origin of the crash, but it was a long journey.
>
> 
>
>   I was unable to debug using gdb record because of this famous error:
>
> Process record does not support instruction 0xc5 at address 
> 0x1482758a4b30.
>
> Program stopped.
> __memset_avx2_unaligned_erms ()
> at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:168
> 168 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such
>   file or directory.
>
>   Trying with rr, I had constant "stack depth limit exceeded", even with
>   unlimited one. Does it worth opening a discussion or a wiki page about these
>   tools? Peter, it looks like you have some experience with rr, any feedback?
>
>   Finally, Julien Rouhaud spend some time with me after work hours, a,swering
>   my questions about some parts of the code and pointed me to the excellent
>   backtrace_functions GUC addition few days ago. This finally did the trick to
>   find out what was happening. Many thanks Julien!
>
> 
>
> Back to the bug itself. Consider a working logical replication with constant
> update/insert activity, eg. pgbench running against provider.
>
> Now, on the subscriber side, a session issue an "ALTER TABLE ADD
> COLUMN" on a subscribed table, eg. pgbench_branches. A cache invalidation
> message is then pending for this table.
>
> In the meantime, the logical replication worker receive an UPDATE to apply. It
> opens the local relation using "logicalrep_rel_open". It finds the related
> entry in LogicalRepRelMap is valid, so it does not update its attrmap
> and directly opens and locks the local relation:
>

- /* Try to find and lock the relation by name. */
+ /* Try to find the relation by name */
  relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname,\
remoterel->relname, -1),
- lockmode, true);
+ NoLock, true);

I think we can't do this because it could lead to locking the wrong
reloid.  See RangeVarGetRelidExtended.  It ensures that after locking
the relation (which includes accepting invalidation messages), that
the reloid is correct.  I think changing the code in the way you are
suggesting can lead to locking incorrect reloid.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: segmentation fault when cassert enabled

2019-11-25 Thread Jehan-Guillaume de Rorthais
On Wed, 6 Nov 2019 14:34:38 +0100
Peter Eisentraut  wrote:

> On 2019-11-05 17:29, Jehan-Guillaume de Rorthais wrote:
> > My best bet so far is that logicalrep_relmap_invalidate_cb is not called
> > after the DDL on the subscriber so the relmap cache is not invalidated. So
> > we end up with slot->tts_tupleDescriptor->natts superior than
> > rel->remoterel->natts in slot_store_cstrings, leading to the overflow on
> > attrmap and the sigsev.  
> 
> It looks like something like that is happening.  But it shouldn't. 
> Different table schemas on publisher and subscriber are well supported, 
> so this must be an edge case of some kind.  Please continue investigating.

I've been able to find the origin of the crash, but it was a long journey.



  I was unable to debug using gdb record because of this famous error:

Process record does not support instruction 0xc5 at address 0x1482758a4b30.

Program stopped.
__memset_avx2_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:168
168 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such
  file or directory.

  Trying with rr, I had constant "stack depth limit exceeded", even with
  unlimited one. Does it worth opening a discussion or a wiki page about these
  tools? Peter, it looks like you have some experience with rr, any feedback?

  Finally, Julien Rouhaud spend some time with me after work hours, a,swering
  my questions about some parts of the code and pointed me to the excellent
  backtrace_functions GUC addition few days ago. This finally did the trick to
  find out what was happening. Many thanks Julien!



Back to the bug itself. Consider a working logical replication with constant
update/insert activity, eg. pgbench running against provider.

Now, on the subscriber side, a session issue an "ALTER TABLE ADD
COLUMN" on a subscribed table, eg. pgbench_branches. A cache invalidation
message is then pending for this table.

In the meantime, the logical replication worker receive an UPDATE to apply. It
opens the local relation using "logicalrep_rel_open". It finds the related
entry in LogicalRepRelMap is valid, so it does not update its attrmap
and directly opens and locks the local relation:

  /* Need to update the local cache? */
  if (!OidIsValid(entry->localreloid))
  {
[...updates attrmap here...]
  }
  else
entry->localrel = table_open(entry->localreloid, lockmode);

However, when locking the table, the code in LockRelationOid() actually process
any pending invalidation messages:

  LockRelationOid(Oid relid, LOCKMODE lockmode)
  {
[...]
/*
 * Now that we have the lock, check for invalidation messages, so that we
 * will update or flush any stale relcache entry before we try to use it.
 * RangeVarGetRelid() specifically relies on us for this.  We can skip
 * this in the not-uncommon case that we already had the same type of lock
 * being requested, since then no one else could have modified the
 * relcache entry in an undesirable way.  (In the case where our own xact
 * modifies the rel, the relcache update happens via
 * CommandCounterIncrement, not here.)
 *
 * However, in corner cases where code acts on tables (usually catalogs)
 * recursively, we might get here while still processing invalidation
 * messages in some outer execution of this function or a sibling.  The
 * "cleared" status of the lock tells us whether we really are done
 * absorbing relevant inval messages.
 */
if (res != LOCKACQUIRE_ALREADY_CLEAR)
{
  AcceptInvalidationMessages();
  MarkLockClear(locallock);
}
  }

We end up with attrmap referencing N columns and the relcache referencing N+1
columns. Later, in apply_handle_update(), we build a TupleTableSlot based on
the relcache representation and we crash by overflowing attrmap while trying to
feed this larger slot in slot_store_cstrings().

Please find in attachment a bugfix proposal. It just moves the attrmap update
after the table_open() call.

Last, I was wondering if entry->attrmap should be pfree'd before palloc'ing it
again during its rebuild to avoid some memory leak?

Regards,
>From 4295b277952a46378f01211bd37075f20223aadc Mon Sep 17 00:00:00 2001
From: Jehan-Guillaume de Rorthais 
Date: Mon, 25 Nov 2019 15:21:38 +0100
Subject: [PATCH] Fix subscriber invalid memory access on DDL

Before this patch, the attrmap structure mapping the local fields
to remote ones for a subscribed relation was rebuild before handling
pending invalidation messages and potential relcache updates. This
was leading to an invalid memory access by overflowing the attrmap
when building the related tuple slot received from the provider.
---
 src/backend/replication/logical/relation.c | 57 --
 1 file changed, 42 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index b386f8460d..cfc34d49e0 100644

Re: segmentation fault when cassert enabled

2019-11-12 Thread Jehan-Guillaume de Rorthais
On Mon, 28 Oct 2019 16:47:02 +0900 (JST)
Kyotaro Horiguchi  wrote:

> At Fri, 25 Oct 2019 12:28:38 -0400, Tom Lane  wrote in 
> > Jehan-Guillaume de Rorthais  writes:  
> > > When investigating for the bug reported in thread "logical replication -
> > > negative bitmapset member not allowed", I found a way to seg fault
> > > postgresql only when cassert is enabled.
> > > ...
> > > I hadn't time to digg further yet. However, I don't understand why this
> > > crash is triggered when cassert is enabled.  
> > 
> > Most likely, it's not so much assertions that provoke the crash as
> > CLOBBER_FREED_MEMORY, ie the actual problem here is use of already-freed
> > memory.  
> 
> Agreed.
> 
> By the way I didn't get a crash by Jehan's script with the
> --enable-cassert build of the master HEAD of a few days ago.

I am now working with HEAD and I can confirm I am able to make it crash 99% of
the time using my script.
It feels like a race condition between cache invalidation and record
processing from worker.c. Make sure you have enough write activity during the
test.

> FWIW I sometimes got SEGVish crashes or mysterious misbehavor when
> some structs were changed and I didn't do "make clean". Rarely I
> needed "make distclean". (Yeah, I didn't ususally turn on
> --enable-depend..)

I'm paranoid, I always do:

* make distclean
* git reset; git clean -df
* ./configure && make install

Regards,




Re: segmentation fault when cassert enabled

2019-11-06 Thread Peter Eisentraut

On 2019-11-05 17:29, Jehan-Guillaume de Rorthais wrote:

My best bet so far is that logicalrep_relmap_invalidate_cb is not called after
the DDL on the subscriber so the relmap cache is not invalidated. So we end up
with slot->tts_tupleDescriptor->natts superior than rel->remoterel->natts in
slot_store_cstrings, leading to the overflow on attrmap and the sigsev.


It looks like something like that is happening.  But it shouldn't. 
Different table schemas on publisher and subscriber are well supported, 
so this must be an edge case of some kind.  Please continue investigating.



By the way, I noticed attrmap is declared as AttrNumber * in struct
LogicalRepRelMapEntry, AttrNumber being typedef'd as an int16. However, attrmap
is allocated based on sizeof(int) in logicalrep_rel_open:

   entry->attrmap = palloc(desc->natts * sizeof(int));

It doesn't look like a major problem, it just allocates more memory than
needed.


Right.  I have committed a fix for this.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: segmentation fault when cassert enabled

2019-11-05 Thread Jehan-Guillaume de Rorthais
On Fri, 25 Oct 2019 12:28:38 -0400
Tom Lane  wrote:

> Jehan-Guillaume de Rorthais  writes:
> > When investigating for the bug reported in thread "logical replication -
> > negative bitmapset member not allowed", I found a way to seg fault
> > postgresql only when cassert is enabled.
> > ...
> > I hadn't time to digg further yet. However, I don't understand why this
> > crash is triggered when cassert is enabled.  
> 
> Most likely, it's not so much assertions that provoke the crash as
> CLOBBER_FREED_MEMORY, ie the actual problem here is use of already-freed
> memory.

Thank you. Indeed, enabling CLOBBER_FREED_MEMORY on its own is enough to
trigger the segfault.

In fact, valgrind detect it as an uninitialised value, no matter
CLOBBER_FREED_MEMORY is defined or not:

 Conditional jump or move depends on uninitialised value(s)
at 0x43F410: slot_modify_cstrings (worker.c:398)
by 0x43FBE9: apply_handle_update (worker.c:744)
by 0x440088: apply_dispatch (worker.c:968)
by 0x4405D7: LogicalRepApplyLoop (worker.c:1175)
by 0x440CD0: ApplyWorkerMain (worker.c:1733)
by 0x411C34: StartBackgroundWorker (bgworker.c:834)
by 0x41EA24: do_start_bgworker (postmaster.c:5763)
by 0x41EB6F: maybe_start_bgworkers (postmaster.c:5976)
by 0x41F562: sigusr1_handler (postmaster.c:5161)
by 0x48A072F: ??? (in /lib/x86_64-linux-gnu/libpthread-2.28.so)
by 0x4B31FF6: select (select.c:41)
by 0x41FDDE: ServerLoop (postmaster.c:1668)
  Uninitialised value was created by a heap allocation
at 0x5C579B: palloc (mcxt.c:949)
by 0x437116: logicalrep_rel_open (relation.c:270)
by 0x43FA8F: apply_handle_update (worker.c:684)
by 0x440088: apply_dispatch (worker.c:968)
by 0x4405D7: LogicalRepApplyLoop (worker.c:1175)
by 0x440CD0: ApplyWorkerMain (worker.c:1733)
by 0x411C34: StartBackgroundWorker (bgworker.c:834)
by 0x41EA24: do_start_bgworker (postmaster.c:5763)
by 0x41EB6F: maybe_start_bgworkers (postmaster.c:5976)
by 0x41F562: sigusr1_handler (postmaster.c:5161)
by 0x48A072F: ??? (in /lib/x86_64-linux-gnu/libpthread-2.28.so)
by 0x4B31FF6: select (select.c:41)

My best bet so far is that logicalrep_relmap_invalidate_cb is not called after
the DDL on the subscriber so the relmap cache is not invalidated. So we end up
with slot->tts_tupleDescriptor->natts superior than rel->remoterel->natts in
slot_store_cstrings, leading to the overflow on attrmap and the sigsev.

I hadn't follow this path yet.

By the way, I noticed attrmap is declared as AttrNumber * in struct
LogicalRepRelMapEntry, AttrNumber being typedef'd as an int16. However, attrmap
is allocated based on sizeof(int) in logicalrep_rel_open:

  entry->attrmap = palloc(desc->natts * sizeof(int));

It doesn't look like a major problem, it just allocates more memory than
needed.

Regards,




Re: segmentation fault when cassert enabled

2019-10-28 Thread Kyotaro Horiguchi
At Fri, 25 Oct 2019 12:28:38 -0400, Tom Lane  wrote in 
> Jehan-Guillaume de Rorthais  writes:
> > When investigating for the bug reported in thread "logical replication -
> > negative bitmapset member not allowed", I found a way to seg fault 
> > postgresql
> > only when cassert is enabled.
> > ...
> > I hadn't time to digg further yet. However, I don't understand why this 
> > crash
> > is triggered when cassert is enabled.
> 
> Most likely, it's not so much assertions that provoke the crash as
> CLOBBER_FREED_MEMORY, ie the actual problem here is use of already-freed
> memory.

Agreed.

By the way I didn't get a crash by Jehan's script with the
--enable-cassert build of the master HEAD of a few days ago.

FWIW I sometimes got SEGVish crashes or mysterious misbehavor when
some structs were changed and I didn't do "make clean". Rarely I
needed "make distclean". (Yeah, I didn't ususally turn on
--enable-depend..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: segmentation fault when cassert enabled

2019-10-25 Thread Tom Lane
Jehan-Guillaume de Rorthais  writes:
> When investigating for the bug reported in thread "logical replication -
> negative bitmapset member not allowed", I found a way to seg fault postgresql
> only when cassert is enabled.
> ...
> I hadn't time to digg further yet. However, I don't understand why this crash
> is triggered when cassert is enabled.

Most likely, it's not so much assertions that provoke the crash as
CLOBBER_FREED_MEMORY, ie the actual problem here is use of already-freed
memory.

regards, tom lane




segmentation fault when cassert enabled

2019-10-25 Thread Jehan-Guillaume de Rorthais
Hi list,

When investigating for the bug reported in thread "logical replication -
negative bitmapset member not allowed", I found a way to seg fault postgresql
only when cassert is enabled.

See the scenario in attachment.

When executed against binaries compiled with --enable-cassert, I have the
following error in logs:

  LOG:  0: background worker "logical replication worker" (PID 761) was
  terminated by signal 11: Segmentation fault

Here is the stack trace:

#0  in slot_store_cstrings (slot=0x55a3c6973b48, rel=0x55a3c6989468,
values=0x7ffe08ae67b0) at worker.c:330
#1  in apply_handle_update (s=0x7ffe08aeddb0) at worker.c:712
#2  in apply_dispatch (s=0x7ffe08aeddb0) at worker.c:968
#3  in LogicalRepApplyLoop (last_received=87957952) at worker.c:1175
#4  in ApplyWorkerMain (main_arg=0) at worker.c:1733
#5  in StartBackgroundWorker () at bgworker.c:834
#6  in do_start_bgworker (rw=0x55a3c68c16d0) at postmaster.c:5763
#7  in maybe_start_bgworkers () at postmaster.c:5976
#8  in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5161
#9  
#10 in __GI___select (nfds=6, readfds=0x7ffe08aee680, writefds=0x0,
exceptfds=0x0, timeout=0x7ffe08aee700)
at ../sysdeps/unix/sysv/linux/select.c:41
#11 in ServerLoop () at postmaster.c:1668
#12 in PostmasterMain (argc=3, argv=0x55a3c6899820) at postmaster.c:1377
#13 in main (argc=3, argv=0x55a3c6899820) at main.c:228


It leads to this conditional test in worker.c:slot_store_cstrings

for (i = 0; i < natts; i++)
{ [...]
if (!att->attisdropped && remoteattnum >= 0 &&
values[remoteattnum] != NULL)

In gdb, I found remoteattnum seems to be not correctly initialized for the
latest column the scenario adds in pgbench_branches:

  (gdb) p remoteattnum
  $1 = 32639
  (gdb) p i
  $2 = 3

I hadn't time to digg further yet. However, I don't understand why this crash
is triggered when cassert is enabled.

Regards,


logrep-cassert-crash.bash
Description: Binary data