Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Thu, Jun 23, 2022 at 2:09 AM Robert Haas wrote: > On Wed, Jun 22, 2022 at 12:34 AM Thomas Munro wrote: > > > For the record, the third idea proposed was to use 1 for the first > > > byte, so that 0 is reserved for NULL and works with memset(0). Here's > > > an attempt at that. > > > > ... erm, though, duh, I forgot to adjust Assert(val > base). One more time. > > I like this idea and think this might have the side benefit of making > it harder to get away with accessing relptr_off directly. Thanks. Pushed, and back-patched to 14, where min_dynamic_shared_memory arrived. I wondered in passing if the stuff about relptr_declare() was still needed to avoid confusing pgindent, since we tweaked the indent code a bit for macros that take a typename, but it seems that it still mangles "relptr(FooBar) some_struct_member;", putting extra whitespace in front of it. Hmmph.
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Wed, Jun 22, 2022 at 12:34 AM Thomas Munro wrote: > > For the record, the third idea proposed was to use 1 for the first > > byte, so that 0 is reserved for NULL and works with memset(0). Here's > > an attempt at that. > > ... erm, though, duh, I forgot to adjust Assert(val > base). One more time. I like this idea and think this might have the side benefit of making it harder to get away with accessing relptr_off directly. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Wed, Jun 22, 2022 at 4:24 PM Thomas Munro wrote: > On Wed, Jun 22, 2022 at 2:54 PM Tom Lane wrote: > > John Naylor writes: > > > On Wed, Jun 1, 2022 at 2:57 AM Robert Haas wrote: > > >> ... So we can fix this by: > > >> 1. Using a relative pointer value other than 0 to represent a null > > >> pointer. Andres suggested (Size) -1. > > >> 2. Not storing the free page manager for the DSM in the main shared > > >> memory segment at byte offset 0. > > For the record, the third idea proposed was to use 1 for the first > byte, so that 0 is reserved for NULL and works with memset(0). Here's > an attempt at that. ... erm, though, duh, I forgot to adjust Assert(val > base). One more time. From ba1661745ad57c69d0d9f881913d337504b5e870 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 22 Jun 2022 16:18:53 +1200 Subject: [PATCH v2] Fix relptr's encoding of NULL. Use 0 for NULL, and use 1-based offset for non-NULL values. Also fix macro hygiene in passing (ie missing/misplaced parentheses), and remove open-coded access to the raw value from freepage.c/.h. --- src/backend/utils/mmgr/freepage.c | 6 +++--- src/include/utils/freepage.h | 4 ++-- src/include/utils/relptr.h| 15 +-- 3 files changed, 14 insertions(+), 11 deletions(-) diff --git a/src/backend/utils/mmgr/freepage.c b/src/backend/utils/mmgr/freepage.c index b26a295a4e..dcf246faf1 100644 --- a/src/backend/utils/mmgr/freepage.c +++ b/src/backend/utils/mmgr/freepage.c @@ -434,7 +434,7 @@ FreePageManagerDump(FreePageManager *fpm) /* Dump general stuff. */ appendStringInfo(, "metadata: self %zu max contiguous pages = %zu\n", - fpm->self.relptr_off, fpm->contiguous_pages); + relptr_offset(fpm->self), fpm->contiguous_pages); /* Dump btree. */ if (fpm->btree_depth > 0) @@ -1269,7 +1269,7 @@ FreePageManagerDumpBtree(FreePageManager *fpm, FreePageBtree *btp, if (btp->hdr.magic == FREE_PAGE_INTERNAL_MAGIC) appendStringInfo(buf, " %zu->%zu", btp->u.internal_key[index].first_page, - btp->u.internal_key[index].child.relptr_off / FPM_PAGE_SIZE); + relptr_offset(btp->u.internal_key[index].child) / FPM_PAGE_SIZE); else appendStringInfo(buf, " %zu(%zu)", btp->u.leaf_key[index].first_page, @@ -1859,7 +1859,7 @@ FreePagePopSpanLeader(FreePageManager *fpm, Size pageno) { Size f = Min(span->npages, FPM_NUM_FREELISTS) - 1; - Assert(fpm->freelist[f].relptr_off == pageno * FPM_PAGE_SIZE); + Assert(relptr_offset(fpm->freelist[f]) == pageno * FPM_PAGE_SIZE); relptr_copy(fpm->freelist[f], span->next); } } diff --git a/src/include/utils/freepage.h b/src/include/utils/freepage.h index 08064b10f9..e69b2804ec 100644 --- a/src/include/utils/freepage.h +++ b/src/include/utils/freepage.h @@ -78,11 +78,11 @@ struct FreePageManager #define fpm_pointer_is_page_aligned(base, ptr) \ (((Size) (((char *) (ptr)) - (base))) % FPM_PAGE_SIZE == 0) #define fpm_relptr_is_page_aligned(base, relptr) \ - ((relptr).relptr_off % FPM_PAGE_SIZE == 0) + (relptr_offset(relptr) % FPM_PAGE_SIZE == 0) /* Macro to find base address of the segment containing a FreePageManager. */ #define fpm_segment_base(fpm) \ - (((char *) fpm) - fpm->self.relptr_off) + (((char *) fpm) - relptr_offset(fpm->self)) /* Macro to access a FreePageManager's largest consecutive run of pages. */ #define fpm_largest(fpm) \ diff --git a/src/include/utils/relptr.h b/src/include/utils/relptr.h index cc80a7200d..9364dd6694 100644 --- a/src/include/utils/relptr.h +++ b/src/include/utils/relptr.h @@ -42,7 +42,7 @@ #define relptr_access(base, rp) \ (AssertVariableIsOfTypeMacro(base, char *), \ (__typeof__((rp).relptr_type)) ((rp).relptr_off == 0 ? NULL : \ - (base + (rp).relptr_off))) + (base) + (rp).relptr_off - 1)) #else /* * If we don't have __builtin_types_compatible_p, assume we might not have @@ -50,12 +50,15 @@ */ #define relptr_access(base, rp) \ (AssertVariableIsOfTypeMacro(base, char *), \ - (void *) ((rp).relptr_off == 0 ? NULL : (base + (rp).relptr_off))) + (void *) ((rp).relptr_off == 0 ? NULL : (base) + (rp).relptr_off - 1)) #endif #define relptr_is_null(rp) \ ((rp).relptr_off == 0) +#define relptr_offset(rp) \ + ((rp).relptr_off - 1) + /* We use this inline to avoid double eval of "val" in relptr_store */ static inline Size relptr_store_eval(char *base, char *val) @@ -64,8 +67,8 @@ relptr_store_eval(char *base, char *val) return 0; else { - Assert(val > base); - return val - base; + Assert(val >= base); + return val - base + 1; } } @@ -73,7 +76,7 @@ relptr_store_eval(char *base, char *val) #define relptr_store(base, rp, val) \ (AssertVariableIsOfTypeMacro(base, char *), \ AssertVariableIsOfTypeMacro(val, __typeof__((rp).relptr_type)), \ - (rp).relptr_off = relptr_store_eval(base, (char *) (val))) + (rp).relptr_off = relptr_store_eval((base), (char *) (val))) #else /* * If we don't have __builtin_types_compatible_p, assume we might not have
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Wed, Jun 22, 2022 at 2:54 PM Tom Lane wrote: > John Naylor writes: > > On Wed, Jun 1, 2022 at 2:57 AM Robert Haas wrote: > >> ... So we can fix this by: > >> 1. Using a relative pointer value other than 0 to represent a null > >> pointer. Andres suggested (Size) -1. > >> 2. Not storing the free page manager for the DSM in the main shared > >> memory segment at byte offset 0. For the record, the third idea proposed was to use 1 for the first byte, so that 0 is reserved for NULL and works with memset(0). Here's an attempt at that. From 8b6b1e765b4dd23b968563aab1a01fa8b7c5a463 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 22 Jun 2022 16:18:53 +1200 Subject: [PATCH] Fix relptr's encoding of NULL. Use 0 for NULL, and use 1-based offset for non-NULL values. Also fix macro hygiene in passing (ie missing/misplaced parentheses), and remove open-coded access to the raw value from freepage.c/.h. --- src/backend/utils/mmgr/freepage.c | 6 +++--- src/include/utils/freepage.h | 4 ++-- src/include/utils/relptr.h| 13 - 3 files changed, 13 insertions(+), 10 deletions(-) diff --git a/src/backend/utils/mmgr/freepage.c b/src/backend/utils/mmgr/freepage.c index b26a295a4e..dcf246faf1 100644 --- a/src/backend/utils/mmgr/freepage.c +++ b/src/backend/utils/mmgr/freepage.c @@ -434,7 +434,7 @@ FreePageManagerDump(FreePageManager *fpm) /* Dump general stuff. */ appendStringInfo(, "metadata: self %zu max contiguous pages = %zu\n", - fpm->self.relptr_off, fpm->contiguous_pages); + relptr_offset(fpm->self), fpm->contiguous_pages); /* Dump btree. */ if (fpm->btree_depth > 0) @@ -1269,7 +1269,7 @@ FreePageManagerDumpBtree(FreePageManager *fpm, FreePageBtree *btp, if (btp->hdr.magic == FREE_PAGE_INTERNAL_MAGIC) appendStringInfo(buf, " %zu->%zu", btp->u.internal_key[index].first_page, - btp->u.internal_key[index].child.relptr_off / FPM_PAGE_SIZE); + relptr_offset(btp->u.internal_key[index].child) / FPM_PAGE_SIZE); else appendStringInfo(buf, " %zu(%zu)", btp->u.leaf_key[index].first_page, @@ -1859,7 +1859,7 @@ FreePagePopSpanLeader(FreePageManager *fpm, Size pageno) { Size f = Min(span->npages, FPM_NUM_FREELISTS) - 1; - Assert(fpm->freelist[f].relptr_off == pageno * FPM_PAGE_SIZE); + Assert(relptr_offset(fpm->freelist[f]) == pageno * FPM_PAGE_SIZE); relptr_copy(fpm->freelist[f], span->next); } } diff --git a/src/include/utils/freepage.h b/src/include/utils/freepage.h index 08064b10f9..e69b2804ec 100644 --- a/src/include/utils/freepage.h +++ b/src/include/utils/freepage.h @@ -78,11 +78,11 @@ struct FreePageManager #define fpm_pointer_is_page_aligned(base, ptr) \ (((Size) (((char *) (ptr)) - (base))) % FPM_PAGE_SIZE == 0) #define fpm_relptr_is_page_aligned(base, relptr) \ - ((relptr).relptr_off % FPM_PAGE_SIZE == 0) + (relptr_offset(relptr) % FPM_PAGE_SIZE == 0) /* Macro to find base address of the segment containing a FreePageManager. */ #define fpm_segment_base(fpm) \ - (((char *) fpm) - fpm->self.relptr_off) + (((char *) fpm) - relptr_offset(fpm->self)) /* Macro to access a FreePageManager's largest consecutive run of pages. */ #define fpm_largest(fpm) \ diff --git a/src/include/utils/relptr.h b/src/include/utils/relptr.h index cc80a7200d..79f4f0095d 100644 --- a/src/include/utils/relptr.h +++ b/src/include/utils/relptr.h @@ -42,7 +42,7 @@ #define relptr_access(base, rp) \ (AssertVariableIsOfTypeMacro(base, char *), \ (__typeof__((rp).relptr_type)) ((rp).relptr_off == 0 ? NULL : \ - (base + (rp).relptr_off))) + (base) + (rp).relptr_off - 1)) #else /* * If we don't have __builtin_types_compatible_p, assume we might not have @@ -50,12 +50,15 @@ */ #define relptr_access(base, rp) \ (AssertVariableIsOfTypeMacro(base, char *), \ - (void *) ((rp).relptr_off == 0 ? NULL : (base + (rp).relptr_off))) + (void *) ((rp).relptr_off == 0 ? NULL : (base) + (rp).relptr_off - 1)) #endif #define relptr_is_null(rp) \ ((rp).relptr_off == 0) +#define relptr_offset(rp) \ + ((rp).relptr_off - 1) + /* We use this inline to avoid double eval of "val" in relptr_store */ static inline Size relptr_store_eval(char *base, char *val) @@ -65,7 +68,7 @@ relptr_store_eval(char *base, char *val) else { Assert(val > base); - return val - base; + return val - base + 1; } } @@ -73,7 +76,7 @@ relptr_store_eval(char *base, char *val) #define relptr_store(base, rp, val) \ (AssertVariableIsOfTypeMacro(base, char *), \ AssertVariableIsOfTypeMacro(val, __typeof__((rp).relptr_type)), \ - (rp).relptr_off = relptr_store_eval(base, (char *) (val))) + (rp).relptr_off = relptr_store_eval((base), (char *) (val))) #else /* * If we don't have __builtin_types_compatible_p, assume we might not have @@ -81,7 +84,7 @@ relptr_store_eval(char *base, char *val) */ #define relptr_store(base, rp, val) \ (AssertVariableIsOfTypeMacro(base, char *), \ - (rp).relptr_off = relptr_store_eval(base,
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
John Naylor writes: > On Wed, Jun 1, 2022 at 2:57 AM Robert Haas wrote: >> ... So we can fix this by: >> 1. Using a relative pointer value other than 0 to represent a null >> pointer. Andres suggested (Size) -1. >> 2. Not storing the free page manager for the DSM in the main shared >> memory segment at byte offset 0. > For this open item, the above two ideas were discussed as a short-term > fix, and my reading of the thread is that the other proposals are too > invasive at this point in the cycle. Both of them have a draft patch > in the thread. #2, i.e. wasting MAXALIGN of space, seems the simplest > and most localized. Any thoughts on pulling the trigger on either of > these two approaches? I'm still of the opinion that 0 == NULL is a good property to have, so I vote for #2. regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Wed, Jun 1, 2022 at 2:57 AM Robert Haas wrote: > We do use fpm_segment_base(), but that accidentally fails > to break, because instead of using relptr_access() it drills right > through the abstraction and doesn't have any kind of special case for > 0. So we can fix this by: > > 1. Using a relative pointer value other than 0 to represent a null > pointer. Andres suggested (Size) -1. > 2. Not storing the free page manager for the DSM in the main shared > memory segment at byte offset 0. Hi all, For this open item, the above two ideas were discussed as a short-term fix, and my reading of the thread is that the other proposals are too invasive at this point in the cycle. Both of them have a draft patch in the thread. #2, i.e. wasting MAXALIGN of space, seems the simplest and most localized. Any thoughts on pulling the trigger on either of these two approaches? -- John Naylor EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
At Wed, 01 Jun 2022 11:42:01 +0900 (JST), Kyotaro Horiguchi wrote in me> At Tue, 31 May 2022 16:10:05 -0400, Tom Lane wrote in me> tgl> Robert Haas writes: me> tgl> > Yeah, so when I created this stuff in the first place, I figured that me> tgl> > it wasn't a problem if we reserved relptr == 0 to mean a NULL pointer, Mmm. Sorry. It's just an accidental shooting. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
At Tue, 31 May 2022 15:57:14 -0400, Robert Haas wrote in > 1. Using a relative pointer value other than 0 to represent a null > pointer. Andres suggested (Size) -1. I thought that relptr as a part of DSM so the use of offset=0 is somewhat illegal. But I like this. We can fix this by this modification. I think ((Size) -1) is natural to signal something special. (I see glibc uses "(size_t) -1".) > 2. Not storing the free page manager for the DSM in the main shared > memory segment at byte offset 0. > 3. Dropping the assertion while loudly singing "la la la la la la". reagards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/include/utils/relptr.h b/src/include/utils/relptr.h index cc80a7200d..c6d39a1360 100644 --- a/src/include/utils/relptr.h +++ b/src/include/utils/relptr.h @@ -41,7 +41,7 @@ #ifdef HAVE__BUILTIN_TYPES_COMPATIBLE_P #define relptr_access(base, rp) \ (AssertVariableIsOfTypeMacro(base, char *), \ - (__typeof__((rp).relptr_type)) ((rp).relptr_off == 0 ? NULL : \ + (__typeof__((rp).relptr_type)) ((rp).relptr_off == ((Size) -1) ? NULL : \ (base + (rp).relptr_off))) #else /* @@ -50,21 +50,21 @@ */ #define relptr_access(base, rp) \ (AssertVariableIsOfTypeMacro(base, char *), \ - (void *) ((rp).relptr_off == 0 ? NULL : (base + (rp).relptr_off))) + (void *) ((rp).relptr_off == ((Size) -1) ? NULL : (base + (rp).relptr_off))) #endif #define relptr_is_null(rp) \ - ((rp).relptr_off == 0) + ((rp).relptr_off == ((Size) -1)) /* We use this inline to avoid double eval of "val" in relptr_store */ static inline Size relptr_store_eval(char *base, char *val) { if (val == NULL) - return 0; + return ((Size) -1); else { - Assert(val > base); + Assert(val >= base); return val - base; } }
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
At Tue, 31 May 2022 16:10:05 -0400, Tom Lane wrote in tgl> Robert Haas writes: tgl> > Yeah, so when I created this stuff in the first place, I figured that tgl> > it wasn't a problem if we reserved relptr == 0 to mean a NULL pointer, tgl> > because you would never have a relative pointer pointing to the tgl> > beginning of a DSM, because it would probably always start with a tgl> > dsm_toc. But when Thomas made it so that DSM allocations could happen tgl> > in the main shared memory segment, that ceased to be true. This tgl> > example happened not to break because we never use relptr_access() on tgl> > fpm->self. We do use fpm_segment_base(), but that accidentally fails tgl> > to break, because instead of using relptr_access() it drills right tgl> > through the abstraction and doesn't have any kind of special case for tgl> > 0. tgl> tgl> Seems like that in itself is a a lousy idea. Either the code should tgl> respect the abstraction, or it shouldn't be declaring the variable tgl> as a relptr in the first place. tgl> tgl> > So we can fix this by: tgl> > 1. Using a relative pointer value other than 0 to represent a null tgl> > pointer. Andres suggested (Size) -1. tgl> > 2. Not storing the free page manager for the DSM in the main shared tgl> > memory segment at byte offset 0. tgl> > 3. Dropping the assertion while loudly singing "la la la la la la". tgl> tgl> I'm definitely down on #3, because that just leaves the ambiguity tgl> in place to bite somewhere else in future. #1 would work as long tgl> as nobody expects memset-to-zero to produce null relptrs, but that tgl> doesn't seem very nice either. tgl> tgl> On the whole, wasting MAXALIGN worth of memory seems like the least bad tgl> alternative, but I wonder if we ought to do it right here as opposed tgl> to somewhere in the DSM code proper. Why is this DSM space not like tgl> other DSM spaces in starting with a TOC? tgl> tgl>regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Robert Haas writes: > On Tue, May 31, 2022 at 6:14 PM Tom Lane wrote: >> However, now that I've corrected that mistaken image ... I wonder if >> it could make sense to redefine relptr as self-relative? That ought >> to provide some notational savings since you'd only need to carry >> around the relptr's own address not that plus a base address. >> Probably not something to consider for v15 though. > I think that would be pretty hard to make work, since copying around a > relative pointer would change its meaning. Code like "relptr_foo x = > *y" would be broken, for example, but the compiler would not complain. Sure, but the current definition is far from error-proof as well: nothing stops you from using the wrong base address with a relptr's value. Anyway, it's just idle speculation at this point. regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Tue, May 31, 2022 at 6:14 PM Tom Lane wrote: > However, now that I've corrected that mistaken image ... I wonder if > it could make sense to redefine relptr as self-relative? That ought > to provide some notational savings since you'd only need to carry > around the relptr's own address not that plus a base address. > Probably not something to consider for v15 though. I think that would be pretty hard to make work, since copying around a relative pointer would change its meaning. Code like "relptr_foo x = *y" would be broken, for example, but the compiler would not complain. Or maybe I misunderstand your idea? Also keep in mind that the major use case here is DSM segments, which can be mapped at different addresses in different processes. Mainly, we expect to store relative pointers in the segment to other things in the same segment. Sometimes, we might read the values from there into local variables - or maybe global variables - in code that is accessing those DSM segments for some purpose. There is little use for a relative pointer that can access all of the address space that exists. For that, it is better to just as a regular pointer. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Robert Haas writes: > Seems backwards to me. A relative pointer is supposed to point to > something inside some range of memory, like a DSM gment -- it can > never be legally used to point to anything outside that segment. So it > seems to me that you could perfectly legally point to the second byte > of the segment, but never to the -1'th byte. Okay, I was thinking about it slightly wrong: relptr is defined as an offset relative to some base address, not to its own address. As long as you're prepared to assume that the base address really is the start of the addressable area, then yeah the above argument works. However, now that I've corrected that mistaken image ... I wonder if it could make sense to redefine relptr as self-relative? That ought to provide some notational savings since you'd only need to carry around the relptr's own address not that plus a base address. Probably not something to consider for v15 though. regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Tue, May 31, 2022 at 5:52 PM Tom Lane wrote: > Thomas Munro writes: > > Count we make the relptrs 1-based, so that 0 is reserved as a sentinel > > that has the nice memset(0) property? > > Hm ... almost. A +1 offset would mean that zero is ambiguous with a > pointer to the byte just before the relptr. Maybe that case never > arises in practice, but now that we've seen this problem I'm not real > comfortable with such an assumption. But how about a -1 offset? > Then zero would be ambiguous with a pointer to the second byte of the > relptr, and I think I *am* prepared to assume that that has no use-cases. > > The other advantage of such a definition is that it'd help flush out > anybody breaking the relptr abstraction ;-) Seems backwards to me. A relative pointer is supposed to point to something inside some range of memory, like a DSM gment -- it can never be legally used to point to anything outside that segment. So it seems to me that you could perfectly legally point to the second byte of the segment, but never to the -1'th byte. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Thomas Munro writes: > Count we make the relptrs 1-based, so that 0 is reserved as a sentinel > that has the nice memset(0) property? Hm ... almost. A +1 offset would mean that zero is ambiguous with a pointer to the byte just before the relptr. Maybe that case never arises in practice, but now that we've seen this problem I'm not real comfortable with such an assumption. But how about a -1 offset? Then zero would be ambiguous with a pointer to the second byte of the relptr, and I think I *am* prepared to assume that that has no use-cases. The other advantage of such a definition is that it'd help flush out anybody breaking the relptr abstraction ;-) regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Wed, Jun 1, 2022 at 9:09 AM Tom Lane wrote: > Robert Haas writes: > > Could it use something other than its own address as the base address? > > Hmm, maybe we could make something of that idea ... > > > One way to do this would be to put it at the *end* of the > > "Preallocated DSM" space, rather than the beginning. > > ... but that way doesn't sound good. Doesn't it just move the > problem to the first object allocated inside the FPM? Count we make the relptrs 1-based, so that 0 is reserved as a sentinel that has the nice memset(0) property?
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Robert Haas writes: > Could it use something other than its own address as the base address? Hmm, maybe we could make something of that idea ... > One way to do this would be to put it at the *end* of the > "Preallocated DSM" space, rather than the beginning. ... but that way doesn't sound good. Doesn't it just move the problem to the first object allocated inside the FPM? regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Tue, May 31, 2022 at 4:32 PM Thomas Munro wrote: > This FPM isn't in a DSM. (It happens to have DSMs *inside it*, > because I'm using it as a separate DSM allocator: instead of making > them with dsm_impl.c mechanisms, this one recycles space from the main > shmem area). I view FPM as a reusable 4kb page-based memory allocator > that could have many potential uses, not as a thing that must live > inside another thing with a TOC. The fact that it uses the relptr > thing makes it possible to use FPM inside DSMs too, but that doesn't > mean it has to be used inside a DSM. Could it use something other than its own address as the base address? One way to do this would be to put it at the *end* of the "Preallocated DSM" space, rather than the beginning. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Tue, May 31, 2022 at 4:10 PM Tom Lane wrote: > Seems like that in itself is a a lousy idea. Either the code should > respect the abstraction, or it shouldn't be declaring the variable > as a relptr in the first place. Yep. I think it should be respecting the abstraction, but the 2016 version of me failed to realize the issue when committing 13e14a78ea1. Hindsight is 20-20, perhaps. > > So we can fix this by: > > 1. Using a relative pointer value other than 0 to represent a null > > pointer. Andres suggested (Size) -1. > > 2. Not storing the free page manager for the DSM in the main shared > > memory segment at byte offset 0. > > 3. Dropping the assertion while loudly singing "la la la la la la". > > I'm definitely down on #3, because that just leaves the ambiguity > in place to bite somewhere else in future. #1 would work as long > as nobody expects memset-to-zero to produce null relptrs, but that > doesn't seem very nice either. Well, that's a good point that I hadn't considered, actually. I was thinking I'd only picked 0 as the value out of adherence to convention, but I might have had this in mind too, at the time. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Wed, Jun 1, 2022 at 8:10 AM Tom Lane wrote: > Robert Haas writes: > > So we can fix this by: > > 1. Using a relative pointer value other than 0 to represent a null > > pointer. Andres suggested (Size) -1. > > 2. Not storing the free page manager for the DSM in the main shared > > memory segment at byte offset 0. > > 3. Dropping the assertion while loudly singing "la la la la la la". > > I'm definitely down on #3, because that just leaves the ambiguity > in place to bite somewhere else in future. #1 would work as long > as nobody expects memset-to-zero to produce null relptrs, but that > doesn't seem very nice either. > > On the whole, wasting MAXALIGN worth of memory seems like the least bad > alternative, but I wonder if we ought to do it right here as opposed > to somewhere in the DSM code proper. Why is this DSM space not like > other DSM spaces in starting with a TOC? This FPM isn't in a DSM. (It happens to have DSMs *inside it*, because I'm using it as a separate DSM allocator: instead of making them with dsm_impl.c mechanisms, this one recycles space from the main shmem area). I view FPM as a reusable 4kb page-based memory allocator that could have many potential uses, not as a thing that must live inside another thing with a TOC. The fact that it uses the relptr thing makes it possible to use FPM inside DSMs too, but that doesn't mean it has to be used inside a DSM. I vote for #1.
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Robert Haas writes: > Yeah, so when I created this stuff in the first place, I figured that > it wasn't a problem if we reserved relptr == 0 to mean a NULL pointer, > because you would never have a relative pointer pointing to the > beginning of a DSM, because it would probably always start with a > dsm_toc. But when Thomas made it so that DSM allocations could happen > in the main shared memory segment, that ceased to be true. This > example happened not to break because we never use relptr_access() on > fpm->self. We do use fpm_segment_base(), but that accidentally fails > to break, because instead of using relptr_access() it drills right > through the abstraction and doesn't have any kind of special case for > 0. Seems like that in itself is a a lousy idea. Either the code should respect the abstraction, or it shouldn't be declaring the variable as a relptr in the first place. > So we can fix this by: > 1. Using a relative pointer value other than 0 to represent a null > pointer. Andres suggested (Size) -1. > 2. Not storing the free page manager for the DSM in the main shared > memory segment at byte offset 0. > 3. Dropping the assertion while loudly singing "la la la la la la". I'm definitely down on #3, because that just leaves the ambiguity in place to bite somewhere else in future. #1 would work as long as nobody expects memset-to-zero to produce null relptrs, but that doesn't seem very nice either. On the whole, wasting MAXALIGN worth of memory seems like the least bad alternative, but I wonder if we ought to do it right here as opposed to somewhere in the DSM code proper. Why is this DSM space not like other DSM spaces in starting with a TOC? regards, tom lane
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
On Thu, May 19, 2022 at 11:00 PM Kyotaro Horiguchi wrote: > The path is taken only when a valid value is given to the > parameter. If we don't use preallocated dsm, it is initialized > elsewhere. In those cases the first bytes of the base address (the > second parameter of FreePageManagerInitialize) are used for > dsa_segment_header so the relptr won't be zero (!= NULL). > > It can be silenced by wasting the first MAXALIGN bytes of > dsm_main_space_begin.. Yeah, so when I created this stuff in the first place, I figured that it wasn't a problem if we reserved relptr == 0 to mean a NULL pointer, because you would never have a relative pointer pointing to the beginning of a DSM, because it would probably always start with a dsm_toc. But when Thomas made it so that DSM allocations could happen in the main shared memory segment, that ceased to be true. This example happened not to break because we never use relptr_access() on fpm->self. We do use fpm_segment_base(), but that accidentally fails to break, because instead of using relptr_access() it drills right through the abstraction and doesn't have any kind of special case for 0. So we can fix this by: 1. Using a relative pointer value other than 0 to represent a null pointer. Andres suggested (Size) -1. 2. Not storing the free page manager for the DSM in the main shared memory segment at byte offset 0. 3. Dropping the assertion while loudly singing "la la la la la la". -- Robert Haas EDB: http://www.enterprisedb.com
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
At Fri, 20 May 2022 12:00:14 +0900 (JST), Kyotaro Horiguchi wrote in > At Thu, 19 May 2022 17:16:03 -0400, Tom Lane wrote in > > Justin Pryzby writes: > > > ./tmp_install/usr/local/pgsql/bin/postgres -D > > > ./src/test/regress/tmp_check/data -c min_dynamic_shared_memory=1MB > > > TRAP: FailedAssertion("val > base", File: > > > "../../../../src/include/utils/relptr.h", Line: 67, PID: 21912) > > > > Yeah, I see it too. > > > > > It looks like this may be pre-existing problem exposed by > > > commit e07d4ddc55fdcf82082950b3eb0cd8f728284c9d > > > > Agreed. Here I see > > > > #5 FreePageManagerInitialize (fpm=fpm@entry=0x7f34b3ddd300, > > base=base@entry=0x7f34b3ddd300 "") at freepage.c:187 > > #6 0x0082423e in dsm_shmem_init () at dsm.c:473 > > > > so that where we do > > > > relptr_store(base, fpm->self, fpm); > > > > the "relative" pointer value would have to be zero, making the case > > indistinguishable from a NULL pointer. We can either change the > > caller so that these addresses aren't the same, or give up the > > ability to store NULL in relptrs ... doesn't seem like a hard call. > > > > One interesting question I didn't look into is why it takes a nondefault > > value of min_dynamic_shared_memory to expose this bug. > > The path is taken only when a valid value is given to the > parameter. If we don't use preallocated dsm, it is initialized > elsewhere. In those cases the first bytes of the base address (the > second parameter of FreePageManagerInitialize) are used for > dsa_segment_header so the relptr won't be zero (!= NULL). > > It can be silenced by wasting the first MAXALIGN bytes of > dsm_main_space_begin.. Actually, that change doesn't result in wasting of usable memory size since the change doesn't move the first effective page. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c index 9d86bbe872..c67134ec27 100644 --- a/src/backend/storage/ipc/dsm.c +++ b/src/backend/storage/ipc/dsm.c @@ -461,12 +461,19 @@ dsm_shmem_init(void) dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, ); if (!found) { - FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin; + /* + * fpm cannot be the same with dsm_main_space_begin due to the + * restriction imposed by FreePageManagerInitialize() due to + * relptr. Add MAXALIGN(1) to fpm to live with that restriction. + */ + int gap = MAXALIGN(1); + FreePageManager *fpm = + (FreePageManager *) ((char *)dsm_main_space_begin + gap); size_t first_page = 0; size_t pages; /* Reserve space for the FreePageManager. */ - while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager)) + while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager) + gap) ++first_page; /* Initialize it and give it all the rest of the space. */ diff --git a/src/test/modules/test_misc/t/004_check_min_dsm_size.pl b/src/test/modules/test_misc/t/004_check_min_dsm_size.pl new file mode 100644 index 00..025dca33e4 --- /dev/null +++ b/src/test/modules/test_misc/t/004_check_min_dsm_size.pl @@ -0,0 +1,14 @@ +# Tests of min_dynamic_shared_memory + +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +my $node = PostgreSQL::Test::Cluster->new('main'); +$node->init; +$node->append_conf('postgresql.conf', 'min_dynamic_shared_memory = 1MB'); +$node->start; +pass('check if server can start with min_dynamic_shared_memory'); +done_testing();
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
At Thu, 19 May 2022 17:16:03 -0400, Tom Lane wrote in > Justin Pryzby writes: > > ./tmp_install/usr/local/pgsql/bin/postgres -D > > ./src/test/regress/tmp_check/data -c min_dynamic_shared_memory=1MB > > TRAP: FailedAssertion("val > base", File: > > "../../../../src/include/utils/relptr.h", Line: 67, PID: 21912) > > Yeah, I see it too. > > > It looks like this may be pre-existing problem exposed by > > commit e07d4ddc55fdcf82082950b3eb0cd8f728284c9d > > Agreed. Here I see > > #5 FreePageManagerInitialize (fpm=fpm@entry=0x7f34b3ddd300, > base=base@entry=0x7f34b3ddd300 "") at freepage.c:187 > #6 0x0082423e in dsm_shmem_init () at dsm.c:473 > > so that where we do > > relptr_store(base, fpm->self, fpm); > > the "relative" pointer value would have to be zero, making the case > indistinguishable from a NULL pointer. We can either change the > caller so that these addresses aren't the same, or give up the > ability to store NULL in relptrs ... doesn't seem like a hard call. > > One interesting question I didn't look into is why it takes a nondefault > value of min_dynamic_shared_memory to expose this bug. The path is taken only when a valid value is given to the parameter. If we don't use preallocated dsm, it is initialized elsewhere. In those cases the first bytes of the base address (the second parameter of FreePageManagerInitialize) are used for dsa_segment_header so the relptr won't be zero (!= NULL). It can be silenced by wasting the first MAXALIGN bytes of dsm_main_space_begin.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Justin Pryzby writes: > ./tmp_install/usr/local/pgsql/bin/postgres -D > ./src/test/regress/tmp_check/data -c min_dynamic_shared_memory=1MB > TRAP: FailedAssertion("val > base", File: > "../../../../src/include/utils/relptr.h", Line: 67, PID: 21912) Yeah, I see it too. > It looks like this may be pre-existing problem exposed by > commit e07d4ddc55fdcf82082950b3eb0cd8f728284c9d Agreed. Here I see #5 FreePageManagerInitialize (fpm=fpm@entry=0x7f34b3ddd300, base=base@entry=0x7f34b3ddd300 "") at freepage.c:187 #6 0x0082423e in dsm_shmem_init () at dsm.c:473 so that where we do relptr_store(base, fpm->self, fpm); the "relative" pointer value would have to be zero, making the case indistinguishable from a NULL pointer. We can either change the caller so that these addresses aren't the same, or give up the ability to store NULL in relptrs ... doesn't seem like a hard call. One interesting question I didn't look into is why it takes a nondefault value of min_dynamic_shared_memory to expose this bug. regards, tom lane
pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
./tmp_install/usr/local/pgsql/bin/postgres -D ./src/test/regress/tmp_check/data -c min_dynamic_shared_memory=1MB TRAP: FailedAssertion("val > base", File: "../../../../src/include/utils/relptr.h", Line: 67, PID: 21912) ./tmp_install/usr/local/pgsql/bin/postgres(ExceptionalCondition+0xa0)[0x55af5c9c463e] ./tmp_install/usr/local/pgsql/bin/postgres(FreePageManagerInitialize+0x94)[0x55af5c9f4478] ./tmp_install/usr/local/pgsql/bin/postgres(dsm_shmem_init+0x87)[0x55af5c841532] ./tmp_install/usr/local/pgsql/bin/postgres(CreateSharedMemoryAndSemaphores+0x8d)[0x55af5c843f30] ./tmp_install/usr/local/pgsql/bin/postgres(+0x41805c)[0x55af5c7c605c] ./tmp_install/usr/local/pgsql/bin/postgres(PostmasterMain+0x959)[0x55af5c7ca8e7] ./tmp_install/usr/local/pgsql/bin/postgres(main+0x229)[0x55af5c70af7f] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f736d6e1b97] ./tmp_install/usr/local/pgsql/bin/postgres(_start+0x2a)[0x55af5c4794fa] It looks like this may be pre-existing problem exposed by commit e07d4ddc55fdcf82082950b3eb0cd8f728284c9d Author: Tom Lane Date: Sat Mar 26 14:29:29 2022 -0400 Suppress compiler warning in relptr_store().