On Tue, 10 Oct 2023 at 06:38, Tom Lane <t...@sss.pgh.pa.us> wrote: > Hm. I'd be inclined to use maxlen == 0 as the indicator of a read-only > buffer, just because that would not create a problem if we ever want > to change it to an unsigned type. Other than that, I agree with the > idea of using a special maxlen value to indicate that the buffer is > read-only and not owned by the StringInfo. We need to nail down the > exact semantics though.
I've attached a slightly more worked on patch that makes maxlen == 0 mean read-only. Unsure if a macro is worthwhile there or not. The patch still fails during 023_twophase_stream.pl for the reasons mentioned upthread. Getting rid of the Assert in initStringInfoFromStringWithLen() allows it to pass. One thought I had about this is that the memory context behaviour might catch someone out at some point. Right now if you do initStringInfo() the memory context of the "data" field will be CurrentMemoryContext, but if someone does initStringInfoFromStringWithLen() and then changes to some other memory context before doing an appendStringInfo on that string, then we'll allocate "data" in whatever that memory context is. Maybe that's ok if we document it. Fixing it would mean adding a MemoryContext field to StringInfoData which would be set to CurrentMemoryContext during initStringInfo() and initStringInfoFromStringWithLen(). I'm not fully happy with the extra code added in enlargeStringInfo(). It's a little repetitive. Fixing it up would mean having to have a boolean variable to mark if the string was readonly so at the end we'd know to repalloc or palloc/memcpy. For now, I just marked that code as unlikely() since there's no place in the code base that uses it. David
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c index 82f48a488e..fdfbd45e2c 100644 --- a/src/backend/replication/logical/applyparallelworker.c +++ b/src/backend/replication/logical/applyparallelworker.c @@ -774,10 +774,7 @@ LogicalParallelApplyLoop(shm_mq_handle *mqh) if (len == 0) elog(ERROR, "invalid message length"); - s.cursor = 0; - s.maxlen = -1; - s.data = (char *) data; - s.len = len; + initStringInfoFromStringWithLen(&s, data, len); /* * The first byte of messages sent from leader apply worker to diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c index d52c8963eb..e51a2b7ce1 100644 --- a/src/backend/replication/logical/proto.c +++ b/src/backend/replication/logical/proto.c @@ -879,6 +879,7 @@ logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple) /* Read the data */ for (i = 0; i < natts; i++) { + char *buff; char kind; int len; StringInfo value = &tuple->colvalues[i]; @@ -899,19 +900,16 @@ logicalrep_read_tuple(StringInfo in, LogicalRepTupleData *tuple) len = pq_getmsgint(in, 4); /* read length */ /* and data */ - value->data = palloc(len + 1); - pq_copymsgbytes(in, value->data, len); + buff = palloc(len + 1); + pq_copymsgbytes(in, buff, len); /* * Not strictly necessary for LOGICALREP_COLUMN_BINARY, but * per StringInfo practice. */ - value->data[len] = '\0'; + buff[len] = '\0'; - /* make StringInfo fully valid */ - value->len = len; - value->cursor = 0; - value->maxlen = len; + initStringInfoFromStringWithLen(value, buff, len); break; default: elog(ERROR, "unrecognized data representation type '%c'", kind); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 597947410f..b4bf3111cf 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -3582,10 +3582,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received) /* Ensure we are reading the data into our memory context. */ MemoryContextSwitchTo(ApplyMessageContext); - s.data = buf; - s.len = len; - s.cursor = 0; - s.maxlen = -1; + initStringInfoFromStringWithLen(&s, buf, len); c = pq_getmsgbyte(&s); diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 21b9763183..d48029038b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -1816,23 +1816,19 @@ exec_bind_message(StringInfo input_message) if (!isNull) { - const char *pvalue = pq_getmsgbytes(input_message, plength); + char *pvalue; /* - * Rather than copying data around, we just set up a phony - * StringInfo pointing to the correct portion of the message - * buffer. We assume we can scribble on the message buffer so - * as to maintain the convention that StringInfos have a - * trailing null. This is grotty but is a big win when - * dealing with very large parameter strings. + * Rather than copying data around, we just initialize a StringInfo + * pointing to the correct portion of the message buffer. We assume we + * can scribble on the message buffer so as to maintain the convention + * that StringInfos have a trailing null. This is grotty but is a big win + * when dealing with very large parameter strings. */ - pbuf.data = unconstify(char *, pvalue); - pbuf.maxlen = plength + 1; - pbuf.len = plength; - pbuf.cursor = 0; - - csave = pbuf.data[plength]; - pbuf.data[plength] = '\0'; + pvalue = unconstify(char *, pq_getmsgbytes(input_message, plength)); + csave = pvalue[plength]; + pvalue[plength] = '\0'; + initStringInfoFromStringWithLen(&pbuf, pvalue, plength); } else { diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c index ad176651d8..e76c723f49 100644 --- a/src/backend/utils/adt/rowtypes.c +++ b/src/backend/utils/adt/rowtypes.c @@ -623,21 +623,18 @@ record_recv(PG_FUNCTION_ARGS) } else { + char *strbuff; + /* - * Rather than copying data around, we just set up a phony - * StringInfo pointing to the correct portion of the input buffer. - * We assume we can scribble on the input buffer so as to maintain + * Initalize a new StringInfo using the correct portion of the input + * buffer. We assume we can scribble on the input buffer so as to maintain * the convention that StringInfos have a trailing null. */ - item_buf.data = &buf->data[buf->cursor]; - item_buf.maxlen = itemlen + 1; - item_buf.len = itemlen; - item_buf.cursor = 0; - + strbuff = &buf->data[buf->cursor]; buf->cursor += itemlen; - csave = buf->data[buf->cursor]; buf->data[buf->cursor] = '\0'; + initStringInfoFromStringWithLen(&item_buf, strbuff, itemlen); bufptr = &item_buf; nulls[i] = false; diff --git a/src/common/stringinfo.c b/src/common/stringinfo.c index 05b22b5c53..2cf57a804f 100644 --- a/src/common/stringinfo.c +++ b/src/common/stringinfo.c @@ -30,6 +30,7 @@ #endif #include "lib/stringinfo.h" +#include "port/pg_bitutils.h" /* @@ -320,24 +321,50 @@ enlargeStringInfo(StringInfo str, int needed) if (needed <= str->maxlen) return; /* got enough space already */ - /* - * We don't want to allocate just a little more space with each append; - * for efficiency, double the buffer size each time it overflows. - * Actually, we might need to more than double it if 'needed' is big... - */ - newlen = 2 * str->maxlen; - while (needed > newlen) - newlen = 2 * newlen; - - /* - * Clamp to MaxAllocSize in case we went past it. Note we are assuming - * here that MaxAllocSize <= INT_MAX/2, else the above loop could - * overflow. We will still have newlen >= needed. - */ - if (newlen > (int) MaxAllocSize) - newlen = (int) MaxAllocSize; - - str->data = (char *) repalloc(str->data, newlen); - - str->maxlen = newlen; + if (likely(str->maxlen != 0)) + { + /* + * We don't want to allocate just a little more space with each append; + * for efficiency, double the buffer size each time it overflows. + * Actually, we might need to more than double it if 'needed' is big... + */ + newlen = 2 * str->maxlen; + while (needed > newlen) + newlen = 2 * newlen; + + /* + * Clamp to MaxAllocSize in case we went past it. Note we are assuming + * here that MaxAllocSize <= INT_MAX/2, else the above loop could + * overflow. We will still have newlen >= needed. + */ + if (newlen > (int) MaxAllocSize) + newlen = (int) MaxAllocSize; + + str->data = (char *) repalloc(str->data, newlen); + + str->maxlen = newlen; + } + else + { + char *newdata; + + /* + * The StringInfo may be read-only if initialized with + * initStringInfoFromStringWithLen(). This means we need to copy the + * read-only string out into a newly palloc'd buffer. + */ + newlen = pg_nextpower2_32(str->len) * 2; + while (needed > newlen) + newlen = 2 * newlen; + + if (newlen > (int) MaxAllocSize) + newlen = (int) MaxAllocSize; + + newdata = (char *) palloc(newlen); + memcpy(newdata, str->data, str->len + 1); + + str->data = newdata; + str->maxlen = newlen; + return; + } } diff --git a/src/include/lib/stringinfo.h b/src/include/lib/stringinfo.h index 36a416f8e0..f3739790a7 100644 --- a/src/include/lib/stringinfo.h +++ b/src/include/lib/stringinfo.h @@ -27,7 +27,10 @@ * maxlen is the allocated size in bytes of 'data', i.e. the maximum * string size (including the terminating '\0' char) that we can * currently store in 'data' without having to reallocate - * more space. We must always have maxlen > len. + * more space. We must always have maxlen > len except in the + * case when maxlen is set to 0, in which case 'data' is classed + * "read-only". Before appending to a read-only StringInfoData we + * must always copy the 'data' out into a new allocation. * cursor is initialized to zero by makeStringInfo or initStringInfo, * but is not otherwise touched by the stringinfo.c routines. * Some routines use it to scan through a StringInfo. @@ -79,6 +82,26 @@ extern StringInfo makeStringInfo(void); */ extern void initStringInfo(StringInfo str); +/*------------------------ + * initStringInfoFromStringWithLen + * Initialize a StringInfoData struct from an existing string without copying + * the string. The caller is responsible for ensuring the given string + * remains valid as long as the StringInfo does. The given string must be + * NUL terminated at 'len' bytes. Calls to this are used in performance + * critical locations where allocating a new buffer and copying would be too + * costly. + */ +static inline void +initStringInfoFromStringWithLen(StringInfo str, char *data, int len) +{ + Assert(data[len] == '\0'); + + str->data = data; + str->maxlen = 0; + str->len = len; + str->cursor = 0; +} + /*------------------------ * resetStringInfo * Clears the current content of the StringInfo, if any. The