Hi,

When running workloads that include fast queries with a lot of columns,
SendRowDescriptionMessage(), and the routines it calls, becomes a
bottleneck.  Besides syscache lookups (see [1] and [2]) a major cost of
that is manipulation of the StringBuffer and the version specific
branches in the per-attribute loop.  As so often, the performance
differential of this patch gets bigger when the other performance
patches are applied.

The issues in SendRowDescriptionMessage() are the following:

1) All the pq_sendint calls, and also the pq_sendstring, are
   expensive. The amount of calculations done for a single 2/4 byte
   addition to the stringbuffer (particularly enlargeStringInfo()) is
   problematic, as are the reallocations themselves.

   I addressed this by adding pq_send*_pre() wrappers, implemented as
   inline functions, that require that memory is pre-allocated.
   Combining that with doing a enlargeStringInfo() in
   SendRowDescriptionMessage() that pre-allocates the maximum required
   memory, yields pretty good speedup.

   I'm not yet super sure about the implementation. For one, I'm not
   sure this shouldn't instead be stringinfo.h functions, with very very
   tiny pqformat.h wrappers. But conversely I think it'd make a lot of
   sense for the pqformat integer functions to get rid of the
   continually maintained trailing null-byte - I was hoping the compiler
   could optimize that away, but alas, no luck.  As soon as a single
   integer is sent, you can't rely on 0 terminated strings anyway.

2) It creates a new StringInfo in every iteration. That causes
   noticeable memory management overhead, and restarts the size of the
   buffer every time. When the description is relatively large, that
   leads to a number of reallocations for every
   SendRowDescriptionMessage() call.

   My solution here was to create persistent StringInfo for
   SendRowDescriptionMessage() that never gets deallocated (just
   reset). That in combination with new versions of
   pq_beginmessage/endmessage that keep the buffer alive, yields a nice
   speedup.

   Currently I'm using a static variable to allocate a string buffer for
   the function. It'd probably better to manage that outside of a single
   function - I'm also wondering why we're allocating a good number of
   stringinfos in various places, rather than reuse them. Most of them
   can't be entered recursively, and even if that's a concern, we could
   have one reusable per portal or such.

3) The v2/v3 branches in the attribute loop are noticeable (others too,
   but well...).  I solved that by splitting out the v2 and v3
   per-attribute loops into separate functions.  Imo also a good chunk
   more readable.

Comments?

Greetings,

Andres Freund

[1] 
http://archives.postgresql.org/message-id/ca+tgmobj72e_tg6w98h0oubccumoc4urmjocypbnwc2rxya...@mail.gmail.com
[2] 
http://archives.postgresql.org/message-id/20170914061207.zxotvyopetm7lrrp%40alap3.anarazel.de
>From 672cbfd0660e4d2b2cc6980f3f5c2af27e692404 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Wed, 13 Sep 2017 18:39:24 -0700
Subject: [PATCH 2/8] Add more efficient functions to pqformat API.

New inline functions allow to add data to a stringbuf in a more
efficient manner by pre-allocating ahead of time, and
pq_beginmessage_pre/pq_endmessage_keep allow reuse of a stringbuffer.
---
 src/backend/libpq/pqformat.c   | 37 +++++++++++++++++++++++++++++++++
 src/backend/utils/mb/mbutils.c | 11 ----------
 src/include/libpq/pqformat.h   | 47 ++++++++++++++++++++++++++++++++++++++++++
 src/include/mb/pg_wchar.h      | 11 ++++++++++
 4 files changed, 95 insertions(+), 11 deletions(-)

diff --git a/src/backend/libpq/pqformat.c b/src/backend/libpq/pqformat.c
index c8cf67c041..6e40ee087c 100644
--- a/src/backend/libpq/pqformat.c
+++ b/src/backend/libpq/pqformat.c
@@ -97,6 +97,28 @@ pq_beginmessage(StringInfo buf, char msgtype)
 	buf->cursor = msgtype;
 }
 
+/* --------------------------------
+
+ *		pq_beginmessage_pre - initialize for sending a message, reuse buffer
+ *
+ * This requires the buffer to be allocated in an sufficiently long-lived
+ * memory context.
+ * --------------------------------
+ */
+void
+pq_beginmessage_pre(StringInfo buf, char msgtype)
+{
+	resetStringInfo(buf);
+
+	/*
+	 * We stash the message type into the buffer's cursor field, expecting
+	 * that the pq_sendXXX routines won't touch it.  We could alternatively
+	 * make it the first byte of the buffer contents, but this seems easier.
+	 */
+	buf->cursor = msgtype;
+}
+
+
 /* --------------------------------
  *		pq_sendbyte		- append a raw byte to a StringInfo buffer
  * --------------------------------
@@ -350,6 +372,21 @@ pq_endmessage(StringInfo buf)
 	buf->data = NULL;
 }
 
+/* --------------------------------
+ *		pq_endmessage_keep	- send the completed message to the frontend
+ *
+ * The data buffer is *not* freed, allowing to reuse the buffer with
+ * pg_beginmessage_pre.
+ --------------------------------
+ */
+
+void
+pq_endmessage_keep(StringInfo buf)
+{
+	/* msgtype was saved in cursor field */
+	(void) pq_putmessage(buf->cursor, buf->data, buf->len);
+}
+
 
 /* --------------------------------
  *		pq_begintypsend		- initialize for constructing a bytea result
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index c4fbe0903b..56f4dc1453 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -41,17 +41,6 @@
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 
-/*
- * When converting strings between different encodings, we assume that space
- * for converted result is 4-to-1 growth in the worst case. The rate for
- * currently supported encoding pairs are within 3 (SJIS JIS X0201 half width
- * kanna -> UTF8 is the worst case).  So "4" should be enough for the moment.
- *
- * Note that this is not the same as the maximum character width in any
- * particular encoding.
- */
-#define MAX_CONVERSION_GROWTH  4
-
 /*
  * We maintain a simple linked list caching the fmgr lookup info for the
  * currently selected conversion functions, as well as any that have been
diff --git a/src/include/libpq/pqformat.h b/src/include/libpq/pqformat.h
index 32112547a0..2e5b2b7685 100644
--- a/src/include/libpq/pqformat.h
+++ b/src/include/libpq/pqformat.h
@@ -14,8 +14,10 @@
 #define PQFORMAT_H
 
 #include "lib/stringinfo.h"
+#include "mb/pg_wchar.h"
 
 extern void pq_beginmessage(StringInfo buf, char msgtype);
+extern void pq_beginmessage_pre(StringInfo buf, char msgtype);
 extern void pq_sendbyte(StringInfo buf, int byt);
 extern void pq_sendbytes(StringInfo buf, const char *data, int datalen);
 extern void pq_sendcountedtext(StringInfo buf, const char *str, int slen,
@@ -28,6 +30,51 @@ extern void pq_sendint64(StringInfo buf, int64 i);
 extern void pq_sendfloat4(StringInfo buf, float4 f);
 extern void pq_sendfloat8(StringInfo buf, float8 f);
 extern void pq_endmessage(StringInfo buf);
+extern void pq_endmessage_keep(StringInfo buf);
+
+extern void pq_sendint64(StringInfo buf, int64 i);
+extern void pq_sendfloat4(StringInfo buf, float4 f);
+extern void pq_sendfloat8(StringInfo buf, float8 f);
+
+/* inline versions that require all space is pre-allocated */
+static inline void
+pq_sendint32_pre(StringInfo buf, int32 i)
+{
+	Assert(buf->len + sizeof(i) + 1 <= buf->maxlen);
+	*(int32* ) (buf->data + buf->len) = htonl(i);
+	buf->len += sizeof(i);
+	*(char *) (buf->data + buf->len) = '\0';
+}
+
+static inline void
+pq_sendint16_pre(StringInfo buf, int16 i)
+{
+	Assert(buf->len + sizeof(i) + 1 <= buf->maxlen);
+	*(int16* ) (buf->data + buf->len) = htons(i);
+	buf->len += sizeof(i);
+	*(char *) (buf->data + buf->len) = '\0';
+}
+
+static inline void
+pq_sendstring_pre(StringInfo buf, const char *str)
+{
+	int			slen = strlen(str);
+	char	   *p;
+
+	p = pg_server_to_client(str, slen);
+	if (p != str)				/* actual conversion has been done? */
+		slen = strlen(p);
+
+	Assert(buf->len + slen + 1 <= buf->maxlen);
+
+	memcpy(buf->data + buf->len, p, slen + 1);
+	buf->len += slen + 1;
+	*(char *) (buf->data + buf->len) = '\0';
+
+	if (p != str)
+		pfree(p);
+}
+
 
 extern void pq_begintypsend(StringInfo buf);
 extern bytea *pq_endtypsend(StringInfo buf);
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index d57ef017cb..9227d634f6 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -304,6 +304,17 @@ typedef enum pg_enc
 /* On FE are possible all encodings */
 #define PG_VALID_FE_ENCODING(_enc)	PG_VALID_ENCODING(_enc)
 
+/*
+ * When converting strings between different encodings, we assume that space
+ * for converted result is 4-to-1 growth in the worst case. The rate for
+ * currently supported encoding pairs are within 3 (SJIS JIS X0201 half width
+ * kanna -> UTF8 is the worst case).  So "4" should be enough for the moment.
+ *
+ * Note that this is not the same as the maximum character width in any
+ * particular encoding.
+ */
+#define MAX_CONVERSION_GROWTH  4
+
 /*
  * Table for mapping an encoding number to official encoding name and
  * possibly other subsidiary data.  Be careful to check encoding number
-- 
2.14.1.536.g6867272d5b.dirty

>From 4b369d2aa88f0fed53e304932d39defc95b9bbfb Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Wed, 13 Sep 2017 18:41:50 -0700
Subject: [PATCH 3/8] Improve performance of SendRowDescriptionMessage.

Using the new pqformat functions yields performance for statements
with a noticeable number of rows. The function itself is more than
twice as fast.
---
 src/backend/access/common/printtup.c | 156 ++++++++++++++++++++++++++---------
 1 file changed, 118 insertions(+), 38 deletions(-)

diff --git a/src/backend/access/common/printtup.c b/src/backend/access/common/printtup.c
index 20d20e623e..894f89b63c 100644
--- a/src/backend/access/common/printtup.c
+++ b/src/backend/access/common/printtup.c
@@ -32,6 +32,8 @@ static bool printtup_internal_20(TupleTableSlot *slot, DestReceiver *self);
 static void printtup_shutdown(DestReceiver *self);
 static void printtup_destroy(DestReceiver *self);
 
+static void SendRowDescriptionCols_2(StringInfo buf, TupleDesc typeinfo, List *targetlist, int16 *formats);
+static void SendRowDescriptionCols_3(StringInfo buf, TupleDesc typeinfo, List *targetlist, int16 *formats);
 
 /* ----------------------------------------------------------------
  *		printtup / debugtup support
@@ -189,12 +191,118 @@ SendRowDescriptionMessage(TupleDesc typeinfo, List *targetlist, int16 *formats)
 {
 	int			natts = typeinfo->natts;
 	int			proto = PG_PROTOCOL_MAJOR(FrontendProtocol);
+	static StringInfo rowDescriptionBuf = NULL;
+
+	/*
+	 * As this routine is executed for every single query, it can be a
+	 * pq_sendstring_prebottleneck. To avoid unnecessary allocator overhead
+	 * reuse a single buffer. That means we'll never shrink below the largest
+	 * row-description sent, but that seems acceptable given the limited size.
+	 */
+	if (unlikely(!rowDescriptionBuf))
+	{
+		MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
+		rowDescriptionBuf = makeStringInfo();
+		MemoryContextSwitchTo(oldContext);
+	}
+
+	/* tuple descriptor message type */
+	pq_beginmessage_pre(rowDescriptionBuf, 'T');
+	/* # of attrs in tuples */
+	pq_sendint(rowDescriptionBuf, natts, 2);
+
+	if (proto >= 3)
+		SendRowDescriptionCols_3(rowDescriptionBuf, typeinfo, targetlist,
+								 formats);
+	else
+		SendRowDescriptionCols_2(rowDescriptionBuf, typeinfo, targetlist,
+								 formats);
+
+	pq_endmessage_keep(rowDescriptionBuf);
+}
+
+/*
+ * Send description for each column when using v3+ protocol
+ */
+static void
+SendRowDescriptionCols_3(StringInfo buf, TupleDesc typeinfo, List *targetlist, int16 *formats)
+{
+	int			natts = typeinfo->natts;
 	int			i;
-	StringInfoData buf;
 	ListCell   *tlist_item = list_head(targetlist);
 
-	pq_beginmessage(&buf, 'T'); /* tuple descriptor message type */
-	pq_sendint(&buf, natts, 2); /* # of attrs in tuples */
+	/*
+	 * Prealloc memory for the entire message to be sent. That allows to use
+	 * the significantly faster inline pqformat.h functions and to avoid
+	 * reallocations. Have to overestimate the size of the column-names, to
+	 * account for character set overhead.
+	 */
+	enlargeStringInfo(buf, (NAMEDATALEN * MAX_CONVERSION_GROWTH /* attname */
+							+ sizeof(int32) /* attlen */
+							+ sizeof(int32) /* resorigtbl */
+							+ sizeof(int16) /* resorigcol */
+							+ sizeof(Oid) /* atttypid */
+							+ sizeof(int16) /* attlen */
+							+ sizeof(int32) /* attypmod */
+						  ) * natts);
+
+	for (i = 0; i < natts; ++i)
+	{
+		Form_pg_attribute att = TupleDescAttr(typeinfo, i);
+		Oid			atttypid = att->atttypid;
+		int32		atttypmod = att->atttypmod;
+		int32		resorigtbl;
+		int32		resorigcol;
+		int16		format;
+
+		/*
+		 * If column is a domain, send the base type and typmod
+		 * instead. Lookup before sending any ints, for efficiency.
+		 */
+		atttypid = getBaseTypeAndTypmod(atttypid, &atttypmod);
+
+		/* Do we have a non-resjunk tlist item? */
+		while (tlist_item &&
+			   ((TargetEntry *) lfirst(tlist_item))->resjunk)
+			tlist_item = lnext(tlist_item);
+		if (tlist_item)
+		{
+			TargetEntry *tle = (TargetEntry *) lfirst(tlist_item);
+
+			resorigtbl = tle->resorigtbl;
+			resorigcol = tle->resorigcol;
+			tlist_item = lnext(tlist_item);
+		}
+		else
+		{
+			/* No info available, so send zeroes */
+			resorigtbl = 0;
+			resorigcol = 0;
+		}
+
+		if (formats)
+			format = formats[i];
+		else
+			format = 0;
+
+		pq_sendstring_pre(buf, NameStr(att->attname));
+		pq_sendint32_pre(buf, resorigtbl);
+		pq_sendint16_pre(buf, resorigcol);
+		pq_sendint32_pre(buf, atttypid);
+		pq_sendint16_pre(buf, att->attlen);
+		pq_sendint32_pre(buf, atttypmod);
+		pq_sendint16_pre(buf, format);
+	}
+}
+
+/*
+ * Send description for each column when using v2 protocol
+ */
+static void
+SendRowDescriptionCols_2(StringInfo buf, TupleDesc typeinfo, List *targetlist, int16 *formats)
+{
+	int			natts = typeinfo->natts;
+	int			i;
 
 	for (i = 0; i < natts; ++i)
 	{
@@ -202,44 +310,16 @@ SendRowDescriptionMessage(TupleDesc typeinfo, List *targetlist, int16 *formats)
 		Oid			atttypid = att->atttypid;
 		int32		atttypmod = att->atttypmod;
 
-		pq_sendstring(&buf, NameStr(att->attname));
-		/* column ID info appears in protocol 3.0 and up */
-		if (proto >= 3)
-		{
-			/* Do we have a non-resjunk tlist item? */
-			while (tlist_item &&
-				   ((TargetEntry *) lfirst(tlist_item))->resjunk)
-				tlist_item = lnext(tlist_item);
-			if (tlist_item)
-			{
-				TargetEntry *tle = (TargetEntry *) lfirst(tlist_item);
-
-				pq_sendint(&buf, tle->resorigtbl, 4);
-				pq_sendint(&buf, tle->resorigcol, 2);
-				tlist_item = lnext(tlist_item);
-			}
-			else
-			{
-				/* No info available, so send zeroes */
-				pq_sendint(&buf, 0, 4);
-				pq_sendint(&buf, 0, 2);
-			}
-		}
 		/* If column is a domain, send the base type and typmod instead */
 		atttypid = getBaseTypeAndTypmod(atttypid, &atttypmod);
-		pq_sendint(&buf, (int) atttypid, sizeof(atttypid));
-		pq_sendint(&buf, att->attlen, sizeof(att->attlen));
-		pq_sendint(&buf, atttypmod, sizeof(atttypmod));
-		/* format info appears in protocol 3.0 and up */
-		if (proto >= 3)
-		{
-			if (formats)
-				pq_sendint(&buf, formats[i], 2);
-			else
-				pq_sendint(&buf, 0, 2);
-		}
+
+		pq_sendstring(buf, NameStr(att->attname));
+		/* column ID only info appears in protocol 3.0 and up */
+		pq_sendint(buf, (int) atttypid, sizeof(atttypid));
+		pq_sendint(buf, att->attlen, sizeof(att->attlen));
+		pq_sendint(buf, atttypmod, sizeof(atttypmod));
+		/* format info only appears in protocol 3.0 and up */
 	}
-	pq_endmessage(&buf);
 }
 
 /*
-- 
2.14.1.536.g6867272d5b.dirty

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to