Hello,
On Tue, Mar 10, 2026 at 8:17 PM Nathan Bossart <[email protected]>
wrote:

> On Sat, Feb 14, 2026 at 04:02:21PM +0100, KAZAR Ayoub wrote:
> > On Thu, Feb 12, 2026 at 10:25 PM Andres Freund <[email protected]>
> wrote:
> >> I have a hard time believing that adding a strlen() to the handling of a
> >> short column won't be a measurable overhead with lots of short
> attributes.
> >> Particularly because the patch afaict will call it repeatedly if there
> are
> >> any to-be-escaped characters.
> >
> > [...]
> >
> > 1000 columns:
> > TEXT: 17% regression
> > CSV: 3.4% regression
> >
> > 500 columns:
> > TEXT: 17.7% regression
> > CSV: 3.1% regression
> >
> > 100 columns:
> > TEXT: 17.3% regression
> > CSV: 3% regression
> >
> > A bit unstable results, but yeah the overhead for worse cases like this
> is
> > really significant, I can't argue whether this is worth it or not, so
> > thoughts on this ?
>
> I seriously doubt we'd commit something that produces a 17% regression
> here.  Perhaps we should skip the SIMD paths whenever transcoding is
> required.
>
> --
> nathan
>
I've spent some time rethinking about this and here's what i've done in v3:
SIMD is only used for varlena attributes whose text representation is
longer than a single SIMD vector, and only when no transcoding is required.

Fixed-size types such as integers etc.. mostly produce short ASCII output
for which SIMD provides no benefit.

For eligible attributes, the stored varlena size is used as a cheap
pre-filter to avoid an
unnecessary strlen() call on short values.

Here are the benchmark results after many runs compared to master
(4deecb52aff):
TEXT clean: -34.0%
CSV clean: -39.3%
TEXT 1/3: +4.7%
CSV 1/3: -2.3%
the above numbers have a variance of 1% to 3% improvs or regressions
across +20 runs

WIDE tables short attributes TEXT:
50 columns: -3.7%
100 columns: -1.7%
200 columns: +1.8%
500 columns: -0.5%
1000 columns: -0.3%

WIDE tables short attributes CSV:
50 columns: -2.5%
100 columns: +1.8%
200 columns: +1.4%
500 columns: -0.9%
1000 columns: -1.1%

Wide tables benchmarks where all similar noise, across +20 runs its always
around -2% and +4% for all numbers of columns.

Just a small concern about where some varlenas have a larger binary size
than its text representation ex:
SELECT pg_column_size(to_tsvector('SIMD is GOOD'));
 pg_column_size
----------------
             32

its text representation is less than sizeof(Vector8) so currently v3 would
enter SIMD path and exit out just from the beginning (two extra branches)
because it does this:
+ if (TupleDescAttr(tup_desc, attnum - 1)->attlen == -1 &&
+ VARSIZE_ANY_EXHDR(DatumGetPointer(value)) > sizeof(Vector8))

I thought maybe we could do * 2 or * 4 its binary size, depends on the type
really but this is just a proposition if this case is something concerning.

Thoughts?


Regards,
Ayoub
From a22258dfe42d9804cd6cc41c7a15151c4d30c8b9 Mon Sep 17 00:00:00 2001
From: AyoubKAZ <[email protected]>
Date: Sat, 14 Mar 2026 22:52:22 +0100
Subject: [PATCH] Speed up COPY TO (FORMAT {text,csv}) using SIMD. Presently,
 such commands scan each attribute's string representation one byte at a time
 looking for special characters.  This commit adds a new path that uses SIMD
 instructions to skip over chunks of data without any special characters. 
 This can be much faster.

SIMD processing is only used for varlena attributes whose text
representation is longer than a single SIMD vector, and only when
no encoding conversion is required.  Fixed-size types such as
integers and booleans always produce short ASCII output for which
SIMD provides no benefit, and when transcoding is needed the string
length may change after conversion.  For eligible attributes, the
stored varlena size is used as a cheap pre-filter to avoid an
unnecessary strlen() call on short values, this version also avoids
calling strlen twice when transcoding is necessary.

For TEXT mode, the SIMD path scans for ASCII control characters,
backslash, and the delimiter.  For CSV mode, two SIMD helpers are
used: one to determine whether a field requires quoting by scanning
for the delimiter, quote character, and end-of-line characters, and
one to scan for characters requiring escaping during the output pass.
In both modes, the scalar path handles any remaining characters after
the SIMD pre-pass.
---
 src/backend/commands/copyto.c | 254 +++++++++++++++++++++++++++++++---
 1 file changed, 236 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index d6ef7275a64..fde19f9a6a4 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -31,6 +31,8 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "port/pg_bitutils.h"
+#include "port/simd.h"
 #include "storage/fd.h"
 #include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
@@ -117,11 +119,147 @@ static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
 static void EndCopy(CopyToState cstate);
 static void ClosePipeToProgram(CopyToState cstate);
 static void CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot);
-static void CopyAttributeOutText(CopyToState cstate, const char *string);
-static void CopyAttributeOutCSV(CopyToState cstate, const char *string,
-								bool use_quote);
+static pg_attribute_always_inline void CopyAttributeOutText(CopyToState cstate, const char *string,
+															bool use_simd, size_t len);
+static pg_attribute_always_inline void CopyAttributeOutCSV(CopyToState cstate, const char *string,
+														   bool use_quote, bool use_simd, size_t len);
 static void CopyRelationTo(CopyToState cstate, Relation rel, Relation root_rel,
 						   uint64 *processed);
+static void CopySkipTextSIMD(const char **ptr,
+							 size_t len, char delimc);
+static void CopyCheckCSVQuoteNeedSIMD(const char **ptr,
+									  size_t len, char delimc, char quotec);
+static void CopySkipCSVEscapeSIMD(const char **ptr,
+								  size_t len, char escapec, char quotec);
+
+/*
+ * CopySkipTextSIMD - Scan forward in TEXT mode using SIMD,
+ * stopping at the first special character then caller continues processing any remaining
+ * characters in the scalar path.
+ *
+ * Special characters for TEXT mode are: ASCII control characters (< 0x20),
+ * backslash, and the delimiter.
+ */
+static void
+CopySkipTextSIMD(const char **ptr, size_t len, char delimc)
+{
+#ifndef USE_NO_SIMD
+	const char *p = *ptr;
+	const char *end = p + len;
+
+	const Vector8 backslash_mask = vector8_broadcast('\\');
+	const Vector8 delim_mask = vector8_broadcast(delimc);
+	const Vector8 control_mask = vector8_broadcast(0x20);
+
+	while (p + sizeof(Vector8) <= end)
+	{
+		Vector8		chunk;
+		Vector8		match;
+
+		vector8_load(&chunk, (const uint8 *) p);
+
+		match = vector8_or(vector8_gt(control_mask, chunk),
+						   vector8_eq(chunk, backslash_mask));
+		match = vector8_or(match, vector8_eq(chunk, delim_mask));
+
+		if (vector8_is_highbit_set(match))
+		{
+			uint32		mask;
+
+			mask = vector8_highbit_mask(match);
+			*ptr = p + pg_rightmost_one_pos32(mask);
+			return;
+		}
+
+		p += sizeof(Vector8);
+	}
+
+	*ptr = p;
+#endif
+}
+
+/*
+ * CopyCheckCSVQuoteNeedSIMD - Scan a CSV field using SIMD to determine
+ * whether it needs quoting stopping at the first character that would require the field to be quoted:
+ * the delimiter, the quote character, newline, or carriage return.
+ */
+static void
+CopyCheckCSVQuoteNeedSIMD(const char **ptr, size_t len, char delimc, char quotec)
+{
+#ifndef USE_NO_SIMD
+	const char *p = *ptr;
+	const char *end = p + len;
+
+	const Vector8 delim_mask = vector8_broadcast(delimc);
+	const Vector8 quote_mask = vector8_broadcast(quotec);
+	const Vector8 nl_mask = vector8_broadcast('\n');
+	const Vector8 cr_mask = vector8_broadcast('\r');
+
+	while (p + sizeof(Vector8) <= end)
+	{
+		Vector8		chunk;
+		Vector8		match;
+
+		vector8_load(&chunk, (const uint8 *) p);
+
+		match = vector8_or(vector8_eq(chunk, nl_mask), vector8_eq(chunk, cr_mask));
+		match = vector8_or(match, vector8_or(vector8_eq(chunk, delim_mask),
+											 vector8_eq(chunk, quote_mask)));
+
+		if (vector8_is_highbit_set(match))
+		{
+			uint32		mask;
+
+			mask = vector8_highbit_mask(match);
+			*ptr = p + pg_rightmost_one_pos32(mask);
+			return;
+		}
+
+		p += sizeof(Vector8);
+	}
+
+	*ptr = p;
+#endif
+}
+
+/*
+ * CopySkipCSVEscapeSIMD - Same as CopyCheckCSVQuoteNeedSIMD, scan forward in CSV mode using SIMD,
+ * stopping at the first character that requires escaping.
+ */
+static void
+CopySkipCSVEscapeSIMD(const char **ptr, size_t len, char escapec, char quotec)
+{
+#ifndef USE_NO_SIMD
+	const char *p = *ptr;
+	const char *end = p + len;
+
+	const Vector8 escape_mask = vector8_broadcast(escapec);
+	const Vector8 quote_mask = vector8_broadcast(quotec);
+
+	while (p + sizeof(Vector8) <= end)
+	{
+		Vector8		chunk;
+		Vector8		match;
+
+		vector8_load(&chunk, (const uint8 *) p);
+
+		match = vector8_or(vector8_eq(chunk, quote_mask), vector8_eq(chunk, escape_mask));
+
+		if (vector8_is_highbit_set(match))
+		{
+			uint32		mask;
+
+			mask = vector8_highbit_mask(match);
+			*ptr = p + pg_rightmost_one_pos32(mask);
+			return;
+		}
+
+		p += sizeof(Vector8);
+	}
+
+	*ptr = p;
+#endif
+}
 
 /* built-in format-specific routines */
 static void CopyToTextLikeStart(CopyToState cstate, TupleDesc tupDesc);
@@ -222,9 +360,9 @@ CopyToTextLikeStart(CopyToState cstate, TupleDesc tupDesc)
 			colname = NameStr(TupleDescAttr(tupDesc, attnum - 1)->attname);
 
 			if (cstate->opts.csv_mode)
-				CopyAttributeOutCSV(cstate, colname, false);
+				CopyAttributeOutCSV(cstate, colname, false, false, 0);
 			else
-				CopyAttributeOutText(cstate, colname);
+				CopyAttributeOutText(cstate, colname, false, 0);
 		}
 
 		CopySendTextLikeEndOfRow(cstate);
@@ -273,6 +411,7 @@ CopyToTextLikeOneRow(CopyToState cstate,
 {
 	bool		need_delim = false;
 	FmgrInfo   *out_functions = cstate->out_functions;
+	TupleDesc	tup_desc = slot->tts_tupleDescriptor;
 
 	foreach_int(attnum, cstate->attnumlist)
 	{
@@ -290,15 +429,48 @@ CopyToTextLikeOneRow(CopyToState cstate,
 		else
 		{
 			char	   *string;
+			bool		use_simd = false;
+			size_t		len = 0;
+
+			string = OutputFunctionCall(&out_functions[attnum - 1], value);
 
-			string = OutputFunctionCall(&out_functions[attnum - 1],
-										value);
+			/*
+			* Only use SIMD for varlena types without transcoding.  Fixed-size
+			* types (int4, bool, date, etc.) always produce short ASCII output
+			* for which SIMD provides no benefit.  When transcoding is needed,
+			* the string length may change after conversion, so we skip SIMD
+			* entirely in that case too.
+			*
+			* We use VARSIZE_ANY_EXHDR as a cheap pre-filter to avoid calling
+			* strlen() on short varlenas.  The actual length passed to the SIMD
+			* helpers is always strlen(string) so the text output length not
+			* the binary storage size.
+			*/
+			if (TupleDescAttr(tup_desc, attnum - 1)->attlen == -1 &&
+				VARSIZE_ANY_EXHDR(DatumGetPointer(value)) > sizeof(Vector8))
+			{
+				len = strlen(string);
+				use_simd = !cstate->need_transcoding && (len > sizeof(Vector8));
+			}
 
 			if (is_csv)
-				CopyAttributeOutCSV(cstate, string,
-									cstate->opts.force_quote_flags[attnum - 1]);
+			{
+				if (use_simd)
+					CopyAttributeOutCSV(cstate, string,
+										cstate->opts.force_quote_flags[attnum - 1],
+										true, len);
+				else
+					CopyAttributeOutCSV(cstate, string,
+										cstate->opts.force_quote_flags[attnum - 1],
+										false, len);
+			}
 			else
-				CopyAttributeOutText(cstate, string);
+			{
+				if (use_simd)
+					CopyAttributeOutText(cstate, string, true, len);
+				else
+					CopyAttributeOutText(cstate, string, false, len);
+			}
 		}
 	}
 
@@ -1239,8 +1411,24 @@ CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot)
 			CopySendData(cstate, start, ptr - start); \
 	} while (0)
 
-static void
-CopyAttributeOutText(CopyToState cstate, const char *string)
+/*
+ * CopyAttributeOutText - Send text representation of one attribute,
+ * with conversion and escaping.
+ *
+ * For a little extra speed, if use_simd is true we first use SIMD
+ * instructions to skip over chunks of data that contain no special
+ * characters.  This pre-pass advances ptr as far as possible before
+ * handing off to the scalar loop below, which then processes any
+ * remaining characters.  use_simd is only set by the caller when the
+ * attribute is a varlena type whose text representation is longer than
+ * a single SIMD vector and no encoding conversion is required.  In all
+ * other cases we fall straight through to the scalar path.
+ *
+ * When use_simd is true, len must be the strlen() of string, otherwise it is unused
+ */
+static pg_attribute_always_inline void
+CopyAttributeOutText(CopyToState cstate, const char *string,
+					 bool use_simd, size_t len)
 {
 	const char *ptr;
 	const char *start;
@@ -1248,7 +1436,15 @@ CopyAttributeOutText(CopyToState cstate, const char *string)
 	char		delimc = cstate->opts.delim[0];
 
 	if (cstate->need_transcoding)
-		ptr = pg_server_to_any(string, strlen(string), cstate->file_encoding);
+	{
+		/*
+		 * len may already be set by the caller for long varlenas, avoiding an extra
+		 * strlen() call.  For all other cases it is 0 and we compute it here.
+		 */
+		if (len == 0)
+			len = strlen(string);
+		ptr = pg_server_to_any(string, len, cstate->file_encoding);
+	}
 	else
 		ptr = string;
 
@@ -1269,6 +1465,9 @@ CopyAttributeOutText(CopyToState cstate, const char *string)
 	if (cstate->encoding_embeds_ascii)
 	{
 		start = ptr;
+		if (use_simd)
+			CopySkipTextSIMD(&ptr, len, delimc);
+
 		while ((c = *ptr) != '\0')
 		{
 			if ((unsigned char) c < (unsigned char) 0x20)
@@ -1329,6 +1528,9 @@ CopyAttributeOutText(CopyToState cstate, const char *string)
 	else
 	{
 		start = ptr;
+		if (use_simd)
+			CopySkipTextSIMD(&ptr, len, delimc);
+
 		while ((c = *ptr) != '\0')
 		{
 			if ((unsigned char) c < (unsigned char) 0x20)
@@ -1389,12 +1591,14 @@ CopyAttributeOutText(CopyToState cstate, const char *string)
 }
 
 /*
- * Send text representation of one attribute, with conversion and
- * CSV-style escaping
+ * CopyAttributeOutCSV - Send text representation of one attribute,
+ * with conversion and CSV-style escaping.
+ *
+ * We use the same simd optimization idea, see CopyAttributeOutText comment.
  */
-static void
+static pg_attribute_always_inline void
 CopyAttributeOutCSV(CopyToState cstate, const char *string,
-					bool use_quote)
+					bool use_quote, bool use_simd, size_t len)
 {
 	const char *ptr;
 	const char *start;
@@ -1409,7 +1613,15 @@ CopyAttributeOutCSV(CopyToState cstate, const char *string,
 		use_quote = true;
 
 	if (cstate->need_transcoding)
-		ptr = pg_server_to_any(string, strlen(string), cstate->file_encoding);
+	{
+		/*
+		 * len may already be set by the caller for long varlenas, avoiding an extra
+		 * strlen() call.  For all other cases it is 0 and we compute it here.
+		 */
+		if (len == 0)
+			len = strlen(string);
+		ptr = pg_server_to_any(string, len, cstate->file_encoding);
+	}
 	else
 		ptr = string;
 
@@ -1431,6 +1643,9 @@ CopyAttributeOutCSV(CopyToState cstate, const char *string,
 		{
 			const char *tptr = ptr;
 
+			if (use_simd)
+				CopyCheckCSVQuoteNeedSIMD(&tptr, len, delimc, quotec);
+
 			while ((c = *tptr) != '\0')
 			{
 				if (c == delimc || c == quotec || c == '\n' || c == '\r')
@@ -1454,6 +1669,9 @@ CopyAttributeOutCSV(CopyToState cstate, const char *string,
 		 * We adopt the same optimization strategy as in CopyAttributeOutText
 		 */
 		start = ptr;
+		if (use_simd)
+			CopySkipCSVEscapeSIMD(&ptr, len, escapec, quotec);
+
 		while ((c = *ptr) != '\0')
 		{
 			if (c == quotec || c == escapec)
-- 
2.34.1

Reply via email to