On Thu, Sep 3, 2015 at 6:21 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> [ new patches ]

Still more review comments:

+                               /* Allow space for terminating zero-byte */
+                               size = add_size(size, 1);

This is pointless.  The length is already stored separately, and if it
weren't, this wouldn't be adequate anyway because a varlena can
contain NUL bytes.  It won't if it's text, but it could be bytea or
numeric or whatever.

RestoreBoundParams is broken, because it can do unaligned reads, which
will core dump on some architectures (and merely be slow on others).
If there are two or more parameters, and the first one is a varlena
with a length that is not a multiple of MAXIMUM_ALIGNOF, the second
SerializedParamExternData will be misaligned.

Also, it's pretty lame that we send the useless pointer even for a
pass-by-reference data type and then overwrite the bad pointer with a
good one a few lines later.  It would be better to design the
serialization format so that we don't send the bogus pointer over the
wire in the first place.

It's also problematic in my view that there is so much duplicated code
here. SerializedParamExternData and SerializedParamExecData are very
similar and there are large swaths of very similar code to handle each
case.  Both structures contain Datum value, Size length, bool isnull,
and Oid ptype, albeit not in the same order for some reason.  The only
difference is that SerializedParamExternData contains uint16 pflags
and SerializedParamExecData contains int paramid.  I think we need to
refactor this code to get rid of all this duplication.  I suggest that
we decide to represent a datum here in a uniform fashion, perhaps like
this:

First, store a 4-byte header word.  If this is -2, the value is NULL
and no data follows.  If it's -1, the value is pass-by-value and
sizeof(Datum) bytes follow.  If it's >0, the value is
pass-by-reference and the value gives the number of following bytes
that should be copied into a brand-new palloc'd chunk.

Using a format like this, we can serialize and restore datums in
various contexts - including bind and exec params - without having to
rewrite the code each time.  For example, for param extern data, you
can dump an array of all the ptypes and paramids and then follow it
with all of the params one after another.  For param exec data, you
can dump an array of all the ptypes and paramids and then follow it
with the values one after another.  The code that reads and writes the
datums in both cases can be the same.  If we need to send datums in
other contexts, we can use the same code for it.

The attached patch - which I even tested! - shows what I have in mind.
It can save and restore the ParamListInfo (bind parameters).  I
haven't tried to adapt it to the exec parameters because I don't quite
understand what you are doing there yet, but you can see that the
datum-serialization logic is separated from the stuff that knows about
the details of ParamListInfo, so datumSerialize() should be reusable
for other purposes.  This also doesn't have the other problems
mentioned above.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/nodes/params.c b/src/backend/nodes/params.c
index fb803f8..d093263 100644
--- a/src/backend/nodes/params.c
+++ b/src/backend/nodes/params.c
@@ -16,6 +16,7 @@
 #include "postgres.h"
 
 #include "nodes/params.h"
+#include "storage/shmem.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
 
@@ -73,3 +74,157 @@ copyParamList(ParamListInfo from)
 
 	return retval;
 }
+
+/*
+ * Estimate the amount of space required to serialize a ParamListInfo.
+ */
+Size
+EstimateParamListSpace(ParamListInfo paramLI)
+{
+	int		i;
+	Size	sz = sizeof(int);
+
+	if (paramLI == NULL || paramLI->numParams <= 0)
+		return sz;
+
+	for (i = 0; i < paramLI->numParams; i++)
+	{
+		ParamExternData *prm = &paramLI->params[i];
+		int16		typLen;
+		bool		typByVal;
+
+		/* give hook a chance in case parameter is dynamic */
+		if (!OidIsValid(prm->ptype) && paramLI->paramFetch != NULL)
+			(*paramLI->paramFetch) (paramLI, i + 1);
+
+		sz = add_size(sz, sizeof(Oid));			/* space for type OID */
+		sz = add_size(sz, sizeof(uint16));		/* space for pflags */
+
+		/* space for datum/isnull */
+		if (OidIsValid(prm->ptype))
+			get_typlenbyval(prm->ptype, &typLen, &typByVal);
+		else
+		{
+			/* If no type OID, assume by-value, like copyParamList does. */
+			typLen = sizeof(Datum);
+			typByVal = true;
+		}
+		sz = add_size(sz,
+			datumEstimateSpace(prm->value, prm->isnull, typByVal, typLen));
+	}
+
+	return sz;
+}
+
+/*
+ * Serialize a paramListInfo structure into caller-provided storage.
+ *
+ * We write the number of parameters first, as a 4-byte integer, and then
+ * write details for each parameter in turn.  The details for each parameter
+ * consist of a 4-byte type OID, 2 bytes of flags, and then the datum as
+ * serialized by datumSerialize().  The caller is responsible for ensuring
+ * that there is enough storage to store the number of bytes that will be
+ * written; use EstimateParamListSpace to find out how many will be needed.
+ * *start_address is updated to point to the byte immediately following those
+ * written.
+ *
+ * RestoreParamList can be used to recreate a ParamListInfo based on the
+ * serialized representation; this will be a static, self-contained copy
+ * just as copyParamList would create.
+ */
+void
+SerializeParamList(ParamListInfo paramLI, char **start_address)
+{
+	int			nparams;
+	int			i;
+
+	/* Write number of parameters. */
+	if (paramLI == NULL || paramLI->numParams <= 0)
+		nparams = 0;
+	else
+		nparams = paramLI->numParams;
+	memcpy(*start_address, &nparams, sizeof(int));
+	*start_address += sizeof(int);
+
+	/* Write each parameter in turn. */
+	for (i = 0; i < nparams; i++)
+	{
+		ParamExternData *prm = &paramLI->params[i];
+		int16		typLen;
+		bool		typByVal;
+
+		/* give hook a chance in case parameter is dynamic */
+		if (!OidIsValid(prm->ptype) && paramLI->paramFetch != NULL)
+			(*paramLI->paramFetch) (paramLI, i + 1);
+
+		/* Write type OID. */
+		memcpy(*start_address, &prm->ptype, sizeof(Oid));
+		*start_address += sizeof(Oid);
+
+		/* Write flags. */
+		memcpy(*start_address, &prm->pflags, sizeof(uint16));
+		*start_address += sizeof(uint16);
+
+		/* Write datum/isnull. */
+		if (OidIsValid(prm->ptype))
+			get_typlenbyval(prm->ptype, &typLen, &typByVal);
+		else
+		{
+			/* If no type OID, assume by-value, like copyParamList does. */
+			typLen = sizeof(Datum);
+			typByVal = true;
+		}
+		datumSerialize(prm->value, prm->isnull, typByVal, typLen,
+					   start_address);
+	}
+}
+
+/*
+ * Copy a ParamListInfo structure.
+ *
+ * The result is allocated in CurrentMemoryContext.
+ *
+ * Note: the intent of this function is to make a static, self-contained
+ * set of parameter values.  If dynamic parameter hooks are present, we
+ * intentionally do not copy them into the result.  Rather, we forcibly
+ * instantiate all available parameter values and copy the datum values.
+ */
+ParamListInfo
+RestoreParamList(char **start_address)
+{
+	ParamListInfo paramLI;
+	Size		size;
+	int			i;
+	int			nparams;
+
+	memcpy(&nparams, *start_address, sizeof(int));
+	*start_address += sizeof(int);
+
+	size = offsetof(ParamListInfoData, params) +
+		nparams * sizeof(ParamExternData);
+
+	paramLI = (ParamListInfo) palloc(size);
+	paramLI->paramFetch = NULL;
+	paramLI->paramFetchArg = NULL;
+	paramLI->parserSetup = NULL;
+	paramLI->parserSetupArg = NULL;
+	paramLI->numParams = nparams;
+
+	for (i = 0; i < nparams; i++)
+	{
+		ParamExternData *prm = &paramLI->params[i];
+
+		/* Read type OID. */
+		memcpy(&prm->ptype, *start_address, sizeof(Oid));
+		*start_address += sizeof(Oid);
+
+		/* Read flags. */
+		memcpy(&prm->pflags, *start_address, sizeof(uint16));
+		*start_address += sizeof(uint16);
+
+		/* Read datum/isnull. */
+		prm->value = datumRestore(start_address, &prm->isnull);
+	}
+
+	return paramLI;
+}
diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c
index e8af030..3d9e354 100644
--- a/src/backend/utils/adt/datum.c
+++ b/src/backend/utils/adt/datum.c
@@ -246,3 +246,121 @@ datumIsEqual(Datum value1, Datum value2, bool typByVal, int typLen)
 	}
 	return res;
 }
+
+/*-------------------------------------------------------------------------
+ * datumEstimateSpace
+ *
+ * Compute the amount of space that datumSerialize will require for a
+ * particular Datum.
+ *-------------------------------------------------------------------------
+ */
+Size
+datumEstimateSpace(Datum value, bool isnull, bool typByVal, int typLen)
+{
+	Size	sz = sizeof(int);
+
+	if (!isnull)
+	{
+		/* no need to use add_size, can't overflow */
+		if (typByVal)
+			sz += sizeof(Datum);
+		else
+			sz += datumGetSize(value, typByVal, typLen);
+	}
+
+	return sz;
+}
+
+/*-------------------------------------------------------------------------
+ * datumSerialize
+ *
+ * Serialize a possibly-NULL datum into caller-provided storage.
+ *
+ * The format is as follows: first, we write a 4-byte header word, which
+ * is either the length of a pass-by-reference datum, -1 for a
+ * pass-by-value datum, or -2 for a NULL.  If the value is NULL, nothing
+ * further is written.  If it is pass-by-value, sizeof(Datum) bytes
+ * follow.  Otherwise, the number of bytes indicated by the header word
+ * follow.  The caller is responsible for ensuring that there is enough
+ * storage to store the number of bytes that will be written; use
+ * datumEstimateSpace() to find out how many will be needed.
+ * *start_address is updated to point to the byte immediately following
+ * those written.
+ *-------------------------------------------------------------------------
+ */
+void
+datumSerialize(Datum value, bool isnull, bool typByVal, int typLen,
+			   char **start_address)
+{
+	int		header;
+
+	/* Write header word. */
+	if (isnull)
+		header = -2;
+	else if (typByVal)
+		header = -1;
+	else
+		header = datumGetSize(value, typByVal, typLen);
+	memcpy(*start_address, &header, sizeof(int));
+	*start_address += sizeof(int);
+
+	/* If not null, write payload bytes. */
+	if (!isnull)
+	{
+		if (typByVal)
+		{
+			memcpy(*start_address, &value, sizeof(Datum));
+			*start_address += sizeof(Datum);
+		}
+		else
+		{
+			memcpy(*start_address, DatumGetPointer(value), header);
+			*start_address += header;
+		}
+	}
+}
+
+/*-------------------------------------------------------------------------
+ * datumRestore
+ *
+ * Restore a possibly-NULL datum previously serialized by datumSerialize.
+ * *start_address is updated according to the number of bytes consumed.
+ *-------------------------------------------------------------------------
+ */
+Datum
+datumRestore(char **start_address, bool *isnull)
+{
+	int		header;
+	void   *d;
+
+	/* Read header word. */
+	memcpy(&header, *start_address, sizeof(int));
+	*start_address += sizeof(int);
+
+	/* If this datum is NULL, we can stop here. */
+	if (header == -2)
+	{
+		*isnull = true;
+		return (Datum) 0;
+	}
+
+	/* OK, datum is not null. */
+	*isnull = false;
+
+	/* If this datum is pass-by-value, sizeof(Datum) bytes follow. */
+	if (header == -1)
+	{
+		Datum		val;
+
+		memcpy(&val, *start_address, sizeof(Datum));
+		*start_address += sizeof(Datum);
+		return val;
+	}
+
+	/* Pass-by-reference case; copy indicated number of bytes. */
+	Assert(header > 0);
+	d = palloc(header);
+	memcpy(d, *start_address, header);
+	*start_address += header;
+	return PointerGetDatum(d);
+}
diff --git a/src/include/nodes/params.h b/src/include/nodes/params.h
index a0f7dd0..83bebde 100644
--- a/src/include/nodes/params.h
+++ b/src/include/nodes/params.h
@@ -102,5 +102,8 @@ typedef struct ParamExecData
 
 /* Functions found in src/backend/nodes/params.c */
 extern ParamListInfo copyParamList(ParamListInfo from);
+extern Size EstimateParamListSpace(ParamListInfo paramLI);
+extern void SerializeParamList(ParamListInfo paramLI, char **start_address);
+extern ParamListInfo RestoreParamList(char **start_address);
 
 #endif   /* PARAMS_H */
diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h
index c572f79..e9d4be5 100644
--- a/src/include/utils/datum.h
+++ b/src/include/utils/datum.h
@@ -46,4 +46,14 @@ extern Datum datumTransfer(Datum value, bool typByVal, int typLen);
 extern bool datumIsEqual(Datum value1, Datum value2,
 			 bool typByVal, int typLen);
 
+/*
+ * Serialize and restore datums so that we can transfer them to parallel
+ * workers.
+ */
+extern Size datumEstimateSpace(Datum value, bool isnull, bool typByVal,
+				   int typLen);
+extern void datumSerialize(Datum value, bool isnull, bool typByVal,
+			   int typLen, char **start_address);
+extern Datum datumRestore(char **start_address, bool *isnull);
+
 #endif   /* DATUM_H */
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to