Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-03-07 Thread Robert Haas
On Fri, Mar 4, 2016 at 2:37 PM, Andres Freund  wrote:
> On 2016-02-12 15:56:45 +0100, Andres Freund wrote:
>> Hi,
>>
>>
>> On 2016-02-10 23:26:20 -0500, Robert Haas wrote:
>> > I think the part about whacking around the FDW API is a little more
>> > potentially objectionable to others, so I want to hold off doing that
>> > unless a few more people chime in with +1.  Perhaps we could start a
>> > new thread to talk about that specific idea.  This is useful even
>> > without that, though.
>>
>> FWIW, I can delete a couple hundred lines of code from citusdb thanks to
>> this...
>
> And I'm now working on doing that.
>
>
>> why exactly did you expose read/writeBitmapset(), and nothing else?
>> Afaics a lot of the other read routines are also pretty necessary from
>> the outside?
>
> I'd like to also expose at least outDatum()/readDatum() - they're not
> entirely trivial, so it'd be sad to copy them.
>
>
> What I'm wondering about right now is how an extensible node should
> implement the equivalent of
> #define WRITE_NODE_FIELD(fldname) \
> (appendStringInfo(str, " :" CppAsString(fldname) " "), \
>  _outNode(str, node->fldname))
>
> given that _outNode isn't public, that seems to imply having to do
> something like
>
> #define WRITE_NODE_FIELD(fldname) \
> (appendStringInfo(str, " :" CppAsString(fldname) " "), \
>  appendStringInfo(str, nodeToString(node->fldname)))
>
> i.e. essentially doubling memory overhead. Istm we should make
> outNode() externally visible?

I suggest that you write a patch to do whatever you think best and
commit it, with the understanding that future API stability isn't
guaranteed if we decide what you picked is not actually for the best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-03-04 Thread Andres Freund
On 2016-02-12 15:56:45 +0100, Andres Freund wrote:
> Hi,
> 
> 
> On 2016-02-10 23:26:20 -0500, Robert Haas wrote:
> > I think the part about whacking around the FDW API is a little more
> > potentially objectionable to others, so I want to hold off doing that
> > unless a few more people chime in with +1.  Perhaps we could start a
> > new thread to talk about that specific idea.  This is useful even
> > without that, though.
> 
> FWIW, I can delete a couple hundred lines of code from citusdb thanks to
> this...

And I'm now working on doing that.


> why exactly did you expose read/writeBitmapset(), and nothing else?
> Afaics a lot of the other read routines are also pretty necessary from
> the outside?

I'd like to also expose at least outDatum()/readDatum() - they're not
entirely trivial, so it'd be sad to copy them.


What I'm wondering about right now is how an extensible node should
implement the equivalent of
#define WRITE_NODE_FIELD(fldname) \
(appendStringInfo(str, " :" CppAsString(fldname) " "), \
 _outNode(str, node->fldname))

given that _outNode isn't public, that seems to imply having to do
something like

#define WRITE_NODE_FIELD(fldname) \
(appendStringInfo(str, " :" CppAsString(fldname) " "), \
 appendStringInfo(str, nodeToString(node->fldname)))

i.e. essentially doubling memory overhead. Istm we should make
outNode() externally visible?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-12 Thread Robert Haas
On Thu, Feb 11, 2016 at 7:06 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
>> -Original Message-
>> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
>> Sent: Thursday, February 11, 2016 1:26 PM
>> To: Kaigai Kouhei(海外 浩平)
>> Cc: Andres Freund; Amit Kapila; pgsql-hackers
>> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan 
>> support
>> on readfuncs.c)
>>
>> On Wed, Feb 10, 2016 at 1:25 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
>> > It is pretty good!
>> >
>> > The attached patch (primary one) implements the above idea.
>> >
>> > Now ExtensibleNode works as a basis structure of data container,
>> > regardless of CustomScan and ForeignScan.
>> > Also, fdw_private and custom_private are de-defined to Node * type
>> > from List * type. It affected to a few FDW APIs.
>>
>> I extracted the subset of this that just creates the extensible node
>> framework and did some cleanup - the result is attached.  I will
>> commit this if nobody objects.
>>
> I have no objection of course.

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-12 Thread Andres Freund
Hi,


On 2016-02-10 23:26:20 -0500, Robert Haas wrote:
> I think the part about whacking around the FDW API is a little more
> potentially objectionable to others, so I want to hold off doing that
> unless a few more people chime in with +1.  Perhaps we could start a
> new thread to talk about that specific idea.  This is useful even
> without that, though.

FWIW, I can delete a couple hundred lines of code from citusdb thanks to
this...

A quick questions about what you committed:

> @@ -527,10 +532,17 @@ extern PGDLLIMPORT Node *newNodeMacroHolder;
>   */
>  extern char *nodeToString(const void *obj);
>  
> +struct Bitmapset;/* not to include bitmapset.h here */
> +struct StringInfoData;   /* not to include stringinfo.h here */
> +extern void outToken(struct StringInfoData *str, const char *s);
> +extern void outBitmapset(struct StringInfoData *str,
> +  const struct Bitmapset *bms);
> +
>  /*
>   * nodes/{readfuncs.c,read.c}
>   */
>  extern void *stringToNode(char *str);
> +extern struct Bitmapset *readBitmapset(void);

why exactly did you expose read/writeBitmapset(), and nothing else?
Afaics a lot of the other read routines are also pretty necessary from
the outside?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-12 Thread Kohei KaiGai
2016-02-13 0:11 GMT+09:00 Robert Haas :
> On Fri, Feb 12, 2016 at 9:56 AM, Andres Freund  wrote:
>> On 2016-02-10 23:26:20 -0500, Robert Haas wrote:
>>> I think the part about whacking around the FDW API is a little more
>>> potentially objectionable to others, so I want to hold off doing that
>>> unless a few more people chime in with +1.  Perhaps we could start a
>>> new thread to talk about that specific idea.  This is useful even
>>> without that, though.
>>
>> FWIW, I can delete a couple hundred lines of code from citusdb thanks to
>> this...
>>
>> A quick questions about what you committed:
>>
>>> @@ -527,10 +532,17 @@ extern PGDLLIMPORT Node *newNodeMacroHolder;
>>>   */
>>>  extern char *nodeToString(const void *obj);
>>>
>>> +struct Bitmapset;/* not to include bitmapset.h here */
>>> +struct StringInfoData;   /* not to include stringinfo.h here */
>>> +extern void outToken(struct StringInfoData *str, const char *s);
>>> +extern void outBitmapset(struct StringInfoData *str,
>>> +  const struct Bitmapset *bms);
>>> +
>>>  /*
>>>   * nodes/{readfuncs.c,read.c}
>>>   */
>>>  extern void *stringToNode(char *str);
>>> +extern struct Bitmapset *readBitmapset(void);
>>
>> why exactly did you expose read/writeBitmapset(), and nothing else?
>> Afaics a lot of the other read routines are also pretty necessary from
>> the outside?
>
> Because that's what KaiGai submitted and I had no reason to
> second-guess it.  I'm happy to expose other stuff along similar lines
> if somebody writes a patch for it.
>
Although we have various service macro in outfuncs.c and readfuncs.c,
it is not a big burden for extension authors to write equivalent code,
except for string and bitmap. On the other hands, string needs _outToken,
bitmap needs _outBitmap. It is not good idea to suggest extensions to
have copy & paste code with no clear reason.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-11 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: Thursday, February 11, 2016 1:26 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan 
> support
> on readfuncs.c)
> 
> On Wed, Feb 10, 2016 at 1:25 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> > It is pretty good!
> >
> > The attached patch (primary one) implements the above idea.
> >
> > Now ExtensibleNode works as a basis structure of data container,
> > regardless of CustomScan and ForeignScan.
> > Also, fdw_private and custom_private are de-defined to Node * type
> > from List * type. It affected to a few FDW APIs.
> 
> I extracted the subset of this that just creates the extensible node
> framework and did some cleanup - the result is attached.  I will
> commit this if nobody objects.
>
I have no objection of course.

> I think the part about whacking around the FDW API is a little more
> potentially objectionable to others, so I want to hold off doing that
> unless a few more people chime in with +1.  Perhaps we could start a
> new thread to talk about that specific idea.  This is useful even
> without that, though.
>

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-10 Thread Robert Haas
On Wed, Feb 10, 2016 at 1:25 AM, Kouhei Kaigai  wrote:
> It is pretty good!
>
> The attached patch (primary one) implements the above idea.
>
> Now ExtensibleNode works as a basis structure of data container,
> regardless of CustomScan and ForeignScan.
> Also, fdw_private and custom_private are de-defined to Node * type
> from List * type. It affected to a few FDW APIs.

I extracted the subset of this that just creates the extensible node
framework and did some cleanup - the result is attached.  I will
commit this if nobody objects.

I think the part about whacking around the FDW API is a little more
potentially objectionable to others, so I want to hold off doing that
unless a few more people chime in with +1.  Perhaps we could start a
new thread to talk about that specific idea.  This is useful even
without that, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index fe2e460..0b1e98c 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -13,7 +13,7 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = nodeFuncs.o nodes.o list.o bitmapset.o tidbitmap.o \
-   copyfuncs.o equalfuncs.o makefuncs.o \
+   copyfuncs.o equalfuncs.o extensible.o makefuncs.o \
outfuncs.o readfuncs.o print.o read.o params.o value.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e54d174..a9e9cc3 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -23,6 +23,7 @@
 #include "postgres.h"
 
 #include "miscadmin.h"
+#include "nodes/extensible.h"
 #include "nodes/plannodes.h"
 #include "nodes/relation.h"
 #include "utils/datum.h"
@@ -4166,6 +4167,27 @@ _copyList(const List *from)
 }
 
 /* 
+ *	extensible.h copy functions
+ * 
+ */
+static ExtensibleNode *
+_copyExtensibleNode(const ExtensibleNode *from)
+{
+	ExtensibleNode	   *newnode;
+	const ExtensibleNodeMethods *methods;
+
+	methods = GetExtensibleNodeMethods(from->extnodename, false);
+	newnode = (ExtensibleNode *) newNode(methods->node_size,
+		 T_ExtensibleNode);
+	COPY_STRING_FIELD(extnodename);
+
+	/* copy the private fields */
+	methods->nodeCopy(newnode, from);
+
+	return newnode;
+}
+
+/* 
  *	value.h copy functions
  * 
  */
@@ -4545,6 +4567,13 @@ copyObject(const void *from)
 			break;
 
 			/*
+			 * EXTENSIBLE NODES
+			 */
+		case T_ExtensibleNode:
+			retval = _copyExtensibleNode(from);
+			break;
+
+			/*
 			 * PARSE NODES
 			 */
 		case T_Query:
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 08ccc0d..b9c3959 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -29,6 +29,7 @@
 
 #include "postgres.h"
 
+#include "nodes/extensible.h"
 #include "nodes/relation.h"
 #include "utils/datum.h"
 
@@ -871,6 +872,25 @@ _equalPlaceHolderInfo(const PlaceHolderInfo *a, const PlaceHolderInfo *b)
 	return true;
 }
 
+/*
+ * Stuff from extensible.h
+ */
+static bool
+_equalExtensibleNode(const ExtensibleNode *a, const ExtensibleNode *b)
+{
+	const ExtensibleNodeMethods  *methods;
+
+	COMPARE_STRING_FIELD(extnodename);
+
+	/* At this point, we know extnodename is the same for both nodes. */
+	methods = GetExtensibleNodeMethods(a->extnodename, false);
+
+	/* compare the private fields */
+	if (!methods->nodeEqual(a, b))
+		return false;
+
+	return true;
+}
 
 /*
  * Stuff from parsenodes.h
@@ -2874,6 +2894,13 @@ equal(const void *a, const void *b)
 			break;
 
 			/*
+			 * EXTENSIBLE NODES
+			 */
+		case T_ExtensibleNode:
+			retval = _equalExtensibleNode(a, b);
+			break;
+
+			/*
 			 * PARSE NODES
 			 */
 		case T_Query:
diff --git a/src/backend/nodes/extensible.c b/src/backend/nodes/extensible.c
new file mode 100644
index 000..8fb4767
--- /dev/null
+++ b/src/backend/nodes/extensible.c
@@ -0,0 +1,92 @@
+/*-
+ *
+ * extensible.c
+ *	  Support for extensible node types
+ *
+ * Loadable modules can define what are in effect new types of nodes using
+ * the routines in this file.  All such nodes are flagged T_ExtensibleNode,
+ * with the extnodename field distinguishing the specific type.  Use
+ * RegisterExtensibleNodeMethods to register a new type of extensible node,
+ * and GetExtensibleNodeMethods to get information about a previously
+ * registered type of extensible node.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-09 Thread Kouhei Kaigai
> -Original Message-
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Monday, February 08, 2016 11:59 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: ##freemail## Re: CustomScan in a larger structure (RE: [HACKERS]
> CustomScan support on readfuncs.c)
> 
> On Sun, Feb 7, 2016 at 7:28 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> > The new callbacks of T_ExtensibleNode will replace the necessity to
> > form and deform process of private values, like as:
> >   https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L114
> 
> Yeah.
> 
> > It transforms a bunch of internal data of CustomScan (similar to the
> > extended fields in T_ExtensibleNode) to/from the node functions
> > understandable forms for copy, input and output support.
> > I think it implies you proposition is workable.
> >
> > I'd like to follow this proposition basically.
> > On the other hands, I have two points I want to pay attention.
> >
> > 1. At this moment, it is allowed to define a larger structure that
> > embeds CustomPath and CustomScanState by extension. How do we treat
> > this coding manner in this case? Especially, CustomScanState has no
> > private pointer dereference because it assumes an extension define
> > a larger structure. Of course, we don't need to care about node
> > operations on Path and PlanState nodes, but right now.
> 
> I see no real advantage in letting a CustomPath be larger.  If
> custom_private can include extension-defined node types, that seems
> good enough.  On the other hand, if CustomScanState can be larger,
> that seems fine.   We don't really need any special support for that,
> do we?
>
Yes. Right now, we have no code path that handles PlanState or its
inheritance using node operations. So, it is not a real problem.

> > 2. I intended to replace LibraryName and SymbolName fields from the
> > CustomScanMethods structure by integration of extensible node type.
> > We had to give a pair of these identifiers because custom scan provider
> > has no registration points at this moment. A little concern is extension
> > has to assume a particular filename of itself.
> > But, probably, it shall be a separated discussion. My preference is
> > preliminary registration of custom scan provider by its name, as well
> > as extensible node.
> 
> Seems like we could just leave the CustomScan stuff alone and worry
> about this as a separate facility.
>
OK

> > Towards the last question; whether *_private shall be void * or List *,
> > I want to keep fdw_private and custom_private as List * pointer, because
> > a new node type definition is a bit overdone job if this FDW or CSP will
> > take only a few private fields with primitive data types.
> > It is a preferable features when extension defines ten or more private
> > fields.
> 
> Well, I suggested Node *, not void *.  A Node can be a List, but not
> every Node is a List.
>
It is pretty good!

The attached patch (primary one) implements the above idea.

Now ExtensibleNode works as a basis structure of data container,
regardless of CustomScan and ForeignScan.
Also, fdw_private and custom_private are de-defined to Node * type
from List * type. It affected to a few FDW APIs.

The secondary patch is a demonstration of new ExtensibleNode using
postgres_fdw extension. Its private data are expected to be packed
in a list with a particular order. Self defined structure allows to
keep these variables without ugly pack/unpacking.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>


pgsql-v9.6-custom-private.v5.demo.patch
Description: pgsql-v9.6-custom-private.v5.demo.patch


pgsql-v9.6-custom-private.v5.patch
Description: pgsql-v9.6-custom-private.v5.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-08 Thread Robert Haas
On Sun, Feb 7, 2016 at 7:28 PM, Kouhei Kaigai  wrote:
> The new callbacks of T_ExtensibleNode will replace the necessity to
> form and deform process of private values, like as:
>   https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L114

Yeah.

> It transforms a bunch of internal data of CustomScan (similar to the
> extended fields in T_ExtensibleNode) to/from the node functions
> understandable forms for copy, input and output support.
> I think it implies you proposition is workable.
>
> I'd like to follow this proposition basically.
> On the other hands, I have two points I want to pay attention.
>
> 1. At this moment, it is allowed to define a larger structure that
> embeds CustomPath and CustomScanState by extension. How do we treat
> this coding manner in this case? Especially, CustomScanState has no
> private pointer dereference because it assumes an extension define
> a larger structure. Of course, we don't need to care about node
> operations on Path and PlanState nodes, but right now.

I see no real advantage in letting a CustomPath be larger.  If
custom_private can include extension-defined node types, that seems
good enough.  On the other hand, if CustomScanState can be larger,
that seems fine.   We don't really need any special support for that,
do we?

> 2. I intended to replace LibraryName and SymbolName fields from the
> CustomScanMethods structure by integration of extensible node type.
> We had to give a pair of these identifiers because custom scan provider
> has no registration points at this moment. A little concern is extension
> has to assume a particular filename of itself.
> But, probably, it shall be a separated discussion. My preference is
> preliminary registration of custom scan provider by its name, as well
> as extensible node.

Seems like we could just leave the CustomScan stuff alone and worry
about this as a separate facility.

> Towards the last question; whether *_private shall be void * or List *,
> I want to keep fdw_private and custom_private as List * pointer, because
> a new node type definition is a bit overdone job if this FDW or CSP will
> take only a few private fields with primitive data types.
> It is a preferable features when extension defines ten or more private
> fields.

Well, I suggested Node *, not void *.  A Node can be a List, but not
every Node is a List.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-07 Thread Kouhei Kaigai
> On Wed, Feb 3, 2016 at 11:57 PM, Kouhei Kaigai  wrote:
> > At this moment, I tried to write up description at nodes/nodes.h.
> > The amount of description is about 100lines. It is on a borderline
> > whether we split off this chunk into another header file, in my sense.
> >
> >
> > On the other hands, I noticed that we cannot omit checks for individual
> > callbacks on Custom node type, ExtensibleNodeMethods is embedded in
> > the CustomMethods structure, thus we may have Custom node with
> > no extensible feature.
> > This manner is beneficial because extension does not need to register
> > the library and symbol name for serialization. So, CustomScan related
> > code still checks existence of individual callbacks.
> 
> I was looking over this patch yesterday, and something was bugging me
> about it, but I couldn't quite put my finger on what it was.  But now
> I think I know.
> 
> I think of an extensible node type facility as something that ought to
> be defined to allow a user to create new types of nodes.  But this is
> not that.  What this does is allows you to have a CustomScan or
> ForeignScan node - that is, the existing node type - which is actually
> larger than a normal node of that type and has some extra data that is
> part of it.  I'm having a really hard time being comfortable with that
> concept.  Somehow, it seems like the node type should determine the
> size of the node.  I can stretch my brain to the point of being able
> to say, well, maybe if the node tag is T_ExtensibleNode, then you can
> look at char *extnodename to figure out what type of node it is
> really, and then from there get the size.  But what this does is:
> every time you see a T_CustomScan or T_ForeignScan node, it might not
> really be that kind of node but something else else, and tomorrow
> there might be another half-dozen node types with a similar property.
> And every one of those node types will have char *extnodename in a
> different place in the structure, so a hypothetical piece of code that
> wanted to find the extension methods for a node, or the size of a
> node, would need a switch that knows about all of those node types.
> It feels very awkward.
> 
> So I have a slightly different proposal.  Let's forget about letting
> T_CustomScan or T_ForeignScan or any other built-in node type vary in
> size.  Instead, let's add T_ExtensibleNode which acts as a completely
> user-defined node.  It has read/out/copy/equalfuncs support along the
> lines of what this patch implements, and that's it.  It's not a scan
> or a path or anything like that: it's just an opaque data container
> that the system can input, output, copy, and test for equality, and
> nothing else.  Isn't that useless?  Not at all.  If you're creating an
> FDW or custom scan provider and want to store some extra data, you can
> create a new type of extensible node and stick it in the fdw_private
> or custom_private field!  The data won't be part of the CustomScan or
> ForeignScan structure itself, as in this patch, but who cares? The
> only objection I can see is that you still need several pointer
> deferences to get to the data since fdw_private is a List, but if
> that's actually a real performance problem we could probably fix it by
> just changing fdw_private to a Node instead.  You'd still need one
> pointer dereference, but that doesn't seem too bad.
> 
> Thoughts?
>
The new callbacks of T_ExtensibleNode will replace the necessity to
form and deform process of private values, like as:
  https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L114

It transforms a bunch of internal data of CustomScan (similar to the
extended fields in T_ExtensibleNode) to/from the node functions
understandable forms for copy, input and output support.
I think it implies you proposition is workable.

I'd like to follow this proposition basically.
On the other hands, I have two points I want to pay attention.

1. At this moment, it is allowed to define a larger structure that
embeds CustomPath and CustomScanState by extension. How do we treat
this coding manner in this case? Especially, CustomScanState has no
private pointer dereference because it assumes an extension define
a larger structure. Of course, we don't need to care about node
operations on Path and PlanState nodes, but right now.

2. I intended to replace LibraryName and SymbolName fields from the
CustomScanMethods structure by integration of extensible node type.
We had to give a pair of these identifiers because custom scan provider
has no registration points at this moment. A little concern is extension
has to assume a particular filename of itself.
But, probably, it shall be a separated discussion. My preference is
preliminary registration of custom scan provider by its name, as well
as extensible node.


Towards the last question; whether *_private shall be void * or List *,
I want to keep fdw_private and custom_private as List * pointer, because
a new 

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-06 Thread Robert Haas
On Wed, Feb 3, 2016 at 11:57 PM, Kouhei Kaigai  wrote:
> At this moment, I tried to write up description at nodes/nodes.h.
> The amount of description is about 100lines. It is on a borderline
> whether we split off this chunk into another header file, in my sense.
>
>
> On the other hands, I noticed that we cannot omit checks for individual
> callbacks on Custom node type, ExtensibleNodeMethods is embedded in
> the CustomMethods structure, thus we may have Custom node with
> no extensible feature.
> This manner is beneficial because extension does not need to register
> the library and symbol name for serialization. So, CustomScan related
> code still checks existence of individual callbacks.

I was looking over this patch yesterday, and something was bugging me
about it, but I couldn't quite put my finger on what it was.  But now
I think I know.

I think of an extensible node type facility as something that ought to
be defined to allow a user to create new types of nodes.  But this is
not that.  What this does is allows you to have a CustomScan or
ForeignScan node - that is, the existing node type - which is actually
larger than a normal node of that type and has some extra data that is
part of it.  I'm having a really hard time being comfortable with that
concept.  Somehow, it seems like the node type should determine the
size of the node.  I can stretch my brain to the point of being able
to say, well, maybe if the node tag is T_ExtensibleNode, then you can
look at char *extnodename to figure out what type of node it is
really, and then from there get the size.  But what this does is:
every time you see a T_CustomScan or T_ForeignScan node, it might not
really be that kind of node but something else else, and tomorrow
there might be another half-dozen node types with a similar property.
And every one of those node types will have char *extnodename in a
different place in the structure, so a hypothetical piece of code that
wanted to find the extension methods for a node, or the size of a
node, would need a switch that knows about all of those node types.
It feels very awkward.

So I have a slightly different proposal.  Let's forget about letting
T_CustomScan or T_ForeignScan or any other built-in node type vary in
size.  Instead, let's add T_ExtensibleNode which acts as a completely
user-defined node.  It has read/out/copy/equalfuncs support along the
lines of what this patch implements, and that's it.  It's not a scan
or a path or anything like that: it's just an opaque data container
that the system can input, output, copy, and test for equality, and
nothing else.  Isn't that useless?  Not at all.  If you're creating an
FDW or custom scan provider and want to store some extra data, you can
create a new type of extensible node and stick it in the fdw_private
or custom_private field!  The data won't be part of the CustomScan or
ForeignScan structure itself, as in this patch, but who cares?  The
only objection I can see is that you still need several pointer
deferences to get to the data since fdw_private is a List, but if
that's actually a real performance problem we could probably fix it by
just changing fdw_private to a Node instead.  You'd still need one
pointer dereference, but that doesn't seem too bad.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-03 Thread Kouhei Kaigai
> On Wed, Jan 27, 2016 at 9:36 PM, Robert Haas  wrote:
> > On Mon, Jan 25, 2016 at 8:06 PM, Kouhei Kaigai  wrote:
> >> Sorry for my late response. I've been unavailable to have enough
> >> time to touch code for the last 1.5 month.
> >>
> >> The attached patch is a revised one to handle private data of
> >> foregn/custom scan node more gracefully.
> >>
> >> The overall consensus upthread were:
> >> - A new ExtensibleNodeMethods structure defines a unique name
> >>   and a set of callbacks to handle node copy, serialization,
> >>   deserialization and equality checks.
> >> - (Foreign|Custom)(Path|Scan|ScanState) are first host of the
> >>   ExtensibleNodeMethods, to allow extension to define larger
> >>   structure to store its private fields.
> >> - ExtensibleNodeMethods does not support variable length
> >>   structure (like a structure with an array on the tail, use
> >>   separately allocated array).
> >> - ExtensibleNodeMethods shall be registered on _PG_init() of
> >>   extensions.
> >>
> >> The 'pgsql-v9.6-custom-private.v3.patch' is the main part of
> >> this feature. As I pointed out before, it uses dynhash instead
> >> of the self invented hash table.
> >
> > On a first read-through, I see nothing in this patch to which I would
> > want to object except for the fact that the comments and documentation
> > need some work from a native speaker of English.  It looks like what
> > we discussed, and I think it's an improvement over what we have now.
> 
> Well, looking at this a bit more, it seems like the documentation
> you've written here is really misplaced.  The patch is introducing a
> new facility that applies to both CustomScan and ForeignScan, but the
> documentation is only to do with CustomScan.  I think we need a whole
> new chapter on extensible nodes, or something.  I'm actually not
> really keen on the fact that we keep adding SGML documentation for
> this stuff; it seems like it belongs in a README in the source tree.
> We don't explain nodes in general, but now we're going to have to try
> to explain extensible nodes.  How's that going to work?
>
The detail of these callbacks are not for end-users, administrators and
so on except for core/extension developers. So, I loves idea not to have
such a detailed description in SGML.
How about an idea to have more detailed source code comments close to
the definition of ExtensibleNodeMethods?
I haven't seen the src/backend/nodes/README yet, and it has only 6 updates
history from Postgres95 era. I guess people may forget to update README
file if description is separately located from the implementation.

> I think you should avoid the call to GetExtensibleNodeMethods() in the
> case where extnodename is NULL.  On the other hand, I think that if
> extnodename is non-NULL, all four methods should be required, so that
> you don't have to check if (methods && methods->nodeRead) but just if
> (extnodename) { methods = GetExtensibleNodeMethods(extnodename);
> methods->nodeRead( ... ); }.  That seems like it would be a bit
> tidier.
>
OK, I'll fix up. No need to have 'missing_ok' argument here.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-03 Thread Robert Haas
On Wed, Feb 3, 2016 at 8:00 PM, Kouhei Kaigai  wrote:
>> Well, looking at this a bit more, it seems like the documentation
>> you've written here is really misplaced.  The patch is introducing a
>> new facility that applies to both CustomScan and ForeignScan, but the
>> documentation is only to do with CustomScan.  I think we need a whole
>> new chapter on extensible nodes, or something.  I'm actually not
>> really keen on the fact that we keep adding SGML documentation for
>> this stuff; it seems like it belongs in a README in the source tree.
>> We don't explain nodes in general, but now we're going to have to try
>> to explain extensible nodes.  How's that going to work?
>>
> The detail of these callbacks are not for end-users, administrators and
> so on except for core/extension developers. So, I loves idea not to have
> such a detailed description in SGML.
> How about an idea to have more detailed source code comments close to
> the definition of ExtensibleNodeMethods?
> I haven't seen the src/backend/nodes/README yet, and it has only 6 updates
> history from Postgres95 era. I guess people may forget to update README
> file if description is separately located from the implementation.

Hmm, that might work, although that file is so old that it may be
difficult to add to.  Another idea is: maybe we could have a header
file for the extensible node stuff and just give it a really long
header comment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-03 Thread Robert Haas
On Wed, Jan 27, 2016 at 9:36 PM, Robert Haas  wrote:
> On Mon, Jan 25, 2016 at 8:06 PM, Kouhei Kaigai  wrote:
>> Sorry for my late response. I've been unavailable to have enough
>> time to touch code for the last 1.5 month.
>>
>> The attached patch is a revised one to handle private data of
>> foregn/custom scan node more gracefully.
>>
>> The overall consensus upthread were:
>> - A new ExtensibleNodeMethods structure defines a unique name
>>   and a set of callbacks to handle node copy, serialization,
>>   deserialization and equality checks.
>> - (Foreign|Custom)(Path|Scan|ScanState) are first host of the
>>   ExtensibleNodeMethods, to allow extension to define larger
>>   structure to store its private fields.
>> - ExtensibleNodeMethods does not support variable length
>>   structure (like a structure with an array on the tail, use
>>   separately allocated array).
>> - ExtensibleNodeMethods shall be registered on _PG_init() of
>>   extensions.
>>
>> The 'pgsql-v9.6-custom-private.v3.patch' is the main part of
>> this feature. As I pointed out before, it uses dynhash instead
>> of the self invented hash table.
>
> On a first read-through, I see nothing in this patch to which I would
> want to object except for the fact that the comments and documentation
> need some work from a native speaker of English.  It looks like what
> we discussed, and I think it's an improvement over what we have now.

Well, looking at this a bit more, it seems like the documentation
you've written here is really misplaced.  The patch is introducing a
new facility that applies to both CustomScan and ForeignScan, but the
documentation is only to do with CustomScan.  I think we need a whole
new chapter on extensible nodes, or something.  I'm actually not
really keen on the fact that we keep adding SGML documentation for
this stuff; it seems like it belongs in a README in the source tree.
We don't explain nodes in general, but now we're going to have to try
to explain extensible nodes.  How's that going to work?

I think you should avoid the call to GetExtensibleNodeMethods() in the
case where extnodename is NULL.  On the other hand, I think that if
extnodename is non-NULL, all four methods should be required, so that
you don't have to check if (methods && methods->nodeRead) but just if
(extnodename) { methods = GetExtensibleNodeMethods(extnodename);
methods->nodeRead( ... ); }.  That seems like it would be a bit
tidier.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-03 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: Thursday, February 04, 2016 11:39 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan 
> support
> on readfuncs.c)
> 
> On Wed, Feb 3, 2016 at 8:00 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> >> Well, looking at this a bit more, it seems like the documentation
> >> you've written here is really misplaced.  The patch is introducing a
> >> new facility that applies to both CustomScan and ForeignScan, but the
> >> documentation is only to do with CustomScan.  I think we need a whole
> >> new chapter on extensible nodes, or something.  I'm actually not
> >> really keen on the fact that we keep adding SGML documentation for
> >> this stuff; it seems like it belongs in a README in the source tree.
> >> We don't explain nodes in general, but now we're going to have to try
> >> to explain extensible nodes.  How's that going to work?
> >>
> > The detail of these callbacks are not for end-users, administrators and
> > so on except for core/extension developers. So, I loves idea not to have
> > such a detailed description in SGML.
> > How about an idea to have more detailed source code comments close to
> > the definition of ExtensibleNodeMethods?
> > I haven't seen the src/backend/nodes/README yet, and it has only 6 updates
> > history from Postgres95 era. I guess people may forget to update README
> > file if description is separately located from the implementation.
> 
> Hmm, that might work, although that file is so old that it may be
> difficult to add to.  Another idea is: maybe we could have a header
> file for the extensible node stuff and just give it a really long
> header comment.
>
At this moment, I tried to write up description at nodes/nodes.h.
The amount of description is about 100lines. It is on a borderline
whether we split off this chunk into another header file, in my sense.


On the other hands, I noticed that we cannot omit checks for individual
callbacks on Custom node type, ExtensibleNodeMethods is embedded in
the CustomMethods structure, thus we may have Custom node with
no extensible feature.
This manner is beneficial because extension does not need to register
the library and symbol name for serialization. So, CustomScan related
code still checks existence of individual callbacks.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


pgsql-v9.6-custom-private.v4.patch
Description: pgsql-v9.6-custom-private.v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-01-28 Thread Kouhei Kaigai
> On Mon, Jan 25, 2016 at 8:06 PM, Kouhei Kaigai  wrote:
> > Sorry for my late response. I've been unavailable to have enough
> > time to touch code for the last 1.5 month.
> >
> > The attached patch is a revised one to handle private data of
> > foregn/custom scan node more gracefully.
> >
> > The overall consensus upthread were:
> > - A new ExtensibleNodeMethods structure defines a unique name
> >   and a set of callbacks to handle node copy, serialization,
> >   deserialization and equality checks.
> > - (Foreign|Custom)(Path|Scan|ScanState) are first host of the
> >   ExtensibleNodeMethods, to allow extension to define larger
> >   structure to store its private fields.
> > - ExtensibleNodeMethods does not support variable length
> >   structure (like a structure with an array on the tail, use
> >   separately allocated array).
> > - ExtensibleNodeMethods shall be registered on _PG_init() of
> >   extensions.
> >
> > The 'pgsql-v9.6-custom-private.v3.patch' is the main part of
> > this feature. As I pointed out before, it uses dynhash instead
> > of the self invented hash table.
> 
> On a first read-through, I see nothing in this patch to which I would
> want to object except for the fact that the comments and documentation
> need some work from a native speaker of English.  It looks like what
> we discussed, and I think it's an improvement over what we have now.
>
Thanks,

Do you think we shall allow to register same extensible node name for
different node types? Like, "GpuJoin" for any of CustomPath, CustomScan
and CustomScanState. Or, do we avoid this using different name for each?

--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-01-28 Thread Robert Haas
On Thu, Jan 28, 2016 at 10:18 PM, Kouhei Kaigai  wrote:
> Do you think we shall allow to register same extensible node name for
> different node types? Like, "GpuJoin" for any of CustomPath, CustomScan
> and CustomScanState. Or, do we avoid this using different name for each?

I'd say a different name for each.  That's our current convention, and
I don't see much reason to change it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-01-28 Thread Kouhei Kaigai
> On Thu, Jan 28, 2016 at 10:18 PM, Kouhei Kaigai  wrote:
> > Do you think we shall allow to register same extensible node name for
> > different node types? Like, "GpuJoin" for any of CustomPath, CustomScan
> > and CustomScanState. Or, do we avoid this using different name for each?
> 
> I'd say a different name for each.  That's our current convention, and
> I don't see much reason to change it.
>
OK, it is not a serious problem, at least, for my use cases.
A convention like "GpuJoinPath", "GpuJoin" and "GpuJoinState" are sufficient.
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-01-27 Thread Robert Haas
On Mon, Jan 25, 2016 at 8:06 PM, Kouhei Kaigai  wrote:
> Sorry for my late response. I've been unavailable to have enough
> time to touch code for the last 1.5 month.
>
> The attached patch is a revised one to handle private data of
> foregn/custom scan node more gracefully.
>
> The overall consensus upthread were:
> - A new ExtensibleNodeMethods structure defines a unique name
>   and a set of callbacks to handle node copy, serialization,
>   deserialization and equality checks.
> - (Foreign|Custom)(Path|Scan|ScanState) are first host of the
>   ExtensibleNodeMethods, to allow extension to define larger
>   structure to store its private fields.
> - ExtensibleNodeMethods does not support variable length
>   structure (like a structure with an array on the tail, use
>   separately allocated array).
> - ExtensibleNodeMethods shall be registered on _PG_init() of
>   extensions.
>
> The 'pgsql-v9.6-custom-private.v3.patch' is the main part of
> this feature. As I pointed out before, it uses dynhash instead
> of the self invented hash table.

On a first read-through, I see nothing in this patch to which I would
want to object except for the fact that the comments and documentation
need some work from a native speaker of English.  It looks like what
we discussed, and I think it's an improvement over what we have now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-01-25 Thread Kouhei Kaigai
Sorry for my late response. I've been unavailable to have enough
time to touch code for the last 1.5 month.

The attached patch is a revised one to handle private data of
foregn/custom scan node more gracefully.

The overall consensus upthread were:
- A new ExtensibleNodeMethods structure defines a unique name
  and a set of callbacks to handle node copy, serialization,
  deserialization and equality checks.
- (Foreign|Custom)(Path|Scan|ScanState) are first host of the
  ExtensibleNodeMethods, to allow extension to define larger
  structure to store its private fields.
- ExtensibleNodeMethods does not support variable length
  structure (like a structure with an array on the tail, use
  separately allocated array).
- ExtensibleNodeMethods shall be registered on _PG_init() of
  extensions.

The 'pgsql-v9.6-custom-private.v3.patch' is the main part of
this feature. As I pointed out before, it uses dynhash instead
of the self invented hash table.

Interfaces are defined as follows (not changed from v2):

  typedef struct ExtensibleNodeMethods
  {
 const char *extnodename;
 Sizenode_size;
 void  (*nodeCopy)(Node *newnode, const Node *oldnode);
 bool  (*nodeEqual)(const Node *a, const Node *b);
 void  (*nodeOut)(struct StringInfoData *str, const Node *node);
 void  (*nodeRead)(Node *node);
  } ExtensibleNodeMethods;
  
  extern void
  RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods);
  
  extern const ExtensibleNodeMethods *
  GetExtensibleNodeMethods(const char *extnodename, bool missing_ok);


Also, 'extensible-node-example-on-pgstrom.patch' is a working
example on its "GpuScan" node.
The code below uses all of copy, serialization and deserialization.

gscan = (GpuScan *)stringToNode(nodeToString(copyObject(cscan)));
elog(INFO, "GpuScan: %s", nodeToString(gscan));

Then, I could confirm private fields are reproduced correctly.

In addition to this, I'd like to suggest two small improvement.

On nodeOut callback, extension will need _outToken() and _outBitmap(),
however, these two functions are static. Entrypoint for extensions
are needed. (Of course, extension can copy & paste these small functions...)

ExtensibleNodeMethods may be registered with a unique pair of its
name and node-tag which is associated with. The current code requires
the name is unique to others, however, it may make a bit inconvenience.
In case of CustomScan, extension need to define three nodes: CustomPath,
CustomScan and CustomScanState, thus, ExtensibleNodeMethods which is
associated with these node must have individually unique name, like
"GpuScanPath", "GpuScan" and "GpuScanState".
If extnodename would be unique within a particular node type, we can
apply same name for all of the three above.

How about your thought?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>

> -Original Message-
> From: Kaigai Kouhei(海外 浩平)
> Sent: Wednesday, December 02, 2015 5:52 PM
> To: 'Robert Haas'
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan 
> support
> on readfuncs.c)
> 
> > On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> > > I'm now implementing. The above design perfectly works on ForeignScan.
> > > On the other hands, I'd like to have deeper consideration for CustomScan.
> > >
> > > My recent patch adds LibraryName and SymbolName on CustomScanMethods
> > > to lookup the method table even if library is not loaded yet.
> > > However, this ExtensibleNodeMethods relies custom scan provider shall
> > > be loaded, by parallel infrastructure, prior to the deserialization.
> > > It means extension has a chance to register itself as well.
> > >
> > > My idea is, redefine CustomScanMethod as follows:
> > >
> > > typedef struct ExtensibleNodeMethods
> > > {
> > > const char *extnodename;
> > > Sizenode_size;
> > > Node *(*nodeCopy)(const Node *from);
> > > bool  (*nodeEqual)(const Node *a, const Node *b);
> > > void  (*nodeOut)(struct StringInfoData *str, const Node *node);
> > > void  (*nodeRead)(Node *node);
> > > } ExtensibleNodeMethods;
> > >
> > > typedef struct CustomScanMethods
> > > {
> > > union {
> > > const char *CustomName;
> > > ExtensibleNodeMethods  xnode;
> > > };
> > > /* Create execution state (CustomScanState) from a CustomScan plan 
> > > node
> > */
> > > Node   *(*CreateCustomScanState) (struct CustomScan *cscan);
> > > } CustomScanMethods;

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-12-06 Thread Kouhei Kaigai
> On Wed, Dec 2, 2015 at 4:06 AM, Andres Freund  wrote:
> > On 2015-12-02 08:52:20 +, Kouhei Kaigai wrote:
> >> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> >> index 26264cb..c4bb76e 100644
> >> --- a/src/backend/nodes/copyfuncs.c
> >> +++ b/src/backend/nodes/copyfuncs.c
> >> @@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
> >>  static ForeignScan *
> >>  _copyForeignScan(const ForeignScan *from)
> >>  {
> >> - ForeignScan *newnode = makeNode(ForeignScan);
> >> -
> >> + const ExtensibleNodeMethods *methods
> >> + = GetExtensibleNodeMethods(from->extnodename, true);
> >> + ForeignScan *newnode = (ForeignScan *) newNode(!methods
> >>
> +
>  ? sizeof(ForeignScan)
> >>
> +
>  : methods->node_size,
> >> +
> T_ForeignScan);
> >
> > Changing the FDW API seems to put this squarely into 9.6
> > territory. Agreed?
> 
> I don't think anybody thought this patch was 9.5 material.  I didn't,
> anyway.  The FDW changes for the join-pushdown-EPQ problem are another
> issue, but that can be discussed on the other thread.
>
I don't expect it is 9.5 feature.
The name of attached patch was "pgsql-v9.6-custom-private.v2.patch".

> >>   /*
> >>* copy node superclass fields
> >>*/
> >> @@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
> >>   /*
> >>* copy remainder of node
> >>*/
> >> + COPY_STRING_FIELD(extnodename);
> >>   COPY_SCALAR_FIELD(fs_server);
> >>   COPY_NODE_FIELD(fdw_exprs);
> >>   COPY_NODE_FIELD(fdw_private);
> >> @@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
> >>   COPY_NODE_FIELD(fdw_recheck_quals);
> >>   COPY_BITMAPSET_FIELD(fs_relids);
> >>   COPY_SCALAR_FIELD(fsSystemCol);
> >> + if (methods && methods->nodeCopy)
> >> + methods->nodeCopy((Node *) newnode, (const Node *) from);
> >
> > I wonder if we shouldn't instead "recurse" into a generic routine for
> > extensible nodes here.
> 
> I don't know what that means.
> 
> >> +void
> >> +RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
> >> +{
> >> + uint32  hash;
> >> + int index;
> >> + ListCell   *lc;
> >> + MemoryContext   oldcxt;
> >> +
> >> + if (!xnodes_methods_slots)
> >> + xnodes_methods_slots =
> MemoryContextAllocZero(TopMemoryContext,
> >> +
> sizeof(List *) * XNODES_NUM_SLOTS);
> >> +
> >> + hash = hash_any(methods->extnodename, strlen(methods->extnodename));
> >> + index = hash % XNODES_NUM_SLOTS;
> >> +
> >> + /* duplication check */
> >> + foreach (lc, xnodes_methods_slots[index])
> >> + {
> >> + const ExtensibleNodeMethods *curr = lfirst(lc);
> >> +
> >> + if (strcmp(methods->extnodename, curr->extnodename) == 0)
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_DUPLICATE_OBJECT),
> >> +  errmsg("ExtensibleNodeMethods
> \"%s\" already exists",
> >> +
> methods->extnodename)));
> >> + }
> >> + /* ok, register this entry */
> >> + oldcxt = MemoryContextSwitchTo(TopMemoryContext);
> >> + xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
> >> +
> (void *)methods);
> >> + MemoryContextSwitchTo(oldcxt);
> >> +}
> >
> > Why aren't you using dynahash here, and instead reimplement a hashtable?
> 
> I had thought we might assume that the number of extensible nodes was
> small enough that we could use an array for this and just use linear
> search.  But if we want to keep the door open to having enough
> extensible nodes that this would be inefficient, then I agree that
> dynahash is better than a hand-rolled hash table.
>
Hmm. I also expected the number of extensible nodes are not so much,
so self-defined hash table is sufficient. However, it is not strong
reason. Let me revise the hash implementation.

> >>  static ForeignScan *
> >>  _readForeignScan(void)
> >>  {
> >> + const ExtensibleNodeMethods *methods;
> >>   READ_LOCALS(ForeignScan);
> >>
> >>   ReadCommonScan(_node->scan);
> >>
> >> + READ_STRING_FIELD(extnodename);
> >>   READ_OID_FIELD(fs_server);
> >>   READ_NODE_FIELD(fdw_exprs);
> >>   READ_NODE_FIELD(fdw_private);
> >> @@ -1804,6 +1806,19 @@ _readForeignScan(void)
> >>   READ_BITMAPSET_FIELD(fs_relids);
> >>   READ_BOOL_FIELD(fsSystemCol);
> >>
> >> + methods = GetExtensibleNodeMethods(local_node->extnodename, true);
> >> + if (methods && methods->nodeRead)
> >> + {
> >> + ForeignScan*local_temp =
> palloc0(methods->node_size);
> >> +
> >> + Assert(methods->node_size >= sizeof(ForeignScan));
> >> +
> >> + memcpy(local_temp, local_node, sizeof(ForeignScan));
> >> + methods->nodeRead((Node *) local_temp);
> >> +
> >> +   

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-12-04 Thread Robert Haas
On Wed, Dec 2, 2015 at 4:06 AM, Andres Freund  wrote:
> On 2015-12-02 08:52:20 +, Kouhei Kaigai wrote:
>> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
>> index 26264cb..c4bb76e 100644
>> --- a/src/backend/nodes/copyfuncs.c
>> +++ b/src/backend/nodes/copyfuncs.c
>> @@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
>>  static ForeignScan *
>>  _copyForeignScan(const ForeignScan *from)
>>  {
>> - ForeignScan *newnode = makeNode(ForeignScan);
>> -
>> + const ExtensibleNodeMethods *methods
>> + = GetExtensibleNodeMethods(from->extnodename, true);
>> + ForeignScan *newnode = (ForeignScan *) newNode(!methods
>> +
>> ? sizeof(ForeignScan)
>> +
>> : methods->node_size,
>> +
>> T_ForeignScan);
>
> Changing the FDW API seems to put this squarely into 9.6
> territory. Agreed?

I don't think anybody thought this patch was 9.5 material.  I didn't,
anyway.  The FDW changes for the join-pushdown-EPQ problem are another
issue, but that can be discussed on the other thread.

>>   /*
>>* copy node superclass fields
>>*/
>> @@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
>>   /*
>>* copy remainder of node
>>*/
>> + COPY_STRING_FIELD(extnodename);
>>   COPY_SCALAR_FIELD(fs_server);
>>   COPY_NODE_FIELD(fdw_exprs);
>>   COPY_NODE_FIELD(fdw_private);
>> @@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
>>   COPY_NODE_FIELD(fdw_recheck_quals);
>>   COPY_BITMAPSET_FIELD(fs_relids);
>>   COPY_SCALAR_FIELD(fsSystemCol);
>> + if (methods && methods->nodeCopy)
>> + methods->nodeCopy((Node *) newnode, (const Node *) from);
>
> I wonder if we shouldn't instead "recurse" into a generic routine for
> extensible nodes here.

I don't know what that means.

>> +void
>> +RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
>> +{
>> + uint32  hash;
>> + int index;
>> + ListCell   *lc;
>> + MemoryContext   oldcxt;
>> +
>> + if (!xnodes_methods_slots)
>> + xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
>> +
>>sizeof(List *) * XNODES_NUM_SLOTS);
>> +
>> + hash = hash_any(methods->extnodename, strlen(methods->extnodename));
>> + index = hash % XNODES_NUM_SLOTS;
>> +
>> + /* duplication check */
>> + foreach (lc, xnodes_methods_slots[index])
>> + {
>> + const ExtensibleNodeMethods *curr = lfirst(lc);
>> +
>> + if (strcmp(methods->extnodename, curr->extnodename) == 0)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_DUPLICATE_OBJECT),
>> +  errmsg("ExtensibleNodeMethods \"%s\" 
>> already exists",
>> + 
>> methods->extnodename)));
>> + }
>> + /* ok, register this entry */
>> + oldcxt = MemoryContextSwitchTo(TopMemoryContext);
>> + xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
>> +
>>(void *)methods);
>> + MemoryContextSwitchTo(oldcxt);
>> +}
>
> Why aren't you using dynahash here, and instead reimplement a hashtable?

I had thought we might assume that the number of extensible nodes was
small enough that we could use an array for this and just use linear
search.  But if we want to keep the door open to having enough
extensible nodes that this would be inefficient, then I agree that
dynahash is better than a hand-rolled hash table.

>>  static ForeignScan *
>>  _readForeignScan(void)
>>  {
>> + const ExtensibleNodeMethods *methods;
>>   READ_LOCALS(ForeignScan);
>>
>>   ReadCommonScan(_node->scan);
>>
>> + READ_STRING_FIELD(extnodename);
>>   READ_OID_FIELD(fs_server);
>>   READ_NODE_FIELD(fdw_exprs);
>>   READ_NODE_FIELD(fdw_private);
>> @@ -1804,6 +1806,19 @@ _readForeignScan(void)
>>   READ_BITMAPSET_FIELD(fs_relids);
>>   READ_BOOL_FIELD(fsSystemCol);
>>
>> + methods = GetExtensibleNodeMethods(local_node->extnodename, true);
>> + if (methods && methods->nodeRead)
>> + {
>> + ForeignScan*local_temp = palloc0(methods->node_size);
>> +
>> + Assert(methods->node_size >= sizeof(ForeignScan));
>> +
>> + memcpy(local_temp, local_node, sizeof(ForeignScan));
>> + methods->nodeRead((Node *) local_temp);
>> +
>> + pfree(local_node);
>> + local_node = 

Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-12-02 Thread Kouhei Kaigai
> On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai  wrote:
> > I'm now implementing. The above design perfectly works on ForeignScan.
> > On the other hands, I'd like to have deeper consideration for CustomScan.
> >
> > My recent patch adds LibraryName and SymbolName on CustomScanMethods
> > to lookup the method table even if library is not loaded yet.
> > However, this ExtensibleNodeMethods relies custom scan provider shall
> > be loaded, by parallel infrastructure, prior to the deserialization.
> > It means extension has a chance to register itself as well.
> >
> > My idea is, redefine CustomScanMethod as follows:
> >
> > typedef struct ExtensibleNodeMethods
> > {
> > const char *extnodename;
> > Sizenode_size;
> > Node *(*nodeCopy)(const Node *from);
> > bool  (*nodeEqual)(const Node *a, const Node *b);
> > void  (*nodeOut)(struct StringInfoData *str, const Node *node);
> > void  (*nodeRead)(Node *node);
> > } ExtensibleNodeMethods;
> >
> > typedef struct CustomScanMethods
> > {
> > union {
> > const char *CustomName;
> > ExtensibleNodeMethods  xnode;
> > };
> > /* Create execution state (CustomScanState) from a CustomScan plan node
> */
> > Node   *(*CreateCustomScanState) (struct CustomScan *cscan);
> > } CustomScanMethods;
> >
> > It allows to pull CustomScanMethods using GetExtensibleNodeMethods
> > by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
> > Thus, we don't need to care about LibraryName and SymbolName.
> >
> > This kind of trick is not needed for ForeignScan because all the pointer
> > stuff are reproducible by catalog, however, method table of CustomScan
> > is not.
> >
> > How about your opinion?
> 
> Anonymous unions are not C89, so this is a no-go.
> 
> I think we should just drop CustomName and replace it with
> ExtensibleNodeMethods.  That will break things for third-party code
> that is using the existing CustomScan stuff, but there's a good chance
> that nobody other than you has written any such code.  And even if
> someone has, updating it for this change will not be very difficult.
>
Thanks, I have same opinion.
At this moment, CustomName is not utilized so much except for EXPLAIN
and debug output, so it is not a hard stuff to replace this field by
extnodename, even if custom scan provider does not have own structure
thus no callbacks of node operations are defined.

The attached patch adds 'extnodename' fields on ForeignPath and
ForeignScan, also embeds ExtensibleNodeMethods in CustomPathMethods,
CustomScanMethods and CustomExecMethods.

Its handlers in copyfuncs.c were enhanced to utilize the callback
to allocate a larger structure and copy private fields.
Handlers in outfuncs.c and readfuncs.c have to be symmetric. The
core backend writes out 'extnodename' prior to any private fields,
then we can identify the callback to process rest of tokens for
private fields.

RegisterExtensibleNodeMethods() tracks a pair of 'extnodename'
and ExtensibleNodeMethods itself. It saves the pointer of the
method table, but not duplicate, because _readCustomScan()
cast the method pulled by 'extnodename' thus registered table
has to be a static variable on extension; that means extension
never update ExtensibleNodeMethods once registered.

The one other patch is my test using PG-Strom, however, I didn't
have a test on FDW extensions yet.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


pgstrom-custom-private.test.patch
Description: pgstrom-custom-private.test.patch


pgsql-v9.6-custom-private.v2.patch
Description: pgsql-v9.6-custom-private.v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-12-02 Thread Andres Freund
On 2015-12-02 08:52:20 +, Kouhei Kaigai wrote:
> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> index 26264cb..c4bb76e 100644
> --- a/src/backend/nodes/copyfuncs.c
> +++ b/src/backend/nodes/copyfuncs.c
> @@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
>  static ForeignScan *
>  _copyForeignScan(const ForeignScan *from)
>  {
> - ForeignScan *newnode = makeNode(ForeignScan);
> -
> + const ExtensibleNodeMethods *methods
> + = GetExtensibleNodeMethods(from->extnodename, true);
> + ForeignScan *newnode = (ForeignScan *) newNode(!methods
> + 
>? sizeof(ForeignScan)
> + 
>: methods->node_size,
> + 
>T_ForeignScan);

Changing the FDW API seems to put this squarely into 9.6
territory. Agreed?

>   /*
>* copy node superclass fields
>*/
> @@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
>   /*
>* copy remainder of node
>*/
> + COPY_STRING_FIELD(extnodename);
>   COPY_SCALAR_FIELD(fs_server);
>   COPY_NODE_FIELD(fdw_exprs);
>   COPY_NODE_FIELD(fdw_private);
> @@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
>   COPY_NODE_FIELD(fdw_recheck_quals);
>   COPY_BITMAPSET_FIELD(fs_relids);
>   COPY_SCALAR_FIELD(fsSystemCol);
> + if (methods && methods->nodeCopy)
> + methods->nodeCopy((Node *) newnode, (const Node *) from);

I wonder if we shouldn't instead "recurse" into a generic routine for
extensible nodes here.
> +void
> +RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
> +{
> + uint32  hash;
> + int index;
> + ListCell   *lc;
> + MemoryContext   oldcxt;
> +
> + if (!xnodes_methods_slots)
> + xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
> + 
>   sizeof(List *) * XNODES_NUM_SLOTS);
> +
> + hash = hash_any(methods->extnodename, strlen(methods->extnodename));
> + index = hash % XNODES_NUM_SLOTS;
> +
> + /* duplication check */
> + foreach (lc, xnodes_methods_slots[index])
> + {
> + const ExtensibleNodeMethods *curr = lfirst(lc);
> +
> + if (strcmp(methods->extnodename, curr->extnodename) == 0)
> + ereport(ERROR,
> + (errcode(ERRCODE_DUPLICATE_OBJECT),
> +  errmsg("ExtensibleNodeMethods \"%s\" 
> already exists",
> + methods->extnodename)));
> + }
> + /* ok, register this entry */
> + oldcxt = MemoryContextSwitchTo(TopMemoryContext);
> + xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
> + 
>   (void *)methods);
> + MemoryContextSwitchTo(oldcxt);
> +}

Why aren't you using dynahash here, and instead reimplement a hashtable?


>  static ForeignScan *
>  _readForeignScan(void)
>  {
> + const ExtensibleNodeMethods *methods;
>   READ_LOCALS(ForeignScan);
>  
>   ReadCommonScan(_node->scan);
>  
> + READ_STRING_FIELD(extnodename);
>   READ_OID_FIELD(fs_server);
>   READ_NODE_FIELD(fdw_exprs);
>   READ_NODE_FIELD(fdw_private);
> @@ -1804,6 +1806,19 @@ _readForeignScan(void)
>   READ_BITMAPSET_FIELD(fs_relids);
>   READ_BOOL_FIELD(fsSystemCol);
>  
> + methods = GetExtensibleNodeMethods(local_node->extnodename, true);
> + if (methods && methods->nodeRead)
> + {
> + ForeignScan*local_temp = palloc0(methods->node_size);
> +
> + Assert(methods->node_size >= sizeof(ForeignScan));
> +
> + memcpy(local_temp, local_node, sizeof(ForeignScan));
> + methods->nodeRead((Node *) local_temp);
> +
> + pfree(local_node);
> + local_node = local_temp;
> + }
>   READ_DONE();
>  }

This imo pretty clearly shows why the 'extensible node' handling should
be delegated to separate routines.


I wonder if we shouldn't, before committing this, at least write a PoC
implementation of actual extensible nodes. I.e. new node types that are
registered at runtime by extensions. ISTM those are relatively closely
linked, and the above doesn't yet support them nicely afaics.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-30 Thread Robert Haas
On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai  wrote:
> I'm now implementing. The above design perfectly works on ForeignScan.
> On the other hands, I'd like to have deeper consideration for CustomScan.
>
> My recent patch adds LibraryName and SymbolName on CustomScanMethods
> to lookup the method table even if library is not loaded yet.
> However, this ExtensibleNodeMethods relies custom scan provider shall
> be loaded, by parallel infrastructure, prior to the deserialization.
> It means extension has a chance to register itself as well.
>
> My idea is, redefine CustomScanMethod as follows:
>
> typedef struct ExtensibleNodeMethods
> {
> const char *extnodename;
> Sizenode_size;
> Node *(*nodeCopy)(const Node *from);
> bool  (*nodeEqual)(const Node *a, const Node *b);
> void  (*nodeOut)(struct StringInfoData *str, const Node *node);
> void  (*nodeRead)(Node *node);
> } ExtensibleNodeMethods;
>
> typedef struct CustomScanMethods
> {
> union {
> const char *CustomName;
> ExtensibleNodeMethods  xnode;
> };
> /* Create execution state (CustomScanState) from a CustomScan plan node */
> Node   *(*CreateCustomScanState) (struct CustomScan *cscan);
> } CustomScanMethods;
>
> It allows to pull CustomScanMethods using GetExtensibleNodeMethods
> by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
> Thus, we don't need to care about LibraryName and SymbolName.
>
> This kind of trick is not needed for ForeignScan because all the pointer
> stuff are reproducible by catalog, however, method table of CustomScan
> is not.
>
> How about your opinion?

Anonymous unions are not C89, so this is a no-go.

I think we should just drop CustomName and replace it with
ExtensibleNodeMethods.  That will break things for third-party code
that is using the existing CustomScan stuff, but there's a good chance
that nobody other than you has written any such code.  And even if
someone has, updating it for this change will not be very difficult.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-26 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Saturday, November 21, 2015 8:05 AM
> To: Robert Haas
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan 
> support
> on readfuncs.c)
> 
> > On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> 
> > wrote:
> > > The above two points are for the case if and when extension want to use
> > > variable length fields for its private fields.
> > > So, nodeAlloc() callback is not a perfect answer for the use case because
> > > length of the variable length fields shall be (usually) determined by the
> > > value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
> > > we cannot determine the correct length before read.
> > >
> > > Let's assume an custom-scan extension that wants to have:
> > >
> > >   typedef struct {
> > >   CustomScancscan;
> > >   int   priv_value_1;
> > >   long  priv_value_2;
> > >   extra_info_t  extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
> > >   /* its length equal to number of sub-plans */
> > >   } ExampleScan;
> > >
> > > The "extnodename" within CustomScan allows to pull callback functions
> > > to handle read node from the serialized format.
> > > However, nodeAlloc() callback cannot determine the correct length
> > > (= number of sub-plans in this example) prior to read 'cscan' part.
> > >
> > > So, I'd like to suggest _readCustomScan (and other extendable nodes
> > > also) read tokens on local CustomScan variable once, then call
> > >   Node *(nodeRead)(Node *local_node)
> > > to allocate entire ExampleScan node and read other private fields.
> > >
> > > The local_node is already filled up by the core backend prior to
> > > the callback invocation, so extension can know how many sub-plans
> > > are expected thus how many private tokens shall appear.
> > > It also means extension can know exact length of the ExampleScan
> > > node, so it can allocate the node as expected then fill up
> > > remaining private tokens.
> >
> > On second thought, I think we should insist that nodes have to be
> > fixed-size.  This sounds like a mess.  If the node needs a variable
> > amount of storage for something, it can store a pointer to a
> > separately-allocated array someplace inside of it.  That's what
> > existing nodes do, and it works fine.
> >
> OK, let's have "node_size" instead of nodeAlloc callback and other
> stuffs.
> 
> Please wait for a patch.
>
I'm now implementing. The above design perfectly works on ForeignScan.
On the other hands, I'd like to have deeper consideration for CustomScan.

My recent patch adds LibraryName and SymbolName on CustomScanMethods
to lookup the method table even if library is not loaded yet.
However, this ExtensibleNodeMethods relies custom scan provider shall
be loaded, by parallel infrastructure, prior to the deserialization.
It means extension has a chance to register itself as well.

My idea is, redefine CustomScanMethod as follows:

typedef struct ExtensibleNodeMethods
{
const char *extnodename;
Sizenode_size;
Node *(*nodeCopy)(const Node *from);
bool  (*nodeEqual)(const Node *a, const Node *b);
void  (*nodeOut)(struct StringInfoData *str, const Node *node);
void  (*nodeRead)(Node *node);
} ExtensibleNodeMethods;

typedef struct CustomScanMethods
{
union {
const char *CustomName;
ExtensibleNodeMethods  xnode;
};
/* Create execution state (CustomScanState) from a CustomScan plan node */
Node   *(*CreateCustomScanState) (struct CustomScan *cscan);
} CustomScanMethods;

It allows to pull CustomScanMethods using GetExtensibleNodeMethods
by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
Thus, we don't need to care about LibraryName and SymbolName.

This kind of trick is not needed for ForeignScan because all the pointer
stuff are reproducible by catalog, however, method table of CustomScan
is not.

How about your opinion?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-20 Thread Robert Haas
On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai  wrote:
> The above two points are for the case if and when extension want to use
> variable length fields for its private fields.
> So, nodeAlloc() callback is not a perfect answer for the use case because
> length of the variable length fields shall be (usually) determined by the
> value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
> we cannot determine the correct length before read.
>
> Let's assume an custom-scan extension that wants to have:
>
>   typedef struct {
>   CustomScancscan;
>   int   priv_value_1;
>   long  priv_value_2;
>   extra_info_t  extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
>   /* its length equal to number of sub-plans */
>   } ExampleScan;
>
> The "extnodename" within CustomScan allows to pull callback functions
> to handle read node from the serialized format.
> However, nodeAlloc() callback cannot determine the correct length
> (= number of sub-plans in this example) prior to read 'cscan' part.
>
> So, I'd like to suggest _readCustomScan (and other extendable nodes
> also) read tokens on local CustomScan variable once, then call
>   Node *(nodeRead)(Node *local_node)
> to allocate entire ExampleScan node and read other private fields.
>
> The local_node is already filled up by the core backend prior to
> the callback invocation, so extension can know how many sub-plans
> are expected thus how many private tokens shall appear.
> It also means extension can know exact length of the ExampleScan
> node, so it can allocate the node as expected then fill up
> remaining private tokens.

On second thought, I think we should insist that nodes have to be
fixed-size.  This sounds like a mess.  If the node needs a variable
amount of storage for something, it can store a pointer to a
separately-allocated array someplace inside of it.  That's what
existing nodes do, and it works fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-20 Thread Kouhei Kaigai
> On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai  wrote:
> > The above two points are for the case if and when extension want to use
> > variable length fields for its private fields.
> > So, nodeAlloc() callback is not a perfect answer for the use case because
> > length of the variable length fields shall be (usually) determined by the
> > value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
> > we cannot determine the correct length before read.
> >
> > Let's assume an custom-scan extension that wants to have:
> >
> >   typedef struct {
> >   CustomScancscan;
> >   int   priv_value_1;
> >   long  priv_value_2;
> >   extra_info_t  extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
> >   /* its length equal to number of sub-plans */
> >   } ExampleScan;
> >
> > The "extnodename" within CustomScan allows to pull callback functions
> > to handle read node from the serialized format.
> > However, nodeAlloc() callback cannot determine the correct length
> > (= number of sub-plans in this example) prior to read 'cscan' part.
> >
> > So, I'd like to suggest _readCustomScan (and other extendable nodes
> > also) read tokens on local CustomScan variable once, then call
> >   Node *(nodeRead)(Node *local_node)
> > to allocate entire ExampleScan node and read other private fields.
> >
> > The local_node is already filled up by the core backend prior to
> > the callback invocation, so extension can know how many sub-plans
> > are expected thus how many private tokens shall appear.
> > It also means extension can know exact length of the ExampleScan
> > node, so it can allocate the node as expected then fill up
> > remaining private tokens.
> 
> On second thought, I think we should insist that nodes have to be
> fixed-size.  This sounds like a mess.  If the node needs a variable
> amount of storage for something, it can store a pointer to a
> separately-allocated array someplace inside of it.  That's what
> existing nodes do, and it works fine.
>
OK, let's have "node_size" instead of nodeAlloc callback and other
stuffs.

Please wait for a patch.
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-19 Thread Kouhei Kaigai
> On Mon, Nov 16, 2015 at 9:03 PM, Kouhei Kaigai  wrote:
> > I guess we will put a pointer to static ExtensibleNodeMethods structure
> > on ForeignScan, CustomScan, CustomScanState and etc...
> 
> I think that makes it confusing.  What I'd prefer to do is have only
> the name stored in the node, and then people who need methods can look
> up those methods based on the name.
>
OK. It is equivalent.

> > Let me write down the steps for deserialization. Please correct me if it is
> > not what you are thinking.
> > First, stringToNode picks up a token that shall have node label, like
> > "SEQSCAN", "CUSTOMSCAN" and others. Once we can identify the following
> > tokens belong to CustomScan node, it can make advance the pg_strtok pointer
> > until "methods" field. The method field is consists of a pair of 
> > library_name
> > and symbol_name, then it tries to load the library and resolve the symbol.
> 
> No.  It reads the "extnodename" field (or whatever we decide to call
> it) and calls GetExtensibleNodeMethods() on that name.  That name
> returns the appropriate structure.  C libraries that get loaded into
> the server can register their extensible node types at _PG_init()
> time.
>
OK, I got.

> > Even if library was not loaded on the worker side, this step allows to 
> > ensure
> > the library gets loaded, thus, PG_init() can RegisterExtensibleNodeMethods.
> 
> A parallel worker will load the same loadable modules which the master
> has got loaded.  If you use some other kind of background worker, of
> course, you're on your own.
>
Sorry, I oversight it. And SerializeLibraryState() and RestoreLibraryState()
indeed allow background workers even if out of parallel context to restore
the library state easily.

> > I have two concerns on the above proposition.
> > 1) 'node_size' field implies user defined structure to have fixed length.
> >Extension may want to use variable length structure. The example below
> >shows GpuJoinState has multiple innerState structure according to the
> >number of inner relations joined at once.
> >https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L240
> 
> OK, we can replace the node_size field with an allocator callback:
> Node (*nodeAlloc)(void) or something like that.
> 
> > 2) It may be valuable if 'nodeRead(void)' can take an argument of the
> >knows/common portion of ForeignScan and etc... It allows extension
> >to adjust number of private fields according to the known part.
> >For example, I can expect flexible length private fields according
> >to the length of custom_subplans list.
> 
> I don't understand this bit.
>
The above two points are for the case if and when extension want to use
variable length fields for its private fields.
So, nodeAlloc() callback is not a perfect answer for the use case because
length of the variable length fields shall be (usually) determined by the
value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
we cannot determine the correct length before read.

Let's assume an custom-scan extension that wants to have:

  typedef struct {
  CustomScancscan;
  int   priv_value_1;
  long  priv_value_2;
  extra_info_t  extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
  /* its length equal to number of sub-plans */
  } ExampleScan;

The "extnodename" within CustomScan allows to pull callback functions
to handle read node from the serialized format.
However, nodeAlloc() callback cannot determine the correct length
(= number of sub-plans in this example) prior to read 'cscan' part.

So, I'd like to suggest _readCustomScan (and other extendable nodes
also) read tokens on local CustomScan variable once, then call
  Node *(nodeRead)(Node *local_node)
to allocate entire ExampleScan node and read other private fields.

The local_node is already filled up by the core backend prior to
the callback invocation, so extension can know how many sub-plans
are expected thus how many private tokens shall appear.
It also means extension can know exact length of the ExampleScan
node, so it can allocate the node as expected then fill up
remaining private tokens.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-18 Thread Robert Haas
On Mon, Nov 16, 2015 at 9:03 PM, Kouhei Kaigai  wrote:
> I guess we will put a pointer to static ExtensibleNodeMethods structure
> on ForeignScan, CustomScan, CustomScanState and etc...

I think that makes it confusing.  What I'd prefer to do is have only
the name stored in the node, and then people who need methods can look
up those methods based on the name.

> Let me write down the steps for deserialization. Please correct me if it is
> not what you are thinking.
> First, stringToNode picks up a token that shall have node label, like
> "SEQSCAN", "CUSTOMSCAN" and others. Once we can identify the following
> tokens belong to CustomScan node, it can make advance the pg_strtok pointer
> until "methods" field. The method field is consists of a pair of library_name
> and symbol_name, then it tries to load the library and resolve the symbol.

No.  It reads the "extnodename" field (or whatever we decide to call
it) and calls GetExtensibleNodeMethods() on that name.  That name
returns the appropriate structure.  C libraries that get loaded into
the server can register their extensible node types at _PG_init()
time.

> Even if library was not loaded on the worker side, this step allows to ensure
> the library gets loaded, thus, PG_init() can RegisterExtensibleNodeMethods.

A parallel worker will load the same loadable modules which the master
has got loaded.  If you use some other kind of background worker, of
course, you're on your own.

> I have two concerns on the above proposition.
> 1) 'node_size' field implies user defined structure to have fixed length.
>Extension may want to use variable length structure. The example below
>shows GpuJoinState has multiple innerState structure according to the
>number of inner relations joined at once.
>https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L240

OK, we can replace the node_size field with an allocator callback:
Node (*nodeAlloc)(void) or something like that.

> 2) It may be valuable if 'nodeRead(void)' can take an argument of the
>knows/common portion of ForeignScan and etc... It allows extension
>to adjust number of private fields according to the known part.
>For example, I can expect flexible length private fields according
>to the length of custom_subplans list.

I don't understand this bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-16 Thread Robert Haas
On Wed, Nov 11, 2015 at 11:13 PM, Kouhei Kaigai  wrote:
> I agree with we have no reason why only custom-scan is allowed to have
> serialize/deserialize capability. I can implement an equivalent stuff
> for foreign-scan also, and it is helpful for extension authors, especially,
> who tries to implement remote join feature because FDW driver has to
> keep larger number of private fields than scan.
>
> Also, what is the reason why we allow extensions to define a larger
> structure which contains CustomPath or CustomScanState? It seems to
> me that these types are not (fully) supported by the current copyfuncs,
> outfuncs and readfuncs, aren't it?
> (Although outfuncs.c supports path-nodes, thus CustomPathMethods has
> TextOut callback but no copy/read handler at this moment.)

I feel like we're sort of plunging blindly down a path of trying to
make certain parts of the system extensible and pluggable without
really having a clear vision of where we want to end up.  Somehow I
feel like we ought to start by thinking about a framework for *node*
extensibility; then, within that, we can talk about what that means
for particular types of nodes.

For example, suppose do something like this:

typedef struct
{
   NodeTag tag;
   char *extnodename;
} ExtensibleNode;

typedef stuct
{
char *extnodename;
char *library_name;
char *function_name;
Size node_size;
ExtensibleNode *(*nodeCopy)(ExtensibleNode *);
bool (*nodeEqual)(ExtensibleNode *, ExtensibleNode *);
void (*nodeOut)(StringInfo str, ExtensibleNode *);
ExtensibleNode (*nodeRead)(void);
} ExtensibleNodeMethods;

extern void RegisterExtensibleNodeMethods(ExtensibleNodeMethods *);
extern ExtensibleNodeMethods *GetExtensibleNodeMethods(char *extnodename);

This provides a generic infrastructure so that we don't have to keep
inventing the same stuff over and over, rather than coming up with one
way to do it for ForeignScans and another way for CustomScan and
another way for CustomScanState.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-16 Thread Kouhei Kaigai
> On Wed, Nov 11, 2015 at 11:13 PM, Kouhei Kaigai  wrote:
> > I agree with we have no reason why only custom-scan is allowed to have
> > serialize/deserialize capability. I can implement an equivalent stuff
> > for foreign-scan also, and it is helpful for extension authors, especially,
> > who tries to implement remote join feature because FDW driver has to
> > keep larger number of private fields than scan.
> >
> > Also, what is the reason why we allow extensions to define a larger
> > structure which contains CustomPath or CustomScanState? It seems to
> > me that these types are not (fully) supported by the current copyfuncs,
> > outfuncs and readfuncs, aren't it?
> > (Although outfuncs.c supports path-nodes, thus CustomPathMethods has
> > TextOut callback but no copy/read handler at this moment.)
> 
> I feel like we're sort of plunging blindly down a path of trying to
> make certain parts of the system extensible and pluggable without
> really having a clear vision of where we want to end up.  Somehow I
> feel like we ought to start by thinking about a framework for *node*
> extensibility; then, within that, we can talk about what that means
> for particular types of nodes.
> 
> For example, suppose do something like this:
> 
> typedef struct
> {
>NodeTag tag;
>char *extnodename;
> } ExtensibleNode;
> 
> typedef stuct
> {
> char *extnodename;
> char *library_name;
> char *function_name;
> Size node_size;
> ExtensibleNode *(*nodeCopy)(ExtensibleNode *);
> bool (*nodeEqual)(ExtensibleNode *, ExtensibleNode *);
> void (*nodeOut)(StringInfo str, ExtensibleNode *);
> ExtensibleNode (*nodeRead)(void);
> } ExtensibleNodeMethods;
> 
> extern void RegisterExtensibleNodeMethods(ExtensibleNodeMethods *);
> extern ExtensibleNodeMethods *GetExtensibleNodeMethods(char *extnodename);
> 
> This provides a generic infrastructure so that we don't have to keep
> inventing the same stuff over and over, rather than coming up with one
> way to do it for ForeignScans and another way for CustomScan and
> another way for CustomScanState.
>
Wow! I love the idea.

I guess we will put a pointer to static ExtensibleNodeMethods structure
on ForeignScan, CustomScan, CustomScanState and etc...

Like:
  typedef struct ForeignScan
  {
  Scanscan;
  Oid fs_server;  /* OID of foreign server */
  List   *fdw_exprs;  /* expressions that FDW may evaluate */
  List   *fdw_private;/* private data for FDW */
  List   *fdw_scan_tlist; /* optional tlist describing scan tuple */
  List   *fdw_recheck_quals;  /* original quals not in scan.plan.qual */
  Bitmapset  *fs_relids;  /* RTIs generated by this scan */
  boolfsSystemCol;/* true if any "system column" is needed */
+ const ExtensibleNodeMethods *methods;
  } ForeignScan;

We may be able to put ExtensibleNodeMethods on the head of CustomScanMethods
structure in case of CustomScan.

Let me write down the steps for deserialization. Please correct me if it is
not what you are thinking.
First, stringToNode picks up a token that shall have node label, like
"SEQSCAN", "CUSTOMSCAN" and others. Once we can identify the following
tokens belong to CustomScan node, it can make advance the pg_strtok pointer
until "methods" field. The method field is consists of a pair of library_name
and symbol_name, then it tries to load the library and resolve the symbol.
Even if library was not loaded on the worker side, this step allows to ensure
the library gets loaded, thus, PG_init() can RegisterExtensibleNodeMethods.
Next, it looks up the table of extensible nodes by a pair of extnodename,
library_name and symbol_name, to get ExtensibleNodeMethods table in this
process address space.
Once it can get nodeRead() callback, we can continue deserialization of
private fields.

I have two concerns on the above proposition.
1) 'node_size' field implies user defined structure to have fixed length.
   Extension may want to use variable length structure. The example below
   shows GpuJoinState has multiple innerState structure according to the
   number of inner relations joined at once.
   https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L240

2) It may be valuable if 'nodeRead(void)' can take an argument of the
   knows/common portion of ForeignScan and etc... It allows extension
   to adjust number of private fields according to the known part.
   For example, I can expect flexible length private fields according
   to the length of custom_subplans list.

How about your thought?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-12 Thread Kouhei Kaigai
> > On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund  wrote:
> > > On 2015-11-11 14:59:33 -0500, Robert Haas wrote:
> > >> I don't see this as being a particularly good idea.  The same issue
> > >> exists for FDWs, and we're just living with it in that case.
> > >
> > > It's absolutely horrible there. I don't see why that's a justification
> > > for much.  To deal with the lack of extensible copy/out/readfuncs I've
> > > just had to copy the entirety of readfuncs.c into an extension. Or you
> > > build replacements for those (as e.g. postgres_fdw essentially has
> > > done).
> > >
> > >> If we do want to improve it, I'm not sure this is the way to go,
> > >> either.  I think there could be other designs where we focus on making
> > >> the serialization and deserialization options better, rather than
> > >> letting people tack stuff onto the struct.
> > >
> > > Just better serialization doesn't actually help all that much. Being
> > > able to conveniently access data directly, i.e. as fields in a struct,
> > > makes code rather more readable. And in many cases more
> > > efficient. Having to serialize internal datastructures unconditionally,
> > > just so copyfuncs.c works if actually used, makes for a fair amount of
> > > inefficiency (forced deserialization, even when not copying) and uglier
> > > code.
> >
> > Fair enough, but I'm still not very interested in having something for
> > CustomScan only.
> >
> I agree with we have no reason why only custom-scan is allowed to have
> serialize/deserialize capability. I can implement an equivalent stuff
> for foreign-scan also, and it is helpful for extension authors, especially,
> who tries to implement remote join feature because FDW driver has to
> keep larger number of private fields than scan.
>
I tried to make a patch to support serialization/deserialization on both
of ForeignScan and CustomScan, but have not tested yet.

One exceptional stuff in ForeignScan is ForeignPath in outfuncs.c, because
ForeignPath itself does not have identifier to get callback functions (it
is kept in RelOptInfo; that is sufficient in planning phase), thus, we have
no way to write out if ForeignPath is a part of larger structure.
We ought to ignore it at this point. How about your opinion?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


pgsql-v9.6-foreign-custom-serialization.v1.patch
Description: pgsql-v9.6-foreign-custom-serialization.v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-11 Thread Kouhei Kaigai
> On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund  wrote:
> > On 2015-11-11 14:59:33 -0500, Robert Haas wrote:
> >> I don't see this as being a particularly good idea.  The same issue
> >> exists for FDWs, and we're just living with it in that case.
> >
> > It's absolutely horrible there. I don't see why that's a justification
> > for much.  To deal with the lack of extensible copy/out/readfuncs I've
> > just had to copy the entirety of readfuncs.c into an extension. Or you
> > build replacements for those (as e.g. postgres_fdw essentially has
> > done).
> >
> >> If we do want to improve it, I'm not sure this is the way to go,
> >> either.  I think there could be other designs where we focus on making
> >> the serialization and deserialization options better, rather than
> >> letting people tack stuff onto the struct.
> >
> > Just better serialization doesn't actually help all that much. Being
> > able to conveniently access data directly, i.e. as fields in a struct,
> > makes code rather more readable. And in many cases more
> > efficient. Having to serialize internal datastructures unconditionally,
> > just so copyfuncs.c works if actually used, makes for a fair amount of
> > inefficiency (forced deserialization, even when not copying) and uglier
> > code.
> 
> Fair enough, but I'm still not very interested in having something for
> CustomScan only.
>
I agree with we have no reason why only custom-scan is allowed to have
serialize/deserialize capability. I can implement an equivalent stuff
for foreign-scan also, and it is helpful for extension authors, especially,
who tries to implement remote join feature because FDW driver has to
keep larger number of private fields than scan.

Also, what is the reason why we allow extensions to define a larger
structure which contains CustomPath or CustomScanState? It seems to
me that these types are not (fully) supported by the current copyfuncs,
outfuncs and readfuncs, aren't it?
(Although outfuncs.c supports path-nodes, thus CustomPathMethods has
TextOut callback but no copy/read handler at this moment.)

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-11 Thread Kouhei Kaigai
> Robert Haas  writes:
> > On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund  wrote:
> >> Just better serialization doesn't actually help all that much. Being
> >> able to conveniently access data directly, i.e. as fields in a struct,
> >> makes code rather more readable. And in many cases more
> >> efficient. Having to serialize internal datastructures unconditionally,
> >> just so copyfuncs.c works if actually used, makes for a fair amount of
> >> inefficiency (forced deserialization, even when not copying) and uglier
> >> code.
> 
> > Fair enough, but I'm still not very interested in having something for
> > CustomScan only.
> 
> I'm with Robert here.  The fact is that postgres_fdw and other FDWs are
> living just fine with the restrictions we put in to allow copyfuncs.c to
> copy ForeignScan, and there's been no sufficient justification offered
> as to why CustomScan can't play by those rules.
>
So, I'd like to propose copy/out/read callbacks for both of ForeignScan
and CustomScan.

> Yes, it would be nice to be able to use a more-tailored struct definition,
> but it's not *necessary*.  We should not contort the core system
> enormously in order to provide a bit more notational cleanliness to
> extensions.
> 
> I'd be fine with fixing this if a nice solution were to present itself,
> but so far nothing's been offered that wasn't horrid.
> 
> BTW, the "inefficiency" argument is junk, unless you provide some hard
> evidence of a case where it hurts a lot.  If we were forcing people to
> use serialized representations at execution time, it would be meaningful,
> but we're not; you can convert as needed at executor setup time.
>
Indeed, it is a "nice to have" feature. We can survive without luxury
goods but more expensive tax.

Once an extension tries to implement own join logic, it shall have
much larger number of private fields than usual scan logic.
Andres got a shock when he looked at GpuJoin code of PG-Strom before
because of its pack/unpack routine at:
  https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L112
The custom_private/fdw_private have to be safe to copyObject(), thus,
we have to pack/unpack private variables to/from a List structure.
When we design serialization/deserialization of Plan nodes, it means
ForeignScan and CustomScan has to pay double cost for this.

I never say it is a "necessary" feature, but "very nice to have".

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-11 Thread Robert Haas
On Mon, Nov 9, 2015 at 10:26 PM, Kouhei Kaigai  wrote:
> It is a relevant topic of readfuncs support for custom-scan.
>
> Unlike CustomPath and CustomScanState, we don't allow custom-scan
> provider to define own and larger structure that embeds CustomScan
> at head of the structure, to support copyObject().
> Thus, custom-scan provider needs to have own logic to pack and
> unpack private fields, like:
>   https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L112
>
> At present, only create_customscan_plan() and _copyCustomScan()
> can produce CustomScan node.
> The create_customscan_plan() invokes PlanCustomPath callback,
> thus, CSP can return a pointer of CustomScan field within
> a larger structure. So, we don't adjust this interface.
> On the other hands, _copyCustomScan() allocates a new node using
> makeNode(CustomScan), thus, here is no space for the larger
> structure if CSP wants to use to keep private values without
> packing and unpacking.
> Only CSP knows exact size of the larger structure, so all we can
> do is ask CSP to allocate a new node and copy its private fields.
> This patch newly adds NodeCopyCustomScan callback to support
> copyObject.
>
> Also, v9.6 will support nodeToString/stringToNode on plan nodes.
> So, we need to re-define the role of TextOutCustomScan callback
> that is also defined at v9.5.
> If CSP extends the CustomScan to save private values on the extra
> area, only CSP knows what values and which data types are stored,
> thus, all we can do is ask CSP to serialize and deserialize its
> private fields without inconsistency.
> Even though TextOutCustomScan callback shall be once ripped out
> to support readfuncs.c, a pair of TextOut and TextRead callback
> will re-define its responsibility again; when CSP defines
> a larger structure that embeds CustomScan, a pair of these
> callbacks are used to serialize/deserialize the entire custom
> defined objects.

I don't see this as being a particularly good idea.  The same issue
exists for FDWs, and we're just living with it in that case.  There
was talk previously of improving it, and maybe some day somebody will,
but I can't get excited about it.  Writing serialization and
deserialization code is annoying but completely insufficient, in my
mind, to justify the hassle of creating a system for this that will be
used by very, very little code.

If we do want to improve it, I'm not sure this is the way to go,
either.  I think there could be other designs where we focus on making
the serialization and deserialization options better, rather than
letting people tack stuff onto the struct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-11 Thread Andres Freund
On 2015-11-11 14:59:33 -0500, Robert Haas wrote:
> I don't see this as being a particularly good idea.  The same issue
> exists for FDWs, and we're just living with it in that case.

It's absolutely horrible there. I don't see why that's a justification
for much.  To deal with the lack of extensible copy/out/readfuncs I've
just had to copy the entirety of readfuncs.c into an extension. Or you
build replacements for those (as e.g. postgres_fdw essentially has
done).

> If we do want to improve it, I'm not sure this is the way to go,
> either.  I think there could be other designs where we focus on making
> the serialization and deserialization options better, rather than
> letting people tack stuff onto the struct.

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-11 Thread Tom Lane
Robert Haas  writes:
> On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund  wrote:
>> Just better serialization doesn't actually help all that much. Being
>> able to conveniently access data directly, i.e. as fields in a struct,
>> makes code rather more readable. And in many cases more
>> efficient. Having to serialize internal datastructures unconditionally,
>> just so copyfuncs.c works if actually used, makes for a fair amount of
>> inefficiency (forced deserialization, even when not copying) and uglier
>> code.

> Fair enough, but I'm still not very interested in having something for
> CustomScan only.

I'm with Robert here.  The fact is that postgres_fdw and other FDWs are
living just fine with the restrictions we put in to allow copyfuncs.c to
copy ForeignScan, and there's been no sufficient justification offered
as to why CustomScan can't play by those rules.

Yes, it would be nice to be able to use a more-tailored struct definition,
but it's not *necessary*.  We should not contort the core system
enormously in order to provide a bit more notational cleanliness to
extensions.

I'd be fine with fixing this if a nice solution were to present itself,
but so far nothing's been offered that wasn't horrid.

BTW, the "inefficiency" argument is junk, unless you provide some hard
evidence of a case where it hurts a lot.  If we were forcing people to
use serialized representations at execution time, it would be meaningful,
but we're not; you can convert as needed at executor setup time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-11 Thread Robert Haas
On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund  wrote:
> On 2015-11-11 14:59:33 -0500, Robert Haas wrote:
>> I don't see this as being a particularly good idea.  The same issue
>> exists for FDWs, and we're just living with it in that case.
>
> It's absolutely horrible there. I don't see why that's a justification
> for much.  To deal with the lack of extensible copy/out/readfuncs I've
> just had to copy the entirety of readfuncs.c into an extension. Or you
> build replacements for those (as e.g. postgres_fdw essentially has
> done).
>
>> If we do want to improve it, I'm not sure this is the way to go,
>> either.  I think there could be other designs where we focus on making
>> the serialization and deserialization options better, rather than
>> letting people tack stuff onto the struct.
>
> Just better serialization doesn't actually help all that much. Being
> able to conveniently access data directly, i.e. as fields in a struct,
> makes code rather more readable. And in many cases more
> efficient. Having to serialize internal datastructures unconditionally,
> just so copyfuncs.c works if actually used, makes for a fair amount of
> inefficiency (forced deserialization, even when not copying) and uglier
> code.

Fair enough, but I'm still not very interested in having something for
CustomScan only.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-09 Thread Kouhei Kaigai
It is a relevant topic of readfuncs support for custom-scan.

Unlike CustomPath and CustomScanState, we don't allow custom-scan
provider to define own and larger structure that embeds CustomScan
at head of the structure, to support copyObject().
Thus, custom-scan provider needs to have own logic to pack and
unpack private fields, like:
  https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L112

At present, only create_customscan_plan() and _copyCustomScan()
can produce CustomScan node.
The create_customscan_plan() invokes PlanCustomPath callback,
thus, CSP can return a pointer of CustomScan field within
a larger structure. So, we don't adjust this interface.
On the other hands, _copyCustomScan() allocates a new node using
makeNode(CustomScan), thus, here is no space for the larger
structure if CSP wants to use to keep private values without
packing and unpacking.
Only CSP knows exact size of the larger structure, so all we can
do is ask CSP to allocate a new node and copy its private fields.
This patch newly adds NodeCopyCustomScan callback to support
copyObject.

Also, v9.6 will support nodeToString/stringToNode on plan nodes.
So, we need to re-define the role of TextOutCustomScan callback
that is also defined at v9.5.
If CSP extends the CustomScan to save private values on the extra
area, only CSP knows what values and which data types are stored,
thus, all we can do is ask CSP to serialize and deserialize its
private fields without inconsistency.
Even though TextOutCustomScan callback shall be once ripped out
to support readfuncs.c, a pair of TextOut and TextRead callback
will re-define its responsibility again; when CSP defines
a larger structure that embeds CustomScan, a pair of these
callbacks are used to serialize/deserialize the entire custom
defined objects.

The attached patches are for v9.5 and v9.6. The v9.6 patch
assumes the patch for readfuncs support is already applied.
The v9.5 patch assumes the latest master.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Tuesday, November 10, 2015 11:47 AM
> To: Robert Haas
> Cc: Amit Kapila; Andres Freund; pgsql-hackers
> Subject: Re: [HACKERS] CustomScan support on readfuncs.c
> 
> > On Sat, Nov 7, 2015 at 6:38 AM, Kouhei Kaigai  wrote:
> > > Are you suggesting to pass the object name on software build time?
> >
> > Yes.  That's what test_shm_mq and worker_spi already do:
> >
> > sprintf(worker.bgw_library_name, "test_shm_mq");
> >
> OK, I ripped out the portion around dfmgr.c.
> 
> > > If so, it may load incorrect libraries when DBA renamed its filename.
> > > At least, PostgreSQL cannot prevent DBA to rename library filenames
> > > even if they try to deploy "pg_strom.so", "pg_strom.20151106.so" and
> > > "pg_strom.20151030_bug.so" on same directory.
> >
> > I agree.  But that's not a new problem that needs to be solved by this
> > patch.  It already exists - see above.
> >
> It still may be a potential problem, even though a corner case.
> I'll try to implement an internal API to know "what is my name".
> 
> The attached main patch (custom-scan-on-readfuncs.v3.patch) once
> removes TextOutCustomScan, thus all the serialized tokens become
> known to the core backend, and add _readCustomScan() at readfuncs.c.
> It deserializes CustomScan nodes from cstring tokens, especially,
> reloads the pointer of CustomScanMethods tables using a pair of
> library name and symbol name.
> Thus, it also requires custom scan providers to keep the methods
> table visible from external objects.
> 
> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei 



custom-scan-on-copyfuncs-9.6devel.v3.patch
Description: custom-scan-on-copyfuncs-9.6devel.v3.patch


custom-scan-on-copyfuncs-9.5beta.v3.patch
Description: custom-scan-on-copyfuncs-9.5beta.v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers