Re: [HACKERS] plpgsql versus domains

2015-03-01 Thread Tom Lane
Andrew Gierth and...@tao11.riddles.org.uk writes:
 Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom This is the first attempt at weaponizing the memory context
  Tom reset/delete feature, and I'm fairly happy with it, except for one
  Tom thing: I had to #include utils/memnodes.h into typcache.h in order
  Tom to preserve the intended property that the callback structs could
  Tom be included directly into structs using them.  Now, that's not
  Tom awful in itself, because typcache.h isn't used everywhere; but if
  Tom this feature gets popular we'll find memnodes.h being included
  Tom pretty much everywhere, which is not so great.  I'm thinking about
  Tom moving struct MemoryContextCallback and the extern for
  Tom MemoryContextRegisterResetCallback into palloc.h to avoid this.
  Tom Maybe that's too far in the other direction.  Thoughts?
  Tom Compromise ideas?

 This was pretty much my first thought on looking at the callback
 patch.  It may seem logical to put the reset callback stuff in
 memutils/memnodes alongside context creation and reset and so on, but in
 fact the places that are going to want to use callbacks are primarily
 _not_ the places that are doing their own context management - since
 those could do their own cleanup - but rather places that are allocating
 things in contexts supplied by others. So palloc.h is the place for it.

That argument sounds good to me ;-).  Moved.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql versus domains

2015-02-28 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-02-26 13:53:18 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 You're probably going to kill me because of the increased complexity,
 but how about making typecache.c smarter, more in the vein of
 relcache.c, and store the relevant info in there?

 I thought that's what I was proposing in point #1.

 Sure, but my second point was to avoid duplicating that information into
 -fcinfo and such and instead reference the typecache entry from
 there. So that if the typecache entry is being rebuilt (a new mechanism
 I'm afraid), whatever uses -fcinfo gets the updated
 functions/definition.

Here's a draft patch for that.  This fixes the delayed-domain-constraint
problem pretty thoroughly, in that any attempt to check a domain's
constraints will use fully up-to-date info.

This is the first attempt at weaponizing the memory context reset/delete
feature, and I'm fairly happy with it, except for one thing: I had to
#include utils/memnodes.h into typcache.h in order to preserve the
intended property that the callback structs could be included directly
into structs using them.  Now, that's not awful in itself, because
typcache.h isn't used everywhere; but if this feature gets popular we'll
find memnodes.h being included pretty much everywhere, which is not so
great.  I'm thinking about moving struct MemoryContextCallback and the
extern for MemoryContextRegisterResetCallback into palloc.h to avoid this.
Maybe that's too far in the other direction.  Thoughts?  Compromise ideas?

Also, instrumenting the code showed that TypeCacheConstrCallback gets
called quite a lot during the standard regression tests, which is why
I went out of my way to make it quick.  Almost all of those cache flushes
are from non-domain-related updates on pg_type or pg_constraint, so
they're not really necessary from a logical perspective, and they're
surely going to hurt performance for heavy users of domains.  I think to
fix this we'd have to split pg_constraint into two catalogs, one for table
constraints and one for domain constraints; which would be a good thing
anyway from a normal-form-theory perspective.  And we'd have to get rid of
pg_type.typnotnull and instead store domain NOT NULL constraints in this
hypothetical domain constraint catalog.  I don't plan to do anything about
that myself right now, because I'm not sure that production databases
would have the kind of traffic on pg_type and pg_constraint that the
regression tests exhibit.  But at some point we might have to fix it.

regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 07ab4b4..7455020 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
*** ATExecAddColumn(List **wqueue, AlteredTa
*** 4824,4830 
  	{
  		defval = (Expr *) build_column_default(rel, attribute.attnum);
  
! 		if (!defval  GetDomainConstraints(typeOid) != NIL)
  		{
  			Oid			baseTypeId;
  			int32		baseTypeMod;
--- 4824,4830 
  	{
  		defval = (Expr *) build_column_default(rel, attribute.attnum);
  
! 		if (!defval  DomainHasConstraints(typeOid))
  		{
  			Oid			baseTypeId;
  			int32		baseTypeMod;
*** ATColumnChangeRequiresRewrite(Node *expr
*** 7778,7784 
  		{
  			CoerceToDomain *d = (CoerceToDomain *) expr;
  
! 			if (GetDomainConstraints(d-resulttype) != NIL)
  return true;
  			expr = (Node *) d-arg;
  		}
--- 7778,7784 
  		{
  			CoerceToDomain *d = (CoerceToDomain *) expr;
  
! 			if (DomainHasConstraints(d-resulttype))
  return true;
  			expr = (Node *) d-arg;
  		}
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index b77e1b4..d800033 100644
*** a/src/backend/commands/typecmds.c
--- b/src/backend/commands/typecmds.c
***
*** 59,65 
  #include executor/executor.h
  #include miscadmin.h
  #include nodes/makefuncs.h
- #include optimizer/planner.h
  #include optimizer/var.h
  #include parser/parse_coerce.h
  #include parser/parse_collate.h
--- 59,64 
*** domainAddConstraint(Oid domainOid, Oid d
*** 3081,3206 
  	return ccbin;
  }
  
- /*
-  * GetDomainConstraints - get a list of the current constraints of domain
-  *
-  * Returns a possibly-empty list of DomainConstraintState nodes.
-  *
-  * This is called by the executor during plan startup for a CoerceToDomain
-  * expression node.  The given constraints will be checked for each value
-  * passed through the node.
-  *
-  * We allow this to be called for non-domain types, in which case the result
-  * is always NIL.
-  */
- List *
- GetDomainConstraints(Oid typeOid)
- {
- 	List	   *result = NIL;
- 	bool		notNull = false;
- 	Relation	conRel;
- 
- 	conRel = heap_open(ConstraintRelationId, AccessShareLock);
- 
- 	for (;;)
- 	{
- 		HeapTuple	tup;
- 		HeapTuple	conTup;
- 		Form_pg_type typTup;
- 		ScanKeyData key[1];
- 		SysScanDesc scan;
- 
- 		tup = 

Re: [HACKERS] plpgsql versus domains

2015-02-28 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

 Tom This is the first attempt at weaponizing the memory context
 Tom reset/delete feature, and I'm fairly happy with it, except for one
 Tom thing: I had to #include utils/memnodes.h into typcache.h in order
 Tom to preserve the intended property that the callback structs could
 Tom be included directly into structs using them.  Now, that's not
 Tom awful in itself, because typcache.h isn't used everywhere; but if
 Tom this feature gets popular we'll find memnodes.h being included
 Tom pretty much everywhere, which is not so great.  I'm thinking about
 Tom moving struct MemoryContextCallback and the extern for
 Tom MemoryContextRegisterResetCallback into palloc.h to avoid this.
 Tom Maybe that's too far in the other direction.  Thoughts?
 Tom Compromise ideas?

This was pretty much my first thought on looking at the callback
patch.  It may seem logical to put the reset callback stuff in
memutils/memnodes alongside context creation and reset and so on, but in
fact the places that are going to want to use callbacks are primarily
_not_ the places that are doing their own context management - since
those could do their own cleanup - but rather places that are allocating
things in contexts supplied by others. So palloc.h is the place for it.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql versus domains

2015-02-27 Thread Robert Haas
On Thu, Feb 26, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 To some extent this is a workaround for the fact that plpgsql does type
 conversions the way it does (ie, by I/O rather than by using the parser's
 cast mechanisms).  We've talked about changing that, but people seem to
 be afraid of the compatibility consequences, and I'm not planning to fight
 two compatibility battles concurrently ;-)

A sensible plan, but since we're here:

- I do agree that changing the way PL/pgsql does those type
conversions is scary.  I can't predict what will break, and there's no
getting around the scariness of that.

- On the other hand, I do think it would be good to change it, because
this sucks:

rhaas=# do $$ declare x int; begin x := (3.0::numeric)/(1.0::numeric); end $$;
ERROR:  invalid input syntax for integer: 3.
CONTEXT:  PL/pgSQL function inline_code_block line 1 at assignment

I didn't realize that this issue had even been discussed before...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql versus domains

2015-02-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Feb 26, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 To some extent this is a workaround for the fact that plpgsql does type
 conversions the way it does (ie, by I/O rather than by using the parser's
 cast mechanisms).  We've talked about changing that, but people seem to
 be afraid of the compatibility consequences, and I'm not planning to fight
 two compatibility battles concurrently ;-)

 A sensible plan, but since we're here:

 - I do agree that changing the way PL/pgsql does those type
 conversions is scary.  I can't predict what will break, and there's no
 getting around the scariness of that.

 - On the other hand, I do think it would be good to change it, because
 this sucks:

 rhaas=# do $$ declare x int; begin x := (3.0::numeric)/(1.0::numeric); end $$;
 ERROR:  invalid input syntax for integer: 3.
 CONTEXT:  PL/pgSQL function inline_code_block line 1 at assignment

If we did want to open that can of worms, my proposal would be to make
plpgsql type conversions work like so:

* If there is a cast pathway known to the core SQL parser, use that
  mechanism.

* Otherwise, attempt to convert via I/O as we do today.

This seems to minimize the risk of breaking things, although there would
probably be corner cases that work differently (for instance I bet boolean
to text might turn out differently).  In the very long run we could perhaps
deprecate and remove the second phase.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql versus domains

2015-02-27 Thread Pavel Stehule
2015-02-27 21:57 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 Robert Haas robertmh...@gmail.com writes:
  On Thu, Feb 26, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  To some extent this is a workaround for the fact that plpgsql does type
  conversions the way it does (ie, by I/O rather than by using the
 parser's
  cast mechanisms).  We've talked about changing that, but people seem to
  be afraid of the compatibility consequences, and I'm not planning to
 fight
  two compatibility battles concurrently ;-)

  A sensible plan, but since we're here:

  - I do agree that changing the way PL/pgsql does those type
  conversions is scary.  I can't predict what will break, and there's no
  getting around the scariness of that.

  - On the other hand, I do think it would be good to change it, because
  this sucks:

  rhaas=# do $$ declare x int; begin x := (3.0::numeric)/(1.0::numeric);
 end $$;
  ERROR:  invalid input syntax for integer: 3.
  CONTEXT:  PL/pgSQL function inline_code_block line 1 at assignment

 If we did want to open that can of worms, my proposal would be to make
 plpgsql type conversions work like so:

 * If there is a cast pathway known to the core SQL parser, use that
   mechanism.

 * Otherwise, attempt to convert via I/O as we do today.


 This seems to minimize the risk of breaking things, although there would
 probably be corner cases that work differently (for instance I bet boolean
 to text might turn out differently).  In the very long run we could perhaps
 deprecate and remove the second phase.


+1

There can be similar solution like plpgsql/sql identifiers priority
configuration. Some levels - and what you are proposing should be default.

It works perfectly - and from my view and what I know from my neighborhood,
there are no issues.




 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] plpgsql versus domains

2015-02-26 Thread Pavel Stehule
Hi

2015-02-26 18:31 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 At the behest of Salesforce, I've been looking into improving plpgsql's
 handling of variables of domain types, which is currently pretty awful.
 It's really slow, because any assignment to a domain variable goes through
 the slow double-I/O-conversion path in exec_cast_value, even if no actual
 work is needed.  And I noticed that the domain's constraints get looked up
 only once per function per session; for example this script gets
 unexpected results:

 create domain di as int;

 create function foo1(int) returns di as $$
 declare d di;
 begin
   d := $1;
   return d;
 end
 $$ language plpgsql immutable;

 select foo1(0); -- call once to set up cache

 alter domain di add constraint pos check (value  0);

 select 0::di;   -- fails, as expected

 select foo1(0); -- should fail, does not

 \c -

 select foo1(0); -- now it fails

 The reason for this is that domain_in looks up the domain's constraints
 and caches them under the calling FmgrInfo struct.  That behavior was
 designed to ensure we'd do just one constraint-lookup per query, which
 I think is reasonable.  But plpgsql sets up an FmgrInfo in the variable's
 PLpgSQL_type record, which persists for the whole session unless the
 function's parse tree is flushed for some reason.  So the constraints
 only get looked up once.

 The rough sketch I have in mind for fixing this is:

 1. Arrange to cache the constraints for domain types in typcache.c,
 so as to reduce the number of trips to the actual catalogs.  I think
 typcache.c will have to flush these cache items upon seeing any sinval
 change in pg_constraint, but that's still cheaper than searching
 pg_constraint for every query.

 2. In plpgsql, explicitly model a domain type as being its base type
 plus some constraints --- that is, stop depending on domain_in() to
 enforce the constraints, and do it ourselves.  This would mean that
 the FmgrInfo in a PLpgSQL_type struct would reference the base type's
 input function rather than domain_in, so we'd not have to be afraid
 to cache that.  The constraints would be represented as a separate list,
 which we'd arrange to fetch from the typcache once per transaction.
 (I think checking for new constraints once per transaction is sufficient
 for reasonable situations, though I suppose that could be argued about.)
 The advantage of this approach is first that we could avoid an I/O
 conversion when the incoming value to be assigned matches the domain's
 base type, which is the typical case; and second that a domain without
 any CHECK constraints would become essentially free, which is also a
 common case.


I like this variant

There can be some good optimization with  scalar types: text, varchars,
numbers and reduce IO cast.




 3. We could still have domains.c responsible for most of the details;
 the domain_check() API may be good enough as-is, though it seems to lack
 any simple way to force re-lookup of the domain constraints once per xact.

 Thoughts, better ideas?

 Given the lack of field complaints, I don't feel a need to try to
 back-patch a fix for this, but I do want to see it fixed for 9.5.


+1

Regards

Pavel



 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] plpgsql versus domains

2015-02-26 Thread Andres Freund
On 2015-02-26 13:53:18 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Hm. A bit sad to open code that in plpgsql instead of making it
  available more generally.
 
 The actual checking wouldn't be open-coded, it would come from
 domain_check() (or possibly a modified version of that).  It is true
 that plpgsql would know more about domains than it does today, but

It'd still teach plpgsql more about types than it knows right now. But
on a second thought that ship has sailed long ago - and the amount of
knowledge seems relatively small. There's much more stuff about
composites there already.

 and I'm not planning to fight two compatibility battles concurrently ;-)

Heh ;)

  You're probably going to kill me because of the increased complexity,
  but how about making typecache.c smarter, more in the vein of
  relcache.c, and store the relevant info in there?
 
 I thought that's what I was proposing in point #1.

Sure, but my second point was to avoid duplicating that information into
-fcinfo and such and instead reference the typecache entry from
there. So that if the typecache entry is being rebuilt (a new mechanism
I'm afraid), whatever uses -fcinfo gets the updated
functions/definition.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql versus domains

2015-02-26 Thread Andres Freund
Hi,

On 2015-02-26 12:31:46 -0500, Tom Lane wrote:
 It's really slow, because any assignment to a domain variable goes through
 the slow double-I/O-conversion path in exec_cast_value, even if no actual
 work is needed.

Hm, that's surely not nice for some types. Doing syscache lookups every
time can't help either.

 And I noticed that the domain's constraints get looked up
 only once per function per session; for example this script gets
 unexpected results:
 The reason for this is that domain_in looks up the domain's constraints
 and caches them under the calling FmgrInfo struct.

That's probably far from the only such case we have :(

 1. Arrange to cache the constraints for domain types in typcache.c,
 so as to reduce the number of trips to the actual catalogs.  I think
 typcache.c will have to flush these cache items upon seeing any sinval
 change in pg_constraint, but that's still cheaper than searching
 pg_constraint for every query.

That sounds sane.

 2. In plpgsql, explicitly model a domain type as being its base type
 plus some constraints --- that is, stop depending on domain_in() to
 enforce the constraints, and do it ourselves.  This would mean that
 the FmgrInfo in a PLpgSQL_type struct would reference the base type's
 input function rather than domain_in, so we'd not have to be afraid
 to cache that.  The constraints would be represented as a separate list,
 which we'd arrange to fetch from the typcache once per transaction.

Hm. A bit sad to open code that in plpgsql instead of making it
available more generally.

 3. We could still have domains.c responsible for most of the details;
 the domain_check() API may be good enough as-is, though it seems to lack
 any simple way to force re-lookup of the domain constraints once per xact.
 
 Thoughts, better ideas?

You're probably going to kill me because of the increased complexity,
but how about making typecache.c smarter, more in the vein of
relcache.c, and store the relevant info in there? And then, to avoid
repeated lookups, store a reference to that in fcinfo? The lifetime of
objects wouldn't be entirely trivial, but it doesn't sound impossible.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql versus domains

2015-02-26 Thread Pavel Stehule
2015-02-26 19:53 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 Andres Freund and...@2ndquadrant.com writes:
  On 2015-02-26 12:31:46 -0500, Tom Lane wrote:
  2. In plpgsql, explicitly model a domain type as being its base type
  plus some constraints --- that is, stop depending on domain_in() to
  enforce the constraints, and do it ourselves.  This would mean that
  the FmgrInfo in a PLpgSQL_type struct would reference the base type's
  input function rather than domain_in, so we'd not have to be afraid
  to cache that.  The constraints would be represented as a separate list,
  which we'd arrange to fetch from the typcache once per transaction.

  Hm. A bit sad to open code that in plpgsql instead of making it
  available more generally.

 The actual checking wouldn't be open-coded, it would come from
 domain_check() (or possibly a modified version of that).  It is true
 that plpgsql would know more about domains than it does today, but
 I'm not sure I see another use-case for this type of logic.

 To some extent this is a workaround for the fact that plpgsql does type
 conversions the way it does (ie, by I/O rather than by using the parser's
 cast mechanisms).  We've talked about changing that, but people seem to
 be afraid of the compatibility consequences, and I'm not planning to fight
 two compatibility battles concurrently ;-)


I understand, but in this case, a compatibility can be enforced simply - we
can support a only know cast mode and  IO cast mode.

IO cast is simple for lot of people, but there is a lot of performance
issues and few errors related to this topic. But this is offtopic now.

But, what can be related - for plpgsql_check is necessary some simple check
of a assigning - that should to return error or warning

This part of plpgsql_check is too complex now.


Regards

Pavel


  You're probably going to kill me because of the increased complexity,
  but how about making typecache.c smarter, more in the vein of
  relcache.c, and store the relevant info in there?

 I thought that's what I was proposing in point #1.

 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] plpgsql versus domains

2015-02-26 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-02-26 12:31:46 -0500, Tom Lane wrote:
 2. In plpgsql, explicitly model a domain type as being its base type
 plus some constraints --- that is, stop depending on domain_in() to
 enforce the constraints, and do it ourselves.  This would mean that
 the FmgrInfo in a PLpgSQL_type struct would reference the base type's
 input function rather than domain_in, so we'd not have to be afraid
 to cache that.  The constraints would be represented as a separate list,
 which we'd arrange to fetch from the typcache once per transaction.

 Hm. A bit sad to open code that in plpgsql instead of making it
 available more generally.

The actual checking wouldn't be open-coded, it would come from
domain_check() (or possibly a modified version of that).  It is true
that plpgsql would know more about domains than it does today, but
I'm not sure I see another use-case for this type of logic.

To some extent this is a workaround for the fact that plpgsql does type
conversions the way it does (ie, by I/O rather than by using the parser's
cast mechanisms).  We've talked about changing that, but people seem to
be afraid of the compatibility consequences, and I'm not planning to fight
two compatibility battles concurrently ;-)

 You're probably going to kill me because of the increased complexity,
 but how about making typecache.c smarter, more in the vein of
 relcache.c, and store the relevant info in there?

I thought that's what I was proposing in point #1.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers