date:20120706

Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases

2012-07-06 Thread Kyotaro HORIGUCHI

Hello,

  I've been stuck in mud trying to plperl work on windows
  environment. I saw many messages complaining that plperl wouldn't
  be built to work. For the convenience of those and myself, I
  describe the process of building postgresql with plperl on
  Windows with cygwin and VC++ I've done below.
 
 Hrm, I don't develop on windows here, but out of curiosity, what were
 the messages like?

My memory about that has already become faint.. As far as I
remember, I saw two patterns of crash.

One is caused by gcc-4's stack-protector in cygperl5_10.dll. It
caused crash on create function, (or create language). Building
postgresql with gcc-4 did not help for me. Finally, I gave up to
use pre-installed dll and built all including perl with GCC-3 to
make it work.

The another is 0xC005 (Access Violation) on 'create language
plperl' for VC10(:-p) vs ActivePerl5.14. This happenend at ERRSV
in plperl_(un)trasted_init(). Replacing ERRSV with
get_sv(@,FALSE) had put down that (but also I don't know if it
works) but finally I had a error didn't get a CODE reference
from compiling function on create function .. language plperl
which was the sign of dead end for me. I decided to behave well
to use ActivePerl5.12 and VC8 at last. I suppose this is a kind
of so-called DLL HELL related to memory allocation. ActivePerl
5.12 links the system's msvcrt.dll but VC links its output with
msvcrxx.dll. MS says memory allocaltion/deallocation across DLL
bounary should cause crash. But I don't know why the pair of
AP5.12 and VC8 results in success.

http://msdn.microsoft.com/en-us/library/ms235460.aspx

badalex  - The remainder of the patch whic fixes the easy fixable leakes
badalexof palloc'ed memory won't be ported into 9.1. This is only for
badalex9.3dev.
badalex 
badalex  What should I do for this?
badalex 
badalex Just let the commiter decide? :-)

Agreed.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Heikki Linnakangas


On 06.07.2012 00:54, Jan Urbański wrote:

On 05/07/12 23:30, Peter Eisentraut wrote:

On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote:

The problem is that PLyUnicode_Bytes is (via an ifdef) used as
PyString_ToString on Python3, which means that there are numerous call
sites and new ones might appear in any moment. I'm not that keen on
invoking the traceback machinery on low-level encoding errors.


Why not?


Because it can lead to recursion errors, like the one this patch was
supposed to fix. The traceback machinery calls into the encoding
functions, because it converts Python strings (like function names) into
C strings.


In the backend elog routines, there is a global variable 
'recursion_depth', which is incremented when an error-handling routine 
is entered, and decremented afterwards. Can we use a similar mechinism 
in PLy_elog() to detect and stop recursion?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Jan Urbański


On 06/07/12 10:05, Heikki Linnakangas wrote:

On 06.07.2012 00:54, Jan Urbański wrote:

On 05/07/12 23:30, Peter Eisentraut wrote:

On tor, 2012-07-05 at 22:53 +0200, Jan Urbański wrote:

The problem is that PLyUnicode_Bytes is (via an ifdef) used as
PyString_ToString on Python3, which means that there are numerous call
sites and new ones might appear in any moment. I'm not that keen on
invoking the traceback machinery on low-level encoding errors.


Why not?


Because it can lead to recursion errors, like the one this patch was
supposed to fix. The traceback machinery calls into the encoding
functions, because it converts Python strings (like function names) into
C strings.


In the backend elog routines, there is a global variable
'recursion_depth', which is incremented when an error-handling routine
is entered, and decremented afterwards. Can we use a similar mechinism
in PLy_elog() to detect and stop recursion?


I guess we can, I'll try to do some tests in order to see if there's an 
easy user-triggereable way of causing PLy_elog to recurse and if not 
then a guard like this should be enough as a safety measure against as 
yet unknown conditions (as opposed to something we expect to happen 
regularly).


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] obsolete copyright notice

2012-07-06 Thread Antonin Houska


On 07/06/2012 02:13 AM, Bruce Momjian wrote:

On Thu, Jul 05, 2012 at 10:10:08PM +0200, Antonin Houska wrote:

I found out by chance that \copyright command still contains 2011.
Perhaps documentation of new year's day changes needs to be
updated, if such exists.

What version of Postgres is this?

9.2beta1

We don't update the copyright for
minor releases,
I didn't know about this rule. I just saw the copyright updated in 
source files, so thought this was forgotten.

Sorry for the noise,
Tony H.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Covering Indexes

2012-07-06 Thread Csaba Nagy

Hi all,

 On Thu, Jun 28, 2012 at 5:16 AM, David E. Wheeler da...@justatheory.com 
 wrote:
 I don't see the virtue of this in this case.  Since the index is not
 unique, why not just put the index on (a,b,c,d) and be done with it?
 Is there some advantage to be had in inventing a way to store c and d
 in the index without having them usable for indexing?

Why not restrict it to UNIQUE indexes ?

For not unique indexes it has no advantages (it could be in fact indexed
on all columns anyway as an implementation detail).

For the unique case the problem of many identical entries mentioned by
Tom is not relevant, so the additional data will only bloat the index
but not otherwise affect the index performance.

Would this get close enough to index-covered table ? _That_ would be
interesting - I have a really big table (table/index size: 370G/320G,
~8*10^9 rows) which is basically using double space because it's primary
key is covering all columns of the table.

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-07-06 Thread Oleg Bartunov

Yes, it's a bug and it needs to be applied !

On Tue, Jul 3, 2012 at 7:44 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 3, 2012 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jun 21, 2012 at 2:53 PM, Alexander Korotkov
 aekorot...@gmail.com wrote:
 I think we definitely should apply this patch before 9.2 release, because 
 it
 is a bug fix. Otherwise people will continue produce incorrect GiST indexes
 with in-core geometrical opclasses until 9.3. Patch is very simple and only
 changes few lines of code.

 Any thoughts?

 Do we need to apply this patch to 9.2?

 It's been like that all along, no?

 Yeah, but it seems an awful lot like a bug.  In fact... it's hard to
 imagine how it could be any more of a bug than this.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Oracle porting sample instr function

2012-07-06 Thread Roberto Mello

On Mon, Jul 2, 2012 at 8:51 PM, Greg Smith g...@2ndquadrant.com wrote:


snip

 The INSTR implementation in the docs will instead search backwards from the
 end of the string if you tell it to start at 0, same as if you gave it a
 negative input.  I think it's therefore possible to get the plpgsql version
 to return a value in cases Oracle would instead return 0.  Seems like a
 straightforward thing to confirm and change the sample to do differently;
 just have to add an explicit test for a 0 value of beg_index.

I wrote that sample eons ago with the plpgsql-porting doc. I probably
overlooked the 0 behavior. Thanks for reporting Greg, and thanks Albe
for providing a patch.

Roberto

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-07-06 Thread Joel Jacobson

On Thu, Jul 5, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 You may in fact need a new field --- I'm just saying it should be in the
 object-type-specific struct, eg FuncInfo, not DumpableObject.


I suggest adding char *funcsig to FuncInfo, and moving the funcsig =
format_function_arguments(finfo, funciargs) code from dumpFunc to getFuncs.

Because dumpFunc is called after sortDumpableObjectsByTypeName, setting
funcsig in the FuncInfo struct in dumpFunc would't work, as it needs to be
available when entering sortDumpableObjectsByTypeName.

What do you think?

Re: [HACKERS] Schema version management

2012-07-06 Thread Dimitri Fontaine

Tom Lane t...@sss.pgh.pa.us writes:
 This argument seems a bit irrelevant to me.  pg_dump doesn't get to pick
 and choose what will be in the database it's told to dump.  If we're

Sure.

 going to do something like what Joel wants, we have to have file naming
 conventions for operator and cast objects.  So we can't just leave them
 out of the conversation (or if we do, we shouldn't be surprised when the
 ensuing design sucks).

I guess what we're saying is that at this point we can pick non user
friendly naming rules, like pg_operator/oid.sql or something like
that, OID based. Impacted users might as well learn about extensions.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-07-06 Thread Robert Haas

On Fri, Jul 6, 2012 at 8:23 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
 This argument seems a bit irrelevant to me.  pg_dump doesn't get to pick
 and choose what will be in the database it's told to dump.  If we're

 Sure.

 going to do something like what Joel wants, we have to have file naming
 conventions for operator and cast objects.  So we can't just leave them
 out of the conversation (or if we do, we shouldn't be surprised when the
 ensuing design sucks).

 I guess what we're saying is that at this point we can pick non user
 friendly naming rules, like pg_operator/oid.sql or something like
 that, OID based. Impacted users might as well learn about extensions.

I think that would defeat some of the human-readability goals that
people have for this feature, not to mention that it would lose the
ability to do diff -r between a dump produced on cluster A and a dump
produced on cluster B.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Covering Indexes

2012-07-06 Thread Bruce Momjian

On Fri, Jun 29, 2012 at 08:10:03AM +0200, Csaba Nagy wrote:
 Hi all,
 
  On Thu, Jun 28, 2012 at 5:16 AM, David E. Wheeler da...@justatheory.com 
  wrote:
  I don't see the virtue of this in this case.  Since the index is not
  unique, why not just put the index on (a,b,c,d) and be done with it?
  Is there some advantage to be had in inventing a way to store c and d
  in the index without having them usable for indexing?
 
 Why not restrict it to UNIQUE indexes ?
 
 For not unique indexes it has no advantages (it could be in fact indexed
 on all columns anyway as an implementation detail).
 
 For the unique case the problem of many identical entries mentioned by
 Tom is not relevant, so the additional data will only bloat the index
 but not otherwise affect the index performance.
 
 Would this get close enough to index-covered table ? _That_ would be
 interesting - I have a really big table (table/index size: 370G/320G,
 ~8*10^9 rows) which is basically using double space because it's primary
 key is covering all columns of the table.

I was wondering if there was some way to specify an expression index
that did a unique index check on some columns but included more columns
not part of the unique index.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-07-06 Thread Marc Mamin

Hello,

Is it imaginable to additionally generate an index file that map the
half friendly file names to a cleartext object signature ?

This would allow user to possibly postprocess the output while merging
overloaded functions to single files or renaming the files according to
their needs and preferences...

best regards,

Marc Mamin



 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Robert Haas
 Sent: Freitag, 6. Juli 2012 15:02
 To: Dimitri Fontaine
 Cc: Tom Lane; Christopher Browne; Pg Hackers
 Subject: Re: [HACKERS] Schema version management
 
 On Fri, Jul 6, 2012 at 8:23 AM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
  Tom Lane t...@sss.pgh.pa.us writes:
  This argument seems a bit irrelevant to me.  pg_dump doesn't get to
 pick
  and choose what will be in the database it's told to dump.  If
we're
 
  Sure.
 
  going to do something like what Joel wants, we have to have file
 naming
  conventions for operator and cast objects.  So we can't just leave
 them
  out of the conversation (or if we do, we shouldn't be surprised
when
 the
  ensuing design sucks).
 
  I guess what we're saying is that at this point we can pick non user
  friendly naming rules, like pg_operator/oid.sql or something like
  that, OID based. Impacted users might as well learn about
extensions.
 
 I think that would defeat some of the human-readability goals that
 people have for this feature, not to mention that it would lose the
 ability to do diff -r between a dump produced on cluster A and a dump
 produced on cluster B.
 
 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 
 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Covering Indexes

2012-07-06 Thread Cédric Villemain

Le vendredi 6 juillet 2012 15:41:01, Bruce Momjian a écrit :
 On Fri, Jun 29, 2012 at 08:10:03AM +0200, Csaba Nagy wrote:
  Hi all,
  
   On Thu, Jun 28, 2012 at 5:16 AM, David E. Wheeler
   da...@justatheory.com wrote: I don't see the virtue of this in this
   case.  Since the index is not unique, why not just put the index on
   (a,b,c,d) and be done with it? Is there some advantage to be had in
   inventing a way to store c and d in the index without having them
   usable for indexing?
  
  Why not restrict it to UNIQUE indexes ?
  
  For not unique indexes it has no advantages (it could be in fact indexed
  on all columns anyway as an implementation detail).
  
  For the unique case the problem of many identical entries mentioned by
  Tom is not relevant, so the additional data will only bloat the index
  but not otherwise affect the index performance.
  
  Would this get close enough to index-covered table ? _That_ would be
  interesting - I have a really big table (table/index size: 370G/320G,
  ~8*10^9 rows) which is basically using double space because it's primary
  key is covering all columns of the table.
 
 I was wondering if there was some way to specify an expression index
 that did a unique index check on some columns but included more columns
 not part of the unique index.

I haven't tryed it, but I suppose that Exclusion Constraint should allow that.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-07-06 Thread Dean Rasheed

On 24 June 2012 04:01, Alvaro Herrera alvhe...@commandprompt.com wrote:

 Excerpts from Dean Rasheed's message of sáb jun 23 18:08:31 -0400 2012:

 I spotted a couple of other issues during testing:

 David, when you generate a new version of the patch, please also make
 sure to use RELKIND_RELATION and RELKIND_FOREIGN instead of 'r' and 'f'.

 * You're still allowing INCLUDING DEFAULTS and INCLUDING STORAGE, even
 though these options are not supported on foreign tables.

 Maybe the code should list options allowed instead of the ones
 disallowed.

 * If I do INCLUDING ALL, I get an error because of the unsupported
 options. I think that ALL in this context needs to be made to mean
 all the options that foreign tables support (just COMMENTS at the
 moment).

 I agree.


David, do you have an updated version of this patch?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Covering Indexes

2012-07-06 Thread Tom Lane

Csaba Nagy ncsli...@googlemail.com writes:
 Why not restrict it to UNIQUE indexes ?

What benefit would such a restriction provide?  AFAICS it doesn't make
implementation any easier.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Jan Urbański


On 06/07/12 10:14, Jan Urbański wrote:

On 06/07/12 10:05, Heikki Linnakangas wrote:

In the backend elog routines, there is a global variable
'recursion_depth', which is incremented when an error-handling routine
is entered, and decremented afterwards. Can we use a similar mechinism
in PLy_elog() to detect and stop recursion?


I guess we can, I'll try to do some tests in order to see if there's an
easy user-triggereable way of causing PLy_elog to recurse and if not
then a guard like this should be enough as a safety measure against as
yet unknown conditions (as opposed to something we expect to happen
regularly).


Attached is a patch that stores the recursion level of PLy_traceback and 
prevents it from running if it's too deep (PLy_traceback is the one 
doing heavy lifting, that's why I chose to put the logic to skip running 
there).


I tried a few things and was not able to easily invoke the infinite 
recursion condition, but I did notice that there are two more encodings 
that have different names in Postgres and in Python (KOI8-R and KOI8-U) 
and added them to the switch.


There's still trouble with EUC_TW and MULE_INTERNAL which don't have 
Python equivalents. EUC-TW has been discussed in 
http://bugs.python.org/issue2066 and rejected (see 
http://bugs.python.org/issue2066#msg113731).


If you use any of these encodings, you *will* get into the recursion 
trouble described eariler, just as before the path you'd get into it 
with CP1252 as your encoding.


What shall we do about those? Ignore them? Document that if you're sing 
one of these encodings then PL/Python with Python 2 will be crippled and 
with Python 3 just won't work?


Cheers,
Jan
diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c
new file mode 100644
index c375ac0..c2b3cb8
*** a/src/pl/plpython/plpy_elog.c
--- b/src/pl/plpython/plpy_elog.c
*** static char *get_source_line(const char
*** 28,33 
--- 28,41 
  
  
  /*
+  * Guard agains re-entrant calls to PLy_traceback, which can happen if
+  * traceback formatting functions raise Python errors.
+  */
+ #define TRACEBACK_RECURSION_LIMIT 2
+ static int	recursion_depth = 0;
+ 
+ 
+ /*
   * Emit a PG error or notice, together with any available info about
   * the current Python error, previously set by PLy_exception_set().
   * This should be used to propagate Python errors into PG.	If fmt is
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 147,166 
  	StringInfoData xstr;
  	StringInfoData tbstr;
  
  	/*
  	 * get the current exception
  	 */
  	PyErr_Fetch(e, v, tb);
  
  	/*
! 	 * oops, no exception, return
  	 */
! 	if (e == NULL)
  	{
  		*xmsg = NULL;
  		*tbmsg = NULL;
  		*tb_depth = 0;
  
  		return;
  	}
  
--- 155,177 
  	StringInfoData xstr;
  	StringInfoData tbstr;
  
+ 	recursion_depth++;
+ 
  	/*
  	 * get the current exception
  	 */
  	PyErr_Fetch(e, v, tb);
  
  	/*
! 	 * oops, no exception or recursion depth exceeded, return
  	 */
! 	if (e == NULL || recursion_depth  TRACEBACK_RECURSION_LIMIT)
  	{
  		*xmsg = NULL;
  		*tbmsg = NULL;
  		*tb_depth = 0;
  
+ 		recursion_depth--;
  		return;
  	}
  
*** PLy_traceback(char **xmsg, char **tbmsg,
*** 326,331 
--- 337,344 
  		(*tb_depth)++;
  	}
  
+ 	recursion_depth--;
+ 
  	/* Return the traceback. */
  	*tbmsg = tbstr.data;
  
diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c
new file mode 100644
index bf29532..ea4ecdf
*** a/src/pl/plpython/plpy_util.c
--- b/src/pl/plpython/plpy_util.c
*** PLyUnicode_Bytes(PyObject *unicode)
*** 112,117 
--- 112,123 
  		case PG_WIN874:
  			serverenc = cp874;
  			break;
+ 		case PG_KOI8R:
+ 			serverenc = koi8-r;
+ 			break;
+ 		case PG_KOI8U:
+ 			serverenc = koi8-u;
+ 			break;
  		default:
  			/* Other encodings have the same name in Python. */
  			serverenc = GetDatabaseEncodingName();
*** PLyUnicode_Bytes(PyObject *unicode)
*** 120,135 
  
  	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
  	if (rv == NULL)
! 	{
! 		/*
! 		 * Use a plain ereport instead of PLy_elog to avoid recursion, if
! 		 * the traceback formatting functions try to do unicode to bytes
! 		 * conversion again.
! 		 */
! 		ereport(ERROR,
! (errcode(ERRCODE_INTERNAL_ERROR),
!  errmsg(could not convert Python Unicode object to PostgreSQL server encoding)));
! 	}
  	return rv;
  }
  
--- 126,132 
  
  	rv = PyUnicode_AsEncodedString(unicode, serverenc, strict);
  	if (rv == NULL)
! 		PLy_elog(ERROR, could not convert Python Unicode object to PostgreSQL server encoding);
  	return rv;
  }
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Heikki Linnakangas


On 06.07.2012 18:01, Jan Urbański wrote:

There's still trouble with EUC_TW and MULE_INTERNAL which don't have
Python equivalents. EUC-TW has been discussed in
http://bugs.python.org/issue2066 and rejected (see
http://bugs.python.org/issue2066#msg113731).

If you use any of these encodings, you *will* get into the recursion
trouble described eariler, just as before the path you'd get into it
with CP1252 as your encoding.

What shall we do about those? Ignore them? Document that if you're sing
one of these encodings then PL/Python with Python 2 will be crippled and
with Python 3 just won't work?


We could convert to UTF-8, and use the PostgreSQL functions to convert 
from UTF-8 to the server encoding. Double conversion might be slow, but 
I think it would be better than failing.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] obsolete copyright notice

2012-07-06 Thread Bruce Momjian

On Fri, Jul 06, 2012 at 11:06:28AM +0200, Antonin Houska wrote:
 On 07/06/2012 02:13 AM, Bruce Momjian wrote:
 On Thu, Jul 05, 2012 at 10:10:08PM +0200, Antonin Houska wrote:
 I found out by chance that \copyright command still contains 2011.
 Perhaps documentation of new year's day changes needs to be
 updated, if such exists.
 What version of Postgres is this?
 9.2beta1
 We don't update the copyright for
 minor releases,
 I didn't know about this rule. I just saw the copyright updated in
 source files, so thought this was forgotten.
 Sorry for the noise,

That is not noise.  :-O  You are 100% correct that 9.2beta1 should
_not_ show 2011!  The 9.2 major release comes out in 2012 and should
show 2012 everywhere.

We created copyright.pl for 9.2, and it seems it is lacking in some
areas.  I just made several commits to fix problems I found:

* Fix perltidy problem caused by missing regex slash --- this was the
only file to generate an error file from perltidy, so I am confident
about the other files.  I reverted the perltidy and re-ran it to make
sure things were good.

* Fix code that was supposed to skip the .git directory, but wasn't
(added basename() call)

* Modified code so all lines are processed, not just the first match ---
this fixes the help.c file that causes the original report above.  I
checked the impact of not stopping after the first match and could not
find any problems.

The attached, applied patch to head and 9.2 addresses all these issues. 
Thanks for the report.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/tools/copyright.pl b/src/tools/copyright.pl
new file mode 100755
index a55228d..52a7cb1
*** a/src/tools/copyright.pl
--- b/src/tools/copyright.pl
*** use strict;
*** 11,16 
--- 11,17 
  use warnings;
  
  use File::Find;
+ use File::Basename;
  use Tie::File;
  
  my $pgdg = 'PostgreSQL Global Development Group';
*** find({ wanted = \wanted, no_chdir = 1
*** 25,39 
  
  sub wanted
  {
- 
  	# prevent corruption of git indexes by ignoring any .git/
! 	if ($_ eq '.git')
  	{
  		$File::Find::prune = 1;
  		return;
  	}
  
! 	return if !-f $File::Find::name || -l $File::Find::name;
  
  	# skip file names with binary extensions
  	# How are these updated?  bjm 2012-01-02
--- 26,39 
  
  sub wanted
  {
  	# prevent corruption of git indexes by ignoring any .git/
! 	if (basename($_) eq '.git')
  	{
  		$File::Find::prune = 1;
  		return;
  	}
  
! 	return if ! -f $File::Find::name || -l $File::Find::name;
  
  	# skip file names with binary extensions
  	# How are these updated?  bjm 2012-01-02
*** sub wanted
*** 46,59 
  	{
  
  		# We only care about lines with a copyright notice.
! 		next unless $line =~ m/$cc . *$pgdg /;
  
! 		# We stop when we've done one substitution.  This is both for
! 		# efficiency and, at least in the case of this program, for
! 		# correctness.
! 		last if $line =~ m/$cc.*$year.*$pgdg/;
! 		last if $line =~ s/($cc\d{4})(, $pgdg)/$1-$year$2/;
! 		last if $line =~ s/($cc\d{4})-\d{4}(, $pgdg)/$1-$year$2/;
  	}
  	untie @lines;
  }
--- 46,61 
  	{
  
  		# We only care about lines with a copyright notice.
! 		next unless $line =~ m/$cc.*$pgdg/;
  
! 		# Skip line if already matches the current year; if not
! 		# we get $year-$year, e.g. 2012-2012
! 		next if $line =~ m/$cc$year, $pgdg/;
! 
! 		# We process all lines because some files have copyright
! 		# strings embedded in them, e.g. src/bin/psql/help.c
! 		$line =~ s/($cc\d{4})(, $pgdg)/$1-$year$2/;
! 		$line =~ s/($cc\d{4})-\d{4}(, $pgdg)/$1-$year$2/;
  	}
  	untie @lines;
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug tracker tool we need

2012-07-06 Thread Bruce Momjian

On Wed, Apr 18, 2012 at 10:29:26AM -0400, Robert Haas wrote:
  IME, bug trackers typically work best when used by a tightly
  integrated team.
 
  Well, very many loosely distributed open-source projects use bug
  trackers quite successfully.
 
  So I think Greg has exactly the right idea: we shouldn't try to
  incorporate one of these systems that aims to manage workflow;
 
  Um, isn't half of the commitfest app about workflow?  Patch is waiting
  for review, who is the reviewer, patch is waiting for author, who is the
  author, patch is ready for committer, who is the committer?  And every
  week or so the commitfest manager (if any) produces a report on patch
  progress.  Isn't that exactly what these workflow management systems
  provide?
 
 Yeah, but I thought we'd veered off into a digression about tracking
 bug reports.  Here's our workflow for bugs:
 
 1. If they seem interesting, Tom fixes them.
 2. Or occasionally someone else fixes them.
 3. Otherwise, they drift forever in the bleakness of space.
 
 I've been conducting the experiment for a year or two now of leaving
 unresolved bug reports unread in my mailbox.  I've got over 100 such
 emails now...  and some of them may not really be bugs, but nobody's
 put in the work to figure that out.  It would be useful to have a

I saved this email from April and have given it some thought.  I decided
to approach it by looking at our current workflow, then deciding what
the problems were, rather than approaching it with we need a bug
tracker.

I started by drawing a diagram of our current development process:

http://momjian.us/main/writings/pgsql/work_flow.pdf

I did this so I could see the weaknesses.

First, the left and right sides are what our users see, and the middle
is controlled by developers.  

Looking at the chart, the three sections, left, middle, and right, seem
to work fine in isolation.  Our interaction with bug reporters is very
good, our development flow seems fine, and people are certainly happy
with the quality of our releases.  Yes, there are problems, like the
ability of patches to get lost, but in general, things are good.

I think our big gap is in integrating these sections.  There is no easy
way for a bug reporter to find out what happens to his report unless the
patch is applied in the same email thread as the report.  It is hard for
users to see _all_ the changes made in a release because the release
notes are filtered.

Our current system is designed to have very limited friction of action,
and this give us a high-quality user experience and release quality, but
it does have limits in how well we deal with complex cases.

OK, now for the question about a bug tracker.  A bug tracker would
provide a track-able contact for everyone reporting a bug, and allow
them to see exactly what release fixes the bug (in an ideal world).  It
also allows for more detailed reporting of what is each release.  

For me, the big problem with a bug trackers is that it adds so much
friction to the development process, meaning fewer people are involved
and less work gets done.  If you have company-sponsored development, you
can just hire more people to overcome that friction, but for volunteer
development, I am not sure a bug tracker really works well, and given
the chaotic content in almost every bug tracker database, I think that
is true.

So, my question is, what can we do to better integrate these sections? 
Assign a bug number on email that gets stamped on the commit and release
note item?  Add email notification of commits somehow?  Should we
publish the entire git log for each release?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Event Triggers reduced, v1

2012-07-06 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 Here is an incremental documentation patch which I hope you will like.

Definitely, it's better this way. I'm not thrilled with separating it
into its own top level chapter, but I can see how it makes sense indeed.

This part is strange though:

+ A trigger definition can also specify a literalWHEN/literal
+ condition so that, for example, a literalcommand_start/literal
+ tag can be fired only for particular commands which the user wishes
+ to intercept.  A common use of such triggers is to restrict the range of
+ DDL operations which users may perform.

I don't think of that as firing a command tag, so it's hard for me to
parse that sentence.

 the matrix somewhat.  I think as we add more firing points it will be
 clearer and easier to read if we have all the commands arranged in
 columns rather than listing a bunch of firing points on each line.  I

+1

 also made a bunch of minor edits to improve readability and improve
 the English (which wasn't bad, but I touched it up a bit); and I tried
 to add some extra detail here and there (some of it recycled from
 previous patch versions).  Assuming this all seems reasonably
 agreeable, can you merge it on your side?

Done, thanks !

 This took the last several hours, so I haven't looked at your latest
 code changes yet.   However, in the course of editing the
 documentation, it occurred to me that we seem to be fairly arbitrarily
 excluding a large number of commands from the event trigger mechanism.

As many as that? I'm surprised about the quantity. Yes I did not add all
and any command we have, on purpose, and I agree that the new turn of
things allow us to add a new set.

  For example, GRANT and REVOKE.  In earlier patches, we needed
 specific changes for every command, so there was some reason not to
 try to support everything right out of the gate.  But ISTM that the
 argument for this is much less now; presumably it's just a few extra
 lines of code per command, so maybe we ought to go ahead and try to
 make this as complete as possible.  I attempt to explain in the

Will do soon™.

 attached patch the reasons why we don't support certain classes of
 commands, but I can't come up with any explanation for supporting
 GRANT and REVOKE that doesn't fall flat.  I can't even really see a
 reason not to support things like LISTEN and NOTIFY, and it would
 certainly be more consistent with the notion of a command_start
 trigger to support as many commands as we can.

I would think that NOTIFY is on a fast track not to be disturbed by
calling into used defined code, and that would explain why we don't
support event triggers here.

 I had an interesting experience while testing this patch.  I
 accidentally redefined my event trigger function to something which
 errored out.  That of course precluded me from using CREATE OR REPLACE
 FUNCTION to fix it.  This makes me feel rather glad that we decided to
 exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger
 mechanism, else recovery would have had to involve system catalog
 hackery.

Yeah, we have some places were it's not very hard to shoot oneself in
the foot, here the resulting hole is a little too big and offers no real
benefits. Event triggers on create|alter|drop event triggers, really?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Event Triggers reduced, v1

2012-07-06 Thread Robert Haas

On Fri, Jul 6, 2012 at 12:00 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jul 6, 2012 at 7:21 AM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
 Robert Haas robertmh...@gmail.com writes:
 Attached is a incremental patch with a bunch of minor cleanups,
 including reverts of a few spurious white space changes.  Could you
 merge this into your version?

 Thank you very much for that, yes it's included now. So you have 3
 attachments here, the whole new patch revision (v1.7), the incremental
 patch to go from 1.6 to 1.7 and the incremental patch that should apply
 cleanly on top of your cleanups.

 Here is an incremental documentation patch which I hope you will like.

And here is another incremental patch, this one doing some more
cleanup.  Some of this is cosmetic, but it also:

- Fixes the new event_trigger type so that it passes the type sanity
test, instead of adding the failure as expected output.
- Fixes DROP EVENT TRIGGER IF EXISTS on a non-existent trigger.
- Fleshes out the ownership handling so that it's more similar to what
we do for other types of objects.

I'm feeling pretty good about this at this point, although I think
there is still some more work to do before we call it done and go
home.

I have a large remaining maintainability concern about the way we're
mapping back and forth between node tags, event tags, and command
tags.  Right now we've got parse_event_tag, which parses something
like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got
command_to_string, which turns E_AlterAggregate back into 'ALTER
AGGREGATE', and then we've got InitEventContext(), which turns
T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into
E_AlterAggregate.  I can't easily verify that all three of these
things are consistent with each other, and even if they are right now
I estimate the chances of that remaining true as other people patch
the code as near-zero.  You didn't like my last proposal for dealing
with this, which is fine: it might not have been the best way of
dealing with it.  But I think we have to figure out something better
than what we've got now, or this is almost guaranteed to get broken.
If you don't have a brilliant idea I'll hack on it and see what I can
come up with.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


event-trigger-morecleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Event Triggers reduced, v1

2012-07-06 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 And here is another incremental patch, this one doing some more
 cleanup.  Some of this is cosmetic, but it also:

Thanks, applied in my github repository!

 I'm feeling pretty good about this at this point, although I think
 there is still some more work to do before we call it done and go
 home.

Nice reading that :)

 I have a large remaining maintainability concern about the way we're
 mapping back and forth between node tags, event tags, and command
 tags.  Right now we've got parse_event_tag, which parses something
[…valid concern…]
 If you don't have a brilliant idea I'll hack on it and see what I can
 come up with.

I think we might be able to install a static array for the setup where
we would find the different elements, and then code up some procedures
doing different kind of look ups in that array.

 like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got
 command_to_string, which turns E_AlterAggregate back into 'ALTER
 AGGREGATE', and then we've got InitEventContext(), which turns
 T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into
 E_AlterAggregate.  I can't easily verify that all three of these

{
  E_AlterAggregate, // TrigEventCommand
  ALTER AGGREGATE,// command tag
  T_RenameStmt, // nodeTag
  -1// object type
},
{
  E_AlterAggregate,
  ALTER AGGREGATE,
  T_AlterObjectSchemaStmt,
  OBJECT_AGGREGATE
}

The problem is coming up with a way of writing the code that does not
incur a full array scan for each step of parsing or rewriting. And I
don't see that it merits yet another cache. Given the existing event
trigger cache it might be that we don't care about having a full scan of
this table twice per event trigger related commands, as I don't think it
would happen when executing other DDLs.

Scratch that we need to parse command tags when we build the event
cache, so scanning the full array each time would make that O(n²) and we
want to avoid that. So we could install the contents of the array in
another hash table in BuildEventTriggerCache() then use that to lookup
the TrigEventCommand from the command tag…

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] patch: inline code with params

2012-07-06 Thread Pavel Stehule

Hello

I updated my two years old patch
http://archives.postgresql.org/pgsql-hackers/2010-07/txtIakTCBA15Z.txt

Syntax is based on Florian Pflog's proposal
http://archives.postgresql.org/pgsql-hackers/2010-07/msg00110.php

postgres=# do (a int, b int, text) $$begin raise notice '% % %', $1,
$2, $3; end; $$ language plpgsql using 10+100,20, :'USER';
NOTICE:  110 20 pavel
DO

This patch is not final - missing documentation, regress tests, and
doesn't support subselects as expr.


inline_code_with_params.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Event Triggers reduced, v1

2012-07-06 Thread Robert Haas

On Fri, Jul 6, 2012 at 3:29 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Here is an incremental documentation patch which I hope you will like.

 Definitely, it's better this way. I'm not thrilled with separating it
 into its own top level chapter, but I can see how it makes sense indeed.

Oh, really?  I thought that was a huge readability improvement.  There
are some things that are the same between triggers and event triggers,
but there's an awful lotta stuff that is completely different.

 This part is strange though:

 + A trigger definition can also specify a literalWHEN/literal
 + condition so that, for example, a literalcommand_start/literal
 + tag can be fired only for particular commands which the user wishes
 + to intercept.  A common use of such triggers is to restrict the range of
 + DDL operations which users may perform.

 I don't think of that as firing a command tag, so it's hard for me to
 parse that sentence.

Oh, that should say a command_start trigger rather than a
command_start tag.  Good catch.

 This took the last several hours, so I haven't looked at your latest
 code changes yet.   However, in the course of editing the
 documentation, it occurred to me that we seem to be fairly arbitrarily
 excluding a large number of commands from the event trigger mechanism.

 As many as that? I'm surprised about the quantity. Yes I did not add all
 and any command we have, on purpose, and I agree that the new turn of
 things allow us to add a new set.

I admit I didn't count them up.  :-)  Maybe there aren't that many.

I think we might want to extend the support matrix to include every
command that appears in sql-commands.html and have either an X if it's
supported or blank if it's not.  That would make it easier to judge
how many commands are not supported, not just for us but for users who
may be trying to answer the same questions we are.

 I would think that NOTIFY is on a fast track not to be disturbed by
 calling into used defined code, and that would explain why we don't
 support event triggers here.

If the DBA is allowed to restrict CREATE FUNCTION, why not NOTIFY?  I
guess I don't see why that one's deserving of special treatment.

Mind you, if I ran the world, this would probably be broken up
differently: I'd have ddl_command_start covering all the
CREATE/ALTER/DROP commands and nothing else; and separate firing
points for anything else I wanted to support.  It's not too late to
make that change, hint, hint.  But if we're not gonna do that then I
think that we'd better try to cast the net as broadly as reasonably
possible.  It seems to me that our excuse for not including things
like UPDATE and DELETE is a bit thin; surely there are people who
would like a sort of universal trigger that applies to every relation
in the system.  Of course there are recursion problems there that need
to be thought long hard about, and no I don't really want to go there
right now, but I'll bet you a nickle that someone is going to ask why
it doesn't work that way.

Another advantage to recasting this as ddl_command_start is that we
quite easily pass the operation (CREATE, ALTER, DROP) and the named
object type (TABLE, FUNCTION, CAST) as separate arguments.  I think
that would be a usability improvement, since it would then be dead
easy to write an event trigger that prohibits DROP (and only DROP) of
any sort.  Of course it's not that hard to do it right now, but you
have to parse the command tag.  It would likely simplify the code for
mapping between node and command tags, too.

One other thought: if we're NOT going to do what I suggested above,
then how about renaming TG_WHEN to TG_EVENT?  Seems like that would
fit better.

Also: now that the E_WhatEver constants don't get stored on disk, I
don't think they should live in pg_event_trigger.h any more; can we
move them to someplace more appropriate?  Possibly make them private
to event_trigger.c?  And, on a related note, I don't think it's a good
idea to use E_ as a prefix for both the event types and the command
tags.  It's too short, and hard to grep for, and we don't want the
same one for both, I think.  How above things like EVT_CommandStart
for the events and ECT_CreateAggregate for the command tags?

 I had an interesting experience while testing this patch.  I
 accidentally redefined my event trigger function to something which
 errored out.  That of course precluded me from using CREATE OR REPLACE
 FUNCTION to fix it.  This makes me feel rather glad that we decided to
 exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger
 mechanism, else recovery would have had to involve system catalog
 hackery.

 Yeah, we have some places were it's not very hard to shoot oneself in
 the foot, here the resulting hole is a little too big and offers no real
 benefits. Event triggers on create|alter|drop event triggers, really?

Indeed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Event Triggers reduced, v1

2012-07-06 Thread Robert Haas

On Fri, Jul 6, 2012 at 4:00 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 And here is another incremental patch, this one doing some more
 cleanup.  Some of this is cosmetic, but it also:

 Thanks, applied in my github repository!

Thanks.

 I have a large remaining maintainability concern about the way we're
 mapping back and forth between node tags, event tags, and command
 tags.  Right now we've got parse_event_tag, which parses something
 […valid concern…]
 If you don't have a brilliant idea I'll hack on it and see what I can
 come up with.

 I think we might be able to install a static array for the setup where
 we would find the different elements, and then code up some procedures
 doing different kind of look ups in that array.

+1.

 like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got
 command_to_string, which turns E_AlterAggregate back into 'ALTER
 AGGREGATE', and then we've got InitEventContext(), which turns
 T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into
 E_AlterAggregate.  I can't easily verify that all three of these

 {
   E_AlterAggregate, // TrigEventCommand
   ALTER AGGREGATE,// command tag
   T_RenameStmt, // nodeTag
   -1// object type
 },
 {
   E_AlterAggregate,
   ALTER AGGREGATE,
   T_AlterObjectSchemaStmt,
   OBJECT_AGGREGATE
 }

 The problem is coming up with a way of writing the code that does not
 incur a full array scan for each step of parsing or rewriting. And I
 don't see that it merits yet another cache. Given the existing event
 trigger cache it might be that we don't care about having a full scan of
 this table twice per event trigger related commands, as I don't think it
 would happen when executing other DDLs.

 Scratch that we need to parse command tags when we build the event
 cache, so scanning the full array each time would make that O(n²) and we
 want to avoid that. So we could install the contents of the array in
 another hash table in BuildEventTriggerCache() then use that to lookup
 the TrigEventCommand from the command tag…

Ugh.  Yeah, obviously the most important thing I think is that
InitEventContext() needs to be lightning-fast, but we don't want
BuildEventTriggerCache() to be pathologically slow either.

I think the best thing to do with InitEventContext() might be to get
rid of it.  It's just a big switch over node tags, and we've already
got one of those in standard_ProcessUtility.  Maybe every case that
already exists in that function should either (a) get a comment of the
form /* does not support event triggers */ or (b) get a call of the
form EventTriggerStartup(evt, parsetree, E_WhateverCommandThisIs).
EventTriggerStartup() could call InitEventContext() and then if
CommandFiresTriggersForEvent(..., E_CommandStart) it could also call
ExecEventTriggers().  This might seem like it's just moving the wood
around, but if someone adds a new case in standard_ProcessUtility,
they're going to model it on one of the existing cases, which greatly
decreases the likelihood that they're going to screw it up.  And if
they do screw it up it will be obviously non-parallel to the rest of
what's there, so somebody can notice and fix it.  As a side benefit,
this would probably be faster than having two separate switches that
are executed more or less consecutively.

Now that leaves the question of how to translate between
E_AlterAggregate and ALTER AGGREGATE; I think your idea of a hash
table (or two?) might be the most practical option.  We'd only need to
build the hash table(s) if an index-scan of pg_event_trigger finds it
non-empty, and then only once per session.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Peter Eisentraut

On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote:
  What shall we do about those? Ignore them? Document that if you're sing
  one of these encodings then PL/Python with Python 2 will be crippled and
  with Python 3 just won't work?
 
 We could convert to UTF-8, and use the PostgreSQL functions to convert 
 from UTF-8 to the server encoding. Double conversion might be slow, but 
 I think it would be better than failing. 

Actually, we already do the other direction that way
(PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to
always use this.

I would hesitate to use this as a kind of fallback, because then we
would sometimes be using PostgreSQL's recoding tables and sometimes
Python's recoding tables, which could became confusing.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-07-06 Thread Peter Eisentraut

On tor, 2012-07-05 at 23:52 +0200, Dimitri Fontaine wrote:
 I would argue like Álvaro that when dealing with operators and casts
 you're probably writing an extension already, and we're providing
 another way to deal with that.

I have code in the wild that defines new operators and casts and has no
C code and is not in an extension and has no business being in an
extension.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq URI and regression testing

2012-07-06 Thread Alvaro Herrera


Excerpts from Alex's message of jue abr 19 17:06:05 -0300 2012:
 Peter Eisentraut pete...@gmx.net writes:
 
  On tor, 2012-04-19 at 00:13 +0300, Alex wrote:
  +#!/usr/bin/env perl
 
  Don't do that.  Call the script using $(PERL) from the makefile.
 
 Thank you for the suggestion.  Attached v2 does just this (while keeping
 a more commonly found shebang line in the perl script for running it w/o
 the makefile.)

I've applied this to 9.3.  Andrew, can we have the non-MSVC buildfarm
members running on the master branch also run make installcheck in
src/interfaces/libpq?  If there are platform dependencies here, it would
be good to know what they are; if we don't have the tests run
automatically we will never know.

We'll need to figure out some way for MSVC animals to run the tests as
well.  Any takers?

I think this is necessary, because otherwise it is quite clear that the
tests are going to be kept failing forever.  They were already failing
because of message style changes that didn't update the expected output.

(I am unsure about pushing the sh to perl test harness conversion in
9.2.  If anybody thinks it should be done, please discuss.)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format changes

2012-07-06 Thread Peter Eisentraut

On fre, 2012-06-15 at 00:01 +0300, Heikki Linnakangas wrote:
 1. Use a 64-bit segment number, instead of the log/seg combination. And 
 don't waste the last segment on each logical 4 GB log file. The concept 
 of a logical log file is now completely gone. XLogRecPtr is unchanged, 
 but it should now be understood as a plain 64-bit value, just split into 
 two 32-bit integers for historical reasons. On disk, this means that 
 there will be log files ending in FF, those were skipped before.

A thought on this.  There were some concerns that this would silently
break tools that pretend to have detailed knowledge of WAL file
numbering and this previous behavior of skipping the FF files.  We could
address this by fixing the overall file naming from something like

000108D000FD
000108D000FE
000108D000FF
000108D1

to

000108D0FD00
000108D0FE00
000108D0FF00
000108D1

which represents the new true WAL stream numbering as opposed to the old
two-part numbering.

Thus, any tool that thinks it knows how the WAL files are sequenced will
break very obviously, but any tool that just looks for 24 hexadecimal
digits will be fine.

I wonder if any tools in the former category would also break if one
changes XLOG_SEG_SIZE.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-06 Thread Jan Urbański


On 06/07/12 22:47, Peter Eisentraut wrote:

On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote:

What shall we do about those? Ignore them? Document that if you're sing
one of these encodings then PL/Python with Python 2 will be crippled and
with Python 3 just won't work?


We could convert to UTF-8, and use the PostgreSQL functions to convert
from UTF-8 to the server encoding. Double conversion might be slow, but
I think it would be better than failing.


Actually, we already do the other direction that way
(PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to
always use this.

I would hesitate to use this as a kind of fallback, because then we
would sometimes be using PostgreSQL's recoding tables and sometimes
Python's recoding tables, which could became confusing.


So you're in favour of doing unicode - bytes by encoding with UTF-8 and 
then using the server's encoding functions?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format changes

2012-07-06 Thread Greg Stark

On Thu, Jun 14, 2012 at 10:01 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 This has the advantage that you can calculate the CRC for all the other
 fields before acquiring WALInsertLock. For xl_prev, you need to know where
 exactly the record is inserted, so it's handy that it's the last field
 before CRC.

It may be late to mention this but fwiw you don't need to reorder the
fields to do this. CRC has the property that you can easily adjust it
for any changes to the data covered by it. Regardless of where the
xl_prev link is you can calculate the CRC as if xl_prev is 0 and then
once you get the lock add in the correct xl_prev. This is an
argument in favour of using CRC over other checksums for which that
would be hard or impossible.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] transforms

2012-07-06 Thread Peter Eisentraut

I haven't had the time to finish all the issues with this, but I want to
keep the discussion going in the meantime and provide an updated patch.

On mån, 2012-06-18 at 17:33 +0200, Andres Freund wrote:

 Cursory code review:
 * pstrndup already exists as pnstrdup (hstore_plperl.c)

Fixed.

 * PyString_FromStringAndSize return value not decreffed? PyDict_SetItem 
 doesn't steal references

Fixed.

 * In plpython_to_hstore I would move the 'pairs' and some related variables 
 in 
 the PG_TRY block, so the reader doesn't need to check whether it should be 
 volatile
 * In the same function items needs to be volatile to fit into longjmp 
 semantics

I'll recheck that later.

 * I don't think recording dependencies on transforms used when creating 
 functions is a good idea as the transform might get created after the 
 functions already exists. That seems to be a pretty confusing behaviour.

We need the dependencies, because otherwise dropping a transform would
break or silently alter the behavior of functions that depend on it.
That sounds like my worst nightmare, thinking of some applications that
would be affected by that.  But your point is a good one.  I think this
could be addressed by prohibiting the creation of a transform that
affects functions that already exist.

Because the legacy behavior of PL implementations of defaulting to a
string representation conversion, we would technically need a dependency
on the absence of a transform object to make this airtight.  In the far
future, I could imagine removing this default behavior, meaning you
couldn't create the function if no suitable transforms exist for all
argument and return types.

 * I forsee the need for multiple transforms for the same type/language pair 
 to 
 coexist. The user would need to manually choose/call the transform in 
 that 
 case. This currently isn't easily possible...

I thought about this briefly at the beginning, but see under worst
nightmare above.  Also, having a configuration setting for this or
something would prevent any PL functions from being immutable.  We don't
allow multiple casts or multiple in/out functions either, which are
related concepts.  If you want different behavior, you should define a
different type or different language.

  *) No psql backslash commands yet.  Could be added.
 Doesn't really seem necessary to me. Not many people will need to look at 
 this 
 and the list of commands already is rather long.

I'm going to leave this out for now.

  *) Permissions: Transforms don't have owners, a bit like casts.
  Currently, you are allowed to drop a transform if you own both the type
  and the language.  That might be too strict, maybe own the type and have
  privileges on the language would be enough.
 Seems sensible enough to me.

I have made this change.

  *) There is currently some syntax schizophrenia.  The grammar accepts
  
  CREATE TRANSFORM FOR hstore LANGUAGE plperl (
  FROM SQL WITH hstore_to_plperl,
  TO SQL WITH plperl_to_hstore
  );
  
  but pg_dump produces
  
  CREATE TRANSFORM FOR hstore LANGUAGE plperl (
  FROM SQL WITH hstore_to_plperl(hstore),
  TO SQL WITH plperl_to_hstore(internal)
  );
  
  The SQL standard allows both.  (In the same way that it allows 'DROP
  FUNCTION foo' without arguments, if it is not ambigious.)  Precedent is
  that CREATE CAST requires arguments, but CREATE LANGUAGE does not.
 I don't find that problematic personally.

I have fixed the syntax to include argument types, so the dump output
and the input grammar is consistent.


Other changes:

- Fixed ecpg grammar to work again with this.

- Changed extension naming to be more consistent.

- Build additional contrib modules conditionally depending on whether
--with-perl or --with-python were configured.  (complaint from Jeff
Janes)

- Fixed Python 3.


Things I still want to do:

- Refactor the regression test framework for Python 3 so that contrib
modules or external extensions don't have to repeat the magic in
src/pl/plpython/Makefile.  (Python 3 with hstore_plpython and
ltree_plpython works, but the tests don't run.)

- Refactor pyobject_to_string(), which is currently kind of copied and
pasted from plpython, but should instead be exported by plpython in some
suitable way.

- Refactor shared library building so that I can have, say, hstore,
hstore_plperl, and hstore_plpython in one directory, rather than in
three.  The reason being, if someone has a new type in a repository on
github or something, I don't want them to have to make three separate
projects or some crazy subdirectory structure in order to add some PL
support for their type.  This will require some deep Makefile.shlib
hacking, but I think it's worth trying to make this simple(r).


So, it's quite likely that this patch won't get finished in this commit
fest.



transforms-20120707.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your

Re: [HACKERS] Support for XLogRecPtr in expand_fmt_string?

2012-07-06 Thread Peter Eisentraut

On tis, 2012-07-03 at 14:52 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On tis, 2012-07-03 at 19:35 +0200, Andres Freund wrote:
  I wonder if we just should add a format code like %R or something similar 
  as a 
  replacement for the %X/%X notion.
 
  Maybe just print it as a single 64-bit value from now on.
 
 That'd be problematic also, because of the lack of standardization of
 the format code for uint64.  We could write things like
   message...  UINT64_FORMAT  ...more message
 but I wonder how well the translation tools would work with that;
 and anyway it would at least double the translation effort for
 messages containing such things.

The existing uses of INT64_FORMAT and UINT64_FORMAT show how this is
done:  You print the value in a temporary buffer and use %s in the final
string.  It's not terribly pretty, but it's been done this way forever,
including in xlog code, so there shouldn't be a reason to hesitate about
the use for this particular case.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug tracker tool we need

2012-07-06 Thread Daniel Farina

On Fri, Jul 6, 2012 at 12:21 PM, Bruce Momjian br...@momjian.us wrote:
 I think our big gap is in integrating these sections.  There is no easy
 way for a bug reporter to find out what happens to his report unless the
 patch is applied in the same email thread as the report.  It is hard for
 users to see _all_ the changes made in a release because the release
 notes are filtered.

 Our current system is designed to have very limited friction of action,
 and this give us a high-quality user experience and release quality, but
 it does have limits in how well we deal with complex cases.

I do basically agree with this.  I was reflecting on the bug tracker
issue (or lack thereof) for unrelated reasons earlier today and I
think there are some very nice things to recommend the current
email-based system, which are the reasons you identify above.  Perhaps
the area where it falls down is structured searches (such as for
closed or wontfix) and tracking progress of related, complex, or
multi-part issues that span multiple root email messages.

Maybe just using the message-ids to cross reference things (or at
least morally: perhaps a point of indirection as to collapse multiple
bug reports about the same thing, or to provide a place to add more
annotation would be good, not unlike the CommitFest web application in
relation to emails) is enough.  Basically, perhaps an overlay
on-top-of email might be a more supple way to figure out what process
improvements work well without committing to a whole new tool chain
and workflow all at once.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug tracker tool we need

2012-07-06 Thread Bruce Momjian

On Fri, Jul 06, 2012 at 03:41:41PM -0700, Daniel Farina wrote:
 On Fri, Jul 6, 2012 at 12:21 PM, Bruce Momjian br...@momjian.us wrote:
  I think our big gap is in integrating these sections.  There is no easy
  way for a bug reporter to find out what happens to his report unless the
  patch is applied in the same email thread as the report.  It is hard for
  users to see _all_ the changes made in a release because the release
  notes are filtered.
 
  Our current system is designed to have very limited friction of action,
  and this give us a high-quality user experience and release quality, but
  it does have limits in how well we deal with complex cases.
 
 I do basically agree with this.  I was reflecting on the bug tracker
 issue (or lack thereof) for unrelated reasons earlier today and I
 think there are some very nice things to recommend the current
 email-based system, which are the reasons you identify above.  Perhaps
 the area where it falls down is structured searches (such as for
 closed or wontfix) and tracking progress of related, complex, or
 multi-part issues that span multiple root email messages.

I normally assume friction is just something that slows you down from
attaining a goal, but open source development is only _possible_ because
of the low friction communication available via the Internet.  It isn't
that open source development would be slower --- it would probably not
exist in its current form (think shareware diskettes for an
alternative).

So, while it is hopeful to think of a bug trackers as just slowing us
down, it might really alter our ability to develop software.  Yes, I
know most other projects use bug trackers, but I doubt their development
and user interactions are the same quality as ours.  On the flip side,
for complex cases, some of our user interactions are terrible.

 Maybe just using the message-ids to cross reference things (or at
 least morally: perhaps a point of indirection as to collapse multiple
 bug reports about the same thing, or to provide a place to add more
 annotation would be good, not unlike the CommitFest web application in
 relation to emails) is enough.  Basically, perhaps an overlay
 on-top-of email might be a more supple way to figure out what process
 improvements work well without committing to a whole new tool chain
 and workflow all at once.

I know there is work to allow cross-month email archive threading, and
that might help.  

I feel we have to be honest in what our current development process does
poorly.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] enhanced error fields

2012-07-06 Thread Peter Geoghegan

Attached is a revision of this patch, with a few clean-ups, mostly to
the wording of certain things.

On 5 July 2012 17:41, Pavel Stehule pavel.steh...@gmail.com wrote:
 * renamed auxiliary functions and moved it elog.c - header is new file
 relerror.h

In my revision, I've just added a pre-declaration and removed the
dedicated header, which didn't make too much sense to me:

+ /* Pre-declare Relation, in order to avoid a build dependency on rel.h. */
+ typedef struct RelationData *Relation;

This works fine, and is precedented. See
src/include/storage/buffile.h, for example. If you think it's
unreasonable that none of the functions now added to elog.h are
callable without also including rel.h, consider that all call sites
are naturally already doing that, for the errmsg() string itself.
Besides, this is a perfectly valid way of reducing build dependencies,
or at least a more valid way than creating a new header that does not
really represent a natural new division for these new functions, IMHO.
Opaque pointers are ordinarily used to encapsulate things in C, rather
than to prevent build dependencies, but I believe that's only because
in general that's something that C programmers are more likely to
want.

 * new fields constraint_table and trigger_table - constraints and
 triggers are related to relation in pg, not just to schema

I've added some remarks to that effect in the docs of my revision for
your consideration.

 * better coverage of enhancing errors in source code

Good. I think it's important that we nail down just where these are
expected to be available. It would be nice if there was a quick and
easy answer to the question Just what ErrorResponse fields should
this new sub-category of class 23 ereport() site have?.  We clearly
have yet to work those details out.

I have another concern with this patch. log_error_verbosity appears to
be intended as an exact analogue of client verbosity (as set by
PQsetErrorVerbosity()). While this generally holds for this patch,
there is one notable exception: You always log all of these new fields
within write_csvlog(), even if (Log_error_verbosity 
PGERROR_VERBOSE). Why? There will be a bunch of commas in most CSV
logs once this happens, so that the schema of the log is consistent.
That is kind of ugly, but I don't see a way around it. We already do
this for location and application_name (though that's separately
controlled by the application_name guc). I haven't touched that in the
attached revision, as I'd like to hear what you have to say.

Another problem that will need to be fixed is that of the follow values:

+#define PG_DIAG_COLUMN_NAME'c'
+#define PG_DIAG_TABLE_NAME 't'
+#define PG_DIAG_SCHEMA_NAME's'
+#define PG_DIAG_CONSTRAINT_NAME'n'
+#define PG_DIAG_CONSTRAINT_TABLE   'o'
+#define PG_DIAG_CONSTRAINT_SCHEMA  'm'
+#define PG_DIAG_ROUTINE_NAME   'r'
+#define PG_DIAG_ROUTINE_SCHEMA 'u'
+#define PG_DIAG_TRIGGER_NAME   'g'
+#define PG_DIAG_TRIGGER_TABLE  'i'
+#define PG_DIAG_TRIGGER_SCHEMA 'h'

Not all appear to have a way of setting the value within the ereport
interface. For example, there is nothing like errrelation_column()
(or errrelcol(), as I call it) to set PG_DIAG_ROUTINE_NAME. This is
something I haven't touched.

 src/backend/utils/adt/domains.c
 162:
 (errcode(ERRCODE_CHECK_VIOLATION),


 these exceptions are related to domains - we has not adequate fields
 now - and these fields are not in standards

 it needs some like DOMAIN_NAME and DOMAIN_SCHEMA ???

Hmm. I'm not sure that it's worth it.

I took a look at recent JDBC documentation, because I'd expect it to
offer the most complete support for exposing this in exception
classes. Turns out that it does not expose things at as fine a
granularity as you have here at all, which is disappointing. It seems
to suppose that just a vendor code and cause will be sufficient. If
you're using this one proprietary database, there is a command that'll
let you get diagnostics. The wisdom for users of another proprietary
database seems to be that you just use stored procedures. So I agree
that CONSTRAINT NAME should be unset in the event of uncatalogued,
unnamed not null integrity constraint violations. The standard seems
to be loose on this, and I think we'll have to be too. However, I'd
find it pretty intolerable if we were inconsistent between ereport()
callsites that were *effectively the same error* - this could result
in a user's application breaking based on the phase of the moon.

Do you suppose it's worth stashing the last set of these fields to one
side, and exposing this through an SQL utility command, or even a
bundled function? I don't imagine that it's essential to have that
right away, but it's something to consider.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


eelog.2012_07_07.patch

Re: [HACKERS] Bug tracker tool we need

2012-07-06 Thread Christopher Browne

I wonder if maybe the nearest step towards better bug tracker is a more
readily referenceable mail archive.

Clearly, one of our frictions is searching for relevant messages, so
improved mail archive ==  lowered friction, no?

There's a very particular use case; people keep rueing that indexes get cut
off on a monthly basis.  That's doubtless not the only pain, but it keeps
getting mentioned, so solving it seems valuable.

Having a correlation between commits, commitfest entries, and associated
email seems like another valuable addition.

Perhaps there are more...  I'm not yet poking at anything that would
suggest email database, either.

A lot of the analysis would be more network-oriented; putting more of a
Prolog hat on, not so much tabular / relational ...

Re: [HACKERS] Bug tracker tool we need

2012-07-06 Thread Bruce Momjian

On Fri, Jul 06, 2012 at 08:44:13PM -0400, Christopher Browne wrote:
 I wonder if maybe the nearest step towards better bug tracker is a more
 readily referenceable mail archive.
 
 Clearly, one of our frictions is searching for relevant messages, so 
 improved
 mail archive ==  lowered friction, no?
 
 There's a very particular use case; people keep rueing that indexes get cut 
 off
 on a monthly basis.  That's doubtless not the only pain, but it keeps getting
 mentioned, so solving it seems valuable.

Agreed.  I think Magnus is working on having the threads span months. 
The big question is what are we going to do with this ability once we
get it.

 Having a correlation between commits, commitfest entries, and associated email
 seems like another valuable addition.

Yep.

 Perhaps there are more...  I'm not yet poking at anything that would suggest
 email database, either.
 
 A lot of the analysis would be more network-oriented; putting more of a Prolog
 hat on, not so much tabular / relational ...

To put a finer point on this, I think projects that interact with users
via a bug tracker have much poorer user/developer communication, and
also less impetus to fix bugs quickly, because it is already recorded in
the tracker.  And after not dealing with bugs immediately for a while,
the bug database becomes huge, and developers can only triage the
database, fixing commonly-reported bugs, and leaving the rest for later,
which effectively means never.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug tracker tool we need

2012-07-06 Thread Robert Haas

On Fri, Jul 6, 2012 at 6:41 PM, Daniel Farina dan...@heroku.com wrote:
 I do basically agree with this.  I was reflecting on the bug tracker
 issue (or lack thereof) for unrelated reasons earlier today and I
 think there are some very nice things to recommend the current
 email-based system, which are the reasons you identify above.  Perhaps
 the area where it falls down is structured searches (such as for
 closed or wontfix) and tracking progress of related, complex, or
 multi-part issues that span multiple root email messages.

 Maybe just using the message-ids to cross reference things (or at
 least morally: perhaps a point of indirection as to collapse multiple
 bug reports about the same thing, or to provide a place to add more
 annotation would be good, not unlike the CommitFest web application in
 relation to emails) is enough.  Basically, perhaps an overlay
 on-top-of email might be a more supple way to figure out what process
 improvements work well without committing to a whole new tool chain
 and workflow all at once.

+1.  This is almost word-for-word how I feel about it myself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-06 Thread Gurjeet Singh

Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT ...
USING INDEX we added back in the day is not so useful in the field. Having
to drop foreign key constraints before this command, and recreate them
afterwards makes this command useless to most database setups. I feel sorry
that no one brought this up when we were implementing the feature; maybe we
could've done something about it right then.

I wish to correct it now, and did some research. Long story short, I
realized that the foreign key constraint depends on the index relation of
the primary key, and does not depend on the constraint object as I had
expected (Please see rows 5 and 11 of the result set shown below). This
behaviour is also seen when the FKey references a unique constraint. As
much as that perplexes me, I think it makes our job a bit easier.

All we need to do is allow swapping of pg_class.relfilenode of two indexes.
This will let the dependency entries stand as they are and allow us to drop
the bloated primary key index structure without having to rebuild the
foreign key constraints.

As for the syntactical sugar, this can be added to either ALTER TABLE or to
ALTER INDEX. Although under no normal circumstances one would need to use
ALTER INDEX to swap two indexes' relfilenode (because one can easily create
a duplicate index and drop/rename-in-place the old one), I think it would
make more sense here since it is just an operation on two indexes and has
nothing to do with the constraints, apart from the fact that we want to use
this feature to meddle with the constraints.

Syntax options:

ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING
INDEX new_index;

ALTER INDEX ind REPLACE WITH new_index;

Note that in both the syntaxes, it is assumed that all remnants of
new_index will be gone after the command completes successfully; that is,
the commands will behave as if they deleted the index structure of the
index being replaced and placed the new structure in its place, while
dropping the index that was used for the replacement.

I don't think we need to ensure that the new_index is completely flushed to
disk before the operation, but we do need to issue relevant cache
invalidation messages after the operation is done.

For replacement to be successful, new_index should not be associated with
any constraints, and, new_index should be identical to the index being
replaced, except for the index names.

The ALTER TABLE syntax closely emulates the existing syntax of replacing a
constraint using an existing index, but looking at the grammar construction
I feel that it may be more complex to implement than the ALTER INDEX syntax.

ALTER INDEX feels easier to do, since we won't have to jump through hoops
like in ALTER TABLE's multi-command support (ATExec*() functions), and
dropping the new_index might be easier to do.

Thoughts?

postgres=# \d+ test
 Table public.test
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 a  | integer | not null  | plain   |  |
 b  | integer |   | plain   |  |
Indexes:
test_pkey PRIMARY KEY, btree (a)
temp_idx UNIQUE, btree (a)
Referenced by:
TABLE test2 CONSTRAINT test2_b_fkey FOREIGN KEY (b) REFERENCES
test(a)
Has OIDs: no

postgres=# \d+ test2
Table public.test2
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 a  | integer |   | plain   |  |
 b  | integer |   | plain   |  |
Foreign-key constraints:
test2_b_fkey FOREIGN KEY (b) REFERENCES test(a)
Has OIDs: no

Relevant output of query [4] on pg_depend:

classid| objid |objid |  refclassid   |
refobjid |   refobjid   | deptype
---+---+--+---+--+--+-
 pg_class  | 16413 | test | pg_namespace  |
2200 | public   | n
 pg_type   | 16415 | test | pg_class  |
16413 | test | i
 pg_type   | 16414 | test[]   | pg_type   |
16415 | test | i
 pg_constraint | 16417 | test_pkey| pg_class  |
16413 | test | a
 pg_class  | 16416 | test_pkey| pg_constraint |
16417 | test_pkey| i
 pg_class  | 16418 | test2| pg_namespace  |
2200 | public   | n
 pg_type   | 16420 | test2| pg_class  |
16418 | test2| i
 pg_type   | 16419 | test2[]  | pg_type   |
16420 | test2| i
 pg_constraint | 16421 | test2_b_fkey | pg_class  |
16413 | test | n
 pg_constraint | 16421 | test2_b_fkey | pg_class  |
16418 |

40 matches

Mail list logo