Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-19 Thread Robert Haas
On Tue, Jul 19, 2011 at 12:24 AM, Peter Eisentraut pete...@gmx.net wrote:
 Please review and fix this compiler warning:

 indexcmds.c: In function ‘CheckIndexCompatible’:
 indexcmds.c:126:15: warning: variable ‘amoptions’ set but not used 
 [-Wunused-but-set-variable]

I have removed the offending variable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-18 Thread Robert Haas
On Thu, Jul 7, 2011 at 3:25 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jul 7, 2011 at 3:21 PM, Noah Misch n...@2ndquadrant.com wrote:
 On Thu, Jul 07, 2011 at 03:06:46PM -0400, Robert Haas wrote:
 On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch n...@2ndquadrant.com wrote:
  CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new 
  operator
  classes, collations and exclusion operators for each index column.  It 
  then
  checks those against the existing values for the same.  I figured that was
  obvious enough, but do you want a new version noting that?

 I guess one question I had was... are we depending on the fact that
 ComputeIndexAttrs() performs a bunch of internal sanity checks?  Or
 are we just expecting those to always pass, and we're going to examine
 the outputs after the fact?

 Those checks can fail; consider an explicit operator class or collation that
 does not support the destination type.  At that stage, we neither rely on 
 those
 checks nor mind if they do fire.  If we somehow miss the problem at that 
 stage,
 DefineIndex() will detect it later.  Likewise, if we hit an error in
 CheckIndexCompatible(), we would also hit it later in DefineIndex().

 OK.

Committed with minor comment and documentation changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-18 Thread Peter Eisentraut
On mån, 2011-07-18 at 11:05 -0400, Robert Haas wrote:
 On Thu, Jul 7, 2011 at 3:25 PM, Robert Haas robertmh...@gmail.com wrote:
  On Thu, Jul 7, 2011 at 3:21 PM, Noah Misch n...@2ndquadrant.com wrote:
  On Thu, Jul 07, 2011 at 03:06:46PM -0400, Robert Haas wrote:
  On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch n...@2ndquadrant.com wrote:
   CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new 
   operator
   classes, collations and exclusion operators for each index column.  It 
   then
   checks those against the existing values for the same.  I figured that 
   was
   obvious enough, but do you want a new version noting that?
 
  I guess one question I had was... are we depending on the fact that
  ComputeIndexAttrs() performs a bunch of internal sanity checks?  Or
  are we just expecting those to always pass, and we're going to examine
  the outputs after the fact?
 
  Those checks can fail; consider an explicit operator class or collation 
  that
  does not support the destination type.  At that stage, we neither rely on 
  those
  checks nor mind if they do fire.  If we somehow miss the problem at that 
  stage,
  DefineIndex() will detect it later.  Likewise, if we hit an error in
  CheckIndexCompatible(), we would also hit it later in DefineIndex().
 
  OK.
 
 Committed with minor comment and documentation changes.

Please review and fix this compiler warning:

indexcmds.c: In function ‘CheckIndexCompatible’:
indexcmds.c:126:15: warning: variable ‘amoptions’ set but not used 
[-Wunused-but-set-variable]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-07 Thread Robert Haas
On Wed, Jul 6, 2011 at 2:50 PM, Noah Misch n...@2ndquadrant.com wrote:
 Drat; fixed in this version.  My local branches contain a large test battery
 that I filter out of the diff before posting.  This time, that filter also
 removed an essential part of the patch.

OK, I'm pretty happy with this version, with the following minor caveats:

1. I still think the documentation changes could use some
word-smithing.  If I end up being the one who commits this, I'll take
a look at that as part of the commit.

2. I think it would be helpful to add a comment explaining why
CheckIndexCompatible() is calling

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-07 Thread Robert Haas
On Thu, Jul 7, 2011 at 2:43 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jul 6, 2011 at 2:50 PM, Noah Misch n...@2ndquadrant.com wrote:
 Drat; fixed in this version.  My local branches contain a large test battery
 that I filter out of the diff before posting.  This time, that filter also
 removed an essential part of the patch.

 OK, I'm pretty happy with this version, with the following minor caveats:

 1. I still think the documentation changes could use some
 word-smithing.  If I end up being the one who commits this, I'll take
 a look at that as part of the commit.

 2. I think it would be helpful to add a comment explaining why
 CheckIndexCompatible() is calling

Woops, sorry.  Hit send too soon.

...why CheckIndexCompatible() is calling ComputeIndexAttrs().

Rather than committing this immediately, I'm going to mark this Ready
for Committer, just in case Tom or someone else wants to look this
over and weigh in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-07 Thread Noah Misch
On Thu, Jul 07, 2011 at 02:44:49PM -0400, Robert Haas wrote:
 On Thu, Jul 7, 2011 at 2:43 PM, Robert Haas robertmh...@gmail.com wrote:
  On Wed, Jul 6, 2011 at 2:50 PM, Noah Misch n...@2ndquadrant.com wrote:
  Drat; fixed in this version.  My local branches contain a large test 
  battery
  that I filter out of the diff before posting.  This time, that filter also
  removed an essential part of the patch.
 
  OK, I'm pretty happy with this version, with the following minor caveats:
 
  1. I still think the documentation changes could use some
  word-smithing.  If I end up being the one who commits this, I'll take
  a look at that as part of the commit.
 
  2. I think it would be helpful to add a comment explaining why
  CheckIndexCompatible() is calling
 
 Woops, sorry.  Hit send too soon.
 
 ...why CheckIndexCompatible() is calling ComputeIndexAttrs().

CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new operator
classes, collations and exclusion operators for each index column.  It then
checks those against the existing values for the same.  I figured that was
obvious enough, but do you want a new version noting that?

 Rather than committing this immediately, I'm going to mark this Ready
 for Committer, just in case Tom or someone else wants to look this
 over and weigh in.

Great.  Thanks for reviewing.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-07 Thread Robert Haas
On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch n...@2ndquadrant.com wrote:
 CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new operator
 classes, collations and exclusion operators for each index column.  It then
 checks those against the existing values for the same.  I figured that was
 obvious enough, but do you want a new version noting that?

I guess one question I had was... are we depending on the fact that
ComputeIndexAttrs() performs a bunch of internal sanity checks?  Or
are we just expecting those to always pass, and we're going to examine
the outputs after the fact?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-07 Thread Noah Misch
On Thu, Jul 07, 2011 at 03:06:46PM -0400, Robert Haas wrote:
 On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch n...@2ndquadrant.com wrote:
  CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new operator
  classes, collations and exclusion operators for each index column.  It then
  checks those against the existing values for the same.  I figured that was
  obvious enough, but do you want a new version noting that?
 
 I guess one question I had was... are we depending on the fact that
 ComputeIndexAttrs() performs a bunch of internal sanity checks?  Or
 are we just expecting those to always pass, and we're going to examine
 the outputs after the fact?

Those checks can fail; consider an explicit operator class or collation that
does not support the destination type.  At that stage, we neither rely on those
checks nor mind if they do fire.  If we somehow miss the problem at that stage,
DefineIndex() will detect it later.  Likewise, if we hit an error in
CheckIndexCompatible(), we would also hit it later in DefineIndex().

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-07 Thread Robert Haas
On Thu, Jul 7, 2011 at 3:21 PM, Noah Misch n...@2ndquadrant.com wrote:
 On Thu, Jul 07, 2011 at 03:06:46PM -0400, Robert Haas wrote:
 On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch n...@2ndquadrant.com wrote:
  CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new 
  operator
  classes, collations and exclusion operators for each index column.  It then
  checks those against the existing values for the same.  I figured that was
  obvious enough, but do you want a new version noting that?

 I guess one question I had was... are we depending on the fact that
 ComputeIndexAttrs() performs a bunch of internal sanity checks?  Or
 are we just expecting those to always pass, and we're going to examine
 the outputs after the fact?

 Those checks can fail; consider an explicit operator class or collation that
 does not support the destination type.  At that stage, we neither rely on 
 those
 checks nor mind if they do fire.  If we somehow miss the problem at that 
 stage,
 DefineIndex() will detect it later.  Likewise, if we hit an error in
 CheckIndexCompatible(), we would also hit it later in DefineIndex().

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-06 Thread Robert Haas
On Thu, Jun 30, 2011 at 1:02 PM, Robert Haas robertmh...@gmail.com wrote:
 On first blush, that looks a whole lot cleaner.  I'll try to find some
 time for a more detailed review soon.

This seems not to compile for me:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wformat-security
-fno-strict-aliasing -fwrapv -g -Werror -I../../../src/include
-I/opt/local/include  -c -o index.o index.c -MMD -MP -MF
.deps/index.Po
index.c:692: error: conflicting types for ‘index_create’
../../../src/include/catalog/index.h:53: error: previous declaration
of ‘index_create’ was here
cc1: warnings being treated as errors
index.c: In function ‘index_create’:
index.c:821: warning: passing argument 5 of ‘heap_create’ makes
integer from pointer without a cast
index.c:821: warning: passing argument 6 of ‘heap_create’ makes
pointer from integer without a cast
index.c:821: error: too few arguments to function ‘heap_create’

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-07-06 Thread Noah Misch
On Wed, Jul 06, 2011 at 09:55:01AM -0400, Robert Haas wrote:
 On Thu, Jun 30, 2011 at 1:02 PM, Robert Haas robertmh...@gmail.com wrote:
  On first blush, that looks a whole lot cleaner. ?I'll try to find some
  time for a more detailed review soon.
 
 This seems not to compile for me:
 
 gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
 -Wdeclaration-after-statement -Wendif-labels -Wformat-security
 -fno-strict-aliasing -fwrapv -g -Werror -I../../../src/include
 -I/opt/local/include  -c -o index.o index.c -MMD -MP -MF
 .deps/index.Po
 index.c:692: error: conflicting types for ?index_create?
 ../../../src/include/catalog/index.h:53: error: previous declaration
 of ?index_create? was here
 cc1: warnings being treated as errors
 index.c: In function ?index_create?:
 index.c:821: warning: passing argument 5 of ?heap_create? makes
 integer from pointer without a cast
 index.c:821: warning: passing argument 6 of ?heap_create? makes
 pointer from integer without a cast
 index.c:821: error: too few arguments to function ?heap_create?

Drat; fixed in this version.  My local branches contain a large test battery
that I filter out of the diff before posting.  This time, that filter also
removed an essential part of the patch.

Thanks,
nm
diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml
index a90b4e2..b6fe908 100644
*** a/doc/src/sgml/xindex.sgml
--- b/doc/src/sgml/xindex.sgml
***
*** 834,846  ALTER OPERATOR FAMILY integer_ops USING btree ADD
para
 In a B-tree operator family, all the operators in the family must sort
 compatibly, meaning that the transitive laws hold across all the data types
!supported by the family: quoteif A = B and B = C, then A =
!C/, and quoteif A lt; B and B lt; C, then A lt; C/.  For each
!operator in the family there must be a support function having the same
!two input data types as the operator.  It is recommended that a family be
!complete, i.e., for each combination of data types, all operators are
!included.  Each operator class should include just the non-cross-type
!operators and support function for its data type.
/para
  
para
--- 834,848 
para
 In a B-tree operator family, all the operators in the family must sort
 compatibly, meaning that the transitive laws hold across all the data types
!supported by the family: quoteif A = B and B = C, then A = C/,
!and quoteif A lt; B and B lt; C, then A lt; C/.  Subjecting operands
!to any number of implicit casts or binary coercion casts involving only
!types represented in the family must not change the result other than to
!require a similar cast thereon.  For each operator in the family there must
!be a support function having the same two input data types as the operator.
!It is recommended that a family be complete, i.e., for each combination of
!data types, all operators are included.  Each operator class should include
!just the non-cross-type operators and support function for its data type.
/para
  
para
***
*** 851,861  ALTER OPERATOR FAMILY integer_ops USING btree ADD
 by the family's equality operators, even when the values are of different
 types.  This is usually difficult to accomplish when the types have
 different physical representations, but it can be done in some cases.
!Notice that there is only one support function per data type, not one
!per equality operator.  It is recommended that a family be complete, i.e.,
!provide an equality operator for each combination of data types.
!Each operator class should include just the non-cross-type equality
!operator and the support function for its data type.
/para
  
para
--- 853,865 
 by the family's equality operators, even when the values are of different
 types.  This is usually difficult to accomplish when the types have
 different physical representations, but it can be done in some cases.
!Implicit casts and binary coercion casts among types represented in the
!family must preserve this invariant.  Notice that there is only one support
!function per data type, not one per equality operator.  It is recommended
!that a family be complete, i.e., provide an equality operator for each
!combination of data types.  Each operator class should include just the
!non-cross-type equality operator and the support function for its data
!type.
/para
  
para
diff --git a/src/backend/bootstindex f3e85aa..d0a0e92 100644
*** a/src/backend/bootstrap/bootparse.y
--- b/src/backend/bootstrap/bootparse.y
***
*** 217,222  Boot_CreateStmt:
--- 217,223 

   PG_CATALOG_NAMESPACE,

   shared_relation ? GLOBALTABLESPACE_OID : 0,
 

Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-30 Thread Noah Misch
On Wed, Jun 29, 2011 at 09:42:06AM -0400, Robert Haas wrote:
 On Tue, Jun 28, 2011 at 3:40 PM, Noah Misch n...@leadboat.com wrote:

  Here's the call stack in question:
 
  ? ? ? ?RelationBuildLocalRelation
  ? ? ? ?heap_create
  ? ? ? ?index_create
  ? ? ? ?DefineIndex
  ? ? ? ?ATExecAddIndex
 
  Looking at it again, it wouldn't bring the end of the world to add a 
  relfilenode
  argument to each. None of those have more than four callers.
 
 Yeah.  Those functions take an awful lot of arguments, which suggests
 that some refactoring might be in order, but I still think it's
 cleaner to add another argument than to change the state around
 after-the-fact.
 
  ATExecAddIndex()
  would then call RelationPreserveStorage() before calling DefineIndex(), 
  which
  would in turn put things in a correct state from the start. ?Does that seem
  appropriate? ?Offhand, I do like it better than what I had.
 
 I wish we could avoid the whole death-and-resurrection thing
 altogether, but off-hand I'm not seeing a real clean way to do that.
 At the very least we had better comment it to death.

I couldn't think of a massive amount to say about that, but see what you think
of this level of commentary.

Looking at this again turned up a live bug in the previous version: if the old
index file were created in the current transaction, we would wrongly remove its
delete-at-abort entry as well as its delete-at-commit entry.  This leaked the
disk file.  Fixed by adding an argument to RelationPreserveStorage() indicating
which kind to remove.  Test case:

BEGIN;
CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(n);
CREATE INDEX ti ON t(n);
SELECT pg_relation_filepath('ti');
ALTER TABLE t ALTER n TYPE int;
ROLLBACK;
CHECKPOINT;
-- file named above should be gone

I also updated the ATPostAlterTypeCleanup() variable names per discussion and
moved IndexStmt.oldNode to a more-natural position in the structure.

Thanks,
nm
diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml
index a90b4e2..b6fe908 100644
*** a/doc/src/sgml/xindex.sgml
--- b/doc/src/sgml/xindex.sgml
***
*** 834,846  ALTER OPERATOR FAMILY integer_ops USING btree ADD
para
 In a B-tree operator family, all the operators in the family must sort
 compatibly, meaning that the transitive laws hold across all the data types
!supported by the family: quoteif A = B and B = C, then A =
!C/, and quoteif A lt; B and B lt; C, then A lt; C/.  For each
!operator in the family there must be a support function having the same
!two input data types as the operator.  It is recommended that a family be
!complete, i.e., for each combination of data types, all operators are
!included.  Each operator class should include just the non-cross-type
!operators and support function for its data type.
/para
  
para
--- 834,848 
para
 In a B-tree operator family, all the operators in the family must sort
 compatibly, meaning that the transitive laws hold across all the data types
!supported by the family: quoteif A = B and B = C, then A = C/,
!and quoteif A lt; B and B lt; C, then A lt; C/.  Subjecting operands
!to any number of implicit casts or binary coercion casts involving only
!types represented in the family must not change the result other than to
!require a similar cast thereon.  For each operator in the family there must
!be a support function having the same two input data types as the operator.
!It is recommended that a family be complete, i.e., for each combination of
!data types, all operators are included.  Each operator class should include
!just the non-cross-type operators and support function for its data type.
/para
  
para
***
*** 851,861  ALTER OPERATOR FAMILY integer_ops USING btree ADD
 by the family's equality operators, even when the values are of different
 types.  This is usually difficult to accomplish when the types have
 different physical representations, but it can be done in some cases.
!Notice that there is only one support function per data type, not one
!per equality operator.  It is recommended that a family be complete, i.e.,
!provide an equality operator for each combination of data types.
!Each operator class should include just the non-cross-type equality
!operator and the support function for its data type.
/para
  
para
--- 853,865 
 by the family's equality operators, even when the values are of different
 types.  This is usually difficult to accomplish when the types have
 different physical representations, but it can be done in some cases.
!Implicit casts and binary coercion casts among types represented in the
!family must preserve this invariant.  Notice that there is only one support
!function per data type, not one per equality operator.  It is recommended
! 

Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-30 Thread Robert Haas
On Thu, Jun 30, 2011 at 11:55 AM, Noah Misch n...@leadboat.com wrote:
 On Wed, Jun 29, 2011 at 09:42:06AM -0400, Robert Haas wrote:
 On Tue, Jun 28, 2011 at 3:40 PM, Noah Misch n...@leadboat.com wrote:

  Here's the call stack in question:
 
  ? ? ? ?RelationBuildLocalRelation
  ? ? ? ?heap_create
  ? ? ? ?index_create
  ? ? ? ?DefineIndex
  ? ? ? ?ATExecAddIndex
 
  Looking at it again, it wouldn't bring the end of the world to add a 
  relfilenode
  argument to each. None of those have more than four callers.

 Yeah.  Those functions take an awful lot of arguments, which suggests
 that some refactoring might be in order, but I still think it's
 cleaner to add another argument than to change the state around
 after-the-fact.

  ATExecAddIndex()
  would then call RelationPreserveStorage() before calling DefineIndex(), 
  which
  would in turn put things in a correct state from the start. ?Does that seem
  appropriate? ?Offhand, I do like it better than what I had.

 I wish we could avoid the whole death-and-resurrection thing
 altogether, but off-hand I'm not seeing a real clean way to do that.
 At the very least we had better comment it to death.

 I couldn't think of a massive amount to say about that, but see what you think
 of this level of commentary.

 Looking at this again turned up a live bug in the previous version: if the old
 index file were created in the current transaction, we would wrongly remove 
 its
 delete-at-abort entry as well as its delete-at-commit entry.  This leaked the
 disk file.  Fixed by adding an argument to RelationPreserveStorage() 
 indicating
 which kind to remove.  Test case:

        BEGIN;
        CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(n);
        CREATE INDEX ti ON t(n);
        SELECT pg_relation_filepath('ti');
        ALTER TABLE t ALTER n TYPE int;
        ROLLBACK;
        CHECKPOINT;
        -- file named above should be gone

 I also updated the ATPostAlterTypeCleanup() variable names per discussion and
 moved IndexStmt.oldNode to a more-natural position in the structure.

On first blush, that looks a whole lot cleaner.  I'll try to find some
time for a more detailed review soon.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-29 Thread Robert Haas
On Tue, Jun 28, 2011 at 3:40 PM, Noah Misch n...@leadboat.com wrote:
 On Tue, Jun 28, 2011 at 02:11:11PM -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch n...@leadboat.com wrote:
  On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
  On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
   [patch to avoid index rebuilds]
 
  With respect to the documentation hunks, it seems to me that the first
  hunk might be made clearer by leaving the paragraph of which it is a
  part as-is, and adding another paragraph afterwards beginning with the
  words In addition.
 
  The added restriction elaborates on the transitivity requirement, so I 
  wanted to
  keep the new language adjacent to that.

 That makes sense, but it reads a bit awkwardly to me.  Maybe it's just
 that the sentence itself is so complicated that I have difficulty
 understanding it.  I guess it's the same problem as with the text you
 added about hash indexes: without thinking about it, it's hard to
 understand what territory is covered by the new sentence that is not
 covered by what's already there.  In the case of the hash indexes, I
 think I have it figured out: we already know that we must get
 compatible hash values out of logically equal values of different
 datatypes.  But it's possible that the inter-type cast changes the
 value in some arbitrary way and then compensates for it by defining
 the hash function in such a way as to compensate.  Similarly, for
 btrees, we need the relative ordering of A and B to remain the same
 after casting within the opfamily, not to be rearranged somehow.
 Maybe the text you've got is OK to explain this, but I wonder if
 there's a way to do it more simply.

 That's basically right, I think.  Presently, there is no connection between
 casts and operator family notions of equality.  For example, a cast can change
 the hash value.  In general, that's not wrong.  However, I wish to forbid it
 when some hash operator family covers both the source and destination types of
 the cast.  Note that, I don't especially care whether the stored bits changed 
 or
 not.  I just want casts to preserve equality when an operator family defines
 equality across the types involved in the cast.  The specific way of
 articulating that is probably vulnerable to improvement.

  It would be valuable to avoid introducing a second chunk of code that knows
  everything about the catalog entries behind an index. ?That's what led me 
  to the
  put forward the most recent version as best. ?What do you find vile about 
  that
  approach? ?I wasn't comfortable with it at first, because I suspected the 
  checks
  in RelationPreserveStorage() might be important for correctness. ?Having 
  studied
  it some more, though, I think they just reflect the narrower needs of its
  current sole user.

 Maybe vile is a strong word, but it seems like a modularity violation:
 we're basically letting DefineIndex() do some stuff we don't really
 want done, and then backing it out parts of it that we don't really
 want done after all.  It seems like it would be better to provide
 DefineIndex with enough information not to do the wrong thing in the
 first place.  Could we maybe pass stmt-oldNode to DefineIndex(), and
 let it work out the details?

 True.  I initially shied away from that, because we assume somewhat deep into
 the stack that the new relation will have pg_class.oid = pg_class.relfilenode.
 Here's the call stack in question:

        RelationBuildLocalRelation
        heap_create
        index_create
        DefineIndex
        ATExecAddIndex

 Looking at it again, it wouldn't bring the end of the world to add a 
 relfilenode
 argument to each. None of those have more than four callers.

Yeah.  Those functions take an awful lot of arguments, which suggests
that some refactoring might be in order, but I still think it's
cleaner to add another argument than to change the state around
after-the-fact.

 ATExecAddIndex()
 would then call RelationPreserveStorage() before calling DefineIndex(), which
 would in turn put things in a correct state from the start.  Does that seem
 appropriate?  Offhand, I do like it better than what I had.

I wish we could avoid the whole death-and-resurrection thing
altogether, but off-hand I'm not seeing a real clean way to do that.
At the very least we had better comment it to death.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-28 Thread Robert Haas
On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch n...@leadboat.com wrote:
 On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
 On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
  [patch to avoid index rebuilds]

 With respect to the documentation hunks, it seems to me that the first
 hunk might be made clearer by leaving the paragraph of which it is a
 part as-is, and adding another paragraph afterwards beginning with the
 words In addition.

 The added restriction elaborates on the transitivity requirement, so I wanted 
 to
 keep the new language adjacent to that.

That makes sense, but it reads a bit awkwardly to me.  Maybe it's just
that the sentence itself is so complicated that I have difficulty
understanding it.  I guess it's the same problem as with the text you
added about hash indexes: without thinking about it, it's hard to
understand what territory is covered by the new sentence that is not
covered by what's already there.  In the case of the hash indexes, I
think I have it figured out: we already know that we must get
compatible hash values out of logically equal values of different
datatypes.  But it's possible that the inter-type cast changes the
value in some arbitrary way and then compensates for it by defining
the hash function in such a way as to compensate.  Similarly, for
btrees, we need the relative ordering of A and B to remain the same
after casting within the opfamily, not to be rearranged somehow.
Maybe the text you've got is OK to explain this, but I wonder if
there's a way to do it more simply.

 As you no doubt expected, my eyes was immediately drawn to the
 index-resurrection hack.  Reviewing the thread, I see that you asked
 about that in January and never got feedback.  I have to say that what
 you've done here looks like a pretty vile hack, but it's hard to say
 for sure without knowing what to compare it against.  You made
 reference to this being smaller and simpler than updating the index
 definition in place - can you give a sketch of what would need to be
 done if we went that route instead?

 In at7-index-opfamily.patch attached to
 http://archives.postgresql.org/message-id/20110113230124.ga18...@tornado.gateway.2wire.net
 check out the code following the comment /* The old index is compatible.
 Update catalogs. */ until the end of the function.  That code would need
 updates for per-column collations, and it incorrectly reuses
 values/nulls/replace arrays.  It probably does not belong in tablecmds.c,
 either.  However, it gives the right general outline.

 It would be valuable to avoid introducing a second chunk of code that knows
 everything about the catalog entries behind an index.  That's what led me to 
 the
 put forward the most recent version as best.  What do you find vile about that
 approach?  I wasn't comfortable with it at first, because I suspected the 
 checks
 in RelationPreserveStorage() might be important for correctness.  Having 
 studied
 it some more, though, I think they just reflect the narrower needs of its
 current sole user.

Maybe vile is a strong word, but it seems like a modularity violation:
we're basically letting DefineIndex() do some stuff we don't really
want done, and then backing it out parts of it that we don't really
want done after all.  It seems like it would be better to provide
DefineIndex with enough information not to do the wrong thing in the
first place.  Could we maybe pass stmt-oldNode to DefineIndex(), and
let it work out the details?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-28 Thread Noah Misch
On Tue, Jun 28, 2011 at 02:11:11PM -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch n...@leadboat.com wrote:
  On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
  On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
   [patch to avoid index rebuilds]
 
  With respect to the documentation hunks, it seems to me that the first
  hunk might be made clearer by leaving the paragraph of which it is a
  part as-is, and adding another paragraph afterwards beginning with the
  words In addition.
 
  The added restriction elaborates on the transitivity requirement, so I 
  wanted to
  keep the new language adjacent to that.
 
 That makes sense, but it reads a bit awkwardly to me.  Maybe it's just
 that the sentence itself is so complicated that I have difficulty
 understanding it.  I guess it's the same problem as with the text you
 added about hash indexes: without thinking about it, it's hard to
 understand what territory is covered by the new sentence that is not
 covered by what's already there.  In the case of the hash indexes, I
 think I have it figured out: we already know that we must get
 compatible hash values out of logically equal values of different
 datatypes.  But it's possible that the inter-type cast changes the
 value in some arbitrary way and then compensates for it by defining
 the hash function in such a way as to compensate.  Similarly, for
 btrees, we need the relative ordering of A and B to remain the same
 after casting within the opfamily, not to be rearranged somehow.
 Maybe the text you've got is OK to explain this, but I wonder if
 there's a way to do it more simply.

That's basically right, I think.  Presently, there is no connection between
casts and operator family notions of equality.  For example, a cast can change
the hash value.  In general, that's not wrong.  However, I wish to forbid it
when some hash operator family covers both the source and destination types of
the cast.  Note that, I don't especially care whether the stored bits changed or
not.  I just want casts to preserve equality when an operator family defines
equality across the types involved in the cast.  The specific way of
articulating that is probably vulnerable to improvement.

  It would be valuable to avoid introducing a second chunk of code that knows
  everything about the catalog entries behind an index. ?That's what led me 
  to the
  put forward the most recent version as best. ?What do you find vile about 
  that
  approach? ?I wasn't comfortable with it at first, because I suspected the 
  checks
  in RelationPreserveStorage() might be important for correctness. ?Having 
  studied
  it some more, though, I think they just reflect the narrower needs of its
  current sole user.
 
 Maybe vile is a strong word, but it seems like a modularity violation:
 we're basically letting DefineIndex() do some stuff we don't really
 want done, and then backing it out parts of it that we don't really
 want done after all.  It seems like it would be better to provide
 DefineIndex with enough information not to do the wrong thing in the
 first place.  Could we maybe pass stmt-oldNode to DefineIndex(), and
 let it work out the details?

True.  I initially shied away from that, because we assume somewhat deep into
the stack that the new relation will have pg_class.oid = pg_class.relfilenode.
Here's the call stack in question:

RelationBuildLocalRelation
heap_create
index_create
DefineIndex
ATExecAddIndex

Looking at it again, it wouldn't bring the end of the world to add a relfilenode
argument to each.  None of those have more than four callers.  ATExecAddIndex()
would then call RelationPreserveStorage() before calling DefineIndex(), which
would in turn put things in a correct state from the start.  Does that seem
appropriate?  Offhand, I do like it better than what I had.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-27 Thread Robert Haas
On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
 [patch to avoid index rebuilds]

With respect to the documentation hunks, it seems to me that the first
hunk might be made clearer by leaving the paragraph of which it is a
part as-is, and adding another paragraph afterwards beginning with the
words In addition.  I am not sure whether the second hunk is
necessary at all.  Doesn't the existing language cover the same
territory as what you've added?

I think that the variables in ATPostAlterTypeCleanup() could be better
named.  They appear to be values, when in fact they are ListCells.
Honestly I'd probably just use l1 and l2, but if you want to insist on
some more mnemonic naming it should probably be something that sounds
vaguely list-ish.

As you no doubt expected, my eyes was immediately drawn to the
index-resurrection hack.  Reviewing the thread, I see that you asked
about that in January and never got feedback.  I have to say that what
you've done here looks like a pretty vile hack, but it's hard to say
for sure without knowing what to compare it against.  You made
reference to this being smaller and simpler than updating the index
definition in place - can you give a sketch of what would need to be
done if we went that route instead?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

2011-06-27 Thread Noah Misch
On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
 On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch n...@leadboat.com wrote:
  [patch to avoid index rebuilds]
 
 With respect to the documentation hunks, it seems to me that the first
 hunk might be made clearer by leaving the paragraph of which it is a
 part as-is, and adding another paragraph afterwards beginning with the
 words In addition.

The added restriction elaborates on the transitivity requirement, so I wanted to
keep the new language adjacent to that.

 I am not sure whether the second hunk is
 necessary at all.  Doesn't the existing language cover the same
 territory as what you've added?

The first hunk updates the contract for btree families, and the second updates
the contract for hash families.  I kept the second instance a bit terse since it
follows soon after the similar text for B-tree.

 I think that the variables in ATPostAlterTypeCleanup() could be better
 named.  They appear to be values, when in fact they are ListCells.
 Honestly I'd probably just use l1 and l2, but if you want to insist on
 some more mnemonic naming it should probably be something that sounds
 vaguely list-ish.

Okay; I'll do that in the next version.  Either l1/l2 or maybe oid_item/def_item
like we use in postgres.c.

 As you no doubt expected, my eyes was immediately drawn to the
 index-resurrection hack.  Reviewing the thread, I see that you asked
 about that in January and never got feedback.  I have to say that what
 you've done here looks like a pretty vile hack, but it's hard to say
 for sure without knowing what to compare it against.  You made
 reference to this being smaller and simpler than updating the index
 definition in place - can you give a sketch of what would need to be
 done if we went that route instead?

In at7-index-opfamily.patch attached to
http://archives.postgresql.org/message-id/20110113230124.ga18...@tornado.gateway.2wire.net
check out the code following the comment /* The old index is compatible.
Update catalogs. */ until the end of the function.  That code would need
updates for per-column collations, and it incorrectly reuses
values/nulls/replace arrays.  It probably does not belong in tablecmds.c,
either.  However, it gives the right general outline.

It would be valuable to avoid introducing a second chunk of code that knows
everything about the catalog entries behind an index.  That's what led me to the
put forward the most recent version as best.  What do you find vile about that
approach?  I wasn't comfortable with it at first, because I suspected the checks
in RelationPreserveStorage() might be important for correctness.  Having studied
it some more, though, I think they just reflect the narrower needs of its
current sole user.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers