I'd like to add a new column to pg_attribute that specifies the
attribute's "logical position" within its relation. The idea here is
to separate the logical order of the columns in a relation from the
on-disk storage of the relation's tuples. This allows us to easily &
quickly change column order, add an additional column before or after
an existing column, etc.

At present, attnum basically does three things: identifies an column
within a relation, indicates which columns are system columns, and
defines the order of a relation's columns. I'd like to move this last
functionality into a separate pg_attribute column named "attpos" (or
"attlogicalpos"):

         - when the table is created, attnum == attpos. System columns
           have attpos < 0, as with attnum. At no point will two
           columns of the same relation have the same attpos.

         - when returning output to the client and no column ordering
           is implied by the query (e.g. "SELECT * ..."), we sort the
           columns in ascending attpos order.

         - when storing a tuple on disk, we don't consider attpos

         - if we want to change the order of the column's in a
           relation, we can do so merely by updating pg_attribute; no
           changes to the on-disk storage of the relation should be
           necessary

A few notes:

  (a) ISTM this should also apply to COPY TO and COPY FROM if the user
      didn't supply a column list. Is this reasonable? It would break
      dumps of the table's contents, but then again, dumps aren't
      guaranteed to remain valid over arbitrary changes to the table's
      meta-data.

  (b) Using the above scheme that attnum == attpos initially, there
      won't be any gaps in the sequence of attpos values. That means
      that if, for example, we want to move the column in position 50
      to position 1, we'll need to change the position's of all the
      columns in positions [1..49] (and suffer the resulting MVCC
      bloat in pg_attribute). Changing the column order is hardly a
      performance critical operation, so that might be acceptable.

      If we want to avoid this, one easy (but arguably unclean) way to
      do so would be to make the initial value of attpos == attnum *
      1000, and make attpos an int4 rather than an int2. Then, we can
      do most column reordering operations with only a single
      pg_attribute update -- in the worst-case that enough
      re-orderings are done that we overflow the 999 "padding"
      positions, we can just fall-back to doing multiple pg_attribute
      updates. Is this worth doing, and/or is there a better way to
      achieve the same effect?

  (c) Do I need to consider inheritance?

Comments are welcome.

-Neil


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Reply via email to