Re: [HACKERS] logical column ordering

2015-04-14 Thread Robert Haas
On Tue, Apr 14, 2015 at 2:38 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 As I said, I'm still writing the first pieces of this so I'm not sure
 what other ramifications it will have.  If there are any thoughts, I
 would appreciate them.  (Particularly useful input on whether it is
 acceptable to omit lognums/physnums from _outRangeTblEntry.)

I think the general rule is that an RTE should not contain any
structure information about the underlying relation that can
potentially change: the OID is OK because it's immutable for any given
relation.  The relkind is not quite immutable because you can create a
_SELECT rule on a table and turn it into a view; I'm not sure how we
handle that, but it's a fairly minor exception anyhow.  Everything
else in the RTE, with the new and perhaps-unfortunate exception of
security quals, is stuff derived from what's in the query, not the
table.  I think it would be good for this to work the same way: the
structural information about the table should be found in the
relcache, not the RTE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-04-14 Thread Alvaro Herrera
I've been looking at this again.  It has become apparent to me that what
we're doing in parse analysis is wrong, and the underlying node
representation is wrong too.  Here's a different approach, which I hope
will give better fruits.  I'm still working on implementing the ideas
here (and figuring out what the fallout is).

Currently, the patch stores RangeTblEntry-eref-colnames in logical
order; and it also adds a map from logical colnums to attnum (called
lognums).  Now this is problematic for two reasons:

1. the lognums map becomes part of the stored representation of a rule;
any time you modified the logical ordering of a table underlying some
view, the view's _RETURN rule would have to be modified as well.  Not
good.

2. RTE-eref-colnames is in attlognum order and thus can only be sanely
interpreted if RTE-lognums is available, so not only lognums would have
to be modified, but colnames as well.

I think the solution to both these issues is to store colnames in attnum
ordering not logical, and *not* output RTE-lognums as part of
_outRangeTblEntry.  This means that every time we read the RTE for the
table we need to obtain lognums from its tupledesc.  RTE-eref-colnames
can then be sorted appropriately at plan time.

At RTE creation time (addRangeTableEntry and siblings) we can obtain
lognums and physnums.  Both these arrays are available for later
application in setrefs.c, avoiding the need of the obviously misplaced
relation_open() call we currently have there.

There is one gotcha, which is that expandTupleDesc (and, really,
everything from expandRTE downwards) will need to be split in two
somehow: one part needs to fill in the colnames array in attnum order,
and the other part needs to expand the attribute array into Var nodes in
logical order.

(If you recall, we need attphysnums at setrefs.c time so that we can
fix-up any TupleDesc created from a targetlist so that it contains the
proper attphysnum values.  The attphysnum values for each attribute do
not propagate properly there, and I believe this is the mechanism to do
so.)

As I said, I'm still writing the first pieces of this so I'm not sure
what other ramifications it will have.  If there are any thoughts, I
would appreciate them.  (Particularly useful input on whether it is
acceptable to omit lognums/physnums from _outRangeTblEntry.)

An alternative idea would be to add lognums and physnums to RelOptInfo
instead of RangeTblEntry (we would do so during get_relation_info).  I'm
not sure how this works for setrefs.c though, if at all; the advantage
is that RelOptInfo is not part of stored rules so we don't have to worry
about not saving them there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-24 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 12, 2015 at 9:57 AM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
 However, there's a difference between making a query silently given
 different results, and breaking it completely forcing the user to
 re-study how to write it.  I think the latter is better.  In that light
 we should just drop attnum as a column name, and use something else:
 maybe (attidnum, attlognum, attphysnum).  So all queries in the wild
 would be forced to be updated, but we would not silently change
 semantics instead.

 +1 for that approach.  Much better to break all of the third-party
 code out there definitively than to bet on which attribute people are
 going to want to use most commonly.

+1

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Andres Freund
On 2015-03-23 13:01:48 -0400, Robert Haas wrote:
 I'm a little confused as to the status of this patch.  It's marked as
 Waiting on Author in the CommitFest application, and the last patch
 version was posted in December.

I think it fairly can be marked as returned with feedback for now?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Andres Freund
On 2015-03-23 14:19:50 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2015-03-23 13:01:48 -0400, Robert Haas wrote:
   I'm a little confused as to the status of this patch.  It's marked as
   Waiting on Author in the CommitFest application, and the last patch
   version was posted in December.
  
  I think it fairly can be marked as returned with feedback for now?
 
 ... which means that no useful feedback was received at all in this
 round for this patch.  (There was lots of feedback, mind you, but as far
 as I can see it was all on the subject of how the patch is going to be
 summarily rejected unless user-visible controls are offered -- and you
 already know my opinion on that matter.)

To me the actual blocker seems to be the implementation. Which doesn't
look like it's going to be ready for 9.5; there seems to be loads of
work left to do. It's hard to provide non flame-bait feedback if the
patch isn't ready. I'm not sure what review you'd like to see at this
stage?

I think your approach of concentrating on the technical parts is sane,
and I'd continue going that way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Robert Haas
On Thu, Mar 12, 2015 at 9:57 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 However, there's a difference between making a query silently given
 different results, and breaking it completely forcing the user to
 re-study how to write it.  I think the latter is better.  In that light
 we should just drop attnum as a column name, and use something else:
 maybe (attidnum, attlognum, attphysnum).  So all queries in the wild
 would be forced to be updated, but we would not silently change
 semantics instead.

+1 for that approach.  Much better to break all of the third-party
code out there definitively than to bet on which attribute people are
going to want to use most commonly.

I'm a little confused as to the status of this patch.  It's marked as
Waiting on Author in the CommitFest application, and the last patch
version was posted in December.  The fact that the new CommitFest
application encourages people to blindly move things to the next CF
instead of forcing patch authors to reopen the record when they update
the patch is, IMHO, not good.  It's just going to lead to the CF
application filling up with things that the authors aren't really
working on.  We've got enough work to do with the patches that are
actually under active development.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Tomas Vondra
On 23.3.2015 19:52, Peter Geoghegan wrote:
 On Mon, Mar 23, 2015 at 11:50 AM, Tomas Vondra
 tomas.von...@2ndquadrant.com wrote:
 Not sure how that's supposed to improve the situation? Also, when you
 change the status to 'stalled', it only makes it more difficult to
 identify why it was stalled (was it waiting for author or a review?).

 What might be done is tracking time since last patch/review, but
 I really don't know how we're going to identify that considering
 the problems with identifying which messages are patches.
 
 
 Perhaps I explained myself poorly. I am proposing having a totally 
 automated/mechanical way of highlighting no actual change in status
 in the CF app. So I think we are in agreement here, or close enough.
 I was just talking about a somewhat arbitrary point at which patches
 are considered to have stalled within the CF app.

Oh, right. Yes, tracking time since the last status change like this
might be useful, although my experience is that many patches are stuck
at some status yet there was a long discussion on the list ...  Not sure
if that counts as 'stalled'.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Josh Berkus
On 03/23/2015 02:32 PM, Tomas Vondra wrote:
 Oh, right. Yes, tracking time since the last status change like this
 might be useful, although my experience is that many patches are stuck
 at some status yet there was a long discussion on the list ...  Not sure
 if that counts as 'stalled'.

Time since last email maybe.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Tomas Vondra
On 23.3.2015 18:30, Andres Freund wrote:

 I think it fairly can be marked as returned with feedback for
 now?

That will eventually be the end result, yes. If it's time to do that
now, or leave the patch in the CF and only bounce it at the end, I don't
know.


 ... which means that no useful feedback was received at all in
 this round for this patch. (There was lots of feedback, mind you,
 but as far as I can see it was all on the subject of how the patch
 is going to be summarily rejected unless user-visible controls are
 offered -- and you already know my opinion on that matter.)
 
 To me the actual blocker seems to be the implementation. Which
 doesn't look like it's going to be ready for 9.5; there seems to be
 loads of work left to do. It's hard to provide non flame-bait
 feedback if the patch isn't ready. I'm not sure what review you'd
 like to see at this stage?

The version I posted at the end of February is certainly incomplete (and
some of the regression tests fail), but it seemed reasonably complete to
get some feedback. That is not to say parts of the patch are probably
wrong / need rework.

 I think your approach of concentrating on the technical parts is
 sane, and I'd continue going that way.

I do work in that direction. OTOH I think it's useful to provide some
sort of minimum usable API so that people can actually use it without
messing with catalogs directly. It certainly won't have all the bells
and whistles, though.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Tomas Vondra
On 23.3.2015 22:53, Jeff Janes wrote:
 On Mon, Mar 23, 2015 at 11:52 AM, Tomas Vondra
 
 Sorry, the 23/2 one is the one I meant.  I got confused over which of
 the emails listed as having an attachment but no patch was the one that
 actually had a patch.  (If the commitfest app can't correctly deal with
 more than one attachment, it needs to at least give an indication that
 this condition may exist).
 
 But I am still getting a lot of errors during make check.
 
 60 of 153 tests failed
 
 Some of them look like maybe a change in the expected output file didn't
 get included in the patch, but at least one was a coredump.

Yes, there were two coredumps (as noted in the message with the patch).

Not sure of the other errors - it certainly is possible I forgot to
include something in the patch. Thanks for noticing this, will look into
that.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Jeff Janes
On Mon, Mar 23, 2015 at 10:01 AM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Mar 12, 2015 at 9:57 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  However, there's a difference between making a query silently given
  different results, and breaking it completely forcing the user to
  re-study how to write it.  I think the latter is better.  In that light
  we should just drop attnum as a column name, and use something else:
  maybe (attidnum, attlognum, attphysnum).  So all queries in the wild
  would be forced to be updated, but we would not silently change
  semantics instead.

 +1 for that approach.  Much better to break all of the third-party
 code out there definitively than to bet on which attribute people are
 going to want to use most commonly.

 I'm a little confused as to the status of this patch.  It's marked as
 Waiting on Author in the CommitFest application, and the last patch
 version was posted in December.


There was a patch here, which in the commit fest is hidden behind other
non-attachments in the same email:

Attachment (randomize.sql
http://www.postgresql.org/message-id/attachment/37076/randomize.sql)
at 2015-02-27
19:10:21
http://www.postgresql.org/message-id/54f0c11d.7000...@2ndquadrant.com/ from
Tomas Vondra tomas.vondra at 2ndquadrant.com

But that patch failed the majority of make check checks in my hands.  So
I also don't know what the status is.

Cheers,

Jeff


Re: [HACKERS] logical column ordering

2015-03-23 Thread Peter Geoghegan
On Mon, Mar 23, 2015 at 10:01 AM, Robert Haas robertmh...@gmail.com wrote:
 I'm a little confused as to the status of this patch.  It's marked as
 Waiting on Author in the CommitFest application, and the last patch
 version was posted in December.  The fact that the new CommitFest
 application encourages people to blindly move things to the next CF
 instead of forcing patch authors to reopen the record when they update
 the patch is, IMHO, not good.  It's just going to lead to the CF
 application filling up with things that the authors aren't really
 working on.  We've got enough work to do with the patches that are
 actually under active development.

Maybe there should be a stalled patch status summary, that
highlights patches that have not had their status change in (say) 2
weeks. Although it wouldn't really be a status summary, since that
they're mutually exclusive with each other in the CF app (e.g. a patch
cannot be both Waiting on Author and Ready for Committer).


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Alvaro Herrera
Andres Freund wrote:
 On 2015-03-23 13:01:48 -0400, Robert Haas wrote:
  I'm a little confused as to the status of this patch.  It's marked as
  Waiting on Author in the CommitFest application, and the last patch
  version was posted in December.
 
 I think it fairly can be marked as returned with feedback for now?

... which means that no useful feedback was received at all in this
round for this patch.  (There was lots of feedback, mind you, but as far
as I can see it was all on the subject of how the patch is going to be
summarily rejected unless user-visible controls are offered -- and you
already know my opinion on that matter.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Tomas Vondra
Hi,

On 23.3.2015 18:01, Robert Haas wrote:
 On Thu, Mar 12, 2015 at 9:57 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 However, there's a difference between making a query silently given
 different results, and breaking it completely forcing the user to
 re-study how to write it.  I think the latter is better.  In that light
 we should just drop attnum as a column name, and use something else:
 maybe (attidnum, attlognum, attphysnum).  So all queries in the wild
 would be forced to be updated, but we would not silently change
 semantics instead.
 
 +1 for that approach.  Much better to break all of the third-party
 code out there definitively than to bet on which attribute people are
 going to want to use most commonly.
 
 I'm a little confused as to the status of this patch.  It's marked as
 Waiting on Author in the CommitFest application, and the last patch
 version was posted in December.  The fact that the new CommitFest
 application encourages people to blindly move things to the next CF
 instead of forcing patch authors to reopen the record when they update
 the patch is, IMHO, not good.  It's just going to lead to the CF
 application filling up with things that the authors aren't really
 working on.  We've got enough work to do with the patches that are
 actually under active development.

The last version of the patch was submitted on 24/2 by me. Not sure why
it's not listed in the CF app, but it's here:

http://www.postgresql.org/message-id/54ebb312.7090...@2ndquadrant.com

I'm working on a new version of the patch, based on the ideas that were
mentioned in this thread. I plan to post a new version within a few
days, hopefully.

Anyway, it's obvious this patch won't make it into 9.5 - it's a lot of
subtle changes on many places, so it's not suitable for the last
commitfest. But the feedback is welcome, of course.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Tomas Vondra
Hi,

On 23.3.2015 18:07, Peter Geoghegan wrote:
 On Mon, Mar 23, 2015 at 10:01 AM, Robert Haas robertmh...@gmail.com wrote:
 I'm a little confused as to the status of this patch.  It's marked as
 Waiting on Author in the CommitFest application, and the last patch
 version was posted in December.  The fact that the new CommitFest
 application encourages people to blindly move things to the next CF
 instead of forcing patch authors to reopen the record when they update
 the patch is, IMHO, not good.  It's just going to lead to the CF
 application filling up with things that the authors aren't really
 working on.  We've got enough work to do with the patches that are
 actually under active development.
 
 Maybe there should be a stalled patch status summary, that
 highlights patches that have not had their status change in (say) 2
 weeks. Although it wouldn't really be a status summary, since that
 they're mutually exclusive with each other in the CF app (e.g. a patch
 cannot be both Waiting on Author and Ready for Committer).

Not sure how that's supposed to improve the situation? Also, when you
change the status to 'stalled', it only makes it more difficult to
identify why it was stalled (was it waiting for author or a review?).

What might be done is tracking time since last patch/review, but I
really don't know how we're going to identify that considering the
problems with identifying which messages are patches.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Peter Geoghegan
On Mon, Mar 23, 2015 at 11:50 AM, Tomas Vondra
tomas.von...@2ndquadrant.com wrote:
 Not sure how that's supposed to improve the situation? Also, when you
 change the status to 'stalled', it only makes it more difficult to
 identify why it was stalled (was it waiting for author or a review?).

 What might be done is tracking time since last patch/review, but I
 really don't know how we're going to identify that considering the
 problems with identifying which messages are patches.


Perhaps I explained myself poorly. I am proposing having a totally
automated/mechanical way of highlighting no actual change in status in
the CF app. So I think we are in agreement here, or close enough. I
was just talking about a somewhat arbitrary point at which patches are
considered to have stalled within the CF app.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-23 Thread Tomas Vondra
On 23.3.2015 18:08, Jeff Janes wrote:
 On Mon, Mar 23, 2015 at 10:01 AM, Robert Haas robertmh...@gmail.com
 mailto:robertmh...@gmail.com wrote:
 
 There was a patch here, which in the commit fest is hidden behind
 other non-attachments in the same email:
 
 Attachment (randomize.sql
 http://www.postgresql.org/message-id/attachment/37076/randomize.sql)
 at 2015-02-27 19:10:21
 http://www.postgresql.org/message-id/54f0c11d.7000...@2ndquadrant.com/ from
 Tomas Vondra tomas.vondra at 2ndquadrant.com http://2ndquadrant.com
 
 But that patch failed the majority of make check checks in my hands. 
 So I also don't know what the status is.

Ummm, that's not a patch but a testing script ...

There was a patch submitted on 23/2, and I believe that passes most make
check tests, except for two IIRC. But it's not perfect - it was the
first version that mostly worked, and was somehow suitable for getting
feedback.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-13 Thread Jeff Janes
On Mon, Feb 23, 2015 at 3:09 PM, Tomas Vondra tomas.von...@2ndquadrant.com
wrote:

 Hi,

 attached is the result of my first attempt to make the logical column
 ordering patch work. This touches a lot of code in the executor that is
 mostly new to me, so if you see something that looks like an obvious
 bug, it probably is (so let me know).


There is an apply conflict in src/backend/parser/parse_relation.c in the
comments for scanRTEForColumn.

It seems like it should be easy to ignore, but when I ignore it I get make
check failing all over the place.

(The first patch posted in this thread seems to work for me, I did some
testing on it before I realized it was obsolete.)

Cheers,

Jeff


Re: [HACKERS] logical column ordering

2015-03-12 Thread Andres Freund
Hi,

On 2015-03-11 22:16:52 -0400, Tom Lane wrote:
 I agree though that it's worth considering defining pg_attribute.attnum as
 the logical column position so as to minimize the effects on client-side
 code.

I actually wonder if it'd not make more sense to define it as the
physical column number. That'd reduce the invasiveness and risk of the
patch considerably. It means that most existing code doesn't have to be
changed and can just continue to refer to attnum like today. There's
much less risk of it being wrongly used to refer to the physical offset
instead of creation order.  Queries against attnum would still give a
somewhat sane response.

It would make some ALTER TABLE commands a bit more complex if we want to
allow reordering the physical order. But that seems like a much more
localized complexity than previous patches in this thread (although I've
not looked at the last version).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Alvaro Herrera
Tomas Vondra wrote:
 On 12.3.2015 03:16, Tom Lane wrote:

  I agree though that it's worth considering defining 
  pg_attribute.attnum as the logical column position so as to minimize 
  the effects on client-side code. I doubt there is much stuff 
  client-side that cares about column creation order, but there is 
  plenty that cares about logical column order. OTOH this would 
  introduce confusion into the backend code, since Alvaro's definition 
  of attnum is what most of the backend should care about.
 
 IMHO reusing attnum for logical column order would actually make it more
 complex, especially if we allow users to modify the logical order using
 ALTER TABLE. Because if you change it, you have to walk through all the
 places where it might be referenced and update those too (say, columns
 referenced in indexes and such). Keeping attnum immutable makes this
 much easier and simpler.

I think you're misunderstanding.  The suggestion, as I understand it, is
to rename the attnum column to something else (maybe, say, attidnum),
and rename attlognum to attnum.  That preserves the existing property
that ORDER BY attnum gives you the correct view of the table from the
point of view of the user.  That's very useful because it means clients
looking at pg_attribute need less changes, or maybe none at all.

I think this wouldn't be too difficult to implement, because there
aren't that many places that refer to the column-identity attribute by
name; most of them just grab the TupleDesc-attrs array in whatever
order is appropriate and scan that in a loop.  Only a few of these use
att-attnum inside the loop --- that's what would need to be changed,
and it should be pretty mechanical.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 However, there's a difference between making a query silently given
 different results, and breaking it completely forcing the user to
 re-study how to write it.  I think the latter is better.  In that light
 we should just drop attnum as a column name, and use something else:
 maybe (attidnum, attlognum, attphysnum).  So all queries in the wild
 would be forced to be updated, but we would not silently change
 semantics instead.

Hm.  I'm on board with renaming like that inside the backend, but
I'm very much less excited about forcibly breaking client queries.
I think there is little if any client-side code that will care about
either attidnum or attphysnum, so forcing people to think about it
will just create make-work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Tomas Vondra
On 12.3.2015 14:17, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 On 12.3.2015 03:16, Tom Lane wrote:
 
 I agree though that it's worth considering defining 
 pg_attribute.attnum as the logical column position so as to minimize 
 the effects on client-side code. I doubt there is much stuff 
 client-side that cares about column creation order, but there is 
 plenty that cares about logical column order. OTOH this would 
 introduce confusion into the backend code, since Alvaro's definition 
 of attnum is what most of the backend should care about.

 IMHO reusing attnum for logical column order would actually make it more
 complex, especially if we allow users to modify the logical order using
 ALTER TABLE. Because if you change it, you have to walk through all the
 places where it might be referenced and update those too (say, columns
 referenced in indexes and such). Keeping attnum immutable makes this
 much easier and simpler.
 
 I think you're misunderstanding. The suggestion, as I understand it,
 is to rename the attnum column to something else (maybe, say,
 attidnum), and rename attlognum to attnum. That preserves the
 existing property that ORDER BY attnum gives you the correct view
 of the table from the point of view of the user. That's very useful
 because it means clients looking at pg_attribute need less changes,
 or maybe none at all.

Hmm ... I understood it as a suggestion to drop attlognum and just
define (attnum, attphysnum).

 I think this wouldn't be too difficult to implement, because there 
 aren't that many places that refer to the column-identity attribute
 by name; most of them just grab the TupleDesc-attrs array in
 whatever order is appropriate and scan that in a loop. Only a few of
 these use att-attnum inside the loop --- that's what would need to
 be changed, and it should be pretty mechanical.

I think it's way more complicated. We may fix all the pieces of the
code, but that's not all - attnum is referenced in various system views,
catalogs and such. For example pg_stats view does this:

  FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
  LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
  WHERE NOT attisdropped
AND has_column_privilege(c.oid, a.attnum, 'select');

information_schema also uses attnum on many places too.

I see the catalogs as a kind of public API, and redefining the meaning
of an existing column this way seems tricky, especially when we
reference it from other catalogs - I'm pretty sure there's plenty of SQL
queries in various tools that rely on this. Just google for pg_indexes
indkeys unnest and you'll find posts like this one from Craig:


http://stackoverflow.com/questions/18121103/how-to-get-the-index-column-orderasc-desc-nulls-first-from-postgresql

specifically tell people to do this:

SELECT
...
FROM (
  SELECT
pg_class.relname,
...
unnest(pg_index.indkey) AS k
  FROM pg_index
  INNER JOIN pg_class ON pg_index.indexrelid = pg_class.oid
) i
...
INNER JOIN pg_attribute ON (pg_attribute.attrelid = i.indrelid
AND pg_attribute.attnum = k);

which specifically tells people to match attnum vs. indkeys. If we
redefine the meaning of attnum, and instead match indkeys against a
different column (say, attidnum), all those queries will be broken.

Which actually breaks the catalog definition as specified here:

  http://www.postgresql.org/docs/devel/static/catalog-pg-index.html

which explicitly says that indkey references pg_attribute.attnum.

But maybe we don't really care about breaking this API and it is a good
approach - I need to think about it and try it.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Alvaro Herrera
Tomas Vondra wrote:
 On 12.3.2015 14:17, Alvaro Herrera wrote:
  Tomas Vondra wrote:
  On 12.3.2015 03:16, Tom Lane wrote:
  
  I agree though that it's worth considering defining 
  pg_attribute.attnum as the logical column position so as to minimize 
  the effects on client-side code. I doubt there is much stuff 
  client-side that cares about column creation order, but there is 
  plenty that cares about logical column order. OTOH this would 
  introduce confusion into the backend code, since Alvaro's definition 
  of attnum is what most of the backend should care about.
 
  IMHO reusing attnum for logical column order would actually make it more
  complex, especially if we allow users to modify the logical order using
  ALTER TABLE. Because if you change it, you have to walk through all the
  places where it might be referenced and update those too (say, columns
  referenced in indexes and such). Keeping attnum immutable makes this
  much easier and simpler.
  
  I think you're misunderstanding. The suggestion, as I understand it,
  is to rename the attnum column to something else (maybe, say,
  attidnum), and rename attlognum to attnum. That preserves the
  existing property that ORDER BY attnum gives you the correct view
  of the table from the point of view of the user. That's very useful
  because it means clients looking at pg_attribute need less changes,
  or maybe none at all.
 
 Hmm ... I understood it as a suggestion to drop attlognum and just
 define (attnum, attphysnum).

Pretty sure it wasn't that.

  I think this wouldn't be too difficult to implement, because there 
  aren't that many places that refer to the column-identity attribute
  by name; most of them just grab the TupleDesc-attrs array in
  whatever order is appropriate and scan that in a loop. Only a few of
  these use att-attnum inside the loop --- that's what would need to
  be changed, and it should be pretty mechanical.
 
 I think it's way more complicated. We may fix all the pieces of the
 code, but that's not all - attnum is referenced in various system views,
 catalogs and such. For example pg_stats view does this:
 
   FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
 JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
   LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
   WHERE NOT attisdropped
 AND has_column_privilege(c.oid, a.attnum, 'select');
 
 information_schema also uses attnum on many places too.

Those can be fixed with relative ease to refer to attidnum instead.

 I see the catalogs as a kind of public API, and redefining the meaning
 of an existing column this way seems tricky, especially when we
 reference it from other catalogs - I'm pretty sure there's plenty of SQL
 queries in various tools that rely on this.

That's true, but then we've never promised that system catalogs remain
unchanged forever.  That would essentially stop development.

However, there's a difference between making a query silently given
different results, and breaking it completely forcing the user to
re-study how to write it.  I think the latter is better.  In that light
we should just drop attnum as a column name, and use something else:
maybe (attidnum, attlognum, attphysnum).  So all queries in the wild
would be forced to be updated, but we would not silently change
semantics instead.

 Which actually breaks the catalog definition as specified here:
 
   http://www.postgresql.org/docs/devel/static/catalog-pg-index.html
 
 which explicitly says that indkey references pg_attribute.attnum.

That's a simple doc fix.

 But maybe we don't really care about breaking this API and it is a good
 approach - I need to think about it and try it.

Yeah, thanks.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Peter Eisentraut
On 3/12/15 10:07 AM, Andres Freund wrote:
 I actually wonder if it'd not make more sense to define it as the
 physical column number. That'd reduce the invasiveness and risk of the
 patch considerably.

Clearly, the number of places where attnum has to be changed to
something else is not zero, and so it doesn't matter if a lot or a few
have to be changed.  They all have to be looked at and considered.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Peter Eisentraut
On 3/11/15 10:16 PM, Tom Lane wrote:

 I think using an OID would break more stuff than it fixes in dependency
 tracking; in particular you would now need an explicit dependency link
 from each column to its table, because the sub-object knowledge would
 no longer work.

That might not be a bad thing, but ...

 In any case this patch is going to be plenty big enough
 already without saddling it with a major rewrite of the dependency system.

... is true.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-12 Thread Tomas Vondra
On 12.3.2015 03:16, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 Side idea: Let attnum be the logical number, introduce attphysnum 
 as the storage position, and add an oid to pg_attribute as the 
 eternal identifier.
 
 That way you avoid breaking pretty much all user code that looks at
 pg_attribute, which will probably do something like ORDER BY 
 attnum.
 
 Also, one could get rid of all sorts of ugly code that works around
 the lack of an oid in pg_attribute, such as in the dependency
 tracking.
 
 I think using an OID would break more stuff than it fixes in 
 dependency tracking; in particular you would now need an explicit 
 dependency link from each column to its table, because the 
 sub-object knowledge would no longer work. In any case this patch 
 is going to be plenty big enough already without saddling it with a 
 major rewrite of the dependency system.

Exactly. I believe Alvaro considered that option in the past.

 I agree though that it's worth considering defining 
 pg_attribute.attnum as the logical column position so as to minimize 
 the effects on client-side code. I doubt there is much stuff 
 client-side that cares about column creation order, but there is 
 plenty that cares about logical column order. OTOH this would 
 introduce confusion into the backend code, since Alvaro's definition 
 of attnum is what most of the backend should care about.

IMHO reusing attnum for logical column order would actually make it more
complex, especially if we allow users to modify the logical order using
ALTER TABLE. Because if you change it, you have to walk through all the
places where it might be referenced and update those too (say, columns
referenced in indexes and such). Keeping attnum immutable makes this
much easier and simpler.

regards

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-11 Thread Peter Eisentraut
On 12/9/14 12:41 PM, Alvaro Herrera wrote:
 To recap, this is based on the idea of having three numbers for each
 attribute rather than a single attnum; the first of these is attnum (a
 number that uniquely identifies an attribute since its inception and may
 or may not have any relationship to its storage position and the place
 it expands to through user interaction).  The second is attphysnum,
 which indicates where it is stored in the physical structure.  The third
 is attlognum, which indicates where it expands in *, where must its
 values be placed in COPY or VALUES lists, etc --- the logical position
 as the user sees it.

Side idea:  Let attnum be the logical number, introduce attphysnum as
the storage position, and add an oid to pg_attribute as the eternal
identifier.

That way you avoid breaking pretty much all user code that looks at
pg_attribute, which will probably do something like ORDER BY attnum.

Also, one could get rid of all sorts of ugly code that works around the
lack of an oid in pg_attribute, such as in the dependency tracking.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-11 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Side idea:  Let attnum be the logical number, introduce attphysnum as
 the storage position, and add an oid to pg_attribute as the eternal
 identifier.

 That way you avoid breaking pretty much all user code that looks at
 pg_attribute, which will probably do something like ORDER BY attnum.

 Also, one could get rid of all sorts of ugly code that works around the
 lack of an oid in pg_attribute, such as in the dependency tracking.

I think using an OID would break more stuff than it fixes in dependency
tracking; in particular you would now need an explicit dependency link
from each column to its table, because the sub-object knowledge would
no longer work.  In any case this patch is going to be plenty big enough
already without saddling it with a major rewrite of the dependency system.

I agree though that it's worth considering defining pg_attribute.attnum as
the logical column position so as to minimize the effects on client-side
code.  I doubt there is much stuff client-side that cares about column
creation order, but there is plenty that cares about logical column order.
OTOH this would introduce confusion into the backend code, since Alvaro's
definition of attnum is what most of the backend should care about.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-03 Thread Bruce Momjian
On Thu, Feb 26, 2015 at 01:55:44PM -0800, Josh Berkus wrote:
 On 02/26/2015 01:54 PM, Alvaro Herrera wrote:
  This patch decouples these three things so that they
  can changed freely -- but provides no user interface to do so.  I think
  that trying to only decouple the thing we currently have in two pieces,
  and then have a subsequent patch to decouple again, is additional
  conceptual complexity for no gain.
 
 Oh, I didn't realize there weren't commands to change the LCO.  Without
 at least SQL syntax for LCO, I don't see why we'd take it; this sounds
 more like a WIP patch.

FYI, pg_upgrade is going to need pg_dump --binary-upgrade to output the
columns in physical order with some logical ordering information, i.e.
pg_upgrade cannot be passed only logical ordering from pg_dump.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-03 Thread Jim Nasby

On 3/3/15 11:23 AM, Bruce Momjian wrote:

On Thu, Feb 26, 2015 at 01:55:44PM -0800, Josh Berkus wrote:

On 02/26/2015 01:54 PM, Alvaro Herrera wrote:

This patch decouples these three things so that they
can changed freely -- but provides no user interface to do so.  I think
that trying to only decouple the thing we currently have in two pieces,
and then have a subsequent patch to decouple again, is additional
conceptual complexity for no gain.


Oh, I didn't realize there weren't commands to change the LCO.  Without
at least SQL syntax for LCO, I don't see why we'd take it; this sounds
more like a WIP patch.


FYI, pg_upgrade is going to need pg_dump --binary-upgrade to output the
columns in physical order with some logical ordering information, i.e.
pg_upgrade cannot be passed only logical ordering from pg_dump.


Wouldn't it need attno info too, so all 3 orderings?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-03 Thread Bruce Momjian
On Tue, Mar  3, 2015 at 11:24:38AM -0600, Jim Nasby wrote:
 On 3/3/15 11:23 AM, Bruce Momjian wrote:
 On Thu, Feb 26, 2015 at 01:55:44PM -0800, Josh Berkus wrote:
 On 02/26/2015 01:54 PM, Alvaro Herrera wrote:
 This patch decouples these three things so that they
 can changed freely -- but provides no user interface to do so.  I think
 that trying to only decouple the thing we currently have in two pieces,
 and then have a subsequent patch to decouple again, is additional
 conceptual complexity for no gain.
 
 Oh, I didn't realize there weren't commands to change the LCO.  Without
 at least SQL syntax for LCO, I don't see why we'd take it; this sounds
 more like a WIP patch.
 
 FYI, pg_upgrade is going to need pg_dump --binary-upgrade to output the
 columns in physical order with some logical ordering information, i.e.
 pg_upgrade cannot be passed only logical ordering from pg_dump.
 
 Wouldn't it need attno info too, so all 3 orderings?

Uh, what is the third ordering?  Physical, logical, and ?  It already
gets information about dropped columns, if that is the third one.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-03 Thread Bruce Momjian
On Tue, Mar  3, 2015 at 11:41:20AM -0600, Jim Nasby wrote:
 Wouldn't it need attno info too, so all 3 orderings?
 
 Uh, what is the third ordering?  Physical, logical, and ?  It already
 gets information about dropped columns, if that is the third one.
 
 attnum; used in other catalogs to reference columns.
 
 If you're shuffling everything though pg_dump anyway then maybe it's
 not needed...

Yes, all those attno system table links are done with pg_dump SQL
commands.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-03 Thread Jim Nasby

On 3/3/15 11:26 AM, Bruce Momjian wrote:

On Tue, Mar  3, 2015 at 11:24:38AM -0600, Jim Nasby wrote:

On 3/3/15 11:23 AM, Bruce Momjian wrote:

On Thu, Feb 26, 2015 at 01:55:44PM -0800, Josh Berkus wrote:

On 02/26/2015 01:54 PM, Alvaro Herrera wrote:

This patch decouples these three things so that they
can changed freely -- but provides no user interface to do so.  I think
that trying to only decouple the thing we currently have in two pieces,
and then have a subsequent patch to decouple again, is additional
conceptual complexity for no gain.


Oh, I didn't realize there weren't commands to change the LCO.  Without
at least SQL syntax for LCO, I don't see why we'd take it; this sounds
more like a WIP patch.


FYI, pg_upgrade is going to need pg_dump --binary-upgrade to output the
columns in physical order with some logical ordering information, i.e.
pg_upgrade cannot be passed only logical ordering from pg_dump.


Wouldn't it need attno info too, so all 3 orderings?


Uh, what is the third ordering?  Physical, logical, and ?  It already
gets information about dropped columns, if that is the third one.


attnum; used in other catalogs to reference columns.

If you're shuffling everything though pg_dump anyway then maybe it's not 
needed...

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-01 Thread Alvaro Herrera
Jim Nasby wrote:
 On 2/27/15 2:37 PM, Gavin Flower wrote:
 Might be an idea to allow lists of columns and their starting position.
 
 ALTER TABLE customer ALTER COLUMN (job_id, type_id, account_num) SET
 ORDER 3;
 
 I would certainly want something along those lines because it would be *way*
 less verbose (and presumably a lot faster) than a slew of ALTER TABLE
 statements.

You know you can issue multiple subcommands in one ALTER TABLE
statement, right?  There's no reason to do more than one table rewrite.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-01 Thread Alvaro Herrera
Jim Nasby wrote:
 On 2/27/15 2:49 PM, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 
 1) change the order of columns in SELECT *
 
 - display columns so that related ones grouped together
   (irrespectedly whether they were added later, etc.)
 
 FWIW, I find the ordering more important for things like \d than SELECT *.

Logical column ordering is (or should be) used in all places where
column lists are expanded user-visibly.  \d would be one of those.
Think column order in the output of a JOIN also, for instance -- they
must follow logical column order.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-03-01 Thread Alvaro Herrera
Tomas Vondra wrote:

  We need an API for physical column reordering, even if it's just pg_
  functions.  The reason is that we want to enable people writing their
  own physical column re-ordering tools, so that our users can figure out
  for us what the best reordering algorithm is.
 
 I doubt that. For example, do you realize you can only do that while the
 table is completely empty, and in that case you can just do a CREATE
 TABLE with the proper order?

Not if you have views or constraints depending on the table definition
-- it's not trivial to drop/recreate the table in that case, but you can
of course think about truncating it, then reorder columns, then
repopulate.

Even better you can cause a full table rewrite if needed.

 But if we want to allow users to define this, I'd say let's make that
 part of CREATE TABLE, i.e. the order of columns defines logical order,
 and you use something like 'AFTER' to specify physical order.
 
 CREATE TABLE test (
 a INT AFTER b,-- attlognum = 1, attphysnum = 2
 b INT -- attlognum = 2, attphysnum = 1
 );

Surely you want an ALTER command as a minimum; perhaps that is enough
and there is no need for options in CREATE.

 It might get tricky because of cycles, though.

If there's a cycle, just raise an error.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Jim Nasby

On 2/27/15 2:37 PM, Gavin Flower wrote:

Might be an idea to allow lists of columns and their starting position.

ALTER TABLE customer ALTER COLUMN (job_id, type_id, account_num) SET
ORDER 3;


I would certainly want something along those lines because it would be 
*way* less verbose (and presumably a lot faster) than a slew of ALTER 
TABLE statements.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Gavin Flower

On 28/02/15 18:34, Jim Nasby wrote:

On 2/27/15 2:49 PM, Alvaro Herrera wrote:

Tomas Vondra wrote:


1) change the order of columns in SELECT *

- display columns so that related ones grouped together
  (irrespectedly whether they were added later, etc.)


FWIW, I find the ordering more important for things like \d than 
SELECT *.


Hey, after we get this done the next step is allowing different 
logical ordering for different uses! ;P

[...]

How about CREATE COLUMN SELECTION my_col_sel (a, g, b, e) FROM TABLE 
my_table;


Notes:

1. The column names must match those of the table

2. The COLUMN SELECTION is associated with the specified table

3. If a column gets renamed, then the COLUMN SELECTION effectively gets
   updated to use the new column name
   (This can probably be done automatically, by virtue of storing
   references to the appropriate column definitions)

4. Allow fewer columns in the COLUMN SELECTION than the original table

5. Allow the the same column to be duplicated
   (trivial, simply don't raise an error for duplicates)

6. Allow the COLUMN SELECTION name to appear instead of the list of
   columns after the SELECT key word
   (SELECT COLUMN SELECTION my_col_sel FROM my_table WHERE ... - must
   match table in FROM clause)

If several tables are defined in the FROM clause, and 2 different tables 
have COLUMN SELECTION with identical names, then the COLUMN SELECTION 
names in the SELECT must be prefixed either the table name or its alias.



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Gavin Flower

On 28/02/15 12:21, Josh Berkus wrote:

On 02/27/2015 03:06 PM, Tomas Vondra wrote:

On 27.2.2015 23:48, Josh Berkus wrote:

Actually, I'm going to go back on what I said.

We need an API for physical column reordering, even if it's just pg_
functions.  The reason is that we want to enable people writing their
own physical column re-ordering tools, so that our users can figure out
for us what the best reordering algorithm is.

I doubt that. For example, do you realize you can only do that while the
table is completely empty, and in that case you can just do a CREATE
TABLE with the proper order?

Well, you could recreate the table as the de-facto API, although as you
point out below that still requires new syntax.

But I was thinking of something which would re-write the table, just
like ADD COLUMN x DEFAULT '' does now.


I also doubt the users will be able to optimize the order better than
users, who usually have on idea of how this actually works internally.

We have a lot of power users, including a lot of the people on this
mailing list.

Among the things we don't know about ordering optimization:

* How important is it for index performance to keep key columns adjacent?

* How important is it to pack values  4 bytes, as opposed to packing
values which are non-nullable?

* How important is it to pack values of the same size, as opposed to
packing values which are non-nullable?


But if we want to allow users to define this, I'd say let's make that
part of CREATE TABLE, i.e. the order of columns defines logical order,
and you use something like 'AFTER' to specify physical order.

 CREATE TABLE test (
 a INT AFTER b,-- attlognum = 1, attphysnum = 2
 b INT -- attlognum = 2, attphysnum = 1
 );

It might get tricky because of cycles, though.

It would be a lot easier to allow the user to specific a scalar.

CREATE TABLE test (
a INT NOT NULL WITH ( lognum 1, physnum 2 )
b INT WITH ( lognum 2, physnum 1 )

... and just throw an error if the user creates duplicates or gaps.


I thinks gaps should be okay.

Remember BASIC?  Lines numbers tended to be in 10's so you could easily 
slot new lines in between the existing ones - essential when using the 
Teletype input/output device.


Though the difference here is that you would NOT want to remember the 
original order numbers (at least I don't think that would be worth the 
coding effort  space).  However, the key is the actual order, not the 
numbering.  However, that might require a WARNING message to say that 
the columns would be essentially renumbered from 1 onwards when the 
table was actually created - if gaps had been left.



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Jim Nasby

On 2/27/15 2:49 PM, Alvaro Herrera wrote:

Tomas Vondra wrote:


1) change the order of columns in SELECT *

- display columns so that related ones grouped together
  (irrespectedly whether they were added later, etc.)


FWIW, I find the ordering more important for things like \d than SELECT *.

Hey, after we get this done the next step is allowing different logical 
ordering for different uses! ;P



- keep columns synced with COPY

- requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _)


Not sure about the ORDER # syntax.  In ALTER ENUM we have AFTER
value and such .. I'd consider that instead.


+1. See also Gavin's suggestion of ALTER COLUMN (a, b, c) SET ORDER ...


2) optimization of physical order (efficient storage / tuple deforming)

- more efficient order for storage (deforming)

- may be done manually by reordering columns in CREATE TABLE

- should be done automatically (no user interface required)


A large part of it can be done automatically: for instance, not-nullable
fixed length types ought to come first, because that enables the


I would think that eliminating wasted space due to alignment would be 
more important than optimizing attcacheoff, at least for a database that 
doesn't fit in memory. Even if it does fit in memory I suspect memory 
bandwidth is more important than clock cycles.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Matt Kelly

 Even if it does fit in memory I suspect memory bandwidth is more important
 than clock cycles.


http://people.freebsd.org/~lstewart/articles/cpumemory.pdf

This paper is old but the ratios should still be pretty accurate.  Main
memory is 240 clock cycles away and L1d is only 3.  If the experiments in
this paper still hold true loading the 8K block into L1d is far more
expensive than the CPU processing done once the block is in cache.

When one adds in NUMA to the contention on this shared resource, its not
that hard for a 40 core machine to starve for memory bandwidth, and for
cores to sit idle waiting for main memory.  Eliminating wasted space seems
far more important even when everything could fit in memory already.


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 26.2.2015 23:36, Tom Lane wrote:
 Jim Nasby jim.na...@bluetreble.com writes:
 On 2/26/15 4:01 PM, Alvaro Herrera wrote:
 Josh Berkus wrote:
 Oh, I didn't realize there weren't commands to change the LCO.  Without
 at least SQL syntax for LCO, I don't see why we'd take it; this sounds
 more like a WIP patch.
 
 The reason for doing it this way is that changing the underlying
 architecture is really hard, without having to bear an endless hackers
 bike shed discussion about the best userland syntax to use.  It seems a
 much better approach to do the actually difficult part first, then let
 the rest to be argued to death by others and let those others do the
 easy part (and take all the credit along with that); that way, that
 discussion does not kill other possible uses that the new architecture
 allows.
 
 +1. This patch is already several years old; lets not delay it further.
 
 I tend to agree with that, but how are we going to test things if there's
 no mechanism to create a table in which the orderings are different?

Physical or logical orderings?

Physical ordering is still determined by the CREATE TABLE command, so
you can do either

CREATE TABLE order_1 (
a INT,
b INT
);

or (to get the reverse order)

CREATE TABLE order_2 (
b INT,
a INT
);

Logical ordering may be updated directly in pg_attribute catalog, by
tweaking the attlognum column

UPDATE pg_attribute SET attlognum = 10
 WHERE attrelid = 'order_1'::regclass AND attname = 'a';

Of course, this does not handle duplicities, ranges and such, so that
needs to be done manually. But I think inventing something like

ALTER TABLE order_1 ALTER COLUMN a SET lognum = 11;

should be straightforward. But IMHO getting the internals working
properly first is more important.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 27.2.2015 19:23, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 
 Physical ordering is still determined by the CREATE TABLE command,
 
 In theory, you should be able to UPDATE attphysnum in pg_attribute
 too when the table is empty, and new tuples inserted would follow
 the specified ordering.

Good idea - that'd give you descriptors with

(attnum != attphysnum)

which might trigger some bugs.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Alvaro Herrera
Tomas Vondra wrote:
 On 26.2.2015 23:42, Kevin Grittner wrote:

  One use case is to be able to suppress default display of columns 
  that are used for internal purposes. For example, incremental 
  maintenance of materialized views will require storing a count(t) 
  column, and sometimes state information for aggregate columns, in 
  addition to what the users explicitly request. At the developers' 
  meeting there was discussion of whether and how to avoid displaying 
  these by default, and it was felt that when we have this logical 
  column ordering it would be good to have a way tosuppress default
  display. Perhaps this could be as simple as a special value for
  logical position.
 
 I don't see how hiding columns is related to this patch at all. That's
 completely unrelated thing, and it certainly is not part of this patch.

It's not directly related to the patch, but I think the intent is that
once we have this patch it will be possible to apply other
transformations, such as having columns that are effectively hidden --
consider for example the idea that attlognum be set to a negative
number.  (For instance, consider the idea that system columns may all
have -1 as attlognum, which would just be an indicator that they are
never present in logical column expansion.  That makes sense to me; what
reason do we have to keep them using the current attnums they have?)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 26.2.2015 23:42, Kevin Grittner wrote:
 Tomas Vondra tomas.von...@2ndquadrant.com wrote:
 
 Over the time I've heard various use cases for this patch, but in 
 most cases it was quite speculative. If you have an idea where
 this might be useful, can you explain it here, or maybe point me to
 a place where it's described?
 
 
 One use case is to be able to suppress default display of columns 
 that are used for internal purposes. For example, incremental 
 maintenance of materialized views will require storing a count(t) 
 column, and sometimes state information for aggregate columns, in 
 addition to what the users explicitly request. At the developers' 
 meeting there was discussion of whether and how to avoid displaying 
 these by default, and it was felt that when we have this logical 
 column ordering it would be good to have a way tosuppress default
 display. Perhaps this could be as simple as a special value for
 logical position.

I don't see how hiding columns is related to this patch at all. That's
completely unrelated thing, and it certainly is not part of this patch.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Alvaro Herrera
Tomas Vondra wrote:

 Physical ordering is still determined by the CREATE TABLE command,

In theory, you should be able to UPDATE attphysnum in pg_attribute too
when the table is empty, and new tuples inserted would follow the
specified ordering.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Alvaro Herrera
Arthur Silva wrote:

 Sorry to intrude, I've been following this post and I was wondering if it
 would allow (in the currently planed form or in the future) a wider set of
 non-rewriting DDLs to Postgres. For example, drop a column without
 rewriting the table.

Perhaps.  But dropping a column already does not rewrite the table, only
marks the column as dropped in system catalogs, so do you have a better
example.

One obvious example is that you have 

CREATE TABLE t (
   t1 int,
   t3 int
);

and later want to add t2 in the middle, the only way currently is to
drop the table and start again (re-creating all dependant views, FKs,
etc).  With the patch you will be able to add the column at the right
place.  If no default value is supplied for the new column, no table
rewrite is necessary at all.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 27.2.2015 20:34, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 
 I think we could calls to the randomization functions into some of the
 regression tests (say 'create_tables.sql'), but that makes regression
 tests ... well, random, and I'm not convinced that's a good thing.

 Also, this makes regression tests harder to think, because SELECT *
 does different things depending on the attlognum order.
 
 No, that approach doesn't seem very useful.  Rather, randomize the
 columns in the CREATE TABLE statement, and then fix up the attlognums so
 that the SELECT * expansion is the same as it would be with the
 not-randomized CREATE TABLE.

Yes, that's a possible approach too - possibly a better one for
regression tests as it fixes the 'SELECT *' but it effectively uses
fixed 'attlognum' and 'attnum' values (it's difficult to randomize
those, as they may be referenced in other catalogs).

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
OK, so let me summarize here (the other posts in this subthread are
discussing the user interface, not the use cases, so I'll respond here).


There are two main use cases:

1) change the order of columns in SELECT *

   - display columns so that related ones grouped together
 (irrespectedly whether they were added later, etc.)

   - keep columns synced with COPY

   - requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _)


2) optimization of physical order (efficient storage / tuple deforming)

   - more efficient order for storage (deforming)

   - may be done manually by reordering columns in CREATE TABLE

   - should be done automatically (no user interface required)

   - seems useful both for on-disk physical tuples, and virtual tuples


Anything else?

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 27.2.2015 19:50, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 On 26.2.2015 23:42, Kevin Grittner wrote:
 
 One use case is to be able to suppress default display of columns 
 that are used for internal purposes. For example, incremental 
 maintenance of materialized views will require storing a count(t) 
 column, and sometimes state information for aggregate columns, in 
 addition to what the users explicitly request. At the developers' 
 meeting there was discussion of whether and how to avoid displaying 
 these by default, and it was felt that when we have this logical 
 column ordering it would be good to have a way tosuppress default
 display. Perhaps this could be as simple as a special value for
 logical position.

 I don't see how hiding columns is related to this patch at all. That's
 completely unrelated thing, and it certainly is not part of this patch.
 
 It's not directly related to the patch, but I think the intent is that
 once we have this patch it will be possible to apply other
 transformations, such as having columns that are effectively hidden --
 consider for example the idea that attlognum be set to a negative
 number.  (For instance, consider the idea that system columns may all
 have -1 as attlognum, which would just be an indicator that they are
 never present in logical column expansion.  That makes sense to me; what
 reason do we have to keep them using the current attnums they have?)

My vote is against reusing attlognums in this way - I see that as a
misuse, making the code needlessly complex. A better way to achieve this
is to introduce a simple 'is hidden' flag into pg_attribute, and
something as simple as

   ALTER TABLE t ALTER COLUMN a SHOW;
   ALTER TABLE t ALTER COLUMN a HIDE;

or something along the lines. Not only that's cleaner, but it's also a
better interface for the users than 'you have to set attlognum to a
negative value.'


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 26.2.2015 23:34, Andres Freund wrote:
 On 2015-02-26 16:16:54 -0600, Jim Nasby wrote:
 On 2/26/15 4:01 PM, Alvaro Herrera wrote:
 The reason for doing it this way is that changing the underlying
 architecture is really hard, without having to bear an endless hackers
 bike shed discussion about the best userland syntax to use.  It seems a
 much better approach to do the actually difficult part first, then let
 the rest to be argued to death by others and let those others do the
 easy part (and take all the credit along with that); that way, that
 discussion does not kill other possible uses that the new architecture
 allows.
 
 I agree that it's a sane order of developing things. But: I don't
 think it makes sense to commit it without the capability to change
 the order. Not so much because it'll not serve a purpose at that
 point, but rather because it'll essentially untestable. And this is a
 patch that needs a fair amount of changes, both automated, and
 manual.

I agree that committing something without a code that actually uses it
in practice is not the best approach. That's one of the reasons why I
was asking for the use cases we expect from this.


 At least during development I'd even add a function that randomizes
 the physical order of columns, and keeps the logical one. Running
 the regression tests that way seems likely to catch a fair number of
 bugs.

That's not all that difficult, and can be done in 10 lines of PL/pgSQL -
see the attached file. Should be sufficient for development testing (and
in production there's not much use for that anyway).

 +1. This patch is already several years old; lets not delay it further.

 Plus, I suspect that you could hack the catalog directly if you
 really wanted to change LCO. Worst case you could create a C
 function to do it.
 
 I don't think that's sufficient for testing purposes. Not only for
 the patch itself, but also for getting it right in new code.

I think we could calls to the randomization functions into some of the
regression tests (say 'create_tables.sql'), but that makes regression
tests ... well, random, and I'm not convinced that's a good thing.

Also, this makes regression tests harder to think, because SELECT *
does different things depending on the attlognum order.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


randomize.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Gavin Flower

On 28/02/15 09:09, Josh Berkus wrote:

Tomas,

So for an API, 100% of the use cases I have for this feature would be
satisfied by:

ALTER TABLE __ ALTER COLUMN _ SET ORDER #

and:

ALTER TABLE _ ADD COLUMN colname coltype ORDER #

If that's infeasible, a function would be less optimal, but would work:

SELECT pg_column_order(schemaname, tablename, colname, attnum)

If you set the order # to one where a column already exists, other
column attnums would get bumped down, closing up any gaps in the
process.  Obviously, this would require some kind of exclusive lock, but
then ALTER TABLE usually does, doesn't it?


Might be an idea to allow lists of columns and their starting position.

ALTER TABLE customer ALTER COLUMN (job_id, type_id, account_num) SET 
ORDER 3;


So in a table with fields:

1. id
2. *account_num*
3. dob
4. start_date
5. *type_id*
6. preferred_status
7. */job_id/*
8. comment


would end up like:

1. id
2. dob
3. *job_id*
4. *type_id*
5. *account_num*
6. start_date
7. preferred_status
8. comment

Am assuming positions are numbered from 1 onwards.


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 27.2.2015 21:42, Josh Berkus wrote:
 On 02/27/2015 12:25 PM, Tomas Vondra wrote:
 On 27.2.2015 21:09, Josh Berkus wrote:
 Tomas,

 So for an API, 100% of the use cases I have for this feature would be
 satisfied by:

 ALTER TABLE __ ALTER COLUMN _ SET ORDER #

 and:

 ALTER TABLE _ ADD COLUMN colname coltype ORDER #

 Yes, I imagined an interface like that. Just to be clear, you're
 talking about logical order (and not a physical one), right?
 
 Correct. The only reason to rearrange the physical columns is in
 order to optimize, which presumably would be carried out by a utility
 command, e.g. VACUUM FULL OPTIMIZE.

I was thinking more about CREATE TABLE at this moment, but yeah, VACUUM
FULL OPTIMIZE might do the same thing.

 If we ignore the system columns, the current implementation assumes
 that the values in each of the three columns (attnum, attlognum
 and attphysnum) are distinct and within 1..natts. So there are no
 gaps and you'll always set the value to an existing one (so yes,
 shuffling is necessary).

 And yes, that certainly requires an exclusive lock on the
 pg_attribute (I don't think we need a lock on the table itself).
 
 If MVCC on pg_attribute is good enough to not lock against concurrent
 SELECT *, then that would be lovely.

Yeah, I think this will need a bit more thought. We certainly don't want
blocking queries on the table, but we need a consistent list of column
definitions from pg_attribute.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Alvaro Herrera
Tomas Vondra wrote:

 1) change the order of columns in SELECT *
 
- display columns so that related ones grouped together
  (irrespectedly whether they were added later, etc.)
 
- keep columns synced with COPY
 
- requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _)

Not sure about the ORDER # syntax.  In ALTER ENUM we have AFTER
value and such .. I'd consider that instead.

 2) optimization of physical order (efficient storage / tuple deforming)
 
- more efficient order for storage (deforming)
 
- may be done manually by reordering columns in CREATE TABLE
 
- should be done automatically (no user interface required)

A large part of it can be done automatically: for instance, not-nullable
fixed length types ought to come first, because that enables the
attcacheoff optimizations in heaptuple.c to fire for more columns.  But
what column comes next?  The offset of the column immediately after them
can also be cached, and so it would be faster to obtain than other
attributes.  Which one to choose here is going to be a coin toss in most
cases, but I suppose there are cases out there which can benefit from
having a particular column there.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Gavin Flower

On 28/02/15 09:49, Alvaro Herrera wrote:

Tomas Vondra wrote:


1) change the order of columns in SELECT *

- display columns so that related ones grouped together
  (irrespectedly whether they were added later, etc.)

- keep columns synced with COPY

- requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _)

Not sure about the ORDER # syntax.  In ALTER ENUM we have AFTER
value and such .. I'd consider that instead.


2) optimization of physical order (efficient storage / tuple deforming)

- more efficient order for storage (deforming)

- may be done manually by reordering columns in CREATE TABLE

- should be done automatically (no user interface required)

A large part of it can be done automatically: for instance, not-nullable
fixed length types ought to come first, because that enables the
attcacheoff optimizations in heaptuple.c to fire for more columns.  But
what column comes next?  The offset of the column immediately after them
can also be cached, and so it would be faster to obtain than other
attributes.  Which one to choose here is going to be a coin toss in most
cases, but I suppose there are cases out there which can benefit from
having a particular column there.


Possible, if there is no obvious (to the system) way of deciding the 
order of 2 columns, then the logical order should be used?


As either the order does not really matter, or an expert DBA might know 
which is more efficient.



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Arthur Silva
On Fri, Feb 27, 2015 at 4:44 PM, Tomas Vondra tomas.von...@2ndquadrant.com
wrote:

 On 27.2.2015 20:34, Alvaro Herrera wrote:
  Tomas Vondra wrote:
 
  I think we could calls to the randomization functions into some of the
  regression tests (say 'create_tables.sql'), but that makes regression
  tests ... well, random, and I'm not convinced that's a good thing.
 
  Also, this makes regression tests harder to think, because SELECT *
  does different things depending on the attlognum order.
 
  No, that approach doesn't seem very useful.  Rather, randomize the
  columns in the CREATE TABLE statement, and then fix up the attlognums so
  that the SELECT * expansion is the same as it would be with the
  not-randomized CREATE TABLE.

 Yes, that's a possible approach too - possibly a better one for
 regression tests as it fixes the 'SELECT *' but it effectively uses
 fixed 'attlognum' and 'attnum' values (it's difficult to randomize
 those, as they may be referenced in other catalogs).

 --
 Tomas Vondrahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


Sorry to intrude, I've been following this post and I was wondering if it
would allow (in the currently planed form or in the future) a wider set of
non-rewriting DDLs to Postgres. For example, drop a column without
rewriting the table.


Re: [HACKERS] logical column ordering

2015-02-27 Thread Josh Berkus
On 02/27/2015 12:25 PM, Tomas Vondra wrote:
 On 27.2.2015 21:09, Josh Berkus wrote:
 Tomas,

 So for an API, 100% of the use cases I have for this feature would be
 satisfied by:

 ALTER TABLE __ ALTER COLUMN _ SET ORDER #

 and:

 ALTER TABLE _ ADD COLUMN colname coltype ORDER #
 
 Yes, I imagined an interface like that. Just to be clear, you're talking
 about logical order (and not a physical one), right?

Correct.  The only reason to rearrange the physical columns is in order
to optimize, which presumably would be carried out by a utility command,
e.g. VACUUM FULL OPTIMIZE.

 If we ignore the system columns, the current implementation assumes that
 the values in each of the three columns (attnum, attlognum and
 attphysnum) are distinct and within 1..natts. So there are no gaps and
 you'll always set the value to an existing one (so yes, shuffling is
 necessary).
 
 And yes, that certainly requires an exclusive lock on the pg_attribute
 (I don't think we need a lock on the table itself).

If MVCC on pg_attribute is good enough to not lock against concurrent
SELECT *, then that would be lovely.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Alvaro Herrera
Tomas Vondra wrote:

 I think we could calls to the randomization functions into some of the
 regression tests (say 'create_tables.sql'), but that makes regression
 tests ... well, random, and I'm not convinced that's a good thing.
 
 Also, this makes regression tests harder to think, because SELECT *
 does different things depending on the attlognum order.

No, that approach doesn't seem very useful.  Rather, randomize the
columns in the CREATE TABLE statement, and then fix up the attlognums so
that the SELECT * expansion is the same as it would be with the
not-randomized CREATE TABLE.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Alvaro Herrera
Tomas Vondra wrote:
 On 27.2.2015 20:34, Alvaro Herrera wrote:
  Tomas Vondra wrote:
  
  I think we could calls to the randomization functions into some of the
  regression tests (say 'create_tables.sql'), but that makes regression
  tests ... well, random, and I'm not convinced that's a good thing.
 
  Also, this makes regression tests harder to think, because SELECT *
  does different things depending on the attlognum order.
  
  No, that approach doesn't seem very useful.  Rather, randomize the
  columns in the CREATE TABLE statement, and then fix up the attlognums so
  that the SELECT * expansion is the same as it would be with the
  not-randomized CREATE TABLE.
 
 Yes, that's a possible approach too - possibly a better one for
 regression tests as it fixes the 'SELECT *' but it effectively uses
 fixed 'attlognum' and 'attnum' values (it's difficult to randomize
 those, as they may be referenced in other catalogs).

Why would you care what values are used as attnum?  If anything
misbehaves, surely that would be a bug in the patch.  (Of course, you
can't just change the numbers too much later after the fact, because the
attnum values could have propagated into other tables via foreign keys
and such; it needs to be done during executing CREATE TABLE or
immediately thereafter.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Arthur Silva
On Feb 27, 2015 5:02 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 Arthur Silva wrote:

  Sorry to intrude, I've been following this post and I was wondering if
it
  would allow (in the currently planed form or in the future) a wider set
of
  non-rewriting DDLs to Postgres. For example, drop a column without
  rewriting the table.

 Perhaps.  But dropping a column already does not rewrite the table, only
 marks the column as dropped in system catalogs, so do you have a better
 example.

 One obvious example is that you have

 CREATE TABLE t (
t1 int,
t3 int
 );

 and later want to add t2 in the middle, the only way currently is to
 drop the table and start again (re-creating all dependant views, FKs,
 etc).  With the patch you will be able to add the column at the right
 place.  If no default value is supplied for the new column, no table
 rewrite is necessary at all.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services

Cool! I didn't know I could drop stuff without rewriting.

Ya, that's another example, people do these from GUI tools. That's a nice
side effect. Cool (again)!


Re: [HACKERS] logical column ordering

2015-02-27 Thread Josh Berkus
On 02/27/2015 12:48 PM, Tomas Vondra wrote:
 On 27.2.2015 21:42, Josh Berkus wrote:
 On 02/27/2015 12:25 PM, Tomas Vondra wrote:
 On 27.2.2015 21:09, Josh Berkus wrote:
 Tomas,

 So for an API, 100% of the use cases I have for this feature would be
 satisfied by:

 ALTER TABLE __ ALTER COLUMN _ SET ORDER #

 and:

 ALTER TABLE _ ADD COLUMN colname coltype ORDER #

 Yes, I imagined an interface like that. Just to be clear, you're
 talking about logical order (and not a physical one), right?

 Correct. The only reason to rearrange the physical columns is in
 order to optimize, which presumably would be carried out by a utility
 command, e.g. VACUUM FULL OPTIMIZE.
 
 I was thinking more about CREATE TABLE at this moment, but yeah, VACUUM
 FULL OPTIMIZE might do the same thing.

Actually, I'm going to go back on what I said.

We need an API for physical column reordering, even if it's just pg_
functions.  The reason is that we want to enable people writing their
own physical column re-ordering tools, so that our users can figure out
for us what the best reordering algorithm is.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Tomas Vondra
On 27.2.2015 23:48, Josh Berkus wrote:
 On 02/27/2015 12:48 PM, Tomas Vondra wrote:
 On 27.2.2015 21:42, Josh Berkus wrote:
 On 02/27/2015 12:25 PM, Tomas Vondra wrote:
 On 27.2.2015 21:09, Josh Berkus wrote:
 Tomas,

 So for an API, 100% of the use cases I have for this feature would be
 satisfied by:

 ALTER TABLE __ ALTER COLUMN _ SET ORDER #

 and:

 ALTER TABLE _ ADD COLUMN colname coltype ORDER #

 Yes, I imagined an interface like that. Just to be clear, you're
 talking about logical order (and not a physical one), right?

 Correct. The only reason to rearrange the physical columns is in
 order to optimize, which presumably would be carried out by a utility
 command, e.g. VACUUM FULL OPTIMIZE.

 I was thinking more about CREATE TABLE at this moment, but yeah, VACUUM
 FULL OPTIMIZE might do the same thing.
 
 Actually, I'm going to go back on what I said.
 
 We need an API for physical column reordering, even if it's just pg_
 functions.  The reason is that we want to enable people writing their
 own physical column re-ordering tools, so that our users can figure out
 for us what the best reordering algorithm is.

I doubt that. For example, do you realize you can only do that while the
table is completely empty, and in that case you can just do a CREATE
TABLE with the proper order?

I also doubt the users will be able to optimize the order better than
users, who usually have on idea of how this actually works internally.

But if we want to allow users to define this, I'd say let's make that
part of CREATE TABLE, i.e. the order of columns defines logical order,
and you use something like 'AFTER' to specify physical order.

CREATE TABLE test (
a INT AFTER b,-- attlognum = 1, attphysnum = 2
b INT -- attlognum = 2, attphysnum = 1
);

It might get tricky because of cycles, though.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread David G Johnston
Tomas Vondra-4 wrote
 But if we want to allow users to define this, I'd say let's make that
 part of CREATE TABLE, i.e. the order of columns defines logical order,
 and you use something like 'AFTER' to specify physical order.
 
 CREATE TABLE test (
 a INT AFTER b,-- attlognum = 1, attphysnum = 2
 b INT -- attlognum = 2, attphysnum = 1
 );

Why not memorialize this as a storage parameter?

CREATE TABLE test (
a INT, b INT
)
WITH (layout = 'b, a')
;

A canonical form should be defined and then ALTER TABLE can either directly
update the current value or require the user to specify a new layout before
it will perform the alteration.

David J.




--
View this message in context: 
http://postgresql.nabble.com/logical-column-ordering-tp5829775p5839825.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread Josh Berkus
On 02/27/2015 03:06 PM, Tomas Vondra wrote:
 On 27.2.2015 23:48, Josh Berkus wrote:
 Actually, I'm going to go back on what I said.

 We need an API for physical column reordering, even if it's just pg_
 functions.  The reason is that we want to enable people writing their
 own physical column re-ordering tools, so that our users can figure out
 for us what the best reordering algorithm is.
 
 I doubt that. For example, do you realize you can only do that while the
 table is completely empty, and in that case you can just do a CREATE
 TABLE with the proper order?

Well, you could recreate the table as the de-facto API, although as you
point out below that still requires new syntax.

But I was thinking of something which would re-write the table, just
like ADD COLUMN x DEFAULT '' does now.

 I also doubt the users will be able to optimize the order better than
 users, who usually have on idea of how this actually works internally.

We have a lot of power users, including a lot of the people on this
mailing list.

Among the things we don't know about ordering optimization:

* How important is it for index performance to keep key columns adjacent?

* How important is it to pack values  4 bytes, as opposed to packing
values which are non-nullable?

* How important is it to pack values of the same size, as opposed to
packing values which are non-nullable?

 But if we want to allow users to define this, I'd say let's make that
 part of CREATE TABLE, i.e. the order of columns defines logical order,
 and you use something like 'AFTER' to specify physical order.
 
 CREATE TABLE test (
 a INT AFTER b,-- attlognum = 1, attphysnum = 2
 b INT -- attlognum = 2, attphysnum = 1
 );
 
 It might get tricky because of cycles, though.

It would be a lot easier to allow the user to specific a scalar.

CREATE TABLE test (
a INT NOT NULL WITH ( lognum 1, physnum 2 )
b INT WITH ( lognum 2, physnum 1 )

... and just throw an error if the user creates duplicates or gaps.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-27 Thread David G Johnston
David G Johnston wrote
 
 Tomas Vondra-4 wrote
 But if we want to allow users to define this, I'd say let's make that
 part of CREATE TABLE, i.e. the order of columns defines logical order,
 and you use something like 'AFTER' to specify physical order.
 
 CREATE TABLE test (
 a INT AFTER b,-- attlognum = 1, attphysnum = 2
 b INT -- attlognum = 2, attphysnum = 1
 );
 Why not memorialize this as a storage parameter?
 
 CREATE TABLE test (
 a INT, b INT
 )
 WITH (layout = 'b, a')
 ;
 
 A canonical form should be defined and then ALTER TABLE can either
 directly update the current value or require the user to specify a new
 layout before it will perform the alteration.
 
 David J.

Extending the idea a bit further why not have CREATE TABLE be the API; or
at least something similar to it?

CREATE OR REARRANGE TABLE test (
[...]
)
WITH ();

The [...] part would be logical and the WITH() would define the physical. 
The PK would simply be whatever is required to make the command work.

David J.




--
View this message in context: 
http://postgresql.nabble.com/logical-column-ordering-tp5829775p5839828.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Jim Nasby

On 2/26/15 4:34 PM, Andres Freund wrote:

On 2015-02-26 16:16:54 -0600, Jim Nasby wrote:

On 2/26/15 4:01 PM, Alvaro Herrera wrote:

The reason for doing it this way is that changing the underlying
architecture is really hard, without having to bear an endless hackers
bike shed discussion about the best userland syntax to use.  It seems a
much better approach to do the actually difficult part first, then let
the rest to be argued to death by others and let those others do the
easy part (and take all the credit along with that); that way, that
discussion does not kill other possible uses that the new architecture
allows.


I agree that it's a sane order of developing things. But: I don't think
it makes sense to commit it without the capability to change the
order. Not so much because it'll not serve a purpose at that point, but
rather because it'll essentially untestable. And this is a patch that
needs a fair amount of changes, both automated, and manual.


It's targeted for 9.6 anyway, so we have a while to decide on what's 
committed when. It's possible that there's no huge debate on the syntax.



At least during development I'd even add a function that randomizes the
physical order of columns, and keeps the logical one. Running the
regression tests that way seems likely to catch a fair number of bugs.


Yeah, I've thought that would be a necessary part of testing. Not really 
sure how we'd go about it with existing framework though...



+1. This patch is already several years old; lets not delay it further.

Plus, I suspect that you could hack the catalog directly if you really
wanted to change LCO. Worst case you could create a C function to do it.


I don't think that's sufficient for testing purposes. Not only for the
patch itself, but also for getting it right in new code.


We'll want to do testing that it doesn't make sense for the API to 
support. For example, randomizing the storage ordering; that's not a 
sane use case. Ideally we wouldn't even expose physical ordering because 
the code would be smart enough to figure it out.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Jim Nasby

On 2/23/15 5:09 PM, Tomas Vondra wrote:

Over the time I've heard various use cases for this patch, but in most
cases it was quite speculative. If you have an idea where this might be
useful, can you explain it here, or maybe point me to a place where it's
described?


For better or worse, table structure is a form of documentation for a 
system. As such, it's very valuable to group related fields in a table 
together. When creating a table, that's easy, but as soon as you need to 
alter your careful ordering can easily end up out the window.


Perhaps to some that just sounds like pointless window dressing, but my 
experience is that on a complex system the less organized things are the 
more bugs you get due to overlooking something.


The other reason for this patch (which it maybe doesn't support 
anymore?) is to allow Postgres to use an optimal physical ordering of 
fields on a page to reduce space wasted on alignment, as well as taking 
nullability and varlena into account.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 On 2/26/15 4:01 PM, Alvaro Herrera wrote:
 Josh Berkus wrote:
 Oh, I didn't realize there weren't commands to change the LCO.  Without
 at least SQL syntax for LCO, I don't see why we'd take it; this sounds
 more like a WIP patch.

 The reason for doing it this way is that changing the underlying
 architecture is really hard, without having to bear an endless hackers
 bike shed discussion about the best userland syntax to use.  It seems a
 much better approach to do the actually difficult part first, then let
 the rest to be argued to death by others and let those others do the
 easy part (and take all the credit along with that); that way, that
 discussion does not kill other possible uses that the new architecture
 allows.

 +1. This patch is already several years old; lets not delay it further.

I tend to agree with that, but how are we going to test things if there's
no mechanism to create a table in which the orderings are different?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Jim Nasby

On 2/26/15 4:01 PM, Alvaro Herrera wrote:

Josh Berkus wrote:

On 02/26/2015 01:54 PM, Alvaro Herrera wrote:

This patch decouples these three things so that they
can changed freely -- but provides no user interface to do so.  I think
that trying to only decouple the thing we currently have in two pieces,
and then have a subsequent patch to decouple again, is additional
conceptual complexity for no gain.


Oh, I didn't realize there weren't commands to change the LCO.  Without
at least SQL syntax for LCO, I don't see why we'd take it; this sounds
more like a WIP patch.


The reason for doing it this way is that changing the underlying
architecture is really hard, without having to bear an endless hackers
bike shed discussion about the best userland syntax to use.  It seems a
much better approach to do the actually difficult part first, then let
the rest to be argued to death by others and let those others do the
easy part (and take all the credit along with that); that way, that
discussion does not kill other possible uses that the new architecture
allows.


+1. This patch is already several years old; lets not delay it further.

Plus, I suspect that you could hack the catalog directly if you really 
wanted to change LCO. Worst case you could create a C function to do it.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Andres Freund
On 2015-02-26 16:16:54 -0600, Jim Nasby wrote:
 On 2/26/15 4:01 PM, Alvaro Herrera wrote:
 The reason for doing it this way is that changing the underlying
 architecture is really hard, without having to bear an endless hackers
 bike shed discussion about the best userland syntax to use.  It seems a
 much better approach to do the actually difficult part first, then let
 the rest to be argued to death by others and let those others do the
 easy part (and take all the credit along with that); that way, that
 discussion does not kill other possible uses that the new architecture
 allows.

I agree that it's a sane order of developing things. But: I don't think
it makes sense to commit it without the capability to change the
order. Not so much because it'll not serve a purpose at that point, but
rather because it'll essentially untestable. And this is a patch that
needs a fair amount of changes, both automated, and manual.

At least during development I'd even add a function that randomizes the
physical order of columns, and keeps the logical one. Running the
regression tests that way seems likely to catch a fair number of bugs.

 +1. This patch is already several years old; lets not delay it further.
 
 Plus, I suspect that you could hack the catalog directly if you really
 wanted to change LCO. Worst case you could create a C function to do it.

I don't think that's sufficient for testing purposes. Not only for the
patch itself, but also for getting it right in new code.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Alvaro Herrera
Josh Berkus wrote:
 On 02/26/2015 01:54 PM, Alvaro Herrera wrote:
  This patch decouples these three things so that they
  can changed freely -- but provides no user interface to do so.  I think
  that trying to only decouple the thing we currently have in two pieces,
  and then have a subsequent patch to decouple again, is additional
  conceptual complexity for no gain.
 
 Oh, I didn't realize there weren't commands to change the LCO.  Without
 at least SQL syntax for LCO, I don't see why we'd take it; this sounds
 more like a WIP patch.

The reason for doing it this way is that changing the underlying
architecture is really hard, without having to bear an endless hackers
bike shed discussion about the best userland syntax to use.  It seems a
much better approach to do the actually difficult part first, then let
the rest to be argued to death by others and let those others do the
easy part (and take all the credit along with that); that way, that
discussion does not kill other possible uses that the new architecture
allows.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread Kevin Grittner
Tomas Vondra tomas.von...@2ndquadrant.com wrote:

 Over the time I've heard various use cases for this patch, but in most
 cases it was quite speculative. If you have an idea where this might be
 useful, can you explain it here, or maybe point me to a place where it's
 described?


One use case is to be able to suppress default display of columns that are
used for internal purposes.  For example, incremental maintenance of
materialized views will require storing a count(t) column, and sometimes
state information for aggregate columns, in addition to what the users
explicitly request.  At the developers' meeting there was discussion of
whether and how to avoid displaying these by default, and it was felt that
when we have this logical column ordering it would be good to have a way to
suppress default display.  Perhaps this could be as simple as a special
value for logical position.
-- 

Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-26 Thread David Steele
On 2/26/15 1:49 PM, Jim Nasby wrote:
 On 2/23/15 5:09 PM, Tomas Vondra wrote:
 Over the time I've heard various use cases for this patch, but in most
 cases it was quite speculative. If you have an idea where this might be
 useful, can you explain it here, or maybe point me to a place where it's
 described?
 
 For better or worse, table structure is a form of documentation for a
 system. As such, it's very valuable to group related fields in a table
 together. When creating a table, that's easy, but as soon as you need to
 alter your careful ordering can easily end up out the window.
 
 Perhaps to some that just sounds like pointless window dressing, but my
 experience is that on a complex system the less organized things are the
 more bugs you get due to overlooking something.

I agree with Jim's comments.  I've generally followed column ordering
that goes something like:

1) primary key
2) foreign keys
3) flags
4) other programmatic data fields (type, order, etc.)
5) non-programmatic data fields (name, description, etc.)

The immediate practical benefit of this is that users are more likely to
see fields that they need without scrolling right.  Documentation is
also clearer because fields tend to go from most to least important
(left to right, top to bottom).  Also, if you are consistent enough then
users just *know* where to look.

I wrote a function a while back that reorders columns in tables (it not
only deals with reordering, but triggers, constraints, indexes, etc.,
though there are some caveats).  It's painful and not very efficient,
but easy to use.

Most dimension tables that I've worked with are in the millions of rows
so reordering is not problem.  With fact tables, I assess on a
case-by-case basis. It would be nice to not have to do that triage.

The function is attached if anyone is interested.

-- 
- David Steele
da...@pgmasters.net
/
 CATALOG_TABLE_COLUMN_MOVE Function
create
 or replace function _utility.catalog_table_column_move
(
strSchemaName text, 
strTableName text, 
strColumnName text, 
strColumnNameBefore text
)
returns void as $$
declare
rIndex record;
rConstraint record;
rColumn record;
strSchemaTable text = strSchemaName || '.' || strTableName;
strDdl text;
strClusterIndex text;
begin
-- Raise notice that a reorder is in progress
raise notice 'Reorder columns in table %.% (% before %)', strSchemaName, 
strTableName, strColumnName, strColumnNameBefore;

-- Get the cluster index
select pg_index.relname
  into strClusterIndex
  from pg_namespace
   inner join pg_class
on pg_class.relnamespace = pg_namespace.oid
   and pg_class.relname = strTableName
   inner join pg_index pg_index_map
on pg_index_map.indrelid = pg_class.oid
   and pg_index_map.indisclustered = true
   inner join pg_class pg_index
on pg_index.oid = pg_index_map.indexrelid
 where pg_namespace.nspname = strSchemaName;

if strClusterIndex is null then
raise exception 'Table %.% must have a cluster index before 
reordering', strSchemaName, strTableName;
end if;

-- Disable all user triggers
strDdl = 'alter table ' || strSchemaTable || ' disable trigger user';
raise notice 'Disable triggers [%]', strDdl;
execute strDdl;

-- Create temp table to hold ddl
create temp table temp_catalogtablecolumnreorder
(
type text not null,
name text not null,
ddl text not null
);

-- Save index ddl in a temp table
raise notice 'Save indexes';

for rIndex in
with index as
(
select _utility.catalog_index_list_get(strSchemaName, strTableName) 
as name
),
index_ddl as
(
select index.name,
   
_utility.catalog_index_create_get(_utility.catalog_index_get(strSchemaName, 
index.name)) as ddl
  from index
)
select index.name,
   index_ddl.ddl
  from index
   left outer join index_ddl
on index_ddl.name = index.name
   and index_ddl.ddl not like '%[function]%'
loop
raise notice 'Save %', rIndex.name;
insert into temp_catalogtablecolumnreorder values ('index', 
rIndex.name, rIndex.ddl);
end loop;

-- Save constraint ddl in a temp table
raise notice 'Save constraints';

for rConstraint in
with constraint_list as
(
select _utility.catalog_constraint_list_get(strSchemaName, 
strTableName, '{p,u,f,c}') as name
),
constraint_ddl as
(
select constraint_list.name,
   

Re: [HACKERS] logical column ordering

2015-02-26 Thread Gavin Flower

On 27/02/15 14:08, David Steele wrote:
[...]

I agree with Jim's comments.  I've generally followed column ordering
that goes something like:

1) primary key
2) foreign keys
3) flags
4) other programmatic data fields (type, order, etc.)
5) non-programmatic data fields (name, description, etc.)

The immediate practical benefit of this is that users are more likely to
see fields that they need without scrolling right.  Documentation is
also clearer because fields tend to go from most to least important
(left to right, top to bottom).  Also, if you are consistent enough then
users just *know* where to look.

I wrote a function a while back that reorders columns in tables (it not
only deals with reordering, but triggers, constraints, indexes, etc.,
though there are some caveats).  It's painful and not very efficient,
but easy to use.

Most dimension tables that I've worked with are in the millions of rows
so reordering is not problem.  With fact tables, I assess on a
case-by-case basis. It would be nice to not have to do that triage.

The function is attached if anyone is interested.

I've never formally written down the order of how I define 
fields^H^H^H^H^H^H columns in a table, but David's list is the same 
order I use.


The only additional ordering I do: is to put fields that are likely to 
be long text fields, at the end of the record.


So I would certainly appreciate my logical ordering to be the natural 
order they appear.



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-02-23 Thread Tomas Vondra
Hi,

attached is the result of my first attempt to make the logical column
ordering patch work. This touches a lot of code in the executor that is
mostly new to me, so if you see something that looks like an obvious
bug, it probably is (so let me know).   


improvements

The main improvements of this version are that:

* initdb actually works (while before it was crashing)

* regression tests work, with two exceptions

  (a) 'subselect' fails because EXPLAIN prints columns in physical order
  (but we expect logical)

  (b) col_order crashes works because of tuple descriptor mismatch in a
  function call (this actually causes a segfault)

The main change is this patch is that tlist_matches_tupdesc() now checks
target list vs. physical attribute order, which may result in doing a
projection (in cases when that would not be done previously).

I don not claim this is the best approach - maybe it would be better to
keep the physical tuple and reorder it lazily. That's why I kept a few
pieces of code (fix_physno_mutator) and a few unused fields in Var.

Over the time I've heard various use cases for this patch, but in most
cases it was quite speculative. If you have an idea where this might be
useful, can you explain it here, or maybe point me to a place where it's
described?

There's also a few FIXMEs, mostly from Alvaro's version of the patch.
Some of them are probably obsolete, but I wasn't 100% sure by that so
I've left them in place until I understand the code sufficiently.


randomized testing
--
I've also attached a python script for simple randomized testing. Just
execute it like this:

$ python randomize-attlognum.py -t test_1 test_2 \
   --init-script attlognum-init.sql \
   --test-script attlognum-test.sql

and it will do this over and over

$ dropdb test
$ createdb test
$ run init script
$ randomly set attlognums for the tables (test_1 and test_2)
$ run test script

It does not actually check the result, but my experience is that when
there's a bug in handling the descriptor, it results in segfault pretty
fast (just put some varlena columns into the table).


plans / future
--
After discussing this with Alvaro, we've both agreed that this is far
too high-risk change to commit in the very last CF (even if it was in a
better shape). So while it's added to 2015-02 CF, we're aiming for 9.6
if things go well.


regards

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index c5892d3..7528724 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2368,6 +2368,9 @@ get_attnum_pk_pos(int *pkattnums, int pknumatts, int key)
 	return -1;
 }
 
+/*
+ * FIXME this probably needs to be tweaked.
+ */
 static HeapTuple
 get_tuple_of_interest(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals)
 {
diff --git a/contrib/spi/timetravel.c b/contrib/spi/timetravel.c
index 0699438..30e496c 100644
--- a/contrib/spi/timetravel.c
+++ b/contrib/spi/timetravel.c
@@ -314,6 +314,7 @@ timetravel(PG_FUNCTION_ARGS)
 		Oid		   *ctypes;
 		char		sql[8192];
 		char		separ = ' ';
+		Form_pg_attribute *attrs;
 
 		/* allocate ctypes for preparation */
 		ctypes = (Oid *) palloc(natts * sizeof(Oid));
@@ -322,10 +323,11 @@ timetravel(PG_FUNCTION_ARGS)
 		 * Construct query: INSERT INTO _relation_ VALUES ($1, ...)
 		 */
 		snprintf(sql, sizeof(sql), INSERT INTO %s VALUES (, relname);
+		attrs = TupleDescGetLogSortedAttrs(tupdesc);
 		for (i = 1; i = natts; i++)
 		{
-			ctypes[i - 1] = SPI_gettypeid(tupdesc, i);
-			if (!(tupdesc-attrs[i - 1]-attisdropped)) /* skip dropped columns */
+			ctypes[i - 1] = SPI_gettypeid(tupdesc, attrs[i - 1]-attnum);
+			if (!(attrs[i - 1]-attisdropped)) /* skip dropped columns */
 			{
 snprintf(sql + strlen(sql), sizeof(sql) - strlen(sql), %c$%d, separ, i);
 separ = ',';
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 6cd4e8e..14787ce 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -79,6 +79,8 @@
 /*
  * heap_compute_data_size
  *		Determine size of the data area of a tuple to be constructed
+ *
+ * Note: input arrays must be in attnum order.
  */
 Size
 heap_compute_data_size(TupleDesc tupleDesc,
@@ -88,16 +90,23 @@ heap_compute_data_size(TupleDesc tupleDesc,
 	Size		data_length = 0;
 	int			i;
 	int			numberOfAttributes = tupleDesc-natts;
-	Form_pg_attribute *att = tupleDesc-attrs;
+	Form_pg_attribute *att = TupleDescGetPhysSortedAttrs(tupleDesc);
 
+	/*
+	 * We need to consider the attributes in physical order for storage, yet
+	 * our input arrays are in attnum order.  In this loop, i is an index
+	 * into the attphysnum-sorted attribute array, and idx is an index into the
+	 * input arrays.
+	 */
 	for (i = 0; i  numberOfAttributes; i++)
 	

Re: [HACKERS] logical column ordering

2015-01-24 Thread Tomas Vondra
On 20.1.2015 22:30, Alvaro Herrera wrote:
 I've decided to abandon this patch.  I have spent too much time looking
 at it now.
 
 If anyone is interested in trying to study, I can provide the patches I
 came up with, explanations, and references to prior discussion -- feel
 free to ask.

I'll take look. Can you share the patches etc. - either here, or maybe
send it to me directly?

regards
Tomas

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-01-20 Thread Alvaro Herrera
I've decided to abandon this patch.  I have spent too much time looking
at it now.

If anyone is interested in trying to study, I can provide the patches I
came up with, explanations, and references to prior discussion -- feel
free to ask.

My main motivation for this work is to enable a later patch for column
stores.  Right now, since columns have monotonically increasing attnums,
it's rather difficult to have columns that are stored elsewhere.  My
plan for that now is much more modest, something like adding a constant
1 to attnums and that would let us identify columns that are outside
the heap -- or something like that.  I haven't fully worked it out yet.


Just a few quick notes about this patch: last thing I was doing was mess
with setrefs.c so that Var nodes carry varphysnum annotations, which
are set to 0 during initial planner phases, and are turned into the
correct attphysnum (the value extracted from catalogs) so that
TupleDescs constructed from targetlists by ExecTypeFromTL and friends
can have the correct attphysnum too.  I think this part works correctly,
with the horrible exception that I had to do a relation_open() in
setrefs.c to get hold of the right attphysnum from a tupledesc obtained
from catalogs.  That's not acceptable at all; I think the right way to
do this would be to store a list of numbers earlier (not sure when) and
store it in RelOptInfo or RangeTableEntry; that would be accesible
during setrefs.c.

The other bit I did was modify all the heaptuple.c code so that it could
deal correctly with tupledescs that have attphysnums and attlognum in an
order different from stock attnum.  That took some time to get right,
but I think it's also correct now.

One issue I had was finding places that use attnum as an index into
the tupledesc attrs array.  I had to examine all these places and
change them to use a physattrs array, which is an array that has been
sorted by physical number.  I don't think all the changes are correct,
and I'm not sure that I caught them all.


Anyway it seems to me that this is mostly there.  If somebody is
interested in learning executor code, this project would be damn cool to
get done.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-01-14 Thread Michael Paquier
On Sun, Jan 4, 2015 at 10:37 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 So I'm reworking my patch with that in mind.
Switching to returned with feedback. Alvaro, feel free to add an entry
to the next CF if you are planning to work on it again.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-01-03 Thread Andres Freund
On 2014-12-09 14:41:46 -0300, Alvaro Herrera wrote:
 So I've been updating my very old patch to allow logical and physical
 column reordering.  Here's a WIP first cut for examination.

Do you have a updated patch that has ripened further?

 The first thing where this matters is tuple descriptor expansion in
 parse analysis; at this stage, things such as * (in select *) are
 turned into a target list, which must be sorted according to attlognum.
 To achieve this I added a new routine to tupledescs,
 TupleDescGetSortedAttrs() which computes a new Attribute array and
 caches it in the TupleDesc for later uses; this array points to the
 same elements in the normal attribute list but is order by attlognum.

That sounds sane.

 Another place that needs tweaking is heapam.c, which must construct a
 physical tuple from Datum/nulls arrays (heap_form_tuple).  In some cases
 the input arrays are sorted in logical column order.

I'm not sure that changing heaptuple.c's API (you mean that, not
heapam.c, right?)  is a good level to tackle this at. I think some
function to reorder values/isnull arrays into logical order and reverse
might end up being less invasive and actually faster.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-01-03 Thread Alvaro Herrera
Andres Freund wrote:
 On 2014-12-09 14:41:46 -0300, Alvaro Herrera wrote:
  So I've been updating my very old patch to allow logical and physical
  column reordering.  Here's a WIP first cut for examination.
 
 Do you have a updated patch that has ripened further?

Not yet.  Phil was kind enough to send me his old patch for study; I
am stealing a few interesting ideas from there, in particular:

  Another place that needs tweaking is heapam.c, which must construct a
  physical tuple from Datum/nulls arrays (heap_form_tuple).  In some cases
  the input arrays are sorted in logical column order.
 
 I'm not sure that changing heaptuple.c's API (you mean that, not
 heapam.c, right?)  is a good level to tackle this at. I think some
 function to reorder values/isnull arrays into logical order and reverse
 might end up being less invasive and actually faster.

Phil took a different route here than I did, and I think his design is
better than mine.  The main idea is that the Datum/nulls arrays in a
TupleTableSlot always follows physical order (he calls it storage
order), rather than this very strange mixture of things I did by
hacking the heaptuple.c API.  So I'm reworking my patch with that in
mind.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-12 Thread Andres Freund
On 2014-12-10 19:06:28 -0800, Josh Berkus wrote:
 On 12/10/2014 05:14 PM, Stephen Frost wrote:
  * Andres Freund (and...@2ndquadrant.com) wrote:
   But the scheduling of commits with regard to the 9.5 schedule actually
   opens a relevant question: When are we planning to release 9.5? Because
   If we try ~ one year from now it's a whole different ballgame than if we
   try to go back to september. And I think there's pretty good arguments
   for both.
  This should really be on its own thread for discussion...  I'm leaning,
  at the moment at least, towards the September release schedule.  I agree
  that having a later release would allow us to get more into it, but
  there's a lot to be said for the consistency we've kept up over the past
  few years with a September (our last non-September release was 8.4).
 
 Can we please NOT discuss this in the thread about someone's patch?  Thanks.

Well, it's relevant for the arguments made about the patches future...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Robert Haas
On Wed, Dec 10, 2014 at 12:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Andrew Dunstan wrote:
 I seriously doubt it, although I could be wrong. Unless someone can show a
 significant performance gain from using physical order, which would be a bit
 of a surprise to me, I would just stick with logical ordering as the
 default.

 Well, we have an optimization that avoids a projection step IIRC by
 using the physical tlist instead of having to build a tailored one.  I
 guess the reason that's there is because somebody did measure an
 improvement.  Maybe it *is* worth having as an option for pg_dump ...

 The physical tlist thing is there because it's demonstrable that
 ExecProject() takes nonzero time.  COPY does not go through ExecProject
 though.  What's more, it already has code to deal with a user-specified
 column order, and nobody's ever claimed that that code imposes a
 measurable performance overhead.

Also, if we're adding options to use the physical rather than the
logical column ordering in too many places, that's probably a sign
that we need to rethink this whole concept.  The concept of a logical
column ordering doesn't have much meaning if you're constantly forced
to fall back to some other column ordering whenever you want good
performance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Alvaro Herrera
Robert Haas wrote:
 On Wed, Dec 10, 2014 at 12:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Andrew Dunstan wrote:
  I seriously doubt it, although I could be wrong. Unless someone can show a
  significant performance gain from using physical order, which would be a 
  bit
  of a surprise to me, I would just stick with logical ordering as the
  default.
 
  Well, we have an optimization that avoids a projection step IIRC by
  using the physical tlist instead of having to build a tailored one.  I
  guess the reason that's there is because somebody did measure an
  improvement.  Maybe it *is* worth having as an option for pg_dump ...
 
  The physical tlist thing is there because it's demonstrable that
  ExecProject() takes nonzero time.  COPY does not go through ExecProject
  though.  What's more, it already has code to deal with a user-specified
  column order, and nobody's ever claimed that that code imposes a
  measurable performance overhead.
 
 Also, if we're adding options to use the physical rather than the
 logical column ordering in too many places, that's probably a sign
 that we need to rethink this whole concept.  The concept of a logical
 column ordering doesn't have much meaning if you're constantly forced
 to fall back to some other column ordering whenever you want good
 performance.

FWIW I have no intention to add options for physical/logical ordering
anywhere.  All users will see is that tables will follow the same
(logical) order everywhere.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Robert Haas
On Wed, Dec 10, 2014 at 9:25 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 FWIW I have no intention to add options for physical/logical ordering
 anywhere.  All users will see is that tables will follow the same
 (logical) order everywhere.

Just to be clear, I wasn't in any way attending to say that the patch
had a problem in this area.  I was just expressing concern about the
apparent rush to judgement on whether converting between physical and
logical column ordering would be expensive.  I certainly think that's
something that we should test - for example, we might want to consider
whether there are cases where you could maybe convince the executor to
spend a lot of time pointlessly reorganizing tuples in ways that
wouldn't happen today.  But I have no particular reason to think that
any issues we hit there issues won't be solvable.

To the extent that I have any concern about the patch at this point,
it's around stability.  I would awfully rather see something like this
get committed at the beginning of a development cycle than the end.
It's quite possible that I'm being more nervous than is justified, but
given that we're *still* fixing bugs related to dropped-column
handling (cf. 9b35ddce93a2ef336498baa15581b9d10f01db9c from July of
this year) which was added in July 2002, maybe not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Stephen Frost
Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
 To the extent that I have any concern about the patch at this point,
 it's around stability.  I would awfully rather see something like this
 get committed at the beginning of a development cycle than the end.

I tend to agree with this; we have a pretty bad habit of bouncing
around big patches until the end and then committing them.  That's not
as bad when the patch has been getting reviews and feedback over a few
months (or years) but this particular patch isn't even code-complete at
this point, aiui.

 It's quite possible that I'm being more nervous than is justified, but
 given that we're *still* fixing bugs related to dropped-column
 handling (cf. 9b35ddce93a2ef336498baa15581b9d10f01db9c from July of
 this year) which was added in July 2002, maybe not.

I'm not quite sure that I see how that's relevant.  Bugs will happen,
unfortunately, no matter how much review is done of a given patch.  That
isn't to say that we shouldn't do any review, but it's a trade-off.
This change, at least, strikes me as less likely to have subtle bugs
in it as compared to the dropped column case.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] logical column ordering

2014-12-10 Thread Robert Haas
On Wed, Dec 10, 2014 at 11:22 AM, Stephen Frost sfr...@snowman.net wrote:
 I'm not quite sure that I see how that's relevant.  Bugs will happen,
 unfortunately, no matter how much review is done of a given patch.  That
 isn't to say that we shouldn't do any review, but it's a trade-off.
 This change, at least, strikes me as less likely to have subtle bugs
 in it as compared to the dropped column case.

Yeah, that's possible.  They seem similar to me because they both
introduce new ways for the tuple as it is stored on disk to be
different from what must be shown to the user.  But I don't really
know how well that translates to what needs to be changed on a code
level.  If we're basically going back over all the same places that
needed special handling for attisdropped, then the likelihood of bugs
may be quite a bit lower than it was for that patch, because now we
know (mostly!) which places require attisdropped handling and we can
audit them all to make sure they handle this, too.  But if it's a
completely different set of places that need to be touched, then I
think there's a lively possibility for bugs of omission.

Ultimately, I think this is mostly going to be a question of what
Alvaro feels comfortable with; he's presumably going to have a better
sense of where and to what extent there might be bugs lurking than any
of the rest of us.  But the point is worth considering, because I
think we would probably all agree that having a release that is stable
and usable right out of the gate is more important than having any
single feature get into that release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Andres Freund
On 2014-12-10 12:06:11 -0500, Robert Haas wrote:
 Ultimately, I think this is mostly going to be a question of what
 Alvaro feels comfortable with; he's presumably going to have a better
 sense of where and to what extent there might be bugs lurking than any
 of the rest of us.  But the point is worth considering, because I
 think we would probably all agree that having a release that is stable
 and usable right out of the gate is more important than having any
 single feature get into that release.

Sure, 9.4 needs to be out of the gate. I don't think anybody doubts
that.

But the scheduling of commits with regard to the 9.5 schedule actually
opens a relevant question: When are we planning to release 9.5? Because
If we try ~ one year from now it's a whole different ballgame than if we
try to go back to september. And I think there's pretty good arguments
for both.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 But the scheduling of commits with regard to the 9.5 schedule actually
 opens a relevant question: When are we planning to release 9.5? Because
 If we try ~ one year from now it's a whole different ballgame than if we
 try to go back to september. And I think there's pretty good arguments
 for both.

This should really be on its own thread for discussion...  I'm leaning,
at the moment at least, towards the September release schedule.  I agree
that having a later release would allow us to get more into it, but
there's a lot to be said for the consistency we've kept up over the past
few years with a September (our last non-September release was 8.4).

I'd certainly vote against planning for a mid-December release as, at
least in my experience, it's a bad idea to try and do December (or
January 1..) major releases.  October might work, but that's not much of
a change from September.  Late January or February would probably work
but that's quite a shift from September and don't think it'd be
particularly better.  Worse, I'm nervous we might get into a habit of
longer and longer releases.  Having yearly releases, imv at least, is
really good for the project and those who depend on it.  New features
are available pretty quickly to end-users and people can plan around our
schedule pretty easily (eg- plan to do DB upgrades in January/February).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] logical column ordering

2014-12-10 Thread Josh Berkus
On 12/10/2014 05:14 PM, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  But the scheduling of commits with regard to the 9.5 schedule actually
  opens a relevant question: When are we planning to release 9.5? Because
  If we try ~ one year from now it's a whole different ballgame than if we
  try to go back to september. And I think there's pretty good arguments
  for both.
 This should really be on its own thread for discussion...  I'm leaning,
 at the moment at least, towards the September release schedule.  I agree
 that having a later release would allow us to get more into it, but
 there's a lot to be said for the consistency we've kept up over the past
 few years with a September (our last non-September release was 8.4).

Can we please NOT discuss this in the thread about someone's patch?  Thanks.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-10 Thread Josh Berkus
On 12/09/2014 09:11 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 Question on COPY, though: there's reasons why people would want COPY to
 dump in either physical or logical order.  If you're doing COPY to
 create CSV files for output, then you want the columns in logical order.
  If you're doing COPY for pg_dump, then you want them in physical order
 for faster dump/reload.  So we're almost certainly going to need to have
 an option for COPY.
 
 This is complete nonsense, Josh, or at least it is until you can provide
 some solid evidence to believe that column ordering would make any
 noticeable performance difference in COPY.  I know of no reason to believe
 that the existing user-defined-column-ordering option makes any difference.


Chill, dude, chill.  When we have a patch, I'll do some performance
testing, and we'll see if it's something we care about or not. There's
no reason to be belligerent about it.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-09 Thread Jim Nasby

On 12/9/14, 11:41 AM, Alvaro Herrera wrote:

I'm going to see about it while I get feedback on the rest of this patch; in
particular, extra test cases that fail to work when columns have been
moved around are welcome, so that I can add them to the regress test.
What I have now is the basics I'm building as I go along.  The
regression tests show examples of some logical column renumbering (which
can be done after the table already contains some data) but none of
physical column renumbering (which can only be done when the table is
completely empty.)  My hunch is that the sample foo, bar, baz, quux
tables should present plenty of opportunities to display brokenness in
the planner and executor.


The ideal case would be to do something like randomizing logical and physical 
ordering as tables are created throughout the entire test suite (presumably as 
an option). That should work for physical ordering, but presumably it would 
pointlessly blow up on logical ordering because the expected output is 
hard-coded.

Perhaps instead of randomizing logical ordering we could force that to be the 
same ordering in which fields were supplied and actually randomize attnum?

In particular, I'm thinking that in DefineRelation we can randomize 
stmt-tableElts before merging in inheritance attributes.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-09 Thread Josh Berkus
On 12/09/2014 09:41 AM, Alvaro Herrera wrote:
 The first thing where this matters is tuple descriptor expansion in
 parse analysis; at this stage, things such as * (in select *) are
 turned into a target list, which must be sorted according to attlognum.
 To achieve this I added a new routine to tupledescs,

The two other major cases are:

INSERT INTO table SELECT|VALUES ...

COPY table FROM|TO ...

... although copy should just be a subclass of SELECT.

Question on COPY, though: there's reasons why people would want COPY to
dump in either physical or logical order.  If you're doing COPY to
create CSV files for output, then you want the columns in logical order.
 If you're doing COPY for pg_dump, then you want them in physical order
for faster dump/reload.  So we're almost certainly going to need to have
an option for COPY.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-09 Thread Andrew Dunstan


On 12/09/2014 06:19 PM, Josh Berkus wrote:

On 12/09/2014 09:41 AM, Alvaro Herrera wrote:

The first thing where this matters is tuple descriptor expansion in
parse analysis; at this stage, things such as * (in select *) are
turned into a target list, which must be sorted according to attlognum.
To achieve this I added a new routine to tupledescs,

The two other major cases are:

INSERT INTO table SELECT|VALUES ...

COPY table FROM|TO ...

... although copy should just be a subclass of SELECT.

Question on COPY, though: there's reasons why people would want COPY to
dump in either physical or logical order.  If you're doing COPY to
create CSV files for output, then you want the columns in logical order.
  If you're doing COPY for pg_dump, then you want them in physical order
for faster dump/reload.  So we're almost certainly going to need to have
an option for COPY.





I seriously doubt it, although I could be wrong. Unless someone can show 
a significant performance gain from using physical order, which would be 
a bit of a surprise to me, I would just stick with logical ordering as 
the default.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-09 Thread Alvaro Herrera
Andrew Dunstan wrote:
 
 On 12/09/2014 06:19 PM, Josh Berkus wrote:
 On 12/09/2014 09:41 AM, Alvaro Herrera wrote:
 The first thing where this matters is tuple descriptor expansion in
 parse analysis; at this stage, things such as * (in select *) are
 turned into a target list, which must be sorted according to attlognum.
 To achieve this I added a new routine to tupledescs,
 The two other major cases are:
 
 INSERT INTO table SELECT|VALUES ...
 
 COPY table FROM|TO ...

Yes, both are covered.

 ... although copy should just be a subclass of SELECT.

It is not.  There's one part of COPY that goes through SELECT
processing, but only when the table being copied is a subselect.
Normal COPY does not use the same code path.

 Question on COPY, though: there's reasons why people would want COPY to
 dump in either physical or logical order.  If you're doing COPY to
 create CSV files for output, then you want the columns in logical order.
   If you're doing COPY for pg_dump, then you want them in physical order
 for faster dump/reload.  So we're almost certainly going to need to have
 an option for COPY.
 
 I seriously doubt it, although I could be wrong. Unless someone can show a
 significant performance gain from using physical order, which would be a bit
 of a surprise to me, I would just stick with logical ordering as the
 default.

Well, we have an optimization that avoids a projection step IIRC by
using the physical tlist instead of having to build a tailored one.  I
guess the reason that's there is because somebody did measure an
improvement.  Maybe it *is* worth having as an option for pg_dump ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-09 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Question on COPY, though: there's reasons why people would want COPY to
 dump in either physical or logical order.  If you're doing COPY to
 create CSV files for output, then you want the columns in logical order.
  If you're doing COPY for pg_dump, then you want them in physical order
 for faster dump/reload.  So we're almost certainly going to need to have
 an option for COPY.

This is complete nonsense, Josh, or at least it is until you can provide
some solid evidence to believe that column ordering would make any
noticeable performance difference in COPY.  I know of no reason to believe
that the existing user-defined-column-ordering option makes any difference.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2014-12-09 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Andrew Dunstan wrote:
 I seriously doubt it, although I could be wrong. Unless someone can show a
 significant performance gain from using physical order, which would be a bit
 of a surprise to me, I would just stick with logical ordering as the
 default.

 Well, we have an optimization that avoids a projection step IIRC by
 using the physical tlist instead of having to build a tailored one.  I
 guess the reason that's there is because somebody did measure an
 improvement.  Maybe it *is* worth having as an option for pg_dump ...

The physical tlist thing is there because it's demonstrable that
ExecProject() takes nonzero time.  COPY does not go through ExecProject
though.  What's more, it already has code to deal with a user-specified
column order, and nobody's ever claimed that that code imposes a
measurable performance overhead.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers