Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-06-12 Thread Bruce Momjian

Tom Lane wrote:
 I said:
  Sorry, you used up your chance at claiming that t_hoff is dispensable.
  If we apply your already-submitted patch, it isn't.
 
 Wait, I take that back.  t_hoff is important to distinguish how much
 bitmap padding there is on a particular tuple --- but that's really
 only interesting as long as we aren't forcing dump/initdb/reload.
 If we are changing anything else about tuple headers, then that
 argument becomes irrelevant anyway.
 
 However, I'm still concerned about losing safety margin by removing
 redundant fields.

I just wanted to comment that redundancy in the tuple header, while
adding a very marginal amount to stability, is really too high a cost. 
If we can save 4 bytes on every row stored, I think that is a clear win.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-06 Thread Hiroshi Inoue
Neil Conway wrote:
 
 On Mon, 6 May 2002 08:44:27 +0900
 "Hiroshi Inoue" [EMAIL PROTECTED] wrote:
   -Original Message-
   From: Manfred Koizar
  
   If there is interest in reducing on-disk tuple header size and I have
   not missed any strong arguments against dropping t_natts, I'll
   investigate further.  Comments?
 
  If a dbms is proper, it prepares a mechanism from the first
  to handle ADD COLUMN without touching the tuples. If the
  machanism is lost(I believe so) by removing t_natts, I would
  say good bye to PostgreSQL.
 
 IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
 redundant on-disk data (t_natts), it isn't SQL compliant (because
 default values or NOT NULL can't be specified), and depends on
 a low-level kludge (that the storage system will return NULL for
 any attnums  the # of the attributes stored in the tuple).

I think it's neither a hack nor a kludge.
The value of data which are non-existent at the appearance
is basically unknown. So there could be an implementation
of ALTER TABLE ADD COLUMN .. DEFAULT which doesn't touch
existent tuples at all as Oracle does.
Though I don't object to touch tuples to implement ADD COLUMN
.. DEFAULT, please don't change the existent stuff together.

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-06 Thread Rod Taylor
I think the real trick is keeping track of the difference between:

begin;
ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4;
commit;

and

begin;
ALTER TABLE tab ADD COLUMN col1;
ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
commit;

The first should populate the column with the value of '4', the second
should populate the column with NULL and have new entries with default
of 4.

Not to mention
begin;
ALTER TABLE tab ADD COLUMN col1 DEFAULT 5;
ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
commit;

New tuples with default value of 4, but the column creation should
have 5.
--
Rod
- Original Message -
From: "Hiroshi Inoue" [EMAIL PROTECTED]
To: "Neil Conway" [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Monday, May 06, 2002 9:08 PM
Subject: Re: [HACKERS] Number of attributes in HeapTupleHeader


 Neil Conway wrote:
 
  On Mon, 6 May 2002 08:44:27 +0900
  "Hiroshi Inoue" [EMAIL PROTECTED] wrote:
-Original Message-
From: Manfred Koizar
   
If there is interest in reducing on-disk tuple header size and
I have
not missed any strong arguments against dropping t_natts, I'll
investigate further.  Comments?
  
   If a dbms is proper, it prepares a mechanism from the first
   to handle ADD COLUMN without touching the tuples. If the
   machanism is lost(I believe so) by removing t_natts, I would
   say good bye to PostgreSQL.
 
  IMHO, the current ADD COLUMN mechanism is a hack. Besides
requiring
  redundant on-disk data (t_natts), it isn't SQL compliant (because
  default values or NOT NULL can't be specified), and depends on
  a low-level kludge (that the storage system will return NULL for
  any attnums  the # of the attributes stored in the tuple).

 I think it's neither a hack nor a kludge.
 The value of data which are non-existent at the appearance
 is basically unknown. So there could be an implementation
 of ALTER TABLE ADD COLUMN .. DEFAULT which doesn't touch
 existent tuples at all as Oracle does.
 Though I don't object to touch tuples to implement ADD COLUMN
 .. DEFAULT, please don't change the existent stuff together.

 regards,
 Hiroshi Inoue
 http://w2422.nsk.ne.jp/~inoue/

 ---(end of
broadcast)---
 TIP 6: Have you searched our list archives?

 http://archives.postgresql.org



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-06 Thread Hiroshi Inoue
Rod Taylor wrote:
 
 I think the real trick is keeping track of the difference between:
 
 begin;
 ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4;
 commit;
 
 and
 
 begin;
 ALTER TABLE tab ADD COLUMN col1;
 ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
 commit;
 
 The first should populate the column with the value of '4', the second
 should populate the column with NULL and have new entries with default
 of 4.

I know the difference. Though I don't love the standard
spec of the first, I don't object to introduce it.
My only anxiety is that the implementation of the first
would replace the current implementaion of ADD COLUMN
(without default) together to touch tuples.

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Neil Conway

On Sun, 05 May 2002 23:48:31 +0200
Manfred Koizar [EMAIL PROTECTED] wrote:
 Two years ago there have been thoughts about ADD COLUMN and whether it
 should touch all tuples or just change the metadata.  Could someone
 tell me, what eventually came out of this discussion and where I find
 the relevant pieces of source code, please.

See AlterTableAddColumn() in commands/tablecmds.c

 If there is interest in reducing on-disk tuple header size and I have
 not missed any strong arguments against dropping t_natts, I'll
 investigate further.  Comments?

I'd definately be interested -- let me know if you'd like any help...

Cheers,

Neil

-- 
Neil Conway [EMAIL PROTECTED]
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Manfred Koizar

On Sun, 5 May 2002 18:07:27 -0400, Neil Conway
[EMAIL PROTECTED] wrote:
See AlterTableAddColumn() in commands/tablecmds.c
Thanks.  Sounds obvious.  Should have looked before asking...
This doesn't look too promising:
 * Implementation restrictions: because we don't touch the table rows,
   ^^
 * the new column values will initially appear to be NULLs.  (This
 * happens because the heap tuple access routines always check for
 * attnum  # of attributes in tuple, and return NULL if so.)
   ^

Scratching my head and pondering on ...
I'll be back :-)

I'd definately be interested -- let me know if you'd like any help...
Well, currently I'm in the process of making myself familiar with the
code.  That mainly takes hours of reading and searching.  Anyway,
thanks;  I'll post here, if I have questions.

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Tom Lane

Manfred Koizar [EMAIL PROTECTED] writes:
 Currently there's an int16 t_natts in HeapTupleHeaderData.  This
 number is stored on disk for every single tuple.  Assuming that the
 number of attributes is constant for all tuples of one relation we
 have a lot of redundancy here.

... but that's a false assumption.

No, I don't think removing 2 bytes from the header is worth making
ALTER TABLE ADD COLUMN orders of magnitude slower.  Especially since
the actual savings will be *zero*, unless you can find another 2 bytes
someplace.

 If this is doable, we arrive at 6 bytes.  And what works for t_natts,
 should also work for t_hoff; that's another byte.  Are we getting
 nearer?

Sorry, you used up your chance at claiming that t_hoff is dispensable.
If we apply your already-submitted patch, it isn't.

The bigger picture here is that the more redundancy we squeeze out
of tuple headers, the more fragile the table data structure becomes.
Even if we could remove t_natts at zero runtime cost, I'd be concerned
about the implications for reliability (ie, ability to detect
inconsistencies) and post-crash data reconstruction.  I've spent enough
time staring at tuple dumps to be fairly glad that we don't run the
data through a compressor ;-)

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Hiroshi Inoue

 -Original Message-
 From: Manfred Koizar
 
 If there is interest in reducing on-disk tuple header size and I have
 not missed any strong arguments against dropping t_natts, I'll
 investigate further.  Comments?

If a dbms is proper, it prepares a mechanism from the first
to handle ADD COLUMN without touching the tuples. If the
machanism is lost(I believe so) by removing t_natts, I would
say good bye to PostgreSQL.

regards,
Hiroshi Inoue

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Neil Conway

On Mon, 6 May 2002 08:44:27 +0900
Hiroshi Inoue [EMAIL PROTECTED] wrote:
  -Original Message-
  From: Manfred Koizar
  
  If there is interest in reducing on-disk tuple header size and I have
  not missed any strong arguments against dropping t_natts, I'll
  investigate further.  Comments?
 
 If a dbms is proper, it prepares a mechanism from the first
 to handle ADD COLUMN without touching the tuples. If the
 machanism is lost(I believe so) by removing t_natts, I would
 say good bye to PostgreSQL.

IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
redundant on-disk data (t_natts), it isn't SQL compliant (because
default values or NOT NULL can't be specified), and depends on
a low-level kludge (that the storage system will return NULL for
any attnums  the # of the attributes stored in the tuple).

While instantaneous ADD COLUMN is nice, I think it's counter-
productive to not take advantage of a storage space optimization
just to preserve a feature that is already semi-broken.

Cheers,

Neil

-- 
Neil Conway [EMAIL PROTECTED]
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Christopher Kings-Lynne

 IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
 redundant on-disk data (t_natts), it isn't SQL compliant (because
 default values or NOT NULL can't be specified), and depends on
 a low-level kludge (that the storage system will return NULL for
 any attnums  the # of the attributes stored in the tuple).

 While instantaneous ADD COLUMN is nice, I think it's counter-
 productive to not take advantage of a storage space optimization
 just to preserve a feature that is already semi-broken.

I actually started working on modifying ADD COLUMN to allow NOT NULL and
DEFAULT clauses.  Tom's idea of having col  n_atts return the default
instead of NULL is cool - I didn't think of that.  My changes would have
basically made the plain add column we have at the moment work instantly,
but if they specified NOT NULL it would touch every row.  That way it's up
to the DBA which one they want (as good HCI should always do).

However, now that my SET/DROP NOT NULL patch is in there, it's easy to do
the whole add column process, just in a transaction:

BEGIN;
ALTER TABLE foo ADD bar int4;
UPDATE foo SET bar=3;
ALTER TABLE foo ALTER bar SET NOT NULL;
ALTER TABLE foo SET DEFAULT 3;
ALTER TABLE foo ADD FOREIGN KEY (bar) REFERENCES (noik);
COMMIT;

With the advantage that you have full control over every step...

Chris


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Number of attributes in HeapTupleHeader

2002-05-05 Thread Tom Lane

I said:
 Sorry, you used up your chance at claiming that t_hoff is dispensable.
 If we apply your already-submitted patch, it isn't.

Wait, I take that back.  t_hoff is important to distinguish how much
bitmap padding there is on a particular tuple --- but that's really
only interesting as long as we aren't forcing dump/initdb/reload.
If we are changing anything else about tuple headers, then that
argument becomes irrelevant anyway.

However, I'm still concerned about losing safety margin by removing
redundant fields.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])