Re: [HACKERS] Number of attributes in HeapTupleHeader
Tom Lane wrote: I said: Sorry, you used up your chance at claiming that t_hoff is dispensable. If we apply your already-submitted patch, it isn't. Wait, I take that back. t_hoff is important to distinguish how much bitmap padding there is on a particular tuple --- but that's really only interesting as long as we aren't forcing dump/initdb/reload. If we are changing anything else about tuple headers, then that argument becomes irrelevant anyway. However, I'm still concerned about losing safety margin by removing redundant fields. I just wanted to comment that redundancy in the tuple header, while adding a very marginal amount to stability, is really too high a cost. If we can save 4 bytes on every row stored, I think that is a clear win. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Number of attributes in HeapTupleHeader
Neil Conway wrote: On Mon, 6 May 2002 08:44:27 +0900 "Hiroshi Inoue" [EMAIL PROTECTED] wrote: -Original Message- From: Manfred Koizar If there is interest in reducing on-disk tuple header size and I have not missed any strong arguments against dropping t_natts, I'll investigate further. Comments? If a dbms is proper, it prepares a mechanism from the first to handle ADD COLUMN without touching the tuples. If the machanism is lost(I believe so) by removing t_natts, I would say good bye to PostgreSQL. IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring redundant on-disk data (t_natts), it isn't SQL compliant (because default values or NOT NULL can't be specified), and depends on a low-level kludge (that the storage system will return NULL for any attnums the # of the attributes stored in the tuple). I think it's neither a hack nor a kludge. The value of data which are non-existent at the appearance is basically unknown. So there could be an implementation of ALTER TABLE ADD COLUMN .. DEFAULT which doesn't touch existent tuples at all as Oracle does. Though I don't object to touch tuples to implement ADD COLUMN .. DEFAULT, please don't change the existent stuff together. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Number of attributes in HeapTupleHeader
I think the real trick is keeping track of the difference between: begin; ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4; commit; and begin; ALTER TABLE tab ADD COLUMN col1; ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4; commit; The first should populate the column with the value of '4', the second should populate the column with NULL and have new entries with default of 4. Not to mention begin; ALTER TABLE tab ADD COLUMN col1 DEFAULT 5; ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4; commit; New tuples with default value of 4, but the column creation should have 5. -- Rod - Original Message - From: "Hiroshi Inoue" [EMAIL PROTECTED] To: "Neil Conway" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Monday, May 06, 2002 9:08 PM Subject: Re: [HACKERS] Number of attributes in HeapTupleHeader Neil Conway wrote: On Mon, 6 May 2002 08:44:27 +0900 "Hiroshi Inoue" [EMAIL PROTECTED] wrote: -Original Message- From: Manfred Koizar If there is interest in reducing on-disk tuple header size and I have not missed any strong arguments against dropping t_natts, I'll investigate further. Comments? If a dbms is proper, it prepares a mechanism from the first to handle ADD COLUMN without touching the tuples. If the machanism is lost(I believe so) by removing t_natts, I would say good bye to PostgreSQL. IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring redundant on-disk data (t_natts), it isn't SQL compliant (because default values or NOT NULL can't be specified), and depends on a low-level kludge (that the storage system will return NULL for any attnums the # of the attributes stored in the tuple). I think it's neither a hack nor a kludge. The value of data which are non-existent at the appearance is basically unknown. So there could be an implementation of ALTER TABLE ADD COLUMN .. DEFAULT which doesn't touch existent tuples at all as Oracle does. Though I don't object to touch tuples to implement ADD COLUMN .. DEFAULT, please don't change the existent stuff together. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Number of attributes in HeapTupleHeader
Rod Taylor wrote: I think the real trick is keeping track of the difference between: begin; ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4; commit; and begin; ALTER TABLE tab ADD COLUMN col1; ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4; commit; The first should populate the column with the value of '4', the second should populate the column with NULL and have new entries with default of 4. I know the difference. Though I don't love the standard spec of the first, I don't object to introduce it. My only anxiety is that the implementation of the first would replace the current implementaion of ADD COLUMN (without default) together to touch tuples. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Number of attributes in HeapTupleHeader
On Sun, 05 May 2002 23:48:31 +0200 Manfred Koizar [EMAIL PROTECTED] wrote: Two years ago there have been thoughts about ADD COLUMN and whether it should touch all tuples or just change the metadata. Could someone tell me, what eventually came out of this discussion and where I find the relevant pieces of source code, please. See AlterTableAddColumn() in commands/tablecmds.c If there is interest in reducing on-disk tuple header size and I have not missed any strong arguments against dropping t_natts, I'll investigate further. Comments? I'd definately be interested -- let me know if you'd like any help... Cheers, Neil -- Neil Conway [EMAIL PROTECTED] PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Number of attributes in HeapTupleHeader
On Sun, 5 May 2002 18:07:27 -0400, Neil Conway [EMAIL PROTECTED] wrote: See AlterTableAddColumn() in commands/tablecmds.c Thanks. Sounds obvious. Should have looked before asking... This doesn't look too promising: * Implementation restrictions: because we don't touch the table rows, ^^ * the new column values will initially appear to be NULLs. (This * happens because the heap tuple access routines always check for * attnum # of attributes in tuple, and return NULL if so.) ^ Scratching my head and pondering on ... I'll be back :-) I'd definately be interested -- let me know if you'd like any help... Well, currently I'm in the process of making myself familiar with the code. That mainly takes hours of reading and searching. Anyway, thanks; I'll post here, if I have questions. Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Number of attributes in HeapTupleHeader
Manfred Koizar [EMAIL PROTECTED] writes: Currently there's an int16 t_natts in HeapTupleHeaderData. This number is stored on disk for every single tuple. Assuming that the number of attributes is constant for all tuples of one relation we have a lot of redundancy here. ... but that's a false assumption. No, I don't think removing 2 bytes from the header is worth making ALTER TABLE ADD COLUMN orders of magnitude slower. Especially since the actual savings will be *zero*, unless you can find another 2 bytes someplace. If this is doable, we arrive at 6 bytes. And what works for t_natts, should also work for t_hoff; that's another byte. Are we getting nearer? Sorry, you used up your chance at claiming that t_hoff is dispensable. If we apply your already-submitted patch, it isn't. The bigger picture here is that the more redundancy we squeeze out of tuple headers, the more fragile the table data structure becomes. Even if we could remove t_natts at zero runtime cost, I'd be concerned about the implications for reliability (ie, ability to detect inconsistencies) and post-crash data reconstruction. I've spent enough time staring at tuple dumps to be fairly glad that we don't run the data through a compressor ;-) regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Number of attributes in HeapTupleHeader
-Original Message- From: Manfred Koizar If there is interest in reducing on-disk tuple header size and I have not missed any strong arguments against dropping t_natts, I'll investigate further. Comments? If a dbms is proper, it prepares a mechanism from the first to handle ADD COLUMN without touching the tuples. If the machanism is lost(I believe so) by removing t_natts, I would say good bye to PostgreSQL. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Number of attributes in HeapTupleHeader
On Mon, 6 May 2002 08:44:27 +0900 Hiroshi Inoue [EMAIL PROTECTED] wrote: -Original Message- From: Manfred Koizar If there is interest in reducing on-disk tuple header size and I have not missed any strong arguments against dropping t_natts, I'll investigate further. Comments? If a dbms is proper, it prepares a mechanism from the first to handle ADD COLUMN without touching the tuples. If the machanism is lost(I believe so) by removing t_natts, I would say good bye to PostgreSQL. IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring redundant on-disk data (t_natts), it isn't SQL compliant (because default values or NOT NULL can't be specified), and depends on a low-level kludge (that the storage system will return NULL for any attnums the # of the attributes stored in the tuple). While instantaneous ADD COLUMN is nice, I think it's counter- productive to not take advantage of a storage space optimization just to preserve a feature that is already semi-broken. Cheers, Neil -- Neil Conway [EMAIL PROTECTED] PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Number of attributes in HeapTupleHeader
IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring redundant on-disk data (t_natts), it isn't SQL compliant (because default values or NOT NULL can't be specified), and depends on a low-level kludge (that the storage system will return NULL for any attnums the # of the attributes stored in the tuple). While instantaneous ADD COLUMN is nice, I think it's counter- productive to not take advantage of a storage space optimization just to preserve a feature that is already semi-broken. I actually started working on modifying ADD COLUMN to allow NOT NULL and DEFAULT clauses. Tom's idea of having col n_atts return the default instead of NULL is cool - I didn't think of that. My changes would have basically made the plain add column we have at the moment work instantly, but if they specified NOT NULL it would touch every row. That way it's up to the DBA which one they want (as good HCI should always do). However, now that my SET/DROP NOT NULL patch is in there, it's easy to do the whole add column process, just in a transaction: BEGIN; ALTER TABLE foo ADD bar int4; UPDATE foo SET bar=3; ALTER TABLE foo ALTER bar SET NOT NULL; ALTER TABLE foo SET DEFAULT 3; ALTER TABLE foo ADD FOREIGN KEY (bar) REFERENCES (noik); COMMIT; With the advantage that you have full control over every step... Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Number of attributes in HeapTupleHeader
I said: Sorry, you used up your chance at claiming that t_hoff is dispensable. If we apply your already-submitted patch, it isn't. Wait, I take that back. t_hoff is important to distinguish how much bitmap padding there is on a particular tuple --- but that's really only interesting as long as we aren't forcing dump/initdb/reload. If we are changing anything else about tuple headers, then that argument becomes irrelevant anyway. However, I'm still concerned about losing safety margin by removing redundant fields. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])