Re: [PATCHES] [WIP] The relminxid addition, try 3
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Well, if a transaction modifies a table in some way, even without > > changing the data, should generate an unfreeze event, because it will > > need to lock the table; for example AlterTable locks the affected > > relation with AccessExclusiveLock. It's important for the > > non-transactional change to the pg_class tuple be the very first in the > > transaction, because otherwise the change could be lost; but other than > > this, I don't think there's any problem. > > You can't guarantee that. Consider for instance manual updates to > pg_class: > > BEGIN; > UPDATE pg_class SET reltriggers = 0 WHERE relname = ... > ... alter table contents ... > COMMIT or ROLLBACK; > > I believe there are actually patterns like this in some pg_dump output. > Will you hack every UPDATE operation to test whether it's changing > pg_class and if so force an "unfreeze" operation before changing any > row? No thanks :-( Oh, true, I hadn't thought of direct updates to pg_class. > >> I'm wondering if we need a second pg_class-derived catalog that carries > >> just the nontransactional columns. > > > I hope we don't need to do this because ISTM it will be a very big change. > > (Yawn...) We've made far bigger changes than that. The important > thing is to get it right. Yeah, I know -- I've been involved in some of them. I hereby volunteer to do it for 8.2 because I'd really like to see this patch in. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Well, if a transaction modifies a table in some way, even without > changing the data, should generate an unfreeze event, because it will > need to lock the table; for example AlterTable locks the affected > relation with AccessExclusiveLock. It's important for the > non-transactional change to the pg_class tuple be the very first in the > transaction, because otherwise the change could be lost; but other than > this, I don't think there's any problem. You can't guarantee that. Consider for instance manual updates to pg_class: BEGIN; UPDATE pg_class SET reltriggers = 0 WHERE relname = ... ... alter table contents ... COMMIT or ROLLBACK; I believe there are actually patterns like this in some pg_dump output. Will you hack every UPDATE operation to test whether it's changing pg_class and if so force an "unfreeze" operation before changing any row? No thanks :-( >> I'm wondering if we need a second pg_class-derived catalog that carries >> just the nontransactional columns. > I hope we don't need to do this because ISTM it will be a very big change. (Yawn...) We've made far bigger changes than that. The important thing is to get it right. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [WIP] The relminxid addition, try 3
Tom Lane wrote: > I wrote: > > Well, that needs rethinking. The unfreeze has to be a non-transactional > > update (if our transaction rolls back, the unfreeze still has to > > "stick", because we may have put dead tuples into the rel). > > Actually, this seems even messier than I thought. Consider a > transaction that does something transactional to a table's schema, > thereby generating a new pg_class row (but not touching any data within > the table), and then alters the table contents, requiring an unfreeze. > An update to the apparently-current pg_class tuple is not good because > that tuple might be rolled back. An update to the last committed > version doesn't work either. Well, if a transaction modifies a table in some way, even without changing the data, should generate an unfreeze event, because it will need to lock the table; for example AlterTable locks the affected relation with AccessExclusiveLock. It's important for the non-transactional change to the pg_class tuple be the very first in the transaction, because otherwise the change could be lost; but other than this, I don't think there's any problem. Not that I had actually considered this problem, to be frank; but it seems a nice side effect of how the unfreezing works. > This seems real close to the recent discussions about how to put > sequence data into a single one-row-per-sequence system catalog, > specifically about how there were some parts of the sequence catalog > data that should be transactional and some that should not be. > I'm wondering if we need a second pg_class-derived catalog that carries > just the nontransactional columns. I hope we don't need to do this because ISTM it will be a very big change. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [WIP] The relminxid addition, try 3
I wrote: > Well, that needs rethinking. The unfreeze has to be a non-transactional > update (if our transaction rolls back, the unfreeze still has to > "stick", because we may have put dead tuples into the rel). Actually, this seems even messier than I thought. Consider a transaction that does something transactional to a table's schema, thereby generating a new pg_class row (but not touching any data within the table), and then alters the table contents, requiring an unfreeze. An update to the apparently-current pg_class tuple is not good because that tuple might be rolled back. An update to the last committed version doesn't work either. This seems real close to the recent discussions about how to put sequence data into a single one-row-per-sequence system catalog, specifically about how there were some parts of the sequence catalog data that should be transactional and some that should not be. I'm wondering if we need a second pg_class-derived catalog that carries just the nontransactional columns. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Where did all these CommandCounterIncrement calls come from? > I added them in heap_unfreeze precisely because I want the relation to > be frozen exactly once, and this doesn't seem to happen if I don't CCI > there -- I was seeing multiple occurences of the NOTICE I put in > heap_unfreeze for the same relation for a single CREATE TABLE (multiple > unfreezes of pg_class and pg_attribute, for example). Well, that needs rethinking. The unfreeze has to be a non-transactional update (if our transaction rolls back, the unfreeze still has to "stick", because we may have put dead tuples into the rel). Therefore, a CCI is neither necessary nor useful. I didn't look at your patch in any detail ... didn't you modify it to use the non-transactional update code I put in heapam.c recently? It might be that we need to broadcast an sinval message for the tuple when we update it this way ... although sinval believes updates are transactional, so that's not going to work all that well. Maybe we have to legislate that catcache/syscache entries shouldn't be trusted to have up-to-date values of these fields. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [WIP] The relminxid addition, try 3
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > CREATE TABLE foo (a int); > > > for some unknown reason, an inval message involving relation foo seems > > to be emitted. > > > heap_unfreeze(pg_class) > > CommandCounterIncrement() > > heap_unfreeze(pg_attribute) > > CommandCounterIncrement() > > ... insert the pg_attribute rows ... > > Where did all these CommandCounterIncrement calls come from? > I don't think it's going to work if you are throwing in CCIs > at random places; this problem with the relcache will be the > least of your worries. Why do you think you need that anyway? I added them in heap_unfreeze precisely because I want the relation to be frozen exactly once, and this doesn't seem to happen if I don't CCI there -- I was seeing multiple occurences of the NOTICE I put in heap_unfreeze for the same relation for a single CREATE TABLE (multiple unfreezes of pg_class and pg_attribute, for example). Maybe the problem is elsewhere. I'll investigate more. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > CREATE TABLE foo (a int); > for some unknown reason, an inval message involving relation foo seems > to be emitted. > heap_unfreeze(pg_class) > CommandCounterIncrement() > heap_unfreeze(pg_attribute) > CommandCounterIncrement() > ... insert the pg_attribute rows ... Where did all these CommandCounterIncrement calls come from? I don't think it's going to work if you are throwing in CCIs at random places; this problem with the relcache will be the least of your worries. Why do you think you need that anyway? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> (Now, if you're combining this with the very grotty relpages/reltuples >> update code, then I'm all for making that xlog properly --- we've gotten >> away without it so far but it really should xlog the change.) > I hadn't done that, but I'll see what I can do. Notice however that I'm > doing this in both pg_class _and_ pg_database. It strikes me that the cleanest way to deal with this is to invent a single new type of xlog record, something like HEAP_UPDATE_IN_PLACE, which just replaces tuple contents with a new tuple of the same size. This would serve for both stats updates and unfreezing in both pg_class and pg_database, and might have other uses in future for non-transactional updates. It's a little bit more xlog space than your unfreeze-specific operations, but I don't think we need to sweat a few bytes for those; they're not likely to be common. Barring objections, I'll add this when I refactor UpdateStats. Got some other work I need to get back to right now, though. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> We should reorganize things so this is done once at the outer level. >> It'd require some change of the ambuild() API, but considering we're >> hacking several other aspects of the AM API in this development cycle, >> that doesn't bother me. > I'm rather stuck in Mammoth Replicator currently :-( so I'm not sure > when I could devote some time to this task. OK, I'll try to get to it later this week. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [WIP] The relminxid addition, try 3
Tom Lane wrote: > BTW, one thing I was looking at last week was whether we couldn't > refactor the relpages/reltuples updates to be done in a cleaner place. > Right now UpdateStats is called (for indexes) directly from the index > AM, which sucks from a modularity point of view, and what's worse it > forces two successive updates of the same tuple during CREATE INDEX. > We should reorganize things so this is done once at the outer level. > It'd require some change of the ambuild() API, but considering we're > hacking several other aspects of the AM API in this development cycle, > that doesn't bother me. I'm rather stuck in Mammoth Replicator currently :-( so I'm not sure when I could devote some time to this task. If you want to do it, go ahead and I can adapt the relminxid patch to your changes. Or you can take the relminxid patch from where it currently is, if you feel so inclined. If you don't, I'll eventually come back to it, but I'm not sure when will that be. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> (Now, if you're combining this with the very grotty relpages/reltuples >> update code, then I'm all for making that xlog properly --- we've gotten >> away without it so far but it really should xlog the change.) > I hadn't done that, but I'll see what I can do. Notice however that I'm > doing this in both pg_class _and_ pg_database. BTW, one thing I was looking at last week was whether we couldn't refactor the relpages/reltuples updates to be done in a cleaner place. Right now UpdateStats is called (for indexes) directly from the index AM, which sucks from a modularity point of view, and what's worse it forces two successive updates of the same tuple during CREATE INDEX. We should reorganize things so this is done once at the outer level. It'd require some change of the ambuild() API, but considering we're hacking several other aspects of the AM API in this development cycle, that doesn't bother me. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [WIP] The relminxid addition, try 3
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Ah, there's another reason, and it's that I'm rewriting the tuple in > > place, not calling heap_update. > > Is that really a good idea, as compared to using heap_update? Not sure -- we would leave dead tuples after the VACUUM is finished if we used heap_update, no? ISTM it's undesirable. > (Now, if you're combining this with the very grotty relpages/reltuples > update code, then I'm all for making that xlog properly --- we've gotten > away without it so far but it really should xlog the change.) I hadn't done that, but I'll see what I can do. Notice however that I'm doing this in both pg_class _and_ pg_database. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Ah, there's another reason, and it's that I'm rewriting the tuple in > place, not calling heap_update. Is that really a good idea, as compared to using heap_update? (Now, if you're combining this with the very grotty relpages/reltuples update code, then I'm all for making that xlog properly --- we've gotten away without it so far but it really should xlog the change.) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [WIP] The relminxid addition, try 3
Tom Lane wrote: > But why do you need your own xlogging at all? Shouldn't these actions > be perfectly ordinary updates of the relevant catalog tuples? The XLog entry can be smaller, AFAICT (we need to store the whole new tuple in a heap_update, right?). If that's not a worthy goal I'll gladly rewrite this piece of code. Ah, there's another reason, and it's that I'm rewriting the tuple in place, not calling heap_update. I'm not sure if I can reuse log_heap_update for this purpose -- I'll take a look. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [WIP] The relminxid addition, try 3
Alvaro Herrera <[EMAIL PROTECTED]> writes: > I'm not too sure about the XLOG routines -- I don't understand very well > the business about attaching the changes to a buffer; I thought at first > that since all the changes go to a tuple, they all belong to the buffer, > so I assigned a single XLogRecData struct with all the info and the > buffer containing the tuple; but then on replay, I got "PANIC: invalid > xlog record length 0" Generally you want an xlog record to consist of some fixed overhead plus attached data. The attached data is the part that should link to the buffer (normally it's data that is/will be actually stored into that buffer). The fixed overhead isn't in the buffer and doesn't link. But why do you need your own xlogging at all? Shouldn't these actions be perfectly ordinary updates of the relevant catalog tuples? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq