Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Well, if a transaction modifies a table in some way, even without
> > changing the data, should generate an unfreeze event, because it will
> > need to lock the table; for example AlterTable locks the affected
> > relation with AccessExclusiveLock.  It's important for the
> > non-transactional change to the pg_class tuple be the very first in the
> > transaction, because otherwise the change could be lost; but other than
> > this, I don't think there's any problem.
> 
> You can't guarantee that.  Consider for instance manual updates to
> pg_class:
> 
>   BEGIN;
>   UPDATE pg_class SET reltriggers = 0 WHERE relname = ...
>   ... alter table contents ...
>   COMMIT or ROLLBACK;
> 
> I believe there are actually patterns like this in some pg_dump output.
> Will you hack every UPDATE operation to test whether it's changing
> pg_class and if so force an "unfreeze" operation before changing any
> row?  No thanks :-(

Oh, true, I hadn't thought of direct updates to pg_class.

> >> I'm wondering if we need a second pg_class-derived catalog that carries
> >> just the nontransactional columns.
> 
> > I hope we don't need to do this because ISTM it will be a very big change.
> 
> (Yawn...)  We've made far bigger changes than that.  The important
> thing is to get it right.

Yeah, I know -- I've been involved in some of them.  I hereby volunteer
to do it for 8.2 because I'd really like to see this patch in.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Well, if a transaction modifies a table in some way, even without
> changing the data, should generate an unfreeze event, because it will
> need to lock the table; for example AlterTable locks the affected
> relation with AccessExclusiveLock.  It's important for the
> non-transactional change to the pg_class tuple be the very first in the
> transaction, because otherwise the change could be lost; but other than
> this, I don't think there's any problem.

You can't guarantee that.  Consider for instance manual updates to
pg_class:

BEGIN;
UPDATE pg_class SET reltriggers = 0 WHERE relname = ...
... alter table contents ...
COMMIT or ROLLBACK;

I believe there are actually patterns like this in some pg_dump output.
Will you hack every UPDATE operation to test whether it's changing
pg_class and if so force an "unfreeze" operation before changing any
row?  No thanks :-(

>> I'm wondering if we need a second pg_class-derived catalog that carries
>> just the nontransactional columns.

> I hope we don't need to do this because ISTM it will be a very big change.

(Yawn...)  We've made far bigger changes than that.  The important
thing is to get it right.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Alvaro Herrera
Tom Lane wrote:
> I wrote:
> > Well, that needs rethinking.  The unfreeze has to be a non-transactional
> > update (if our transaction rolls back, the unfreeze still has to
> > "stick", because we may have put dead tuples into the rel).
> 
> Actually, this seems even messier than I thought.  Consider a
> transaction that does something transactional to a table's schema,
> thereby generating a new pg_class row (but not touching any data within
> the table), and then alters the table contents, requiring an unfreeze.
> An update to the apparently-current pg_class tuple is not good because
> that tuple might be rolled back.  An update to the last committed
> version doesn't work either.

Well, if a transaction modifies a table in some way, even without
changing the data, should generate an unfreeze event, because it will
need to lock the table; for example AlterTable locks the affected
relation with AccessExclusiveLock.  It's important for the
non-transactional change to the pg_class tuple be the very first in the
transaction, because otherwise the change could be lost; but other than
this, I don't think there's any problem.

Not that I had actually considered this problem, to be frank; but it
seems a nice side effect of how the unfreezing works.

> This seems real close to the recent discussions about how to put
> sequence data into a single one-row-per-sequence system catalog,
> specifically about how there were some parts of the sequence catalog
> data that should be transactional and some that should not be.
> I'm wondering if we need a second pg_class-derived catalog that carries
> just the nontransactional columns.

I hope we don't need to do this because ISTM it will be a very big change.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Tom Lane
I wrote:
> Well, that needs rethinking.  The unfreeze has to be a non-transactional
> update (if our transaction rolls back, the unfreeze still has to
> "stick", because we may have put dead tuples into the rel).

Actually, this seems even messier than I thought.  Consider a
transaction that does something transactional to a table's schema,
thereby generating a new pg_class row (but not touching any data within
the table), and then alters the table contents, requiring an unfreeze.
An update to the apparently-current pg_class tuple is not good because
that tuple might be rolled back.  An update to the last committed
version doesn't work either.

This seems real close to the recent discussions about how to put
sequence data into a single one-row-per-sequence system catalog,
specifically about how there were some parts of the sequence catalog
data that should be transactional and some that should not be.
I'm wondering if we need a second pg_class-derived catalog that carries
just the nontransactional columns.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Where did all these CommandCounterIncrement calls come from?

> I added them in heap_unfreeze precisely because I want the relation to
> be frozen exactly once, and this doesn't seem to happen if I don't CCI
> there -- I was seeing multiple occurences of the NOTICE I put in
> heap_unfreeze for the same relation for a single CREATE TABLE (multiple
> unfreezes of pg_class and pg_attribute, for example).

Well, that needs rethinking.  The unfreeze has to be a non-transactional
update (if our transaction rolls back, the unfreeze still has to
"stick", because we may have put dead tuples into the rel).  Therefore,
a CCI is neither necessary nor useful.  I didn't look at your patch in
any detail ... didn't you modify it to use the non-transactional update
code I put in heapam.c recently?

It might be that we need to broadcast an sinval message for the tuple
when we update it this way ... although sinval believes updates are
transactional, so that's not going to work all that well.  Maybe we have
to legislate that catcache/syscache entries shouldn't be trusted to have
up-to-date values of these fields.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > CREATE TABLE foo (a int);
> 
> > for some unknown reason, an inval message involving relation foo seems
> > to be emitted.
> 
> > heap_unfreeze(pg_class)
> >   CommandCounterIncrement()
> > heap_unfreeze(pg_attribute)
> >   CommandCounterIncrement()
> > ... insert the pg_attribute rows ...
> 
> Where did all these CommandCounterIncrement calls come from?
> I don't think it's going to work if you are throwing in CCIs
> at random places; this problem with the relcache will be the
> least of your worries.  Why do you think you need that anyway?

I added them in heap_unfreeze precisely because I want the relation to
be frozen exactly once, and this doesn't seem to happen if I don't CCI
there -- I was seeing multiple occurences of the NOTICE I put in
heap_unfreeze for the same relation for a single CREATE TABLE (multiple
unfreezes of pg_class and pg_attribute, for example).

Maybe the problem is elsewhere.  I'll investigate more.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-25 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> CREATE TABLE foo (a int);

> for some unknown reason, an inval message involving relation foo seems
> to be emitted.

> heap_unfreeze(pg_class)
>   CommandCounterIncrement()
> heap_unfreeze(pg_attribute)
>   CommandCounterIncrement()
> ... insert the pg_attribute rows ...

Where did all these CommandCounterIncrement calls come from?
I don't think it's going to work if you are throwing in CCIs
at random places; this problem with the relcache will be the
least of your worries.  Why do you think you need that anyway?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> (Now, if you're combining this with the very grotty relpages/reltuples
>> update code, then I'm all for making that xlog properly --- we've gotten
>> away without it so far but it really should xlog the change.)

> I hadn't done that, but I'll see what I can do.  Notice however that I'm
> doing this in both pg_class _and_ pg_database.

It strikes me that the cleanest way to deal with this is to invent a
single new type of xlog record, something like HEAP_UPDATE_IN_PLACE,
which just replaces tuple contents with a new tuple of the same size.
This would serve for both stats updates and unfreezing in both pg_class
and pg_database, and might have other uses in future for
non-transactional updates.  It's a little bit more xlog space than your
unfreeze-specific operations, but I don't think we need to sweat a few
bytes for those; they're not likely to be common.

Barring objections, I'll add this when I refactor UpdateStats.  Got some
other work I need to get back to right now, though.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> We should reorganize things so this is done once at the outer level.
>> It'd require some change of the ambuild() API, but considering we're
>> hacking several other aspects of the AM API in this development cycle,
>> that doesn't bother me.

> I'm rather stuck in Mammoth Replicator currently :-( so I'm not sure
> when I could devote some time to this task.

OK, I'll try to get to it later this week.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Alvaro Herrera
Tom Lane wrote:

> BTW, one thing I was looking at last week was whether we couldn't
> refactor the relpages/reltuples updates to be done in a cleaner place.
> Right now UpdateStats is called (for indexes) directly from the index
> AM, which sucks from a modularity point of view, and what's worse it
> forces two successive updates of the same tuple during CREATE INDEX.
> We should reorganize things so this is done once at the outer level.
> It'd require some change of the ambuild() API, but considering we're
> hacking several other aspects of the AM API in this development cycle,
> that doesn't bother me.

I'm rather stuck in Mammoth Replicator currently :-( so I'm not sure
when I could devote some time to this task.  If you want to do it, go
ahead and I can adapt the relminxid patch to your changes.  Or you can
take the relminxid patch from where it currently is, if you feel so
inclined.

If you don't, I'll eventually come back to it, but I'm not sure when
will that be.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> (Now, if you're combining this with the very grotty relpages/reltuples
>> update code, then I'm all for making that xlog properly --- we've gotten
>> away without it so far but it really should xlog the change.)

> I hadn't done that, but I'll see what I can do.  Notice however that I'm
> doing this in both pg_class _and_ pg_database.

BTW, one thing I was looking at last week was whether we couldn't
refactor the relpages/reltuples updates to be done in a cleaner place.
Right now UpdateStats is called (for indexes) directly from the index
AM, which sucks from a modularity point of view, and what's worse it
forces two successive updates of the same tuple during CREATE INDEX.
We should reorganize things so this is done once at the outer level.
It'd require some change of the ambuild() API, but considering we're
hacking several other aspects of the AM API in this development cycle,
that doesn't bother me.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Ah, there's another reason, and it's that I'm rewriting the tuple in
> > place, not calling heap_update.
> 
> Is that really a good idea, as compared to using heap_update?

Not sure -- we would leave dead tuples after the VACUUM is finished if
we used heap_update, no?  ISTM it's undesirable.

> (Now, if you're combining this with the very grotty relpages/reltuples
> update code, then I'm all for making that xlog properly --- we've gotten
> away without it so far but it really should xlog the change.)

I hadn't done that, but I'll see what I can do.  Notice however that I'm
doing this in both pg_class _and_ pg_database.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Ah, there's another reason, and it's that I'm rewriting the tuple in
> place, not calling heap_update.

Is that really a good idea, as compared to using heap_update?

(Now, if you're combining this with the very grotty relpages/reltuples
update code, then I'm all for making that xlog properly --- we've gotten
away without it so far but it really should xlog the change.)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Alvaro Herrera
Tom Lane wrote:

> But why do you need your own xlogging at all?  Shouldn't these actions
> be perfectly ordinary updates of the relevant catalog tuples?

The XLog entry can be smaller, AFAICT (we need to store the whole new
tuple in a heap_update, right?).  If that's not a worthy goal I'll
gladly rewrite this piece of code.

Ah, there's another reason, and it's that I'm rewriting the tuple in
place, not calling heap_update.  I'm not sure if I can reuse
log_heap_update for this purpose -- I'll take a look.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [WIP] The relminxid addition, try 3

2006-05-08 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I'm not too sure about the XLOG routines -- I don't understand very well
> the business about attaching the changes to a buffer; I thought at first
> that since all the changes go to a tuple, they all belong to the buffer,
> so I assigned a single XLogRecData struct with all the info and the
> buffer containing the tuple; but then on replay, I got "PANIC: invalid
> xlog record length 0"

Generally you want an xlog record to consist of some fixed overhead plus
attached data.  The attached data is the part that should link to the
buffer (normally it's data that is/will be actually stored into that buffer).
The fixed overhead isn't in the buffer and doesn't link.

But why do you need your own xlogging at all?  Shouldn't these actions
be perfectly ordinary updates of the relevant catalog tuples?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq