Re: [PATCHES] The corresponding relminxid patch; try 1

2006-06-11 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 This is the relminxid patch corresponding to the pg_ntclass patch I just
 posted.

That disable_heap_unfreeze thing seriously sucks.  How bad are the API
changes needed to pass that as a parameter instead of having a
very-dangerous global variable?

The comment at line 328ff in dbcommands.c seems misguided, which makes
me doubt the code too.  datfrozenxid and datvacuumxid should be
considered as indicating what XIDs appear inside the database, not what
is in its pg_database row.

The changes in vacuum.c are far too extensive to review meaningfully.
What did you do, and did it really need to touch so much code?

 The thing that bothers me most about this is that it turns LockRelation
 into an operation that needs to heap_fetch() from pg_ntclass in some
 cases, and possibly update it.

Have you done any profiling to see what that actually costs?

I think we could possibly dodge the work in the normal case if we are
willing to make VACUUM FREEZE take ExclusiveLock and send out a relation
cache inval on the relation.  Then, we can cache the pg_ntclass tuple in
relcache entries (discarding it on cache inval), and if the cached value
says it's not frozen then it's not frozen.  You couldn't trust the
cached value much further than that, but that would at least take the
fetch out of the normal path in LockRelation.  The trick here is the
problem that if VACUUM FREEZE fails after modifying pg_ntclass, its
relcache inval won't be sent out.

A bigger issue here is that I'm not sure what the locking protocol is
for pg_ntclass itself.  It looks like you're not consistently taking
a RowExclusiveLock when you update it.

BTW, I think you have the order of operations wrong in LockRelation;
should it not unfreeze only *after* obtaining lock?  Consider race
condition against relation drop for instance.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] The corresponding relminxid patch; try 1

2006-06-11 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 No, actually it's correct.  The point of that comment is that if the
 source database is frozen, then all XIDs appearing inside both databases
 (source and newly created) are frozen.  So it's possible that the row in
 pg_database is frozen as well.  But because we are creating a new row in
 pg_database, it's not really frozen any longer; so we change the
 pg_database fields in the new row to match.

No, this only says that pg_database has to be unfrozen.  If the source
DB is frozen then the clone is frozen too.

 The changes in vacuum.c are far too extensive to review meaningfully.
 What did you do, and did it really need to touch so much code?

 Yeah, they are extensive. ...

 Maybe I should take a stab at making incremental patches instead of
 doing everything in one patch.  This way it would be easier to review
 for correctness (and I'd be more confident that it is actually correct
 as well).

Please.  I've got no confidence that I see what's going on there.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] The corresponding relminxid patch; try 1

2006-06-11 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  This is the relminxid patch corresponding to the pg_ntclass patch I just
  posted.
 
 That disable_heap_unfreeze thing seriously sucks.  How bad are the API
 changes needed to pass that as a parameter instead of having a
 very-dangerous global variable?

Let's see -- I would need to fix all callers of LockRelation, and the
problem I found in an earlier version of the patch (before the invention
of the non-transaction stuff) was that some callers needed to pass that
information several levels down.  It's possible that this was an
artifact of the fact that it was using the relcache.  I'll experiment
with changing stuff so that the global variable is not needed.

 The comment at line 328ff in dbcommands.c seems misguided, which makes
 me doubt the code too.  datfrozenxid and datvacuumxid should be
 considered as indicating what XIDs appear inside the database, not what
 is in its pg_database row.

No, actually it's correct.  The point of that comment is that if the
source database is frozen, then all XIDs appearing inside both databases
(source and newly created) are frozen.  So it's possible that the row in
pg_database is frozen as well.  But because we are creating a new row in
pg_database, it's not really frozen any longer; so we change the
pg_database fields in the new row to match.

Actually, pg_database is going to be unfrozen right after that code,
because it's opened with RowExclusiveLock shortly after, precisely to
insert that new row we are inserting.  So maybe this is not an issue.


 The changes in vacuum.c are far too extensive to review meaningfully.
 What did you do, and did it really need to touch so much code?

Yeah, they are extensive.  I did several things there: get rid of a
couple of global variables that no longer need to be global; remove the
return value from vacuum_rel, since it's no longer needed (it's used to
determine whether we can truncate pg_clog, but now we can do it
regardless of whether this particular vacuuming took place or not); I
changed some variables from the old frozenXid name to minXid; I put
in a hack to make VACUUM FREEZE take a stronger lock; changed the API of
vacuum_rel so that instead of taking a specific acceptable relkind, it
receives whether TOAST is acceptable or not; and added the code needed
to keep track of the earliest Xid found in code.  But by far the most
extensive change is the melding of vac_update_dbstats into
vac_update_dbminxid, and the update of vac_update_relstats to cope with
pg_ntclass.

Maybe I should take a stab at making incremental patches instead of
doing everything in one patch.  This way it would be easier to review
for correctness (and I'd be more confident that it is actually correct
as well).

  The thing that bothers me most about this is that it turns LockRelation
  into an operation that needs to heap_fetch() from pg_ntclass in some
  cases, and possibly update it.
 
 Have you done any profiling to see what that actually costs?

No, but I guess it must be expensive.  While relminxid was still in the
relcache, it was cheap because we checked the value before having to
actually do anything else.  That's why I was suggesting having a
separate cache for non-transactional stuff.

 I think we could possibly dodge the work in the normal case if we are
 willing to make VACUUM FREEZE take ExclusiveLock and send out a relation
 cache inval on the relation.

Well, one problem is that if enough transactions pass after the last
update to a table, a normal VACUUM (i.e. not FREEZE) could mark a table
as frozen as well; marking frozen is not an exclusive property of VACUUM
FREEZE.

 BTW, I think you have the order of operations wrong in LockRelation;
 should it not unfreeze only *after* obtaining lock?  Consider race
 condition against relation drop for instance.

Hmm, good point.  I think it was OK (and actually, it was required)
while relminxid was still on pg_class; or rather, there was a race
condition the other way around, so it was required to unfreeze the table
_before_ obtaining the lock.  But it's certainly wrong now.

I'll work on pg_class_nt and I'll be back to this soon.  Thanks for the
review.

---(end of broadcast)---
TIP 6: explain analyze is your friend