Re: [PATCHES] Are you actively hacking on 2PC?

2005-06-11 Thread Heikki Linnakangas

On Wed, 8 Jun 2005, Alvaro Herrera wrote:


Additionally I collected some to-do entries, maybe you find it useful.


Good list overall. Comments to some entries below:


* Clean up the callback support.


Yeah. Possibly remove the callback abstraction alltogether and wire 
calls to different functions directly in a switch statement etc. I 
expected there to be more functions in the callback tables but it didn't 
turn out that way.



* Is PREPARE TRANSACTION the best interface?  What about marking a transaction
 for preparing when it begins?  Apparently the XA spec requires/prefers this.


I'd prefer not to mark the transaction on begin. I might not know it's 
going to be a two-phase transaction until later on. At the very least it 
must still be possible to do a regular commit even if you mark the 
transaction as two-phase in the begin.


We'll have to see what the XA spec says about it.

BTW: while googling for the XA spec, I bumped into RFC2371 - Transaction 
Internet Protocol, which seems to be a protocol for coordinating a 
two-phase commit over TCP. It looks quite flexible, it should work either 
way. I don't know how widely adopted it is, it's first time I hear about 
it.



* Check whether the TwoPhaseStateLock usage per GetPreparedTransactionXidList
 is safe.  (In particular in CheckPointTwoPhase)


I don't see a problem, no other locks are held at the same time. Can you 
elaborate?



* Rethink about using both WAL and a state file.  (If we declared the
 intent to persist a transaction when it begins, maybe we don't need
 the state file at all.)


The purpose of the state file is two-fold. First, the original backend 
uses it to send a message, in a sense, to the backend that's later going 
to finish the 2nd phase commit. It contains the instructions to do the 
finishing.


Secondly, the pg_twophase directory and the set of valid state files in it 
allows prepared transactions to be picked up by recovery after crash.


The first need could be covered with a shared memory structure. But 
shared memory is fixed-size.


I don't know any good alternatives for the second need. The information 
need to be retained over checkpoints, so the WAL alone is not enough.


I wouldn't worry about the overhead of writing the same data twice, the 
state files are typically around 300 bytes in the current implementation. 
And it could probably be cut down even more with some thought, making the 
gid variable size for example.


I'm a bit bothered with the fact that we have to create and delete a file 
for each transaction, though. That could be expensive on some file 
systems.


I don't see how it would help to declare the intent to persist 
on transaction at begin.



* Why is the new code in GetOldestXmin dependent on allDbs?  That seems bogus.


It's a bit accidental. GetOldestXmin is used in three places:

1. It determines the subtrans files safe truncation point
2. It determines the vacuum cut-off point, tuples older than that can be 
removed.
3. Similar to 2, it's used in index building to determine which tuples 
are dead.


Case 1 needs to take prepared transactions into account. Cases 2 & 3 
don't, because a prepared transaction can't read any tuples anymore.


Case 1 happens to call GetOldestXmin with allDbs set to true, so I just 
wired the logic to that parameter. It should be cleaned up.


- Heikki

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Are you actively hacking on 2PC?

2005-06-08 Thread Alvaro Herrera
On Tue, Jun 07, 2005 at 07:00:48PM -0400, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Tom, please let me know if you want to hack on it, if that's the case
> > I'll stop my cleanup and start to work on supporting the things that
> > need further coding, which are AFAICS MultiXactId, large objects and
> > notifies.
> 
> Well, I am just finishing up the multixact XLOG rewrite, so I was about
> to pester you as to where you are on this.
> 
> I would like to start doing something with it --- how can we best avoid
> stepping on each others' toes?

Please use the patch I posted yesterday as starting point -- I'll work
on missing features.  Just don't touch notify.c nor large object support
files...

I reworked the iterator routines for procarray.c and disaster ensued --
but I'm not sure if that uncovered previous bugs or just introduced some
semantic change I'm not seeing.  If you rewrite the whole stuff my
changes will prove useless, but I don't want to slow you down if it's a
bug of mine.

-- 
Alvaro Herrera ()
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Are you actively hacking on 2PC?

2005-06-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom, please let me know if you want to hack on it, if that's the case
> I'll stop my cleanup and start to work on supporting the things that
> need further coding, which are AFAICS MultiXactId, large objects and
> notifies.

Well, I am just finishing up the multixact XLOG rewrite, so I was about
to pester you as to where you are on this.

I would like to start doing something with it --- how can we best avoid
stepping on each others' toes?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings