Re: [HACKERS] Proposal: Commit timestamp

Jan Wieck Thu, 25 Jan 2007 21:42:46 -0800

On 1/25/2007 11:41 PM, Bruce Momjian wrote:

Jan Wieck wrote:
On 1/25/2007 6:49 PM, Tom Lane wrote:
> Jan Wieck <[EMAIL PROTECTED]> writes:
>> To provide this data, I would like to add another "log" directory,>> pg_tslog. The files in this directory will be similar to the clog, but>> contain arrays of timestamptz values.>> Why should everybody be made to pay this overhead?
It could be made an initdb time option. If you intend to use a productthat requires this feature, you will be willing to pay that price.
That is going to cut your usage by like 80%.  There must be a better
way.


I'd love to.

But it is a datum that needs to be collected at the moment wherebasically the clog entry is made ... I don't think any external modulecan do that ever.

You know how long I've been in and out and back into replication again.The one thing that pops up again and again in all the scenarios is "whatthe heck was the commit order?". Now the pure commit order for a singlenode could certainly be recorded from a sequence, but that doesn't coverthe multi-node environment I am after. That's why I want it to be atimestamp with a few fudged bits at the end. If you look at what I'vedescribed, you will notice that as long as all node priorities areunique, this timestamp will be a globally unique ID in a somewhatascending order along a timeline. That is what replication people arelooking for.

Tom fears that the overhead is significant, which I do understand andfrankly, wonder myself about (actually I don't even have a vagueestimate). I really think we should make this thing an initdb option anddecide later if it's on or off by default. Probably we can implement iteven in a way that one can turn it on/off and a postmaster restart pluswaiting the desired freeze-delay would do.

What I know for certain is that no async replication system can ever dowithout the commit timestamp information. Using the transaction starttime or even the single statements timeofday will only lead toinconsistencies all over the place (I haven't been absent from themailing lists for the past couple of month hiding in my closet ... I'vebeen experimenting and trying to get around all these issues - in mycloset). Slony-I can survive without that information because everythinghappens on one node and we record snapshot information for later abusal.But look at what cost we are dealing with this rather trivial issue. Allwe need to know is the serializable commit order. And we have to issuequeries that eventually might exceed address space limits?



Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Re: [HACKERS] Proposal: Commit timestamp

Reply via email to