On 1/25/2007 11:41 PM, Bruce Momjian wrote:
Jan Wieck wrote:
On 1/25/2007 6:49 PM, Tom Lane wrote:
> Jan Wieck <[EMAIL PROTECTED]> writes:
>> To provide this data, I would like to add another "log" directory, >> pg_tslog. The files in this directory will be similar to the clog, but >> contain arrays of timestamptz values. > > Why should everybody be made to pay this overhead?

It could be made an initdb time option. If you intend to use a product that requires this feature, you will be willing to pay that price.

That is going to cut your usage by like 80%.  There must be a better
way.

I'd love to.

But it is a datum that needs to be collected at the moment where basically the clog entry is made ... I don't think any external module can do that ever.

You know how long I've been in and out and back into replication again. The one thing that pops up again and again in all the scenarios is "what the heck was the commit order?". Now the pure commit order for a single node could certainly be recorded from a sequence, but that doesn't cover the multi-node environment I am after. That's why I want it to be a timestamp with a few fudged bits at the end. If you look at what I've described, you will notice that as long as all node priorities are unique, this timestamp will be a globally unique ID in a somewhat ascending order along a timeline. That is what replication people are looking for.

Tom fears that the overhead is significant, which I do understand and frankly, wonder myself about (actually I don't even have a vague estimate). I really think we should make this thing an initdb option and decide later if it's on or off by default. Probably we can implement it even in a way that one can turn it on/off and a postmaster restart plus waiting the desired freeze-delay would do.

What I know for certain is that no async replication system can ever do without the commit timestamp information. Using the transaction start time or even the single statements timeofday will only lead to inconsistencies all over the place (I haven't been absent from the mailing lists for the past couple of month hiding in my closet ... I've been experimenting and trying to get around all these issues - in my closet). Slony-I can survive without that information because everything happens on one node and we record snapshot information for later abusal. But look at what cost we are dealing with this rather trivial issue. All we need to know is the serializable commit order. And we have to issue queries that eventually might exceed address space limits?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to