On 2/3/2007 4:58 PM, Theo Schlossnagle wrote:
On Feb 3, 2007, at 4:38 PM, Jan Wieck wrote:
On 2/3/2007 4:05 PM, Theo Schlossnagle wrote:
On Feb 3, 2007, at 3:52 PM, Jan Wieck wrote:
On 2/1/2007 11:23 PM, Jim Nasby wrote:
On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote:
If a per database configurable tslog_priority is given, the
timestamp will be truncated to milliseconds and the increment
logic is done on milliseconds. The priority is added to the
timestamp. This guarantees that no two timestamps for commits
will ever be exactly identical, even across different servers.
Wouldn't it be better to just store that information
separately, rather than mucking with the timestamp?
Though, there's anothe issue here... I don't think NTP is good
for any better than a few milliseconds, even on a local network.
How exact does the conflict resolution need to be, anyway?
Would it really be a problem if transaction B committed 0.1
seconds after transaction A yet the cluster thought it was the
other way around?
Since the timestamp is basically a Lamport counter which is just
bumped be the clock as well, it doesn't need to be too precise.
Unless I'm missing something, you are _treating_ the counter as a
Lamport timestamp, when in fact it is not and thus does not
provide semantics of a Lamport timestamp. As such, any
algorithms that use lamport timestamps as a basis or assumption
for the proof of their correctness will not translate (provably)
to this system.
How are your counter semantically equivalent to Lamport timestamps?
Yes, you must be missing something.
The last used timestamp is remembered. When a remote transaction is
replicated, the remembered timestamp is set to max(remembered,
remote). For a local transaction, the remembered timestamp is set
to max(remembered+1ms, systemclock) and that value is used as the
transaction commit timestamp.
A Lamport clock, IIRC, require a cluster wide tick. This seems based
only on activity and is thus an observational tick only which means
various nodes can have various perspectives at different times.
Given that time skew is prevalent, why is the system clock involved
at all?
This question was already answered.
As is usual distributed systems problems, they are very hard to
explain casually and also hard to review from a theoretical angle
without a proof. Are you basing this off a paper? If so which one?
If not, have you written a rigorous proof of correctness for this
approach?
I don't have any such paper and the proof of concept will be the
implementation of the system. I do however see enough resistance against
this proposal to withdraw the commit timestamp at this time. The new
replication system will therefore require the installation of a patched,
non-standard PostgreSQL version, compiled from sources cluster wide in
order to be used. I am aware that this will dramatically reduce it's
popularity but it is impossible to develop this essential feature as an
external module.
I thank everyone for their attention.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== [EMAIL PROTECTED] #
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings