On 02/27/2016 06:57 AM, Robert Haas wrote:
On Sat, Feb 27, 2016 at 1:49 AM, Konstantin Knizhnik
pg_tsdtm is based on another approach: it is using system time as CSN and
doesn't require arbiter. In theory there is no limit for scalability. But
differences in system time and necessity to use more rounds of communication
have negative impact on performance.
How do you prevent clock skew from causing serialization anomalies?
If node receives message from "feature" it just needs to wait until this future
Practically we just "adjust" system time in this case, moving it forward
(certainly system time is not actually changed, we just set correction value which need
to be added to system time).
This approach was discussed in the article:
I hope, in this article algorithm is explained much better than I can do here.
1. I can not prove that our pg_tsdtm absolutely correctly implements approach
described in this article.
2. I didn't try to formally prove that our implementation can not cause some
3. We just run various synchronization tests (including simplest debit-credit
test which breaks old version of Postgtes-XL) during several days and we didn't
get any inconsistencies.
4. We have tested pg_tsdtm both at single node, blade cluster and geographically distributed nodes (distance more than thousand kilometers: one server was in Vladivostok, another in Kaliningrad). Ping between these two servers takes about 100msec.
Performance of our benchmark drops about 100 times but there was no inconsistencies.
Also I once again want to notice that primary idea of the proposed patch was
There are well know limitation of this pg_tsdtm which we will try to address
What we want is to include XTM API in PostgreSQL to be able to continue our
experiments with different transaction managers and implementing multimaster on
top of it (our first practical goal) without affecting PostgreSQL core.
If XTM patch will be included in 9.6, then we can propose our multimaster as
PostgreSQL extension and everybody can use it.
Otherwise we have to propose our own fork of Postgres which significantly
complicates using and maintaining it.
So there is no ideal solution which can work well for all cluster. This is
why it is not possible to develop just one GTM, propose it as a patch for
review and then (hopefully) commit it in Postgres core. IMHO it will never
happen. And I do not think that it is actually needed. What we need is a way
to be able to create own transaction managers as Postgres extension not
affecting its core.
This seems rather defeatist. If the code is good and reliable, why
should it not be committed to core?
1. There is no ideal implementation of DTM which will fit all possible needs
and be efficient for all clusters.
2. Even if such implementation exists, still the right way of it integration is
Postgres should use kind of TM API.
I hope that everybody will agree that doing it in this way:
/* In Postgres-XC, stop timestamp has to follow the timeline of GTM */
xlrec.xact_time = xactStopTimestamp + GTMdeltaTimestamp;
xlrec.xact_time = xactStopTimestamp;
or in this way:
xlrec.xact_time = xactUseGTM ? xactStopTimestamp + GTMdeltaTimestamp :
is very very bad idea.
In OO programming we should have abstract TM interface and several
implementations of this interface, for example
MVCC_TM, 2PL_TM, Distributed_TM...
This is actually what can be done with our XTM API.
As far as Postgres is implemented in C, not in C++ we have to emulate
interfaces using structures with function pointers.
And please notice that there is completely no need to include DTM
implementation in core, as far as it is not needed for everybody.
It can be easily distributed as extension.
I have that quite soon we can propose multimaster extension which should provides functionality similar with MySQL Gallera. But even right now we have integrated pg_dtm and pg_tsdtm with pg_shard and postgres_fdw, allowing to provide distributed
consistency for them.
All arguments against XTM can be applied to any other extension API in
Postgres, for example FDW.
Is it general enough? There are many useful operations which currently are
not handled by this API. For example performing aggregation and grouping at
foreign server side. But still it is very useful and flexible mechanism,
allowing to implement many wonderful things.
That is true. And everybody is entitled to an opinion on each new
proposed hook, as to whether that hook is general or not. We have
both accepted and rejected proposed hooks in the past.
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: