Re: [HACKERS] The plan for FDW-based sharding

Konstantin Knizhnik Tue, 01 Mar 2016 09:08:21 -0800

Thank you very much for you comments.

On 01.03.2016 18:19, Robert Haas wrote:

On Sat, Feb 27, 2016 at 2:29 AM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

How do you prevent clock skew from causing serialization anomalies?

If node receives message from "feature" it just needs to wait until this
future arrive.
Practically we just "adjust" system time in this case, moving it forward
(certainly system time is not actually changed, we just set correction value
which need to be added to system time).
This approach was discussed in the article:
http://research.microsoft.com/en-us/people/samehe/clocksi.srds2013.pdf
I hope, in this article algorithm is explained much better than I can do
here.

Hmm, the approach in that article is very interesting, but it sounds
different than what you are describing - they do not, AFAICT, have
anything like a "correction value"


In the article them used anotion "wait":

if T.SnapshotTime>GetClockTime()
then wait until T.SnapshotTime<GetClockTime()

Originally we really do sleep here, but then we think that instead ofsleeping we can just adjust local time.Sorry, I do not have format prove it is equivalent but... at least wehave not encountered any inconsistencies after this fix and performanceis improved.

There are well know limitation of this  pg_tsdtm which we will try to
address in future.

How well known are those limitations?  Are they documented somewhere?
Or are they only well-known to you?

Sorry, well know for us.
But them are described at DTM wiki page.

Right now pg_tsdtm is not supporting correct distributed deadlockdetection (is not building global lock graph) and is detectingdistributed deadlocks just based on timeouts.It doesn't support explicit locks but "select for update" will workcorrectly.

What we want is to include XTM API in PostgreSQL to be able to continue our
experiments with different transaction managers and implementing multimaster
on top of it (our first practical goal) without affecting PostgreSQL core.

If XTM patch will be included in 9.6, then we can propose our multimaster as
PostgreSQL extension and everybody can use it.
Otherwise we have to propose our own fork of Postgres which significantly
complicates using and maintaining it.

Well I still think what I said before is valid.  If the code is good,
let it be a core submission.  If it's not ready yet, submit it to core
when it is.  If it can't be made good, forget it.

I have nothing against committing DTM code in core. But still the bestway of integration it is to use a-la-OO approach.So still need API. Inserting if-s or switches in existed code is IMHOugly idea.

Also it is not enough for DTM code to be just "good". It should provideexpected functionality.But which functionality is expected? From my experience of developmentdifferent cluster solutions I can say thatdifferent customers have very different requirements. It is very hard ifever possible to satisfy them all.


Right now I do not feel that I can predict all possible requirements to DTM.

This is why we want to provide some API, propose some implementations ofthis API, receive feedbecks and get better understanding whichfunctionality is actually needed by customers.

This seems rather defeatist.  If the code is good and reliable, why
should it not be committed to core?

Two reasons:
1. There is no ideal implementation of DTM which will fit all possible needs
and be  efficient for all clusters.

Hmm, what is the reasoning behind that statement?  I mean, it is
certainly true that there are some places where we have decided that
one-size-fits-all is not the right approach.  Indexing, for example.
But there are many other places where we have not chosen to make
things pluggable, and that I don't think it should be taken for
granted that plugability is always an advantage.

I fear that building a DTM that is fully reliable and also
well-performing is going to be really hard, and I think it would be
far better to have one such DTM that is 100% reliable than two or more
implementations each of which are 99% reliable.

The question is not about it's reliability, but mostly about itsfunctionality and flexibility.

2. Even if such implementation exists, still the right way of it integration
is Postgres should use kind of TM API.

Sure, APIs are generally good, but that doesn't mean *this* API is good.

Well, I do not what to say "better than nothing", but I find this API tobe a reasonable compromise between flexibility and minimization ofchanges in PostgreSQL core. If you have some suggestions how to improveit, I will be glad to receive them.

I hope that everybody will agree that doing it in this way:

#ifdef PGXC
         /* In Postgres-XC, stop timestamp has to follow the timeline of GTM
*/
         xlrec.xact_time = xactStopTimestamp + GTMdeltaTimestamp;
#else
         xlrec.xact_time = xactStopTimestamp;
#endif

PGXC chose that style in order to simplify merging.  I wouldn't have
picked the same thing, but I don't know why it deserves scorn.

or in this way:

         xlrec.xact_time = xactUseGTM ? xactStopTimestamp + GTMdeltaTimestamp
: xactStopTimestamp;

is very very bad idea.

I don't know why that is such a bad idea.  It's a heck of a lot faster
than insisting on calling some out-of-line function.  It might be a
bad idea, but I think we need to decide that, not assume it.

It violates modularity, complicates code, makes it more error prone.
I still prefer to extract all DTM code in separate module.
It should not necessary be an extension.
But from the other side - it is not required to put in in core.

At least at this stage. As i already wrote - not just because code isnot good enough or is not reliable enough,

but because I am not sure that it is fits all (or just most) of use cases.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] The plan for FDW-based sharding

Reply via email to