Re: [HACKERS] eXtensible Transaction Manager API

Konstantin Knizhnik Mon, 09 Nov 2015 00:39:54 -0800


On 09.11.2015 07:46, Amit Kapila wrote:

On Sat, Nov 7, 2015 at 10:23 PM, Konstantin Knizhnik<k.knizh...@postgrespro.ru <mailto:k.knizh...@postgrespro.ru>> wrote:



    Lock manager is one of the tasks we are currently working on.
    There are still a lot of open questions:
    1. Should distributed lock manager (DLM) do something else except
    detection of distributed deadlock?


I think so.  Basically DLM should be responsible for maintaining
all the lock information which inturn means that any backend process

that needs to acquire/release lock needs to interact with DLM, withoutthat

I don't think even global deadlock detection can work (to detect deadlocks
among all the nodes, it needs to know the lock info of all nodes).

I hope that it will not needed, otherwise it will add significantperformance penalty.Unless I missed something, locks can still managed locally, but we needDLM to detect global deadlocks.Deadlock detection in PostgreSQL is performed only after timeoutexpiration for lock waiting. Only then lock graph is analyzed forpresence of loops. At this moment we can send our local lock graph toDLM. If it is really distributed deadlock, then sooner or laterall nodes involved in this deadlock will send their local graphs to DLMand DLM will be able to build global graph and detect distributeddeadlock. In this scenario DLM will be accessed very rarely - only whenlock can not be granted within deadlock_timeout.

One of the possible problems is false global deadlock detection, becauselocal lock graphs corresponds to different moments of time, so we canfound "false" loop. I am still thinking about best solution of this problem.

This is somewhat inline with what currently we do during lock conflicts,
i.e if the backend incur a lock conflict and decides to sleep, before

sleeping it tries to detect an early deadlock, so if we make DLMresponsible

for managing locks the same could be even achieved in distributed system.

    2. Should DLM be part of XTM API or it should be separate API?


We might be able to do it either ways (make it part of XTM API or devise a

separate API). I think here more important point is to first get thehigh level

design for Distributed Transactions (which primarily includes consistent
Commit/Rollback, snapshots, distributed lock manager (DLM) and recovery).

    3. Should DLM be implemented by separate process or should it be
    part of arbiter (dtmd).


That's important decision. I think it will depend on which kind of design
we choose for distributed transaction manager (arbiter based solution or
non-arbiter based solution, something like tsDTM).  I think DLM should be
separate, else arbiter will become hot-spot with respect to contention.


There are pros and contras.
Pros for integrating DLM in DTM:

1. DTM arbiter has information about local to global transaction IDmapping which may be needed to DLM2. If my assumptions about DLM are correct, then it will be accessedrelatively rarely and should not cause significant

impact on performance.

Contras:
1. tsDTM doesn't need centralized arbiter but still needs DLM
2. Logically DLM and DTM are independent components


Can you please explain more about tsDTM approach, how timestamps
are used, what is exactly CSN (is it Commit Sequence Number) and
how it is used in prepare phase?  IS CSN used as timestamp?
Is the coordinator also one of the PostgreSQL instance?

In tsDTM approach system time (in microseconds) is used as CSN (commitsequence number).We also enforce that assigned CSN is unique and monotonic withinPostgreSQL instance.CSN are assigned locally and do not require interaction with some othercluster nodes.

This is why in theory tsDTM approach should provide good scalability.


    I think in this patch, it is important to see the completeness of
    all the
    API's that needs to be exposed for the implementation of distributed
    transactions and the same is difficult to visualize without
    having complete
    picture of all the components that has some interaction with the
    distributed
    transaction system.  On the other hand we can do it in
    incremental fashion
    as and when more parts of the design are clear.


    That is exactly what we are going to do - we are trying to
    integrate DTM with existed systems (pg_shard, postgres_fdw, BDR)
    and find out what is missed and should be added. In parallel we
    are trying to compare efficiency and scalability of different
    solutions.


One thing, I have noticed that in DTM approach, you seems to
be considering a centralized (Arbiter) transaction manager and
centralized Lock manager which seems to be workable approach,
but I think such an approach won't scale in write-heavy or mixed
read-write workload.  Have you thought about distributing responsibility
of global transaction and lock management?


From my point of view there are two different scenarios:

1. When most transactions are local and only few of them are global (forexample most operation in branch of the back areperformed with accounts of clients of this branch, but there are fewtransactions involved accounts from different branches).

2. When most or all transactions are global.

It seems to me that first approach is more popular in real life andactually good performance of distributed system can be achieved only incase when most transaction are local (involves only one node). There areseveral approaches allowing to optimize local transactions. For exampleonce used in SAP HANA(http://pi3.informatik.uni-mannheim.de/~norman/dsi_jour_2014.pdf)We have also DTM implementation based on this approach but it is not yetworking.

If most of transaction are global, them affect random subsets of clusternodes (so it is not possible to logically split cluster into groups oftightly coupled nodes) and number of nodes is not very large (<10) thenI do not think that there can be better alternative (from performancepoint of view) than centralized arbiter.

But it is only my speculations and it will be really very interestingfor me to know access patterns of real customers, using distributed systems.

I think such a system
might be somewhat difficult to design, but the scalability will be better.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>

Re: [HACKERS] eXtensible Transaction Manager API

Reply via email to