Re: [HACKERS] Proposal: Snapshot cloning

Hannu Krosing Thu, 25 Jan 2007 23:19:50 -0800

Ühel kenal päeval, N, 2007-01-25 kell 22:19, kirjutas Jan Wieck:
> Granted this one has a few open ends so far and I'd like to receive some 
> constructive input on how to actually implement it.
> 
> The idea is to clone an existing serializable transactions snapshot 
> visibility information from one backend to another. The semantics would 
> be like this:
> 
>      backend1: start transaction;
>      backend1: set transaction isolation level serializable;
>      backend1: select pg_backend_pid();
>      backend1: select publish_snapshot(); -- will block
> 
>      backend2: start transaction;
>      backend2: set transaction isolation level serializable;
>      backend2: select clone_snapshot(<pid>); -- will unblock backend1
> 
>      backend1: select publish_snapshot();
> 
>      backend3: start transaction;
>      backend3: set transaction isolation level serializable;
>      backend3: select clone_snapshot(<pid>);
> 
>      ...
> 
> This will allow a number of separate backends to assume the same MVCC 
> visibility, so that they can query independent but the overall result 
> will be according to one consistent snapshot of the database.


I see uses for this in implementing query parallelism in user level
code, like querying two child tables in two separate processes. 

> What I try to accomplish with this is to widen a bottleneck, many 
> current Slony users are facing. The initial copy of a database is 
> currently limited to one single reader to copy a snapshot of the data 
> provider. With the above functionality, several tables could be copied 
> in parallel by different client threads, feeding separate backends on 
> the receiving side at the same time.

I'm afraid that for most configurations this would make the copy slower,
as there will be mode random disk i/o.

Maybe better fix slony so that it allows initial copies in different
parallel transactions, or just do initial copy in several sets and merge
the sets later.

> The feature could also be used by a parallel version of pg_dump as well 
> as data mining tools.
> 
> The cloning process needs to make sure that the clone_snapshot() call is 
> made from the same DB user in the same database as corresponding 
> publish_snapshot() call was done. 

Why ? Snapshot is universal and same for whole db instance, so why limit
it to same user/database ?

> Since publish_snapshot() only 
> publishes the information, it gained legally and that is visible in the 
> PGPROC shared memory (xmin, xmax being the crucial part here), there is 
> no risk of creating a snapshot for which data might have been removed by 
> vacuum already.
> 
> What I am not sure about yet is what IPC method would best suit the 
> transfer of the arbitrarily sized xip vector. Ideas?
> 
> 
> Jan
> 
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Re: [HACKERS] Proposal: Snapshot cloning

Reply via email to