Hi Thomas,

Barriers are really very simple and convenient mechanism for process 
synchronization.
But it is actually a special case of semaphores: having semaphore primitive it 
is trivial to implement a barrier.
We have semaphores in Postgres, but ... them can not be used by extensions: 
there is fixed number of semaphores allocated based on maximal number of 
connections and there is no mechanism for requesting additional semaphores. 


Rober has recently proposed conditional variables, which are also very useful. 
Right now we have spinlocks, LW-locks and latches.
>From my point of view it is not enough. While creating various extensions for 
>Postgres I always fill lack of such synchronization primitive as events 
>(condition variables) and semaphores. Events and semaphores are similar and it 
>is possible to implement any of them based on another. But from user's point 
>of view them have different semantic and use cases, so it is better to provide 
>both of them.

I wonder if we should provide system-abstraction-layer (SAL) for Postgres, 
where we can isolate all system dependent code and which will be available for 
vore developers as well as for developers of extensions?  The obvious 
candidates for SAL are:
1. Synchronization primitives (locks, events, semaphores, mutexes, spinlocks, 
latches, barriers)
2. Shared memory
3. File access
4. Network sockets
5. Process control (fork, ...)

Certainly it requires a lot of refactoring but will make Postgres code much 
more elegant, easer to read and maintain.
Also it is not necessary to do all this changes  in one step: here we do not 
need atomic transactions:)
We can start for example with synchronization primitives, as far as in any case 
a lot of changes are proposed here.

Parallel execution is one of the most promising approach to improve Postgres 
performance. I do not mean just parallel execution of single query.
Parallel vacuum, parallel index creation, parallel sort, ... And to implement 
all this stuff we definitely need convenient and efficient synchronization 
primitives.
The set of such primitives can be discussed. IMHO it should include RW-locks 
(current LW-locks), mutexes (spinlock + some mechanism to wait), events 
(condition variables), semaphores and barriers (based on semaphores). Latches 
can be left for backward compatibility or be replaced with events.
I wonder if somebody has measured  how much times latches (signal+socket) are 
slower then posix semaphores or conditional variables?



On Aug 14, 2016, at 2:18 AM, Thomas Munro wrote:

> Hi hackers,
> 
> I would like to propose "barriers" for Postgres processes.  A barrier
> is a very simple mechanism for coordinating parallel computation, as
> found in many threading libraries.
> 
> First, you initialise a Barrier object somewhere in shared memory,
> most likely in the DSM segment used by parallel query, by calling
> BarrierInit(&barrier, nworkers).  Then workers can call
> BarrierWait(&barrier) when they want to block until all workers arrive
> at the barrier.  When the final worker arrives, BarrierWait returns in
> all workers, releasing them to continue their work.  One arbitrary
> worker receives a different return value as a way of "electing" it to
> perform serial phases of computation.  For parallel phases of
> computation, the return value can be ignored.  For example, there may
> be preparation, merging, or post-processing phases which must be done
> by just one worker, interspersed with phases where all workers do
> something.
> 
> My use case for this is coordinating the phases of parallel hash
> joins, but I strongly suspect there are other cases.  Parallel sort
> springs to mind, which is why I wanted to post this separately and
> earlier than my larger patch series, to get feedback from people
> working on other parallel features.
> 
> A problem that I'm still grappling with is how to deal with workers
> that fail to launch.  What I'm proposing so far is based on static
> worker sets, where you can only give the number of workers at
> initialisation time, just like pthread_barrier_init.  Some other
> libraries allow for adjustable worker sets, and I'm wondering if a
> parallel leader might need to be able to adjust the barrier when it
> hears of a worker not starting.  More on that soon.
> 
> Please see the attached WIP patch.  I had an earlier version with its
> own waitlists and signalling machinery etc, but I've now rebased it to
> depend on Robert Haas's proposed condition variables, making this code
> much shorter and sweeter.  So it depends on his
> condition-variable-vX.patch[1], which in turn depends on my
> lwlocks-in-dsm-vX.patch[2] (for proclist).
> 
> When Michaƫl Paquier's work on naming wait points[3] lands, I plan to
> include event IDs as an extra argument to BarrierWait which will be
> passed though so as to show up in pg_stat_activity.  Then you'll be
> able to see where workers are waiting for each other!  For now I
> didn't want to tangle this up with yet another patch.
> 
> I thought about using a different name to avoid colliding with
> barrier.h and overloading the term: there are of course also compiler
> barriers and memory barriers.  But then I realised that that header
> was basically vacant real estate, and 'barrier' is the super-well
> established standard term for this parallel computing primitive.
> 
> I'd be grateful for any thoughts, feedback, flames etc.
> 
> [1] 
> https://www.postgresql.org/message-id/flat/CA%2BTgmoaj2aPti0yho7FeEf2qt-JgQPRWb0gci_o1Hfr%3DC56Xng%40mail.gmail.com
> [2] 
> https://www.postgresql.org/message-id/flat/CAEepm%3D0Vvr9zgwHt67RwuTfwMEby1GiGptBk3xFPDbbgEtZgMg%40mail.gmail.com
> [3] 
> https://www.postgresql.org/message-id/flat/cab7npqtghfouhag1ejrvskn8-e5fpqvhm7al0tafsdzjqg_...@mail.gmail.com
> 
> -- 
> Thomas Munro
> http://www.enterprisedb.com
> <barrier-v1.patch>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to