On Fri, Feb 3, 2017 at 12:01 PM, Alexander Korotkov
<a.korot...@postgrespro.ru> wrote:
> Hi everybody!
> During FOSDEM/PGDay 2017 developer meeting I said that I have some special
> assembly optimization for multicore Power machines.  From the answers of
> other hackers I realized following.
> There are some big Power machines with PostgreSQL in production use.  Not as
> many as Intel, but some of them.
> Community could be interested in special assembly optimization for Power
> machines despite cost of maintaining it.
> Power processors use specific implementation of atomic operations.  This
> implementation is some kind of optimistic locking. 'lwarx' instruction
> 'reserves index', but that reservation could be broken on 'stwcx', and then
> we have to retry.  So, for instance CAS operation on Power processor is a
> loop.  So, loop of CAS operations is two level nested loop.  Benchmarks
> showed that it becomes real problem for LWLockAttemptLock().  However, one
> actually can put arbitrary logic between 'lwarx' and 'stwcx' and make it a
> single loop.  The downside is that this logic has to be implemented in
> assembly.  See [1] for experiment details.
> Results in [1] have a lot of junk which isn't relevant anymore.  This is why
> I draw a separate graph.
> power8-lwlock-asm-ro.png – results of read-only pgbench test on IBM E880
> which have 32 physical cores and 256 virtual thread via SMT.  The curves
> have following meaning.
>  * 9.5: unpatched PostgreSQL 9.5
>  * pinunpin-cas: PostgreSQL 9.5 + earlier version of 48354581
>  * pinunpin-lwlock-asm: PostgreSQL 9.5 + earlier version of 48354581 +
> LWLock implementation in assembly.

Cool work.  Obviously there's some work to do before we can merge this
-- vetting the abstraction, performance testing -- but it seems pretty

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to