On Sun, Jun 14, 2015 at 9:52 AM, Jan Wieck wrote:
> The whole thing turns out to be based on wrong baseline data, taken with a
> pgbench client running from a remote machine. It all started out from an
> investigation against 9.3. Curiously enough, the s_lock() problem that
> existed in 9.3 has a
The whole thing turns out to be based on wrong baseline data, taken with
a pgbench client running from a remote machine. It all started out from
an investigation against 9.3. Curiously enough, the s_lock() problem
that existed in 9.3 has a very similar effect on throughput as a network
bottlene
On Wed, Jun 10, 2015 at 1:58 PM, Andres Freund wrote:
>> Now that we (EnterpriseDB) have this 8-socket machine, maybe we could
>> try your patch there, bound to varying numbers of sockets.
>
> It'd be a significant amount of work to rebase it ontop current HEAD. I
> guess the easiest thing would b
On 2015-06-10 13:52:14 -0400, Robert Haas wrote:
> On Wed, Jun 10, 2015 at 1:39 PM, Andres Freund wrote:
> > Well, not necessarily. If you can write your algorithm in a way that
> > xadd etc are used, instead of a lock cmpxchg, you're actually never
> > spinning on x86 as it's guaranteed to succee
On Wed, Jun 10, 2015 at 1:39 PM, Andres Freund wrote:
> In the uncontended case lwlocks are just as fast as spinlocks now, with
> the exception of the local tracking array. They're faster if there's
> differences with read/write lockers.
If nothing else, the spinlock calls are inline, while the l
On 2015-06-10 13:19:14 -0400, Robert Haas wrote:
> On Wed, Jun 10, 2015 at 11:58 AM, Andres Freund wrote:
> > I think we should just gank spinlocks asap. The hard part is removing
> > them from lwlock.c's slow path and the buffer headers imo. After that we
> > should imo be fine replacing them wit
On Wed, Jun 10, 2015 at 11:58 AM, Andres Freund wrote:
> I think we should just gank spinlocks asap. The hard part is removing
> them from lwlock.c's slow path and the buffer headers imo. After that we
> should imo be fine replacing them with lwlocks.
Mmmph. I'm not convinced there's any point i
On 2015-06-10 11:51:06 -0400, Jan Wieck wrote:
> >ret = pg_atomic_fetch_sub_u32(&buf->state, 1);
> >
> >if (ret & BM_PIN_COUNT_WAITER)
> >{
> >pg_atomic_fetch_sub_u32(&buf->state, BM_PIN_COUNT_WAITER);
> >/* XXX: deal with race that another backend has set BM_PIN_COUNT_WAITER
> > */
> >}
>
>>> As in 200%+ slower.
>> Have you tried PTHREAD_MUTEX_ADAPTIVE_NP ?
> Yes.
Ok, if this can be validated, we might have a new case now for which my
suggestion would not be helpful. Reviewed, optimized code with short critical
sections and no hotspots by design could indeed be an exception where t
On 06/10/2015 11:34 AM, Andres Freund wrote:
If you check the section where the spinlock is held there's nontrivial
code executed. Under contention you'll have the problem that if backend
tries to acquire the the spinlock while another backend holds the lock,
it'll "steal" the cacheline on which
On 2015-06-10 17:30:33 +0200, Nils Goroll wrote:
> On 10/06/15 17:17, Andres Freund wrote:
> > On 2015-06-10 16:07:50 +0200, Nils Goroll wrote:
> > Interesting. I've been able to reproduce quite massive slowdowns doing
> > this on a 4 socket linux machine (after applying the lwlock patch that's
> >
On 2015-06-10 11:12:46 -0400, Jan Wieck wrote:
> The test case is that 200 threads are running in a tight loop like this:
>
> for (...)
> {
> s_lock();
> // do something with a global variable
> s_unlock();
> }
>
> That is the most contended case I can think of, yet the short and
> pr
On 10/06/15 17:17, Andres Freund wrote:
> On 2015-06-10 16:07:50 +0200, Nils Goroll wrote:
>> On larger Linux machines, we have been running with spin locks replaced by
>> generic posix mutexes for years now. I personally haven't look at the code
>> for
>> ages, but we maintain a patch which pre
On 2015-06-10 16:07:50 +0200, Nils Goroll wrote:
> On larger Linux machines, we have been running with spin locks replaced by
> generic posix mutexes for years now. I personally haven't look at the code for
> ages, but we maintain a patch which pretty much does the same thing still:
Interesting. I
On 10/06/15 17:12, Jan Wieck wrote:
> for (...)
> {
> s_lock();
> // do something with a global variable
> s_unlock();
> }
OK, I understand now, thank you. I am not sure if this test case is appropriate
for the critical sections in postgres (if it was, we'd not have the problem we
are
On 06/10/2015 11:06 AM, Nils Goroll wrote:
On 10/06/15 16:18, Jan Wieck wrote:
I have played with test code that isolates a stripped down version of s_lock()
and uses it with multiple threads. I then implemented multiple different
versions of that s_lock(). The results with 200 concurrent threa
On 10/06/15 17:01, Andres Freund wrote:
>> > - The fact that well behaved mutexes have a higher initial cost could even
>> > motivate good use of them rather than optimize misuse.
> Well. There's many locks in a RDBMS that can't realistically be
> avoided. So optimizing for no and moderate cont
On 10/06/15 16:18, Jan Wieck wrote:
>
> I have played with test code that isolates a stripped down version of s_lock()
> and uses it with multiple threads. I then implemented multiple different
> versions of that s_lock(). The results with 200 concurrent threads are that
> using a __sync_val_compa
On 06/10/2015 10:59 AM, Robert Haas wrote:
On Wed, Jun 10, 2015 at 10:20 AM, Tom Lane wrote:
Jan Wieck writes:
The attached patch demonstrates that less aggressive spinning and (much)
more often delaying improves the performance "on this type of machine".
Hm. One thing worth asking is why
On 2015-06-10 16:55:31 +0200, Nils Goroll wrote:
> But still I am convinced that on today's massively parallel NUMAs, spinlocks
> are
> plain wrong:
Sure. But a large number of installations are not using massive NUMA
systems, so we can't focus on optimizing for NUMA.
We definitely have quite so
On Wed, Jun 10, 2015 at 10:20 AM, Tom Lane wrote:
> Jan Wieck writes:
>> The attached patch demonstrates that less aggressive spinning and (much)
>> more often delaying improves the performance "on this type of machine".
>
> Hm. One thing worth asking is why the code didn't converge to a good
>
On 2015-06-10 10:25:32 -0400, Tom Lane wrote:
> Andres Freund writes:
> > Unfortunately there's no portable futex support. That's what stopped us
> > from adopting them so far. And even futexes can be significantly more
> > heavyweight under moderate contention than our spinlocks - It's rather
>
On 10/06/15 16:20, Andres Freund wrote:
> That's precisely what I referred to in the bit you cut away...
I apologize, yes.
On 10/06/15 16:25, Tom Lane wrote:
> Optimizing for misuse of the mechanism is not the way.
I absolutely agree and I really appreciate all efforts towards lockless data
stru
On 06/10/2015 10:20 AM, Tom Lane wrote:
Jan Wieck writes:
The attached patch demonstrates that less aggressive spinning and (much)
more often delaying improves the performance "on this type of machine".
Hm. One thing worth asking is why the code didn't converge to a good
value of spins_per_d
Andres Freund writes:
> Unfortunately there's no portable futex support. That's what stopped us
> from adopting them so far. And even futexes can be significantly more
> heavyweight under moderate contention than our spinlocks - It's rather
> easy to reproduce scenarios where futexes cause signif
On 2015-06-10 16:12:05 +0200, Nils Goroll wrote:
>
> On 10/06/15 16:05, Andres Freund wrote:
> > it'll nearly always be beneficial to spin
>
> Trouble is that postgres cannot know if the process holding the lock actually
> does run, so if it doesn't, all we're doing is burn cycles and make the
> p
Jan Wieck writes:
> The attached patch demonstrates that less aggressive spinning and (much)
> more often delaying improves the performance "on this type of machine".
Hm. One thing worth asking is why the code didn't converge to a good
value of spins_per_delay without help. The value should d
On 06/10/2015 10:07 AM, Nils Goroll wrote:
On larger Linux machines, we have been running with spin locks replaced by
generic posix mutexes for years now. I personally haven't look at the code for
ages, but we maintain a patch which pretty much does the same thing still:
Ref: http://www.postgres
On 10/06/15 16:05, Andres Freund wrote:
> it'll nearly always be beneficial to spin
Trouble is that postgres cannot know if the process holding the lock actually
does run, so if it doesn't, all we're doing is burn cycles and make the problem
worse.
Contrary to that, the kernel does know, so for
On larger Linux machines, we have been running with spin locks replaced by
generic posix mutexes for years now. I personally haven't look at the code for
ages, but we maintain a patch which pretty much does the same thing still:
Ref: http://www.postgresql.org/message-id/4fede0bf.7080...@schokola.d
Hi,
On 2015-06-10 09:54:00 -0400, Jan Wieck wrote:
> model name : Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
> numactl --hardware shows the distance to the attached memory as 10, the
> distance to every other node as 21. I interpret that as the machine having
> one NUMA bus with all cpu packag
On 06/10/2015 09:28 AM, Andres Freund wrote:
On 2015-06-10 09:18:56 -0400, Jan Wieck wrote:
On a machine with 8 sockets, 64 cores, Hyperthreaded 128 threads total, a
pgbench -S peaks with 50-60 clients around 85,000 TPS. The throughput then
takes a very sharp dive and reaches around 20,000 TPS a
On Wed, Jun 10, 2015 at 09:18:56AM -0400, Jan Wieck wrote:
> The attached patch demonstrates that less aggressive spinning and
> (much) more often delaying improves the performance "on this type of
> machine". The 8 socket machine in question scales to over 350,000
> TPS.
>
> The patch is meant to
On 2015-06-10 09:18:56 -0400, Jan Wieck wrote:
> On a machine with 8 sockets, 64 cores, Hyperthreaded 128 threads total, a
> pgbench -S peaks with 50-60 clients around 85,000 TPS. The throughput then
> takes a very sharp dive and reaches around 20,000 TPS at 120 clients. It
> never recovers from th
Hi,
I think I may have found one of the problems, PostgreSQL has on machines
with many NUMA nodes. I am not yet sure what exactly happens on the NUMA
bus, but there seems to be a tipping point at which the spinlock
concurrency wreaks havoc and the performance of the database collapses.
On a
35 matches
Mail list logo