subject:"\[PERFORM\] MusicBrainz postgres performance issues"




 On Mar 15, 2015, at 13:45, Josh Krupka jkru...@gmail.com wrote:
 Hmm that's definitely odd that it's swapping since it has plenty of free 
 memory at the moment.  Is it still under heavy load right now?  Has the 
 output of free consistently looked like that during your trouble times?

And it seems better to disable swapiness 


 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Robert Kaye

Hi!

We at MusicBrainz have been having trouble with our Postgres install for the 
past few days. I’ve collected all the relevant information here:

  http://blog.musicbrainz.org/2015/03/15/postgres-troubles/ 
http://blog.musicbrainz.org/2015/03/15/postgres-troubles/

If anyone could provide tips, suggestions or other relevant advice for what to 
poke at next, we would love it.

Thanks!

--

--ruaok

Robert Kaye -- r...@musicbrainz.org mailto:r...@musicbrainz.org 
--http://musicbrainz.org http://musicbrainz.org/

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Robert Kaye


 On Mar 15, 2015, at 12:13 PM, Josh Krupka jkru...@gmail.com wrote:
 
 It sounds like you've hit the postgres basics, what about some of the linux 
 check list items?
 
 what does free -m show on your db server?

 total   used   free sharedbuffers cached
Mem: 48295  31673  16622  0  5  12670
-/+ buffers/cache:  18997  29298
Swap:22852   2382  20470

 
 If the load problem really is being caused by swapping when things really 
 shouldn't be swapping, it could be a matter of adjusting your swappiness - 
 what does cat /proc/sys/vm/swappiness show on your server?

0 

We adjusted that too, but no effect.

(I’ve updated the blog post with these two comments)

 
 There are other linux memory management things that can cause postgres and 
 the server running it to throw fits like THP and zone reclaim.  I don't have 
 enough info about your system to say they are the cause either, but check out 
 the many postings here and other places on the detrimental effect that those 
 settings *can* have.  That would at least give you another angle to 
 investigate.


If there are specific things you’d like to know, I’ve be happy to be a human 
proxy. :)

Thanks!

--

--ruaok

Robert Kaye -- r...@musicbrainz.org mailto:r...@musicbrainz.org 
--http://musicbrainz.org http://musicbrainz.org/

Re: [PERFORM] MusicBrainz postgres performance issues

On 2015-03-15 13:08:13 +0100, Robert Kaye wrote:
  On Mar 15, 2015, at 12:41 PM, Andreas Kretschmer 
  akretsch...@spamfence.net wrote:
  
  just a wild guess: raid-controller BBU faulty
 
 We don’t have a BBU in this server, but at least we have redundant power 
 supplies.
 
 In any case, how would a fault batter possibly cause this?

Many controllers disable write-back caching when the battery is dead.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Rural Hunter


  
  
pls check this if it helps:
  http://ubuntuforums.org/showthread.php?t=2258734
  
  在 2015/3/15 18:54, Robert Kaye 写道:


  
  Hi!
  
  
  We at MusicBrainz have been having trouble with our
Postgres install for the past few days. I’ve collected all the
relevant information here:
  
  
    http://blog.musicbrainz.org/2015/03/15/postgres-troubles/
  
  
  If anyone could provide tips, suggestions or other
relevant advice for what to poke at next, we would love it.
  
  
  Thanks!
  
  
  

  --

--ruaok    

Robert Kaye -- r...@musicbrainz.org --
   http://musicbrainz.org

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Josh Krupka

On Sun, Mar 15, 2015 at 8:07 AM, Robert Kaye r...@musicbrainz.org wrote:

 what does free -m show on your db server?


  total   used   free sharedbuffers cached
 Mem: 48295  31673  16622  0  5  12670
 -/+ buffers/cache:  18997  29298
 Swap:22852   2382  20470


Hmm that's definitely odd that it's swapping since it has plenty of free
memory at the moment.  Is it still under heavy load right now?  Has the
output of free consistently looked like that during your trouble times?



 If the load problem really is being caused by swapping when things really
 shouldn't be swapping, it could be a matter of adjusting your swappiness -
 what does cat /proc/sys/vm/swappiness show on your server?


 0

 We adjusted that too, but no effect.

 (I’ve updated the blog post with these two comments)

 That had been updated a while ago or just now?



 There are other linux memory management things that can cause postgres and
 the server running it to throw fits like THP and zone reclaim.  I don't
 have enough info about your system to say they are the cause either, but
 check out the many postings here and other places on the detrimental effect
 that those settings *can* have.  That would at least give you another angle
 to investigate.


 If there are specific things you’d like to know, I’ve be happy to be a
 human proxy. :)


If zone reclaim is enabled (I think linux usually decides whether or not to
enable it at boot time depending on the numa architecture) it sometimes
avoids using memory on remote numa nodes if it thinks that memory access is
too expensive.  This can lead to way too much disk access (not sure if it
would actually make linux swap or not...) and lots of ram sitting around
doing nothing instead of being used for fs cache like it should be.  Check
to see if zone reclaim is enabled with this command: cat
/proc/sys/vm/zone_reclaim_mode.  If your server is a numa one, you can
install the numactl utility and look at the numa layout with this: numactl
--hardware

I'm not sure how THP would cause lots of swapping, but it's worth checking
in general: cat /sys/kernel/mm/transparent_hugepage/enabled.  If it's
spending too much time trying to compact memory pages it can cause stalls
in your processes.  To get the thp metrics do egrep 'trans|thp' /proc/vmstat

Re: [PERFORM] MusicBrainz postgres performance issues

On 15.3.2015 13:07, Robert Kaye wrote:

If the load problem really is being caused by swapping when things
really shouldn't be swapping, it could be a matter of adjusting your
swappiness - what does cat /proc/sys/vm/swappiness show on your server?

We adjusted that too, but no effect.

(I’ve updated the blog post with these two comments)

IMHO setting swappiness to 0 is way too aggressive. Just set it to
something like 10 - 20, that works better in my experience.

There are other linux memory management things that can cause
postgres and the server running it to throw fits like THP and zone
reclaim. I don't have enough info about your system to say they are
the cause either, but check out the many postings here and other
places on the detrimental effect that those settings *can* have.
That would at least give you another angle to investigate.

If there are specific things you’d like to know, I’ve be happy to be a
human proxy. :)

I'd start with vm.* configuration, so the output from this:

# sysctl -a | grep '^vm.*'

and possibly /proc/meminfo. I'm especially interested in the overcommit
settings, because per the free output you provided there's ~16GB of free
RAM.

BTW what amounts of data are we talking about? How large is the database
and how large is the active set?

I also noticed you use kernel 3.2 - that's not the best kernel version
for PostgreSQL - see [1] or [2] for example.

[1]
https://medium.com/postgresql-talk/benchmarking-postgresql-with-different-linux-kernel-versions-on-ubuntu-lts-e61d57b70dd4

[2]
http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html

--
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training Services

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On 2015-03-15 13:07:25 +0100, Robert Kaye wrote:
 
  On Mar 15, 2015, at 12:13 PM, Josh Krupka jkru...@gmail.com wrote:
  
  It sounds like you've hit the postgres basics, what about some of the linux 
  check list items?
  
  what does free -m show on your db server?
 
  total   used   free sharedbuffers cached
 Mem: 48295  31673  16622  0  5  12670
 -/+ buffers/cache:  18997  29298
 Swap:22852   2382  20470

Could you post /proc/meminfo instead? That gives a fair bit more
information.

Also:
* What hardware is this running on?
* Why do you need 500 connections (that are nearly all used) when you
  have a pgbouncer in front of the database? That's not going to be
  efficient.
* Do you have any data tracking the state connections are in?
  I.e. whether they're idle or not? The connections graph on you linked
  doesn't give that information?
* You're apparently not graphing CPU usage. How busy are the CPUs? How
  much time is spent in the kernel (i.e. system)?
* Consider installing perf (linux-utils-$something) and doing a
  systemwide profile.

3.2 isn't the greatest kernel around, efficiency wise. At some point you
might want to upgrade to something newer. I've seen remarkable
differences around this.

You really should upgrade postgres to a newer major version one of these
days. Especially 9.2. can give you a remarkable improvement in
performance with many connections in a read mostly workload.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Robert Kaye


 On Mar 15, 2015, at 12:41 PM, Andreas Kretschmer akretsch...@spamfence.net 
 wrote:
 
 just a wild guess: raid-controller BBU faulty

We don’t have a BBU in this server, but at least we have redundant power 
supplies.

In any case, how would a fault batter possibly cause this?

--

--ruaok

Robert Kaye -- r...@musicbrainz.org --http://musicbrainz.org



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

Hi!

What shows your pg_stat_bgwriter for one day? 


 On Mar 15, 2015, at 11:54, Robert Kaye r...@musicbrainz.org wrote:
 
 Hi!
 
 We at MusicBrainz have been having trouble with our Postgres install for the 
 past few days. I’ve collected all the relevant information here:
 
   http://blog.musicbrainz.org/2015/03/15/postgres-troubles/
 
 If anyone could provide tips, suggestions or other relevant advice for what 
 to poke at next, we would love it.
 
 Thanks!
 
 --
 
 --ruaok
 
 Robert Kaye -- r...@musicbrainz.org --http://musicbrainz.org

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andreas Kretschmer

Robert Kaye r...@musicbrainz.org wrote:

 Hi!
 
 We at MusicBrainz have been having trouble with our Postgres install for the
 past few days. I’ve collected all the relevant information here:
 
   http://blog.musicbrainz.org/2015/03/15/postgres-troubles/
 
 If anyone could provide tips, suggestions or other relevant advice for what to
 poke at next, we would love it.


just a wild guess: raid-controller BBU faulty


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
If I was god, I would recompile penguin with --enable-fly.   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Josh Krupka

It sounds like you've hit the postgres basics, what about some of the linux
check list items?

what does free -m show on your db server?

If the load problem really is being caused by swapping when things really
shouldn't be swapping, it could be a matter of adjusting your swappiness -
what does cat /proc/sys/vm/swappiness show on your server?

There are other linux memory management things that can cause postgres and
the server running it to throw fits like THP and zone reclaim.  I don't
have enough info about your system to say they are the cause either, but
check out the many postings here and other places on the detrimental effect
that those settings *can* have.  That would at least give you another angle
to investigate.

Re: [PERFORM] MusicBrainz postgres performance issues

On Sun, Mar 15, 2015 at 11:09 AM, Scott Marlowe scott.marl...@gmail.com wrote:

Clarification:

 64MB work mem AND max_connections = 500 is a recipe for disaster. No
 db can actively process 500 queries at once without going kaboom, ad
 having 64MB work_mem means it will go kaboom long before it reaches
 500 active connections. Lower that and let pgbouncer handle the extra
 connections for you.

Lower max_connections. work_mem 64MB is fine as long as
max_connections is something reasonable (reasonable is generally #CPU
cores * 2 or so).

work_mem is per sort. A single query could easily use 2 or 4x work_mem
all by itself. You can see how having hundreds of active connections
each using 64MB or more at the same time can kill your server.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
 shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.

I think that's a outdated wisdom, i.e. not generally true. I've now seen
a significant number of systems where a larger shared_buffers can help
quite massively.  The primary case where it can, in my experience, go
bad are write mostly database where every buffer acquiration has to
write out dirty data while holding locks. Especially during relation
extension that's bad.  A new enough kernel, a sane filesystem
(i.e. not ext3) and sane checkpoint configuration takes care of most of
the other disadvantages.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On 2015-03-15 20:42:51 +0300, Ilya Kosmodemiansky wrote:
 On Sun, Mar 15, 2015 at 8:20 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
  shared_mem of 12G is almost always too large. I'd drop it down to ~1G or 
  so.
 
  I think that's a outdated wisdom, i.e. not generally true.
 
 Quite agreed. With note, that proper configured controller with BBU is needed.

That imo doesn't really have anything to do with it. The primary benefit
of a BBU with writeback caching is accelerating (near-)synchronous
writes. Like the WAL. But, besides influencing the default for
wal_buffers, a larger shared_buffers doesn't change the amount of
synchronous writes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On Sun, Mar 15, 2015 at 8:46 PM, Andres Freund and...@2ndquadrant.com wrote:
 That imo doesn't really have anything to do with it. The primary benefit
 of a BBU with writeback caching is accelerating (near-)synchronous
 writes. Like the WAL.

My point was, that having no proper raid controller (today bbu surely
needed for the controller to be a proper one) + heavy writes of any
kind, it is absolutely impossible to live with large shared_buffers
and without io problems.


 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



-- 
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
i...@postgresql-consulting.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On Sun, Mar 15, 2015 at 7:50 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-03-15 13:07:25 +0100, Robert Kaye wrote:

  On Mar 15, 2015, at 12:13 PM, Josh Krupka jkru...@gmail.com wrote:
 
  It sounds like you've hit the postgres basics, what about some of the 
  linux check list items?
 
  what does free -m show on your db server?

  total   used   free sharedbuffers cached
 Mem: 48295  31673  16622  0  5  12670
 -/+ buffers/cache:  18997  29298
 Swap:22852   2382  20470

 Could you post /proc/meminfo instead? That gives a fair bit more
 information.

 Also:
 * What hardware is this running on?
 * Why do you need 500 connections (that are nearly all used) when you
   have a pgbouncer in front of the database? That's not going to be
   efficient.
 * Do you have any data tracking the state connections are in?
   I.e. whether they're idle or not? The connections graph on you linked
   doesn't give that information?
 * You're apparently not graphing CPU usage. How busy are the CPUs? How
   much time is spent in the kernel (i.e. system)?

htop is a great tool for watching the CPU cores live. Red == kernel btw.

 * Consider installing perf (linux-utils-$something) and doing a
   systemwide profile.

 3.2 isn't the greatest kernel around, efficiency wise. At some point you
 might want to upgrade to something newer. I've seen remarkable
 differences around this.

That is an understatement. Here's a nice article on why it's borked:

http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html

Had a 32 core machine with big RAID BBU and 512GB memory that was
dying using 3.2 kernel. went to 3.11 and it went from a load of 20 to
40 to a load of 5.

 You really should upgrade postgres to a newer major version one of these
 days. Especially 9.2. can give you a remarkable improvement in
 performance with many connections in a read mostly workload.

Agreed. ubuntu 12.04 with kernel 3.11/3.13 with pg 9.2 has been a
great improvement over debian squeeze and pg 8.4 that we were running
at work until recently.

As for the OP. if you've got swap activity causing issues when there's
plenty of free space just TURN IT OFF.

swapoff -a

I do this on all my big memory servers that don't really need swap,
esp when I was using hte 3.2 kernel which seems broken as regards swap
on bigger memory machines.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Joshua D. Drake



On 03/15/2015 09:43 AM, Scott Marlowe wrote:


* Consider installing perf (linux-utils-$something) and doing a
   systemwide profile.

3.2 isn't the greatest kernel around, efficiency wise. At some point you
might want to upgrade to something newer. I've seen remarkable
differences around this.


Not at some point, now. 3.2 - 3.8 are undeniably broken for PostgreSQL.



That is an understatement. Here's a nice article on why it's borked:

http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html

Had a 32 core machine with big RAID BBU and 512GB memory that was
dying using 3.2 kernel. went to 3.11 and it went from a load of 20 to
40 to a load of 5.


Yep, I can confirm this behavior.




You really should upgrade postgres to a newer major version one of these
days. Especially 9.2. can give you a remarkable improvement in
performance with many connections in a read mostly workload.


Seconded.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc

Now I get it: your service is designed for a customer
base that grew up with Facebook, watches Japanese seizure
robot anime, and has the attention span of a gnat.
I'm not that user., Tyler Riddle



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On Sun, Mar 15, 2015 at 10:43 AM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Sun, Mar 15, 2015 at 7:50 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-03-15 13:07:25 +0100, Robert Kaye wrote:

  On Mar 15, 2015, at 12:13 PM, Josh Krupka jkru...@gmail.com wrote:
 
  It sounds like you've hit the postgres basics, what about some of the 
  linux check list items?
 
  what does free -m show on your db server?

  total   used   free sharedbuffers cached
 Mem: 48295  31673  16622  0  5  12670
 -/+ buffers/cache:  18997  29298
 Swap:22852   2382  20470

 Could you post /proc/meminfo instead? That gives a fair bit more
 information.

 Also:
 * What hardware is this running on?
 * Why do you need 500 connections (that are nearly all used) when you
   have a pgbouncer in front of the database? That's not going to be
   efficient.
 * Do you have any data tracking the state connections are in?
   I.e. whether they're idle or not? The connections graph on you linked
   doesn't give that information?
 * You're apparently not graphing CPU usage. How busy are the CPUs? How
   much time is spent in the kernel (i.e. system)?

 htop is a great tool for watching the CPU cores live. Red == kernel btw.

 * Consider installing perf (linux-utils-$something) and doing a
   systemwide profile.

 3.2 isn't the greatest kernel around, efficiency wise. At some point you
 might want to upgrade to something newer. I've seen remarkable
 differences around this.

 That is an understatement. Here's a nice article on why it's borked:

 http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html

 Had a 32 core machine with big RAID BBU and 512GB memory that was
 dying using 3.2 kernel. went to 3.11 and it went from a load of 20 to
 40 to a load of 5.

 You really should upgrade postgres to a newer major version one of these
 days. Especially 9.2. can give you a remarkable improvement in
 performance with many connections in a read mostly workload.

 Agreed. ubuntu 12.04 with kernel 3.11/3.13 with pg 9.2 has been a
 great improvement over debian squeeze and pg 8.4 that we were running
 at work until recently.

 As for the OP. if you've got swap activity causing issues when there's
 plenty of free space just TURN IT OFF.

 swapoff -a

 I do this on all my big memory servers that don't really need swap,
 esp when I was using hte 3.2 kernel which seems broken as regards swap
 on bigger memory machines.

OK I've now read your blog post. A few pointers I'd make.

shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.

64MB work mem AND max_connections = 500 is a recipe for disaster. No
db can actively process 500 queries at once without going kaboom, ad
having 64MB work_mem means it will go kaboom long before it reaches
500 active connections. Lower that and let pgbouncer handle the extra
connections for you.

Get some monitoring installed if you don't already have it so you can
track memory usage, cpu usage, disk usage etc. Zabbix or Nagios work
well. Without some kind of system monitoring you're missing half the
information you need to troubleshoot with.

Install iotop, sysstat, and htop. Configure sysstat to collect data so
you can use sar to see what the machine's been doing in the past few
days etc. Set it to 1 minute intervals in the /etc/cron.d/sysstat
file.

Do whatever you have to to get kernel 3.11 or greater on that machine
(or a new one). You don't have to upgrade pg just yet but the upgrade
of the kernel is essential.

Good luck. Let us know what you find and if you can get that machine
back on its feet.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On Sun, Mar 15, 2015 at 11:46 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-03-15 20:42:51 +0300, Ilya Kosmodemiansky wrote:
 On Sun, Mar 15, 2015 at 8:20 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
  shared_mem of 12G is almost always too large. I'd drop it down to ~1G or 
  so.
 
  I think that's a outdated wisdom, i.e. not generally true.

 Quite agreed. With note, that proper configured controller with BBU is 
 needed.

 That imo doesn't really have anything to do with it. The primary benefit
 of a BBU with writeback caching is accelerating (near-)synchronous
 writes. Like the WAL. But, besides influencing the default for
 wal_buffers, a larger shared_buffers doesn't change the amount of
 synchronous writes.

Here's the problem with a large shared_buffers on a machine that's
getting pushed into swap. It starts to swap BUFFERs. Once buffers
start getting swapped you're not just losing performance, that huge
shared_buffers is now working against you because what you THINK are
buffers in RAM to make things faster are in fact blocks on a hard
drive being swapped in and out during reads. It's the exact opposite
of fast. :)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Joshua D. Drake



On 03/15/2015 05:08 AM, Robert Kaye wrote:




On Mar 15, 2015, at 12:41 PM, Andreas Kretschmer akretsch...@spamfence.net 
wrote:

just a wild guess: raid-controller BBU faulty


We don’t have a BBU in this server, but at least we have redundant power 
supplies.

In any case, how would a fault batter possibly cause this?


The controller would turn off the cache.

JD



--

--ruaok

Robert Kaye -- r...@musicbrainz.org --http://musicbrainz.org






--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc

Now I get it: your service is designed for a customer
base that grew up with Facebook, watches Japanese seizure
robot anime, and has the attention span of a gnat.
I'm not that user., Tyler Riddle



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On Sun, Mar 15, 2015 at 8:20 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
 shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.

 I think that's a outdated wisdom, i.e. not generally true.

Quite agreed. With note, that proper configured controller with BBU is needed.


 A new enough kernel, a sane filesystem
 (i.e. not ext3) and sane checkpoint configuration takes care of most of
 the other disadvantages.

Most likely. And better to be sure that filesystem mounted without barrier.

And I agree with Scott - 64MB work mem AND max_connections = 500 is a
recipe for disaster. The problem could be in session mode of
pgbouncer. If you can work with transaction mode - do it.


Best regards,
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
i...@postgresql-consulting.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread mich...@sqlexec.com

Why is 500 connections insane.  We got 32 CPU with 96GB and 3000 max 
connections, and we are doing fine, even when hitting our max concurrent 
connection peaks around 4500.  At a previous site, we were using 2000 
max connections on 24 CPU and 64GB RAM, with about 1500 max concurrent 
connections.  So I wouldn't be too hasty in saying more than 500 is 
asking for trouble.  Just as long as you got your kernel resources set 
high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and ulimits), and RAM 
for work_mem.

Tomas Vondra mailto:tomas.von...@2ndquadrant.com
Sunday, March 15, 2015 7:41 PM
On 15.3.2015 23:47, Andres Freund wrote:

On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:

Here's the problem with a large shared_buffers on a machine that's
getting pushed into swap. It starts to swap BUFFERs. Once buffers
start getting swapped you're not just losing performance, that huge
shared_buffers is now working against you because what you THINK are
buffers in RAM to make things faster are in fact blocks on a hard
drive being swapped in and out during reads. It's the exact opposite
of fast. :)

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping
out actively used pages, the problem isn't the 12GB of shared
buffers, but that you require so much memory for other things. That
needs to be fixed.


I second this opinion.

As was already pointed out, the 500 connections is rather insane
(assuming the machine does not have hundreds of cores).

If there are memory pressure issues, it's likely because many queries
are performing memory-expensive operations at the same time (might even
be a bad estimate causing hashagg to use much more than work_mem).



But! We haven't even established that swapping is an actual problem
here. The ~2GB of swapped out memory could just as well be the java raid
controller management monstrosity or something similar. Those pages
won't ever be used and thus can better be used to buffer IO.

You can check what's actually swapped out using:
grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'

For swapping to be actually harmful you need to have pages that are
regularly swapped in. vmstat will tell.


I've already asked for vmstat logs, so let's wait.


In a concurrent OLTP workload (~450 established connections do
suggest that) with a fair amount of data keeping the hot data set in
shared_buffers can significantly reduce problems. Constantly
searching for victim buffers isn't a nice thing, and that will happen
if your most frequently used data doesn't fit into s_b. On the other
hand, if your data set is so large that even the hottest part doesn't
fit into memory (perhaps because there's no hottest part as there's
no locality at all), a smaller shared buffers can make things more
efficient, because the search for replacement buffers is cheaper with
a smaller shared buffers setting.


I've met many systems with max_connections values this high, and it was
mostly idle connections because of separate connection pools on each
application server. So mostly idle (90% of the time), but at peak time
all the application servers want to od stuff at the same time. And it
all goes KABOOOM! just like here.


Andres Freund mailto:and...@2ndquadrant.com
Sunday, March 15, 2015 6:47 PM

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping out
actively used pages, the problem isn't the 12GB of shared buffers, but
that you require so much memory for other things. That needs to be
fixed.

But! We haven't even established that swapping is an actual problem
here. The ~2GB of swapped out memory could just as well be the java raid
controller management monstrosity or something similar. Those pages
won't ever be used and thus can better be used to buffer IO.

You can check what's actually swapped out using:
grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'

For swapping to be actually harmful you need to have pages that are
regularly swapped in. vmstat will tell.

In a concurrent OLTP workload (~450 established connections do suggest
that) with a fair amount of data keeping the hot data set in
shared_buffers can significantly reduce problems. Constantly searching
for victim buffers isn't a nice thing, and that will happen if your most
frequently used data doesn't fit into s_b. On the other hand, if your
data set is so large that even the hottest part doesn't fit into memory
(perhaps because there's no hottest part as there's no locality at all),
a smaller shared buffers can make things more efficient, because the
search for replacement buffers is cheaper with a smaller shared buffers
setting.

Greetings,

Andres Freund

Scott Marlowe mailto:scott.marl...@gmail.com
Sunday, March 15, 2015 2:25 PM

Here's the problem with a large shared_buffers on a machine that's
getting pushed into swap. It starts to swap BUFFERs. Once buffers
start getting swapped you're

Re: [PERFORM] MusicBrainz postgres performance issues

On 15.3.2015 23:47, Andres Freund wrote:
 On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:
 Here's the problem with a large shared_buffers on a machine that's
 getting pushed into swap. It starts to swap BUFFERs. Once buffers
 start getting swapped you're not just losing performance, that huge
 shared_buffers is now working against you because what you THINK are
 buffers in RAM to make things faster are in fact blocks on a hard
 drive being swapped in and out during reads. It's the exact opposite
 of fast. :)
 
 IMNSHO that's tackling things from the wrong end. If 12GB of shared 
 buffers drive your 48GB dedicated OLTP postgres server into swapping
 out actively used pages, the problem isn't the 12GB of shared
 buffers, but that you require so much memory for other things. That
 needs to be fixed.

I second this opinion.

As was already pointed out, the 500 connections is rather insane
(assuming the machine does not have hundreds of cores).

If there are memory pressure issues, it's likely because many queries
are performing memory-expensive operations at the same time (might even
be a bad estimate causing hashagg to use much more than work_mem).


 But! We haven't even established that swapping is an actual problem
 here. The ~2GB of swapped out memory could just as well be the java raid
 controller management monstrosity or something similar. Those pages
 won't ever be used and thus can better be used to buffer IO.
 
 You can check what's actually swapped out using:
 grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'
 
 For swapping to be actually harmful you need to have pages that are 
 regularly swapped in. vmstat will tell.

I've already asked for vmstat logs, so let's wait.

 In a concurrent OLTP workload (~450 established connections do
 suggest that) with a fair amount of data keeping the hot data set in 
 shared_buffers can significantly reduce problems. Constantly
 searching for victim buffers isn't a nice thing, and that will happen
 if your most frequently used data doesn't fit into s_b. On the other
 hand, if your data set is so large that even the hottest part doesn't
 fit into memory (perhaps because there's no hottest part as there's
 no locality at all), a smaller shared buffers can make things more
 efficient, because the search for replacement buffers is cheaper with
 a smaller shared buffers setting.

I've met many systems with max_connections values this high, and it was
mostly idle connections because of separate connection pools on each
application server. So mostly idle (90% of the time), but at peak time
all the application servers want to od stuff at the same time. And it
all goes KABOOOM! just like here.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

(please quote properly)

On 2015-03-15 19:55:23 -0400, mich...@sqlexec.com wrote:
 Why is 500 connections insane.  We got 32 CPU with 96GB and 3000 max
 connections, and we are doing fine, even when hitting our max concurrent
 connection peaks around 4500.  At a previous site, we were using 2000 max
 connections on 24 CPU and 64GB RAM, with about 1500 max concurrent
 connections.  So I wouldn't be too hasty in saying more than 500 is asking
 for trouble.  Just as long as you got your kernel resources set high enough
 to sustain it (SHMMAX, SHMALL, SEMMNI, and ulimits), and RAM for work_mem.

It may work acceptably in some scenarios, but it can lead to significant
problems. Several things in postgres things in postgres scale linearly
(from the algorithmic point of view, often CPU characteristics like
cache sizes make it wors) with max_connections, most notably acquiring a
snapshot.  It usually works ok enough if you don't have a high number of
queries per second, but if you do, you can run into horrible contention
problems.  Absurdly enough that matters *more* on bigger machines with
several sockets. It's especially bad on 4+ socket systems.

The other aspect is that such a high number of full connections usually
just isn't helpful for throughput. Not even the most massive NUMA
systems (~256 hardware threads is the realistic max atm IIRC) can
process 4.5k queries at the same time.  It'll often be much more
efficient if all connections above a certain number aren't allocated a
full postgres backend, with all it's overhead, but use a much more
lightweight pooler connection.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Gavin Flower


On 16/03/15 13:07, Tomas Vondra wrote:

On 16.3.2015 00:55, mich...@sqlexec.com wrote:

Why is 500 connections insane. We got 32 CPU with 96GB and 3000
max connections, and we are doing fine, even when hitting our max
concurrent connection peaks around 4500. At a previous site, we were
using 2000 max connections on 24 CPU and 64GB RAM, with about 1500
max concurrent connections. So I wouldn't be too hasty in saying more
than 500 is asking for trouble. Just as long as you got your kernel
resources set high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and
ulimits), and RAM for work_mem.

[...]

Also, if part of the query required a certain amount of memory for part
of the plan, it now holds that memory for much longer too. That only
increases the change of OOM issues.


[...]

Also you could get a situation where a small number of queries  their 
data, relevant indexes, and working memory etc can all just fit into 
RAM, but the extra queries suddenly reduce the RAM so that even these 
queries spill to disk, plus the time required to process the extra 
queries.  So a nicely behaved system could suddenly get a lot worse.  
Even before you consider additional lock contention and other nasty things!


It all depends...


Cheers,
Gavin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

On 2015-03-15 20:54:51 +0300, Ilya Kosmodemiansky wrote:
 On Sun, Mar 15, 2015 at 8:46 PM, Andres Freund and...@2ndquadrant.com wrote:
  That imo doesn't really have anything to do with it. The primary benefit
  of a BBU with writeback caching is accelerating (near-)synchronous
  writes. Like the WAL.
 
 My point was, that having no proper raid controller (today bbu surely
 needed for the controller to be a proper one) + heavy writes of any
 kind, it is absolutely impossible to live with large shared_buffers
 and without io problems.

And my point is that you're mostly wrong. What a raid controller's
writeback usefully cache accelerates is synchronous writes. I.e. writes
that the application waits for. Usually raid controllers don't have much
chance to reorderer the queued writes (i.e. turning neighboring writes
into one larger sequential write). What they do excel at is making
synchronous writes to disk return faster because the data is only
written to the the controller's memory, not to actual storage. They're
also good at avoiding actual writes to disk when the *same* page is
written to multiple times in short amount of time.

In postgres writes for data that goes through shared_buffers are usally
asynchronous. We write them to the OS's page cache when a page is needed
for other contents, is undirtied by the bgwriter, or written out during
a checkpoint; but do *not* wait for the write to hit the disk.  The
controller's write back cache doesn't hugely help here, because it
doesn't make *that* much of a difference whether the dirty data stays in
the kernel's page cache or in the controller.

In contrast to that, writes to the WAL are often more or les
synchronous. We actually wait (via fdatasync()/fsync() syscalls) for
writes to hit disk in a bunch of scenarios, most commonly when
committing a transaction. Unless synchronous_commit = off every COMMIT
in a transaction that wrote data implies a fdatasync() of a file in
pg_xlog (well, we do optimize that in some condition, but let's leave
that out for now).

Additionally, if there are many smaller/interleaved transactions, we
will write()/fdatasync() out the same 8kb WAL page repeatedly. Everytime
a transaction commits (and some other things) the page that commit
record is on will be flushed. As the WAL records for insertions,
updates, deletes, commits are frequently much smaller than 8kb that will
often happen 20-100 for the same page in OLTP scenarios with narrow
rows.  That's why synchronous_commit = off can be such a huge win for
OLTP write workloads without a writeback cache - synchronous writes are
turned into asynchronous writes, and repetitive writes to the same page
are avoided. It also explains why synchronous_commit = off has much less
an effect for bulk write workloads: As there are no synchronous disk
writes due to WAL flushes at commit time (there's only very few
commits), synchronous commit doesn't have much of an effect.


That said, there's a couple reasons why you're not completely wrong:

Historically, when using ext3 with data=ordered and some other
filesystems, synchronous writes to one file forced *all* other
previously dirtied data to also be flushed. That means that if you have
pg_xlog and the normal relation files on the same filesystem, the
synchronous writes to the WAL will not only have to write out the new
WAL (often not that much data), but also all the other dirty data.  The
OS will often be unable to do efficient write combining in that case,
because a) there's not yet that much data there, b) postgres doesn't
order writes during checkpoints.  That means that WAL writes will
suddenly have to write out much more data = COMMITs are slow.  That's
where the suggestion to keep pg_xlog on a separate partion largely comes
from.

Writes going through shared_buffers are sometimes indirectly turned into
synchronous writes (from the OSs perspective at least. Which means
they'll done at a higher priority). That happens when the checkpointer
fsync()s all the files at the end of a checkpoint. When things are going
well and checkpoints are executed infrequently and completely guided by
time (i.e. triggered solely by checkpoint_timeout, and not
checkpoint_segments) that's usually not too bad. You'll see a relatively
small latency spike for transactions.
Unfortunately the ext* filesystems have a implementation problem here,
which can make this problem much worse: The way writes are priorized
during an fsync() can stall out concurrent synchronous reads/writes
pretty badly. That's much less of a problem with e.g. xfs. Which is why
I'd atm not suggest ext4 for write intensive applications.

The other situation where this can lead to big problems is if your
checkpoints aren't scheduled by time (you can recognize that by enabling
log_checkpoints and check a) that time is the trigger, b) they're
actually happening in a interval consistent with checkpoint_timeout). If
the relation files are not writtten out in a smoothed out fashion
(configured by

Re: [PERFORM] MusicBrainz postgres performance issues

On 16.3.2015 00:55, mich...@sqlexec.com wrote:
 Why is 500 connections insane. We got 32 CPU with 96GB and 3000
 max connections, and we are doing fine, even when hitting our max
 concurrent connection peaks around 4500. At a previous site, we were
 using 2000 max connections on 24 CPU and 64GB RAM, with about 1500
 max concurrent connections. So I wouldn't be too hasty in saying more
 than 500 is asking for trouble. Just as long as you got your kernel
 resources set high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and
 ulimits), and RAM for work_mem.

If all the connections are active at the same time (i.e. running
queries), they have to share the 32 cores somehow. Or I/O, if that's the
bottleneck.

In other words, you're not improving the throughput of the system,
you're merely increasing latencies. And it may easily happen that the
latency increase is not linear, but grows faster - because of locking,
context switches and other process-related management.

Imagine you have a query taking 1 second of CPU time. If you have 64
such queries running concurrently on 32 cores, each gets only 1/2 a CPU
and so takes =2 seconds. With 500 queries, it's =15 seconds per, etc.

If those queries are acquiring the same locks (e.g. updating the same
rows, or so), you can imagine what happens ...

Also, if part of the query required a certain amount of memory for part
of the plan, it now holds that memory for much longer too. That only
increases the change of OOM issues.

It may work fine when most of the connections are idle, but it makes
storms like this possible.


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread mich...@sqlexec.com


How many CPUs in play here on the PG Cluster Server,

cat /proc/cpuinfo | grep processor | wc -l


I see you got pg_stat_statements enabled, what are the SQL you 
experience during this heavy load time?  And does explain on them show a 
lot of sorting activity that requires more work_mem.


Please enable log_checkpoints, so we can see if your checkpoint_segments 
is adequate.



Andres Freund mailto:and...@2ndquadrant.com
Sunday, March 15, 2015 6:47 PM

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping out
actively used pages, the problem isn't the 12GB of shared buffers, but
that you require so much memory for other things. That needs to be
fixed.

But! We haven't even established that swapping is an actual problem
here. The ~2GB of swapped out memory could just as well be the java raid
controller management monstrosity or something similar. Those pages
won't ever be used and thus can better be used to buffer IO.

You can check what's actually swapped out using:
grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'

For swapping to be actually harmful you need to have pages that are
regularly swapped in. vmstat will tell.

In a concurrent OLTP workload (~450 established connections do suggest
that) with a fair amount of data keeping the hot data set in
shared_buffers can significantly reduce problems. Constantly searching
for victim buffers isn't a nice thing, and that will happen if your most
frequently used data doesn't fit into s_b. On the other hand, if your
data set is so large that even the hottest part doesn't fit into memory
(perhaps because there's no hottest part as there's no locality at all),
a smaller shared buffers can make things more efficient, because the
search for replacement buffers is cheaper with a smaller shared buffers
setting.

Greetings,

Andres Freund

Scott Marlowe mailto:scott.marl...@gmail.com
Sunday, March 15, 2015 2:25 PM

Here's the problem with a large shared_buffers on a machine that's
getting pushed into swap. It starts to swap BUFFERs. Once buffers
start getting swapped you're not just losing performance, that huge
shared_buffers is now working against you because what you THINK are
buffers in RAM to make things faster are in fact blocks on a hard
drive being swapped in and out during reads. It's the exact opposite
of fast. :)


Andres Freund mailto:and...@2ndquadrant.com
Sunday, March 15, 2015 1:46 PM

That imo doesn't really have anything to do with it. The primary benefit
of a BBU with writeback caching is accelerating (near-)synchronous
writes. Like the WAL. But, besides influencing the default for
wal_buffers, a larger shared_buffers doesn't change the amount of
synchronous writes.

Greetings,

Andres Freund

Ilya Kosmodemiansky mailto:ilya.kosmodemian...@postgresql-consulting.com
Sunday, March 15, 2015 1:42 PM
On Sun, Mar 15, 2015 at 8:20 PM, Andres Freundand...@2ndquadrant.com  wrote:

On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:

shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.

I think that's a outdated wisdom, i.e. not generally true.


Quite agreed. With note, that proper configured controller with BBU is needed.



A new enough kernel, a sane filesystem
(i.e. not ext3) and sane checkpoint configuration takes care of most of
the other disadvantages.


Most likely. And better to be sure that filesystem mounted without barrier.

And I agree with Scott - 64MB work mem AND max_connections = 500 is a
recipe for disaster. The problem could be in session mode of
pgbouncer. If you can work with transaction mode - do it.


Best regards,
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
i...@postgresql-consulting.com


Andres Freund mailto:and...@2ndquadrant.com
Sunday, March 15, 2015 1:20 PM

I think that's a outdated wisdom, i.e. not generally true. I've now seen
a significant number of systems where a larger shared_buffers can help
quite massively. The primary case where it can, in my experience, go
bad are write mostly database where every buffer acquiration has to
write out dirty data while holding locks. Especially during relation
extension that's bad. A new enough kernel, a sane filesystem
(i.e. not ext3) and sane checkpoint configuration takes care of most of
the other disadvantages.

Greetings,

Andres Freund

Re: [PERFORM] MusicBrainz postgres performance issues

On 15.3.2015 18:54, Ilya Kosmodemiansky wrote:
 On Sun, Mar 15, 2015 at 8:46 PM, Andres Freund and...@2ndquadrant.com wrote:
 That imo doesn't really have anything to do with it. The primary
 benefit of a BBU with writeback caching is accelerating
 (near-)synchronous writes. Like the WAL.
 
 My point was, that having no proper raid controller (today bbu
 surely needed for the controller to be a proper one) + heavy writes
 of any kind, it is absolutely impossible to live with large
 shared_buffers and without io problems.

That is not really true, IMHO.

The benefit of the write cache is that it can absorb certain amount of
writes, equal to the size of the cache (nowadays usually 512MB or 1GB),
without forcing them to disks.

It however still has to flush the dirty data to the drives later, but
that side usually has much lower throughput - e.g. while you can easily
write several GB/s to the controller, the drives usually handle only
~1MB/s of random writes each (I assume rotational drives here).

But if you do a lot of random writes (which is likely the case for
write-heavy databases), you'll fill the write cache pretty soon and will
be bounded by the drives anyway.

The controller really can't help with sequential writes much, because
the drives already handle that quite well. And SSDs are a completely
different story of course.

That does not mean the write cache is useless - it can absorb short
bursts of random writes, fix the write hole with RAID5, the controller
may compute the parity computation etc. Whenever someone asks me whether
they should buy a RAID controller with write cache for their database
server, my answer is absolutely yes in 95.23% cases ...

... but really it's not something that magically changes the limits for
write-heavy databases - the main limit are still the drives.

regards
Tomas

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread mich...@sqlexec.com

I agree with your counter argument about how high max_connections can 
cause problems, but max_connections may not part of the problem here.  
There's a bunch of depends stuff in there based on workload details, # 
cpus, RAM, etc.


I'm still waiting to find out how many CPUs on this DB server.  Did i 
miss it somewhere in the email thread below?



Tomas Vondra mailto:tomas.von...@2ndquadrant.com
Sunday, March 15, 2015 8:07 PM

If all the connections are active at the same time (i.e. running
queries), they have to share the 32 cores somehow. Or I/O, if that's the
bottleneck.

In other words, you're not improving the throughput of the system,
you're merely increasing latencies. And it may easily happen that the
latency increase is not linear, but grows faster - because of locking,
context switches and other process-related management.

Imagine you have a query taking 1 second of CPU time. If you have 64
such queries running concurrently on 32 cores, each gets only 1/2 a CPU
and so takes =2 seconds. With 500 queries, it's =15 seconds per, etc.

If those queries are acquiring the same locks (e.g. updating the same
rows, or so), you can imagine what happens ...

Also, if part of the query required a certain amount of memory for part
of the plan, it now holds that memory for much longer too. That only
increases the change of OOM issues.

It may work fine when most of the connections are idle, but it makes
storms like this possible.


mich...@sqlexec.com mailto:mich...@sqlexec.com
Sunday, March 15, 2015 7:55 PM
Why is 500 connections insane.  We got 32 CPU with 96GB and 3000 max 
connections, and we are doing fine, even when hitting our max 
concurrent connection peaks around 4500.  At a previous site, we were 
using 2000 max connections on 24 CPU and 64GB RAM, with about 1500 max 
concurrent connections.  So I wouldn't be too hasty in saying more 
than 500 is asking for trouble.  Just as long as you got your kernel 
resources set high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and 
ulimits), and RAM for work_mem.

Tomas Vondra mailto:tomas.von...@2ndquadrant.com
Sunday, March 15, 2015 7:41 PM
On 15.3.2015 23:47, Andres Freund wrote:

On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:

Here's the problem with a large shared_buffers on a machine that's
getting pushed into swap. It starts to swap BUFFERs. Once buffers
start getting swapped you're not just losing performance, that huge
shared_buffers is now working against you because what you THINK are
buffers in RAM to make things faster are in fact blocks on a hard
drive being swapped in and out during reads. It's the exact opposite
of fast. :)

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping
out actively used pages, the problem isn't the 12GB of shared
buffers, but that you require so much memory for other things. That
needs to be fixed.


I second this opinion.

As was already pointed out, the 500 connections is rather insane
(assuming the machine does not have hundreds of cores).

If there are memory pressure issues, it's likely because many queries
are performing memory-expensive operations at the same time (might even
be a bad estimate causing hashagg to use much more than work_mem).



But! We haven't even established that swapping is an actual problem
here. The ~2GB of swapped out memory could just as well be the java raid
controller management monstrosity or something similar. Those pages
won't ever be used and thus can better be used to buffer IO.

You can check what's actually swapped out using:
grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'

For swapping to be actually harmful you need to have pages that are
regularly swapped in. vmstat will tell.


I've already asked for vmstat logs, so let's wait.


In a concurrent OLTP workload (~450 established connections do
suggest that) with a fair amount of data keeping the hot data set in
shared_buffers can significantly reduce problems. Constantly
searching for victim buffers isn't a nice thing, and that will happen
if your most frequently used data doesn't fit into s_b. On the other
hand, if your data set is so large that even the hottest part doesn't
fit into memory (perhaps because there's no hottest part as there's
no locality at all), a smaller shared buffers can make things more
efficient, because the search for replacement buffers is cheaper with
a smaller shared buffers setting.


I've met many systems with max_connections values this high, and it was
mostly idle connections because of separate connection pools on each
application server. So mostly idle (90% of the time), but at peak time
all the application servers want to od stuff at the same time. And it
all goes KABOOOM! just like here.


Andres Freund mailto:and...@2ndquadrant.com
Sunday, March 15, 2015 6:47 PM

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping out

Re: [PERFORM] MusicBrainz postgres performance issues