Well, I have searched a bit more about the zone_reclaim_mode thing, and obviously I am not the first one having problems with it. It seems, there was some kind of logic introduced to the kernel, which decided that our machine must be a NUMA architecture (which it is not), and as a consequence, decided to switch zone_reclaim_mode = 1. But now, it seems, that some people have discovered that this makes no sense with some of the modern processors, which - although - seem to be NUME from software ... are not NUMA. And thus, they want to go back to always set zone_reclaim_mode = 0 by default.
Here is a lengthy discussion about the topic (so no need for me to start it again at LKML - and obviously it is not Ubuntu specific :-) http://lkml.org/lkml/2009/5/12/586 (and yes, our hardware has Core i7 CPUs which seem to make many kernels think it is NUMA - it is often mentioned in the LKML thread). Andras Fabian -----Ursprüngliche Nachricht----- Von: Craig Ringer [mailto:cr...@postnewspapers.com.au] Gesendet: Mittwoch, 14. Juli 2010 05:11 An: Andras Fabian Cc: pgsql-general@postgresql.org Betreff: Re: AW: AW: AW: AW: AW: [GENERAL] PG_DUMP very slow because of STDOUT ?? On 13/07/10 22:16, Andras Fabian wrote: > I think I have found the solution. Yes, I now can get constantly high > throughput with COPY-to-STDOUT, even if free -m only shows me 82 Mbytes (so > no, this solution is not cleaning the cache). Always around 2 3/4 minutes. > > I have compared all the /proc/sys/vm settings on my new machines and the old > machines (which never had problems), and of course found some differences, > some new settings etc. (of course, lot of changes can happen between 2.6.26 > and 2.6.32). And there was one, which stood out from the mass, because its > name reminded me of some functions which I have always seen in the kernel > stack while having congestion_wait: > > - zone_reclaim_mode > (yes, in the kernel stack there was always also a call to "zone_reclaim"). > > Interestingly, on the old machine this was set to "0" and on the new machine > - obviously per Ubuntu default - to "1" ... What these all means is shortly > described here: > > http://www.linuxinsight.com/proc_sys_vm_zone_reclaim_mode.html > > Then I though, lets give it a try. Nd I set it to "0" on the new server too > ... and voila, it is running at high speed in COPY-to-STDOUT. I can even > switch back and forth between 0 and 1 and see, how congestion_wait comes back > or disappears. > > Now, someone with big kernel know-how could try to describe me in detail, > what exactly could be at odds here. > > But for me it is now obvious, that I will put in my start up settings a > change of "zone_reclaim_mode = 0". > > And tomorrow I will see how my nightly backup runs with this setting. It sounds like it's time for a post to the Linux Kernel Mailing List, and/or a Launchpad bug against the Ubuntu kernel. Make sure to have your asbestos undewear on if posting to LKML ;-) -- Craig Ringer Tech-related writing: http://soapyfrogs.blogspot.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general