Well, I have searched a  bit more about the zone_reclaim_mode thing, and 
obviously I am not the first one having problems with it. It seems, there was 
some kind of logic introduced to the kernel, which decided that our machine 
must be a NUMA architecture (which it is not), and as a consequence, decided to 
switch zone_reclaim_mode = 1. But now, it seems, that some people have 
discovered that this makes no sense with some of the modern processors, which - 
although - seem to be NUME from software ... are not NUMA. And thus, they want 
to go back to always set zone_reclaim_mode = 0 by default.

Here is a lengthy discussion about the topic (so no need for me to start it 
again at LKML - and obviously it is not Ubuntu specific :-)

        http://lkml.org/lkml/2009/5/12/586

(and yes, our hardware has Core i7 CPUs which seem to make many kernels think 
it is NUMA - it is often mentioned in the LKML thread).

Andras Fabian

-----Ursprüngliche Nachricht-----
Von: Craig Ringer [mailto:cr...@postnewspapers.com.au] 
Gesendet: Mittwoch, 14. Juli 2010 05:11
An: Andras Fabian
Cc: pgsql-general@postgresql.org
Betreff: Re: AW: AW: AW: AW: AW: [GENERAL] PG_DUMP very slow because of STDOUT 
??

On 13/07/10 22:16, Andras Fabian wrote:
> I think I have found the solution. Yes, I now can get constantly high 
> throughput with COPY-to-STDOUT, even if free -m only shows me 82 Mbytes (so 
> no, this solution is not cleaning the cache). Always around 2 3/4 minutes.
> 
> I have compared all the /proc/sys/vm settings on my new machines and the old 
> machines (which never had problems), and of course found some differences, 
> some new settings etc. (of course, lot of changes can happen between 2.6.26 
> and 2.6.32). And there was one, which stood out from the mass, because its 
> name reminded me of some functions which I have always seen in the kernel 
> stack while having congestion_wait:
> 
> - zone_reclaim_mode
> (yes, in the kernel stack there was always also a call to "zone_reclaim").
> 
> Interestingly, on the old machine this was set to "0" and on the new machine 
> - obviously per Ubuntu default - to "1" ... What these all means is shortly 
> described here:
> 
> http://www.linuxinsight.com/proc_sys_vm_zone_reclaim_mode.html
> 
> Then I though, lets give it a try. Nd I set it to "0" on the new server too 
> ... and voila, it is running at high speed in COPY-to-STDOUT. I can even 
> switch back and forth between 0 and 1 and see, how congestion_wait comes back 
> or disappears.
> 
> Now, someone with big kernel know-how could try to describe me in detail, 
> what exactly could be at odds here. 
> 
> But for me it is now obvious, that I will put in my start up settings a 
> change of "zone_reclaim_mode = 0".
> 
> And tomorrow I will see how my nightly backup runs with this setting.

It sounds like it's time for a post to the Linux Kernel Mailing List,
and/or a Launchpad bug against the Ubuntu kernel.

Make sure to have your asbestos undewear on if posting to LKML ;-)

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to