Hi,

> Anton suggested that NUMA distances in powerpc mattered and hurted
> performance without this setting. We need to validate to see if this
> is still true. A simple way to start would be benchmarking

The original issue was that we never reclaimed local clean pagecache.

I just tried all settings for /proc/sys/vm/zone_reclaim_mode and none
of them caused me to reclaim local clean pagecache! We are very broken.

I would think we have test cases for this, but here is a dumb one.
First something to consume memory:

# cat alloc.c

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <assert.h>

int main(int argc, char *argv[])
{
        void *p;

        unsigned long size;

        size = strtoul(argv[1], NULL, 0);

        p = malloc(size);
        assert(p);
        memset(p, 0, size);
        printf("%p\n", p);

        sleep(3600);

        return 0;
}

Now create a file to consume pagecache. My nodes have 32GB each, so
I create 16GB, enough to consume half of the node:

dd if=/dev/zero of=/tmp/file bs=1G count=16

Clear out our pagecache:

sync
echo 3 > /proc/sys/vm/drop_caches

Bring it in on node 0:

taskset -c 0 cat /tmp/file > /dev/null

Consume 24GB of memory on node 0:

taskset -c 0 ./alloc 25769803776

In all zone reclaim modes, the pagecache never gets reclaimed:

# grep FilePages /sys/devices/system/node/node0/meminfo

Node 0 FilePages:      16757376 kB

And our alloc process shows lots of off node memory used:

3ff9a4630000 default anon=393217 dirty=393217 N0=112474 N1=220490 N16=60253 
kernelpagesize_kB=64

Clearly nothing is working. Gavin, if your patch fixes this we should
get it into stable too.

Anton

Reply via email to