Re: per_cpu_pagesets degrades MPI performance

Nick Piggin Thu, 07 Apr 2005 17:41:48 -0700

Jack Steiner wrote:

Good idea. For the specific benchmark that I was running, batch sizes of 0 (pcp disabled), 1, 3, 5, 7, 9, 10, 11, 13 & 15 all produced good results. Batch sizes of 2, 4 and 8 produced horrible results.


Phew, I hope we won't have to make this a CONFIG_ option!

Surprisingly 7 was not quite as good as the other good values but I attribute 
that
to an anomaly of the reference pattern of the specific benchmark.

Even more suprising (again an anomaly I think) was that a size of 13 ran
10% faster than any of the other sizes. I reproduced this data point several
times - it is real.


Hmm. Yeah, sounds you are getting close to some "resonance" behaviour -
were 7 and 13 are close to a multiple or divisor of some application
or cache property.

Our next step to to run the full benchmark suite. That should happen
within 2 weeks.
Tentatively, I'm planning to post a patch to change the batch size to 2**n-1 but I'll wait for the results of the full benchmark.


Cool. I would consider (maybe you are) posting the patch ASAP, so you
can get a wider range of testers, and Andrew can possibly put it in
-mm. Just to get things happening in parallel.

I also want to finish understanding the issue of excessive memory
being trapped in the per_cpu lists.


Nutty problem, that, on a 256 node, 512 CPU system :(

Thanks,
Nick

--
SUSE Labs, Novell Inc.


-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: per_cpu_pagesets degrades MPI performance

Reply via email to