On 15-Jul-08, at 1:01 AM, Bakul Shah wrote:

I suspect a lot of this complexity will end up being dropped
when you don't have to worry about efficiently using the last
N% of cpu cycles.

Would that I weren't working on a multi-core graphics part... That N% is what the game is all about.

When your bottleneck is memory bandwidth
using core 100% is not going to happen in general.

But in most cases, that memory movement has to share the bus with increasingly remote cache accesses, which in turn take bandwidth. Affinity is a serious win for reducing on-chip bandwidth usage in cache-coherent many-core systems.

 And I am
not sure thread placement belongs in the kernel.  Why not let
an application manage its allocation of h/w thread x cycle
resources?  I am not even sure a full kernel belongs on every
core.

I'm still looking for the right scheduler, in kernel or user space, that lets me deal with affinitizing 3 resources that run at different granularities: per-core cache, hardware-thread-to-core, and cross-chip caches. There's a rough hierarchy implied by these three resources, and perfect scheduling might be possible in a purely cooperative world, but reality imposes pre-emption and resource virtualization.

Unlike you I think the kernel should do even less as more and
more cores are added.  It should basically stay out of the
way.  Less government, more privatization :-)  So may be
the plan9 kernel would a better starting point than a Unix
kernel.

Agreed, less and less in the kernel, but *enough*. I like resource virtualization, and as long as it gets affinity right, I win.

Paul



Reply via email to