On Mon, 17 Oct 2016, Luck, Tony wrote:
> On Tue, Oct 18, 2016 at 01:20:36AM +0200, Thomas Gleixner wrote:
> > It's certainly not perfect (missing L2 etc.), but clearly avoids exactly
> > the above issues. And it would allow you to utilize the 256 groups in an
> > understandable way.
> If you head down that path someone with a 4-socket system will try to
> make 16x16x16x16 = 65536 groups and "understandable" takes a bit of
> a beating. The eight socket system with 16^8 = 4G groups defies any
> rationale hope. Best not to think about 16 sockets.
> The L2 + L3 configuration space gets unbelievably messy too.
> There's a reason why I ripped out the allocation code and went with
> a simple global allocator in this version. If we decide we need something
> fancier we can adapt later. Some solutions might be transparent to
> applications, others might add a "closid" file into each directory to
> give 2nd generation applications hooks to view (and maybe control)
> which closid is used by each group.
I'm not saying that we want something fancier. I fully agree with your
decision to make a simple global allocator.
I was just puzzled by the 16*16 comment and wondered what this is
about. Looking at Fenghuas explanation and the examples there is nothing
which really looks like we ever want it. In fact the fancy CLOSID matrix
does not make much sense at all.
So I rather would like to see a comment clearly explaining why the chosen
allocator (grouping) gives us the most straight forward way to utilize the
hardware. It surely restricts the theoretical choices, but it limits them to
the subset which makes technically sense.