Re: [hwloc-devel] structure assumptions, duplication

Fawzi Mohamed Tue, 29 Sep 2009 12:55:32 -0400

Hi Samuel,

On 29-set-09, at 18:14, Samuel Thibault wrote:

Fawzi Mohamed, le Tue 29 Sep 2009 17:39:17 +0200, a écrit :

so that in the future one could avoid storing it at least in the
deepest levels where it is easy and relatively cheap to generate (and
where one would have the largest savings).

Even the deepest levels would have a L1 cache level on top of maybejustat most 4 threads. Here we only save the "children" pointers, whichis

not so many, compared to the siblings & cousins pointers, I'm not sure
it is really worth the pain of defining a long series of functions.


ok those were two separate things, I was thinking

cpuset -> cpuset_ptr (or just a flag that says if the structure hasit, and thus two structures, a long one with it and a short onewithout, differing only in the tail if you really want to be hacky).Then cpuset is generated on the fly for the deepest level (like lessthan 4-8 proc -> lots of memory savings on large machines).

(cost 1 function, and copying or building the cpuset)

sibling/cousin -> only cousins (you can make them loop first onsiblings, then to the others if it really is a partition)

children -> only one representation (arity/childrens or first/last)
(cost many functions)

the main point is that these changes/optimizations can be done evenlater without breaking anything if you use functions.

I would say that for most operations (cpuset, next_sibling,...) using
functions that get a hwloc_obj_t (and if needed also a topology) and
return what requested is the way to go.

That means a long series of functions, I'm not sure it's reallyclearer

for the user. obj->father looks to me easier to read than
hwloc_obj_father(obj), particularly in complex expressions.

ok I can see that, so I guess you will have to evaluate if theabstraction cost is worth the potential savings, maybe for cpuset itis; for sibling,... you might be right that it isn't, for father itsure isn't.

I suppose that most of these operations are not performance critical.


I wouldn't suppose this actually. Detection time is probably not
performance critical, but it could be useful to make browsing the
topology very efficient.

ok, I was thinking that maybe you did/would like to provide in the
future something akin to what opensolaris does with locality groups
http://opensolaris.org/os/community/performance/mpo_overview.pdf


Yes, we intend to provide something similar.

In fact what I "need" (or at least I think I need ;) is just the next
neighbors, basically I go up the hierarchy, and look which new
neighbors I have, so some hierarchy like the lgroups is close to what
I need, and simpler to handle than the full graph.


That's what future heuristics would build for you, yes.


tha's great, I am really looking forward to it.

and sorry if I seem to be criticizing a lot, as I am mainly a user,not a developer of hwloc, but I hope it is constructive, and maybehelps making hwloc better...


ciao
Fawzi

Re: [hwloc-devel] structure assumptions, duplication

Reply via email to