Re: [hwloc-devel] release status

Fawzi Mohamed Mon, 5 Oct 2009 09:23:30 -0400


On 5-ott-09, at 14:27, Jeff Squyres wrote:

On Oct 3, 2009, at 8:21 AM, Fawzi Mohamed wrote:
Ok you are right that storing in the struct might be overkill, andabout performance I fully agree, space not so much, especially ifyou really want to cache all the cpuset for all objects, this stillgrows quadratically, and allocates a lot of objects.
I'm still not sure that I agree -- I still think we're justquibbling over a few bytes here. It's commonplace to have 2GB RAMper core these days; that number certainly isn't going to go down --it's likely that it's even going to go up.
So yes, if every process running on every core has a cpuset, youmultiply (for example) a 4k cpuset data structure times 1,000processors (cores): 4MB. But consider that each of those 1,000processors will have 2GB or more of RAM. That's 2 terabytes; whocares about 4MB when you have 2TB? That's 6 orders of magnitudedifference; put differently, 4MB is 0.0002 percent of 2TB.

well you assume you have a single copy of the whole system structure,I am not sure that would be the case, and while the memory per core isgrowing, the memory per thread is not growing much,... but anyway thatis not the important point...

I agree that we shouldn't be wasteful, but the difference we'retalking about here is in the noise.

ok

That was the reason I was advocating having a function returningthe cpuset from an object (sparse cpuset would also be a solution).
Anyway the real issue here is the API I think.
I would say that the best solution is
- keep cpuset a structure (not just void*), so it can be just avoid* or something more complex in the future without API changes
I'm not sure I parsed the above sentence properly -- I read it asadvocating 2 different things. Can you explain?

yes you are right, I was unclear, I meant that I would pass a cpu_setstruct by value (not always pass a pointer).If one wants to later migrate to passing just a pointer, theninternally this struct can have just a single pointer as field.

- add functions to allocate/deallocate/copy it, and make it clearthat these should be called on the cpusets returned by otherfunctions (i.e. clarify ownership transfers).
Such functions would be necessary only if there are non-publicmembers of the struct or if you want to deep copy the struct,right? They would also apply if we return opaque handles, notpublic structures.

indeed, if you alloc, and it is fixed size (no sparse structure) thenone can just call free, but in general having a structure specificfree function gives just a lot more flexibility for the future (and isneeded to copy unknown size structs).

- functions that are possibly inlined are ok (obviously changingthem breaks the binary compatibility), but recompilation fixesthem, and other languages can still use the non inline functionthat is part of the lib
The usual reason for inlining is a need for performance -- and Ihonestly think that we don't need it. So if the usual question forinlining is "why not?", I turn that question around and ask "if notfor performance, why?". :-)


ok with me :)

- macros I don't like, they make binding to other languages moredifficult, as one has to write either a thin glue layer, orduplicate the macro, which will not stay in sync with lib changesautomatically (cpuset has some macros, but the structure is sosimply that I just used another bit compatible type when binding toD).
Agreed. Macros = evil; should only be used where absolutelynecessary.
To make the release quickly I think that just adding the requestedfunctions (alloc/dealloc would be noops at the moment) would be good.Then in the future one can switch to dynamic or sparse cpusetwithout user visible changes (apart recompilation).
Agreed; that is a good goal (switch to a new back-end type withoutneeding to change user code).

yes, and I think that was the reason behind the initial question bySamuel on dynamic cpuset_t


Fawzi

Re: [hwloc-devel] release status

Reply via email to