On Oct 3, 2009, at 8:21 AM, Fawzi Mohamed wrote:
Ok you are right that storing in the struct might be overkill, and
about performance I fully agree, space not so much, especially if
you really want to cache all the cpuset for all objects, this still
grows quadratically, and allocates a lot of objects.
I'm still not sure that I agree -- I still think we're just quibbling
over a few bytes here. It's commonplace to have 2GB RAM per core
these days; that number certainly isn't going to go down -- it's
likely that it's even going to go up.
So yes, if every process running on every core has a cpuset, you
multiply (for example) a 4k cpuset data structure times 1,000
processors (cores): 4MB. But consider that each of those 1,000
processors will have 2GB or more of RAM. That's 2 terabytes; who
cares about 4MB when you have 2TB? That's 6 orders of magnitude
difference; put differently, 4MB is 0.0002 percent of 2TB.
I agree that we shouldn't be wasteful, but the difference we're
talking about here is in the noise.
That was the reason I was advocating having a function returning the
cpuset from an object (sparse cpuset would also be a solution).
Anyway the real issue here is the API I think.
I would say that the best solution is
- keep cpuset a structure (not just void*), so it can be just a
void* or something more complex in the future without API changes
I'm not sure I parsed the above sentence properly -- I read it as
advocating 2 different things. Can you explain?
- add functions to allocate/deallocate/copy it, and make it clear
that these should be called on the cpusets returned by other
functions (i.e. clarify ownership transfers).
Such functions would be necessary only if there are non-public members
of the struct or if you want to deep copy the struct, right? They
would also apply if we return opaque handles, not public structures.
- functions that are possibly inlined are ok (obviously changing
them breaks the binary compatibility), but recompilation fixes them,
and other languages can still use the non inline function that is
part of the lib
The usual reason for inlining is a need for performance -- and I
honestly think that we don't need it. So if the usual question for
inlining is "why not?", I turn that question around and ask "if not
for performance, why?". :-)
- macros I don't like, they make binding to other languages more
difficult, as one has to write either a thin glue layer, or
duplicate the macro, which will not stay in sync with lib changes
automatically (cpuset has some macros, but the structure is so
simply that I just used another bit compatible type when binding to
D).
Agreed. Macros = evil; should only be used where absolutely necessary.
To make the release quickly I think that just adding the requested
functions (alloc/dealloc would be noops at the moment) would be good.
Then in the future one can switch to dynamic or sparse cpuset
without user visible changes (apart recompilation).
Agreed; that is a good goal (switch to a new back-end type without
needing to change user code).
--
Jeff Squyres
jsquy...@cisco.com