On Oct 3, 2009, at 8:21 AM, Fawzi Mohamed wrote:

Ok you are right that storing in the struct might be overkill, and about performance I fully agree, space not so much, especially if you really want to cache all the cpuset for all objects, this still grows quadratically, and allocates a lot of objects.

I'm still not sure that I agree -- I still think we're just quibbling over a few bytes here. It's commonplace to have 2GB RAM per core these days; that number certainly isn't going to go down -- it's likely that it's even going to go up.

So yes, if every process running on every core has a cpuset, you multiply (for example) a 4k cpuset data structure times 1,000 processors (cores): 4MB. But consider that each of those 1,000 processors will have 2GB or more of RAM. That's 2 terabytes; who cares about 4MB when you have 2TB? That's 6 orders of magnitude difference; put differently, 4MB is 0.0002 percent of 2TB.

I agree that we shouldn't be wasteful, but the difference we're talking about here is in the noise.

That was the reason I was advocating having a function returning the cpuset from an object (sparse cpuset would also be a solution).

Anyway the real issue here is the API I think.
I would say that the best solution is
- keep cpuset a structure (not just void*), so it can be just a void* or something more complex in the future without API changes

I'm not sure I parsed the above sentence properly -- I read it as advocating 2 different things. Can you explain?

- add functions to allocate/deallocate/copy it, and make it clear that these should be called on the cpusets returned by other functions (i.e. clarify ownership transfers).

Such functions would be necessary only if there are non-public members of the struct or if you want to deep copy the struct, right? They would also apply if we return opaque handles, not public structures.

- functions that are possibly inlined are ok (obviously changing them breaks the binary compatibility), but recompilation fixes them, and other languages can still use the non inline function that is part of the lib

The usual reason for inlining is a need for performance -- and I honestly think that we don't need it. So if the usual question for inlining is "why not?", I turn that question around and ask "if not for performance, why?". :-)

- macros I don't like, they make binding to other languages more difficult, as one has to write either a thin glue layer, or duplicate the macro, which will not stay in sync with lib changes automatically (cpuset has some macros, but the structure is so simply that I just used another bit compatible type when binding to D).

Agreed.  Macros = evil; should only be used where absolutely necessary.

To make the release quickly I think that just adding the requested functions (alloc/dealloc would be noops at the moment) would be good. Then in the future one can switch to dynamic or sparse cpuset without user visible changes (apart recompilation).


Agreed; that is a good goal (switch to a new back-end type without needing to change user code).

--
Jeff Squyres
jsquy...@cisco.com

Reply via email to