On 5-ott-09, at 14:27, Jeff Squyres wrote:

On Oct 3, 2009, at 8:21 AM, Fawzi Mohamed wrote:

Ok you are right that storing in the struct might be overkill, and about performance I fully agree, space not so much, especially if you really want to cache all the cpuset for all objects, this still grows quadratically, and allocates a lot of objects.

I'm still not sure that I agree -- I still think we're just quibbling over a few bytes here. It's commonplace to have 2GB RAM per core these days; that number certainly isn't going to go down -- it's likely that it's even going to go up.

So yes, if every process running on every core has a cpuset, you multiply (for example) a 4k cpuset data structure times 1,000 processors (cores): 4MB. But consider that each of those 1,000 processors will have 2GB or more of RAM. That's 2 terabytes; who cares about 4MB when you have 2TB? That's 6 orders of magnitude difference; put differently, 4MB is 0.0002 percent of 2TB.

well you assume you have a single copy of the whole system structure, I am not sure that would be the case, and while the memory per core is growing, the memory per thread is not growing much,... but anyway that is not the important point...

I agree that we shouldn't be wasteful, but the difference we're talking about here is in the noise.

ok

That was the reason I was advocating having a function returning the cpuset from an object (sparse cpuset would also be a solution).

Anyway the real issue here is the API I think.
I would say that the best solution is
- keep cpuset a structure (not just void*), so it can be just a void* or something more complex in the future without API changes

I'm not sure I parsed the above sentence properly -- I read it as advocating 2 different things. Can you explain?

yes you are right, I was unclear, I meant that I would pass a cpu_set struct by value (not always pass a pointer). If one wants to later migrate to passing just a pointer, then internally this struct can have just a single pointer as field.

- add functions to allocate/deallocate/copy it, and make it clear that these should be called on the cpusets returned by other functions (i.e. clarify ownership transfers).

Such functions would be necessary only if there are non-public members of the struct or if you want to deep copy the struct, right? They would also apply if we return opaque handles, not public structures.

indeed, if you alloc, and it is fixed size (no sparse structure) then one can just call free, but in general having a structure specific free function gives just a lot more flexibility for the future (and is needed to copy unknown size structs).

- functions that are possibly inlined are ok (obviously changing them breaks the binary compatibility), but recompilation fixes them, and other languages can still use the non inline function that is part of the lib

The usual reason for inlining is a need for performance -- and I honestly think that we don't need it. So if the usual question for inlining is "why not?", I turn that question around and ask "if not for performance, why?". :-)

ok with me :)

- macros I don't like, they make binding to other languages more difficult, as one has to write either a thin glue layer, or duplicate the macro, which will not stay in sync with lib changes automatically (cpuset has some macros, but the structure is so simply that I just used another bit compatible type when binding to D).

Agreed. Macros = evil; should only be used where absolutely necessary.

To make the release quickly I think that just adding the requested functions (alloc/dealloc would be noops at the moment) would be good. Then in the future one can switch to dynamic or sparse cpuset without user visible changes (apart recompilation).


Agreed; that is a good goal (switch to a new back-end type without needing to change user code).

yes, and I think that was the reason behind the initial question by Samuel on dynamic cpuset_t

Fawzi

Reply via email to