Re: working with all threads of a 'process'

Justin Cinkelj Sun, 03 Jul 2016 22:57:59 -0700


On 07/03/2016 08:52 AM, Nadav Har'El wrote:

On Thu, Jun 30, 2016 at 9:18 PM, Justin Cinkelj<[email protected] <mailto:[email protected]>> wrote:
    I tried to do that, and sent patch. I added comment that it might
    be possible that thread returned by find_by_id() get deleted while
    we are using it. It unlikely/rare etc, but just to mention.
This is true... There's a way to fix that, by holding thread_map_mutexwhile using the thread. We could replace find_by_id() by a methodwhich takes an std::function and runs this function with the mutex held.
HOWEVER, I'm not sure this is a real problem; Consider that we havedozens of functions which take a sched::thread pointer (not an integerid), and most of them will crash if the thread dies in the middle oftheir work. So I'm not sure it's very important to fix the specificproblem of this getaffinity function.
    Another little detail - std::unordered_map<unsigned long, thread
    *> thread_map used ULONG, but thread._id (and pid_t) are UINT - 8
    vs 4 B. Maybe should that be unified? Or is there a reason for
    ULONG usage?
I think this was for historic reasons - historically OSv had an"unsigned long" for thread ids, because we anticipated someone mightwant to go through more than 4 billlion short-lived threads during thelifetime of the VM.
However, at some point we wanted to be more Linux compatible, withLinux having 4-byte process ids, so we switched to having an "unsignedint" _id for threads. The constructor thread::thread() now needs towrap around after finishing 4 billion ids, and needs to check the nextid if not already in use (which we didn't need to do when having8-byte ids and never wrapping around).
So you're right, thread_map's key should be unsigned int, NOT unsignedlong. Thanks for noticing, I'll make a patch for that.

Thank you for explaining. I wasn't sure if long vs int has some purposeor is just an 'coincidence'.


    However, if you also need sched_setaffinity(), this is more
    difficult. Pinning the current thread, and pinning some other
    random thread, have conceptual differences (e.g., consider
    pinning a thread which is already running right now on the wrong
    CPU), and the latter hasn't been done yet. If you need it, please
    checkout https://github.com/cloudius-systems/osv/issues/520 and
    perhaps explain there why you need it.

    Yes, that's on whishlist too. I will read links, and hopefully
    understand some bits.

Doing sched_setaffinity() for other threads will be considerably moretricky than the other small changes you did here, so I wonder if it'sreally necessary: Can't each thread pin itself? That we already support.

It's a bit complicated I guess. The simplest Open MPI app doesn't startthreads on its own, and for those I can add code at start of each threadroutine. The problem is, at that time I don't know to which cpu threadshould be pinned (we start communication thread, and info about requiredpinning comes later over TCP communication channel).

More complex Open MPI apps start additional threads, often even beforetheir Open MPI part gets initialized. For that case, all old threads arefirst pinned. The one created later are probably pinned as whole processis pinned (I guess its so). The 'pin all old threads' seems to be very'intentional' - code loops over all thread a few times to recheck/re-pinthem, to check if new threads as created just at the current moment etc;I guess they had quite some problems with some threads randomly slippingunnoticed from pinning part.


The OpenFOAM so far goes under the simple app case;

--
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: working with all threads of a 'process'

Reply via email to