On 07/03/2016 08:52 AM, Nadav Har'El wrote:
On Thu, Jun 30, 2016 at 9:18 PM, Justin Cinkelj
<[email protected] <mailto:[email protected]>> wrote:
I tried to do that, and sent patch. I added comment that it might
be possible that thread returned by find_by_id() get deleted while
we are using it. It unlikely/rare etc, but just to mention.
This is true... There's a way to fix that, by holding thread_map_mutex
while using the thread. We could replace find_by_id() by a method
which takes an std::function and runs this function with the mutex held.
HOWEVER, I'm not sure this is a real problem; Consider that we have
dozens of functions which take a sched::thread pointer (not an integer
id), and most of them will crash if the thread dies in the middle of
their work. So I'm not sure it's very important to fix the specific
problem of this getaffinity function.
Another little detail - std::unordered_map<unsigned long, thread
*> thread_map used ULONG, but thread._id (and pid_t) are UINT - 8
vs 4 B. Maybe should that be unified? Or is there a reason for
ULONG usage?
I think this was for historic reasons - historically OSv had an
"unsigned long" for thread ids, because we anticipated someone might
want to go through more than 4 billlion short-lived threads during the
lifetime of the VM.
However, at some point we wanted to be more Linux compatible, with
Linux having 4-byte process ids, so we switched to having an "unsigned
int" _id for threads. The constructor thread::thread() now needs to
wrap around after finishing 4 billion ids, and needs to check the next
id if not already in use (which we didn't need to do when having
8-byte ids and never wrapping around).
So you're right, thread_map's key should be unsigned int, NOT unsigned
long. Thanks for noticing, I'll make a patch for that.
Thank you for explaining. I wasn't sure if long vs int has some purpose
or is just an 'coincidence'.
However, if you also need sched_setaffinity(), this is more
difficult. Pinning the current thread, and pinning some other
random thread, have conceptual differences (e.g., consider
pinning a thread which is already running right now on the wrong
CPU), and the latter hasn't been done yet. If you need it, please
checkout https://github.com/cloudius-systems/osv/issues/520 and
perhaps explain there why you need it.
Yes, that's on whishlist too. I will read links, and hopefully
understand some bits.
Doing sched_setaffinity() for other threads will be considerably more
tricky than the other small changes you did here, so I wonder if it's
really necessary: Can't each thread pin itself? That we already support.
It's a bit complicated I guess. The simplest Open MPI app doesn't start
threads on its own, and for those I can add code at start of each thread
routine. The problem is, at that time I don't know to which cpu thread
should be pinned (we start communication thread, and info about required
pinning comes later over TCP communication channel).
More complex Open MPI apps start additional threads, often even before
their Open MPI part gets initialized. For that case, all old threads are
first pinned. The one created later are probably pinned as whole process
is pinned (I guess its so). The 'pin all old threads' seems to be very
'intentional' - code loops over all thread a few times to recheck/re-pin
them, to check if new threads as created just at the current moment etc;
I guess they had quite some problems with some threads randomly slipping
unnoticed from pinning part.
The OpenFOAM so far goes under the simple app case;
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.