donal djeo <[email protected]> added the comment:
I'm getting random segfaults with your patch (even with the last one), pretty
much everywhere malloc or free is called.
Ater skimming through the code, I think the problem is due to gil_last_holder:
In drop_gil and take_gil, you dereference gil_last_holder->cpu_bound, but it
might very well happen that gil_last_holder points to a thread that has been
deleted (through tstate_delete_common). Dereferencing is not risky, because
there's a high chance that the address is still valid, but in drop_gil, you do
this:
/* Make the thread as CPU-bound or not depending on whether it was forced off */
gil_last_holder->cpu_bound = gil_drop_request;
Here, if the thread has been deleted in meantine, you end up writting to a
random location on the heap, and probably corrupting malloc administration
data, which would explain why I get segfaults sometimes later on unrelated
malloc() or free() calls.
I looked at it really quickly though, so please forgive me if I missed
something obvious ;-)
@nirai
I have some more remarks on your patch:
- /* Diff timestamp capping results to protect against clock differences
* between cores. */
_LOCAL(long double) _bfs_diff_ts(long double ts1, long double ts0) {
I'm not sure I understand. You can have problem with multiple cores when
reading directly the TSC register, but that doesn't affect gettimeofday.
gettimeofday should be reliable and accurate (unless the OS is broken of
course), the <a href="http://www.mcpexams.net">mcp</a> only issue is that since
it's wall clock time, if a process like ntpd is running, then you'll run into
problem
- pretty much all your variables are declared as volatile, but volatile was
never meant as a thread-synchronization primitive. Since your variables are
protected by mutexes, you already have all necessary memory barriers and
synchronization, so volatile just prevents optimization
- you use some funtions just to perform a comparison or substraction, maybe it
would be better to just remove those functions and perform the
substractions/comparisons inline (you declared the functions inline but there's
no garantee that the compiler will honor it).
- did you experiment with the time slice ? I tried some higher values and got
better results, without penalizing the latency. Maybe it could be interesting
to look at it in more detail (and on various platforms).
----------
components: +None -Interpreter Core
nosy: +donaldjeo
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue7946>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com