2009/2/11 Neil Jerram <n...@ossau.uklinux.net>: > Linas Vepstas <linasveps...@gmail.com> writes: > >> Err, sort of, yes, unless I misunderstand. Guile 1.8 makes >> a certain basic assumption that is splattered throughout >> the code; it rather intentionally re-orders the order in which >> one of the locks is taken. If I remember correctly, its the >> "in guile mode" lock. So if you just go looking for locks >> that are released out-of-order, you'll find lots of these. > > Yes, I think I understand this now (having seen it myself). The > pattern is > > - thread holding its heap_mutex - which is the normal state in guile
Yes, that's the one. > That in itself doesn't actually cause an ordering problem, but then > the thread releases the other mutex without releasing the heap mutex > first - which is perceived (by helgrind at least) as a problem. > > (Is something like this actually _ever_ a problem? If locks are > always _acquired_ in the right order, how can the order of _releasing_ > ever cause a problem?) Yes, it can be a problem; I don't want to dream up some particular scenario (this stuff destroys brain cells) ... but I do vaguely remember skimming some wikipedia article on locking that discussed this. I think the scenario involves three locks, though. This is why helgrind checks lock order -- its one of the locking problems it can actually detect. However, for the case of guile, the heap mutex is not visible to anything that isn't guile, and thus, its safe in this particular case. If there were outside users, things would be different. > The other thing to bear in mind is that 99% of this will just > evaporate if we move to BDW-GC for 2.0.x; so - assuming we do end up > doing that - it makes sense to take a slightly more pragmatic approach > than normal for 1.8.x. Sure. As a reminder ... the only real remaining problem that remained with a race to update some hash table, when define was being used from several threads. *thats* the bug that needs attention (but is hard to fix). > [1] I am only running a basic startup test, though: "valgrind > --tool=helgrind guile -q <<EOF". Were you running something a lot > more complex than that? I had written some simple test case, which I think sprouted a bunch of threads, and then did simple scheme things in each .. e.g. just adding numbers, or whatever. I'm attaching some kind of simple test case to this email however, it is very hacked, so I don't know if it actually will find bugs, and its probably doesn't do what it claims to do. I provide it only as a short-cut for creating a new test case. ... --linas
/** * Guile threading bug/unexpected behaviour. * * Two posix threads are created. The first thread runs, and defines * something. The second thread tries to access the thing defined by * the first. But, due to some "unexpected" threading behaviour, the * things defined in the first thread are not visible to the first. * I'm pretty convinced this is a bug, and am hunting it down now. * * The printed output of this program, for me, under guile1.8.5, is * * HEllo, thread one is running * Goodbye, thread one is exiting * HEllo, thread two is running * ERROR: Unbound variable: x * Goodbye, thread two is exiting * Main is exiting * * To get a better idea of what's going on, it seems to have something * to do with the current module. By turning on print debugging, the * output, for me, becomes: * * HEllo, thread one is running * thread one had output: * the root module is #<module (guile) f73f9e80> * the current module is #<directory (guile-user) f73fc600> * * Goodbye, thread one is exiting * HEllo, thread two is running * thread two had output: * the root module is #<module (guile) f73f9e80> * the current module is #<module (guile) f73f9e80> * * ERROR: Unbound variable: x * Goodbye, thread two is exiting * Main is exiting */ #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <libguile.h> SCM outport; void prtdbg(const char * id) { #define SHOW_STUFF 1 #ifdef SHOW_STUFF char buff[100]; sprintf(buff, "(display \"duuude id %s\\n\")\n", id); scm_c_eval_string (buff); scm_c_eval_string ("(display \"the root module is \")\n"); scm_c_eval_string ("(display the-root-module)\n"); scm_c_eval_string ("(newline)\n"); scm_c_eval_string ("(display \"the current module is \")\n"); scm_c_eval_string ("(display (current-module))\n"); scm_c_eval_string ("(newline)\n"); SCM out = scm_get_output_string(outport); char * str = scm_to_locale_string(out); scm_truncate_file(outport, scm_from_uint16(0)); printf("%s had output:\n%s\n", id, str); #endif } void * scm_one (void *p) { outport = scm_open_output_string(); scm_set_current_output_port(outport); prtdbg("thread one"); // scm_c_eval_string ("(set-current-module the-root-module)\n"); // prtdbg("thread one again"); scm_c_eval_string ("(define x \"asddf\")\n"); } void * scm_two (void *p) { scm_set_current_output_port(outport); prtdbg("thread two"); // scm_c_eval_string ("(display x)\n"); scm_c_eval_string ("(display \"duuuuuuuuuuuuauuuuuuuuuuuude\")\n"); scm_c_eval_string ("(define (fact n) (if (= n 1) 1 (* n (fact (- n 1)))))\n" "(display \"wahazzzuppp\\n\")\n" "(display (fact 369))\n" "(display \"yeah\\n\")\n"); } void * scm_three (void *p) { scm_set_current_output_port(outport); prtdbg("thread three"); scm_c_eval_string ("(display x)\n"); } void * scm_four (void *p) { prtdbg("thread four"); scm_c_eval_string ("(display x)\n"); } static void * thread_zero (void * arg) { printf("Thread zeroooooo \n"); scm_with_guile(scm_one, NULL); printf("Thread exit \n"); return NULL; } static void * thread_one (void * arg) { printf("HEllo, thread one is running\n"); pthread_attr_t attr; pthread_t t0; pthread_attr_init(&attr); // pthread_create(&t0, &attr, thread_zero, NULL); scm_with_guile(scm_one, NULL); printf("Goodbye, thread one is exiting\n"); return NULL; } static void * thread_two (void * arg) { printf("HEllo, thread two is running\n"); scm_with_guile(scm_two, NULL); printf("Goodbye, thread two is exiting\n"); return NULL; } static void * thread_three (void * arg) { printf("HEllo, thread three is running\n"); scm_with_guile(scm_three, NULL); printf("Goodbye, thread three is exiting\n"); return NULL; } static void * thread_four (void * arg) { printf("HEllo, thread four is running\n"); scm_with_guile(scm_four, NULL); printf("Goodbye, thread four is exiting\n"); return NULL; } main() { int rc; pthread_attr_t attr; pthread_t t1, t2, t3, t4; pthread_attr_init(&attr); rc = pthread_create(&t1, &attr, thread_one, NULL); if(rc) { fprintf(stderr, "Fatal error: can't create thread one\n"); exit(1); } sleep(1); rc = pthread_create(&t2, &attr, thread_two, NULL); if(rc) { fprintf(stderr, "Fatal error: can't create thread two\n"); exit(1); } sleep(1); rc = pthread_create(&t3, &attr, thread_three, NULL); if(rc) { fprintf(stderr, "Fatal error: can't create thread three\n"); exit(1); } sleep(1); rc = pthread_create(&t4, &attr, thread_four, NULL); sleep(5); printf("Main is exiting\n"); }