2008/11/9 Ludovic Courtès <[EMAIL PROTECTED]>: > Hi, > > "Linas Vepstas" <[EMAIL PROTECTED]> writes: > >> Anyway, I have an even simpler variant, with only *one* >> thread deadlocked in gc. Here's the scenario: >> >> thread A: >> scm_init_guile(); >> does some other stuff, then >> goes to sleep in select, waiting on socket input >> (as expected). >> >> thread B: >> scm_init_guile() -- hangs here. > > Just to be sure: do you use raw select(2), `scm_std_select ()', or > select(2) within `scm_without_guile ()'?
Raw select. I don't actually use it, its in some other library that the project uses; wrapping it with a guile function is not an option; its not code that we (the project) can even control. scm_without_guile is also clearly not an option, as it requires an inversion of control. In my app. guile is a bit player, just another small guy inside a much larger framework; and so anything invasive is just not possible. Similar remarks for scm_pthread_mutex_lock(), etc. Its rather ludicrous to even suggest that such wrappers be used -- typically, such things are not ever within reach of the developer. > The first option would lead to the situation you're observing because > Guile's GC requires thread to cooperate, which is why > `scm_without_guile' exists (info "(guile) Blocking"). Yes, well, I eventually figured it out. The very first sentence of http://www.gnu.org/software/guile/manual/html_node/Blocking.html says it all: "A thread must not block outside of a libguile function while it is in guile mode. " Unfortunately, this sentence is completely missing from the section called "5.3 Initializing guile", which offers no hint of this behaviour. Similarly, there is no hint of this in section 4.3.5 "Multi-threading". Instead, I see sentences that say things like "a thread promises to reach a safe point reasonably frequently (see Asynchronous Signals)" What the heck is "a safe point"? Who is making the "promise"? Guile or the user app? What the heck do "safe points" have to do with signals? So I think that section is trying to warn about the deadlock, but fails utterly to actually come out and say it .... the documentation on this topic needs an overhaul. I found a simple solution that works for me: just wrap calls to scm_c_eval_string() with calls to scm_with_guile(). That is, I basically call scm_with_guile(scm_c_eval_string, expr) and so am *never* in "guile mode", except while evaluating an expression. This solution is "obvious", once one understands it ... but again, the documentation doesn't even hint this: scm_c_eval_string does not even hint that wrapping it with scm_with_guile is going to be the "typical" use case. (well, I actually wrap it scm_c_catch first.. which is also a whole under-documented area). I suppose I could volunteer to provide updated documentation, if someone can hold my hand during its creation. --linas
