Hi Mark, Mark H Weaver <[email protected]> skribis:
> [email protected] (Ludovic Courtès) writes: [...] >>> Indeed, we need something to ensure that the compiler won't reorder >>> these writes. My recommendation would be to use the 'volatile' >>> qualifier in the declarations of both the 'fp' field of 'struct scm_vm' >>> and the stack element modified by SCM_FRAME_SET_DYNAMIC_LINK. >>> >>> Maybe something like this: >>> >>> diff --git a/libguile/frames.h b/libguile/frames.h >>> index ef2db3df5..8554f886b 100644 >>> --- a/libguile/frames.h >>> +++ b/libguile/frames.h >>> @@ -89,6 +89,7 @@ >>> union scm_vm_stack_element >>> { >>> scm_t_uintptr as_uint; >>> + volatile scm_t_uintptr as_volatile_uint; >> >> [...] >> >>> +#define SCM_FRAME_SET_DYNAMIC_LINK(fp, dl) ((fp)[1].as_volatile_uint = >>> ((dl) - (fp))) >> >> I’m not sure this is exactly what we need. >> >> What I had in mind, to make sure the vp->fp assignment really happens >> after the SCM_FRAME_SET_DYNAMIC_LINK, was to do something like this: >> >> SCM_FRAME_SET_RETURN_ADDRESS (new_fp, …); >> SCM_FRAME_SET_DYNAMIC_LINK (new_fp, …); >> >> asm volatile ("" : : : "memory"); >> >> vp->fp = new_fp; >> >> I think that more accurately reflects what we want, WDYT? >> >> It’s GCC-specific though (I don’t think Clang implements ‘asm’ in a >> compatible way, does it?) and I suppose we’d have to ignore the non-GCC >> case. > > Right, the problem is that the "asm volatile" solution is not portable. > > To my mind, it's fine to optionally use GCC extensions for improved > performance, but I think we should ensure that Guile works reliably when > compiled with any conforming C compiler. I agree, of course (that said, most compilers implement most GCC extensions nowadays, but ‘asm’ is probably an exception). > What problem do you see with my proposed portable solution? If I > understand C99 section 5.1.2.3 paragraph 2 correctly, compilers are not > permitted to reorder accesses to volatile objects as long as there is a > sequence point between them. My understand is that what you propose is (almost*) equivalent to the asm trick, provided SCM_FRAME_SET_DYNAMIC_LINK is the last statement before the vp->fp assignment. So we’d have to make sure we keep things in this order, right? [*] The ‘volatile’ qualifier on the field does more than just an instruction reordering barrier: AIUI, it forces the compiler to emit a load and store instruction right at the assignment point, which is a stricter constraint than what we need, I think. > I should say that I'm not confident that _either_ of these proposed > solutions will adequately address all of the possible problems that > could occur when GC is performed on VM threads stopped at arbitrary > points in their execution. Yeah, as discussed on IRC with Andy, we’d be better off if we were sure that each stack is marked by the thread it belongs to, in terms of data locality, and thus also in terms of being sure that vp->fp is up-to-date when the marker reads it. It’s not something we can change now, though. Alternately, we could use atomics when accessing vp->fp to ensure memory consistency (I tried that during my debugging journey…). It could be costly. Anyway, I don’t think we’ll have the final word on all this before 2.2.4. The way I see it we should keep working on improving it, but there are difficult choices to make, so it will probably take a bit of time. Thanks, Ludo’.
