On Fri, May 8, 2009 at 10:47 PM, Charles Oliver Nutter
<charles.nut...@sun.com> wrote:
> Subramanya Sastry wrote:
>>
>> I may have been wrong.   I had a chance to think through this a little bit
>> more.
>>
>> Consider this ruby code:
>>
>> i = 5
>> v1 = i + 1
>> some_random_method_call()
>> v2 = i + 1
>>
>> In this code snippet, the second '+' might not get optimized because
>> 'some_random_method_call' could monkeypatch Fixnum.+ in the course of
>> whatever it does.  This means that calls form hard walls beyond which you
>> cannot hoist method and class modification guards (unless of course, you can
>> precisely determine the set of methods any given call can modify -- for
>> example, that 'some_random_method_call' will not modify Fixnum.+).
>
> This is certainly a challenge for entirely eliminating guards, and I believe
> even the fastest Ruby implementations currently are unable to remove those
> guards completely.
>
> In this example, if we can at least prove that i, v1, and v2 will always be
> Fixnum when Fixnum#+ has not been replaced, we can emit guarded optimized
> versions alongside deoptimized versions. Because Fixnum#+ falls on the
> "rarely or never replaced" end of the spectrum, reducing + operations to a
> simple type guard plus "Fixnum has been modified" guard is probably the best
> we can do without code replacement. More on code replacement thoughts later.
>
>> So, this is somewhat similar to pointer alias analysis in C programs.  In
>> C programs, the ability to pass around pointers and manipulate them in an
>> unrestricted fashion effectively restricts the kind of optimizations that a
>> compiler can do.  Similarly, in Ruby programs, it seems to me that the
>> ability for code to arbitrarily modify code effectively restricts the kind
>> of optimizations that a compiler can do.  Open classes and eval and all of
>> that are to Ruby what pointers are to C.  Potentially powerful for the
>> programmer, but hell for the compiler.  More on this further below after
>> discussion of the multi-threaded scenario.
>
> It is true that we generally must assume code modification events will have
> a large impact on core methods. This means we need to always guard, even if
> such modifications are exceedingly rare. However it may be pessimistic to
> assume such guards will be unacceptably slow in comparison to the rest of
> the system, when this may or may not be the case. Currently, all method
> calls in normal execution scenarios must read a volatile field for the call
> site invalidation guard. Although it's difficult to measure accurately, the
> volatile guard should in theory be a very large expense, and John Rose said
> as much when I described our current system to him. But it may be possible
> to make such guards non-volatile if we have other mechanisms for triggering
> memory synchronization that those non-volatile calls could see.
>
> One theory I had was that since we periodically ping another volatile field
> for cross-thread events (kill, raise), there may be a way we can rely on
> those events to trigger our memory sync. But I am unfamiliar enough with the
> Java memory model that I do not know if such a volatility-reducing mechanism
> would work or be reliable.
>
>> If thread 2 modifies Fixnum.+, you might expect the modification to be
>> reflected in thread 1 at some point.  If you hoist method/class guards all
>> the way outside the loop and convert the i+1 to integer addition, code
>> modifications from thread 2 won't propagate to thread 1.
>> But, in this scenario, I would argue that any code that relies on behavior
>> like this is broken.  This is effectively a race condition and different
>> ruby implementations will yield different behavior.  In fact, code
>> modification by meta-programming is effectively modifying class meta-data.
>>  So, concurrent code modification is not very different from concurrent
>> writes to program data.   I am not sure if the meta programming model treats
>> modification to open classes as changes to a program-visible meta-data.  If
>> it were, then programmer can then synchronize on that meta-data object
>> before patching code.  But, I expect this is not the case because that would
>> then force every method call to acquire a lock on the class meta-data.
>
> Again there's some unknowns here, but the likelihood of method table
> modifications happening across threads (or at least the likelihood of those
> changes producing runtime behavioral effects across threads) is probably
> very low. The majority of Ruby libraries and frameworks have either been
> designed with single threads in mind--i.e. the expectation is that a given
> thread would exit the library before a code modification would happen, if it
> ever did--or have been designed with multi-threading in mind--making code
> modifications at boot time or with an expectation that only one thread would
> ever see them (like modifying a thread-local object to add new methods or
> modules). Both cases have few expectations about vigorous or even common
> cross-thread code modification events.
>
>> Where does this leave us?  Consider a Rails app where there is a
>> UpdateCodeController with a method called load_new_code(C, m) which
>> basically updates running code with a new version of a method 'm' in class
>> 'C' (lets say a bug fix).  This discussion of code modification then boils
>> down to asking the question: at what point does the new method become
>> available?  In the absence of code modification synchronization (on C's
>> metadata object), the ruby implementation can block on this request while
>> continuing to run the old version of 'm' till it is convenient to switch
>> over to new code!
>>
>> This is obviously a contrived example, but the point of this is: If there
>> is no mechanism for the programmer to force code modifications to be visible
>> in a ruby implementation, the ruby implementation has flexibility over how
>> long it turns a blind eye to code modifications in a thread other than the
>> currently executing thread.  So, what you need to worry about is figuring
>> out the hard boundaries for method and class guards assuming a
>> single-threaded program and optimizing for that scenario.
>>
>> Am I missing something?
>
> No, I think this is right on. And to cement this even further: we are
> blazing a very new trail when it comes to parallel-executing Ruby, which
> means we may be able to *set* expectations for cross-thread
> code-modification behavior. A user moving their library to JRuby will then
> (as now) have to pay a bit more attention to how that library behaves in a
> (probably) more formal and (definitely) better-specified threading
> environment.
>
>> Now, back to the single-threaded scenario and the earlier example:
>>
>> i = 5
>> v1 = i + 1
>> some_random_method_call()
>> v2 = i + 1
>>
>> As discussed earlier, the problem here is determining the set of methods
>> that a call can modify, and this is somewhat similar to the C pointer
>> aliasing problem.  That by itself is "not a big deal" because unlike C
>> pointers, code modification is much rarer.  So, in the ruby implementation,
>> you could optimistically / speculatively optimize a ton of code assuming NO
>> code modifications and move the burden of invalidation from call sites
>> (which are everywhere) to the centralized class meta-data structures i.e.
>> you install listeners/traps/callbacks (whatever you want to call them) on
>> the class and when a method is modified, you invalidate all optimized code
>> that assumes that the method is "static".  This is non-trivial (will require
>> on-stack replacement of code), but at least conceivable.  But, since you are
>> compiling for the JVM (or LLVM in the case of MacRuby), once you JIT, you
>> effectively cede control.  You dont have a mechanism to invalidate JIT-ted
>> code.  So, at this time, I am not sure if you can really get around this
>> (extremely pessimistic) constraint.  Only ruby implementations that can
>> control the compilation stack all the way to machine code (or have hooks to
>> invalidate / modify existing code) might be able to get around this.  I am
>> curious how MacRuby-LLVM combination is tackling this ...
>
> As early as last fall, JRuby's call site invalidation was "active" as you
> describe, and a list of call sites attached to a method object were
> physically flushed when changes occurred in that method's "home" hierarchy
> likely to render such caching invalid. This approach was originally
> abandoned due to the complexity of ensuring multiple threads reaching the
> same call site at the same time would not accidentally miss an invalidation
> event and cache bad code forever. This is also, in fact, an open question
> about the invokedynamic active invalidation; when I and others brought it up
> to John Rose, he recognized that most dynamic calls would still need to have
> "passive" invalidation guards even after linking. The original call site
> could still be actively invalidated, but without introducing locks at call
> sites (sure performance death) there's no simple way to avoid "just in case"
> passive guards as well.

  This makes me wonder if we cannot actually ascertain the cost of
various locking/volatile/active/inactive scenarios.  Of course in a
pet micro bench of these it may give an unrealistic answer, but it
would still be cool to get some understanding of cost.

  It also makes me think of more exotic solutions like generating a
callsite class per thread and per call then using actual
synchronization since then you would get monomorphic call and
lightweight synch lock.  I wonder if anyone has ever discussed all the
strategies that are realistic?  Like is that realistic or the suck :)

-Tom

-- 
Blog: http://www.bloglines.com/blog/ThomasEEnebo
Email: en...@acm.org , tom.en...@gmail.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to