On 12/09/10 19:09, John Rose wrote: > I started a thread on Google Groups to get more advice on safepoint-based > invalidation, which the EG is naming MutableCallSite#sync. > > http://groups.google.com/group/jvm-languages/browse_thread/thread/9c9d3e84fc745676# >
TL;DR version: The scheme you laid out here seems sound. (To avoid relays/forks of discussions, I tried joining and CCing the group) My only concern about this is whether it precludes further useful support for the general problem of "phased computation", which is in turn either a variant of or a generalization of fork/join computation, depending on how you look at it. Since JDK7 includes integrated support for both ForkJoinTask and Phaser, that will both probably be used in JDK8 lambdas etc, it seems almost a sure thing that people will further explore underlying JVM support. Here's a synopsis of issues. (Please bear with the background before returning to questions at hand.) Phased and fork/join computations impose enough structure on parallel computations to allow cheaper low-level memory-level accesses. Well, at least when they are used in the intended ways; among the main difference between JDK7 support and explicitly parallel languages like X10 (supporting "clocks" (a form of Phasers) and async/finish (a form of fork/join)) is that Java compilers cannot enforce correct usage (although tools and IDE plugins could be improved/developed that could do so in most cases). As warmup, for fork/join, the main economies are due to the static DAG structure of computation. Among other impacts, accessing a variable modified by a joined task requires no memory-level sync beyond what was required anyway to determine the join. (We currently have no way to designate such variables or usages; better support probably awaits this.) For phased computations, the main economies apply to "phased" variables -- those that may take a new value on each phase but are constant within phases. (We also don't have a way of designating these.) Conceptually, a phased variable is a holder of two values/variables, call them "current" and "next", where reads are from "current" and writes are to "next". In existing JMM-ese, "current" is basically a non-volatile variable, and "next" is volatile. And conceptually, upon each phase advance, each party does "current = next". There are a few ways to implement this differently though. One common option is to use only a single location per phased variable, and to arrange an action taken upon phase advance to do in-place updates (see for example Phaser.onAdvance, http://gee.cs.oswego.edu/dl/jsr166/dist/docs/java/util/concurrent/Phaser.html which must by nature be performed in exclusion since all other parties are blocked until advance. Another option is to use a two-slot array per phased variable, indexed by the lowest bit of phase number: reads are from slot[phase & 1], writes are to slot[~phase & 1] (assuming that phase numbers are explicit and sequential). This avoids the need for copying. Also, across all of these, it is possible and sometimes desirable for parties to use their own writes early, before advance. Sadly enough, I don't know of a recipe determining which of these options is best in a given situation. (Although I do know one of the ingredients: whether you require atomicity of updates across all parties upon advance.) Maybe some further study and exploration could arrive at a good recipe. In the mean time, note that only one of these options (explicit current+next) even has a reasonable interpretation in existing JVM-ese. The others are mismatched in that some of the operations have JMM-volatile semantics and some don't, so it doesn't make sense to declare the variable as either volatile or non-volatile. John's proposal for MutableCallSite falls into this category, hence the controversy about how to express it. It is a specialization of the onAdvance (aka sync()) approach with in-place updates, and safepoint generations serving as phases. This seems reasonable given the likely need for atomicity during safepoints anyway. While ideally it would be nice to integrate this with other phased computation even now (I can imagine defining adjunct classes for use with j.u.c.Phaser), I'm not too tempted to do so right now, because it is not always the best available option. However, someday, it would be great to be able to allow use the same mechanics in each. As parallelism becomes more widespread, we must make it easiest and most attractive for people to use simpler constrained execution control patterns like phasers and FJ rather than the ad-hoc unstructured use of threads/volatiles/locks. One way to help reach this goal is to improve JVM-level support so that the most structured parallel code is also the fastest parallel code. -Doug _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev