Hi, Further to my previous email, here's a patch to JamVM 1.2.0 to include memory barriers on Intel. Mark, as you've seen the problem with Eclipse 3, could you give it a test? If there's anybody else who's seen problems I'd be grateful if you could also give it a go.
Thanks, Rob. P.S. I believe the compare_and_swap implementation on Intel was correct, as any locked instruction forms a memory barrier. However, there are a couple of other places where ordering is important -- I've added memory barriers here. In particular, bytecode rewriting in the interpreter. I suspect this is the most likely cause of the problem, as Mark said making static methods synchronised slows it down (i.e. only 1 thread can be in the method). The memory barrier itself is a locked no-op; the sfence, lfence and mfence instructions exist on the P4 but will not work on all processors. On Sat, 13 Nov 2004 12:07:37 -0500, Chris Pickett <[EMAIL PROTECTED]> wrote: > Robert Lougher wrote: > > Hi all, > > > > On Sat, 13 Nov 2004 11:58:53 +0100, Mark Wielaard <[EMAIL PROTECTED]> wrote: > > > >>The Eclipse 3 (but not 2) startup problem seems to only happen on SMP > >>machine (it disappears when I don't use a SMP kernel, this is on a Intel > >>hyperthreading system) with jamvm [*]. It works fine with gcj/gij (it > >>doesn't work anymore with kaffe though since they don't implement > >>java.lang.ClassLoader.setSigners which we now call). > > > > > > I'm not terribly surprised -- I've never tested JamVM on a real or > > virtual SMP machine before. When writing the thin-locking > > implementation I didn't include any SMP memory barriers, so it's > > something I've been expecting to hear! I'll look at including them > > for the next release. Mark, would you be willing to do the testing? > > > > > >>Cheers, > >> > >>Mark > >> > >>[*] Hint for Robert. When inspecting with -verbose I can see that some > >>classes are [loaded] multiple times. I can slow down crashing a bit by > >>making various VMClass static methods synchronized, but that is not a > >>full solution. I think this is a bug in the runtime that needs to guard > >>against defining the same class from multiple threads and not completely > >>fixable in our core libraries setup. > >> > > > > > > I don't think this is the cause. This can happen even on a > > uni-processor machine. Two threads can see a class hasn't been loaded > > and start to define it. However, the updating of the loaded class > > hash table is locked. One thread will win the race and update the > > table, the other will find it already there, and discard the one it's > > just loaded. This keeps locking to a minimum, and should lead to > > overall faster behaviour. It's a bug as to where the -verbose message > > is printed -- it should only be done by the thread that wins the race. > > For what it's worth, we've had SMP problems in SableVM for a while now > also. They too seem related to thread startup and thread death. It > never occurred to me that this might be a Classpath problem since until > now I thought we were the only ones, but then again it could just be > that both JamVM and SableVM have equally bad internal locking :(. I > tried putting in memory barriers as prescribed by the JSR133 cookbook > [1], but it didn't make any difference. In fact, I tried putting a > StoreLoad barrier in between every single bytecode instruction, and it > still didn't help. I haven't tested Eclipse, but will try to (or some > other SableVM person with a working Eclipse installation could try). > > [1] http://gee.cs.oswego.edu/dl/jmm/cookbook.html > > SableVM also doesn't have any handling of Java volatiles, which do > indeed exist in the Classpath threading code. However, one would think > that with a barrier in between every single bytecode that this wouldn't > matter and that something else must be wrong. We did manage to squash a > couple of threading bugs when somebody tried to build on NetBSD (I > think...), and got compile-time pthread initialization warnings. > > Again my experience says that this isn't strictly limited to SMP > machines, but that on UP's the time between context switches is so long > that it's much harder to catch these heisenbugs. > > I think it would be interesting to hear from VM developers who _don't_ > have problems on SMP machines but had them in the past and somehow > managed to eliminate them. > > Chris >
mb-patch
Description: Binary data
_______________________________________________ Classpath mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/classpath