On 1/4/07, Rana Dasgupta <[EMAIL PROTECTED]> wrote:
Hi Weldon, Since I have a RHEL4 2 way smp box, I ran this for a while. Same thing....build.test hangs, when run in isolation, it ocassionally throws OOME messages ( when trying to spawn threads ) and hangs. In fact, I could occassionally repro the hang with RI as well.
Is this really true? The RI hangs intermittantly as well? I think that the hang is the
flush ( as you pointed out..it may not be very critical ), but the RI does not run out of memory even when spawning threads intensely. Since the test is non-deterministic each run could do something different...so it is not easy to compare. I changed the randomizer to mostly spawn ...but I still did not see any OOME failures on RI. DRLVM may be chewing up memory faster when creating threads. Is this a new test?
Good point. If the test never passed on 2-way SMP reliably, then its hard to call the current state of affairs a regression. In my mind a new bug report is not as high of a priority as a regression simply because a timely rollback can actually fix a regression. In general this is much cheaper than fixing new bugs. Just wanted to let you know since you are debugging it. Thanks. This helps :) Rana
On 1/4/07, Weldon Washburn <[EMAIL PROTECTED]> wrote: > > On 1/4/07, Naveen Neelakantam <[EMAIL PROTECTED]> wrote: > > > > When you say "fail", do you mean the test hangs? That was the > > behavior I was seeing. > > > > I see it hang consistently when running automated mode (build test). I > have > seen it hang once when running manually from a linux terminal window. It > actually printed out "PASSED" then hung. This leads me to suspect there > might be problems with how System.out.flush() is working when there are > multiple threads running on SMP. Are you running on an SMP box? Can you > give me the exact command line you are using? I would like to try it on > my > box. > > > I tried running stress.Mix on the command line several times, and it > > would still hang. > > > > Naveen > > > > On Jan 4, 2007, at 11:06 AM, Weldon Washburn wrote: > > > > > stress.Mix has been broken since December 19. From a quick > > > investigation it > > > looks like stress.Mix is having problems with thread > > > synchronization on red > > > hat release 4 on 2 cpu server. As much as I would like to simply > > > go back to > > > committing patches to the threading system, it seems prudent to fix > > > stress.Mix threading problems first. > > > > > > Running stress.Mix on my one cpu laptop shows real close to 60 > > > threads are > > > created, each running one of 10 different workloads. stress.Mix > > > test always > > > passes on my laptop. > > > > > > Running "build test" which contains stress.Mix on my 2 cpu red hat box > > > always fails. Using "mpstat -P ALL" shows that both cpus are > > > idle. Another > > > related data point is that when I run "......deploy/jdk/jre/bin/ > > > java -cp . > > > stress.Mix" from linux terminal window, the test always passes. > > > Interestingly real close to 60 threads are also created on the > > > server. AFAICT the only difference between running "build test" and > > > running > > > the test manually is the std error output to terminal screen. Std > > > error > > > println's appear when running manually but do not appear when > > > running with > > > "build test". My best guess at this time is that the println's to > > > screen > > > are forcing the threads to not aggrevate a synch bug. > > > If nobody else is working on the above, I will continue my > > > investigation. > > > Thoughts? Suggestions? > > > > > > -- > > > Weldon Washburn > > > Intel Enterprise Solutions Software Division > > > > > > > -- > Weldon Washburn > Intel Enterprise Solutions Software Division > >
-- Weldon Washburn Intel Enterprise Solutions Software Division
