On 1/4/07, Rana Dasgupta <[EMAIL PROTECTED]> wrote:

Hi Weldon,
  Since I have a RHEL4 2 way smp box, I ran this for a while. Same
thing....build.test hangs, when run in isolation, it ocassionally throws
OOME messages ( when trying to spawn threads ) and hangs. In fact, I could
occassionally repro the hang with RI as well.


Is this really true?  The RI hangs intermittantly as well?


I think that the hang is the
flush ( as you pointed out..it may not be very critical ), but the RI does
not run out of memory even when spawning threads intensely.
Since the test is non-deterministic each run could do something
different...so it is not easy to compare. I changed the randomizer to
mostly
spawn ...but I still did not see any OOME failures on RI. DRLVM may be
chewing up memory faster when creating threads. Is this a new test?


Good point.  If the test never passed on 2-way SMP reliably, then its hard
to call the current state of affairs a regression.  In my mind a new bug
report is not as high of a priority as a regression simply because a timely
rollback can actually fix a regression.  In general this is much cheaper
than fixing new bugs.

Just wanted to let you know since you are debugging it.


Thanks.  This helps   :)

Rana


On 1/4/07, Weldon Washburn <[EMAIL PROTECTED]> wrote:
>
> On 1/4/07, Naveen Neelakantam <[EMAIL PROTECTED]> wrote:
> >
> > When you say "fail", do you mean the test hangs?  That was the
> > behavior I was seeing.
>
>
>
> I see it hang consistently when running automated mode (build test).  I
> have
> seen it hang once when running manually from a linux terminal
window.  It
> actually printed out "PASSED" then hung.  This leads me to suspect there
> might be problems with how System.out.flush() is working when there are
> multiple threads running on SMP.  Are you running on an SMP box?  Can
you
> give me the exact command line you are using?  I would like to try it on
> my
> box.
>
>
> I tried running stress.Mix on the command line several times, and it
> > would still hang.
> >
> > Naveen
> >
> > On Jan 4, 2007, at 11:06 AM, Weldon Washburn wrote:
> >
> > > stress.Mix has been broken since December 19.  From a quick
> > > investigation it
> > > looks like stress.Mix is having problems with thread
> > > synchronization on red
> > > hat release 4 on 2 cpu server.  As much as I would like to simply
> > > go back to
> > > committing patches to the threading system, it seems prudent to fix
> > > stress.Mix threading problems first.
> > >
> > > Running stress.Mix on my one cpu laptop shows real close to 60
> > > threads are
> > > created, each running one of 10 different workloads.  stress.Mix
> > > test always
> > > passes on my laptop.
> > >
> > > Running "build test" which contains stress.Mix on my 2 cpu red hat
box
> > > always fails.  Using "mpstat -P ALL" shows that both cpus are
> > > idle.  Another
> > > related data point is that when I run "......deploy/jdk/jre/bin/
> > > java -cp  .
> > > stress.Mix" from linux terminal window, the test always passes.
> > > Interestingly real close to 60 threads are also created on the
> > > server. AFAICT the only difference between running "build test" and
> > > running
> > > the test manually is the std error output to terminal screen.  Std
> > > error
> > > println's appear when running manually but do not appear when
> > > running with
> > > "build test".  My best guess at this time is that the println's to
> > > screen
> > > are forcing the threads to not aggrevate a synch bug.
> > > If nobody else is working on the above, I will continue my
> > > investigation.
> > > Thoughts?  Suggestions?
> > >
> > > --
> > > Weldon Washburn
> > > Intel Enterprise Solutions Software Division
> >
> >
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>
>




--
Weldon Washburn
Intel Enterprise Solutions Software Division

Reply via email to