Re: [drlvm] stress.Mix / MegaSpawn threading bug

Geir Magnusson Jr. Wed, 10 Jan 2007 05:12:16 -0800


On Jan 9, 2007, at 1:14 PM, Rana Dasgupta wrote:

On 1/9/07, Weldon Washburn <[EMAIL PROTECTED]> wrote:
On 1/9/07, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
>> I've tried to analyze MegaSpawn test on windows and here's whatI found
>> out.
>>
>> OOME is thrown because process virtual size easily gets up to2Gb. This
>> happens at about ~1.5k simultaneously running threads. I think it
>> happens because all of virtual process memory is mapped for thread
stacks.
>>
>Good job! I got the same sort of hunch when I looked at thesource code
did
>not have enough time to pin down specifics.  The only guidance I
>found regarding what happens when too many threads are spawned isthe
>following in the java.lang.Thread reference manual, "...specifying a
lower
>[stacksize] value may allow a greater number of threads to exist
>concurrently without throwing an OutOfMemoryError (or other internal
>error)."
>I think what the above implies is that it is OK for the JVM toerror and>exit if the app tries to create too many threads. If this is thecase,
it
>sort of looks like we need to clean up the handling of malloc()errors so
>that the JVM can exit gracefully.
I am not sure that we need to do something about this. The defaultinitial
stack size on Windows is 1M,


Yikes!  There's our problem on windows...

and that is the recommended init size for real
applications. The fact that our threads start with a larger intialstackmapped( default ) than RI is a design issue, it is not a bug. Wecould start
with 2K and create many more threads!

That's right. The fact that the VM crashes and burns is the bug, anda serious one, IMO.

Exactly as Gregory points out,
ultimately we will hit virtual memory limits and fail. The reasonthe RI
seems to fail less is that the test ends before running out of virtual
memory.On my 32 bit RHEL Linux box, RI fails almost every time with
MegaSpawn, with an identical OOME error message and stack dump.
We can catch the exception in the test and print a message. But Iam notvery sure what purpose that would serve. A resource exhaustionexception is
a fatal exception and the process is hosed,


No, it's not.

no real app would be able to do
anything more at this point.


That's not true.

We should not use this test ( which is not a
real app ) as guidance to tune the initial stack size. Mysuggestion is tolower the test duration so that we can create about a 1000( orwhatevermagic number ) threads at least. That is the stress condition weshould test
for.

The big thing for me is ensuring that we can drive the VM to thelimit, and it maintains internal integrity, so applications that aredesigned to gracefully deal with resource exhaustion can do so w/confidence that the VM isn't about to crumble out from underneath them.


geir

a
Thanks,
Rana

Re: [drlvm] stress.Mix / MegaSpawn threading bug

Reply via email to