On Mon, Mar 15, 2010 at 1:28 PM, Robert Joly <[email protected]> wrote:
>> > On Mar 9, 2010, at 3:36 PM, Robert Joly wrote:
>> >
>> > >>
>> > >>
>> > >> On Mar 9, 2010, at 1:16 PM, Robert Joly wrote:
>> > >>
>> > >>> Mardy wrote:
>> > >>>>
>> > >>>>
>> > >>>> On Mar 9, 2010, at 10:07 AM, Robert Joly wrote:
>> > >>>>
>> > >>>>> Hi guys,
>> > >>>>> I have been investigating XX-7634 which reports high CPU
>> > >>>> utilization
>> > >>>>> by java processes after an ISO install of the commercial
>> > >>>> version (SCS)
>> > >>>>> which uses the IBM JVM.
>> > >>>>>
>> > >>>>> Basically, after an ISO install I'm seeing that *all* the
>> > >>>> java-based
>> > >>>>> processes chew up between 50% and 100% of one processor and
>> > >>>> remained
>> > >>>>> like that for as long as I kept the box up (few hours).
>> > >>>> The processes
>> > >>>>> in question are sipXpage, sipXivr, sipXrelay, sipXconfig,
>> > >>>> sipXrest and
>> > >>>>> sipXprovision.  Using jconsole I was able to find that the
>> > >>>> hot thread
>> > >>>>> for each of these services is called 'Attach Handler'
>> > >>>>> (com.ibm.tools.attach.javaSE.AttachHandler.run()) and I
>> > >> also found
>> > >>>>> that I can eliminate the high CPU condition completely
>> > on a fresh
>> > >>>>> install by hand-editing the launch command for each of
>> > these java
>> > >>>>> processes to add the following property:
>> > >>>>> -Dcom.ibm.tools.attach.enable=no
>> > >>>>>
>> > >>>>> If I add this property then all the processes are
>> > >>>> well-behaved but I
>> > >>>>> do not understand the fundamental reason why the hot thread
>> > >>>> is there
>> > >>>>> in the first place.  I'm therefore turning to the Java gods
>> > >>>> that are
>> > >>>>> tuned in to this list to see if they had previous
>> > >>>> encounters with this.
>> > >>>>>
>> > >>>>> Thanks in advance,
>> > >>>>> bob
>> > >>>>
>> > >>>> I would like to know why this issue has all of a sudden
>> > >> shown up on
>> > >>>> the radar.  The Attach API and supporting AttachHandler
>> > thread was
>> > >>>> introduced, as a result of an upgrade to the IBM JVM, back on
>> > >>>> 2009-11-14  Is it possible that the high CPU utilization
>> > has been
>> > >>>> there since then but no one had noticed it until now?
>> > >>>
>> > >>> Refresh my memory, will you?  Why are we using IBM in the
>> > >> first place?
>> > >>
>> > >> Initially because it was the only option for supporting
>> one of our
>> > >> customers.  In addition, we have discovered that it offers
>> > some very
>> > >> attractive memory optimization features not offered by
>> other JVM's
>> > >> that we may need to employ as the number of Java based services
>> > >> increase.
>> > >>
>> > >>>
>> > >>> I do not know when the high CPU behavior started showing up
>> > >> but it was
>> > >>> first reported on 2010-02-10.  Also, not every system
>> > exhibits the
>> > >>> behavior.  For example, our friends at Qantom can I can
>> > >> reproduce the
>> > >>> problem but Al Campbell and Chris Parfitt cannot on their
>> > systems.
>> > >>> I'm using a Dell R300 and Qantom is using Dell Optiplex and
>> > >> so do Al
>> > >>> and Chris.  I have not identified the ingredient that makes
>> > >> this hot
>> > >>> thread appear and as far as I can tell, IBM does not
>> publish the
>> > >>> source code for their attach implementation.
>> > >>>
>> > >>> The thread seems to be looping around waiting for a
>> > >> semaphore. Please
>> > >>> see the attached screenshot for  visual of the stacktrace.
>> > >>>
>> > >>
>> > >> Is this actually causing a problem or is it just a red
>> > herring?  If
>> > >> it is in fact impacting the performance of the system,
>> > then disable
>> > >> it.
>> > >
>> > >
>> > > Here's a sample of top running on a bad system.  This is
>> > running on a
>> > > quad-core machine and 60% of it is spent in the kernel and
>> > 17% spent
>> > > in user space.  I'm not sure which way the priorities go but I'm
>> > > hoping that the processes with lower number have higher
>> priority...
>> > >
>> > > Would you agree that this is a problematic case?
>> >
>> > Is that a view of an idle system or one that is under heavy
>> load?  If
>> > the system is idle, then no conclusion can be drawn from that data.
>> >
>> > I suggest that you take the safe route and just disable the Attach
>> > API.
>>
>> I agree with this suggestion.  I could modify the startup
>> script for each java process to add a
>> -Dcom.ibm.tools.attach.enable=no argument but this would not
>> be optimal as new java processes that get introduce later may
>> forget to do this.
>>
>> Instead, I was toying with the idea of modifying the 'sipx-config'
>> script that is used to generate the string to use to launch
>> 'java' (i.e.
>> /usr/bin/java).  Every startup script for java processes
>> invoke it and it appears to me that it would be a good
>> central place to put my -Dcom.ibm.tools.attach.enable=no
>> argument so that it gets applied to all present and future
>> java processes.
>>
>> Comments?
>
> So, I went ahead and implemented that fix and it did bring the CPU
> utilization of our idle Java processes down to 0% which is where we want
> it however that solution is not good enough because it only applies to
> the Java-based processes that the sipXecs team manages but does not
> reach the "other" external Java processes we carry, openfire being the
> leading (and possibly only) example.  I do not believe that adding a
> -Dcom.ibm.tools.attach.enable=no argument to the openfire-supplied
> launch script is a good idea because of 1) the licensing ambiguities
> that this may bring and 2) because it does not solve the problem across
> the board.
>
> Given that, I went back to the drawing board trying to understand the
> fundamental reason why the 'Attach' threads start spinning in the first
> place.  To make a long story short, after many gyrations I found that
> the spinning threads try to write to a temporary directory called
> /tmp/.com_ibm_tools_attach/ but they do not have the necessary
> permissions to do so.


There seems to be a lot of gyrating going on.:-)  First the threads
and now you... but I agree with the solution below although it seems
puzzling that java does not just exit, or dump core and exit as a
result of being unable to write to the file.

Good investigative work. Hats off!

Ranga

More specifically, the
> /tmp/.com_ibm_tools_attach/ directory gets created by the first Java
> process to run on the box.  On a fresh install, the first Java processes
> are the setup ones: sipxkeystoregen, sipxconfig-setup and
> sipxopenfire-setup.sh.  These three Java setup programs are launched by
> the do_setup() function of the sipxecs launch script and get run as
> root.  As a result, the /tmp/.com_ibm_tools_attach/ directory created is
> owned by root:root.  Since the sipXecs Java-based processes are run as
> sipxchange, their attempts to write to /tmp/.com_ibm_tools_attach/ fail
> and they keep trying over and over again resulting in high CPU
> utilization.  To prove the theory, I added the following two lines to
> the do_setup() function of the sipxecs launch script before launching
> any process and the CPU problem goes away immediately:
>        mkdir /tmp/.com_ibm_tools_attach/
>        chown sipxchange:sipxchange /tmp/.com_ibm_tools_attach/
>
> Do these two lines seem like a decent approach to solving the high CPU
> problem or does anybody have a more clever approach to ensuring that
> /tmp/.com_ibm_tools_attach/ permissions allow our Java-based sipXecs
> processes to write to it?
>
> Comments?
>
> [BTW, I'm still puzzled by the fact that some people cannot reproduce
> the problem.  With the failure mechanism I just highlighted, it seems to
> me that we should always get high CPU utilization on every fresh reclone
> of a sipXecs.  This part is still a mystery to me...]
> _______________________________________________
> sipx-dev mailing list [email protected]
> List Archive: http://list.sipfoundry.org/archive/sipx-dev
> Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
> sipXecs IP PBX -- http://www.sipfoundry.org/
>



-- 
M. Ranganathan
_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/

Reply via email to