Re: NMS Failover transport pegging CPU

Ted C. Mon, 08 Mar 2010 10:24:12 -0800

Tim --

The issue number is AMQNET-241.


Thanks,

Ted C.


Ted C. wrote:
> 
> Tim --
> 
> Yes, it is at least one problem.  I've applied and tested a very small
> targeted fix that seems to work.  The problem is that this code is
> effectively a state machine, which I haven't had the time to fully
> understand.  Making changes to that type of code always makes me nervous
> because there's always more states than you realize at first.
> 
> I will include my patch with the bug but I would probably recommend not
> taking it as a long-term solution.
> 
> For the next few week, I'm swamped with trying to get a release out the
> door at work and finals at school.  After that, I may have a chance to
> look into the failover code in more detail.
> 
> Thanks,
> 
> Ted C.
> 
> 
> Timothy Bish wrote:
>> 
>> On Thu, 2010-03-04 at 17:03 -0800, Ted C. wrote:
>>> I'm happy to do so, will probably be tomorrow.  Just as a note, I think
>>> that
>>> I'm getting this because in FailoverTransport there's the following if:
>>> 
>>>                 if(ConnectedTransport != null || disposed ||
>>> connectionFailure != null)
>>>                 {
>>>                     return false;
>>>                 } 
>>>                 else
>>> 
>>> 
>>> 
>>> it appears (as in I've seen this in a couple of iterations and haven't
>>> gotten back to it, yet) that connectionFailure is not null, so there's
>>> an
>>> immediate return false and the the loop spins.
>>> I
>>> Speaking of which, I'm not sure I see a way for connectionFailure to
>>> ever
>>> become null again.  It appears that it's only assigned in the else part
>>> of
>>> the if above.  Am I missing something?
>>> 
>>> Thanks,
>> 
>> Its quite possible that this is the problem.  I've not had time yet to
>> test this.  The FailoverTransport code is in need of a code review, so
>> its not surprising there's some issues in there.  
>> 
>> Regards
>> Tim.
>> 
>> 
>>> 
>>> Ted C.
>>> 
>>> 
>>> 
>>> Timothy Bish wrote:
>>> > 
>>> > On Tue, 2010-03-02 at 17:27 -0800, Ted C. wrote:
>>> >> It appears that NMS is pegging the CPU.  In my scenario, there's one
>>> >> broker
>>> >> running and the broker goes down.  When that hapens, my CPU
>>> utilization
>>> >> goes
>>> >> to 100% and never recovers.
>>> >> 
>>> >> When I break into the program, I see that FailoverTask.Iterate is
>>> getting
>>> >> called frequently.  I ran it under dotTrace and got the following:
>>> >> 
>>> >> 32.70 % Thread #105762776 - 14308 ms - 0 calls
>>> >>   32.70 %
>>> System.Threading._ThreadPoolWaitCallback.PerformWaitCallback...
>>> >> -
>>> >> 14308* ms - 0 calls
>>> >>     32.70 % Run - 14308* ms - 0 calls -
>>> >> Apache.NMS.ActiveMQ.Threads.PooledTaskRunner.Run(Object)
>>> >>       32.70 % RunTask - 14308* ms - 0 calls -
>>> >> Apache.NMS.ActiveMQ.Threads.PooledTaskRunner.RunTask()
>>> >>         32.70 % Iterate - 14308* ms - 0 calls -
>>> >>
>>> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.FailoverTask.Iterate()
>>> >>           23.59 % WaitOne - 10323* ms - 0 calls -
>>> >> System.Threading.WaitHandle.WaitOne()
>>> >>           8.22 % ReleaseMutex - 3597 ms - 0 calls -
>>> >> System.Threading.Mutex.ReleaseMutex()
>>> >>           0.67 % get_ConnectedTransport - 291 ms - 0 calls -
>>> >>
>>> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.get_ConnectedTransport()
>>> >>           0.22 % DoConnect - 97 ms - 0 calls -
>>> >> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.DoConnect()
>>> >> 
>>> >> Anybody seen similar issues?  This is ActiveMQ 5.3 and NMS 1.2.0.
>>> >> 
>>> >> Thanks,
>>> >> 
>>> >> Ted C.
>>> >> 
>>> > 
>>> > This isn't an issue that's been reported yet.  Could you raise a new
>>> > Jira issue regarding this?  I'd expect that the initial failure would
>>> > cause a spike in CPU but would expect that the reconnect delay would
>>> > cause that to settle down as it increases.
>>> > 
>>> > Regards
>>> > 
>>> > 
>>> > -- 
>>> > Tim Bish
>>> > 
>>> > Open Source Integration: http://fusesource.com
>>> > ActiveMQ in Action: http://www.manning.com/snyder/
>>> > 
>>> > Follow me on Twitter: http://twitter.com/tabish121
>>> > My Blog: http://timbish.blogspot.com/
>>> > 
>>> > 
>>> > 
>>> 
>> 
>> -- 
>> Tim Bish
>> 
>> Open Source Integration: http://fusesource.com
>> ActiveMQ in Action: http://www.manning.com/snyder/
>> 
>> Follow me on Twitter: http://twitter.com/tabish121
>> My Blog: http://timbish.blogspot.com/
>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/NMS-Failover-transport-pegging-CPU-tp27763465p27825429.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: NMS Failover transport pegging CPU

Reply via email to