On Fri, Dec 06, 2013 at 04:30:17PM +0900,
Maru Newby <ma...@redhat.com> wrote:

> On Dec 5, 2013, at 5:21 PM, Isaku Yamahata <isaku.yamah...@gmail.com> wrote:
> > On Wed, Dec 04, 2013 at 12:37:19PM +0900,
> > Maru Newby <ma...@redhat.com> wrote:
> > 
> >> In the current architecture, the Neutron service handles RPC and WSGI with 
> >> a single process and is prone to being overloaded such that agent 
> >> heartbeats can be delayed beyond the limit for the agent being declared 
> >> 'down'.  Even if we increased the agent timeout as Yongsheg suggests, 
> >> there is no guarantee that we can accurately detect whether an agent is 
> >> 'live' with the current architecture.  Given that amqp can ensure eventual 
> >> delivery - it is a queue - is sending a notification blind such a bad 
> >> idea?  In the best case the agent isn't really down and can process the 
> >> notification.  In the worst case, the agent really is down but will be 
> >> brought up eventually by a deployment's monitoring solution and process 
> >> the notification when it returns.  What am I missing? 
> >> 
> > 
> > Do you mean overload of neutron server? Not neutron agent.
> > So event agent sends periodic 'live' report, the reports are piled up
> > unprocessed by server.
> > When server sends notification, it considers agent dead wrongly.
> > Not because agent didn't send live reports due to overload of agent.
> > Is this understanding correct?
> Your interpretation is likely correct.  The demands on the service are going 
> to be much higher by virtue of having to field RPC requests from all the 
> agents to interact with the database on their behalf.

Is this strongly indicating thread-starvation. i.e. too much unfair
thread scheduling.
Given that eventlet is cooperative threading, should sleep(0) to 
hogging thread?
Isaku Yamahata <isaku.yamah...@gmail.com>

OpenStack-dev mailing list

Reply via email to