Salvatore, 
Very good questions.
You raised your concerns for OVS agent, but I think it will be applicable for 
any other neutron agent that requires additional service to perform actions . 
At least, I was dealing with similar issues for Mellanox L2 agent. It makes 
sense for me if you fail to bind the port, it should be indicated  by neutron 
port status.
Another issue I had and try to solve  by the following patch: 
https://review.openstack.org/#/c/48842/ is the situation when agent fails to 
communicate with external daemon that responsible for actual programming. After 
number of retries with increasing back-off interval between retries, the agent 
will be terminated if fails to communicate. Does it make sense?

Regards,
Irena 

-----Original Message-----
From: Kyle Mestery (kmestery) [mailto:kmest...@cisco.com] 
Sent: Monday, November 25, 2013 11:16 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Openstack-dev][Neutron] Handling of ovs command 
errors

On Nov 25, 2013, at 12:36 PM, Salvatore Orlando <sorla...@nicira.com> wrote:
> 
> Thanks Kyle,
> 
> More comments inline.
> 
> Salvatore
> 
> 
> On 25 November 2013 16:03, Kyle Mestery (kmestery) <kmest...@cisco.com> wrote:
> On Nov 25, 2013, at 8:28 AM, Salvatore Orlando <sorla...@nicira.com> wrote:
> >
> > Hi,
> >
> > I've been recently debugging some issues I've had with the OVS agent, and I 
> > found out that in many  cases (possibly every case) the code just logs 
> > errors from ovs-vsctl and ovs-ofctl without taking any action in the 
> > control flow.
> >
> > For instance, the routine which should do the wiring for a port, port_bound 
> > [1], does not react in any way if it fails to configure the local vlan, 
> > which I guess means the port would not be able to send/receive any data.
> >
> > I'm pretty sure there's a good reason for this which I'm missing at the 
> > moment. I am asking because I see a pretty large number of ALARM_CLOCK 
> > errors returned by OVS commands in gate logs (see bug [2]), and I'm not 
> > sure whether it's ok to handle them as the OVS agent is doing nowadays.
> >
> Thanks for bringing this up Salvatore. It looks like the underlying run_vstcl 
> [1] provides an ability to raise exceptions on errors, but this is not used 
> by most of the callers of run_vsctl. Do you think we should be returning the 
> exceptions back up the stack to callers to handle? I think that may be a good 
> first step.
> 
> I think it makes sense to start to handle errors; as they often happen in the 
> agent's rpc loop simply raising will probably just cause the agent to crash.
> I looked again at the code and it really seems it's silently ignoring errors 
> from ovs command.
> This actually makes sense in some cases. For instance the l3 agent might 
> remove a qr-xxx or qg-xxx port while the l2 agent is in the middle of its 
> iteration.
> 
> There are however cases in which the exception must be handled.
> In cases like the ALARM_CLOCK error, either a retry mechanism or marking the 
> port for re-syncing at the next iteration might make sense.
> Other error cases might be unrecoverable; for instance when a port 
> disappears. In that case it seems reasonable to put the relevant neutron port 
> in ERROR state, so that the user is aware that the port anymore.
> 
I think it makes sense to address these things. Want me to file a bug?

> Thanks,
> Kyle
> 
> [1] 
> https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ovs_lib.py#L52
> 
> > Regards,
> > Salvatore
> >
> > [1] 
> > https://github.com/openstack/neutron/blob/master/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py#L599
> > [2] https://bugs.launchpad.net/neutron/+bug/1254520
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to