Hi Kris,

   I'm adding Shi Han Zhang to the thread,

I'm was involved in some refactors during kilo and Han Zhang in some extra fixes during Liberty [1] [2] [3],

Could you get us some logs of such failures to see what was happening around the failure time?, as a minimum we should
post the log error traces to a bug in https://bugs.launchpad.net/neutron

We will be glad to use such information to make the ipset more fault tolerant, and try to identify the cause of the
possible race conditions.


[1] https://review.openstack.org/#/c/187483/
[2] https://review.openstack.org/190991
[3] https://review.openstack.org/#/c/187433/



Kris G. Lindgren wrote:

We have been using ipsets since juno. Twice now since our kilo upgrade we have had issues with ipsets blowing up on a compute node.

The first time, was iptables was referencing an ipset that was either no longer there or was not added, and was trying to apply the iptables config every second and dumping the full iptables-resotore output into the log when it failed at TRACE level. Second time, was that ipsets was failing to remove an element that was no longer there.

For #1 I solved by restarting the neutron-openvswitch-agent. For #2 we just added the entry that ipsets was trying to remove. It seems like we are having some race conditions under kilo that were not present under juno (or we managed to run it for 6+ months without it biting us).

Is anyone else seeing the same problems? I am noticing some commits reverting/re-adding around ipsets in kilo and liberty so trying to confirm if I need to open a new bug on this.
____________________________________________

Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.

_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Kris G. Lindgren wrote:
We have been using ipsets since juno.  Twice now since our kilo upgrade we have 
had issues with ipsets blowing up on a compute node.

The first time, was iptables was referencing an ipset that was either no longer 
there or was not added, and was trying to apply the iptables config every 
second and dumping the full iptables-resotore output into the log when it 
failed at TRACE level.
Second time, was that ipsets was failing to remove an element that was no 
longer there.

For #1 I solved by restarting the neutron-openvswitch-agent.  For #2 we just 
added the entry that ipsets was trying to remove.  It seems like we are having 
some race conditions under kilo that were not present under juno (or we managed 
to run it for 6+ months without it biting us).

Is anyone else seeing the same problems?  I am noticing some commits 
reverting/re-adding around ipsets in kilo and liberty so trying to confirm if I 
need to open a new bug on this.
____________________________________________

Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.

_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to