Hi, that could be a problem with neutron metadata service, check the logs.
Have you considered that the outage might have corrupted your databases, neutron, nova, etc? BR On Thu, Jul 5, 2018 at 9:07 PM Torin Woltjer <torin.wolt...@granddial.com> wrote: > Are IP addresses set by cloud-init on boot? I noticed that cloud-init > isn't working on my VMs. created a new instance from an ubuntu 18.04 image > to test with, the hostname was not set to the name of the instance and > could not login as users I had specified in the configuration. > > *Torin Woltjer* > > *Grand Dial Communications - A ZK Tech Inc. Company* > > *616.776.1066 ext. 2006* > *www.granddial.com <http://www.granddial.com> <http://www.granddial.com>* > > ------------------------------ > *From*: George Mihaiescu <lmihaie...@gmail.com> > *Sent*: 7/5/18 12:57 PM > *To*: torin.wolt...@granddial.com > *Cc*: "openst...@lists.openstack.org" <openst...@lists.openstack.org>, " > openstack-operators@lists.openstack.org" < > openstack-operators@lists.openstack.org> > *Subject*: Re: [Openstack] Recovering from full outage > You should tcpdump inside the qdhcp namespace to see if the requests make > it there, and also check iptables rules on the compute nodes for the return > traffic. > > > On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer < > torin.wolt...@granddial.com> wrote: > >> Yes, I've done this. The VMs hang for awhile waiting for DHCP and >> eventually come up with no addresses. neutron-dhcp-agent has been restarted >> on both controllers. The qdhcp netns's were all present; I stopped the >> service, removed the qdhcp netns's, noted the dhcp agents show offline by >> `neutron agent-list`, restarted all neutron services, noted the qdhcp >> netns's were recreated, restarted a VM again and it still fails to pull an >> IP address. >> >> *Torin Woltjer* >> >> *Grand Dial Communications - A ZK Tech Inc. Company* >> >> *616.776.1066 ext. 2006* >> * <http://www.granddial.com> <http://www.granddial.com>www.granddial.com >> <http://www.granddial.com> <http://www.granddial.com>* >> >> ------------------------------ >> *From*: George Mihaiescu <lmihaie...@gmail.com> >> *Sent*: 7/5/18 10:38 AM >> *To*: torin.wolt...@granddial.com >> *Subject*: Re: [Openstack] Recovering from full outage >> Did you restart the neutron-dhcp-agent and rebooted the VMs? >> >> On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer < >> torin.wolt...@granddial.com> wrote: >> >>> The qrouter netns appears once the lock_path is specified, the neutron >>> router is pingable as well. However, instances are not pingable. If I log >>> in via console, the instances have not been given IP addresses, if I >>> manually give them an address and route they are pingable and seem to work. >>> So the router is working correctly but dhcp is not working. >>> >>> No errors in any of the neutron or nova logs on controllers or compute >>> nodes. >>> >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com >>> <http://www.granddial.com> <http://www.granddial.com>* >>> >>> ------------------------------ >>> *From*: "Torin Woltjer" <torin.wolt...@granddial.com> >>> *Sent*: 7/5/18 8:53 AM >>> *To*: <lmihaie...@gmail.com> >>> *Cc*: openstack-operators@lists.openstack.org, >>> openst...@lists.openstack.org >>> *Subject*: Re: [Openstack] Recovering from full outage >>> There is no lock path set in my neutron configuration. Does it >>> ultimately matter what it is set to as long as it is consistent? Does it >>> need to be set on compute nodes as well as controllers? >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com >>> <http://www.granddial.com> <http://www.granddial.com>* >>> >>> ------------------------------ >>> *From*: George Mihaiescu <lmihaie...@gmail.com> >>> *Sent*: 7/3/18 7:47 PM >>> *To*: torin.wolt...@granddial.com >>> *Cc*: openstack-operators@lists.openstack.org, >>> openst...@lists.openstack.org >>> *Subject*: Re: [Openstack] Recovering from full outage >>> >>> Did you set a lock_path in the neutron’s config? >>> >>> On Jul 3, 2018, at 17:34, Torin Woltjer <torin.wolt...@granddial.com> >>> wrote: >>> >>> The following errors appear in the neutron-linuxbridge-agent.log on both >>> controllers: <http://paste.openstack.org/show/724930/> >>> <http://paste.openstack.org/show/724930/> >>> <http://paste.openstack.org/show/724930/> >>> <http://paste.openstack.org/show/724930/> >>> <http://paste.openstack.org/show/724930/> >>> <http://paste.openstack.org/show/724930/> >>> <http://paste.openstack.org/show/724930/> >>> http://paste.openstack.org/show/724930/ >>> >>> No such errors are on the compute nodes themselves. >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com >>> <http://www.granddial.com> <http://www.granddial.com>* >>> >>> ------------------------------ >>> *From*: "Torin Woltjer" <torin.wolt...@granddial.com> >>> *Sent*: 7/3/18 5:14 PM >>> *To*: <lmihaie...@gmail.com> >>> *Cc*: "openstack-operators@lists.openstack.org" < >>> openstack-operators@lists.openstack.org>, "openst...@lists.openstack.org" >>> <openst...@lists.openstack.org> >>> *Subject*: Re: [Openstack] Recovering from full outage >>> Running `openstack server reboot` on an instance just causes the >>> instance to be stuck in a rebooting status. Most notable of the logs is >>> neutron-server.log which shows the following: >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> <http://paste.openstack.org/show/724917/> >>> http://paste.openstack.org/show/724917/ >>> >>> I realized that rabbitmq was in a failed state, so I bootstrapped it, >>> rebooted controllers, and all of the agents show online. >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> <http://paste.openstack.org/show/724921/> >>> http://paste.openstack.org/show/724921/ >>> And all of the instances can be properly started, however I cannot ping >>> any of the instances floating IPs or the neutron router. And when logging >>> into an instance with the console, there is no IP address on any interface. >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com> >>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com >>> <http://www.granddial.com> <http://www.granddial.com>* >>> >>> ------------------------------ >>> *From*: George Mihaiescu <lmihaie...@gmail.com> >>> *Sent*: 7/3/18 11:50 AM >>> *To*: torin.wolt...@granddial.com >>> *Subject*: Re: [Openstack] Recovering from full outage >>> Try restarting them using "openstack server reboot" and also check the >>> nova-compute.log and neutron agents logs on the compute nodes. >>> >>> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer < >>> torin.wolt...@granddial.com> wrote: >>> >>>> We just suffered a power outage in out data center and I'm having >>>> trouble recovering the Openstack cluster. All of the nodes are back online, >>>> every instance shows active but `virsh list --all` on the compute nodes >>>> show that all of the VMs are actually shut down. Running `ip addr` on any >>>> of the nodes shows that none of the bridges are present and `ip netns` >>>> shows that all of the network namespaces are missing as well. So despite >>>> all of the neutron service running, none of the networking appears to be >>>> active, which is concerning. How do I solve this without recreating all of >>>> the networks? >>>> >>>> *Torin Woltjer* >>>> >>>> *Grand Dial Communications - A ZK Tech Inc. Company* >>>> >>>> *616.776.1066 ext. 2006* >>>> * <http://www.granddial.com> <http://www.granddial.com> >>>> <http://www.granddial.com> <http://www.granddial.com> >>>> <http://www.granddial.com> <http://www.granddial.com> >>>> <http://www.granddial.com> <http://www.granddial.com> >>>> <http://www.granddial.com> <http://www.granddial.com> >>>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com >>>> <http://www.granddial.com> <http://www.granddial.com>* >>>> >>>> _______________________________________________ >>>> Mailing list: >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> Post to : openst...@lists.openstack.org >>>> Unsubscribe : >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack> >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> >>>> >>> >> > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >
_______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators