Re: [Openstack-operators] [Openstack] Recovering from full outage

2018-07-06 Thread Torin Woltjer
I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? Torin Woltjer Grand Dial Communications - A ZK

[Openstack-operators] Deprecation notice: Cinder Driver for NetApp E-Series

2018-07-06 Thread Gavioli, Luiz
Developers and Operators, NetApp’s various Cinder drivers currently provide platform integration for ONTAP powered systems, SolidFire, and E/EF-Series systems. Per systems-provided telemetry and discussion amongst our user community, we’ve learned that when E/EF-series systems are deployed

Re: [Openstack-operators] [Openstack] Recovering from full outage

2018-07-06 Thread Torin Woltjer
Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be

Re: [Openstack-operators] [Openstack] Recovering from full outage

2018-07-06 Thread George Mihaiescu
Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server? That would confirm if there is connectivity at least. Also, on the controller node where the dhcp server for that network is, check the

Re: [Openstack-operators] [Openstack] Recovering from full outage

2018-07-06 Thread Torin Woltjer
I have done tcpdumps on both the controllers and on a compute node. Controller: `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67` `tcpdump -vnes0 -i any port 67` Compute: `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` For the first command on the

Re: [Openstack-operators] [ceph-users] After power outage, nearly all vm volumes corrupted and unmountable

2018-07-06 Thread Gary Molenkamp
Thank you Jason,  Not sure how I missed that step. On 2018-07-06 08:34 AM, Jason Dillaman wrote: There have been several similar reports on the mailing list about this [1][2][3][4] that are always a result of skipping step 6 from the Luminous upgrade guide [5]. The new (starting Luminous)

[Openstack-operators] After power outage, nearly all vm volumes corrupted and unmountable

2018-07-06 Thread Gary Molenkamp
Good morning all, After losing all power to our DC last night due to a storm, nearly all of the volumes in our Pike cluster are unmountable.  Of the 30 VMs in use at the time, only one has been able to successfully mount and boot from its rootfs.   We are using Ceph as the backend storage to

Re: [Openstack-operators] Storage concerns when switching from a single controller to a HA setup

2018-07-06 Thread Christian Zunker
Hi Jean-Philippe, we had the same issue with ceph as backend. This fixed the problem in our setup: https://ask.openstack.org/en/question/87545/cinder-high-availability/ Although the above link talks about an active-active setup, the official docs mention the hostname in the configuration also