Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-28 Thread Joe Topjian
I'm pretty sure I've resolved this issue. Since this seems to happen
randomly, it might just be a coincidence that this is by far the longest
streak that it hasn't happened. :)

I noticed that CentOS 7 and RHEL 7 are setting a `valid_lft` and
`preferred_lft` timeout on the IPv4 address. You can see this by doing an
"ip a" on CentOS7/RHEL7 and comparing with either CentOS6 or Ubuntu. This
is the first time I've seen this used on IPv4. It's usually used for IPv6
privacy addresses. The timeout is set to something larger than the lease
renewal time.

What happens, though, is that it is occasionally taking a little longer to
receive the DHCP renewal. Then the `valid_lft` hits zero and the IP is
removed from the interface. When this happens, the kernel will clean up any
routes used by the removed IP (in this case, the default gateway).

A few seconds later, the late DHCP renewal is finally received and the IP
is added back to the interface. But due to how CentOS/RHEL7 is handling the
renewal in /usr/sbin/dhclient-script, the gateway is never re-added.

My guess as to why a newer version of dnsmasq does not exhibit this issue
is because it's advertising renewals a little different: enough to trigger
the part of dhclient-script to re-add the gateway. I have not verified this
theory, though.

What I've done for now is modified dhclient-script and removed any portion
that sets a valid_lft and preferred_lft, so now they are set to "forever"
just like other distros.

And so far, so good (crossing fingers).

Thanks,
Joe

On Tue, Jan 27, 2015 at 1:53 PM, Joe Topjian  wrote:

> Hi George,
>
> All instances have only a single interface.
>
> Thanks,
> Joe
>
> On Tue, Jan 27, 2015 at 1:38 PM, George Shuklin 
> wrote:
>
>>  How many network interfaces have your instance? If more than one - check
>> settings for second network (subnet). It can have own dhcp settings which
>> may mess up with routes for the main network.
>>
>>
>> On 01/27/2015 06:08 PM, Joe Topjian wrote:
>>
>> Hello,
>>
>>  I have run into two different OpenStack clouds where instances running
>> either RHEL 7 or CentOS 7 images are randomly losing their network gateway.
>>
>>  There's nothing in the logs that show any indication of why. There's no
>> DHCP hiccup or anything like that. The gateway has just disappeared.
>>
>>  If I log into the instance via another instance (so on the same subnet
>> since there's no gateway), I can manually re-add the gateway and everything
>> works... until it loses it again.
>>
>>  One cloud is running Havana and the other is running Icehouse. Both are
>> using nova-network and both are Ubuntu 12.04.
>>
>>  On the Havana cloud, we decided to install the dnsmasq package from
>> Ubuntu 14.04. This looks to have resolved the issue as this was back in
>> November and I haven't heard an update since.
>>
>>  However, we don't want to do that just yet on the Icehouse cloud. We'd
>> like to understand exactly why this is happening and why updating dnsmasq
>> resolves an issue that only one specific type of image is having.
>>
>>  I can make my way around CentOS, but I'm not as familiar with it as I
>> am with Ubuntu (especially CentOS 7). Does anyone know what change in
>> RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
>> how to troubleshoot the issue?
>>
>>  I currently have access to two instances in this state, so I'd be happy
>> to act as remote hands and eyes. :)
>>
>>  Thanks,
>> Joe
>>
>>
>> ___
>> OpenStack-operators mailing 
>> listOpenStack-operators@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Joe Topjian
Hi George,

All instances have only a single interface.

Thanks,
Joe

On Tue, Jan 27, 2015 at 1:38 PM, George Shuklin 
wrote:

>  How many network interfaces have your instance? If more than one - check
> settings for second network (subnet). It can have own dhcp settings which
> may mess up with routes for the main network.
>
>
> On 01/27/2015 06:08 PM, Joe Topjian wrote:
>
> Hello,
>
>  I have run into two different OpenStack clouds where instances running
> either RHEL 7 or CentOS 7 images are randomly losing their network gateway.
>
>  There's nothing in the logs that show any indication of why. There's no
> DHCP hiccup or anything like that. The gateway has just disappeared.
>
>  If I log into the instance via another instance (so on the same subnet
> since there's no gateway), I can manually re-add the gateway and everything
> works... until it loses it again.
>
>  One cloud is running Havana and the other is running Icehouse. Both are
> using nova-network and both are Ubuntu 12.04.
>
>  On the Havana cloud, we decided to install the dnsmasq package from
> Ubuntu 14.04. This looks to have resolved the issue as this was back in
> November and I haven't heard an update since.
>
>  However, we don't want to do that just yet on the Icehouse cloud. We'd
> like to understand exactly why this is happening and why updating dnsmasq
> resolves an issue that only one specific type of image is having.
>
>  I can make my way around CentOS, but I'm not as familiar with it as I am
> with Ubuntu (especially CentOS 7). Does anyone know what change in
> RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
> how to troubleshoot the issue?
>
>  I currently have access to two instances in this state, so I'd be happy
> to act as remote hands and eyes. :)
>
>  Thanks,
> Joe
>
>
> ___
> OpenStack-operators mailing 
> listOpenStack-operators@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread George Shuklin
How many network interfaces have your instance? If more than one - check 
settings for second network (subnet). It can have own dhcp settings 
which may mess up with routes for the main network.


On 01/27/2015 06:08 PM, Joe Topjian wrote:

Hello,

I have run into two different OpenStack clouds where instances running 
either RHEL 7 or CentOS 7 images are randomly losing their network 
gateway.


There's nothing in the logs that show any indication of why. There's 
no DHCP hiccup or anything like that. The gateway has just disappeared.


If I log into the instance via another instance (so on the same subnet 
since there's no gateway), I can manually re-add the gateway and 
everything works... until it loses it again.


One cloud is running Havana and the other is running Icehouse. Both 
are using nova-network and both are Ubuntu 12.04.


On the Havana cloud, we decided to install the dnsmasq package from 
Ubuntu 14.04. This looks to have resolved the issue as this was back 
in November and I haven't heard an update since.


However, we don't want to do that just yet on the Icehouse cloud. We'd 
like to understand exactly why this is happening and why updating 
dnsmasq resolves an issue that only one specific type of image is having.


I can make my way around CentOS, but I'm not as familiar with it as I 
am with Ubuntu (especially CentOS 7). Does anyone know what change in 
RHEL7/CentOS7 might be causing this? Or does anyone have any other 
ideas on how to troubleshoot the issue?


I currently have access to two instances in this state, so I'd be 
happy to act as remote hands and eyes. :)


Thanks,
Joe


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Joe Topjian
Thanks, Kris. I'm going to see if there's any oddities between the version
of dnsmasq packaged with 12.04/Icehouse and systemd-dhcp.

On Tue, Jan 27, 2015 at 9:25 AM, Kris G. Lindgren 
wrote:

>  I can't help as we use config-drive to set networking and are just
> starting to roll out Cent7 vm's.  However, a huge change from Cent6 to
> Cent7 was the switch from upstart/dhclient to systemd/systemd-dhcp.
>  
>
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
>
>
>
>   From: Joe Topjian 
> Date: Tuesday, January 27, 2015 at 9:08 AM
> To: "openstack-operators@lists.openstack.org" <
> openstack-operators@lists.openstack.org>
> Subject: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their
> network gateway
>
>   Hello,
>
>  I have run into two different OpenStack clouds where instances running
> either RHEL 7 or CentOS 7 images are randomly losing their network gateway.
>
>  There's nothing in the logs that show any indication of why. There's no
> DHCP hiccup or anything like that. The gateway has just disappeared.
>
>  If I log into the instance via another instance (so on the same subnet
> since there's no gateway), I can manually re-add the gateway and everything
> works... until it loses it again.
>
>  One cloud is running Havana and the other is running Icehouse. Both are
> using nova-network and both are Ubuntu 12.04.
>
>  On the Havana cloud, we decided to install the dnsmasq package from
> Ubuntu 14.04. This looks to have resolved the issue as this was back in
> November and I haven't heard an update since.
>
>  However, we don't want to do that just yet on the Icehouse cloud. We'd
> like to understand exactly why this is happening and why updating dnsmasq
> resolves an issue that only one specific type of image is having.
>
>  I can make my way around CentOS, but I'm not as familiar with it as I am
> with Ubuntu (especially CentOS 7). Does anyone know what change in
> RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
> how to troubleshoot the issue?
>
>  I currently have access to two instances in this state, so I'd be happy
> to act as remote hands and eyes. :)
>
>  Thanks,
> Joe
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Kris G. Lindgren
I can't help as we use config-drive to set networking and are just starting to 
roll out Cent7 vm's.  However, a huge change from Cent6 to Cent7 was the switch 
from upstart/dhclient to systemd/systemd-dhcp.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.



From: Joe Topjian mailto:j...@topjian.net>>
Date: Tuesday, January 27, 2015 at 9:08 AM
To: 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network 
gateway

Hello,

I have run into two different OpenStack clouds where instances running either 
RHEL 7 or CentOS 7 images are randomly losing their network gateway.

There's nothing in the logs that show any indication of why. There's no DHCP 
hiccup or anything like that. The gateway has just disappeared.

If I log into the instance via another instance (so on the same subnet since 
there's no gateway), I can manually re-add the gateway and everything works... 
until it loses it again.

One cloud is running Havana and the other is running Icehouse. Both are using 
nova-network and both are Ubuntu 12.04.

On the Havana cloud, we decided to install the dnsmasq package from Ubuntu 
14.04. This looks to have resolved the issue as this was back in November and I 
haven't heard an update since.

However, we don't want to do that just yet on the Icehouse cloud. We'd like to 
understand exactly why this is happening and why updating dnsmasq resolves an 
issue that only one specific type of image is having.

I can make my way around CentOS, but I'm not as familiar with it as I am with 
Ubuntu (especially CentOS 7). Does anyone know what change in RHEL7/CentOS7 
might be causing this? Or does anyone have any other ideas on how to 
troubleshoot the issue?

I currently have access to two instances in this state, so I'd be happy to act 
as remote hands and eyes. :)

Thanks,
Joe
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Jesse Keating
At first guess, I would say it's the client trying to refresh it's lease 
and the lease is coming back without a gateway, due to a bug in dnsmasq. 
Just a guess though.


We are running 12.04 as well, but I don't recall running into this 
situation. We're on Neutron (havana for now, juno very soon) though if 
that makes a difference.


--
-jlk

On 1/27/15 8:08 AM, Joe Topjian wrote:

Hello,

I have run into two different OpenStack clouds where instances running
either RHEL 7 or CentOS 7 images are randomly losing their network gateway.

There's nothing in the logs that show any indication of why. There's no
DHCP hiccup or anything like that. The gateway has just disappeared.

If I log into the instance via another instance (so on the same subnet
since there's no gateway), I can manually re-add the gateway and
everything works... until it loses it again.

One cloud is running Havana and the other is running Icehouse. Both are
using nova-network and both are Ubuntu 12.04.

On the Havana cloud, we decided to install the dnsmasq package from
Ubuntu 14.04. This looks to have resolved the issue as this was back in
November and I haven't heard an update since.

However, we don't want to do that just yet on the Icehouse cloud. We'd
like to understand exactly why this is happening and why updating
dnsmasq resolves an issue that only one specific type of image is having.

I can make my way around CentOS, but I'm not as familiar with it as I am
with Ubuntu (especially CentOS 7). Does anyone know what change in
RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas
on how to troubleshoot the issue?

I currently have access to two instances in this state, so I'd be happy
to act as remote hands and eyes. :)

Thanks,
Joe




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Joe Topjian
Hello,

I have run into two different OpenStack clouds where instances running
either RHEL 7 or CentOS 7 images are randomly losing their network gateway.

There's nothing in the logs that show any indication of why. There's no
DHCP hiccup or anything like that. The gateway has just disappeared.

If I log into the instance via another instance (so on the same subnet
since there's no gateway), I can manually re-add the gateway and everything
works... until it loses it again.

One cloud is running Havana and the other is running Icehouse. Both are
using nova-network and both are Ubuntu 12.04.

On the Havana cloud, we decided to install the dnsmasq package from Ubuntu
14.04. This looks to have resolved the issue as this was back in November
and I haven't heard an update since.

However, we don't want to do that just yet on the Icehouse cloud. We'd like
to understand exactly why this is happening and why updating dnsmasq
resolves an issue that only one specific type of image is having.

I can make my way around CentOS, but I'm not as familiar with it as I am
with Ubuntu (especially CentOS 7). Does anyone know what change in
RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
how to troubleshoot the issue?

I currently have access to two instances in this state, so I'd be happy to
act as remote hands and eyes. :)

Thanks,
Joe
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators