Thank you Ernest,
I have tried setting "agent_down_time" to 120, but nothing changed.
I think I am facing a different problem (ovs-vsctl, neutron log and
mysql show a different output than yours).
I have a two node configuration:
Controller/Network node (node A, ip 10.0.0.11)
Compute node (node B, ip 10.0.0.31. The instance has ip 192.168.1.11)
I have used tcpdump on both nodes to see what's happening.
When an instance gets an ip, both nodes captures the exchanging and
/var/log/syslog (on node A) shows this output:
24208 Sep 18 11:35:00 controller dnsmasq-dhcp[19606]:
DHCPDISCOVER(tap96bde1f8-a7) fa:16:3e:e0:72:28
24209 Sep 18 11:35:00 controller dnsmasq-dhcp[19606]:
DHCPOFFER(tap96bde1f8-a7) 192.168.1.11 fa:16:3e:e0:72:28
24210 Sep 18 11:35:00 controller dnsmasq-dhcp[19606]:
DHCPREQUEST(tap96bde1f8-a7) 192.168.1.11 fa:16:3e:e0:72:28
24211 Sep 18 11:35:00 controller dnsmasq-dhcp[19606]:
DHCPACK(tap96bde1f8-a7) 192.168.1.11 fa:16:3e:e0:72:28 host-192-168-1-11
When an instance doesn't get an ip, node B captures packets going out,
while node A captures nothing.
Strange thing, in /var/log/syslog (on node A) I see some lines about the
exchange (even if tcpdump captures nothing), but request packets come
from node A:
Sep 18 11:37:41 controller dnsmasq[19606]: read
/var/lib/neutron/dhcp/b319b093-09f2-46ed-8136-1454f0616147/addn_hosts -
4 addresses
Sep 18 11:37:41 controller dnsmasq-dhcp[19606]: read
/var/lib/neutron/dhcp/b319b093-09f2-46ed-8136-1454f0616147/host
Sep 18 11:37:41 controller dnsmasq-dhcp[19606]: read
/var/lib/neutron/dhcp/b319b093-09f2-46ed-8136-1454f0616147/opts
Sep 18 11:39:44 controller dhclient: DHCPREQUEST of 10.0.0.11 on eth0 to
10.0.0.1 port 67 (xid=0x3e9eab12)
Sep 18 11:39:44 controller dhclient: DHCPACK of 10.0.0.11 from 10.0.0.1
Sep 18 11:39:44 controller dhclient: bound to 10.0.0.11 -- renewal in
1787 seconds.
thanks,
Claudio
Il 17/09/2014 17:48, Ernest Bisson ha scritto:
Hi Claudio.
I was having this issue too but maybe not as often as you. I believe I
finally fixed it by increasing "agent_down_time" in
/etc/neutron/neutron.conf. The default is 75 and I increased to 120. The
following are a few things you can check to determine if your problem is
the same as mine. (BTW... I'm running on RedHat 6.5)
1. On the NETWORK node, Check the VLAN ID of the "tag" port on the
"br-int" bridge (4095 is the dead Vlan)
ovs-vsctl show
cbf90101-67bc-40d3-adc1-4f28aca87c85
Bridge br-int
fail_mode: secure
Port br-int
Interface br-int
type: internal
Port "tap81f710b4-84"
tag: 4095
Interface "tap81f710b4-84"
type: internal
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port "qr-9ffcb84c-32"
tag: 1
Interface "qr-9ffcb84c-32"
type: internal
Bridge br-ex
Port "qg-48380a3c-2a"
Interface "qg-48380a3c-2a"
type: internal
Port "eth1"
Interface "eth1"
Port br-ex
Interface br-ex
type: internal
Bridge br-tun
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
Port "gre-0a000018"
Interface "gre-0a000018"
type: gre
options: {in_key=flow, local_ip="10.0.0.14",
out_key=flow, remote_ip="10.0.0.24"}
Port br-tun
Interface br-tun
type: internal
ovs_version: "1.11.0"
2. On the NETWORK node, Check for the following types of messages in
/var/log/neutron.log with DEBUG mode enabled
DEBUG neutron.plugins.ml2.drivers.mech_agent
[req-cfcf3460-c013-4259-9e6c-7b0125e97a8e None] Attempting to bind port
DEBUG neutron.plugins.ml2.drivers.mech_agent
[req-cfcf3460-c013-4259-9e6c-7b0125e97a8e None] Checking agent:
WARNING neutron.plugins.ml2.drivers.mech_agent
[req-cfcf3460-c013-4259-9e6c-7b0125e97a8e None] Attempting to bind with
dead agent:
WARNING neutron.plugins.ml2.managers
[req-cfcf3460-c013-4259-9e6c-7b0125e97a8e None] Failed to bind port
DHCP-PORT-ID on host HOSTNAME
3. On the CONTROLLER node, Check the status of "network:dhcp" port in
MySQL (will be DOWN when failing)
mysql -u root -pPASSWORD
mysql> use neutron;
mysql> select * from ports;
Hope this helps,
Ernie
----
Ernie Bisson
System Administrator & Virtualization
IBM Software Group
Mass Lab Central Services
550 King St. Littleton, MA. 01460
Email: [email protected]
Phone: 978-899-3893
T/L : 276-3893
Inactive hide details for Claudio Pupparo ---09/17/2014 11:04:56
AM---Hi, I have the common issue of instances not getting theiClaudio
Pupparo ---09/17/2014 11:04:56 AM---Hi, I have the common issue of
instances not getting their ip.
From: Claudio Pupparo <[email protected]>
To: [email protected],
Date: 09/17/2014 11:04 AM
Subject: [Openstack] A strange solution for instances not getting their ip
------------------------------------------------------------------------
Hi,
I have the common issue of instances not getting their ip.
When I start a cirros instance, the output looks like this:
udhcpc (v1.20.1) started
Sending discover...
Sending discover...
Sending discover...
No lease, failing
I've found that during the boot process (while the instance is trying to
get an ip), if I remove and immediately recreate in the router the
interface to the subnet in which the instance is located, the instance
succeds in getting its ip!
I've tested many times, and this method keeps working.
Anyone has a hint about this strange behaviour?
Thanks,
Claudio
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : [email protected]
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : [email protected]
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack