On 11/30/2012 04:18 PM, Simon Kelley wrote:
On 30/11/12 21:03, Gene Czarcinski wrote:
On 11/30/2012 12:45 PM, Simon Kelley wrote:
On 30/11/12 17:20, Gene Czarcinski wrote:
On 11/30/2012 11:32 AM, Simon Kelley wrote:
On 30/11/12 15:54, Gene Czarcinski wrote:
On 11/29/2012 04:18 PM, Simon Kelley wrote:
On 29/11/12 20:31, Gene Czarcinski wrote:
I spoke too quickly.
The cause of the problem is libvirt related but I am not sure what
just
yet.
I was running a libvirt that had a lot of "stuff" on it but
seemed to
work OK. Then, earlier today I update to a point that appears
to be
somewhat beyond the leading edge and, although I was not
getting any
RTR-ADVERT messages, it turned out that there were/are big-time
problems
running qemu-kvm. So, back off/downgrade to the previous version.
Qemu-kvm now works but the RTR-ADVERT messages are back.
This may be a bit time-consuming to debug!
Are you seeing the new log message in netlink.c?
The good news is that libvirt is working again (I must have done a
git-pull in the middle of an update). Thus, I am not seeing the
large
numbers of RTR-ADVERT.
Yes, I am seeing the new log message and I have a question about
that.
Every time a new virtual network interface is started, something
must be
doing some type of broadcast because all of the dnsmasq instances
(the
new one and all the "old" ones) suddenly wake up and issue a
flurry of
RA packets and related syslog messages. To kick the flurry off,
there
one of the new "unsolicited" syslog messages from each dnsmasq
instance.
Is this something you would expect? Is this "normal?" The libvirt
folks they are not doing it.
I'd expect it. The code you instrumented gets run whenever a "new
address" event happens, which is whenever an address is added to an
interface. "Every time a new virtual network interface is started"
is a
good proxy for that.
The dnsmasq code isn't very discriminating, it updates it's idea of
which interfaces hace which addresses, and then does a minute of fast
advertisements on all of them. It might be possible to only do the
fast
advertisements on new interfaces, but implementing that isn't totally
trivial.
Yes, I doubt very much if it would be trivial. However, I do not
believe that this is the basic problem.
When the problem occurs, one of the networks "suddenly" attempts to
work
with the real NIC rather than the virtual one defined in its config
file. I slightly changed the IPv4 and IPv6 addresses defined for this
network and the problem went away. I have also "just" seen the
problem
happen on another system which also had that virtual address defined.
BTW, these configurations all use interface= and bind-dynamic rather
than the "old" bind-interface with listen-address= specified for each
specified IPv4 and IPv6 address. I had not noticed the problem
previously. Why it occurs at all with just this specific address is
puzzling.
The configuration in which causes problems is:
------------------------------------------
# dnsmasq conf file created by libvirt
strict-order
domain-needed
domain=net6
expand-hosts
local=/net6/
pid-file=/var/run/libvirt/network/net6.pid
bind-dynamic
interface=virbr11
dhcp-range=192.168.6.128,192.168.6.254
dhcp-no-override
dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases
dhcp-lease-max=127
dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile
addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts
dhcp-range=fd00:beef:10:6::1,ra-only
-------------------------------------------------
When I changed all the "6" to "160", the problem, disappeared. And
there is another network defined almost the same with "8" instead
of "6"
and I have had no problems with it.
The real NIC is configured as a DHCP client for both IPv4 and
IPv6. It
is assigned "nailed" addresses of 192.168.17.2/24 and
fd00:dead:beef:17::2.
And I just discovered why crazy stuff is happening (but I do not know
what causes it) ... the P33p1 NIC has:
inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic
Is that the "real NIC"?
Yes, p33p1 is the real NIC. This is going to be a real PITA to debug
because I believe part of the problem is a race condition.
NetworkManager has this really long dance it goes through to bring up
the IPv6 interface.
But, I do not have any proof of that and as I just proved to myself,
getting things to repeat are going to be difficult.
At this point I am not sure that bind-dynamic was related. I went
through the syslogs I still have and the first occurrence was on 8
November. That is well before bind-dynamic was integrated in.
Attached are some limited copies of syslogs that I thought you might
find of interest. It seems like the "strangeness" seem to happen right
after I update libvirt and libvirtd is restarted which then gets dnsmasq
started.
If I cannot get this figured out and "fixed", I will need to disable use
of dnsmasq for RA service and fall back on radvd.
Frustrating .. so close and yet so far!
I wonder if the virbr* interfaces are bridged to the "real" NICs, such
that when a prefix is advertised on the virbr interface, it causes the
real interface to add an address for that prefix. Because dnsmasq is
configured to advertise the prefix, that then causes the
advertisements via the real NIC.
Just a thought.
If I had not done the ip addr to get the above, I would still be
scratching my head.
Anyway, here is ip addr:
-----------------------------------------------
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: p33p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 30:85:a9:8f:e9:82 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.2/24 brd 192.168.17.255 scope global p33p1
inet6 fd00:dead:beef:17:1::2/128 scope global
valid_lft forever preferred_lft forever
inet6 fe80::3285:a9ff:fe8f:e982/64 scope link
valid_lft forever preferred_lft forever
10: virbr11: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
state DOWN
link/ether 52:54:00:0b:84:5c brd ff:ff:ff:ff:ff:ff
inet 192.168.6.1/24 brd 192.168.6.255 scope global virbr11
inet6 fd00:beef:10:6::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe0b:845c/64 scope link
valid_lft forever preferred_lft forever
11: virbr11-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master
virbr11 state DOWN qlen 500
link/ether 52:54:00:0b:84:5c brd ff:ff:ff:ff:ff:ff
------------------------------------
And here is brctl show:
-----------------------------------------
bridge name bridge id STP enabled interfaces
virbr11 8000.5254000b845c yes virbr11-nic
---------------------------------------
I think I will give it a rest until tomorrow!
Gene
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss