On 11/30/2012 04:18 PM, Simon Kelley wrote:
On 30/11/12 21:03, Gene Czarcinski wrote:
On 11/30/2012 12:45 PM, Simon Kelley wrote:
On 30/11/12 17:20, Gene Czarcinski wrote:
On 11/30/2012 11:32 AM, Simon Kelley wrote:
On 30/11/12 15:54, Gene Czarcinski wrote:
On 11/29/2012 04:18 PM, Simon Kelley wrote:
On 29/11/12 20:31, Gene Czarcinski wrote:

I spoke too quickly.

The cause of the problem is libvirt related but I am not sure what
just
yet.

I was running a libvirt that had a lot of "stuff" on it but
seemed to
work OK. Then, earlier today I update to a point that appears to be somewhat beyond the leading edge and, although I was not getting any
RTR-ADVERT messages, it turned out that there were/are big-time
problems
running qemu-kvm. So, back off/downgrade to the previous version.
Qemu-kvm now works but the RTR-ADVERT messages are back.

This may be a bit time-consuming to debug!

Are you seeing the new log message in netlink.c?


The good news is that libvirt is working again (I must have done a
git-pull in the middle of an update). Thus, I am not seeing the large
numbers of RTR-ADVERT.

Yes, I am seeing the new log message and I have a question about that.
Every time a new virtual network interface is started, something
must be
doing some type of broadcast because all of the dnsmasq instances (the new one and all the "old" ones) suddenly wake up and issue a flurry of RA packets and related syslog messages. To kick the flurry off, there
one of the new "unsolicited" syslog messages from each dnsmasq
instance.

Is this something you would expect?  Is this "normal?" The libvirt
folks they are not doing it.
I'd expect it. The code you instrumented gets run whenever a "new
address" event happens, which is whenever an address is added to an
interface. "Every time a new virtual network interface is started" is a
good proxy for that.

The dnsmasq code isn't very discriminating, it updates it's idea of
which interfaces hace which addresses, and then does a minute of fast
advertisements on all of them. It might be possible to only do the fast
advertisements on new interfaces, but implementing that isn't totally
trivial.


Yes, I doubt very much if it would be trivial.  However, I do not
believe that this is the basic problem.

When the problem occurs, one of the networks "suddenly" attempts to work
with the real NIC rather than the virtual one defined in its config
file.  I slightly changed the IPv4 and IPv6 addresses defined for this
network and the problem went away. I have also "just" seen the problem
happen on another system which also had that virtual address defined.

BTW, these configurations all use interface= and bind-dynamic rather
than the "old" bind-interface with listen-address= specified for each
specified IPv4 and IPv6 address.  I had not noticed the problem
previously.  Why it occurs at all with just this specific address is
puzzling.

The configuration in which causes problems is:
------------------------------------------
# dnsmasq conf file created by libvirt
strict-order
domain-needed
domain=net6
expand-hosts
local=/net6/
pid-file=/var/run/libvirt/network/net6.pid
bind-dynamic
interface=virbr11
dhcp-range=192.168.6.128,192.168.6.254
dhcp-no-override
dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases
dhcp-lease-max=127
dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile
addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts
dhcp-range=fd00:beef:10:6::1,ra-only
-------------------------------------------------

When I changed all the "6" to "160", the problem, disappeared. And
there is another network defined almost the same with "8" instead of "6"
and I have had no problems with it.

The real NIC is configured as a DHCP client for both IPv4 and IPv6. It
is assigned "nailed" addresses of 192.168.17.2/24 and
fd00:dead:beef:17::2.

And I just discovered why crazy stuff is happening (but I do not know
what causes it) ... the P33p1 NIC has:
   inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic

Is that the "real NIC"?

Yes, p33p1 is the real NIC.  This is going to be a real PITA to debug
because I believe part of the problem is a race condition.
NetworkManager has this really long dance it goes through to bring up
the IPv6 interface.

But, I do not have any proof of that and as I just proved to myself,
getting things to repeat are going to be difficult.

At this point I am not sure that bind-dynamic was related.  I went
through the syslogs I still have and the first occurrence was on  8
November.  That is well before bind-dynamic was integrated in.

Attached are some limited copies of syslogs that I thought you might
find of interest.  It seems like the "strangeness" seem to happen right
after I update libvirt and libvirtd is restarted which then gets dnsmasq
started.

If I cannot get this figured out and "fixed", I will need to disable use
of dnsmasq for RA service and fall back on radvd.

Frustrating .. so close and yet so far!


I wonder if the virbr* interfaces are bridged to the "real" NICs, such that when a prefix is advertised on the virbr interface, it causes the real interface to add an address for that prefix. Because dnsmasq is configured to advertise the prefix, that then causes the advertisements via the real NIC.

Just a thought.

If I had not done the ip addr to get the above, I would still be scratching my head.

Anyway, here is ip addr:
-----------------------------------------------
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: p33p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 30:85:a9:8f:e9:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.17.2/24 brd 192.168.17.255 scope global p33p1
    inet6 fd00:dead:beef:17:1::2/128 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::3285:a9ff:fe8f:e982/64 scope link
       valid_lft forever preferred_lft forever
10: virbr11: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
    link/ether 52:54:00:0b:84:5c brd ff:ff:ff:ff:ff:ff
    inet 192.168.6.1/24 brd 192.168.6.255 scope global virbr11
    inet6 fd00:beef:10:6::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe0b:845c/64 scope link
       valid_lft forever preferred_lft forever
11: virbr11-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr11 state DOWN qlen 500
    link/ether 52:54:00:0b:84:5c brd ff:ff:ff:ff:ff:ff
------------------------------------

And here is brctl show:
-----------------------------------------
bridge name    bridge id        STP enabled    interfaces
virbr11        8000.5254000b845c    yes        virbr11-nic
---------------------------------------

I think I will give it a rest until tomorrow!

Gene

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Reply via email to