hello. My understanding is that the arp caching mechanism works
regardless of whether
you use static MAC addresses or dynamically generated ones. The reason is that
arp bridges the
gap between the layer 2 network, i.e. the MAC addresses, and the layer 3
network, i.e. the IP
addresses those MAC addresses map to. You can demonstrate this interaction by
shutting down
the vif interface to your domu, then delete the MAC address from the arp cache
for that vif by
using arp -d <MAC address>, then by trying to ping your domu from dom0. After
about 20
seconds, you should see the host is down message. Then, use arp -a to look for
your domu's IP
address. what you'll see in the MAC field is the word "incomplete".
If you then run brconfig on the bridge containing the domu, you'll see the MAC
address you
assigned, or which was assigned dynamically, alive and well.
My guess is that you're runing into some sort of short term memory
crunch inside the
dom0's network stack. The long term ping test should provide more details
about where this
memory crunch might be. The long time favorite variable for this issue is the
good ole
nmbclusters value, tunable in the kernel config and visible through:
/sbin/sysctl kern.mbuf.nmbclusters
Although it's a blunt instrument, the output from:
netstat -m
might be helpful as well. specifically, the value listed as the number of
calls to protocol
drain routines.
Yet another possibility is if you have a firewall set up , either on
the dom0, or on the
domu in question. If you're running into some rule that restricts access or
bandwidth on the
path between the dom0 and the domu, you might see this kind of behavior.
Unfortunately, in my
experience, when one runs into a firewall issue of this nature, the error
messaging around it
is very misleading. It's important to remember that the IP stacks on the dom0
or domu,
respectively, don't know that the IP address for the machine at the other end
of the connection
is actually running on the same hardware. Consequently, if there are firewall
rules set up on
either dom0 or the domu in question, and, possibly both, be sure your firewall
rules provide
full access between the dom0 and domu in question, just as you would if you
were writing rules
for remote machines.
the fact that you're only seeing this problem when communicating between the
dom0 and the domu,
and not between the domu and the rest of the world, suggests to me the problem
is on the dom0,
so I would start by looking there first.
Hope these notes help.
-Brian