VR not able to ping public gateway for almost 3 hours then it works.

Cloudstack 4.4.1 (new install) and Xenserver 6.2, public and management 
networks are not tagged and using vlan1. For some reason when VR is created its 
not able communicate with its public gateway for almost 2-3 hours and all of a 
sudden it starts pinging. After it starts pinging then restarting VR etc. is 
not a problem and it starts working as soon as it comes up but problem happens 
again when router is destroyed and created again and we have this same problem 
of not being able ping gateway for sometime, it takes 30-45 minutes to starts 
working again sometimes 2-3 hours. We have 3 HP switches and first one is 
connected to internet gateway and through untagged ports  on those 2 other 
switches (through trunk port) xenserver hosts connected via (active-passive) 
bonds. Iscsi (primary storage, vlan 30,31,32,33) nfs (secondary storage vlan 
34), guest (500-550) public (vlan1) management (vlan1). We logon to virtual 
router and issue iptables -L and response is very slow (when it starts working 
response is very fast) we tried to traceroute gateway ip and response is very 
fast blank * * * displayed for all those 30 hops. ifconfig -a displays all the 
right information for network interfaces.We tried to remove and reinsert egress 
rule (ALL) back but that didn't help we would still have to wait for few hours 
for router to start pinging again. We tried to use this same IP on a physical 
machine connected to this same switch on an untagged port and it works as soon 
as we configure this same IP. We can ping this VR from outside and it responds 
OK so we know that network configuration is OK, We are thinking about firewall 
rules not downloading in a timely manner but we checked /var/log/cloud.log file 
on the router but there is really no change before and after (pinging) so we 
really don't know how to troubleshoot this problem any further...

If requested, I can upload cloud.log file from VR, we compared this log file 
with a working one (VR) and no difference between them,

Template file and CS 4.4.1 downloaded around Oct 6,

I know it is hard to troubleshoot this kind of issue but if you can point me to 
possible causes that will be perfect so we can start from somewhere to 
troubleshoot this problem,

When we used tcpdump on the router we realized that before it starts working we 
have more stuff displayed (conversations about almost every network activity on 
the switch) and when it starts working almost %60 reduction in tcp 
conversations from all interfaces on the router...

Thanks,

Sam

Reply via email to