This message slipped through the cracks. It leads me to giving an update on the problem though.

I worked with TAC to troubleshoot the issue last week. The TAC engineer also noticed the giants on the 7600's side. He tried sending large ICMPs through to the 7600 from the 7201. Nothing over 1508 would pass even though the interface MTU was 9000 on both sides (and the IP MTU followed). Even sending ICMPs WITHOUT df set still resulted in a failure. We dropped the MTU to 1500 and suddenly we could send large ICMPs that needed to be fragged. Very weird. It gets weirder though.

Prior to calling TAC I upgraded the code on another 7201 that's dual-homed to both 7613s in the core. As soon as I reloaded that 7201 LDP on it also started flapping to BOTH 7600s (the original 7201 was only single-homed to one 7600). BGP appears to be unaffected on this 7201. So now I have 2 7201s with constantly flapping LDP neighbors. The 2nd 7201 also can't ping either 7600 with large ICMPs. However, and this is weird, BOTH 7600s can ping the loopback on the 7201 with 9000 byte ICMPs.

When I wrote that last sentence it got me thinking. I was pinging from the 7201s to Lo0 on the 7600s. Large ICMPs weren't getting there and giants were logged on the incoming L3 interface on the 7600s. I can ping from the 2nd 7201 to the directly-connected interface on either 7600 with large ICMPs and they are not dropped and no giants are logged. Even though it can send large frames to the directly-connected interface it can't to the loopback. I don't believe that's normal. From the 7600 I can turn around and ping the loopback on the 2nd 7201 with jumbo frames without any problems. It's like MTU is only being honored in one direction.

This is a confusing one to me that smells like a bug. I'm running SRB1 on both 7600s and was running different 12.4(15)Tn releases on the 7201s. They are both now running 12.2(24)T. I'll drop one of them back to an early 12.4(15)Tn tonight to troubleshoot if I have to. The problem occured on the 1st 7201 without a code change and didn't occur on the 2nd until after the code change and reboot.

Any thoughts?
 Justin


David Freedman wrote:
You appear to have a high number of input queue drops and input errors,
granted the counters have never been cleared, do you haver any PPS
graphs of the link between these two boxes? I would suspect a traffic
spike or link fault causing control messages to be dropped being the
cause here.

_______________________________________________
cisco-nsp mailing list  [email protected]
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Reply via email to