On Wed, Nov 30, 2011 at 03:54:18AM +0400, Alexander Polakov wrote:
> From time to time I get this panic when detaching run device, while
> traffic is flowing through it.
>
> run0 detached
> uvm_fault(0xffffffff80xfcc00, 0xffff800001008000, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at tcp_mss+0x129: testb 0x8, 0xa4(%r14)
> ddb{0}> trace
> tcp_mss() at tcp_mss+0x129
> tcp_output() at tcp_output+0xbb6
> tcp_timer_rexmt() at tcp_timer_rexmt+0x2f3
> softclock() at softclock+0x291
> softintr_dispatch() at softintr_dispatch+0x5d
> Xsoftclock() at Xsoftclock+0x2d
>
> I tried debugging it myself and found that it is tcp_input.c:3080 (if I
> didn't screw up).
>
> 3073 if (!ifp)
> 3074 /*
> 3075 * ifp may be null and rmx_mtu may be zero in certain
> 3076 * v6 cases (e.g., if ND wasn't able to resolve the
> 3077 * destination host.
> 3078 */
> 3079 goto out;
> 3080 else if (ifp->if_flags & IFF_LOOPBACK)
> 3081 mss = ifp->if_mtu - iphlen - sizeof(struct tcphdr);
>
Can you try the following diff?
It is scary that we never checked if the cached route is valid in
in_pcbrtentry(). It always reliad on the check in ip_output().
--
:wq Claudio
Index: in_pcb.c
===================================================================
RCS file: /cvs/src/sys/netinet/in_pcb.c,v
retrieving revision 1.124
diff -u -p -r1.124 in_pcb.c
--- in_pcb.c 6 Jul 2011 01:57:37 -0000 1.124
+++ in_pcb.c 30 Nov 2011 00:16:20 -0000
@@ -746,6 +746,12 @@ in_pcbrtentry(struct inpcb *inp)
ro = &inp->inp_route;
+ /* check if route is still valid */
+ if (ro->ro_rt && (ro->ro_rt->rt_flags & RTF_UP) == 0) {
+ RTFREE(ro->ro_rt);
+ ro->ro_rt = NULL;
+ }
+
/*
* No route yet, so try to acquire one.
*/
@@ -765,6 +771,7 @@ in_pcbrtentry(struct inpcb *inp)
ro->ro_dst.sa_len = sizeof(struct sockaddr_in6);
((struct sockaddr_in6 *) &ro->ro_dst)->sin6_addr =
inp->inp_faddr6;
+ ro->ro_tableid = inp->inp_rtableid;
rtalloc_mpath(ro, &inp->inp_laddr6.s6_addr32[0]);
break;
#endif /* INET6 */