Re: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks

2001-04-19 Thread Kurt Roeckx

On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote:
> While running 2.4.3, I saw the following message a few times:
> 
> KERNEL: assertion (tp->lost_out == 0) failed at
> tcp_input.c(1202):tcp_remove_reno_sacks

I've been running tcpdump for some time, and get the message 2
times again today.

Apr 19 19:05:17 thunderbird kernel: KERNEL: assertion (tp->lost_out == 0)
failed at tcp_input.c(1202):tcp_remove_reno_sacks
Apr 19 19:07:18 thunderbird kernel: KERNEL: assertion (tp->lost_out == 0)
failed at tcp_input.c(1202):tcp_remove_reno_sacks

I'm going to start with the second one, because there was alot less trafic at that 
time.

19:07:17.571150 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 
1921:3141(1220) ack 1811 win 5680 (len 1240, hlim 64)
19:07:17.571163 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 
3141:3341(200) ack 1811 win 5680 (len 220, hlim 64)
19:07:17.572431 3ffe:401:0:1::16:2 > 3ffe:80c0:220::b: icmp6: too big 1280
 (len 1240, hlim 63)
19:07:17.645807 3ffe:8010:91::26.2237 > 3ffe:80c0:220::b.6667: S [tcp sum ok] 
2268475160:2268475160(0) win 32660  
(len 40, hlim 61)
19:07:17.816319 3ffe:1001:211:80:baba:beba:deca:ceca.33258 > 3ffe:80c0:220::b.6667: . 
[tcp sum ok] 290:290(0) ack 14134 win 34160 (len 20, hlim 60)
19:07:18.186433 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060 > 3ffe:80c0:220::b.6667: . 
[tcp sum ok] 1811:1811(0) ack 3341 win 15620 (len 20, hlim 59)
19:07:18.186465 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 
3341:4561(1220) ack 1811 win 5680 (len 1240, hlim 64)
19:07:18.886979 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060 > 3ffe:80c0:220::b.6667: . 
[tcp sum ok] 1811:1811(0) ack 4561 win 17040 (len 20, hlim 59)
19:07:18.887047 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 
4561:4761(200) ack 1811 win 5680 (len 220, hlim 64)
19:07:19.236653 3ffe:8010:14::1:dead:beef.3207 > 3ffe:80c0:220::b.6667: S [tcp sum ok] 
2702352776:2702352776(0) win 31680  (len 40, hlim 60)

As you can see, during that second there only was trafic of 1 connection.

Some part of the tcpdump around the time of the first:

19:05:16.783871 3ffe:8010:7:43:1000:dead:dead:2.3292 > 3ffe:80c0:220::b.6667: P
[tcp sum ok] 134:152(18) ack 1104 win 31520 (len 38, hlim 60)
19:05:16.783923 3ffe:80c0:220::b.6667 > 3ffe:8010:7:43:1000:dead:dead:2.3292: .
[tcp sum ok] 3321:3321(0) ack 152 win 5680 (len 20, hlim 64)
19:05:16.849145 3ffe:400:680::::15.1117 > 3ffe:80c0:220::b.6667: . [tcp
sum ok] 124:124(0) ack 38670 win 32660 (len 20, hlim 61)
19:05:16.921394 3ffe:8060:100::26:2 > 3ffe:80c0:220::b: icmp6: too big 1280
 (len 1240, hlim 63)
19:05:16.972044 3ffe:8191::2.1044 > 3ffe:80c0:220::b.6667: . [tcp sum ok] 73:73(0) ack 
8784 win 17040 (len 20, hlim 60)
19:05:16.972143 3ffe:80c0:220::b.6667 > 3ffe:8191::2.1044: P 8784:8984(200) ack
73 win 5680 (len 220, hlim 64)
19:05:17.030129 3ffe:80c0:220::b.6667 > 3ffe:b00:4011:a::3.1880: P 76:1163(1087) ack 
213 win 5680 (len 1107, hlim 64)
19:05:17.062691 3ffe:80c0:220::b.6667 > 3ffe:b00:4011:a::3.1880: P 1163:2383(1220) ack 
213 win 5680 (len 1240, hlim 64)
19:05:17.097973 3ffe:80c0:220::b. > 3ffe:1200:3028:82ca:4:4:4:6.2160: P 
205:819(614) ack 256 win 5680 (len 634, hlim 64)
19:05:17.098080 3ffe:80c0:220::b. > 3ffe:8114:2000:1d0::4.2856: P 3811:4198(387) 
ack 85 win 5680 (len 407, hlim 64)
19:05:17.098135 3ffe:80c0:220::b.6667 > 3ffe:400:680::::15.1117: . 
38670:40090(1420) ack 124 win 5680 (len 1440, hlim 64)
19:05:17.098151 3ffe:80c0:220::b.6667 > 3ffe:400:680::::15.1117: P 
40090:40197(107) ack 124 win 5680 (len 127, hlim 64)
19:05:17.098860 3ffe:80c0:220::b.6667 > 3ffe:80e8:140:200::1.3899: P 158:1049(891) ack 
85 win 5680 (len 911, hlim 64)
19:05:17.106040 3ffe:80c0:220::b.6667 > 3ffe:80c0:220::19.1998: P 5851:6543(692) ack 
475 win 5680 (len 712, hlim 64)
19:05:17.108239 3ffe:80c0:220::b.4126 > 3ffe:1001:340::6.113: S [tcp sum ok] 
2552352896:2552352896(0) win 5680  (len 40, hlim 64)
19:05:17.258572 3ffe:401:0:1::16:2 > 3ffe:80c0:220::b: icmp6: too big 1280
 (len 1240, hlim 63)
19:05:17.258633 3ffe:80c0:220::b.6667 > 3ffe:400:680::::15.1117: . 
38670:39890(1220) ack 124 win 5680 (len 1240, hlim 64)
19:05:17.321612 3ffe:8010:7:43:1000:dead:dead:2.3292 > 3ffe:80c0:220::b.6667: P
152:244(92) ack 1104 win 31520 (len 112, hlim 60)
19:05:17.321636 3ffe:80c0:220::b.6667 > 3ffe:8010:7:43:1000:dead:dead:2.3292: .
[tcp sum ok] 3321:3321(0) ack 244 win 5680 (len 20, hlim 64)
19:05:17.364448 3ffe:80c0:220::b.6667 > 3ffe:400:680:11:::aa15.3452: P [tcp
sum ok] 770:789(19) ack 67 win 5680 (len 39, hlim 64)
19:05:17.370740 3ffe:400:680:11:::aa15.3452 > 3ffe:80c0:220::b.6667: P [tcp
sum ok] 51:67(16) ack 770 win 48800 (len 36, hlim 60)
19:05:17.370761 3ffe:80c0:220::b.6667 > 3ffe:400:680:11:::aa15.3452: . [tcp
sum ok] 789:789(0) ack 67 win 5680  (len 32, hlim
64)
19:05:17.390719 3ffe:8114:2000:1d0::4.2856 > 3ffe:80c0:220::b.: . [tcp sum ok] 

Re: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks

2001-04-19 Thread Kurt Roeckx

On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote:
 While running 2.4.3, I saw the following message a few times:
 
 KERNEL: assertion (tp-lost_out == 0) failed at
 tcp_input.c(1202):tcp_remove_reno_sacks

I've been running tcpdump for some time, and get the message 2
times again today.

Apr 19 19:05:17 thunderbird kernel: KERNEL: assertion (tp-lost_out == 0)
failed at tcp_input.c(1202):tcp_remove_reno_sacks
Apr 19 19:07:18 thunderbird kernel: KERNEL: assertion (tp-lost_out == 0)
failed at tcp_input.c(1202):tcp_remove_reno_sacks

I'm going to start with the second one, because there was alot less trafic at that 
time.

19:07:17.571150 3ffe:80c0:220::b.6667  3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 
1921:3141(1220) ack 1811 win 5680 (len 1240, hlim 64)
19:07:17.571163 3ffe:80c0:220::b.6667  3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 
3141:3341(200) ack 1811 win 5680 (len 220, hlim 64)
19:07:17.572431 3ffe:401:0:1::16:2  3ffe:80c0:220::b: icmp6: too big 1280
 (len 1240, hlim 63)
19:07:17.645807 3ffe:8010:91::26.2237  3ffe:80c0:220::b.6667: S [tcp sum ok] 
2268475160:2268475160(0) win 32660 mss 1420,sackOK,timestamp 54007992 0,nop,wscale 0 
(len 40, hlim 61)
19:07:17.816319 3ffe:1001:211:80:baba:beba:deca:ceca.33258  3ffe:80c0:220::b.6667: . 
[tcp sum ok] 290:290(0) ack 14134 win 34160 (len 20, hlim 60)
19:07:18.186433 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060  3ffe:80c0:220::b.6667: . 
[tcp sum ok] 1811:1811(0) ack 3341 win 15620 (len 20, hlim 59)
19:07:18.186465 3ffe:80c0:220::b.6667  3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 
3341:4561(1220) ack 1811 win 5680 (len 1240, hlim 64)
19:07:18.886979 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060  3ffe:80c0:220::b.6667: . 
[tcp sum ok] 1811:1811(0) ack 4561 win 17040 (len 20, hlim 59)
19:07:18.887047 3ffe:80c0:220::b.6667  3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 
4561:4761(200) ack 1811 win 5680 (len 220, hlim 64)
19:07:19.236653 3ffe:8010:14::1:dead:beef.3207  3ffe:80c0:220::b.6667: S [tcp sum ok] 
2702352776:2702352776(0) win 31680 mss 1440,sackOK,timestamp 113753265 0,nop,wscale 
0 (len 40, hlim 60)

As you can see, during that second there only was trafic of 1 connection.

Some part of the tcpdump around the time of the first:

19:05:16.783871 3ffe:8010:7:43:1000:dead:dead:2.3292  3ffe:80c0:220::b.6667: P
[tcp sum ok] 134:152(18) ack 1104 win 31520 (len 38, hlim 60)
19:05:16.783923 3ffe:80c0:220::b.6667  3ffe:8010:7:43:1000:dead:dead:2.3292: .
[tcp sum ok] 3321:3321(0) ack 152 win 5680 (len 20, hlim 64)
19:05:16.849145 3ffe:400:680::::15.1117  3ffe:80c0:220::b.6667: . [tcp
sum ok] 124:124(0) ack 38670 win 32660 (len 20, hlim 61)
19:05:16.921394 3ffe:8060:100::26:2  3ffe:80c0:220::b: icmp6: too big 1280
 (len 1240, hlim 63)
19:05:16.972044 3ffe:8191::2.1044  3ffe:80c0:220::b.6667: . [tcp sum ok] 73:73(0) ack 
8784 win 17040 (len 20, hlim 60)
19:05:16.972143 3ffe:80c0:220::b.6667  3ffe:8191::2.1044: P 8784:8984(200) ack
73 win 5680 (len 220, hlim 64)
19:05:17.030129 3ffe:80c0:220::b.6667  3ffe:b00:4011:a::3.1880: P 76:1163(1087) ack 
213 win 5680 (len 1107, hlim 64)
19:05:17.062691 3ffe:80c0:220::b.6667  3ffe:b00:4011:a::3.1880: P 1163:2383(1220) ack 
213 win 5680 (len 1240, hlim 64)
19:05:17.097973 3ffe:80c0:220::b.  3ffe:1200:3028:82ca:4:4:4:6.2160: P 
205:819(614) ack 256 win 5680 (len 634, hlim 64)
19:05:17.098080 3ffe:80c0:220::b.  3ffe:8114:2000:1d0::4.2856: P 3811:4198(387) 
ack 85 win 5680 (len 407, hlim 64)
19:05:17.098135 3ffe:80c0:220::b.6667  3ffe:400:680::::15.1117: . 
38670:40090(1420) ack 124 win 5680 (len 1440, hlim 64)
19:05:17.098151 3ffe:80c0:220::b.6667  3ffe:400:680::::15.1117: P 
40090:40197(107) ack 124 win 5680 (len 127, hlim 64)
19:05:17.098860 3ffe:80c0:220::b.6667  3ffe:80e8:140:200::1.3899: P 158:1049(891) ack 
85 win 5680 (len 911, hlim 64)
19:05:17.106040 3ffe:80c0:220::b.6667  3ffe:80c0:220::19.1998: P 5851:6543(692) ack 
475 win 5680 (len 712, hlim 64)
19:05:17.108239 3ffe:80c0:220::b.4126  3ffe:1001:340::6.113: S [tcp sum ok] 
2552352896:2552352896(0) win 5680 mss 1420,sackOK,timestamp 11315112 0,nop,wscale
0 (len 40, hlim 64)
19:05:17.258572 3ffe:401:0:1::16:2  3ffe:80c0:220::b: icmp6: too big 1280
 (len 1240, hlim 63)
19:05:17.258633 3ffe:80c0:220::b.6667  3ffe:400:680::::15.1117: . 
38670:39890(1220) ack 124 win 5680 (len 1240, hlim 64)
19:05:17.321612 3ffe:8010:7:43:1000:dead:dead:2.3292  3ffe:80c0:220::b.6667: P
152:244(92) ack 1104 win 31520 (len 112, hlim 60)
19:05:17.321636 3ffe:80c0:220::b.6667  3ffe:8010:7:43:1000:dead:dead:2.3292: .
[tcp sum ok] 3321:3321(0) ack 244 win 5680 (len 20, hlim 64)
19:05:17.364448 3ffe:80c0:220::b.6667  3ffe:400:680:11:::aa15.3452: P [tcp
sum ok] 770:789(19) ack 67 win 5680 (len 39, hlim 64)
19:05:17.370740 3ffe:400:680:11:::aa15.3452  3ffe:80c0:220::b.6667: P [tcp
sum ok] 51:67(16) ack 770 win 48800 (len 36, hlim 60)
19:05:17.370761 3ffe:80c0:220::b.6667  3ffe:400:680:11:::aa15.3452: . [tcp
sum ok] 789:789(0) ack 67 win 

Re: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks

2001-04-15 Thread Kurt Roeckx

On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote:
> While running 2.4.3, I saw the following message a few times:
> 
> KERNEL: assertion (tp->lost_out == 0) failed at
> tcp_input.c(1202):tcp_remove_reno_sacks

Nobody seems to be intrested in fixing this bug?

Anyway, I was looking at some statistics of the box, which I
think might be related to this problem.

netstat -s shows this under TCP:

Tcp:
11681 active connections openings
0 passive connection openings
84689 failed connection attempts
0 connection resets received
94 connections established
10963047 segments received
11476087 segments send out
392891 segments retransmited
772 bad segments received.
24083 resets sent

It seems it has to retransmit 3.4% of the TCP segments, which is
rather high.

The box is just up for 10 days, this means it has to retransmit
about .45 segments / second, and the rate seems to be going up.

I hope this helps.

If there is anything else I can do, please ask.


Kurt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks

2001-04-15 Thread Kurt Roeckx

On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote:
 While running 2.4.3, I saw the following message a few times:
 
 KERNEL: assertion (tp-lost_out == 0) failed at
 tcp_input.c(1202):tcp_remove_reno_sacks

Nobody seems to be intrested in fixing this bug?

Anyway, I was looking at some statistics of the box, which I
think might be related to this problem.

netstat -s shows this under TCP:

Tcp:
11681 active connections openings
0 passive connection openings
84689 failed connection attempts
0 connection resets received
94 connections established
10963047 segments received
11476087 segments send out
392891 segments retransmited
772 bad segments received.
24083 resets sent

It seems it has to retransmit 3.4% of the TCP segments, which is
rather high.

The box is just up for 10 days, this means it has to retransmit
about .45 segments / second, and the rate seems to be going up.

I hope this helps.

If there is anything else I can do, please ask.


Kurt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/