Hi,

I have the following problem: sometimes a tcp-connection (i.e http using wget,
ftp,  telnet and ssh) freeze for minutes. Other connections between the same
machines still work well. This is observed with all 2.0.35, 2.0.36 and
2.0.37 and 20.38 kernels.

One thing which seems to trigger this behaviour is, when a tcp-packet is
lost. So I never see this for machines on the same segment, sometimes the
packet must pass a rather full 10Mbit segment and rather often over a
768kbit-link. tcpdumps on both machines show that no packet is sent or
reveived for this connection while the connections is frozen.

I don't know if its really tcp or maybe another system-call (i.e. select), but
squid i.e. continues to work for its other connections or new ones when one
of its connections freeze.

Some details:

AAA: ssh-client
BBB: ssh-server

netstat -t on client:

Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0    360 wuerli.h3.stusta.m:1021 s52.ext.studentenwer:22 ESTABLISHED

By the way, Send-Q is always 360 or 180

tcpdump after the freeze on client
---
00:21:25.116767 AAA.1021 > BBB.22: P 2889612707:2889612727(20) ack 2753952570 win 
32120 (DF) [tos 0x10]
00:23:25.116767 AAA.1021 > BBB.22: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:25:25.116767 AAA.1021 > BBB.22: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:27:25.116767 AAA.1021 > BBB.22: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:29:25.116767 AAA.1021 > BBB.22: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:31:25.116767 AAA.1021 > BBB.22: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:32:56.866767 BBB.22 > AAA.1021: . ack 0 win 32736 [tos 0x10]
00:32:56.866767 AAA.1021 > BBB.22: . ack 1 win 32120 (DF) [tos 0x10]
00:33:25.116767 AAA.1021 > BBB.22: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
---


netstat -t on server:

Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0     20 eumel.studentenwerk:ssh wuerli.h3.stusta.m:1022 ESTABLISHED

Again, it seems to be always 20


tcpdump after the freeze on server
---
00:23:23.288556 AAA.1021 > BBB.ssh: P 2889612707:2889612727(20) ack 2753952570 win 
32120 (DF) [tos 0x10]
00:25:23.278556 AAA.1021 > BBB.ssh: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:27:23.268556 AAA.1021 > BBB.ssh: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:29:23.258556 AAA.1021 > BBB.ssh: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:31:23.248556 AAA.1021 > BBB.ssh: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
00:32:54.998556 BBB.ssh > AAA.1021: . ack 0 win 32736 [tos 0x10]
00:32:54.998556 AAA.1021 > BBB.ssh: . ack 1 win 32120 (DF) [tos 0x10]
00:33:23.238556 AAA.1021 > BBB.ssh: P 0:20(20) ack 1 win 32120 (DF) [tos 0x10]
---


vi on the client is sleeping in sys_select and can be terminated sending a
KILL signal.

The wget-case is similar.

I have reports of other people with the same problem.

I don't know if this is a (known) bug in 2.0.35+ kernels, but I would appreciate
if someone could tell me where to start searching the reason.

Wolfgang Walter

-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

Reply via email to