I have the same problem, of which I'm still searching for an answer. :(
The "Client" is SBClient (a fancy GUI program for Telnet), and the
Server is Unidata DB server running on NT 4.0 with SP6a and the Unidata
server has its own Telnet Service that appears to be running on the
standard port.
Client connections are made across a Site-to-Site VPN (PIX to PIX/LAN to
LAN), only 1 or 2 users have their sessions terminated without request
from the Server Admin or the Client. All information gathered seems to
indicate that after a period of idle time, the connection is dropped.
My problem is, I've placed a sniffer (Ethereal) on both ends of the
connection (but not simultaneously) with hopes of finding the root of
the disconnect, but am not able to do so thus far. The weird thing is,
it seems like when the particular user that experiences these random
disconnects logs into the system, I see a Multicast Join advertisement
appear. If they normally log out of the system (telnet app), I see a
Multicast Leave advertisement....
If anyone has a similar setup and can shed some light on where I can go
look to figure this out, I'd appreciate it.
For clarity, the ASCII art depicts the layout:
Telnet Server (Unidata DB Server)
|
--PIX --
| |
- RouterA |
| | V
--Internet-- | P
| | N
- RouterB |
| |
-- PIX --
|
- SBClient (Win98 workstations)
Also, when the SBClient "gets suddenly disconnected", the Server still
thinks the user is connected. When the user re-connects to the server
with the SBClient telnet app, the server starts a NEW session for the
same user id, and therefore eats up another license connection. The now
"orphaned" old session has to be manually killed by the Admin on the
Server. These disconnects only occur with users across the VPN- local
users are not affected.
Short of coming up with a selective debug (if its even possible) and
logging debug output for the specific users' telnet sessions over a
period of time, I'm at a loss as to how I can figure out and solve this
problem.
Note: ONLY 1 or 2 Users at each of four remote sites experience this
issue... and it's always the same users. This whole setup did work
without a problem when the remote users were connected via Frame Relay
P-t-P connections... but has exhibited this issue since the topology
changed to VPNs and dropping the FR connections.
Also- I've opened a TAC case, but TAC pointed me back to the server, and
said "confirm whether or not the Server's Telnet service operates with
unicast, or broadcast/multicast and get back with us. Also run a
sniffer to capture session traffic and check that for errors and then
get back with us. The PIX does not pass broadcast or multicast traffic
by design of its ASA process. If the server is using anything other
than unicast for communications, reconfigure application server for
unicast."
So far my determination is that the Unidata Application Server's
implementation of Telnet is TCP unicast.
Am I wrong to understand telnet uses TCP-based unicast communications??
TIA for any advice or ideas on how to solve this problem.
-Mark
-----Original Message-----
From: sam sneed [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 25, 2002 11:17 AM
To: [EMAIL PROTECTED]
Subject: Re: TCP sequence numbers question [7:49535]
How does the other host know its a keepalive? I do not see any keepalive
fields in the TCP packet, perhaps a TCP option?
I think I was more confused by how the sequence #'s are incremented and
ack'd. I read in Stevens book
" Since every byte that is exchanged is numbered, the acknowledgement
number
contains the next sequence number that the sender of the acknowledgement
expects to receive. This is therefore the sequence number plus 1 of the
last
successfully received byte of data."
So using the example below (host A 192.168.133.21, B 10.10.10.12), A
sends 1
byte of data, last successful sent byte is 2653258021, shouldn't Host B
ack
(2653258021)+1 ?
The problem I'm trying to solve is a TCP connection that unexpectedly
terminates. Supposedly the client can detect this and reconnect to the
server but there are problems. I started the keepalive thread last week
related to the same issue. I thought our firewall may have droppped the
connection from its state table after its timeout but this is not the
case
since it seems keepalives are sent every 30 seconds.
17:56:46.563514 O 192.168.133.21.5055 > 10.10.10.12.1617: P
2653258020:2653258021(1) ack 808512610 win 8760 (DF)
17:56:46.604328 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
2653258021
win 17520 (DF)
17:58:20.327090 O 192.168.133.21.5055 > 10.10.10.12.1617: P
2653258020:2653258021(1) ack 808512610 win 8760 (DF)
17:58:20.368296 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
2653258021
win 17520 (DF)
17:59:54.090651 O 192.168.133.21.5055 > 10.10.10.12.1617: P
2653258020:2653258021(1) ack 808512610 win 8760 (DF)
17:59:54.132170 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
2653258021
win 17520 (DF)
""Priscilla Oppenheimer"" wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> sam sneed wrote:
> >
> > I have been troubleshooting a problem and have seen something I
> > don't
> > understand. If host A sends data to host B and host B acks the
> > data, isn't
> > host A supposed to increment its seq #. Here is an actual
> > tcpdump. Host A is
> > 192.168.133.21 and B is 10.10.10.12.
> > You'll notice host A is pushing 1 byte of data and Host B is
> > acking it, yet
> > host A's seq never increments. Is this normal?
>
> It sounds like Host A has gone into a keepalive mode. It doesn't have
any
> actual data to send, so it just sits there sending one byte at a time.
>
> We had a long discussion about TCP keepalives last week sometime. You
might
> want to check the archives. The TCP RFC (793) doesn't actually mention
> keepalives. With ordinary TCP, when there's no data to send, both
sides
are
> silent. But a lot of implementations send keepalives, and the host
> requirements RFC does say that's OK. (RFC 1122)
>
> Theoretically a host should just be able to send an empty TCP segment
with
> no data to implement the keepalive function. In that case, there's no
reason
> to increment the sequence number as sequence numbers count payload
bytes.
> However, some older implementations based on 4.2 BSD UNIX do not
respond
if
> the keepalive contains no data, causing the sender to think its
partner
has
> died.
>
> Some systems instead send one garbage byte of data to elicit an ACK.
They
> purposely keep the sequence number the same so that the garbage byte
can't
> cause any harm. It's not the expected sequence number. It's a sequence
> number that the receiver already received and ACKed, so the byte is
thrown
> away before being given to an application (although it is ACKed by
TCP.)
>
> Some implementations send a keepalive with no data and if no response
is
> received, switch over to the 4.2 BSD style and send a garbage byte.
>
> Anyway, I doubt this is related to the problem you are troubleshooting
since
> it's normal behavior. What is the problem? Can you tell us more about
it?
> Thanks.
> ________________________
>
> Priscilla Oppenheimer
> http://www.priscilla.com
>
>
> >
> > 17:56:46.563514 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 17:56:46.604328 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 17:58:20.327090 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 17:58:20.368296 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 17:59:54.090651 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 17:59:54.132170 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 18:01:27.854289 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 18:01:27.895254 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 18:03:01.618100 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 18:03:01.658892 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 18:04:35.381698 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 18:04:35.422538 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 18:06:09.145358 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 18:06:09.186227 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> > 18:07:42.909048 O 192.168.133.21.5055 > 10.10.10.12.1617: P
> > 2653258020:2653258021(1) ack 808512610 win 8760 (DF)
> > 18:07:42.949850 I 10.10.10.12.1617 > 192.168.133.21.5055: . ack
> > 2653258021
> > win 17520 (DF)
> >
> > thanks
Message Posted at:
http://www.groupstudy.com/form/read.php?f=7&i=50166&t=49535
--------------------------------------------------
FAQ, list archives, and subscription info: http://www.groupstudy.com/list/cisco.html
Report misconduct and Nondisclosure violations to [EMAIL PROTECTED]