Re: Odd TCP glitches in new currents

1999-12-25 Thread B. Scott Michel

We aren't doing mcast at this time. If there's anyone from Nortel
lurking behind this list, UCLA CS is pretty close to throwing out
the Accelars due to a lack of tech support response.

No, UCLA CS is not capable of doing department-wide mcast because
of a set of peculiar bugs in the Accelar's code. It will only do
DVMRP snooping on a limited number of mcast groups (~400 or so).
What we actually see is 3x that number. And so we're waiting for
some upgraded code that Nortel/Bay has claimed is coming for the
better part of a year now.


-scooter


On Fri, 24 Dec 1999, Glendon Gross wrote:

 
 Are you sure that this is a problem with the local interface dropping
 packets, or could it just be a multicast router
 that is suppressing packets?  I have noticed with my new FreeBSD box 
 running mrouted, exceptionally good routing performance.  But my linux
 boxes are more consistent in their response.  So I concluded that 
 my upstream neighbors are supressing the broadcasts as a feature of the
 multicast routing protocol.  I don't think it's a problem with my local
 interface, just a feature of the DVMRP protocol.  
 
 Can anyone recommend a good reference on this?  I've been reading RFC-1075
 and don't really understand it.--Glen Gross
 
 On Tue, 30 Nov 1999, B. Scott Michel wrote:
 
  On Wed, 22 Dec 1999, Jonathan Lemon wrote:
  
   On Dec 12, 1999 at 11:37:42AM -0800, Matthew Dillon wrote:
   I had a Netgear FS509 switch here that would eat packets transmitted
   through the GigE port under certain conditions.  Netgear shipped me 
   a new one, and I've been happy with it, until the same problem started
   happening again this morning.
  
  There's some oddities in the 3.3 and 3.4 kernels as well -- I've actually
  nailed down the plexicity and speed on both the Accellar and my humble PC,
  and yet, I'm looking at weird TCP lockups from time to time.
  
  Mostly seems to be related to NFSv3, but will also happen when doing
  cvsup. There's no magic number of how many bytes are queued waiting to go
  out the interface. And it seems to be limited to specific connections,
  i.e. an NFS TCP connection can be jammed and yet I can be happily talking
  to cvsup3 doing an update.
  
  The interface in question is a NetGear:
  
  pn0: 82c169 PNIC 10/100BaseTX rev 0x20 int a irq 11 on pci0.9.0
  
  What is odd is that the output error metric from netstat -in monotonically
  increases.
  
  Yes, I could post my configuration, etc., and I could go back to running
  -current, but I have a PhD to make progress on. And I'm willing to wait to
  try out the consolidated 2x040/PNIC driver when 4.0 finally rolls out.
  
  
  -scooter
  
  
  
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with "unsubscribe freebsd-current" in the body of the message
  
  
 

Scott Michel| No research ideal ever survives
UCLA Computer Science   | contact with implementation.
PhD Graduate Student| 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-25 Thread Matthew Dillon

:...
: I'm pretty sure that the box was getiting receive interrupts because
: every time I sent a packet to it from the outside systat -vm showed
: a PCI interrupt for the network device.  However 'netstat -in 1' did
: not show the statistics for the received packets until 64 had 
: accumulated.  It could be that the statistics are not being accumulated
: on a per-reception basis and that the receive packets are actually
: getting through, and that its the transmit side which is broken.  I don't
: know the code well enough yet to make the determination.
:
:If things are done in these drives as they are in the if_de driver then
:what you are seeing is the fact that if_opackets and are only
:updated when the tx ring is reclaimed by an interrupt, not

Next time this bug rears its ugly head I'll get a tcpdump going to try
to figure out what is actually going on.  Ooh, and I just had a 
thought -- a profiled kernel might help track down the problem as well
by enabling it to see which routines get hit (and which don't).

I don't see anything specific in the code so far, other then there being
a lot of memory mapped (apparently shared with the device) objects that 
haven't been volatilized.  So far I can't tie that into anything though. 

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-24 Thread Harold Gutch

On Wed, Dec 22, 1999 at 10:18:56PM -0800, Matthew Dillon wrote:
 I'm adding Bill Paul to the list specifically.
 
 Hmm.  Now this is odd!  I think I may have found something!
 
 All of my 'rl' driver cards fail this test:
 
   apollo# linktest -m 0.1:0.2 -s 16 -f16 lander
   lander# linktest -m 0.1:0.2 -s 16 -f16 apollo
 
   They get about 1% packet loss with the test.  Always.  
   100BaseTX full or half duplex, or 10BaseT -- I still get
   failures.
 

I can't repeat this with a RealTek 8039 (that's an 'ed'-NIC) and
a RealTek 8139 (that's the 'rl'-one) running 10BaseT.
Note that I am _NOT_ running -CURRENT on any of these machines,
they both run 2.2-STABLE (rev. 1.17 of rl.c).

The packetloss when using small packets is exactly 0 - that is no
packetloss occured during the minute or so which I was running
linktest.
I just started it again and will leave it running for a couple of
hours, but I doubt that this will make a change.

Whoops, I in fact experienced packet loss now:

overdose(194.94.249.94)-foobar.franken.de  lost 1/1606
overdose(194.94.249.94)-foobar.franken.de  lost 2/2702
foobar.franken.de-overdose(194.94.249.94)  lost 1/3412
overdose(194.94.249.94)-foobar.franken.de  lost 3/3829


Note that was playing PCM-files via NFS at this time, so there
was additional network traffic of ~180 KByte/s.

These here now occured although there was no additional network
traffic:

overdose(194.94.249.94)-foobar.franken.de  lost 4/5491
overdose(194.94.249.94)-foobar.franken.de  lost 5/5692
overdose(194.94.249.94)-foobar.franken.de  lost 6/7277
overdose(194.94.249.94)-foobar.franken.de  lost 7/8661
overdose(194.94.249.94)-foobar.franken.de  lost 8/9412
overdose(194.94.249.94)-foobar.franken.de  lost 9/11393
overdose(194.94.249.94)-foobar.franken.de  lost 10/13699
foobar.franken.de-overdose(194.94.249.94)  lost 2/13728
overdose(194.94.249.94)-foobar.franken.de  lost 11/16426


It seems as if this was roughly the same amount of packetloss as
you experienced.


   rl0: RealTek 8139 10/100BaseTX irq 11 at device 3.0 on pci0
   rl0: Ethernet address: 00:50:ba:d1:89:05
   miibus0: MII bus on rl0
 
 All of my 'fxp' driver cards succeed with the above test perfectly.
 If I test an fxp machine verses an 'rl' machine, linktest shows that
 the 'rl' cards can transmit small packets just fine but they lose
 out trying to receive them!

Nope, it's the other way round for me.  overdose has the
'rl'-NIC, foobar has the 'ed'-NIC.

I hope to be able to do a few additional tests soon.


 Methinks there is something going on with the 'rl' driver and/or
 the RealTek cards!

My experience with those cards isn't the best, so I'd place my
bets on the cards.

bye,
  Harold

-- 
Shabby Sleep is an abstinence syndrome wich occurs due to lack of caffein.
Wed Mar  4 04:53:33 CET 1998   #unix, ircnet


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-24 Thread Rodney W. Grimes

...
 I'm pretty sure that the box was getiting receive interrupts because
 every time I sent a packet to it from the outside systat -vm showed
 a PCI interrupt for the network device.  However 'netstat -in 1' did
 not show the statistics for the received packets until 64 had 
 accumulated.  It could be that the statistics are not being accumulated
 on a per-reception basis and that the receive packets are actually
 getting through, and that its the transmit side which is broken.  I don't
 know the code well enough yet to make the determination.

If things are done in these drives as they are in the if_de driver then
what you are seeing is the fact that if_opackets and are only
updated when the tx ring is reclaimed by an interrupt, not
when we actually queue the packet to the card.  This has been a source
of confusion for a long time, and IMNSO we should move the if_ipackets+=
in the code.  Here is an idle box, with an dc21143 in it showing probably
what you are seing (the only network traffic to this box is the output
of this running netstat -I de0 1 command:
input  (de0)   output
   packets  errs  bytespackets  errs  bytes colls
 1 0 60  0 0138 0
 2 0182  0 0250 0
 2 0158  0 0138 0
... 100 + lines of output deleted...
 3 0256  0 0138 0
 1 0 60122 0138 0
 3 0256  0 0138 0
 1 0 60  0 0138 0

Search for lines like this:
sc-tulip_if.if_opackets += xmits;

in the driver to see when we update the counter, then look at how
interrupt per packet drivers do it and propose a nice clean solution :-)



-- 
Rod Grimes - KD7CAX @ CN85sl - (RWG25)   [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Matthew Dillon

Ok, here's the current status:  The RealTek boards ('rl' driver, D-Link
brand, RealTek chip vendor) appear to have serious packet loss problems 
with small packets.  The cause is currently unknown.  I had two different 
machines (an older PPro 200 and a somewhat newer K6-2/233) with the 
boards in and both exhibited the problem.

The problem is fairly trivial to reproduce using linktest:

http://www.backplane.com/FreeSrc/linktest-1.1.c

host1# linktest -s 16 -f8 host2
host1# linktest -s 16 -f8 host1

These boards were the cause of my TCP problems.

The D-Link boards came with the D-Link switch I had purchased.  I removed
the boards and replaced them with the two LinkSys boards that came with
the LinkSys switch I had purchased.

The LinkSys boards ('dc' driver, LNE100TX+ fame, LC82C115 PNIC II vendor)
do not appear to have the packet loss problem.  I have not had a 
reoccurance of my TCP glitches and my linktest tests have all come out
roses.

I'm hoping Bill will be able to find the problem with the D-Link boards,
just so everyone else using them doesn't hit the same hangup, but my
problem at least appears to be solved after replacing the boards.  I've
stuck my D-Link board into another diskless test machine and it's 
available for testing potential fixes, debugging, etc.

In regards to the switches themselves:  Both the LinkSys and the D-Link
5-Port switches appear to work well.  I've interchanged them with each
other and tested them pretty significantly with four machines attached.
The LinkSys seems to be limited to around 25 MBytes/sec in aggregate
throughput.  The D-Link maxed out my machines (35 MBytes/sec) so I do
not know what it's ultimate limitation is.  The small-packet test maxed
out my machines at 35,000 packets per second.  So while I couldn't find
the limitations of the switches, they're plenty good enough for me!

The only problem I've come up against is that when I change the duplex
with ifconfig the ethernet port is not reset and the switches do not
recognize that the duplex has changed.  If I 'ifconfig XXX media auto',
however, the ports are reset and the switches negotiate full-duplex
properly.  If I ifconfig between 10 and 100BaseT the ports are reset and
the switches appear to figure out the mode properly as well.

So that's where I am.  There was never anything wrong with the switches
or the cabling - the entire problem was due to the D-Link ethernet cards.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-23 Thread Poul-Henning Kamp

In message [EMAIL PROTECTED], Matthew Dillon writes:
:
:make sure you test odd packet lengths. (as in "not even")
:there are occasional bugs that turn up with that sort of thing.

Yup.  Way ahead of you.  

Hmm. usleep() seems to have a high granularity - only about 150 Hz.
How annoying!

Increase your HZ.  I'm using 1000 as default these days.

--
Poul-Henning Kamp FreeBSD coreteam member
[EMAIL PROTECTED]   "Real hackers run -current on their laptop."
FreeBSD -- It will take a long time before progress goes too far!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Mikko T


Just a quick note, not entirely on-topic:

Bill Paul wrote:

[...]

Yes, I know there's a minimum frame length of 60 bytes. And the rl_encap()
routine has the following code:

/* Pad frames to at least 60 bytes. */
if (m_head-m_pkthdr.len  RL_MIN_FRAMELEN) {
m_head-m_pkthdr.len +=
(RL_MIN_FRAMELEN - m_head-m_pkthdr.len);
m_head-m_len = m_head-m_pkthdr.len;
}

[...]

60 bytes, I just adjust bump up m_pkthdr.len and m_len. This adjuster
length gets used later in rl_start() when transmission is triggered.

I haven't read through the code yet, so I don't know where the extra
memory in that buffer originated from, or rather if it has been zeroed
before reaching this point.  Otherwise you are leaking data from the
kernel out to the network.

Other OSes have done this before.  It can be used for "data fishing"
by just pinging the machine.  Eventually it turns up all sorts of
interesting information ([partial] passwords, for example).

How many other NICs are unable to auto-pad, and how many of the
drivers just add "random" data that happened to be laying around
inside the kernel...?

   Just curious,
   /Mikko

   (Off to make sure that if_ed in my home firewall isn't doing
anything like this...)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Matthew Dillon

:Okay, I patched if_rl.c in -current to fixe the problem demonstrated by 
:Matt's linktest program. The bug was actually on the receive side of the 
:rl driver, not the transmit side. A packet can wrap from the end of the 
:RX buffer back to the beginning, and in some cases these packets would 
:get lost due to botched use of m_pullup(). I can run the linktest 
:program now without losing any frames.
:
:There's another way around this which is to allocate a whole mbuf
:cluster when you know the packet is wrapped and bcopy the data manually
:instead of using m_devget(), but I'm not sure I want to waste a whole
:cluster just for that case.
:
:-Bill
:
:-- 
:=
:-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu

Great!  Thanks for your help, Bill!

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

 Heh heh.  Sorry about this, I believe I have further information on
 another older problem.  Bill, remember those ethernet lockups I was 
 having with the 'xl' driver all those months ago that we could never
 track down?

And remember how I kept telling you that I could never duplicate the
problem here?

 Well, they happen with the 'dc' driver too.  But this time I'm not getting
 a complete lockup.  The network actually continues to work well enough,
 well, just barely well enough, that I can still use it.  slowly.
 
 It appears that the 'dc' driver continues to take receive interrupts
 (see the systat -vm snapshot at the end), but winds up not processing 
 any of the packets.  Except when 64 packets accumulate then suddenly all
 64 get processed all at once!  Then nothing again until the next 64
 accumulate.

Uh. That's... strange. First of all, you haven't said if this is the
same machine that experienced the problems with the xl driver. Second,
the number 64 sticks out in this case. If you look at if_dc.c (uh...
you did actually look at the code, right?), you'll see that dc_encap()
will only ask for a "TX done" interrupt every 64 packets. Why? Well,
reclaiming transmit buffers is a fairly unimportant task and I wanted to 
cut down on the number of interrupts that were generated, and when the
tulip reaches the last descriptor in a transmit chain, it's supposed
to generate a "no more buffers in TX ring" interrupt, which will also
trigger a TX buffer reclamation (i.e. dc_txeof() will be called for
either interrupt).

This behavior is controlled by the DC_TX_USE_TX_INTR flag, which
is set for the PNIC II chip. I also use the DC_TX_POLL flag, which
means that the chip is programmed to poll the TX ring and start
transmission itself rather than having the driver write to the
TX DMA start register. This means no register accesses on transmit,
which is always nice. You can ask for a "TX done" interrupt to be
scheduled for each transmitted packet by using the DC_TX_INTR_ALWAYS
flag, which is currently only used for the PNIC I (82c168/82c169)
because it blows goats.

Anyway. I *never* see this behavior on any of my test machines. I
have a LinkSys LNE100TX V2.0 card with the 82c115 chip, as well
as a couple of Macronix cards, a Davicom card, several Intel/DEC
21143 cards, ASIX cards and ADMtek cards, and PNIC I-based LinkSys
cards. None of them exhibit this behavior when I test them.

 This netstat is on the machine with the 'dc' driver that locked up, when
 I ping it from another machine.  The 'dc' driver still works--- barely.
 It doesn't processes any packets until 64 have been received, then it
 processes them all at once.  The transmit side appears to work fine and
 the receive side appears to get interrupts but does not appear to process
 incoming packets.  Yet, obviously, the packets are being accumulated 
 somewhere because I don't have any packet loss, just incredibly long and
 odd ping times.

No no no. You can't say "the receive side appears to get interrupts."
That's speculation. You can stare at the machine and theorize about
what appears to be happening all you want: it won't do a damn bit of good 
until you actually test your theory. You know that an "RX done" interrupt
has been delivered if dc_rxeof() is called. So do something to verify
that it's being called: stick a printf() in dc_rxeof() that tells you
when it trips. Then duplicate the behavior and watch what happens.

 This occurs when I am running netscape on the same box over a remote X
 connection (read:  Lots of packets going over the network plus lots of
 local PCI activity talking to the graphics card).  Same problem occurs 
 with different graphics adapters but I believe this same problem also
 occured with the 'xl' driver on the card I had in before I put this
 card in.

Yes, but the one vital fact you keep leaving out is: does this always
happen with the same machine. If so, then describe this machine. What
PCI chipset does it have? And more to the point, what cards have you
used in this machine that *didn't* exhibit this problem.

No wait, let me guess: Intel fxp. Right? G.

I'm very puzzled by the fact that nobody else has *ever* reported
any problem even remotely like this. Of course, with the level of
feedback I get, it's possible that 50 people are having the same
problem and simply never bothered to tell me.

 And watch what happens after I managed to 'ifconfig dc0 media auto',
 it goes back to normal... suddenly everything is working properly
 again.

And what happens if instead of auto, you use "ifconfg dc0 media 100baseTX
mediaopt full-duplex" to lock the media setting down? Or what happens if
you shut down and restart the X server?

-Bill

-- 

Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Matthew Dillon

: It appears that the 'dc' driver continues to take receive interrupts
: (see the systat -vm snapshot at the end), but winds up not processing 
: any of the packets.  Except when 64 packets accumulate then suddenly all
: 64 get processed all at once!  Then nothing again until the next 64
: accumulate.
:
:Uh. That's... strange. First of all, you haven't said if this is the
:same machine that experienced the problems with the xl driver. Second,
:the number 64 sticks out in this case. If you look at if_dc.c (uh...
:you did actually look at the code, right?), you'll see that dc_encap()
:will only ask for a "TX done" interrupt every 64 packets. Why? Well,
:reclaiming transmit buffers is a fairly unimportant task and I wanted to 

I'm trying to narrow down the area enough that I can mess with the 
driver myself and hopefully locate the problem, since it can't be
reproduced easily.   I was hoping the magic number 64 could be
related to something - and you have apparently been able to do that,
which gives me a place to start anyway.   netstat shows the trigger
to be the reception of 64 packets rather then the transmission, though.
Is there anything at all about the number 64 that could be related to
the receiver?

I'm pretty sure that the box was getiting receive interrupts because
every time I sent a packet to it from the outside systat -vm showed
a PCI interrupt for the network device.  However 'netstat -in 1' did
not show the statistics for the received packets until 64 had 
accumulated.  It could be that the statistics are not being accumulated
on a per-reception basis and that the receive packets are actually
getting through, and that its the transmit side which is broken.  I don't
know the code well enough yet to make the determination.

Previously it was not possible to add debugging code due to the amount
of network traffic involved.  With the new card, though, it should be
possible to add conditional debugging code that could then be turned on
with the sysctl because the network does not lock up completely (so I can
still run 'sysctl' even if it takes it 5 minutes to load over NFS).

:Yes, but the one vital fact you keep leaving out is: does this always
:happen with the same machine. If so, then describe this machine. What
:PCI chipset does it have? And more to the point, what cards have you
:used in this machine that *didn't* exhibit this problem.
:
:No wait, let me guess: Intel fxp. Right? G.

I only have one machine with this configuration (diskless workstation,
everything running over NFS, plus X Display), so yes.  The problem only
occurs on one machine.  It started occuring mid-year, after I threw the
card in that used the xl driver.  The previous ethernet card used a 'de'
driver I believe and didn't have the problem.  The only 'fxp' ethernets
I have are in two of my test boxes - built into the motherboard.  I
don't think I have any PCI cards that use that driver.  The LinkSys
card in my server has never locked up, and the card using the 'xl' driver
in my other diskless test machine (which doesn't have an X display)
has never locked up either.

: And watch what happens after I managed to 'ifconfig dc0 media auto',
: it goes back to normal... suddenly everything is working properly
: again.
:
:And what happens if instead of auto, you use "ifconfg dc0 media 100baseTX
:mediaopt full-duplex" to lock the media setting down? Or what happens if
:you shut down and restart the X server?
:
:-Bill

I'll try that next time the problem occurs but I doubt it will have 
any effect.  Changing the duplex mode does not appear to reset the port 
whereas forcing the media to 'auto' does appear to reset the port.  This 
is actually another problem (switches don't appear to pick up the duplex
change if the port isn't reset), but not one I'm concerned with.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
 I'm trying to narrow down the area enough that I can mess with the 
 driver myself and hopefully locate the problem, since it can't be
 reproduced easily.   I was hoping the magic number 64 could be
 related to something - and you have apparently been able to do that,
 which gives me a place to start anyway.   netstat shows the trigger
 to be the reception of 64 packets rather then the transmission, though.
 Is there anything at all about the number 64 that could be related to
 the receiver?

64 is also the number of descriptors/buffers in the RX ring. When you
fill up the RX ring, the chip is supposed to generate a 'no RX buffer
available' interrupt. The driver will check the RX ring for packets
when either an 'RX OK' or 'no RX buffers available' interrupt is
delivered, but you should be getting an 'RX OK' interrupt on every
received packet.

The datasheet for the PNIC II is at:

http://www.freebsd.org/~wpaul/Macronix/PNIC_II.PDF

This is the datasheet LinkSys gave me when they first came out with
the LNE100TX v2.0 board. It's very similar to the Macronix 98715A
datasheet.
 
 I'm pretty sure that the box was getiting receive interrupts because
 every time I sent a packet to it from the outside systat -vm showed
 a PCI interrupt for the network device.  However 'netstat -in 1' did
 not show the statistics for the received packets until 64 had 
 accumulated.  It could be that the statistics are not being accumulated
 on a per-reception basis and that the receive packets are actually
 getting through, and that its the transmit side which is broken.  I don't
 know the code well enough yet to make the determination.

The dc_rxeof() routine is what increments ifp-if_ipackets, so if
netstat -in doesn't show any change until after 64 packets have arrived,
then it isn't getting the 'RX OK' interrupts. But I promise you that I
have never seen a condition where 'RX OK' interrupts failed to arrive
even though 'no RX buffer available' interrupts did. The interrupt handler
re-enables interrupts just before it exits, so there should never be a
case where interrupts are turned off and never turned back on again.

-Bill

 I'll try that next time the problem occurs but I doubt it will have 
 any effect.  Changing the duplex mode does not appear to reset the port 
 whereas forcing the media to 'auto' does appear to reset the port.  This 
 is actually another problem (switches don't appear to pick up the duplex
 change if the port isn't reset), but not one I'm concerned with.

In general what you want to do is a) switch modes and b) reset the link
so that the guy on the other side re-senses the media. However both sides
can only agree on the duplex setting as the result of an NWAY autoneg
session: if you manually select 100baseTX full duplex, the link partner
can only sense the link speed (100mbs as opposed to 10) but not the
duplex mode. The rule is that if you don't have NWAY but can sense the
link speed, you default to half duplex and let the operator manually
fix things if necessary (that's what operators are for). Of course this
only works if the switch has a management interface that allows you
to configure things like that. Some don't, which can make your life tough.

I'm pretty sure the speed and duplex setting don't really have anything
to do with this particular problem though. I was just wondering why
renegotiating the media would have any effect. It's possible that
dc_init() may be called in there somewhere, which could be resetting
all of the driver state.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Brad Knowles

At 8:00 PM +1300 1999/12/22, Joe Abley wrote:

  Sorry if this is stating the obvious, but I've seen more than one
  clueful person bitten by this:

hard-wire your duplex setting on your machine and also on the switch

If you check http://www.backplane.com/diablo/hard.html and 
scroll down to the "Network:" section (from the looks of things, 
written sometime back in 1997 or perhaps 1998), you'll see that Matt 
has been well aware of this problem for some time.

That's not to say that he might have forgotten his own advice on 
this issue, just that he is (or should be ;-) well aware of it.


That said, I have to admit that I have yet to be bitten by this 
problem, and I have not (yet) configured machines  switches at this 
site to forcibly select a particular media speed and plexicity.

-- 
Brad Knowles [EMAIL PROTECTED] http://www.shub-internet.org/brad/
 http://wwwkeys.pgp.net:11371/pks/lookup?op=getsearch=0xE38CCEF1

Your mouse has moved.   Windows NT must be restarted for the change to
take effect.   Reboot now?  [ OK ]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Matthew Dillon

:  clueful person bitten by this:
:
:hard-wire your duplex setting on your machine and also on the switch
:
:   If you check http://www.backplane.com/diablo/hard.html and 
:scroll down to the "Network:" section (from the looks of things, 
:written sometime back in 1997 or perhaps 1998), you'll see that Matt 
:has been well aware of this problem for some time.
:
:   That's not to say that he might have forgotten his own advice on 
:this issue, just that he is (or should be ;-) well aware of it.
:
:
:   That said, I have to admit that I have yet to be bitten by this 
:problem, and I have not (yet) configured machines  switches at this 
:site to forcibly select a particular media speed and plexicity.
:
:-- 
:Brad Knowles [EMAIL PROTECTED] http://www.shub-internet.org/brad/
: http://wwwkeys.pgp.net:11371/pks/lookup?op=getsearch=0xE38CCEF1

That's less of a problem as boards have started to conform better,
but I definitely checked it - full-duplex worked fine (I could push
over 15 MBits aggregate).  My packet loss occured with both half and
full duplex.

I finally tracked it down.  The loss is occuring in the link between
two of my switches.  The link goes across my apartment - about 60 feet of
Cat-5 cable.  That should be well within spec (you are supposed to be
able to do 100 meters) but it causes packet loss.  The switches 
autonegotiate full-duplex for the link (and I verified that it's actually
running at full duplex), but that's where the packet loss occurs.  Very
weird.

I was finally able to fix it by dropping in a 10BaseT hub to force the
switches to negotiate 10BaseT across the link.

Maybe my cable is damaged or something.  I'll run a second cable to see
if that's the problem or whether.

The second switch is a LinkSys.  I have a D-Link near my servers and a
LinkSys near my workstation.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Jonathan Lemon

On Dec 12, 1999 at 11:37:42AM -0800, Matthew Dillon wrote:
 I finally tracked it down.  The loss is occuring in the link between
 two of my switches.  The link goes across my apartment - about 60 feet of
 Cat-5 cable.  That should be well within spec (you are supposed to be
 able to do 100 meters) but it causes packet loss.  The switches 
 autonegotiate full-duplex for the link (and I verified that it's actually
 running at full duplex), but that's where the packet loss occurs.  Very
 weird.
 
 I was finally able to fix it by dropping in a 10BaseT hub to force the
 switches to negotiate 10BaseT across the link.
 
 Maybe my cable is damaged or something.  I'll run a second cable to see
 if that's the problem or whether.
 
 The second switch is a LinkSys.  I have a D-Link near my servers and a
 LinkSys near my workstation.

Another thing I to keep in mind, is that sometimes the switch is bad.
I had a Netgear FS509 switch here that would eat packets transmitted
through the GigE port under certain conditions.  Netgear shipped me 
a new one, and I've been happy with it, until the same problem started
happening again this morning.

Perhaps in this case, it's a bad fiber cable, I'll have to do some 
more testing to track it down.
--
Jonathan


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread B. Scott Michel

On Wed, 22 Dec 1999, Jonathan Lemon wrote:

 On Dec 12, 1999 at 11:37:42AM -0800, Matthew Dillon wrote:
 I had a Netgear FS509 switch here that would eat packets transmitted
 through the GigE port under certain conditions.  Netgear shipped me 
 a new one, and I've been happy with it, until the same problem started
 happening again this morning.

There's some oddities in the 3.3 and 3.4 kernels as well -- I've actually
nailed down the plexicity and speed on both the Accellar and my humble PC,
and yet, I'm looking at weird TCP lockups from time to time.

Mostly seems to be related to NFSv3, but will also happen when doing
cvsup. There's no magic number of how many bytes are queued waiting to go
out the interface. And it seems to be limited to specific connections,
i.e. an NFS TCP connection can be jammed and yet I can be happily talking
to cvsup3 doing an update.

The interface in question is a NetGear:

pn0: 82c169 PNIC 10/100BaseTX rev 0x20 int a irq 11 on pci0.9.0

What is odd is that the output error metric from netstat -in monotonically
increases.

Yes, I could post my configuration, etc., and I could go back to running
-current, but I have a PhD to make progress on. And I'm willing to wait to
try out the consolidated 2x040/PNIC driver when 4.0 finally rolls out.


-scooter



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Matthew Dillon

:
:There's some oddities in the 3.3 and 3.4 kernels as well -- I've actually
:nailed down the plexicity and speed on both the Accellar and my humble PC,
:and yet, I'm looking at weird TCP lockups from time to time.
:
:Mostly seems to be related to NFSv3, but will also happen when doing
:cvsup. There's no magic number of how many bytes are queued waiting to go
:out the interface. And it seems to be limited to specific connections,
:i.e. an NFS TCP connection can be jammed and yet I can be happily talking
:to cvsup3 doing an update.

If an NFS TCP connection is jammed you can easily determine whether
the problem is NFS or the TCP stack by looking at the netstat -tn output.
'netstat -tn | fgrep tcp' on both the client and server and locate the
NFS tcp connection in question, then see if there is traffic built-up on
it.

If there is input traffic built-up on either the client or server then
NFS isn't reading the socket.  But if there is output traffic built-up
(and no input traffic built-up by the receiving end) then the problem is
somewhere in the TCP stack.

---

Well, My problem still persists -- it wasn't the link between my two
switches.  I am having the same problem across just about every tcp 
connection I make, whether it's over a local switch or a hub and it
doesn't seem to matter what kind of ethernet cards I have either.

I am clueless as to what is going on.  It seems to only happen with TCP
connections.  I wrote a UDP-based packet loss test program that sends
UDP packets at varying rates and sizes in both directions and figures 
out where the loss is occuring, and I get nada.   In fact, while its
running in the background I am *still* getting TCP stutters and tcpdump
still shows one machine sending a packet that the other machine never
gets!  I have no friggin clue as to why TCP packets fail when UDP packets
don't.

I am beginning to seriously suspect a software problem.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Julian Elischer


make sure you test odd packet lengths. (as in "not even")
there are occasional bugs that turn up with that sort of thing.


On Wed, 22 Dec 1999, Matthew Dillon wrote:
 I am clueless as to what is going on.  It seems to only happen with TCP
 connections.  I wrote a UDP-based packet loss test program that sends
 UDP packets at varying rates and sizes in both directions and figures 
 out where the loss is occuring, and I get nada.   In fact, while its
 running in the background I am *still* getting TCP stutters and tcpdump
 still shows one machine sending a packet that the other machine never
 gets!  I have no friggin clue as to why TCP packets fail when UDP packets
 don't.
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Matthew Dillon

:
:make sure you test odd packet lengths. (as in "not even")
:there are occasional bugs that turn up with that sort of thing.

Yup.  Way ahead of you.  

Hmm. usleep() seems to have a high granularity - only about 150 Hz.
How annoying!

I've put the linktest program up on my web site.  This one adds a 
'-f' option that allows you to specify to run the test as quickly
as possible with up to N packets in transit to any given host
at any given moment, default 1 (i.e. -f == -f1.  Try -f2, -f3...).

http://apollo.backplane.com/FreeSrc/linktest-1.0.c

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-22 Thread Peter Jeremy

On 1999-Dec-23 15:12:53 +1100, Matthew Dillon [EMAIL PROTECTED] wrote:
  In fact, while its
running in the background I am *still* getting TCP stutters and tcpdump
still shows one machine sending a packet that the other machine never
gets!  I have no friggin clue as to why TCP packets fail when UDP packets
don't.

If the problem shows up at 10baseX speeds, you could try setting up a
10base2 network comprising the two test machines and a third machine
as a sniffer.  The thinwire will allow an independent sniffer without
introducing any other hardware (like hubs) that might affect the
results.  If you suspect a s/w problem, have the sniffer run different
s/w (a commercial LAN analyser if you have one available, otherwise
maybe something non-FreeBSD).

This should allow you to identify whether it's a transmit or receive
problem.

Peter


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



new linktest program avail (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Matthew Dillon

A new version of linktest is up, much enhanced:

* fixes cpu use problems due to calling random() too much
* fixes usleep (we now use a pipe and select())

This version can really stuff the network.

http://www.backplane.com/FreeSrc/linktest-1.1.c


Running the following tests with my LinkSys 5-port switch (I haven't 
done this with the D-Link yet):

test3-test4 only

test3# linktest -s 1200 -f8 test4
test4# linktest -s 1200 -f8 test3

11.9 MBytes/sec in both directions (23 MB/sec across the switch)

lander-apollo only

lander# linktest -s 1200 -f8 apollo
apollo# linktest -s 1200 -f8 lander

4.5 MBytse/sec in both directions (10 MBytse/sec across the switch)

Now both tests running in parallel.

test3# linktest -s 1200 -f8 test4
test4# linktest -s 1200 -f8 test3
lander# linktest -s 1200 -f8 apollo
apollo# linktest -s 1200 -f8 lander

7.9 MBytes/sec in both directions for test3-test4 (15 MB/sec)
(6300 pps in both directions - 12000 pps across the switch)

4.3 MBytes/sec in both directions for apollo-lander (8.6 MB/sec)
(3500 pps in both directions - 7000 pps across the switch)

Interesting, eh?  The test3-test4 test slowed down when I ran the
apollo-lander test.  Still, the switch performs plenty good enough
for a tiny little 5-port item.

And... no packet loss in the test - except my TCP connection when I type
is still getting packet loss.  Weeeird.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Matthew Dillon

I'm adding Bill Paul to the list specifically.

Hmm.  Now this is odd!  I think I may have found something!

All of my 'rl' driver cards fail this test:

apollo# linktest -m 0.1:0.2 -s 16 -f16 lander
lander# linktest -m 0.1:0.2 -s 16 -f16 apollo

They get about 1% packet loss with the test.  Always.  
100BaseTX full or half duplex, or 10BaseT -- I still get
failures.

rl0: RealTek 8139 10/100BaseTX irq 11 at device 3.0 on pci0
rl0: Ethernet address: 00:50:ba:d1:89:05
miibus0: MII bus on rl0

All of my 'fxp' driver cards succeed with the above test perfectly.
If I test an fxp machine verses an 'rl' machine, linktest shows that
the 'rl' cards can transmit small packets just fine but they lose
out trying to receive them!

(test3 has an 'fxp' driver, apollo has an 'rl' driver.  Both are
on the same switch!)

test3(216.240.41.13)-apollo.backplane.com  lost 79/89027
test3(216.240.41.13)-apollo.backplane.com  lost 80/89990
test3(216.240.41.13)-apollo.backplane.com  lost 81/90953
test3(216.240.41.13)-apollo.backplane.com  lost 82/92879
test3(216.240.41.13)-apollo.backplane.com  lost 83/93842
test3(216.240.41.13)-apollo.backplane.com  lost 84/94805
test3(216.240.41.13)-apollo.backplane.com  lost 85/96730

Methinks there is something going on with the 'rl' driver and/or
the RealTek cards!

-Matt
Matthew Dillon 
[EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
 I'm adding Bill Paul to the list specifically.
 
 Hmm.  Now this is odd!  I think I may have found something!
 
 All of my 'rl' driver cards fail this test:

Oh sure. Bet the farm on the absolute worst NIC on the whole damn planet,
why don't you. Why spend a few bucks on some nice 3c905B or 3c905C cards
and beat up on them when you can buy ten RealTek cards for a dollar. About
as reliable as a pair of tin cans and a piece of string, but gosh they
sure are cheap.

You'll have to wait until at least tomorrow before I can look into this,
since I won't be able to do any debugging until I throw my one and only
RealTek 8139 sample adapter into a machine and run some tests with it.

   rl0: RealTek 8139 10/100BaseTX irq 11 at device 3.0 on pci0
   rl0: Ethernet address: 00:50:ba:d1:89:05
   miibus0: MII bus on rl0

pciconf -l would be nice here too (to see the PCI revision code).
 
 Methinks there is something going on with the 'rl' driver and/or
 the RealTek cards!

Gee, y'think? I don't suppose you ran any similar tests with, say,
one of those LinkSys cards you had the other day. Or maybe a 3Com card.
I mean, it's just a little anti-climactic, you know? I put all that
blood, sweat and tears into if_xl and if_dc, but do people do stress
tests with them to help me identify weaknesses? No, they pound on
the house of cards that is if_rl.

*sigh*

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Jonathan Lemon

On Dec 12, 1999 at 01:41:04AM -0500, Bill Paul wrote:
 Of all the gin joints in all the towns in all the world, Matthew Dillon 
 had to walk into mine and say:
  
  I'm adding Bill Paul to the list specifically.
  
  Hmm.  Now this is odd!  I think I may have found something!
  
  All of my 'rl' driver cards fail this test:
 
 Oh sure. Bet the farm on the absolute worst NIC on the whole damn planet,
 why don't you.

Sorry, but I can't resist quoting this:

/*
 * The RealTek 8139 PCI NIC redefines the meaning of 'low end.' This is
 * probably the worst PCI ethernet controller ever made, with the possible
 * exception of the FEAST chip made by SMC.
 */

--
Jonathan


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
 (taking this off -current)
 
 apollo# linktest -s 51 -f1 lander 1-51 byte payload - errors
 lander# linktest -s 51 -f1 apollo
 
 apollo# linktest -s 52 -f1 lander 52+ byte payload - no errors
 lander# linktest -s 52 -f1 apollo
 
 
 You know, this kinda sounds like a jabber lockup.
 
 Bill, are you following the *MINIMUM* ethernet frame size specification 
 for ethernet?

*sigh* No, I've been living on Mars since 1975 and we don't get IEEE spec
documents up here.

Yes, I know there's a minimum frame length of 60 bytes. And the rl_encap()
routine has the following code:

/* Pad frames to at least 60 bytes. */
if (m_head-m_pkthdr.len  RL_MIN_FRAMELEN) {
m_head-m_pkthdr.len +=
(RL_MIN_FRAMELEN - m_head-m_pkthdr.len);
m_head-m_len = m_head-m_pkthdr.len;
}

The RealTek doesn't autopad, so you have to handle it manually. You're
only allowed one DMA buffer per transmission, so outbound packets are
coalesced into a single mbuf cluster buffer in rl_encap(). A cluster
buffer is always 2K, and frames can never be larger than 1514 bytes, so
we know there'll always be plenty of room. In the case of frames less
60 bytes, I just adjust bump up m_pkthdr.len and m_len. This adjuster
length gets used later in rl_start() when transmission is triggered.

Incidentally, you should be using tcpdump -n -e -i rl0 to measure the
actual frame length of failing and succeeding transmissions: that's
usually a much better indicator of what might be going wrong. You could
calculate it from the data buffer length, but I suck at math; I find it's
easier just to monitor the offending frames.

-Bill

=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-21 Thread Poul-Henning Kamp

In message [EMAIL PROTECTED], Garrett Wollman write
s:
On Tue, 21 Dec 1999 12:50:50 -0800 (PST), Matthew Dillon 
[EMAIL PROTECTED] said:

 I have NOT tested this fix yet, so I don't know if it works, but I
 believe the problem is that on high speed networks the milliscond round
 trip delay is short enough that you can get 1-tick timeouts.

Hmmm.  I thought we agreed that 200 msec was the minimum reasonable
RTO.  That code doesn't seem to have made it in.

I assume you mean 20 msec (= 2 tick @ 100 Hz ) ? 200 msec is enough
to get halfway around the globe...

--
Poul-Henning Kamp FreeBSD coreteam member
[EMAIL PROTECTED]   "Real hackers run -current on their laptop."
FreeBSD -- It will take a long time before progress goes too far!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-21 Thread Garrett Wollman

On Tue, 21 Dec 1999 22:13:51 +0100, Poul-Henning Kamp [EMAIL PROTECTED] said:

 Hmmm.  I thought we agreed that 200 msec was the minimum reasonable
 RTO.  That code doesn't seem to have made it in.

 I assume you mean 20 msec (= 2 tick @ 100 Hz ) ? 200 msec is enough
 to get halfway around the globe...

No, I mean 200 msec.  If you make the RTO be any shorter than that,
you'll slow-start every packet you send to a machine which is running
delayed-ACK (i.e., almost everyone).  The official standard RTO is I
think 500 msec, but this might be too high.

We have ``bad retransmit recovery'' which is supposed to detect some
instances of this and disable slow-start in that case.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-21 Thread Matthew Dillon

:
:Hmmm.  I thought we agreed that 200 msec was the minimum reasonable
:RTO.  That code doesn't seem to have made it in.
:
:I assume you mean 20 msec (= 2 tick @ 100 Hz ) ? 200 msec is enough
:to get halfway around the globe...
:
:--
:Poul-Henning Kamp FreeBSD coreteam member

I just rebooted both machines and it didn't fix the problem.  I did a
packet trace on both boxes and there does indeed appear to be packet loss.

I may have thrown out a red herring, sorry about that folks!  Something 
odd is going on, that's for sure.  My packet trace shows that there was 
packet loss and that the retry did *NOT* occur immediately, so my 
premise goes out the window.

I don't understand why my tcp connection has this sort of packet loss when
all of my ping tests succeed 100%.  I am totally baffled.

-Matt

(make window wide to view.  Note: my xntpd's aren't synchronized well
enough this soon after reboot so the two machine's times do not match
very well).

machine #1  (did not receive packet sequence 20400)

13:12:28.730938 216.240.41.6.4006  216.240.41.2.22: P 20360:20380(20) ack 36645 win 
17520 (DF) [tos 0x10]
13:12:28.756646 216.240.41.2.22  216.240.41.6.4006: P 36645:36665(20) ack 20380 win 
17520 (DF)
13:12:28.794196 216.240.41.6.4006  216.240.41.2.22: P 20380:20400(20) ack 36665 win 
17520 (DF) [tos 0x10]
13:12:28.816622 216.240.41.2.22  216.240.41.6.4006: P 36665:36685(20) ack 20400 win 
17520 (DF)
13:12:28.962999 216.240.41.6.4006  216.240.41.2.22: P 20420:20440(20) ack 36685 win 
17520 (DF) [tos 0x10]
13:12:28.963059 216.240.41.2.22  216.240.41.6.4006: . ack 20400 win 17520 (DF)
13:12:29.027297 216.240.41.6.4006  216.240.41.2.22: P 20440:20460(20) ack 36685 win 
17520 (DF) [tos 0x10]

machine #2  (sent packet sequence 20400, then timed out later and resent)

13:12:27.743652 216.240.41.6.4006  216.240.41.2.22: . ack 36645 win 17520 (DF) [tos 
0x10]
13:12:28.176252 216.240.41.6.4006  216.240.41.2.22: P 20360:20380(20) ack 36645 win 
17520 (DF) [tos 0x10]
13:12:28.202078 216.240.41.2.22  216.240.41.6.4006: P 36645:36665(20) ack 20380 win 
17520 (DF)
13:12:28.239533 216.240.41.6.4006  216.240.41.2.22: P 20380:20400(20) ack 36665 win 
17520 (DF) [tos 0x10]
13:12:28.262069 216.240.41.2.22  216.240.41.6.4006: P 36665:36685(20) ack 20400 win 
17520 (DF)
13:12:28.336525 216.240.41.6.4006  216.240.41.2.22: P 20400:20420(20) ack 36685 win 
17520 (DF) [tos 0x10]
13:12:28.408355 216.240.41.6.4006  216.240.41.2.22: P 20420:20440(20) ack 36685 win 
17520 (DF) [tos 0x10]
13:12:28.408512 216.240.41.2.22  216.240.41.6.4006: . ack 20400 win 17520 (DF)
13:12:28.472656 216.240.41.6.4006  216.240.41.2.22: P 20440:20460(20) ack 36685 win 
17520 (DF) [tos 0x10]
13:12:28.472805 216.240.41.2.22  216.240.41.6.4006: . ack 20400 win 17520 (DF)
13:12:28.545556 216.240.41.6.4006  216.240.41.2.22: P 20460:20480(20) ack 36685 win 
17520 (DF) [tos 0x10]
13:12:28.545703 216.240.41.2.22  216.240.41.6.4006: . ack 20400 win 17520 (DF)
13:12:28.545770 216.240.41.6.4006  216.240.41.2.22: P 20400:20480(80) ack 36685 win 
17520 (DF) [tos 0x10]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Odd TCP glitches in new currents

1999-12-21 Thread Joe Abley

On Tue, Dec 21, 1999 at 01:23:05PM -0800, Matthew Dillon wrote:
 I just rebooted both machines and it didn't fix the problem.  I did a
 packet trace on both boxes and there does indeed appear to be packet loss.

Sorry if this is stating the obvious, but I've seen more than one
clueful person bitten by this:

  hard-wire your duplex setting on your machine and also on the switch

Even if the switch and NIC appear to auto-negotiate a sensible duplex
setting, I have seen many cases where they will forget for no apparent
reason, usually in the middle of the night just after you have stepped
onto a plane to fly to a different country.

If one end thinks it is full-duplex and the other end thinks it is
half, then late collisions can occur which will not result in MAC-layer
retransmissions from the full-duplex-thinking station -- hence packet
loss.


Joe (possibly #2 in a series of red herrings :)

-- 
Ua lawa küpono ka hakahaka pä o këia pä malule


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message