Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Fri, Jul 14, 2000 at 08:46:40AM +0200, Wilko Bulte wrote: That theory is not correct, I have seen multiple Alpha machines reporting buffer underruns as well. No ATA disk in sight there.. I get the same thing on AS4000/AS4100 machines running Tru64. I'm inclined to believe it's a design flaw in the chip. Peter To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
... As far as I can tell the fxp driver doesn't even use the tx_fifo in the 825xxx chips :-) The 82557-9 have a 2KB internal buffer for transmits. They don't start transmitting until a programmed threshold is reached - this is to insure that PCI bus latency doesn't result in the transmitter getting stalled. The fxp driver starts out with this threshold set at 512 bytes, but will increase it (512 bytes at a time) when a DMA underrun occurs. Of course once the threshold reached 1536, then an entire 1500 byte packet is DMA'd into the buffer before the transmit begins. Can you point me to the part of if_fxp.c that does this, as alls I can find about any form of fifo in the code are these references: Guardian# grep -i fifo *fxp* if_fxp.c: cbp-rx_fifo_limit =8; /* rx fifo threshold (32 bytes) */ if_fxp.c: cbp-tx_fifo_limit =0; /* tx fifo threshold (0 bytes) */ if_fxpreg.h:volatile u_int rx_fifo_limit:4, if_fxpreg.h:tx_fifo_limit:3, No place do I find anything that does any adjustments to these values :-(. -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Friday, 14th July 2000, "Rodney W. Grimes" wrote: I suspect an interaction between the ATA driver and VIA chipsets, because other than the network, that's all that is operating when I see the underruns. And my Celeron with a ZX chipset is immune. I've seen them on just about everything, chipset doesn't seem to matter, IDE or SCSI doesn't seem to matter. Well, maybe they are just a fact of life. But using just my vague knowledge of how PCI works, it doesn't look inevitable to me. So I see bugs. :-) Getting even more technical, it appears to me that the current driver instructs the 21143 to poll for transmit packets (ie a small DMA) every 80us even if there are none to be sent. I don't know what percentage of bus time this might be, or even how to calculate it (got some time Rod?) I'll have to look at that. If it is a simple 32 bit read every 80uS thats something like .1515% of the PCI bandwidth, something that shouldn't matter much. (I assumed a simple 4 cycle PCI operation). Just how big is this DMA operation every 80uS? I believe it is just one 32 bit read. But I don't understand that aspect of the hardware very well yet. I also suspect that this polling adds to the latency, but again, I haven't got to the end of that either. Sometimes other things can distract you from even the most interesting technical matter. :-) Stephen. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Friday, 14th July 2000, "Rodney W. Grimes" wrote: I suspect an interaction between the ATA driver and VIA chipsets, because other than the network, that's all that is operating when I see the underruns. And my Celeron with a ZX chipset is immune. I've seen them on just about everything, chipset doesn't seem to matter, IDE or SCSI doesn't seem to matter. Well, maybe they are just a fact of life. But using just my vague knowledge of how PCI works, it doesn't look inevitable to me. So I see bugs. :-) Yes, there are bugs, it's in the poor specification of the PCI bus, and in the even poorer implementation of PCI in hardware. To qoute from the PCI 2.0 spec, starting at the bottom of page 44, section 3.4.4.3 Latency Guidelines: In most PCI systems, typical access latency is both short (likely under 2us) and easily quantified. However, worst case latency (however rare) may not only be quite long, but in some cases quite difficult to predict. For example, latency to a standard expansion adapter (ISA/EISA/MC) through a bridge is often a function of adapter behavior, not PCI behavior. (This is especially problematic since some existing adapters are not compliant with latency parameters defined by the associated bus standard.) To compensate, masters that require guaranteed worst case access latency must provide adequate buffering for 30 microseconds. This implies a minimum of about 50 bytes of buffering for a 10Mbit/second LAN, and about 500 bytes for a 100Mbit/second LAN. (If the buffers are line organized [i.e., 16- or 32-bit aligned] to imporove PCI and target memory utilization, minimum buffer size likely increases.) In spite of worst case uncertainty, 30 microseconds should provide sufficient margin for realizable system designs. My calculations say that 30uS is long enough to transfer about 3960Bytes, now you see the problem??? I think the current driver behavior is near optimal, it backs down until it becomes latency proof (store and forward is latency proof). The only thing it might do better is deal with the fact that short term bus starvation should not effect long term performance, and as long as the underun events have a tolerable frequence it should not down grade to store and forward. Right now the code immediately steps the TXTHRESH every time we get an underrun, this should probably use a frequency counter and not do this unless we are seeing some untolerable rate of underruns. Especially when makeing the transition to store and forward. Ohh... and a finally note, DEC blew the chip design by only including a 160byte threshold point given that PCI 2.0 spec says it should have been 500bytes!! (Well, they blew it when the did the DC2114x enhancement to the the DC2104x chip by not increasing the fifo depth to compensate for the higher rate at which the fifo is emptied.) Getting even more technical, it appears to me that the current driver instructs the 21143 to poll for transmit packets (ie a small DMA) every 80us even if there are none to be sent. I don't know what percentage of bus time this might be, or even how to calculate it (got some time Rod?) I'll have to look at that. If it is a simple 32 bit read every 80uS thats something like .1515% of the PCI bandwidth, something that shouldn't matter much. (I assumed a simple 4 cycle PCI operation). Just how big is this DMA operation every 80uS? I believe it is just one 32 bit read. But I don't understand that aspect of the hardware very well yet. I also suspect that this polling adds to the latency, but again, I haven't got to the end of that either. Sometimes other things can distract you from even the most interesting technical matter. :-) :-) -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Sun, 16 Jul 2000 11:41:37 -0700 (PDT), "Rodney W. Grimes" [EMAIL PROTECTED] said: Ohh... and a finally note, DEC blew the chip design by only including a 160byte threshold point given that PCI 2.0 spec says it should have been 500bytes!! It wouldn't be the first thing DEC had screwed up in the design of these NICs. On the other hand, Intel has owned the silicon for a couple of years now, which is more than enough time to unscrew it if they really wanted to. Clearly, they'd rather be selling 82559s -GAWollman -- Garrett A. Wollman | O Siem / We are all family / O Siem / We're all the same [EMAIL PROTECTED] | O Siem / The fires of freedom Opinions not those of| Dance in the burning flame MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Sun, 16 Jul 2000 11:41:37 -0700 (PDT), "Rodney W. Grimes" [EMAIL PROTECTED] said: Ohh... and a finally note, DEC blew the chip design by only including a 160byte threshold point given that PCI 2.0 spec says it should have been 500bytes!! It wouldn't be the first thing DEC had screwed up in the design of these NICs. On the other hand, Intel has owned the silicon for a couple of years now, which is more than enough time to unscrew it if they really wanted to. Clearly, they'd rather be selling 82559s As far as I can tell the fxp driver doesn't even use the tx_fifo in the 825xxx chips :-) -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Sun, 16 Jul 2000 11:41:37 -0700 (PDT), "Rodney W. Grimes" [EMAIL PROTECTED] said: Ohh... and a finally note, DEC blew the chip design by only including a 160byte threshold point given that PCI 2.0 spec says it should have been 500bytes!! It wouldn't be the first thing DEC had screwed up in the design of these NICs. On the other hand, Intel has owned the silicon for a couple of years now, which is more than enough time to unscrew it if they really wanted to. Clearly, they'd rather be selling 82559s You're going to barf when I tell you that the ethernet component in the new ICH2 (PIIX4 equivalent in the new low-cost 815 chipset) looks like an 82586... -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Sun, 16 Jul 2000 11:41:37 -0700 (PDT), "Rodney W. Grimes" [EMAIL PROTECTED] said: Ohh... and a finally note, DEC blew the chip design by only including a 160byte threshold point given that PCI 2.0 spec says it should have been 500bytes!! It wouldn't be the first thing DEC had screwed up in the design of these NICs. On the other hand, Intel has owned the silicon for a couple of years now, which is more than enough time to unscrew it if they really wanted to. Clearly, they'd rather be selling 82559s As far as I can tell the fxp driver doesn't even use the tx_fifo in the 825xxx chips :-) The 82557-9 have a 2KB internal buffer for transmits. They don't start transmitting until a programmed threshold is reached - this is to insure that PCI bus latency doesn't result in the transmitter getting stalled. The fxp driver starts out with this threshold set at 512 bytes, but will increase it (512 bytes at a time) when a DMA underrun occurs. Of course once the threshold reached 1536, then an entire 1500 byte packet is DMA'd into the buffer before the transmit begins. There is buffering on the receive side as well, but I don't recall off hand how large that is (although I think it's 2KB as well). -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org Manufacturer of high-performance Internet servers - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Fri, Jul 14, 2000 at 12:51:14PM +1000, Stephen McKay wrote: On Thursday, 13th July 2000, "Rodney W. Grimes" wrote: On Thu, 13 Jul 2000, Stephen McKay wrote: Does anyone here actually measure these latencies? I know for a fact that nothing I've ever done would or could be affected by extra latencies that are as small as the ones we are discussing. Does anybody at all depend on the start-transmitting-before-DMA-completed feature we are discussing? I don't like the idea of removing that feature. Perhaps it should be a sysctl or ifconfig option, but it should definitely remain available. Those minute latencies are critical to those of us who use MPI for complex parallel calculations. I have to agree here. The store and forward adds an approximate 11uS (by theory under ideal conditions 1500bytes@132MB/s = 11uS, practice actually makes this worse as typical PCI does something less than 100MB/s or 15uS) to a 120uS packet time on the wire (again, ideal, but here given that switches, and infact often cut-through switches, are used for these types of things, ideal and practice are very close.) I don't think these folks, nor myself, are wanting^H^H^H^H^H^H^Hilling to give up 12.5%. OK. It seems that repairing the feature, rather than disabling it is the most popular option. Still, I am quite interested in finding anyone who actually measures these things, and is affected by them. These very same people might be able to trace why we get the underruns in the first place. I suspect an interaction between the ATA driver and VIA chipsets, because other than the network, that's all that is operating when I see the underruns. And my Celeron with a ZX chipset is immune. That theory is not correct, I have seen multiple Alpha machines reporting buffer underruns as well. No ATA disk in sight there.. -- Wilko Bulte http://www.freebsd.org "Do, or do not. There is no try" [EMAIL PROTECTED] http://www.nlfug.nl Yoda - The Empire Strikes Back To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
That theory is not correct, I have seen multiple Alpha machines reporting buffer underruns as well. No ATA disk in sight there.. This has been a reported feature of the tulip chip and alphas (de driver usually) forever forever forever. It's not a bug, per se, IMO. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
[cc: trimmed to -current] Does anyone here actually measure these latencies? I know for a fact that nothing I've ever done would or could be affected by extra latencies that are as small as the ones we are discussing. Does anybody at all depend on the start-transmitting-before-DMA-completed feature we are discussing? I don't like the idea of removing that feature. Perhaps it should be a sysctl or ifconfig option, but it should definitely remain available. Those minute latencies are critical to those of us who use MPI for complex parallel calculations. I have to agree here. The store and forward adds an approximate 11uS (by theory under ideal conditions 1500bytes@132MB/s = 11uS, practice actually makes this worse as typical PCI does something less than 100MB/s or 15uS) to a 120uS packet time on the wire (again, ideal, but here given that switches, and infact often cut-through switches, are used for these types of things, ideal and practice are very close.) I don't think these folks, nor myself, are wanting^H^H^H^H^H^H^Hilling to give up 12.5%. OK. It seems that repairing the feature, rather than disabling it is the most popular option. Still, I am quite interested in finding anyone who actually measures these things, and is affected by them. As already pointed out, anyone running computational code on a compute cluster that is passing data around is directly affected by this. I know of at least 3 sites that converting to store and forward would destroy as far as ``operational'' status goes. They have gone the extra mile to even use cut-through ethernet switches and I can assure you that an 11uS delay per packet would have a significant impact on cluster performance. They don't directly measure these values, but none the less they would have an impact. Also for those using dc21x4x cards in high load router and/or firewall situations would notice this, though it would be harder to measure (well, actually a pps test should show it quite clearly, as my above 12.5% was based on full size packets, this becomes a larger percentage as packet size is decreased). These very same people might be able to trace why we get the underruns in the first place. Of the sites I know of they don't get these messages :-). I have noticed that I see them more often with the dc driver than I do with the de driver, ie now that I am upgrading more and more of our systems to 4.x from 3.x I have started to see these on machines that have never reported them before. Now this may be the driver, or it could be some other part of system that has changed. I suspect an interaction between the ATA driver and VIA chipsets, because other than the network, that's all that is operating when I see the underruns. And my Celeron with a ZX chipset is immune. I've seen them on just about everything, chipset doesn't seem to matter, IDE or SCSI doesn't seem to matter. Back to the technical, for a moment. I have verified that stopping the transmitter on the 21143 is both sufficient and necessary to enable the thresholds to be set. I have code that works on my machine. I intend to commit it when I think it looks neat enough. Good. That should help the folks with the major complaint of 2 to 3 second network outages when one of the occur. It may also be possible to simply start out one step further down on the fifo level and eliminate the message for most people. (When I do see these it usually only happens once or maybe twice, then the box is silent about it from then on. I have never seen a box back off to store and forward mode that didn't have some other serious hardware related problem.) Getting even more technical, it appears to me that the current driver instructs the 21143 to poll for transmit packets (ie a small DMA) every 80us even if there are none to be sent. I don't know what percentage of bus time this might be, or even how to calculate it (got some time Rod?) I'll have to look at that. If it is a simple 32 bit read every 80uS thats something like .1515% of the PCI bandwidth, something that shouldn't matter much. (I assumed a simple 4 cycle PCI operation). Just how big is this DMA operation every 80uS? -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
dc driver and underruns (was: Strangeness with 4.0-S)
On Monday, 10th July 2000, Stefan Esser wrote: On 2000-07-09 20:52 +1000, Stephen McKay [EMAIL PROTECTED] wrote: On Saturday, 8th July 2000, Stefan Esser wrote: Oh, there are renegotiations after each overrun ??? The code at the point that an underrun is detected is: printf("dc%d: TX underrun -- ", sc-dc_unit); if (DC_IS_DAVICOM(sc) || DC_IS_INTEL(sc)) dc_init(sc); After that, it sets the new threshold, or store and forward mode. That conditional (which resets the DE-500 style cards I own), looks deliberate since it is so specific. Either that, or Bill was being conservative. When I get a chance, I will experiment with removing it. Well, the DE Driver (DEC 21x4x) has (relevant lines marked ***): [SNIP: code showing de driver does not reset chip] I've now read the 21143 chip manual from Intel. What the de driver does is illegal (the transmitter must be idle when the threshold is changed). I don't know if it works in practice, the de driver didn't work well for me. What the dc driver does is overkill. I will implement some changes, based on the documentation, and see what happens. Of course, Bill, if you have direct experience that contradicts the documentation (as if I've never seen incorrect doco...) then I'm all ears. I also have a very limited range of test hardware. I agree, that for chips that need to be completely re-initialized, the default might be store-and-forward ... There are so many DEC 21x4x clones, all slightly different, and it seems that at least a few need the chip reset. There is already a convenient store-and-forward-only flag that is set for one of the supported chips. I propose that this flag be set on all hardware that cannot have the threshold changed without a reset. It hides the problem very well for me. I really can't see the tiniest of performance loss with store and forward. Maybe it's something that only shows up on benchmarks. Guess it will show up if you measure latencies (or your application is doing lots of RPCs). But as soon as there is a cheap 100baseT switch in the path to the destination, there will be store-and-forward at work ;-) Does anyone here actually measure these latencies? I know for a fact that nothing I've ever done would or could be affected by extra latencies that are as small as the ones we are discussing. Does anybody at all depend on the start-transmitting-before-DMA-completed feature we are discussing? Lastly, some people really want to keep the messages. Is hiding them behind bootverbose enough? Or do I have to add a flag/hint? No, I haven't looked at the new hint system, so I don't know if I should be afraid or not. :-) Stephen. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Thu, 13 Jul 2000, Stephen McKay wrote: Guess it will show up if you measure latencies (or your application is doing lots of RPCs). But as soon as there is a cheap 100baseT switch in the path to the destination, there will be store-and-forward at work ;-) Does anyone here actually measure these latencies? I know for a fact that nothing I've ever done would or could be affected by extra latencies that are as small as the ones we are discussing. Does anybody at all depend on the start-transmitting-before-DMA-completed feature we are discussing? I don't like the idea of removing that feature. Perhaps it should be a sysctl or ifconfig option, but it should definitely remain available. Those minute latencies are critical to those of us who use MPI for complex parallel calculations. Brandon D. Valentine -- bandix at looksharp.net | bandix at structbio.vanderbilt.edu "Truth suffers from too much analysis." -- Ancient Fremen Saying To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Thu, 13 Jul 2000, Stephen McKay wrote: Guess it will show up if you measure latencies (or your application is doing lots of RPCs). But as soon as there is a cheap 100baseT switch in the path to the destination, there will be store-and-forward at work ;-) Does anyone here actually measure these latencies? I know for a fact that nothing I've ever done would or could be affected by extra latencies that are as small as the ones we are discussing. Does anybody at all depend on the start-transmitting-before-DMA-completed feature we are discussing? I don't like the idea of removing that feature. Perhaps it should be a sysctl or ifconfig option, but it should definitely remain available. Those minute latencies are critical to those of us who use MPI for complex parallel calculations. I have to agree here. The store and forward adds an approximate 11uS (by theory under ideal conditions 1500bytes@132MB/s = 11uS, practice actually makes this worse as typical PCI does something less than 100MB/s or 15uS) to a 120uS packet time on the wire (again, ideal, but here given that switches, and infact often cut-through switches, are used for these types of things, ideal and practice are very close.) I don't think these folks, nor myself, are wanting^H^H^H^H^H^H^Hilling to give up 12.5%. -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Thursday, 13th July 2000, "Rodney W. Grimes" wrote: On Thu, 13 Jul 2000, Stephen McKay wrote: Does anyone here actually measure these latencies? I know for a fact that nothing I've ever done would or could be affected by extra latencies that are as small as the ones we are discussing. Does anybody at all depend on the start-transmitting-before-DMA-completed feature we are discussing? I don't like the idea of removing that feature. Perhaps it should be a sysctl or ifconfig option, but it should definitely remain available. Those minute latencies are critical to those of us who use MPI for complex parallel calculations. I have to agree here. The store and forward adds an approximate 11uS (by theory under ideal conditions 1500bytes@132MB/s = 11uS, practice actually makes this worse as typical PCI does something less than 100MB/s or 15uS) to a 120uS packet time on the wire (again, ideal, but here given that switches, and infact often cut-through switches, are used for these types of things, ideal and practice are very close.) I don't think these folks, nor myself, are wanting^H^H^H^H^H^H^Hilling to give up 12.5%. OK. It seems that repairing the feature, rather than disabling it is the most popular option. Still, I am quite interested in finding anyone who actually measures these things, and is affected by them. These very same people might be able to trace why we get the underruns in the first place. I suspect an interaction between the ATA driver and VIA chipsets, because other than the network, that's all that is operating when I see the underruns. And my Celeron with a ZX chipset is immune. Back to the technical, for a moment. I have verified that stopping the transmitter on the 21143 is both sufficient and necessary to enable the thresholds to be set. I have code that works on my machine. I intend to commit it when I think it looks neat enough. Getting even more technical, it appears to me that the current driver instructs the 21143 to poll for transmit packets (ie a small DMA) every 80us even if there are none to be sent. I don't know what percentage of bus time this might be, or even how to calculate it (got some time Rod?) but it looks unnecessary to me. I think the transmitter could be turned off regularly. At the moment, the driver leaves it on all the time. And to the non technical: Do the messages go or stay? I've heard both sides. For most people they are just annoying fluff. For those who actually care about the latency, it might be informative, and thus too useful to be hidden behind bootverbose. Opinions? Stephen. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dc driver and underruns (was: Strangeness with 4.0-S)
On Fri, 14 Jul 2000, Stephen McKay wrote: place. I suspect an interaction between the ATA driver and VIA chipsets, because other than the network, that's all that is operating when I see the underruns. And my Celeron with a ZX chipset is immune. I've noticed this on a VIA chipset machine. It also has ATA drives. It's running 5.0-current from 7/10. I have an HX chipset machine running -current from 7/10, same card, ATA drives, no error. A BX chipset machine running -current cooked today, same card, SCSI drives, no error. Just a few more data points. Scott To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message