Just an update for anyone who may care... We found that our troubles were not bee2 related at all, rather, the cable that connnected our host to the switch was bad. We found this by connecting the host to other 10 gbe hosts through the switch and testing without the bee.
What made it seem like the switch was causing it was that we didn't use the "bad" cable when connecting the host directly to the bee, but we used it when connecting the host to the switch. The bad cable was a gore 10 gbe cable. One just like it worked fine. I guess there's something wrong inside it. John > I don't think the NIC-switch cable length makes much difference. I've > used everything from 0.5m to 5m and they all worked fine. These ports > have some form of auto-equalisation which seems to work very well. > > Jason > > On 25 Nov 2009, at 15:59, John Ford wrote: > >>> Matt and I have had problems with BEE2 - switch with short cables. >>> The >>> lanes lose sync, and then take a fixed time to resync. Data is lost >>> during this process, hence the losses are higher at higher datarates. >>> With lower rates, it is able to buffer the data and resume without >>> significant losses. >>> >>> FWIW, with default PHY settings (max preemph and swing), my BEE2 >>> cable >>> length preferences are as follows: >>> BEE2 - IBOB: 1m >>> BEE2 - switch: 3m >>> BEE2 - Myricom NIC: anything from 1m to 3m. >> >> I have a variety of copper cables I can try with various lengths. >> We'll >> see how that goes! It never occurred to me that the cable could be >> too >> short! We also have a really short cable on the host to switch >> ports, so >> I'll try adding to that one as well. >> >> On another front, at the SC09 show last week we tried to use a zarlink >> fiber-optic cable, and found that the myricom card would not drive >> it. It >> worked fine between 2 ports on a ROACH. The network guy there said >> he's >> seen that before, where a NIC couldn't drive enough power to supply >> a FO >> transceiver over CX-4. >> >>> >>> I believe we can improve the range of cable lengths that work, but >>> you'll have to tweak the PHY's preemph and swing settings. >>> Unfortunately, on the V2Ps, this must be done at compile time. ROACH >>> lets you do this at runtime, which aids tuning. >> >> Sounds a lot nicer. >> >>> >>> I would be surprised to find that the switch is causing the >>> problem. I >>> am currently pumping 8Gbps on each of 8 ports on my XG2000C without a >>> single dropped packet. >> >> That's good to hear, as we are depending on that to be true... >> >> John >> >>> >>> Jason >>> >>> On 25 Nov 2009, at 02:00, Paul Demorest wrote: >>> >>>> Hey guys, >>>> >>>> Jumbo frames are already enabled on the switch. The packet loss >>>> we're seeing is about 0.1%, so most of the data is making it >>>> through. The weird thing is that this number seems independent of >>>> the data rate. Even at <1 MB/s we still lose 0.1% of the data. >>>> >>>> As John already said, connecting the bee2 directly to the receiving >>>> computer results in zero packet loss at low data rates. We do lose >>>> some if the data rate is set much higher (>~400 MB/s depending on >>>> the data processing, system load, etc). But that kind of thing is >>>> expected.. >>>> >>>> -Paul >>>> >>>> On Tue, 24 Nov 2009, Peter McMahon wrote: >>>> >>>>> Page 3 of Jason's switch memo at >>>>> http://casper.berkeley.edu/memos/switch_configuration.pdf has the >>>>> commands >>>>> to enable jumbo frame on some of the Fujitsu switches. >>>>> >>>>> Peter >>>>> >>>>> -----Original Message----- >>>>> From: [email protected] >>>>> [mailto:[email protected]] On Behalf Of Dan >>>>> Werthimer >>>>> Sent: Tuesday, November 24, 2009 3:34 PM >>>>> To: John Ford >>>>> Cc: [email protected] >>>>> Subject: Re: [casper] 10 GBe Switch weirdness >>>>> >>>>> >>>>> hi john, >>>>> >>>>> i vaguely remember that there's a command you need to issue >>>>> to the fujitsu switches to enable jumbo packets ?? >>>>> >>>>> jason will know. >>>>> >>>>> dan >>>>> >>>>> >>>>> >>>>> On 11/24/2009 02:01 PM, John Ford wrote: >>>>>> Hi all. We've been running GUPPI for some time now with a direct >>>>>> connection from our bee2, "bee2" to our host, "beef". Works fine, >>>>>> no >>>>>> dropped packets, etc. Life's fine. >>>>>> >>>>>> To build our next machine, GUPPI-2, we decided to insert a Fujitsu >>>>>> XG-2000C switch between them, and now we are losing packets. Same >>>>>> hardware, same software, except the switch is now between the bee2 >>>>>> and the >>>>>> host. We have very short cables on all the ports. 1 meter or 0.5 >>>>>> meter. >>>>>> Changing cables had no effect. Changing switch ports had no >>>>>> effect. The >>>>>> ports are set to use jumbo packets. Moving the cable from the >>>>>> switch back >>>>>> to the host fixes the problems, and it all magically works again. >>>>>> >>>>>> Other hosts on the 10 gbe network do not seem to suffer from >>>>>> packet loss. >>>>>> The switch's monitoring screens do not show any packet loss in the >>>>>> switch. >>>>>> >>>>>> Does the switch do anything to the timing of the packets? Any >>>>>> ideas about >>>>>> incompatible NIC's? They are all myricom pcie cards. >>>>>> >>>>>> Any other ideas? >>>>>> >>>>>> John >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> >

