Not to steer this conversation too far off track, but does a patch exist to enable the ethernet CRC check within the 10Gb yellow block? When I tested the switch I used Dave MacMahon's CRC generator block, but it would be nice if there was the option to have the 10 gig core use the UDP crc -- or get rid of the port that pretends that it's doing this :)
Cheers, J On 2 September 2014 17:33, Jason Manley <jman...@ska.ac.za> wrote: > The SX1012 and SX1036 definitely use the same ASIC series and both suffer > from the PHY link problem. I've got seven of them here, and tested them with > 20 different ROACH-2 boards. As Jack points out, some links are worse than > others and swapping cables and ROACH2 mezzanine cards often sorts the problem > out. To be honest, the issue is probably on the ROACH-2 side, as we haven't > tuned the PHY's RX parameters. Mellanox claim to have tested and complied to > IEEE standards. > > To get around this, Mellanox have a "patch" for CASPER users, which adjusts > the TX drive on the switch so that a bigger signal hits the ROACH2. This > fixes all the problems, and the links are then very very reliable. They > actually borrowed a ROACH-2 for the purpose of testing and tuning their PHYs. > Very kind of them! This tuning was done using their own 3m cables, so you > should expect the best performance using those copper links. Another option > is to use AOC cables, which are cost-effective and loss-less. > > Lincoln, not to alarm you, but are you actually checking for bit errors on > your SX1024 links? I suspect you're just not noticing any faults. Packets > don't always get dropped... oftentimes a few bits in the data is just > corrupted, and if you're looking at correlator noise, you might not even > notice it! ...the "RX error" line from the 10G core is hard-coded to zero, so > if this is what you're watching, you'd never know. Likewise the > ROACH2->switch links work fine, it's only the switch->ROACH2 direction that's > problematic, so you'd never see the error counters on the switch climbing. We > (SKA-SA) add checksums to our packets to be sure. Mostly, we were seeing BER > numbers like 10e-10 on most of the Mellanox kit out-the-box, but a few links > had bad links with BER ~10E-6. After patching, it's better than 10e-13. > > Jason Manley > CBF Manager > SKA-SA > > Cell: +27 82 662 7726 > Work: +27 21 506 7300 > > On 02 Sep 2014, at 18:01, Lincoln Greenhill <lgreenh...@cfa.harvard.edu> > wrote: > >> Hi Jack (and John), >> >> Interesting report. Difficult to puzzle out >> differences in our mutual experience unless the >> 1012 is unlike the 1024 in subtle ways. >> >> We (LEDA) did not require an upgrade or Mellanox cables >> for the 10/40GbE links. Again - I would look >> forward to discussion people may have with Jonathon >> (Cc me if not conducted via the Casper list). >> >> Best, >> Lincoln >> >> On 9/2/14, 11:54 AM, Jack Hickish wrote: >>> Hi John, >>> >>> As Dan said I've tested (to some extent) the SX1012. I just used one >>> ROACH2 and corner >>> turned data through 8 x 10GbE ports. It worked well basically up to >>> line rate, with no CRC errors after a few hours of operation, but only >>> after >>> >>> - Jason put me in touch with some Mellanox guys who provided updated >>> firmware >>> - using branded Mellanox 3m QSFP -> 4xSFP cables >>> >>> Before the upgrade transmissions from the switch to the ROACH2 would >>> frequently fail. >>> >>> Of course this may or may not be remotely relevant for the SX1024 :) >>> >>> Good luck! >>> >>> J >>> >>> On 2 September 2014 16:46, Dan Werthimer <d...@ssl.berkeley.edu> wrote: >>>> >>>> hi john, >>>> >>>> jack hickish tested a mellanox SX1012 >>>> (12x40Gbe, or 48x10Gbe, or mixture), >>>> when he was visiting berkeley. >>>> >>>> the SX1012 worked beautifully on roach2, after jack upgraded >>>> the switch firmware. and it's a great price. ($6K) >>>> >>>> i think jason has also tested this switch. >>>> >>>> best wishes, >>>> >>>> dan >>>> >>>> >>>> >>>> >>>> On Tue, Sep 2, 2014 at 8:36 AM, John Ford <jf...@nrao.edu> wrote: >>>>> >>>>> Hi all. Has anyone tested the Mellanox SX-1024 series of switches with >>>>> ROACH-2 and 10 and/or 40 gb NICs? >>>>> >>>>> These switches have 48 10 Gbe ports and 12 40 Gbe ports on them. >>>>> >>>>> Thanks! >>>>> >>>>> John >>>>> >>>>> >>>>> >>>> >>> >> >> -- >> Lincoln J. Greenhill Harvard-Smithsonian CfA >> Office: 1 617-495-7194 60 Garden St, Mail Stop 42 >> Cell: 1 650 722-7798 Cambridge, MA 02138 >> FAX: 1 617-495-7345 greenh...@cfa.harvard.edu >> Skype: ljgreenhill www.cfa.harvard.edu/~lincoln >> >