The SX1012 and SX1036 definitely use the same ASIC series and both suffer from 
the PHY link problem. I've got seven of them here, and tested them with 20 
different ROACH-2 boards. As Jack points out, some links are worse than others 
and swapping cables and ROACH2 mezzanine cards often sorts the problem out. To 
be honest, the issue is probably on the ROACH-2 side, as we haven't tuned the 
PHY's RX parameters. Mellanox claim to have tested and complied to IEEE 
standards.

To get around this, Mellanox have a "patch" for CASPER users, which adjusts the 
TX drive on the switch so that a bigger signal hits the ROACH2. This fixes all 
the problems, and the links are then very very reliable. They actually borrowed 
a ROACH-2 for the purpose of testing and tuning their PHYs. Very kind of them! 
This tuning was done using their own 3m cables, so you should expect the best 
performance using those copper links. Another option is to use AOC cables, 
which are cost-effective and loss-less.

Lincoln, not to alarm you, but are you actually checking for bit errors on your 
SX1024 links? I suspect you're just not noticing any faults. Packets don't 
always get dropped... oftentimes a few bits in the data is just corrupted, and 
if you're looking at correlator noise, you might not even notice it! ...the "RX 
error" line from the 10G core is hard-coded to zero, so if this is what you're 
watching, you'd never know. Likewise the ROACH2->switch links work fine, it's 
only the switch->ROACH2 direction that's problematic, so you'd never see the 
error counters on the switch climbing. We (SKA-SA) add checksums to our packets 
to be sure. Mostly, we were seeing BER numbers like 10e-10 on most of the 
Mellanox kit out-the-box, but a few links had bad links with BER ~10E-6. After 
patching, it's better than 10e-13.

Jason Manley
CBF Manager
SKA-SA

Cell: +27 82 662 7726
Work: +27 21 506 7300

On 02 Sep 2014, at 18:01, Lincoln Greenhill <lgreenh...@cfa.harvard.edu> wrote:

> Hi Jack (and John),
> 
> Interesting report.  Difficult to puzzle out
> differences in our mutual experience unless the
> 1012 is unlike the 1024 in subtle ways.
> 
> We (LEDA) did not require an upgrade or Mellanox cables
> for the 10/40GbE links.  Again - I would look
> forward to discussion people may have with Jonathon
> (Cc me if not conducted via the Casper list).
> 
> Best,
> Lincoln
> 
> On 9/2/14, 11:54 AM, Jack Hickish wrote:
>> Hi John,
>> 
>> As Dan said I've tested (to some extent) the SX1012. I just used one
>> ROACH2 and corner
>> turned data through 8 x 10GbE ports. It worked well basically up to
>> line rate, with no CRC errors after a few hours of operation, but only
>> after
>> 
>> - Jason put me in touch with some Mellanox guys who provided updated firmware
>> - using branded Mellanox 3m QSFP -> 4xSFP cables
>> 
>> Before the upgrade transmissions from the switch to the ROACH2 would
>> frequently fail.
>> 
>> Of course this may or may not be remotely relevant for the SX1024 :)
>> 
>> Good luck!
>> 
>> J
>> 
>> On 2 September 2014 16:46, Dan Werthimer <d...@ssl.berkeley.edu> wrote:
>>> 
>>> hi john,
>>> 
>>> jack hickish tested a mellanox SX1012
>>> (12x40Gbe, or 48x10Gbe, or mixture),
>>> when he was visiting berkeley.
>>> 
>>> the SX1012 worked beautifully on roach2, after jack upgraded
>>> the switch firmware.   and it's a great price.  ($6K)
>>> 
>>> i think jason has also tested this switch.
>>> 
>>> best wishes,
>>> 
>>> dan
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Sep 2, 2014 at 8:36 AM, John Ford <jf...@nrao.edu> wrote:
>>>> 
>>>> Hi all.  Has anyone tested the Mellanox SX-1024 series of switches with
>>>> ROACH-2 and 10 and/or 40 gb NICs?
>>>> 
>>>> These switches have 48 10 Gbe ports and 12 40 Gbe ports on them.
>>>> 
>>>> Thanks!
>>>> 
>>>> John
>>>> 
>>>> 
>>>> 
>>> 
>> 
> 
> -- 
> Lincoln J. Greenhill               Harvard-Smithsonian CfA
> Office:     1 617-495-7194         60 Garden St, Mail Stop 42
> Cell:       1 650 722-7798         Cambridge, MA 02138
> FAX:        1 617-495-7345         greenh...@cfa.harvard.edu
> Skype:      ljgreenhill            www.cfa.harvard.edu/~lincoln
> 


Reply via email to