Dear all,

After switching to newer backplane revisions we have a more stable system, but 
far from perfect. 
We are using the following backplane from supermicro: 

BPN-SAS-836EL2                  
3U SAS / SATA Expander Backplane with dual LSI SASX28 Expander Chips

in combination with 1x AOC-USAS-L8I (based on Lsi 1068E) HBA 
Per system we use 12-14 x Western digital RE3 1002FBYS  SATA disks + 1 x ssd 
mlc + 1x ssd slc 

OS: Nexenta Core Platform 3.0.1 (snv 134) 

After upgrading the LSI controller to firmware 1.30.00 the ssd's give less 
errors in the console. 

[b]Problem is that this setup still gives  SMP PHY control link reset issues 
under heavy load (scrub etc) [/b]


Supermicro's support is worthless, so we discussed this problem with western 
digital. After a couple of weeks we received the following answer:




[i]Firstly from discovery info, PHY identifier 0x18 is destination SAS address 
ending 9C:5A:58 – a virtualized address for a SATA device. A write command tag 
0x1C was sent earlier (off the top of the trace but it was WFPDMA queued LBA 
0x13F69E39 for 0x100 sectors sent around 3.6ms before the trigger point). The 
drive signals it’s ready for transfer and the host begins to send data.

 
[b]http://www.boeri.be/trace.jpeg[/b]
 

Part way through the transfer, the drive begins sending HOLDS to the host. This 
continues for around 2.2ms until the HOST sends an SMP PHY control link reset 
request for the PHY identifier 0x18 (associated with the SATA device that is 
sending HOLDs) causing the expander to send BREAK. I’ve just shown host -> 
expander link 3->4 in the above – traffic is continuing on links 1->2 to 
another device. The HOST sends the SMP PHY control request via links 1->2 so 
you don’t see it above but its around 40us before the expander responds with 
the BREAK.

 

So it seems to me there’s a possibility that if a drive gets in a state where 
it needs to send HOLDS for a period of time, the host may have some kind of 
timeout – do you have a view as to why this would be happening?
[/i]

Has someone the same experiences ? Is there really a problem with the mpt_sas 
driver? Can we switch to adaptec hba's, are they better? I think the newer 
storagetek hba's are based on the adaptec 5805 ? 
Point to point sata links are not an ideal solution for us, neither are sas 
drives. We like to hold the backplane solution. Thanks.
-- 
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to