Hello, Our configuration includes a 1068e SAS controller with 2 JBODs, each with 23 disks and we run b118. The controller is patched to the latest LSI firmware. We are experiencing problems under load where the mpt driver is throwing read timeout. This happens for example when scrubing the pools and is happening on 3 different nodes.
Is there a limit in the MPT driver about how many devices it can handle? For example, we read a bug fixed in snv_92 where the driver would not work with more than 32 devices. Is there another limit we should be aware of? Is there any other tuning/optmization you would recommend? queue length? timeout? What's the best way to find out if we are overwhelming the SAS controller? LSISAS3801E (1068E chipset) Firmware: 1.28.02 X86 BIOS: 6.28.00 (2009.02.03) Under heavy load (e.g. scrub), we get these errors on random drives: Oct 1 13:32:37 tonas103 scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@e,0 (sd11): Oct 1 13:32:37 tonas103 incomplete read- retrying Oct 1 13:32:37 tonas103 scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@18,0 (sd21): Oct 1 13:32:37 tonas103 incomplete read- retrying Oct 1 13:32:37 tonas103 scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@1c,0 (sd25): Oct 1 13:32:37 tonas103 incomplete read- retrying Oct 1 13:32:37 tonas103 scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@9,0 (sd6): Oct 1 13:32:37 tonas103 incomplete read- retrying Also, sometimes the above are coupled with what looks like a bus reset: Oct 1 13:39:51 tonas103 scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@21,0 (sd30): Oct 1 13:39:51 tonas103 incomplete read- retrying Oct 1 13:40:06 tonas103 scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): Oct 1 13:40:06 tonas103 Rev. 8 LSI, Inc. 1068E found. Oct 1 13:40:06 tonas103 scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): Oct 1 13:40:06 tonas103 mpt0 supports power management. Oct 1 13:40:09 tonas103 scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): Oct 1 13:40:09 tonas103 mpt0: IOC Operational. And this combination appears as well: Oct 1 14:04:45 tonas103 scsi: [ID 243001 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): Oct 1 14:04:45 tonas103 mpt_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31110b00 Oct 1 14:04:45 tonas103 scsi: [ID 243001 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): Oct 1 14:04:45 tonas103 mpt_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31110b00 Oct 1 14:04:45 tonas103 scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@21,0 (sd30): Oct 1 14:04:45 tonas103 incomplete read- retrying -- This message posted from opensolaris.org _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
