putting storage-discuss@ and zfs-discuss@ as well.
On 04/02/2010 16:33, Robert Milkowski wrote: > Hi, > > S10, SC3.2 + patches, Generic_142900-03, 2x T5220 with QLE2462 connected to > 6540s. > > We started to observe below messages yesterday at both nodes at the same time > after several weeks of running: > > <pre> > XXX cl_runtime: [ID 856360 kern.warning] WARNING: QUORUM_GENERIC: > quorum_read_keys error: Reading the registration keys failed on quorum device > /dev/did/rdsk/d7s2 with error 22. > XXX cl_runtime: [ID 868277 kern.warning] WARNING: CMM: Erstwhile online > quorum device /dev/did/rdsk/d7s2 (qid 1) is inaccessible now. > > d7 is a quorum device and it was marked by cluster as offline: > > # clq status > > === Cluster Quorum === > > --- Quorum Votes Summary from latest node reconfiguration --- > > Needed Present Possible > ------ ------- -------- > 2 3 3 > > > --- Quorum Votes by Node (current status) --- > > Node Name Present Possible Status > --------- ------- -------- ------ > XXXXXXXXXXXXXXX 1 1 Online > YYYYYYYYYYYYYYY 1 1 Online > > > --- Quorum Votes by Device (current status) --- > > Device Name Present Possible Status > ----------- ------- -------- ------ > d7 0 1 Offline > > > > By looking at the source code I found that the above message is printed from > within quorum_device_generic_impl::quorum_read_keys() and it will only happen > if quorum_pgre_key_read() returns with return code 22 (actually any other > than 0 or EACCESS but we already know that the rc is 22 from the syslog > message). > > Now quorum_pgre_key_read() calls quorum_scsi_sector_read() and passes its > return code as its own. > The quorum_scsi_sector_read() can possibly return with error if > quorum_ioctl_with_retries() return with error or if there is a checksum > mismatch. > > This is the relevant source code: > 406 int > 407 quorum_scsi_sector_read( > [...] > 449 error = quorum_ioctl_with_retries(vnode_ptr, USCSICMD, > (intptr_t)&ucmd, > 450 &retval); > 451 if (error != 0) { > 452 CMM_TRACE(("quorum_scsi_sector_read: ioctl USCSICMD " > 453 "returned error (%d).\n", error)); > 454 kmem_free(ucmd.uscsi_rqbuf, (size_t)SENSE_LENGTH); > 455 return (error); > 456 } > 457 > 458 // > 459 // Calculate and compare the checksum if check_data is true. > 460 // Also, validate the pgres_id string at the beg of the sector. > 461 // > 462 if (check_data) { > 463 PGRE_CALCCHKSUM(chksum, sector, iptr); > 464 > 465 // Compare the checksum. > 466 if (PGRE_GETCHKSUM(sector) != chksum) { > 467 CMM_TRACE(("quorum_scsi_sector_read: " > 468 "checksum mismatch.\n")); > 469 kmem_free(ucmd.uscsi_rqbuf, > (size_t)SENSE_LENGTH); > 470 return (EINVAL); > 471 } > 472 > 473 // > 474 // Validate the PGRE string at the beg of the sector. > 475 // It should contain PGRE_ID_LEAD_STRING[1|2]. > 476 // > 477 if ((os::strncmp((char *)sector->pgres_id, > PGRE_ID_LEAD_STRING1, > 478 strlen(PGRE_ID_LEAD_STRING1)) != 0)&& > 479 (os::strncmp((char *)sector->pgres_id, > PGRE_ID_LEAD_STRING2, > 480 strlen(PGRE_ID_LEAD_STRING2)) != 0)) { > 481 CMM_TRACE(("quorum_scsi_sector_read: pgre id " > 482 "mismatch. The sector id is %s.\n", > 483 sector->pgres_id)); > 484 kmem_free(ucmd.uscsi_rqbuf, > (size_t)SENSE_LENGTH); > 485 return (EINVAL); > 486 } > 487 > 488 } > 489 kmem_free(ucmd.uscsi_rqbuf, (size_t)SENSE_LENGTH); > 490 > 491 return (error); > 492 } > > > > 56 -> __1cXquorum_scsi_sector_read6FpnFvnode_LpnLpgre_sector_b_i_ > 6308555744942019 enter > 56 -> __1cZquorum_ioctl_with_retries6FpnFvnode_ilpi_i_ 6308555744957176 > enter > 56<- __1cZquorum_ioctl_with_retries6FpnFvnode_ilpi_i_ 6308555745089857 rc: 0 > 56 -> __1cNdbg_print_bufIdbprintf6MpcE_v_ 6308555745108310 enter > 56 -> __1cNdbg_print_bufLdbprintf_va6Mbpcrpv_v_ 6308555745120941 enter > 56 -> __1cCosHsprintf6FpcpkcE_v_ 6308555745134231 enter > 56<- __1cCosHsprintf6FpcpkcE_v_ 6308555745148729 rc: 2890607504684 > 56<- __1cNdbg_print_bufLdbprintf_va6Mbpcrpv_v_ 6308555745162898 rc: > 1886718112 > 56<- __1cNdbg_print_bufIdbprintf6MpcE_v_ 6308555745175529 rc: 1886718112 > 56<- __1cXquorum_scsi_sector_read6FpnFvnode_LpnLpgre_sector_b_i_ > 6308555745188599 rc: 22 > > From the above output we know that quorum_ioctl_with_retries() returns with > 0 so it must be a checksum mismatch! > As CMM_TRACE() is being called above and there are two of them in the code > lets check which one it is: > > 21 -> __1cNdbg_print_bufIdbprintf6MpcE_v_ 6309628794339298 CMM_DEBUG: > quorum_scsi_sector_read: checksum mismatch. > > > So this is where it fails: > > 462 if (check_data) { > 463 PGRE_CALCCHKSUM(chksum, sector, iptr); > 464 > 465 // Compare the checksum. > 466 if (PGRE_GETCHKSUM(sector) != chksum) { > 467 CMM_TRACE(("quorum_scsi_sector_read: " > 468 "checksum mismatch.\n")); > 469 kmem_free(ucmd.uscsi_rqbuf, > (size_t)SENSE_LENGTH); > 470 return (EINVAL); > 471 } > > > > By adding another quorum device, them removing d7 and adding it again (and > removing the extra one) everything came back to normal. However I wonder how > did we end-up there? HBA? firmware? 6540's firmware? SC bug? > > # fcinfo hba-port -l > HBA Port WWN: 2100001b3291014c > OS Device Name: /dev/cfg/c2 > Manufacturer: QLogic Corp. > Model: 375-3356-02 > Firmware Version: 05.01.00 > FCode/BIOS Version: BIOS: 2.10; fcode: 2.4; EFI: 2.4; > Serial Number: 0402R00-0927731201 > Driver Name: qlc > Driver Version: 20090519-2.31 > Type: N-port > State: online > Supported Speeds: 1Gb 2Gb 4Gb > Current Speed: 4Gb > Node WWN: 2000001b3291014c > Link Error Statistics: > Link Failure Count: 0 > Loss of Sync Count: 0 > Loss of Signal Count: 0 > Primitive Seq Protocol Error Count: 0 > Invalid Tx Word Count: 0 > Invalid CRC Count: 0 > HBA Port WWN: 2101001b32b1014c > OS Device Name: /dev/cfg/c3 > Manufacturer: QLogic Corp. > Model: 375-3356-02 > Firmware Version: 05.01.00 > FCode/BIOS Version: BIOS: 2.10; fcode: 2.4; EFI: 2.4; > Serial Number: 0402R00-0927731201 > Driver Name: qlc > Driver Version: 20090519-2.31 > Type: N-port > State: online > Supported Speeds: 1Gb 2Gb 4Gb > Current Speed: 4Gb > Node WWN: 2001001b32b1014c > Link Error Statistics: > Link Failure Count: 0 > Loss of Sync Count: 0 > Loss of Signal Count: 0 > Primitive Seq Protocol Error Count: 0 > Invalid Tx Word Count: 0 > Invalid CRC Count: 0 > > > 142084-02 is applied and by a quick glance I can't see anything related to > the above which might be addressed by 142084-03. > > Each 6540 presents one 2TB LUN and we are using ZFS to mirror between them. > One of LUNs is used as the quorum device as well. > Since it looks like data was corrupted for quorum the pool itself might be > affected as well so I run scrub and after couple of hours I got so far: > > # zpool status -v XXXX > pool: XXXX > state: DEGRADED > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub in progress for 2h29m, 56.94% done, 1h52m to go > config: > > NAME STATE READ WRITE CKSUM > XXXX DEGRADED 0 0 14 > mirror DEGRADED 0 0 28 > c4t600A0B800029AF0000006CD4486B3B05d0 DEGRADED 0 0 28 > too many errors > c4t600A0B800029B74600004255486B6A4Fd0 DEGRADED 0 0 28 > too many errors > > errors: Permanent errors have been detected in the following files: > > /XXXX/XXXX/XXXXXXXX/YYYYYY.dbf > > > I can't see any other errors in the system nor in logs or from FMA. The HBA > firmware seems to be the latest version as well. > > Because of the corruption within the zfs pool I think that while the issue > manifested itself first as a problem with the quorum device it has rather > nothing to do with the SC itself and data corruption is happening somewhere. > The other interesting thing is that so far all the corrupted blocks detected > by ZFS were corrupted on both sides of the mirror. Since each side is a > separate disk array I think the corruption must probably have originated on > the server itself rather than on SAN or disk arrays. Now the HBA is a > dual-ported card and both paths are used (MPxIO). The issue is also rather > not caused by ZFS itself as it shouldn't have affect the SC keys on the > quorum device. > > > Any ideas? > </pre> > >