Hi Atmane, The missing path from old mmlspdisk (/dev/sdob) and the log file (/dev/sdge) do not match. This may be because server was booted after the old mmlspdisk was taken. The path name are not guarantied across reboot.
The log is reporting problem with /dev/sdge. You should check if OS can see path /dev/sdge (use lsscsi). If the disk is accessible from other path than I don't believe it is problem with the disk. Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine <[email protected]> To: "[email protected]" <[email protected]> Date: 24/10/2017 02:50 PM Subject: [gpfsug-discuss] GSS GPFS Storage Server show one path for one Disk Sent by: [email protected] Dear All we owning a solution for our HPC a GSS gpfs storage server native raid I noticed 3 days ago that a disk shows a single path my configuration is as follows GSS configuration: 4 enclosures, 6 SSDs, 2 empty slots, 238 total disks, 0 NVRAM partitions if I search with fdisk I have the following result 476 disk in GSS0 and GSS1 with an old file cat mmlspdisk.old ##### replacementPriority = 1000 name = "e3d5s05" device = "/dev/sdkt,/dev/sdob" << - recoveryGroup = "BB1RGL" declusteredArray = "DA2" state = "ok" userLocation = "Enclosure 2021-20E-SV25262728 Drawer 5 Slot 5" userCondition = "normal" nPaths = 2 activates 4 total << - while the disk contains the 2 paths ##### ls /dev/sdob /Dev/ sdob ls /dev/sdkt /Dev/sdkt mmlspdisk all >> mmlspdisk.log vi mmlspdisk.log replacementPriority = 1000 name = "e3d5s05" device = "/dev/sdkt" << --- the disk contains 1 path recoveryGroup = "BB1RGL" declusteredArray = "DA2" state = "ok" userLocation = "Enclosure 2021-20E-SV25262728 Drawer 5 Slot 5" userCondition = "normal" nPaths = 1 active 3 total here is the result of the log file in GSS1 grep e3d5s05 /var/adm/ras/mmfs.log.latest ################## START LOG GSS1 ##################### 0 result ################# END LOG GSS 1 ##################### here is the result of the log file in GSS0 grep e3d5s05 /var/adm/ras/mmfs.log.latest ################# START LOG GSS 0 ##################### Thu Sep 14 16:35:01.619 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 4673959648 length 4112 err 5. Thu Sep 14 16:35:01.620 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Thu Sep 14 16:35:01.787 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Thu Sep 14 16:35:01.788 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Thu Sep 14 16:35:03.709 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Thu Sep 14 17:53:13.209 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 3658399408 length 4112 err 5. Thu Sep 14 17:53:13.210 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Thu Sep 14 17:53:15.685 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Thu Sep 14 17:56:10.410 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 796658640 length 4112 err 5. Thu Sep 14 17:56:10.411 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Thu Sep 14 17:56:10.593 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on write: sector 738304 length 512 err 5. Thu Sep 14 17:56:11.236 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Thu Sep 14 17:56:11.237 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Thu Sep 14 17:56:13.127 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Thu Sep 14 17:59:14.322 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Thu Sep 14 18:02:16.580 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:08:01.464 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 682228176 length 4112 err 5. Fri Sep 15 00:08:01.465 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:08:03.391 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:21:41.785 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 4063038688 length 4112 err 5. Fri Sep 15 00:21:41.786 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:21:42.559 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:21:42.560 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 00:21:44.336 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:36:11.899 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 2503485424 length 4112 err 5. Fri Sep 15 00:36:11.900 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:36:12.676 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:36:12.677 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 00:36:14.458 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:40:16.038 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 4113538928 length 4112 err 5. Fri Sep 15 00:40:16.039 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:40:16.801 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:40:16.802 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 00:40:18.307 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:47:11.468 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 4185195728 length 4112 err 5. Fri Sep 15 00:47:11.469 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:47:12.238 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:47:12.239 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 00:47:13.995 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:51:01.323 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 1637135520 length 4112 err 5. Fri Sep 15 00:51:01.324 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:51:01.486 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:51:01.487 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 00:51:03.437 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:55:27.595 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 3646618336 length 4112 err 5. Fri Sep 15 00:55:27.596 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 00:55:27.749 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 00:55:27.750 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 00:55:29.675 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 00:58:29.900 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 02:15:44.428 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 768931040 length 4112 err 5. Fri Sep 15 02:15:44.429 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Fri Sep 15 02:15:44.596 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b03 (ACK/NAK timeout). Fri Sep 15 02:15:44.597 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Fri Sep 15 02:15:46.486 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Fri Sep 15 02:18:46.826 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b03 (ACK/NAK timeout). Fri Sep 15 02:21:47.317 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 02:24:47.723 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b03 (ACK/NAK timeout). Fri Sep 15 02:27:48.152 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Fri Sep 15 02:30:48.392 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b03 (ACK/NAK timeout). Sun Sep 24 15:40:18.434 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 2733386136 length 264 err 5. Sun Sep 24 15:40:18.435 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Sun Sep 24 15:40:19.326 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Sun Sep 24 15:40:41.619 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on write: sector 3021316920 length 520 err 5. Sun Sep 24 15:40:41.620 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Sun Sep 24 15:40:42.446 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Sun Sep 24 15:40:57.977 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on read: sector 4939800712 length 264 err 5. Sun Sep 24 15:40:57.978 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Sun Sep 24 15:40:58.133 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b03 (ACK/NAK timeout). Sun Sep 24 15:40:58.134 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to error. Sun Sep 24 15:40:58.984 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. Sun Sep 24 15:44:00.932 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b03 (ACK/NAK timeout). Sun Sep 24 15:47:02.352 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Sun Sep 24 15:50:03.149 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x28 Read (10): Check Condition: skey=0x0b (aborted command) asc/ascq=0x4b04 (NAK received). Mon Sep 25 08:31:07.906 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge: I/O error on write: sector 942033152 length 264 err 5. Mon Sep 25 08:31:07.907 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from ok to diagnosing. Mon Sep 25 08:31:07.908 2017: [D] Pdisk e3d5s05 of RG BB1RGL path //gss0-ib0/dev/sdge: SCSI op=0x00 Test Unit Ready: Ioctl or RPC Failed: err=19. Mon Sep 25 08:31:07.909 2017: [D] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge status changed from active to noDevice. Mon Sep 25 08:31:07.910 2017: [E] Pdisk e3d5s05 of RG BB1RGL path /dev/sdge failed; location 'SV25262728-5-5'. Mon Sep 25 08:31:08.770 2017: [D] Pdisk e3d5s05 of RG BB1RGL state changed from diagnosing to ok. ################## END LOG ##################### is it a HW or SW problem? thank you Atmane Khiredine HPC System Administrator | Office National de la Météorologie Tél : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : [email protected] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=QsMCUxg_qSYCs6Joccb2Brey1phAF_tJFrEnVD6LNoc&s=eSulhfhE2jQnmMrmb9_eoomafxb5xI3KL5Y6n3rH5CE&e=
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
