Hi All, We have lustre installation where 2 OSS nodes are in the HA mode. It was found that One was stonithed. The var log messages showed following errors before it was stonithed
================================================================================= Feb 1 10:35:57 oss5 heartbeat: [8336]: WARN: Gmain_timeout_dispatch: Dispatch function for memory stats took too long to execute: 870 ms (> 100 ms) (GSource: 0x1e6c62a8) Feb 1 10:36:00 oss5 kernel: LustreError: 27684:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:03 oss5 kernel: LustreError: 15913:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:08 oss5 kernel: LustreError: 12380:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:09 oss5 kernel: LustreError: 12261:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:10 oss5 kernel: LustreError: 9713:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:12 oss5 kernel: LustreError: 4114:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:13 oss5 kernel: LustreError: 4092:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:15 oss5 kernel: LustreError: 12398:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:17 oss5 kernel: LustreError: 12283:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:18 oss5 kernel: LustreError: 12325:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:19 oss5 kernel: LustreError: 9752:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:19 oss5 kernel: LustreError: 23057:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:22 oss5 kernel: LustreError: 12428:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:22 oss5 kernel: LustreError: 9679:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:27 oss5 kernel: LustreError: 9686:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:28 oss5 kernel: LustreError: 12385:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:33 oss5 kernel: LustreError: 27687:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:35 oss5 kernel: LustreError: 12264:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:40 oss5 kernel: LustreError: 9784:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:43 oss5 kernel: LustreError: 23117:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:48 oss5 kernel: LustreError: 12265:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:52 oss5 kernel: LustreError: 4103:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:57 oss5 kernel: LustreError: 12415:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:02 oss5 kernel: LustreError: 23132:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:05 oss5 kernel: LustreError: 23100:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:07 oss5 kernel: LustreError: 9714:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:07 oss5 kernel: LustreError: 12429:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:07 oss5 kernel: LustreError: 4090:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:10 oss5 kernel: LustreError: 9773:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:14 oss5 kernel: LustreError: 9781:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:19 oss5 kernel: LustreError: 9752:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:25 oss5 kernel: LustreError: 23082:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:32 oss5 kernel: LustreError: 15927:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:41 oss5 kernel: LustreError: 9761:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:50 oss5 kernel: LustreError: 12382:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:53 oss5 kernel: LustreError: 15925:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:55 oss5 kernel: LustreError: 23102:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:58 oss5 kernel: LustreError: 9732:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:00 oss5 kernel: LustreError: 12442:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:02 oss5 kernel: LustreError: 9658:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:07 oss5 kernel: LustreError: 12342:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:13 oss5 kernel: LustreError: 4108:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:20 oss5 kernel: LustreError: 12271:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:21 oss5 kernel: LustreError: 9683:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:28 oss5 kernel: LustreError: 23118:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:37 oss5 kernel: LustreError: 4124:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:47 oss5 kernel: LustreError: 9670:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:47 oss5 kernel: LustreError: 15920:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:51 oss5 kernel: LustreError: 9768:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:57 oss5 kernel: LustreError: 23058:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:00 oss5 kernel: LustreError: 9708:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:04 oss5 kernel: LustreError: 4115:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:09 oss5 kernel: LustreError: 9771:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:15 oss5 kernel: LustreError: 15926:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:22 oss5 kernel: LustreError: 12411:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:30 oss5 kernel: LustreError: 12438:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:33 oss5 kernel: LustreError: 12266:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:35 oss5 kernel: LustreError: 27698:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:38 oss5 kernel: LustreError: 9776:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:39 oss5 kernel: LustreError: 27691:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:42 oss5 kernel: LustreError: 15931:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:47 oss5 kernel: LustreError: 9780:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:49 oss5 kernel: LustreError: 9753:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:49 oss5 kernel: LustreError: 12378:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:53 oss5 kernel: LustreError: 9766:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:40:00 oss5 kernel: LustreError: 9712:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 ================================================================================================= later on the other (HA) OSS was also stonithed after outputting messages like above. What could be the problem and what further to look for to diagnose? Thanks and Regards Prithu
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
