The logs you attached start sometime after the issue: to tell what happened you need to find the error in the logs before you started getting these errors: Feb 5 04:03:13 oss1 kernel: LustreError: 9222:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30
It looks like you rebooted the server, and OST0 and OST1 were mounted, and you are NOT getting those errors any more, but both OSTs reported errors on mount. So unmount the OSTs, and run: e2fsck /dev/dm-0 e2fsck /dev/dm-1 I don't know how mangled your OSTs are, so I don't know what e2fsck will report. See also http://wiki.lustre.org/index.php/Handling_File_System_Errors Kevin On Feb 21, 2012, at 10:43 PM, VIJESH EK wrote: Dear Kevin, Herewith i have attached the /var/log/messages , kindly go through the logs and give me a solution for this immly. Can u tell me How to run e2fsck for OST ? , Pl tell the exact command with switch how to run e2fsck without effecting the data..... we are waiting for your reply..... Thanks & Regards VIJESH E K On Tue, Feb 21, 2012 at 8:38 PM, Kevin Van Maren <[email protected]<mailto:[email protected]>> wrote: This is not the correct list for help with SGE. That being said, the real issue (as has been mentioned by several people) is that an OST has gone read-only due to some issue. The file system will not function properly until this is resolved, irrespective of where you put SGE. You will need to check the logs on oss1 to find the initial issue, stop the bad ost, and take corrective action (the details of which depend on the issue), Kevin Sent from my iPhone On Feb 21, 2012, at 3:23 AM, "VIJESH EK" <[email protected]<mailto:[email protected]>> wrote: - We are waiting for your feedback......... Thanks & Regards VIJESH E K On Tue, Feb 21, 2012 at 12:22 PM, VIJESH EK <<mailto:[email protected]>[email protected]<mailto:[email protected]>> wrote: Dear All, We have done the following changes in the exec Nodes , still now also we are getting the same errors in /var/log/messages. 1. We have changed the exec Nodes spool directory to local directory by editing the file /home/appl/sge-root/default/common/configuration and changes the parameter execd_spool_dir. After changing this also the same error, i.e below mentioned error is coming in OSS1 Node. This error is generating only in the OSS1 Node. Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:05 oss1 kernel: LustreError: 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:06 oss1 kernel: LustreError: 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:07 oss1 kernel: LustreError: 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Can u tell me how to change the Master spool directory ? Is it possible to change the directory in live mode ? Kindly explain briefly, so that we can proceed for the next step.. Thanks and Regards VIJESH On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <<mailto:[email protected]>[email protected]<mailto:[email protected]>> wrote: Hi vijesh. Are you running the SGE master spooling on lustre?!?! What about the exec nodes spooling?! I strongly recommend you to do not run the master spooling on lustre. And if possible use local spooling on local disk for the exec nodes. SGE (át. least until version 6.2u7) is known to get unstable when running the spooling on lustre. Carlos On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <<mailto:[email protected]>[email protected]<mailto:[email protected]>> wrote: Dear All, Kindly get a solution for these below issue........... Thanks & Regards VIJESH E K On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <<mailto:[email protected]>[email protected]<mailto:[email protected]>> wrote: Dear Sir, I am getting below mentioned error messages continuously in OSS1 Node,it causes that sge service is not running intermittently....... Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 The detailed log information i have attached herewith.. The attached file containes the /var/log/messages continuous logs seperated by *. So kindly give me a solution for this issue....... Thanks & Regards VIJESH E K - <ATT00001.c> - - Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. Email addresses that end with a “-c” identify the sender as a Fusion-io contractor. <newoss1messages> Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. Email addresses that end with a “-c” identify the sender as a Fusion-io contractor.
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
