Sorry for the rushed email. For some reason the LVM metadata got screwed up, managed to restore it, though now running into another issue. I've mounted the OSTs yet it seems they are not all cooperating. One of the OSTs will stay listed as Resource Unavailable and this seems to be the main message on the OSS node:
LustreError: 137-5: UUID 'lustre-OST0002_UUID' is not available for connect (no target) LustreError: Skipped 470 previous similar messages LustreError: 5214:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@ processing error (-19) req@ffff8103ffc73400 x1404513746630678/t0 o8-><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1341207057 ref 1 fl Interpret:/0/0 rc -19/0 LustreError: 5214:0:(ldlm_lib.c:1914:target_send_reply_msg()) Skipped 470 previous similar messages I've tried remounting this ost on the other data node but still won't connect from the client side. I've even rebooted the mds and still no go. I've run e2fsck to check the OSTs and no issues and the disk arrays report no problems on their end and fibre connections are good and the multipath driver doesnt report anything(These are Sun disk arrays so using the rdac driver instead of the basic multpath daemon). On the client side I'll see this: Lustre: 3289:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1404591888147958 sent from lustre-OST0002-osc-ffff8104104ad800 to NID 192.168.5.101@tcp 0s ago has failed due to network error (30s prior to deadline). req@ffff81015113b400 x1404591888147958/t0 o8->[email protected]@tcp:28/4 lens 368/584 e 0 to 1 dl 1341187631 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 3290:0:(import.c:517:import_select_connection()) lustre-OST0002-osc-ffff8104104ad800: tried all connections, increasing latency to 22s Lustre: 3290:0:(import.c:517:import_select_connection()) Skipped 39 previous similar messages On Sun, Jul 1, 2012 at 8:10 PM, Mark Day <[email protected]> wrote: > Does the device show up in /dev ? > Have you physically checked for Fibre/SAS connectivity, RAID controller > errors etc? > > You may need to supply more information about your setup. It sounds more > like a RAID/disk issue than a Lustre issue. > > ________________________________ > From: "David Noriega" <[email protected]> > To: [email protected] > Sent: Monday, 2 July, 2012 8:51:18 AM > Subject: [Lustre-discuss] Lustre missing physical volume > > > Just recently used heartbeat to failover resources so that I could > power down a lustre node to add more ram and failed back to do the > same to our second lustre node. Only then do I find that now our > lustre install is missing a physical volume out of lvm. pvscan only > shows three out of four partitions. > > Any hints? I've tried some recovery steps in lvm with pvcreate using > the archived config for the missing pv but no luck, says no device > with such uuid. I'm lost on what to do now. This is lustre 1.8.4 > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > -- David Noriega CSBC/CBI System Administrator University of Texas at San Antonio One UTSA Circle San Antonio, TX 78249 Office: BSE 3.112 Phone: 210-458-7100 http://www.cbi.utsa.edu _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
