Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-10 Thread Bob Ball
OK, this worked.  I was able to rewrite the LAST_ID value stored in the lov_objid object to a value of 1, and when lustre came up, the ost was back at "UP" (yay!). However, there still seem to be problems with that ost, as lfs_find comes up with files there that do not exist.  I guess at some

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-10 Thread Bob Ball
I just made some random checks on the "lfs find" output for this OST from yesterday.  Each file I checked was one lost when we had problems a few months back.  The suggested "unlink" on these did not work in 1.8.3, worked fine on a whole set yesterday with 1.8.4, but I obviously did not find th

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-10 Thread Bernd Schubert
>> Assuming the disk really is empty then, and LAST_ID really is zero, >> shall I then leave it at zero, and follow the recommendation of >> page 23-14, ie, just shut down again, delete the lov_objid file on >> the MDS, and restart the system? Certainly the value at the >> correct index (29) is de

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-10 Thread Andreas Dilger
On 2010-09-10, at 08:21, Bob Ball wrote: > Now, we move on to the "bad" OST. First, I did an lfs_find yesterday on just > this OST and came up with some 8000 files before it seemed to cease output. > So, I expected to see SOMETHING on the physical disk. But, in fact, the > /tmp/objects.sdc sh

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-10 Thread Bob Ball
OK, I tried this morning to follow the information/procedures from section 23.3.9 of the user manual, and succeeded in confusing myself admirably. Took lustre completely offline, then first checked the LAST_ID on a known, good OST.  I found this kind of thing: # od -Ax -td4 /mnt/ost/last_rcvd

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-03 Thread Bob Ball
Thank you, Bern.  "df" claims there is some 442MB of data on the volume, compared to neighbors with 285GB.  That could well be a fragment of a single, unsuccessful transfer attempt.  I can run lfs_find on it though and see what comes back.  Was having problems earlier, thought I got files back

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-03 Thread Bernd Schubert
On Friday, September 03, 2010, Bernd Schubert wrote: > On Friday, September 03, 2010, Bob Ball wrote: > > We added a new OSS to our 1.8.4 Lustre installation. It has 6 OST of > > 8.9TB each. Within a day of having these on-line, one OST stopped > > accepting new files. I cannot get it to activat

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-03 Thread Bernd Schubert
On Friday, September 03, 2010, Bob Ball wrote: > We added a new OSS to our 1.8.4 Lustre installation. It has 6 OST of > 8.9TB each. Within a day of having these on-line, one OST stopped > accepting new files. I cannot get it to activate. The other 5 seem fine. > > On the MDS "lctl dl" shows it

[Lustre-discuss] Cannot get an OST to activate

2010-09-03 Thread Bob Ball
We added a new OSS to our 1.8.4 Lustre installation. It has 6 OST of 8.9TB each. Within a day of having these on-line, one OST stopped accepting new files. I cannot get it to activate. The other 5 seem fine. On the MDS "lctl dl" shows it IN, but not UP, and files can be read from it: 33 IN