On 2010-06-03, at 06:23, Stefano Elmopi wrote:
> surely my action was to test environment, in a production environment, I 
> would have placed all the files before deleting the server OST1.

The main problem here is that you have completely erased all knowledge of the 
failed OST, while there are still files in the filesystem using it (i.e. using 
lctl --writeconf).

If the OST had simply failed and been marked inactive (which is what is 
normally done in such situations) it would still be possible to delete the 
files.  The problem being seen on the MDT now is simply one that cannot happen 
in any "normal" failure scenario.

That said, the checks in the MDS could/should probably be made more lenient.  I 
suspect, however, that there will be a follow-on chain of failures resulting 
from this, since the file layout is now broken and there are likely missing 
checks for this "impossible" case elsewhere in the code.

> However, I tried to do:
> 
> unlink zero.dat
> 
> unlink: cannot unlink `zero.dat': Invalid argument
> 
> Jun  3 14:05:29 mdt02prdpom kernel: LustreError: 
> 16265:0:(lov_ea.c:248:lsm_unpackmd_v1()) OST index 1 missing
> Jun  3 14:05:29 mdt02prdpom kernel: Lustre: 
> 16265:0:(lov_pack.c:64:lov_dump_lmm_common()) objid 0x1b20017, magic 
> 0x0bd10bd0, pattern 0x1
> Jun  3 14:05:29 mdt02prdpom kernel: Lustre: 
> 16265:0:(lov_pack.c:67:lov_dump_lmm_common()) stripe_size 1048576, 
> stripe_count 1
> Jun  3 14:05:29 mdt02prdpom kernel: Lustre: 
> 16265:0:(lov_pack.c:84:lov_dump_lmm_objects()) stripe 0 idx 1 subobj 0x0/0x62
> 
> For the Kernel Panic console messages, I have them only as an image, I can 
> attach to email ?
> 
> For the second problem:
> 
> doing tests with Quotas, when I go to run the command:
> 
> lfs quotacheck -ug /LUSTRE/
> quotacheck failed: Input/output error
> 
> and the log say:
> 
> kernel: LustreError: 7103:0:(quota_check.c:251:lov_quota_check()) lov idx 1 
> inactive
> 
> Is there any suggestion ?
> 
> Thanks
> 
> Cheers, Stefano
> 
> 
> 
> 
> Ing. Stefano Elmopi
> Gruppo Darco - Resp. ICT Sistemi
> Via Ostiense 131/L Corpo B, 00154 Roma
> 
> cell. 3466147165
> tel.  0657060500
> email:[email protected]
> 
> "Ai sensi e per effetti della legge sulla tutela  della  riservatezza 
> personale
> (D.lgs n. 196/2003),  questa @mail e' destinata  unicamente alle persone sopra
> indicate e le informazioni in essa contenute sono da considerarsi strettamente
> riservate. E' proibito leggere, copiare, usare o diffondere il contenuto della
> presente @mail  senza  autorizzazione. Se avete ricevuto  questo messaggio per
> errore, siete pregati di rispedire la stessa al mittente. Grazie"
> 
> Il giorno 28/mag/10, alle ore 21:34, Andreas Dilger ha scritto:
> 
>> On 2010-05-27, at 04:15, Stefano Elmopi wrote:
>>> A clarification on what I wrote, the command that go server MGS/MDS in 
>>> Kernel Panic is:
>>> 
>>>> My version of Lustre is 1.8.3
>>>> By testing, I tried to delete a OST and replace it with another OST
>>>> and now the situation is this:
>>>> 
>>>> cat /proc/fs/lustre/lov/lustre01-mdtlov/target_obd 
>>>> 0: lustre01-OST0000_UUID ACTIVE
>>>> 2: lustre01-OST0002_UUID ACTIVE
>>>> 
>>>> - first problem
>>>> lustre01-OST0001_UUID ACTIVE is the OST was canceled and it had files,
>>>> which of course now there are not more:
>> 
>> Ideally, you should migrate files off the OST before deleting it.
>> 
>>>> ls -lrt
>>>> total 12475312
>>>> ?--------- ? ?    ?             ?            ? zero.dat
>>>> ?--------- ? ?    ?             ?            ? ubuntu-9.10-dvd-i386.iso
>>>> ?--------- ? ?    ?             ?            ? 
>>>> XXXXXXXXX_CentOS-5.4-x86_64-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? Windows_XP-Capodarco.iso
>>>> ?--------- ? ?    ?             ?            ? 
>>>> UBUNTU_CentOS-5.4-x86_64-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? 
>>>> KK_CentOS-5.4-x86_64-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? 
>>>> FFFFF_CentOS-5.4-x86_64-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? CentOS-5.3-i386-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? 
>>>> BBBBB_CentOS-5.4-x86_64-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? 
>>>> BAK_CentOS-5.4-x86_64-bin-DVD.iso
>>>> ?--------- ? ?    ?             ?            ? 2.iso
>>>> 
>>>> 
>>>> I to delete them, follow these steps:
>> 
>> You should be able to delete them from the client with "unlink zero.dat", 
>> which will return an ENOENT error, but the file should be gone.  No need to 
>> run lfsck at all.
>> 
>>>> and the server MGS/MDS go to in Kernel Panic
>> 
>> What do the MDS console messages say?  That is the root of the problem.
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Technical Lead
>> Oracle Corporation Canada Inc.
>> 
> 


Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to