OK.  This looks bad.  It appears that I should have upgraded ext3 to ext4, I 
found instructions for that,

        tune2fs -O extents,uninit_bg,dir_index /dev/XXX
        fsck -pf /dev/XXX
        
Is the above correct?  I'd like to move our systems to ext4. I didn't know 
those 
steps were necessary.

Other answers listed below.

Wojciech Turek wrote:
> Hi Roger,
> 
> Sorry for the delay. From the ldiskfs messages I seem to me that you are 
> using ext4 ldiskfs
> (Jun 26 17:54:30 puppy7 kernel: ldiskfs created from ext4-2.6-rhel5).
> If you upgrading from 1.6.6 you ldiskfs is ext3 based so I think taht in 
> lustre-1.8.3 you should use ext3 based ldiskfs rpm.
> 
> Can you also  tell us a bit more about your setup? From what you wrote 
> so far I understand you have 2 OSS servers and each server has one OST 
> device. In addition to that you have a third server which acts as a 
> MGS/MDS, is that right?
> 
> The logs you provided seem to be only from one server called puppy7 so 
> it does not give a whole picture of the situation. The timeout messages 
> may indicate a problem with communication between the servers but it is 
> really difficult to say without seeing the whole picture or at least 
> more elements of it.
> 
> To check if you have correct rpms installed can you please run 'rpm -qa 
> | grep lustre' on both OSS servers and the MDS?
> 
> Also please provide output from command 'lctl list_nids'  run on both 
> OSS servers, MDS and a client?

puppy5 (MDS/MGS)
172.17....@o2ib
172.16....@tcp

puppy6 (OSS)
172.17....@o2ib
172.16....@tcp

puppy7 (OSS)
172.17....@o2ib
172.16....@tcp


> 
> In addition to above please run following command on all lustre targets 
> (OSTs and MDT) to display your current lustre configuration
> 
>  tunefs.lustre --dryrun --print /dev/<ost_device>

puppy5 (MDS/MGS)
    Read previous values:
Target:     lustre1-MDT0000
Index:      0
Lustre FS:  lustre1
Mount type: ldiskfs
Flags:      0x405
               (MDT MGS )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters: lov.stripesize=125K lov.stripecount=2 
mdt.group_upcall=/usr/sbin/l_getgroups mdt.group_upcall=NONE 
mdt.group_upcall=NONE


    Permanent disk data:
Target:     lustre1-MDT0000
Index:      0
Lustre FS:  lustre1
Mount type: ldiskfs
Flags:      0x405
               (MDT MGS )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters: lov.stripesize=125K lov.stripecount=2 
mdt.group_upcall=/usr/sbin/l_getgroups mdt.group_upcall=NONE 
mdt.group_upcall=NONE

exiting before disk write.
----------------------------------------------------
puppy6
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

    Read previous values:
Target:     lustre1-OST0000
Index:      0
Lustre FS:  lustre1
Mount type: ldiskfs
Flags:      0x2
               (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17....@o2ib


    Permanent disk data:
Target:     lustre1-OST0000
Index:      0
Lustre FS:  lustre1
Mount type: ldiskfs
Flags:      0x2
               (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17....@o2ib
--------------------------------------------------
puppy7 (this is the broken OSS. The "Target" should be "lustre1-OST0001")
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

    Read previous values:
Target:     lustre1-OST0000
Index:      0
Lustre FS:  lustre1
Mount type: ldiskfs
Flags:      0x2
               (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17....@o2ib


    Permanent disk data:
Target:     lustre1-OST0000
Index:      0
Lustre FS:  lustre1
Mount type: ldiskfs
Flags:      0x2
               (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17....@o2ib

exiting before disk write.


> 
> If possible please attach syslog from each machine from the time you 
> mounted lustre targets (OST and MDT).
> 
> Best regards,
> 
> Wojciech
> 
> On 14 July 2010 20:46, Roger Sersted <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
>     Any additional info?
> 
>     Thanks,
> 
>     Roger S.
> 
> 
> 
> 
> -- 
> --
> Wojciech Turek
> 
> 
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to