Aaron, Thanks for the reply. That didn't seem to work...it still gives me the whole "No Such Device" error. Same thing in /var/log/messages too.
Thanks! -Matt -----Original Message----- From: Aaron Knister [mailto:[EMAIL PROTECTED] Sent: Thursday, February 15, 2007 11:14 AM To: Matt Hollingsworth Cc: [email protected] Subject: Re: [Lustre-discuss] Strange Problem When Mounting Lustre Filesystem Try this-- lmc -m cluster-production.xml --add node --node client --nid '*'@tcp --nettype tcp it might complain because the node "client" has already been added to your config file but I"m not sure. Once you're run the above command try remounting your lustre fs. -Aaron Matt Hollingsworth wrote: > > Hello, > > #I'm sorry if this is a double post, but the message bounced back to > me, and I'm not sure whether it went or not. > > I have spent the last couple of months designing the file system for a > scientific cluster that I am helping administer. After a bunch of > testing, we are finally ready to actually get down to using our setup. > We have a slightly strange setup, so I'll start off with explaining > what we are doing. > > We rebuilt the kernel from the > kernel-source-2.6.9-42.0.2.EL_lustre.1.4.7.1.x86_64.rpm package in > order to to slim down the features and to add root over nfs support to > it. Then we built lustre-1.4.8 against that kernel. We have one > head-node that is the MDS as well as the boot server (it exports the > root file system and runs tftpd). All of the other nodes, then, boot > off of that server. The other (slave) nodes are the OSS's. I use this > script to generate the config file: > > ############################# > > ############################# > > cms-lustre-config.sh > > ############################# > > ############################# > > #!/bin/bash > > rm cluster-production.xml > > #----------------- > > #Create the nodes | > > #----------------- > > lmc -m cluster-production.xml --add node --node osg1 > > lmc -m cluster-production.xml --add net --node osg1 --nid > [EMAIL PROTECTED] --nettype lnet > > lmc -m cluster-production.xml --add node --node node253 > > lmc -m cluster-production.xml --add net --node node253 --nid > [EMAIL PROTECTED] --nettype lnet > > lmc -m cluster-production.xml --add node --node node252 > > lmc -m cluster-production.xml --add net --node node252 --nid > [EMAIL PROTECTED] --nettype lnet > > lmc -m cluster-production.xml --add node --node node251 > > lmc -m cluster-production.xml --add net --node node251 --nid > [EMAIL PROTECTED] --nettype lnet > > lmc -m cluster-production.xml --add node --node node250 > > lmc -m cluster-production.xml --add net --node node250 --nid > [EMAIL PROTECTED] --nettype lnet > > lmc -m cluster-production.xml --add node --node node249 > > lmc -m cluster-production.xml --add net --node node249 --nid > [EMAIL PROTECTED] --nettype lnet > > lmc -m cluster-production.xml --add node --node client > > lmc -m cluster-production.xml --add net --node client --nid '*' > --nettype lnet > > #-------------- > > #Configure MDS | > > #-------------- > > lmc -m cluster-production.xml --add mds --node osg1 --mds cms-mds > --fstype ldiskfs --dev /dev/sdb > > #--------------- > > #Configure OSTs | > > #--------------- > > lmc -m cluster-production.xml --add lov --lov cms-lov --mds cms-mds > --stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0 > > #Head Node > > #========== > > #lmc -m cluster-production.xml --add ost --node osg1 --lov cms-lov > --ost node001-ost --fstype ldiskfs --dev /dev/sdc > > #========== > > #Compute Nodes > > #========== > > #node253 > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #-------- > > #node252 > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node251 > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node250 > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node249 > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #========== > > #----------------- > > #Configure client | > > #----------------- > > lmc -m cluster-production.xml --add mtpt --node client --path > /mnt/cms-lustre --mds cms-mds --lov cms-lov > > cp cluster-production.xml > /cluster-images/rootfs-SL4-x86_64/root/lustre-config/ > > ############################# > > ############################# > > end > > ############################# > > ############################# > > Now, I do all the > > lconf --reformat --node <insert node name> cluster-production.xml > > on each node, and wait for a while for everything to format. > Everything completes fine, without an error. > > I then do mount.lustre 10.0.0.243:/cms-mds/client /mnt/cms-lustre on > the head node (osg1). > > That also works fine. I've run a number of tests, and it works fine > (really well, in fact). > > The problem occurs when I attempt to mount the file system on the > slave nodes. When I do the same command as above, I get the following: > > [EMAIL PROTECTED] ~]# mount.lustre 10.0.0.243:/cms-mds/client > /var/writable/cms-lustre/ > > mount.lustre: mount([EMAIL PROTECTED]:/cms-mds/client, > /var/writable/cms-lustre/) failed: No such device > > mds nid 0: [EMAIL PROTECTED] > > mds name: cms-mds > > profile: client > > options: > > retry: 0 > > Are the lustre modules loaded? > > Check /etc/modprobe.conf and /proc/filesystems > > [EMAIL PROTECTED] ~]# > > and this pops up in the error log: > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(obd_config.c:102:class_attach()) Cannot create device > OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of > type osc : -19 > > Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The > configuration 'client' could not be read from the MDS 'cms-mds'. This > may be the result of communication errors between the client and the > MDS, or if the MDS is not running. > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client > > Any idea what's going on here? > > Thanks a bunch for the help. > > -Matt > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > -- "Computers are incredibly fast, accurate and stupid; humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination." --Albert Einsten Aaron Knister Center for Research on Environment and Water 4041 Powder Mill Road, Suite 302; Calverton MD 20705 Office: (240) 247-1456 Fax: (301) 595-9790 http://crew.iges.org _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
