Try this--
lmc -m cluster-production.xml --add node --node client --nid '*'@tcp
--nettype tcp
it might complain because the node "client" has already been added to
your config file but I"m not sure.
Once you're run the above command try remounting your lustre fs.
-Aaron
Matt Hollingsworth wrote:
Hello,
#I’m sorry if this is a double post, but the message bounced back to
me, and I’m not sure whether it went or not.
I have spent the last couple of months designing the file system for a
scientific cluster that I am helping administer. After a bunch of
testing, we are finally ready to actually get down to using our setup.
We have a slightly strange setup, so I’ll start off with explaining
what we are doing.
We rebuilt the kernel from the
kernel-source-2.6.9-42.0.2.EL_lustre.1.4.7.1.x86_64.rpm package in
order to to slim down the features and to add root over nfs support to
it. Then we built lustre-1.4.8 against that kernel. We have one
head-node that is the MDS as well as the boot server (it exports the
root file system and runs tftpd). All of the other nodes, then, boot
off of that server. The other (slave) nodes are the OSS’s. I use this
script to generate the config file:
#############################
#############################
cms-lustre-config.sh
#############################
#############################
#!/bin/bash
rm cluster-production.xml
#-----------------
#Create the nodes |
#-----------------
lmc -m cluster-production.xml --add node --node osg1
lmc -m cluster-production.xml --add net --node osg1 --nid
[EMAIL PROTECTED] --nettype lnet
lmc -m cluster-production.xml --add node --node node253
lmc -m cluster-production.xml --add net --node node253 --nid
[EMAIL PROTECTED] --nettype lnet
lmc -m cluster-production.xml --add node --node node252
lmc -m cluster-production.xml --add net --node node252 --nid
[EMAIL PROTECTED] --nettype lnet
lmc -m cluster-production.xml --add node --node node251
lmc -m cluster-production.xml --add net --node node251 --nid
[EMAIL PROTECTED] --nettype lnet
lmc -m cluster-production.xml --add node --node node250
lmc -m cluster-production.xml --add net --node node250 --nid
[EMAIL PROTECTED] --nettype lnet
lmc -m cluster-production.xml --add node --node node249
lmc -m cluster-production.xml --add net --node node249 --nid
[EMAIL PROTECTED] --nettype lnet
lmc -m cluster-production.xml --add node --node client
lmc -m cluster-production.xml --add net --node client --nid '*'
--nettype lnet
#--------------
#Configure MDS |
#--------------
lmc -m cluster-production.xml --add mds --node osg1 --mds cms-mds
--fstype ldiskfs --dev /dev/sdb
#---------------
#Configure OSTs |
#---------------
lmc -m cluster-production.xml --add lov --lov cms-lov --mds cms-mds
--stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0
#Head Node
#==========
#lmc -m cluster-production.xml --add ost --node osg1 --lov cms-lov
--ost node001-ost --fstype ldiskfs --dev /dev/sdc
#==========
#Compute Nodes
#==========
#node253
lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov
--ost node253-ost-sda --fstype ldiskfs --dev /dev/sda
lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov
--ost node253-ost-sdb --fstype ldiskfs --dev /dev/sdb
lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov
--ost node253-ost-sdc --fstype ldiskfs --dev /dev/sdc
lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov
--ost node253-ost-sdd --fstype ldiskfs --dev /dev/sdd
#--------
#node252
lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov
--ost node252-ost-sda --fstype ldiskfs --dev /dev/sda
lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov
--ost node252-ost-sdb --fstype ldiskfs --dev /dev/sdb
lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov
--ost node252-ost-sdc --fstype ldiskfs --dev /dev/sdc
lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov
--ost node252-ost-sdd --fstype ldiskfs --dev /dev/sdd
#---------
#node251
lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov
--ost node251-ost-sda --fstype ldiskfs --dev /dev/sda
lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov
--ost node251-ost-sdb --fstype ldiskfs --dev /dev/sdb
lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov
--ost node251-ost-sdc --fstype ldiskfs --dev /dev/sdc
lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov
--ost node251-ost-sdd --fstype ldiskfs --dev /dev/sdd
#---------
#node250
lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov
--ost node250-ost-sda --fstype ldiskfs --dev /dev/sda
lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov
--ost node250-ost-sdb --fstype ldiskfs --dev /dev/sdb
lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov
--ost node250-ost-sdc --fstype ldiskfs --dev /dev/sdc
lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov
--ost node250-ost-sdd --fstype ldiskfs --dev /dev/sdd
#---------
#node249
lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov
--ost node249-ost-sda --fstype ldiskfs --dev /dev/sda
lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov
--ost node249-ost-sdb --fstype ldiskfs --dev /dev/sdb
lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov
--ost node249-ost-sdc --fstype ldiskfs --dev /dev/sdc
lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov
--ost node249-ost-sdd --fstype ldiskfs --dev /dev/sdd
#==========
#-----------------
#Configure client |
#-----------------
lmc -m cluster-production.xml --add mtpt --node client --path
/mnt/cms-lustre --mds cms-mds --lov cms-lov
cp cluster-production.xml
/cluster-images/rootfs-SL4-x86_64/root/lustre-config/
#############################
#############################
end
#############################
#############################
Now, I do all the
lconf --reformat --node <insert node name> cluster-production.xml
on each node, and wait for a while for everything to format.
Everything completes fine, without an error.
I then do mount.lustre 10.0.0.243:/cms-mds/client /mnt/cms-lustre on
the head node (osg1).
That also works fine. I’ve run a number of tests, and it works fine
(really well, in fact).
The problem occurs when I attempt to mount the file system on the
slave nodes. When I do the same command as above, I get the following:
[EMAIL PROTECTED] ~]# mount.lustre 10.0.0.243:/cms-mds/client
/var/writable/cms-lustre/
mount.lustre: mount([EMAIL PROTECTED]:/cms-mds/client,
/var/writable/cms-lustre/) failed: No such device
mds nid 0: [EMAIL PROTECTED]
mds name: cms-mds
profile: client
options:
retry: 0
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems
[EMAIL PROTECTED] ~]#
and this pops up in the error log:
Feb 14 04:50:07 localhost kernel: LustreError:
6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc
Feb 14 04:50:07 localhost kernel: LustreError:
6053:0:(obd_config.c:102:class_attach()) Cannot create device
OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of
type osc : -19
Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The
configuration 'client' could not be read from the MDS 'cms-mds'. This
may be the result of communication errors between the client and the
MDS, or if the MDS is not running.
Feb 14 04:50:07 localhost kernel: LustreError:
6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client
Any idea what’s going on here?
Thanks a bunch for the help.
-Matt
------------------------------------------------------------------------
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss