Hi Roger, I believe you can connect the OSSs once the MDS has booted, and in fact, I¹m pretty sure that the five in the connected_clients: 0/5¹ are in fact your OSS nodes. Each OST maintains a connection to the MDS while the file system is mounted, so they will be included in the connection count on the MDS.
However, regardless of the state if your MDS is online and the MDT is mounted, you can start up the OSS nodes and corresponding OSTs at any time; clients attempting to make transactions will have their I/O operations block (or fail, depending on the MDS config) until the missing nodes come back online. hth, Klaus On 1/20/09 3:05 PM, "Roger Spellman" <[email protected]> etched on stone tablets: > I have 2 MDS, configured as an active/standby pair. I have 5 OSTs that are > NOT active/standby. I > have 5 clients. > > I am using Lustre 1.6.5, due to bug 18232 > <https://bugzilla.lustre.org/show_bug.cgi?id=18232> which only affects 1.6.6. > Using Lustre 1.6.5, when I > reset my active node, the standby takes over. This is quite reliable. > > Today, I did the following in this order: > Unmounted all the clients > Rebooted all the clients > Stopped Linux HA from running > Unmounted the OSTs > Unmounted the MDS > Rebooted the OSTs > Rebooted both MDSes > > When the MDSes started up, Linux HA chose one to be active. That system > mounted the MDT. > > I looked at the file /proc/fs/lustre/mds/tacc-MDT0000/recovery_status, and it > showed: > > [r...@ts-tacc-01 ~]# cat /proc/fs/lustre/mds/tacc-MDT0000/recovery_status > status: RECOVERING > recovery_start: 0 > time_remaining: 0 > connected_clients: 0/5 > completed_clients: 0/5 > replayed_requests: 0/?? > queued_requests: 0 > next_transno: 17768 > > > ***** Note that recovery_start and time_remaining are both zero. ***** > > I waited a several minutes, and this file was the same. > > I was waiting for recovery to complete before trying to mount the OSTs. > However, it appears that > this would never occur! > > Does this look like a bug? > > --------------------------- > > I format my MDT using the following command. The command is run from > 10.2.43.1, and the failnode > is 10.2.43.2: > > mkfs.lustre --reformat --fsname tacc --mdt --mgs --device-size=10000000 > --mkfsoptions=' -m 0 -O > mmp' --failnode=10.2.4...@o2ib0 /dev/sdb > > I format the OSTs using the following command: > > /usr/bin/time -p mkfs.lustre --reformat --ost --mkfsoptions='-J > device=/dev/sdc1 -m 0' --fsname > tacc --device-size=400000000 --mgsnode=10.2.4...@o2ib0 > --mgsnode=10.2.4...@o2ib0 /dev/sdb > > I mount the clients using: > > mount -t lustre 10.2.4...@o2ib:10.2.4...@o2ib:/tacc /mnt/lustre > > > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
