Hi Dean, Glad to hear you were able to clean up, sounds like you've also been successful in your vm trial, I would suggest at this point that you take a close look at your installation and verify that all of the needed packages are installed correctly.
The fact that it's complaining about missing modules for ldiskfs is a strong clue to the problem. Does there happen to be earlier modules installed by chance? did you check rpm -qa | grep -i lustre? It seems you may also have the zfs obd installed? Is this on purpose? -cf On Fri, Nov 19, 2021 at 1:25 PM STEPHENS, DEAN - US <[email protected]> wrote: > I also figure out how to clean up after the llmount.sh script is run. > There is a llmountcleanup.sh that will do that. > > > > Dean > > > > *From:* STEPHENS, DEAN - US > *Sent:* Friday, November 19, 2021 1:08 PM > *To:* Colin Faber <[email protected]> > *Cc:* [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > One more thing that I have noticed using the llmount.sh script, the > directories that were created by the script under /mnt have 000 set for the > permissions. The ones that I have configure under /mnt/lustre are set to > 750 permissions. > > > > Is this something that needs to be fixed. I have these server being > configure via puppet and that is how the /mnt/lustre directories are being > created and the permissions set. > > > > Dean > > > > > > *From:* STEPHENS, DEAN - US > *Sent:* Friday, November 19, 2021 7:14 AM > *To:* Colin Faber <[email protected]> > *Cc:* [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > The other question that I have is how to clean up after the llmount.sh has > been run? If I do a df on the server I see that mds1, osd1, and ost2 are > still mounted to /mnt. Do I need to manually umount them since the > llmount.sh completed successfully? > > > > Also I have not done anything to my MDS node so some direction on what to > do there will be helpful as well. > > > > Dean > > > > *From:* STEPHENS, DEAN - US > *Sent:* Friday, November 19, 2021 7:00 AM > *To:* Colin Faber <[email protected]> > *Cc:* [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > Thanks for the help yesterday and I was able to install the Lustre kernel > and software on a VM to include the test RPM. > > > > This is what I did following these directions > <https://wiki.lustre.org/Installing_the_Lustre_Software#Lustre_Servers_with_LDISKFS_OSD_Support> > : > > Installed the Lustre kernel and kernel-devel (the other RPMs listed were > not in my luster-server repo) > > Rebooted the VM > > Installed kmod-lustre kmod-lustre-osd-ldiskfs lustre-osd-ldiskfs-mount > lustre lustre-resource-agents lustre-tests > > Ran modprobe -v lustre (did not show that it loaded kernel modules as it > has done in the past) > > Ran lustre_rmmod (got an error Module Luster in use) > > Rebooted again > > Ran llmount.sh and it looked like it completed successfully > > Ran tunefs.lustre /dev/sdb (at the bottom of the output I am seeing > tunefs.luster: Unable to mount /dev/sdb: Invalid argument and > tunefs.luster: FATAL: failed to write local files and tunefs.luster: > exiting with 22 (Invalid argument)) > > > > Any idea what the “invalid argument” is talking about? > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 3:34 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > The VM will need a full install of all server packages, as well as the > tests package to allow for this test. > > > > On Thu, Nov 18, 2021 at 2:26 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > I have not tried that but I can do that on a new VM that I can create. I > assume that is all that I need is the lustre-tests RPM and associated > dependencies and not the full blown lustre install? > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 2:22 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > So that indicates that your installation is incomplete or something else > is preventing lustre, ldiskfs, and possibly other modules from loading. > Have you been able to reproduce this behavior on a fresh rhel install with > lustre 2.12.7? (i.e. llmount.sh failing)? > > > > -cf > > > > > > On Thu, Nov 18, 2021 at 2:20 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > Thanks for the direction. I found it and installed lustre-tests.x86_64 and > now I have the llmount but it was defaulted to > /usr/lib64/lustre/tests/llmount.sh and when I ran it but it failed with: > > > > Stopping clients: <hostname> /mnt/lustre (opts: -f) > > Stopping clients: <hostname> /mnt/lustre2 (opts: -f) > > Loading modules from /usr/lib64/lustre/tests/.. > > Detected 2 online CPUs by sysfs > > Force libcfs to create 2 CPU partitions > > Formatting mgs, mds, osts > > Format mds1: /tmp/lustre-mdt1 > > Mkfs.lustre: Unable to mount /dev/loop0: No such device (even though > /dev/loop0 is a thing) > Is the ldiskfs module loaded? > > > > Mkfs.lustre FATAL: failed to write local files > > Mkfs.lustre: exiting with 19 (no such device) > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 2:03 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > This would be part of the lustre-tests RPM package and will install > llmount.sh to /usr/lib/lustre/tests/llmount.sh I believe. > > > > On Thu, Nov 18, 2021 at 1:45 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > Not sure what you mean by “If you install the test suite”. I am not seeing > a llmount.sh file on the server using “locate llmount.sh” at this point. > What are the steps to install the test suite? > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 1:34 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > Hm.. If you install the test suite does llmount.sh succeed? This should > setup a single node cluster on whatever node you're running lustre on, I > believe it will load modules as needed (IIRC), if this test succeeds, then > you know that lustre is installed correctly (or correctly enough), if not, > I'd focus on the installation as the target issue may be a redheirring > > > > -cf > > > > > > On Thu, Nov 18, 2021 at 1:01 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > Thanks for the fast reply. > > When I do the tunefs.lustre /dev/sdX command I get: > > Target: <name>-OST0009 > > Index: 9 > > > > Target: <name>-OST0008 > > Index: 8 > > I spot checked some others and they seem to be good with the exception of > one. It shows: > > > > Target: <name>-OST000a > > Index: 10 > > > > But since there are 11 LUNs attached that make sense to me. > > > > As far as the upgrade it was a fresh install using the legacy targets as > the OSS and MDS nodes are virtual machine with the LUN disks attached to > them so that Red Hat sees them as /dev/sdX devices. > > > > When I loaded Lustre on the server I did a yum install lustre and since we > were pointed at the lustre-2.12 repo in our environment it picked up the > following RPMs to install: > > Luster-resource-agents-2.12.6-1.el7.x86_64 > > Kmod-lustre-2.12.6-1.el7.x86_64 > > Kmod-zfs-3.10.0-1160.2.1.el7_lustre.x86_64-09.7.13-1.el7.x86_64 > > Kmod-lustre-osd-zfs-2.12.6-1.el7.x86_64 > > Lustre-2.12.6-1.el7.x86_64 > > Kmod-spl-3.10.0-1160.2.1.el7_lustre.x86_64-09.7.13-1.el7.x86_64 > > Lustre-osd-zfs-mount-2.12.6-1.el7.x86_64 > > Lustre-osd-ldiskfs-mount-2.12.6-1.el7.x86_64 > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 12:35 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > EXTERNAL EMAIL - This email originated from outside of CACI. Do not click > any links or attachments unless you recognize and trust the sender. > > > > > > Hi, > > > > I believe in 2.10 sometime (someone correct me if I'm wrong) that the > index parameter was required and needs to be specified. On an existing > system this should already be set, but can you check the parameters line > with tunefs.lustre for correct index=N values across your storage nodes? > > > > Also, with your "upgrade", was this a fresh install utilizing legacy > targets? > > > > The last thing I can think of IIRC, there was on-disk format changes > between 2.5 and 2.12, these should be transparent to you, but it may be > some other issue is preventing successful upgrade, though the missing > module error really speaks to possible issues around how lustre was > installed and loaded on the system. > > > > Cheers! > > > > -cf > > > > > > On Thu, Nov 18, 2021 at 12:24 PM STEPHENS, DEAN - US via lustre-discuss < > [email protected]> wrote: > > I am by no means a Lustre expert and am seeking some help with our system. > I am not able to get log file to post as the servers are in the closed area > with no access to the Internet. > > > > Here is a bit of history of our system: > > The OSS and MDS nodes were RHEL6 and running a Luster server the kernel > 2.6.32-431.23.3.el6_lustre.x86_64 and the Lustre version of 2.5.3. the > client version was 2.10. That was in a working state. > > We upgraded the OSS ad MDS nodes to RHEL7 and installed Lustre server 2.12 > software and kernel. > > The attached 11 LUNs are showing up as /dev/sdb - /dev/sdl > > Right now, on the OSS nodes, if I use the command tunefs.luster /dev/sdb I > get some data back saying that Lustre data has been found but at the bottom > of the out put it shows “tunefs.lustre: Unable to mount /dev/sdb: No such > device” and “Is the ldiskfs module available” > > When I do a “modprobe -v lustre” I do not see ldiskfs.ko as being loaded > even though there is a ldiskfs.ko file in > /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs directory. I > am not sure how to get it to load in the modprobe command. > > I used “insmod > /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/ ldiskfs.ko” > and re-ran the “tunefs.luster /dev/sdb” command with the same result. > > If I use the same command on the MDS nodes I get “no Lustre data found and > /dev/sdb has not been formatted with mkfs.lustre”. I am not sure that is > what is needed here as the MDS nodes do not really have the lustre data as > it is the meta data server. > > I tried to use the command “tunefs.lustre --mgs --erase_params > --mgsnode=<IP address>@tcp --writeconf --dryrun /dev/sdb” and get the error > “/dev/sdb has not been formatted with mkfs.lustre”. > > > > I need some help and guidance and I can provide what may be needed though > it will need to be typed out as I am not able to get actual log files from > the system. > > > > Dean Stephens > > CACI > > Linux System Admin > > > > > ------------------------------ > > > This electronic message contains information from CACI International Inc > or subsidiary companies, which may be company sensitive, proprietary, > privileged or otherwise protected from disclosure. The information is > intended to be used solely by the recipient(s) named above. If you are not > an intended recipient, be aware that any review, disclosure, copying, > distribution or use of this transmission or its contents is prohibited. If > you have received this transmission in error, please notify the sender > immediately. > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
