what does tune2fs report for /dev/sdb on the MDS? (Also sorry, this somehow got lost in my inbox)
On Mon, Nov 22, 2021 at 8:57 AM STEPHENS, DEAN - US <[email protected]> wrote: > Colin and Andreas, so to clarify some points for you, This is what I am > seeing: > > > > Rpm -qa | grep lustre > > Kmod_lustre-2.12.6-1.el7.x86_64 > > Lustre-iokit-2.12.6-1.el7.x86_64 > > Lustre_test-2.12.6-1.el7.x86_64 > > Kernel-devel-3.10.0-1160.2.el7_lustre.x86_64 > > Lustre-osd-ldiskfs-2.12.6-1.el7.x86_64 > > Kmod-lustre-osd-ldiskfs-2.12.6-1.el7.x86_64 > > Kmod-lustre-tests-2.12.6-1.el7.x86_64 > > lustre-resource-agents-2.12.6-1.el7.x86_64 > > kernel-3.10.0-1160.2.el7_lustre.x86_64 > > lustre-2.12.6-1.el7.x86_64 > > > > rpm -qa | grep e2fs > > e2fsprogs-libs-1.45.6.wc1-0.el7.x86_64 > > e2fsprogs-1.45.6.wc1-0.el7.x86_64 > > > > With all of that installed and the successful running and clean up of the > llmount.sh and llmountcleanup.sh I am still getting the errors: > > “Unable to mount /dev/sdb: Invalid argument” > > “tunefs.luster: FATAL: failed to write local files and tunefs.luster: > exiting with 22 (Invalid argument)” > > > > When I use the command tunefs.lustre /dev/sdb (which is one of the lustre > LUNs that is attached as a “disk” to the VM) > > > > Full output of the tunefs.luster /dev/sdb command (as mush as I can show > anyway): > > > > Tunefs.lustre /dev/sdb > > Checking for existing lustre data: found > > Reading CONFIGS/mountdata > > > > Read previous values: > > Target: <name>-OST0009 > > Index: 9 > > Luster FS: <name> > > Mount type: ldiskfs > > Flags: 0x1002 > > (OST no_primmode ) > > Persistent mount opts: errors=remount-ro > > Parameters: mgsnode=<IP of the 1st MGS node>@tcp mgsnode=<IP of the 2nd > MGS node>@tcp failover.node=<IP of the 1st OSS node>@tcp > failover.node=<IP of the 2nd OSS node>@tcp > > > > Permanent disk data: > > Target: <name>-OST0009 > > Index: 9 > > Luster FS: <name> > > Mount type: ldiskfs > > Flags: 0x1002 > > (OST no_primmode ) > > Persistent mount opts: errors=remount-ro > > Parameters: mgsnode=<IP of the 1st MGS node>@tcp mgsnode=<IP of the 2nd > MGS node>@tcp failover.node=<IP of the 1st OSS node>@tcp > failover.node=<IP of the 2nd OSS node>@tcp > > > > tunefs.luster: Unable to mount /dev/sdb: Invalid argument > > > > tunefs.luster: FATAL: failed to write local files > > tunefs.luster: exiting with 22 (Invalid argument*)* > > > > Now to be clear the MDS nodes are not working correctly as I am not able > to mount /dev/sdb on them where the existing meta data is served out from. > To this point I have been concentrating on the OSS nodes as that is where > the lustre data is coming from. I have installed the lustre kernel and the > same software on the MDS nodes in the same way that I have on the OSS > nodes. When I try to use tunefs.lustre /dev/sdb on the MDS nodes I get an > error saying: > > > > Checking for existing lustre data: not found > > > > tunefs.luster: FATAL: device /dev/sdb has not been formatted with > mkfs.lustre > > tunefs.luster: exiting with 19 (no such device*)* > > > > I am assuming that this is correct as that attached LUN does not need to > have lustre data on it as it is the meta data server. Is there anything > that I can/need to check on the MDS nodes to see what is running/working > correctly? > > > > I know that this is a lot and I appreciate any help that you can give me > to troubleshoot this. > > > Dean > > > > > > > > *From:* STEPHENS, DEAN - US > *Sent:* Monday, November 22, 2021 5:58 AM > *To:* Andreas Dilger <[email protected]> > *Cc:* Colin Faber <[email protected]>; [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > Thanks for the clarification. I am using llmount.sh to test the install of > the OST and MDT not run in production. I hope to have more done today and > will reach out to let you all know what I find. > > > > Dean > > > > *From:* Andreas Dilger <[email protected]> > *Sent:* Friday, November 19, 2021 5:25 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* Colin Faber <[email protected]>; [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > Dean, > > it should be emphasized that "llmount.sh" and "llmountcleanup.sh" are for > quickly formatting and mounting *TEST* filesystems. They only create a few > small (400MB) loopback files in /tmp and format them as OSTs and MDTs. > This should *NOT* be used on a production system, or you will be very sad > when the files in /tmp disappear after the server is rebooted and/or they > reformat your real filesystem devices. > > > > I mention this here because it isn't clear to me whether you are using > them for testing, or trying to get a real filesystem mounted. > > > > Cheers, Andreas > > > > On Nov 19, 2021, at 13:25, STEPHENS, DEAN - US via lustre-discuss < > [email protected]> wrote: > > > > I also figure out how to clean up after the llmount.sh script is run. > There is a llmountcleanup.sh that will do that. > > > > Dean > > > > *From:* STEPHENS, DEAN - US > *Sent:* Friday, November 19, 2021 1:08 PM > *To:* Colin Faber <[email protected]> > *Cc:* [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > One more thing that I have noticed using the llmount.sh script, the > directories that were created by the script under /mnt have 000 set for the > permissions. The ones that I have configure under /mnt/lustre are set to > 750 permissions. > > > > Is this something that needs to be fixed. I have these server being > configure via puppet and that is how the /mnt/lustre directories are being > created and the permissions set. > > > > Dean > > > > > > *From:* STEPHENS, DEAN - US > *Sent:* Friday, November 19, 2021 7:14 AM > *To:* Colin Faber <[email protected]> > *Cc:* [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > The other question that I have is how to clean up after the llmount.sh has > been run? If I do a df on the server I see that mds1, osd1, and ost2 are > still mounted to /mnt. Do I need to manually umount them since the > llmount.sh completed successfully? > > > > Also I have not done anything to my MDS node so some direction on what to > do there will be helpful as well. > > > > Dean > > > > *From:* STEPHENS, DEAN - US > *Sent:* Friday, November 19, 2021 7:00 AM > *To:* Colin Faber <[email protected]> > *Cc:* [email protected] > *Subject:* RE: [lustre-discuss] Lustre and server upgrade > > > > Thanks for the help yesterday and I was able to install the Lustre kernel > and software on a VM to include the test RPM. > > > > This is what I did following these directions > <https://wiki.lustre.org/Installing_the_Lustre_Software#Lustre_Servers_with_LDISKFS_OSD_Support> > : > > Installed the Lustre kernel and kernel-devel (the other RPMs listed were > not in my luster-server repo) > > Rebooted the VM > > Installed kmod-lustre kmod-lustre-osd-ldiskfs lustre-osd-ldiskfs-mount > lustre lustre-resource-agents lustre-tests > > Ran modprobe -v lustre (did not show that it loaded kernel modules as it > has done in the past) > > Ran lustre_rmmod (got an error Module Luster in use) > > Rebooted again > > Ran llmount.sh and it looked like it completed successfully > > Ran tunefs.lustre /dev/sdb (at the bottom of the output I am seeing > tunefs.luster: Unable to mount /dev/sdb: Invalid argument and > tunefs.luster: FATAL: failed to write local files and tunefs.luster: > exiting with 22 (Invalid argument)) > > > > Any idea what the “invalid argument” is talking about? > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 3:34 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > The VM will need a full install of all server packages, as well as the > tests package to allow for this test. > > > > On Thu, Nov 18, 2021 at 2:26 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > I have not tried that but I can do that on a new VM that I can create. I > assume that is all that I need is the lustre-tests RPM and associated > dependencies and not the full blown lustre install? > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 2:22 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > So that indicates that your installation is incomplete or something else > is preventing lustre, ldiskfs, and possibly other modules from loading. > Have you been able to reproduce this behavior on a fresh rhel install with > lustre 2.12.7? (i.e. llmount.sh failing)? > > > > -cf > > > > > > On Thu, Nov 18, 2021 at 2:20 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > Thanks for the direction. I found it and installed lustre-tests.x86_64 and > now I have the llmount but it was defaulted to > /usr/lib64/lustre/tests/llmount.sh and when I ran it but it failed with: > > > > Stopping clients: <hostname> /mnt/lustre (opts: -f) > > Stopping clients: <hostname> /mnt/lustre2 (opts: -f) > > Loading modules from /usr/lib64/lustre/tests/.. > > Detected 2 online CPUs by sysfs > > Force libcfs to create 2 CPU partitions > > Formatting mgs, mds, osts > > Format mds1: /tmp/lustre-mdt1 > > Mkfs.lustre: Unable to mount /dev/loop0: No such device (even though > /dev/loop0 is a thing) > Is the ldiskfs module loaded? > > > > Mkfs.lustre FATAL: failed to write local files > > Mkfs.lustre: exiting with 19 (no such device) > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 2:03 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > This would be part of the lustre-tests RPM package and will install > llmount.sh to /usr/lib/lustre/tests/llmount.sh I believe. > > > > On Thu, Nov 18, 2021 at 1:45 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > Not sure what you mean by “If you install the test suite”. I am not seeing > a llmount.sh file on the server using “locate llmount.sh” at this point. > What are the steps to install the test suite? > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 1:34 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > Hm.. If you install the test suite does llmount.sh succeed? This should > setup a single node cluster on whatever node you're running lustre on, I > believe it will load modules as needed (IIRC), if this test succeeds, then > you know that lustre is installed correctly (or correctly enough), if not, > I'd focus on the installation as the target issue may be a redheirring > > > > -cf > > > > > > On Thu, Nov 18, 2021 at 1:01 PM STEPHENS, DEAN - US < > [email protected]> wrote: > > Thanks for the fast reply. > > When I do the tunefs.lustre /dev/sdX command I get: > > Target: <name>-OST0009 > > Index: 9 > > > > Target: <name>-OST0008 > > Index: 8 > > I spot checked some others and they seem to be good with the exception of > one. It shows: > > > > Target: <name>-OST000a > > Index: 10 > > > > But since there are 11 LUNs attached that make sense to me. > > > > As far as the upgrade it was a fresh install using the legacy targets as > the OSS and MDS nodes are virtual machine with the LUN disks attached to > them so that Red Hat sees them as /dev/sdX devices. > > > > When I loaded Lustre on the server I did a yum install lustre and since we > were pointed at the lustre-2.12 repo in our environment it picked up the > following RPMs to install: > > Luster-resource-agents-2.12.6-1.el7.x86_64 > > Kmod-lustre-2.12.6-1.el7.x86_64 > > Kmod-zfs-3.10.0-1160.2.1.el7_lustre.x86_64-09.7.13-1.el7.x86_64 > > Kmod-lustre-osd-zfs-2.12.6-1.el7.x86_64 > > Lustre-2.12.6-1.el7.x86_64 > > Kmod-spl-3.10.0-1160.2.1.el7_lustre.x86_64-09.7.13-1.el7.x86_64 > > Lustre-osd-zfs-mount-2.12.6-1.el7.x86_64 > > Lustre-osd-ldiskfs-mount-2.12.6-1.el7.x86_64 > > > > Dean > > > > *From:* Colin Faber <[email protected]> > *Sent:* Thursday, November 18, 2021 12:35 PM > *To:* STEPHENS, DEAN - US <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] Lustre and server upgrade > > > > EXTERNAL EMAIL - This email originated from outside of CACI. Do not click > any links or attachments unless you recognize and trust the sender. > > > > > > Hi, > > > > I believe in 2.10 sometime (someone correct me if I'm wrong) that the > index parameter was required and needs to be specified. On an existing > system this should already be set, but can you check the parameters line > with tunefs.lustre for correct index=N values across your storage nodes? > > > > Also, with your "upgrade", was this a fresh install utilizing legacy > targets? > > > > The last thing I can think of IIRC, there was on-disk format changes > between 2.5 and 2.12, these should be transparent to you, but it may be > some other issue is preventing successful upgrade, though the missing > module error really speaks to possible issues around how lustre was > installed and loaded on the system. > > > > Cheers! > > > > -cf > > > > > > On Thu, Nov 18, 2021 at 12:24 PM STEPHENS, DEAN - US via lustre-discuss < > [email protected]> wrote: > > I am by no means a Lustre expert and am seeking some help with our system. > I am not able to get log file to post as the servers are in the closed area > with no access to the Internet. > > > > Here is a bit of history of our system: > > The OSS and MDS nodes were RHEL6 and running a Luster server the kernel > 2.6.32-431.23.3.el6_lustre.x86_64 and the Lustre version of 2.5.3. the > client version was 2.10. That was in a working state. > > We upgraded the OSS ad MDS nodes to RHEL7 and installed Lustre server 2.12 > software and kernel. > > The attached 11 LUNs are showing up as /dev/sdb - /dev/sdl > > Right now, on the OSS nodes, if I use the command tunefs.luster /dev/sdb I > get some data back saying that Lustre data has been found but at the bottom > of the out put it shows “tunefs.lustre: Unable to mount /dev/sdb: No such > device” and “Is the ldiskfs module available” > > When I do a “modprobe -v lustre” I do not see ldiskfs.ko as being loaded > even though there is a ldiskfs.ko file in > /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs directory. I > am not sure how to get it to load in the modprobe command. > > I used “insmod > /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/ ldiskfs.ko” > and re-ran the “tunefs.luster /dev/sdb” command with the same result. > > If I use the same command on the MDS nodes I get “no Lustre data found and > /dev/sdb has not been formatted with mkfs.lustre”. I am not sure that is > what is needed here as the MDS nodes do not really have the lustre data as > it is the meta data server. > > I tried to use the command “tunefs.lustre --mgs --erase_params > --mgsnode=<IP address>@tcp --writeconf --dryrun /dev/sdb” and get the error > “/dev/sdb has not been formatted with mkfs.lustre”. > > > > I need some help and guidance and I can provide what may be needed though > it will need to be typed out as I am not able to get actual log files from > the system. > > > > Dean Stephens > > CACI > > Linux System Admin > > > > > ------------------------------ > > > This electronic message contains information from CACI International Inc > or subsidiary companies, which may be company sensitive, proprietary, > privileged or otherwise protected from disclosure. The information is > intended to be used solely by the recipient(s) named above. If you are not > an intended recipient, be aware that any review, disclosure, copying, > distribution or use of this transmission or its contents is prohibited. If > you have received this transmission in error, please notify the sender > immediately. > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Lustre Principal Architect > > Whamcloud > > > > > > > > > > > > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
