[LUG] LVM adventures

Noah Sematimba Thu, 14 Jan 2010 03:31:40 -0800


Hi All,

I was discussing my recent adventures with LVM2 with a friend whosuggested that I should post my recent experiences on the listperchance someone could benefit from my experiences. So here goes:

I had a server that someone else had setup (to my disgust) in thefollowing way:

SUNFire X2100 with 2 250GB SATA disks. They created a small /bootpartition on sda1 for the boot stuff then created an LVM RAID 0 volumefrom the rest of the space on sda2 and sdb1 which was a partitionusign the whole of the second disk, then then installed RedhatEnterprise Server 4, created two logical volumes. One for root andanother for swap.

Now somewhere along the way, some "bright" chap realised that theserver had two disks but he was only seeing one in use :-) So the chapformatted /dev/sdb1 to create it as a separate mount point. Since hedid not reboot, this mistake was never discovered.

So one fine day, the data on the server grows beyond the capacity ofthe 1st disk and LVM2 attempts to extend on to the second disk in thevolume. Predictably this results in data corruption, major LVM errorsand the server crashes and fails to reboot because the LVM metadata iscorrupted.


Now the fun part: There was no backup of this server.

However the errors that LVM was throwing were actually showing cannotfind physical volume with uuid blah blah... so I actually had a uuidto work with.

Variously on the web the suggestions were as follows:

pvcreate -ff --uuid xxxx /dev/sdb1
vgcreate VolGroup00 /dev/sda2 /dev/sdb1
vgcfgrestore -f metadata-file VolGroup00

However I did not have a backed up copy of the metadata file. Readinga few things on the web showed me that LVM keeps its metadata files atthe beginning of the disk, in the first 255 sectors following thepartition table in sector 1 of the disk.So I booted up with the rescue disk and used foremost to try extractthe text of the config files form the raw device using patternmatching following the suggestions on http://blog.eliasprobst.eu/?p=3however this did not work for me probably because as I discoveredlater, there was no complete uncorrupted metadata file. However Irealised I was being too cute with my stuff and instead simply did:


dd if=/dev/sda2 of=/dd.txt count=255 skip=1 bs=512

then vi /dd.txt. I found a bunch of binary stuff and snippets ofconfig files but no complete config file showing that the metadatafiles were indeed corrupt. With a bit of cut and paste, here andthere, I put together a working config file.

However I then run into my next hurdle. I can only assume that becausethe disk was already full and the volume group configuration was stillcorrupted, it meant that there was no space to put my configurationfile when I attempted to use vgcfgrestore as the logical disks werenot available to move the extra data onto the second disk.

So I again used dd again to manually overwrite what is on that sectionof the raw HDD and put my created metadata file onto the disk.However unless vgcfgrestore has actually failedand you have nooptions, this is a very risky stunt and should be avoided. I only didthis stunt after doing a dd of the entire disk onto another just incase I made a mistake.


Now when I did vgscan I was able to see
Found volume group "VolGroup00" using metadata type lvm2
I could also do pvscan and see both hard disks in the volume group.
I then run vgchange VolGroup00 -a y
then lvscan which showed me:
ACTIVE '/dev/VolGroup00/LogVol00' [476.38 GB] inherit
 ACTIVE '/dev/VolGroup00/LogVol01' [512.00 MB] inherit

Voila, now that I could see my volumes, I could mount them and accessmy data. However mounting them proved to be a problem initially as thedata was so corrupt that when i mounted them and attempted to list theroot directory using ls, it could not even tell whether the /usr /opt /var were directories or files and put a ? in that field. That meantthat I needed to run fsck on the filesystem to see if I could get thisfixed.

Running fsck yielded errors and kept bombing out while trying to fixthe first couple of inodes alluding to a corrupted superblock. So Ithen run mke2fs -n /dev/VolGroup00/LogVol00 which showed me thealternate superblocks for the filesystem. I was then able to run fsck -b superblock -y /dev/VolGroup00/LogVol00Remember to run fsck with -y because initially I did not and had tokeep answering y to the fsck prompts however I got tired, cancelledthe fsck and redid it with -y which proved a good decision because thefsck run for over 36 hours with those prompts flying by so fast ont hescreen that I expect they must have been about 100,000 of thoseprompts at the least.


a couple of things to note:

1. if using LVM, be sure to have a backup of your /etc directory or atleast your /etc/lvm directory.2. if you do not have a backup and end up getting a config file fromraw disk, pay attention to the } that close statements in the lvmmetadata file.3. The usual. Take backups. I wouldn't have needed to go through allof this if i had backups of my data. I would have simply reinstalledand then restored from backup.


_______________________________________________
LUG mailing list
[email protected]
http://kym.net/mailman/listinfo/lug
%LUG is generously hosted by INFOCOM http://www.infocom.co.ug/

The above comments and data are owned by whoever posted them (including 
attachments if any). The List's Host is not responsible for them in any way.
---------------------------------------

[LUG] LVM adventures

Reply via email to