Hi.

I just had a weird problem I'd like to share, and see how it can be fixed.

After installing dozens of servers without a problem, one specific install failed during setup-storage. The error looked something like this :
> Executing: fai-vol_id -u /dev/sda1
> Executing: fai-vol_id -l /dev/sda1
> /dev/sda1 UUID=6eb8e204-614f-42f1-9431-5b2aacf091d4
> Executing: fai-vol_id -u /dev/sda2
> Executing: fai-vol_id -l /dev/sda2
> /dev/sda2 UUID=c480d5e4-3271-4d93-8875-12dacaa032d1
> Executing: fai-vol_id -u /dev/system/swap
> Command had non-zero exit code

My config file looks like this :
> disk_config disk1  fstabkey:uuid
> primary /boot       256       ext3    rw
> primary /           512       ext4    rw
> primary -           0-        -       -
>
> disk_config lvm
> vg      system     disk1.3
> system-home     /home  1000   ext4  noexec,nodev,nosuid
> system-usr      /usr   2000   ext4  defaults
> system-tmp      /tmp   1000   ext4  nodev,noexec,nosuid
> system-var      /var   5000  ext4  noatime
> system-swap     swap   2000   swap  -

It seems fai-vol_id could not probe /dev/mapper/swap. A quick tested showed it worked fine with the other LVs on the disk.

After investigating more, it turned out that there used to be on the disk, at the same place, a ReiserFS filesystem. I actually found out the following :

> root@fai-client:~# blkid -p /dev/dm-*
> /dev/dm-0: ambivalent result (probably more filesystems on the device, use wipefs(8) to see more details) > /dev/dm-1: UUID="f0d3e822-90d2-418a-af56-0228db0c600e" VERSION="1.0" TYPE="ext4" USAGE="filesystem" > /dev/dm-2: UUID="d76feba3-af3b-4278-8344-eeccb6e4d34c" VERSION="1.0" TYPE="ext4" USAGE="filesystem" > /dev/dm-3: UUID="0b180c48-d8d5-4008-bd5d-b48dc456d5d0" VERSION="1.0" TYPE="ext4" USAGE="filesystem" > /dev/dm-4: UUID="6ad447dc-7df4-40b8-81ef-d852b5d6e714" VERSION="1.0" TYPE="ext4" USAGE="filesystem"

Wipefs revealed that it could see two filesystems :
> root@pdf:~# wipefs /dev/system/swap
> offset               type
> ----------------------------------------------------------------
> 0x10034              reiserfs   [filesystem]
>                      UUID:  fab7a960-4570-44a9-a74f-e02429213e6a
>
> 0xff6                swap   [other]
>                      UUID:  ad03f0b0-b2cc-4908-96ca-e8f77ad1522b

A quick "wipefs --all /dev/system/swap" fixed the issue, and setup-storage then ran fine.


Now, on to "how do we fix this issue ?"

I see several ways this could be done :
- add a "wipe" action that should be run before an install, booting something like DBAN through PXE, and wiping everything. - before destroying any partition/vg/other, perform a "wipefs --all" on it.
I like the second solution better, though.

--

Vivien Bernet-Rollande
Systems&  Networking Engineer
Alter Way Hosting

Antwort per Email an