Re: [PVE-User] LV Not Available - could not open disk - no such file or directory

John Crisp Thu, 01 Sep 2016 03:20:13 -0700

Hi Dietmar

On 01/09/16 08:38, Dietmar Maurer wrote:
>> In reply to Dietmar in absence of John:
>>
>> root@pve:~# lvchange -a y pve/data
>>    Thin pool transaction_id is 0, while expected 3.
> 
> 
> Does it help if you reboot the node?
>

No - we tried that (it's a single machine/node - no cluster)

> Some people reported that is possible to fix this with 
> vgcfgbackup/vgcfgrestore,
> but you should be really careful when doing such things (backup everything
> first).
> 

We can try it but we have a very large VM that we have nowhere to store
easily at present..... as the main part of the array is taken up by the
inaccessible lvm-thin partition we cannot back up to the small 'local'
partition and cannot see another way to get a backup off the server as
you can't backup if you cannot access the VM.....

> # vgcfgbackup -f lvmbackup.txt pve
> 
> edit transaction_id inside lvmbackup.txt
> 
> # vgcfgrestore --file lvmbackup.txt --force pve

We can try this but clearly we risk losing the VM

> 
> But John wrote:
> 
>> There was an issue with the BBWC on the RAID
> 
> So it is likely that more than this is damaged (what happened exactly?)...
> 

Extremely unlikely. The RAID would normally just disable any write
caching to preserve data integrity. From what we can see the server
looked like it was shutdown cleanly before they replaced the battery.
There are no obvious errors in the logs

Due to an error by the datacentre staff who managed to plug in the
network cables in the wrong ports (!!!!!!) after replacing the battery
the machine was rebooted a number of times, but the VMs were all set to
manual start so none were run until a few hours later. The machine
seemed to boot cleanly with no obvious errors

What is worrying is

a) there is no effective way to back up a VM if you cannot mount the
partition

b) For a single machine having a default setting to provision lvm-thin
with no easily accessible backup space is clearly dangerous

c) what actually caused this to happen

Having looked a bit further the only thing that I can see that *might*
be related is this

Aug 31 15:35:55 pve pve-manager[5584]: shutdown VM 701:
UPID:pve:000015D0:147C9EF2:57C6DD3B:qmshutdown:701:root@pam:

Aug 31 15:36:06 pve pve-manager[5583]: end task
UPID:pve:000015D0:147C9EF2:57C6DD3B:qmshutdown:701:root@pam:

Aug 31 15:36:10 pve pve-manager[5583]: all VMs and CTs stopped

Aug 31 15:36:10 pve pve-manager[5577]: <root@pam> end task
UPID:pve:000015CF:147C9EEF:57C6DD3B:stopall::root@pam: OK

Aug 31 15:36:12 pve fusermount[5672]: /bin/fusermount: failed to unmount
/var/lib/lxcfs: Invalid argument

This seems very similar:

https://forum.proxmox.com/threads/cannot-start-kvm.25085/

We wonder whether the VM was shutdown cleanly but the file system was
not unmounted cleanly ?

Any thoughts appreciated.

B. Rgds
John

signature.asc
Description: OpenPGP digital signature

_______________________________________________
pve-user mailing list
[email protected]
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Re: [PVE-User] LV Not Available - could not open disk - no such file or directory

Reply via email to