Re: [Users] How to rescue storage domain structure

2013-04-24 Thread Chris Smith
Does this work with volume groups?  I have several virtual disks
presented to the VM which are part of a volume group.

[root@voyager media]# fsarchiver probe
[==DISK==] [=NAME==] [SIZE] [MAJ] [MIN]
[vda ] [   ] [40.00 GB] [252] [  0]
[vdb ] [   ] [   100.00 GB] [252] [ 16]
[vdc ] [   ] [20.00 GB] [252] [ 32]
[vdd ] [   ] [40.00 GB] [252] [ 48]

[=DEVICE=] [==FILESYS==] [==LABEL==] [SIZE] [MAJ] [MIN]
[vda1] [ext4   ] [unknown] [   500.00 MB] [252] [  1]
[vda2] [LVM2_member] [unknown] [39.51 GB] [252] [  2]
[vdb1] [LVM2_member] [unknown] [   100.00 GB] [252] [ 17]
[vdc1] [LVM2_member] [unknown] [20.00 GB] [252] [ 33]
[dm-0] [ext4   ] [unknown] [ 4.00 GB] [253] [  0]
[dm-1] [swap   ] [unknown] [ 3.94 GB] [253] [  1]
[dm-2] [ext4   ] [unknown] [   119.99 GB] [253] [  2]
[dm-3] [ext4   ] [unknown] [15.00 GB] [253] [  3]
[dm-4] [ext4   ] [unknown] [ 8.00 GB] [253] [  4]
[dm-5] [ext4   ] [unknown] [ 8.00 GB] [253] [  5]

I'm thinking that during restore, I can just re-create the volume
groups and logical volumes and then restore each file system backup to
that logical volume.  Or better yet, since I know how much space I'm
actually using, just create one logical volume of the right size.  I
kept adding virtual disks as needed to store repos in /var/satellite

I also want to verify the syntax I'm using:

fsarchiver savefs -Aa -e /mnt/media/* -j 2
/mnt/media/voyager/boot.fsa /dev/vda1

Seemed to work fine for backing up /boot

Are there any other recommended options I should be using for backing
up live file systems mounted read / write?

I've also stopped all of the spacewalk services and other services on
the VM in order to minimize open files being skipped, etc.

Volume group structure.

[root@voyager media]# pvdisplay
  --- Physical volume ---
  PV Name   /dev/vdb1
  VG Name   satellite
  PV Size   100.00 GiB / not usable 3.00 MiB
  Allocatable   yes (but full)
  PE Size   4.00 MiB
  Total PE  25599
  Free PE   0
  Allocated PE  25599
  PV UUID   g3uGGu-p0b3-eSIJ-Bwy7-YOTD-GKnd-prWP7a

  --- Physical volume ---
  PV Name   /dev/vdc1
  VG Name   satellite
  PV Size   20.00 GiB / not usable 3.89 MiB
  Allocatable   yes (but full)
  PE Size   4.00 MiB
  Total PE  5119
  Free PE   0
  Allocated PE  5119
  PV UUID   W35GYr-T6pg-3e0o-s8I7-aqtc-fxcD-Emh62K

  --- Physical volume ---
  PV Name   /dev/vda2
  VG Name   vg_voyager
  PV Size   39.51 GiB / not usable 3.00 MiB
  Allocatable   yes
  PE Size   4.00 MiB
  Total PE  10114
  Free PE   146
  Allocated PE  9968
  PV UUID   hJCdct-iR6Q-NPYi-eBZN-dZdP-x4YP-U1zyvE

[root@voyager media]# vgdisplay
  --- Volume group ---
  VG Name   satellite
  System ID
  Formatlvm2
  Metadata Areas2
  Metadata Sequence No  4
  VG Access read/write
  VG Status resizable
  MAX LV0
  Cur LV1
  Open LV   1
  Max PV0
  Cur PV2
  Act PV2
  VG Size   119.99 GiB
  PE Size   4.00 MiB
  Total PE  30718
  Alloc PE / Size   30718 / 119.99 GiB
  Free  PE / Size   0 / 0
  VG UUID   fXvCp3-N0uG-rBRc-FWVJ-Kpv3-AH9L-1PnYUy

  --- Volume group ---
  VG Name   vg_voyager
  System ID
  Formatlvm2
  Metadata Areas1
  Metadata Sequence No  9
  VG Access read/write
  VG Status resizable
  MAX LV0
  Cur LV5
  Open LV   5
  Max PV0
  Cur PV1
  Act PV1
  VG Size   39.51 GiB
  PE Size   4.00 MiB
  Total PE  10114
  Alloc PE / Size   9968 / 38.94 GiB
  Free  PE / Size   146 / 584.00 MiB
  VG UUID   3txqia-eDtn-j5wn-iixS-gfpv-90b9-ButDqh



[root@voyager media]# lvdisplay
  --- Logical volume ---
  LV Path/dev/satellite/lv_packages
  LV Namelv_packages
  VG Namesatellite
  LV UUID03VUWu-bxGf-hG2b-c3cx-m3lu-7Dlp-iaiWzu
  LV Write Accessread/write
  LV Creation host, time voyager, 2012-11-11 12:53:54 -0500
  LV Status  available
  # open  

Re: [Users] How to rescue storage domain structure

2013-04-22 Thread Joop

Chris Smith wrote:

List,

I have lost the ability to manage the hosts or VM's using ovirt engine
web interface.  The data center is offline, and I
can't actually perform any operations with the hosts or VM's.  I don't
think that there
are any actions I can perform in the web interface at all.

What's odd is that I can tell the host to go into maintenance mode
using the ovirt-engine web interface and it seems to go into
maintenance mode.  It even shows the wrench icon next to the host.  I
can also try and activate it after it susposedly goes into maintenance
mode, and It states that the host was activated, but the host never
actually comes up or contends for SPM status, and the data center
never comes online.

From the logs it seems that at least PKI is broken between the engine
and the hosts as I see numerous certificate errors on both the
ovirt-engine and clients.

vdsm.log shows:

Traceback (most recent call last):
  File /usr/lib64/python2.7/SocketServer.py, line 582, in
process_request_thread
self.finish_request(request, client_address)
  File /usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py,
line 66, in finish_request
request.do_handshake()
  File /usr/lib64/python2.7/ssl.py, line 305, in do_handshake
self._sslobj.do_handshake()
SSLError: [Errno 1] _ssl.c:504: error:14094416:SSL
routines:SSL3_READ_BYTES:sslv3 alert certificate unknown

and engine.log shows:

2013-04-18 18:42:43,632 ERROR
[org.ovirt.engine.core.
engineencryptutils.EncryptionUtils]
(QuartzScheduler_Worker-68) Failed to decryptData must start with zero
2013-04-18 18:42:43,642 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand]
(QuartzScheduler_Worker-68) XML RPC error in command


Alon Bar-Lev was able to offer several good pointers in another thread
titled Certificates and PKI seem to be broken after yum update and
eventually concluded that the installation seems to be corrupted more
than just the certificates, truststore, and keystore, and suggested
that I start a new thread to ask about how to rescue the storage
domain structure.

The storage used for the data center is ISCSI, which is intact and
working.  In fact 2 of the VM's are still online and running on one of
the original FC17 hosts systems.

I'm not able to reinstall any of the existing hosts from the ovirt-engine web
interface.  I attempted to reinstall one of the hosts (not the SPM)
which failed.

I also tried to bring up a new, third host and add it to the cluster.
I setup another Fedora 17 box up and tried to add it to the
cluster, but it states that there are no available servers in the
cluster to probe the new host.

This is a test environment that I would like to fix, but I'm also
willing to just run engine cleanup and start over.

That said, there are 3 VM's that I would like to keep.  Two are online
and running, and I'm able to see them with virsh on that host.  I was
wondering about using virsh to backup these vm's.

The third VM exists in the database, and was set to run on the host
that I attempted to reinstall, but that VM isn't running, and when I
use virsh on it's host, virsh can't seem to find it, when I perform
the list commands, and I can't start it with virsh vm-name

What is the best way to proceed?  It seems like it would be easier to
export the VM's using virsh from the host that they run on if
possible, then update ovirt to the latest version, recreate everything
and then import the VM's back in to the new environment.

Will this work?  Is there a procedure I can follow to do this?

Here's some additional information about the installed ovirt packages
on the ovirt-engine

[
If you want a backup of the currently running hosts you can use 
fsarchiver. There is a statically linked version consisting of one 
executable on the website of fsarchiver and you can use options to 
overrule the fact that you're backing up a live system.
You can't shutdown the VM's, I think, and then do an export to an export 
domain since you don't have a master storage domain thats why the above 
workaround with fsarchiver. You can ofcourse use you're favourite backup 
programme.



Joop

--
irc: jvandewege

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users