Re: [Users] Unable to attach to storage domain (Ovirt 3.2)

2013-10-02 Thread Itamar Heim

On 09/20/2013 05:01 AM, Dan Ferris wrote:

Hi,

This is my first post to the list.  I am happy to say that we have been
using Ovirt for 6 months with a few bumps, but it's mostly been ok.

Until tonight that is...

I had to do a maintenance that required rebooting both of our Hypervisor
nodes.  Both of them run Fedora Core 18 and have been happy for months.
  After rebooting them tonight, they will not attach to the storage.  If
it matters, the storage is a server running LIO with a Fibre Channel
target.

Vdsm log:

Thread-22::DEBUG::2013-09-19
21:57:09,392::misc::84::Storage.Misc.excCmd::() '/usr/bin/dd
iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
bs=4096 count=1' (cwd None)
Thread-22::DEBUG::2013-09-19
21:57:09,400::misc::84::Storage.Misc.excCmd::() SUCCESS:  =
'1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
0.000547161 s, 7.5 MB/s\n';  = 0
Thread-23::DEBUG::2013-09-19
21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' got the operation mutex
Thread-23::DEBUG::2013-09-19
21:57:16,587::misc::84::Storage.Misc.excCmd::() u'/usr/bin/sudo
-n /sbin/lvm vgs --config " devices { preferred_names =
[\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
disable_after_error_count=3 filter = [
\\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global {
locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix
--separator | -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
Thread-23::DEBUG::2013-09-19
21:57:16,643::misc::84::Storage.Misc.excCmd::() FAILED:  =
'  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
 = 5
Thread-23::WARNING::2013-09-19
21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
Thread-23::DEBUG::2013-09-19
21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' released the operation mutex
Thread-23::ERROR::2013-09-19
21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
monitoring information
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
_monitorDomain
 self.domain = sdCache.produce(self.sdUUID)
   File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
 domain.getRealDomain()
   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
 return self._cache._realProduce(self._sdUUID)
   File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
 domain = self._findDomain(sdUUID)
   File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
 raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)

vgs output (Note that I don't know what the device
(Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :

[root@node01 vdsm]# vgs
   Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
   VG   #PV #LV #SN Attr   VSize   VFree
   b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t  5.88t
   build  2   2   0 wz-pn- 299.75g 16.00m
   fedora 1   3   0 wz--n- 557.88g 0

lvs output:

[root@node01 vdsm]# lvs
   Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
   LV   VG  Attr  LSizePool
Origin Data%  Move Log Copy%  Convert
   0b8cca47-313f-48da-84f2-154810790d5a
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
   0f6f7572-8797-4d84-831b-87dbc4e1aa48
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
   19a1473f-c375-411f-9a02-c6054b9a28d2
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   50.00g
   221144dc-51dc-46ae-9399-c0b8e030f38a
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
   2386932f-5f68-46e1-99a4-e96c944ac21b
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
   3e027010-931b-43d6-9c9f-eeeabbdcd47a
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a2.00g
   4257ccc2-94d5-4d71-b21a-c188acbf7ca1
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  200.00g
   4979b2a4-04aa-46a1-be0d-f10be0a1f587
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
   4e1b8a1a-1704-422b-9d79-60f15e165cb7
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
   70bce792-410f-479f-8e04-a2a4093d3dfb
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
   791f6bda-c7eb-4d90-84c1-d7e33e73de62
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
   818ad6bc-8da2-4099-b38a-8c5b52f69e32
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  120.00g
   861c9c44-fdeb-43cd-8e5c-32c00ce3cd3d
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
   86b69521-

Re: [Users] Unable to attach to storage domain (Ovirt 3.2)

2013-09-22 Thread Ayal Baron


- Original Message -
> We actually got it working.  Both of us were tired from working late, so
> we didn't find that the missing storage domain was actually one of the
> NFS exports for awhile.  After removing our NFS ISO domain and NFS
> export domain everything came up.
> 

Thanks for the update.
Coincidentally we have a patch upstream that should ignore the iso and export 
domains in such situations (http://gerrit.ovirt.org/#/c/17986/) and would void 
the need for you to deactivate them.

> Dan
> 
> On 9/22/13 6:08 AM, Ayal Baron wrote:
> > If I understand correctly you have a storage domain which is built of
> > multiple (at least 2) LUNs.
> > One of these LUNs seems to be missing
> > (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is an LVM PV UUID).
> > It looks like you are either not fully connected to the storage server
> > (missing a connection) or the LUN mapping in LIO has been changed or that
> > the chap password has changed or something.
> >
> > LVM is able to report the LVs since the PV which contains the metadata is
> > still accessible (which is also why you see the VG and why LVM knows that
> > the Wy3Ymi... device is missing).
> >
> > Can you compress and attach *all* of the vdsm.log* files?
> >
> > - Original Message -
> >> Hi Dan, it looks like one of the domains is missing:
> >>
> >> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
> >>
> >> Is there any target missing? (disconnected or somehow faulty or
> >> unreachable)
> >>
> >> --
> >> Federico
> >>
> >> - Original Message -
> >>> From: "Dan Ferris" 
> >>> To: users@ovirt.org
> >>> Sent: Friday, September 20, 2013 4:01:06 AM
> >>> Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)
> >>>
> >>> Hi,
> >>>
> >>> This is my first post to the list.  I am happy to say that we have been
> >>> using Ovirt for 6 months with a few bumps, but it's mostly been ok.
> >>>
> >>> Until tonight that is...
> >>>
> >>> I had to do a maintenance that required rebooting both of our Hypervisor
> >>> nodes.  Both of them run Fedora Core 18 and have been happy for months.
> >>>After rebooting them tonight, they will not attach to the storage.  If
> >>> it matters, the storage is a server running LIO with a Fibre Channel
> >>> target.
> >>>
> >>> Vdsm log:
> >>>
> >>> Thread-22::DEBUG::2013-09-19
> >>> 21:57:09,392::misc::84::Storage.Misc.excCmd::() '/usr/bin/dd
> >>> iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
> >>> bs=4096 count=1' (cwd None)
> >>> Thread-22::DEBUG::2013-09-19
> >>> 21:57:09,400::misc::84::Storage.Misc.excCmd::() SUCCESS:  =
> >>> '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
> >>> 0.000547161 s, 7.5 MB/s\n';  = 0
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
> >>> reload operation' got the operation mutex
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,587::misc::84::Storage.Misc.excCmd::() u'/usr/bin/sudo
> >>> -n /sbin/lvm vgs --config " devices { preferred_names =
> >>> [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
> >>> disable_after_error_count=3 filter = [
> >>> \\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global {
> >>> locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
> >>> retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix
> >>> --separator | -o
> >>> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
> >>> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,643::misc::84::Storage.Misc.excCmd::() FAILED:  =
> >>> '  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
> >>>  = 5
> >>> Thread-23::WARNING::2013-09-19
> >>> 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
> >>> ['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) 

Re: [Users] Unable to attach to storage domain (Ovirt 3.2)

2013-09-22 Thread Dan Ferris
We actually got it working.  Both of us were tired from working late, so 
we didn't find that the missing storage domain was actually one of the 
NFS exports for awhile.  After removing our NFS ISO domain and NFS 
export domain everything came up.


Dan

On 9/22/13 6:08 AM, Ayal Baron wrote:

If I understand correctly you have a storage domain which is built of multiple 
(at least 2) LUNs.
One of these LUNs seems to be missing (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z 
is an LVM PV UUID).
It looks like you are either not fully connected to the storage server (missing 
a connection) or the LUN mapping in LIO has been changed or that the chap 
password has changed or something.

LVM is able to report the LVs since the PV which contains the metadata is still 
accessible (which is also why you see the VG and why LVM knows that the 
Wy3Ymi... device is missing).

Can you compress and attach *all* of the vdsm.log* files?

- Original Message -

Hi Dan, it looks like one of the domains is missing:

6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50

Is there any target missing? (disconnected or somehow faulty or
unreachable)

--
Federico

- Original Message -

From: "Dan Ferris" 
To: users@ovirt.org
Sent: Friday, September 20, 2013 4:01:06 AM
Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)

Hi,

This is my first post to the list.  I am happy to say that we have been
using Ovirt for 6 months with a few bumps, but it's mostly been ok.

Until tonight that is...

I had to do a maintenance that required rebooting both of our Hypervisor
nodes.  Both of them run Fedora Core 18 and have been happy for months.
   After rebooting them tonight, they will not attach to the storage.  If
it matters, the storage is a server running LIO with a Fibre Channel
target.

Vdsm log:

Thread-22::DEBUG::2013-09-19
21:57:09,392::misc::84::Storage.Misc.excCmd::() '/usr/bin/dd
iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
bs=4096 count=1' (cwd None)
Thread-22::DEBUG::2013-09-19
21:57:09,400::misc::84::Storage.Misc.excCmd::() SUCCESS:  =
'1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
0.000547161 s, 7.5 MB/s\n';  = 0
Thread-23::DEBUG::2013-09-19
21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' got the operation mutex
Thread-23::DEBUG::2013-09-19
21:57:16,587::misc::84::Storage.Misc.excCmd::() u'/usr/bin/sudo
-n /sbin/lvm vgs --config " devices { preferred_names =
[\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
disable_after_error_count=3 filter = [
\\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global {
locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix
--separator | -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
Thread-23::DEBUG::2013-09-19
21:57:16,643::misc::84::Storage.Misc.excCmd::() FAILED:  =
'  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
 = 5
Thread-23::WARNING::2013-09-19
21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
Thread-23::DEBUG::2013-09-19
21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' released the operation mutex
Thread-23::ERROR::2013-09-19
21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
monitoring information
Traceback (most recent call last):
File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
_monitorDomain
  self.domain = sdCache.produce(self.sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
  domain.getRealDomain()
File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
  return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
  domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
  raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)

vgs output (Note that I don't know what the device
(Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :

[root@node01 vdsm]# vgs
Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
VG   #PV #LV #SN Attr   VSize   VFree
b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t  5.88t
build  2   2   0 wz-pn- 299.75g 16.00m
fedora   

Re: [Users] Unable to attach to storage domain (Ovirt 3.2)

2013-09-22 Thread Ayal Baron
If I understand correctly you have a storage domain which is built of multiple 
(at least 2) LUNs.
One of these LUNs seems to be missing (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z 
is an LVM PV UUID).
It looks like you are either not fully connected to the storage server (missing 
a connection) or the LUN mapping in LIO has been changed or that the chap 
password has changed or something.

LVM is able to report the LVs since the PV which contains the metadata is still 
accessible (which is also why you see the VG and why LVM knows that the 
Wy3Ymi... device is missing).

Can you compress and attach *all* of the vdsm.log* files?

- Original Message -
> Hi Dan, it looks like one of the domains is missing:
> 
> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
> 
> Is there any target missing? (disconnected or somehow faulty or
> unreachable)
> 
> --
> Federico
> 
> - Original Message -
> > From: "Dan Ferris" 
> > To: users@ovirt.org
> > Sent: Friday, September 20, 2013 4:01:06 AM
> > Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)
> > 
> > Hi,
> > 
> > This is my first post to the list.  I am happy to say that we have been
> > using Ovirt for 6 months with a few bumps, but it's mostly been ok.
> > 
> > Until tonight that is...
> > 
> > I had to do a maintenance that required rebooting both of our Hypervisor
> > nodes.  Both of them run Fedora Core 18 and have been happy for months.
> >   After rebooting them tonight, they will not attach to the storage.  If
> > it matters, the storage is a server running LIO with a Fibre Channel
> > target.
> > 
> > Vdsm log:
> > 
> > Thread-22::DEBUG::2013-09-19
> > 21:57:09,392::misc::84::Storage.Misc.excCmd::() '/usr/bin/dd
> > iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
> > bs=4096 count=1' (cwd None)
> > Thread-22::DEBUG::2013-09-19
> > 21:57:09,400::misc::84::Storage.Misc.excCmd::() SUCCESS:  =
> > '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
> > 0.000547161 s, 7.5 MB/s\n';  = 0
> > Thread-23::DEBUG::2013-09-19
> > 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
> > reload operation' got the operation mutex
> > Thread-23::DEBUG::2013-09-19
> > 21:57:16,587::misc::84::Storage.Misc.excCmd::() u'/usr/bin/sudo
> > -n /sbin/lvm vgs --config " devices { preferred_names =
> > [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
> > disable_after_error_count=3 filter = [
> > \\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global {
> > locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
> > retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix
> > --separator | -o
> > uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
> > 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
> > Thread-23::DEBUG::2013-09-19
> > 21:57:16,643::misc::84::Storage.Misc.excCmd::() FAILED:  =
> > '  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
> >  = 5
> > Thread-23::WARNING::2013-09-19
> > 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
> > ['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
> > Thread-23::DEBUG::2013-09-19
> > 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
> > reload operation' released the operation mutex
> > Thread-23::ERROR::2013-09-19
> > 21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
> > Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
> > monitoring information
> > Traceback (most recent call last):
> >File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
> > _monitorDomain
> >  self.domain = sdCache.produce(self.sdUUID)
> >File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
> >  domain.getRealDomain()
> >File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> >  return self._cache._realProduce(self._sdUUID)
> >File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
> >  domain = self._findDomain(sdUUID)
> >File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
> >  raise se.StorageDomainDoesNotExist(sdUUID)
> > StorageDomainDoesNotExist: Storage domain does not exist:
> > (u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)
> > 
>

Re: [Users] Unable to attach to storage domain (Ovirt 3.2)

2013-09-22 Thread Federico Simoncelli
Hi Dan, it looks like one of the domains is missing:

6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50

Is there any target missing? (disconnected or somehow faulty or
unreachable)

-- 
Federico

- Original Message -
> From: "Dan Ferris" 
> To: users@ovirt.org
> Sent: Friday, September 20, 2013 4:01:06 AM
> Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)
> 
> Hi,
> 
> This is my first post to the list.  I am happy to say that we have been
> using Ovirt for 6 months with a few bumps, but it's mostly been ok.
> 
> Until tonight that is...
> 
> I had to do a maintenance that required rebooting both of our Hypervisor
> nodes.  Both of them run Fedora Core 18 and have been happy for months.
>   After rebooting them tonight, they will not attach to the storage.  If
> it matters, the storage is a server running LIO with a Fibre Channel target.
> 
> Vdsm log:
> 
> Thread-22::DEBUG::2013-09-19
> 21:57:09,392::misc::84::Storage.Misc.excCmd::() '/usr/bin/dd
> iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
> bs=4096 count=1' (cwd None)
> Thread-22::DEBUG::2013-09-19
> 21:57:09,400::misc::84::Storage.Misc.excCmd::() SUCCESS:  =
> '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
> 0.000547161 s, 7.5 MB/s\n';  = 0
> Thread-23::DEBUG::2013-09-19
> 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
> reload operation' got the operation mutex
> Thread-23::DEBUG::2013-09-19
> 21:57:16,587::misc::84::Storage.Misc.excCmd::() u'/usr/bin/sudo
> -n /sbin/lvm vgs --config " devices { preferred_names =
> [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
> disable_after_error_count=3 filter = [
> \\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global {
> locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
> retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix
> --separator | -o
> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
> Thread-23::DEBUG::2013-09-19
> 21:57:16,643::misc::84::Storage.Misc.excCmd::() FAILED:  =
> '  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
>  = 5
> Thread-23::WARNING::2013-09-19
> 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
> ['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
> Thread-23::DEBUG::2013-09-19
> 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
> reload operation' released the operation mutex
> Thread-23::ERROR::2013-09-19
> 21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
> Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
> monitoring information
> Traceback (most recent call last):
>File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
> _monitorDomain
>  self.domain = sdCache.produce(self.sdUUID)
>File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
>  domain.getRealDomain()
>File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
>  return self._cache._realProduce(self._sdUUID)
>File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
>  domain = self._findDomain(sdUUID)
>File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
>  raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)
> 
> vgs output (Note that I don't know what the device
> (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :
> 
> [root@node01 vdsm]# vgs
>Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
>VG   #PV #LV #SN Attr   VSize   VFree
>b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t  5.88t
>build  2   2   0 wz-pn- 299.75g 16.00m
>fedora 1   3   0 wz--n- 557.88g 0
> 
> lvs output:
> 
> [root@node01 vdsm]# lvs
>Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
>LV   VG
>   Attr  LSizePool Origin Data%  Move Log Copy%  Convert
>0b8cca47-313f-48da-84f2-154810790d5a
> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
>0f6f7572-8797-4d84-831b-87dbc4e1aa48
> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
>19a1473f-c375-411f-9a02-c6054b9a28d2
> b358e46b-635b-4c0e-8e73-0a4

[Users] Unable to attach to storage domain (Ovirt 3.2)

2013-09-19 Thread Dan Ferris

Hi,

This is my first post to the list.  I am happy to say that we have been 
using Ovirt for 6 months with a few bumps, but it's mostly been ok.


Until tonight that is...

I had to do a maintenance that required rebooting both of our Hypervisor 
nodes.  Both of them run Fedora Core 18 and have been happy for months. 
 After rebooting them tonight, they will not attach to the storage.  If 
it matters, the storage is a server running LIO with a Fibre Channel target.


Vdsm log:

Thread-22::DEBUG::2013-09-19 
21:57:09,392::misc::84::Storage.Misc.excCmd::() '/usr/bin/dd 
iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata 
bs=4096 count=1' (cwd None)
Thread-22::DEBUG::2013-09-19 
21:57:09,400::misc::84::Storage.Misc.excCmd::() SUCCESS:  = 
'1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 
0.000547161 s, 7.5 MB/s\n';  = 0
Thread-23::DEBUG::2013-09-19 
21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm 
reload operation' got the operation mutex
Thread-23::DEBUG::2013-09-19 
21:57:16,587::misc::84::Storage.Misc.excCmd::() u'/usr/bin/sudo 
-n /sbin/lvm vgs --config " devices { preferred_names = 
[\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 
disable_after_error_count=3 filter = [ 
\\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global { 
locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup { 
retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix 
--separator | -o 
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free 
6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
Thread-23::DEBUG::2013-09-19 
21:57:16,643::misc::84::Storage.Misc.excCmd::() FAILED:  = 
'  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n'; 
 = 5
Thread-23::WARNING::2013-09-19 
21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] 
['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
Thread-23::DEBUG::2013-09-19 
21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm 
reload operation' released the operation mutex
Thread-23::ERROR::2013-09-19 
21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain) 
Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50 
monitoring information

Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in 
_monitorDomain

self.domain = sdCache.produce(self.sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: 
(u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)


vgs output (Note that I don't know what the device 
(Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :


[root@node01 vdsm]# vgs
  Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
  VG   #PV #LV #SN Attr   VSize   VFree
  b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t  5.88t
  build  2   2   0 wz-pn- 299.75g 16.00m
  fedora 1   3   0 wz--n- 557.88g 0

lvs output:

[root@node01 vdsm]# lvs
  Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
  LV   VG 
 Attr  LSizePool Origin Data%  Move Log Copy%  Convert
  0b8cca47-313f-48da-84f2-154810790d5a 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
  0f6f7572-8797-4d84-831b-87dbc4e1aa48 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
  19a1473f-c375-411f-9a02-c6054b9a28d2 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   50.00g
  221144dc-51dc-46ae-9399-c0b8e030f38a 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
  2386932f-5f68-46e1-99a4-e96c944ac21b 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
  3e027010-931b-43d6-9c9f-eeeabbdcd47a 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a2.00g
  4257ccc2-94d5-4d71-b21a-c188acbf7ca1 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  200.00g
  4979b2a4-04aa-46a1-be0d-f10be0a1f587 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
  4e1b8a1a-1704-422b-9d79-60f15e165cb7 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
  70bce792-410f-479f-8e04-a2a4093d3dfb 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
  791f6bda-c7eb-4d90-84c1-d7e33e73de62 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
  818ad6bc-8da2-4099-b38a-8c5b52f69e32 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  120.00g
  861c9c44-fdeb-43cd-8e5c-32c00ce3cd3d 
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
  86b69521-14db-43d1-801f-9d21f