Re: [ovirt-users] Does my Storage Domain crashed or is this iSCSI LUN's a problem?

2015-01-28 Thread Adam Litke

On 26/01/15 13:05 +0100, shimano wrote:

Hi guys,

I'm trying to run one of my storage domains, which experienced failure.
Unfortunately, I meet a very nasty error (Storage domain does not exist).

Could someone tell me, how to try to restore this domain?


Could you try moving the host to Maintenance mode and then Activate it
again please.  I've encountered situations where vdsm restarts and
engine does not reconnect storage until an Activate action happens.
Let's see if this is your issue.



P.S.
It's an oVirt 3.4.2-1.el6

**

/var/log/messages:
Jan 26 12:48:49 node002 vdsm TaskManager.Task ERROR
Task=`10d02993-b585-448f-9a50-bd3e8cda7082`::Unexpected error#012Traceback
(most recent call last):#012  File /usr/share/vdsm/storage/task.py, line
873, in _run#012return fn(*args, **kargs)#012  File
/usr/share/vdsm/logUtils.py, line 45, in wrapper#012res = f(*args,
**kwargs)#012  File /usr/share/vdsm/storage/hsm.py, line 2959, in
getVGInfo#012return dict(info=self.__getVGsInfo([vgUUID])[0])#012  File
/usr/share/vdsm/storage/hsm.py, line 2892, in __getVGsInfo#012vgList
= [lvm.getVGbyUUID(vgUUID) for vgUUID in vgUUIDs]#012  File
/usr/share/vdsm/storage/lvm.py, line 894, in getVGbyUUID#012raise
se.VolumeGroupDoesNotExist(vg_uuid: %s %
vgUUID)#012VolumeGroupDoesNotExist: Volume Group does not exist: ('vg_uuid:
gyaCWf-6VKi-lI9W-JT6H-IZdy-rIsB-hTvZ4O',)
Jan 26 12:48:49 node002 kernel: device-mapper: table: 253:26: multipath:
error getting device
Jan 26 12:48:49 node002 kernel: device-mapper: ioctl: error adding target
to table

**

/var/log/vdsm.log:
Thread-22::ERROR::2015-01-26
12:43:03,376::sdc::137::Storage.StorageDomainCache::(_findDomain) looking
for unfetched domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
Thread-22::ERROR::2015-01-26
12:43:03,377::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
Thread-22::DEBUG::2015-01-26
12:43:03,377::lvm::373::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' got the operation mutex
Thread-22::DEBUG::2015-01-26
12:43:03,378::lvm::296::Storage.Misc.excCmd::(cmd) u'/usr/bin/sudo -n
/sbin/lvm vgs --config  devices { preferred_names = [\\^/dev/mapper/\\]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
obtain_device_list_from_udev=0 filter = [
\'a|/dev/mapper/mpathb|/dev/mapper/mpathc|/dev/mapper/mpathd|/dev/mapper/mpathe|/dev/mapper/mpathf|\',
\'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1
wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days =
0 }  --noheadings --units b --nosuffix --separator | -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
db52e9cb-7306-43fd-aff3-20831bc2bcaf' (cwd None)
Thread-22::DEBUG::2015-01-26
12:43:03,462::lvm::296::Storage.Misc.excCmd::(cmd) FAILED: err = '
/dev/mapper/mpathc: Checksum error\n  /dev/mapper/mpathc: Checksum error\n
Volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf not found\n  Skipping
volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf\n'; rc = 5
Thread-22::WARNING::2015-01-26
12:43:03,466::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
/dev/mapper/mpathc: Checksum error', '  /dev/mapper/mpathc: Checksum
error', '  Volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf not found',
'  Skipping volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf']
Thread-22::DEBUG::2015-01-26
12:43:03,466::lvm::415::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' released the operation mutex
Thread-22::ERROR::2015-01-26
12:43:03,477::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
db52e9cb-7306-43fd-aff3-20831bc2bcaf not found
Traceback (most recent call last):
 File /usr/share/vdsm/storage/sdc.py, line 141, in _findDomain
   dom = findMethod(sdUUID)
 File /usr/share/vdsm/storage/sdc.py, line 171, in _findUnfetchedDomain
   raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'db52e9cb-7306-43fd-aff3-20831bc2bcaf',)
Thread-22::ERROR::2015-01-26
12:43:03,478::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
monitoring information
Traceback (most recent call last):
 File /usr/share/vdsm/storage/domainMonitor.py, line 204, in
_monitorDomain
   self.domain = sdCache.produce(self.sdUUID)
 File /usr/share/vdsm/storage/sdc.py, line 98, in produce
   domain.getRealDomain()
 File /usr/share/vdsm/storage/sdc.py, line 52, in getRealDomain
   return self._cache._realProduce(self._sdUUID)
 File /usr/share/vdsm/storage/sdc.py, line 122, in _realProduce
   domain = self._findDomain(sdUUID)
 File /usr/share/vdsm/storage/sdc.py, line 141, in _findDomain
   dom = findMethod(sdUUID)
 File /usr/share/vdsm/storage/sdc.py, 

[ovirt-users] Does my Storage Domain crashed or is this iSCSI LUN's a problem?

2015-01-27 Thread shimano
Hi guys,

I'm trying to run one of my storage domains, which experienced failure.
Unfortunately, I meet a very nasty error (Storage domain does not exist).

Could someone tell me, how to try to restore this domain?

P.S.
It's an oVirt 3.4.2-1.el6

**


/var/log/messages:
Jan 26 12:48:49 node002 vdsm TaskManager.Task ERROR
Task=`10d02993-b585-448f-9a50-bd3e8cda7082`::Unexpected error#012Traceback
(most recent call last):#012  File /usr/share/vdsm/storage/task.py, line
873, in _run#012return fn(*args, **kargs)#012  File
/usr/share/vdsm/logUtils.py, line 45, in wrapper#012res = f(*args,
**kwargs)#012  File /usr/share/vdsm/storage/hsm.py, line 2959, in
getVGInfo#012return dict(info=self.__getVGsInfo([vgUUID])[0])#012  File
/usr/share/vdsm/storage/hsm.py, line 2892, in __getVGsInfo#012vgList
= [lvm.getVGbyUUID(vgUUID) for vgUUID in vgUUIDs]#012  File
/usr/share/vdsm/storage/lvm.py, line 894, in getVGbyUUID#012raise
se.VolumeGroupDoesNotExist(vg_uuid: %s %
vgUUID)#012VolumeGroupDoesNotExist: Volume Group does not exist: ('vg_uuid:
gyaCWf-6VKi-lI9W-JT6H-IZdy-rIsB-hTvZ4O',)
Jan 26 12:48:49 node002 kernel: device-mapper: table: 253:26: multipath:
error getting device
Jan 26 12:48:49 node002 kernel: device-mapper: ioctl: error adding target
to table

**

/var/log/vdsm.log:
Thread-22::ERROR::2015-01-26
12:43:03,376::sdc::137::Storage.StorageDomainCache::(_findDomain) looking
for unfetched domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
Thread-22::ERROR::2015-01-26
12:43:03,377::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
Thread-22::DEBUG::2015-01-26
12:43:03,377::lvm::373::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' got the operation mutex
Thread-22::DEBUG::2015-01-26
12:43:03,378::lvm::296::Storage.Misc.excCmd::(cmd) u'/usr/bin/sudo -n
/sbin/lvm vgs --config  devices { preferred_names = [\\^/dev/mapper/\\]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
obtain_device_list_from_udev=0 filter = [
\'a|/dev/mapper/mpathb|/dev/mapper/mpathc|/dev/mapper/mpathd|/dev/mapper/mpathe|/dev/mapper/mpathf|\',
\'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1
wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days =
0 }  --noheadings --units b --nosuffix --separator | -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
db52e9cb-7306-43fd-aff3-20831bc2bcaf' (cwd None)
Thread-22::DEBUG::2015-01-26
12:43:03,462::lvm::296::Storage.Misc.excCmd::(cmd) FAILED: err = '
/dev/mapper/mpathc: Checksum error\n  /dev/mapper/mpathc: Checksum error\n
Volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf not found\n  Skipping
volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf\n'; rc = 5
Thread-22::WARNING::2015-01-26
12:43:03,466::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
/dev/mapper/mpathc: Checksum error', '  /dev/mapper/mpathc: Checksum
error', '  Volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf not found',
'  Skipping volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf']
Thread-22::DEBUG::2015-01-26
12:43:03,466::lvm::415::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' released the operation mutex
Thread-22::ERROR::2015-01-26
12:43:03,477::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
db52e9cb-7306-43fd-aff3-20831bc2bcaf not found
Traceback (most recent call last):
  File /usr/share/vdsm/storage/sdc.py, line 141, in _findDomain
dom = findMethod(sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'db52e9cb-7306-43fd-aff3-20831bc2bcaf',)
Thread-22::ERROR::2015-01-26
12:43:03,478::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
monitoring information
Traceback (most recent call last):
  File /usr/share/vdsm/storage/domainMonitor.py, line 204, in
_monitorDomain
self.domain = sdCache.produce(self.sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 98, in produce
domain.getRealDomain()
  File /usr/share/vdsm/storage/sdc.py, line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 122, in _realProduce
domain = self._findDomain(sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 141, in _findDomain
dom = findMethod(sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'db52e9cb-7306-43fd-aff3-20831bc2bcaf',)
Thread-13::DEBUG::2015-01-26
12:43:05,102::task::595::TaskManager.Task::(_updateState)

[ovirt-users] Does my Storage Domain crashed or is this iSCSI LUN's a problem?

2015-01-26 Thread shimano
Hi guys,

I'm trying to run one of my storage domains, which experienced failure.
Unfortunately, I meet a very nasty error (Storage domain does not exist).

Could someone tell me, how to try to restore this domain?

P.S.
It's an oVirt 3.4.2-1.el6

**

/var/log/messages:
Jan 26 12:48:49 node002 vdsm TaskManager.Task ERROR
Task=`10d02993-b585-448f-9a50-bd3e8cda7082`::Unexpected error#012Traceback
(most recent call last):#012  File /usr/share/vdsm/storage/task.py, line
873, in _run#012return fn(*args, **kargs)#012  File
/usr/share/vdsm/logUtils.py, line 45, in wrapper#012res = f(*args,
**kwargs)#012  File /usr/share/vdsm/storage/hsm.py, line 2959, in
getVGInfo#012return dict(info=self.__getVGsInfo([vgUUID])[0])#012  File
/usr/share/vdsm/storage/hsm.py, line 2892, in __getVGsInfo#012vgList
= [lvm.getVGbyUUID(vgUUID) for vgUUID in vgUUIDs]#012  File
/usr/share/vdsm/storage/lvm.py, line 894, in getVGbyUUID#012raise
se.VolumeGroupDoesNotExist(vg_uuid: %s %
vgUUID)#012VolumeGroupDoesNotExist: Volume Group does not exist: ('vg_uuid:
gyaCWf-6VKi-lI9W-JT6H-IZdy-rIsB-hTvZ4O',)
Jan 26 12:48:49 node002 kernel: device-mapper: table: 253:26: multipath:
error getting device
Jan 26 12:48:49 node002 kernel: device-mapper: ioctl: error adding target
to table

**

/var/log/vdsm.log:
Thread-22::ERROR::2015-01-26
12:43:03,376::sdc::137::Storage.StorageDomainCache::(_findDomain) looking
for unfetched domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
Thread-22::ERROR::2015-01-26
12:43:03,377::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
Thread-22::DEBUG::2015-01-26
12:43:03,377::lvm::373::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' got the operation mutex
Thread-22::DEBUG::2015-01-26
12:43:03,378::lvm::296::Storage.Misc.excCmd::(cmd) u'/usr/bin/sudo -n
/sbin/lvm vgs --config  devices { preferred_names = [\\^/dev/mapper/\\]
ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
obtain_device_list_from_udev=0 filter = [
\'a|/dev/mapper/mpathb|/dev/mapper/mpathc|/dev/mapper/mpathd|/dev/mapper/mpathe|/dev/mapper/mpathf|\',
\'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1
wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days =
0 }  --noheadings --units b --nosuffix --separator | -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
db52e9cb-7306-43fd-aff3-20831bc2bcaf' (cwd None)
Thread-22::DEBUG::2015-01-26
12:43:03,462::lvm::296::Storage.Misc.excCmd::(cmd) FAILED: err = '
/dev/mapper/mpathc: Checksum error\n  /dev/mapper/mpathc: Checksum error\n
Volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf not found\n  Skipping
volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf\n'; rc = 5
Thread-22::WARNING::2015-01-26
12:43:03,466::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
/dev/mapper/mpathc: Checksum error', '  /dev/mapper/mpathc: Checksum
error', '  Volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf not found',
'  Skipping volume group db52e9cb-7306-43fd-aff3-20831bc2bcaf']
Thread-22::DEBUG::2015-01-26
12:43:03,466::lvm::415::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' released the operation mutex
Thread-22::ERROR::2015-01-26
12:43:03,477::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
db52e9cb-7306-43fd-aff3-20831bc2bcaf not found
Traceback (most recent call last):
  File /usr/share/vdsm/storage/sdc.py, line 141, in _findDomain
dom = findMethod(sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'db52e9cb-7306-43fd-aff3-20831bc2bcaf',)
Thread-22::ERROR::2015-01-26
12:43:03,478::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain db52e9cb-7306-43fd-aff3-20831bc2bcaf
monitoring information
Traceback (most recent call last):
  File /usr/share/vdsm/storage/domainMonitor.py, line 204, in
_monitorDomain
self.domain = sdCache.produce(self.sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 98, in produce
domain.getRealDomain()
  File /usr/share/vdsm/storage/sdc.py, line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 122, in _realProduce
domain = self._findDomain(sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 141, in _findDomain
dom = findMethod(sdUUID)
  File /usr/share/vdsm/storage/sdc.py, line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'db52e9cb-7306-43fd-aff3-20831bc2bcaf',)
Thread-13::DEBUG::2015-01-26
12:43:05,102::task::595::TaskManager.Task::(_updateState)