Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain

2017-05-09 Thread Jens Oechsler
Hi,
Thanks for reply. Have tried to gather logs from hosts here on google
drive: https://drive.google.com/open?id=0B7R4U330JfWpbkNhb2pxZWhmUUk

On Sun, Apr 30, 2017 at 10:50 AM, Fred Rolland  wrote:
> Hi,
>
> Can you provide the vdsm and engine logs ?
>
> Thanks,
> Fred
>
> On Wed, Apr 26, 2017 at 5:30 PM, Jens Oechsler  wrote:
>>
>> Greetings,
>>
>> Is there any way to get the oVirt Data Center described below active
>> again?
>>
>> On Tue, Apr 25, 2017 at 4:11 PM, Jens Oechsler  wrote:
>> > Hi,
>> >
>> > LUN is not in pvs output, but I found it in lsblk output without any
>> > partions on it apparently.
>> >
>> > $ sudo pvs
>> >   PVVG
>> >   Fmt  Attr PSize   PFree
>> >   /dev/mapper/360050768018182b6c990 data
>> >   lvm2 a--  200.00g 180.00g
>> >   /dev/mapper/360050768018182b6c998
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a--  499.62g 484.50g
>> >   /dev/sda2 system
>> >   lvm2 a--  278.78g 208.41g
>> >
>> > $ sudo lvs
>> >   LV   VG
>> >  Attr   LSizePool Origin Data%  Meta%  Move Log Cpy%Sync
>> > Convert
>> >   34a9328f-87fe-4190-96e9-a3580b0734fc
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
>> >   506ff043-1058-448c-bbab-5c864adb2bfc
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-   10.00g
>> >   65449c88-bc28-4275--5fc75b692cbc
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
>> >   e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
>> >   ids
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
>> >   inbox
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
>> >   leases
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g
>> >   master
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
>> >   metadata
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  512.00m
>> >   outbox
>> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
>> >   data data
>> >  -wi-ao   20.00g
>> >   home system
>> >  -wi-ao 1000.00m
>> >   prod system
>> >  -wi-ao4.88g
>> >   root system
>> >  -wi-ao7.81g
>> >   swap system
>> >  -wi-ao4.00g
>> >   swap7system
>> >  -wi-ao   20.00g
>> >   tmp  system
>> >  -wi-ao4.88g
>> >   var  system
>> >  -wi-ao   27.81g
>> >
>> > $ sudo lsblk
>> > 
>> > sdq
>> > 65:00   500G  0 disk
>> > └─360050768018182b6c9d7
>> >253:33   0   500G  0 mpath
>> >
>> > Data domain was made with one 500 GB LUN and extended with 500 GB more.
>> >
>> > On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland 
>> > wrote:
>> >> Hi,
>> >>
>> >> Do you see the LUN in the host ?
>> >> Can you share pvs and lvs output ?
>> >>
>> >> Thanks,
>> >>
>> >> Fred
>> >>
>> >> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler  wrote:
>> >>>
>> >>> Hello
>> >>> I have a problem with oVirt Hosted Engine Setup version:
>> >>> 4.0.5.5-1.el7.centos.
>> >>> Setup is using FCP SAN for data and engine.
>> >>> Cluster has worked fine for a while. It has two hosts with VMs
>> >>> running.
>> >>> I extended storage with an additional LUN recently. This LUN seems to
>> >>> be gone from data domain and one VM is paused which I assume has data
>> >>> on that device.
>> >>>
>> >>> Got these errors in events:
>> >>>
>> >>> Apr 24, 2017 10:26:05 AM
>> >>> Failed to activate Storage Domain SD (Data Center DC) by
>> >>> admin@internal-authz
>> >>> Apr 10, 2017 3:38:08 PM
>> >>> Status of host cl01 was set to Up.
>> >>> Apr 10, 2017 3:38:03 PM
>> >>> Host cl01 does not enforce SELinux. Current status: DISABLED
>> >>> Apr 10, 2017 3:37:58 PM
>> >>> Host cl01 is initializing. Message: Recovering from crash or
>> >>> Initializing
>> >>> Apr 10, 2017 3:37:58 PM
>> >>> VDSM cl01 command failed: Recovering from crash or Initializing
>> >>> Apr 10, 2017 3:37:46 PM
>> >>> Failed to Reconstruct Master Domain for Data Center DC.
>> >>> Apr 10, 2017 3:37:46 PM
>> >>> Host cl01 is not responding. Host cannot be fenced automatically
>> >>> because power management for the host is disabled.
>> >>> Apr 10, 2017 3:37:46 PM
>> >>> VDSM cl01 command failed: Broken pipe
>> >>> Apr 10, 2017 3:37:46 PM
>> >>> VDSM cl01 command failed: Broken pipe
>> >>> Apr 10, 2017 3:32:45 PM
>> >>> Invalid status on Data Center DC. Setting Data Center status to Non
>> >>> Responsive (On host cl01, Error: General Exception).
>> >>> Apr 10, 2017 3:32:45 PM
>> >>> VDSM cl01 command failed: [Errno 19] 

Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain

2017-04-30 Thread Fred Rolland
Hi,

Can you provide the vdsm and engine logs ?

Thanks,
Fred

On Wed, Apr 26, 2017 at 5:30 PM, Jens Oechsler  wrote:

> Greetings,
>
> Is there any way to get the oVirt Data Center described below active again?
>
> On Tue, Apr 25, 2017 at 4:11 PM, Jens Oechsler  wrote:
> > Hi,
> >
> > LUN is not in pvs output, but I found it in lsblk output without any
> > partions on it apparently.
> >
> > $ sudo pvs
> >   PVVG
> >   Fmt  Attr PSize   PFree
> >   /dev/mapper/360050768018182b6c990 data
> >   lvm2 a--  200.00g 180.00g
> >   /dev/mapper/360050768018182b6c998
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a--  499.62g 484.50g
> >   /dev/sda2 system
> >   lvm2 a--  278.78g 208.41g
> >
> > $ sudo lvs
> >   LV   VG
> >  Attr   LSizePool Origin Data%  Meta%  Move Log Cpy%Sync
> > Convert
> >   34a9328f-87fe-4190-96e9-a3580b0734fc
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
> >   506ff043-1058-448c-bbab-5c864adb2bfc
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-   10.00g
> >   65449c88-bc28-4275--5fc75b692cbc
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
> >   e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
> >   ids
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
> >   inbox
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
> >   leases
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g
> >   master
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
> >   metadata
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  512.00m
> >   outbox
> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
> >   data data
> >  -wi-ao   20.00g
> >   home system
> >  -wi-ao 1000.00m
> >   prod system
> >  -wi-ao4.88g
> >   root system
> >  -wi-ao7.81g
> >   swap system
> >  -wi-ao4.00g
> >   swap7system
> >  -wi-ao   20.00g
> >   tmp  system
> >  -wi-ao4.88g
> >   var  system
> >  -wi-ao   27.81g
> >
> > $ sudo lsblk
> > 
> > sdq
> > 65:00   500G  0 disk
> > └─360050768018182b6c9d7
> >253:33   0   500G  0 mpath
> >
> > Data domain was made with one 500 GB LUN and extended with 500 GB more.
> >
> > On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland 
> wrote:
> >> Hi,
> >>
> >> Do you see the LUN in the host ?
> >> Can you share pvs and lvs output ?
> >>
> >> Thanks,
> >>
> >> Fred
> >>
> >> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler  wrote:
> >>>
> >>> Hello
> >>> I have a problem with oVirt Hosted Engine Setup version:
> >>> 4.0.5.5-1.el7.centos.
> >>> Setup is using FCP SAN for data and engine.
> >>> Cluster has worked fine for a while. It has two hosts with VMs running.
> >>> I extended storage with an additional LUN recently. This LUN seems to
> >>> be gone from data domain and one VM is paused which I assume has data
> >>> on that device.
> >>>
> >>> Got these errors in events:
> >>>
> >>> Apr 24, 2017 10:26:05 AM
> >>> Failed to activate Storage Domain SD (Data Center DC) by
> >>> admin@internal-authz
> >>> Apr 10, 2017 3:38:08 PM
> >>> Status of host cl01 was set to Up.
> >>> Apr 10, 2017 3:38:03 PM
> >>> Host cl01 does not enforce SELinux. Current status: DISABLED
> >>> Apr 10, 2017 3:37:58 PM
> >>> Host cl01 is initializing. Message: Recovering from crash or
> Initializing
> >>> Apr 10, 2017 3:37:58 PM
> >>> VDSM cl01 command failed: Recovering from crash or Initializing
> >>> Apr 10, 2017 3:37:46 PM
> >>> Failed to Reconstruct Master Domain for Data Center DC.
> >>> Apr 10, 2017 3:37:46 PM
> >>> Host cl01 is not responding. Host cannot be fenced automatically
> >>> because power management for the host is disabled.
> >>> Apr 10, 2017 3:37:46 PM
> >>> VDSM cl01 command failed: Broken pipe
> >>> Apr 10, 2017 3:37:46 PM
> >>> VDSM cl01 command failed: Broken pipe
> >>> Apr 10, 2017 3:32:45 PM
> >>> Invalid status on Data Center DC. Setting Data Center status to Non
> >>> Responsive (On host cl01, Error: General Exception).
> >>> Apr 10, 2017 3:32:45 PM
> >>> VDSM cl01 command failed: [Errno 19] Could not find dm device named
> >>> `[unknown]`
> >>> Apr 7, 2017 1:28:04 PM
> >>> VM HostedEngine is down with error. Exit message: resource busy:
> >>> Failed to acquire lock: error -243.
> >>> Apr 7, 2017 1:28:02 PM
> >>> Storage Pool Manager runs on Host cl01 (Address: cl01).
> >>> Apr 7, 2017 1:27:59 PM
> >>> Invalid status on Data Center DC. Setting status to 

Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain

2017-04-26 Thread Jens Oechsler
Greetings,

Is there any way to get the oVirt Data Center described below active again?

On Tue, Apr 25, 2017 at 4:11 PM, Jens Oechsler  wrote:
> Hi,
>
> LUN is not in pvs output, but I found it in lsblk output without any
> partions on it apparently.
>
> $ sudo pvs
>   PVVG
>   Fmt  Attr PSize   PFree
>   /dev/mapper/360050768018182b6c990 data
>   lvm2 a--  200.00g 180.00g
>   /dev/mapper/360050768018182b6c998
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a--  499.62g 484.50g
>   /dev/sda2 system
>   lvm2 a--  278.78g 208.41g
>
> $ sudo lvs
>   LV   VG
>  Attr   LSizePool Origin Data%  Meta%  Move Log Cpy%Sync
> Convert
>   34a9328f-87fe-4190-96e9-a3580b0734fc
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
>   506ff043-1058-448c-bbab-5c864adb2bfc
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-   10.00g
>   65449c88-bc28-4275--5fc75b692cbc
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
>   e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
>   ids
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
>   inbox
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
>   leases
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g
>   master
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
>   metadata
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  512.00m
>   outbox
> 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
>   data data
>  -wi-ao   20.00g
>   home system
>  -wi-ao 1000.00m
>   prod system
>  -wi-ao4.88g
>   root system
>  -wi-ao7.81g
>   swap system
>  -wi-ao4.00g
>   swap7system
>  -wi-ao   20.00g
>   tmp  system
>  -wi-ao4.88g
>   var  system
>  -wi-ao   27.81g
>
> $ sudo lsblk
> 
> sdq
> 65:00   500G  0 disk
> └─360050768018182b6c9d7
>253:33   0   500G  0 mpath
>
> Data domain was made with one 500 GB LUN and extended with 500 GB more.
>
> On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland  wrote:
>> Hi,
>>
>> Do you see the LUN in the host ?
>> Can you share pvs and lvs output ?
>>
>> Thanks,
>>
>> Fred
>>
>> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler  wrote:
>>>
>>> Hello
>>> I have a problem with oVirt Hosted Engine Setup version:
>>> 4.0.5.5-1.el7.centos.
>>> Setup is using FCP SAN for data and engine.
>>> Cluster has worked fine for a while. It has two hosts with VMs running.
>>> I extended storage with an additional LUN recently. This LUN seems to
>>> be gone from data domain and one VM is paused which I assume has data
>>> on that device.
>>>
>>> Got these errors in events:
>>>
>>> Apr 24, 2017 10:26:05 AM
>>> Failed to activate Storage Domain SD (Data Center DC) by
>>> admin@internal-authz
>>> Apr 10, 2017 3:38:08 PM
>>> Status of host cl01 was set to Up.
>>> Apr 10, 2017 3:38:03 PM
>>> Host cl01 does not enforce SELinux. Current status: DISABLED
>>> Apr 10, 2017 3:37:58 PM
>>> Host cl01 is initializing. Message: Recovering from crash or Initializing
>>> Apr 10, 2017 3:37:58 PM
>>> VDSM cl01 command failed: Recovering from crash or Initializing
>>> Apr 10, 2017 3:37:46 PM
>>> Failed to Reconstruct Master Domain for Data Center DC.
>>> Apr 10, 2017 3:37:46 PM
>>> Host cl01 is not responding. Host cannot be fenced automatically
>>> because power management for the host is disabled.
>>> Apr 10, 2017 3:37:46 PM
>>> VDSM cl01 command failed: Broken pipe
>>> Apr 10, 2017 3:37:46 PM
>>> VDSM cl01 command failed: Broken pipe
>>> Apr 10, 2017 3:32:45 PM
>>> Invalid status on Data Center DC. Setting Data Center status to Non
>>> Responsive (On host cl01, Error: General Exception).
>>> Apr 10, 2017 3:32:45 PM
>>> VDSM cl01 command failed: [Errno 19] Could not find dm device named
>>> `[unknown]`
>>> Apr 7, 2017 1:28:04 PM
>>> VM HostedEngine is down with error. Exit message: resource busy:
>>> Failed to acquire lock: error -243.
>>> Apr 7, 2017 1:28:02 PM
>>> Storage Pool Manager runs on Host cl01 (Address: cl01).
>>> Apr 7, 2017 1:27:59 PM
>>> Invalid status on Data Center DC. Setting status to Non Responsive.
>>> Apr 7, 2017 1:27:53 PM
>>> Host cl02 does not enforce SELinux. Current status: DISABLED
>>> Apr 7, 2017 1:27:52 PM
>>> Host cl01 does not enforce SELinux. Current status: DISABLED
>>> Apr 7, 2017 1:27:49 PM
>>> Affinity Rules Enforcement Manager started.
>>> Apr 7, 2017 1:27:34 PM
>>> ETL Service Started
>>> Apr 7, 2017 1:26:01 PM
>>> ETL Service Stopped
>>> Apr 3, 2017 

Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain

2017-04-25 Thread Jens Oechsler
Hi,

LUN is not in pvs output, but I found it in lsblk output without any
partions on it apparently.

$ sudo pvs
  PVVG
  Fmt  Attr PSize   PFree
  /dev/mapper/360050768018182b6c990 data
  lvm2 a--  200.00g 180.00g
  /dev/mapper/360050768018182b6c998
9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a--  499.62g 484.50g
  /dev/sda2 system
  lvm2 a--  278.78g 208.41g

$ sudo lvs
  LV   VG
 Attr   LSizePool Origin Data%  Meta%  Move Log Cpy%Sync
Convert
  34a9328f-87fe-4190-96e9-a3580b0734fc
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
  506ff043-1058-448c-bbab-5c864adb2bfc
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-   10.00g
  65449c88-bc28-4275--5fc75b692cbc
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
  e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
  ids
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao  128.00m
  inbox
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
  leases
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g
  master
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g
  metadata
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  512.00m
  outbox
9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-  128.00m
  data data
 -wi-ao   20.00g
  home system
 -wi-ao 1000.00m
  prod system
 -wi-ao4.88g
  root system
 -wi-ao7.81g
  swap system
 -wi-ao4.00g
  swap7system
 -wi-ao   20.00g
  tmp  system
 -wi-ao4.88g
  var  system
 -wi-ao   27.81g

$ sudo lsblk

sdq
65:00   500G  0 disk
└─360050768018182b6c9d7
   253:33   0   500G  0 mpath

Data domain was made with one 500 GB LUN and extended with 500 GB more.

On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland  wrote:
> Hi,
>
> Do you see the LUN in the host ?
> Can you share pvs and lvs output ?
>
> Thanks,
>
> Fred
>
> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler  wrote:
>>
>> Hello
>> I have a problem with oVirt Hosted Engine Setup version:
>> 4.0.5.5-1.el7.centos.
>> Setup is using FCP SAN for data and engine.
>> Cluster has worked fine for a while. It has two hosts with VMs running.
>> I extended storage with an additional LUN recently. This LUN seems to
>> be gone from data domain and one VM is paused which I assume has data
>> on that device.
>>
>> Got these errors in events:
>>
>> Apr 24, 2017 10:26:05 AM
>> Failed to activate Storage Domain SD (Data Center DC) by
>> admin@internal-authz
>> Apr 10, 2017 3:38:08 PM
>> Status of host cl01 was set to Up.
>> Apr 10, 2017 3:38:03 PM
>> Host cl01 does not enforce SELinux. Current status: DISABLED
>> Apr 10, 2017 3:37:58 PM
>> Host cl01 is initializing. Message: Recovering from crash or Initializing
>> Apr 10, 2017 3:37:58 PM
>> VDSM cl01 command failed: Recovering from crash or Initializing
>> Apr 10, 2017 3:37:46 PM
>> Failed to Reconstruct Master Domain for Data Center DC.
>> Apr 10, 2017 3:37:46 PM
>> Host cl01 is not responding. Host cannot be fenced automatically
>> because power management for the host is disabled.
>> Apr 10, 2017 3:37:46 PM
>> VDSM cl01 command failed: Broken pipe
>> Apr 10, 2017 3:37:46 PM
>> VDSM cl01 command failed: Broken pipe
>> Apr 10, 2017 3:32:45 PM
>> Invalid status on Data Center DC. Setting Data Center status to Non
>> Responsive (On host cl01, Error: General Exception).
>> Apr 10, 2017 3:32:45 PM
>> VDSM cl01 command failed: [Errno 19] Could not find dm device named
>> `[unknown]`
>> Apr 7, 2017 1:28:04 PM
>> VM HostedEngine is down with error. Exit message: resource busy:
>> Failed to acquire lock: error -243.
>> Apr 7, 2017 1:28:02 PM
>> Storage Pool Manager runs on Host cl01 (Address: cl01).
>> Apr 7, 2017 1:27:59 PM
>> Invalid status on Data Center DC. Setting status to Non Responsive.
>> Apr 7, 2017 1:27:53 PM
>> Host cl02 does not enforce SELinux. Current status: DISABLED
>> Apr 7, 2017 1:27:52 PM
>> Host cl01 does not enforce SELinux. Current status: DISABLED
>> Apr 7, 2017 1:27:49 PM
>> Affinity Rules Enforcement Manager started.
>> Apr 7, 2017 1:27:34 PM
>> ETL Service Started
>> Apr 7, 2017 1:26:01 PM
>> ETL Service Stopped
>> Apr 3, 2017 1:22:54 PM
>> Shutdown of VM HostedEngine failed.
>> Apr 3, 2017 1:22:52 PM
>> Storage Pool Manager runs on Host cl01 (Address: cl01).
>> Apr 3, 2017 1:22:49 PM
>> Invalid status on Data Center DC. Setting status to Non Responsive.
>>
>>
>> Master data domain is inactive.
>>
>>
>> vdsm.log:
>>
>> jsonrpc.Executor/5::INFO::2017-04-20
>> 

Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain

2017-04-25 Thread Fred Rolland
Hi,

Do you see the LUN in the host ?
Can you share pvs and lvs output ?

Thanks,

Fred

On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler  wrote:

> Hello
> I have a problem with oVirt Hosted Engine Setup version:
> 4.0.5.5-1.el7.centos.
> Setup is using FCP SAN for data and engine.
> Cluster has worked fine for a while. It has two hosts with VMs running.
> I extended storage with an additional LUN recently. This LUN seems to
> be gone from data domain and one VM is paused which I assume has data
> on that device.
>
> Got these errors in events:
>
> Apr 24, 2017 10:26:05 AM
> Failed to activate Storage Domain SD (Data Center DC) by
> admin@internal-authz
> Apr 10, 2017 3:38:08 PM
> Status of host cl01 was set to Up.
> Apr 10, 2017 3:38:03 PM
> Host cl01 does not enforce SELinux. Current status: DISABLED
> Apr 10, 2017 3:37:58 PM
> Host cl01 is initializing. Message: Recovering from crash or Initializing
> Apr 10, 2017 3:37:58 PM
> VDSM cl01 command failed: Recovering from crash or Initializing
> Apr 10, 2017 3:37:46 PM
> Failed to Reconstruct Master Domain for Data Center DC.
> Apr 10, 2017 3:37:46 PM
> Host cl01 is not responding. Host cannot be fenced automatically
> because power management for the host is disabled.
> Apr 10, 2017 3:37:46 PM
> VDSM cl01 command failed: Broken pipe
> Apr 10, 2017 3:37:46 PM
> VDSM cl01 command failed: Broken pipe
> Apr 10, 2017 3:32:45 PM
> Invalid status on Data Center DC. Setting Data Center status to Non
> Responsive (On host cl01, Error: General Exception).
> Apr 10, 2017 3:32:45 PM
> VDSM cl01 command failed: [Errno 19] Could not find dm device named
> `[unknown]`
> Apr 7, 2017 1:28:04 PM
> VM HostedEngine is down with error. Exit message: resource busy:
> Failed to acquire lock: error -243.
> Apr 7, 2017 1:28:02 PM
> Storage Pool Manager runs on Host cl01 (Address: cl01).
> Apr 7, 2017 1:27:59 PM
> Invalid status on Data Center DC. Setting status to Non Responsive.
> Apr 7, 2017 1:27:53 PM
> Host cl02 does not enforce SELinux. Current status: DISABLED
> Apr 7, 2017 1:27:52 PM
> Host cl01 does not enforce SELinux. Current status: DISABLED
> Apr 7, 2017 1:27:49 PM
> Affinity Rules Enforcement Manager started.
> Apr 7, 2017 1:27:34 PM
> ETL Service Started
> Apr 7, 2017 1:26:01 PM
> ETL Service Stopped
> Apr 3, 2017 1:22:54 PM
> Shutdown of VM HostedEngine failed.
> Apr 3, 2017 1:22:52 PM
> Storage Pool Manager runs on Host cl01 (Address: cl01).
> Apr 3, 2017 1:22:49 PM
> Invalid status on Data Center DC. Setting status to Non Responsive.
>
>
> Master data domain is inactive.
>
>
> vdsm.log:
>
> jsonrpc.Executor/5::INFO::2017-04-20
> 07:01:26,796::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
> vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['ids']
> jsonrpc.Executor/5::DEBUG::2017-04-20
> 07:01:26,796::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
> --cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
> devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
> evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
> '\''a|/dev/mapper/360050768018182b6c99e|[unknown]|'\'',
> '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
> wait_for_locks=1  use_lvmetad=
> 0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
> bd616961-6da7-4eb0-939e-330b0a3fea6e/ids (cwd None)
> jsonrpc.Executor/5::DEBUG::2017-04-20
> 07:01:26,880::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS:  = "
> WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
> WARNING: To avoid corruption, rescan devices to make changes
>  visible (pvscan --cache).\n  Couldn't find device with uuid
> jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n";  = 0
> jsonrpc.Executor/5::INFO::2017-04-20
> 07:01:26,881::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
> vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['leases']
> jsonrpc.Executor/5::DEBUG::2017-04-20
> 07:01:26,881::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
> --cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
> devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
> evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
> '\''a|/dev/mapper/360050768018182b6c99e|[unknown]|'\'',
> '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
> wait_for_locks=1  use_lvmetad=
> 0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
> bd616961-6da7-4eb0-939e-330b0a3fea6e/leases (cwd None)
> jsonrpc.Executor/5::DEBUG::2017-04-20
> 07:01:26,973::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS:  = "
> WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
> WARNING: To avoid corruption, rescan devices to make changes
>  visible (pvscan --cache).\n  Couldn't find device with uuid
> jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n";  = 0
> jsonrpc.Executor/5::INFO::2017-04-20
> 07:01:26,973::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
> 

[ovirt-users] Hosted engine FCP SAN can not activate data domain

2017-04-24 Thread Jens Oechsler
Hello
I have a problem with oVirt Hosted Engine Setup version: 4.0.5.5-1.el7.centos.
Setup is using FCP SAN for data and engine.
Cluster has worked fine for a while. It has two hosts with VMs running.
I extended storage with an additional LUN recently. This LUN seems to
be gone from data domain and one VM is paused which I assume has data
on that device.

Got these errors in events:

Apr 24, 2017 10:26:05 AM
Failed to activate Storage Domain SD (Data Center DC) by admin@internal-authz
Apr 10, 2017 3:38:08 PM
Status of host cl01 was set to Up.
Apr 10, 2017 3:38:03 PM
Host cl01 does not enforce SELinux. Current status: DISABLED
Apr 10, 2017 3:37:58 PM
Host cl01 is initializing. Message: Recovering from crash or Initializing
Apr 10, 2017 3:37:58 PM
VDSM cl01 command failed: Recovering from crash or Initializing
Apr 10, 2017 3:37:46 PM
Failed to Reconstruct Master Domain for Data Center DC.
Apr 10, 2017 3:37:46 PM
Host cl01 is not responding. Host cannot be fenced automatically
because power management for the host is disabled.
Apr 10, 2017 3:37:46 PM
VDSM cl01 command failed: Broken pipe
Apr 10, 2017 3:37:46 PM
VDSM cl01 command failed: Broken pipe
Apr 10, 2017 3:32:45 PM
Invalid status on Data Center DC. Setting Data Center status to Non
Responsive (On host cl01, Error: General Exception).
Apr 10, 2017 3:32:45 PM
VDSM cl01 command failed: [Errno 19] Could not find dm device named `[unknown]`
Apr 7, 2017 1:28:04 PM
VM HostedEngine is down with error. Exit message: resource busy:
Failed to acquire lock: error -243.
Apr 7, 2017 1:28:02 PM
Storage Pool Manager runs on Host cl01 (Address: cl01).
Apr 7, 2017 1:27:59 PM
Invalid status on Data Center DC. Setting status to Non Responsive.
Apr 7, 2017 1:27:53 PM
Host cl02 does not enforce SELinux. Current status: DISABLED
Apr 7, 2017 1:27:52 PM
Host cl01 does not enforce SELinux. Current status: DISABLED
Apr 7, 2017 1:27:49 PM
Affinity Rules Enforcement Manager started.
Apr 7, 2017 1:27:34 PM
ETL Service Started
Apr 7, 2017 1:26:01 PM
ETL Service Stopped
Apr 3, 2017 1:22:54 PM
Shutdown of VM HostedEngine failed.
Apr 3, 2017 1:22:52 PM
Storage Pool Manager runs on Host cl01 (Address: cl01).
Apr 3, 2017 1:22:49 PM
Invalid status on Data Center DC. Setting status to Non Responsive.


Master data domain is inactive.


vdsm.log:

jsonrpc.Executor/5::INFO::2017-04-20
07:01:26,796::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['ids']
jsonrpc.Executor/5::DEBUG::2017-04-20
07:01:26,796::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
--cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
'\''a|/dev/mapper/360050768018182b6c99e|[unknown]|'\'',
'\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
wait_for_locks=1  use_lvmetad=
0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
bd616961-6da7-4eb0-939e-330b0a3fea6e/ids (cwd None)
jsonrpc.Executor/5::DEBUG::2017-04-20
07:01:26,880::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS:  = "
WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
WARNING: To avoid corruption, rescan devices to make changes
 visible (pvscan --cache).\n  Couldn't find device with uuid
jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n";  = 0
jsonrpc.Executor/5::INFO::2017-04-20
07:01:26,881::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['leases']
jsonrpc.Executor/5::DEBUG::2017-04-20
07:01:26,881::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
--cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
evices=1 write_cache_state=0 disable_after_error_count=3 filter = [
'\''a|/dev/mapper/360050768018182b6c99e|[unknown]|'\'',
'\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1
wait_for_locks=1  use_lvmetad=
0 }  backup {  retain_min = 50  retain_days = 0 } ' --refresh
bd616961-6da7-4eb0-939e-330b0a3fea6e/leases (cwd None)
jsonrpc.Executor/5::DEBUG::2017-04-20
07:01:26,973::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS:  = "
WARNING: Not using lvmetad because config setting use_lvmetad=0.\n
WARNING: To avoid corruption, rescan devices to make changes
 visible (pvscan --cache).\n  Couldn't find device with uuid
jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n";  = 0
jsonrpc.Executor/5::INFO::2017-04-20
07:01:26,973::lvm::1226::Storage.LVM::(activateLVs) Refreshing lvs:
vg=bd616961-6da7-4eb0-939e-330b0a3fea6e lvs=['metadata', 'leases',
'ids', 'inbox', 'outbox', 'master']
jsonrpc.Executor/5::DEBUG::2017-04-20
07:01:26,974::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset
--cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config '
devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d
evices=1 write_cache_state=0 disable_after_error_count=3 filter = [