[ovirt-users] Ovirt node ready for production env?

2017-07-19 Thread Lionel Caignec
Hi,

i'm did not test myself so i prefer asking before use it 
(https://www.ovirt.org/node/). 
Is ovirt node can be used for production environment ? 
Is it possible to add some software on host (ex: backup tools, ossec,... )? 
How does work security update, is it managed by ovirt? or can i plug ovirt node 
on spacewalk/katello?


Sorry for my "noobs question"

Regards
--
Lionel 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt on sdcard?

2017-07-19 Thread Lionel Caignec
Ok thank you,

for now i'm not so advanced on architecture design i'm just thinking of what 
can i do.

Lionel

- Mail original -
De: "Yedidyah Bar David" 
À: "Lionel Caignec" 
Cc: "users" 
Envoyé: Jeudi 20 Juillet 2017 08:03:50
Objet: Re: [ovirt-users] ovirt on sdcard?

On Wed, Jul 19, 2017 at 10:16 PM, Lionel Caignec  wrote:
> Hi,
>
> i'm planning to install some new hypervisors (ovirt) and i'm wondering if 
> it's possible to get it installed on sdcard.
> I know there is write limitation on this kind of storage device.
> Is it a viable solution? there is somewhere some tuto about tuning ovirt on 
> this kind of storage?

Perhaps provide some more details about your plans?

The local disk is normally used only for standard OS-level stuff -
mostly logging. If you put /var/log on NFS/iSCSI/whatever, I think
you should not expect much other local writing.
Didn't test this myself.

People are doing many other things, including putting all of the
root filesystem on remote storage. There are many options, depending
on your hardware, your existing infrastructure, etc.

Best,

>
> Thanks
>
> --
> Lionel
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt on sdcard?

2017-07-19 Thread Yedidyah Bar David
On Wed, Jul 19, 2017 at 10:16 PM, Lionel Caignec  wrote:
> Hi,
>
> i'm planning to install some new hypervisors (ovirt) and i'm wondering if 
> it's possible to get it installed on sdcard.
> I know there is write limitation on this kind of storage device.
> Is it a viable solution? there is somewhere some tuto about tuning ovirt on 
> this kind of storage?

Perhaps provide some more details about your plans?

The local disk is normally used only for standard OS-level stuff -
mostly logging. If you put /var/log on NFS/iSCSI/whatever, I think
you should not expect much other local writing.
Didn't test this myself.

People are doing many other things, including putting all of the
root filesystem on remote storage. There are many options, depending
on your hardware, your existing infrastructure, etc.

Best,

>
> Thanks
>
> --
> Lionel
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVIRT 4.1 / iSCSI Multipathing

2017-07-19 Thread Vinícius Ferrão
Hello,

I’ve skipped this message entirely yesterday. So this is per design? Because 
the best practices of iSCSI MPIO, as far as I know, recommends two completely 
separate paths. If this can’t be achieved with oVirt what’s the point of 
running MPIO?

May we ask for a bug fix or a feature redesign on this?

MPIO is part of my datacenter, and it was originally build for running 
XenServer, but I’m considering the move to oVirt. MPIO isn’t working right and 
this can be a great no-go for me...

I’m willing to wait and hold my DC project if this can be fixed.

Any answer from the redhat folks?

Thanks,
V.

> On 18 Jul 2017, at 11:09, Uwe Laverenz  wrote:
> 
> Hi,
> 
> 
> Am 17.07.2017 um 14:11 schrieb Devin Acosta:
> 
>> I am still troubleshooting the issue, I haven’t found any resolution to my 
>> issue at this point yet. I need to figure out by this Friday otherwise I 
>> need to look at Xen or another solution. iSCSI and oVIRT seems problematic.
> 
> The configuration of iSCSI-Multipathing via OVirt didn't work for me either. 
> IIRC the underlying problem in my case was that I use totally isolated 
> networks for each path.
> 
> Workaround: to make round robin work you have to enable it by editing 
> "/etc/multipath.conf". Just add the 3 lines for the round robin setting (see 
> comment in the file) and additionally add the "# VDSM PRIVATE" comment to 
> keep vdsmd from overwriting your settings.
> 
> My multipath.conf:
> 
> 
>> # VDSM REVISION 1.3
>> # VDSM PRIVATE
>> defaults {
>>polling_interval5
>>no_path_retry   fail
>>user_friendly_names no
>>flush_on_last_del   yes
>>fast_io_fail_tmo5
>>dev_loss_tmo30
>>max_fds 4096
>># 3 lines added manually for multipathing:
>>path_selector   "round-robin 0"
>>path_grouping_policymultibus
>>failbackimmediate
>> }
>> # Remove devices entries when overrides section is available.
>> devices {
>>device {
>># These settings overrides built-in devices settings. It does not 
>> apply
>># to devices without built-in settings (these use the settings in the
>># "defaults" section), or to devices defined in the "devices" section.
>># Note: This is not available yet on Fedora 21. For more info see
>># https://bugzilla.redhat.com/1253799
>>all_devsyes
>>no_path_retry   fail
>>}
>> }
> 
> 
> 
> To enable the settings:
> 
>  systemctl restart multipathd
> 
> See if it works:
> 
>  multipath -ll
> 
> 
> HTH,
> Uwe
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Problems with oVirt3.5 engine + CentOS6 Host

2017-07-19 Thread Antonio Sallés
Hello friends,

Since yesterday I've been trying to register a CentOS 6 host on an oVirt
3.5 engine, but I have not been able to finish the process successfully.
I copied the ssh keys and lifted vdsmd without problems, then tried via
firefox with admin portal to integrate it, but without success.
Could you help me? This is the error ... Thank you very much!

[root@ovirt ovirt-engine]# tail -f -n0 /var/log/ovirt-engine/engine.log
2017-07-19 17:21:25,078 WARN
[org.ovirt.engine.core.compat.backendcompat.PropertyInfo]
(ajp--127.0.0.1-8702-3) Unable to get value of property: vdsName for
class org.ovirt.engine.core.common.businessentities.VdsStatic
2017-07-19 17:21:25,079 WARN
[org.ovirt.engine.core.compat.backendcompat.PropertyInfo]
(ajp--127.0.0.1-8702-3) Unable to get value of property: vdsName for
class org.ovirt.engine.core.common.businessentities.VdsStatic
2017-07-19 17:21:25,079 INFO
[org.ovirt.engine.core.bll.InstallVdsCommand] (ajp--127.0.0.1-8702-3)
[7a173239] Running command: InstallVdsCommand internal: false. Entities
affected :  ID: 9b82c66f-46c8-49dc-8ab3-2b5e27c7bdd5 Type: VDSAction
group EDIT_HOST_CONFIGURATION with role type ADMIN
2017-07-19 17:21:25,088 WARN
[org.ovirt.engine.core.compat.backendcompat.PropertyInfo]
(ajp--127.0.0.1-8702-3) Unable to get value of property: vdsName for
class org.ovirt.engine.core.common.businessentities.VdsStatic
2017-07-19 17:21:25,088 WARN
[org.ovirt.engine.core.compat.backendcompat.PropertyInfo]
(ajp--127.0.0.1-8702-3) Unable to get value of property: vdsName for
class org.ovirt.engine.core.common.businessentities.VdsStatic
2017-07-19 17:21:25,090 INFO
[org.ovirt.engine.core.bll.InstallVdsInternalCommand]
(ajp--127.0.0.1-8702-3) [7a173239] Lock Acquired to object EngineLock
[exclusiveLocks= key: 9b82c66f-46c8-49dc-8ab3-2b5e27c7bdd5 value: VDS
, sharedLocks= ]
2017-07-19 17:21:25,093 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(ajp--127.0.0.1-8702-3) [7a173239] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: Failed to verify Power Management
configuration for Host kvm2.segic.cl.
2017-07-19 17:21:25,095 INFO
[org.ovirt.engine.core.bll.InstallVdsInternalCommand]
(org.ovirt.thread.pool-8-thread-32) [7a173239] Running command:
InstallVdsInternalCommand internal: true. Entities affected :  ID:
9b82c66f-46c8-49dc-8ab3-2b5e27c7bdd5 Type: VDS
2017-07-19 17:21:25,095 INFO
[org.ovirt.engine.core.bll.InstallVdsInternalCommand]
(org.ovirt.thread.pool-8-thread-32) [7a173239] Before Installation host
9b82c66f-46c8-49dc-8ab3-2b5e27c7bdd5, kvm2.segic.cl
2017-07-19 17:21:25,105 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(ajp--127.0.0.1-8702-3) [7a173239] Correlation ID: 7a173239, Call Stack:
null, Custom Event ID: -1, Message: Host kvm2.segic.cl configuration was
updated by admin@internal.
2017-07-19 17:21:25,106 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(org.ovirt.thread.pool-8-thread-32) [7a173239] START,
SetVdsStatusVDSCommand(HostName = kvm2.segic.cl, HostId =
9b82c66f-46c8-49dc-8ab3-2b5e27c7bdd5, status=Installing,
nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 8c3992
2017-07-19 17:21:25,109 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(org.ovirt.thread.pool-8-thread-32) [7a173239] FINISH,
SetVdsStatusVDSCommand, log id: 8c3992
2017-07-19 17:21:25,133 INFO
[org.ovirt.engine.core.bll.InstallerMessages]
(org.ovirt.thread.pool-8-thread-32) [7a173239] Installation
158.170.39.12: Connected to host 158.170.39.12 with SSH key fingerprint:
16:2b:79:78:60:ea:d2:24:0a:8d:7c:2f:2e:8e:20:51
2017-07-19 17:21:25,138 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-32) [7a173239] Correlation ID: 7a173239,
Call Stack: null, Custom Event ID: -1, Message: Installing Host
kvm2.segic.cl. Connected to host 158.170.39.12 with SSH key fingerprint:
16:2b:79:78:60:ea:d2:24:0a:8d:7c:2f:2e:8e:20:51.
2017-07-19 17:21:25,223 INFO  [org.ovirt.engine.core.bll.VdsDeploy]
(org.ovirt.thread.pool-8-thread-32) [7a173239] Installation of
158.170.39.12. Executing command via SSH umask 0077;
MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XX)"; trap
"chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" >
/dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x &&
"${MYTMP}"/setup DIALOG/dialect=str:machine
DIALOG/customization=bool:True <
/var/cache/ovirt-engine/ovirt-host-deploy.tar
2017-07-19 17:21:25,223 INFO
[org.ovirt.engine.core.utils.archivers.tar.CachedTar]
(org.ovirt.thread.pool-8-thread-32) Tarball
'/var/cache/ovirt-engine/ovirt-host-deploy.tar' refresh
2017-07-19 17:21:25,254 INFO
[org.ovirt.engine.core.uutils.ssh.SSHDialog]
(org.ovirt.thread.pool-8-thread-32) SSH execute root@158.170.39.12
'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t
ovirt-XX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1;
rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning

[ovirt-users] ovirt on sdcard?

2017-07-19 Thread Lionel Caignec
Hi,

i'm planning to install some new hypervisors (ovirt) and i'm wondering if it's 
possible to get it installed on sdcard.
I know there is write limitation on this kind of storage device. 
Is it a viable solution? there is somewhere some tuto about tuning ovirt on 
this kind of storage?

Thanks

--
Lionel 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Alan Griffiths
What happens if you run "/usr/bin/vdsm-tool restore-nets" manually?

On 19 July 2017 at 16:22, Anthony.Fillmore 
wrote:

> All services active and running except the vdsm-network.service which last
> entry is “activating”:
>
>
>
> [root@t0894bmh1001 vdsm.conf.d]# systemctl status -l vdsm-network.service
> -l
>
> ● vdsm-network.service - Virtual Desktop Server Manager network restoration
>
>Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled;
> vendor preset: enabled)
>
>Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 23h ago
>
>   Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append
> --logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence
> (code=exited, status=0/SUCCESS)
>
> Main PID: 8231 (vdsm-tool)
>
>CGroup: /system.slice/vdsm-network.service
>
>├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
>
>└─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
>
> *From:* Alan Griffiths [mailto:apgriffith...@gmail.com]
> *Sent:* Wednesday, July 19, 2017 10:13 AM
>
> *To:* Anthony.Fillmore 
> *Cc:* Pavel Gashev ; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> Looking at vdsmd.service on one of my 4.0 hosts.
>
>
>
> Requires=multipathd.service libvirtd.service time-sync.target \
>
>  iscsid.service rpcbind.service supervdsmd.service sanlock.service
> \
>
>  vdsm-network.service
>
>
>
> Are all these services present and running?
>
>
>
>
>
> On 19 July 2017 at 16:05, Anthony.Fillmore 
> wrote:
>
> Are the vdsm.conf or mom.conf file in /etc/vdsm of note in this
> situation?
>
>
>
> *From:* Anthony.Fillmore
> *Sent:* Wednesday, July 19, 2017 9:57 AM
> *To:* 'Alan Griffiths' 
> *Cc:* Pavel Gashev ; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* RE: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> [boxname ~]# systemctl | grep -i dead
>
> mom-vdsm.service
>
>start MOM instance
> configured for VDSM purposes
>
> vdsmd.service
>
> start Virtual Desktop
> Server Manager
>
>
>
>
>
> [ boxname ~]# systemctl | grep -i exited
>
> blk-availability.service
>
>Availability
> of block devices
>
> iptables.service
>
> IPv4 firewall with
> iptables
>
> kdump.service
>
>  Crash
> recovery kernel arming
>
> kmod-static-nodes.service
>
>  Create list
> of required static device nodes for the current kernel
>
> lvm2-monitor.service
>
>  Monitoring
> of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
>
> lvm2-pvscan@253:3.service
>
>  LVM2 PV scan
> on device 253:3
>
> lvm2-pvscan@253:4.service
>
>  LVM2 PV scan
> on device 253:4
>
> lvm2-pvscan@8:3.service
>
>  LVM2 PV scan
> on device 8:3
>
> network.service
>
>  LSB:
> Bring up/down networking
>
> openvswitch-nonetwork.service
>
>   Open vSwitch
> Internal Unit
>
> openvswitch.service
>
> Open vSwitch
>
> rhel-dmesg.service
>
>  Dump dmesg
> to /var/log/dmesg
>
> rhel-import-state.service
>
>  Import
> network configuration from initramfs
>
> rhel-readonly.service
>
>Configure
> read-only root support
>
> systemd-journal-flush.service
>
>Flush Journal to
> Persistent Storage
>
> systemd-modules-load.service
>
> 
> Load
> Kernel Modules
>
> systemd-random-seed.service
>
>  Load/Save
> Random Seed
>
> systemd-readahead-collect.service
>
>  Collect
> Read-Ahead Data
>
> systemd-readahead-replay.service
>
>   Replay
> Read-Ahead Data
>
> systemd-remount-fs.service
>
> 

Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Anthony . Fillmore
All services active and running except the vdsm-network.service which last 
entry is “activating”:

[root@t0894bmh1001 vdsm.conf.d]# systemctl status -l vdsm-network.service -l
● vdsm-network.service - Virtual Desktop Server Manager network restoration
   Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled; 
vendor preset: enabled)
   Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 23h ago
  Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append 
--logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence (code=exited, 
status=0/SUCCESS)
Main PID: 8231 (vdsm-tool)
   CGroup: /system.slice/vdsm-network.service
   ├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
   └─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
From: Alan Griffiths [mailto:apgriffith...@gmail.com]
Sent: Wednesday, July 19, 2017 10:13 AM
To: Anthony.Fillmore 
Cc: Pavel Gashev ; users@ovirt.org; Brandon.Markgraf 
; Sandeep.Mendiratta 

Subject: Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network 
Outage

Looking at vdsmd.service on one of my 4.0 hosts.

Requires=multipathd.service libvirtd.service time-sync.target \
 iscsid.service rpcbind.service supervdsmd.service sanlock.service \
 vdsm-network.service

Are all these services present and running?


On 19 July 2017 at 16:05, Anthony.Fillmore 
mailto:anthony.fillm...@target.com>> wrote:
Are the vdsm.conf or mom.conf file in /etc/vdsm of note in this situation?

From: Anthony.Fillmore
Sent: Wednesday, July 19, 2017 9:57 AM
To: 'Alan Griffiths' mailto:apgriffith...@gmail.com>>
Cc: Pavel Gashev mailto:p...@acronis.com>>; 
users@ovirt.org; Brandon.Markgraf 
mailto:brandon.markg...@target.com>>; 
Sandeep.Mendiratta 
mailto:sandeep.mendira...@target.com>>
Subject: RE: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network 
Outage

[boxname ~]# systemctl | grep -i dead
mom-vdsm.service

   start MOM instance configured for VDSM purposes
vdsmd.service   

   start Virtual Desktop Server Manager


[ boxname ~]# systemctl | grep -i exited
blk-availability.service

 Availability of block devices
iptables.service

 IPv4 firewall with iptables
kdump.service   

 Crash recovery kernel arming
kmod-static-nodes.service   

 Create list of required static device nodes for the 
current kernel
lvm2-monitor.service

 Monitoring of LVM2 mirrors, snapshots etc. using dmeventd 
or progress polling
lvm2-pvscan@253:3.service 

   LVM2 PV scan on device 
253:3
lvm2-pvscan@253:4.service 

   LVM2 PV scan on device 
253:4
lvm2-pvscan@8:3.service 

 LVM2 PV scan on device 8:3
network.service 

 LSB: Bring up/down networking
openvswitch-nonetwork.service   

 Open vSwitch Internal Unit
openvswitch.service 

 Open vSwitch
rhel-dmesg.service   

Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Alan Griffiths
Looking at vdsmd.service on one of my 4.0 hosts.

Requires=multipathd.service libvirtd.service time-sync.target \
 iscsid.service rpcbind.service supervdsmd.service sanlock.service \
 vdsm-network.service

Are all these services present and running?


On 19 July 2017 at 16:05, Anthony.Fillmore 
wrote:

> Are the vdsm.conf or mom.conf file in /etc/vdsm of note in this
> situation?
>
>
>
> *From:* Anthony.Fillmore
> *Sent:* Wednesday, July 19, 2017 9:57 AM
> *To:* 'Alan Griffiths' 
> *Cc:* Pavel Gashev ; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* RE: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> [boxname ~]# systemctl | grep -i dead
>
> mom-vdsm.service
>
>start MOM instance
> configured for VDSM purposes
>
> vdsmd.service
>
> start Virtual Desktop
> Server Manager
>
>
>
>
>
> [ boxname ~]# systemctl | grep -i exited
>
> blk-availability.service
>
>Availability
> of block devices
>
> iptables.service
>
> IPv4 firewall with
> iptables
>
> kdump.service
>
>  Crash
> recovery kernel arming
>
> kmod-static-nodes.service
>
>  Create list
> of required static device nodes for the current kernel
>
> lvm2-monitor.service
>
>  Monitoring
> of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
>
> lvm2-pvscan@253:3.service
>
>  LVM2 PV scan
> on device 253:3
>
> lvm2-pvscan@253:4.service
>
>  LVM2 PV scan
> on device 253:4
>
> lvm2-pvscan@8:3.service
>
>  LVM2 PV scan
> on device 8:3
>
> network.service
>
>  LSB:
> Bring up/down networking
>
> openvswitch-nonetwork.service
>
>   Open vSwitch
> Internal Unit
>
> openvswitch.service
>
> Open vSwitch
>
> rhel-dmesg.service
>
>  Dump dmesg
> to /var/log/dmesg
>
> rhel-import-state.service
>
>  Import
> network configuration from initramfs
>
> rhel-readonly.service
>
>Configure
> read-only root support
>
> systemd-journal-flush.service
>
>Flush Journal to
> Persistent Storage
>
> systemd-modules-load.service
>
> 
> Load
> Kernel Modules
>
> systemd-random-seed.service
>
>  Load/Save
> Random Seed
>
> systemd-readahead-collect.service
>
>  Collect
> Read-Ahead Data
>
> systemd-readahead-replay.service
>
>   Replay
> Read-Ahead Data
>
> systemd-remount-fs.service
>
>  Remount Root
> and Kernel File Systems
>
> systemd-sysctl.service
>
>  Apply Kernel
> Variables
>
> systemd-tmpfiles-setup-dev.service
>
> Create
> Static Device Nodes in /dev
>
> systemd-tmpfiles-setup.service
>
>
> Create Volatile Files and Directories
>
> systemd-udev-trigger.service
>
>  udev
> Coldplug all Devices
>
> systemd-update-utmp.service
>
>  Update UTMP
> about System Boot/Shutdown
>
> systemd-user-sessions.service
>
>  Permit
> User Sessions
>
> systemd-vconsole-setup.service
>
>Setup
> Virtual Console
>
> vdsm-network-init.service
>
>  Virtual
> Desktop Server Manager network IP+link restoration
>
>
>
> *From:* Alan Griffiths [mailto:apgriffith...@gmail.com
> ]
> *Sent:* Wednesday, July 19, 2017 9:47 AM
>
> *To:* Anthony.Fillmore 
> *Cc:* Pavel Gashev ; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* Re: [ovirt-users] [EXTERNAL] Re: Host st

Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Anthony . Fillmore
Are the vdsm.conf or mom.conf file in /etc/vdsm of note in this situation?

From: Anthony.Fillmore
Sent: Wednesday, July 19, 2017 9:57 AM
To: 'Alan Griffiths' 
Cc: Pavel Gashev ; users@ovirt.org; Brandon.Markgraf 
; Sandeep.Mendiratta 

Subject: RE: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network 
Outage

[boxname ~]# systemctl | grep -i dead
mom-vdsm.service

   start MOM instance configured for VDSM purposes
vdsmd.service   

   start Virtual Desktop Server Manager


[ boxname ~]# systemctl | grep -i exited
blk-availability.service

 Availability of block devices
iptables.service

 IPv4 firewall with iptables
kdump.service   

 Crash recovery kernel arming
kmod-static-nodes.service   

 Create list of required static device nodes for the 
current kernel
lvm2-monitor.service

 Monitoring of LVM2 mirrors, snapshots etc. using dmeventd 
or progress polling
lvm2-pvscan@253:3.service 

   LVM2 PV scan on device 
253:3
lvm2-pvscan@253:4.service 

   LVM2 PV scan on device 
253:4
lvm2-pvscan@8:3.service 

 LVM2 PV scan on device 8:3
network.service 

 LSB: Bring up/down networking
openvswitch-nonetwork.service   

 Open vSwitch Internal Unit
openvswitch.service 

 Open vSwitch
rhel-dmesg.service  

 Dump dmesg to /var/log/dmesg
rhel-import-state.service   

 Import network configuration from initramfs
rhel-readonly.service   

 Configure read-only root support
systemd-journal-flush.service   

 Flush Journal to Persistent Storage
systemd-modules-load.service

 Load Kernel Modules
systemd-random-seed.service 

 Load/Save Random Seed
systemd-readahead-collect.service   

 Collect Read-Ahead Data
systemd-readahead-replay.service
   

Re: [ovirt-users] Hosted Engine/NFS Troubles

2017-07-19 Thread Phillip Bailey
On Tue, Jul 18, 2017 at 7:09 AM, Pavel Gashev  wrote:

> Phillip,
>
>
>
> The relevant lines from the vdsm logs are the following:
>
>
>
> jsonrpc.Executor/6::INFO::2017-07-17 
> 14:24:41,005::logUtils::49::dispatcher::(wrapper)
> Run and protect: connectStorageServer(domType=1,
> spUUID=u'----', conList=[{u'protocol
>
> _version': 3, u'connection': u'192.168.1.21:/srv/ovirt', u'user': u'kvm',
> u'id': u'dbeb8ab4-849f-4728-8ee9-f891bb84ce2f'}], options=None)
>
> jsonrpc.Executor/6::DEBUG::2017-07-17 
> 14:24:41,006::fileUtils::209::Storage.fileUtils::(createdir)
> Creating directory: /rhev/data-center/mnt/192.168.1.21:_srv_ovirt mode:
> None
>
> jsonrpc.Executor/6::DEBUG::2017-07-17 
> 14:24:41,007::fileUtils::218::Storage.fileUtils::(createdir)
> Using existing directory: /rhev/data-center/mnt/192.168.1.21:_srv_ovirt
>
> jsonrpc.Executor/6::INFO::2017-07-17 
> 14:24:41,007::mount::226::storage.Mount::(mount)
> mounting 192.168.1.21:/srv/ovirt at /rhev/data-center/mnt/192.168.
> 1.21:_srv_ovirt
>
> jsonrpc.Executor/6::ERROR::2017-07-17 14:26:46,098::hsm::2403::
> Storage.HSM::(connectStorageServer) Could not connect to storageServer
>
> Traceback (most recent call last):
>
>   File "/usr/share/vdsm/storage/hsm.py", line 2400, in
> connectStorageServer
>
> conObj.connect()
>
>   File "/usr/share/vdsm/storage/storageServer.py", line 456, in connect
>
> return self._mountCon.connect()
>
>   File "/usr/share/vdsm/storage/storageServer.py", line 238, in connect
>
> six.reraise(t, v, tb)
>
>   File "/usr/share/vdsm/storage/storageServer.py", line 230, in connect
>
> self._mount.mount(self.options, self._vfsType, cgroup=self.CGROUP)
>
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line
> 229, in mount
>
> timeout=timeout, cgroup=cgroup)
>
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in
> __call__
>
> return callMethod()
>
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in
> 
>
> **kwargs)
>
>   File "", line 2, in mount
>
>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in
> _callmethod
>
> raise convert_to_error(kind, result)
>
> MountError: (32, ';mount.nfs: *Connection timed out*\n')
>

I saw this as well, but I don't understand why I don't have the same
problem when mounting the domain manually and writing to it as the vdsm
user. Thanks for pointing it out, though. I should have included that in
the original message.

>
>
>
>
> *From: * on behalf of Phillip Bailey <
> phbai...@redhat.com>
> *Date: *Tuesday, 18 July 2017 at 13:48
> *To: *Luca 'remix_tj' Lorenzetto 
> *Cc: *users 
> *Subject: *Re: [ovirt-users] Hosted Engine/NFS Troubles
>
>
>
> On Mon, Jul 17, 2017 at 3:34 PM, Luca 'remix_tj' Lorenzetto <
> lorenzetto.l...@gmail.com> wrote:
>
> On Mon, Jul 17, 2017 at 9:05 PM, Phillip Bailey 
> wrote:
> > Hi,
> >
> > I'm having trouble with my hosted engine setup (v4.0) and could use some
> > help. The problem I'm having is that whenever I try to add additional
> hosts
> > to the setup via webadmin, the operation fails due to storage-related
> > issues.
> >
> > webadmin shows the following error messages:
> >
> > "Host  cannot access the Storage Domain(s) hosted_storage
> > attached to the Data Center Default. Setting Host state to
> Non-Operational.
> > Failed to connect Host ovirt-node-1 to Storage Pool Default"
> >
>
> Hi Phillip,
>
> your hosted engine storage is on nfs, right? Did you test if you can
> mount manually on each host?
>
> Hi Luca,
>
>
>
> Yes, both storage domains are on NFS (v3) and I am able to successfully
> mount them manually on the hosts.
>
>
>
> Luca
>
>
>
> --
> "E' assurdo impiegare gli uomini di intelligenza eccellente per fare
> calcoli che potrebbero essere affidati a chiunque se si usassero delle
> macchine"
> Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
>
> "Internet è la più grande biblioteca del mondo.
> Ma il problema è che i libri sono tutti sparsi sul pavimento"
> John Allen Paulos, Matematico (1945-vivente)
>
> Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , <
> lorenzetto.l...@gmail.com>
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine/NFS Troubles

2017-07-19 Thread Phillip Bailey
On Tue, Jul 18, 2017 at 7:04 AM, Staniforth, Paul <
p.stanifo...@leedsbeckett.ac.uk> wrote:

> I used the troubleshooting guide at
>
>
> documentation/how-to/troubleshooting/troubleshooting-nfs-storage-issues/
>
>
> and exported using
>
>
> (rw,sync,no_subtree_check,all_squash,anonuid=36,anongid=36)
>
>
> This was for the DATA_DOMAIN not hosted storage but it may help.
>
>
> My exports file has the same line, so I don't believe that's the problem.
I went through the recommended steps on the troubleshooting guide and
everything looks to be fine, but the problem still exists. The
troubleshooting guide never came up in my previous googling of the problem
though, so thanks for pointing me to that. =)
>
> Regards,
>
>
> Paul S.
> --
> *From:* users-boun...@ovirt.org  on behalf of
> Phillip Bailey 
> *Sent:* 17 July 2017 20:05
> *To:* users
> *Subject:* [ovirt-users] Hosted Engine/NFS Troubles
>
> Hi,
>
> I'm having trouble with my hosted engine setup (v4.0) and could use some
> help. The problem I'm having is that whenever I try to add additional
> hosts to the setup via webadmin, the operation fails due to storage-related
> issues.
>
> webadmin shows the following error messages:
>
> "Host  cannot access the Storage Domain(s) hosted_storage
> attached to the Data Center Default. Setting Host state to Non-Operational.
> Failed to connect Host ovirt-node-1 to Storage Pool Default"
>
>
> The VDSM log from the host shows the following error message:
>
> "Thread-18::ERROR::2017-07-17 13:01:11,483::sdc::146::Storag
> e.StorageDomainCache::(_findDomain) domain 
> ca044720-e5cf-40a8-8b21-57a17026db7c
> not found
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain
> dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 174, in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'ca044720-e5cf-40a8-8b21-57a17026db7c',)"
>
>
> The engine log shows the following error messages:
>
> "2017-07-17 18:32:11,409 ERROR 
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
> (org.ovirt.thread.pool-6-thread-34) [] Domain
> 'ca044720-e5cf-40a8-8b21-57a17026db7c:hosted_storage' was reported with
> error code '358'
> 2017-07-17 18:32:11,410 ERROR [org.ovirt.engine.core.bll.InitVdsOnUpCommand]
> (org.ovirt.thread.pool-6-thread-34) [] Storage Domain 'hosted_storage' of
> pool 'Default' is in problem in host 'ovirt-node-1'
> 2017-07-17 18:32:11,487 ERROR [org.ovirt.engine.core.dal.dbb
> roker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-34)
> [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message:
> Host ovirt-node-1 reports about one of the Active Storage Domains as
> Problematic."
>
>
> I have ownership set to vdsm/kvm and full rwx rights enabled on both
> directories. I have successfully mounted both the master domain and the
> hosted_storage manually on one of the hosts I'm trying to add. I have
> attached the engine log and the VDSM log for that host.
>
> Could someone please help me figure out what's causing this?
>
> -Phillip Bailey
> To view the terms under which this email is distributed, please go to:-
> http://disclaimer.leedsbeckett.ac.uk/disclaimer/disclaimer.html
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Anthony . Fillmore
[boxname ~]# systemctl | grep -i dead
mom-vdsm.service

   start MOM instance configured for VDSM purposes
vdsmd.service   

   start Virtual Desktop Server Manager


[ boxname ~]# systemctl | grep -i exited
blk-availability.service

 Availability of block devices
iptables.service

 IPv4 firewall with iptables
kdump.service   

 Crash recovery kernel arming
kmod-static-nodes.service   

 Create list of required static device nodes for the 
current kernel
lvm2-monitor.service

 Monitoring of LVM2 mirrors, snapshots etc. using dmeventd 
or progress polling
lvm2-pvscan@253:3.service   

 LVM2 PV scan on device 253:3
lvm2-pvscan@253:4.service   

 LVM2 PV scan on device 253:4
lvm2-pvscan@8:3.service 

 LVM2 PV scan on device 8:3
network.service 

 LSB: Bring up/down networking
openvswitch-nonetwork.service   

 Open vSwitch Internal Unit
openvswitch.service 

 Open vSwitch
rhel-dmesg.service  

 Dump dmesg to /var/log/dmesg
rhel-import-state.service   

 Import network configuration from initramfs
rhel-readonly.service   

 Configure read-only root support
systemd-journal-flush.service   

 Flush Journal to Persistent Storage
systemd-modules-load.service

 Load Kernel Modules
systemd-random-seed.service 

 Load/Save Random Seed
systemd-readahead-collect.service   

 Collect Read-Ahead Data
systemd-readahead-replay.service

 Replay Read-Ahead Data
systemd-remount-fs.service  

 Remount Root and Kernel File Systems
systemd-sysctl.service  


Re: [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-19 Thread Ravishankar N



On 07/19/2017 08:02 PM, Sahina Bose wrote:

[Adding gluster-users]

On Wed, Jul 19, 2017 at 2:52 PM, yayo (j) > wrote:


Hi all,

We have an ovirt cluster hyperconverged with hosted engine on 3
full replicated node . This cluster have 2 gluster volume:

- data: volume for the Data (Master) Domain (For vm)
- engine: volume fro the hosted_storage  Domain (for hosted engine)

We have this problem: "engine" gluster volume have always unsynced
elements and we cant' fix the problem, on command line we have
tried to use the "heal" command but elements remain always
unsynced 

Below the heal command "status":

[root@node01 ~]# gluster volume heal engine info
Brick node01:/gluster/engine/brick
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.48
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.64
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.60
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.2
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.68

/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01

/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.61
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.1
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.20
/__DIRECT_IO_TEST__
Status: Connected
Number of entries: 12

Brick node02:/gluster/engine/brick

/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01

/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids



/__DIRECT_IO_TEST__



/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6


Status: Connected
Number of entries: 12

Brick node04:/gluster/engine/brick
Status: Connected
Number of entries: 0


running the "gluster volume heal engine" don't solve the problem...



1. What does the glustershd.log say on all 3 nodes when you run the 
command? Does it complain anything about these files?

2. Are these 12 files also present in the 3rd data brick?
3. Can you provide the output of `gluster volume info` for the this volume?


Some extra info:

We have recently changed the gluster from: 2 (full repliacated) +
1 arbiter to 3 full replicated cluster



Just curious, how did you do this? `remove-brick` of arbiter brick 
followed by an `add-brick` to increase to replica-3?


Thanks,
Ravi


but i don't know this is the problem...

The "data" volume is good and healty and have no unsynced entry.

Ovirt refuse to put the node02 and node01 in "maintenance mode"
and complains about "unsynced elements"

How can I fix this?
Thank you

___
Users mailing list
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Anthony . Fillmore
Hey Alan,

Rpcbind is running on my box, looks like no issue there.  Any other ideas on 
what could be keeping vdsmd dead?  I even uninstalled all Ovirt related 
components from the host and went for a reinstall of the host through Ovirt 
(just short of actually fully removing the host from ovirt and re-adding, which 
I want to avoid) and the reinstall ends up timing out when it attempts to start 
VDSM (checking logs can see the service is dead when it gets here).

Thanks,
Tony

From: Alan Griffiths [mailto:apgriffith...@gmail.com]
Sent: Wednesday, July 19, 2017 4:14 AM
To: Anthony.Fillmore 
Cc: Pavel Gashev ; users@ovirt.org; Brandon.Markgraf 
; Sandeep.Mendiratta 

Subject: Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network 
Outage

Is rpcbind running? This is a dependency for vdsmd.

I've seen issues where rpcbind will not start on boot if IPv6 is disabled. The 
solution for me was to rebuild the initramfs, aka "dracut -f"

On 18 July 2017 at 18:13, Anthony.Fillmore 
mailto:anthony.fillm...@target.com>> wrote:
[boxname ~]# systemctl status -l vdsm-network
● vdsm-network.service - Virtual Desktop Server Manager network restoration
   Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled; 
vendor preset: enabled)
   Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 1h 29min ago
  Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append 
--logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence (code=exited, 
status=0/SUCCESS)
Main PID: 8231 (vdsm-tool)
   CGroup: /system.slice/vdsm-network.service
   ├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
   └─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config

Jul 18 10:42:57 
t0894bmh1001.stores.target.com 
systemd[1]: Starting Virtual Desktop Server Manager network restoration...

Thanks,
Tony
From: Pavel Gashev [mailto:p...@acronis.com]
Sent: Tuesday, July 18, 2017 11:17 AM
To: Anthony.Fillmore 
mailto:anthony.fillm...@target.com>>; 
users@ovirt.org
Cc: Brandon.Markgraf 
mailto:brandon.markg...@target.com>>; 
Sandeep.Mendiratta 
mailto:sandeep.mendira...@target.com>>
Subject: [EXTERNAL] Re: [ovirt-users] Host stuck unresponsive after Network 
Outage

Anthony,

Output of “systemctl status -l vdsm-network” would help.


From: mailto:users-boun...@ovirt.org>> on behalf of 
"Anthony.Fillmore" 
mailto:anthony.fillm...@target.com>>
Date: Tuesday, 18 July 2017 at 18:13
To: "users@ovirt.org" 
mailto:users@ovirt.org>>
Cc: "Brandon.Markgraf" 
mailto:brandon.markg...@target.com>>, 
"Sandeep.Mendiratta" 
mailto:sandeep.mendira...@target.com>>
Subject: [ovirt-users] Host stuck unresponsive after Network Outage

Hey Ovirt Users and Team,

I have a host that I am unable to recover post a network outage.  The host is 
stuck in unresponsive mode, even though the host is on the network, able to SSH 
and seems to be healthy.  I’ve tried several things to recover the host in 
Ovirt, but have had no success so far.  I’d like to reach out to the community 
before blowing away and rebuilding the host.

Environment: I have an Ovengine server with about 26 Datacenters, with 2 to 3 
hosts per Datacenter.  My Ovengine server is hosted centrally, with my hosts 
being bare-metal and distributed throughout my environment.Ovengine is 
version 4.0.6.

What I’ve tried: put into maintenance mode, rebooted the host.  Confirmed host 
was rebooted and tried to active, goes back to unresponsive.   Attempted a 
reinstall, which fails.

Checking from the host perspective, I can see the following problems:

[boxname~]# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor 
preset: enabled)
   Active: inactive (dead)

Jul 14 12:34:28 boxname systemd[1]: Dependency failed for Virtual Desktop 
Server Manager.
Jul 14 12:34:28 boxname systemd[1]: Job vdsmd.service/start failed with result 
'dependency'.

Going a bit deeper, the results of journalctl –xe:

[root@boxname ~]# journalctl -xe
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit libvirtd.service has begun shutting down.
Jul 18 09:07:31 boxname systemd[1]: Stopped Virtualization daemon.
-- Subject: Unit libvirtd.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit libvirtd.service has finished shutting down.
Jul 18 09:07:31 boxname systemd[1]: Reloading.
Jul 18 09:07:31 boxname systemd[1]: Binding to IPv6 address not available since 
kernel does not support IPv6.
Jul 18 09:07:31 boxname systemd[1]: [/usr/lib/systemd/system/rpcbind.socket:6] 
Failed to parse address value, ignoring: [::
Jul 18 09:07:31 boxname systemd[1]: Started Auxiliary vdsm service for running 
helper functions as root.
-- Subject: Unit 

Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Alan Griffiths
Are there other failed services?

systemctl --state=failed

On 19 July 2017 at 15:40, Anthony.Fillmore 
wrote:

> Hey Alan,
>
>
>
> Rpcbind is running on my box, looks like no issue there.  Any other ideas
> on what could be keeping vdsmd dead?  I even uninstalled all Ovirt related
> components from the host and went for a reinstall of the host through Ovirt
> (just short of actually fully removing the host from ovirt and re-adding,
> which I want to avoid) and the reinstall ends up timing out when it
> attempts to start VDSM (checking logs can see the service is dead when it
> gets here).
>
>
>
> Thanks,
>
> Tony
>
>
>
> *From:* Alan Griffiths [mailto:apgriffith...@gmail.com]
> *Sent:* Wednesday, July 19, 2017 4:14 AM
> *To:* Anthony.Fillmore 
> *Cc:* Pavel Gashev ; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> Is rpcbind running? This is a dependency for vdsmd.
>
>
>
> I've seen issues where rpcbind will not start on boot if IPv6 is disabled.
> The solution for me was to rebuild the initramfs, aka "dracut -f"
>
>
>
> On 18 July 2017 at 18:13, Anthony.Fillmore 
> wrote:
>
> [boxname ~]# systemctl status -l vdsm-network
>
> ● vdsm-network.service - Virtual Desktop Server Manager network restoration
>
>Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled;
> vendor preset: enabled)
>
>Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 1h 29min
> ago
>
>   Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append
> --logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence
> (code=exited, status=0/SUCCESS)
>
> Main PID: 8231 (vdsm-tool)
>
>CGroup: /system.slice/vdsm-network.service
>
>├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
>
>└─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
>
>
>
> Jul 18 10:42:57 t0894bmh1001.stores.target.com systemd[1]: Starting
> Virtual Desktop Server Manager network restoration...
>
>
>
> Thanks,
>
> Tony
>
> *From:* Pavel Gashev [mailto:p...@acronis.com]
> *Sent:* Tuesday, July 18, 2017 11:17 AM
> *To:* Anthony.Fillmore ; users@ovirt.org
> *Cc:* Brandon.Markgraf ; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* [EXTERNAL] Re: [ovirt-users] Host stuck unresponsive after
> Network Outage
>
>
>
> Anthony,
>
>
>
> Output of “systemctl status -l vdsm-network” would help.
>
>
>
>
>
> *From: * on behalf of "Anthony.Fillmore" <
> anthony.fillm...@target.com>
> *Date: *Tuesday, 18 July 2017 at 18:13
> *To: *"users@ovirt.org" 
> *Cc: *"Brandon.Markgraf" ,
> "Sandeep.Mendiratta" 
> *Subject: *[ovirt-users] Host stuck unresponsive after Network Outage
>
>
>
> Hey Ovirt Users and Team,
>
>
>
> I have a host that I am unable to recover post a network outage.  The host
> is stuck in unresponsive mode, even though the host is on the network, able
> to SSH and seems to be healthy.  I’ve tried several things to recover the
> host in Ovirt, but have had no success so far.  I’d like to reach out to
> the community before blowing away and rebuilding the host.
>
>
>
> *Environment*: I have an Ovengine server with about 26 Datacenters, with
> 2 to 3 hosts per Datacenter.  My Ovengine server is hosted centrally, with
> my hosts being bare-metal and distributed throughout my environment.
>   Ovengine is version 4.0.6.
>
>
>
> *What I’ve tried: *put into maintenance mode, rebooted the host.
> Confirmed host was rebooted and tried to active, goes back to
> unresponsive.   Attempted a reinstall, which fails.
>
>
>
> *Checking from the host perspective, I can see the following problems: *
>
>
>
> [boxname~]# systemctl status vdsmd
>
> ● vdsmd.service - Virtual Desktop Server Manager
>
>Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>
>Active: inactive (dead)
>
>
>
> Jul 14 12:34:28 boxname systemd[1]: Dependency failed for Virtual Desktop
> Server Manager.
>
> Jul 14 12:34:28 boxname systemd[1]: Job vdsmd.service/start failed with
> result 'dependency'.
>
>
>
> *Going a bit deeper, the results of journalctl –xe: *
>
>
>
> [root@boxname ~]# journalctl -xe
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has begun shutting down.
>
> Jul 18 09:07:31 boxname systemd[1]: Stopped Virtualization daemon.
>
> -- Subject: Unit libvirtd.service has finished shutting down
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has finished shutting down.
>
> Jul 18 09:07:31 boxname systemd[1]: Reloading.
>
> Jul 18 09:07:31 boxname systemd[1]: Binding to IPv6 address not available
> since kernel does not support IPv6.
>
> Jul 18 09:07:31 boxname systemd[1]: [/usr/lib/systemd/system/rpcbind.socket:6]
> Failed

Re: [ovirt-users] ovirt-hosted-engine state transition messages

2017-07-19 Thread Martin Sivak
Hi,

the BadHealth status means we could not "ping" the engine. There is an
http endpoint we use to check the service is up and responding and it
is possible there was a timeout while the system was too loaded.

We actually use hosted-engine --check-liveliness that tries to talk to
http://{fqdn}/ovirt-engine/services/health

There is a 5 minute grace period and if the engine recovers (and it
usually does in this case) we move the status back to Up.


Best regards

--
Martin Sivak
oVirt

On Tue, Jul 18, 2017 at 6:22 PM, Darrell Budic  wrote:
> I had some of this going on recently under 4.1.2, started with one or two
> warning messages, then a flood of them. Did the upgrade to 4.1.3 and haven’t
> seen it yet, but it’s only been a few days so far. A java process was
> consuming much CPU, and the DataWarehouse appears to not be collecting data
> (evidenced by a blank dashboard). My DWH has since recovered as well.
>
> I forgot to check, but suspect I was low/out of memory on my engine VM, it’s
> an old one with only 6G allocated currently. Watching for this to happen
> again, and will confirm RAM utilization and bump up appropriately if it
> looks like it’s starved for RAM.
>
>
> On Jul 18, 2017, at 5:45 AM, Christophe TREFOIS 
> wrote:
>
> I have the same as you on 4.1.0
>
> EngineBadHealth-EngineUp 1 minute later. Sometimes 20 times per day, mostly
> on weekends.
>
> Cheers,
>
> --
>
> Dr Christophe Trefois, Dipl.-Ing.
> Technical Specialist / Post-Doc
>
> UNIVERSITÉ DU LUXEMBOURG
>
> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
> Campus Belval | House of Biomedicine
> 6, avenue du Swing
> L-4367 Belvaux
> T: +352 46 66 44 6124
> F: +352 46 66 44 6949
> http://www.uni.lu/lcsb
>
>
>
>
> 
> This message is confidential and may contain privileged information.
> It is intended for the named recipient only.
> If you receive it in error please notify me and permanently delete the
> original message and any copies.
> 
>
>
>
> On 17 Jul 2017, at 17:35, Jim Kusznir  wrote:
>
> Ok, I've been ignoring this for a long time as the logs were so verbose and
> didn't show anything I could identify as usable debug info.  Recently one of
> my ovirt hosts (currently NOT running the main engine, but a candidate) was
> cycling as much as 40 times a day between "EngineUpBadHealth and EngineUp".
> Here's the log snippit.  I included some time before and after if that's
> helpful.  In this case, I got an email about bad health at 8:15 and a
> restore (engine up) at 8:16.  I see where the messages are sent, but I don't
> see any explanation as to why / what the problem is.
>
> BTW: 192.168.8.11 is this computer's physical IP; 192.168.8.12 is the
> computer currently running the engine.  Both are also hosting the gluster
> store (eg, I have 3 hosts, all are participating in the gluster replica
> 2+arbitrator).
>
> I'd appreciate it if someone could shed some light on why this keeps
> happening!
>
> --Jim
> 
>
> MainThread::INFO::2017-07-17
> 08:12:06,230::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
> Reloading vm.conf from the shared storage domain
> MainThread::INFO::2017-07-17
> 08:12:06,230::config::412::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
> Trying to get a fresher copy of vm configuration from the OVF_STORE
> MainThread::INFO::2017-07-17
> 08:12:08,877::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
> Found OVF_STORE: imgUUID:e10c90a5-4d9c-4e18-b6f7-ae8f0cdf4f57,
> volUUID:a9754d40-eda1-44d7-ac92-76a228f9f1ac
> MainThread::INFO::2017-07-17
> 08:12:09,432::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
> Found OVF_STORE: imgUUID:f22829ab-9fd5-415a-9a8f-809d3f7887d4,
> volUUID:9f4760ee-119c-412a-a1e8-49e73e6ba929
> MainThread::INFO::2017-07-17
> 08:12:09,925::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> Extracting Engine VM OVF from the OVF_STORE
> MainThread::INFO::2017-07-17
> 08:12:10,324::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path:
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine/c0acdefb-7d16-48ec-9d76-659b8fe33e2a/images/f22829ab-9fd5-415a-9a8f-809d3f7887d4/9f4760ee-119c-412a-a1e8-49e73e6ba929
> MainThread::INFO::2017-07-17
> 08:12:10,696::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
> Found an OVF for HE VM, trying to convert
> MainThread::INFO::2017-07-17
> 08:12:10,704::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
> Got vm.conf from OVF_STORE
> MainThread::INFO::2017-07-17
> 08:12:10,705::states::426::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2017-07-17
> 08:12:10,714::hosted_engine::604::ovirt_hosted_engine_ha

Re: [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-19 Thread Sahina Bose
[Adding gluster-users]

On Wed, Jul 19, 2017 at 2:52 PM, yayo (j)  wrote:

> Hi all,
>
> We have an ovirt cluster hyperconverged with hosted engine on 3 full
> replicated node . This cluster have 2 gluster volume:
>
> - data: volume for the Data (Master) Domain (For vm)
> - engine: volume fro the hosted_storage  Domain (for hosted engine)
>
> We have this problem: "engine" gluster volume have always unsynced
> elements and we cant' fix the problem, on command line we have tried to use
> the "heal" command but elements remain always unsynced 
>
> Below the heal command "status":
>
> [root@node01 ~]# gluster volume heal engine info
> Brick node01:/gluster/engine/brick
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.48
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.64
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.60
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.2
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.68
> /8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-
> 52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01
> /8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-
> a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.61
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.1
> /8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids
> /.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.20
> /__DIRECT_IO_TEST__
> Status: Connected
> Number of entries: 12
>
> Brick node02:/gluster/engine/brick
> /8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-
> 52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01
> 
> /8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids
> 
> 
> 
> /__DIRECT_IO_TEST__
> 
> 
> /8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-
> a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6
> 
> 
> Status: Connected
> Number of entries: 12
>
> Brick node04:/gluster/engine/brick
> Status: Connected
> Number of entries: 0
>
>
>
> running the "gluster volume heal engine" don't solve the problem...
>
> Some extra info:
>
> We have recently changed the gluster from: 2 (full repliacated) + 1
> arbiter to 3 full replicated cluster but i don't know this is the problem...
>
> The "data" volume is good and healty and have no unsynced entry.
>
> Ovirt refuse to put the node02 and node01 in "maintenance mode" and
> complains about "unsynced elements"
>
> How can I fix this?
> Thank you
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [ovirt-devel] [openstack-dev] [kolla] Looking for Docker images for Cinder, Glance etc for oVirt

2017-07-19 Thread Sandro Bonazzola
On Sun, Jul 9, 2017 at 1:07 AM, Steven Dake (stdake) 
wrote:

> Leni,
>
> Reading their website, they reference the “kollagllue’ namespace.  That
> hasn’t been used for 2+ years (we switched to kolla).  They also said a
> “recent” change in the imges was made to deploy via Ansible “and they
> weren’t ready for that yet”.  That was 3 years ago.  I think those docs are
> all pretty old and could use some validation from the ovirt folks.
>


The whole feature having kollaglue / kolla deployed by oVirt installer has
been in tech preview for a long time with not much interest raised from
community for that feature.
Thanks for having reported the issue, we'll update the documentation and
discuss about keeping the feature or drop it in next releases.



>
> -Original Message-
> From: Leni Kadali Mutungi 
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-...@lists.openstack.org>
> Date: Saturday, July 8, 2017 at 11:03 AM
> To: openstack-dev , "de...@ovirt.org" <
> de...@ovirt.org>
> Cc: users 
> Subject: [openstack-dev] [kolla] Looking for Docker images for Cinder,
> Glance etc for oVirt
>
> Hello all.
>
> I am trying to use the Cinder and Glance Docker images you provide in
> relation to the setup here:
> http://www.ovirt.org/develop/release-management/features/
> cinderglance-docker-integration/
>
> I tried to run `sudo docker pull
> kollaglue/centos-rdo-glance-registry:latest` and got an error of not
> found. I thought that it could possible to use a Dockerfile to spin up
> an equivalent of it, so I would like some guidance on how to go about
> doing that. Best practices and so on. Alternatively, if it is
> possible, may you point me in the direction of the equivalent images
> mentioned in the guides if they have been superseded by something
> else? Thanks.
>
> CCing the oVirt users and devel lists to see if anyone has experienced
> something similar.
>
> --
> - Warm regards
> Leni Kadali Mutungi
>
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:
> unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ___
> Devel mailing list
> de...@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>



-- 

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA 

TRIED. TESTED. TRUSTED. 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI Multipath issues

2017-07-19 Thread Uwe Laverenz

Hi,


Am 19.07.2017 um 04:52 schrieb Vinícius Ferrão:

I’m joining the crowd with iSCSI Multipath issues on oVirt here. I’m 
trying to enable the feature without success too.


Here’s what I’ve done, step-by-step.

1. Installed oVirt Node 4.1.3 with the following network settings:

eno1 and eno2 on a 802.3ad (LACP) Bond, creating a bond0 interface.
eno3 with 9216 MTU.
eno4 with 9216 MTU.
vlan11 on eno3 with 9216 MTU and fixed IP addresses.
vlan12 on eno4 with 9216 MTU and fixed IP addresses.

eno3 and eno4 are my iSCSI MPIO Interfaces, completelly segregated, on 
different switches.


This is the point: the OVirt implementation of iSCSI-Bonding assumes 
that all network interfaces in the bond can connect/reach all targets, 
including those in the other net(s). The fact that you use separate, 
isolated networks means that this is not the case in your setup (and not 
in mine).


I am not sure if this is a bug, a design flaw or a feature, but as a 
result of this OVirt's iSCSI-Bonding does not work for us.


Please see my mail from yesterday for a workaround.

cu,
Uwe
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] test email

2017-07-19 Thread Erekle Magradze

test


On 07/19/2017 11:53 AM, Abi Askushi wrote:

several days without receiving any email from this list.
please test back.

Abi


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


--
Recogizer Group GmbH

Dr.rer.nat. Erekle Magradze
Lead Big Data Engineering & DevOps
Rheinwerkallee 2, 53227 Bonn
Tel: +49 228 29974555

E-Mail erekle.magra...@recogizer.de
Web: www.recogizer.com
 
Recogizer auf LinkedIn https://www.linkedin.com/company-beta/10039182/

Folgen Sie uns auf Twitter https://twitter.com/recogizer
 
-

Recogizer Group GmbH
Geschäftsführer: Oliver Habisch, Carsten Kreutze
Handelsregister: Amtsgericht Bonn HRB 20724
Sitz der Gesellschaft: Bonn; USt-ID-Nr.: DE294195993
 
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen.

Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten 
haben,
informieren Sie bitte sofort den Absender und löschen Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der 
darin enthaltenen Informationen ist nicht gestattet.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] test email

2017-07-19 Thread Abi Askushi
several days without receiving any email from this list.
please test back.

Abi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine/NFS Troubles

2017-07-19 Thread Staniforth, Paul
I used the troubleshooting guide at


documentation/how-to/troubleshooting/troubleshooting-nfs-storage-issues/


and exported using


(rw,sync,no_subtree_check,all_squash,anonuid=36,anongid=36)


This was for the DATA_DOMAIN not hosted storage but it may help.


Regards,


Paul S.


From: users-boun...@ovirt.org  on behalf of Phillip 
Bailey 
Sent: 17 July 2017 20:05
To: users
Subject: [ovirt-users] Hosted Engine/NFS Troubles

Hi,

I'm having trouble with my hosted engine setup (v4.0) and could use some help. 
The problem I'm having is that whenever I try to add additional hosts to the 
setup via webadmin, the operation fails due to storage-related issues.

webadmin shows the following error messages:

"Host  cannot access the Storage Domain(s) hosted_storage attached 
to the Data Center Default. Setting Host state to Non-Operational.
Failed to connect Host ovirt-node-1 to Storage Pool Default"


The VDSM log from the host shows the following error message:

"Thread-18::ERROR::2017-07-17 
13:01:11,483::sdc::146::Storage.StorageDomainCache::(_findDomain) domain 
ca044720-e5cf-40a8-8b21-57a17026db7c not found
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain
dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 174, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: 
(u'ca044720-e5cf-40a8-8b21-57a17026db7c',)"


The engine log shows the following error messages:

"2017-07-17 18:32:11,409 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] 
(org.ovirt.thread.pool-6-thread-34) [] Domain 
'ca044720-e5cf-40a8-8b21-57a17026db7c:hosted_storage' was reported with error 
code '358'
2017-07-17 18:32:11,410 ERROR [org.ovirt.engine.core.bll.InitVdsOnUpCommand] 
(org.ovirt.thread.pool-6-thread-34) [] Storage Domain 'hosted_storage' of pool 
'Default' is in problem in host 'ovirt-node-1'
2017-07-17 18:32:11,487 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(org.ovirt.thread.pool-6-thread-34) [] Correlation ID: null, Call Stack: null, 
Custom Event ID: -1, Message: Host ovirt-node-1 reports about one of the Active 
Storage Domains as Problematic."


I have ownership set to vdsm/kvm and full rwx rights enabled on both 
directories. I have successfully mounted both the master domain and the 
hosted_storage manually on one of the hosts I'm trying to add. I have attached 
the engine log and the VDSM log for that host.

Could someone please help me figure out what's causing this?

-Phillip Bailey
To view the terms under which this email is distributed, please go to:-
http://disclaimer.leedsbeckett.ac.uk/disclaimer/disclaimer.html
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] NullPointerException when changing compatibility version to 4.0

2017-07-19 Thread Marcel Hanke
Hi,
i currently have a problem with changing one of our clusters to compatibility 
version 4.0.
The Log shows a NullPointerException after several successful vms:
2017-07-19 11:19:45,886 ERROR [org.ovirt.engine.core.bll.UpdateVmCommand] 
(default task-31) [1acd2990] Error during ValidateFailure.: 
java.lang.NullPointerException
at 
org.ovirt.engine.core.bll.UpdateVmCommand.validate(UpdateVmCommand.java:632) 
[bll.jar:]
at 
org.ovirt.engine.core.bll.CommandBase.internalValidate(CommandBase.java:886) 
[bll.jar:]
at 
org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:391) 
[bll.jar:]
at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:493) 
[bll.jar:]
.

On other Clusters with the exect same configuration the change to 4.0 was 
successfull without a problem.
Turning off the cluster for the change is also not possible because of >1200 
Vms running on it.

Does anyone have an idea what to do, or that to look for?

Thanks
Marcel
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-19 Thread yayo (j)
Hi all,

We have an ovirt cluster hyperconverged with hosted engine on 3 full
replicated node . This cluster have 2 gluster volume:

- data: volume for the Data (Master) Domain (For vm)
- engine: volume fro the hosted_storage  Domain (for hosted engine)

We have this problem: "engine" gluster volume have always unsynced elements
and we cant' fix the problem, on command line we have tried to use the
"heal" command but elements remain always unsynced 

Below the heal command "status":

[root@node01 ~]# gluster volume heal engine info
Brick node01:/gluster/engine/brick
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.48
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.64
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.60
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.2
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.68
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.61
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.1
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.20
/__DIRECT_IO_TEST__
Status: Connected
Number of entries: 12

Brick node02:/gluster/engine/brick
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01

/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids



/__DIRECT_IO_TEST__


/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6


Status: Connected
Number of entries: 12

Brick node04:/gluster/engine/brick
Status: Connected
Number of entries: 0



running the "gluster volume heal engine" don't solve the problem...

Some extra info:

We have recently changed the gluster from: 2 (full repliacated) + 1 arbiter
to 3 full replicated cluster but i don't know this is the problem...

The "data" volume is good and healty and have no unsynced entry.

Ovirt refuse to put the node02 and node01 in "maintenance mode" and
complains about "unsynced elements"

How can I fix this?
Thank you
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after Network Outage

2017-07-19 Thread Alan Griffiths
Is rpcbind running? This is a dependency for vdsmd.

I've seen issues where rpcbind will not start on boot if IPv6 is disabled.
The solution for me was to rebuild the initramfs, aka "dracut -f"

On 18 July 2017 at 18:13, Anthony.Fillmore 
wrote:

> [boxname ~]# systemctl status -l vdsm-network
>
> ● vdsm-network.service - Virtual Desktop Server Manager network restoration
>
>Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled;
> vendor preset: enabled)
>
>Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 1h 29min
> ago
>
>   Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append
> --logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence
> (code=exited, status=0/SUCCESS)
>
> Main PID: 8231 (vdsm-tool)
>
>CGroup: /system.slice/vdsm-network.service
>
>├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
>
>└─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
>
>
>
> Jul 18 10:42:57 t0894bmh1001.stores.target.com systemd[1]: Starting
> Virtual Desktop Server Manager network restoration...
>
>
>
> Thanks,
>
> Tony
>
> *From:* Pavel Gashev [mailto:p...@acronis.com]
> *Sent:* Tuesday, July 18, 2017 11:17 AM
> *To:* Anthony.Fillmore ; users@ovirt.org
> *Cc:* Brandon.Markgraf ; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* [EXTERNAL] Re: [ovirt-users] Host stuck unresponsive after
> Network Outage
>
>
>
> Anthony,
>
>
>
> Output of “systemctl status -l vdsm-network” would help.
>
>
>
>
>
> *From: * on behalf of "Anthony.Fillmore" <
> anthony.fillm...@target.com>
> *Date: *Tuesday, 18 July 2017 at 18:13
> *To: *"users@ovirt.org" 
> *Cc: *"Brandon.Markgraf" ,
> "Sandeep.Mendiratta" 
> *Subject: *[ovirt-users] Host stuck unresponsive after Network Outage
>
>
>
> Hey Ovirt Users and Team,
>
>
>
> I have a host that I am unable to recover post a network outage.  The host
> is stuck in unresponsive mode, even though the host is on the network, able
> to SSH and seems to be healthy.  I’ve tried several things to recover the
> host in Ovirt, but have had no success so far.  I’d like to reach out to
> the community before blowing away and rebuilding the host.
>
>
>
> *Environment*: I have an Ovengine server with about 26 Datacenters, with
> 2 to 3 hosts per Datacenter.  My Ovengine server is hosted centrally, with
> my hosts being bare-metal and distributed throughout my environment.
>   Ovengine is version 4.0.6.
>
>
>
> *What I’ve tried: *put into maintenance mode, rebooted the host.
> Confirmed host was rebooted and tried to active, goes back to
> unresponsive.   Attempted a reinstall, which fails.
>
>
>
> *Checking from the host perspective, I can see the following problems: *
>
>
>
> [boxname~]# systemctl status vdsmd
>
> ● vdsmd.service - Virtual Desktop Server Manager
>
>Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>
>Active: inactive (dead)
>
>
>
> Jul 14 12:34:28 boxname systemd[1]: Dependency failed for Virtual Desktop
> Server Manager.
>
> Jul 14 12:34:28 boxname systemd[1]: Job vdsmd.service/start failed with
> result 'dependency'.
>
>
>
> *Going a bit deeper, the results of journalctl –xe: *
>
>
>
> [root@boxname ~]# journalctl -xe
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has begun shutting down.
>
> Jul 18 09:07:31 boxname systemd[1]: Stopped Virtualization daemon.
>
> -- Subject: Unit libvirtd.service has finished shutting down
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has finished shutting down.
>
> Jul 18 09:07:31 boxname systemd[1]: Reloading.
>
> Jul 18 09:07:31 boxname systemd[1]: Binding to IPv6 address not available
> since kernel does not support IPv6.
>
> Jul 18 09:07:31 boxname systemd[1]: [/usr/lib/systemd/system/rpcbind.socket:6]
> Failed to parse address value, ignoring: [::
>
> Jul 18 09:07:31 boxname systemd[1]: Started Auxiliary vdsm service for
> running helper functions as root.
>
> -- Subject: Unit supervdsmd.service has finished start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit supervdsmd.service has finished starting up.
>
> --
>
> -- The start-up result is done.
>
> Jul 18 09:07:31 boxname systemd[1]: Starting Auxiliary vdsm service for
> running helper functions as root...
>
> -- Subject: Unit supervdsmd.service has begun start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit supervdsmd.service has begun starting up.
>
> Jul 18 09:07:31 boxname systemd[1]: Starting Virtualization daemon...
>
> -- Subject: Unit libvirtd.service has begun start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd