[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-02 Thread Alex K
On Mon, Mar 1, 2021, 15:20  wrote:

> Hello again,
>
> I am back with a brief description of the situation I am in, and questions
> about the recovery.
>
> oVirt environment: 4.3.5.2 Hyperconverged
> GlusterFS: Replica 2 + Arbiter 1
> GlusterFS volumes: data, engine, vmstore
>
> The current situation is the following:
>
> - The Cluster is in Global Maintenance.
>
> - The volume engine is up with comment (in the Web GUI) : Up, unsynched
> entries, needs healing.
>
> - The VM HostedEngine is paused due to a storage I/O error (Web GUI) while
> the output of virsh list --all command shows that the HostedEngine is
> running.
>
> I tried to issue the gluster heal command (gluster volume heal engine) but
> nothing changed.
>
> I have the following questions:
>
> 1. Should I restart the glusterd service? Where from? Is it enough if the
> glusterd is restarted on one host or should it be restarted on the other
> two as well?
>
It sounds as a gluster split brain. I would start from there. Can you check
status by listing split brain entries?

>
> 2. Should the node that was NonResponsive and came back, be rebooted or
> not? It seems alright now and in good health.
>
> 3. Should the HostedEngine be restored with engine-backup or is it not
> necessary?
>
> 4. Could the loss of the DNS server for the oVirt hosts lead to an
> unresponsive host?
> The nsswitch file on the ovirt hosts and engine, has the DNS defined as:
> hosts:  files dns myhostname
>
If you have opted for dns liveliness checks it could be.

>
> 5. How can we recover/rectify the situation above?
>
I would start checking for gluster split brains and ensure that all hosts
have connectivity in the storage domain net (ping, jumbo frames if
enabled). 99% of my similar issues have been caused from gluster split.

The fact that the engine is shown as paused and that you can still access
web ui makes me think you have a split brain issue

>
> Thanks for your help,
> Maria Souvalioti
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GO6S6GXRJWYZN5NZ5IFTNQ6SGNEB75WQ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZNIFUDLRYHU3YTYC35OLXVVHYKAPNJZI/


[ovirt-users] Re: CVE-2021-3156 && ovirt-node-ng 4.3 && 4.4 (sudo)

2021-03-02 Thread Thiago Linhares
I've got the same questions...
Someone?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OYY7XVTUXFV6VRFZEDFCKY5GMEOWDLRF/


[ovirt-users] Re: Multipath flapping with SAS via FCP

2021-03-02 Thread bchatelain
Hello Ben,

I have make some test with devices in multipath.conf, directio doesn't work. 
But, I had e-mail from dm-de...@redhat.com -Xose Vazquez Perez, with some 
instructions and it work's.

in multipath.conf, add 'path_grouping_policy "group_by_prio"' to device
devices {
device {
vendor "COMPELNT"
product "Compellent Vol"
path_grouping_policy "group_by_prio"
prio "alua"
failback "immediate"
no_path_retry 30
}
}

Thank you.

Regards,
Benoit Chatelain



- Mail original -
De: "Benjamin Marzinski" 
À: "Nir Soffer" 
Cc: "Benoit Chatelain" , "users" 
Envoyé: Lundi 1 Mars 2021 21:15:21
Objet: Re: [ovirt-users] Re: Multipath flapping with SAS via FCP

On Fri, Feb 26, 2021 at 05:29:50PM +0200, Nir Soffer wrote:
> On Fri, Feb 26, 2021 at 12:07 PM Benoit Chatelain  wrote:
> >
> > Hi Nir Soffer,
> > Thank for your reply
> >
> > Indeed, the device fails immediately after it was reinstated.
> >
> > There is my 'multipathd show config' dump :
> >
> > defaults {
> > verbosity 2
> > polling_interval 5
> > max_polling_interval 20
> > reassign_maps "no"
> > multipath_dir "/lib64/multipath"
> > path_selector "service-time 0"
> > path_grouping_policy "failover"
> > uid_attribute "ID_SERIAL"
> > prio "const"
> > prio_args ""
> > features "0"
> > path_checker "tur"
> > alias_prefix "mpath"
> > failback "manual"
> > rr_min_io 1000
> > rr_min_io_rq 1
> > max_fds 4096
> > rr_weight "uniform"
> > no_path_retry 16
> > queue_without_daemon "no"
> > flush_on_last_del "yes"
> > user_friendly_names "no"
> > fast_io_fail_tmo 5
> > dev_loss_tmo 60
> > bindings_file "/etc/multipath/bindings"
> > wwids_file "/etc/multipath/wwids"
> > prkeys_file "/etc/multipath/prkeys"
> > log_checker_err always
> > all_tg_pt "no"
> > retain_attached_hw_handler "yes"
> > detect_prio "yes"
> > detect_checker "yes"
> > force_sync "no"
> > strict_timing "no"
> > deferred_remove "no"
> > config_dir "/etc/multipath/conf.d"
> > delay_watch_checks "no"
> > delay_wait_checks "no"
> > san_path_err_threshold "no"
> > san_path_err_forget_rate "no"
> > san_path_err_recovery_time "no"
> > marginal_path_err_sample_time "no"
> > marginal_path_err_rate_threshold "no"
> > marginal_path_err_recheck_gap_time "no"
> > marginal_path_double_failed_time "no"
> > find_multipaths "on"
> > uxsock_timeout 4000
> > retrigger_tries 3
> > retrigger_delay 10
> > missing_uev_wait_timeout 30
> > skip_kpartx "no"
> > disable_changed_wwids ignored
> > remove_retries 0
> > ghost_delay "no"
> > find_multipaths_timeout -10
> > enable_foreign ""
> > marginal_pathgroups "no"
> > }
> > blacklist {
> > devnode "!^(sd[a-z]|dasd[a-z]|nvme[0-9])"
> > wwid "36f402700f232e40026b41bd43a0812e5"
> > protocol "(scsi:adt|scsi:sbp)"
> > ...
> > }
> > blacklist_exceptions {
> > protocol "scsi:sas"
> > }
> > devices {
> > ...
> > device {
> > vendor "COMPELNT"
> > product "Compellent Vol"
> > path_grouping_policy "multibus"
> > no_path_retry "queue"
> > }
> > ...
> > }
> > overrides {
> > no_path_retry 16
> > }
> >
> > And there is my scsi disks (sdb & sdc disks) :
> >
> > [root@anarion-adm ~]# lsscsi -l
> > [0:2:0:0]diskDELL PERC H330 Adp4.30  /dev/sda
> >   state=running queue_depth=256 scsi_level=6 type=0 device_blocked=0 
> > timeout=90
> > [1:0:0:2]diskCOMPELNT Compellent Vol   0704  /dev/sdb
> >   state=running queue_depth=254 scsi_level=6 type=0 device_blocked=0 
> > timeout=30
> > [1:0:1:2]diskCOMPELNT Compellent Vol   0704  /dev/sdc
> >   state=running queue_depth=254 scsi_level=6 type=0 device_blocked=0 
> > timeout=30
> >
> >
> > My disk configuration is present in multipath, and the DELLEMC 
> > documentation & white paper don't specifying exotics configuration for 
> > multipathd. (I'm wrong ?)
> >
> > I looked modules for SAS & FCP driver, they look good :
> >
> > [root@anarion-adm ~]# lsmod | grep sas
> > mpt3sas   303104  4
> > raid_class 16384  1 mpt3sas
> > megaraid_sas  172032  2
> > scsi_transport_sas 45056  1 mpt3sas
> >
> > [root@anarion-adm ~]# lsmod | grep fc
> > bnx2fc110592  0
> > cnic   69632  1 bnx2fc
> > libfcoe77824  2 qedf,bnx2fc
> > libfc 147456  3 qedf,bnx2fc,libfcoe
> > scsi_transport_fc  69632  3 qedf,libfc,bnx2fc
> >
> > Do you think my device is misconfig

[ovirt-users] Re: How to enable nested virtualization

2021-03-02 Thread miguel . garcia
Thanks you so much, this is was left to get it work.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SGDWJEVX3NGOU6AIINHSKYMHDQLNIMIF/


[ovirt-users] Re: kvm host becomes non operational after breaking a bonding interface

2021-03-02 Thread Ales Musil
On Tue, Mar 2, 2021 at 8:54 AM LS CHENG  wrote:

> That did the trick Ales
>
> Thanks!
>

Great, glad to help.
Thanks for reporting back.

Best regards,
Ales


>
> On Tue, Mar 2, 2021 at 8:37 AM LS CHENG  wrote:
>
>> OK, I will give it a try and let you know
>>
>> Thank you very much
>>
>> On Tue, Mar 2, 2021 at 8:29 AM Ales Musil  wrote:
>>
>>>
>>>
>>> On Tue, Mar 2, 2021 at 8:25 AM LS CHENG  wrote:
>>>
 Hi Ales

 Thanks for the reply.

 How do I know a network is "required"? I have three networks defined,
 ovirtmgmt, VLAN50 and VLAN601, these last I dont think they are a must.

 Thanks


>>> This can be found and changed in Compute -> Clusters -> $YourCluster ->
>>> Logical Networks -> Manage Networks. The check box Require is the one.
>>> Also when you are creating a new network, there is a Cluster side menu
>>> which let's you choose if the network is required or not (true by default).
>>>
>>> Hopefully this helps.
>>>
>>> Best regards,
>>> Ales
>>>
>>>


 On Tue, Mar 2, 2021 at 7:33 AM Ales Musil  wrote:

>
>
> On Mon, Mar 1, 2021 at 11:30 PM LS CHENG  wrote:
>
>> Hi
>>
>> I have created a couple of bonding interfaces in the administrator
>> portal, due to some miss configuration I had to break the bond 
>> interfaces,
>> what I observe is that after breaking the bonding interfaces the KVM host
>> becomes non operational and it stays in such state unless I add the 
>> bonding
>> interfaces back. Anyone have observed this behaviour?
>>
>> Thanks
>>
>
>
> Hi,
> the reason might be that the bond had a required network connected to
> it. If that is the case, by connecting the required network to proper
> interface
> should make the host operational again.
>
> Best Regards,
> Ales
>
>
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/OOIUOIC6S4JMIWIKVPZZRD26BQK4VUDW/
>>
>
>
> --
>
> Ales Musil
>
> Software Engineer - RHV Network
>
> Red Hat EMEA 
>
> amu...@redhat.comIM: amusil
> 
>

>>>
>>> --
>>>
>>> Ales Musil
>>>
>>> Software Engineer - RHV Network
>>>
>>> Red Hat EMEA 
>>>
>>> amu...@redhat.comIM: amusil
>>> 
>>>
>>

-- 

Ales Musil

Software Engineer - RHV Network

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TT6HCTCXNMSAEFOKB64LPUWZPQWUJTSC/