Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host
Hello, Sorry for my late response. I reproduced the error in a lab environment (oVirt3.5/CentOS7.1) with 2 hosts (ovhv00 ovhv01) and a replicated glusterfs. I activated the maintenance mode in host ovhv01 and then I stopped network.service (instead of a reboot). The result is always the same. Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". This is the engine log https://paste.fedoraproject.org/250877/58925314/raw/ Thanks, K. On 07/29/2015 12:21 PM, Artyom Lukianov wrote: Can you please provide engine log(/var/log/ovirt-engine/engine.log)? - Original Message - From: "Konstantinos Christidis" To: users@ovirt.org Cc: "Artyom Lukianov" Sent: Wednesday, July 29, 2015 9:40:26 AM Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host Maintenance mode is already enabled. All VMs finish migration successfully. Now I stop glusterd service on this host (systemctl stop glusterd.service) and nothing bad happens, which means that distributed replica glusterfs works fine. Then I stop vdsmd service (systemctl stop vdsmd.service) and everything works fine. When I administratively set ovirtmgmt network down or reboot this host, my Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". K. On 07/28/2015 06:09 PM, Artyom Lukianov wrote: Just put host to maintenance mode, if it have vms it will migrate them automatically on other host. - Original Message - From: "Konstantinos Christidis" To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host Hello ovirt users, I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7) When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess? Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host
Maintenance mode is already enabled. All VMs finish migration successfully. Now I stop glusterd service on this host (systemctl stop glusterd.service) and nothing bad happens, which means that distributed replica glusterfs works fine. Then I stop vdsmd service (systemctl stop vdsmd.service) and everything works fine. When I administratively set ovirtmgmt network down or reboot this host, my Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". K. On 07/28/2015 06:09 PM, Artyom Lukianov wrote: Just put host to maintenance mode, if it have vms it will migrate them automatically on other host. - Original Message - From: "Konstantinos Christidis" To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host Hello ovirt users, I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7) When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess? Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Data Center becomes Non Responsive when I reboot a host
Hello ovirt users, I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7) When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess? Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM has been paused due to unknown storage error
Hello Mario, On 16/07/2015 04:12 μμ, m...@ohnewald.net wrote: Check your vdsm Logs on your nodes. I bet you find something about I/O errors i guess... Yes there are many IO errors libvirtEventLoop::INFO::2015-07-16 22:30:02,237::vm::3609::virt.vm::(onIOError) vmId=`bb46929c-0b4e-4f01-868a-7e7638fa943b`::abnormal vm stop device virtio-disk0 error eother libvirtEventLoop::INFO::2015-07-16 22:30:02,237::vm::4889::virt.vm::(_logGuestCpuStatus) vmId=`bb46929c-0b4e-4f01-868a-7e7638fa943b`::CPU stopped: onIOError Full vdsm log - https://paste.fedoraproject.org/245148/43707759/ and glusterfs errors W [MSGID: 114031] [client-rpc-fops.c:2973:client3_3_lookup_cbk] 0-distributed_vol-client-0: remote operation failed: Transport endpoint is not connected. Path: / (----0001) [Transport endpoint is not connected] W [fuse-bridge.c:2273:fuse_writev_cbk] 0-glusterfs-fuse: 362694: WRITE => -1 (Transport endpoint is not connected) K. Also check your glusterfs logs. Maybe you can find some problems, too. Mario Am 16.07.15 um 10:29 schrieb Konstantinos Christidis: Hello oVirt users, I am facing a serious problem regarding my GlusterFS storage and virtual machines that have *bootable* disks on this storage. All my VMs that have GlusterFS disks are occasionally (1-2 times/hour) becoming paused with the following Error: VM vm02.mytld has been paused due to unknown storage error. Engine Log INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-69) [] VM '247bb0f3-1a77-44e4-a404-3271eaee94be'(vm02.mytld) moved from 'Up' --> 'Paused' INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-69) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm02.mytld has been paused. ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-69) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm02.mytld has been paused due to unknown storage error My iSCSI VM's, some of which may have mounted (not bootable) disks from the same GlusterFS storage, do NOT suffer from this issue AFAIK. My installation (oVirt 3.6/CentOS 7) is pretty much a typical one, with a GlusterFS enabled cluster with 4 hosts, 2-3 networks, and 6-7 VMs.. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] VM has been paused due to unknown storage error
Hello oVirt users, I am facing a serious problem regarding my GlusterFS storage and virtual machines that have *bootable* disks on this storage. All my VMs that have GlusterFS disks are occasionally (1-2 times/hour) becoming paused with the following Error: VM vm02.mytld has been paused due to unknown storage error. Engine Log INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-69) [] VM '247bb0f3-1a77-44e4-a404-3271eaee94be'(vm02.mytld) moved from 'Up' --> 'Paused' INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-69) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm02.mytld has been paused. ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-69) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm02.mytld has been paused due to unknown storage error My iSCSI VM's, some of which may have mounted (not bootable) disks from the same GlusterFS storage, do NOT suffer from this issue AFAIK. My installation (oVirt 3.6/CentOS 7) is pretty much a typical one, with a GlusterFS enabled cluster with 4 hosts, 2-3 networks, and 6-7 VMs.. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Create Template error
Hello, "Clone VM" or "Make Template" took several minutes and failed with this error Failed with error ENGINE and code5001 Full error engone log http://ur1.ca/n4iiu oVirt / CentOS7 and local PostgreSQL. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Internal Engine Error while adding a new (distribute glusterfs) Storage Domain
The good news: there is an an option for this. The bad news: Only replica 3 is supported. Other options are for development purposes. [root@hv00 ~]# cat /etc/vdsm/vdsm.conf ... [gluster] # Only replica 3 is supported, this configuration is for development. # Value is comma separated. For example, to allow replica 1 and # replica 3, use 1,3. allowed_replica_counts = 1,3 ... https://bugzilla.redhat.com/show_bug.cgi?id=1238093 On 07/13/2015 05:08 PM, Konstantinos Christidis wrote: Hello, I created (through oVirt web) a GlusterFS distributed volume with four bricks. When I try to add a New Domain - GlusterFS Data I am getting Error while executing action Add Storage Connection: Internal Engine Error and Error validating master storage domain: ('MD read error',) Full logs engine.log - https://paste.kde.org/pefcwndgc/zamd2o/raw vdsm.log - https://paste.kde.org/pxf91znwq/6mhrg3/raw Gluster info/options - https://paste.kde.org/pjfauvisg/grfrvj/raw (oVirt3.6/Centos7) ps: My installation seems to work only with replica-3 oVirt Optimized volumes. Every other combination fails with the error above. Any help would be appreciated. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Internal Engine Error while adding a new (distribute glusterfs) Storage Domain
Hello, I created (through oVirt web) a GlusterFS distributed volume with four bricks. When I try to add a New Domain - GlusterFS Data I am getting Error while executing action Add Storage Connection: Internal Engine Error and Error validating master storage domain: ('MD read error',) Full logs engine.log - https://paste.kde.org/pefcwndgc/zamd2o/raw vdsm.log - https://paste.kde.org/pxf91znwq/6mhrg3/raw Gluster info/options - https://paste.kde.org/pjfauvisg/grfrvj/raw (oVirt3.6/Centos7) ps: My installation seems to work only with replica-3 oVirt Optimized volumes. Every other combination fails with the error above. Any help would be appreciated. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt Error while executing Attach Storage Domain (glusterfs)
Hello again, In the volume options (auth.allow) I changed my comma seperated values from ovweb.mytld,ovhv1.mytld,ovhv2.mytld,ovhv3.mytld,ovhv4.mytld to * and it worked like a charm. Thanks. K. On 11/07/2015 12:08 μμ, Konstantinos Christidis wrote: Hello, I have a glusterfs enabled cluster with replica-3 volume. When I try to add oVirt Glusterfs/Data storage I am getting this error. Error while executing Attach Storage Domain: Storage domain cannot be reached. Please ensure it is accessible from the host(s). iptables/firewalld are stopped. Anyone? K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] oVirt Error while executing Attach Storage Domain (glusterfs)
Hello, I have a glusterfs enabled cluster with replica-3 volume. When I try to add oVirt Glusterfs/Data storage I am getting this error. Error while executing Attach Storage Domain: Storage domain cannot be reached. Please ensure it is accessible from the host(s). iptables/firewalld are stopped. Anyone? K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users