[ovirt-users] Storage question: single node gluster?
Hi There, This is not strictly oVirt, but is storage-related, so hopefully you will indulge me? Is there any detriment (performance or otherwise) in setting up a single-node glusterFS storage? I know glusterFS is designed to be used with multiple nodes, but I am wondering if there are any ill-effects in configuring current storage as a single-node cluster, with the idea of possibly adding future nodes in the future? Thanks! :-) -Alan ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] VM got paused state after migration
Dear Listmembers, I have a 3 node ovirt system, it's based on self-host glusterfs. Today, I did a vm migration but it was unsuccessful. The vm got a paused state and it was unable to resume and it was impossible to switch off. Cause I killed it on the compute node, but it's remains as paused state. How can I switch to power-off state this machine? At this moment I can't power on this vm.:( In this case are there any friendly solution for the future? If I do a restart for vdsmd, that will cause other bad things? Thanks in advance. Regards, Tibor ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Internal Engine Error while adding a new (distribute glusterfs) Storage Domain
Hello, I created (through oVirt web) a GlusterFS distributed volume with four bricks. When I try to add a New Domain - GlusterFS Data I am getting Error while executing action Add Storage Connection: Internal Engine Error and Error validating master storage domain: ('MD read error',) Full logs engine.log - https://paste.kde.org/pefcwndgc/zamd2o/raw vdsm.log - https://paste.kde.org/pxf91znwq/6mhrg3/raw Gluster info/options - https://paste.kde.org/pjfauvisg/grfrvj/raw (oVirt3.6/Centos7) ps: My installation seems to work only with replica-3 oVirt Optimized volumes. Every other combination fails with the error above. Any help would be appreciated. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM got paused state after migration
On 13/07/15 17:02, Demeter Tibor wrote: Dear Listmembers, I have a 3 node ovirt system, it's based on self-host glusterfs. Today, I did a vm migration but it was unsuccessful. The vm got a paused state and it was unable to resume and it was impossible to switch off. Cause I killed it on the compute node, but it's remains as paused state. How can I switch to power-off state this machine? At this moment I can't power on this vm.:( In this case are there any friendly solution for the future? If I do a restart for vdsmd, that will cause other bad things? Thanks in advance. Regards, Tibor Hi Tibor, is the qemu process dead in both hosts? Can you run the below command on one of the hosts and provide the output- hosted-engine --vm-status ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM got paused state after migration
Hi, I'm sorry, but i don't have real hosted engine. I have glusterfs on same nodes with vms, but i don't use hosted engine. Also, I've checked, the vm does not running on both servers. What do you think, if i restart the vdsmd, that will solve this problem? Thanks for fast reply, Tibor - 2015. júl.. 13., 16:52, Doron Fediuck dfedi...@redhat.com írta: On 13/07/15 17:02, Demeter Tibor wrote: Dear Listmembers, I have a 3 node ovirt system, it's based on self-host glusterfs. Today, I did a vm migration but it was unsuccessful. The vm got a paused state and it was unable to resume and it was impossible to switch off. Cause I killed it on the compute node, but it's remains as paused state. How can I switch to power-off state this machine? At this moment I can't power on this vm.:( In this case are there any friendly solution for the future? If I do a restart for vdsmd, that will cause other bad things? Thanks in advance. Regards, Tibor Hi Tibor, is the qemu process dead in both hosts? Can you run the below command on one of the hosts and provide the output- hosted-engine --vm-status ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
Thanks for the responses everyone and for the RFE. I do use HA in some places at the moment, but I do see another timeout value called vdsConnectionTimeout. Would HA use this value or vdsTimeout (set to 2 by default) when attempting to contact the host? -Original Message- From: Shubhendu Tripathi [mailto:shtri...@redhat.com] Sent: Monday, July 13, 2015 2:25 AM To: Piotr Kliczewski Cc: Omer Frenkel; Groten, Ryan; users@ovirt.org Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine? On 07/13/2015 01:42 PM, Piotr Kliczewski wrote: On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi shtri...@redhat.com wrote: On 07/12/2015 09:53 PM, Omer Frenkel wrote: - Original Message - From: Liron Aravot lara...@redhat.com To: Ryan Groten ryan.gro...@stantec.com Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine? - Original Message - From: Ryan Groten ryan.gro...@stantec.com To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine? When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 ) I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good. Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that. thanks, Liron. if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing. you can read more on http://www.ovirt.org/Automatic_Fencing Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :) Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout. Piotr, I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373. Thanks and Regards, Shubhendu Thanks, Ryan ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM got paused state after migration
On 13/07/15 17:59, Demeter Tibor wrote: Hi, I'm sorry, but i don't have real hosted engine. I have glusterfs on same nodes with vms, but i don't use hosted engine. Also, I've checked, the vm does not running on both servers. What do you think, if i restart the vdsmd, that will solve this problem? You can restart vdsmd, but it may not do anything (running VMs are not touched by vdsm). Can you check with vdsm what is the vm status? Thanks for fast reply, Tibor - 2015. júl.. 13., 16:52, Doron Fediuck dfedi...@redhat.com írta: On 13/07/15 17:02, Demeter Tibor wrote: Dear Listmembers, I have a 3 node ovirt system, it's based on self-host glusterfs. Today, I did a vm migration but it was unsuccessful. The vm got a paused state and it was unable to resume and it was impossible to switch off. Cause I killed it on the compute node, but it's remains as paused state. How can I switch to power-off state this machine? At this moment I can't power on this vm.:( In this case are there any friendly solution for the future? If I do a restart for vdsmd, that will cause other bad things? Thanks in advance. Regards, Tibor Hi Tibor, is the qemu process dead in both hosts? Can you run the below command on one of the hosts and provide the output- hosted-engine --vm-status ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Internal Engine Error while adding a new (distribute glusterfs) Storage Domain
The good news: there is an an option for this. The bad news: Only replica 3 is supported. Other options are for development purposes. [root@hv00 ~]# cat /etc/vdsm/vdsm.conf ... [gluster] # Only replica 3 is supported, this configuration is for development. # Value is comma separated. For example, to allow replica 1 and # replica 3, use 1,3. allowed_replica_counts = 1,3 ... https://bugzilla.redhat.com/show_bug.cgi?id=1238093 On 07/13/2015 05:08 PM, Konstantinos Christidis wrote: Hello, I created (through oVirt web) a GlusterFS distributed volume with four bricks. When I try to add a New Domain - GlusterFS Data I am getting Error while executing action Add Storage Connection: Internal Engine Error and Error validating master storage domain: ('MD read error',) Full logs engine.log - https://paste.kde.org/pefcwndgc/zamd2o/raw vdsm.log - https://paste.kde.org/pxf91znwq/6mhrg3/raw Gluster info/options - https://paste.kde.org/pjfauvisg/grfrvj/raw (oVirt3.6/Centos7) ps: My installation seems to work only with replica-3 oVirt Optimized volumes. Every other combination fails with the error above. Any help would be appreciated. Thanks, K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users K. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] R: R: R: R: R: R: R: PXE boot of a VM on vdsm don't read DHCP offer
-Messaggio originale- Da: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] Per conto di Michael S. Tsirkin Inviato: giovedì 9 luglio 2015 15:15 A: Fabian Deutsch Cc: users@ovirt.org Oggetto: Re: [ovirt-users] R: R: R: R: R: R: PXE boot of a VM on vdsm don't read DHCP offer On Thu, Jul 09, 2015 at 08:57:50AM -0400, Fabian Deutsch wrote: - Original Message - On Wed, Jul 08, 2015 at 09:11:42AM +0300, Michael S. Tsirkin wrote: On Tue, Jul 07, 2015 at 05:13:28PM +0100, Dan Kenigsberg wrote: On Tue, Jul 07, 2015 at 10:14:54AM +0200, NUNIN Roberto wrote: On Mon, Jul 06, 2015 at 10:33:59AM +0200, NUNIN Roberto wrote: Hi Dan Sorry for question: what do you mean for interface vnet ? Currently our path is : eno1 - eno2 bond0 - bond.3500 (VLAN) -- bridge - vm. Which one of these ? Moreover, reading Fabian statements about bonding limits, today I can try to switch to a config without bonding. vm is a complicated term. `brctl show` would not show you a vm connected to a bridge. When you WOULD see is a vnet888 tap device. The other side of this device is held by qemu, which implement the VM. Ok, understood and found it, vnet2 I'm asking if the dhcp offer has reached that tap device. No, the DHCP offer packet do not reach the vnet2 interface, I can see only DHCP DISCOVER. Ok, so it seems that we have a problem in the host bridging. Is it the latest kernel-3.10.0-229.7.2.el7.x86_64 ? Michael, a DHCP DISCOVER is sent out of a just-booted guest, and OFFER returns to the bridge, but is not propagated to the tap device. Can you suggest how to debug this further? Dump packets including the ethernet headers. Likely something interfered with them so the eth address is wrong. Since bonding does this sometimes, this is the most likely culprit. We've ruled this out already - Roberto reproduces the issue without a bond. To me this looks like either a regression in the host side bridging. But otoh it doesn't look like it's happening always, because otherwise I'd expect more noise around this issue. - fabian Hard to say. E.g. forwarding delay would do this for a while. If eth address of the packets is okay, poke at the fbd, maybe there's something wrong there. Maybe stp is detecting a loop - try checking that. I've the tcpdump captures, let me know if are useful to analyze. In the VLAN interface, STP=off. RN -- MST ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Questo messaggio e' indirizzato esclusivamente al destinatario indicato e potrebbe contenere informazioni confidenziali, riservate o proprietarie. Qualora la presente venisse ricevuta per errore, si prega di segnalarlo immediatamente al mittente, cancellando l'originale e ogni sua copia e distruggendo eventuali copie cartacee. Ogni altro uso e' strettamente proibito e potrebbe essere fonte di violazione di legge. This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately, deleting the original and all copies and destroying any hard copies. Any other use is strictly prohibited and may be unlawful. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi shtri...@redhat.com wrote: On 07/12/2015 09:53 PM, Omer Frenkel wrote: - Original Message - From: Liron Aravot lara...@redhat.com To: Ryan Groten ryan.gro...@stantec.com Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine? - Original Message - From: Ryan Groten ryan.gro...@stantec.com To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine? When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 ) I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good. Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that. thanks, Liron. if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing. you can read more on http://www.ovirt.org/Automatic_Fencing Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :) Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout. Thanks, Ryan ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] hosted_engine, fencing does not work
Il 10/07/2015 11:52, Николаев Алексей ha scritto: CC sbona...@redhat.com @Sandro Bonazzola, can you please tell us what status of oVirt project for CentOS 6.6 for hypervisors hosts? Are support CentOS 6.6 for hypervisors hosts is end? CentOS Linux 6.6 won't be supported for cluster level 3.6. You may continue to use 6.6 Hypervisors with 3.5 cluster level using ovirt-engine 3.6. We encourage to upgrade to latest 7 release in order to gain 3.6 cluster level features. *Нашёл информация только по oVirt Engine 3.6:* Repository closure is currently broken on Fedora 20 and CentOS 6 due to a missing required dependency on recent libvirt and vdsm rpm dropping el6 support. VDSM builds for EL6 are no more available on master snapshot. VDSM is not built anymore on Fedora 20 and EL6 since 3.6 cluster level won't be available on those distributions. You may continue using vdsm from 3.5 repositories on those distro but you're strongly encouraged to upgrade to a more recent one in order to get 3.6 features. В будущем все равно придётся перейти на CentOS 7. 10.07.2015, 11:24, Ilya Fedotov kosh...@gmail.com: Ну это Николаев Алексей загнул. System Requirements Minimum Hardware/Software * 4 GB memory * 20 GB disk space Optional Hardware * Network storage Recommended browsers * Mozilla Firefox 17 * IE9 and above for the web-admin * IE8 and above for the user portal Прекрасно работает. Supported Hosts * Fedora 20 * CentOS 6.6, 7.0 * Red Hat Enterprise Linux 6.6, 7.0 * Scientific Linux 6.6, 7.0 2015-07-10 10:45 GMT+03:00 Николаев Алексей alexeynikolaev.p...@yandex.ru mailto:alexeynikolaev.p...@yandex.ru: Ovirt больше не поддерживает CentOS 6 для гипервизоров. CentOS 6 поддерживается только для Ovirt Engine. Считаю, что сначало надо воспроизвести проблему на CentOS 7.1. Если она сохранится, то будут нужны логи с гипервизоров (/var/log/vdsm/vdsm.log) и лог Ovirt Engine (/var/log/ovirt-engine/engine.log). *Логи могут содержать конфиденциальную информацию о вашей системе.* 10.07.2015, 03:06, martirosov.d martiroso...@emk.ru mailto:martiroso...@emk.ru: Hi. ovirt3.5 Have two servers(node1 and node2) that are runnig Centos6.6. 1. engine on node2. Disables network node1, engine restarts after some time node1 and everything works fine. 2. engine on node1. Disables network node1, engine moved to node2, node1 but does not reset, although attempts. That log messages: 2015-Jul-03, 10:42 Host node1 is not responding. It will stay in Connecting state for a grace period of 120 seconds and after that an attempt to fence the host will be issued. 2015-Jul-03, 10:39 Host node1 is not responding. It will stay in Connecting state for a grace period of 120 seconds and after that an attempt to fence the host will be issued. 2015-Jul-03, 10:36 Host node1 is not responding. It will stay in Connecting state for a grace period of 120 seconds and after that an attempt to fence the host will be issued. 2015-Jul-03, 10:34 User admin@internal logged in. 2015-Jul-03, 10:33 Host node1 became non responsive. It has no power management configured. Please check the host status, manually reboot it, and click Confirm Host Has Been Rebooted 2015-Jul-03, 10:33 Host node2 from cluster emk-cluster was chosen as a proxy to execute Status command on Host node1. 2015-Jul-03, 10:33 Host node1 is non responsive. Manual fencing to node1 works from the node2: # fence_ipmilan -A password -i 10.64.1.103 -l admin -p *** -o status Getting status of IPMI:10.64.1.103...Chassis power = On Done Power Management on node2 test succeeded in oVirt Manager. i.e. If turn off the host on which the engine, after moving to a healthy host, engine does not fencing to a problematic host and all VMs that were on problematic host does not migrate. If the problem is not with the host on which the engine, then it works: making fence, and migrates Vms. ___ Users mailing list Users@ovirt.org mailto:Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org mailto:Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
On 07/13/2015 01:42 PM, Piotr Kliczewski wrote: On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi shtri...@redhat.com wrote: On 07/12/2015 09:53 PM, Omer Frenkel wrote: - Original Message - From: Liron Aravot lara...@redhat.com To: Ryan Groten ryan.gro...@stantec.com Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine? - Original Message - From: Ryan Groten ryan.gro...@stantec.com To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine? When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 ) I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good. Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that. thanks, Liron. if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing. you can read more on http://www.ovirt.org/Automatic_Fencing Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :) Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout. Piotr, I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373. Thanks and Regards, Shubhendu Thanks, Ryan ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users