Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith It's definitely connected to the storage which leads us to the gluster, I'm not very familiar with the gluster so I need to check this with our gluster gurus. == Thanks, Vijay ___ Gluster-devel mailing list gluster-de...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [ovirt-devel] virt-v2v integration feature
Lots of virtual appliances come in ovf format. There isn’t a way to import it into an oVirt/RHEV instance without a live ESX instance. This is very inconvenient and inefficient -Original Message- From: Sven Kieske [mailto:s.kie...@mittwald.de] Sent: Thursday, July 10, 2014 5:01 AM To: Itamar Heim; de...@ovirt.org Cc: Users@ovirt.org List; Richard W.M. Jones; midnightst...@msn.com; blas...@556nato.com; bugzi...@grendelman.com; f...@moov.de; R P Herrold; jsp...@bandwith.com Subject: Re: [ovirt-users] [ovirt-devel] virt-v2v integration feature Am 10.07.2014 09:41, schrieb Itamar Heim: On 07/10/2014 10:29 AM, Sven Kieske wrote: Am 09.07.2014 20:30, schrieb Arik Hadas: Hi All, The proposed feature will introduce a new process of import virtual machines from external systems using virt-v2v in oVirt. I've created a wiki page that contains initial thoughts and design for it: http://www.ovirt.org/Features/virt-v2v_Integration You are more than welcome to share your thoughts and insights. Thanks, Arik Am I right that this still involves a full operational e.g. esxi host to import vmware vms? There is a huge user demand on a simpler process for just converting and importing an vmware disk image. This feature will not solve this use case, will it? I agree it should. need to check if virt-v2v can cover this. if not, need to fix it so it will... Well here are the relevant BZ entries: https://bugzilla.redhat.com/show_bug.cgi?id=1062910 https://bugzilla.redhat.com/show_bug.cgi?id=1049604 CC'ing the users from this Bugzilla entries, maybe they can add something to gain some traction :) -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Pranith It's definitely connected to the storage which leads us to the gluster, I'm not very familiar with the gluster so I need to check this with our gluster gurus. == Thanks, Vijay ___ Gluster-devel mailing list gluster-de...@gluster.org mailto:gluster-de...@gluster.org
[ovirt-users] unable to mount iso storage
Hi, when i added the 2nd host, i keep getting The error message for connection OvirtFE:/mnt/iso_domain returned by VDSM was: Problem while trying to mount target and i am unable to access images when creating vms. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] unable to mount iso storage
On 21-7-2014 9:29, Gene Fontanilla wrote: Hi, when i added the 2nd host, i keep getting The error message for connection OvirtFE:/mnt/iso_domain returned by VDSM was: Problem while trying to mount target and i am unable to access images when creating vms. Check if you can mount the nfs share from that second server. I'll guess not. Check your firewall(s) You can find the exact mount commant in vdsm.log Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Setup of hosted Engine Fails
Hi Andrew, thanks for debugging this, please create a bug against vdsm to make sure it gets proper attention. Thanks, Jirka On 07/19/2014 12:36 PM, Andrew Lau wrote: Quick update, it seems to be related to the latest vdsm package, service vdsmd start vdsm: Running mkdirs vdsm: Running configure_coredump vdsm: Running configure_vdsm_logs vdsm: Running run_init_hooks vdsm: Running gencerts vdsm: Running check_is_configured libvirt is not configured for vdsm yet Modules libvirt are not configured Traceback (most recent call last): File /usr/bin/vdsm-tool, line 145, in module sys.exit(main()) File /usr/bin/vdsm-tool, line 142, in main return tool_command[cmd][command](*args[1:]) File /usr/lib64/python2.6/site-packages/vdsm/tool/configurator.py, line 282, in isconfigured raise RuntimeError(msg) RuntimeError: One of the modules is not configured to work with VDSM. To configure the module use the following: 'vdsm-tool configure [module_name]'. If all modules are not configured try to use: 'vdsm-tool configure --force' (The force flag will stop the module's service and start it afterwards automatically to load the new configuration.) vdsm: stopped during execute check_is_configured task (task returned with error code 1). vdsm start [FAILED] yum downgrade vdsm* Here's the package changes for reference, -- Running transaction check --- Package vdsm.x86_64 0:4.14.9-0.el6 will be a downgrade --- Package vdsm.x86_64 0:4.14.11-0.el6 will be erased --- Package vdsm-cli.noarch 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-cli.noarch 0:4.14.11-0.el6 will be erased --- Package vdsm-python.x86_64 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-python.x86_64 0:4.14.11-0.el6 will be erased --- Package vdsm-python-zombiereaper.noarch 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-python-zombiereaper.noarch 0:4.14.11-0.el6 will be erased --- Package vdsm-xmlrpc.noarch 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-xmlrpc.noarch 0:4.14.11-0.el6 will be erased service vdsmd start initctl: Job is already running: libvirtd vdsm: Running mkdirs vdsm: Running configure_coredump vdsm: Running configure_vdsm_logs vdsm: Running run_init_hooks vdsm: Running gencerts vdsm: Running check_is_configured libvirt is already configured for vdsm sanlock service is already configured vdsm: Running validate_configuration SUCCESS: ssl configured to true. No conflicts vdsm: Running prepare_transient_repository vdsm: Running syslog_available vdsm: Running nwfilter vdsm: Running dummybr vdsm: Running load_needed_modules vdsm: Running tune_system vdsm: Running test_space vdsm: Running test_lo vdsm: Running unified_network_persistence_upgrade vdsm: Running restore_nets vdsm: Running upgrade_300_nets Starting up vdsm daemon: vdsm start [ OK ] [root@ov-hv1-2a-08-23 ~]# service vdsmd status VDS daemon server is running On Sat, Jul 19, 2014 at 6:58 PM, Andrew Lau and...@andrewklau.com mailto:and...@andrewklau.com wrote: It seems vdsm is not running, service vdsmd status VDS daemon is not running, and its watchdog is running The only logs in /var/log/vdsm/ that appear to have any content is /var/log/vdsm/supervdsm.log - everything else is blank MainThread::DEBUG::2014-07-19 18:55:34,793::supervdsmServer::424::SuperVdsm.Server::(main) Terminated normally MainThread::DEBUG::2014-07-19 18:55:38,033::netconfpersistence::134::root::(_getConfigs) Non-existing config set. MainThread::DEBUG::2014-07-19 18:55:38,034::netconfpersistence::134::root::(_getConfigs) Non-existing config set. MainThread::DEBUG::2014-07-19 18:55:38,058::supervdsmServer::384::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsm MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::393::SuperVdsm.Server::(main) Parsing cmd args MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::396::SuperVdsm.Server::(main) Cleaning old socket /var/run/vdsm/svdsm.sock MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::400::SuperVdsm.Server::(main) Setting up keep alive thread MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::406::SuperVdsm.Server::(main) Creating remote object manager MainThread::DEBUG::2014-07-19 18:55:38,061::supervdsmServer::417::SuperVdsm.Server::(main) Started serving super vdsm object sourceRoute::DEBUG::2014-07-19 18:55:38,062::sourceRouteThread::56::root::(_subscribeToInotifyLoop) sourceRouteThread.subscribeToInotifyLoop started On Sat, Jul 19, 2014 at 6:48 PM, Andrew Lau and...@andrewklau.com mailto:and...@andrewklau.com wrote: Here's a snippet from my hosted-engine-setup log 2014-07-19 18:45:14 DEBUG otopi.context context._executeMethod:138 Stage late_setup METHOD
Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote: On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Hi, I've never had such setup, I guessed problem with gluster based on OSError: [Errno 116] Stale file handle: which happens when the file opened by application on client gets removed on the server. I'm pretty sure we (hosted-engine) don't remove that file, so I think it's some gluster magic moving the data around... --Jirka Pranith
Re: [ovirt-users] Setup of hosted Engine Fails
Done, https://bugzilla.redhat.com/show_bug.cgi?id=1121561 On Mon, Jul 21, 2014 at 6:32 PM, Jiri Moskovcak jmosk...@redhat.com wrote: Hi Andrew, thanks for debugging this, please create a bug against vdsm to make sure it gets proper attention. Thanks, Jirka On 07/19/2014 12:36 PM, Andrew Lau wrote: Quick update, it seems to be related to the latest vdsm package, service vdsmd start vdsm: Running mkdirs vdsm: Running configure_coredump vdsm: Running configure_vdsm_logs vdsm: Running run_init_hooks vdsm: Running gencerts vdsm: Running check_is_configured libvirt is not configured for vdsm yet Modules libvirt are not configured Traceback (most recent call last): File /usr/bin/vdsm-tool, line 145, in module sys.exit(main()) File /usr/bin/vdsm-tool, line 142, in main return tool_command[cmd][command](*args[1:]) File /usr/lib64/python2.6/site-packages/vdsm/tool/configurator.py, line 282, in isconfigured raise RuntimeError(msg) RuntimeError: One of the modules is not configured to work with VDSM. To configure the module use the following: 'vdsm-tool configure [module_name]'. If all modules are not configured try to use: 'vdsm-tool configure --force' (The force flag will stop the module's service and start it afterwards automatically to load the new configuration.) vdsm: stopped during execute check_is_configured task (task returned with error code 1). vdsm start [FAILED] yum downgrade vdsm* Here's the package changes for reference, -- Running transaction check --- Package vdsm.x86_64 0:4.14.9-0.el6 will be a downgrade --- Package vdsm.x86_64 0:4.14.11-0.el6 will be erased --- Package vdsm-cli.noarch 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-cli.noarch 0:4.14.11-0.el6 will be erased --- Package vdsm-python.x86_64 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-python.x86_64 0:4.14.11-0.el6 will be erased --- Package vdsm-python-zombiereaper.noarch 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-python-zombiereaper.noarch 0:4.14.11-0.el6 will be erased --- Package vdsm-xmlrpc.noarch 0:4.14.9-0.el6 will be a downgrade --- Package vdsm-xmlrpc.noarch 0:4.14.11-0.el6 will be erased service vdsmd start initctl: Job is already running: libvirtd vdsm: Running mkdirs vdsm: Running configure_coredump vdsm: Running configure_vdsm_logs vdsm: Running run_init_hooks vdsm: Running gencerts vdsm: Running check_is_configured libvirt is already configured for vdsm sanlock service is already configured vdsm: Running validate_configuration SUCCESS: ssl configured to true. No conflicts vdsm: Running prepare_transient_repository vdsm: Running syslog_available vdsm: Running nwfilter vdsm: Running dummybr vdsm: Running load_needed_modules vdsm: Running tune_system vdsm: Running test_space vdsm: Running test_lo vdsm: Running unified_network_persistence_upgrade vdsm: Running restore_nets vdsm: Running upgrade_300_nets Starting up vdsm daemon: vdsm start [ OK ] [root@ov-hv1-2a-08-23 ~]# service vdsmd status VDS daemon server is running On Sat, Jul 19, 2014 at 6:58 PM, Andrew Lau and...@andrewklau.com mailto:and...@andrewklau.com wrote: It seems vdsm is not running, service vdsmd status VDS daemon is not running, and its watchdog is running The only logs in /var/log/vdsm/ that appear to have any content is /var/log/vdsm/supervdsm.log - everything else is blank MainThread::DEBUG::2014-07-19 18:55:34,793::supervdsmServer::424::SuperVdsm.Server::(main) Terminated normally MainThread::DEBUG::2014-07-19 18:55:38,033::netconfpersistence::134::root::(_getConfigs) Non-existing config set. MainThread::DEBUG::2014-07-19 18:55:38,034::netconfpersistence::134::root::(_getConfigs) Non-existing config set. MainThread::DEBUG::2014-07-19 18:55:38,058::supervdsmServer::384::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsm MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::393::SuperVdsm.Server::(main) Parsing cmd args MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::396::SuperVdsm.Server::(main) Cleaning old socket /var/run/vdsm/svdsm.sock MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::400::SuperVdsm.Server::(main) Setting up keep alive thread MainThread::DEBUG::2014-07-19 18:55:38,059::supervdsmServer::406::SuperVdsm.Server::(main) Creating remote object manager MainThread::DEBUG::2014-07-19 18:55:38,061::supervdsmServer::417::SuperVdsm.Server::(main) Started serving super vdsm object sourceRoute::DEBUG::2014-07-19 18:55:38,062::sourceRouteThread::56::root::(_subscribeToInotifyLoop) sourceRouteThread.subscribeToInotifyLoop started On Sat, Jul 19, 2014 at 6:48 PM, Andrew Lau and...@andrewklau.com
Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/21/2014 02:08 PM, Jiri Moskovcak wrote: On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote: On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Hi, I've never had such setup, I guessed problem with gluster based on OSError: [Errno 116] Stale file handle: which happens when the file opened by application on client gets removed on the server. I'm pretty sure we (hosted-engine) don't remove that file, so I think it's some gluster magic moving the data
Re: [ovirt-users] Guest VM Console Creation/Access using REST API and noVNC
On Jul 21, 2014, at 04:33 , Punit Dambiwal hypu...@gmail.com wrote: Hi All, I am still waiting for the updates...is there any one have the clue to solve this problem ??? Hi Punit, I'm afraid no one can help you debug connectivity issues remotely, without describing precisely what are you doing and how, and include all the logs Thanks, michal Thanks, Punit On Fri, Jul 18, 2014 at 12:37 PM, Punit Dambiwal hypu...@gmail.com wrote: Hi All, We are also struggling with the same problemcan anybody mind to update here the resolution or suggest us the way to get rid of this Failed to connect to server (code: 1006 error. Thanks, Punit On Thu, Jul 17, 2014 at 5:20 PM, Shanil S xielessha...@gmail.com wrote: Hi, We are waiting for the updates, it will be great if anyone can give the helpful details.. :) -- Regards Shanil On Thu, Jul 17, 2014 at 10:23 AM, Shanil S xielessha...@gmail.com wrote: Hi, we have enabled our portal ip address on the engine and hosts firewall but still the connection failed. so there should be no firewall issues. -- Regards Shanil On Wed, Jul 16, 2014 at 3:26 PM, Shanil S xielessha...@gmail.com wrote: Hi Sven, Regarding the ticket path, Is it the direct combination of host and port ? suppose if the host is 1.2.3.4 and the port is 5100 then what should be the path value ? Is there encryption needs here ? so you have access from the browser to the websocket-proxy, network wise? can you ping the proxy? and the websocket proxy can reach the host where the vm runs? yes.. there should be no firewall issue as we can access the console from ovirt engine portal Do we need to allow our own portal ip address in the ovirt engine and hypervisiors also ??? -- Regards Shanil On Wed, Jul 16, 2014 at 3:13 PM, Sven Kieske s.kie...@mittwald.de wrote: Am 16.07.2014 11:30, schrieb Shanil S: We will get the ticket details like host,port and password from the ticket api funcion call but didn't get the path value. Will it get it from the ticket details ? i couldn't find out any from the ticket details. the path is the combination of host and port. so you have access from the browser to the websocket-proxy, network wise? can you ping the proxy? and the websocket proxy can reach the host where the vm runs? are you sure there are no firewalls in between? also you should pay attention on how long your ticket is valid, you can specify the duration in minutes in your api call. -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
On 21-7-2014 14:38, Maurice James wrote: I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They are not down not sure why it thinks the interfaces are down. Its doing this for all 4 of my hosts Upgraded too and same problem. Seems that a downgrade of vdsm will solve this. Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
Wow, huge bug From: Joop jvdw...@xs4all.nl To: users@ovirt.org Sent: Monday, July 21, 2014 8:46:18 AM Subject: Re: [ovirt-users] 3.4.3 Network problem On 21-7-2014 14:38, Maurice James wrote: I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They are not down not sure why it thinks the interfaces are down. Its doing this for all 4 of my hosts Upgraded too and same problem. Seems that a downgrade of vdsm will solve this. Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
I submitted a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1121643 - Original Message - From: Joop jvdw...@xs4all.nl To: users@ovirt.org Sent: Monday, July 21, 2014 8:46:18 AM Subject: Re: [ovirt-users] 3.4.3 Network problem On 21-7-2014 14:38, Maurice James wrote: I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They are not down not sure why it thinks the interfaces are down. Its doing this for all 4 of my hosts Upgraded too and same problem. Seems that a downgrade of vdsm will solve this. Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] postponing oVirt 3.5.0 second beta
Hi, we're going to postpone oVirt 3.5.0 second beta since ovirt-engine currently doesn't build [1]. We also have identified a set of bugs causing automated tests to fail so we're going to block the release until engine will build cleanly and at least most critical issues found have been fixed. Please note that more than 80 patches are now in master and not backported to 3.5 branch. Maintainers, please ensure all patches targeted to 3.5 are properly backported. Probably we're going to postpone second test day too, according to the date we'll be able to compose the second beta build. [1] http://jenkins.ovirt.org/view/Stable%20branches%20per%20project/view/ovirt-engine/job/ovirt-engine_3.5_create-rpms_merged/47/ [2] http://goo.gl/pFngWU -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] ilo4 vs. ipmilan fencing agents
- Original Message - From: Eli Mesika emes...@redhat.com To: Jason Brooks jbro...@redhat.com Cc: users users@ovirt.org, Marek Grac mg...@redhat.com Sent: Saturday, July 19, 2014 1:45:37 PM Subject: Re: [ovirt-users] ilo4 vs. ipmilan fencing agents - Original Message - From: Jason Brooks jbro...@redhat.com To: users users@ovirt.org Sent: Thursday, July 10, 2014 1:02:13 AM Subject: [ovirt-users] ilo4 vs. ipmilan fencing agents Hi all -- I'm trying to get fencing squared away in my cluster of hp dl-380 servers, which come with ilo4. I was able to get a successful status check from the command line with fence_ilo4, but not w/ the ilo4 option in ovirt. I see, though, that ilo4 in ovirt just maps to fence_ipmilan, and I was not able to get a successful status check w/ fence_ipmilan from the cli. So, I tried resetting the mapping so that ilo4 maps to ilo4. Now I can complete the power management test in ovirt, but I imagine there's some reason why ovirt isn't configured this way by default. Will fencing actually work for me with ilo4 mapped to ilo4, rather than to ipmilan? ILO3 and ILO4 are mapped implicitly to ipmilan with lanplus flag ON and power_wait=4 On my installation, ilo4 w/ no options fails the test. ilo4 w/ lanplus=on in the options field succeeds. Is it possible that the lanplus=on options isn't being registered/applied properly? Jason If you change the mapping to use the native scripts its OK as long as it works for you addin Marec G to the thread Marec, should we always map ILO3 ILO4 to the native scripts (fence_ilo3 , fence_ilo4) and not to ipmilan ??? Thanks, Jason --- Jason Brooks Red Hat Open Source and Standards @jasonbrooks | @redhatopen http://community.redhat.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Logical network error
On Mon, Jul 21, 2014 at 07:35:00AM -0400, Maurice James wrote: Here are the supervdsm logs Hmm, it's a trove of errors and tracebacks; there's a lot to debug, for example, here Vdsm was asked to create a 'migration' network on top of bond0 that was already used by ovirtmgmt. Engine should have blocked that. Moti? MainProcess|Thread-47826::DEBUG::2014-04-11 10:00:26,335::configNetwork::589::setupNetworks::(setupNetworks) Setting up network according to configuration: networks:{'migration': {'bonding': 'bond0', 'bridged': 'false'}}, bondings:{}, options:{'connectivityCheck': 'true', 'connectivityTimeout': 120} Traceback (most recent call last): File /usr/share/vdsm/supervdsmServer, line 98, in wrapper res = func(*args, **kwargs) File /usr/share/vdsm/supervdsmServer, line 202, in setupNetworks return configNetwork.setupNetworks(networks, bondings, **options) File /usr/share/vdsm/configNetwork.py, line 648, in setupNetworks implicitBonding=True, **d) File /usr/share/vdsm/configNetwork.py, line 186, in wrapped return func(*args, **kwargs) File /usr/share/vdsm/configNetwork.py, line 256, in addNetwork bridged) File /usr/share/vdsm/configNetwork.py, line 170, in _validateInterNetworkCompatibility _validateNoDirectNet(ifaces_bridgeless) File /usr/share/vdsm/configNetwork.py, line 154, in _validateNoDirectNet (iface, iface_net)) ConfigNetworkError: (21, interface 'bond0' already member of network 'ovirtmgmt') There are also repeated failed attempts to create a payload disk: did you notice when they happen? Could it be related to insufficient disk space? MainProcess|clientIFinit::ERROR::2014-03-25 22:13:02,529::supervdsmServer::100::SuperVdsm.ServerCallback::(wrapper) Error in mkFloppyFs Traceback (most recent call last): File /usr/share/vdsm/supervdsmServer, line 98, in wrapper res = func(*args, **kwargs) File /usr/share/vdsm/supervdsmServer, line 325, in mkFloppyFs return mkimage.mkFloppyFs(vmId, files, volId) File /usr/share/vdsm/mkimage.py, line 104, in mkFloppyFs code %s, out %s\nerr %s % (rc, out, err)) OSError: [Errno 5] could not create floppy file: code 1, out mkfs.msdos 3.0.9 (31 Jan 2010) err mkfs.msdos: unable to create /var/run/vdsm/payload/94632c2e-28a0-4304-9261-c302e0027604.ecac527e731a2a0057dc6a3ae3df9ba3.img In any case, if you manage to reproduce these issues, please open a detailed bug entry. Regards, Dan. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] RHEV 3.4 trial hosted-engine either host wants to take ownership
I added a hook to rhevm, and then restarted the engine service which triggered a hosted-engine VM shutdown (likely because of the failed liveliness check). Once the hosted-engine VM shutdown it did not restart on the other host. On both hosts configured for hosted-engine I'm seeing logs from ha-agent where each host thinks the other host has a better score. Is there supposed to be a mechanism for a tie breaker here? I do notice that the log mentions best REMOTE host, so perhaps I'm interpreting this message incorrectly. ha-agent logs: Host 001: MainThread::INFO::2014-07-21 11:51:57,396::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1405957917.4 type=state_transition detail=EngineDown-EngineDown hostname='rhev001.miovision.corp' MainThread::INFO::2014-07-21 11:51:57,397::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDown) sent? ignored MainThread::INFO::2014-07-21 11:51:57,924::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400) MainThread::INFO::2014-07-21 11:51:57,924::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Best remote host rhev002.miovision.corp (id: 2, score: 2400) MainThread::INFO::2014-07-21 11:52:07,961::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down, local host does not have best score MainThread::INFO::2014-07-21 11:52:07,975::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1405957927.98 type=state_transition detail=EngineDown-EngineDown hostname='rhev001.miovision.corp' Host 002: MainThread::INFO::2014-07-21 11:51:47,405::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1405957907.41 type=state_transition detail=EngineDown-EngineDown hostname='rhev002.miovision.corp' MainThread::INFO::2014-07-21 11:51:47,406::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDown) sent? ignored MainThread::INFO::2014-07-21 11:51:47,834::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400) MainThread::INFO::2014-07-21 11:51:47,835::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Best remote host rhev001.miovision.corp (id: 1, score: 2400) MainThread::INFO::2014-07-21 11:51:57,870::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down, local host does not have best score MainThread::INFO::2014-07-21 11:51:57,883::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1405957917.88 type=state_transition detail=EngineDown-EngineDown hostname='rhev002.miovision.corp' This went on for 20 minutes about an hour ago, and I decided to --vm-start on one of the hosts. The manager VM runs for a few minutes with the engine ui accessible, before shutting itself down again. I then put host 002 into local maintenance mode, and host 001 auto started the hosted-engine VM. The logging still references host 002 as the 'best remote host' even though the calculated score is now 0: MainThread::INFO::2014-07-21 12:03:24,011::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1405958604.01 type=state_transition detail=EngineUp-EngineUp hostname='rhev001.miovision.corp' MainThread::INFO::2014-07-21 12:03:24,013::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineUp-EngineUp) sent? ignored MainThread::INFO::2014-07-21 12:03:24,515::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUp (score: 2400) MainThread::INFO::2014-07-21 12:03:24,516::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Best remote host rhev002.miovision.corp (id: 2, score: 0) MainThread::INFO::2014-07-21 12:03:34,567::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1405958614.57 type=state_transition detail=EngineUp-EngineUp hostname='rhev001.miovision.corp' Once the hosted-engine VM was up for about 5 minutes I took host 002 out of local maintenance mode and the VM has not since shutdown. Is this expected behaviour? Is this the normal recovery process when two hosts both hosting hosted-engine are started at the same time? I would have expected once hosted-engine VM was detected as bad (liveliness check from when I restarted the engine service) and the VM was shutdown, that it would spin back up on the next available host. Thanks, Steve
[ovirt-users] ovirt-release.rpm 3.4 dead link
Hello: The link to ovirt-release.rpm 3.4 is dead: http://www.ovirt.org/OVirt_3.4_Release_Notes#Install_.2F_Upgrade_from_Previous_Versions Where is the ovirt-release.rpm ?? Thanks Federico ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
On Mon, Jul 21, 2014 at 06:05:45PM +0100, Dan Kenigsberg wrote: On Mon, Jul 21, 2014 at 09:03:58AM -0400, Maurice James wrote: I submitted a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1121643 - Original Message - From: Joop jvdw...@xs4all.nl To: users@ovirt.org Sent: Monday, July 21, 2014 8:46:18 AM Subject: Re: [ovirt-users] 3.4.3 Network problem On 21-7-2014 14:38, Maurice James wrote: I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They are not down not sure why it thinks the interfaces are down. Its doing this for all 4 of my hosts It is a horrible bug, due to my http://gerrit.ovirt.org/29689, I'll try to send a quick fix asap. Please help me verify that a removal of two lines http://gerrit.ovirt.org/#/c/30547/ fixes the issue. Regards, Dan. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] ovirt-release.rpm 3.4 dead link
- Original Message - From: Federico Alberto Sayd fs...@uncu.edu.ar To: users users@ovirt.org Sent: Monday, July 21, 2014 7:54:48 PM Subject: [ovirt-users] ovirt-release.rpm 3.4 dead link Hello: The link to ovirt-release.rpm 3.4 is dead: http://www.ovirt.org/OVirt_3.4_Release_Notes#Install_.2F_Upgrade_from_Previous_Versions Where is the ovirt-release.rpm ?? 3.4 RPM is available at http://resources.ovirt.org/pub/yum-repo/ovirt-release34.rpm (I've updated the stale link in the wiki..) Thanks Federico ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
Where can I get the RPMs? - Original Message - From: Dan Kenigsberg dan...@redhat.com To: Maurice James mja...@media-node.com Cc: Joop jvdw...@xs4all.nl, users@ovirt.org Sent: Monday, July 21, 2014 1:34:39 PM Subject: Re: [ovirt-users] 3.4.3 Network problem On Mon, Jul 21, 2014 at 06:05:45PM +0100, Dan Kenigsberg wrote: On Mon, Jul 21, 2014 at 09:03:58AM -0400, Maurice James wrote: I submitted a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1121643 - Original Message - From: Joop jvdw...@xs4all.nl To: users@ovirt.org Sent: Monday, July 21, 2014 8:46:18 AM Subject: Re: [ovirt-users] 3.4.3 Network problem On 21-7-2014 14:38, Maurice James wrote: I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They are not down not sure why it thinks the interfaces are down. Its doing this for all 4 of my hosts It is a horrible bug, due to my http://gerrit.ovirt.org/29689, I'll try to send a quick fix asap. Please help me verify that a removal of two lines http://gerrit.ovirt.org/#/c/30547/ fixes the issue. Regards, Dan. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] hostusb hook - VM device errors in Windows VM
I'm using the hostusb hook on RHEV 3.4 trial. The usb device is passed through to the VM, but I'm getting errors in a Windows VM when the device driver is loaded. I started with a simple usb drive, on the host it is listed as: Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc. Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM. In Windows I get an error in device manager: USB Mass Storage Device This device cannot start. (Code 10) Properties/General Tab: Device type: Universal Serial Bus Controllers, Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001 Under hardware Ids: USB\VID_05DCPID_C75CREV_0102 USB\VID_05DCPID_C75C So it looks like the proper USB device ID is passed to the VM. I don't see any error messages in event viewer, and I don't see anything in VDSM logs either. Any help is appreciated. Steve ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] hostusb hook - VM device errors in Windows VM
I should mention I can mount this usb drive in a CentOS 6.5 VM without any problems. On Mon, Jul 21, 2014 at 2:11 PM, Steve Dainard sdain...@miovision.com wrote: I'm using the hostusb hook on RHEV 3.4 trial. The usb device is passed through to the VM, but I'm getting errors in a Windows VM when the device driver is loaded. I started with a simple usb drive, on the host it is listed as: Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc. Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM. In Windows I get an error in device manager: USB Mass Storage Device This device cannot start. (Code 10) Properties/General Tab: Device type: Universal Serial Bus Controllers, Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001 Under hardware Ids: USB\VID_05DCPID_C75CREV_0102 USB\VID_05DCPID_C75C So it looks like the proper USB device ID is passed to the VM. I don't see any error messages in event viewer, and I don't see anything in VDSM logs either. Any help is appreciated. Steve ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
On 21-7-2014 19:34, Dan Kenigsberg wrote: On Mon, Jul 21, 2014 at 06:05:45PM +0100, Dan Kenigsberg wrote: On Mon, Jul 21, 2014 at 09:03:58AM -0400, Maurice James wrote: I submitted a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1121643 - Original Message - From: Joop jvdw...@xs4all.nl To: users@ovirt.org Sent: Monday, July 21, 2014 8:46:18 AM Subject: Re: [ovirt-users] 3.4.3 Network problem On 21-7-2014 14:38, Maurice James wrote: I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They are not down not sure why it thinks the interfaces are down. Its doing this for all 4 of my hosts It is a horrible bug, due to my http://gerrit.ovirt.org/29689, I'll try to send a quick fix asap. Please help me verify that a removal of two lines http://gerrit.ovirt.org/#/c/30547/ fixes the issue. I commented out the indicated 2 lines and could activate my host and it stayed activated (1h) while before this patch it would turn unresponsive quite quickly (minutes). Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.4.3 Network problem
On 21-7-2014 20:11, Maurice James wrote: Where can I get the RPMs? No rpms yet but its a 2 line edit in /usr/share/vdsm/sampling.py. Search for 'vlan' and it should find that in a 3 way if then elseif construct. Just comment the elseif line and the vlan line and save, then restart vdsm and things should work again. I expect/hope new rpms tomorrow late or day after. Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Online backup options
On 06/20/2014 04:35 PM, Steve Dainard wrote: Hello Ovirt team, Reading this bulletin: https://access.redhat.com/site/solutions/117763 there is a reference to 'private Red Hat Bug # 523354' covering online backups of VM's. Can someone comment on this feature, and rough timeline? Is this a native backup solution that will be included with Ovirt/RHEV? Is this Ovirt feature where the work is being done? http://www.ovirt.org/Features/Backup-Restore_API_Integration It seems like this may be a different feature specifically for 3rd party backup options. yes, that's the current approach to allow backup solution to work with ovirt for backups while we focus on core issues. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Deploying hosted engine on second host with different CPU model
On 07/18/2014 02:05 AM, Andrew Lau wrote: I think you should be able specify this within the ovirt-engine, just modify the cluster's cpu compatibility. I hit this too, but i think I just ended up provisioning the older machine first then the newer ones joined with the older model 1. first the host needs to have a compatible cpu model. what does 'vdsClient -s 0 getVdsCaps | grep -i flag' returns 2. cluster cpu level is easy, but hosted engine vm config resides on the disk, and needs to be manually edited in this case iirc. On Thu, Jul 17, 2014 at 11:05 PM, George Machitidze gmachiti...@greennet.ge wrote: Hello, I am deploying hosted engine (HA) on hosts with different CPU models on one of my oVirt labs. Host have different CPU's or there is also the problem: virtualization platform cannot detect CPU at all, The following CPU types are supported by this host: is empty: 2014-07-17 16:51:42 DEBUG otopi.plugins.ovirt_hosted_engine_setup.vdsmd.cpu cpu._customization:124 Compatible CPU models are: [] Is there any way to override this setting and use CPU of old machine for both hosts? ex. host1: cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU5160 @ 3.00GHz stepping: 11 host2: cpu family : 6 model : 42 model name : Intel(R) Xeon(R) CPU E31220 @ 3.10GHz stepping: 7 [root@ovirt2 ~]# hosted-engine --deploy [ INFO ] Stage: Initializing Continuing will configure this host for serving as hypervisor and create a VM where you have to install oVirt Engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20140717165111-7tg2g7.log Version: otopi-1.2.1 (otopi-1.2.1-1.el6) [ INFO ] Hardware supports virtualization [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ INFO ] Stage: Environment customization --== STORAGE CONFIGURATION ==-- During customization use CTRL-D to abort. Please specify the storage you would like to use (nfs3, nfs4)[nfs3]: Please specify the full shared storage connection path to use (example: host:/path): ovirt-hosted:/engine The specified storage location already contains a data domain. Is this an additional host setup (Yes, No)[Yes]? [ INFO ] Installing on additional host Please specify the Host ID [Must be integer, default: 2]: --== SYSTEM CONFIGURATION ==-- [WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host. The answer file may be fetched from the first host using scp. If you do not want to download it automatically you can abort the setup answering no to the following question. Do you want to scp the answer file from the first host? (Yes, No)[Yes]: Please provide the FQDN or IP of the first host: ovirt1.test.ge Enter 'root' user password for host ovirt1.test.ge: [ INFO ] Answer file successfully downloaded --== NETWORK CONFIGURATION ==-- The following CPU types are supported by this host: [ ERROR ] Failed to execute stage 'Environment customization': Invalid CPU type specified: model_Conroe [ INFO ] Stage: Clean up [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination -- BR George Machitidze ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/21/2014 05:09 AM, Pranith Kumar Karampuri wrote: On 07/21/2014 02:08 PM, Jiri Moskovcak wrote: On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote: On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Hi, I've never had such setup, I guessed problem with gluster based on OSError: [Errno 116] Stale file handle: which happens when the file opened by application on client gets removed on the server. I'm pretty sure we (hosted-engine) don't remove that file, so I