Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Pranith Kumar Karampuri


On 07/18/2014 05:43 PM, Andrew Lau wrote:


On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com 
mailto:vbel...@redhat.com wrote:


[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages, hosted
engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is currently
something
we really warn against.


I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here. Hosted
engine assumes that all writes are atomic and (immediately)
available for all hosts in the cluster. Gluster violates those
assumptions.


I tried going through BZ1097639 but could not find much detail
with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that causes
the problem?

2. Is hosted engine performing synchronous writes to ensure that
writes are durable?

Also, if there is any documentation that details the hosted engine
architecture that would help in enhancing our understanding of its
interactions with gluster.




Now my question, does this theory prevent a scenario of perhaps
something like a gluster replicated volume being mounted as a
glusterfs
filesystem and then re-exported as the native kernel NFS share
for the
hosted-engine to consume? It could then be possible to chuck
ctdb in
there to provide a last resort failover solution. I have tried
myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and they
haven't
reported as many issues. Curious, could anyone validate my
theory on this?


If we obtain more details on the use case and obtain gluster logs
from the failed scenarios, we should be able to understand the
problem better. That could be the first step in validating your
theory or evolving further recommendations :).


I'm not sure how useful this is, but Jiri Moskovcak tracked this down 
in an off list message.


Message Quote:

==

We were able to track it down to this (thanks Andrew for providing the 
testing setup):


-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 165, in handle

  response = success  + self._dispatch(data)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 261, in _dispatch

  .get_all_stats_for_service_type(**options)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, 
line 41, in get_all_stats_for_service_type

  d = self.get_raw_stats_for_service_type(storage_dir, service_type)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, 
line 74, in get_raw_stats_for_service_type

  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle: 
'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Andrew/Jiri,
Would it be possible to post gluster logs of both the mount and 
bricks on the bz? I can take a look at it once. If I gather nothing then 
probably I will ask for your help in re-creating the issue.


Pranith



It's definitely connected to the storage which leads us to the 
gluster, I'm not very familiar with the gluster so I need to check 
this with our gluster gurus.


==

Thanks,
Vijay




___
Gluster-devel mailing list
gluster-de...@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [ovirt-devel] virt-v2v integration feature

2014-07-21 Thread Maurice James
Lots of virtual appliances come in ovf format. There isn’t a way to import it 
into an oVirt/RHEV instance without a live ESX instance. This is very 
inconvenient and inefficient

-Original Message-
From: Sven Kieske [mailto:s.kie...@mittwald.de] 
Sent: Thursday, July 10, 2014 5:01 AM
To: Itamar Heim; de...@ovirt.org
Cc: Users@ovirt.org List; Richard W.M. Jones; midnightst...@msn.com; 
blas...@556nato.com; bugzi...@grendelman.com; f...@moov.de; R P Herrold; 
jsp...@bandwith.com
Subject: Re: [ovirt-users] [ovirt-devel] virt-v2v integration feature



Am 10.07.2014 09:41, schrieb Itamar Heim:
 On 07/10/2014 10:29 AM, Sven Kieske wrote:


 Am 09.07.2014 20:30, schrieb Arik Hadas:
 Hi All,

 The proposed feature will introduce a new process of import virtual 
 machines from external systems using virt-v2v in oVirt.
 I've created a wiki page that contains initial thoughts and design 
 for it:
 http://www.ovirt.org/Features/virt-v2v_Integration

 You are more than welcome to share your thoughts and insights.

 Thanks,
 Arik

 Am I right that this still involves a full operational e.g. esxi host 
 to import vmware vms?

 There is a huge user demand on a simpler process for just converting 
 and importing an vmware disk image. This feature will not solve this 
 use case, will it?



 I agree it should. need to check if virt-v2v can cover this. if not, 
 need to fix it so it will...




Well here are the relevant BZ entries:

https://bugzilla.redhat.com/show_bug.cgi?id=1062910
https://bugzilla.redhat.com/show_bug.cgi?id=1049604

CC'ing the users from this Bugzilla entries, maybe they can add something to 
gain some traction :)

--
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH  Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Pranith Kumar Karampuri


On 07/19/2014 11:25 AM, Andrew Lau wrote:



On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri 
pkara...@redhat.com mailto:pkara...@redhat.com wrote:



On 07/18/2014 05:43 PM, Andrew Lau wrote:

​ ​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.


I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here.
Hosted engine assumes that all writes are atomic and
(immediately) available for all hosts in the cluster.
Gluster violates those assumptions.
​

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.


​

Now my question, does this theory prevent a scenario of
perhaps
something like a gluster replicated volume being mounted
as a glusterfs
filesystem and then re-exported as the native kernel NFS
share for the
hosted-engine to consume? It could then be possible to
chuck ctdb in
there to provide a last resort failover solution. I have
tried myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and
they haven't
reported as many issues. Curious, could anyone validate
my theory on this?


If we obtain more details on the use case and obtain gluster
logs from the failed scenarios, we should be able to
understand the problem better. That could be the first step
in validating your theory or evolving further recommendations :).


​ I'm not sure how useful this is, but ​Jiri Moskovcak tracked
this down in an off list message.

​ Message Quote:​

​ ==​

​We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 165, in handle
  response = success  + self._dispatch(data)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 261, in _dispatch
  .get_all_stats_for_service_type(**options)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 41, in get_all_stats_for_service_type
  d = self.get_raw_stats_for_service_type(storage_dir, service_type)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 74, in get_raw_stats_for_service_type
  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:

'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Andrew/Jiri,
Would it be possible to post gluster logs of both the
mount and bricks on the bz? I can take a look at it once. If I
gather nothing then probably I will ask for your help in
re-creating the issue.

Pranith


​Unfortunately, I don't have the logs for that setup any more.. ​I'll 
try replicate when I get a chance. If I understand the comment from 
the BZ, I don't think it's a gluster bug per-say, more just how 
gluster does its replication.

hi Andrew,
 Thanks for that. I couldn't come to any conclusions because no 
logs were available. It is unlikely that self-heal is involved because 
there were no bricks going down/up according to the bug description.


Pranith





It's definitely connected to the storage which leads us to the
gluster, I'm not very familiar with the gluster so I need to
check this with our gluster gurus.​

​== ​

Thanks,
Vijay




___
Gluster-devel mailing list
gluster-de...@gluster.org  mailto:gluster-de...@gluster.org

[ovirt-users] unable to mount iso storage

2014-07-21 Thread Gene Fontanilla
Hi,

when i added the 2nd host, i keep getting



The error message for connection OvirtFE:/mnt/iso_domain returned by VDSM 
was: Problem while trying to mount target

and i am unable to access images when creating vms.




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] unable to mount iso storage

2014-07-21 Thread Joop
On 21-7-2014 9:29, Gene Fontanilla wrote:
 Hi,

 when i added the 2nd host, i keep getting



 The error message for connection OvirtFE:/mnt/iso_domain returned by VDSM 
 was: Problem while trying to mount target

 and i am unable to access images when creating vms.

Check if you can mount the nfs share from that second server. I'll guess
not. Check your firewall(s)
You can find the exact mount commant in vdsm.log

Joop

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Setup of hosted Engine Fails

2014-07-21 Thread Jiri Moskovcak

Hi Andrew,
thanks for debugging this, please create a bug against vdsm to make sure 
it gets proper attention.


Thanks,
Jirka

On 07/19/2014 12:36 PM, Andrew Lau wrote:

Quick update, it seems to be related to the latest vdsm package,

service vdsmd start
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running gencerts
vdsm: Running check_is_configured
libvirt is not configured for vdsm yet
Modules libvirt are not configured
  Traceback (most recent call last):
   File /usr/bin/vdsm-tool, line 145, in module
 sys.exit(main())
   File /usr/bin/vdsm-tool, line 142, in main
 return tool_command[cmd][command](*args[1:])
   File /usr/lib64/python2.6/site-packages/vdsm/tool/configurator.py,
line 282, in isconfigured
 raise RuntimeError(msg)
RuntimeError:

One of the modules is not configured to work with VDSM.
To configure the module use the following:
'vdsm-tool configure [module_name]'.

If all modules are not configured try to use:
'vdsm-tool configure --force'
(The force flag will stop the module's service and start it
afterwards automatically to load the new configuration.)

vdsm: stopped during execute check_is_configured task (task returned
with error code 1).
vdsm start [FAILED]

yum downgrade vdsm*

​Here's the package changes for reference,

-- Running transaction check
--- Package vdsm.x86_64 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm.x86_64 0:4.14.11-0.el6 will be erased
--- Package vdsm-cli.noarch 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm-cli.noarch 0:4.14.11-0.el6 will be erased
--- Package vdsm-python.x86_64 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm-python.x86_64 0:4.14.11-0.el6 will be erased
--- Package vdsm-python-zombiereaper.noarch 0:4.14.9-0.el6 will be a
downgrade
--- Package vdsm-python-zombiereaper.noarch 0:4.14.11-0.el6 will be erased
--- Package vdsm-xmlrpc.noarch 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm-xmlrpc.noarch 0:4.14.11-0.el6 will be erased

service vdsmd start
initctl: Job is already running: libvirtd
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running gencerts
vdsm: Running check_is_configured
libvirt is already configured for vdsm
sanlock service is already configured
vdsm: Running validate_configuration
SUCCESS: ssl configured to true. No conflicts
vdsm: Running prepare_transient_repository
vdsm: Running syslog_available
vdsm: Running nwfilter
vdsm: Running dummybr
vdsm: Running load_needed_modules
vdsm: Running tune_system
vdsm: Running test_space
vdsm: Running test_lo
vdsm: Running unified_network_persistence_upgrade
vdsm: Running restore_nets
vdsm: Running upgrade_300_nets
Starting up vdsm daemon:
vdsm start [  OK  ]
[root@ov-hv1-2a-08-23 ~]# service vdsmd status
VDS daemon server is running


On Sat, Jul 19, 2014 at 6:58 PM, Andrew Lau and...@andrewklau.com
mailto:and...@andrewklau.com wrote:

It seems vdsm is not running,

service vdsmd status
VDS daemon is not running, and its watchdog is running

The only logs in /var/log/vdsm/ that appear to have any content is
/var/log/vdsm/supervdsm.log - everything else is blank

MainThread::DEBUG::2014-07-19
18:55:34,793::supervdsmServer::424::SuperVdsm.Server::(main)
Terminated normally
MainThread::DEBUG::2014-07-19
18:55:38,033::netconfpersistence::134::root::(_getConfigs)
Non-existing config set.
MainThread::DEBUG::2014-07-19
18:55:38,034::netconfpersistence::134::root::(_getConfigs)
Non-existing config set.
MainThread::DEBUG::2014-07-19
18:55:38,058::supervdsmServer::384::SuperVdsm.Server::(main) Making
sure I'm root - SuperVdsm
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::393::SuperVdsm.Server::(main) Parsing
cmd args
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::396::SuperVdsm.Server::(main)
Cleaning old socket /var/run/vdsm/svdsm.sock
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::400::SuperVdsm.Server::(main) Setting
up keep alive thread
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::406::SuperVdsm.Server::(main)
Creating remote object manager
MainThread::DEBUG::2014-07-19
18:55:38,061::supervdsmServer::417::SuperVdsm.Server::(main) Started
serving super vdsm object
sourceRoute::DEBUG::2014-07-19
18:55:38,062::sourceRouteThread::56::root::(_subscribeToInotifyLoop)
sourceRouteThread.subscribeToInotifyLoop started


On Sat, Jul 19, 2014 at 6:48 PM, Andrew Lau and...@andrewklau.com
mailto:and...@andrewklau.com wrote:

Here's a snippet from my hosted-engine-setup log

2014-07-19 18:45:14 DEBUG otopi.context
context._executeMethod:138 Stage late_setup METHOD


Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Jiri Moskovcak

On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:


On 07/19/2014 11:25 AM, Andrew Lau wrote:



On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
pkara...@redhat.com mailto:pkara...@redhat.com wrote:


On 07/18/2014 05:43 PM, Andrew Lau wrote:

​ ​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.


I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here.
Hosted engine assumes that all writes are atomic and
(immediately) available for all hosts in the cluster.
Gluster violates those assumptions.
​

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.


​

Now my question, does this theory prevent a scenario of
perhaps
something like a gluster replicated volume being mounted
as a glusterfs
filesystem and then re-exported as the native kernel NFS
share for the
hosted-engine to consume? It could then be possible to
chuck ctdb in
there to provide a last resort failover solution. I have
tried myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and
they haven't
reported as many issues. Curious, could anyone validate
my theory on this?


If we obtain more details on the use case and obtain gluster
logs from the failed scenarios, we should be able to
understand the problem better. That could be the first step
in validating your theory or evolving further recommendations :).


​ I'm not sure how useful this is, but ​Jiri Moskovcak tracked
this down in an off list message.

​ Message Quote:​

​ ==​

​We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 165, in handle
  response = success  + self._dispatch(data)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 261, in _dispatch
  .get_all_stats_for_service_type(**options)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 41, in get_all_stats_for_service_type
  d = self.get_raw_stats_for_service_type(storage_dir, service_type)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 74, in get_raw_stats_for_service_type
  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:

'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Andrew/Jiri,
Would it be possible to post gluster logs of both the
mount and bricks on the bz? I can take a look at it once. If I
gather nothing then probably I will ask for your help in
re-creating the issue.

Pranith


​Unfortunately, I don't have the logs for that setup any more.. ​I'll
try replicate when I get a chance. If I understand the comment from
the BZ, I don't think it's a gluster bug per-say, more just how
gluster does its replication.

hi Andrew,
  Thanks for that. I couldn't come to any conclusions because no
logs were available. It is unlikely that self-heal is involved because
there were no bricks going down/up according to the bug description.



Hi,
I've never had such setup, I guessed problem with gluster based on 
OSError: [Errno 116] Stale file handle: which happens when the file 
opened by application on client gets removed on the server. I'm pretty 
sure we (hosted-engine) don't remove that file, so I think it's some 
gluster magic moving the data around...


--Jirka


Pranith





 

Re: [ovirt-users] Setup of hosted Engine Fails

2014-07-21 Thread Andrew Lau
Done, https://bugzilla.redhat.com/show_bug.cgi?id=1121561

On Mon, Jul 21, 2014 at 6:32 PM, Jiri Moskovcak jmosk...@redhat.com wrote:

 Hi Andrew,
 thanks for debugging this, please create a bug against vdsm to make sure
 it gets proper attention.

 Thanks,
 Jirka


 On 07/19/2014 12:36 PM, Andrew Lau wrote:

 Quick update, it seems to be related to the latest vdsm package,

 service vdsmd start
 vdsm: Running mkdirs
 vdsm: Running configure_coredump
 vdsm: Running configure_vdsm_logs
 vdsm: Running run_init_hooks
 vdsm: Running gencerts
 vdsm: Running check_is_configured
 libvirt is not configured for vdsm yet
 Modules libvirt are not configured
   Traceback (most recent call last):
File /usr/bin/vdsm-tool, line 145, in module
  sys.exit(main())
File /usr/bin/vdsm-tool, line 142, in main
  return tool_command[cmd][command](*args[1:])
File /usr/lib64/python2.6/site-packages/vdsm/tool/configurator.py,
 line 282, in isconfigured
  raise RuntimeError(msg)
 RuntimeError:

 One of the modules is not configured to work with VDSM.
 To configure the module use the following:
 'vdsm-tool configure [module_name]'.

 If all modules are not configured try to use:
 'vdsm-tool configure --force'
 (The force flag will stop the module's service and start it
 afterwards automatically to load the new configuration.)

 vdsm: stopped during execute check_is_configured task (task returned
 with error code 1).
 vdsm start [FAILED]

 yum downgrade vdsm*

 ​Here's the package changes for reference,

 -- Running transaction check
 --- Package vdsm.x86_64 0:4.14.9-0.el6 will be a downgrade
 --- Package vdsm.x86_64 0:4.14.11-0.el6 will be erased
 --- Package vdsm-cli.noarch 0:4.14.9-0.el6 will be a downgrade
 --- Package vdsm-cli.noarch 0:4.14.11-0.el6 will be erased
 --- Package vdsm-python.x86_64 0:4.14.9-0.el6 will be a downgrade
 --- Package vdsm-python.x86_64 0:4.14.11-0.el6 will be erased
 --- Package vdsm-python-zombiereaper.noarch 0:4.14.9-0.el6 will be a
 downgrade
 --- Package vdsm-python-zombiereaper.noarch 0:4.14.11-0.el6 will be
 erased
 --- Package vdsm-xmlrpc.noarch 0:4.14.9-0.el6 will be a downgrade
 --- Package vdsm-xmlrpc.noarch 0:4.14.11-0.el6 will be erased

 service vdsmd start
 initctl: Job is already running: libvirtd
 vdsm: Running mkdirs
 vdsm: Running configure_coredump
 vdsm: Running configure_vdsm_logs
 vdsm: Running run_init_hooks
 vdsm: Running gencerts
 vdsm: Running check_is_configured
 libvirt is already configured for vdsm
 sanlock service is already configured
 vdsm: Running validate_configuration
 SUCCESS: ssl configured to true. No conflicts
 vdsm: Running prepare_transient_repository
 vdsm: Running syslog_available
 vdsm: Running nwfilter
 vdsm: Running dummybr
 vdsm: Running load_needed_modules
 vdsm: Running tune_system
 vdsm: Running test_space
 vdsm: Running test_lo
 vdsm: Running unified_network_persistence_upgrade
 vdsm: Running restore_nets
 vdsm: Running upgrade_300_nets
 Starting up vdsm daemon:
 vdsm start [  OK  ]
 [root@ov-hv1-2a-08-23 ~]# service vdsmd status
 VDS daemon server is running


 On Sat, Jul 19, 2014 at 6:58 PM, Andrew Lau and...@andrewklau.com
 mailto:and...@andrewklau.com wrote:

 It seems vdsm is not running,

 service vdsmd status
 VDS daemon is not running, and its watchdog is running

 The only logs in /var/log/vdsm/ that appear to have any content is
 /var/log/vdsm/supervdsm.log - everything else is blank

 MainThread::DEBUG::2014-07-19
 18:55:34,793::supervdsmServer::424::SuperVdsm.Server::(main)
 Terminated normally
 MainThread::DEBUG::2014-07-19
 18:55:38,033::netconfpersistence::134::root::(_getConfigs)
 Non-existing config set.
 MainThread::DEBUG::2014-07-19
 18:55:38,034::netconfpersistence::134::root::(_getConfigs)
 Non-existing config set.
 MainThread::DEBUG::2014-07-19
 18:55:38,058::supervdsmServer::384::SuperVdsm.Server::(main) Making
 sure I'm root - SuperVdsm
 MainThread::DEBUG::2014-07-19
 18:55:38,059::supervdsmServer::393::SuperVdsm.Server::(main) Parsing
 cmd args
 MainThread::DEBUG::2014-07-19
 18:55:38,059::supervdsmServer::396::SuperVdsm.Server::(main)
 Cleaning old socket /var/run/vdsm/svdsm.sock
 MainThread::DEBUG::2014-07-19
 18:55:38,059::supervdsmServer::400::SuperVdsm.Server::(main) Setting
 up keep alive thread
 MainThread::DEBUG::2014-07-19
 18:55:38,059::supervdsmServer::406::SuperVdsm.Server::(main)
 Creating remote object manager
 MainThread::DEBUG::2014-07-19
 18:55:38,061::supervdsmServer::417::SuperVdsm.Server::(main) Started
 serving super vdsm object
 sourceRoute::DEBUG::2014-07-19
 18:55:38,062::sourceRouteThread::56::root::(_subscribeToInotifyLoop)
 sourceRouteThread.subscribeToInotifyLoop started


 On Sat, Jul 19, 2014 at 6:48 PM, Andrew Lau and...@andrewklau.com
  

Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Pranith Kumar Karampuri


On 07/21/2014 02:08 PM, Jiri Moskovcak wrote:

On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:


On 07/19/2014 11:25 AM, Andrew Lau wrote:



On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
pkara...@redhat.com mailto:pkara...@redhat.com wrote:


On 07/18/2014 05:43 PM, Andrew Lau wrote:

​ ​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.


I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here.
Hosted engine assumes that all writes are atomic and
(immediately) available for all hosts in the cluster.
Gluster violates those assumptions.
​

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.


​

Now my question, does this theory prevent a scenario of
perhaps
something like a gluster replicated volume being mounted
as a glusterfs
filesystem and then re-exported as the native kernel NFS
share for the
hosted-engine to consume? It could then be possible to
chuck ctdb in
there to provide a last resort failover solution. I have
tried myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and
they haven't
reported as many issues. Curious, could anyone validate
my theory on this?


If we obtain more details on the use case and obtain gluster
logs from the failed scenarios, we should be able to
understand the problem better. That could be the first step
in validating your theory or evolving further 
recommendations :).



​ I'm not sure how useful this is, but ​Jiri Moskovcak tracked
this down in an off list message.

​ Message Quote:​

​ ==​

​We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 165, in handle
  response = success  + self._dispatch(data)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 261, in _dispatch
  .get_all_stats_for_service_type(**options)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 41, in get_all_stats_for_service_type
  d = self.get_raw_stats_for_service_type(storage_dir, 
service_type)

File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 74, in get_raw_stats_for_service_type
  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:
'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Andrew/Jiri,
Would it be possible to post gluster logs of both the
mount and bricks on the bz? I can take a look at it once. If I
gather nothing then probably I will ask for your help in
re-creating the issue.

Pranith


​Unfortunately, I don't have the logs for that setup any more.. ​I'll
try replicate when I get a chance. If I understand the comment from
the BZ, I don't think it's a gluster bug per-say, more just how
gluster does its replication.

hi Andrew,
  Thanks for that. I couldn't come to any conclusions because no
logs were available. It is unlikely that self-heal is involved because
there were no bricks going down/up according to the bug description.



Hi,
I've never had such setup, I guessed problem with gluster based on 
OSError: [Errno 116] Stale file handle: which happens when the file 
opened by application on client gets removed on the server. I'm pretty 
sure we (hosted-engine) don't remove that file, so I think it's some 
gluster magic moving the data 

Re: [ovirt-users] Guest VM Console Creation/Access using REST API and noVNC

2014-07-21 Thread Michal Skrivanek

On Jul 21, 2014, at 04:33 , Punit Dambiwal hypu...@gmail.com wrote:

 Hi All,
 
 I am still waiting for the updates...is there any one have the clue to solve 
 this problem ???

Hi Punit,
I'm afraid no one can help you  debug connectivity issues remotely, without 
describing precisely what are you doing and how, and include all the logs

Thanks,
michal

 
 Thanks,
 Punit
 
 
 On Fri, Jul 18, 2014 at 12:37 PM, Punit Dambiwal hypu...@gmail.com wrote:
 Hi All,
 
 We are also struggling with the same problemcan anybody mind to update 
 here the resolution or suggest us the way to get rid of this Failed to 
 connect to server (code: 1006 error.
 
 Thanks,
 Punit
 
 
 On Thu, Jul 17, 2014 at 5:20 PM, Shanil S xielessha...@gmail.com wrote:
 Hi,
 
 We are waiting for the updates, it will be great if anyone can give the 
 helpful details.. :)
 
 -- 
 Regards 
 Shanil
 
 
 On Thu, Jul 17, 2014 at 10:23 AM, Shanil S xielessha...@gmail.com wrote:
 Hi,
 
 we have enabled our portal ip address on the engine and hosts firewall but 
 still the connection failed. so there should be no firewall issues.
 
 -- 
 Regards 
 Shanil
 
 
 On Wed, Jul 16, 2014 at 3:26 PM, Shanil S xielessha...@gmail.com wrote:
 Hi Sven,
 
 Regarding the ticket path, Is it the direct combination of host and port ? 
 suppose if the host is 1.2.3.4 and the port is 5100 then what should be the 
 path value ? Is there encryption needs here ?
 
 
 so you have access from the browser to the websocket-proxy, network
 wise? can you ping the proxy?
 and the websocket proxy can reach the host where the vm runs?
 
  yes.. there should be no firewall issue as we can access the console from 
 ovirt engine portal
 
  Do we need to allow our own portal ip address in the ovirt engine and 
 hypervisiors also ???
 
 
 -- 
 Regards 
 Shanil
 
 
 On Wed, Jul 16, 2014 at 3:13 PM, Sven Kieske s.kie...@mittwald.de wrote:
 
 
 Am 16.07.2014 11:30, schrieb Shanil S:
  We will get the ticket details like host,port and password from the ticket
  api funcion call but didn't get the path value. Will it get it from the
  ticket details ? i couldn't find out any from the ticket details.
 
 the path is the combination of host and port.
 
 so you have access from the browser to the websocket-proxy, network
 wise? can you ping the proxy?
 and the websocket proxy can reach the host where the vm runs?
 are you sure there are no firewalls in between?
 also you should pay attention on how long your ticket
 is valid, you can specify the duration in minutes in your api call.
 
 --
 Mit freundlichen Grüßen / Regards
 
 Sven Kieske
 
 Systemadministrator
 Mittwald CM Service GmbH  Co. KG
 Königsberger Straße 6
 32339 Espelkamp
 T: +49-5772-293-100
 F: +49-5772-293-333
 https://www.mittwald.de
 Geschäftsführer: Robert Meyer
 St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
 Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
 
 
 
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
 
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Joop
On 21-7-2014 14:38, Maurice James wrote:
 I just upgraded to 3.4.3, not its complaining that em1 and em2 are
 down. They are not down not sure why it thinks the interfaces are
 down. Its doing this for all 4 of my hosts

Upgraded too and same problem. Seems that a downgrade of vdsm will solve
this.

Joop

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Maurice James
Wow, huge bug 





From: Joop jvdw...@xs4all.nl 
To: users@ovirt.org 
Sent: Monday, July 21, 2014 8:46:18 AM 
Subject: Re: [ovirt-users] 3.4.3 Network problem 

On 21-7-2014 14:38, Maurice James wrote: 



I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They 
are not down not sure why it thinks the interfaces are down. Its doing this for 
all 4 of my hosts 



Upgraded too and same problem. Seems that a downgrade of vdsm will solve this. 

Joop 


___ 
Users mailing list 
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Maurice James
I submitted a bug report 
https://bugzilla.redhat.com/show_bug.cgi?id=1121643 


- Original Message -

From: Joop jvdw...@xs4all.nl 
To: users@ovirt.org 
Sent: Monday, July 21, 2014 8:46:18 AM 
Subject: Re: [ovirt-users] 3.4.3 Network problem 

On 21-7-2014 14:38, Maurice James wrote: 



I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. They 
are not down not sure why it thinks the interfaces are down. Its doing this for 
all 4 of my hosts 



Upgraded too and same problem. Seems that a downgrade of vdsm will solve this. 

Joop 


___ 
Users mailing list 
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] postponing oVirt 3.5.0 second beta

2014-07-21 Thread Sandro Bonazzola
Hi,
we're going to postpone oVirt 3.5.0 second beta since ovirt-engine currently 
doesn't build [1].

We also have identified a set of bugs causing automated tests to fail so we're 
going to block the release
until engine will build cleanly and at least most critical issues found have 
been fixed.

Please note that more than 80 patches are now in master and not backported to 
3.5 branch.
Maintainers, please ensure all patches targeted to 3.5 are properly backported.

Probably we're going to postpone second test day too, according to the date 
we'll be able to compose the second beta build.

[1] 
http://jenkins.ovirt.org/view/Stable%20branches%20per%20project/view/ovirt-engine/job/ovirt-engine_3.5_create-rpms_merged/47/
[2] http://goo.gl/pFngWU

-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ilo4 vs. ipmilan fencing agents

2014-07-21 Thread Jason Brooks


- Original Message -
 From: Eli Mesika emes...@redhat.com
 To: Jason Brooks jbro...@redhat.com
 Cc: users users@ovirt.org, Marek Grac mg...@redhat.com
 Sent: Saturday, July 19, 2014 1:45:37 PM
 Subject: Re: [ovirt-users] ilo4 vs. ipmilan fencing agents
 
 
 
 - Original Message -
  From: Jason Brooks jbro...@redhat.com
  To: users users@ovirt.org
  Sent: Thursday, July 10, 2014 1:02:13 AM
  Subject: [ovirt-users] ilo4 vs. ipmilan fencing agents
  
  Hi all --
  
  I'm trying to get fencing squared away in my cluster of hp dl-380 servers,
  which come with ilo4. I was able to get a successful status check from
  the command line with fence_ilo4, but not w/ the ilo4 option in ovirt.
  
  I see, though, that ilo4 in ovirt just maps to fence_ipmilan, and I was
  not able to get a successful status check w/ fence_ipmilan from the cli.
  
  So, I tried resetting the mapping so that ilo4 maps to ilo4. Now I can
  complete the power management test in ovirt, but I imagine there's some
  reason why ovirt isn't configured this way by default.
  
  Will fencing actually work for me with ilo4 mapped to ilo4, rather than
  to ipmilan?
 
 ILO3 and ILO4 are mapped implicitly to ipmilan with lanplus flag ON and
 power_wait=4

On my installation, ilo4 w/ no options fails the test. ilo4 w/ lanplus=on
in the options field succeeds. Is it possible that the lanplus=on options
isn't being registered/applied properly?

Jason


 If you change the mapping to use the native scripts its OK as long as it
 works for you
 addin Marec G to the thread
 Marec, should we always map ILO3  ILO4 to the native scripts (fence_ilo3 ,
 fence_ilo4) and not to ipmilan ???
 
  
  Thanks, Jason
  
  ---
  
  Jason Brooks
  Red Hat Open Source and Standards
  
  @jasonbrooks | @redhatopen
  http://community.redhat.com
  
  
  ___
  Users mailing list
  Users@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/users
  
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Logical network error

2014-07-21 Thread Dan Kenigsberg
On Mon, Jul 21, 2014 at 07:35:00AM -0400, Maurice James wrote:
 
 Here are the supervdsm logs

Hmm, it's a trove of errors and tracebacks; there's a lot to debug, for
example, here Vdsm was asked to create a 'migration' network on top of
bond0 that was already used by ovirtmgmt. Engine should have blocked
that. Moti?

MainProcess|Thread-47826::DEBUG::2014-04-11 
10:00:26,335::configNetwork::589::setupNetworks::(setupNetworks) Setting up 
network according to configuration: networks:{'migration': {'bonding': 'bond0', 
'bridged': 'false'}}, bondings:{}, options:{'connectivityCheck': 'true', 
'connectivityTimeout': 120}
Traceback (most recent call last):
  File /usr/share/vdsm/supervdsmServer, line 98, in wrapper
res = func(*args, **kwargs)
  File /usr/share/vdsm/supervdsmServer, line 202, in setupNetworks
return configNetwork.setupNetworks(networks, bondings, **options)
  File /usr/share/vdsm/configNetwork.py, line 648, in setupNetworks
implicitBonding=True, **d)
  File /usr/share/vdsm/configNetwork.py, line 186, in wrapped
return func(*args, **kwargs)
  File /usr/share/vdsm/configNetwork.py, line 256, in addNetwork
bridged)
  File /usr/share/vdsm/configNetwork.py, line 170, in 
_validateInterNetworkCompatibility
_validateNoDirectNet(ifaces_bridgeless)
  File /usr/share/vdsm/configNetwork.py, line 154, in _validateNoDirectNet
(iface, iface_net))
ConfigNetworkError: (21, interface 'bond0' already member of network 
'ovirtmgmt')


There are also repeated failed attempts to create a payload disk: did you
notice when they happen? Could it be related to insufficient disk space?

MainProcess|clientIFinit::ERROR::2014-03-25 
22:13:02,529::supervdsmServer::100::SuperVdsm.ServerCallback::(wrapper) Error 
in mkFloppyFs
Traceback (most recent call last):
  File /usr/share/vdsm/supervdsmServer, line 98, in wrapper
res = func(*args, **kwargs)
  File /usr/share/vdsm/supervdsmServer, line 325, in mkFloppyFs
return mkimage.mkFloppyFs(vmId, files, volId)
  File /usr/share/vdsm/mkimage.py, line 104, in mkFloppyFs
code %s, out %s\nerr %s % (rc, out, err))
OSError: [Errno 5] could not create floppy file: code 1, out mkfs.msdos 3.0.9 
(31 Jan 2010)

err mkfs.msdos: unable to create 
/var/run/vdsm/payload/94632c2e-28a0-4304-9261-c302e0027604.ecac527e731a2a0057dc6a3ae3df9ba3.img


In any case, if you manage to reproduce these issues, please open a detailed 
bug entry.

Regards,
Dan.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] RHEV 3.4 trial hosted-engine either host wants to take ownership

2014-07-21 Thread Steve Dainard
I added a hook to rhevm, and then restarted the engine service which
triggered a hosted-engine VM shutdown (likely because of the failed
liveliness check).

Once the hosted-engine VM shutdown it did not restart on the other host.

On both hosts configured for hosted-engine I'm seeing logs from ha-agent
where each host thinks the other host has a better score. Is there supposed
to be a mechanism for a tie breaker here? I do notice that the log mentions
best REMOTE host, so perhaps I'm interpreting this message incorrectly.

ha-agent logs:

Host 001:

MainThread::INFO::2014-07-21
11:51:57,396::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957917.4 type=state_transition
detail=EngineDown-EngineDown hostname='rhev001.miovision.corp'
MainThread::INFO::2014-07-21
11:51:57,397::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown) sent?
ignored
MainThread::INFO::2014-07-21
11:51:57,924::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-21
11:51:57,924::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host rhev002.miovision.corp (id: 2, score: 2400)
MainThread::INFO::2014-07-21
11:52:07,961::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down, local host does not have best score
MainThread::INFO::2014-07-21
11:52:07,975::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957927.98 type=state_transition
detail=EngineDown-EngineDown hostname='rhev001.miovision.corp'

Host 002:

MainThread::INFO::2014-07-21
11:51:47,405::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957907.41 type=state_transition
detail=EngineDown-EngineDown hostname='rhev002.miovision.corp'
MainThread::INFO::2014-07-21
11:51:47,406::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown) sent?
ignored
MainThread::INFO::2014-07-21
11:51:47,834::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-21
11:51:47,835::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host rhev001.miovision.corp (id: 1, score: 2400)
MainThread::INFO::2014-07-21
11:51:57,870::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down, local host does not have best score
MainThread::INFO::2014-07-21
11:51:57,883::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957917.88 type=state_transition
detail=EngineDown-EngineDown hostname='rhev002.miovision.corp'

This went on for 20 minutes about an hour ago, and I decided to --vm-start
on one of the hosts. The manager VM runs for a few minutes with the engine
ui accessible, before shutting itself down again.

I then put host 002 into local maintenance mode, and host 001 auto started
the hosted-engine VM. The logging still references host 002 as the 'best
remote host' even though the calculated score is now 0:

MainThread::INFO::2014-07-21
12:03:24,011::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405958604.01 type=state_transition
detail=EngineUp-EngineUp hostname='rhev001.miovision.corp'
MainThread::INFO::2014-07-21
12:03:24,013::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineUp-EngineUp) sent?
ignored
MainThread::INFO::2014-07-21
12:03:24,515::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-07-21
12:03:24,516::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host rhev002.miovision.corp (id: 2, score: 0)
MainThread::INFO::2014-07-21
12:03:34,567::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405958614.57 type=state_transition
detail=EngineUp-EngineUp hostname='rhev001.miovision.corp'

Once the hosted-engine VM was up for about 5 minutes I took host 002 out of
local maintenance mode and the VM has not since shutdown.

Is this expected behaviour? Is this the normal recovery process when two
hosts both hosting hosted-engine are started at the same time? I would have
expected once hosted-engine VM was detected as bad (liveliness check from
when I restarted the engine service) and the VM was shutdown, that it would
spin back up on the next available host.

Thanks,
Steve

[ovirt-users] ovirt-release.rpm 3.4 dead link

2014-07-21 Thread Federico Alberto Sayd

Hello:

The link to ovirt-release.rpm 3.4 is dead:

http://www.ovirt.org/OVirt_3.4_Release_Notes#Install_.2F_Upgrade_from_Previous_Versions

Where is the ovirt-release.rpm ??

Thanks

Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Dan Kenigsberg
On Mon, Jul 21, 2014 at 06:05:45PM +0100, Dan Kenigsberg wrote:
 On Mon, Jul 21, 2014 at 09:03:58AM -0400, Maurice James wrote:
  I submitted a bug report 
  https://bugzilla.redhat.com/show_bug.cgi?id=1121643 
  
  
  - Original Message -
  
  From: Joop jvdw...@xs4all.nl 
  To: users@ovirt.org 
  Sent: Monday, July 21, 2014 8:46:18 AM 
  Subject: Re: [ovirt-users] 3.4.3 Network problem 
  
  On 21-7-2014 14:38, Maurice James wrote: 
  
  
  
  I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. 
  They are not down not sure why it thinks the interfaces are down. Its doing 
  this for all 4 of my hosts 
 
 It is a horrible bug, due to my http://gerrit.ovirt.org/29689, I'll try
 to send a quick fix asap.

Please help me verify that a removal of two lines
http://gerrit.ovirt.org/#/c/30547/ fixes the issue.

Regards,
Dan.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt-release.rpm 3.4 dead link

2014-07-21 Thread Daniel Erez


- Original Message -
 From: Federico Alberto Sayd fs...@uncu.edu.ar
 To: users users@ovirt.org
 Sent: Monday, July 21, 2014 7:54:48 PM
 Subject: [ovirt-users] ovirt-release.rpm 3.4 dead link
 
 Hello:
 
 The link to ovirt-release.rpm 3.4 is dead:
 
 http://www.ovirt.org/OVirt_3.4_Release_Notes#Install_.2F_Upgrade_from_Previous_Versions
 
 Where is the ovirt-release.rpm ??

3.4 RPM is available at 
http://resources.ovirt.org/pub/yum-repo/ovirt-release34.rpm
(I've updated the stale link in the wiki..)

 
 Thanks
 
 Federico
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Maurice James
Where can I get the RPMs?

- Original Message -
From: Dan Kenigsberg dan...@redhat.com
To: Maurice James mja...@media-node.com
Cc: Joop jvdw...@xs4all.nl, users@ovirt.org
Sent: Monday, July 21, 2014 1:34:39 PM
Subject: Re: [ovirt-users] 3.4.3 Network problem

On Mon, Jul 21, 2014 at 06:05:45PM +0100, Dan Kenigsberg wrote:
 On Mon, Jul 21, 2014 at 09:03:58AM -0400, Maurice James wrote:
  I submitted a bug report 
  https://bugzilla.redhat.com/show_bug.cgi?id=1121643 
  
  
  - Original Message -
  
  From: Joop jvdw...@xs4all.nl 
  To: users@ovirt.org 
  Sent: Monday, July 21, 2014 8:46:18 AM 
  Subject: Re: [ovirt-users] 3.4.3 Network problem 
  
  On 21-7-2014 14:38, Maurice James wrote: 
  
  
  
  I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. 
  They are not down not sure why it thinks the interfaces are down. Its doing 
  this for all 4 of my hosts 
 
 It is a horrible bug, due to my http://gerrit.ovirt.org/29689, I'll try
 to send a quick fix asap.

Please help me verify that a removal of two lines
http://gerrit.ovirt.org/#/c/30547/ fixes the issue.

Regards,
Dan.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-21 Thread Steve Dainard
I'm using the hostusb hook on RHEV 3.4 trial.

The usb device is passed through to the VM, but I'm getting errors in a
Windows VM when the device driver is loaded.

I started with a simple usb drive, on the host it is listed as:

Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc.

Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM.

In Windows I get an error in device manager:
USB Mass Storage Device This device cannot start. (Code 10)
Properties/General Tab: Device type: Universal Serial Bus Controllers,
Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001

Under hardware Ids:
USB\VID_05DCPID_C75CREV_0102
USB\VID_05DCPID_C75C

So it looks like the proper USB device ID is passed to the VM.

I don't see any error messages in event viewer, and I don't see anything in
VDSM logs either.

Any help is appreciated.

Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-21 Thread Steve Dainard
I should mention I can mount this usb drive in a CentOS 6.5 VM without any
problems.


On Mon, Jul 21, 2014 at 2:11 PM, Steve Dainard sdain...@miovision.com
wrote:

 I'm using the hostusb hook on RHEV 3.4 trial.

 The usb device is passed through to the VM, but I'm getting errors in a
 Windows VM when the device driver is loaded.

 I started with a simple usb drive, on the host it is listed as:

 Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc.

 Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM.

 In Windows I get an error in device manager:
 USB Mass Storage Device This device cannot start. (Code 10)
 Properties/General Tab: Device type: Universal Serial Bus Controllers,
 Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001

 Under hardware Ids:
 USB\VID_05DCPID_C75CREV_0102
 USB\VID_05DCPID_C75C

 So it looks like the proper USB device ID is passed to the VM.

 I don't see any error messages in event viewer, and I don't see anything
 in VDSM logs either.

 Any help is appreciated.

 Steve

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Joop
On 21-7-2014 19:34, Dan Kenigsberg wrote:
 On Mon, Jul 21, 2014 at 06:05:45PM +0100, Dan Kenigsberg wrote:
 On Mon, Jul 21, 2014 at 09:03:58AM -0400, Maurice James wrote:
 I submitted a bug report 
 https://bugzilla.redhat.com/show_bug.cgi?id=1121643 


 - Original Message -

 From: Joop jvdw...@xs4all.nl 
 To: users@ovirt.org 
 Sent: Monday, July 21, 2014 8:46:18 AM 
 Subject: Re: [ovirt-users] 3.4.3 Network problem 

 On 21-7-2014 14:38, Maurice James wrote: 



 I just upgraded to 3.4.3, not its complaining that em1 and em2 are down. 
 They are not down not sure why it thinks the interfaces are down. Its doing 
 this for all 4 of my hosts 
 It is a horrible bug, due to my http://gerrit.ovirt.org/29689, I'll try
 to send a quick fix asap.
 Please help me verify that a removal of two lines
 http://gerrit.ovirt.org/#/c/30547/ fixes the issue.

I commented out the indicated 2 lines and could activate my host and it
stayed activated (1h) while before this patch it would turn unresponsive
quite quickly (minutes).

Joop

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.4.3 Network problem

2014-07-21 Thread Joop
On 21-7-2014 20:11, Maurice James wrote:
 Where can I get the RPMs?

No rpms yet but its a 2 line edit in /usr/share/vdsm/sampling.py. Search
for 'vlan' and it should find that in a 3 way if then  elseif construct.
Just comment the elseif line and the vlan line and save, then restart
vdsm and things should work again.
I expect/hope new rpms tomorrow late or day after.

Joop

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Online backup options

2014-07-21 Thread Itamar Heim

On 06/20/2014 04:35 PM, Steve Dainard wrote:

Hello Ovirt team,

Reading this bulletin: https://access.redhat.com/site/solutions/117763
there is a reference to 'private Red Hat Bug # 523354' covering online
backups of VM's.

Can someone comment on this feature, and rough timeline? Is this a
native backup solution that will be included with Ovirt/RHEV?

Is this Ovirt feature where the work is being done?
http://www.ovirt.org/Features/Backup-Restore_API_Integration It seems
like this may be a different feature specifically for 3rd party backup
options.


yes, that's the current approach to allow backup solution to work with 
ovirt for backups while we focus on core issues.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Deploying hosted engine on second host with different CPU model

2014-07-21 Thread Itamar Heim

On 07/18/2014 02:05 AM, Andrew Lau wrote:

I think you should be able specify this within the ovirt-engine, just
modify the cluster's cpu compatibility. I hit this too, but i think I
just ended up provisioning the older machine first then the newer ones
joined with the older model


1. first the host needs to have a compatible cpu model.
   what does 'vdsClient -s 0 getVdsCaps | grep -i flag' returns

2. cluster cpu level is easy, but hosted engine vm config resides on the 
disk, and needs to be manually edited in this case iirc.




On Thu, Jul 17, 2014 at 11:05 PM, George Machitidze
gmachiti...@greennet.ge wrote:

Hello,

I am deploying hosted engine (HA) on hosts with different CPU models on one
of my oVirt labs.

Host have different CPU's or there is also the problem: virtualization
platform cannot detect CPU at all, The following CPU types are supported by
this host: is empty:


2014-07-17 16:51:42 DEBUG otopi.plugins.ovirt_hosted_engine_setup.vdsmd.cpu
cpu._customization:124 Compatible CPU models are: []

Is there any way to override this setting and use CPU of old machine for
both hosts?

ex.
host1:

cpu family  : 6
model   : 15
model name  : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
stepping: 11

host2:

cpu family  : 6
model   : 42
model name  : Intel(R) Xeon(R) CPU E31220 @ 3.10GHz
stepping: 7



[root@ovirt2 ~]# hosted-engine --deploy
[ INFO  ] Stage: Initializing
   Continuing will configure this host for serving as hypervisor and
create a VM where you have to install oVirt Engine afterwards.
   Are you sure you want to continue? (Yes, No)[Yes]:
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
   Configuration files: []
   Log file:
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20140717165111-7tg2g7.log
   Version: otopi-1.2.1 (otopi-1.2.1-1.el6)
[ INFO  ] Hardware supports virtualization
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization

   --== STORAGE CONFIGURATION ==--

   During customization use CTRL-D to abort.
   Please specify the storage you would like to use (nfs3,
nfs4)[nfs3]:
   Please specify the full shared storage connection path to use
(example: host:/path): ovirt-hosted:/engine
   The specified storage location already contains a data domain. Is
this an additional host setup (Yes, No)[Yes]?
[ INFO  ] Installing on additional host
   Please specify the Host ID [Must be integer, default: 2]:

   --== SYSTEM CONFIGURATION ==--

[WARNING] A configuration file must be supplied to deploy Hosted Engine on
an additional host.
   The answer file may be fetched from the first host using scp.
   If you do not want to download it automatically you can abort the
setup answering no to the following question.
   Do you want to scp the answer file from the first host? (Yes,
No)[Yes]:
   Please provide the FQDN or IP of the first host: ovirt1.test.ge
   Enter 'root' user password for host ovirt1.test.ge:
[ INFO  ] Answer file successfully downloaded

   --== NETWORK CONFIGURATION ==--

   The following CPU types are supported by this host:
[ ERROR ] Failed to execute stage 'Environment customization': Invalid CPU
type specified: model_Conroe
[ INFO  ] Stage: Clean up
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination

--
BR

George Machitidze


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Vijay Bellur

On 07/21/2014 05:09 AM, Pranith Kumar Karampuri wrote:


On 07/21/2014 02:08 PM, Jiri Moskovcak wrote:

On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:


On 07/19/2014 11:25 AM, Andrew Lau wrote:



On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
pkara...@redhat.com mailto:pkara...@redhat.com wrote:


On 07/18/2014 05:43 PM, Andrew Lau wrote:

​ ​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.


I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here.
Hosted engine assumes that all writes are atomic and
(immediately) available for all hosts in the cluster.
Gluster violates those assumptions.
​

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.


​

Now my question, does this theory prevent a scenario of
perhaps
something like a gluster replicated volume being mounted
as a glusterfs
filesystem and then re-exported as the native kernel NFS
share for the
hosted-engine to consume? It could then be possible to
chuck ctdb in
there to provide a last resort failover solution. I have
tried myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and
they haven't
reported as many issues. Curious, could anyone validate
my theory on this?


If we obtain more details on the use case and obtain gluster
logs from the failed scenarios, we should be able to
understand the problem better. That could be the first step
in validating your theory or evolving further
recommendations :).


​ I'm not sure how useful this is, but ​Jiri Moskovcak tracked
this down in an off list message.

​ Message Quote:​

​ ==​

​We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,

line 165, in handle
  response = success  + self._dispatch(data)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,

line 261, in _dispatch
  .get_all_stats_for_service_type(**options)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,

line 41, in get_all_stats_for_service_type
  d = self.get_raw_stats_for_service_type(storage_dir,
service_type)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,

line 74, in get_raw_stats_for_service_type
  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:
'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'


Andrew/Jiri,
Would it be possible to post gluster logs of both the
mount and bricks on the bz? I can take a look at it once. If I
gather nothing then probably I will ask for your help in
re-creating the issue.

Pranith


​Unfortunately, I don't have the logs for that setup any more.. ​I'll
try replicate when I get a chance. If I understand the comment from
the BZ, I don't think it's a gluster bug per-say, more just how
gluster does its replication.

hi Andrew,
  Thanks for that. I couldn't come to any conclusions because no
logs were available. It is unlikely that self-heal is involved because
there were no bricks going down/up according to the bug description.



Hi,
I've never had such setup, I guessed problem with gluster based on
OSError: [Errno 116] Stale file handle: which happens when the file
opened by application on client gets removed on the server. I'm pretty
sure we (hosted-engine) don't remove that file, so I