Re: [ovirt-users] ovirt can't find user

2017-06-30 Thread Ondra Machacek
On Thu, Jun 29, 2017 at 5:16 PM, Fabrice Bacchella
 wrote:
>
>> Le 29 juin 2017 à 14:42, Fabrice Bacchella  a 
>> écrit :
>>
>>
>>> Le 29 juin 2017 à 13:41, Ondra Machacek  a écrit :
>>>
>>> How do you login? Do you use webadmin or API/SDK, if using SDK, don't
>>> you use kerberos=True?
>>
>> Ok, got it.
>> It's tested with the sdk, using kerberos. But Kerberos authentication is 
>> done in Apache and I configure a profile for that, so I needed to add: 
>> config.artifact.arg = X-Remote-User in my 
>> /etc/ovirt-engine/extensions.d/MyProfile.authn.properties. But this is 
>> missing from internal-authn.properties. So rexecutor@internal  is checked 
>> with my profil, and not found. But as the internal profil don't know about 
>> X-Remote-User, it can't check the user and fails silently. That's why I'm 
>> getting only one line. Perhaps the log line should have said the extensions 
>> name that was failing, not the generic "External Authentication" that did'nt 
>> caught my eye.
>>
>> I will check that as soon as I have a few minutes to spare and tell you.
>
> I'm starting to understand. I need two authn modules, both using 
> org.ovirt.engineextensions.aaa.misc.http.AuthnExtension but with a different 
> authz.plugin. Is that possible ? If I do what, in what order the different 
> Authn will be tried ? Are they all tried until one succeed  both authn and 
> authz ?
>

Yes you can have multiple authn profiles and it tries to login until
one succeed:

 
https://github.com/oVirt/ovirt-engine/blob/de46aa78f3117cbe436ab10926ac0c23fcdd7cfc/backend/manager/modules/aaa/src/main/java/org/ovirt/engine/core/aaa/filters/NegotiationFilter.java#L125

The order isn't guaranteed, but I think it's not important, or is it for you?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
Hi Denis,

>
> That sound really strange. I would suspect some storage problems or
> something. As i told you earlier, output of --vm-status may shed light on
> that issue.

Unfortunately, I can't replicate it at the moment due to the need to
keep the VMs up.

>
>>
>
> Did you tried to migrate form bare metal engine to the hosted engine?
>>

Yes, I used this procedure:

http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/

Essentially, I used a brand new host not joined to the cluster to
deploy the Hosted Engine VM.

> Engine is responsible for starting those VMs. As you had no engine, there
> was no one to start them. Hosted Engine tools are only responsible for the
> engine VM, not other VMs.

I could not find out why the engine would not start from the logs I looked at.
I didn't have the time to spend on it as I had to get the VMs up and running

> I know, there exists 'bare metal - to - hosted engine' migration procedure,
> but i doubt i knew it good enough. If i remember correctly, you need to take
> a backup of your bare metal engine database, run migration preparation
> script, that will handle spm_id duplications, deploy your first HE host,
> restore database from the backup, deploy more HE hosts. I'm not sure if
> those steps are correct and would better ask Martin about migration process.

I did all these steps as per the URL above, and it did not report any
errors during the process.
The Hosted Engine VM started fine, but it did not appear in the list
of VMs. I think the problem here
was that the list of display types was incorrectly written in the
hosted engine properties file. I was still
left with the issue that the Hosted Engine could not be migrated to
any other host. It was suggested
to re-install the other hosts with the 'deploy hosted engine' option
(which was missing in the official
documentation). This didn't fix the issue so it was suggested that the
host_id was incorrect (as it did not
reflect the SPM ID of the host. I fixed this, then restarted the
cluster...with the result that the engine
would not start, and no VMs started. I could not see any storage
errors in any of the logs I looked at,
but it had not been a problem previously when rebooting hosts (though
I'd never restarted the whole cluster
before). When I used the old bare metal engine, I could get into the
GUI to start the VMs, not sure why
they didn't come up automatically.

I'd like to get it working and will work with the person who takes it
over to do this. I'd like to see it succeed
so eventually we could use oVirt as a proof of concept to replace
VMWare with RHEV. Everyone's help has been great,
but unfortunately it hasn't been entirely smooth sailing (for this
migration) so far.

Thanks again,

Cam
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Sound Device Custom Property

2017-06-30 Thread Mauricio Perez
Hello Everyone,

 

I am trying to modify a VM sound device sub element to  . I understand the only way to perform this is by using a
before vm start vdsm hook to define custom property. Can someone please
guide to setting this up.

* How would I define it in the engine config?

* How do I create the hooking module?

* Is there a source to find pre created hooking modules?

 

Thank You All!!

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] moVirt 2.0 released!

2017-06-30 Thread Filip Krepinsky
Hello everyone,

moVirt 2.0 has just been released and should arrive to your devices soon!
You can also get the apk from our GitHub [1].

The main feature of this release is managing multiple oVirt installations +
many other cool features [2].

Thanks everybody who helped with testing and especially big thanks to Shira
who gave us lots of valuable input.

Have a nice day
Filip

[1]: https://github.com/oVirt/moVirt/releases/tag/v2.0
[2]: https://github.com/oVirt/moVirt/wiki/Changelog
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread Denis Chaplygin
Hello!

On Fri, Jun 30, 2017 at 5:46 PM, cmc  wrote:

> I ran 'hosted-engine --vm-start' after trying to ping the engine and
> running 'hosted-engine --vm-status' (which said it wasn't running) and
> it reported that it was 'destroying storage' and starting the engine,
> though it did not start it. I could not see any evidence from
> 'hosted-engine --vm-status' or logs that it started.


That sound really strange. I would suspect some storage problems or
something. As i told you earlier, output of --vm-status may shed light on
that issue.


> By this point I
> was in a panic to get VMs running. So I had to fire up the old bare
> metal engine. This has been a very disappointing experience. I still
> have no idea why the IDs in 'host_id' differed from the spm ID, and
>

Did you tried to migrate form bare metal engine to the hosted engine?

>
>
> 1. Why did the VMs (apart from the Hosted Engine VM) not start on
> power up of the hosts? Is it because the hosts were powered down, that
> they stay in a down state on power up of the host?
>
>
Engine is responsible for starting those VMs. As you had no engine, there
was no one to start them. Hosted Engine tools are only responsible for the
engine VM, not other VMs.


> 2. Now that I have connected the bare metal engine back to the
> cluster, is there a way back, or do I have to start from scratch
> again? I imagine there is no way of getting the Hosted Engine running
> again. If not, what do I need to 'clean' all the hosts of the remnants
> of the failed deployment? I can of course reinitialise the LUN that
> the Hosted Engine was on - anything else?
>

I know, there exists 'bare metal - to - hosted engine' migration procedure,
but i doubt i knew it good enough. If i remember correctly, you need to
take a backup of your bare metal engine database, run migration preparation
script, that will handle spm_id duplications, deploy your first HE host,
restore database from the backup, deploy more HE hosts. I'm not sure if
those steps are correct and would better ask Martin about migration process.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Help! No VMs start after reboot of cluster

2017-06-30 Thread cmc
The broker was reported as down - I recall there was something about
'Failed to getVmStats' in the systemctl output. I wasn't sure how to
check the storage from the oVirt point of view (the GUI was
unavailable). When I put the bare metal engine back, it did take a
short while for the storage to become available (it is FC storage).
The agent did not report any errors in systemctl.

On Fri, Jun 30, 2017 at 4:39 PM, David Gossage
 wrote:
>
>
> On Fri, Jun 30, 2017 at 10:34 AM, cmc  wrote:
>>
>> Hi Denis,
>>
>> Yes, I did check that and it said it was out of global maintenance
>> ('False' I think it said).
>>
>
> Did you check that the storage the hostedengine VM attaches to mounted and
> is in a healthy state, and that the broker and agent services are running?
> Both have logs that may give some indication if it detects an issue as well.
>
>
>> Thanks,
>>
>> Cam
>>
>> On Fri, Jun 30, 2017 at 4:31 PM, Denis Chaplygin 
>> wrote:
>> > Hello!
>> >
>> > On Fri, Jun 30, 2017 at 4:35 PM, cmc  wrote:
>> >>
>> >> I restarted my 3 host cluster after setting it into global maintenance
>> >> mode and then shutting down all of the nodes and then bringing them up
>> >> again. I moved it out of global maintenance mode and no VM is running,
>> >> including the hosted engine.
>> >>
>> >> Any help greatly appreciated!
>> >
>> >
>> > Are you sure you are really out of global maintenance? Could you please
>> > post
>> > hosted-engine --vm-status output?
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
I ran 'hosted-engine --vm-start' after trying to ping the engine and
running 'hosted-engine --vm-status' (which said it wasn't running) and
it reported that it was 'destroying storage' and starting the engine,
though it did not start it. I could not see any evidence from
'hosted-engine --vm-status' or logs that it started. By this point I
was in a panic to get VMs running. So I had to fire up the old bare
metal engine. This has been a very disappointing experience. I still
have no idea why the IDs in 'host_id' differed from the spm ID, and
why, when I put the cluster into global maintenance and shutdown all
the hosts, the Hosted Engine did not come up, nor any of the VMs. I
don't feel confident in this any more. If I try the deploying the
Hosted Engine again I am not sure if it will result in the same
non-functional cluster. It gave no error on deployment, but clearly
something was wrong.

I have two questions:

1. Why did the VMs (apart from the Hosted Engine VM) not start on
power up of the hosts? Is it because the hosts were powered down, that
they stay in a down state on power up of the host?

2. Now that I have connected the bare metal engine back to the
cluster, is there a way back, or do I have to start from scratch
again? I imagine there is no way of getting the Hosted Engine running
again. If not, what do I need to 'clean' all the hosts of the remnants
of the failed deployment? I can of course reinitialise the LUN that
the Hosted Engine was on - anything else?

Thanks

On Fri, Jun 30, 2017 at 4:30 PM, Denis Chaplygin  wrote:
> Hello!
>
> On Fri, Jun 30, 2017 at 4:19 PM, cmc  wrote:
>>
>> Help! I put the cluster into global maintenance, then powered off and
>> then on all of the nodes I have powered off and powered on all the
>> nodes. I have taken it out of global maintenance. No VM has started,
>> including the hosted engine. This is very bad. I am going to look
>> through logs to see why nothing has started. Help greatly appreciated.
>
>
> Global maintenance mode turns off high availability for the hosted engine
> vm. You should either cancel global maintenance or start vm manually with
> hosted-engine --vm-start
>
> Global maintenance was added to allow manual maintenance of the engine VM,
> so in that mode state of the engine VM and engine itself is not managed and
> you a free to stop engine or vm or both, do whatever you like and hosted
> engine tools will not interfere. Obviously when engine VM just dies while
> cluster is in global maintenance (or all nodes reboot, as in your case)
> there is no one to restart it :)
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Help! No VMs start after reboot of cluster

2017-06-30 Thread David Gossage
On Fri, Jun 30, 2017 at 10:34 AM, cmc  wrote:

> Hi Denis,
>
> Yes, I did check that and it said it was out of global maintenance
> ('False' I think it said).
>
>
Did you check that the storage the hostedengine VM attaches to mounted and
is in a healthy state, and that the broker and agent services are running?
Both have logs that may give some indication if it detects an issue as well.


Thanks,
>
> Cam
>
> On Fri, Jun 30, 2017 at 4:31 PM, Denis Chaplygin 
> wrote:
> > Hello!
> >
> > On Fri, Jun 30, 2017 at 4:35 PM, cmc  wrote:
> >>
> >> I restarted my 3 host cluster after setting it into global maintenance
> >> mode and then shutting down all of the nodes and then bringing them up
> >> again. I moved it out of global maintenance mode and no VM is running,
> >> including the hosted engine.
> >>
> >> Any help greatly appreciated!
> >
> >
> > Are you sure you are really out of global maintenance? Could you please
> post
> > hosted-engine --vm-status output?
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Help! No VMs start after reboot of cluster

2017-06-30 Thread Denis Chaplygin
Hello!

On Fri, Jun 30, 2017 at 5:34 PM, cmc  wrote:

>
> Yes, I did check that and it said it was out of global maintenance
> ('False' I think it said).
>
>
Well, then it should start VM :-) Could you please share hosted-engine
--vm-status  output? It may contain some interesting information
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Help! No VMs start after reboot of cluster

2017-06-30 Thread cmc
Hi Denis,

Yes, I did check that and it said it was out of global maintenance
('False' I think it said).

Thanks,

Cam

On Fri, Jun 30, 2017 at 4:31 PM, Denis Chaplygin  wrote:
> Hello!
>
> On Fri, Jun 30, 2017 at 4:35 PM, cmc  wrote:
>>
>> I restarted my 3 host cluster after setting it into global maintenance
>> mode and then shutting down all of the nodes and then bringing them up
>> again. I moved it out of global maintenance mode and no VM is running,
>> including the hosted engine.
>>
>> Any help greatly appreciated!
>
>
> Are you sure you are really out of global maintenance? Could you please post
> hosted-engine --vm-status output?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Help! No VMs start after reboot of cluster

2017-06-30 Thread Denis Chaplygin
Hello!

On Fri, Jun 30, 2017 at 4:35 PM, cmc  wrote:

> I restarted my 3 host cluster after setting it into global maintenance
> mode and then shutting down all of the nodes and then bringing them up
> again. I moved it out of global maintenance mode and no VM is running,
> including the hosted engine.
>
> Any help greatly appreciated!
>

Are you sure you are really out of global maintenance? Could you please
post hosted-engine --vm-status output?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread Denis Chaplygin
Hello!

On Fri, Jun 30, 2017 at 4:19 PM, cmc  wrote:

> Help! I put the cluster into global maintenance, then powered off and
> then on all of the nodes I have powered off and powered on all the
> nodes. I have taken it out of global maintenance. No VM has started,
> including the hosted engine. This is very bad. I am going to look
> through logs to see why nothing has started. Help greatly appreciated.
>

Global maintenance mode turns off high availability for the hosted engine
vm. You should either cancel global maintenance or start vm manually with
hosted-engine --vm-start

Global maintenance was added to allow manual maintenance of the engine VM,
so in that mode state of the engine VM and engine itself is not managed and
you a free to stop engine or vm or both, do whatever you like and hosted
engine tools will not interfere. Obviously when engine VM just dies while
cluster is in global maintenance (or all nodes reboot, as in your case)
there is no one to restart it :)
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
I've had no other choice but to power up the old bare metal engine to
be able to start the VMs. This is probably really bad but I had to get
the VMs running.
I am guessing now that if the host is shutdown rather than simply
rebooted, that the VMs will not restart on powerup of the host. This
would not have been such a problem if the Hosted Engine started.

So I'm not sure where to go from here...

I guess it is start from scratch again?

On Fri, Jun 30, 2017 at 3:19 PM, cmc  wrote:
> Help! I put the cluster into global maintenance, then powered off and
> then on all of the nodes I have powered off and powered on all the
> nodes. I have taken it out of global maintenance. No VM has started,
> including the hosted engine. This is very bad. I am going to look
> through logs to see why nothing has started. Help greatly appreciated.
>
> Thanks,
>
> Cam
>
> On Fri, Jun 30, 2017 at 1:00 PM, cmc  wrote:
>> So I can run from any node: hosted-engine --set-maintenance
>> --mode=global. By 'agents', you mean the ovirt-ha-agent, right? This
>> shouldn't affect the running of any VMs, correct? Sorry for the
>> questions, just want to do it correctly and not make assumptions :)
>>
>> Cheers,
>>
>> C
>>
>> On Fri, Jun 30, 2017 at 12:12 PM, Martin Sivak  wrote:
>>> Hi,
>>>
 Just to clarify: you mean the host_id in
 /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
 correct?
>>>
>>> Exactly.
>>>
>>> Put the cluster to global maintenance first. Or kill all agents (has
>>> the same effect).
>>>
>>> Martin
>>>
>>> On Fri, Jun 30, 2017 at 12:47 PM, cmc  wrote:
 Just to clarify: you mean the host_id in
 /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
 correct?

 On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak  wrote:
> Hi,
>
> cleaning metadata won't help in this case. Try transferring the
> spm_ids you got from the engine to the proper hosted engine hosts so
> the hosted engine ids match the spm_ids. Then restart all hosted
> engine services. I would actually recommend restarting all hosts after
> this change, but I have no idea how many VMs you have running.
>
> Martin
>
> On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
>> Tried running a 'hosted-engine --clean-metadata" as per
>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>> ovirt-ha-agent was not running anyway, but it fails with the following
>> error:
>>
>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>> to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>> call last):
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>> return action(he)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 67, in action_clean
>> return he.clean(options.force_cleanup)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 345, in clean
>> self._initialize_domain_monitor()
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 823, in _initialize_domain_monitor
>> raise Exception(msg)
>> Exception: Failed to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, 
>> attempt '0'
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>> occurred, giving up. Please review the log and consider filing a bug.
>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>
>> On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
>>> Actually, it looks like sanlock problems:
>>>
>>>"SanlockInitializationError: Failed to initialize sanlock, the
>>> number of errors has exceeded the limit"
>>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
 Sorry, I am mistaken, two hosts failed for the agent with the 
 following error:

 ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
 ERROR Failed to start monitoring domain
 (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
 during domain acquisition
 ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
 ERROR Shutting down the agent because of 3 failures in a row!

 What could cause these timeouts? Some other service not running?

 On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
>>

[ovirt-users] Help! No VMs start after reboot of cluster

2017-06-30 Thread cmc
Hi,

I restarted my 3 host cluster after setting it into global maintenance
mode and then shutting down all of the nodes and then bringing them up
again. I moved it out of global maintenance mode and no VM is running,
including the hosted engine.

Any help greatly appreciated!

Thanks,

Cam
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
Help! I put the cluster into global maintenance, then powered off and
then on all of the nodes I have powered off and powered on all the
nodes. I have taken it out of global maintenance. No VM has started,
including the hosted engine. This is very bad. I am going to look
through logs to see why nothing has started. Help greatly appreciated.

Thanks,

Cam

On Fri, Jun 30, 2017 at 1:00 PM, cmc  wrote:
> So I can run from any node: hosted-engine --set-maintenance
> --mode=global. By 'agents', you mean the ovirt-ha-agent, right? This
> shouldn't affect the running of any VMs, correct? Sorry for the
> questions, just want to do it correctly and not make assumptions :)
>
> Cheers,
>
> C
>
> On Fri, Jun 30, 2017 at 12:12 PM, Martin Sivak  wrote:
>> Hi,
>>
>>> Just to clarify: you mean the host_id in
>>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>>> correct?
>>
>> Exactly.
>>
>> Put the cluster to global maintenance first. Or kill all agents (has
>> the same effect).
>>
>> Martin
>>
>> On Fri, Jun 30, 2017 at 12:47 PM, cmc  wrote:
>>> Just to clarify: you mean the host_id in
>>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>>> correct?
>>>
>>> On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak  wrote:
 Hi,

 cleaning metadata won't help in this case. Try transferring the
 spm_ids you got from the engine to the proper hosted engine hosts so
 the hosted engine ids match the spm_ids. Then restart all hosted
 engine services. I would actually recommend restarting all hosts after
 this change, but I have no idea how many VMs you have running.

 Martin

 On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
> Tried running a 'hosted-engine --clean-metadata" as per
> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
> ovirt-ha-agent was not running anyway, but it fails with the following
> error:
>
> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
> to start monitoring domain
> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
> during domain acquisition
> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
> call last):
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
> return action(he)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 67, in action_clean
> return he.clean(options.force_cleanup)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 345, in clean
> self._initialize_domain_monitor()
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 823, in _initialize_domain_monitor
> raise Exception(msg)
> Exception: Failed to start monitoring domain
> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
> during domain acquisition
> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, 
> attempt '0'
> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
> occurred, giving up. Please review the log and consider filing a bug.
> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>
> On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
>> Actually, it looks like sanlock problems:
>>
>>"SanlockInitializationError: Failed to initialize sanlock, the
>> number of errors has exceeded the limit"
>>
>>
>>
>> On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
>>> Sorry, I am mistaken, two hosts failed for the agent with the following 
>>> error:
>>>
>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>> ERROR Failed to start monitoring domain
>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>> during domain acquisition
>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>> ERROR Shutting down the agent because of 3 failures in a row!
>>>
>>> What could cause these timeouts? Some other service not running?
>>>
>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
 Both services are up on all three hosts. The broke logs just report:

 Thread-6549::INFO::2017-06-29
 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
 Thread-6549::INFO::2017-06-29
 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed

 Thanks,

 Cam

 On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak  
 wrote:
> Hi,
>
> please make su

Re: [ovirt-users] [Ovirt 4.0.6] Suggestion required for Network Throughput options

2017-06-30 Thread TranceWorldLogic .
Your understanding is correct issue only due to encryption/decryption
process but not got idea why not word.
I found that in centos 7 we not have install rng-tools.

Do it required to install for random generator ?

I have changes nothing I just increase number of queues in vnet. It
diffenetly increase throughput and create multiple softIRQs in VM.
But for normal traffic this all things are not required it gives 10G
throughput.

Thanks,
~Rohit

On Fri, Jun 30, 2017 at 7:08 PM, Yaniv Kaul  wrote:

>
>
> On Fri, Jun 30, 2017 at 4:14 PM, TranceWorldLogic . <
> tranceworldlo...@gmail.com> wrote:
>
>> Hi Yaniv,
>>
>> I have enabled random generator in cluster and also in VM.
>> But still not see any improvement in throughput.
>>
>> lsmod | grep -i virtio
>> virtio_rng 13019  0
>>
>
> Are you sure it's being used? What is the qemu command line (do you see
> the device in the guest?)
>
>
>> virtio_balloon 13834  0
>> virtio_console 28115  2
>> virtio_blk 18156  4
>> virtio_scsi18361  0
>> virtio_net 28024  0
>> virtio_pci 22913  0
>> virtio_ring21524  7 virtio_blk,virtio_net,virtio_p
>> ci,virtio_rng,virtio_balloon,virtio_console,virtio_scsi
>> virtio 15008  7 virtio_blk,virtio_net,virtio_p
>> ci,virtio_rng,virtio_balloon,virtio_console,virtio_scsi
>>
>> Would please check do I missing some virtio module ?
>>
>> One more finding, if I set queue property in vnic profile then I got good
>> throughput.
>>
>
> Interesting - I had assumed the bottleneck would be the
> encryption/decryption process, not the network. What do you set exactly?
> Does it matter in non-encrypted traffic as well? Are the packets (and the
> whole communication) large or small (i.e, would jumbo frames help) ?
>  Y.
>
>
>> Thanks,
>> ~Rohit
>>
>>
>> On Fri, Jun 30, 2017 at 12:11 AM, Yaniv Kaul  wrote:
>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 4:02 PM, TranceWorldLogic . <
>>> tranceworldlo...@gmail.com> wrote:
>>>
 Got it, just I need to do modprobe to add virtio-rng driver.
 I will try with this option.

>>>
>>> Make sure it is checked on the cluster.
>>> Y.
>>>

 Thanks for your help,
 ~Rohit

 On Thu, Jun 29, 2017 at 6:20 PM, TranceWorldLogic . <
 tranceworldlo...@gmail.com> wrote:

> Hi,
>
> I am using host as Centos 7.3 and guest also centos 7.3
> it have 3.10 kernel version.
>
> But I not see virtio-rng in guest VM. Is this module come with kernel
> or separately I have to install ?
>
> Thanks,
> ~Rohit
>
> On Thu, Jun 29, 2017 at 5:28 PM, Yaniv Kaul  wrote:
>
>>
>>
>> On Thu, Jun 29, 2017 at 10:07 AM, TranceWorldLogic . <
>> tranceworldlo...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> To increase network throughput we have changed txqueuelen of network
>>> device and bridge manually. And observed  improved throughput.
>>>
>>
>> Interesting, as I've read the default (1000) should be good enough
>> for 10g, for example.
>>
>> Are you actually seeing errors on the interface (overruns) and such?
>>
>>
>>>
>>> But in ovirt I not see any option to increase txqueuelen.
>>>
>>
>> Perhaps use ifup-local script to set it when the interface goes up?
>>
>>
>>> Can someone suggest me what will be the right way to increase
>>> throughput ?
>>>
>>> Note: I am trying to increase throughput for  ipsec packets.
>>>
>>
>> For ipsec, probably best to ensure virtio-rng is enabled.
>> Y.
>>
>>
>>>
>>> Thanks,
>>> ~Rohit
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>

>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Ovirt 4.0.6] Suggestion required for Network Throughput options

2017-06-30 Thread Yaniv Kaul
On Fri, Jun 30, 2017 at 4:14 PM, TranceWorldLogic . <
tranceworldlo...@gmail.com> wrote:

> Hi Yaniv,
>
> I have enabled random generator in cluster and also in VM.
> But still not see any improvement in throughput.
>
> lsmod | grep -i virtio
> virtio_rng 13019  0
>

Are you sure it's being used? What is the qemu command line (do you see the
device in the guest?)


> virtio_balloon 13834  0
> virtio_console 28115  2
> virtio_blk 18156  4
> virtio_scsi18361  0
> virtio_net 28024  0
> virtio_pci 22913  0
> virtio_ring21524  7 virtio_blk,virtio_net,virtio_
> pci,virtio_rng,virtio_balloon,virtio_console,virtio_scsi
> virtio 15008  7 virtio_blk,virtio_net,virtio_
> pci,virtio_rng,virtio_balloon,virtio_console,virtio_scsi
>
> Would please check do I missing some virtio module ?
>
> One more finding, if I set queue property in vnic profile then I got good
> throughput.
>

Interesting - I had assumed the bottleneck would be the
encryption/decryption process, not the network. What do you set exactly?
Does it matter in non-encrypted traffic as well? Are the packets (and the
whole communication) large or small (i.e, would jumbo frames help) ?
 Y.


> Thanks,
> ~Rohit
>
>
> On Fri, Jun 30, 2017 at 12:11 AM, Yaniv Kaul  wrote:
>
>>
>>
>> On Thu, Jun 29, 2017 at 4:02 PM, TranceWorldLogic . <
>> tranceworldlo...@gmail.com> wrote:
>>
>>> Got it, just I need to do modprobe to add virtio-rng driver.
>>> I will try with this option.
>>>
>>
>> Make sure it is checked on the cluster.
>> Y.
>>
>>>
>>> Thanks for your help,
>>> ~Rohit
>>>
>>> On Thu, Jun 29, 2017 at 6:20 PM, TranceWorldLogic . <
>>> tranceworldlo...@gmail.com> wrote:
>>>
 Hi,

 I am using host as Centos 7.3 and guest also centos 7.3
 it have 3.10 kernel version.

 But I not see virtio-rng in guest VM. Is this module come with kernel
 or separately I have to install ?

 Thanks,
 ~Rohit

 On Thu, Jun 29, 2017 at 5:28 PM, Yaniv Kaul  wrote:

>
>
> On Thu, Jun 29, 2017 at 10:07 AM, TranceWorldLogic . <
> tranceworldlo...@gmail.com> wrote:
>
>> Hi,
>>
>> To increase network throughput we have changed txqueuelen of network
>> device and bridge manually. And observed  improved throughput.
>>
>
> Interesting, as I've read the default (1000) should be good enough for
> 10g, for example.
>
> Are you actually seeing errors on the interface (overruns) and such?
>
>
>>
>> But in ovirt I not see any option to increase txqueuelen.
>>
>
> Perhaps use ifup-local script to set it when the interface goes up?
>
>
>> Can someone suggest me what will be the right way to increase
>> throughput ?
>>
>> Note: I am trying to increase throughput for  ipsec packets.
>>
>
> For ipsec, probably best to ensure virtio-rng is enabled.
> Y.
>
>
>>
>> Thanks,
>> ~Rohit
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>

>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Ovirt 4.0.6] Suggestion required for Network Throughput options

2017-06-30 Thread TranceWorldLogic .
Hi Yaniv,

I have enabled random generator in cluster and also in VM.
But still not see any improvement in throughput.

lsmod | grep -i virtio
virtio_rng 13019  0
virtio_balloon 13834  0
virtio_console 28115  2
virtio_blk 18156  4
virtio_scsi18361  0
virtio_net 28024  0
virtio_pci 22913  0
virtio_ring21524  7
virtio_blk,virtio_net,virtio_pci,virtio_rng,virtio_balloon,virtio_console,virtio_scsi
virtio 15008  7
virtio_blk,virtio_net,virtio_pci,virtio_rng,virtio_balloon,virtio_console,virtio_scsi

Would please check do I missing some virtio module ?

One more finding, if I set queue property in vnic profile then I got good
throughput.

Thanks,
~Rohit


On Fri, Jun 30, 2017 at 12:11 AM, Yaniv Kaul  wrote:

>
>
> On Thu, Jun 29, 2017 at 4:02 PM, TranceWorldLogic . <
> tranceworldlo...@gmail.com> wrote:
>
>> Got it, just I need to do modprobe to add virtio-rng driver.
>> I will try with this option.
>>
>
> Make sure it is checked on the cluster.
> Y.
>
>>
>> Thanks for your help,
>> ~Rohit
>>
>> On Thu, Jun 29, 2017 at 6:20 PM, TranceWorldLogic . <
>> tranceworldlo...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am using host as Centos 7.3 and guest also centos 7.3
>>> it have 3.10 kernel version.
>>>
>>> But I not see virtio-rng in guest VM. Is this module come with kernel or
>>> separately I have to install ?
>>>
>>> Thanks,
>>> ~Rohit
>>>
>>> On Thu, Jun 29, 2017 at 5:28 PM, Yaniv Kaul  wrote:
>>>


 On Thu, Jun 29, 2017 at 10:07 AM, TranceWorldLogic . <
 tranceworldlo...@gmail.com> wrote:

> Hi,
>
> To increase network throughput we have changed txqueuelen of network
> device and bridge manually. And observed  improved throughput.
>

 Interesting, as I've read the default (1000) should be good enough for
 10g, for example.

 Are you actually seeing errors on the interface (overruns) and such?


>
> But in ovirt I not see any option to increase txqueuelen.
>

 Perhaps use ifup-local script to set it when the interface goes up?


> Can someone suggest me what will be the right way to increase
> throughput ?
>
> Note: I am trying to increase throughput for  ipsec packets.
>

 For ipsec, probably best to ensure virtio-rng is enabled.
 Y.


>
> Thanks,
> ~Rohit
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>

>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Frustration defines the deployment of Hosted Engine

2017-06-30 Thread Michal Skrivanek

> On 26 Jun 2017, at 15:17, Adam Litke  wrote:
> 
> 
> 
> On Mon, Jun 26, 2017 at 4:04 AM, Ben De Luca  > wrote:
> Is there some way to make vnc/web the default for each vk. Oh I have searched 
> and not found. :/ 
> 
> I feel like there used to be a way to do this at the cluster level but I also 
> could not find it.  Your best bet would be to configure VM templates and set 
> the console to vnc there.

The default is an OS type setting, you can change/override that in osinfo[1].
Another alternative is to just use a template with it (including a system-wide 
change via Blank template)

You can also use the combined display of SPICE+VNC which allows you to connect 
with either protocol
For the client it’s a user’s preference setting (to use remote-viewer or 
novnc/spice-html5)

As for the original question, none of the existing MacOS SPICE clients are 
enterprise quality(that’s subjective of course:), but configuring both gives 
you an easy way how to switch to vnc for particular user or in case of problems

Thanks,
michal

[1] http://www.ovirt.org/develop/release-management/features/virt/os-info/

>  
> 
> 
> On Mon, 26 Jun 2017 at 8:35 am, Fabrice Bacchella 
> mailto:fabrice.bacche...@orange.fr>> wrote:
> 
> 
>> 4. There is any good SPICE client for macOS? Or should I just use the HTML5 
>> version instead?
>> 
>> I'm afraid not.
>> Y.
> 
> There is one spice client, RemoveViewer, but I will not call it good, it's 
> very slow. So I tend to use the embedded vnc viewer.
> 
> ___
> Users mailing list
> Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users 
> 
> 
> ___
> Users mailing list
> Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users 
> 
> 
> 
> 
> 
> -- 
> Adam Litke
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Frustration defines the deployment of Hosted Engine

2017-06-30 Thread Yaniv Kaul
On Fri, Jun 30, 2017 at 2:30 AM, Ben De Luca  wrote:

> Are we running the same code?
>
> I applaud the amount of effort, but I cant image there is a depth of
> testing. Oh thats right we are the testers for rhel
>


Hi Ben,

It's always great to get constructive feedback.
I've tried to look for bugs you have reported, for RHEL, oVirt or Fedora,
but did not find any[1].
Perhaps I was searching with the wrong username.

Respectfully[2],
Y.

[1]
https://bugzilla.redhat.com/buglist.cgi?chfield=%5BBug%20creation%5D&chfieldto=Now&email1=bdeluca%40gmail.com&emailreporter1=1&emailtype1=substring&f3=OP&j3=OR&list_id=7540196&query_format=advanced

[2] http://www.ovirt.org/community/about/community-guidelines/#be-respectful


> On 29 June 2017 at 21:27, Frank Wall  wrote:
>
>> On Mon, Jun 26, 2017 at 10:51:52AM +0200, InterNetX - Juergen
>> Gotteswinter wrote:
>> > > 2. Should I migrate from XenServer to oVirt? This is biased, I know,
>> but
>> > > I would like to hear opinions. The folks with @redhat.com email
>> > > addresses will know how to advocate in favor of oVirt.
>> >
>> > in term of reliability, better stay with xenserver
>>
>> Seriously, you should have provided some more insights to support your
>> statement. What reliability issues did you encounter in oVirt that are
>> not present in Xenserver?
>>
>> I have deployed *several* oVirt setups since 2012 and haven't found a
>> single realibility issue since then. Of course, there have been some bugs,
>> but the oVirt project made *tremendous* progress since 2012.
>>
>>
>> Regards
>> - Frank
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vm shutdown long delay is a problem for users of pools

2017-06-30 Thread Tomáš Golembiovský
Hi,

On Wed, 28 Jun 2017 12:05:46 +0200
"Paul"  wrote:

> Hi Sharon,
> 
> Thanks for your comments. My findings on those:
> 
> 1.   Yes all VMs have ovirt-guest-agent installed and active, shutdown 
> time is still about 90 seconds. Any other way to reduce this? I see in the 
> logs “Jun 28 11:49:03 pool python: Shutdown scheduled for Wed 2017-06-28 
> 11:50:03 CEST, use 'shutdown -c' to cancel.” And then a wait of 60 seconds. 
> Is it possible to adjust this delay?

I've got good news for you... and, of course, some not so good news...

The delay can be configured in engine by engine-config. The
corresponding value is 'VmGracefulShutdownTimeout' and is in seconds
(default is 30 seconds). I.e. you can run the following to disable the
delay. 

# engine-config --set VmGracefulShutdownTimeout=0  

The not so good news is that on linux (which is your case if I
understood correctly) the delay only works by minutes and the value of
VmGracefulShutdownTimeout is rounded *up* to whole minutes. That is
anything between 1-59 seconds is turned into 60 seconds.   

Hope that helps,

Tomas

> 
> 2.   Clicking twice works. Thanks for the tip!
> 
> 3.
> 
> a.   More VMs per user: yes, could be a good option
> 
> b.  I have some trouble with the “console disconnect action” and filed a 
> bug for it last week [1]. Any disconnect action (except shutdown) combined 
> with “strict user checking” (security wise recommended) blocks and depletes 
> pool resources and in my opinion jeopardizes the pool functionality. I am 
> curious what your thoughts are on this.
> 
> Kind regards,
> 
> Paul
> 
> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1464396
> 
>  
> 
> From: Sharon Gratch [mailto:sgra...@redhat.com] 
> Sent: dinsdag 27 juni 2017 19:04
> To: Paul 
> Cc: users 
> Subject: Re: [ovirt-users] vm shutdown long delay is a problem for users of 
> pools
> 
>  
> 
> Hi,
> 
> Please see comments below.
> 
>  
> 
> On Thu, Jun 22, 2017 at 7:15 PM, Paul mailto:p...@kenla.nl> > 
> wrote:
> 
> Hi,
> 
> Shutting down VM’s in the portal with the red downfacing arrow takes quite 
> some time (about 90 seconds). I read this is mainly due to a 60 second delay 
> in the ovirt-guest-agent. I got used to right-click and use “power off” 
> instead of “shutdown”, which is fine.
> 
>  
> 
> My users make use of VM in a VM-pool. They get assigned a VM and after 
> console disconnect the VM shuts down (default recommended behavior). My issue 
> is that the users stays assigned to this VM for the full 90 seconds and 
> cannot do “power off”. Suppose he disconnected by accident, he has to wait 90 
> seconds until he is assigned to the pool again until he can connect to 
> another VM. 
> 
>  
> 
> My questions are:
> 
> -  Is it possible to decrease the time delay of a VM shutdown? 90 
> seconds is quite a lot, 10 seconds should be enough
> 
> ​​
> 
> ​Is ovirt-guest-agent installed on all pool's VMs? Consider installing 
> ovirt-guest-agent in all VMs in your Pool to decrease the time taken for the 
> VM shutdown. 
> 
>  
> 
> -  Is it possible for normal users to use “power off”?
> 
> ​There is no option in UserPortal to power-off a VM but you can 
> 
> ​try to click twice (sequential clicks) on the 'shutdown' button. Two 
> sequential shutdown requests are handled in oVirt as "power off".
> 
> -  Is it possible to “unallocate” the user from a VM if it is 
> powering down? So he can allocate another VM
> 
> ​You can consider assigning two VMs per each user, if possible of-course (via 
> WebAdmin->edit Pool -> and set "Maximum number of VMs per user" field to "2") 
> so that way while one VM is still shutting down, the user can switch and 
> connect to a second VM without waiting.
> 
> Another option is to create a pool with a different policy for console 
> disconnecting so that the VM won't shutdown each time the user close the  
> console (via WebAdmin->Pool->Console tab->"Console Disconnect Action"). 
> Consider changing this field to "Lock screen" or "Logout user" instead of 
> "shutdown virtual machine". 
> This policy will avoid accidentally console disconnection waiting each 
> time...but on the other hand the VM state will remain as is since no shutdown 
> occurs, so it really depends on your requirements.
> 
> Regards,
> 
> Sharon
> 
>  
> 
> Kind regards,
> 
>  
> 
> Paul
> 
> 
> ___
> Users mailing list
> Users@ovirt.org  
> http://lists.ovirt.org/mailman/listinfo/users
> 
>  
> 


-- 
Tomáš Golembiovský 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] add fourth full gluster node and remove arbiter: ovirt 4.1 with hosted engine

2017-06-30 Thread knarra

On 06/30/2017 04:53 PM, yayo (j) wrote:


2017-06-30 12:54 GMT+02:00 yayo (j) >:


The actual arbiter must be removed because is too obsolete. So, I
needs to add the new "full replicated" node but I want to know
what are the steps for add a new "full replicated" node and remove
the arbiter node (Also a way to move the arbiter role to the new
node, If needed) . Extra info: I want to know if I can do this on
an existing ovirt gluster Data Domain (called Data01) because we
have many vm runnig on it.


Hi,

I have found this doc from RH about replacing host in a gluster env: 
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Replacing_Hosts.html


I can use the command described at point 7 ?


# *gluster volume replace-brick vol sys0.example.com:/rhs/brick1/b1 
sys5.example.com:/rhs/brick1/b1 commit force*

*volume replace-brick: success: replace-brick commit successful*


The question is: The replaced node will be a data node (a "full 
replicated" node) or will be again an arbiter?

It will be an arbiter again.


Thank you



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] add fourth full gluster node and remove arbiter: ovirt 4.1 with hosted engine

2017-06-30 Thread knarra

On 06/30/2017 04:24 PM, yayo (j) wrote:


2017-06-30 11:01 GMT+02:00 knarra >:


You do not need to remove the arbiter node as you are getting the
advantage of saving on space by having this config.

Since you have a new you can add this as fourth node and create
another gluster volume (replica 3) out of this node plus the other
two nodes and run vm images there as well.


Hi,

And thanks for the answer. The actual arbiter must be removed because 
is too obsolete. So, I needs to add the new "full replicated" node but 
I want to know what are the steps for add a new "full replicated" node
To add a fully replicated node  you need to reduce the replica count to 
2 and add new brick to the volume so that it becomes replica 3. Reducing 
replica count by removing a brick from replica / arbiter cannot be done 
from UI currently and this has to be done using gluster CLI.
 AFAIR, there was an issue where vm's were going to paused state when 
reducing the replica count and increasing it to 3. Not sure if this 
still holds good with the latest release.


Any specific reason why you want to move to full replication instead of 
using an arbiter node ?




and remove the arbiter node (Also a way to move the arbiter role to 
the new node, If needed)
To move arbiter role to a new node you can move the node to maintenance 
, add  new node and replace  old brick with new brick. You can follow 
the steps below to do that.


 * Move the node to be replaced into Maintenance mode
 * Prepare the replacement node
 * Prepare bricks on that node.
 * Create replacement brick directories
 * Ensure the new directories are owned by the vdsm user and the kvm group.
 * # mkdir /rhgs/bricks/engine
 * # chmod vdsm:kvm /rhgs/bricks/engine
 * # mkdir /rhgs/bricks/data
 * # chmod vdsm:kvm /rhgs/bricks/data
 * Run the following command from one of the healthy cluster members:
 * # gluster peer probe 
 *   add the new host to the cluster.
 * Add new host address to gluster network
 * Click Network Interfaces sub-tab.
 * Click Set up Host Networks.
 * Drag and drop the glusternw network onto the IP address of the new host.
 * Click OK
 * Replace the old brick with the brick on the new host
 * Click the Bricks sub-tab.
 * Verify that brick heal completes successfully.
 * In the Hosts tab, right-click on the old host and click Remove.
 * Clean old host metadata
 * # hosted-engine --clean-metadata --host-id= --force-clean



. Extra info: I want to know if I can do this on an existing ovirt 
gluster Data Domain (called Data01) because we have many vm runnig on it.
When you move your node to maintenance all the vms running on that node 
will be migrated to another node and since you have two nodes up and 
running there should not be any problem.


thank you


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
So I can run from any node: hosted-engine --set-maintenance
--mode=global. By 'agents', you mean the ovirt-ha-agent, right? This
shouldn't affect the running of any VMs, correct? Sorry for the
questions, just want to do it correctly and not make assumptions :)

Cheers,

C

On Fri, Jun 30, 2017 at 12:12 PM, Martin Sivak  wrote:
> Hi,
>
>> Just to clarify: you mean the host_id in
>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>> correct?
>
> Exactly.
>
> Put the cluster to global maintenance first. Or kill all agents (has
> the same effect).
>
> Martin
>
> On Fri, Jun 30, 2017 at 12:47 PM, cmc  wrote:
>> Just to clarify: you mean the host_id in
>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>> correct?
>>
>> On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak  wrote:
>>> Hi,
>>>
>>> cleaning metadata won't help in this case. Try transferring the
>>> spm_ids you got from the engine to the proper hosted engine hosts so
>>> the hosted engine ids match the spm_ids. Then restart all hosted
>>> engine services. I would actually recommend restarting all hosts after
>>> this change, but I have no idea how many VMs you have running.
>>>
>>> Martin
>>>
>>> On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
 Tried running a 'hosted-engine --clean-metadata" as per
 https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
 ovirt-ha-agent was not running anyway, but it fails with the following
 error:

 ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
 to start monitoring domain
 (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
 during domain acquisition
 ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
 call last):
   File 
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
 line 191, in _run_agent
 return action(he)
   File 
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
 line 67, in action_clean
 return he.clean(options.force_cleanup)
   File 
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 345, in clean
 self._initialize_domain_monitor()
   File 
 "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 823, in _initialize_domain_monitor
 raise Exception(msg)
 Exception: Failed to start monitoring domain
 (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
 during domain acquisition
 ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
 WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt 
 '0'
 ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
 occurred, giving up. Please review the log and consider filing a bug.
 INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down

 On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
> Actually, it looks like sanlock problems:
>
>"SanlockInitializationError: Failed to initialize sanlock, the
> number of errors has exceeded the limit"
>
>
>
> On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
>> Sorry, I am mistaken, two hosts failed for the agent with the following 
>> error:
>>
>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>> ERROR Failed to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>> ERROR Shutting down the agent because of 3 failures in a row!
>>
>> What could cause these timeouts? Some other service not running?
>>
>> On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
>>> Both services are up on all three hosts. The broke logs just report:
>>>
>>> Thread-6549::INFO::2017-06-29
>>> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>> Connection established
>>> Thread-6549::INFO::2017-06-29
>>> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>> Connection closed
>>>
>>> Thanks,
>>>
>>> Cam
>>>
>>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak  wrote:
 Hi,

 please make sure that both ovirt-ha-agent and ovirt-ha-broker services
 are restarted and up. The error says the agent can't talk to the
 broker. Is there anything in the broker.log?

 Best regards

 Martin Sivak

 On Thu, Jun 29, 2017 at 4:42 PM, cmc  wrote:
> I've restarted those two services across all hosts, have taken the
> Hosted Engine host out of maintenance, and when I try to migrate the
> Hosted Engine over to another host, it reports that all thre

Re: [ovirt-users] add fourth full gluster node and remove arbiter: ovirt 4.1 with hosted engine

2017-06-30 Thread yayo (j)
2017-06-30 12:54 GMT+02:00 yayo (j) :

> The actual arbiter must be removed because is too obsolete. So, I needs to
> add the new "full replicated" node but I want to know what are the steps
> for add a new "full replicated" node and remove the arbiter node (Also a
> way to move the arbiter role to the new node, If needed) . Extra info: I
> want to know if I can do this on an existing ovirt gluster Data Domain
> (called Data01) because we have many vm runnig on it.
>

Hi,

I have found this doc from RH about replacing host in a gluster env:
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Replacing_Hosts.html

I can use the command described at point 7 ?


# *gluster volume replace-brick vol sys0.example.com:/rhs/brick1/b1
sys5.example.com:/rhs/brick1/b1 commit force*
*volume replace-brick: success: replace-brick commit successful*


The question is: The replaced node will be a data node (a "full replicated"
node) or will be again an arbiter?

Thank you
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread Martin Sivak
Hi,

> Just to clarify: you mean the host_id in
> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
> correct?

Exactly.

Put the cluster to global maintenance first. Or kill all agents (has
the same effect).

Martin

On Fri, Jun 30, 2017 at 12:47 PM, cmc  wrote:
> Just to clarify: you mean the host_id in
> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
> correct?
>
> On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak  wrote:
>> Hi,
>>
>> cleaning metadata won't help in this case. Try transferring the
>> spm_ids you got from the engine to the proper hosted engine hosts so
>> the hosted engine ids match the spm_ids. Then restart all hosted
>> engine services. I would actually recommend restarting all hosts after
>> this change, but I have no idea how many VMs you have running.
>>
>> Martin
>>
>> On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
>>> Tried running a 'hosted-engine --clean-metadata" as per
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>>> ovirt-ha-agent was not running anyway, but it fails with the following
>>> error:
>>>
>>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>>> to start monitoring domain
>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>> during domain acquisition
>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>>> call last):
>>>   File 
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>> line 191, in _run_agent
>>> return action(he)
>>>   File 
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>> line 67, in action_clean
>>> return he.clean(options.force_cleanup)
>>>   File 
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>> line 345, in clean
>>> self._initialize_domain_monitor()
>>>   File 
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>> line 823, in _initialize_domain_monitor
>>> raise Exception(msg)
>>> Exception: Failed to start monitoring domain
>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>> during domain acquisition
>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt 
>>> '0'
>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>>> occurred, giving up. Please review the log and consider filing a bug.
>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>>
>>> On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
 Actually, it looks like sanlock problems:

"SanlockInitializationError: Failed to initialize sanlock, the
 number of errors has exceeded the limit"



 On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
> Sorry, I am mistaken, two hosts failed for the agent with the following 
> error:
>
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
> ERROR Failed to start monitoring domain
> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
> during domain acquisition
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
> ERROR Shutting down the agent because of 3 failures in a row!
>
> What could cause these timeouts? Some other service not running?
>
> On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
>> Both services are up on all three hosts. The broke logs just report:
>>
>> Thread-6549::INFO::2017-06-29
>> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>> Connection established
>> Thread-6549::INFO::2017-06-29
>> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>> Connection closed
>>
>> Thanks,
>>
>> Cam
>>
>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak  wrote:
>>> Hi,
>>>
>>> please make sure that both ovirt-ha-agent and ovirt-ha-broker services
>>> are restarted and up. The error says the agent can't talk to the
>>> broker. Is there anything in the broker.log?
>>>
>>> Best regards
>>>
>>> Martin Sivak
>>>
>>> On Thu, Jun 29, 2017 at 4:42 PM, cmc  wrote:
 I've restarted those two services across all hosts, have taken the
 Hosted Engine host out of maintenance, and when I try to migrate the
 Hosted Engine over to another host, it reports that all three hosts
 'did not satisfy internal filter HA because it is not a Hosted Engine
 host'.

 On the host that the Hosted Engine is currently on it reports in the 
 agent.log:

 ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
 Connection closed: Connection closed
 Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
 ovirt_hosted_engine_ha.l

[ovirt-users] Upgrading HC from 4.0 to 4.1

2017-06-30 Thread Gianluca Cecchi
Hello,
I'm going to try to update to 4.1 an HC environment, currently on 4.0 with
3 nodes in CentOS 7.3 and one of them configured as arbiter

Any particular caveat in HC?
Are the steps below, normally used for Self Hosted Engine environments the
only ones to consider?

- update repos on the 3 hosts and on the engine vm
- global maintenance
- update engine
- update also os packages of engine vm
- shutdown engine vm
- disable global maintenance
- verify engine vm boots and functionality is ok
Then
- update hosts: preferred way will be from the gui itself that takes care
of moving VMs, maintenance and such or to proceed manually?

Is there a preferred order with which I have to update the hosts, after
updating the engine? Arbiter for first or as the latest or not important at
all?

Any possible problem having disaligned versions of glusterfs packages until
I complete all the 3 hosts? Any known bugs passing from 4.0 to 4.1 and
related glusterfs components?

Thanks in advance,
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Unable to rename disks via REST API

2017-06-30 Thread Bruno Rodriguez
Finally I was able to rename the disks, the problem was about handling that
put request and my inability to check libcurl documentation (I was treating
the XML I was sending as a string instead of a file).

Sorry because I needed this script to be working as fast as I could, so I
didn't try to reproduce the 500 error, but I wouldn't be suprised if it was
because of a totally wrong URL.

Sorry again :(

On Wed, Jun 28, 2017 at 1:56 PM, Juan Hernández  wrote:

> On 06/28/2017 01:34 PM, Bruno Rodriguez wrote:
> > Shit, I got it. Sorry
> >
> > The problem is that I was accessing to the v4 disk-attachment id and I
> > was getting everything quite messed up. I was doing all of this while
> > creating a machine, so I was trying to rename the disk at the same time
> > it's being created.
> >
> > I'll delete all the "if ($ovirt_major == 3)" I have in my code. If I
> > have any problem I'll let you know.
> >
> > Thank you
> >
>
> If what you want to do is set the name of the disk for a VM that you are
> creating then you can just set it when adding the disk:
>
>   POST /ovirt-engine/api/vms/123/diskattachments
>
>   
> 
>   yourfavoritename
>   ...
> 
> ...
>   
>
> That way you don't need to modify it later.
>
> Also, in general, try to wait till the objects are completely created
> before trying to use or update them. For disks in particular, it is good
> idea to repeatedly retrieve the disk till the 'status' is 'OK':
>
>
> https://github.com/oVirt/ovirt-engine-sdk/blob/master/
> sdk/examples/add_vm_disk.py#L73-L80
>
> (This is an example from the Python SDK, but you get the idea. There are
> SDKs for Ruby and Java as well.)
>
> Finally, if you get a 500 error then there is most probably a bug. The
> API should never return that, even if you try to do something that
> doesn't make sense it should respond with a reasonable error message. If
> you keep getting that 500 error please share the details.
>
> > On Wed, Jun 28, 2017 at 1:03 PM, Juan Hernández  > > wrote:
> >
> > On 06/28/2017 12:55 PM, Bruno Rodriguez wrote:
> > > I'm sorry about bothering you again, but after trying it some
> times I'm
> > > still getting a "500 Internal Server Error". I'm using a REST
> client
> > > instead of CURL and I tried adding a "Version: 3" to headers and
> used
> > > the URL with the v3 as well.
> > >
> > > I'm issuing a PUT of
> > >
> > > Alias_for_disk
> > >
> > > To the URL
> > >
> > > https://myserver/ovirt-engine/api/v3/vms/62498f51-0203-48b9-
> 83c8-4c6f3bdfe05c/disks/583ed952-46a8-4bc5-8a27-8660e4a24ea2
> >  0203-48b9-83c8-4c6f3bdfe05c/disks/583ed952-46a8-4bc5-8a27-8660e4a24ea2>
> > >
> > > I mean, I can live without it but we are using oVirt as a "static
> > > virtualization" environment and is quite useful for us being able
> to
> > > recognize easily each disk by its server name. In case I have to
> wait
> > > until the 4.2 API version to automatize this I'll do :(
> > >
> >
> > That should work. Can you please check if you get any useful message
> in
> > /var/log/ovirt-engine/server.log or /var/log/ovirt-engine/engine.
> log?
> >
> > >
> > >
> > > On Wed, Jun 28, 2017 at 11:18 AM, Bruno Rodriguez  
> > > >> wrote:
> > >
> > > Thank you very much !!!
> > >
> > > I expected I was missing something like that. Thanks again!
> > >
> > > On Wed, Jun 28, 2017 at 11:12 AM, Juan Hernández
> > > mailto:jhern...@redhat.com>
> > >> wrote:
> > >
> > > On 06/28/2017 10:43 AM, Bruno Rodriguez wrote:
> > > > Thank you, Daniel
> > > >
> > > > I tried a PUT with the same XML body, I got a "405
> > Method Not Allowed".
> > > > It's quite strange, there must be someting I'm missing
> > > >
> > >
> > > The operation to update a disk will be introduced in
> version 4
> > > of the
> > > API with version 4.2 of the engine. Meanwhile the way to
> > update
> > > the disk
> > > is to use version 3 of the API and the disks
> > sub-collection of the
> > > virtual machine. That means that if you have a VM with id
> 123
> > > and a disk
> > > with id 456 you can send a request like this:
> > >
> > >   PUT /ovirt-engine/api/v3/vms/123/disks/456
> > >
> > > With a request body like this:
> > >
> > >   
> > > newalias
> > >   
> > >
> > > Note that you can use the above "v3" prefix in the URL or
> > else the
> > > "Version: 3" header. A complete example using curl:
> > >
> > > --

Re: [ovirt-users] add fourth full gluster node and remove arbiter: ovirt 4.1 with hosted engine

2017-06-30 Thread yayo (j)
2017-06-30 11:01 GMT+02:00 knarra :

> You do not need to remove the arbiter node as you are getting the
> advantage of saving on space by having this config.
>
> Since you have a new you can add this as fourth node and create another
> gluster volume (replica 3) out of this node plus the other two nodes and
> run vm images there as well.
>

Hi,

And thanks for the answer. The actual arbiter must be removed because is
too obsolete. So, I needs to add the new "full replicated" node but I want
to know what are the steps for add a new "full replicated" node and remove
the arbiter node (Also a way to move the arbiter role to the new node, If
needed) . Extra info: I want to know if I can do this on an existing ovirt
gluster Data Domain (called Data01) because we have many vm runnig on it.

thank you
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Empty cgroup files on centos 7.3 host

2017-06-30 Thread Florian Schmid
Hi Yaniv, 

thank you for your answer! I haven't known, that there is already such a 
monitoring tool on ovirt. 

We will sure give it a try, but we have already in our environment a monitoring 
tool, that's why I wanted to add those values, too. 

How does collectd get this data from libvirt, when the corresponding cgroup 
values are empty? 

BR Florian 





Von: "Yaniv Kaul"  
An: "Florian Schmid"  
CC: "users"  
Gesendet: Dienstag, 27. Juni 2017 09:08:51 
Betreff: Re: [ovirt-users] Empty cgroup files on centos 7.3 host 



On Mon, Jun 26, 2017 at 11:03 PM, Florian Schmid < [ mailto:fsch...@ubimet.com 
| fsch...@ubimet.com ] > wrote: 


Hi, 

I wanted to monitor disk IO and R/W on all of our oVirt centos 7.3 hypervisor 
hosts, but it looks like that all those files are empty. 



We have a very nice integration with Elastic based monitoring and logging - why 
not use it. 
On the host, we use collectd for monitoring. 
See [ 
http://www.ovirt.org/develop/release-management/features/engine/metrics-store/ 
| 
http://www.ovirt.org/develop/release-management/features/engine/metrics-store/ 
] 

Y. 

BQ_BEGIN
For example: 
ls -al 
/sys/fs/cgroup/blkio/machine.slice/machine-qemu\\x2d14\\x2dHostedEngine.scope/ 
insgesamt 0 
drwxr-xr-x. 2 root root 0 30. Mai 10:09 . 
drwxr-xr-x. 16 root root 0 26. Jun 09:25 .. 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_merged 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_merged_recursive 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_queued 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_queued_recursive 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_service_bytes 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_service_bytes_recursive 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_serviced 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_serviced_recursive 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_service_time 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_service_time_recursive 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_wait_time 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.io_wait_time_recursive 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.leaf_weight 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.leaf_weight_device 
--w---. 1 root root 0 30. Mai 10:09 blkio.reset_stats 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.sectors 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.sectors_recursive 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.throttle.io_service_bytes 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.throttle.io_serviced 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.throttle.read_bps_device 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.throttle.read_iops_device 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.throttle.write_bps_device 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.throttle.write_iops_device 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.time 
-r--r--r--. 1 root root 0 30. Mai 10:09 blkio.time_recursive 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.weight 
-rw-r--r--. 1 root root 0 30. Mai 10:09 blkio.weight_device 
-rw-r--r--. 1 root root 0 30. Mai 10:09 cgroup.clone_children 
--w--w--w-. 1 root root 0 30. Mai 10:09 cgroup.event_control 
-rw-r--r--. 1 root root 0 30. Mai 10:09 cgroup.procs 
-rw-r--r--. 1 root root 0 30. Mai 10:09 notify_on_release 
-rw-r--r--. 1 root root 0 30. Mai 10:09 tasks 


I thought, I can get my needed values from there, but all files are empty. 

Looking at this post: [ 
http://lists.ovirt.org/pipermail/users/2017-January/079011.html | 
http://lists.ovirt.org/pipermail/users/2017-January/079011.html ] 
this should work. 

Is this normal on centos 7.3 with oVirt installed? How can I get those values, 
without monitoring all VMs directly? 

oVirt Version we use: 
4.1.1.8-1.el7.centos 

BR Florian 
___ 
Users mailing list 
[ mailto:Users@ovirt.org | Users@ovirt.org ] 
[ http://lists.ovirt.org/mailman/listinfo/users | 
http://lists.ovirt.org/mailman/listinfo/users ] 

BQ_END


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
Just to clarify: you mean the host_id in
/etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
correct?

On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak  wrote:
> Hi,
>
> cleaning metadata won't help in this case. Try transferring the
> spm_ids you got from the engine to the proper hosted engine hosts so
> the hosted engine ids match the spm_ids. Then restart all hosted
> engine services. I would actually recommend restarting all hosts after
> this change, but I have no idea how many VMs you have running.
>
> Martin
>
> On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
>> Tried running a 'hosted-engine --clean-metadata" as per
>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>> ovirt-ha-agent was not running anyway, but it fails with the following
>> error:
>>
>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>> to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>> call last):
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>> return action(he)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 67, in action_clean
>> return he.clean(options.force_cleanup)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 345, in clean
>> self._initialize_domain_monitor()
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 823, in _initialize_domain_monitor
>> raise Exception(msg)
>> Exception: Failed to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt 
>> '0'
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>> occurred, giving up. Please review the log and consider filing a bug.
>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>
>> On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
>>> Actually, it looks like sanlock problems:
>>>
>>>"SanlockInitializationError: Failed to initialize sanlock, the
>>> number of errors has exceeded the limit"
>>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
 Sorry, I am mistaken, two hosts failed for the agent with the following 
 error:

 ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
 ERROR Failed to start monitoring domain
 (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
 during domain acquisition
 ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
 ERROR Shutting down the agent because of 3 failures in a row!

 What could cause these timeouts? Some other service not running?

 On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
> Both services are up on all three hosts. The broke logs just report:
>
> Thread-6549::INFO::2017-06-29
> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> Connection established
> Thread-6549::INFO::2017-06-29
> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> Connection closed
>
> Thanks,
>
> Cam
>
> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak  wrote:
>> Hi,
>>
>> please make sure that both ovirt-ha-agent and ovirt-ha-broker services
>> are restarted and up. The error says the agent can't talk to the
>> broker. Is there anything in the broker.log?
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Thu, Jun 29, 2017 at 4:42 PM, cmc  wrote:
>>> I've restarted those two services across all hosts, have taken the
>>> Hosted Engine host out of maintenance, and when I try to migrate the
>>> Hosted Engine over to another host, it reports that all three hosts
>>> 'did not satisfy internal filter HA because it is not a Hosted Engine
>>> host'.
>>>
>>> On the host that the Hosted Engine is currently on it reports in the 
>>> agent.log:
>>>
>>> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>>> Connection closed: Connection closed
>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception
>>> getting service path: Connection closed
>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
>>> call last):
>>> File
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agen

Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread cmc
Ok, Thanks Martin. It should be feasible to get all VMs onto one host,
so I can do that (unless you recommend just shutting the entire
cluster down at once?). For the engine, I'll shut it down since it
won't migrate to another host, before shutting that host down.

Will let you know how it goes.

Thanks,

Cam

On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak  wrote:
> Hi,
>
> cleaning metadata won't help in this case. Try transferring the
> spm_ids you got from the engine to the proper hosted engine hosts so
> the hosted engine ids match the spm_ids. Then restart all hosted
> engine services. I would actually recommend restarting all hosts after
> this change, but I have no idea how many VMs you have running.
>
> Martin
>
> On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
>> Tried running a 'hosted-engine --clean-metadata" as per
>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>> ovirt-ha-agent was not running anyway, but it fails with the following
>> error:
>>
>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>> to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>> call last):
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>> return action(he)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 67, in action_clean
>> return he.clean(options.force_cleanup)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 345, in clean
>> self._initialize_domain_monitor()
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 823, in _initialize_domain_monitor
>> raise Exception(msg)
>> Exception: Failed to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt 
>> '0'
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>> occurred, giving up. Please review the log and consider filing a bug.
>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>
>> On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
>>> Actually, it looks like sanlock problems:
>>>
>>>"SanlockInitializationError: Failed to initialize sanlock, the
>>> number of errors has exceeded the limit"
>>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
 Sorry, I am mistaken, two hosts failed for the agent with the following 
 error:

 ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
 ERROR Failed to start monitoring domain
 (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
 during domain acquisition
 ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
 ERROR Shutting down the agent because of 3 failures in a row!

 What could cause these timeouts? Some other service not running?

 On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
> Both services are up on all three hosts. The broke logs just report:
>
> Thread-6549::INFO::2017-06-29
> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
> Connection established
> Thread-6549::INFO::2017-06-29
> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> Connection closed
>
> Thanks,
>
> Cam
>
> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak  wrote:
>> Hi,
>>
>> please make sure that both ovirt-ha-agent and ovirt-ha-broker services
>> are restarted and up. The error says the agent can't talk to the
>> broker. Is there anything in the broker.log?
>>
>> Best regards
>>
>> Martin Sivak
>>
>> On Thu, Jun 29, 2017 at 4:42 PM, cmc  wrote:
>>> I've restarted those two services across all hosts, have taken the
>>> Hosted Engine host out of maintenance, and when I try to migrate the
>>> Hosted Engine over to another host, it reports that all three hosts
>>> 'did not satisfy internal filter HA because it is not a Hosted Engine
>>> host'.
>>>
>>> On the host that the Hosted Engine is currently on it reports in the 
>>> agent.log:
>>>
>>> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>>> Connection closed: Connection closed
>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception
>>> getting service path: Connection closed
>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>> ovirt_hosted_engine_ha.agent.agent.Age

Re: [ovirt-users] ovirt-guest-agent - Ubuntu 16.04

2017-06-30 Thread Sandro Bonazzola
Adding Laszlo Boszormenyi (GCS)  which is the maintainer
according to
http://it.archive.ubuntu.com/ubuntu/ubuntu/ubuntu/pool/universe/o/ovirt-guest-agent/ovirt-guest-agent_1.0.13.dfsg-1.dsc



On Wed, Jun 28, 2017 at 5:37 PM, FERNANDO FREDIANI <
fernando.fredi...@upx.com> wrote:

> Hello
>
> Is the maintainer of ovirt-guest-agent for Ubuntu on this mail list ?
>
> I have noticed that if you install ovirt-guest-agent package from Ubuntu
> repositories it doesn't start. Throws an error about python and never
> starts. Has anyone noticied the same ? OS in this case is a clean minimal
> install of Ubuntu 16.04.
>
> Installing it from the following repository works fine -
> http://download.opensuse.org/repositories/home:/evilissimo:
> /ubuntu:/16.04/xUbuntu_16.04
>
> Fernando
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>


-- 

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA 

TRIED. TESTED. TRUSTED. 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] add fourth full gluster node and remove arbiter: ovirt 4.1 with hosted engine

2017-06-30 Thread knarra

On 06/30/2017 02:18 PM, yayo (j) wrote:

Hi at all,

we have a 3 node cluster with this configuration:

ovirtzz 4.1 with 3 node hyperconverged with gluster. 2 node are "full 
replicated" and 1 node is the arbiter.


Now we have a new server to add to cluster then we want to add this 
new server and remove the arbiter (or, make this new server a "full 
replicated" gluster with arbiter role? I don't know)
You do not need to remove the arbiter node as you are getting the 
advantage of saving on space by having this config.


Since you have a new you can add this as fourth node and create another 
gluster volume (replica 3) out of this node plus the other two nodes and 
run vm images there as well.


Can you please help me to know what is the right way to do this? Or, 
Can you give me any doc or link that explain the steps to do this?


Thank you in advance!



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] add fourth full gluster node and remove arbiter: ovirt 4.1 with hosted engine

2017-06-30 Thread yayo (j)
Hi at all,

we have a 3 node cluster with this configuration:

ovirtzz 4.1 with 3 node hyperconverged with gluster. 2 node are "full
replicated" and 1 node is the arbiter.

Now we have a new server to add to cluster then we want to add this new
server and remove the arbiter (or, make this new server a "full replicated"
gluster with arbiter role? I don't know)

Can you please help me to know what is the right way to do this? Or, Can
you give me any doc or link that explain the steps to do this?

Thank you in advance!
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HostedEngine VM not visible, but running

2017-06-30 Thread Martin Sivak
Hi,

cleaning metadata won't help in this case. Try transferring the
spm_ids you got from the engine to the proper hosted engine hosts so
the hosted engine ids match the spm_ids. Then restart all hosted
engine services. I would actually recommend restarting all hosts after
this change, but I have no idea how many VMs you have running.

Martin

On Thu, Jun 29, 2017 at 8:27 PM, cmc  wrote:
> Tried running a 'hosted-engine --clean-metadata" as per
> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
> ovirt-ha-agent was not running anyway, but it fails with the following
> error:
>
> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
> to start monitoring domain
> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
> during domain acquisition
> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
> call last):
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
> return action(he)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 67, in action_clean
> return he.clean(options.force_cleanup)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 345, in clean
> self._initialize_domain_monitor()
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 823, in _initialize_domain_monitor
> raise Exception(msg)
> Exception: Failed to start monitoring domain
> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
> during domain acquisition
> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt '0'
> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
> occurred, giving up. Please review the log and consider filing a bug.
> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>
> On Thu, Jun 29, 2017 at 6:10 PM, cmc  wrote:
>> Actually, it looks like sanlock problems:
>>
>>"SanlockInitializationError: Failed to initialize sanlock, the
>> number of errors has exceeded the limit"
>>
>>
>>
>> On Thu, Jun 29, 2017 at 5:10 PM, cmc  wrote:
>>> Sorry, I am mistaken, two hosts failed for the agent with the following 
>>> error:
>>>
>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>> ERROR Failed to start monitoring domain
>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>> during domain acquisition
>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>> ERROR Shutting down the agent because of 3 failures in a row!
>>>
>>> What could cause these timeouts? Some other service not running?
>>>
>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc  wrote:
 Both services are up on all three hosts. The broke logs just report:

 Thread-6549::INFO::2017-06-29
 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
 Thread-6549::INFO::2017-06-29
 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed

 Thanks,

 Cam

 On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak  wrote:
> Hi,
>
> please make sure that both ovirt-ha-agent and ovirt-ha-broker services
> are restarted and up. The error says the agent can't talk to the
> broker. Is there anything in the broker.log?
>
> Best regards
>
> Martin Sivak
>
> On Thu, Jun 29, 2017 at 4:42 PM, cmc  wrote:
>> I've restarted those two services across all hosts, have taken the
>> Hosted Engine host out of maintenance, and when I try to migrate the
>> Hosted Engine over to another host, it reports that all three hosts
>> 'did not satisfy internal filter HA because it is not a Hosted Engine
>> host'.
>>
>> On the host that the Hosted Engine is currently on it reports in the 
>> agent.log:
>>
>> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>> Connection closed: Connection closed
>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception
>> getting service path: Connection closed
>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
>> call last):
>> File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>>   return action(he)
>> File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 64, in action_proper