from:"Juhani Rautiainen"

Story of the problems continues. Finally shut everything down, got
storage domains to maintenance and then this happens:

ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is
"[Physical device initialization failed. Please check that the device
is empty and accessible by the host.]". HTTP response code is 400.
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
"Fault reason is \"Operation Failed\". Fault detail is \"[Physical
device initialization failed. Please check that the device is empty
and accessible by the host.]\". HTTP response code is 400."}

No amount zeroing device helped.

Next plan: Find out if I can restore backup to standalone server. If
that fails that's the end of oVirt for me. It worked fine for a couple
of years but this update hassle is too much. Should have gone until
the end with 4.3.

Thanks,
Juhani

On Tue, Apr 27, 2021 at 11:47 AM Juhani Rautiainen
 wrote:
>
> Hmm. Is it possible that when the other node is still running v4.3
> this operation can't be completed as it doesn't know how to do it.
>
> Thanks,
> Juhani
>
> On Tue, Apr 27, 2021 at 11:07 AM Juhani Rautiainen
>  wrote:
> >
> > It seems that it is not supported in oVirt yet? I got this response
> > when I tried to change master with those storage domain that I have:
> >
> > 
> > 
> > [Cannot switch master storage domain. Switch master
> > storage domain operation is not supported.]
> > Operation Failed
> > 
> >
> > So is this really the only way to do this: shutdown everything and put
> > other storage domains into maintenance? It would have been nice if
> > this information was in the upgrade guide. It made this seem so easy
> > and simple...
> >
> > Thanks,
> > Juhani
> >
> > On Tue, Apr 27, 2021 at 10:47 AM Juhani Rautiainen
> >  wrote:
> > >
> > > Thanks this looks like what I'm looking for. I'm still wondering how
> > > to use this. I have LUN just for new hosted storage. Ansible created
> > > storage domain to it correctly but just can't activate it. So is the
> > > idea that I activate this unattached hosted_storage domain and try to
> > > use API to make it master? I attached a screen shot how it looks
> > > currently.
> > >
> > >
> > >
> > > On Tue, Apr 27, 2021 at 10:41 AM Yedidyah Bar David  
> > > wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 10:15 AM Juhani Rautiainen
> > > >  wrote:
> > > > >
> > > > > To continue. I noticed that another storage domain took the data
> > > > > (master) now. I saw one advice that you can force change by putting
> > > > > the storage domain to maintenance mode. Problem is that there are VM's
> > > > > running on these domains. How is this supposed to work during the
> > > > > restore?
> > > >
> > > > There is a recent change [1] by Shani (Cced) that should allow you
> > > > to choose another storage domain as master. So you can create a new
> > > > (temporary?) SD with the correct compatibility level and then set it
> > > > to master.
> > > >
> > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1576923
> > > >
> > > > Best regards,
> > > >
> > > > >
> > > > > Thanks,
> > > > > Juhani
> > > > >
> > > > > On Tue, Apr 27, 2021 at 9:58 AM Juhani Rautiainen
> > > > >  wrote:
> > > > > >
> > > > > > Hi!
> > > > > >
> > > > > > I started the upgrade from 4.3->4.4. Now I'm stuck as restoring the
> > > > > > backup fails to create the correct storage domain for Hosted Engine.
> > > > > > How can I create one? Error from ansible task is:
> > > > > >
> > > > > > [ ERROR ] ovirtsdk4.Error: Fault reason is "Operation Failed". Fault
> > > > > > detail is "[Domain format is different from master storage domain
> > > > > > format]". HTTP response code is 400.
> > > > > > [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
> > > > > > "Fault reason is \"Operation Failed\". Fault detail is \"[Domain
> > > > > > format is different from master storage domain format]\". HTTP
> > > > > > response code is 400."}
> > > > > >
> > > &

[ovirt-users] Re: Restoring hosted engine from backup fails on new FC storage domain creation

Hmm. Is it possible that when the other node is still running v4.3
this operation can't be completed as it doesn't know how to do it.

Thanks,
Juhani

On Tue, Apr 27, 2021 at 11:07 AM Juhani Rautiainen
 wrote:
>
> It seems that it is not supported in oVirt yet? I got this response
> when I tried to change master with those storage domain that I have:
>
> 
> 
> [Cannot switch master storage domain. Switch master
> storage domain operation is not supported.]
> Operation Failed
> 
>
> So is this really the only way to do this: shutdown everything and put
> other storage domains into maintenance? It would have been nice if
> this information was in the upgrade guide. It made this seem so easy
> and simple...
>
> Thanks,
> Juhani
>
> On Tue, Apr 27, 2021 at 10:47 AM Juhani Rautiainen
>  wrote:
> >
> > Thanks this looks like what I'm looking for. I'm still wondering how
> > to use this. I have LUN just for new hosted storage. Ansible created
> > storage domain to it correctly but just can't activate it. So is the
> > idea that I activate this unattached hosted_storage domain and try to
> > use API to make it master? I attached a screen shot how it looks
> > currently.
> >
> >
> >
> > On Tue, Apr 27, 2021 at 10:41 AM Yedidyah Bar David  wrote:
> > >
> > > On Tue, Apr 27, 2021 at 10:15 AM Juhani Rautiainen
> > >  wrote:
> > > >
> > > > To continue. I noticed that another storage domain took the data
> > > > (master) now. I saw one advice that you can force change by putting
> > > > the storage domain to maintenance mode. Problem is that there are VM's
> > > > running on these domains. How is this supposed to work during the
> > > > restore?
> > >
> > > There is a recent change [1] by Shani (Cced) that should allow you
> > > to choose another storage domain as master. So you can create a new
> > > (temporary?) SD with the correct compatibility level and then set it
> > > to master.
> > >
> > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1576923
> > >
> > > Best regards,
> > >
> > > >
> > > > Thanks,
> > > > Juhani
> > > >
> > > > On Tue, Apr 27, 2021 at 9:58 AM Juhani Rautiainen
> > > >  wrote:
> > > > >
> > > > > Hi!
> > > > >
> > > > > I started the upgrade from 4.3->4.4. Now I'm stuck as restoring the
> > > > > backup fails to create the correct storage domain for Hosted Engine.
> > > > > How can I create one? Error from ansible task is:
> > > > >
> > > > > [ ERROR ] ovirtsdk4.Error: Fault reason is "Operation Failed". Fault
> > > > > detail is "[Domain format is different from master storage domain
> > > > > format]". HTTP response code is 400.
> > > > > [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
> > > > > "Fault reason is \"Operation Failed\". Fault detail is \"[Domain
> > > > > format is different from master storage domain format]\". HTTP
> > > > > response code is 400."}
> > > > >
> > > > > And from UI I can see that what has been created is a data domain. Not
> > > > > data (master) domain as old one was. Now I'm stuck here. This is a
> > > > > fibre channel system  where I'm trying to do this if it is relevant.
> > > > >
> > > > > What I could find from the logs is this:
> > > > > 2021-04-27 09:36:06,925+0300 DEBUG
> > > > > otopi.ovirt_hosted_engine_setup.ansible_utils
> > > > > ansible_utils._process_output:105 storage_domain_details: {'changed':
> > > > > False, 'ovirt_storage_domains': [{'href':
> > > > > '/ovirt-engine/api/storagedomains/dd52022b-7616-47f6-9534-6f1a4084fdf4',
> > > > > 'comment': '', 'description': '', 'id':
> > > > > 'dd52022b-7616-47f6-9534-6f1a4084fdf4', 'name': 'hosted_storage',
> > > > > 'available': 531502202880, 'backup': False, 'block_size': 512,
> > > > > 'committed': 0, 'critical_space_action_blocker': 5,
> > > > > 'discard_after_delete': True, 'disk_profiles': [], 'disk_snapshots':
> > > > > [], 'disks': [], 'external_status': 'ok', 'master': False,
> > > > > 'permissions': [], 'status': 'unattached', 'storage': {'type': 'fcp',
> > > > > 'volume_group': {'id': 'HRLDCn-p7X2-5X2O-vm4h-1

[ovirt-users] Re: Restoring hosted engine from backup fails on new FC storage domain creation

It seems that it is not supported in oVirt yet? I got this response
when I tried to change master with those storage domain that I have:



[Cannot switch master storage domain. Switch master
storage domain operation is not supported.]
Operation Failed


So is this really the only way to do this: shutdown everything and put
other storage domains into maintenance? It would have been nice if
this information was in the upgrade guide. It made this seem so easy
and simple...

Thanks,
Juhani

On Tue, Apr 27, 2021 at 10:47 AM Juhani Rautiainen
 wrote:
>
> Thanks this looks like what I'm looking for. I'm still wondering how
> to use this. I have LUN just for new hosted storage. Ansible created
> storage domain to it correctly but just can't activate it. So is the
> idea that I activate this unattached hosted_storage domain and try to
> use API to make it master? I attached a screen shot how it looks
> currently.
>
>
>
> On Tue, Apr 27, 2021 at 10:41 AM Yedidyah Bar David  wrote:
> >
> > On Tue, Apr 27, 2021 at 10:15 AM Juhani Rautiainen
> >  wrote:
> > >
> > > To continue. I noticed that another storage domain took the data
> > > (master) now. I saw one advice that you can force change by putting
> > > the storage domain to maintenance mode. Problem is that there are VM's
> > > running on these domains. How is this supposed to work during the
> > > restore?
> >
> > There is a recent change [1] by Shani (Cced) that should allow you
> > to choose another storage domain as master. So you can create a new
> > (temporary?) SD with the correct compatibility level and then set it
> > to master.
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1576923
> >
> > Best regards,
> >
> > >
> > > Thanks,
> > > Juhani
> > >
> > > On Tue, Apr 27, 2021 at 9:58 AM Juhani Rautiainen
> > >  wrote:
> > > >
> > > > Hi!
> > > >
> > > > I started the upgrade from 4.3->4.4. Now I'm stuck as restoring the
> > > > backup fails to create the correct storage domain for Hosted Engine.
> > > > How can I create one? Error from ansible task is:
> > > >
> > > > [ ERROR ] ovirtsdk4.Error: Fault reason is "Operation Failed". Fault
> > > > detail is "[Domain format is different from master storage domain
> > > > format]". HTTP response code is 400.
> > > > [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
> > > > "Fault reason is \"Operation Failed\". Fault detail is \"[Domain
> > > > format is different from master storage domain format]\". HTTP
> > > > response code is 400."}
> > > >
> > > > And from UI I can see that what has been created is a data domain. Not
> > > > data (master) domain as old one was. Now I'm stuck here. This is a
> > > > fibre channel system  where I'm trying to do this if it is relevant.
> > > >
> > > > What I could find from the logs is this:
> > > > 2021-04-27 09:36:06,925+0300 DEBUG
> > > > otopi.ovirt_hosted_engine_setup.ansible_utils
> > > > ansible_utils._process_output:105 storage_domain_details: {'changed':
> > > > False, 'ovirt_storage_domains': [{'href':
> > > > '/ovirt-engine/api/storagedomains/dd52022b-7616-47f6-9534-6f1a4084fdf4',
> > > > 'comment': '', 'description': '', 'id':
> > > > 'dd52022b-7616-47f6-9534-6f1a4084fdf4', 'name': 'hosted_storage',
> > > > 'available': 531502202880, 'backup': False, 'block_size': 512,
> > > > 'committed': 0, 'critical_space_action_blocker': 5,
> > > > 'discard_after_delete': True, 'disk_profiles': [], 'disk_snapshots':
> > > > [], 'disks': [], 'external_status': 'ok', 'master': False,
> > > > 'permissions': [], 'status': 'unattached', 'storage': {'type': 'fcp',
> > > > 'volume_group': {'id': 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit',
> > > > 'logical_units': [{'discard_max_size': 268435456,
> > > > 'discard_zeroes_data': False, 'id':
> > > > '36000d31005b4f629', 'lun_mapping': 3, 'paths': 0,
> > > > 'product_id': 'Compellent Vol', 'serial':
> > > > 'SCOMPELNTCompellent_Vol_0005b4f6-0029', 'size': 536870912000,
> > > > 'storage_domain_id': 'dd52022b-7616-47f6-9534-6f1a4084fdf4',
> > > > 'vendor_id': 'COMPELNT', 'volume_group_id':
> > > > 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit'}]}}, 'storage_connections':
> > > > [], 'storage_format': 'v5

[ovirt-users] Re: Restoring hosted engine from backup fails on new FC storage domain creation

Thanks this looks like what I'm looking for. I'm still wondering how
to use this. I have LUN just for new hosted storage. Ansible created
storage domain to it correctly but just can't activate it. So is the
idea that I activate this unattached hosted_storage domain and try to
use API to make it master? I attached a screen shot how it looks
currently.



On Tue, Apr 27, 2021 at 10:41 AM Yedidyah Bar David  wrote:
>
> On Tue, Apr 27, 2021 at 10:15 AM Juhani Rautiainen
>  wrote:
> >
> > To continue. I noticed that another storage domain took the data
> > (master) now. I saw one advice that you can force change by putting
> > the storage domain to maintenance mode. Problem is that there are VM's
> > running on these domains. How is this supposed to work during the
> > restore?
>
> There is a recent change [1] by Shani (Cced) that should allow you
> to choose another storage domain as master. So you can create a new
> (temporary?) SD with the correct compatibility level and then set it
> to master.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1576923
>
> Best regards,
>
> >
> > Thanks,
> > Juhani
> >
> > On Tue, Apr 27, 2021 at 9:58 AM Juhani Rautiainen
> >  wrote:
> > >
> > > Hi!
> > >
> > > I started the upgrade from 4.3->4.4. Now I'm stuck as restoring the
> > > backup fails to create the correct storage domain for Hosted Engine.
> > > How can I create one? Error from ansible task is:
> > >
> > > [ ERROR ] ovirtsdk4.Error: Fault reason is "Operation Failed". Fault
> > > detail is "[Domain format is different from master storage domain
> > > format]". HTTP response code is 400.
> > > [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
> > > "Fault reason is \"Operation Failed\". Fault detail is \"[Domain
> > > format is different from master storage domain format]\". HTTP
> > > response code is 400."}
> > >
> > > And from UI I can see that what has been created is a data domain. Not
> > > data (master) domain as old one was. Now I'm stuck here. This is a
> > > fibre channel system  where I'm trying to do this if it is relevant.
> > >
> > > What I could find from the logs is this:
> > > 2021-04-27 09:36:06,925+0300 DEBUG
> > > otopi.ovirt_hosted_engine_setup.ansible_utils
> > > ansible_utils._process_output:105 storage_domain_details: {'changed':
> > > False, 'ovirt_storage_domains': [{'href':
> > > '/ovirt-engine/api/storagedomains/dd52022b-7616-47f6-9534-6f1a4084fdf4',
> > > 'comment': '', 'description': '', 'id':
> > > 'dd52022b-7616-47f6-9534-6f1a4084fdf4', 'name': 'hosted_storage',
> > > 'available': 531502202880, 'backup': False, 'block_size': 512,
> > > 'committed': 0, 'critical_space_action_blocker': 5,
> > > 'discard_after_delete': True, 'disk_profiles': [], 'disk_snapshots':
> > > [], 'disks': [], 'external_status': 'ok', 'master': False,
> > > 'permissions': [], 'status': 'unattached', 'storage': {'type': 'fcp',
> > > 'volume_group': {'id': 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit',
> > > 'logical_units': [{'discard_max_size': 268435456,
> > > 'discard_zeroes_data': False, 'id':
> > > '36000d31005b4f629', 'lun_mapping': 3, 'paths': 0,
> > > 'product_id': 'Compellent Vol', 'serial':
> > > 'SCOMPELNTCompellent_Vol_0005b4f6-0029', 'size': 536870912000,
> > > 'storage_domain_id': 'dd52022b-7616-47f6-9534-6f1a4084fdf4',
> > > 'vendor_id': 'COMPELNT', 'volume_group_id':
> > > 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit'}]}}, 'storage_connections':
> > > [], 'storage_format': 'v5', 'supports_discard': True,
> > > 'supports_discard_zeroes_data': False, 'templates': [], 'type':
> > > 'data', 'used': 4294967296, 'vms': [], 'warning_low_space_indicator':
> > > 10, 'wipe_after_delete': False}], 'failed': False}
> > >
> > > 'master': False? I'm not sure if this creation or check. I tried this
> > > opration twice. I removed the new hosted_storage and remove also old
> > > hosted_engine domain on second try to make sure that it doesn't
> > > prevent creation of another master. No luck with that.
> > >
> > > Thanks,
> > > Juhani
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZHLFDIYRBQFVDLEVG4SWVZPXSVO7SKK/
>
>
>
> --
> Didi
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5B6IQCPBYPI5MTCL3QP7DUAYVMQLHJ4S/

[ovirt-users] Re: Restoring hosted engine from backup fails on new FC storage domain creation

To continue. I noticed that another storage domain took the data
(master) now. I saw one advice that you can force change by putting
the storage domain to maintenance mode. Problem is that there are VM's
running on these domains. How is this supposed to work during the
restore?

Thanks,
Juhani

On Tue, Apr 27, 2021 at 9:58 AM Juhani Rautiainen
 wrote:
>
> Hi!
>
> I started the upgrade from 4.3->4.4. Now I'm stuck as restoring the
> backup fails to create the correct storage domain for Hosted Engine.
> How can I create one? Error from ansible task is:
>
> [ ERROR ] ovirtsdk4.Error: Fault reason is "Operation Failed". Fault
> detail is "[Domain format is different from master storage domain
> format]". HTTP response code is 400.
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
> "Fault reason is \"Operation Failed\". Fault detail is \"[Domain
> format is different from master storage domain format]\". HTTP
> response code is 400."}
>
> And from UI I can see that what has been created is a data domain. Not
> data (master) domain as old one was. Now I'm stuck here. This is a
> fibre channel system  where I'm trying to do this if it is relevant.
>
> What I could find from the logs is this:
> 2021-04-27 09:36:06,925+0300 DEBUG
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:105 storage_domain_details: {'changed':
> False, 'ovirt_storage_domains': [{'href':
> '/ovirt-engine/api/storagedomains/dd52022b-7616-47f6-9534-6f1a4084fdf4',
> 'comment': '', 'description': '', 'id':
> 'dd52022b-7616-47f6-9534-6f1a4084fdf4', 'name': 'hosted_storage',
> 'available': 531502202880, 'backup': False, 'block_size': 512,
> 'committed': 0, 'critical_space_action_blocker': 5,
> 'discard_after_delete': True, 'disk_profiles': [], 'disk_snapshots':
> [], 'disks': [], 'external_status': 'ok', 'master': False,
> 'permissions': [], 'status': 'unattached', 'storage': {'type': 'fcp',
> 'volume_group': {'id': 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit',
> 'logical_units': [{'discard_max_size': 268435456,
> 'discard_zeroes_data': False, 'id':
> '36000d31005b4f629', 'lun_mapping': 3, 'paths': 0,
> 'product_id': 'Compellent Vol', 'serial':
> 'SCOMPELNTCompellent_Vol_0005b4f6-0029', 'size': 536870912000,
> 'storage_domain_id': 'dd52022b-7616-47f6-9534-6f1a4084fdf4',
> 'vendor_id': 'COMPELNT', 'volume_group_id':
> 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit'}]}}, 'storage_connections':
> [], 'storage_format': 'v5', 'supports_discard': True,
> 'supports_discard_zeroes_data': False, 'templates': [], 'type':
> 'data', 'used': 4294967296, 'vms': [], 'warning_low_space_indicator':
> 10, 'wipe_after_delete': False}], 'failed': False}
>
> 'master': False? I'm not sure if this creation or check. I tried this
> opration twice. I removed the new hosted_storage and remove also old
> hosted_engine domain on second try to make sure that it doesn't
> prevent creation of another master. No luck with that.
>
> Thanks,
> Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZHLFDIYRBQFVDLEVG4SWVZPXSVO7SKK/

[ovirt-users] Restoring hosted engine from backup fails on new FC storage domain creation

Hi!

I started the upgrade from 4.3->4.4. Now I'm stuck as restoring the
backup fails to create the correct storage domain for Hosted Engine.
How can I create one? Error from ansible task is:

[ ERROR ] ovirtsdk4.Error: Fault reason is "Operation Failed". Fault
detail is "[Domain format is different from master storage domain
format]". HTTP response code is 400.
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
"Fault reason is \"Operation Failed\". Fault detail is \"[Domain
format is different from master storage domain format]\". HTTP
response code is 400."}

And from UI I can see that what has been created is a data domain. Not
data (master) domain as old one was. Now I'm stuck here. This is a
fibre channel system  where I'm trying to do this if it is relevant.

What I could find from the logs is this:
2021-04-27 09:36:06,925+0300 DEBUG
otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:105 storage_domain_details: {'changed':
False, 'ovirt_storage_domains': [{'href':
'/ovirt-engine/api/storagedomains/dd52022b-7616-47f6-9534-6f1a4084fdf4',
'comment': '', 'description': '', 'id':
'dd52022b-7616-47f6-9534-6f1a4084fdf4', 'name': 'hosted_storage',
'available': 531502202880, 'backup': False, 'block_size': 512,
'committed': 0, 'critical_space_action_blocker': 5,
'discard_after_delete': True, 'disk_profiles': [], 'disk_snapshots':
[], 'disks': [], 'external_status': 'ok', 'master': False,
'permissions': [], 'status': 'unattached', 'storage': {'type': 'fcp',
'volume_group': {'id': 'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit',
'logical_units': [{'discard_max_size': 268435456,
'discard_zeroes_data': False, 'id':
'36000d31005b4f629', 'lun_mapping': 3, 'paths': 0,
'product_id': 'Compellent Vol', 'serial':
'SCOMPELNTCompellent_Vol_0005b4f6-0029', 'size': 536870912000,
'storage_domain_id': 'dd52022b-7616-47f6-9534-6f1a4084fdf4',
'vendor_id': 'COMPELNT', 'volume_group_id':
'HRLDCn-p7X2-5X2O-vm4h-1Wb9-wAMu-WkIwit'}]}}, 'storage_connections':
[], 'storage_format': 'v5', 'supports_discard': True,
'supports_discard_zeroes_data': False, 'templates': [], 'type':
'data', 'used': 4294967296, 'vms': [], 'warning_low_space_indicator':
10, 'wipe_after_delete': False}], 'failed': False}

'master': False? I'm not sure if this creation or check. I tried this
opration twice. I removed the new hosted_storage and remove also old
hosted_engine domain on second try to make sure that it doesn't
prevent creation of another master. No luck with that.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3VQUION533QU2ZXEGS3MOSGKQSZTSLRN/

[ovirt-users] rhv-log-collector-analyzer available missing?

2021-04-19 Thread Juhani Rautiainen

Hi!

I'm trying to upgrade SHE from 4.3->4.4. Instructions from
https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3
has step "4.2 Analyzing the Environment". My installation doesn't
bring  the tool rhv-log-collector-analyzer although wiki has a notice
that it has been available since v4.2.5?

[root@ovirtmgr ~]# yum install rhv-log-collector-analyzer
Loaded plugins: fastestmirror, versionlock
Determining fastest mirrors
ovirt-4.3-epel/x86_64/metalink

   |  34 kB  00:00:00
 * base: mirror.hosthink.net
 * extras: mirror.hosthink.net
 * ovirt-4.3: resources.ovirt.org
 * ovirt-4.3-epel: mirrors.nxthost.com
 * updates: mirror.hosthink.net
base

   | 3.6 kB  00:00:00
centos-sclo-rh-release

   | 3.0 kB  00:00:00
extras

   | 2.9 kB  00:00:00
ovirt-4.3

   | 3.0 kB  00:00:00
ovirt-4.3-centos-gluster6

   | 3.0 kB  00:00:00
ovirt-4.3-centos-opstools

   | 2.9 kB  00:00:00
ovirt-4.3-centos-ovirt-common

   | 3.0 kB  00:00:00
ovirt-4.3-centos-ovirt43

   | 2.9 kB  00:00:00
ovirt-4.3-centos-qemu-ev

   | 3.0 kB  00:00:00
ovirt-4.3-epel

   | 4.7 kB  00:00:00
ovirt-4.3-virtio-win-latest

   | 3.0 kB  00:00:00
sac-gluster-ansible

   | 3.3 kB  00:00:00
updates

   | 2.9 kB  00:00:00
(1/9): extras/7/x86_64/primary_db

   | 232 kB  00:00:00
(2/9): ovirt-4.3-centos-gluster6/x86_64/primary_db

   | 120 kB  00:00:00
(3/9): ovirt-4.3-epel/x86_64/group_gz

   |  96 kB  00:00:00
(4/9): base/7/x86_64/primary_db

   | 6.1 MB  00:00:00
(5/9): ovirt-4.3-epel/x86_64/updateinfo

   | 1.0 MB  00:00:00
(6/9): ovirt-4.3-epel/x86_64/primary_db

   | 6.9 MB  00:00:00
(7/9): centos-sclo-rh-release/x86_64/primary_db

   | 2.9 MB  00:00:00
(8/9): sac-gluster-ansible/x86_64/primary_db

   |  12 kB  00:00:00
(9/9): updates/7/x86_64/primary_db

   | 7.1 MB  00:00:00
No package rhv-log-collector-analyzer available.
Error: Nothing to do

Are we missing a repo or is this just copy/paste error from RHV docs
and this step shouldn't even be in the oVirt docs?

Thanks,
Juhani
-- 
Juhani Rautiainen   jra...@iki.fi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LXCOH736PRJYHMFNIPCSX7OXKM55MI2M/

[ovirt-users] Re: Removed FC LUN causes multipath to spam syslog

2021-03-06 Thread Juhani Rautiainen

This really might be the best way to go. But it's a bit counter
intuitive that you can easily add stuff and even remove it from the UI
but there is still cleaning up left. But I'm not complaining that
much. You get what you pay for. Sometimes oVirt requires a bit of
extra work and that's the bargain you make. Maybe warning somewhere
could help. And it's not like you don't have to go to CLI when working
with VMWare.

I probably have to go through this again after 4.4 upgrade as SHE is
currently on a 3PAR LUN but won't be after the upgrade.

Thanks,
Juhani

On Sat, Mar 6, 2021 at 9:19 AM Strahil Nikolov  wrote:
>
> Removal of LUN is pure linux task.
> I would go like this:
> 1. Set the storage domain in maintenance
> 2. Detach the storage domain
> 3. I'm not sure if any LVM tasks should be done manually or are done by the 
> Engine
> 4. Add the wwid into multipath blacklist
> 5. Remove all paths (sdXYZ)
> 6. Unzone the LUN
>
> Best Regards,
> Strahil Nikolov
>
> On Fri, Mar 5, 2021 at 15:52, Juhani Rautiainen
>  wrote:
> Hi!
>
> I had already booted the first node so I tried this on the second
> node. After cleaning up with dmsetup I ran ansible script again. It
> claimed success put multipathd was still checking for paths. I tried
> to do 'systemctl reload multipathd' but instead did restart (too quick
> fingers). Anyway it worked. There was some kind of hiccup because of
> that as the engine seemed to re-activate the node. No hosts went down
> so in the end it worked.
>
> Thanks for the help,
> Juhani
>
> On Fri, Mar 5, 2021 at 1:46 PM Vojtech Juranek  wrote:
> >
> > On Friday, 5 March 2021 12:16:56 CET Juhani Rautiainen wrote:
> > > Hi!
> > >
> > > Ansible script fails and reason seems to be those stale DM links. We
> > > are currently still in 4.3.10 as I wanted to do this change before the
> > > upgrade to 4.4. We have SHE and it is currently on a 3PAR disk. When
> > > we upgrade we can do the SHE change at the same time as I didn't want
> > > to do two SHE restores (1st in 4.3 and then in 4.4). As there wasn't
> > > any hint how you can remove those stale DM links the best solution
> > > probably is to put nodes in maintenance and reboot them.
> >
> > you can remove the stale links by
> >
> >dmsetup remove -f /dev/mapper/
> >
> >
> > > BTW I did dezone those disks after I had removed them from oVirt UI. I
> > > mean you can't remove them while they are still on oVirt?
> >
> > if the LUN is on target where no other LUN are used by oVirt, multipath
> > devices should be removed and vdsm should be logged out from the tergat (at
> > least on recent oVirt release). If you still use other LUNs from the storage
> > server on the host (as part of other storage domains), there's no point in
> > removing the LUN as it will be discovered again, as vdsm does rescans of the
> > storage in various flows. So the flow has to be
> >
> > 1. remove storage domain using the LUN
> > 2. unzone the LUN on storage server
> > 3. remove corresponding multipath devices from the hosts
> >
> > There can be some improvement as proposed by Nir under BZ #1310330 like
> > blacklisting removed devices, but this makes everything more complex. It may
> > be considered and implemented in the future, but unfortunately not available
> > right now.
> >
> > > Thanks,
> > > Juhani
> > >
> > > On Fri, Mar 5, 2021 at 11:02 AM Vojtech Juranek  
> > > wrote:
> > > > On Friday, 5 March 2021 09:02:51 CET Juhani Rautiainen wrote:
> > > > > Hi!
> > > > >
> > > > > We are running ovirt 4.3.10 and we are migrating from 3PAR to Dell SC.
> > > > > I've managed to transfer disks from one LUN away and removed it (put
> > > > > it in maintenanced, detached it and removed it). Now multipath seems
> > > > > to spam the logs for missing disks. How can I stop this?
> > > >
> > > > you can remove multipath device either manually or you can use ansible
> > > > playbook for it, please try one attached to
> > > >
> > > > https://bugzilla.redhat.com/1310330
> > > >
> > > > see
> > > >
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1310330#c56
> > > >
> > > > how to use it. You also have to change `hosts` to group of hosts you 
> > > > want
> > > > to test on and remove `connection: local`
> > > >
> > > > If it fail, please report back. It can fail as there are stale DM links,
> > > > see
&

[ovirt-users] Re: Removed FC LUN causes multipath to spam syslog

2021-03-05 Thread Juhani Rautiainen

Hi!

I had already booted the first node so I tried this on the second
node. After cleaning up with dmsetup I ran ansible script again. It
claimed success put multipathd was still checking for paths. I tried
to do 'systemctl reload multipathd' but instead did restart (too quick
fingers). Anyway it worked. There was some kind of hiccup because of
that as the engine seemed to re-activate the node. No hosts went down
so in the end it worked.

Thanks for the help,
Juhani

On Fri, Mar 5, 2021 at 1:46 PM Vojtech Juranek  wrote:
>
> On Friday, 5 March 2021 12:16:56 CET Juhani Rautiainen wrote:
> > Hi!
> >
> > Ansible script fails and reason seems to be those stale DM links. We
> > are currently still in 4.3.10 as I wanted to do this change before the
> > upgrade to 4.4. We have SHE and it is currently on a 3PAR disk. When
> > we upgrade we can do the SHE change at the same time as I didn't want
> > to do two SHE restores (1st in 4.3 and then in 4.4). As there wasn't
> > any hint how you can remove those stale DM links the best solution
> > probably is to put nodes in maintenance and reboot them.
>
> you can remove the stale links by
>
> dmsetup remove -f /dev/mapper/
>
>
> > BTW I did dezone those disks after I had removed them from oVirt UI. I
> > mean you can't remove them while they are still on oVirt?
>
> if the LUN is on target where no other LUN are used by oVirt, multipath
> devices should be removed and vdsm should be logged out from the tergat (at
> least on recent oVirt release). If you still use other LUNs from the storage
> server on the host (as part of other storage domains), there's no point in
> removing the LUN as it will be discovered again, as vdsm does rescans of the
> storage in various flows. So the flow has to be
>
> 1. remove storage domain using the LUN
> 2. unzone the LUN on storage server
> 3. remove corresponding multipath devices from the hosts
>
> There can be some improvement as proposed by Nir under BZ #1310330 like
> blacklisting removed devices, but this makes everything more complex. It may
> be considered and implemented in the future, but unfortunately not available
> right now.
>
> > Thanks,
> > Juhani
> >
> > On Fri, Mar 5, 2021 at 11:02 AM Vojtech Juranek  wrote:
> > > On Friday, 5 March 2021 09:02:51 CET Juhani Rautiainen wrote:
> > > > Hi!
> > > >
> > > > We are running ovirt 4.3.10 and we are migrating from 3PAR to Dell SC.
> > > > I've managed to transfer disks from one LUN away and removed it (put
> > > > it in maintenanced, detached it and removed it). Now multipath seems
> > > > to spam the logs for missing disks. How can I stop this?
> > >
> > > you can remove multipath device either manually or you can use ansible
> > > playbook for it, please try one attached to
> > >
> > > https://bugzilla.redhat.com/1310330
> > >
> > > see
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1310330#c56
> > >
> > > how to use it. You also have to change `hosts` to group of hosts you want
> > > to test on and remove `connection: local`
> > >
> > > If it fail, please report back. It can fail as there are stale DM links,
> > > see
> > >
> > > bugzilla.redhat.com/1928041
> > >
> > > this was fixed recently and should be in next oVirt relase.
> > >
> > > Alternatively, you can reboot the host.
> > >
> > > > Mar  5 09:58:58 ovirt01 multipathd: 360002ac0027257b9:
> > > > sdr - tur checker reports path is down
> > > > Mar  5 09:58:59 ovirt01 multipathd: 360002ac0027257b9:
> > > > sdh - tur checker reports path is down
> > > >
> > > > And so on. Any idea how I can quiet this? And why didn't oVirt do this
> > > > automatically?
> > >
> > > to be able to remove multipath devices, the LUN has to be unzoned first on
> > > the storage server and this cannot be done by oVirt, as oVirt doesn't
> > > manage the storage server - this has to be done by administator of
> > > storage server which has subsequently remove multipath devices from the
> > > hosts e.g. by using ansible script mention above.
> > >
> > > > Thanks,
> > > > Juhani
> > > > ___
> > > > Users mailing list -- users@ovirt.org
> > > > To unsubscribe send an email to users-le...@ovirt.org
> > > > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > > > oVirt Code of Conduct:
> > > > https://www.ovirt.org/community/about/community-guidelines/ List
> > > > Archives:
> > > > https://lists.ovirt.org/archives/list/users@ovirt.org/message/QLENIZGWGU
> > > > U4J
> > > > OETXNQCNIK5GTU2EFVZ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZVW6Y7DM2QXNBVZVF7L4MBCD7JL5GV7N/

[ovirt-users] Re: Removed FC LUN causes multipath to spam syslog

2021-03-05 Thread Juhani Rautiainen

Hi!

Ansible script fails and reason seems to be those stale DM links. We
are currently still in 4.3.10 as I wanted to do this change before the
upgrade to 4.4. We have SHE and it is currently on a 3PAR disk. When
we upgrade we can do the SHE change at the same time as I didn't want
to do two SHE restores (1st in 4.3 and then in 4.4). As there wasn't
any hint how you can remove those stale DM links the best solution
probably is to put nodes in maintenance and reboot them.

BTW I did dezone those disks after I had removed them from oVirt UI. I
mean you can't remove them while they are still on oVirt?

Thanks,
Juhani


On Fri, Mar 5, 2021 at 11:02 AM Vojtech Juranek  wrote:
>
> On Friday, 5 March 2021 09:02:51 CET Juhani Rautiainen wrote:
> > Hi!
> >
> > We are running ovirt 4.3.10 and we are migrating from 3PAR to Dell SC.
> > I've managed to transfer disks from one LUN away and removed it (put
> > it in maintenanced, detached it and removed it). Now multipath seems
> > to spam the logs for missing disks. How can I stop this?
>
> you can remove multipath device either manually or you can use ansible
> playbook for it, please try one attached to
>
> https://bugzilla.redhat.com/1310330
>
> see
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1310330#c56
>
> how to use it. You also have to change `hosts` to group of hosts you want to
> test on and remove `connection: local`
>
> If it fail, please report back. It can fail as there are stale DM links, see
>
> bugzilla.redhat.com/1928041
>
> this was fixed recently and should be in next oVirt relase.
>
> Alternatively, you can reboot the host.
>
> > Mar  5 09:58:58 ovirt01 multipathd: 360002ac0027257b9:
> > sdr - tur checker reports path is down
> > Mar  5 09:58:59 ovirt01 multipathd: 360002ac0027257b9:
> > sdh - tur checker reports path is down
> >
> > And so on. Any idea how I can quiet this? And why didn't oVirt do this
> > automatically?
>
> to be able to remove multipath devices, the LUN has to be unzoned first on the
> storage server and this cannot be done by oVirt, as oVirt doesn't manage the
> storage server - this has to be done by administator of storage server which
> has subsequently remove multipath devices from the hosts e.g. by using ansible
> script mention above.
>
> > Thanks,
> > Juhani
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct:
> > https://www.ovirt.org/community/about/community-guidelines/ List Archives:
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/QLENIZGWGUU4J
> > OETXNQCNIK5GTU2EFVZ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IRGDSYIOBNKDCD3XX32TXLBBNYNH4IFB/

[ovirt-users] Removed FC LUN causes multipath to spam syslog

2021-03-05 Thread Juhani Rautiainen

Hi!

We are running ovirt 4.3.10 and we are migrating from 3PAR to Dell SC.
I've managed to transfer disks from one LUN away and removed it (put
it in maintenanced, detached it and removed it). Now multipath seems
to spam the logs for missing disks. How can I stop this?

Mar  5 09:58:58 ovirt01 multipathd: 360002ac0027257b9:
sdr - tur checker reports path is down
Mar  5 09:58:59 ovirt01 multipathd: 360002ac0027257b9:
sdh - tur checker reports path is down

And so on. Any idea how I can quiet this? And why didn't oVirt do this
automatically?

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QLENIZGWGUU4JOETXNQCNIK5GTU2EFVZ/

[ovirt-users] Re: ovirt Storage Domain Cluster Filesystem?

2020-10-22 Thread Juhani Rautiainen

Hi!

Someone can tell you more about technical details with SPM and other
cool stuff which does the clustering but basically all disks are
usable on all nodes simultaneously. So different VM's from node1 and
node2 can use the same storage domain. This also enables live
migration and storage migration between domains. That's why I
suggested that you try it. You'll see that it works out of the box
pretty well. It is the Gluster which seems to make life much more
interesting. With SAN things are much more boring.

-Juhani


On Thu, Oct 22, 2020 at 2:54 PM  wrote:
>
> Hi
>
> Thanks for the reply.
>
> So LVM is used in ovirt, I guess only in a host ca the LVM moount point 
> monted in a host becaise it is not a clustered LVM correct? In other words 
> and for example if I have a LVM with this setup in ovirt and 2 KVM hosts:
>
> Hosts kvm1 and kvm2
> Volume Group vg_vm
> Logical Volume lv-vm1 and lv-vm2
>
> vg_vm/lv-vm1 is mounted in kvm1 with XFS and vg_vm/lv-vm2 mounted in kvm2 
> with XFS, when kvm1 is stopped vg_vm/lv-vm1 can be mounted in kvm2.
>
> Am I correct?
>
> Thank you
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SFBQNEZ6PF3EDSLXRV7MBU5CMB5GDGOY/



-- 
Juhani Rautiainen   jra...@iki.fi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QELZEIVU5FJII6CHQRLDGDWX4AKJACJU/

[ovirt-users] Re: ovirt Storage Domain Cluster Filesystem?

2020-10-22 Thread Juhani Rautiainen

Hi!

We use oVirt with FC SAN in our environment. Basically you just make
LUN in SAN and attach that to the Storage domain. oVirt uses LVM on
top of that storage for VM's and this happens automatically. All nodes
where the LUN is seen can use the storage. You can also add multiple
LUNs in one storage domain. Be careful with this. If you have
different types of backing store for LUNs things get interesting if
you mix them in the same storage domain. I mean you don't know if VM
is running on SSD, HD or partly on both (depends how LVM fills the
space). And as far as I know you can't remove LUN from the domain. I
made this mistake once when I added SSD LUN to the oVirt. I had to
make a new storage domain to where I transferred the VM's and cleaned
up the mistake after that. Lastly I created a new storage domain for
SSD and added the LUN there.

I suggest that you just try this. It's easier than you think.

-Juhani

On Thu, Oct 22, 2020 at 9:38 AM  wrote:
>
> Hi all
>
> I come from Oracle VM x86 world and we are planning from moving Oracle VM to 
> oVirt.
>
> I am having hard time understanding Storage Domains in oVirt. All our storage 
> are SAN and I wonder how can we manage SAN LUN in oVirt to create a storage 
> domain such that the VM guests can run in any host in the oVirt Cluster?
>
> For example in Oracle VM the Storage Repository (it is the Storage Domain in 
> OVM words) are based on SAN LUNs and on top of that a cluster filesystem is 
> created so all hosts in the cluster have concurrent access to the storage 
> repository and the VM guest can be started in any of the hosts in the cluster.
>
> How do we accomplish the same in oVirt with SAN Storage? Which Cluster 
> Filesystem is supported in Storage Domain?
>
> Or perhaps in oVirt the mechanism is totally different?
>
> Thank you
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/XMXDRNDT6BJ63FGFQEIDKVU7HSVIRXCM/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B7OMWVAMHANQ5SGK6MQSLJF4ZU4UPF7X/

[ovirt-users] Re: Wrong CPU?

2019-11-19 Thread Juhani Rautiainen

Hi!

Had to get back to work to check which CPU we had. We have AMD Epyc
7281 and ovirt CPU Type is AMD EPYC IBPB SSBD. It seems that your CPU
is the next generation (Zen2) and I'm pretty sure that problem is with
Qemu version. As far as I can see from git there is not even Zen2
support in latest Qemu (checking by target/i386/cpu.c)? I mean they
added Hygon Dhyana (never even heard about this chinese AMD EPYC
clone) and in that discussion there was reference to Zen2
architecture. So biggest problem for oVirt seems to come from
upstream. I mean that Zen2 is quite good for virtualization and it's
going to sell a lot. Maybe AMD should help with that push?

-Juhani

On Fri, Nov 15, 2019 at 9:03 PM Christian Reiss
 wrote:
>
> Sorry,
>
> I meant EPYC, not Ryzen.
> How did you solve your EPYC issue?
>
> -Chris.
>
> On 15/11/2019 18:55, Juhani Rautiainen wrote:
> > Hi!
> >
> > It might be that the Qemu in oVirt doesn't recognize the Ryzen. That
> > was case with Epyc when I started using oVirt. It was reconized as a
> > Opteron G2 which caused lot's of problems when upgrading to 4.3.
> >
> > -Juhani
> >
> > On Fri, Nov 15, 2019 at 6:45 PM Christian Reiss
> >  wrote:
> >>
> >> Hey folks,
> >>
> >> running an AMD Ryzen CPU here:
> >>
> >> processor   : 0
> >> vendor_id   : AuthenticAMD
> >> cpu family  : 23
> >> model   : 49
> >> model name  : AMD EPYC 7282 16-Core Processor
> >>
> >> However, libvirt is detecting this as EPYC-IBPB without the ssbd flags?
> >>
> >>   
> >> x86_64
> >> EPYC-IBPB
> >> AMD
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >>   
> >>
> >>
> >> [root@node01 ~]# grep ssbd /var/cache/libvirt/qemu/capabilities/*.xml
> >>   
> >>   
> >>   
> >>   
> >>
> >> But the flag is there:
> >>
> >> [root@node01 ~]# grep ssbd /proc/cpuinfo | tail -n1
> >> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> >> cmov
> >> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
> >> pdpe1gb rdtscp lm constant_tsc art rep_good nopl xtopology nonstop_tsc
> >> extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16
> >> sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy
> >> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
> >> skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb
> >> cat_l3 cdp_l3 hw_pstate sme retpoline_amd ssbd ibrs ibpb stibp vmmcall
> >> fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb
> >> sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total
> >> cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save
> >> tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
> >> avic v_vmsave_vmload vgif umip overflow_recov succor smca
> >>
> >> I tried adding "options kvm_amd avic=1" as well as "options kvm_amd
> >> avic=0" to /etc/modprobe.d/kvm.conf (always with reboots), adding
> >> mitigations=off to grub.. I can't think of any other solution.
> >>
> >> I just can't get the oVirt engine running with the ssbd flag. Seems cpu
> >> can do this, oVirt can do this, libvirt does not detect the cpu
> >> correctly or at least ignores it. But the hosted engine demands it.
> >>
> >> I am at a loss. Any help is oh-so-greatly appreciated.
> >>
> >> -Chris.
> >>
> >> --
> >>Christian Reiss - em...@christian-reiss.de /"\  ASCII Ribbon
> >>  supp...@alpha-labs.net   \ /Campaign
> >>X   against HTML
> >>WEB alpha-labs.net / \   in eMails
> >>
> >>GPG Retrieval https://gpg.christian-reiss.de
> >>GPG ID ABCD43C5, 0x44E29126ABCD43C

[ovirt-users] Re: Wrong CPU?

2019-11-16 Thread Juhani Rautiainen

Hi!

It might be that the Qemu in oVirt doesn't recognize the Ryzen. That
was case with Epyc when I started using oVirt. It was reconized as a
Opteron G2 which caused lot's of problems when upgrading to 4.3.

-Juhani

On Fri, Nov 15, 2019 at 6:45 PM Christian Reiss
 wrote:
>
> Hey folks,
>
> running an AMD Ryzen CPU here:
>
> processor   : 0
> vendor_id   : AuthenticAMD
> cpu family  : 23
> model   : 49
> model name  : AMD EPYC 7282 16-Core Processor
>
> However, libvirt is detecting this as EPYC-IBPB without the ssbd flags?
>
>  
>x86_64
>EPYC-IBPB
>AMD
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>  
>
>
> [root@node01 ~]# grep ssbd /var/cache/libvirt/qemu/capabilities/*.xml
>  
>  
>  
>  
>
> But the flag is there:
>
> [root@node01 ~]# grep ssbd /proc/cpuinfo | tail -n1
> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov
> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
> pdpe1gb rdtscp lm constant_tsc art rep_good nopl xtopology nonstop_tsc
> extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16
> sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy
> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
> skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb
> cat_l3 cdp_l3 hw_pstate sme retpoline_amd ssbd ibrs ibpb stibp vmmcall
> fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb
> sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total
> cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save
> tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
> avic v_vmsave_vmload vgif umip overflow_recov succor smca
>
> I tried adding "options kvm_amd avic=1" as well as "options kvm_amd
> avic=0" to /etc/modprobe.d/kvm.conf (always with reboots), adding
> mitigations=off to grub.. I can't think of any other solution.
>
> I just can't get the oVirt engine running with the ssbd flag. Seems cpu
> can do this, oVirt can do this, libvirt does not detect the cpu
> correctly or at least ignores it. But the hosted engine demands it.
>
> I am at a loss. Any help is oh-so-greatly appreciated.
>
> -Chris.
>
> --
>   Christian Reiss - em...@christian-reiss.de /"\  ASCII Ribbon
> supp...@alpha-labs.net   \ /Campaign
>   X   against HTML
>   WEB alpha-labs.net / \   in eMails
>
>   GPG Retrieval https://gpg.christian-reiss.de
>   GPG ID ABCD43C5, 0x44E29126ABCD43C5
>   GPG fingerprint = 9549 F537 2596 86BA 733C  A4ED 44E2 9126 ABCD 43C5
>
>   "It's better to reign in hell than to serve in heaven.",
>John Milton, Paradise lost.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/PPXT55FJZESYOAOPR7HY5LOHTYELWDN6/



-- 
Juhani Rautiainen   jra...@iki.fi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WD6XANLPRISIQ5GQ77HMP4THR36NFITB/

[ovirt-users] Re: The CPU type of the cluster is unknown. Its possible to change the cluster cpu or set a different one per VM.

2019-06-10 Thread Juhani Rautiainen

Hi!

Epyc is not available in oVirt-4.2. If you can recover the system and
can upgrade to oVirt-4.3 it then you can switch to Epyc (I have system
with Epyc's running in 4.3). For recovery I would try to roll back
changes you did to database.

-Juhani


-Juhani

On Mon, Jun 10, 2019 at 11:46 AM  wrote:
>
> Hi,
>
> I am having trouble in fixing the CPU type of my oVirt cluster.
>
> I have AMD EPYC 7551P 32-Core Processor hosts(x3) in a gluster cluster. The 
> HE and cluster by default had the: AMD Opteron 23xx (Gen 3 Class Opteron).
>
> I tried changing it with this 
> method(https://lists.ovirt.org/archives/list/users@ovirt.org/thread/XYY2WAGWXBL5YA6KAQY3ZEBVFOKELAKE/),
> it didn't work as I can't move the Host to another cluster it complains 
> (Error while executing action: ***:Cannot edit Host. Server having 
> Gluster volume.)
>
> Then I tried updating the cpu_name manually in DB by following: 
> https://www.mail-archive.com/users@ovirt.org/msg33177.html
> After the Update, the web UI shows it changed to AMD EPYC but if I click on 
> Edit cluster it shows the "Intel Conroe Family" and again doesn't allow to 
> modify.
>
> I am now stuck with unusable oVirt setup, my existing VMs won't start and 
> throw: Error while executing action: **: The CPU type of the cluster is 
> unknown. Its possible to change the cluster cpu or set a different one per 
> VM.)
>
> Please help or suggest to fix this issue.
>
> HE:
> cat /proc/cpuinfo |grep "model name"
> model name  : AMD Opteron 23xx (Gen 3 Class Opteron)
> model name  : AMD Opteron 23xx (Gen 3 Class Opteron)
> model name  : AMD Opteron 23xx (Gen 3 Class Opteron)
> model name  : AMD Opteron 23xx (Gen 3 Class Opteron)
>
> rpm -qa |grep -i ovirt
> ovirt-ansible-image-template-1.1.9-1.el7.noarch
> ovirt-engine-dwh-setup-4.2.4.3-1.el7.noarch
> ovirt-engine-backend-4.2.8.2-1.el7.noarch
> ovirt-engine-extension-aaa-ldap-1.3.8-1.el7.noarch
> ovirt-engine-extension-aaa-jdbc-1.1.7-1.el7.centos.noarch
> ovirt-engine-wildfly-overlay-14.0.1-3.el7.noarch
> ovirt-ansible-hosted-engine-setup-1.0.2-1.el7.noarch
> ovirt-host-deploy-java-1.7.4-1.el7.noarch
> ovirt-engine-setup-plugin-websocket-proxy-4.2.8.2-1.el7.noarch
> ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.2.8.2-1.el7.noarch
> ovirt-ansible-engine-setup-1.1.6-1.el7.noarch
> ovirt-ansible-cluster-upgrade-1.1.10-1.el7.noarch
> ovirt-ansible-roles-1.1.6-1.el7.noarch
> ovirt-release42-4.2.8-1.el7.noarch
> ovirt-engine-setup-plugin-ovirt-engine-common-4.2.8.2-1.el7.noarch
> ovirt-engine-tools-backup-4.2.8.2-1.el7.noarch
> ovirt-provider-ovn-1.2.18-1.el7.noarch
> ovirt-imageio-common-1.4.6-1.el7.x86_64
> ovirt-js-dependencies-1.2.0-3.1.el7.centos.noarch
> ovirt-cockpit-sso-0.0.4-1.el7.noarch
> ovirt-engine-restapi-4.2.8.2-1.el7.noarch
> ovirt-engine-vmconsole-proxy-helper-4.2.8.2-1.el7.noarch
> ovirt-engine-api-explorer-0.0.2-1.el7.centos.noarch
> ovirt-engine-lib-4.2.8.2-1.el7.noarch
> ovirt-engine-setup-base-4.2.8.2-1.el7.noarch
> ovirt-engine-metrics-1.1.8.1-1.el7.noarch
> ovirt-ansible-repositories-1.1.3-1.el7.noarch
> ovirt-ansible-disaster-recovery-1.1.4-1.el7.noarch
> ovirt-ansible-infra-1.1.10-1.el7.noarch
> ovirt-ansible-shutdown-env-1.0.0-1.el7.noarch
> ovirt-engine-dwh-4.2.4.3-1.el7.noarch
> ovirt-iso-uploader-4.2.0-1.el7.centos.noarch
> ovirt-engine-webadmin-portal-4.2.8.2-1.el7.noarch
> ovirt-engine-dbscripts-4.2.8.2-1.el7.noarch
> ovirt-engine-setup-plugin-ovirt-engine-4.2.8.2-1.el7.noarch
> ovirt-engine-extension-aaa-ldap-setup-1.3.8-1.el7.noarch
> ovirt-engine-extensions-api-impl-4.2.8.2-1.el7.noarch
> ovirt-host-deploy-1.7.4-1.el7.noarch
> ovirt-vmconsole-1.0.6-2.el7.noarch
> python-ovirt-engine-sdk4-4.2.9-2.el7.x86_64
> ovirt-engine-wildfly-14.0.1-3.el7.x86_64
> ovirt-guest-agent-common-1.0.16-1.el7.noarch
> ovirt-ansible-v2v-conversion-host-1.9.0-1.el7.noarch
> ovirt-ansible-manageiq-1.1.13-1.el7.noarch
> ovirt-setup-lib-1.1.5-1.el7.noarch
> ovirt-engine-websocket-proxy-4.2.8.2-1.el7.noarch
> ovirt-engine-dashboard-1.2.4-1.el7.noarch
> ovirt-engine-setup-4.2.8.2-1.el7.noarch
> ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch
> ovirt-vmconsole-proxy-1.0.6-2.el7.noarch
> ovirt-web-ui-1.4.5-1.el7.noarch
> ovirt-ansible-vm-infra-1.1.12-1.el7.noarch
> ovirt-imageio-proxy-setup-1.4.6-1.el7.noarch
> ovirt-imageio-proxy-1.4.6-1.el7.noarch
> ovirt-engine-tools-4.2.8.2-1.el7.noarch
> ovirt-engine-4.2.8.2-1.el7.noarch
> ovirt-engine-cli-3.6.9.2-1.el7.centos.noarch
>
> 3xHosts:
> cat /proc/cpuinfo |grep "model name"
> model name  : AMD EPYC 7551P 32-Core Processor
> model name  : AMD EPYC 7551P 32-Core Processor
> model name  : AMD EPYC 7551P 32-Core Processor
>
> rpm -qa |grep -i ovirt
> ovirt-release42-4.2.8-1.el7.noarch
> cockpit-ovirt-dashboard-0.11.38-1.el7.noarch
> ovirt-vmconsole-1.0.6-2.el7.noarch
> python-ovirt-engine-sdk4-4.2.9-2.el7.x86_64
> cockpit-machines-ovirt-193-2.el7.noarch
> ovirt-imageio-daemon-1.4.6-1.el7.noarch
>

[ovirt-users] Upgrade success story (4.3.0->4.3.3)

2019-05-13 Thread Juhani Rautiainen

Hi!

Just wanted to let you know that I succesfully upgrade our cluster to
4.3.3 (FC SAN based system and no Gluster). I postponed upgrade a lot
because of the difficulties I had when upgrading from 4.2.8->4.3.0.
This time Hosted Engine upgrade went without any problems. The first
node also upgraded itself from admin UI without any problems. The
second node was different story. I couldn't upgrade it because yum
complained about missing packages. This was mysterious as I just had
upgraded the other node. I bypassed this by taking ovirt-4.3 repos
from other node since diffing files gave results that they differed.
After that node upgraded from UI but went missing after reboot.
Contacting it via ILO I found out that it had turned to DHCP again
(which it never should do). Somehow settings in VDSM persistent
settings were still wrong and pointing to DHCP (it has always been in
fixed address). Fixed this by hand by using vdsm persistent settings
from the running node. Now the node reboots reliably.

Anyway thanks for all the good work oVirt people put in. Small
problems here and there but it's getting better by the release.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XZTHV3THA3KUCN6BPUHD3XE4FQ7K3QK6/

[ovirt-users] Re: Node losing management network address?

2019-05-13 Thread Juhani Rautiainen

Thanks for the tip. I upgraded my cluster and one node (which started
this thread) again picked up DHCP address after reboot. I checked
/var/lib/vdsm/persistence/netconf/nets/ovirtmgmt and it _still_ had
DHCP settings in it. And I had previously removed and added node to
cluster. It seems that VDSM doesn't compare between settings in engine
DB and node. Don't know if it because it "knows" that they are
correct.

-Juhani

On Sun, Mar 3, 2019 at 11:02 AM  wrote:
>
> I had this issue, I believe that when I tried to fix the network manually so 
> that ovirt could sync the correct config, vdsm was kicking in and overwriting 
> my changes with what it had stored in /var/lib/vdsm/persistence/netconf/ 
> before the sync took place. For whatever reason this was dhcp.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/23MLUL2YU7VYFGQXSHZJ3CTC3ZMIHAMR/



-- 
Juhani Rautiainen   jra...@iki.fi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OITO2EXPJZJJHTXO5SMYQLN7J7NOQ6K6/

[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-20 Thread Juhani Rautiainen

On Tue, Mar 19, 2019 at 3:40 PM Juhani Rautiainen
 wrote:
>

> > while true;
> >do ping -c 1 -W 2 10.168.8.1 > /dev/null; echo $?; sleep 0.5;
> > done
>
> I'll try this tomorrow during the expected failure time.

And I found the reason. Nothing wrong with the ovirt. There is big
filetransfer going through FW every fifteen minutes and it's ping
response goes beyond horrible. And it's Enterprise level FW.


Sorry for wasting the time and thanks for the help,
  Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BP6KFMNYIOMH3YS55UEF5NZEWKQH34KH/

[ovirt-users] Re: Daily reboots of Hosted Engine?

On Tue, Mar 19, 2019 at 3:01 PM Simone Tiraboschi  wrote:
>

>>
>> No failed pings to be seen. So how that ping.py decides that 4 out of 5 
>> failed??
>
>
> It's just calling the system ping utility as an external process checking the 
> exit code.
> I don't see any issue with that approach.

I was looking at the same thing but I can also see that packets reach
the host NIC. I just read the times again and it seems that first ping
was delayed (took over 2 secs). So is that 4 out of 5 number of
succeeded pings? Because I read it the other way.

> Can you please try executing:
>
> while true;
>do ping -c 1 -W 2 10.168.8.1 > /dev/null; echo $?; sleep 0.5;
> done

I'll try this tomorrow during the expected failure time.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/T4BR74HPP6ADXEQTC7SHK36DXUUY4UDB/

[ovirt-users] Re: Daily reboots of Hosted Engine?

On Tue, Mar 19, 2019 at 1:33 PM Juhani Rautiainen
 wrote:
>
> On Tue, Mar 19, 2019 at 12:46 PM Juhani Rautiainen
>
> It seems that either our firewall is not responding to pings or
> something else is wrong. Looking at the broker.log this can be seen.
> Curious thing is that the reboot happens even when ping comes back in
> couple of seconds. Is there timeout in ping or does it fire them in
> quick succession?

I don't know much of Python, but I think there is a problem with
broker/ping.py. I noticed that these ping failures happen every
fifteen minutes:

[root@ovirt01 ~]# grep Failed /var/log/ovirt-hosted-engine-ha/broker.log
Thread-1::WARNING::2019-03-19
14:04:44,898::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(4 out of 5)
Thread-1::WARNING::2019-03-19
14:19:38,891::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(4 out of 5)

I monitored the firewall and network traffic in host and ping works
but that ping.py somehow thinks that it did not get replies. I can't
see anything obvius in the code. But this is from tcpdump from that
last failure time frame:

14:19:22.598518 IP ovirt01.virt.local > gateway: ICMP echo request, id
19055, seq 1, length 64
14:19:22.598705 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19055, seq 1, length 64
14:19:23.126800 IP ovirt01.virt.local > gateway: ICMP echo request, id
19056, seq 1, length 64
14:19:23.126978 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19056, seq 1, length 64
14:19:23.653544 IP ovirt01.virt.local > gateway: ICMP echo request, id
19057, seq 1, length 64
14:19:23.653731 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19057, seq 1, length 64
14:19:24.180846 IP ovirt01.virt.local > gateway: ICMP echo request, id
19058, seq 1, length 64
14:19:24.181042 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19058, seq 1, length 64
14:19:24.708083 IP ovirt01.virt.local > gateway: ICMP echo request, id
19065, seq 1, length 64
14:19:24.708274 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19065, seq 1, length 64
14:19:32.743986 IP ovirt01.virt.local > gateway: ICMP echo request, id
19141, seq 1, length 64
14:19:35.160398 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19141, seq 1, length 64
14:19:35.271171 IP ovirt01.virt.local > gateway: ICMP echo request, id
19152, seq 1, length 64
14:19:35.365315 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19152, seq 1, length 64
14:19:35.892716 IP ovirt01.virt.local > gateway: ICMP echo request, id
19154, seq 1, length 64
14:19:36.002087 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19154, seq 1, length 64
14:19:36.529263 IP ovirt01.virt.local > gateway: ICMP echo request, id
19156, seq 1, length 64
14:19:38.359281 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19156, seq 1, length 64
14:19:38.887231 IP ovirt01.virt.local > gateway: ICMP echo request, id
19201, seq 1, length 64
14:19:38.889774 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19201, seq 1, length 64
14:19:42.923684 IP ovirt01.virt.local > gateway: ICMP echo request, id
19234, seq 1, length 64
14:19:42.923951 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19234, seq 1, length 64
14:19:43.450788 IP ovirt01.virt.local > gateway: ICMP echo request, id
19235, seq 1, length 64
14:19:43.450968 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19235, seq 1, length 64
14:19:43.977791 IP ovirt01.virt.local > gateway: ICMP echo request, id
19237, seq 1, length 64
14:19:43.977965 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19237, seq 1, length 64
14:19:44.504541 IP ovirt01.virt.local > gateway: ICMP echo request, id
19238, seq 1, length 64
14:19:44.504715 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19238, seq 1, length 64
14:19:45.031570 IP ovirt01.virt.local > gateway: ICMP echo request, id
19244, seq 1, length 64
14:19:45.031752 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19244, seq 1, length 64

No failed pings to be seen. So how that ping.py decides that 4 out of 5 failed??

Thanks,
  Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UH7MKGQECM2VSI77DNRHQB56C76FJBTY/

[ovirt-users] Re: Daily reboots of Hosted Engine?

On Tue, Mar 19, 2019 at 12:46 PM Juhani Rautiainen
 wrote:
>
>
> Couldn't find anything that jumps as problem but another post in list
> made me check ha-agent logs. This is the reason for reboot:
>
> MainThread::INFO::2019-03-19
> 12:04:41,262::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Penalizing score by 1600 due to gateway status
> MainThread::INFO::2019-03-19
> 12:04:41,263::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineUp (score: 1800)
> MainThread::ERROR::2019-03-19
> 12:04:51,283::states::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Host ovirt02.virt.local (id 2) score is significantly better than
> local score, shutting down VM on this host
> MainThread::INFO::2019-03-19
> 12:04:51,467::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineUp-EngineStop)
> sent? sent
> MainThread::INFO::2019-03-19
> 12:04:51,624::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineStop (score: 3400)
>
> So HA-agent does the reboot. Now the question is: What that
> 'Penalizing score by 1600 due to gateway status' means? Other HA VM's
> don't seen to have any problems.

It seems that either our firewall is not responding to pings or
something else is wrong. Looking at the broker.log this can be seen.
Curious thing is that the reboot happens even when ping comes back in
couple of seconds. Is there timeout in ping or does it fire them in
quick succession?

Thread-1::INFO::2019-03-19 12:04:20,244::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-2::INFO::2019-03-19
12:04:20,567::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-5::INFO::2019-03-19
12:04:24,729::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-2::INFO::2019-03-19
12:04:29,745::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:04:30,166::mem_free::51::mem_free.MemFree::(action) memFree: 340451
Thread-5::INFO::2019-03-19
12:04:34,843::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-2::INFO::2019-03-19
12:04:39,926::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:04:40,287::mem_free::51::mem_free.MemFree::(action) memFree: 340450
Thread-1::WARNING::2019-03-19
12:04:40,389::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(0 out of 5)
Thread-1::INFO::2019-03-19 12:04:43,474::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-5::INFO::2019-03-19
12:04:44,961::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-2::INFO::2019-03-19
12:04:50,154::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:04:50,415::mem_free::51::mem_free.MemFree::(action) memFree: 340454
Thread-1::INFO::2019-03-19 12:04:51,616::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-5::INFO::2019-03-19
12:04:55,076::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-4::INFO::2019-03-19
12:04:59,197::cpu_load_no_engine::126::cpu_load_no_engine.CpuLoadNoEngine::(calculate_load)
System load total=0.0247, engine=0.0004, non-engine=0.0243
Thread-2::INFO::2019-03-19
12:05:00,434::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:05:00,541::mem_free::51::mem_free.MemFree::(action) memFree: 340433
Thread-1::INFO::2019-03-19 12:05:01,763::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-7::INFO::2019-03-19
12:05:06,692::engine_health::203::engine_health.EngineHealth::(_result_from_stats)
VM not running on this host, status Down

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PCIGAKWR6OZZTOEQ33P2QUA6RTJM5WQY/

[ovirt-users] Re: Daily reboots of Hosted Engine?

On Tue, Mar 19, 2019 at 12:39 PM Kaustav Majumder  wrote:
>
>

> It should not affect.
>>
>> Can
>> this cause problems? I noticed that this message was in events hour
>> before reboot:
>>
> @Sahina Bose what can cause such?
>>
>> Invalid status on Data Center Default. Setting status to Non Responsive.
>>
>> Same event happened just after reboot.
>
>> -Juhani
>
>
> Can you also check the vdsm logs for any anomaly around the time of reboot .

Couldn't find anything that jumps as problem but another post in list
made me check ha-agent logs. This is the reason for reboot:

MainThread::INFO::2019-03-19
12:04:41,262::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Penalizing score by 1600 due to gateway status
MainThread::INFO::2019-03-19
12:04:41,263::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state EngineUp (score: 1800)
MainThread::ERROR::2019-03-19
12:04:51,283::states::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Host ovirt02.virt.local (id 2) score is significantly better than
local score, shutting down VM on this host
MainThread::INFO::2019-03-19
12:04:51,467::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineUp-EngineStop)
sent? sent
MainThread::INFO::2019-03-19
12:04:51,624::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state EngineStop (score: 3400)

So HA-agent does the reboot. Now the question is: What that
'Penalizing score by 1600 due to gateway status' means? Other HA VM's
don't seen to have any problems.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYM4IONIT5K7NOYLZ3S2GIEDCSIFKXQI/

[ovirt-users] Re: Daily reboots of Hosted Engine?

On Tue, Mar 19, 2019 at 12:21 PM Kaustav Majumder  wrote:
>
> Hi,
> Can you check if the he vm fqdn resolves to it's ip from all the hosts?

I checked both hosts and DNS resolving works fine. Just occurred to me
that I also added addresses to /etc/hosts just in case DNS fails. Can
this cause problems? I noticed that this message was in events hour
before reboot:

Invalid status on Data Center Default. Setting status to Non Responsive.

Same event happened just after reboot.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/742ETPAXWBALM443VUOYFV6UITK3YKXH/

[ovirt-users] Daily reboots of Hosted Engine?

Hi!

Hosted engine reboots itself almost daily. Is this by design? If not,
where should I be searching for the clues why it shuts down? Someone
is giving reboot order to HE because /var/log/messages in contains
this:
Mar 19 12:05:00 ovirtmgr qemu-ga: info: guest-shutdown called, mode: powerdown
Mar 19 12:05:00 ovirtmgr systemd: Started Delayed Shutdown Service.

And I'm still running v4.3.0 because upgrade to that was bit painful
and haven't dared to new round.

Thanks,
-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WIJXBQZQT4HDWAQ4IVLOIFGHKAKTT76O/

[ovirt-users] Re: Change cluster cpu type with hosted engine

2019-03-12 Thread Juhani Rautiainen

On Tue, Mar 12, 2019 at 2:02 PM Fabrice SOLER <
fabrice.so...@ac-guadeloupe.fr> wrote:

> Hello,
>
> I need to create a windows 10 virtual machine but I have an error :
>
> I have a fresh ovirt installation (version 4.2.8) with an hosted engine.
> At the hosted engine installation there was no question about the cluster
> cpu type, it should be great if in the future version it could be.
>
> To change an host to another cluster this host need to be in maintenance
> mode, and the hosted engine will be power off.
>
> I have created another Cluster with an SandyBridge Family CPU type, but to
> move the hosted engine to this new cluster the hosted should be power off.
>
> Is there someone who can help ?
>

Hi!

This is modified from original but could work:
- create new cluster with new CPU type
- set HE global maintenance mode
- set one of the hosted-engine hosts into maintenance mode
- move it to a different cluster
- shutdown the engine VM
- manually restart the engine VM on the host on the custom cluster directly
executing on that host: hosted-engine --vm-start
- connect again to the engine
- set all the hosts of the initial cluster into maintenance mode
- change CPU type in original cluster
- shut down again the engine VM
- manually restart the engine VM on one of the hosts of the initial cluster
- move back the host that got into a temporary cluster to its initial
cluster

> Sincerely,
> --
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/KFH3ZLPA7KZSSJG3DGOGW2F4OMXE4KZK/
>

-Juhani
-- 
Juhani Rautiainen   jra...@iki.fi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QHHOQPAAGB33PCXH3XXRLBXKTUHTMKSD/

[ovirt-users] Re: Node losing management network address?

2019-02-28 Thread Juhani Rautiainen

On Thu, Feb 28, 2019 at 3:01 PM Edward Haas  wrote:
>
> On boot, VDSM attempts to apply its persisted network desired configuration 
> through ifcfg.
> Even though you may have fixed the ifcfg files, the persisted VDSM config may 
> had been set with DHCP, therefore your changes would get overwritten on boot 
> or when the synchronization was issued from Engine.

This is what I thought when I saw it coming up with DHCP.

> When you get an unsync warning, fixes on the host should match with the 
> persisted config known to VDSM.

How can you see the persistent config? At least the webadmin is maybe
too user friendly and just offers the resync. It doesn't tell what is
going to do when it does resync. I would have not started the
operation if I had known that it was going to switch to DHCP.

> Usually the persisted state in VDSM and the one in Engine are the same, 
> unless something very strange has happened... like moving hosts between 
> Engine servers or self editing on one side without it reaching the other side.

I had to move hosts to different cluster (same HE). That was
recommended here because of the EPYC migration problems when upgrading
to 4.3. Discussions are in the list archives and in bug 1672859 for
that one. Maybe this was side effect of that.

Thanks,
-Juhani

>
> On Thu, Feb 28, 2019 at 9:34 AM Juhani Rautiainen 
>  wrote:
>>
>> On Wed, Feb 27, 2019 at 6:26 PM Dominik Holler  wrote:
>> >
>>
>> >
>> > This is a valid question.
>> >
>> > > > > I noticed that one node had ovirtmgmt network unsynchronized. I tried
>> >
>> > oVirt detected a difference between the expected configuration and applied
>> > configuration. This might happen if the interface configuration is change
>> > directly on the host instead of using oVirt Engine.
>> >
>> > > > > to resynchronize it.
>> >
>> > If you have the vdsm.log, the relevant lines start at the pattern
>> > Calling 'Host.setupNetworks'
>> > and ends at the pattern
>> > FINISH getCapabilities
>>
>> This gave some clues. See the log below. IMHO it points engine getting
>> something wrong because it seems to ask for DHCP setup in query.
>> Probably fails because it succeeds in address change and network
>> connection is torn down.
>>
>> 2019-02-25 12:41:38,478+0200 WARN  (vdsm.Scheduler) [Executor] Worker
>> blocked: > {u'bondings': {}, u'networks': {u'ovirtmgmt': {u'ipv6autoconf': False,
>> u'nic': u'eno1', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False,
>> u'STP': u'no', u'bridged': u'true', u'defaultRoute': True,
>> u'bootproto': u'dhcp'}}, u'options': {u'connectivityCheck': u'true',
>> u'connectivityTimeout': 120, u'commitOnSuccess': True}}, 'jsonrpc':
>> '2.0', 'method': u'Host.setupNetworks', 'id':
>> u'2ca75cf3-6410-43b4-aebf-cdc3f262e5c2'} at 0x7fc9c95ef350>
>> timeout=60, duration=60.01 at 0x7fc9c961e090> task#=230106 at
>> 0x7fc9a43a1fd0>, traceback:
>> File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
>>   self.__bootstrap_inner()
>> File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
>>   self.run()
>> File: "/usr/lib64/python2.7/threading.py", line 765, in run
>>   self.__target(*self.__args, **self.__kwargs)
>> File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py",
>> line 195, in run
>>   ret = func(*args, **kwargs)
>> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run
>>   self._execute_task()
>> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315,
>> in _execute_task
>>   task()
>> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in 
>> __call__
>>   self._callable()
>> File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
>> 262, in __call__
>>   self._handler(self._ctx, self._req)
>> File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
>> 305, in _serveRequest
>>   response = self._handle_request(req, ctx)
>> File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
>> 345, in _handle_request
>>   res = method(**params)
>> File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194,
>> in _dynamicMethod
>>   result = fn(*methodArgs)
>> File: "", line 2, in setupNetworks
>> File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in 
>> method
>>   ret = func(*args, **kwargs)
>> File: "/usr/lib

[ovirt-users] Re: Node losing management network address?

2019-02-27 Thread Juhani Rautiainen

On Wed, Feb 27, 2019 at 6:26 PM Dominik Holler  wrote:
>

>
> This is a valid question.
>
> > > > I noticed that one node had ovirtmgmt network unsynchronized. I tried
>
> oVirt detected a difference between the expected configuration and applied
> configuration. This might happen if the interface configuration is change
> directly on the host instead of using oVirt Engine.
>
> > > > to resynchronize it.
>
> If you have the vdsm.log, the relevant lines start at the pattern
> Calling 'Host.setupNetworks'
> and ends at the pattern
> FINISH getCapabilities

This gave some clues. See the log below. IMHO it points engine getting
something wrong because it seems to ask for DHCP setup in query.
Probably fails because it succeeds in address change and network
connection is torn down.

2019-02-25 12:41:38,478+0200 WARN  (vdsm.Scheduler) [Executor] Worker
blocked: 
timeout=60, duration=60.01 at 0x7fc9c961e090> task#=230106 at
0x7fc9a43a1fd0>, traceback:
File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
  self.__bootstrap_inner()
File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
  self.run()
File: "/usr/lib64/python2.7/threading.py", line 765, in run
  self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py",
line 195, in run
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run
  self._execute_task()
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315,
in _execute_task
  task()
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
  self._callable()
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
262, in __call__
  self._handler(self._ctx, self._req)
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
305, in _serveRequest
  response = self._handle_request(req, ctx)
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
345, in _handle_request
  res = method(**params)
File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194,
in _dynamicMethod
  result = fn(*methodArgs)
File: "", line 2, in setupNetworks
File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1562, in
setupNetworks
  supervdsm.getProxy().setupNetworks(networks, bondings, options)
File: "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py",
line 56, in __call__
  return callMethod()
File: "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py",
line 54, in 
  **kwargs)
File: "", line 2, in setupNetworks
File: "/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in
_callmethod
  kind, result = conn.recv() (executor:363)
2019-02-25 12:41:38,532+0200 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer]
RPC call Host.ping2 succeeded in 0.00 seconds (__init__:312)
2019-02-25 12:41:38,538+0200 INFO  (jsonrpc/5) [api.host] START
getCapabilities() from=::1,41016 (api:48)
2019-02-25 12:41:38,834+0200 INFO  (jsonrpc/7) [api.host] START
getAllVmStats() from=::1,41018 (api:48)
2019-02-25 12:41:38,845+0200 WARN  (jsonrpc/7) [virt.vm]
(vmId='66394fff-6207-4277-ab47-bffd39075eef') monitor became
unresponsive (command timeout, age=70.070003) (vm:6014)
2019-02-25 12:41:38,850+0200 INFO  (jsonrpc/7) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done', 'code': 0},
'statsList': (suppressed)} from=::1,41018 (api:54)
2019-02-25 12:41:38,854+0200 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer]
RPC call Host.getAllVmStats succeeded in 0.02 seconds (__init__:312)
2019-02-25 12:41:38,867+0200 INFO  (jsonrpc/2) [api.host] START
getAllVmIoTunePolicies() from=::1,41018 (api:48)
2019-02-25 12:41:38,868+0200 INFO  (jsonrpc/2) [api.host] FINISH
getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code':
0}, 'io_tune_policies_dict': {'5cb297ee-81c0-4fc6-a2ba-87192723e6ab':
{'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L,
'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':
'/rhev/data-center/mnt/blockSD/5106a440-3a0a-4a79-a8ee-0ffc5fa8e3a2/images/a4b8b705-d999-4578-99eb-c80aa1cbbec6/19e9b0b0-6485-43ed-93e5-c2ef40d2e388',
'name': 'vda'}]}, '16a6d21b-3d69-4967-982b-dfe05534f153': {'policy':
[], 'current_values': [{'ioTune': {'write_bytes_sec': 0L,
'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':
'/rhev/data-center/mnt/blockSD/5106a440-3a0a-4a79-a8ee-0ffc5fa8e3a2/images/67c41187-6510-429e-b88d-93a907bbbf38/b1f627d3-13c2-4800-8f29-4504d35bba5f',
'name': 'sda'}]}, '59103e4b-863c-4c16-8f45-226712a9ab08': {'policy':
[], 'current_values': [{'ioTune': {'write_bytes_sec': 0L,
'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':

[ovirt-users] Re: Node losing management network address?

2019-02-27 Thread Juhani Rautiainen

On Wed, Feb 27, 2019 at 10:49 AM Dominik Holler  wrote:
>

> > I just copied that ifcfg-ovirtmgmt file from second node and fixed
> > IP-address to correct one before doing ifdown/ifup. That file had
> > changed to DHCP so my instinct was trying to correct that one.
> >
>
> Please let oVirt doing the work for you. If you interface to oVirt is
> the web UI, please use the dialog "Edit Managment Network: ovirtmgmt"
> which opens by clicking on the pencil symbol next to ovirtmgmt in
> Compute > Hosts > xxx > Network Interfaces > Setup Host Networks
> This will enable oVirt to recognize this change as intended.

Problem was that oVirt couldn't do the work anymore. It had for some
reason switched that node to using DHCP addresses. DHCP gave totally
different address to the node which was not known by ovirt engine.
This is why I tried above change because I had lost connections to
node after the resync. I had to use HP ILO console to see what's going
on and found out that it had switched to DHCP and had wrong. And it
used ILO fencing to boot the server because it couldn't reach it
(which took many aftive vm's down). After the boot it still couldn't
connect because address was still given by DHCP. What I'm wondering
why it switched to DHCP when it had had static since first minute?

>
> Maybe no required anymore, since you described very precise what you
> did.

Or not clearly enough.

Thanks,
-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2ZVLCGNK4U2F4TQOKAV3E4G5GJ5VMJQU/

[ovirt-users] Re: Node losing management network address?

2019-02-26 Thread Juhani Rautiainen

On Tue, Feb 26, 2019 at 12:05 PM Dominik Holler  wrote:
>
> On Mon, 25 Feb 2019 13:46:59 +0200
> Juhani Rautiainen  wrote:
>
> > Hi!
> >
> > I had weird occurence in my two node ovirt cluster today (I have HE).
> > I noticed that one node had ovirtmgmt network unsynchronized. I tried
> > to resynchronize it. This led the node being rebooted by HP ILO. After
> > reboot the node came up with DHCP address. Tried to change it back by
> > fixing ifcfg-ovirtmgmt to original static address.
>
> How did you fix? By ovirt-engine's web UI, REST-API or by modifying a
> config file on the host, or cockpit?

I just copied that ifcfg-ovirtmgmt file from second node and fixed
IP-address to correct one before doing ifdown/ifup. That file had
changed to DHCP so my instinct was trying to correct that one.

>
> If you would share the vdsm.log files containing the relevant flow, this
> would help to understand what happened.

Can I upload these somewhere? I can find the vdsm logs from the
failure time frame. From engine logs I can see that EVENT_ID:
VDS_NETWORKS_OUT_OF_SYNC(1,110) started weeks earlier (February 6th).
The problem really just flared when I noticed it and tried to resync.
There are not old enough vdsm logs to see what happened back then.
This event continues daily so is there anything on vdsm logs which is
connected to that event that I could dig for? Just noticed that this
is pretty much the date I upgraded cluster from 4.2.8 to 4.3.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4IAS36BZGYZ2ETPHXYOR4DHC5GIMFYJB/

[ovirt-users] Re: Adding user to internalz with webadmin?

2019-02-26 Thread Juhani Rautiainen

On Tue, Feb 26, 2019 at 9:09 AM Lucie Leistnerova  wrote:
>
> Hi Juhani,
>
> engine ovirt-aaa-jdbc-tool
>
> # ovirt-aaa-jdbc-tool user add test
> # ovirt-aaa-jdbc-tool user password-reset test
> --password-valid-to="2020-01-01 00:00:00Z"
>
> See
> https://www.ovirt.org/documentation/admin-guide/chap-Users_and_Roles.html
>
> Adding user in webadmin works only for already created user to see/set
> permissions for them. In the dialog is Search row, when you press Go and
> internal is selected in the first selectbox, all users will be displayed
> in table below. You check some and then press Add.
>

Thank you. This worked perfectly. Maybe there should be  a small hint
in UI that you can't add users from there.

> --
> Lucie Leistnerova

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NQAOWBIVTJRQ3BFRBBTWQADKJI5TGTZR/

[ovirt-users] Adding user to internalz with webadmin?

2019-02-25 Thread Juhani Rautiainen

Hi!

How do you add users to internalz with webadmin? I mean there is add a
button in Administration->Users which opens 'Add Users and
Groups'-window . Windows has  'Add'- and 'Add and close'- buttons in
the bottom. Just can't figure out how they work. Pushing either of
those Add-buttons just closes the window.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y4GFUACEHEEHYPHCIXDKX3UGRHB5FD23/

[ovirt-users] Re: How to upgrade ovirt node host?

2019-02-19 Thread Juhani Rautiainen

On Tue, Feb 19, 2019 at 3:30 PM adam...@adagene.com.cn
 wrote:
>
> Thank you. Your answer solved my problems.
>  After I installed the big rpm and reboot.  there are two repo files remains  
> in /etc/yum.repos.d/
> ovirt-4.2-dependencies.repo
> ovirt-4.2.repo
> maybe I should remove the 2 files to avoid any conflicts?
>
I think that they'll disappear if you do

yum remove ovirt-release42

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2PE4TWKZHMCJYIIDUWC7QL7JXOYV67TN/

[ovirt-users] Re: How to upgrade ovirt node host?

2019-02-19 Thread Juhani Rautiainen

On Mon, Feb 18, 2019 at 9:19 AM Lucie Leistnerova  wrote:

>
> 1. Ensure the correct repositories are enabled. You can check which 
> repositories are currently enabled by running yum repolist.
>
> You can do it like this:
>
> # yum install https://resources.ovirt.org/pub/yum-repo/ovirt-release43.rpm

And this lead me to problems when I did 'yum update' after that. It
should work but didn't. So do this instead:

yum update ovirt-node-ng-image-update

or do to everything in one step:

yum install 
https://resources.ovirt.org/pub/ovirt-4.3/rpm/el7/noarch/ovirt-node-ng-image-update-4.3.0-1.el7.noarch.rpm.

I tried both and they worked.

-Juhani

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NNQ7K6Z3LAUQTS3YOTCULMDHAI3YOGRC/

[ovirt-users] Re: AMD EPYC 4.3 upgrade 'CPU type is not supported in this cluster compatibility version or is not supported at all'

2019-02-09 Thread Juhani Rautiainen

On Sat, Feb 9, 2019 at 7:43 PM Ryan Bullock  wrote:
>
> So I tried making a new cluster with a 4.2 compatibility level and moving one 
> of my EPYC hosts into it. I then updated the host to 4.3 and switched the 
> cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also 
> tried plain AMD EPYC). It still fails to make the host operational 
> complaining that 'CPU type is not supported in this cluster compatibility 
> version or is not supported at all'.
>
When I did this with Epyc I made new cluster wth 4.3 level and Epyc
CPU. And then moved the nodes to it. Maybe try that? I also had to
move couple of VM's to new cluster because old cluster couldn't
upgrade with those. When nodes and couple problem VM's were in new
cluster I could upgrade old cluster to new level.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RDCP3NCZUILRJB3SMGGDYFP5YT2EBYYO/

[ovirt-users] Re: How to upgrade CPU type in Clusters?

2019-02-08 Thread Juhani Rautiainen

On Sat, Feb 9, 2019 at 1:05 AM Vinícius Ferrão
 wrote:
>
> Hello,
>
> For reasons unknown during oVirt 4.2 -> 4.3 my CPU type was Broadwell IBRS 
> and because of this I can’t upgrade the datacenter to 4.3. The problem was 
> that on 4.3 theres only Broadwell IBRS SSBD and Broadwell.
>
> So the only way to up the cluster to 4.3 was to downgrade to plain Broadwell.
>
> Now I can’t upgrade it Broadwell IBRS SSBD.
>
> So, how can I upgrade it back?

There has been lot's of discussion about that this week. I had same
problem with Opteron G3->Epyc conversion. There steps were suggested
to me and they worked and you problem is similar:

- set HE global maintenance mode
- set one of the hosted-engine hosts into maintenance mode
- move it to a different cluster
- shutdown the engine VM
- manually restart the engine VM on the host on the custom cluster
directly executing on that host: hosted-engine --vm-start
- connect again to the engine
- set all the hosts of the initial cluster into maintenance mode
- upgrade the cluster
- shut down again the engine VM
- manually restart the engine VM on one of the hosts of the initial cluster
- move back the host that got into a temporary cluster to its initial cluster

As first step I would add that you create new cluster which already
has 4.3 level and CPU Type you want so already can check if cluster
accepts nodes.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VFWTYTLXCLN3DGUH5ENLWVK34SHPZAC2/

[ovirt-users] Re: AMD EPYC 4.3 upgrade 'CPU type is not supported in this cluster compatibility version or is not supported at all'

2019-02-07 Thread Juhani Rautiainen

On Thu, Feb 7, 2019 at 6:52 PM Simone Tiraboschi 
wrote:

>
>
> For an hosted-engine cluster we have a manual workaround procedure
> documented here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1672859#c1
>
>
I managed to upgrade my Epyc cluster with those steps. I made new cluster
with Epyc CPU Type and cluster already in 4.3 level. Starting engine in new
cluster complained something about not finding vm with that uuid but it
still started engine fine. When all nodes were in new cluster I still
couldn't upgrade old cluster because engine was complaining that couple of
VM's couldn't be upgraded (something to do with custom level). I moved them
to new cluster too. Had to just change networks to management for the move.
After that I could upgrade old cluster to Epyc and 4.3 level. Then I just
moved VM's and nodes back (same steps but backwards). After that you can
remove the extra cluster and raise datacenter to 4.3 level.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/426ZMUHUUPO3Z3H43NVHKUPSPLZY4LI5/

[ovirt-users] Re: Changing CPU type from Opteron G3 to Epyc with hosted engine in 4.3

2019-02-06 Thread Juhani Rautiainen

On Wed, Feb 6, 2019 at 10:31 AM Simone Tiraboschi  wrote:
>

>
> You can simply upgrade the cluster if all the hosts are in global maintenance 
> mode.

Like I originally wrote it doesn't work like that for me. This was
what I tried. I tried again now and even confirmed from cli that it is
correct mode:

# hosted-engine --vm-status

!! Cluster is in GLOBAL MAINTENANCE mode !!

And still I get this:

"Error while executing action: Cannot change Cluster CPU type unless
all Hosts attached to this Cluster are in Maintenance"

Is there a log where I can check if there are some traces why it gives
this error message?

> The only case that could prevent that is that you are in hosted-engine mode 
> and so you cannot set the latest host into maintenance mode without loosing 
> the engine itself.
>
> If this is your case,
> what you can do is:
> - set HE global maintenance mode
> - set one of the hosted-engine hosts into maintenance mode
> - move it to a different cluster
> - shutdown the engine VM
> - manually restart the engine VM on the host on the custom cluster directly 
> executing on that host: hosted-engine --vm-start
> - connect again to the engine
> - set all the hosts of the initial cluster into maintenance mode
> - upgrade the cluster
> - shut down again the engine VM
> - manually restart the engine VM on one of the hosts of the initial cluster
> - move back the host that got into a temporary cluster to its initial cluster

I might try this one.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WN6XG4JT5MPLGIINEGB2W2YRGJ43JMHG/

[ovirt-users] Re: Changing CPU type from Opteron G3 to Epyc with hosted engine in 4.3

2019-02-05 Thread Juhani Rautiainen

On Tue, Feb 5, 2019 at 8:24 AM Juhani Rautiainen
 wrote:
> Hi!
> Now that I have engine and nodes on 4.3 level I'm stuck on upgrading
> the CPU type.  What is the correct way to solve this problem?

So it seems that I have to reinstall the whole cluster because usually
the silence means that there is no solution. As it currently stands I
don't see any other way. I made bugzilla entry about the problem
(1672859). This problem has been discussed shortly on this list 2016
but probably nobody bothered to do the bug report back then. Makes
upgrading hardware bit pointless if you can't use new CPU features.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/54K7BGQEWESN2DEIMZ55FIYQ7RLAIAZR/

[ovirt-users] Changing CPU type from Opteron G3 to Epyc with hosted engine in 4.3

Hi!

Now that I have engine and nodes on 4.3 level I'm stuck on upgrading
the CPU type. I have 2 node cluster with Epyc processors. It was
originally installed with 4.2 so it chose CPU type as Opteron G3 (no
Epyc support back then). In Engine 4.3 Epyc is available as CPU type
when I choose Compatibility Version: 4.3. Big problem is that it
doesn't allow to upgrade CPU because all hosts are not in maintenance:
"Error while executing action: Cannot change Cluster CPU type unless
all Hosts attached to this Cluster are in Maintenance". Putting all
hosts to maintenance is impossible because Engine is hosted in the
cluster. I tried with Global HA maintenance, but that didn't help.
What is the correct way to solve this problem?

Thanks,
-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XYY2WAGWXBL5YA6KAQY3ZEBVFOKELAKE/

[ovirt-users] Re: Upgrade guide for node from 4.2->4.3

On Mon, Feb 4, 2019 at 8:02 PM Sandro Bonazzola  wrote:

>
>
> Investigated a bit.
> On a clean 4.2.8 allo repos in ovirt* are enabled but with the following
> directive:
> # imgbased: set-enabled
> includepkgs = ovirt-node-ng-image-update ovirt-node-ng-image
> ovirt-engine-appliance
>
> installing ovirt-release43.rpm
>  doesn't
> apply above directives because they're added there
> by ovirt-release-host-node post installation script.
>
> So, if you install ovirt-release43, then you need to:
> yum update ovirt-release-host-node; yum update;
>
> to finish the node upgrade.
>
> So the most clean way to upgrade from 4.2.8 to 4.3 is  "yum install
> https://resources.ovirt.org/pub/ovirt-4.3/rpm/el7/noarch/ovirt-node-ng-image-update-4.3.0-1.el7.noarch.rpm
>  "
>
> this leave ovirt-4.2 repo files in /etc/yum.repos.d ; after rebooting into
> 4.3 layer it's safe to remove them
>
>
> Thanks. I managed to upgrade the nodes. Both of those instructions worked
and nodes are now in 4.3 level.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MGBQE2MTNFEZSCZGEUIKSNCBTTKA2Y4I/

[ovirt-users] Re: Upgrade guide for node from 4.2->4.3

On Mon, Feb 4, 2019 at 5:58 PM Vinícius Ferrão
 wrote:
>
> Juhani, how did you upgrade from 4.2 to 4.3?
>
> You followed one of this guides? 
> https://www.ovirt.org/documentation/upgrade-guide/upgrade-guide.html
>
On engine (self-hosted) I first upgraded to latest 4.2 (4.2.8) with
"Update the oVirt Engine"-chapter. I also rebooted the engine because
there was kernel upgrade in yum updates (enable global HA before
this). After that I installed oVirt 4.3 repos with

yum install https://resources.ovirt.org/pub/yum-repo/ovirt-release43.rpm

Without that I couldn't find the 4.3 packages with
engine-upgrade-check . This step was not in the instructions but it
seemed the logical next step. After that I just did same steps as
above. When I could confirm that 4.3 engine was running I removed 4.2
repos with

yum remove ovirt-release42

> I’m looking for the correct way to do the upgrade.
>
> Thanks,

Hopefully this helps,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NHYHJJXEZ6ATGAWKFT4AEXBZWVIPBQKU/

[ovirt-users] Upgrade guide for node from 4.2->4.3