[ovirt-users] Re: How to delete obsolete Data Centers with no hosts, but with domains inside

2019-09-25 Thread Claudio Soprano

I'm glad that i solved the problem but i would like to know why in the GUI:

1) The Force Remove Data Center didn't work ? I can understand the 
Remove Data Center is the standard way to do it and it checks all is 
working before doing it, but if it doesn't work the Force Remove would 
remove it without checks (Storage Domains in maintenance ? Host not 
configured ? SPM not working ?) maybe it can show a message that it is 
not recommended but then it would work always and remove always a Data 
Center


2) Is not there an option to Force Maintenance a Storage Domain without 
a SPM host active or no Host configured ? Again an option that alert the 
administrator about the possibile issues


Or

3) Is not there an option under Administration Panel that would permit 
to Force the Maintenace of a Storage Domain (it is just a simple change 
of a value of a field of a table in a Database).


Repeat i know how it would work, but then having a way to fix it if 
someone has not followed the standard way to do it, would be easy (it is 
just a value in a database that blocks removing the Data Center).


Just my opinions

Claudio

On 24/09/19 17:23, Claudio Soprano wrote:


Thanks, i solved putting all the Storage domains in maintenance 
manually changing the status to the value of 6 in the postgres 
database like you suggested.


After that i could remove the data center from the ovirt management 
interface.


Thanks again for your help

Claudio

Il 24/09/19 13:50, Benny Zlotnik ha scritto:
ah yes, it's generally a good idea to move them to maintenance in the 
case you describe


you can brobably change the status manually in the database, the 
table is storage_pool_iso_map, and the status code for maintenance is 6


On Tuesday, September 24, 2019, Claudio Soprano 
mailto:claudio.sopr...@lnf.infn.it>> wrote:

> Yes, i tried it and i got
>
> "Error while executing action: Cannot remove Data Center which 
contains Storage Domains that are not in Maintenance status.
> -Please deactivate all domains and wait for tasks to finish before 
removing the Data Center."

>
> But the domains are only attachable or activable, so i don't know 
what to do.

>
> Claudio
>
> Il 24/09/19 12:19, Benny Zlotnik ha scritto:
>>
>> Did you try to force remove the DC?
>> You have the option in the UI
>>
>> On Tue, Sep 24, 2019 at 1:07 PM Claudio Soprano
>> mailto:claudio.sopr...@lnf.infn.it>> 
wrote:

>>>
>>> Hi to all,
>>>
>>> We are using ovirt to manage 6 Data Centers, 3 of them are old Data
>>> Centers with no hosts inside, but with domains, storage and VMs 
not running.

>>>
>>> We left them because we wanted to have some backups in case of 
failure

>>> of the new Data Centers created.
>>>
>>> Time pass and now we would like to remove these Data Centers, but 
we got

>>> no way for now to remove them.
>>>
>>> If we try to remove the Storage Domains (using remove o destroy) 
we get

>>>
>>> "Error while executing action: Cannot destroy the master Storage 
Domain
>>> from the Data Center without another active Storage Domain to 
take its

>>> place.
>>> -Either activate another Storage Domain in the Data Center, or remove
>>> the Data Center.
>>> -If you have problems with the master Data Domain, consider following
>>> the recovery process described in the documentation, or contact your
>>> system administrator."
>>>
>>> if we try to remove the Data Center directly we get
>>>
>>> "Error while executing action: Cannot remove Data Center. There is no
>>> active Host in the Data Center."
>>>
>>> How can we solve the problem ?
>>>
>>> It can be done via ovirt-shell or using some script or via ovirt
>>> management interface ?
>>>
>>> Thanks in advance
>>>
>>> Claudio
>>> ___
>>> Users mailing list -- users@ovirt.org <mailto:users@ovirt.org>
>>> To unsubscribe send an email to users-le...@ovirt.org 
<mailto:users-le...@ovirt.org>

>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SADE4JVXJYKZZQ7M3EPB4FY7LWLJPKFK/
> 


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
http

[ovirt-users] Re: How to delete obsolete Data Centers with no hosts, but with domains inside

2019-09-24 Thread Claudio Soprano
Thanks, i solved putting all the Storage domains in maintenance manually 
changing the status to the value of 6 in the postgres database like you 
suggested.


After that i could remove the data center from the ovirt management 
interface.


Thanks again for your help

Claudio

Il 24/09/19 13:50, Benny Zlotnik ha scritto:
ah yes, it's generally a good idea to move them to maintenance in the 
case you describe


you can brobably change the status manually in the database, the table 
is storage_pool_iso_map, and the status code for maintenance is 6


On Tuesday, September 24, 2019, Claudio Soprano 
mailto:claudio.sopr...@lnf.infn.it>> wrote:

> Yes, i tried it and i got
>
> "Error while executing action: Cannot remove Data Center which 
contains Storage Domains that are not in Maintenance status.
> -Please deactivate all domains and wait for tasks to finish before 
removing the Data Center."

>
> But the domains are only attachable or activable, so i don't know 
what to do.

>
> Claudio
>
> Il 24/09/19 12:19, Benny Zlotnik ha scritto:
>>
>> Did you try to force remove the DC?
>> You have the option in the UI
>>
>> On Tue, Sep 24, 2019 at 1:07 PM Claudio Soprano
>> mailto:claudio.sopr...@lnf.infn.it>> 
wrote:

>>>
>>> Hi to all,
>>>
>>> We are using ovirt to manage 6 Data Centers, 3 of them are old Data
>>> Centers with no hosts inside, but with domains, storage and VMs 
not running.

>>>
>>> We left them because we wanted to have some backups in case of failure
>>> of the new Data Centers created.
>>>
>>> Time pass and now we would like to remove these Data Centers, but 
we got

>>> no way for now to remove them.
>>>
>>> If we try to remove the Storage Domains (using remove o destroy) 
we get

>>>
>>> "Error while executing action: Cannot destroy the master Storage 
Domain

>>> from the Data Center without another active Storage Domain to take its
>>> place.
>>> -Either activate another Storage Domain in the Data Center, or remove
>>> the Data Center.
>>> -If you have problems with the master Data Domain, consider following
>>> the recovery process described in the documentation, or contact your
>>> system administrator."
>>>
>>> if we try to remove the Data Center directly we get
>>>
>>> "Error while executing action: Cannot remove Data Center. There is no
>>> active Host in the Data Center."
>>>
>>> How can we solve the problem ?
>>>
>>> It can be done via ovirt-shell or using some script or via ovirt
>>> management interface ?
>>>
>>> Thanks in advance
>>>
>>> Claudio
>>> ___
>>> Users mailing list -- users@ovirt.org <mailto:users@ovirt.org>
>>> To unsubscribe send an email to users-le...@ovirt.org 
<mailto:users-le...@ovirt.org>

>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SADE4JVXJYKZZQ7M3EPB4FY7LWLJPKFK/
> 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PC2TNOIDTORQMLQWR4GHUXBWK7WVO6ED/


[ovirt-users] Re: How to delete obsolete Data Centers with no hosts, but with domains inside

2019-09-24 Thread Claudio Soprano

Yes, i tried it and i got

"Error while executing action: Cannot remove Data Center which contains 
Storage Domains that are not in Maintenance status.
-Please deactivate all domains and wait for tasks to finish before 
removing the Data Center."


But the domains are only attachable or activable, so i don't know what 
to do.


Claudio

Il 24/09/19 12:19, Benny Zlotnik ha scritto:

Did you try to force remove the DC?
You have the option in the UI

On Tue, Sep 24, 2019 at 1:07 PM Claudio Soprano
 wrote:

Hi to all,

We are using ovirt to manage 6 Data Centers, 3 of them are old Data
Centers with no hosts inside, but with domains, storage and VMs not running.

We left them because we wanted to have some backups in case of failure
of the new Data Centers created.

Time pass and now we would like to remove these Data Centers, but we got
no way for now to remove them.

If we try to remove the Storage Domains (using remove o destroy) we get

"Error while executing action: Cannot destroy the master Storage Domain
from the Data Center without another active Storage Domain to take its
place.
-Either activate another Storage Domain in the Data Center, or remove
the Data Center.
-If you have problems with the master Data Domain, consider following
the recovery process described in the documentation, or contact your
system administrator."

if we try to remove the Data Center directly we get

"Error while executing action: Cannot remove Data Center. There is no
active Host in the Data Center."

How can we solve the problem ?

It can be done via ovirt-shell or using some script or via ovirt
management interface ?

Thanks in advance

Claudio
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SADE4JVXJYKZZQ7M3EPB4FY7LWLJPKFK/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OTF5V6P2IO5T7HWXVEI5ZIW7FN6PN6YZ/


[ovirt-users] How to delete obsolete Data Centers with no hosts, but with domains inside

2019-09-24 Thread Claudio Soprano

Hi to all,

We are using ovirt to manage 6 Data Centers, 3 of them are old Data 
Centers with no hosts inside, but with domains, storage and VMs not running.


We left them because we wanted to have some backups in case of failure 
of the new Data Centers created.


Time pass and now we would like to remove these Data Centers, but we got 
no way for now to remove them.


If we try to remove the Storage Domains (using remove o destroy) we get

"Error while executing action: Cannot destroy the master Storage Domain 
from the Data Center without another active Storage Domain to take its 
place.
-Either activate another Storage Domain in the Data Center, or remove 
the Data Center.
-If you have problems with the master Data Domain, consider following 
the recovery process described in the documentation, or contact your 
system administrator."


if we try to remove the Data Center directly we get

"Error while executing action: Cannot remove Data Center. There is no 
active Host in the Data Center."


How can we solve the problem ?

It can be done via ovirt-shell or using some script or via ovirt 
management interface ?


Thanks in advance

Claudio
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SADE4JVXJYKZZQ7M3EPB4FY7LWLJPKFK/


Re: [ovirt-users] Centos 7.3 ovirt 4.0.6 Can't add host to cluster collectd or collectd-disk not found

2017-09-11 Thread Claudio Soprano

You are welcome.

I'm happy you can replicate the problem.

Infact the problem is the name "default" used for the bridge any other 
name works (i can't test all the names :) )


Claudio

On 10/09/17 08:39, Edward Haas wrote:

This is a limitation of the brctl tool.

Attempting to define a bridge named 'default' fails:
# brctl addbr default
add bridge failed: Invalid argument

Thanks,
Edy.


On Thu, Sep 7, 2017 at 8:26 PM, Claudio Soprano 
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>> wrote:


Ok

After installed the host, i need to attach it the networks

so I got the Unexpected Exception while I'm trying to add a vlan 1
(called default) to a Host, in the Setup_networks gui.

I attached:

- a screenshoot of all the network we have

- two screenshoots for the "default" network proprierties (one for
vnic profile, other for the proprierties)

- supervdsm.log from host ovc2n06 (i attached all the file, but
you will find all at the beginning)

Like already explained, it is simple to replicate the problem with
a CentOS v7.3 host, you need to create a Bridge called default
(ifcfg-default) and a Vlan 1 attached to it (ifcfg-enp12s0.1), do
a systemctl restart network and you will get the error.
To fix the problem rename the bridge to "def" for example and
change the Vlan 1 file configuration to link the new name and then
all works.

This problem is found only on the CentOS v7.3 on old CentOS v7.2
hosts the "default" bridge works, to override this problem we
changed the name to "def" for now.

All the hosts now are ovirt v4.1.5 and CentOS v7.3 we have also
some old hosts on CentOS v7.2

Claudio


On 07/09/17 13:46, Dan Kenigsberg wrote:

I do not understand which steps you are doing until you get the
"Unexpected Exception" dialog.
Could you detail them?

Also, can you share supervdsm.log from the relevant host, so
we can
understand the nature of that unexpected exception?

On Wed, Aug 30, 2017 at 12:31 PM, Claudio Soprano
<claudio.sopr...@lnf.infn.it
<mailto:claudio.sopr...@lnf.infn.it>> wrote:

The problem is not the host-deploy, infact the deploy
finished without
errors, if i remember good in the deploy process the
network are not set, i
need to attach each network i need manually, under hosts,
network interface,
setup host networks

and the problem is there, when i attach all the VLANs
(including the
default, that is VLAN 1) it gave error, default is not
present in the
system, if i add all the VLANS (excluding the default,
that is VLAN 1) it is
OK, but the host will not activate because missing the
default vlan.

If i manually add the 2 configuration files (ifcfg-default
and ifcfg-intX.1)
to the host and make a systemctl network restart, it gave
error then

ip addr reports

the vlan 1 is added, but not the Bridge to default (the
default is missing
error).

I don't know if the name "default" (for the network, VLAN
1) could be the
problem, but i can't rename now, because i need to detach
from the hosts,
but actually i can't.

I added 3 screenshoots to show you the situation, before
adding VLAN 1
("default" network), the error when adding it, and the
info on the error.


-- 


       /        |    /   _/           /    |    /  _/   
|    /
      /       / |   /   /                /   / |   /  /        / |   /
     /       /  |  /   ___/   _/    /   /  |  /  ___/     /  |  /
    /       /   | /   /                /   /   | /   /       /   | /
  __/ _/   __/  _/               _/  _/   __/  _/    _/   __/

Claudio Soprano                phone: (+39)-06-9403.2349
<tel:%28%2B39%29-06-9403.2349>/2355
Computing Service              fax: (+39)-06-9403.2649
<tel:%28%2B39%29-06-9403.2649>
LNF-INFN                       e-mail: claudio.sopr...@lnf.infn.it
<mailto:claudio.sopr...@lnf.infn.it>
Via Enrico Fermi, 40           www: http://www.lnf.infn.it/
I-00044 Frascati, Italy




--

   /|/   _/   /|/   _/|/
  /   / |   /   //   / |   /   // |   /
 /   /  |  /   ___/   _//   /  |  /   ___/ /  |  /
/   /   | /   /        /   /   | /   //   | /
  __/ _/   __/  _/   _/  _/   __/  _/   _/   __/

Claudio Sopranophone

Re: [ovirt-users] Centos 7.3 ovirt 4.0.6 Can't add host to cluster collectd or collectd-disk not found - SOLVED - BUG

2017-09-07 Thread Claudio Soprano
No one answered to my messages, so i don't know if someone replicated 
the bug or not in CentOS 7.3.


Let me know if you need more info.

Claudio

On 30/08/17 12:25, Claudio Soprano wrote:


More investigation on this for you :)

the bug is only on CentOS v7.3 infact on previous hosts with CentOS < 
v7.3 i have no problem about the name of the network.


I discovered also that i don't need only to rename the network name, 
but also the vnic profile associated, infact with network name def, 
and vnic profile default, it still didn't work, renamed network name 
and vnic profile to def both, now works on CentOS v7.3 with no errors 
also from GUI.


when i create a network both network name and vnic profile are called 
the same (for my it was default).


Hope that my info can help you.

Another boring issues is that i need to detach the network to all the 
hosts to rename only, but also to change the network on all VMs that 
use it also if they are stopped, i was lucky only 2 VMs and only 4 
hosts, but i can't imagine people have 40-50 hosts and maybe 100-200 
VMs on a network.


Anyway now i have Centos V7.3 and V7.2 hosts both working.

Claudio

On 30/08/17 11:52, Claudio Soprano wrote:


Ok i SOLVED the problem, but i don't know if it is a BUG, the problem 
is the name default, but i had it working with ovirt v3 and v4.0pre 
for sure.


I don't know if the check has been added in the setup_host_network or 
not.


This is that i did

i manually added a ifcfg-def and ifcfg-enp12s0.1 file (first for the 
bridge and the second for the VLAN) to an host that don't have VLAN 1


[root@X network-scripts]# more ifcfg-enp12s0.1
# Generated by VDSM version 4.19.28-1.el7.centos
DEVICE=enp12s0.1
VLAN=yes
BRIDGE=def
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

[root@X network-scripts]# more ifcfg-def
# Generated by VDSM version 4.19.28-1.el7.centos
DEVICE=def
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

then run

systemctl restart network

it didn't give errors, and ip addr shows

[root@X network-scripts]# ip addr

94: enp12s0.1@enp12s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
qdisc noqueue master def state UP qlen 1000

link/ether 00:25:b5:00:10:5f brd ff:ff:ff:ff:ff:ff
inet6 fe80::225:b5ff:fe00:105f/64 scope link
   valid_lft forever preferred_lft forever
95: def: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP qlen 1000

link/ether 00:25:b5:00:10:5f brd ff:ff:ff:ff:ff:ff
inet6 fe80::225:b5ff:fe00:105f/64 scope link
   valid_lft forever preferred_lft forever

I will try to change the name of that network to something different 
from default


Claudio

On 30/08/17 11:31, Claudio Soprano wrote:


The problem is not the host-deploy, infact the deploy finished 
without errors, if i remember good in the deploy process the network 
are not set, i need to attach each network i need manually, under 
hosts, network interface, setup host networks


and the problem is there, when i attach all the VLANs (including the 
default, that is VLAN 1) it gave error, default is not present in 
the system, if i add all the VLANS (excluding the default, that is 
VLAN 1) it is OK, but the host will not activate because missing the 
default vlan.


If i manually add the 2 configuration files (ifcfg-default and 
ifcfg-intX.1) to the host and make a systemctl network restart, it 
gave error then


ip addr reports

the vlan 1 is added, but not the Bridge to default (the default is 
missing error).


I don't know if the name "default" (for the network, VLAN 1) could 
be the problem, but i can't rename now, because i need to detach 
from the hosts, but actually i can't.


I added 3 screenshoots to show you the situation, before adding VLAN 
1 ("default" network), the error when adding it, and the info on the 
error.


Claudio

On 30/08/17 08:49, Sandro Bonazzola wrote:



2017-08-29 14:45 GMT+02:00 Claudio Soprano 
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>>:


Ok in this way i could install the hosts, but we got another
error on them when setup networks

we have on each hosts 2 interfaces

interface 1 is ovirtmgm

interface 2 is a TRUNK with VLANs inside.

All my old hosts are all the VLANs on interface2 included VLAN
1 (tagged).

When i setup networks for the new hosts 4.0.6 i can't include
VLAN 1, ovirt answer with

VDSM hostname.domainname command HostSetupNetworksVDS failed:
[Errno 19] default (that is VLAN 1) is not present in the system.

So thinking about upgrading to 4.1.5 we updated all the new
hosts (the old are still v4.0.6) and reinstall from beginning
but still we get the same error.

What means that error ?


Adding Marcin and Dan about this.

Claudio Soprano

On 25/08/17 15:27, Sandro Bonazzola wrote:



2017-08-25 10:43 GMT+02:00 Claudio Soprano
<claudio.sopr...@l

Re: [ovirt-users] Centos 7.3 ovirt 4.0.6 Can't add host to cluster collectd or collectd-disk not found - SOLVED

2017-08-30 Thread Claudio Soprano

More investigation on this for you :)

the bug is only on CentOS v7.3 infact on previous hosts with CentOS < 
v7.3 i have no problem about the name of the network.


I discovered also that i don't need only to rename the network name, but 
also the vnic profile associated, infact with network name def, and vnic 
profile default, it still didn't work, renamed network name and vnic 
profile to def both, now works on CentOS v7.3 with no errors also from GUI.


when i create a network both network name and vnic profile are called 
the same (for my it was default).


Hope that my info can help you.

Another boring issues is that i need to detach the network to all the 
hosts to rename only, but also to change the network on all VMs that use 
it also if they are stopped, i was lucky only 2 VMs and only 4 hosts, 
but i can't imagine people have 40-50 hosts and maybe 100-200 VMs on a 
network.


Anyway now i have Centos V7.3 and V7.2 hosts both working.

Claudio

On 30/08/17 11:52, Claudio Soprano wrote:


Ok i SOLVED the problem, but i don't know if it is a BUG, the problem 
is the name default, but i had it working with ovirt v3 and v4.0pre 
for sure.


I don't know if the check has been added in the setup_host_network or not.

This is that i did

i manually added a ifcfg-def and ifcfg-enp12s0.1 file (first for the 
bridge and the second for the VLAN) to an host that don't have VLAN 1


[root@X network-scripts]# more ifcfg-enp12s0.1
# Generated by VDSM version 4.19.28-1.el7.centos
DEVICE=enp12s0.1
VLAN=yes
BRIDGE=def
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

[root@X network-scripts]# more ifcfg-def
# Generated by VDSM version 4.19.28-1.el7.centos
DEVICE=def
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

then run

systemctl restart network

it didn't give errors, and ip addr shows

[root@X network-scripts]# ip addr

94: enp12s0.1@enp12s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
qdisc noqueue master def state UP qlen 1000

link/ether 00:25:b5:00:10:5f brd ff:ff:ff:ff:ff:ff
inet6 fe80::225:b5ff:fe00:105f/64 scope link
   valid_lft forever preferred_lft forever
95: def: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP qlen 1000

link/ether 00:25:b5:00:10:5f brd ff:ff:ff:ff:ff:ff
inet6 fe80::225:b5ff:fe00:105f/64 scope link
   valid_lft forever preferred_lft forever

I will try to change the name of that network to something different 
from default


Claudio

On 30/08/17 11:31, Claudio Soprano wrote:


The problem is not the host-deploy, infact the deploy finished 
without errors, if i remember good in the deploy process the network 
are not set, i need to attach each network i need manually, under 
hosts, network interface, setup host networks


and the problem is there, when i attach all the VLANs (including the 
default, that is VLAN 1) it gave error, default is not present in the 
system, if i add all the VLANS (excluding the default, that is VLAN 
1) it is OK, but the host will not activate because missing the 
default vlan.


If i manually add the 2 configuration files (ifcfg-default and 
ifcfg-intX.1) to the host and make a systemctl network restart, it 
gave error then


ip addr reports

the vlan 1 is added, but not the Bridge to default (the default is 
missing error).


I don't know if the name "default" (for the network, VLAN 1) could be 
the problem, but i can't rename now, because i need to detach from 
the hosts, but actually i can't.


I added 3 screenshoots to show you the situation, before adding VLAN 
1 ("default" network), the error when adding it, and the info on the 
error.


Claudio

On 30/08/17 08:49, Sandro Bonazzola wrote:



2017-08-29 14:45 GMT+02:00 Claudio Soprano 
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>>:


Ok in this way i could install the hosts, but we got another
error on them when setup networks

we have on each hosts 2 interfaces

interface 1 is ovirtmgm

interface 2 is a TRUNK with VLANs inside.

All my old hosts are all the VLANs on interface2 included VLAN 1
(tagged).

When i setup networks for the new hosts 4.0.6 i can't include
VLAN 1, ovirt answer with

VDSM hostname.domainname command HostSetupNetworksVDS failed:
[Errno 19] default (that is VLAN 1) is not present in the system.

So thinking about upgrading to 4.1.5 we updated all the new
hosts (the old are still v4.0.6) and reinstall from beginning
but still we get the same error.

What means that error ?


Adding Marcin and Dan about this.

Claudio Soprano

On 25/08/17 15:27, Sandro Bonazzola wrote:



2017-08-25 10:43 GMT+02:00 Claudio Soprano
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>>:

Hi all,

we are installing new nodes on a Ovirt 4.0.6 Cluster, the
new nodes were installed from a Minima

Re: [ovirt-users] Centos 7.3 ovirt 4.0.6 Can't add host to cluster collectd or collectd-disk not found - SOLVED

2017-08-30 Thread Claudio Soprano
Ok i SOLVED the problem, but i don't know if it is a BUG, the problem is 
the name default, but i had it working with ovirt v3 and v4.0pre for sure.


I don't know if the check has been added in the setup_host_network or not.

This is that i did

i manually added a ifcfg-def and ifcfg-enp12s0.1 file (first for the 
bridge and the second for the VLAN) to an host that don't have VLAN 1


[root@X network-scripts]# more ifcfg-enp12s0.1
# Generated by VDSM version 4.19.28-1.el7.centos
DEVICE=enp12s0.1
VLAN=yes
BRIDGE=def
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

[root@X network-scripts]# more ifcfg-def
# Generated by VDSM version 4.19.28-1.el7.centos
DEVICE=def
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

then run

systemctl restart network

it didn't give errors, and ip addr shows

[root@X network-scripts]# ip addr

94: enp12s0.1@enp12s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc 
noqueue master def state UP qlen 1000

link/ether 00:25:b5:00:10:5f brd ff:ff:ff:ff:ff:ff
inet6 fe80::225:b5ff:fe00:105f/64 scope link
   valid_lft forever preferred_lft forever
95: def: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UP qlen 1000

link/ether 00:25:b5:00:10:5f brd ff:ff:ff:ff:ff:ff
inet6 fe80::225:b5ff:fe00:105f/64 scope link
   valid_lft forever preferred_lft forever

I will try to change the name of that network to something different 
from default


Claudio

On 30/08/17 11:31, Claudio Soprano wrote:


The problem is not the host-deploy, infact the deploy finished without 
errors, if i remember good in the deploy process the network are not 
set, i need to attach each network i need manually, under hosts, 
network interface, setup host networks


and the problem is there, when i attach all the VLANs (including the 
default, that is VLAN 1) it gave error, default is not present in the 
system, if i add all the VLANS (excluding the default, that is VLAN 1) 
it is OK, but the host will not activate because missing the default vlan.


If i manually add the 2 configuration files (ifcfg-default and 
ifcfg-intX.1) to the host and make a systemctl network restart, it 
gave error then


ip addr reports

the vlan 1 is added, but not the Bridge to default (the default is 
missing error).


I don't know if the name "default" (for the network, VLAN 1) could be 
the problem, but i can't rename now, because i need to detach from the 
hosts, but actually i can't.


I added 3 screenshoots to show you the situation, before adding VLAN 1 
("default" network), the error when adding it, and the info on the error.


Claudio

On 30/08/17 08:49, Sandro Bonazzola wrote:



2017-08-29 14:45 GMT+02:00 Claudio Soprano 
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>>:


Ok in this way i could install the hosts, but we got another
error on them when setup networks

we have on each hosts 2 interfaces

interface 1 is ovirtmgm

interface 2 is a TRUNK with VLANs inside.

All my old hosts are all the VLANs on interface2 included VLAN 1
(tagged).

When i setup networks for the new hosts 4.0.6 i can't include
VLAN 1, ovirt answer with

VDSM hostname.domainname command HostSetupNetworksVDS failed:
[Errno 19] default (that is VLAN 1) is not present in the system.

So thinking about upgrading to 4.1.5 we updated all the new hosts
(the old are still v4.0.6) and reinstall from beginning but still
we get the same error.

What means that error ?


Adding Marcin and Dan about this.

Claudio Soprano

On 25/08/17 15:27, Sandro Bonazzola wrote:



    2017-08-25 10:43 GMT+02:00 Claudio Soprano
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>>:

Hi all,

we are installing new nodes on a Ovirt 4.0.6 Cluster, the
new nodes were installed from a Minimal 1707 iso image,
Centos v7.3.1611


Any reason for keeping 4.0.6 hosts? oVirt 4.0 has reached End Of
Life back in January[1]
And you can install 4.1 hosts in 4.0 cluster compatibility mode.

That said, you may want to "yum install centos-release-opstools"
which should provide you the missing collectd packages by
providing the same repos already included in ovirt-release41 rpm.

[1]
http://lists.ovirt.org/pipermail/announce/2017-January/000308.html
<http://lists.ovirt.org/pipermail/announce/2017-January/000308.html>


in the log of host-deploy (on the manager) it indicates that
collectd packages was not found.

I found this bug in the ovirt 4.1.0 version where it tells
to disable the EPEL repository, in another tread on redhat
someone told to add

excludepkgs=collectd*

always in the epel repository

I checked all my repository and i found epel reference only in

[root@ yum.r

Re: [ovirt-users] Centos 7.3 ovirt 4.0.6 Can't add host to cluster collectd or collectd-disk not found

2017-08-29 Thread Claudio Soprano
Ok in this way i could install the hosts, but we got another error on 
them when setup networks


we have on each hosts 2 interfaces

interface 1 is ovirtmgm

interface 2 is a TRUNK with VLANs inside.

All my old hosts are all the VLANs on interface2 included VLAN 1 (tagged).

When i setup networks for the new hosts 4.0.6 i can't include VLAN 1, 
ovirt answer with


VDSM hostname.domainname command HostSetupNetworksVDS failed: [Errno 19] 
default (that is VLAN 1) is not present in the system.


So thinking about upgrading to 4.1.5 we updated all the new hosts (the 
old are still v4.0.6) and reinstall from beginning but still we get the 
same error.


What means that error ?

Claudio Soprano

On 25/08/17 15:27, Sandro Bonazzola wrote:



2017-08-25 10:43 GMT+02:00 Claudio Soprano 
<claudio.sopr...@lnf.infn.it <mailto:claudio.sopr...@lnf.infn.it>>:


Hi all,

we are installing new nodes on a Ovirt 4.0.6 Cluster, the new
nodes were installed from a Minimal 1707 iso image, Centos v7.3.1611


Any reason for keeping 4.0.6 hosts? oVirt 4.0 has reached End Of Life 
back in January[1]

And you can install 4.1 hosts in 4.0 cluster compatibility mode.

That said, you may want to "yum install centos-release-opstools" which 
should provide you the missing collectd packages by providing the same 
repos already included in ovirt-release41 rpm.


[1] http://lists.ovirt.org/pipermail/announce/2017-January/000308.html


in the log of host-deploy (on the manager) it indicates that
collectd packages was not found.

I found this bug in the ovirt 4.1.0 version where it tells to
disable the EPEL repository, in another tread on redhat someone
told to add

excludepkgs=collectd*

always in the epel repository

I checked all my repository and i found epel reference only in

[root@ yum.repos.d]# grep epel *
ovirt-4.0-dependencies.repo:[ovirt-4.0-epel]

ovirt-4.0-dependencies.repo:#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
<http://download.fedoraproject.org/pub/epel/7/$basearch>

ovirt-4.0-dependencies.repo:mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7=$basearch
<https://mirrors.fedoraproject.org/metalink?repo=epel-7=$basearch>

ovirt-4.0-dependencies.repo:includepkgs=epel-release,python-uinput,puppet,python-lockfile,python-cpopen,python-ordereddict,python-pthreading,python-inotify,python-argparse,novnc,python-ply,python-kitchen,python-daemon,python-websockify,livecd-tools,spice-html5,mom,python-IPy,python-ioprocess,ioprocess,safelease,python-paramiko,python2-paramiko,python2-crypto,libtomcrypt,libtommath,python-cheetah,python-ecdsa,python2-ecdsa,python-markdown,rubygem-rgen,ovirt-guest-agent*,userspace-rcu,protobuf-java,objenesis,python34*

ovirt-4.0-dependencies.repo:gpgkey=https://dl.fedoraproject.org/pub/epel/RPM-GPG-KEY-EPEL-7
<https://dl.fedoraproject.org/pub/epel/RPM-GPG-KEY-EPEL-7>
ovirt-4.0-dependencies.repo:[ovirt-4.0-patternfly1-noarch-epel]

ovirt-4.0-dependencies.repo:baseurl=http://copr-be.cloud.fedoraproject.org/results/patternfly/patternfly1/epel-7-$basearch/

<http://copr-be.cloud.fedoraproject.org/results/patternfly/patternfly1/epel-7-$basearch/>

so before includespkgs i added the excludepkgs line

removed the host, reinserted and still again collectd not found

so i added the excludepkgs line in all the ovirt sections

removed the host, reinserted and still again collectd not found

I removed on the manager also the file
/var/cache/ovirt-engine/ovirt-host-deploy.tar but still the same error

these are the ovirt-repos files with the excludepkgs line added

[ovirt-4.0]
name=Latest oVirt 4.0 Release
#baseurl=http://resources.ovirt.org/pub/ovirt-4.0/rpm/el$releasever/
<http://resources.ovirt.org/pub/ovirt-4.0/rpm/el$releasever/>

mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-4.0-el$releasever
<http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-4.0-el$releasever>
enabled=1

excludepkgs=collectd*

skip_if_unavailable=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-ovirt-4.0

[ovirt-4.0-epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
<http://download.fedoraproject.org/pub/epel/7/$basearch>

mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7=$basearch
<https://mirrors.fedoraproject.org/metalink?repo=epel-7=$basearch>
failovermethod=priority
enabled=1

excludepkgs=collectd*


includepkgs=epel-release,python-uinput,puppet,python-lockfile,python-cpopen,python-ordereddict,python-pthreading,python-inotify,python-argparse,novnc,python-ply,python-kitchen,python-daemon,python-websockify,livec

d-tools,spice-html5,mom,python-IPy,python-ioprocess,ioprocess,safelease,python-paramiko,python

[ovirt-users] Centos 7.3 ovirt 4.0.6 Can't add host to cluster collectd or collectd-disk not found

2017-08-25 Thread Claudio Soprano
 Cannot queue package collectd: Package collectd 
cannot be found
  File 
"/tmp/ovirt-3rP0BGQm0o/otopi-plugins/ovirt-host-deploy/collectd/packages.py", 
line 53, in _packages

'collectd-write_http',
RuntimeError: Package collectd cannot be found
2017-08-25 09:27:19 ERROR otopi.context context._executeMethod:151 
Failed to execute stage 'Package installation': Package collectd cannot 
be found
2017-08-25 09:27:19 DEBUG otopi.context context.dumpEnvironment:770 ENV 
BASE/exceptionInfo=list:'[(, 
RuntimeError('Package collectd cannot be found',), 0x3514ef0>)]'
2017-08-25 09:27:19 DEBUG otopi.context context.dumpEnvironment:770 ENV 
BASE/exceptionInfo=list:'[(, 
RuntimeError('Package collectd cannot be found',), 0x3514ef0>)]'


After this i see that there is and include package for epel-release, 
that will install the epel repository,


so i installed manually the epel-repository

added the excludepkgs line but now the error is Package collectd-disk 
cannot be found


this is the epel.repo modified

[root@ovc2n05 yum.repos.d]# more /etc/yum.repos.d/epel.repo

[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-7=$basearch
failovermethod=priority
enabled=1

excludepkgs=collectd*

gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7

[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - $basearch - Debug
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch/debug
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-debug-7=$basearch
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1

[epel-source]
name=Extra Packages for Enterprise Linux 7 - $basearch - Source
#baseurl=http://download.fedoraproject.org/pub/epel/7/SRPMS
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-source-7=$basearch
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1

the epel-testing.repo has all disabled

This is the part of log on the manager

[root@ovcmgr host-deploy]# more ovirt-host-deploy-20170825*  | grep 
collectd-disk
2017-08-25 10:36:23 DEBUG otopi.plugins.otopi.packagers.yumpackager 
yumpackager.verbose:76 Yum queue package collectd-disk for install/update
2017-08-25 10:36:23 ERROR otopi.plugins.otopi.packagers.yumpackager 
yumpackager.error:85 Yum Cannot queue package collectd-disk: Package 
collectd-disk cannot be found

RuntimeError: Package collectd-disk cannot be found
2017-08-25 10:36:23 ERROR otopi.context context._executeMethod:151 
Failed to execute stage 'Package installation': Package collectd-disk 
cannot be found
2017-08-25 10:36:23 DEBUG otopi.context context.dumpEnvironment:770 ENV 
BASE/exceptionInfo=list:'[(, 
RuntimeError('Package collectd-disk cannot be found',), object at 0x592e290>)]'
2017-08-25 10:36:23 DEBUG otopi.context context.dumpEnvironment:770 ENV 
BASE/exceptionInfo=list:'[(, 
RuntimeError('Package collectd-disk cannot be found',), object at 0x592e290>)]'


I don't know what other to try.

Any help would be accepted

Claudio Soprano

--

   /|/   _/   /|/   _/|/
  /   / |   /   //   / |   /   // |   /
 /   /  |  /   ___/   _//   /  |  /   ___/ /  |  /
/   /   | /   //   /   | /   //   | /
  __/ _/   __/  _/   _/  _/   __/  _/   _/   __/

Claudio Sopranophone:  (+39)-06-9403.2349/2355
Computing Service  fax:(+39)-06-9403.2649
LNF-INFN   e-mail: claudio.sopr...@lnf.infn.it
Via Enrico Fermi, 40   www:http://www.lnf.infn.it/
I-00044 Frascati, Italy

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Unable to get volume size for domain ... after engine upgrade 4.0.4

2016-11-10 Thread Claudio Soprano

Hi Bertrand,

I solved the problem recovering from old snapshots when possibile (for 
some VMs i didn't find the LVs on the VG)..


I discovered that usually the volume ovirt couldn't get the size, was 
not resident on the VG.


Don't ask me why this happens or when (i'm sure it began on ovirt-3.6 
maybe with version 3.6.7, i'm not sure that i had the 3.6.5), i only 
know that almost all the last snapshots were not resident on the same 
VG, in some way the machine was running ok up to I shut it down, then no 
volume anymore.


Luckily i could recover the VMs from previous snapshots mounting LVs 
manually, then with qemu-img convert creating a raw image and the last 
using dd with netcat to copy on new VMs with the same features (RAM, 
disk, etc.) of the original VMs.


We have about 250 VMs, about 150 have snapshots, I recovered manually 
about 100 VMs, some from snapshot when clone or export worked, but 
mostly using the method described.


I investigated and discovered that in some way the info on Ovirt-DB were 
right, but physically on the snapshots were wrong pointing to different 
VM parent-snapshots or having missing snapshots.


If you need some help let me know, but repeat i'm still recovering old 
VMs, moving them to a new Datacenter.


Claudio

On 09/11/16 23:19, Bertrand Caplet wrote:

Hi Claudio,
I'm having the same problem.
Did you resolve it ? And how ?


Le 20/10/2016 à 17:11, Claudio Soprano a écrit :

Hello all,

we upgraded the engine (standalone) from 3.6.5 to 4.0.4 for the
Backing file too long bug.

we upgraded 2 hosts on 10 from 3.6.5 to 4.0.4 too.

we tried to shutdown some VMs and now when we try to start them again
we get

"Unable to get volume size for domain
384f9059-ef2f-4d43-a54f-de71c5d589c8 volume
83ab4406-ea8d-443e-b64b-77b4e1dcb978"

Where the domain changes for each VMs we try to start, some VMs have
snapshots but one was cloned from a snapshot but has no snapshots itself.

Also this cloned VM after the shutdown doesn't start anymore, same error.

These VMs don't start neither 3.6.5 or 4.0.4 hosts.

I hope in any help because we have about 200 VMs and we don't know if
they will start again after a shutdown.

Claudio




--

   /|/   _/   /|/   _/|/
  /   / |   /   //   / |   /   // |   /
 /   /  |  /   ___/   _//   /  |  /   ___/ /  |  /
/   /   | /   //   /   | /   //   | /
  __/ _/   __/  _/   _/  _/   __/  _/   _/   __/

Claudio Sopranophone:  (+39)-06-9403.2349/2355
Computing Service  fax:(+39)-06-9403.2649
LNF-INFN   e-mail: claudio.sopr...@lnf.infn.it
Via Enrico Fermi, 40   www:http://www.lnf.infn.it/
I-00044 Frascati, Italy

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Problem with backing-file, how to fix the backing-chain ?

2016-10-21 Thread Claudio Soprano
We are trying a Fedora 24, because the qemu-img from git require a 
differnt version of glibc 2.12 and gthread 2.0 from that running on the 
hosts.


In the while we fixed the problem in another way:

1) we dumped (with dd) all the snapshots and the base on files

2) then checked where the backing file becomes too long (in our case in 
the 2nd snapshot) with


qemu-img info --backing chain 

3) we fixed the backing file with hexedit on that file only (be sure to 
correct too the lenght of the backing file read qcow2 format)


4) we checked others snapshots starting from 3rd and all were OK

5) we converted all in a qcow2 and in a raw image with

qemu-img convert  -O qcow2 qcow2>


qemu-img convert  -O raw 

6) we mounted with guestfish the qcow2 image and checked all was correct 
and it was


7) we did the same steps (from 1 to 6) for all the disks (base + 
snapshots) in the VM


8) we create a new VM in ovirt with the same structure of the old one 
(RAM, disks, disks size, network)


9) we started the new VM with a livecd and then dumped back (using the 
raw files) all the disks fixed using dd with netcat


10) we stopped and started again the new VM and all was working like we 
left the last working time


I know our solution was longer and harder, but in this way if something 
was wrong we had destroyed a copy of the original datas and we had no 
success to compile the qemu-img from git on our hosts 4.0.4 for glibc 
and gthread version.


Anyway we will have to check all the VMs (about 200) for snapshots and 
backing file too long but then we would like to fix the problem directly 
on the LVs snapshot (our method would require i suppose 2-3 months of 
work), so we need a working qemu-img patched.


I will let you know if fedora 24 still give problems about the compiler 
and gthread version.


I opened another case in the mailing list, because from when we upgraded 
to 4.0.4 each VM that we shutdown, never starts again with the following 
error


"Unable to get size for domain ..."

Do you think it can be related at the backing file problem ?

For now thanks to all
Claudio

On 19/10/16 16:55, Adam Litke wrote:

On 19/10/16 14:43 +0200, Claudio Soprano wrote:

Hi Adam, we tried your solution.

This is our situation with the current VM that has 2 disks

base -> snap1 -> snap2 -> snap3 -> snap4 -> snap5 -> .. -> snap15 for 
each disk


We tried to do

qemu-img rebase -u -b base snap1

results OK

qemu-img rebase -u -b snap1 snap2

results:

qemu-img: Could not open 'snap2': Backing file name too long

our qemu version is

qemu-img version 2.3.0 (qemu-kvm-ev-2.3.0-31.el7.16.1), Copyright (c) 
2004-2008 Fabrice Bellard


How do you think can we resolve ?


I talked with the qemu developers about this issue and the best way to
fix this is by using a patched version of qemu-img that ignores
invalid backing_file values when doing an unsafe rebase.  Here is what
you will need to do to fix your images.

1. Save the attached patch
2. Grab a copy of the latest qemu.git
3. Apply the patch to the source
4. Install qemu build dependencies
5. Build qemu
6. Run the built version of qemu-img when fixing your chain as I
  suggested above:

   ./qemu-img rebase -u -b snap1 snap2

The patch disables other qemu-img functionality since you should not
be using this for anything but the rebase part.  After the rebase you
can use the system qemu-img binary to check the image.  Please try
this on one VM disk and make sure everything is okay.



--

   /|/   _/   /|/   _/|/
  /   / |   /   //   / |   /   // |   /
 /   /  |  /   ___/   _//   /  |  /   ___/ /  |  /
/   /   | /   //   /   | /   //   | /
  __/ _/   __/  _/   _/  _/   __/  _/   _/   __/

Claudio Sopranophone:  (+39)-06-9403.2349/2355
Computing Service  fax:(+39)-06-9403.2649
LNF-INFN   e-mail: claudio.sopr...@lnf.infn.it
Via Enrico Fermi, 40   www:http://www.lnf.infn.it/
I-00044 Frascati, Italy

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Unable to get volume size for domain ... after engine upgrade 4.0.4

2016-10-20 Thread Claudio Soprano

Hello all,

we upgraded the engine (standalone) from 3.6.5 to 4.0.4 for the Backing 
file too long bug.


we upgraded 2 hosts on 10 from 3.6.5 to 4.0.4 too.

we tried to shutdown some VMs and now when we try to start them again we get

"Unable to get volume size for domain 
384f9059-ef2f-4d43-a54f-de71c5d589c8 volume 
83ab4406-ea8d-443e-b64b-77b4e1dcb978"


Where the domain changes for each VMs we try to start, some VMs have 
snapshots but one was cloned from a snapshot but has no snapshots itself.


Also this cloned VM after the shutdown doesn't start anymore, same error.

These VMs don't start neither 3.6.5 or 4.0.4 hosts.

I hope in any help because we have about 200 VMs and we don't know if 
they will start again after a shutdown.


Claudio


--

   /|/   _/   /|/   _/|/
  /   / |   /   //   / |   /   // |   /
 /   /  |  /   ___/   _//   /  |  /   ___/ /  |  /
/   /   | /   //   /   | /   //   | /
  __/ _/   __/  _/   _/  _/   __/  _/   _/   __/

Claudio Sopranophone:  (+39)-06-9403.2349/2355
Computing Service  fax:(+39)-06-9403.2649
LNF-INFN   e-mail: claudio.sopr...@lnf.infn.it
Via Enrico Fermi, 40   www:http://www.lnf.infn.it/
I-00044 Frascati, Italy

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Problem with backing-file, how to fix the backing-chain ?

2016-10-19 Thread Claudio Soprano
of the backing-file is wrong and too long 
already in the previous snapshots, is there a way to fix it or to edit 
it manually ?


Obviously we tried to clone, export, create a qcow2 image, from all the 
snapshot later 07-august but the operation didn't complete, we can 
recover only from the snapshot of 7-august that is missing 2 months of 
new data.


Please if you have a workaround or a solution, can you write the 
commands we need to run with examples, we searched a lot about 
backing-file but only the manual of the qemu-img command was found with 
no examples on how to recover or change it.


Thanks again
Claudio Soprano


--

   /|/   _/   /|/   _/|/
  /   / |   /   //   / |   /   // |   /
 /   /  |  /   ___/   _//   /  |  /   ___/ /  |  /
/   /   | /   //   /   | /   //   | /
  __/ _/   __/  _/       _/  _/   __/  _/   _/   __/

Claudio Sopranophone:  (+39)-06-9403.2349/2355
Computing Service  fax:(+39)-06-9403.2649
LNF-INFN   e-mail: claudio.sopr...@lnf.infn.it
Via Enrico Fermi, 40   www:http://www.lnf.infn.it/
I-00044 Frascati, Italy


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Problem with backing-file, how to fix the backing-chain ?

2016-10-19 Thread Claudio Soprano

Hi Adam, we tried your solution.

This is our situation with the current VM that has 2 disks

base -> snap1 -> snap2 -> snap3 -> snap4 -> snap5 -> .. -> snap15 for 
each disk


We tried to do

qemu-img rebase -u -b base snap1

results OK

qemu-img rebase -u -b snap1 snap2

results:

qemu-img: Could not open 'snap2': Backing file name too long

our qemu version is

qemu-img version 2.3.0 (qemu-kvm-ev-2.3.0-31.el7.16.1), Copyright (c) 
2004-2008 Fabrice Bellard


How do you think can we resolve ?

Thank you

Claudio Soprano


On 17/10/16 21:57, Adam Litke wrote:


Have you tried a qemu-img 'unsafe' rebase?

Let's say you have the chain:
   base <- middle <- top

and due to the old backing chain links we have

base parent: None
middle parent: ../uuid/../uuid/../uuid/base
top parent: ../uuid/../uuid/middle

Since the volumes are always stored in the same directory, the
repeated ../uuid/ parts of the path are unnecessary and can always be
replaced with just the last part.  Once making the LVs available on
the system you should be able to fix the paths with the following
commands:

qemu-img rebase -u -b base middle
qemu-img rebase -u -b middle top

You can then verify the whole backing chain by:
qemu-img info --backing-chain top

Let me know if this helps.


On 17/10/16 10:34 +0200, Dael Maselli wrote:

Hi all,

We run an ovirt environment before with engine v3.6.5 (if remember good)
and now with v4.0.4 (we upgraded because we read the bug with
backing-file was resolved with v4).

We upgraded some of the hosts machines (but not all still) at v4.0.4 too
to see if this would fix the problem, but nothing.

The problem is that we have several VMs with snapshots, we do daily,
weekly and monthly snapshots, keep some of them (usually the fresh ones)
and remove the olds (that in the case they are weekly snapshots, they
are in the middle of a series of snapshots), this in the time has
produced the famous

Backing file too long bug.

So we upgraded the engine from 3.6.5 to 4.0.4 (latest available).

We discovered this bug, when we tried to upgrade an host to v4.0.4,
doing so a VM in the host didn't migrate, so we shutdown it and tried to
run on another host, but never succeded for the bug.

We don't know if we have more VMs in this situation because we upgraded
only 2 hosts on 10.

Investigating the problem we discovered that the backing file indicated
in each of LVM snapshots report a path very long with
/dev/storage-domain-id/../image-group-id/ with ../image-group-id/
repeated a lot of times and at the end /parentid.

So to understand which was the right path that it would contain, we
cloned a VM in the v4.0.4 and then we did 4 snapshots, now the backing
file path is

/dev/storage-domain-id/parentid

Is there a way to modify the path in the backing-file or a way to
recover the VM from this state ?

Where do reside the informations about the backing-file path ?

I attach here all the commands we run

On the ovirt manager (host with the engine only) we run

ovirt-shell

[oVirt shell (connected)]# list disks --parent-vm-name vm1

id : 2df25a13-6958-40a8-832f-9a26ce65de0f
name   : vm1_Disk2

id : 8cda0aa6-9e25-4b50-ba00-b877232a1983
name   : vm1_Disk1

[oVirt shell (connected)]# show disk 
8cda0aa6-9e25-4b50-ba00-b877232a1983


id   : 8cda0aa6-9e25-4b50-ba00-b877232a1983
name : vm1_Disk1
actual_size  : 1073741824
alias: vm1_Disk1
disk_profile-id  : 1731f79a-5034-4270-9a87-94d93025deac
format   : cow
image_id : 7b354e2a-2099-4f2a-80b7-fba7d1fd13ee
propagate_errors : False
provisioned_size : 17179869184
shareable: False
size : 17179869184
sparse   : True
status-state : ok
storage_domains-storage_domain-id: 384f9059-ef2f-4d43-a54f-de71c5d589c8
storage_type : image
wipe_after_delete: False

[root@ovc1mgr ~]# su - postgres
Last login: Fri Oct 14 01:02:14 CEST 2016
-bash-4.2$ psql -d engine -U postgres
psql (9.2.15)
Type "help" for help.

engine=#\x on
engine=# select * from images where image_group_id =
'8cda0aa6-9e25-4b50-ba00-b877232a1983' order by creation_date;

-[ RECORD 1 ]-+-
image_guid| 60ba7acf-58cb-475b-b9ee-15b1be99fee6
creation_date | 2016-03-29 15:12:34+02
size  | 17179869184
it_guid   | ----
parentid  | ----
imagestatus   | 4
lastmodified  | 2016-04-21 11:25:59.972+02
vm_snapshot_id| 27c187cd-989f-4f7a-ac05-49c4410de6c2
volume_type   | 1
volume_format | 5
image_group_id| 8cda0aa6-9e25-4b50-ba00-b877232a1983
_cr