[ovirt-users] Re: Error while removing snapshot: Unable to get volume info

2022-01-10 Thread Nir Soffer
On Mon, Jan 10, 2022 at 5:22 PM Francesco Lorenzini via Users <
users@ovirt.org> wrote:

> My problem should be the same as the one filed here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1948599
>
> So, if I'm correct I must edit DB entries to fix the situations. Although
> I don't like to operate directly the DB, I'll try that and let you know if
> I resolve it.
>

It looks like the volume on vdsm side was already removed, so when engine
try to merge the merge fails.

This is an engine bug - it should handle this case and remove the illegal
snapshot
in the db. But since it does not, you have to do this manually.

Please file an engine bug for this issue.


>
> In the meanwhile, if anyone has any tips or suggestion that doesn't
> involve editing the DB, much appreciate it.
>

I don't think there is another way.

Nir


>
> Regards,
> Francesco
>
> Il 10/01/2022 10:33, francesco--- via Users ha scritto:
>
> Hi all,
>
> I'm trying to remove a snapshot from a HA VM in a setup with glusterfs (2 
> nodes C8 stream oVirt 4.4 + 1 arbiter C8). The error that appears in the vdsm 
> log of the host is:
>
> 2022-01-10 09:33:03,003+0100 ERROR (jsonrpc/4) [api] FINISH merge error=Merge 
> failed: {'top': '441354e7-c234-4079-b494-53fa99cdce6f', 'base': 
> 'fdf38f20-3416-4d75-a159-2a341b1ed637', 'job': 
> '50206e3a-8018-4ea8-b191-e4bc859ae0c7', 'reason': 'Unable to get volume info 
> for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 
> 441354e7-c234-4079-b494-53fa99cdce6f'} (api:131)
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 285, 
> in merge
> drive.domainID, drive.poolID, drive.imageID, job.top)
>   File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5988, in 
> getVolumeInfo
> (domainID, volumeID))
> vdsm.virt.errors.StorageUnavailableError: Unable to get volume info for 
> domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 
> 441354e7-c234-4079-b494-53fa99cdce6f
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in 
> method
> ret = func(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 776, in merge
> drive, baseVolUUID, topVolUUID, bandwidth, jobUUID)
>   File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5833, in merge
> driveSpec, baseVolUUID, topVolUUID, bandwidth, jobUUID)
>   File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 288, 
> in merge
> str(e), top=top, base=job.base, job=job_id)
>
> The volume list in the host differs from the engine one:
>
> HOST:
>
> vdsm-tool dump-volume-chains 574a3cd1-5617-4742-8de9-4732be4f27e0 | grep -A10 
> 0b995271-e7f3-41b3-aff7-b5ad7942c10d
>image:0b995271-e7f3-41b3-aff7-b5ad7942c10d
>
>  - fdf38f20-3416-4d75-a159-2a341b1ed637
>status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, 
> type: SPARSE, capacity: 53687091200, truesize: 44255387648
>
>  - 10df3adb-38f4-41d1-be84-b8b5b86e92cc
>status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: 
> SPARSE, capacity: 53687091200, truesize: 7335407616
>
> ls -1 0b995271-e7f3-41b3-aff7-b5ad7942c10d
> 10df3adb-38f4-41d1-be84-b8b5b86e92cc
> 10df3adb-38f4-41d1-be84-b8b5b86e92cc.lease
> 10df3adb-38f4-41d1-be84-b8b5b86e92cc.meta
> fdf38f20-3416-4d75-a159-2a341b1ed637
> fdf38f20-3416-4d75-a159-2a341b1ed637.lease
> fdf38f20-3416-4d75-a159-2a341b1ed637.meta
>
>
> ENGINE:
>
> engine=# select * from images where 
> image_group_id='0b995271-e7f3-41b3-aff7-b5ad7942c10d';
> -[ RECORD 1 ]-+-
> image_guid| 10df3adb-38f4-41d1-be84-b8b5b86e92cc
> creation_date | 2022-01-07 11:23:43+01
> size  | 53687091200
> it_guid   | ----
> parentid  | 441354e7-c234-4079-b494-53fa99cdce6f
> imagestatus   | 1
> lastmodified  | 2022-01-07 11:23:39.951+01
> vm_snapshot_id| bd2291a4-8018-4874-a400-8d044a95347d
> volume_type   | 2
> volume_format | 4
> image_group_id| 0b995271-e7f3-41b3-aff7-b5ad7942c10d
> _create_date  | 2022-01-07 11:23:41.448463+01
> _update_date  | 2022-01-07 11:24:10.414777+01
> active| t
> volume_classification | 0
> qcow_compat   | 2
> -[ RECORD 2 ]-+-
> image_guid| 441354e7-c234-4079-b494-53fa99cdce6f
> creation_date | 2021-12-15 07:16:31.647+01
> size  | 53687091200
> it_guid   | ----
> parentid  | fdf38f20-3416-4d75-a159-2a341b1ed637
> imagestatus   | 1
> lastmodified  | 2022-01-07 11:23:41.448+01
> vm_snapshot_id| 2d610958-59e3-4685-b209-139b4266012f
> volume_type   | 2
> 

[ovirt-users] Re: using stop_reason as a vdsm hook trigger into the UI

2021-12-20 Thread Nir Soffer
On Mon, Dec 20, 2021 at 9:59 PM Nathanaël Blanchet  wrote:

Adding the devel list since question is more about extending oVirt
...
> The idea is to use the stop_reason element into the vm xml definition. But 
> after hours, I realized that this element is writed to the vm definition file 
> only after the VM has been destroyed.

So you want to run the clean hook only if stop reason == "clean"?

I think the way to integrate hooks is to define a custom property
in the vm, and check if the property was defined in the hook.

For example how the localdisk hook is triggered:

def main():
backend = os.environ.get('localdisk')
if backend is None:
return
if backend not in [BACKEND_LVM, BACKEND_LVM_THIN]:
hooking.log("localdisk-hook: unsupported backend: %r" % backend)
return
...

The hook runs only if the environment variable "localdisk" is defined
and configured properly.

vdsm defines the custom properties as environment variables.

On the engine side, you need to add a user defined property:

 engine-config -s UserDefinedVMProperties='localdisk=^(lvm|lvmthin)$'

And configure a custom property with one of the allowed values, like:

localdisk=lvm

See vdsm_hooks/localdisk/README for more info.

If you want to control the cleanup, by adding a "clean" stop reason only when
needed, this will not help, and vdsm hook is probably not the right way
to integrate this.

If your intent is to clean a vm in some special events, but you want
to integrate
this in engine, maybe you should write an engine ui plugin?

The plugin can show the running vms, and provide a clean button that will
shut down the vm and run your custom code.

But maybe you don't need to integrate this in engine, and having a simple
script using ovirt engine API/SDK to shutdown the vm and run the cleanup
code.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/V2RYRKKGEPK7PASKYDLD6WZ5F2G6P4KH/


[ovirt-users] Re: vdsm-client delete_checkpoints

2021-12-20 Thread Nir Soffer
On Mon, Dec 20, 2021 at 7:42 PM Tommaso - Shellrent via Users
 wrote:
>
> Hi, someone can give to use us an exemple of the command
>
> vdsm-client VM delete_checkpoints
>
> ?
>
> we have tried a lot of combinations like:
>
> vdsm-client VM delete_checkpoints vmID="ce5d0251-e971-4d89-be1b-4bc28283614c" 
> checkpoint_ids=["e0c56289-bfb3-4a91-9d33-737881972116"]

Why do you need to use this with vdsm?

You should use the ovirt engine API or SDK. We have example script here:
https://github.com/oVirt/python-ovirt-engine-sdk4/blob/main/examples/remove_checkpoint.py

$ ./remove_checkpoint.py -h
usage: remove_checkpoint.py [-h] -c CONFIG [--debug] [--logfile
LOGFILE] vm_uuid checkpoint_uuid

Remove VM checkpoint

positional arguments:
  vm_uuid   VM UUID for removing checkpoint.
  checkpoint_uuid   The removed checkpoint UUID.

options:
  -h, --helpshow this help message and exit
  -c CONFIG, --config CONFIG
Use engine connection details from [CONFIG]
section in ~/.config/ovirt.conf.
  --debug   Log debug level messages to logfile.
  --logfile LOGFILE Log file name (default example.log).

Regardless, if you really need to use vdsm client, here is how to use
the client correctly.

$ vdsm-client VM delete_checkpoints -h
usage: vdsm-client VM delete_checkpoints [-h] [arg=value [arg=value ...]]

positional arguments:
  arg=value   vmID: A UUID of the VM
  checkpoint_ids: List of checkpoints ids to delete,
ordered from oldest to newest.


  JSON representation:
  {
  "vmID": {
  "UUID": "UUID"
  },
  "checkpoint_ids": [
  "string",
  {}
  ]
  }

optional arguments:
  -h, --help  show this help message and exit

The vdsm-client is a tool for developers and support, not for users, so we did
not invest in creating an easy to use command line interface. Instead we made
sure that this tool will need zero maintenance - we never have to change it when
we add or change new APIs, because is simply get a json from the user, and
pass it to vdsm as is.

So when you see the JSON representation, you can build the json request
like this:

$ cat args.json
{
"vmID": "6e95d38f-d9b8-4955-878c-da6d631d0ab2",
"checkpoint_ids": ["b8f3f8e0-660e-49e5-bbb0-58a87ed15b13"]
}

You need to run the command with the -f flag:

$ sudo vdsm-client -f args.json VM delete_checkpoints
{
"checkpoint_ids": [
"b8f3f8e0-660e-49e5-bbb0-58a87ed15b13"
]
}

If you need to automate this, it will be easier to write a python script using
the vdsm client library directly:

$ sudo python3
Python 3.6.8 (default, Oct 15 2021, 10:57:33)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from vdsm import client
>>> c = client.connect("localhost")
>>> c.VM.delete_checkpoints(vmID="6e95d38f-d9b8-4955-878c-da6d631d0ab2", 
>>> checkpoint_ids=["b8f3f8e0-660e-49e5-bbb0-58a87ed15b13"])
{'checkpoint_ids': ['b8f3f8e0-660e-49e5-bbb0-58a87ed15b13']}


Like vdsm-client, the library is meant for oVirt developers, and vdsm
API is private
implementation detail of oVirt, so you should try to use the public
ovirt engine API
instead.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3FCONBNYTX3KHBGGJ5RSZFZCGDSBQWEM/


[ovirt-users] Re: new host addition, Cannot find master domain

2021-12-08 Thread Nir Soffer
On Tue, Dec 7, 2021 at 10:04 AM david  wrote:
>
> hello
> i'm adding a new host into ovirt
> after the installation procces is finished I got an error: "VDSM kvm6 command 
> ConnectStoragePoolVDS failed: Cannot find master domain".
> on the vdsm server in /dev/disk/by-path/ i see the block device but 
> device-mapper not mappying it
> then I found the master's lun in the blacklist: 
> /etc/multipath/conf.d/vdsm_blacklist.conf
> why vdsm host put the lun wwn in blacklist

When adding a host, we run:

vdsm-tool config-lvm-filter

This command put the LUN in the blacklist since it seems to be a local
disk used by the host. Did you have active logical volumes from this LUN
mounted on the host while the host was added to engine?

After you fix the issue (see Vojta reply), please run again:

   vdsm-tool config-lvm-filter

The command may suggest to change the lvm filter, and blacklist the device.
Do not confirm and share the output.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NEO73WJYGVE7ZZDHOTFV4F32ZBXS3ZCQ/


[ovirt-users] Re: Attaching LVM logical volume to VM

2021-11-28 Thread Nir Soffer
On Sun, Nov 28, 2021 at 2:06 PM  wrote:
>
> Thanks for your reply. Greatly appreciated.
>
> > The oVirt way is to create a block storage domain, and create a raw
> > preallocated disk.
> > This will attach a logical volume to the VM.
>
> Problem is that I already have an existing LV that was used before.
> But I guess if I mount that LV on the host I could copy all the data over 
> through scp or similar to the VM with the newly created LV (which requires me 
> to have double the disk space).
>
> Do I understand correctly that a designated VG is used and LVs are created in 
> it and those are exposed as iSCSI volumes and mounted as such in the guest OS?
>

No. It works like this:

1. You configure iSCSI server (e.g. using targetcli), and export one
or more LUNs.
2. You create a iSCSI storage domain in oVirt, connecting to your iSCSI target
3. You select the LUN(s) that should be part of this storage domain
4. oVirt creates a VG with the specified LUN(s)
5. oVirt creates special logical volumes for management (~5G)
6. You create a new disk in oVirt on this storage domain
to use the same raw lv as you used before, use "allocation policy:
preallocated"
7. oVirt creates a logical volume in the VG

So if you want to use existing LV content, you need to create the disk
in oVirt, and then
you can copy the logical volume from the original LV to the new disk:

You need to find the logical volume created for the VM disk, can be done using:

   lvs -o lv_name,tags storage-domain-id | grep disk-id

Then you need to active both lvs, and copy the image:

   lvchange -ay old-vg/old-lv
   lvchange -ay storage-domain-id/lv-name
   qemu-img convert -p -f raw -O raw -t none -T none -n -W
/dev/old-vg/olv-lv dev/storage-domain-id/lv-name
   lvchange -an old-vg/old-lv
   lvchange -an storage-domain-id/lv-name

> The disk space is already allocated to VolumeGroups so I'd have to fiddle to 
> be able to get an extra VG for that storage domain unless I add a disk.

Note that the iSCSI storage domain must be accessible to all hosts in
the data center.
If not you will not be able to activate all the hosts.

Keeping the storage on one of the hypervisors is possible but not recommended.

> I prefer to keep it simple. Perhaps I should create an NFS share and mount 
> that in the guest but then again I'm not that fond of putting it on an NFS 
> share.
>
> >
> > The difference is that the logical volume is served by iSCSI or FC
> > storage server,
> > and you can migrate the VM between hosts.
> >
> > If you really want to use local host storage, the best way is to
> > create a local DC
> > that will include only this host, and use raw preallocated file on the
> > local storage
> > domain.
>
> That would kind of defeat the purpose for me. With the setup I had I could 
> directly mount the LV on the host.
> Perhaps I misunderstand what you're saying here.

oVirt does not support attaching LVs or a local file directly to VMs.

You can also continue to use libvirt and qemu to manage this VM. oVirt
will see it as
external VM running on the host, and it should not try to manage it.

The feature you need, attach logical volume to a VM does not exist, so you
need to either use what the system provides, or manage the VM outside
of the system.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VZGX6VBATSBPIHLGVI5LCTAFIXYMMHJ5/


[ovirt-users] Re: Attaching LVM logical volume to VM

2021-11-28 Thread Nir Soffer
On Sun, Nov 28, 2021 at 12:37 PM  wrote:
>
> Hi,
>
> I've just migrated from Libvirt/kvm to oVirt.
> In libvirt I had a VM that had an LVM logical volume that was attached to a 
> guest as a disk.
>
> However in oVirt I can't immediately find such a capability. I understand 
> that this would "pin" my VM to this host but that's perfectly fine.

Attaching local logical volume is not supported.

> Any pointers how this can be done?

The oVirt way is to create a block storage domain, and create a raw
preallocated disk.
This will attach a logical volume to the VM.

The difference is that the logical volume is served by iSCSI or FC
storage server,
and you can migrate the VM between hosts.

If you really want to use local host storage, the best way is to
create a local DC
that will include only this host, and use raw preallocated file on the
local storage
domain. The performance of LVM or XFS on LVM is almost the same, and some
operations like fallallocate(PUCH_HOLE) can be 100 times faster on XFS.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GS4OMA2ELI35K5RQ2CB6ZZH4DNLFB532/


[ovirt-users] Re: ILLEGAL volume delete via vdsm-client

2021-11-17 Thread Nir Soffer
On Wed, Nov 17, 2021 at 12:14 PM Francesco Lorenzini
 wrote:
>
> It worked now, my bad.
>
> I mistakenly thought that the Storage Pool ID was the ID of the SPM Host and 
> not the one listed under /rhev/data-center/.
>
> I used the correct SPUUID and everything worked correctly:
>
> vdsm-client Volume delete storagepoolID=609ff8db-09c5-435b-b2e5-023d57003138 
> volumeID=5cb3fe58-3e01-4d32-bc7c-5907a4f858a8 
> storagedomainID=e25db7d0-060a-4046-94b5-235f38097cd8 imageID=4d
> 79c1da-34f0-44e3-8b92-c4bcb8524d83 force=true postZero=False
> "d85d0a83-fec0-4472-87fd-e61c8c3e0608"
>
> Removed the task ID d85d0a83-fec0-4472-87fd-e61c8c3e0608 by  vdsm-client Task 
> clear taskID=d85d0a83-fec0-4472-87fd-e61c8c3e0608
>
> Sorry for the waste of time and thank you.

No problem, we are happy to help.

I think the error message could easily be improved - this is the relevant code:

@classmethod
def getPool(cls, spUUID):
if cls._pool.is_connected() and cls._pool.spUUID == spUUID:
return cls._pool

# Calling when pool is not connected or with wrong pool id is client
# error.
raise exception.expected(se.StoragePoolUnknown(spUUID))

We should really report different errors for unconnected pool and
incorrect pool id,
for example:

Incorrect pool id 'c0e7a0c5-8048-4f30-af08-cbd17d797e3b', connected to
pool '609ff8db-09c5-435b-b2e5-023d57003138'

If you think this is useful please file a bug to improve this.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QMEQQB5KA2YBK4RH6J4FPJ4FM76KA45R/


[ovirt-users] Re: ILLEGAL volume delete via vdsm-client

2021-11-17 Thread Nir Soffer
On Wed, Nov 17, 2021 at 10:32 AM Francesco Lorenzini <
france...@shellrent.com> wrote:

> Hi Nir,
>
> sadly in the vdsm log there aren't more useful info:
>
> 2021-11-17 09:27:59,419+0100 INFO  (Reactor thread)
> [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:56392
> (protocoldetector:61)
> 2021-11-17 09:27:59,426+0100 WARN  (Reactor thread) [vds.dispatcher]
> unhandled write event (betterAsyncore:184)
> 2021-11-17 09:27:59,427+0100 INFO  (Reactor thread)
> [ProtocolDetector.Detector] Detected protocol stomp from ::1:56392
> (protocoldetector:125)
> 2021-11-17 09:27:59,427+0100 INFO  (Reactor thread) [Broker.StompAdapter]
> Processing CONNECT request (stompserver:95)
> 2021-11-17 09:27:59,427+0100 INFO  (JsonRpc (StompReactor))
> [Broker.StompAdapter] Subscribe command received (stompserver:124)
> 2021-11-17 09:27:59,448+0100 INFO  (jsonrpc/0) [vdsm.api] START
> deleteVolume(sdUUID='e25db7d0-060a-4046-94b5-235f38097cd8',
> spUUID='c0e7a0c5-8048-4f30-af08-cbd17d797e3b',
> imgUUID='4d79c1da-34f0-44e3-8b92-c4bcb8524d83',
> volumes=['5cb3fe58-3e01-4d32-bc7c-5907a4f858a8'], postZero='False',
> force='true', discard=False) from=::1,56392,
> task_id=33647125-f8b5-494e-a8e8-17a6fc8aad4d (api:48)
> 2021-11-17 09:27:59,448+0100 INFO  (jsonrpc/0) [vdsm.api] FINISH
> deleteVolume error=Unknown pool id, pool not connected:
> ('c0e7a0c5-8048-4f30-af08-cbd17d797e3b',) from=::1,56392,
> task_id=33647125-f8b5-494e-a8e8-17a6fc8aad4d (api:52)
> 2021-11-17 09:27:59,448+0100 INFO  (jsonrpc/0) [storage.TaskManager.Task]
> (Task='33647125-f8b5-494e-a8e8-17a6fc8aad4d') aborting: Task is aborted:
> "value=Unknown pool id, pool not connected:
> ('c0e7a0c5-8048-4f30-af08-cbd17d797e3b',) abortedcode=309" (task:1182)
> 2021-11-17 09:27:59,448+0100 INFO  (jsonrpc/0) [storage.Dispatcher] FINISH
> deleteVolume error=Unknown pool id, pool not connected:
> ('c0e7a0c5-8048-4f30-af08-cbd17d797e3b',) (dispatcher:81)
> 2021-11-17 09:27:59,448+0100 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
> call Volume.delete failed (error 309) in 0.00 seconds (__init__:312)
>
>
> I know the vdsm-client is maintained by the infra team, should I open a
> new thread there?
>

No, this is not vdsm-client issue.

Can you share a complete vdsm log from this host?

The only possible reason this can fail is:
- pool id is incorrect (do you have multiple DCs?)
- the host is not connected to storage (is host UP in engine?)

Correction for my previous reply - the fact that Volume.getInfo() works
does not
prove that the pool is correct or the host is connected to storage. This
call does
not check the pool id and does not depend on host being active.

If the host is connected to the pool, you should see:

# ls -lh /rhev/data-center/c0e7a0c5-8048-4f30-af08-cbd17d797e3b/mastersd/
total 8.0K
drwxr-xr-x.  2 vdsm kvm  103 Jun  4 01:49 dom_md
drwxr-xr-x. 67 vdsm kvm 4.0K Nov 16 16:37 images
drwxr-xr-x.  5 vdsm kvm 4.0K Oct 10 13:33 master

Francesco
>
> Il 16/11/2021 18:19, Nir Soffer ha scritto:
>
> On Tue, Nov 16, 2021 at 4:07 PM francesco--- via Users  
>  wrote:
>
> Hi all,
>
> I'm trying to delete via vdsm-client toolan illegal volume that is not listed 
> in the engine database. The volume ID is 5cb3fe58-3e01-4d32-bc7c-5907a4f858a8:
>
> [root@ovirthost ~]# vdsm-tool dump-volume-chains 
> e25db7d0-060a-4046-94b5-235f38097cd8
>
> Images volume chains (base volume first)
>
>image:4d79c1da-34f0-44e3-8b92-c4bcb8524d83
>
>  Error: more than one volume pointing to the same parent volume 
> e.g: (_BLANK_UUID<-a), (a<-b), (a<-c)
>
>  Unordered volumes and children:
>
>  - ---- <- 
> 5aad30c7-96f0-433d-95c8-2317e5f80045
>status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, 
> type: SPARSE, capacity: 214748364800, truesize: 165493616640
>
>  - 5aad30c7-96f0-433d-95c8-2317e5f80045 <- 
> 5cb3fe58-3e01-4d32-bc7c-5907a4f858a8
>status: OK, voltype: LEAF, format: COW, legality: ILLEGAL, 
> type: SPARSE, capacity: 214748364800, truesize: 8759619584
>
>  - 5aad30c7-96f0-433d-95c8-2317e5f80045 <- 
> 674e85d8-519a-461f-9dd6-aca44798e088
>status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: 
> SPARSE, capacity: 214748364800, truesize: 200704
>
> With the command vdsm-client Volume getInfo I can retrieve the info about the 
> volume 5cb3fe58-3e01-4d32-bc7c-5907a4f858a8:
>
>  vdsm-client Volume getInfo 
> storagepoolID=c0e7a0c5-8048-4f30-af08-cbd17d797e3b 
> volumeID=5cb3fe58-3e01-4d32-bc7c-5907a4f858a8 
> storagedomainID=e25db7d0-060a-4046-94b5-235f38097cd8 
> imageID=4d79c1da-34f0-44e3-8b92-c4bcb8524d83
> {
> &

[ovirt-users] Re: ILLEGAL volume delete via vdsm-client

2021-11-16 Thread Nir Soffer
On Tue, Nov 16, 2021 at 4:07 PM francesco--- via Users  wrote:
>
> Hi all,
>
> I'm trying to delete via vdsm-client toolan illegal volume that is not listed 
> in the engine database. The volume ID is 5cb3fe58-3e01-4d32-bc7c-5907a4f858a8:
>
> [root@ovirthost ~]# vdsm-tool dump-volume-chains 
> e25db7d0-060a-4046-94b5-235f38097cd8
>
> Images volume chains (base volume first)
>
>image:4d79c1da-34f0-44e3-8b92-c4bcb8524d83
>
>  Error: more than one volume pointing to the same parent volume 
> e.g: (_BLANK_UUID<-a), (a<-b), (a<-c)
>
>  Unordered volumes and children:
>
>  - ---- <- 
> 5aad30c7-96f0-433d-95c8-2317e5f80045
>status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, 
> type: SPARSE, capacity: 214748364800, truesize: 165493616640
>
>  - 5aad30c7-96f0-433d-95c8-2317e5f80045 <- 
> 5cb3fe58-3e01-4d32-bc7c-5907a4f858a8
>status: OK, voltype: LEAF, format: COW, legality: ILLEGAL, 
> type: SPARSE, capacity: 214748364800, truesize: 8759619584
>
>  - 5aad30c7-96f0-433d-95c8-2317e5f80045 <- 
> 674e85d8-519a-461f-9dd6-aca44798e088
>status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: 
> SPARSE, capacity: 214748364800, truesize: 200704
>
> With the command vdsm-client Volume getInfo I can retrieve the info about the 
> volume 5cb3fe58-3e01-4d32-bc7c-5907a4f858a8:
>
>  vdsm-client Volume getInfo 
> storagepoolID=c0e7a0c5-8048-4f30-af08-cbd17d797e3b 
> volumeID=5cb3fe58-3e01-4d32-bc7c-5907a4f858a8 
> storagedomainID=e25db7d0-060a-4046-94b5-235f38097cd8 
> imageID=4d79c1da-34f0-44e3-8b92-c4bcb8524d83
> {
> "apparentsize": "8759676160",
> "capacity": "214748364800",
> "children": [],
> "ctime": "1634958924",
> "description": "",
> "disktype": "DATA",
> "domain": "e25db7d0-060a-4046-94b5-235f38097cd8",
> "format": "COW",
> "generation": 0,
> "image": "4d79c1da-34f0-44e3-8b92-c4bcb8524d83",
> "lease": {
> "offset": 0,
> "owners": [],
> "path": 
> "/rhev/data-center/mnt/ovirthost.com:_data/e25db7d0-060a-4046-94b5-235f38097cd8/images/4d79c1da-34f0-44e3-8b92-c4bcb8524d83/5cb3fe58-3e01-4d32-bc7c-5907a4f858a8.lease",
> "version": null
> },
> "legality": "ILLEGAL",
> "mtime": "0",
> "parent": "5aad30c7-96f0-433d-95c8-2317e5f80045",
> "pool": "",
> "status": "ILLEGAL",
> "truesize": "8759619584",
> "type": "SPARSE",
> "uuid": "5cb3fe58-3e01-4d32-bc7c-5907a4f858a8",
> "voltype": "LEAF"
> }
>
> I can't remove it due to the following error:
>
> vdsm-client Volume delete storagepoolID=c0e7a0c5-8048-4f30-af08-cbd17d797e3b 
> volumeID=5cb3fe58-3e01-4d32-bc7c-5907a4f858a8 
> storagedomainID=e25db7d0-060a-4046-94b5-235f38097cd8 
> imageID=4d79c1da-34f0-44e3-8b92-c4bcb8524d83 force=true
> vdsm-client: Command Volume.delete with args {'storagepoolID': 
> 'c0e7a0c5-8048-4f30-af08-cbd17d797e3b', 'volumeID': 
> '5cb3fe58-3e01-4d32-bc7c-5907a4f858a8', 'storagedomainID': 
> 'e25db7d0-060a-4046-94b5-235f38097cd8', 'imageID': 
> '4d79c1da-34f0-44e3-8b92-c4bcb8524d83', 'force': 'true'} failed:
> (code=309, message=Unknown pool id, pool not connected: 
> ('c0e7a0c5-8048-4f30-af08-cbd17d797e3b',))

If the pool id works in Volume getInfo, it should be correct.

>
> I'm performing the operation directly on the SPM. I searched for a while but 
> I didn't find anything usefull. Any tips or doc that I missed?

What you do looks right. Did you look in vdsm.log? I'm sure there is
more info there
about the error.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4GKETEMAYDMHF2KU6T6FFDSX7KAN6EXV/


[ovirt-users] Re: cloning a VM or creating a template speed is so so slow

2021-11-14 Thread Nir Soffer
On Thu, Nov 11, 2021 at 4:33 AM Pascal D  wrote:
>
> I have been trying to figure out why cloning a VM and creating a template 
> from ovirt is so slow. I am using ovirt 4.3.10 over NFS. My NFS server is 
> running NFS 4 over RAID10 with SSD disks over a 10G network and 9000 MTU
>
> Therocially I should be writing a 50GB file in around 1m30s
> a direct copy from the SPM host server of an image to another image on the 
> same host takes 6m34s
> a cloning from ovirt takes around 29m
>
> So quite a big difference. Therefore I started investigating and found that 
> ovirt launches a qemu-img process with no source and target cache. Therefore 
> thinking that could be the issue, I change the cache mode to writeback and 
> was able to run the exact command in 8m14s. Over 3 times faster. I haven't 
> tried yet other parameters line -o preallocation=metadata

-o preallocation=metadata may work for files, we don't use it since it is not
compatible with block storage (requires allocation of the entire
volume upfront).

> but was wondering why no cache was selected and how to change it to use cache 
> writeback

We don't use the host page cache. There are several issues;

- reading stale data after another host change an image on shared storage
  this should probably not happen with NFS.

- writing to the page cache pollute the page cache with data that is unlikely to
  be needed, since vms also do not use the page cache (for other reasons).
  so you may reclaim memory that should be used by your vms during the copy.

- the kernel like to buffer huge amount of data, and flush too much
data at the same
  time. This cause delays in accessing storage during flushing. This
is may break
  sanlock leases that must have access to storage to update the storage leases.

We improved copy performance a few years ago using the -W option, allowing
concurrent writes. This can speed up copy to block storage (iscsi/fc)
up to 6 times[1].

When we tested this with NFS, we did not see big improvement, so we
did not enable
it. It also recommended to use -W for raw preallocated disk, since it may cause
fragmentation.

You can try to change this in vdsm/storage/sd.py:

 396 def recommends_unordered_writes(self, format):
 397 """
 398 Return True if unordered writes are recommended for
copying an image
 399 using format to this storage domain.
 400
 401 Unordered writes improve copy performance but are
recommended only for
 402 preallocated devices and raw format.
 403 """
 404 return format == sc.RAW_FORMAT and not self.supportsSparseness

This allows -W only on raw preallocated disks. So it will not be used for
raw-sparse (NFS thin) or qcow2-sparse (snapshots on NFS), or for
qcow2 on block storage.

We use unordered writes for any disk in ovirt-imageio, and other tools
like nbdcopy
also always enable unordered writes, so maybe we should enable it in all cases.

To enable unordered writes for any volume, change this to:

def recommends_unordered_writes(self, format):
"""
Allow unordered writes only storage in any format.
"""
return True

If you want to always enable this only for file storage (NFS, GlsuterFS,
LocalFS, posix) add this method in vdsm/storage/nfsSD.py:

class FileStorageDomainManifest(sd.StorageDomainManifest):
...
def recommends_unordered_writes(self, format):
"""
Override StorageDomainManifest to allow on on qcow2 and raw
sparse images.
"""
return True

Please report how it works for you.

If this give good results, file a bug to enable option.

I think we can enable this based on vdsm configuration, so it will be
easy to disable
the option if it causes trouble with some storage domain types or image formats.

> command launched by ovirt:
>  /usr/bin/qemu-img convert -p -t none -T none -f qcow2 
> /rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/21f438fb-0c0e-4bdc-abb3-64a7e033cff6/c256a972-4328-4833-984d-fa8e62f76be8
>  -O qcow2 -o compat=1.1 
> /rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/5a90515c-066d-43fb-9313-5c7742f68146/ed6dc60d-1d6f-48b6-aa6e-0e7fb1ad96b9

With the change suggested, this command will become:

/usr/bin/qemu-img convert -p -t none -T none -f qcow2
/rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/21f438fb-0c0e-4bdc-abb3-64a7e033cff6/c256a972-4328-4833-984d-fa8e62f76be8
-O qcow2 -o compat=1.1 -W
/rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/5a90515c-066d-43fb-9313-5c7742f68146/ed6dc60d-1d6f-48b6-aa6e-0e7fb1ad96b9

You can test this in the shell without modifying vdsm to test how it
affects performance.

[1] https://bugzilla.redhat.com/1511891#c57

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: 

[ovirt-users] Re: Upgraded to oVirt 4.4.9, still have vdsmd memory leak

2021-11-14 Thread Nir Soffer
On Wed, Nov 10, 2021 at 4:46 PM Chris Adams  wrote:
>
> I have seen vdsmd leak memory for years (I've been running oVirt since
> version 3.5), but never been able to nail it down.  I've upgraded a
> cluster to oVirt 4.4.9 (reloading the hosts with CentOS 8-stream), and I
> still see it happen.  One host in the cluster, which has been up 8 days,
> has vdsmd with 4.3 GB resident memory.  On a couple of other hosts, it's
> around half a gigabyte.

Can you share vdsm logs from the time vdsm started?

We have these logs:

2021-11-14 15:16:32,956+0200 DEBUG (health) [health] Checking health (health:93)
2021-11-14 15:16:32,977+0200 DEBUG (health) [health] Collected 5001
objects (health:101)
2021-11-14 15:16:32,977+0200 DEBUG (health) [health] user=2.46%,
sys=0.74%, rss=108068 kB (-376), threads=47 (health:126)
2021-11-14 15:16:32,977+0200 INFO  (health) [health] LVM cache hit
ratio: 97.64% (hits: 5431 misses: 131) (health:131)

They may provide useful info on the leak.

You need to enable DEBUG logs for root logger in /etc/vdsm/logger.conf:

[logger_root]
level=DEBUG
handlers=syslog,logthread
propagate=0

and restart vdsmd service.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JDA34CQF5FTHVFTRXF4OGKEFJIKJL3NL/


[ovirt-users] Re: Viewing and hopefully, modifying the VM's qemu command line

2021-11-08 Thread Nir Soffer
On Mon, Nov 8, 2021 at 5:47 PM Gilboa Davara  wrote:
>
> Hello all,
>
> I'm setting up a fairly (?) complex oVirt over Gluster setup built around 3 
> Xeon servers-turned-into-workstations, each doubling as oVirt node + one 
> primary Fedora VM w/ a dedicated passthrough GPU (+audio and a couple of USB 
> root devices).
> One of the servers seems to have some weird issue w/ the passthrough nVidia 
> GPU that seems to require me to edit the VM iommu (1) and passthrough device 
> (2) command line.
> I tried using the qemu-cmdline addon to add the missing parameters, but it 
> seems that qemu treats the added parameters as an additional device / iommu 
> instead of editing the existing parameters.
>
> So:
> 1. How can I view the VM qemu command line?

less /var/log/libvirt/qemu/vm-name.log

> 2. Can I somehow manually edit the qemu command line, either directly or by 
> somehow adding parameters in the HE XML file?

I think this should be possible via vdsm hook, but hooks are bad.
Can you explain what do you want to change?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RWLBQPUCPLS4SQKV6PSLH3ZVV6HPXFNV/


[ovirt-users] Re: is disk reduce command while VM is active after snapshot deletion save?

2021-11-03 Thread Nir Soffer
On Wed, Nov 3, 2021 at 3:58 PM  wrote:
>
> thnx for the quick answer.
>
> the two disk volumes i have reduced are active. the response of the curl 
> command is saying complete. i have set async to false to get 
> a response from the rest api.
> when i called lvdisplay on the image_id of the volumes i was able to see that 
> the images got shrank.
>
> SD is Fibre Channel so it is block SD
>
> the vms are running oracle databases. maybe i should freeze fs bevor calling 
> reduce command? is it possible to call fsfreeze via rest api?

Reducing active volume is blocked in 4.4 since this is completely unsafe
operation that is likely to corrupt the disk.

See the patch fixing this:
https://gerrit.ovirt.org/c/ovirt-engine/+/111541

It is possible that this is not blocked in older 4.3 versions. This is the
risk of running 4.3.

There is no reason to freeze the file system, reducing is an operation that
cannot affect the guest if done safely (on read only layer).

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IQNYE57UZWLJ4ZRVIIAFSF2RYNG2EJ5Y/


[ovirt-users] Re: Snapshot and disk size allocation

2021-10-28 Thread Nir Soffer
On Thu, Oct 28, 2021 at 7:21 PM  wrote:
>
> Hello, is there any progress in this problem? I have tried to reduce a 
> bloated volume by trigger the reduce command over rest api. I am using ovirt 
> 4.3.10, but i get no response, and the volume keeps bloated anyway. I'm 
> getting frustrated, because i am using a backupscript which creates a 
> snapshot, then creates a clone vm out of the snapshot, exports it and removes 
> snapshot and cloned vm. This is done every night, and so the volumes increase.
> it is a productiv cluster with 70 vm, and i can't just stop them and do som 
> magic stuff.

I think the only way you can optimize the disk size is to move the
disk to another
storage domain and back to the original storage domain.

When we copy a disk, we measure every volume in the chain, and create
a new volume
in the destination storage, using the optimal size for this volume.
Then we copy the
data from the source volume to the new volume using qemu-img convert.

When qemu-img convert  copy an image, it detects unallocated areas or
areas which
reads as zeroes. In the target image, these areas will not be
allocated, or will be stored
efficiently as zero clusters (8 byte for 64k of data).

Detecting zeroes happens during the copy, not when you measure the
volume, so after
the first copy the disk, may still have unneeded allocation at lvm
level. When you copy
the disk back to original storage, this extra allocation will be eliminated.

This is not a fast operation, but you can move disks when vms are
running, so there is
no downtime.

You can try this with a new vm:
1. create vm with 100g thin data disk
2. in the guest, fill the disk with zeros
dd if=/dev/zero bs=1M count=102400 of=/dev/sdb oflag=direct status=progress
3. the vm disk will be extended to 100g+
4. move the disk to another storage domain
5. after the move, the disk's actual size will be 100g+
6. move the disk back to original storage
7. the disk actual size will go back to 1g

Note that enabling discard and using fstrim in the guest before the copy will
optimize the process.

1. create vm with 100g thin virtio-scsi data disk, with "enable discard"
2. in the guest, fill the disk with zeros
dd if=/dev/zero bs=1M count=102400 of=/dev/sdb oflag=direct status=progress
3. the vm disk will be extended to 100g+
4. in the guest, run "fstrim /dev/sdb"
5. the disk size will remain 100g+
6. move the disk to another storage domain
7. this move will be extremly quick, no data will be copied
8. the disk actual size on the destination storage domain will be 1g

In ovirt 4.5 we plan to support disk format conversion - this will allow this
kind of  sparsification without copying the data twice to another storage
domain.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LJHSMR7KNSMV6SGMWGLTJE6NZEVPZLOF/


[ovirt-users] Re: Installing Windows 4.4.9/Change CD

2021-10-21 Thread Nir Soffer
On Thu, Oct 21, 2021 at 7:11 PM Matt Schuler  wrote:
>
> Just wondering if it is possible to install windows on the 4.4.9?  (I am 
> running 4.4.9 and node 4.4.8, I don’t node is built yet for .9)
>
>
>
> The issue I am having is changing CDs, when I try to change it get the 
> following error: (Both ISOs are uploaded though the GUI on block storage, 
> iSCSI)
...
> ERROR FINISH changeCD error=Failed to change disk image
>
> Traceback (most recent call last):
>
> File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5005, in 
> _update_disk_device
>
> disk_xml, libvirt.VIR_DOMAIN_DEVICE_MODIFY_FORCE)
>
> File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
>
> ret = attr(*args, **kwargs)
>
> File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", 
> line 131, in wrapper
>
> ret = f(*args, **kwargs)
>
> File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in 
> wrapper
>
> return func(inst, *args, **kwargs)
>
> File "/usr/lib64/python3.6/site-packages/libvirt.py", line 3237, in 
> updateDeviceFlags
>
> raise libvirtError('virDomainUpdateDeviceFlags() failed')
>
> libvirt.libvirtError: internal error: unable to execute QEMU command 
> 'blockdev-add': 'file' driver requires 
> '/rhev/data-center/mnt/blockSD/XX/images/XX/XX' to be a regular file

This is https://bugzilla.redhat.com/1990268

It is fixed in:
$ git describe 6094e672781f593767dc313ebd53d23334511f5a
v4.40.90.1-7-g6094e6727

You need to upgrade vdsm.

The only workaround for now is to use  a file based storage domain
(e.g. NFS) for the ISO
images.

When the issue is fixed, you can copy the ISO disks to another storage
domain, and remove
the file based domain.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/K2AL6IYKQ4SBQ5GEQJ7HVSAJ4HZZVYGJ/


[ovirt-users] Re: install python package ovirt_imageio for Ovirt 4.3

2021-10-21 Thread Nir Soffer
On Wed, Oct 20, 2021 at 3:55 PM Nir Soffer  wrote:
>
> On Wed, Oct 20, 2021 at 2:24 PM Grace Chen  wrote:
> > I want to use python script running from a host to automate the remote kvm 
> > manager's vm backup. Our ovirt version is 4.3. I have installed 
> > ovirt-imageio-daemon on the host
> > I tried to yum install ovirt-imageio-client (we use yum), it couldn't find 
> > the package.
> > error message:
> >
> > Loaded plugins: ulninfo, vdsmupgrade, versionlock
> > ovirt-4.3   
> >  | 3.0 kB  00:00:00
> > ovirt-4.3-extra 
> >  | 3.0 kB  00:00:00
> > ovirt-master-snapshot   
> >  | 3.0 kB  00:00:00
> > ovirt-master-snapshot-static
> >  | 3.0 kB  00:00:00
> > No package ovirt-imageio-client available.
> >
> > I tried to copy folder ovirt_imageio from ovirt-imageio to my python 
> > library directory, it could not compile the c file "ioutil.c"
> > Can anybody help me install the python package? Am I installing on the 
> > correct place, do I need to use the python script only on the host that 
> > installed the kvm manager?
>
> (Copied from private mail in case it is useful for others)
>
> ovirt-imageio-client was released in 4.4, and is not compatible with python 2.
>
> In general the client supports older version of ovirt server (e.g.
> 4.3), but it is
> not tested in this configuration.
>
> You can try to create a vm with centos 8 and ovirt-release44.rpm
> and install the client in this vm.
>
> Then you can run the backups on the vm, and store the backups in
> NFS server accessed by this "backup" vm.
>
> Note that you will get much better performance with ovirt 4.4., when
> running the client directly on the ovirt host. 4.4. also supports incremental
> backup, which is much faster and simpler compared with snapshot based
> backup that you probably plan to use.\

We have another option now to install ovirt_imagieo, via pip.

It may work on any Linux distro with qemu-img (>= 4.2) and python >= 3.6,
but I tested it only on Fedora 34 (qem-img 6.1, python 3.9).

Note that the imageio client depends qemu-img and qemu-nbd, provided by qem-img
package on rpm based distros. pip cannot install this package, so you
need to install
it yourself using your distro package manager.

Here an example using virtual environment:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install --upgrade pip
$ pip install --upgrade ovirt-imageio
$ pip install --upgrade ovirt-engine-sdk-python

The imageio client is not very useful as is unless you want to write
an application
using the python library. But we have several useful example scripts
in the ovirt
engine python SDK. They are not installed by pip, but you can get them via git:

$ git clone https://github.com/oVirt/python-ovirt-engine-sdk4.git

Finally, you need to add configuration file pointing to your engine(s):

$ cat /home/nsoffer/.config/ovirt.conf
[myengine]
engine_url = https://myengine.example.com
username = admin@internal
password = password
cafile = /home/username/certs/myengine.pem

Now you can upload a disk image using:

$ python-ovirt-engine-sdk4/examples/upload_disk.py -c myengine
--sd-name my-storage-domain --disk-sparse disk.qcow2

These scripts will be most useful and should work with 4.3:

- upload_disk.py
- download_disk.py

These require ovirt 4.4:

- list_disk_snapshots.py
- download_disk_snapshot.py
- backup_vm.py

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MATMO35FHP7QMUAGHQUVGRRGBWJLLQ5J/


[ovirt-users] Re: install python package ovirt_imageio for Ovirt 4.3

2021-10-20 Thread Nir Soffer
On Wed, Oct 20, 2021 at 2:24 PM Grace Chen  wrote:
> I want to use python script running from a host to automate the remote kvm 
> manager's vm backup. Our ovirt version is 4.3. I have installed 
> ovirt-imageio-daemon on the host
> I tried to yum install ovirt-imageio-client (we use yum), it couldn't find 
> the package.
> error message:
>
> Loaded plugins: ulninfo, vdsmupgrade, versionlock
> ovirt-4.3 
>| 3.0 kB  00:00:00
> ovirt-4.3-extra   
>| 3.0 kB  00:00:00
> ovirt-master-snapshot 
>| 3.0 kB  00:00:00
> ovirt-master-snapshot-static  
>| 3.0 kB  00:00:00
> No package ovirt-imageio-client available.
>
> I tried to copy folder ovirt_imageio from ovirt-imageio to my python library 
> directory, it could not compile the c file "ioutil.c"
> Can anybody help me install the python package? Am I installing on the 
> correct place, do I need to use the python script only on the host that 
> installed the kvm manager?

(Copied from private mail in case it is useful for others)

ovirt-imageio-client was released in 4.4, and is not compatible with python 2.

In general the client supports older version of ovirt server (e.g.
4.3), but it is
not tested in this configuration.

You can try to create a vm with centos 8 and ovirt-release44.rpm
and install the client in this vm.

Then you can run the backups on the vm, and store the backups in
NFS server accessed by this "backup" vm.

Note that you will get much better performance with ovirt 4.4., when
running the client directly on the ovirt host. 4.4. also supports incremental
backup, which is much faster and simpler compared with snapshot based
backup that you probably plan to use.\

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7VFZT6TKMPYJ54F7QZSK3O5LB6VJCHPE/


[ovirt-users] Re: Unable to put host in maintenance mode & unable to login to Postgres

2021-10-17 Thread Nir Soffer
On Fri, Oct 15, 2021 at 11:38 AM David White via Users 
wrote:

> Thank you very much.
> I was able to (re)set the `engine` user's password in Postgres.
> Unfortunately, I'm still having trouble unlocking the disks.
>
> The following command produces no output underneath "Locked disks" when I
> run this command on the hosted engine VM:
>
> *[root@ovirt-engine1 dwhite]# PGPASSWORD=snip
> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t disk -q*
> *Locked disks*
>
> However, in the oVirt UI, when I try to put the host into maintenance mode
> I continue to get the message that there are (3) locked disks (screenshot
> below).
> [image: Screenshot from 2021-10-15 04-29-15.png]
>
>
Do  you have active image transfers?

You can check by getting

https://myengine/ovirt-engine/api/imagetransfers

If there are no image transfers, check the relevant disks status:

https://myengine/ovirt-engine/api/disks/{id}

If the disks status is "locked", it may be engine bug, not cleaning up after
failed image transfer.

If there is no task in engine using this disk, you can change the disk
status using:

# sudo -u postgres psql -d engine

Finding the locked images:

# select image_group_id,imagestatus from images where imagestatus=2;

Unlocking an image:

# update images set imagestatus=1 where image_group_id='xxx-yyy';

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3UE6CJLEGBR27KWJQH3TA5AX7LXX5URV/


[ovirt-users] Re: How to add a note to a VM

2021-10-12 Thread Nir Soffer
On Tue, Oct 12, 2021 at 4:46 PM Gianluca Cecchi
 wrote:
>
> On Tue, Oct 12, 2021 at 3:29 PM Nir Soffer  wrote:
>>
>> On Tue, Oct 12, 2021 at 3:24 PM Gianluca Cecchi
>>  wrote:
>> >
>> > Hello,
>> > I know there are the "Comment" and "Description" columns available in many 
>> > areas of the Webadmin Gui.
>> > But there are some tasks, like "shutdown VM" or "Management -> 
>> > Maintenance" for a host, where I can specify a "reason" for doing that 
>> > task and then a note icon appears, aside the object, with the mouse over 
>> > showing the note text, like in this image:
>> > https://drive.google.com/file/d/1v3Yd2t7AtuRFMT6HPFYYZqYUmJLMHYMY/view?usp=sharing
>> >
>> > Is there a way to do it in general? So for example I have a VM and I want 
>> > to put a note (for some colleague, or to remind me to do an action 
>> > tomorrow, ecc...)
>> >
>> > And btw: how can I manually remove the note? Eg I shutdown a VM and fill 
>> > in the "Reason" field and then in a second moment I want to remove it
>>
>> The "comment" field was designed exactly for this purpose.
>>
>> Maybe this is not documented?
>>
>> Nir
>>
>
> Probably yes, but it results less visible than having a tooltip with the icon 
> of a note...
> And also, sometimes the "Comment" column is not one of the first, so you have 
> to arrange / order it so that it comes to the left...

The web console can show a note icon if an entity has a comment.

Please file engine RFE explaining the use case.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IZ23NCVSNHGMHAQFCUKQ7KIV6ZQU3I24/


[ovirt-users] Re: How to add a note to a VM

2021-10-12 Thread Nir Soffer
On Tue, Oct 12, 2021 at 3:24 PM Gianluca Cecchi
 wrote:
>
> Hello,
> I know there are the "Comment" and "Description" columns available in many 
> areas of the Webadmin Gui.
> But there are some tasks, like "shutdown VM" or "Management -> Maintenance" 
> for a host, where I can specify a "reason" for doing that task and then a 
> note icon appears, aside the object, with the mouse over showing the note 
> text, like in this image:
> https://drive.google.com/file/d/1v3Yd2t7AtuRFMT6HPFYYZqYUmJLMHYMY/view?usp=sharing
>
> Is there a way to do it in general? So for example I have a VM and I want to 
> put a note (for some colleague, or to remind me to do an action tomorrow, 
> ecc...)
>
> And btw: how can I manually remove the note? Eg I shutdown a VM and fill in 
> the "Reason" field and then in a second moment I want to remove it

The "comment" field was designed exactly for this purpose.

Maybe this is not documented?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WSINGWYDKQZMKDITW5UYXQQZ44QMZB6E/


[ovirt-users] Re: not able to upload disks, iso - paused by the system error -- Version 4.4.6.7-1.el8

2021-10-12 Thread Nir Soffer
On Tue, Oct 12, 2021 at 8:55 AM dhanaraj.ramesh--- via Users
 wrote:
>
> Hi Team
>
> in one of the cluster infra, we are unable to upload the images or disks via 
> gui. up on checking the /var/log/ovirt-imageio/daemon.log found that throwing 
> ssl connection failure, help us to check what are we missing..

Which version?

If you are on ovirt 4.4, please share output of:

ovirt-imageio --show-config

on engine.

> We are using thirdparty CA approved SSL for web GUI..
>
> 2021-10-11 22:45:42,812 INFO(Thread-6) [http] OPEN connection=6 
> client=127.0.0.1
> 2021-10-11 22:45:42,812 INFO(Thread-6) [tickets] [127.0.0.1] REMOVE 
> ticket=f18cff91-1fc4-43b6-91ea-ca2a11d409a6
> 2021-10-11 22:45:42,813 INFO(Thread-6) [http] CLOSE connection=6 
> client=127.0.0.1 [connection 1 ops, 0.000539 s] [dispatch 1 ops, 0.000216 s]
> 2021-10-11 22:45:43,621 INFO(Thread-4) [images] [:::10.12.23.212] 
> OPTIONS ticket=53ff98f9-f429-4880-abe6-06c6c01473de
> 2021-10-11 22:45:43,621 INFO(Thread-4) [backends.http] Open backend 
> netloc='renlovkvma01.test.lab:54322' 
> path='/images/53ff98f9-f429-4880-abe6-06c6c01473de' 
> cafile='/etc/pki/ovirt-engine/ca.pem' secure=True

Looks like the host is configured correctly - the http backend is
using the right CA file
to access the host.

> 2021-10-11 22:45:43,626 ERROR   (Thread-4) [http] Server error
...
> self._sslobj.do_handshake()
> ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed 
> (_ssl.c:897)

The CA file on engine side (/etc/pki/ovirt-engine/ca.pem) does not
match the CA file on the host
(/etc/pki/vdsm/certs/cacert.pem).

Which files did you change when we added the thirdparty CA approved SSL?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/45S3QIVWVPBAVQ6IWV3QHJPURLG5NPCY/


[ovirt-users] Re: Intermittent failure to upload ISOs

2021-09-30 Thread Nir Soffer
On Thu, Sep 30, 2021 at 12:23 PM  wrote:

Thanks for reporting.

> This may be the same issue as described here:
> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/CJISJIDQKSINIJUA5UO6Y4BRFQYEOYLA/
> https://bugzilla.redhat.com/show_bug.cgi?id=1977276
>
> I am on 4.4.8.6-1.el8, installed a couple days ago from the ovirt node ISO. 
> In particular, I noticed if I SSH into the hosted engine and tail -f 
> /var/log/ovirt-imageio/daemon.log, in the failure case I get something like:
>
> 2021-09-30 08:15:52,330 INFO(Thread-8) [http] OPEN connection=8 
> client=:::192.168.1.53
> 2021-09-30 08:16:23,315 INFO(Thread-8) [http] CLOSE connection=8 
> client=:::192.168.1.53 [connection 1 ops, 30.984947 s] [dispatch 1 ops, 
> 0.97 s]

There is no activity since the upload never started.

> No activity in tail -f /var/log/ovirt-imageio/daemon.log on the host (I only 
> have one host) in the failure case, just the engine. In the success case, 
> there is activity in both logs.
>
> It is very intermittent. Sometimes uploads work most of the time (maybe 4 out 
> of 5), and I've had other times that uploads do not work at all (0 out of 5).
>
> I think when it's behaving particularly badly, restarting the engine 
> (hosted-engine --vm-shutdown, then hosted-engine --vm-start) helps, but I 
> haven't figured out a reliable pattern. (I am logged in as admin.) I've tried 
> several browsers, closing/reopening the browser, etc.
>
> Hoping this info will help in tracking it down.

We tracked this down, and it is fixed upstream.

The fix should be available in 4.4.9.
See https://gerrit.ovirt.org/c/ovirt-engine/+/116861

Until this is fixed, you can upload using the SDK, which is also a
better way to upload and download
images anyway.

Install these packages on the host used for uploading:

dnf install ovirt-imageio-client python3-ovirt-engine-sdk4

(packages are already installed on hosts and engine)

Create ovirt configuration file if needed:

$ cat ~/.config/ovirt.conf
[my-engine]
engine_url = https://my-engine.example.com
username = admin@internal
password = mypassword
cafile = /path/to/cacert.pem

cafile can be downloaded from:
https://my-engine.example.com/ovirt-engine/services/pki-resource?resource=ca-certificate=X509-PEM-CA

Then you can upload using:

python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py
-c my-engine --sd-name my-storage-domain /path/to/iso

See --help for more options.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VANDRITI5YPVDLCSJ7FSOPLBF5DDKCGB/


[ovirt-users] Re: Host reboots when network switch goes down

2021-09-29 Thread Nir Soffer
On Wed, Sep 29, 2021 at 2:08 PM cen  wrote:
>
> Hi,
>
> we are experiencing a weird issue with our Ovirt setup. We have two
> physical hosts (DC1 and DC2) and mounted Lenovo NAS storage for all VM data.
>
> They are connected via a managed network switch.
>
> What happens is that if switch goes down for whatever reason (firmware
> update etc), physical host reboots. Not sure if this is an action
> performed by Ovirt but I suspect it is because connection to mounted
> storage is lost and it  performs some kind of an emergency action. I
> would need to get some direction pointers to find out
>
> a) who triggers the reboot and why
>
> c) a way to prevent reboots by increasing storage? timeouts
>
> Switch reboot takes 2-3 minutes.
>
>
> These are the host /var/log/messages just before reboot occurs:
>
> Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
> [10993]: s11 check_our_lease warning 72 last_success 7690912
> Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
> [10993]: s3 check_our_lease warning 76 last_success 7690908
> Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
> [10993]: s1 check_our_lease warning 68 last_success 7690916
> Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
> [27983]: s11 delta_renew read timeout 10 sec offset 0
> /var/run/vdsm/storage/15514c65-5d45-4ba7-bcd4-cc772351c940/fce598a8-11c3-44f9-8aaf-8712c96e00ce/65413499-6970-4a4c-af04-609ef78891a2
> Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
> [27983]: s11 renewal error -202 delta_length 20 last_success 7690912
> Sep 28 16:20:00 ovirtnode02 wdmd[11102]: test warning now 7690984 ping
> 7690970 close 7690980 renewal 7690912 expire 7690992 client 10993
> sanlock_hosted-engine:2
> Sep 28 16:20:00 ovirtnode02 wdmd[11102]: test warning now 7690984 ping
> 7690970 close 7690980 renewal 7690908 expire 7690988 client 10993
> sanlock_3cb12f04-5d68-4d79-8663-f33c0655baa6:2
> Sep 28 16:20:01 ovirtnode02 systemd: Created slice User Slice of root.
> Sep 28 16:20:01 ovirtnode02 systemd: Started Session 15148 of user root.
> Sep 28 16:20:01 ovirtnode02 systemd: Removed slice User Slice of root.
> Sep 28 16:20:01 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:01 7690985
> [10993]: s11 check_our_lease warning 73 last_success 7690912
> Sep 28 16:20:01 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:01 7690985
> [10993]: s3 check_our_lease warning 77 last_success 7690908
> Sep 28 16:20:01 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:01 7690985
> [10993]: s1 check_our_lease warning 69 last_success 7690916
> Sep 28 16:20:01 ovirtnode02 wdmd[11102]: test warning now 7690985 ping
> 7690970 close 7690980 renewal 7690912 expire 7690992 client 10993
> sanlock_hosted-engine:2
> Sep 28 16:20:01 ovirtnode02 wdmd[11102]: test warning now 7690985 ping
> 7690970 close 7690980 renewal 7690908 expire 7690988 client 10993
> sanlock_3cb12f04-5d68-4d79-8663-f33c0655baa6:2
> Sep 28 16:20:02 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:02 7690986
> [10993]: s11 check_our_lease warning 74 last_success 7690912
> Sep 28 16:20:02 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:02 7690986
> [10993]: s3 check_our_lease warning 78 last_success 7690908
> Sep 28 16:20:02 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:02 7690986
> [10993]: s1 check_our_lease warning 70 last_success 7690916
> Sep 28 16:20:02 ovirtnode02 wdmd[11102]: test warning now 7690986 ping
> 7690970 close 7690980 renewal 7690916 expire 7690996 client 10993
> sanlock_15514c65-5d45-4ba7-bcd4-cc772351c940:2
> Sep 28 16:20:02 ovirtnode02 wdmd[11102]: test warning now 7690986 ping
> 7690970 close 7690980 renewal 7690912 expire 7690992 client 10993
> sanlock_hosted-engine:2
> Sep 28 16:20:02 ovirtnode02 wdmd[11102]: test warning now 7690986 ping
> 7690970 close 7690980 renewal 7690908 expire 7690988 client 10993
> sanlock_3cb12f04-5d68-4d79-8663-f33c0655baa6:2
> Sep 28 16:20:03 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:03 7690987
> [10993]: s11 check_our_lease warning 75 last_success 7690912
> Sep 28 16:20:03 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:03 7690987
> [10993]: s3 check_our_lease warning 79 last_success 7690908

Leases on lockspace s3 will expire in one second after this message...

> Sep 28 16:20:03 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:03 7690987
> [10993]: s1 check_our_lease warning 71 last_success 7690916

When leases expire, sanlock to terminate the lease owner (.e.g vdsm, qemu).
If the owner of the lease cannot be terminated in (~40 seconds) the sanlock must
reboot the host.

So the host running hosted engine may be rebooted because storage is
inaccessible
and qemu is stuck on storage.

Other hosts may have the same issue they run HA VMs, server as the SPM, or run
storage tasks that use a lease.

To understand if this is the case, we need complete sanlock.log and
vdsm.log from
the hosts, when the issue happens.

Please file ovirt vdsm bug for this, and attach relevant logs.

Nir

[ovirt-users] Re: Cannot activate a Storage Domain after an oVirt crash

2021-09-19 Thread Nir Soffer
On Thu, Sep 16, 2021 at 4:20 PM Roman Bednar  wrote:
>
> Make sure the VG name is correct, it won't complain if the name is wrong.
>
> Also you can check if the backups are enabled on the hosts, to be sure:
>
> # lvmconfig --typeconfig current | egrep "backup|archive"
> backup {
> backup=1
> backup_dir="/etc/lvm/backup"
> archive=1
> archive_dir="/etc/lvm/archive"
>
>
> If the backups are not available I'm afraid there's not much you can do at 
> this point.

Actually you can, since you may have a good metadata in the PV. There are 2
metadata areas, and when one of them is corrupt, you can restore the metadata
from the other. The metadata areas contains also the history of the metadata, so
even if both metadata areas are corrupted, you can extract an older
metadata from
one of the areas.

If you build lvm from upstream from source on the host, you can
extract the metadata
from the PV using pvck --dump and repair the PV using pvck --repair,
using the dumped
metadata.

But the most important thing is to upgrade - this is a known issue in
4.3.8. You need to
upgrade to 4.3.11 providing vdsm >= 4.30.50 and  lvm2 >= 2.02.187-6.

The related bug:
https://bugzilla.redhat.com/1849595

Nir

> On Thu, Sep 16, 2021 at 2:56 PM  wrote:
>>
>> Hi Roman,
>>
>> Unfortunately, step 1 returns nothing:
>>
>> kvmr03:~# vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>>No archives found in /etc/lvm/archive
>>
>> I tried several hosts and noone has a copy.
>>
>> Any other way to get a backup of the VG?
>>
>> El 2021-09-16 13:42, Roman Bednar escribió:
>> > Hi Nicolas,
>> >
>> > You can try to recover VG metadata from a backup or archive which lvm
>> > automatically creates by default.
>> >
>> > 1) To list all available backups for given VG:
>> >
>> > #vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>> >
>> > Select the latest one which sounds right, something with a description
>> > along the lines of "Created *before* lvremove".
>> > You might want to select something older than the latest as lvm does a
>> > backup also *after* running some command.
>> >
>> > 2) Find UUID of your broken PV (filter might not be needed, depends on
>> > your local conf):
>> >
>> > #pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
>> > /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> >
>> > 3) Create a new PV on a different partition or disk (/dev/sdX) using
>> > the UUID found in previous step and restorefile option:
>> >
>> > #pvcreate --uuid  --restorefile 
>> > 
>> >
>> > 4) Try to display the VG:
>> >
>> > # vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>> >
>> > -Roman
>> >
>> > On Thu, Sep 16, 2021 at 1:47 PM  wrote:
>> >
>> >> I can also see...
>> >>
>> >> kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3
>> >> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
>> >> offset 2198927383040
>> >> Couldn't read volume group metadata from
>> >> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
>> >> Metadata location on
>> >> /dev/mapper/36001405063455cf7cd74c20bc06e9304 at
>> >> 2198927383040 has invalid summary for VG.
>> >> Failed to read metadata summary from
>> >> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> >> Failed to scan VG from
>> >> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> >>
>> >> Seems to me like metadata from that VG has been corrupted. Is there
>> >> a
>> >> way to recover?
>> >>
>> >> El 2021-09-16 11:19, nico...@devels.es escribió:
>> >>> The most relevant log snippet I have found is the following. I
>> >> assume
>> >>> it cannot scan the Storage Domain, but I'm unsure why, as the
>> >> storage
>> >>> domain backend is up and running.
>> >>>
>> >>> 021-09-16 11:16:58,884+0100 WARN  (monitor/219fa16) [storage.LVM]
>> >>> Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices {
>> >>> preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1
>> >>> write_cache_state=0  disable_after_error_count=3
>> >>>
>> >>
>> >
>> filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f0100$|^/dev/mapper/3600c0ff00052a0fe033ec65f0100$|^/dev/mapper/3600c0ff00052a0fe1b40c65f0100$|^/dev/mapper/3600c0ff00052a0fe2294c75f0100$|^/dev/mapper/3600c0ff00052a0fe2394c75f0100$|^/dev/mapper/3600c0ff00052a0fe2494c75f0100$|^/dev/mapper/3600c0ff00052a0fe2594c75f0100$|^/dev/mapper/3600c0ff00052a0fe2694c75f0100$|^/dev/mapper/3600c0ff00052a0fee293c75f0100$|^/dev/mapper/3600c0ff00052a0fee493c75f0100$|^/dev/mapper/3600c0ff00064835b628d30610100$|^/dev/mapper/3600c0ff00064835b628d30610300$|^/dev/mapper/3600c0ff000648
>> >>>
>> >>
>> >
>> 

[ovirt-users] Re: Failed to activate Storage Domain --- ovirt 4.2

2021-09-19 Thread Nir Soffer
On Sat, Sep 18, 2021 at 9:26 AM  wrote:
>
> Hi all.
> I'm using Ovrit 4.3.10 two nodes cluster and facing the same error (second 
> metadata area corruption).
> Does anybody know if there is a solution for that?
>
> Our software include:
> lvm2-2.02.186-7.el7_8.2.x86_64
> Virt Node 4.3.10
> kernel  3.10.0-1127.8.2.el7.x86_64
> device-mapper-multipath-libs-0.4.9-131.el7.x86_64
> libvirt-4.5.0-33.el7_8.1.x86_64

This is a known issue in vdsm < vdsm-4.30.50 and  lvm2 < 2.02.187-6:
https://bugzilla.redhat.com/1849595

The only way to avoid this is to upgrade to oVirt >= 4.3.11.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/C343NUCNXEHEFJAXTP2CLUOB3GEMD3S7/


[ovirt-users] Re: Getting issue in finalize section after disk successfully uploaded

2021-09-19 Thread Nir Soffer
On Sat, Sep 4, 2021 at 10:28 PM avishek.sarkar--- via Users
 wrote:
>
> I have used upload-disk.py to upload qcow2 image to data center...however 
> getting below errror
>
> ./upload-disk.py --engine-url 'https://xyz.net' --username='x...@abc.net' 
> --password-file=password  --sd-name='SEC-ABC' 
> --cafile='/home/jenkins/crt/ca.crt' --disk-format="qcow2" --disk-sparse 
> image-XYZ

I see that you use the user x...@abc.net - does it work when using
admin@internal?

...
> Uploading image...
> [ 100.00% ] 10.00 GiB, 11.34 seconds, 903.09 MiB/s
> Finalizing transfer session...
...
> ovirtsdk4.AuthError: Fault reason is "Operation Failed". Fault detail is 
> "[User is not authorized to perform this action.]". HTTP response code is 403.

This is a bug - the system allows you to start an upload, which creates a disk,
but does not allow you to finalize the upload.

If you can reproduce this with 4.4, please file ovirt engine bug, and
we will try to fix
this in a future 4.4. release.

For now, you can try to modify the capabilities of the user x...@abc.net, maybe
uploading images requires some permissions which are set for this user.

Avihai, do we test image transfers with a special user (not admin@internal)?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/P4WTNXFGIWCFRB2FECJOZIPDQJGZGCCA/


[ovirt-users] Re: virtio-win driver licensing

2021-09-10 Thread Nir Soffer
On Fri, Sep 10, 2021 at 7:15 PM Nir Soffer  wrote:
>
> On Tue, Sep 7, 2021 at 2:14 AM  wrote:
> >
> > It's the one distributed with the engine: 
> > /usr/share/virtio-win/virtio-win-1.9.16.iso
>
> I tried to install this package on RHEL 8.5 (nightly) build:

Correction, tested on Centos 8 Stream host with recent ovirt-release-master:

# rpm -q ovirt-release-master
ovirt-release-master-4.4.8-0.0.master.20210904011127.git42ad2d6.el8.noarch
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NGX4YKHQICUM6Q76P6CO7JI5ZWIKB64L/


[ovirt-users] Re: virtio-win driver licensing

2021-09-10 Thread Nir Soffer
On Tue, Sep 7, 2021 at 2:14 AM  wrote:
>
> It's the one distributed with the engine: 
> /usr/share/virtio-win/virtio-win-1.9.16.iso

I tried to install this package on RHEL 8.5 (nightly) build:

# dnf info virtio-win
Last metadata expiration check: 17:32:51 ago on Fri 10 Sep 2021 01:25:44 IDT.
Installed Packages
Name : virtio-win
Version  : 1.9.18
Release  : 2.el8
Architecture : noarch
Size : 780 M
Source   : virtio-win-1.9.18-2.el8.src.rpm
Repository   : @System
From repo: appstream
Summary  : VirtIO para-virtualized drivers for Windows(R)
URL  : http://www.redhat.com/

This looks like a bug, this URL is not the URL of the virtio-win project.

License  : Red Hat Proprietary and BSD-3-Clause and Apache and GPLv2

I'm not sure what this means, a license cannot be Proprietary and BSD and GPL
at the same time. It can be one *or* another. I think this is a RHEL bug to
report a license in this way.

Description  : VirtIO para-virtualized Windows(R) drivers for 32-bit and 64-bit
 : Windows(R) guests.

However after mounting this iso I see only BSD-3-Clause license:

# mount -o loop /usr/share/virtio-win/virtio-win.iso /tmp/virtio-win/
# cat /tmp/virtio-win/virtio-win_license.txt
Copyright 2009-2021 Red Hat, Inc. and/or its affiliates.
Copyright 2016 Google, Inc.
Copyright 2016 Virtuozzo, Inc.
Copyright 2007 IBM Corporation

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

This also matches the project license mentioned by Scot:
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/LICENSE

Yash, can you help to make this more clear?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CHM5MCLK3Y2YDN2BPHABM4CD2LQFQK4P/


[ovirt-users] Re: Poor gluster performances over 10Gbps network

2021-09-09 Thread Nir Soffer
On Thu, Sep 9, 2021 at 4:12 PM Mathieu Valois  wrote:

> You can find attached the benchmarks on the host and guest. I find the
> differences not so big though...
>

Host is using the gluster mount
(/rhev/data-center/mnt/glusterSD/server:_path/...)
or writing directly into the same filesystem used by gluster
(/bricks/brick1/...)?

If will help if you share output of lsblk and the command line used to run
fio on the host.

Comparing host and guest:

seq-write: (groupid=0, jobs=4): err= 0: pid=294433: Thu Sep  9 14:30:14 2021
  write: IOPS=151, BW=153MiB/s (160MB/s)(4628MiB/30280msec); 0 zone resets

I guess the underlying storage is hard disk - 150 MiB/s is not bad but
very low compared with fast SSD.

seq-read: (groupid=1, jobs=4): err= 0: pid=294778: Thu Sep  9 14:30:14 2021
  read: IOPS=7084, BW=7086MiB/s (7430MB/s)(208GiB/30016msec)

You have crazy caching (ignoring the direct I/O?), 7GiB/s read?

rand-write-qd32: (groupid=2, jobs=4): err= 0: pid=295141: Thu Sep  9
14:30:14 2021
  write: IOPS=228, BW=928KiB/s (951kB/s)(28.1MiB/30971msec); 0 zone resets

Very low, probably limited by the hard disks?

rand-read-qd32: (groupid=3, jobs=4): err= 0: pid=296094: Thu Sep  9
14:30:14 2021
  read: IOPS=552k, BW=2157MiB/s (2262MB/s)(63.2GiB/30001msec)

Very high, this is what you get from fast consumer SSD.

rand-write-qd1: (groupid=4, jobs=1): err= 0: pid=296386: Thu Sep  9
14:30:14 2021
  write: IOPS=55, BW=223KiB/s (229kB/s)(6696KiB/30002msec); 0 zone resets

Very low.

rand-read-qd1: (groupid=5, jobs=1): err= 0: pid=296633: Thu Sep  9 14:30:14
2021
  read: IOPS=39.4k, BW=154MiB/s (161MB/s)(4617MiB/30001msec)

Same caching.

If we compare host and guest:

$ grep -B1 IOPS= *.out
guest.out-seq-write: (groupid=0, jobs=4): err= 0: pid=46235: Thu Sep  9
14:18:05 2021
guest.out:  write: IOPS=57, BW=58.8MiB/s (61.6MB/s)(1792MiB/30492msec); 0
zone resets

~33% of host throughput

guest.out-rand-write-qd32: (groupid=2, jobs=4): err= 0: pid=46330: Thu Sep
 9 14:18:05 2021
guest.out:  write: IOPS=299, BW=1215KiB/s (1244kB/s)(35.8MiB/30212msec); 0
zone resets

Better than host

guest.out-rand-write-qd1: (groupid=4, jobs=1): err= 0: pid=46552: Thu Sep
 9 14:18:05 2021
guest.out:  write: IOPS=213, BW=854KiB/s (875kB/s)(25.0MiB/30003msec); 0
zone resets

Better than host

So you have very fast reads (seq/random), with very slow seq/random write.

Also would be interesting to test fsync - this benchmark does not do any
fsync
but your slow yum/rpm upgrade likey do one of more fsyncs per package
upgrade.

There is an example sync test script here:
https://www.ibm.com/cloud/blog/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd

Le 09/09/2021 à 13:40, Nir Soffer a écrit :
>
> There are few issues with this test:
> - you don't use oflag=direct or conv=fsync, so this may test copying data
>to the host page cache, instead of writing data to storage
> - This tests only sequential write, which is the best case for any kind of
> storage
> - Using synchronous I/O - every write wait for the previous write
> completion
> - Using single process
> - 2g is too small, may test your cache performance
>
> Try to test using fio - attached fio script that tests sequential and
> random io with
> various queue depth.
>
> You can use it like this:
>
> fio --filename=/path/to/fio.data --output=test.out bench.fio
>
> Test both on the host, and in the VM. This will give you more detailed
> results that may help to evaluate the issue, and it may help Gluster
> folks to advise on tuning your storage.
>
> Nir
>
> --
> [image: téïcée] <https://www.teicee.com/?pk_campaign=Email> *Mathieu
> Valois*
>
> Bureau Caen: Quartier Kœnig - 153, rue Géraldine MOCK - 14760
> Bretteville-sur-Odon
> Bureau Vitré: Zone de la baratière - 12, route de Domalain - 35500 Vitré
> 02 72 34 13 20 | www.teicee.com
> <https://www.teicee.com/?pk_campaign=Email>
> [image: téïcée sur facebook] <https://www.facebook.com/teicee> [image:
> téïcée sur twitter] <https://twitter.com/Teicee_fr> [image: téïcée sur
> linkedin] <https://www.linkedin.com/company/t-c-e> [image: téïcée sur
> viadeo] <https://fr.viadeo.com/fr/company/teicee> [image: Datadocké]
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/55XO6KF56K2R7C4AGOCK4GCBAO42RPZ4/


[ovirt-users] Re: how to remove a failed backup operation

2021-09-09 Thread Nir Soffer
On Thu, Sep 9, 2021 at 12:53 PM Nir Soffer  wrote:
...
>> Any insight for finding the scratch disks ids in engine.log?
>> See here my engine.log and timestamp of backup (as seen in database above) 
>> is 15:31 on 03 September:
>>
>> https://drive.google.com/file/d/1Ao1CIA2wlFCqMMKeXbxKXrWZXUrnJN2h/view?usp=sharing
>
>
> To find the scratch disks the best way is to use the UI - open the storage > 
> disks tab
> and change the content type to "Backup scratch disks"
> (see attached screenshot)

Regardless, it is useful to understand engine log, here are the
relevant events in
your log:

$ grep 68f83141-9d03-4cb0-84d4-e71fdd8753bb engine.log
...

1. Backup started

2021-09-03 15:31:11,551+02 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-50) [b302cff2-fb05-4f10-9d02-aa03b10b10e1] EVENT_ID:
VM_BACKUP_STARTED(10,790), Backup 68f83141-9d03-4cb0-84d4-e71fdd8753bb
for VM c8server started (User: tekka@mydomain@mydomain).

2. Creating scratch disk for disk c8_data_c8server1

2021-09-03 15:31:12,550+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39)
[b302cff2-fb05-4f10-9d02-aa03b10b10e1] START, CreateVolumeVDSCommand(
CreateVolumeVDSCommandParameters:{storagePoolId='ef17cad6-7724-4cd8-96e3-9af6e529db51',
ignoreFailoverLimit='false',
storageDomainId='1de66cb8-c9f6-49cd-8a35-d5da8adad570',
imageGroupId='a6ce101a-f7ce-4944-93a5-e71f32dd6c12',
imageSizeInBytes='21474836480', volumeFormat='COW',
newImageId='33aa1bac-4152-492d-8a4a-b6d6c0337fec', imageType='Sparse',
newImageDescription='{"DiskAlias":"VM c8server backup
68f83141-9d03-4cb0-84d4-e71fdd8753bb scratch disk for
c8_data_c8server1","DiskDescription":"Backup
68f83141-9d03-4cb0-84d4-e71fdd8753bb scratch disk"}',
imageInitialSizeInBytes='1073741824',
imageId='----',
sourceImageGroupId='----',
shouldAddBitmaps='false'}), log id: 164ff0c7

3. Creating scratch disk for disk c8_bootdisk_c8server1

2021-09-03 15:31:12,880+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39)
[b302cff2-fb05-4f10-9d02-aa03b10b10e1] START, CreateVolumeVDSCommand(
CreateVolumeVDSCommandParameters:{storagePoolId='ef17cad6-7724-4cd8-96e3-9af6e529db51',
ignoreFailoverLimit='false',
storageDomainId='1de66cb8-c9f6-49cd-8a35-d5da8adad570',
imageGroupId='c9521211-8e24-46ae-aa2e-6f76503527dc',
imageSizeInBytes='21474836480', volumeFormat='COW',
newImageId='48244767-c8dc-4821-be21-935207068e69', imageType='Sparse',
newImageDescription='{"DiskAlias":"VM c8server backup
68f83141-9d03-4cb0-84d4-e71fdd8753bb scratch disk for
c8_bootdisk_c8server1","DiskDescription":"Backup
68f83141-9d03-4cb0-84d4-e71fdd8753bb scratch disk"}',
imageInitialSizeInBytes='1073741824',
imageId='----',
sourceImageGroupId='----',
shouldAddBitmaps='false'}), log id: 367fe98d

We can grep for the scratch disk UUIDs:
- a6ce101a-f7ce-4944-93a5-e71f32dd6c12
- c9521211-8e24-46ae-aa2e-6f76503527dc

But let's first understand what happens to this backup...

4. Backup was started

2021-09-03 15:31:29,883+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.StartVmBackupVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32)
[b302cff2-fb05-4f10-9d02-aa03b10b10e1] START,
StartVmBackupVDSCommand(HostName = ov200,
VmBackupVDSParameters:{hostId='cc241ec7-64fc-4c93-8cec-9e0e7005a49d',
backupId='68f83141-9d03-4cb0-84d4-e71fdd8753bb',
requireConsistency='false'}), log id: 154dbdc5
{96b0e701-7595-4f04-8569-fb1c72e6f8e0=nbd:unix:/run/vdsm/backup/68f83141-9d03-4cb0-84d4-e71fdd8753bb:exportname=sdb,
33b0f6fb-a855-465d-a628-5fce9b64496a=nbd:unix:/run/vdsm/backup/68f83141-9d03-4cb0-84d4-e71fdd8753bb:exportname=sda}
  checkpoint for backup
68f83141-9d03-4cb0-84d4-e71fdd8753bb

The next step is creating image transfer for downloading the disks.
Based on your mail:

[ 157.8 ] Image transfer 'ccc386d3-9f9d-4727-832a-56d355d60a95' is ready

We can follow the image transfer UUID ccc386d3-9f9d-4727-832a-56d355d60a95:

5. Creating image transfer

2021-09-03 15:33:46,892+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand]
(default task-48) [a79ec359-6da7-4b21-a018-1a9360a2f7d8] Creating
ImageTransfer entity for command
'ccc386d3-9f9d-4727-832a-56d355d60a95', proxyEnabled: true

6. Image transfer is ready

2021-09-03 15:33:46,922+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater]
(default task-48) [a79ec359-6da7-4b21-a018-1a9360a2f7d8] Updating
image transfer ccc386d3-9f9d-4727-832a-56d355d60a95 (image
33b0f6fb-a855-465d-a628-5fce9b64496a) phase to Transferring

The next ste

[ovirt-users] Re: Poor gluster performances over 10Gbps network

2021-09-09 Thread Nir Soffer
On Wed, Sep 8, 2021 at 12:15 PM Mathieu Valois  wrote:

> Sorry for double post but I don't know if this mail has been received.
>
> Hello everyone,
>
> I know this issue was already treated on this mailing list. However none
> of the proposed solutions is satisfying me.
>
> Here is my situation : I've got 3 hyperconverged gluster ovirt nodes, with
> 6 network interfaces, bounded in bunches of 2 (management, VMs and
> gluster). The gluster network is on a dedicated bound where the 2
> interfaces are directly connected to the 2 other ovirt nodes. Gluster is
> apparently using it :
>
> # gluster volume status vmstore
> Status of volume: vmstore
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
> --
> Brick gluster-ov1:/gluster_bricks
> /vmstore/vmstore49152 0  Y
> 3019
> Brick gluster-ov2:/gluster_bricks
> /vmstore/vmstore49152 0  Y
> 3009
> Brick gluster-ov3:/gluster_bricks
> /vmstore/vmstore
>
> where 'gluster-ov{1,2,3}' are domain names referencing nodes in the
> gluster network. This networks has 10Gbps capabilities :
>
> # iperf3 -c gluster-ov3
> Connecting to host gluster-ov3, port 5201
> [  5] local 10.20.0.50 port 46220 connected to 10.20.0.51 port 5201
> [ ID] Interval   Transfer Bitrate Retr  Cwnd
> [  5]   0.00-1.00   sec  1.16 GBytes  9.92 Gbits/sec   17900
> KBytes
> [  5]   1.00-2.00   sec  1.15 GBytes  9.90 Gbits/sec0900
> KBytes
> [  5]   2.00-3.00   sec  1.15 GBytes  9.90 Gbits/sec4996
> KBytes
> [  5]   3.00-4.00   sec  1.15 GBytes  9.90 Gbits/sec1996
> KBytes
> [  5]   4.00-5.00   sec  1.15 GBytes  9.89 Gbits/sec0996
> KBytes
> [  5]   5.00-6.00   sec  1.15 GBytes  9.90 Gbits/sec0996
> KBytes
> [  5]   6.00-7.00   sec  1.15 GBytes  9.90 Gbits/sec0996
> KBytes
> [  5]   7.00-8.00   sec  1.15 GBytes  9.91 Gbits/sec0996
> KBytes
> [  5]   8.00-9.00   sec  1.15 GBytes  9.90 Gbits/sec0996
> KBytes
> [  5]   9.00-10.00  sec  1.15 GBytes  9.90 Gbits/sec0996
> KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval   Transfer Bitrate Retr
> [  5]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec   22
> sender
> [  5]   0.00-10.04  sec  11.5 GBytes  9.86 Gbits/sec
> receiver
>
> iperf Done.
>
>
Network seems fine.


> However, VMs stored on the vmstore gluster volume has poor write
> performances, oscillating between 100KBps and 30MBps. I almost always
> observe a write spike (180Mbps) at the beginning until around 500MB
> written, then it drastically falls at 10MBps, sometimes even less
> (100KBps). Hypervisors have 32 threads (2 sockets, 8 cores per socket, 2
> threads per core).
>
> Here is the volume settings :
>
> Volume Name: vmstore
> Type: Replicate
> Volume ID: XXX
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
>
>
This looks like a replica 3 volume. In this case the VM writes everything
3 times - once per replica. The writes are done in parallel, but the data
is sent over the write 2-3 times (e.g. 2 if one of the bricks is on the
local host).

You may get better performance with replica 2 + arbiter:
https://gluster.readthedocs.io/en/latest/Administrator-Guide/arbiter-volumes-and-quorum/#why-arbiter


In this case data is written only to 2 bricks, and the arbiter brick holds
only
metadata.


> Transport-type: tcp
> Bricks:
> Brick1: gluster-ov1:/gluster_bricks/vmstore/vmstore
> Brick2: gluster-ov2:/gluster_bricks/vmstore/vmstore
> Brick3: gluster-ov3:/gluster_bricks/vmstore/vmstore
> Options Reconfigured:
> performance.io-thread-count: 32 # was 16 by default.
> cluster.granular-entry-heal: enable
> storage.owner-gid: 36
> storage.owner-uid: 36
> cluster.lookup-optimize: off
> server.keepalive-count: 5
> server.keepalive-interval: 2
> server.keepalive-time: 10
> server.tcp-user-timeout: 20
> network.ping-timeout: 30
> server.event-threads: 4
> client.event-threads: 8 # was 4 by default
> cluster.choose-local: off
> features.shard: on
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> performance.strict-o-direct: on
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> auth.allow: *
> user.cifs: off
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: on
>
> When I naively write directly on the logical volume, which is mounted on a
> material RAID5 3-disks array, I have interesting performances:
>
> # dd if=/dev/zero of=a bs=4M count=2048
> 2048+0 records in
> 2048+0 records out
> 8589934592 bytes (8.6 GB, 8.0 GiB) 

[ovirt-users] Re: data storage domain iso upload problem

2021-09-09 Thread Nir Soffer
On Mon, Sep 6, 2021 at 11:03 AM Nyika Csaba  wrote:
>
>  Hi Mark,
>
> This list is correct, but i have destroyed the master storage domain (i have 
> 6-7 Sd), and after this i tried to upload ISO-s (old isos, and new isos) they 
> were unbootable.
> ISO-s: Rocky8, ubuntu18, win10, debian10 etc.
>
> But i tried to use the upload_disk.py (thanksNir Soffer) and this script 
> works fine.

We already have bug for this issue, this is not related to the "Test
connection" bug:
https://bugzilla.redhat.com/1977276

This should be fixed soon upstream, but I'm not sure when the fix will
be released.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/G5FYBOFUFFAUWLK6V76WQQBU2ANXNZHN/


[ovirt-users] Re: how to remove a failed backup operation

2021-09-09 Thread Nir Soffer
On Wed, Sep 8, 2021 at 11:52 AM Gianluca Cecchi 
wrote:
...

> Right now what I see in the table is:
>
> engine=# \x
> Expanded display is on.
>

Nice! I did know about that


> engine=# select * from vm_backups;
> -[ RECORD 1 ]--+-
> backup_id  | 68f83141-9d03-4cb0-84d4-e71fdd8753bb
> from_checkpoint_id |
> to_checkpoint_id   | d31e35b6-bd16-46d2-a053-eabb26d283f5
> vm_id  | dc386237-1e98-40c8-9d3d-45658163d1e2
> phase  | Finalizing
>

In current code, this means that VM.stop_backup call was successful
when you asked to finalize the backup.


> _create_date   | 2021-09-03 15:31:11.447+02
> host_id| cc241ec7-64fc-4c93-8cec-9e0e7005a49d
>
> engine=#
>
> see below my doubts...
>
...

> The VM is still running.
> The host (I see it in its events with relation to backup errors) is ov200.
> BTW: how can I see the mapping between host id and hostname (from the db
> and/or api)?
>
> [root@ov200 ~]# vdsm-client VM stop_backup
> vmID=dc386237-1e98-40c8-9d3d-45658163d1e2
> backup_id=68f83141-9d03-4cb0-84d4-e71fdd8753bb
> {
> "code": 0,
> "message": "Done"
> }
> [root@ov200 ~]#
>
>
>>> If this succeeds, the backup is not running on vdsm side.
>>>
>>
> I preseum from the output above that the command succeeded, correct?
>

Yes, this is how a successful command looks like. If the command fails you
will
get a non-zero code and the message will explain the failure.

If this fails, you may need stop the VM to end the backup.
>>>
>>> If the backup was stopped, you may need to delete the scratch disks
>>> used in this backup.
>>> You can find the scratch disks ids in engine logs, and delete them
>>> from engine UI.
>>>
>>
> Any insight for finding the scratch disks ids in engine.log?
> See here my engine.log and timestamp of backup (as seen in database above)
> is 15:31 on 03 September:
>
>
> https://drive.google.com/file/d/1Ao1CIA2wlFCqMMKeXbxKXrWZXUrnJN2h/view?usp=sharing
>

To find the scratch disks the best way is to use the UI - open the storage
> disks tab
and change the content type to "Backup scratch disks"
(see attached screenshot)

The description and comment of the scratch disk should be enough to
detect stale scratch disks that failed to be removed after a backup.

You should be able to delete the disks from the UI/API.


>>> Finally, after you cleaned up vdsm side, you can delete the backup
>>> from engine database,
>>> and unlock the disks.
>>>
>>> Pavel, can you provide instructions on how to clean up engine db after
>>> stuck backup?
>>>
>>
>> Can you please try manually updating the 'phase" of the problematic
>> backup entry in the "vm_backups" DB table to 1 of the final phases,
>> which are either "Succeeded" or "Failed"?
>> This should allow creating a new backup.
>> [image: image.png]
>>
>>
>>>
>>> After vdsm and engine were cleaned, new backup should work normally.
>>>
>>
> OK, so I wait for Nir input about scratch disks removal and then I go with
> changing the phase column for the backup.
>

Once you stop the backup on vdsm side, you can fix the backup phase in the
database.
You don't need to delete the scratch disks before that, they can be deleted
later.

Backup stuck in the finalizing phase blocks future backups of the VM.
Scratch disks only
take logical space in your storage domain, and some physical space in your
storage.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TBIGXOBHK2DN5Y6EJTYIAPWG3MTCCQP2/


[ovirt-users] Re: data storage domain iso upload problem

2021-09-05 Thread Nir Soffer
On Sun, Sep 5, 2021 at 11:27 PM Nir Soffer  wrote:
>
> On Wed, Aug 25, 2021 at 12:57 PM  wrote:
> >
> > Hi,
> >
> > I managed a ovirt 4.4.7 for production systems.
> > Last week i removed the Master storage domain (moved tamplates, vm-s well, 
> > unattached, etc), but i forgot to move isos.
> > Now, when i upload a new iso to a data storage domain, the system show it, 
> > but it's unbootable:
> > "could not read from cdrom code 0005"
>
> This sounds like booting from an empty image.

Reproduced locally, upload fails in engine, but the UI does not show
the failure, and an empty disk is left.

Nyika, see the attached screenshot. Is this what  you get?

> I think we have a bug
> when uploading
> from the UI "works" very quickly without uploading anything.
>
> If I remember correctly, clicking "Test connection" in the upload dialog
> helps to workaround this issue.
>
> You can always upload using SDK upload_disk.py example:
>
> $ cat /home/nsoffer/.config/ovirt.conf
> [myengine]
> engine_url = https://myengine.com
> username = admin@internal
> password =  mypassword
> cafile = /home/me/certs/myengine.pem
>
> $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py
> -c myengine --sd-name mystoragedomain foo.iso
>
> This way is not affected by the test connection bug and it is also faster,
> more powerful, and can be automated.
>
> Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/V6MLY46IPW37TE6ZCJAPOQP3HXUCZU7Q/


[ovirt-users] Re: data storage domain iso upload problem

2021-09-05 Thread Nir Soffer
On Wed, Aug 25, 2021 at 12:57 PM  wrote:
>
> Hi,
>
> I managed a ovirt 4.4.7 for production systems.
> Last week i removed the Master storage domain (moved tamplates, vm-s well, 
> unattached, etc), but i forgot to move isos.
> Now, when i upload a new iso to a data storage domain, the system show it, 
> but it's unbootable:
> "could not read from cdrom code 0005"

This sounds like booting from an empty image. I think we have a bug
when uploading
from the UI "works" very quickly without uploading anything.

If I remember correctly, clicking "Test connection" in the upload dialog
helps to workaround this issue.

You can always upload using SDK upload_disk.py example:

$ cat /home/nsoffer/.config/ovirt.conf
[myengine]
engine_url = https://myengine.com
username = admin@internal
password =  mypassword
cafile = /home/me/certs/myengine.pem

$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py
-c myengine --sd-name mystoragedomain foo.iso

This way is not affected by the test connection bug and it is also faster,
more powerful, and can be automated.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SYXFXLMA7WJ44XDX2LR2JAHREOM2O3VW/


[ovirt-users] Re: how to remove a failed backup operation

2021-09-05 Thread Nir Soffer
On Sat, Sep 4, 2021 at 1:08 AM Gianluca Cecchi
 wrote:
...
>>> ovirt_imageio._internal.nbd.ReplyError: Writing to file failed: [Error 28] 
>>> No space left on device
>> This error is expected if you don't have space to write the data.
> ok.

I forgot to mention that running backup on engine host is not recommended.
It is better to run the backup on the hypervisor, speeding up the data copy.

You can mount the backup directory on the hypervisor (e.g. nfs) and
use --backup-dir
to store the backup where it should be.

>>> Now if I try the same backup command (so with "full" option) and I get
>>>
>>> ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is 
>>> "[Cannot backup VM. The VM is during a backup operation.]". HTTP response 
>>> code is 409.
>> This looks like a bug in the backup script - the backup should be finalized
>> even if the image transfer failed, but the error you get say the vm is still
>> in backup mode.
>>
>>> How can I clean the situation?
>>
>> 1. Stop the current backup
>>
>> If you still have the output from the command, we log the backup UUID.
>>
>> If you lost the backup id, you can get it using the API - visit this address 
>> in your browser:
>>
>> https://myengine/ovirt-engine/api/vms/{vm-id}/backups/
>>
>> Then stop the current backup using:
>>
>> /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py stop 
>> vm-id backup-id
>>
>> If stopping the backup failed, stopping the VM will stop the backup.
>> I hope you are running recent enough version, since in early versions there
>> was a bug when you cannot stop the vm during a backup.
>
> It is the latest 4.4.7. I run the backup_vm.py script from the engine:
>
> ovirt-engine-4.4.7.7-1.el8.noarch
> ovirt-engine-setup-plugin-imageio-4.4.7.7-1.el8.noarch
> ovirt-imageio-common-2.2.0-1.el8.x86_64
> ovirt-imageio-client-2.2.0-1.el8.x86_64
> ovirt-imageio-daemon-2.2.0-1.el8.x86_64
> python3-ovirt-engine-sdk4-4.4.13-1.el8.x86_64

Looks good.

> But if I try the stop command I get the error
>
> [g.cecchi@ovmgr1 ~]$ python3 
> /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py -c ovmgr1 stop 
> dc386237-1e98-40c8-9d3d-45658163d1e2 68f83141-9d03-4cb0-84d4-e71fdd8753bb
> [   0.0 ] Finalizing backup '68f83141-9d03-4cb0-84d4-e71fdd8753bb'
> Traceback (most recent call last):
...
> ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot 
> stop VM backup. The VM backup is not in READY phase, backup phase is 
> FINALIZING. Please try again when the backup is in READY phase.]". HTTP 
> response code is 409.

So your backup was already finalized, and it is stuck in "finalizing" phase.

Usually this means the backup on libvirt side was already stopped, but engine
failed to detect this and failed to complete the finalize step
(ovirt-engine bug).

You need to ensure if the backup was stopped on vdsm side.

- If the vm was stopped, the bacukp is not running
- If the vm is running, we can make sure the backup is stopped using

vdsm-client VM stop_backup
vmID=dc386237-1e98-40c8-9d3d-45658163d1e2
backup_id=68f83141-9d03-4cb0-84d4-e71fdd8753bb

If this succeeds, the backup is not running on vdsm side.
If this fails, you may need stop the VM to end the backup.

If the backup was stopped, you may need to delete the scratch disks
used in this backup.
You can find the scratch disks ids in engine logs, and delete them
from engine UI.

Finally, after you cleaned up vdsm side, you can delete the backup
from engine database,
and unlock the disks.

Pavel, can you provide instructions on how to clean up engine db after
stuck backup?

After vdsm and engine were cleaned, new backup should work normally.

>> 2. File a bug about this
> Filed this one, hope its is correct; I chose ovirt-imageio as the product and 
> Client as the component:

In general backup bugs should be filed for ovirt-engine. ovirt-imageio
is rarely the
cause for a bug. We will move the bug to ovirt-imageio if needed.

> https://bugzilla.redhat.com/show_bug.cgi?id=2001136

Thanks!

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LVI36HQIAM4CIC5YNKBZ5JX5RIZSCDY3/


[ovirt-users] Re: how to remove a failed backup operation

2021-09-03 Thread Nir Soffer
On Fri, Sep 3, 2021 at 4:45 PM Gianluca Cecchi 
wrote:

> Hello,
> I was trying incremental backup with the provided
> /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py and began
> using the "full" option.
> But I specified an incorrect dir and during backup I got error due to
> filesystem full
>
> [ 156.7 ] Creating image transfer for disk
> '33b0f6fb-a855-465d-a628-5fce9b64496a'
> [ 157.8 ] Image transfer 'ccc386d3-9f9d-4727-832a-56d355d60a95' is ready
> --- Logging error ---, 105.02 seconds, 147.48 MiB/s
>
> Traceback (most recent call last):
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py",
> line 242, in _run
> handler.copy(req)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py",
> line 286, in copy
> self._src.write_to(self._dst, req.length, self._buf)
>   File
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py",
> line 216, in write_to
> writer.write(view[:n])
>   File
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/nbd.py",
> line 118, in write
> self._client.write(self._position, buf)
>   File
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/nbd.py", line
> 445, in write
> self._recv_reply(cmd)
>   File
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/nbd.py", line
> 980, in _recv_reply
> if self._recv_reply_chunk(cmd):
>   File
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/nbd.py", line
> 1031, in _recv_reply_chunk
> self._handle_error_chunk(length, flags)
>   File
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/nbd.py", line
> 1144, in _handle_error_chunk
> raise ReplyError(code, message)
> ovirt_imageio._internal.nbd.ReplyError: Writing to file failed: [Error 28]
> No space left on device
>

This error is expected if you don't have space to write the data.


>
> Now if I try the same backup command (so with "full" option) and I get
>
> ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is
> "[Cannot backup VM. The VM is during a backup operation.]". HTTP response
> code is 409.
>

This looks like a bug in the backup script - the backup should be finalized
even if the image transfer failed, but the error you get say the vm is still
in backup mode.


>
> How can I clean the situation?
>

1. Stop the current backup

If you still have the output from the command, we log the backup UUID.

If you lost the backup id, you can get it using the API - visit this
address in your browser:

https://myengine/ovirt-engine/api/vms/{vm-id}/backups/

Then stop the current backup using:

/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py stop
vm-id backup-id

If stopping the backup failed, stopping the VM will stop the backup.
I hope you are running recent enough version, since in early versions there
was a bug when you cannot stop the vm during a backup.

2. File a bug about this


>
> BTW: the parameter to put into ovirt.conf is backup-dir or backup_dir or
> what?
>

ovirt.conf do not include the backup dir, only details about engine. Adding
backup-dir
to ovirt.conf or to backup specific configuration sounds like a good idea.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HSUZOKSHCEWQAHUOAQ6EVUHWACTGANH7/


[ovirt-users] Re: posix storage mount path error when creating volumes

2021-09-01 Thread Nir Soffer
On Wed, Sep 1, 2021 at 6:21 PM Sketch  wrote:

> My cluster was originally built on 4.3, and things were working as long as
> my SPM was on 4.3.  I just killed off the last 4.3 host and rebuilt it as
> 4.4, and upgraded my cluster and DC to compatibility level 4.6.
>
> We had cephfs mounted as a posix FS which worked fine, but oddly in 4.3 we
> would end up with two mounts for the same volume.  The configuration had a
> comma separated list of IPs as that is how ceph was configured for
> redundancy, and this is the mount that shows up on both 4.3 and 4.4 hosts
> (/rhev/data-center/mnt/10.1.88.75,10.1.88.76,10.1.88.77:_vmstore/).


This was never supported.

We had this old fix that was rejected:
https://gerrit.ovirt.org/c/vdsm/+/94027

but it will not help to solve the issue with the task argument below.


> But
> the 4.3 hosts would also have a duplicate mount which had the FQDN of one
> of the servers instead of the comma separated list.
>
> In 4.4, there's only a single mount and existing VMs will start just fine,
> but you can't create new disks or migrate existing disks onto the posix
> storage volume.  My suspicion is this is an issue with the mount parser
> not liking the comma in the name of the mount from the error that I get on
> the SPM host when it tries to create a volume (migration would also fail
> on the volume creation task):
>
> 2021-08-31 19:34:07,767-0700 INFO  (jsonrpc/6) [vdsm.api] START
> createVolume(sdUUID='e8ec5645-fc1b-4d64-a145-44aa8ac5ef48',
> spUUID='2948c860-9bdf-11e8-a6b3-00163e0419f0',
> imgUUID='7d704b4d-1ebe-462f-b11e-b91039f43637', size='1073741824',
> volFormat=5, preallocate=1, diskType='DATA',
> volUUID='be6cb033-4e42-4bf5-a4a3-6ab5bf03edee',
> desc='{"DiskAlias":"test","DiskDescription":""}',
> srcImgUUID='----',
> srcVolUUID='----', initialSize=None,
> addBitmaps=False) from=:::10.1.2.37,43490,
> flow_id=bb137995-1ffa-429f-b6eb-5b9ca9f8dfd7,
> task_id=2ddfd1bc-d7e1-4a1e-877a-68e1c2a897ed (api:48)
> 2021-08-31 19:34:07,767-0700 INFO  (jsonrpc/6) [IOProcessClient] (Global)
> Starting client (__init__:340)
> 2021-08-31 19:34:07,782-0700 INFO  (ioprocess/3193398) [IOProcess]
> (Global) Starting ioprocess (__init__:465)
> 2021-08-31 19:34:07,803-0700 INFO  (jsonrpc/6) [vdsm.api] FINISH
> createVolume return=None from=:::10.1.2.37,43490,
> flow_id=bb137995-1ffa-429f-b6eb-5b9ca9f8dfd7,
> task_id=2ddfd1bc-d7e1-4a1e-877a-68e1c2a897ed (api:54)
> 2021-08-31 19:34:07,844-0700 INFO  (tasks/5)
> [storage.ThreadPool.WorkerThread] START task
> 2ddfd1bc-d7e1-4a1e-877a-68e1c2a897ed (cmd= >, args=None)
> (threadPool:146)
> 2021-08-31 19:34:07,869-0700 INFO  (tasks/5) [storage.StorageDomain]
> Create placeholder 
> /rhev/data-center/mnt/10.1.88.75,10.1.88.76,10.1.88.77:_vmstore/e8ec5645-fc1b-4d64-a145-44aa8ac5ef48/images/7d704b4d-1ebe-462f-b11e-b91039f43637
> for image's volumes (sd:1718)
> 2021-08-31 19:34:07,869-0700 ERROR (tasks/5) [storage.TaskManager.Task]
> (Task='2ddfd1bc-d7e1-4a1e-877a-68e1c2a897ed') Unexpected error (task:877)
> Traceback (most recent call last):
>File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 884,
> in _run
>  return fn(*args, **kargs)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 350,
> in run
>  return self.cmd(*self.argslist, **self.argsdict)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/securable.py", line
> 79, in wrapper
>  return method(self, *args, **kwargs)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/sp.py", line 1945,
> in createVolume
>  initial_size=initialSize, add_bitmaps=addBitmaps)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 1216,
> in createVolume
>  initial_size=initial_size, add_bitmaps=add_bitmaps)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line
> 1174, in create
>  imgPath = dom.create_image(imgUUID)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 1721,
> in create_image
>  "create_image_rollback", [image_dir])
>File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 385,
> in __init__
>  self.params = ParamList(argslist)
>File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 298,
> in __init__
>  raise ValueError("ParamsList: sep %s in %s" % (sep, i))
> ValueError: ParamsList: sep , in /rhev/data-center/mnt/10.1.88.75
> ,10.1.88.76,10.1.88.77:
> _vmstore/e8ec5645-fc1b-4d64-a145-44aa8ac5ef48/images/7d704b4d-1ebe-462f-b11e-b91039f43637
> 2021-08-31 19:34:07,964-0700 INFO  (tasks/5)
> [storage.ThreadPool.WorkerThread] FINISH task
> 2ddfd1bc-d7e1-4a1e-877a-68e1c2a897ed (threadPool:148)
>

I think the issue is the task arguments parser - these are separated by
",", and
arguments including "," breaks the parser.


> This is a pretty major issue since we can no longer create new VMs.  As a
> workaround, I could change the mount path of the volume to only 

[ovirt-users] Re: NFS Synology NAS (DSM 7)

2021-09-01 Thread Nir Soffer
On Wed, Sep 1, 2021 at 9:55 AM Maton, Brett 
wrote:

> Thanks for the replies, as it turns out it was nothing to do with
> /etc/exports or regular file system permissions.
>
> Synology have applied their own brand of Access Control Lists (ACLs) to
> shared folders.
>

What kind of OS are they using? (freebsd?)


>
> Basically I had to run the following commands to allow vdsm:kvm (36:36) to
> read and write to the share:
>
> EXPORT_DIR=/volumeX/...
>
> synoacltool -set-owner "$EXPORT_DIR" group kvm:allow:rwxpdDaARWcCo:fd--
> synoacltool -add "$EXPORT_DIR" user:vdsm:allow:rwxpdDaARWcCo:fd--
> synoacltool -add "$EXPORT_DIR" group:kvm:allow:rwxpdDaARWcCo:fd--
>

It would be useful to add this solution to this page:
https://www.ovirt.org/develop/troubleshooting-nfs-storage-issues.html

You can click "Edit this page" on the bottom and add the info:
https://github.com/oVirt/ovirt-site/edit/master/source/develop/troubleshooting-nfs-storage-issues.md

Nir


>
> On Wed, 1 Sept 2021 at 04:28, Strahil Nikolov 
> wrote:
>
>> I guess gou need to try:
>> all_squash + anonuid=36 + anongid=36
>>
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> On Fri, Aug 27, 2021 at 23:44, Alex K
>>  wrote:
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/MM6CLUHHQFIWTXQSIQ7C7EPWFAW3NAOF/
>>
>> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/IUYTVJIHSSWSK5ARUYW7SLEXDRBDIBFS/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MECRIBTS5JGH4G4ZVRIPFFUCOIEY7DR2/


[ovirt-users] Re: Hosted engine on HCI cluster is not running

2021-08-13 Thread Nir Soffer
On Fri, Aug 13, 2021 at 9:13 PM David White via Users  wrote:
>
> Hello,
> It appears that my Manager / hosted-engine isn't working, and I'm unable to 
> get it to start.
>
> I have a 3-node HCI cluster, but right now, Gluster is only running on 1 host 
> (so no replication).
> I was hoping to upgrade / replace the storage on my 2nd host today, but 
> aborted that maintenance when I found that I couldn't even get into the 
> Manager.
>
> The storage is mounted, but here's what I see:
>
> [root@cha2-storage dwhite]# hosted-engine --vm-status
> The hosted engine configuration has not been retrieved from shared storage. 
> Please ensure that ovirt-ha-agent is running and the storage server is 
> reachable.
>
> [root@cha2-storage dwhite]# systemctl status ovirt-ha-agent
> ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring 
> Agent
>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
> vendor preset: disabled)
>Active: active (running) since Fri 2021-08-13 11:10:51 EDT; 2h 44min ago
> Main PID: 3591872 (ovirt-ha-agent)
> Tasks: 1 (limit: 409676)
>Memory: 21.5M
>CGroup: /system.slice/ovirt-ha-agent.service
>└─3591872 /usr/libexec/platform-python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
>
> Aug 13 11:10:51 cha2-storage.mgt.barredowlweb.com systemd[1]: Started oVirt 
> Hosted Engine High Availability Monitoring Agent.
>
>
> Any time I try to do anything like connect the engine storage, disconnect the 
> engine storage, or connect to the console, it just sits there, and doesn't do 
> anything, and I eventually have to ctl-c out of it.
> Maybe I have to be patient? When I ctl-c, I get a trackback error:
>
> [root@cha2-storage dwhite]# hosted-engine --console
> ^CTraceback (most recent call last):
>   File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
>
> "__main__", mod_spec)
>   File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 214, in 
> [root@cha2-storage dwhite]# args.command(args)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 42, in func
> f(*args, **kwargs)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 91, in checkVmStatus
> cli = ohautil.connect_vdsm_json_rpc()
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 472, in connect_vdsm_json_rpc
> __vdsm_json_rpc_connect(logger, timeout)
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 395, in __vdsm_json_rpc_connect
> timeout=timeout)
>   File "/usr/lib/python3.6/site-packages/vdsm/client.py", line 154, in connect
> outgoing_heartbeat=outgoing_heartbeat, nr_retries=nr_retries)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 426, 
> in SimpleClient
> nr_retries, reconnect_interval)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 448, 
> in StandAloneRpcClient
> client = StompClient(utils.create_connected_socket(host, port, sslctx),
>   File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 379, in 
> create_connected_socket
> sock.connect((host, port))
>   File "/usr/lib64/python3.6/ssl.py", line 1068, in connect
> self._real_connect(addr, False)
>   File "/usr/lib64/python3.6/ssl.py", line 1059, in _real_connect
> self.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
> self._sslobj.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
> self._sslobj.do_handshake()
>
>
>
> This is what I see in /var/log/ovirt-hosted-engine-ha/broker.log:
>
> MainThread::WARNING::2021-08-11 
> 10:24:41,596::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>  Can't connect vdsm storage: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,596::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Failed initializing the broker: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,598::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Traceback (most recent call last):
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 64, in run
> self._storage_broker_instance = self._get_storage_broker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 143, in _get_storage_broker
> return storage_broker.StorageBroker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>  line 97, in __init__
> self._backend.connect()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>  line 375, in connect
> 

[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Nir Soffer
On Wed, Aug 11, 2021 at 4:24 PM Arik Hadas  wrote:
>
>
>
> On Wed, Aug 11, 2021 at 2:56 PM Benny Zlotnik  wrote:
>>
>> > If your vm is temporary and you like to drop the data written while
>> > the vm is running, you
>> > could use a temporary disk based on the template. This is called a
>> > "transient disk" in vdsm.
>> >
>> > Arik, maybe you remember how transient disks are used in engine?
>> > Do we have an API to run a VM once, dropping the changes to the disk
>> > done while the VM was running?
>>
>> I think that's how stateless VMs work
>
>
> +1
> It doesn't work exactly like Nir wrote above - stateless VMs that are 
> thin-provisioned would have a qcow volume on top of each template's volume 
> and when they starts, their active volume would be a qcow volume on top of 
> the aforementioned qcow volume and that active volume will be removed when 
> the VM goes down
> But yeah, stateless VMs are intended for such use case

I was referring to transient disks - created in vdsm:
https://github.com/oVirt/vdsm/blob/45903d01e142047093bf844628b5d90df12b6ffb/lib/vdsm/virt/vm.py#L3789

This creates a *local* temporary file using qcow2 format, using the
disk on shared
storage as a backing file.

Maybe this is not used by engine?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JTB6P4N5G34JK3QO375XJVIIF4OZHTYH/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Nir Soffer
On Wed, Aug 11, 2021 at 3:13 PM Shantur Rathore
 wrote:
>
>
>> Yes, on file based storage a snapshot is a file, and it grows as
>> needed.  On block based
>> storage, a snapshot is a logical volume, and oVirt needs to extend it
>> when needed.
>
>
> Forgive my ignorance, coming from vSphere background where a filesystem was 
> created on iSCSI LUN.
> I take that this isn't the case in case of a iSCSI Storage Domain in oVirt.

Yes, for block storage, we create a LVM volume group with one or more LUNs
to create a storage domain. Disks are created using LVM logical volume on
this VG.

When you create a vm from template on block storage we create a new 1g
logical volume for the vm disk, and create a qcow2 image on this logical volume
with the backing file using the template logical volume.

The logical volume needs to be extended when free space is low. This is done
automatically on the host running the VM, but since oVirt is not in the data
path, the VM may write data too fast and pause trying to write after the end
of the logical volume. In this case the VM will be resumed when oVirt finish
to extend the volume.

> On Wed, Aug 11, 2021 at 12:26 PM Nir Soffer  wrote:
>>
>> On Wed, Aug 11, 2021 at 12:43 AM Shantur Rathore
>>  wrote:
>> >
>> > Thanks for the detailed response Nir.
>> >
>> > In my use case, we keep creating VMs from templates and deleting them so 
>> > we need the VMs to be created quickly and cloning it will use a lot of 
>> > time and storage.
>>
>> That's a good reason to use a template.
>>
>> If your vm is temporary and you like to drop the data written while
>> the vm is running, you
>> could use a temporary disk based on the template. This is called a
>> "transient disk" in vdsm.
>>
>> Arik, maybe you remember how transient disks are used in engine?
>> Do we have an API to run a VM once, dropping the changes to the disk
>> done while the VM was running?
>>
>> > I will try to add the config and try again tomorrow. Also I like the 
>> > Managed Block storage idea, I had read about it in the past and used it 
>> > with Ceph.
>> >
>> > Just to understand it better, is this issue only on iSCSI based storage?
>>
>> Yes, on file based storage a snapshot is a file, and it grows as
>> needed.  On block based
>> storage, a snapshot is a logical volume, and oVirt needs to extend it
>> when needed.
>>
>> Nir
>>
>> > Thanks again.
>> >
>> > Regards
>> > Shantur
>> >
>> > On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer  wrote:
>> >>
>> >> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
>> >>  wrote:
>> >> >
>> >> > Hi all,
>> >> >
>> >> > I have a setup as detailed below
>> >> >
>> >> > - iSCSI Storage Domain
>> >> > - Template with Thin QCOW2 disk
>> >> > - Multiple VMs from Template with Thin disk
>> >>
>> >> Note that a single template disk used by many vms can become a performance
>> >> bottleneck, and is a single point of failure. Cloning the template when 
>> >> creating
>> >> vms avoids such issues.
>> >>
>> >> > oVirt Node 4.4.4
>> >>
>> >> 4.4.4 is old, you should upgrade to 4.4.7.
>> >>
>> >> > When the VMs boots up it downloads some data to it and that leads to 
>> >> > increase in volume size.
>> >> > I see that every few seconds the VM gets paused with
>> >> >
>> >> > "VM X has been paused due to no Storage space error."
>> >> >
>> >> >  and then after few seconds
>> >> >
>> >> > "VM X has recovered from paused back to up"
>> >>
>> >> This is normal operation when a vm writes too quickly and oVirt cannot
>> >> extend the disk quick enough. To mitigate this, you can increase the
>> >> volume chunk size.
>> >>
>> >> Created this configuration drop in file:
>> >>
>> >> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
>> >> [irs]
>> >> volume_utilization_percent = 25
>> >> volume_utilization_chunk_mb = 2048
>> >>
>> >> And restart vdsm.
>> >>
>> >> With this setting, when free space in a disk is 1.5g, the disk will
>> >> be extended by 2g. With the default setting, when free space is
>> >> 0.5g the disk was extended by 1g.
>> >

[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Nir Soffer
On Wed, Aug 11, 2021 at 12:43 AM Shantur Rathore
 wrote:
>
> Thanks for the detailed response Nir.
>
> In my use case, we keep creating VMs from templates and deleting them so we 
> need the VMs to be created quickly and cloning it will use a lot of time and 
> storage.

That's a good reason to use a template.

If your vm is temporary and you like to drop the data written while
the vm is running, you
could use a temporary disk based on the template. This is called a
"transient disk" in vdsm.

Arik, maybe you remember how transient disks are used in engine?
Do we have an API to run a VM once, dropping the changes to the disk
done while the VM was running?

> I will try to add the config and try again tomorrow. Also I like the Managed 
> Block storage idea, I had read about it in the past and used it with Ceph.
>
> Just to understand it better, is this issue only on iSCSI based storage?

Yes, on file based storage a snapshot is a file, and it grows as
needed.  On block based
storage, a snapshot is a logical volume, and oVirt needs to extend it
when needed.

Nir

> Thanks again.
>
> Regards
> Shantur
>
> On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer  wrote:
>>
>> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
>>  wrote:
>> >
>> > Hi all,
>> >
>> > I have a setup as detailed below
>> >
>> > - iSCSI Storage Domain
>> > - Template with Thin QCOW2 disk
>> > - Multiple VMs from Template with Thin disk
>>
>> Note that a single template disk used by many vms can become a performance
>> bottleneck, and is a single point of failure. Cloning the template when 
>> creating
>> vms avoids such issues.
>>
>> > oVirt Node 4.4.4
>>
>> 4.4.4 is old, you should upgrade to 4.4.7.
>>
>> > When the VMs boots up it downloads some data to it and that leads to 
>> > increase in volume size.
>> > I see that every few seconds the VM gets paused with
>> >
>> > "VM X has been paused due to no Storage space error."
>> >
>> >  and then after few seconds
>> >
>> > "VM X has recovered from paused back to up"
>>
>> This is normal operation when a vm writes too quickly and oVirt cannot
>> extend the disk quick enough. To mitigate this, you can increase the
>> volume chunk size.
>>
>> Created this configuration drop in file:
>>
>> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
>> [irs]
>> volume_utilization_percent = 25
>> volume_utilization_chunk_mb = 2048
>>
>> And restart vdsm.
>>
>> With this setting, when free space in a disk is 1.5g, the disk will
>> be extended by 2g. With the default setting, when free space is
>> 0.5g the disk was extended by 1g.
>>
>> If this does not eliminate the pauses, try a larger chunk size
>> like 4096.
>>
>> > Sometimes after a many pause and recovery the VM dies with
>> >
>> > "VM X is down with error. Exit message: Lost connection with qemu process."
>>
>> This means qemu has crashed. You can find more info in the vm log at:
>> /var/log/libvirt/qemu/vm-name.log
>>
>> We know about bugs in qemu that cause such crashes when vm disk is
>> extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
>> will fix this issue.
>>
>> Even with these settings, if you have a very bursty io in the vm, it may
>> become paused. The only way to completely avoid these pauses is to
>> use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
>> can be thin provisioned on the server side so it does not mean you need
>> more storage, but you will not be able to use shared templates in the way
>> you use them now. You can create vm from template, but the template
>> is cloned to the new vm.
>>
>> Another option with (still tech preview) is Managed Block Storage (Cinder
>> based storage). If your storage server is supported by Cinder, we can
>> managed it using cinderlib. In this setup every disk is a LUN, which may
>> be thin provisioned on the storage server. This can also offload storage
>> operations to the server, like cloning disks, which may be much faster and
>> more efficient.
>>
>> Nir
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NH3ZZMYOCTVKDF4GYKFOSQYPP2IK3JFT/


[ovirt-users] Re: low iscsi storage space

2021-08-10 Thread Nir Soffer
On Tue, Aug 10, 2021 at 5:36 PM Leonardo Costa
 wrote:
>
> Hello.
> I'm having a problem that I can't solve,
> I have 20Tb of space on Ovirt but on my DELL SCV 3000 storage, only 5Tb free 
> on the volume.
> It seems to me that the deleted machines are not removing the disks in the 
> storage iscsi.

Did you enable "Discard after delete" option in the storage domain
advanced options?
(Manage domain -> look at the bottom of the dialog)

After you enable this, you can create a very large disk using all free
space, and delete it.
The disk will be discarded, freeing space on the storage server.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7K74CBSJW6RG35IN3TJZZ55OE4BDVQRO/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-10 Thread Nir Soffer
On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
 wrote:
>
> Hi all,
>
> I have a setup as detailed below
>
> - iSCSI Storage Domain
> - Template with Thin QCOW2 disk
> - Multiple VMs from Template with Thin disk

Note that a single template disk used by many vms can become a performance
bottleneck, and is a single point of failure. Cloning the template when creating
vms avoids such issues.

> oVirt Node 4.4.4

4.4.4 is old, you should upgrade to 4.4.7.

> When the VMs boots up it downloads some data to it and that leads to increase 
> in volume size.
> I see that every few seconds the VM gets paused with
>
> "VM X has been paused due to no Storage space error."
>
>  and then after few seconds
>
> "VM X has recovered from paused back to up"

This is normal operation when a vm writes too quickly and oVirt cannot
extend the disk quick enough. To mitigate this, you can increase the
volume chunk size.

Created this configuration drop in file:

# cat /etc/vdsm/vdsm.conf.d/99-local.conf
[irs]
volume_utilization_percent = 25
volume_utilization_chunk_mb = 2048

And restart vdsm.

With this setting, when free space in a disk is 1.5g, the disk will
be extended by 2g. With the default setting, when free space is
0.5g the disk was extended by 1g.

If this does not eliminate the pauses, try a larger chunk size
like 4096.

> Sometimes after a many pause and recovery the VM dies with
>
> "VM X is down with error. Exit message: Lost connection with qemu process."

This means qemu has crashed. You can find more info in the vm log at:
/var/log/libvirt/qemu/vm-name.log

We know about bugs in qemu that cause such crashes when vm disk is
extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
will fix this issue.

Even with these settings, if you have a very bursty io in the vm, it may
become paused. The only way to completely avoid these pauses is to
use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
can be thin provisioned on the server side so it does not mean you need
more storage, but you will not be able to use shared templates in the way
you use them now. You can create vm from template, but the template
is cloned to the new vm.

Another option with (still tech preview) is Managed Block Storage (Cinder
based storage). If your storage server is supported by Cinder, we can
managed it using cinderlib. In this setup every disk is a LUN, which may
be thin provisioned on the storage server. This can also offload storage
operations to the server, like cloning disks, which may be much faster and
more efficient.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W653KLDZMLUNMKLE242UFH5LY4KQ6LD5/


[ovirt-users] Re: live merge of snapshots failed

2021-08-10 Thread Nir Soffer
On Tue, Aug 10, 2021 at 7:33 AM  wrote:
>
> Helo Nir,
> No I do not have libvirt logs enabled.
> I restored the vm from the snapshot and retried. It did boot but at the same 
> time it did not merge again when I tried it. On the other hand when I cloned 
> it and tried to recreate the situation the image did merge.
> Is it possible that the image is corrupted for any reason beyond live merge 
> failure,so the merge fails regardles?
> This is a production vm so I cannot play a lot with it :-(
> I should probably clone it give the clone to production and see if this 
> continues to happen.

I see you are using ovirt 4.3.10 - we fixed some snapshot deletion
issues in 4.4.

For your case I would try to:
1. Clone the VM - can you delete the snapshot in the clone?
2. Shutdown the clone - can delete the snapshot when the clone is shutdown?

Please file a bug and attach vdsm and engine logs showing the
timeframe when you tried
to delete the snapshot.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EFRNJLPOFWSDV5SESQ3S2UHTBRQLGKH4/


[ovirt-users] Re: HA VM and vm leases usage with site failure

2021-08-10 Thread Nir Soffer
On Tue, Aug 10, 2021 at 12:05 PM Klaas Demter  wrote:
> I always thought the SPM role is also "managed" by a storage lease :)

The SPM is using a storage lease to ensure we have only one SPM. But due to
the master mount, we cannot start new SPM even if the old SPM does not hold
the lease, since it will corrupt the master filesystem, used to keep SPM tasks.

> But that does not seem to be the case.
>
> So this means a storage lease is only useful if the host is not the SPM?
> If the SPM host is completely unreachable, not via OS, not via power
> management, then the storage lease won't help to restart VMs on other
> hosts automatically? This is definitely something I did not consider
> when building my environment.

Starting VMs should not depend on the SPM, this is the basic design. If issues
with SPM beaks starting VMs, this is a bug that we need to fix.

The only known dependency is extending thin provisioning disks on block storage.
Without the SPM this cannot happen since the SPM is the only host that
can extend
the logical volumes.

> On 8/9/21 6:25 PM, Nir Soffer wrote:
> > On Thu, Aug 5, 2021 at 5:45 PM Gianluca Cecchi
> >  wrote:
> >> Hello,
> >> supposing latest 4.4.7 environment installed with an external engine and 
> >> two hosts, one in one site and one in another site.
> >> For storage I have one FC storage domain.
> >> I try to simulate a sort of "site failure scenario" to see what kind of HA 
> >> I should expect.
> >>
> >> The 2 hosts have power mgmt configured through fence_ipmilan.
> >>
> >> I have 2 VMs, one configured as HA with lease on storage (Resume Behavior: 
> >> kill) and one not marked as HA.
> >>
> >> Initially host1 is SPM and it is the host that runs the two VMs.
> >>
> >> Fencing of host1 from host2 initially works ok. I can test also from 
> >> command line:
> >> # fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L 
> >> operator -S /usr/local/bin/pwd.sh -o status
> >> Status: ON
> >>
> >> On host2 I then prevent reaching host1 iDRAC:
> >> firewall-cmd --direct --add-rule ipv4 filter OUTPUT 0 -d 10.10.193.152 -p 
> >> udp --dport 623 -j DROP
> >> firewall-cmd --direct --add-rule ipv4 filter OUTPUT 1 -j ACCEPT
> > Why do you need to prevent access from host1 to host2? Hosts do not
> > access each other unless you migrate vms between hosts.
> >
> >> so that:
> >>
> >> # fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L 
> >> operator -S /usr/local/bin/pwd.sh -o status
> >> 2021-08-05 15:06:07,254 ERROR: Failed: Unable to obtain correct plug 
> >> status or plug is not available
> >>
> >> On host1 I generate panic:
> >> # date ; echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger
> >> Thu Aug  5 15:06:24 CEST 2021
> >>
> >> host1 correctly completes its crash dump (kdump integration is enabled) 
> >> and reboots, but I stop it at grub prompt so that host1 is unreachable 
> >> from host2 point of view and also power fencing not determined
> > Crashing the host and preventing it from booting is fine, but isn't it
> > simpler to stop the host using power management?
> >
> >> At this point I thought that VM lease functionality would have come in 
> >> place and host2 would be able to re-start the HA VM, as it is able to see 
> >> that the lease is not taken from the other host and so it can acquire the 
> >> lock itself
> > Once host1 disappears from the system, engine should detect that the HA VM
> > is at unknown status, and start it on the other host.
> >
> > But you kill the SPM, and without SPM some operation cannot
> > work until a new SPM is selected. And for the SPM we don't have a way
> > to start it on another host *before* the old SPM host reboot, and we can
> > verify that the old host is not the SPM.
> >
> >> Instead it goes through the attempt to power fence loop
> >> I wait about 25 minutes without any effect but continuous attempts.
> >>
> >> After 2 minutes host2 correctly becomes SPM and VMs are marked as unknown
> > I wonder how host2 became the SPM. This should not be possible before
> > host 1 is rebooted. Did you use "Confirm host was rebooted" in engine?
> >
> >> At a certain point after the failures in power fencing host1, I see the 
> >> event:
> >>
> >> Failed to power fence host host1. Please check the host status and it's 
> >> power management settings, and then manually reboot it 

[ovirt-users] Re: Resize iSCSI LUN and Storage Domain

2021-08-09 Thread Nir Soffer
On Mon, Aug 9, 2021 at 7:32 PM Shantur Rathore
 wrote:
>
> Hi all,
>
> I have an iSCSI Storage Domain for VMs and need to increase storage.
> To do this, I increased the size of LUN on iscsi storage server and tried to 
> refresh the size of Storage Domain as per
>
> https://www.ovirt.org/documentation/administration_guide/index.html#Increasing_iSCSI_or_FCP_Storage
>
> "Refreshing the LUN Size" section
>
> I cannot see the "Additional Storage Size" column or any other option to 
> refresh the LUN size.

Since you have a storage domain using this LUN, you need to follow the
instructions
for "Increasing an Existing iSCSI or FCP Storage Domain".

The "Refreshing the LUN Size" section is for increating a LUN which is
not part of
the storage domain.

When you open the "Manage domain" dialog, oVirt performs SCSI rescan
which should discover the new size of the LUN. Then oVirt check if the
multipath map matches the size of the LUN, and if not it will resize the map
to fit the LUN.

Finally it shows a button to resize the PV on top of the multipath device.
When you select this button and confirm, oVirt will resize the PV,
which will resize
the VG.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JGY3QU6EBRMORU6VQ6WPRSCKMM4QPQEJ/


[ovirt-users] Re: HA VM and vm leases usage with site failure

2021-08-09 Thread Nir Soffer
On Thu, Aug 5, 2021 at 5:45 PM Gianluca Cecchi
 wrote:
>
> Hello,
> supposing latest 4.4.7 environment installed with an external engine and two 
> hosts, one in one site and one in another site.
> For storage I have one FC storage domain.
> I try to simulate a sort of "site failure scenario" to see what kind of HA I 
> should expect.
>
> The 2 hosts have power mgmt configured through fence_ipmilan.
>
> I have 2 VMs, one configured as HA with lease on storage (Resume Behavior: 
> kill) and one not marked as HA.
>
> Initially host1 is SPM and it is the host that runs the two VMs.
>
> Fencing of host1 from host2 initially works ok. I can test also from command 
> line:
> # fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L operator 
> -S /usr/local/bin/pwd.sh -o status
> Status: ON
>
> On host2 I then prevent reaching host1 iDRAC:
> firewall-cmd --direct --add-rule ipv4 filter OUTPUT 0 -d 10.10.193.152 -p udp 
> --dport 623 -j DROP
> firewall-cmd --direct --add-rule ipv4 filter OUTPUT 1 -j ACCEPT

Why do you need to prevent access from host1 to host2? Hosts do not
access each other unless you migrate vms between hosts.

> so that:
>
> # fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L operator 
> -S /usr/local/bin/pwd.sh -o status
> 2021-08-05 15:06:07,254 ERROR: Failed: Unable to obtain correct plug status 
> or plug is not available
>
> On host1 I generate panic:
> # date ; echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger
> Thu Aug  5 15:06:24 CEST 2021
>
> host1 correctly completes its crash dump (kdump integration is enabled) and 
> reboots, but I stop it at grub prompt so that host1 is unreachable from host2 
> point of view and also power fencing not determined

Crashing the host and preventing it from booting is fine, but isn't it
simpler to stop the host using power management?

> At this point I thought that VM lease functionality would have come in place 
> and host2 would be able to re-start the HA VM, as it is able to see that the 
> lease is not taken from the other host and so it can acquire the lock 
> itself

Once host1 disappears from the system, engine should detect that the HA VM
is at unknown status, and start it on the other host.

But you kill the SPM, and without SPM some operation cannot
work until a new SPM is selected. And for the SPM we don't have a way
to start it on another host *before* the old SPM host reboot, and we can
verify that the old host is not the SPM.

> Instead it goes through the attempt to power fence loop
> I wait about 25 minutes without any effect but continuous attempts.
>
> After 2 minutes host2 correctly becomes SPM and VMs are marked as unknown

I wonder how host2 became the SPM. This should not be possible before
host 1 is rebooted. Did you use "Confirm host was rebooted" in engine?

> At a certain point after the failures in power fencing host1, I see the event:
>
> Failed to power fence host host1. Please check the host status and it's power 
> management settings, and then manually reboot it and click "Confirm Host Has 
> Been Rebooted"
>
> If I select host and choose "Confirm Host Has Been Rebooted", then the two 
> VMs are marked as down and the HA one is correctly booted by host2.
>
> But this requires my manual intervention.

So you host2 became the SPM after you chose: "Confirm Host Has Been Rebooted"?

> Is the behavior above the expected one or the use of VM leases should have 
> allowed host2 to bypass fencing inability and start the HA VM with lease? 
> Otherwise I don't understand the reason to have the lease itself at all

The vm lease allows engine to start HA VM on another host when it cannot
access the original host the VM was running on.

The VM can start only if it is not running on the original host. If
the VM is running
it will keep the lease live, and other hosts will not be able to acquire it.

I suggest you file an ovirt-engine bug with clear instructions on how
to reproduce
the issue.

You can check this presentation on this topic:
https://www.youtube.com/watch?v=WLnU_YsHWtU

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/J3QWSUZTYHZM74ZESJG7M2VAGZRLY5L2/


[ovirt-users] Re: live merge of snapshots failed

2021-08-09 Thread Nir Soffer
On Fri, Aug 6, 2021 at 1:16 PM  wrote:
>
> I think these are the corresponding logs
> qcow2: Marking image as corrupt: Cluster allocation offset 0x7890c000 
> unaligned (L2 offset: 0x39e0, L2 index: 0); further corruption events 
> will be suppressed

This disk was corrupted by a previous run of the vm.

> main_channel_link: add main channel client
> main_channel_client_handle_pong: net test: latency 12.959000 ms, bitrate 
> 3117199391 bps (2972.792998 Mbps)
> inputs_connect: inputs channel client create
> red_qxl_set_cursor_peer:
> red_channel_client_disconnect: rcc=0x56405bdf69c0 (channel=0x56405ad7c940 
> type=3 id=0)
> red_channel_client_disconnect: rcc=0x56405e78cdd0 (channel=0x56405bb96900 
> type=4 id=0)
> red_channel_client_disconnect: rcc=0x56405e79c5b0 (channel=0x56405ad7c220 
> type=2 id=0)
> red_channel_client_disconnect: rcc=0x56405bdea9f0 (channel=0x56405ad7c150 
> type=1 id=0)
> main_channel_client_on_disconnect: rcc=0x56405bdea9f0
> red_client_destroy: destroy client 0x56405c383110 with #channels=4
> red_qxl_disconnect_cursor_peer:
> red_qxl_disconnect_display_peer:
> 2021-08-03T08:10:50.516974Z qemu-kvm: terminating on signal 15 from pid 6847 
> ()
> 2021-08-03 08:10:50.717+: shutting down, reason=destroyed

Did you replace the corrupted disk before starting the vm again 3 hours later?

> 2021-08-03 11:02:57.502+: starting up libvirt version: 4.5.0, package: 
> 33.el7_8.1 (CentOS BuildSystem , 2020-05-12-16:25:35, 
> x86-01.bsys.centos.org), qemu version: 2.12.0qemu-kvm-ev-2.12.0-44.1.el7_8.1, 
> kernel: 3.10.0-1127.8.2.el7.x86_64, hostname: ovirt3-5.vmmgmt-int.uoc.gr
> LC_ALL=C \
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
> QEMU_AUDIO_DRV=none \
> /usr/libexec/qemu-kvm \
> -name guest=anova.admin.uoc.gr,debug-threads=on \
> -S \
> -object 
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-12-anova.admin.uoc.gr/master-key.aes
>  \
> -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off \
> -cpu 
> Westmere,vme=on,pclmuldq=on,x2apic=on,hypervisor=on,arat=on,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_synic,hv_stimer
>  \
> -m size=8388608k,slots=16,maxmem=33554432k \
> -realtime mlock=off \
> -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 \
> -object iothread,id=iothread1 \
> -numa node,nodeid=0,cpus=0-1,mem=8192 \
> -uuid 1c1d20ed-3167-4be7-bff3-29845142fc57 \
> -smbios 'type=1,manufacturer=oVirt,product=oVirt 
> Node,version=7-8.2003.0.el7.centos,serial=4c4c4544-0053-4b10-8059-cac04f475832,uuid=1c1d20ed-3167-4be7-bff3-29845142fc57'
>  \
> -no-user-config \
> -nodefaults \
> -chardev socket,id=charmonitor,fd=33,server,nowait \
> -mon chardev=charmonitor,id=monitor,mode=control \
> -rtc base=2021-08-03T12:02:56,driftfix=slew \
> -global kvm-pit.lost_tick_policy=delay \
> -no-hpet \
> -no-shutdown \
> -global PIIX4_PM.disable_s3=1 \
> -global PIIX4_PM.disable_s4=1 \
> -boot strict=on \
> -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
> -device 
> virtio-scsi-pci,iothread=iothread1,id=ua-90ae154d-56b8-499a-9173-c4cd225ba0c6,bus=pci.0,addr=0x7
>  \
> -device 
> virtio-serial-pci,id=ua-a8dc285c-6fa9-45b2-a4f9-c8862be71342,max_ports=16,bus=pci.0,addr=0x4
>  \
> -drive 
> file=/rhev/data-center/mnt/10.252.80.208:_home_isos/5b1a0f29-8f97-42c3-bea2-39f83bbfbf24/images/----/virtio-win-0.1.185.iso,format=raw,if=none,id=drive-ua-cfb42882-2eba-41b9--43781eeff382,werror=report,rerror=report,readonly=on
>  \
> -device 
> ide-cd,bus=ide.1,unit=0,drive=drive-ua-cfb42882-2eba-41b9--43781eeff382,id=ua-cfb42882-2eba-41b9--43781eeff382,bootindex=2
>  \
> -drive 
> file=/rhev/data-center/mnt/blockSD/a5a492a7-f770-4472-baa3-ac7297a581a9/images/2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5/84c005da-cbec-4ace-8619-5a8e2ae5ea75,format=raw,if=none,id=drive-ua-2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5,serial=2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5,werror=stop,rerror=stop,cache=none,aio=native,throttling.bps-read=157286400,throttling.bps-write=73400320,throttling.iops-read=1200,throttling.iops-write=180
>  \

This log does not show anything except the corruption in the previous run.

What we need is libvirtd.log from /var/log/libvirt/libvirtd.log.

The log usually does not exist since it is too verbose to enable by default.
You can try to enable libvirt logs temporarily, see:
https://libvirt.org/kbase/debuglogs.html

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IVC4PZXJLYRFSZDFPKIUTVFX7WNEV6Q2/


[ovirt-users] Re: Combining Virtual machine image with multiple disks attached

2021-08-05 Thread Nir Soffer
On Thu, Aug 5, 2021 at 5:12 PM KK CHN  wrote:
> I have installed the  ovirt-engine-sdk-python using pip3  in my python3 
> virtaul environment in my personal laptop

I'm not sure this is the right version. Use the rpms provided by ovirt instead.

...
> and Created file  in the user kris home directory in the same laptop  // Is 
> what I am doing right ?
>
> (base) kris@my-ThinkPad-X270:~$ cat ~/.config/ovirt.conf
> [engine-dev]

This can be any name you like for this setup.

> engine_url=https://engine-dev   // what is this engine url ? its the rhevm 
> ovirt url this our service provider may can provide right ?

This is your engine url, the same url you access engine UI.

> username=admin@internal
> password=mypassword
> cafile=/etc/pki/vdsm/certs/cacert.pem // I dont have any cacert.pem file  
> in my laptop's /etc/pki/vdsm/certs/cacert.pem  no folder at all like this

This path works on ovirt host. You can download engine cafile from your
engine using:

curl -k 
'https://engine-dev/ovirt-engine/services/pki-resource?resource=ca-certificate=X509-PEM-CA'
> engine-dev.pem

and use the path to the cafile:

cafile=/home/kris/engine-dev.pem

...
> But  I couldn't find any examples folder where I can find the 
> download_disk.py   // So I have downloaded files for 
> ovirt-engne-sdk-python-4.1.3.tar.gz
>
> and untarred the files where I am able to find the  download_disk.py

You need to use ovirt sdk from 4.4. 4.1 sdk is too old.

Also if  you try to run this on another host, you need to install
more packages.

1. Install ovirt release rpm

dnf install https://resources.ovirt.org/pub/yum-repo/ovirt-release44.rpm

2. Install required packages

dnf install python3-ovirt-engine-sdk4 ovirt-imageio-client

$ rpm -q python3-ovirt-engine-sdk4 ovirt-imageio-client
python3-ovirt-engine-sdk4-4.4.13-1.el8.x86_64
ovirt-imageio-client-2.2.0-1.el8.x86_64

$ find /usr/share/ -name download_disk.py
/usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py

Now you can use download_disk.py to download images from ovirt setup.

...
> Can I execute now the following from my laptop ? so that it will connect to 
> the rhevm host node and download the disks ?

Yes

> (base) kris@my-ThinkPad-X270:$ python3 download_disk.py  -c engine-dev 
> MY_vm_blah_Id /var/tmp/disk1.raw  //is this correct ?

Almost, see the help:

$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py -h
usage: download_disk.py [-h] -c CONFIG [--debug] [--logfile LOGFILE]
[-f {raw,qcow2}] [--use-proxy]
[--max-workers MAX_WORKERS]
[--buffer-size BUFFER_SIZE]
[--timeout-policy {legacy,pause,cancel}]
disk_uuid filename

Download disk

positional arguments:
  disk_uuid Disk UUID to download.
  filename  Path to write downloaded image.

optional arguments:
  -h, --helpshow this help message and exit
  -c CONFIG, --config CONFIG
Use engine connection details from [CONFIG] section in
~/.config/ovirt.conf.
  --debug   Log debug level messages to logfile.
  --logfile LOGFILE Log file name (default example.log).
  -f {raw,qcow2}, --format {raw,qcow2}
Downloaded file format. For best compatibility, use
qcow2 (default qcow2).
  --use-proxy   Download via proxy on the engine host (less
efficient).
  --max-workers MAX_WORKERS
Maximum number of workers to use for download. The
default (4) improves performance when downloading a
single disk. You may want to use lower number if you
download many disks in the same time.
  --buffer-size BUFFER_SIZE
Buffer size per worker. The default (4194304) gives
good performance with the default number of workers.
If you use smaller number of workers you may want use
larger value.
  --timeout-policy {legacy,pause,cancel}
The action to be made for a timed out transfer


Example command to download disk id 3649d84b-6f35-4314-900a-5e8024e3905c
from engine configuration myengine to file disk.img, converting the
format to raw:

$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py
-c myengine --format raw 3649d84b-6f35-4314-900a-5e8024e3905c disk.img
[   0.0 ] Connecting...
[   0.5 ] Creating image transfer...
[   2.8 ] Transfer ID: 62c99f08-e58c-4cc2-8c72-9aa9be835d0f
[   2.8 ] Transfer host name: host4
[   2.8 ] Downloading image...
[ 100.00% ] 6.00 GiB, 11.62 seconds, 528.83 MiB/s
[  14.4 ] Finalizing image transfer...

You can check the image with qemu-img info:

$ qemu-img info disk.img
image: disk.img
file format: 

[ovirt-users] Re: Combining Virtual machine image with multiple disks attached

2021-08-03 Thread Nir Soffer
On Tue, Aug 3, 2021 at 7:29 PM KK CHN  wrote:
>
> I have asked our VM maintainer to run the  command
>
> # virsh -r dumpxml vm-name_blah//as Super user
>
> But no output :   No matching domains found that was the TTY  output on  that 
> rhevm node when I executed the command.
>
> Then I tried to execute #  virsh list //  it doesn't list any VMs  !!!   
> ( How come this ? Does the Rhevm node need to enable any CLI  with License 
> key or something to list Vms or  to dumpxml   with   virsh ? or its CLI 
> commands ?

RHV undefine the vms when they are not running.

> Any way I want to know what I have to ask the   maintainerto provide a 
> working a working  CLI   or ? which do the tasks expected to do with command 
> line utilities in rhevm.
>
If the vm is not running you can get the vm configuration from ovirt
using the API:

GET /api/vms/{vm-id}

You may need more API calls to get info about the disks, follow the 
in the returned xml.

> I have one more question :Which command can I execute on an rhevm node  
> to manually export ( not through GUI portal) a   VMs to   required format  ?
>
> For example;   1.  I need to get  one  VM and disks attached to it  as raw 
> images.  Is this possible how?
>
> and another2. VM and disk attached to it as  Ova or( what other good 
> format) which suitable to upload to glance ?

Arik can add more info on exporting.

>   Each VMs are around 200 to 300 GB with disk volumes ( so where should be 
> the images exported to which path to specify ? to the host node(if the host 
> doesn't have space  or NFS mount ? how to specify the target location where 
> the VM image get stored in case of NFS mount ( available ?)

You have 2 options:
- Download the disks using the SDK
- Export the VM to OVA

When exporting to OVA, you will always get qcow2 images, which you can later
convert to raw using "qemu-img convert"

When downloading the disks, you control the image format, for example
this will download
the disk in any format, collapsing all snapshots to the raw format:

 $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py
-c engine-dev 3649d84b-6f35-4314-900a-5e8024e3905c /var/tmp/disk1.raw

This requires ovirt.conf file:

$ cat ~/.config/ovirt.conf
[engine-dev]
engine_url = https://engine-dev
username = admin@internal
password = mypassword
cafile = /etc/pki/vdsm/certs/cacert.pem

Nir

> Thanks in advance
>
>
> On Mon, Aug 2, 2021 at 8:22 PM Nir Soffer  wrote:
>>
>> On Mon, Aug 2, 2021 at 12:22 PM  wrote:
>> >
>> > I have  few VMs in   Redhat Virtualisation environment  RHeV ( using 
>> > Rhevm4.1 ) managed by a third party
>> >
>> > Now I am in the process of migrating  those VMs to  my cloud setup with  
>> > OpenStack ussuri  version  with KVM hypervisor and Glance storage.
>> >
>> > The third party is making down each VM and giving the each VM image  with 
>> > their attached volume disks along with it.
>> >
>> > There are three folders  which contain images for each VM .
>> > These folders contain the base OS image, and attached LVM disk images ( 
>> > from time to time they added hard disks  and used LVM for storing data ) 
>> > where data is stored.
>> >
>> > Is there a way to  get all these images to be exported as  Single image 
>> > file Instead of  multiple image files from Rhevm it self.  Is this 
>> > possible ?
>> >
>> > If possible how to combine e all these disk images to a single image and 
>> > that image  can upload to our  cloud  glance storage as a single image ?
>>
>> It is not clear what is the vm you are trying to export. If you share
>> the libvirt xml
>> of this vm it will be more clear. You can use "sudo virsh -r dumpxml 
>> vm-name".
>>
>> RHV supports download of disks to one image per disk, which you can move
>> to another system.
>>
>> We also have export to ova, which creates one tar file with all exported 
>> disks,
>> if this helps.
>>
>> Nir
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PSHISFT33TFMDOX4V42LNHNWIM3IKOPA/


[ovirt-users] Re: posix storage migration issue on 4.4 cluster

2021-08-03 Thread Nir Soffer
On Tue, Aug 3, 2021 at 5:51 PM Sketch  wrote:
>
> I currently have two clusters up and running under one engine.  An old
> cluster on 4.3, and a new cluster on 4.4.  In addition to migrating from
> 4.3 to 4.4, we are also migrating from glusterfs to cephfs mounted as
> POSIX storage (not cinderlib, though we may make that conversion after
> moving to 4.4).  I have run into a strange issue, though.
>
> On the 4.3 cluster, migration works fine with any storage backend.  On
> 4.4, migration works against gluster or NFS, but fails when the VM is
> hosted on POSIX cephfs.

What do you mean by "fails"?

What is the failing operation (move disk when vm is running or not?)
and how does it fail?

...
> It appears that the VM fails to start on the new host, but it's not
> obvious why from the logs.  Can anyone shed some light or suggest further
> debugging?

You move the disk when the vm is not running, and after the move the vm will
not start?

If this is the issue, you can check if the disk was copied correctly by creating
a checksum of the disk before the move and after the move.

Here is example run from my system:

$ cat ~/.config/ovirt.conf
[myengine]
engine_url = https://engine-dev
username = admin@internal
password = mypassword
cafile = /etc/pki/vdsm/certs/cacert.pem

$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_disk.py
-c myengine 3649d84b-6f35-4314-900a-5e8024e3905c
{
"algorithm": "blake2b",
"block_size": 4194304,
"checksum":
"d92a2491f797c148e9a6c90830ed7bd2f471099a70e931f7dd9d86853d650ece"
}

See 
https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/checksum_disk.py

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KLD5NMFNJGBXEXUPVQCD4BHQXZXO4RHF/


[ovirt-users] Re: Combining Virtual machine image with multiple disks attached

2021-08-02 Thread Nir Soffer
On Mon, Aug 2, 2021 at 12:22 PM  wrote:
>
> I have  few VMs in   Redhat Virtualisation environment  RHeV ( using Rhevm4.1 
> ) managed by a third party
>
> Now I am in the process of migrating  those VMs to  my cloud setup with  
> OpenStack ussuri  version  with KVM hypervisor and Glance storage.
>
> The third party is making down each VM and giving the each VM image  with 
> their attached volume disks along with it.
>
> There are three folders  which contain images for each VM .
> These folders contain the base OS image, and attached LVM disk images ( from 
> time to time they added hard disks  and used LVM for storing data ) where 
> data is stored.
>
> Is there a way to  get all these images to be exported as  Single image file 
> Instead of  multiple image files from Rhevm it self.  Is this possible ?
>
> If possible how to combine e all these disk images to a single image and that 
> image  can upload to our  cloud  glance storage as a single image ?

It is not clear what is the vm you are trying to export. If you share
the libvirt xml
of this vm it will be more clear. You can use "sudo virsh -r dumpxml vm-name".

RHV supports download of disks to one image per disk, which you can move
to another system.

We also have export to ova, which creates one tar file with all exported disks,
if this helps.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I3J6IVLIEC5Y63TD2Z5AYGYU7UQ6UA43/


[ovirt-users] Re: Host not becoming active due to VDSM failure

2021-07-30 Thread Nir Soffer
On Fri, Jul 30, 2021 at 7:41 PM Vinícius Ferrão via Users
 wrote:
...
> restore-net::ERROR::2021-07-30 
> 12:34:56,167::restore_net_config::462::root::(restore) restoration failed.
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 460, in restore
> unified_restoration()
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 112, in unified_restoration
> classified_conf = _classify_nets_bonds_config(available_config)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 237, in _classify_nets_bonds_config
> net_info = NetInfo(netswitch.configurator.netinfo())
>   File 
> "/usr/lib/python3.6/site-packages/vdsm/network/netswitch/configurator.py", 
> line 323, in netinfo
> _netinfo = netinfo_get(vdsmnets, compatibility)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 268, in get
> return _get(vdsmnets)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 76, in _get
> extra_info.update(_get_devices_info_from_nmstate(state, devices))
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 165, in _get_devices_info_from_nmstate
> nmstate.get_interfaces(state, filter=devices)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 164, in 
> for ifname, ifstate in six.viewitems(
>   File "/usr/lib/python3.6/site-packages/vdsm/network/nmstate/api.py", line 
> 228, in is_dhcp_enabled
> return util_is_dhcp_enabled(family_info)
>   File 
> "/usr/lib/python3.6/site-packages/vdsm/network/nmstate/bridge_util.py", line 
> 137, in is_dhcp_enabled
> return family_info[InterfaceIP.ENABLED] and family_info[InterfaceIP.DHCP]
> KeyError: 'dhcp'

Looks like a mnstate or NetworkManager bug.

You did not mention any version - are you running the latest ovirt version?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/AR64DVBKB5ZZBMVPTDKOCNIZ4IYIHAS4/


[ovirt-users] Re: Direct Linux kernel/initrd boot

2021-07-27 Thread Nir Soffer
On Tue, Jul 27, 2021 at 12:10 PM Yedidyah Bar David  wrote:
>
> On Tue, Jul 27, 2021 at 11:56 AM Shani Leviim  wrote:
> >
> > Hi Chris,
> > Indeed, the ISO domains are deprecated, and you can use a data domain for 
> > uploading iso files (as you've mentioned).
> > To do that, you need to use image-io for uploading images
> >
> > Here's image-io documentation: 
> > http://ovirt.github.io/ovirt-imageio/overview.html.
> >
> > Then you can use the UI (admin portal) or REST API for uploading the iso 
> > image to the relevant storage domain.
>
> Shani, I think Chris asked specifically about booting from an image. See e.g.:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1122970#c34
>
> Is that intended to be handled? It seems like we gave up on removing
> the ISO domain concept, perhaps also because of this missing feature -
> see last few comments of:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1543512
>
> Adding Michal. Michal - IMO we should make up our minds and provide a
> clear view - either undeprecate the ISO domain - remove deprecation
> notices from everywhere - or provide concrete plans to fill the
> missing gaps.

Forcing NFS storage domain in a system with high end FC/iSCSI storage
does not make sense, and introduces reliability issues due to NFS unpredictable
timeouts. We don't want to go back to having NFS on every system.

Uploading to data domain should work for all use cases. If something
does not work (booting from kernel image on data domain) it's a bug,
probably something that was missed when we add the feature to keep iso
disks on block storage.

The issue with data domain is sharing the same domain with multiple DCs.
This is not supported now since only one SPM  can manage the storage
domain (e.g create a new disk). But we have the infrastructure to support
this.

Of course we can support upload to iso domain, this depends on engine setting
up the transfer properly, and probably requires new APIs in vdsm to
create the file
in the iso domain.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RNO246HN42UIHJWFZZG7FYMWKCEHE3FM/


[ovirt-users]Re: Does anyone know How to install macOS on ovirt4.4?

2021-07-21 Thread Nir Soffer
On Wed, Jul 21, 2021 at 9:33 AM zhou...@vip.friendtimes.net
 wrote:
>
>
> Does anyone know How to install macOS on ovirt4.4?
> I used the image--"macOS.Big.Sur.11.2.3.20D91.iso",It can run successfully on 
> VMware.But on ovirt,I tested the boot type by both UEFI and BIOS ,they are 
> all cant boot

Looking in https://github.com/kholia/OSX-KVM running previous
versions of macOS is possible with qemu 4.2.0, so it can work
with oVirt.

I guess some changes are needed in the libvirt xml oVirt generates,
(maybe via a vdsm hook?) or custom vm properties (or both).

I don't think this can work out of the box unless Apple contributes
to oVirt, but this can be a community effort. It is a shame that you
can do this in VMware but not in oVirt.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FROXKQGUBJKTWFOF4FOFOCJXDDZ2I4WB/


[ovirt-users] Re: Removing Direct Mapped LUNs

2021-07-15 Thread Nir Soffer
On Thu, Jul 15, 2021 at 3:50 PM Gianluca Cecchi
 wrote:
>
> On Fri, Apr 23, 2021 at 7:15 PM Nir Soffer  wrote:
>>
>>
>> >> > 1) Is this the expected behavior?
>> >>
>> >> yes, before removing multipath devices, you need to unzone LUN on storage
>> >> server. As oVirt doesn't manage storage server in case of iSCSI, it has 
>> >> to be
>> >> done by storage sever admin and therefore oVirt cannot manage whole flow.
>> >>
>> > Thank you for the information. Perhaps you can expand then on how the 
>> > volumes are picked up once mapped from the Storage system?  Traditionally 
>> > when mapping storage from an iSCSI or Fibre Channel storage we have to 
>> > initiate a LIP or iSCSI login. How is it that oVirt doesn't need to do 
>> > this?
>> >
>> >> > 2) Are we supposed to go to each KVM host and manually remove the
>> >> > underlying multipath devices?
>> >>
>> >> oVirt provides ansible script for it:
>> >>
>> >> https://github.com/oVirt/ovirt-ansible-collection/blob/master/examples/
>> >> remove_mpath_device.yml
>> >>
>> >> Usage is as follows:
>> >>
>> >> ansible-playbook --extra-vars "lun=" remove_mpath_device.yml
>> >
>
>
> I had to decommission one iSCSI based storage domain, after having added one 
> new iSCSI one (with another portal) and moved all the objects into the new 
> one (vm disks, template disks, iso disks, leases).
> The Environment is based on 4.4.6, with 3 hosts, external engine.
> So I tried the ansible playbook way to verify it.
>
> Initial situation is this below; the storage domain to decommission is the 
> ovsd3750, based on the 5Tb LUN.
>
> $ sudo multipath -l
> 364817197c52f98316900666e8c2b0b2b dm-13 EQLOGIC,100E-00
> size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
>   |- 16:0:0:0 sde 8:64 active undef running
>   `- 17:0:0:0 sdf 8:80 active undef running
> 36090a0d800851c9d2195d5b837c9e328 dm-2 EQLOGIC,100E-00
> size=5.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
>   |- 13:0:0:0 sdb 8:16 active undef running
>   `- 14:0:0:0 sdc 8:32 active undef running
>
> Connections are using iSCSI multipathing (iscsi1 and iscs2 in web admin gui), 
> so that I have two paths to each LUN:
>
> $sudo  iscsiadm -m node
> 10.10.100.7:3260,1 
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> 10.10.100.7:3260,1 
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> 10.10.100.9:3260,1 
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> 10.10.100.9:3260,1 
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
>
> $ sudo iscsiadm -m session
> tcp: [1] 10.10.100.7:3260,1 
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 
> (non-flash)
> tcp: [2] 10.10.100.7:3260,1 
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 
> (non-flash)
> tcp: [4] 10.10.100.9:3260,1 
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920 
> (non-flash)
> tcp: [5] 10.10.100.9:3260,1 
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920 
> (non-flash)
>
> One point not taken in consideration inside the previously opened bugs in my 
> opinion is the deletion of iSCSI connections and node at host side (probably 
> to be done by the os admin, but it could be taken in charge by the ansible 
> playbook...)
> The bugs I'm referring are:
> Bug 1310330 - [RFE] Provide a way to remove stale LUNs from hypervisors
> Bug 1928041 - Stale DM links after block SD removal
>
> Actions done:
> put storage domain into maintenance
> detach storage domain
> remove storage domain
> remove access from equallogic admin gui
>
> I have a group named ovirt in ansible inventory composed by my 3 hosts: 
> ov200, ov300 and ov301
> executed
> $ ansible-playbook -b -l ovirt --extra-vars 
> "lun=36090a0d800851c9d2195d5b837c9e328" remove_mpath_device.yml
>
> it went all ok with ov200 and ov300, but for ov301 I got
>
> fatal: [ov301: FAILED! => {"changed": true, "cmd": "multipath -f 
> \"36090a0d800851c9d2195d5b837c9e328\"", "delta": "0:00:00.009003", "end": 
> "2021-07-15 11:17:37.340584", "msg": "non-zero return code", "rc": 1, 
> "start": "2021-07-15 11:17:37.331581", "stderr&q

[ovirt-users] Re: upgrading to 4.4.6 with Rocky Linux 8

2021-07-13 Thread Nir Soffer
On Tue, Jul 13, 2021 at 4:19 PM Marcin Sobczyk  wrote:
>
>
>
> On 7/12/21 2:11 AM, Nir Soffer wrote:
> > On Mon, Jul 12, 2021 at 1:50 AM Branimir Pejakovic  
> > wrote:
> >> It was a fresh install of 2 VMs on top of VirtualBox with Rocky fully 
> >> updated on both prior to oVirt installation. I installed it yesterday and 
> >> followed the usual way of installing it: 
> >> https://www.ovirt.org/download/alternate_downloads.html.
> >>
> >> Here are the oVirt packages that are installed on the hypervisor:
> > Unfortunately vdsm (the core package for ovirt host) does not
> > have ovirt-prefix. Which version do you have?
> > ...
> >> ovirt-imageio-client-2.2.0-1.el8.x86_64
> > This package requires qemu-img >= 5.2.0
> > Maybe the requirement is broken (missing epoch).
> As you quoted, the requirement states:
>
> %if 0%{?rhel} >= 8
> %if 0%{?centos}
> Rocky linux probably doesn't have these macros.
> OTOH 'ovirt-host' package requires plain 'libvirt',
> no version limitations, and that probably sucks in "any 'qemu-kvm'
> possible".

I expect "rhel" to be available on every rhel-like distro but centos is
most likely not available.

So we expect to use the rhel branch:

# 4.4, AV 8.4 - https://bugzilla.redhat.com/1948532
Requires: qemu-kvm >= 15:5.2.0-15.module+el8.4.0+10650+50781ca0

And this should fail if only qemu-kvm 4.2.0 is available.

I could reproduce this strange behavior, even when qemu 5.2.0
is available:

1. Remove host from engine
2. dnf remove vdsm-\* qemu-\* libvirt-\*
3. dnf install qemu-img-4.2.0
   (I needed this for testing imageio patch)
4. dnf install vdsm vdsm-client

And I ended with qemu-kvm 4.2.0 instead of 5.2.0.

5. dnf update

qemu-kvm was updated to 5.2.0.

I hope that someone community can help to resolve this issue.

vdsm patches can be sent using gerrit, please see:
https://github.com/oVirt/vdsm/blob/master/README.md#submitting-patches

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/P7SB4OR2DAO7BXNLWSD63PBPCKGUXYW4/


[ovirt-users] Re: upgrading to 4.4.6 with Rocky Linux 8

2021-07-13 Thread Nir Soffer
On Tue, Jul 13, 2021 at 2:57 PM Branimir Pejakovic  wrote:
...
> > On Mon, Jul 12, 2021 at 1:50 AM Branimir Pejakovic 
> >  >
> > Unfortunately vdsm (the core package for ovirt host) does not
> > have ovirt-prefix. Which version do you have?
> > ...
>
> Just in case:
>
> # rpm -qa | grep vdsm
> vdsm-jsonrpc-4.40.60.7-1.el8.noarch
> vdsm-yajsonrpc-4.40.60.7-1.el8.noarch
> vdsm-common-4.40.60.7-1.el8.noarch
> vdsm-client-4.40.60.7-1.el8.noarch
> vdsm-hook-vmfex-dev-4.40.60.7-1.el8.noarch
> vdsm-api-4.40.60.7-1.el8.noarch
> vdsm-python-4.40.60.7-1.el8.noarch
> vdsm-network-4.40.60.7-1.el8.x86_64
> vdsm-http-4.40.60.7-1.el8.noarch
> vdsm-4.40.60.7-1.el8.x86_64
>
> > This package requires qemu-img >= 5.2.0
> > Maybe the requirement is broken (missing epoch).
> >
> > What does "qemu-img --version" tell?
>
> # qemu-img --version
> qemu-img version 4.2.0 (qemu-kvm-4.2.0-48.module+el8.4.0+534+4680a14e)
> Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
>
> During the installation itself, there were no errors/surprises. Everything 
> went smoothly.

So vdsm spec is broken on Rocky, letting you install vdsm when the required
qemu-kvm version is not available.

Can you file an ovirt/vdsm bug for this?
https://bugzilla.redhat.com/enter_bug.cgi?product=vdsm

I tried to add a Rocky host to my setup and I can confirm that it works.

I create a new vm from:
https://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-8.4-x86_64-dvd1.iso

Install ovirt-release44.rpm from:
https://resources.ovirt.org/pub/yum-repo/ovirt-release44.rpm

And added the host to my engine (4.4.8 master). The host was installed
and activated, and can runs.

I found that migrating vms from other hosts (qemu-6.0, libvirt 7.4)
does not work. I think this issue was already reported here and fix is
expected from libvirt soon.

However I do get the the *right* version of qemu-kvm, provided by:

[root@rocky1 ~]# dnf info qemu-kvm
Last metadata expiration check: 18:02:00 ago on Tue 13 Jul 2021 02:37:31 AM IDT.
Installed Packages
Name : qemu-kvm
Epoch: 15
Version  : 5.2.0
Release  : 16.el8
Architecture : x86_64
Size : 0.0
Source   : qemu-kvm-5.2.0-16.el8.src.rpm
Repository   : @System
From repo: ovirt-4.4-advanced-virtualization

[root@rocky1 ~]# grep -A2 ovirt-4.4-advanced-virtualization
/etc/yum.repos.d/ovirt-4.4-dependencies.repo
[ovirt-4.4-advanced-virtualization]
name=Advanced Virtualization packages for $basearch
mirrorlist=http://mirrorlist.centos.org/?arch=$basearch=8=virt-advanced-virtualization

So oVirt is ready to rock on Rocky with little help from Centos :-)

I'm not sure about the future of the ovirt-4.4-advanced-virtualization
repository after Centos 8 will be discontinued, so this does not look
like production ready yet.

The advanced virtualization packages are likely needed by other virtualization
systems (openstack, proxmox, ...) or anyone that wants to consume the latest
features from libvirt and qemu, so packing it in oVirt is not the
right solution.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XFAMHNHU47OCXP3T7UBRKGZEAGA7BYS5/


[ovirt-users] Re: upgrading to 4.4.6 with Rocky Linux 8

2021-07-12 Thread Nir Soffer
On Mon, Jul 12, 2021 at 8:59 PM Hayden Young via Users  wrote:
>
> I've been giving this a look and it seems that we aren't building the
> advanced virt modules because CentOS builds them from upstream?
>
> I've found no mention of them in their Pagure, and they're built on
> their Community Build System via a SIG, with the metadata set on them
> as `Extra: {'source': {'original_url': 'libvirt-7.0.0-
> 14.1.el8.src.rpm'}}`.
>
> My colleague Neil looked into it, and concluded it seems to be a CLI
> build being manually run(?).
>
> We could investigate building that, but I'm not sure how good we'd be
> to do so as it would likely involve repackaging straight from RHEL
> sources via a RHEL machine.
>
> Anyway, happy to help in any way I can on this, I'm in our
> SIG/Virtualization channel on Mattermost if anyone wants to get to me
> easily.

Great to hear about that, oVirt needs a stable replacement for Centos.

I hope that Sandro should be able to help with this.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CASYORGXHZBR6KYDILDIYGNSIOVM3MEN/


[ovirt-users] Re: Ovirt Storage not reclaim

2021-07-12 Thread Nir Soffer
On Mon, Jul 12, 2021 at 7:00 AM Ranesh Tharanga Perera
 wrote:
>
> Many thanks for the info.
>
> We don't need that option. How we can disable that? just only uncheck that 
> property box ? Is there any downtime?
> What happens if we delete that VM and recreate it? All space will reclaim?

You need to put the domain to maintenance, uncheck the option
and activate the domain.

If you have vms using this domain, you need to shut them down,
or migrate the disk to another storage, and then migrate the vm
back to the storage.

Since migrating the disk requires coping the disks twice, maybe it
is better to create a new LUN(s) on the server side, create a new
storage domain (without wipe-after-delete), and migrate the disks
to the new storage domain (no downtime required). Finally when
old storage domain is unused, remove it. With this flow you need
to copy the data only once.

If you can have short downtime, maybe during a maintenance window
shutting down the vms and modifying the domain may be better.

I'm not sure why we require moving a storage domain to maintenance
for changing such configuration, this sounds completely unnessercery,
so I think RFE for allowing online configuration changes is a good idea.

Nir

>
> Thanks,
> Ranesh..
>
>
> On Mon, Jul 12, 2021 at 3:46 AM Nir Soffer  wrote:
>>
>> On Mon, Jul 12, 2021 at 12:27 AM Ranesh Tharanga Perera
>>  wrote:
>> >
>> > We have enabled both options  discard-after delete and wipe after delete 
>> > but nothing is working , I have tried to creating small disk ( like 1 GB 
>> > ). after deleting that disk space has reclaimed succsuuufully.
>> > Seems this happens for  larger size disk
>>
>> If it worked for 1g disk, it should work for a bigger disk.
>>
>> If you can reproduce this again, please file a ovirt/vdsm bug and attach
>> vdsm logs and /var/log/messages (or output of journalcltl).
>>
>> BTW, do you have a real need for wipe-after-delete? This can make
>> deletion much slower, since it may need to allocate the entire disk on
>> the server side for writing zeroes just to deallocate all space right
>> after that when we discard the disk.
>>
>> This option is needed only if you want to ensure that sensitive data
>> cannot leak to another disk on the same storage, maybe used by
>> another tenant.
>>
>> > On Mon, Jul 12, 2021 at 1:10 AM Nir Soffer  wrote:
>> >>
>> >> On Sun, Jul 11, 2021 at 5:03 PM Ranesh Tharanga Perera
>> >>  wrote:
>> >> >
>> >> > I have attached  7TB disk ( pre allocated)  for VM ( Redhat 8 ).. Due 
>> >> > to some space issue we have unattached disk from VM and deleted it.
>> >>
>> >> Did you enable the "discard after delete" option in the storage domain
>> >> advanced options?
>> >>
>> >> If  you did not, the logical volume is deleted, but the blocks allocated 
>> >> on the
>> >> server side from this logical volume are not discarded.
>> >>
>> >> Since you deleted the logical volume, it is not possible to discard it
>> >> now, but you
>> >> do this:
>> >>
>> >> Try this:
>> >> - Put the storage domain to maintenance
>> >> - Enable "discard-after delete"
>> >> - Active the storage domain
>> >> - Create new raw preallocated disk filling the entire domain
>> >> - Delete the disk
>> >>
>> >> Nir
>>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/CQKAZBOHAPR4IB3ZZUR3IF3S2IAJCA6E/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GIJ3BE3WOV5XX4DXDXAWU5GFXGWYOLMX/


[ovirt-users] Re: upgrading to 4.4.6 with Rocky Linux 8

2021-07-11 Thread Nir Soffer
On Mon, Jul 12, 2021 at 1:50 AM Branimir Pejakovic  wrote:
> It was a fresh install of 2 VMs on top of VirtualBox with Rocky fully updated 
> on both prior to oVirt installation. I installed it yesterday and followed 
> the usual way of installing it: 
> https://www.ovirt.org/download/alternate_downloads.html.
>
> Here are the oVirt packages that are installed on the hypervisor:

Unfortunately vdsm (the core package for ovirt host) does not
have ovirt-prefix. Which version do you have?
...
> ovirt-imageio-client-2.2.0-1.el8.x86_64

This package requires qemu-img >= 5.2.0
Maybe the requirement is broken (missing epoch).

What does "qemu-img --version" tell?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FGKGSVRMFVED3HK4KGFL6JZRV574YMKW/


[ovirt-users] Re: Ovirt Storage not reclaim

2021-07-11 Thread Nir Soffer
On Mon, Jul 12, 2021 at 12:27 AM Ranesh Tharanga Perera
 wrote:
>
> We have enabled both options  discard-after delete and wipe after delete but 
> nothing is working , I have tried to creating small disk ( like 1 GB ). after 
> deleting that disk space has reclaimed succsuuufully.
> Seems this happens for  larger size disk

If it worked for 1g disk, it should work for a bigger disk.

If you can reproduce this again, please file a ovirt/vdsm bug and attach
vdsm logs and /var/log/messages (or output of journalcltl).

BTW, do you have a real need for wipe-after-delete? This can make
deletion much slower, since it may need to allocate the entire disk on
the server side for writing zeroes just to deallocate all space right
after that when we discard the disk.

This option is needed only if you want to ensure that sensitive data
cannot leak to another disk on the same storage, maybe used by
another tenant.

> On Mon, Jul 12, 2021 at 1:10 AM Nir Soffer  wrote:
>>
>> On Sun, Jul 11, 2021 at 5:03 PM Ranesh Tharanga Perera
>>  wrote:
>> >
>> > I have attached  7TB disk ( pre allocated)  for VM ( Redhat 8 ).. Due to 
>> > some space issue we have unattached disk from VM and deleted it.
>>
>> Did you enable the "discard after delete" option in the storage domain
>> advanced options?
>>
>> If  you did not, the logical volume is deleted, but the blocks allocated on 
>> the
>> server side from this logical volume are not discarded.
>>
>> Since you deleted the logical volume, it is not possible to discard it
>> now, but you
>> do this:
>>
>> Try this:
>> - Put the storage domain to maintenance
>> - Enable "discard-after delete"
>> - Active the storage domain
>> - Create new raw preallocated disk filling the entire domain
>> - Delete the disk
>>
>> Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SS6UN23JBG55N4VTZ7YECJPQEORNUL3V/


[ovirt-users] Re: Ovirt Storage not reclaim

2021-07-11 Thread Nir Soffer
On Sun, Jul 11, 2021 at 5:03 PM Ranesh Tharanga Perera
 wrote:
>
> I have attached  7TB disk ( pre allocated)  for VM ( Redhat 8 ).. Due to some 
> space issue we have unattached disk from VM and deleted it.

Did you enable the "discard after delete" option in the storage domain
advanced options?

If  you did not, the logical volume is deleted, but the blocks allocated on the
server side from this logical volume are not discarded.

Since you deleted the logical volume, it is not possible to discard it
now, but you
do this:

Try this:
- Put the storage domain to maintenance
- Enable "discard-after delete"
- Active the storage domain
- Create new raw preallocated disk filling the entire domain
- Delete the disk

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SONKA6KXUNJ7YHU2NLEGOVJWIAYT5LH7/


[ovirt-users] Re: upgrading to 4.4.6 with Rocky Linux 8

2021-07-10 Thread Nir Soffer
On Sat, Jul 10, 2021 at 6:55 PM  wrote:
>
> Hi Nir
>
> Thank you for your explanation.
>
> Can I ask you if you can explain this a bit further? I performed and 
> experiment and installed ovirt + a hypervisor on 2 VMs (on the hypervisor VM 
> I enabled nested virtualization) - both based on Rocky Linux following the 
> official oVirt instructions for 4.4.7. No problems there - I also created a 
> vm and successfully ran it.

How did you install ovirt 4.4.7, when we require qemu-kvm >= 5.2.0?

I see this in vdsm.spec:

%if 0%{?rhel} >= 8
%if 0%{?centos}
# 4.4 Advanced virt stream on CentOS 8
Requires: qemu-kvm >= 15:5.2.0
%else
# 4.4, AV 8.4 - https://bugzilla.redhat.com/1948532
Requires: qemu-kvm >= 15:5.2.0-15.module+el8.4.0+10650+50781ca0
%endif #centos
%endif #rhel

Maybe you install an old rc version, before we updated the requirement?

> Based on this, Rocky - at the moment - replaces CentOS8 just nicely. And 
> while libvirt installed  is libvirt-7.0.0-14.1.el8.x86_64 and comes from 
> CentOS8 advanced virt repo, qemu on the hypervisor machine is 
> qemu-kvm-4.2.0-48.module+el8.4.0+534+4680a14e.x86_64 not 5.2.0 and comes from 
> Rocky's repos. Does it mean that, once CentOS8 reaches EOL at the end of this 
> year, we should only hope that Rocky releases libvirt-7.0.0 in their 
> advanced-virt repo?
>
> A snippet from oVirt GUI - hypervisor properties:
>
> OS Version: RHEL - 8.4 - 30.el8
> OS Description: Rocky Linux 8.4 (Green Obsidian)
> Kernel Version: 4.18.0 - 305.7.1.el8_4.x86_64
> KVM Version: 4.2.0 - 48.module+el8.4.0+534+4680a14e
> LIBVIRT Version: libvirt-7.0.0-14.1.el8

This version is good enough for ovirt 4.4.7.

>
> and confirmation:
>
> # rpm -q qemu-kvm
> qemu-kvm-4.2.0-48.module+el8.4.0+534+4680a14e.x86_64

This version does not support features expected by vdsm.  I wonder how you have
new libvirt with old qemu.

- qemu-img convert does not support the --bitmaps option, used to copy bitmaps
  created during incremental backup when moving disks.
- qemu-img bitmap sub command is not available. use to add, remove and merge
  bitmaps in various flows
- qemu-nbd does not support --allocation-depth option. Use to report holes in
  qcow2 images.
- there may be other missing features that libvirt 7.0.0 depends on

You may have luck with flows that do not use the missing features.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DLTOCBGHVKCCSHUYVCNBIFLCPVCDUI6B/


[ovirt-users] Re: Create new disk failure

2021-07-09 Thread Nir Soffer
On Fri, Jul 9, 2021 at 9:35 PM Gangi Reddy  wrote:
> Software version: 4.4.6.7-1.el8
>
> Error: VDSM server command HSMGetAllTasksStatusesVDS failed: value=Error 
> creating a new volume: ("Volume creation 8f509d4b-6d37-44c5-aa37-acba17391143 
> failed: (28, 'Sanlock resource write failure', 'No space left on device')",) 
> abortedcode=205

This means the host did not join the lockspace yet, but this should be
impossible
on the SPM host, which is the only host that can create new volumes. The SPM
host cannot become SPM without acquiring sanlock lease and this is not possible
without having a lockspace.

Please file ovirt bug for this with engine and vdsm log from the spm host.

Trying the operation again is likely to succeed. If there was an issue
with sanlock
it is likely to be resolved automatically since the system is
monitoring sanlock status
and will recover from such issues automatically.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SJL3GCPFSOE2CMWEKUQKS4UL6NI5FTV7/


[ovirt-users] Re: Unable to migrate VMs to or from oVirt node 4.4.7

2021-07-09 Thread Nir Soffer
On Fri, Jul 9, 2021 at 5:57 PM nroach44--- via Users  wrote:
>
> Hi All,
>
> After upgrading some of my hosts to 4.4.7, and after fixing the policy issue, 
> I'm no longer able to migrate VMs to or from 4.4.7 hosts. Starting them works 
> fine regardless of the host version.
>
> HE 4.4.7.6-1.el8, Linux and Windows VMs.
>
> The log on the receiving end (4.4.7 in this case):
> VDSM:
> 2021-07-09 22:02:17,491+0800 INFO  (libvirt/events) [vds] Channel state for 
> vm_id=5d11885a-37d3-4f68-a953-72d808f43cdd changed from=UNKNOWN(-1) 
> to=disconnected(2) (qemuguestagent:289)
> 2021-07-09 22:02:55,537+0800 INFO  (libvirt/events) [virt.vm] 
> (vmId='5d11885a-37d3-4f68-a953-72d808f43cdd') underlying process disconnected 
> (vm:1134)
> 2021-07-09 22:02:55,537+0800 INFO  (libvirt/events) [virt.vm] 
> (vmId='5d11885a-37d3-4f68-a953-72d808f43cdd') Release VM resources (vm:5313)
> 2021-07-09 22:02:55,537+0800 INFO  (libvirt/events) [virt.vm] 
> (vmId='5d11885a-37d3-4f68-a953-72d808f43cdd') Stopping connection 
> (guestagent:438)
> 2021-07-09 22:02:55,539+0800 INFO  (libvirt/events) [virt.vm] 
> (vmId='5d11885a-37d3-4f68-a953-72d808f43cdd') Stopping connection 
> (guestagent:438)
> 2021-07-09 22:02:55,539+0800 INFO  (libvirt/events) [vdsm.api] START 
> inappropriateDevices(thiefId='5d11885a-37d3-4f68-a953-72d808f43cdd') 
> from=internal, task_id=7abe370b-13bc-4c49-bf02-2e40db142250 (api:48)
> 2021-07-09 22:02:55,544+0800 WARN  (vm/5d11885a) [virt.vm] 
> (vmId='5d11885a-37d3-4f68-a953-72d808f43cdd') Couldn't destroy incoming VM: 
> Domain not found: no domain with matching uuid 
> '5d11885a-37d3-4f68-a953-72d808f43cdd' (vm:4046)
> 2021-07-09 22:02:55,544+0800 INFO  (vm/5d11885a) [virt.vm] 
> (vmId='5d11885a-37d3-4f68-a953-72d808f43cdd') Changed state to Down: VM 
> destroyed during the startup (code=10) (vm:1895)
>
> syslog shows:
> Jul 09 22:35:01 HOSTNAME abrt-hook-ccpp[177862]: Process 177022 (qemu-kvm) of 
> user 107 killed by SIGABRT - dumping core
>
> qemu:
> qemu-kvm: ../util/yank.c:107: yank_unregister_instance: Assertion 
> `QLIST_EMPTY(>yankfns)' failed.
> 2021-07-09 14:02:54.521+: shutting down, reason=failed

Looks like another qemu 6.0.0 regression. Please file ovirt bug for this.

Note that on RHEL we are still using qemu 5.2.0. qemu 6.0.0 is expected
in RHEL 8.5.

> When migrating from 4.4.7 to 4.4.6, syslog shows:
> Jul 09 22:36:36 HOSTNAME libvirtd[2775]: unsupported configuration: unknown 
> audio type 'spice'

Sharing vm xml can help to understand this issue.

Milan, did we test migration from 4.4.7 to 4.4.6?

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OXBKQ5TZ4G64TU4XFZVYENA4BB4OLT6K/


[ovirt-users] Re: blok size and 512e and 4k

2021-07-07 Thread Nir Soffer
On Wed, Jul 7, 2021 at 6:54 PM Anatoliy Radchenko
 wrote:
>
Adding back users@ovirt.org. We want to keep the discussion public
and searchable.

> But,thinking,
> in case you want to create the data domain posix compliant with a block in 4k 
> you cannot do this because you still have 512 block on engine and gives an 
> error about the incompatibility of 512 and 4k.

posix data domain is just NFS data domain under the hood, with
modified mount options, so it share the same limits of NFS data
domain.

I'm not sure how do you get the compatibility issue - do you mean provisioning
engine on a host with 4d disk (using libvirt and local storage) and then copying
to data domain created on same local storage (512 bytes sector size)?

>  How to be in this case?

I think this should work as is since the bootstrap vm created using libvirt
is using 512 bytes sector size, regardless of the underlying storage sector
size (qemu hides the actual sector size from the vm). So the vm starts
with 512 bytes sector size, and them move to another storage which again
have 512 bytes sector size.

This also works when the other storage uses 4k sector (e.g hyperconverge
using gluster and vdo) size since again qemu hides the underlying storage
sector size.

> Il giorno mer 7 lug 2021 alle ore 17:45 Anatoliy Radchenko 
>  ha scritto:
>>
>> Hi Nir,
>> with transferring engine I meen to move it after installation to hosted 
>> engine storage domain, but not important.
>> In any case thank you for your response, everything is just as I thought.
>> Regards
>>
>> Il giorno mer 7 lug 2021 alle ore 17:32 Nir Soffer  ha 
>> scritto:
>>>
>>> On Wed, Jul 7, 2021 at 5:58 PM  wrote:
>>> > With some installations of Ovirt in the configuration of ssd for boot OS 
>>> > and hdd (4k) for data domain, engine is installed on ssd with a block 
>>> > size of 512e and then transferred to hdd 4k.
>>>
>>> What do we mean by transferring engine to another disk?
>>>
>>> Engine is installed either on a host, or in a vm. On a vm, it always
>>> using sector
>>> size of 512 bytes, since we don't support vms with sector size of 4k. If 
>>> enigne
>>> in installed on a host, there is no such thing as "transfer" to
>>> another host. You
>>> can reinstall engine on another host and restore engine database using a
>>> backup. Sector size does not matter in this flow.
>>>
>>> Sector size for data domain is relevant only for gluster storage or
>>> local storage.
>>> These are the only storage domain types that can use 4k storage.
>>>
>>> > and if I wanted to leave the engine on ssd (in a separate partition which 
>>> > will also have a size of 512e), I encountered an incompatibility error 
>>> > between block sizes 512 and 4k (example glusterfs or when the data domain 
>>> > was configured as posix compliant fs). NFS domain passed.
>>> > As I understand it, the NFS uses 512e as well as the drivers of the guest 
>>> > machines,
>>>
>>> Yes, NFS is always using a sector size of 512 bytes since we don't
>>> have any way to
>>> detect the underlying device sector size.
>>>
>>> > and a VM created on a domain with a 4k block will still have 512e. As a 
>>> > result, we have a block of size 512 in any case.
>>>
>>> Yes, vms always see sector size of 512 bytes. qemu knows the underlying 
>>> storage
>>> sector size and align I/O requests to the underlying storage sector size.
>>>
>>> > Does it make sense to use devices with a 4k block at the present time? Is 
>>> > there any configuration to take advantage of 4K?
>>>
>>> Yes, the most attractive use case is VDO, which is optimized for 4k sector 
>>> size.
>>>
>>> Another use case is having disks with 4k sector size which may be cheaper or
>>> easier to get compared with disk supporting 512e.
>>>
>>> For block storage oVirt does not support devices with 4k sector size yet.
>>>
>>> Nir
>>>
>>
>>
>> --
>> _
>>
>> Radchenko Anatolii
>> via Manoppello, 83 - 00132 Roma
>> tel.   06 96044328
>> cel.  329 6030076
>>
>> Nota di riservatezza : ai sensi e per gli effetti della Legge sulla Tutela 
>> della Riservatezza Personale (Legge 196/03) si precisa che il presente 
>> messaggio, corredato dei relativi allegati, contiene informazioni da 
>> considerarsi strettamente riservate, ed è destinato esclusivamente al 
>> destinatario 

[ovirt-users] Re: blok size and 512e and 4k

2021-07-07 Thread Nir Soffer
On Wed, Jul 7, 2021 at 5:58 PM  wrote:
> With some installations of Ovirt in the configuration of ssd for boot OS and 
> hdd (4k) for data domain, engine is installed on ssd with a block size of 
> 512e and then transferred to hdd 4k.

What do we mean by transferring engine to another disk?

Engine is installed either on a host, or in a vm. On a vm, it always
using sector
size of 512 bytes, since we don't support vms with sector size of 4k. If enigne
in installed on a host, there is no such thing as "transfer" to
another host. You
can reinstall engine on another host and restore engine database using a
backup. Sector size does not matter in this flow.

Sector size for data domain is relevant only for gluster storage or
local storage.
These are the only storage domain types that can use 4k storage.

> and if I wanted to leave the engine on ssd (in a separate partition which 
> will also have a size of 512e), I encountered an incompatibility error 
> between block sizes 512 and 4k (example glusterfs or when the data domain was 
> configured as posix compliant fs). NFS domain passed.
> As I understand it, the NFS uses 512e as well as the drivers of the guest 
> machines,

Yes, NFS is always using a sector size of 512 bytes since we don't
have any way to
detect the underlying device sector size.

> and a VM created on a domain with a 4k block will still have 512e. As a 
> result, we have a block of size 512 in any case.

Yes, vms always see sector size of 512 bytes. qemu knows the underlying storage
sector size and align I/O requests to the underlying storage sector size.

> Does it make sense to use devices with a 4k block at the present time? Is 
> there any configuration to take advantage of 4K?

Yes, the most attractive use case is VDO, which is optimized for 4k sector size.

Another use case is having disks with 4k sector size which may be cheaper or
easier to get compared with disk supporting 512e.

For block storage oVirt does not support devices with 4k sector size yet.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MSXWMFISNIVWEEGTMBSM5HQMNAIWFIVB/


[ovirt-users] Re: Any way to terminate stuck export task

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 5:55 PM Gianluca Cecchi
 wrote:
>
> On Tue, Jul 6, 2021 at 2:52 PM Nir Soffer  wrote:
>
>>
>>
>> Too bad.
>>
>> You can evaluate how ovirt 4.4. will work with this appliance using
>> this dd command:
>>
>> dd if=/dev/zero bs=8M count=38400 of=/path/to/new/disk
>> oflag=direct conv=fsync
>>
>> We don't use dd for this, but the operation is the same on NFS < 4.2.
>>
>
> I confirm I'm able to saturate the 1Gb/s link. tried creating a 10Gb file on 
> the StoreOnce appliance
>  # time dd if=/dev/zero bs=8M count=1280 
> of=/rhev/data-center/mnt/172.16.1.137\:_nas_EXPORT-DOMAIN/ansible_ova/test.img
>  oflag=direct conv=fsync
> 1280+0 records in
> 1280+0 records out
> 10737418240 bytes (11 GB) copied, 98.0172 s, 110 MB/s
>
> real 1m38.035s
> user 0m0.003s
> sys 0m2.366s
>
> So are you saying that after upgrading to 4.4.6 (or just released 4.4.7) I 
> should be able to export with this speed?

The preallocation part will run at the same speed, and then
you need to copy the used parts of the disk, time depending
on how much data is used.

>  Or anyway I do need NFS v4.2?

Without NFS 4.2. With NFS 4.2 the entire allocation will take less than
a second without consuming any network bandwidth.

> BTW: is there any capping put in place by oVirt to the export phase (the 
> qemu-img command in practice)? Designed for example not to perturbate the 
> activity of hypervisor?Or do you think that if I have a 10Gb/s network 
> backend and powerful disks on oVirt and powerful NFS server processing power  
> I should have much more speed?

We don't have any capping in place, usually people complain that copying
images is too slow.

In general when copying to file base storage we don't use -W option
(unordered writes) so copy will be slower compared with block based
storage, when qemu-img use 8 concurrent writes. So in a way we always
cap the copies to file based storage. To get maximum throughput you need
to run multiple copies at the same time.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VE6X6ASHETSPLMQ4HTENF4D5UQPV7HQL/


[ovirt-users] Re: what happens to vms when a host shutdowns?

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 5:58 PM Scott Worthington
 wrote:
>
>
>
> On Tue, Jul 6, 2021 at 8:13 AM Nir Soffer  wrote:
>>
>> On Tue, Jul 6, 2021 at 2:29 PM Sandro Bonazzola  wrote:
>>>
>>>
>>>
>>> Il giorno mar 6 lug 2021 alle ore 13:03 Nir Soffer  ha 
>>> scritto:
>>>>
>>>> On Tue, Jul 6, 2021 at 1:11 PM Nathanaël Blanchet  wrote:
>>>> > We are installing UPS powerchute client on hypervisors.
>>>> >
>>>> > What is the default vms behaviour of running vms when an hypervisor is
>>>> > ordered to shutdown: do the vms live migrate or do they shutdown
>>>> > properly (even the restart on an other host because of HA) ?
>>>>
>>>> In general VMs are not restarted after an unexpected shutdown, but HA VMs
>>>> are restarted after failures.
>>>>
>>>> If the HA VM has a lease, it can restart safely on another host regardless 
>>>> of
>>>> the original host status. If the HA VM does not have a lease, the system 
>>>> must
>>>> wait until the original host is up again to check if the VM is still
>>>> running on this
>>>> host.
>>>>
>>>> Arik can add more details on this.
>>>
>>>
>>> I think the question is not related to what happens after the host is back.
>>> I think the question is what happens when the host goes down.
>>> To me, the right way to shutdown a host is putting it first to maintenance 
>>> (VM evacuate to other hosts) and then shutdown.
>>
>>
>> Right, but the we don't have integration with the UPS, so engine cannot put 
>> the host
>> to maintenance when the host lose power and the UPS will shut it down after
>> few minutes.
>
>
> This is outside of the scope of oVirt team:
>
> Perhaps one could combine multiple applications ( NUT + Ansible + 
> Nagios/Zabbix ) to notify the oVirt engine to switch a host to maintenance?
>
> NUT[0] could be configured to alert a monitoring system ( like Nagios or 
> Zabbix) to trigger an Ansible playbook [1][2] to put the host in maintenance 
> mode, and the trigger should happen before the UPS battery is depleted 
> (you'll have to account for the time it takes to live migrate VMs).

I would trigger this once power is lost. You never know how much time
migration will take, so best migrate all vms immediately.

It would be nice to integrate this with engine, but we can start by something
like you describe, that will use engine API/SDK to prepare the hosts for
graceful shutdown.

> [0] Network UPS Tools 
> https://networkupstools.org/docs/user-manual.chunked/index.html
> [1] 
> https://www.ovirt.org/develop/release-management/features/infra/ansible_modules.html
> [2] 
> https://docs.ansible.com/ansible/latest/collections/ovirt/ovirt/ovirt_host_module.html
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/D43UXSOGTE7QHJTHUCCW63MWCYH3YM3M/


[ovirt-users] Re: Failing to migrate hosted engine from 4.4.6 host to 4.4.7 host

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 4:27 PM Sandro Bonazzola  wrote:

>>> This looks like the selinux issue we had in libvirt 7.4. Do we have the 
>>> latest
>>> selinux-policy-target package on the host?
>>
>>
>> Yeah, seems like https://bugzilla.redhat.com/show_bug.cgi?id=1964317
>
>
> In order to get the migration working this solved:
>  restorecon /var/run/libvirt/common/system.token
>  ls -lZ /var/run/libvirt/common/system.token
> -rw---. 1 root root system_u:object_r:virt_common_var_run_t:s0 32 Jul  6 
> 09:29 /var/run/libvirt/common/system.token
> service libvirtd restart
> service virtlogd restart

On the source or destination?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3NM6XWIVWE7NBSWWKHDD52OV5JUMYWNN/


[ovirt-users] Re: Failing to migrate hosted engine from 4.4.6 host to 4.4.7 host

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 3:36 PM Sandro Bonazzola  wrote:

> Hi,
> I update the hosted engine to 4.4.7 and one of the 2 nodes where the
> engine is running.
> Current status is:
> - Hosted engine at 4.4.7 running on Node 0
> - Node 0 at 4.4.6
> - Node 1 at 4.4.7
>
> Now, moving Node 0 to maintenance successfully moved the SPM from Node 0
> to Node 1 but while trying to migrate hosted engine I get on Node 0
> vdsm.log:
>
...

>   File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2119, in 
> migrateToURI3
> raise libvirtError('virDomainMigrateToURI3() failed')
> libvirt.libvirtError: can't connect to virtlogd: Unable to open system token 
> /run/libvirt/common/system.token: Permission denied
>
>
This looks like the selinux issue we had in libvirt 7.4. Do we have the
latest
selinux-policy-target package on the host?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GWSOG2KKMYYASJJZVDG7SIVEXWL5P3XD/


[ovirt-users] Re: Any way to terminate stuck export task

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 10:21 AM Gianluca Cecchi
 wrote:
>
> On Mon, Jul 5, 2021 at 5:06 PM Nir Soffer  wrote:
>
>>
>>
>> qemu-img is busy in posix_fallocate(), wiring one byte to every 4k block.
>>
>> If you add -tt -T (as I suggested), we can see how much time each write 
>> takes,
>> which may explain why this takes so much time.
>>
>> strace -f -p 14342 --tt -T
>>
>
> It seems I missed part of your suggestion... i didn't get the "-tt -T" (or I 
> didn't see it...)
>
> With it I get this during the export (in networking of host console 4 
> mbit/s):
>
> # strace -f -p 25243 -tt -T
> strace: Process 25243 attached with 2 threads
> [pid 25243] 09:17:32.503907 ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, 
> NULL, NULL, 8 
> [pid 25244] 09:17:32.694207 pwrite64(12, "\0", 1, 3773509631) = 1 <0.59>
> [pid 25244] 09:17:32.694412 pwrite64(12, "\0", 1, 3773513727) = 1 <0.78>
> [pid 25244] 09:17:32.694608 pwrite64(12, "\0", 1, 3773517823) = 1 <0.56>
> [pid 25244] 09:17:32.694729 pwrite64(12, "\0", 1, 3773521919) = 1 <0.24>
> [pid 25244] 09:17:32.694796 pwrite64(12, "\0", 1, 3773526015) = 1 <0.20>
> [pid 25244] 09:17:32.694855 pwrite64(12, "\0", 1, 3773530111) = 1 <0.15>
> [pid 25244] 09:17:32.694908 pwrite64(12, "\0", 1, 3773534207) = 1 <0.14>
> [pid 25244] 09:17:32.694950 pwrite64(12, "\0", 1, 3773538303) = 1 <0.16>
> [pid 25244] 09:17:32.694993 pwrite64(12, "\0", 1, 3773542399) = 1 <0.200032>
> [pid 25244] 09:17:32.895140 pwrite64(12, "\0", 1, 3773546495) = 1 <0.34>
> [pid 25244] 09:17:32.895227 pwrite64(12, "\0", 1, 3773550591) = 1 <0.29>
> [pid 25244] 09:17:32.895296 pwrite64(12, "\0", 1, 3773554687) = 1 <0.24>
> [pid 25244] 09:17:32.895353 pwrite64(12, "\0", 1, 3773558783) = 1 <0.16>
> [pid 25244] 09:17:32.895400 pwrite64(12, "\0", 1, 3773562879) = 1 <0.15>
> [pid 25244] 09:17:32.895443 pwrite64(12, "\0", 1, 3773566975) = 1 <0.15>
> [pid 25244] 09:17:32.895485 pwrite64(12, "\0", 1, 3773571071) = 1 <0.15>
> [pid 25244] 09:17:32.895527 pwrite64(12, "\0", 1, 3773575167) = 1 <0.17>
> [pid 25244] 09:17:32.895570 pwrite64(12, "\0", 1, 3773579263) = 1 <0.199493>
> [pid 25244] 09:17:33.095147 pwrite64(12, "\0", 1, 3773583359) = 1 <0.31>
> [pid 25244] 09:17:33.095262 pwrite64(12, "\0", 1, 3773587455) = 1 <0.61>
> [pid 25244] 09:17:33.095378 pwrite64(12, "\0", 1, 3773591551) = 1 <0.27>
> [pid 25244] 09:17:33.095445 pwrite64(12, "\0", 1, 3773595647) = 1 <0.21>
> [pid 25244] 09:17:33.095498 pwrite64(12, "\0", 1, 3773599743) = 1 <0.16>
> [pid 25244] 09:17:33.095542 pwrite64(12, "\0", 1, 3773603839) = 1 <0.14>

Most writes are pretty fast, but from time to time there is a very slow write.

From the small sample you posted, we have:

awk '{print $11}' strace.out | sed -e "s///" | awk
'{sum+=$1; if ($1 < 0.1) {fast+=$1; fast_nr++} else {slow+=$1;
slow_nr++}} END{printf "average: %.6f slow: %.6f fast: %.6f\n",
sum/NR, slow/slow_nr, fast/fast_nr}'
average: 0.016673 slow: 0.199763 fast: 0.28

Preallocating a 300 GiB disk will take about 15 days :-)

>>> 300*1024**3 / 4096 * 0.016673 / 3600 / 24
15.176135

If all writes would be fast, it will take less than an hour:

>>> 300*1024**3 / 4096 * 0.28 / 3600
0.61166933

> . . .
>
> BTW: it seems my NAS appliance doesn't support 4.2 version of NFS, because if 
> I force it, I then get an error in mount and in engine.log this error for 
> both nodes as they try to mount:
>
> 2021-07-05 17:01:56,082+02 ERROR 
> [org.ovirt.engine.core.bll.storage.connection.FileStorageHelper] 
> (EE-ManagedThreadFactory-engine-Thread-2554190) [642eb6be] The connection 
> with details '172.16.1.137:/nas/EXPORT-DOMAIN' failed because of error code 
> '477' and error message is: problem while trying to mount target
>
>
> and in vdsm.log:
> MountError: (32, ';mount.nfs: Protocol not supported\n')

Too bad.

You can evaluate how ovirt 4.4. will work with this appliance using
this dd command:

dd if=/dev/zero bs=8M count=38400 of=/path/to/new/disk
oflag=direct conv=fsync

We don't use dd for this, but the operation is the same on NFS < 4.2.

Based on the 50 MiB/s rate you reported earlier, I guess you have a
1Gbit network to
this appliance, so zeroing can do up to 128 MiB/s, which will take
about 40 minutes
for 300G.

Using NFS 4.

[ovirt-users] Re: what happens to vms when a host shutdowns?

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 2:29 PM Sandro Bonazzola  wrote:

>
>
> Il giorno mar 6 lug 2021 alle ore 13:03 Nir Soffer 
> ha scritto:
>
>> On Tue, Jul 6, 2021 at 1:11 PM Nathanaël Blanchet 
>> wrote:
>> > We are installing UPS powerchute client on hypervisors.
>> >
>> > What is the default vms behaviour of running vms when an hypervisor is
>> > ordered to shutdown: do the vms live migrate or do they shutdown
>> > properly (even the restart on an other host because of HA) ?
>>
>> In general VMs are not restarted after an unexpected shutdown, but HA VMs
>> are restarted after failures.
>>
>> If the HA VM has a lease, it can restart safely on another host
>> regardless of
>> the original host status. If the HA VM does not have a lease, the system
>> must
>> wait until the original host is up again to check if the VM is still
>> running on this
>> host.
>>
>> Arik can add more details on this.
>>
>
> I think the question is not related to what happens after the host is back.
> I think the question is what happens when the host goes down.
> To me, the right way to shutdown a host is putting it first to maintenance
> (VM evacuate to other hosts) and then shutdown.
>

Right, but the we don't have integration with the UPS, so engine cannot put
the host
to maintenance when the host lose power and the UPS will shut it down after
few minutes.


> On emergency shutdown without moving the host to maintenance first I think
> libvirt is communicating the host is going down to the guests and tries to
> cleanly shutdown vms while the host is going down.
> Arik please confirm :-)
>
>
>
>>
>> Nir
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HXVXSLXQYZX6CQPJNXKWLOMY3LQU7XJ5/
>>
>
>
> --
>
> Sandro Bonazzola
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>
> Red Hat EMEA <https://www.redhat.com/>
>
> sbona...@redhat.com
> <https://www.redhat.com/>
>
> *Red Hat respects your work life balance. Therefore there is no need to
> answer this email out of your office hours.
> <https://mojo.redhat.com/docs/DOC-1199578>*
>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7Q7XXOL3JXL2L4MP6G2Q7OJLKLBEZFVP/


[ovirt-users] Re: what happens to vms when a host shutdowns?

2021-07-06 Thread Nir Soffer
On Tue, Jul 6, 2021 at 1:11 PM Nathanaël Blanchet  wrote:
> We are installing UPS powerchute client on hypervisors.
>
> What is the default vms behaviour of running vms when an hypervisor is
> ordered to shutdown: do the vms live migrate or do they shutdown
> properly (even the restart on an other host because of HA) ?

In general VMs are not restarted after an unexpected shutdown, but HA VMs
are restarted after failures.

If the HA VM has a lease, it can restart safely on another host regardless of
the original host status. If the HA VM does not have a lease, the system must
wait until the original host is up again to check if the VM is still
running on this
host.

Arik can add more details on this.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HXVXSLXQYZX6CQPJNXKWLOMY3LQU7XJ5/


[ovirt-users] Re: Any way to terminate stuck export task

2021-07-05 Thread Nir Soffer
On Mon, Jul 5, 2021 at 3:36 PM Gianluca Cecchi
 wrote:
>
> On Mon, Jul 5, 2021 at 2:13 PM Nir Soffer  wrote:
>>
>>
>> >
>> > vdsm 14342  3270  0 11:17 ?00:00:03 /usr/bin/qemu-img convert 
>> > -p -t none -T none -f raw 
>> > /rhev/data-center/mnt/blockSD/679c0725-75fb-4af7-bff1-7c447c5d789c/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379
>> >  -O raw -o preallocation=falloc 
>> > /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379
>>
>> -o preallocation + NFS 4.0 + very slow NFS is your problem.
>>
>> qemu-img is using posix-fallocate() to preallocate the entire image at
>> the start of the copy. With NFS 4.2
>> this uses fallocate() linux specific syscall that allocates the space
>> very efficiently in no time. With older
>> NFS versions, this becomes a very slow loop, writing one byte for
>> every 4k block.
>>
>> If you see -o preallocation, it means you are using an old vdsm
>> version, we stopped using -o preallocation
>> in 4.4.2, see https://bugzilla.redhat.com/1850267.
>
>
> OK. As I said at the beginning the environment is latest 4.3
> We are going to upgrade to 4.4 and we are making some complimentary backups, 
> for safeness.
>
>>
>> > On the hypervisor the ls commands quite hang, so from another hypervisor I 
>> > see that the disk size seems to remain at 4Gb even if timestamp updates...
>> >
>> > # ll 
>> > /rhev/data-center/mnt/172.16.1.137\:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/
>> > total 4260941
>> > -rw-rw. 1 nobody nobody 4363202560 Jul  5 11:23 
>> > d2a89b5e-7d62-4695-96d8-b762ce52b379
>> > -rw-r--r--. 1 nobody nobody261 Jul  5 11:17 
>> > d2a89b5e-7d62-4695-96d8-b762ce52b379.meta
>> >
>> > On host console I see a throughput of 4mbit/s...
>> >
>> > # strace -p 14342
>>
>> This shows only the main thread use -f use -f to show all threads.
>
>
>  # strace -f -p 14342
> strace: Process 14342 attached with 2 threads
> [pid 14342] ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8 
> 
> [pid 14343] pwrite64(12, "\0", 1, 16474968063) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474972159) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474976255) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474980351) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474984447) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474988543) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474992639) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16474996735) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16475000831) = 1
> [pid 14343] pwrite64(12, "\0", 1, 16475004927) = 1

qemu-img is busy in posix_fallocate(), wiring one byte to every 4k block.

If you add -tt -T (as I suggested), we can see how much time each write takes,
which may explain why this takes so much time.

strace -f -p 14342 --tt -T

> . . . and so on . . .
>
>
>> >
>> > This is a test oVirt env so I can wait and eventually test something...
>> > Let me know your suggestions
>>
>> I would start by changing the NFS storage domain to version 4.2.
>
>
> I'm going to try. RIght now I have set it to the default of autonegotiated...
>
>>
>> 1. kill the hang qemu-img (it will probably cannot be killed, but worth 
>> trying)
>> 2. deactivate the storage domain
>> 3. fix the ownership on the storage domain (should be vdsm:kvm, not
>> nobody:nobody)3.
>
>
> Unfortunately it is an appliance. I have asked the guys that have it in 
> charge if we can set them.
> Thanks for the other concepts explained.
>
> Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OI2VU55SU2YZ5ABB3SKEYIHDOG3PERAV/


[ovirt-users] Re: Any way to terminate stuck export task

2021-07-05 Thread Nir Soffer
On Mon, Jul 5, 2021 at 4:06 PM Strahil Nikolov  wrote:
>
> >Disks on the export domain are never used by a running VM so there is
> no reason to
> preallocate them. The system should always use sparse disks when
> copying to export
> domain.
>
> >When importing disks from export domain, the system should reconstruct
> the original disk
> configuration (e.g. raw-preallocated).
>
> Hey Nir,
>
> I think you are wrong. In order to minimize the downtime , many users would 
> use storage migration while the VM is running, then they power off, detach 
> and attach on the new location , power on and live migrate while the VM works.

Live storage migration (move disk while vm is running) is possible only between
data domains, and requires no downtime and no detach/attach are needed.

I'm not sure if it is possible to export a vm to export domain when
the vm is running,
(maybe exporting snapshot works in 4.4). Anway, assuming you can export while
the vm is running, the target disk will never be used by any vm.

When the export is done, you need to import the vm back to the same or other
system, copying the disk to a data domain.

So we have:

original disk: raw-preallocated on data domain 1
exported disk: raw-sparse or qcow2-sparse on export domain
target disk: raw-preallocated on data domain 2

There is no reason to use a preallocated disk for the temporary disk created
in the export domain.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZLHFAAWY33E5FSLLYT4QBV42GRESNCMA/


[ovirt-users] Re: Any way to terminate stuck export task

2021-07-05 Thread Nir Soffer
On Mon, Jul 5, 2021 at 12:50 PM Gianluca Cecchi
 wrote:
>
> On Sun, Jul 4, 2021 at 1:01 PM Nir Soffer  wrote:
>>
>> On Sun, Jul 4, 2021 at 11:30 AM Strahil Nikolov  
>> wrote:
>> >
>> > Isn't it better to strace it before killing qemu-img .
>>
>> It may be too late, but it may help to understand why this qemu-img
>> run got stuck.
>>
>
> Hi, thanks for your answers and suggestions.
> That env was a production one and so I was forced to power off the hypervisor 
> and power on it again (it was a maintenance window with all the VMs powered 
> down anyway). I was also unable to put the host into maintenance because it 
> replied that there were some tasks running, even after the kill, because the 
> 2 processes (the VM had 2 disks to export and so two qemu-img processes) 
> remained in defunct and after several minutes no change in web admin feedback 
> about the process
>
> My first suspicion was something related to fw congestion because the 
> hypervisor network and the nas appliance were in different networks and I 
> wasn't sure if a fw was in place for it
> But on a test oVirt environment with same oVirt version and with the same 
> network for hypervisors I was able to put a Linux server with the same 
> network as the nas and configure it as nfs server.
> And the export went with a throughput of about 50MB/s, so no fw problem.
> A VM with 55Gb disk exported in 19 minutes.
>
> So I got the rights to mount the nas on the test env and mounted it as export 
> domain and now I have the same problems I can debug.
> The same VM with only one disk (55Gb). The process:
>
> vdsm 14342  3270  0 11:17 ?00:00:03 /usr/bin/qemu-img convert -p 
> -t none -T none -f raw 
> /rhev/data-center/mnt/blockSD/679c0725-75fb-4af7-bff1-7c447c5d789c/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379
>  -O raw -o preallocation=falloc 
> /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379

-o preallocation + NFS 4.0 + very slow NFS is your problem.

qemu-img is using posix-fallocate() to preallocate the entire image at
the start of the copy. With NFS 4.2
this uses fallocate() linux specific syscall that allocates the space
very efficiently in no time. With older
NFS versions, this becomes a very slow loop, writing one byte for
every 4k block.

If you see -o preallocation, it means you are using an old vdsm
version, we stopped using -o preallocation
in 4.4.2, see https://bugzilla.redhat.com/1850267.

> On the hypervisor the ls commands quite hang, so from another hypervisor I 
> see that the disk size seems to remain at 4Gb even if timestamp updates...
>
> # ll 
> /rhev/data-center/mnt/172.16.1.137\:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/
> total 4260941
> -rw-rw. 1 nobody nobody 4363202560 Jul  5 11:23 
> d2a89b5e-7d62-4695-96d8-b762ce52b379
> -rw-r--r--. 1 nobody nobody261 Jul  5 11:17 
> d2a89b5e-7d62-4695-96d8-b762ce52b379.meta
>
> On host console I see a throughput of 4mbit/s...
>
> # strace -p 14342

This shows only the main thread use -f use -f to show all threads.

> strace: Process 14342 attached
> ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8
>
> # ll /proc/14342/fd
> hangs...
>
> # nfsstat -v
> Client packet stats:
> packetsudptcptcpconn
> 0  0  0  0
>
> Client rpc stats:
> calls  retransauthrefrsh
> 31171856   0  31186615
>
> Client nfs v4:
> null read writecommit   open open_conf
> 0 0% 2339179   7% 14872911 47% 7233  0% 74956 0% 2 0%
> open_noatopen_dgrdclosesetattr  fsinfo   renew
> 2312347   7% 0 0% 2387293   7% 240% 230% 5 0%
> setclntidconfirm  lock locktlockuaccess
> 3 0% 3 0% 8 0% 8 0% 5 0% 1342746   4%
> getattr  lookup   lookup_root  remove   rename   link
> 3031001   9% 71551 0% 7 0% 74590 0% 6 0% 0 0%
> symlink  create   pathconf statfs   readlink readdir
> 0 0% 9 0% 160% 4548231  14% 0 0% 98506 0%
> server_caps  delegreturn  getacl   setacl   fs_locations rel_lkowner
> 390% 140% 0 0% 0 0% 0 0% 0 0%
> secinfo  exchange_id  create_ses   destroy_ses  sequence get_lease_t
> 0 0% 0 0% 4 0% 2 0% 1 0% 0 0%
> reclaim_comp

[ovirt-users] Re: Any way to terminate stuck export task

2021-07-04 Thread Nir Soffer
On Sun, Jul 4, 2021 at 11:30 AM Strahil Nikolov  wrote:
>
> Isn't it better to strace it before killing qemu-img .

It may be too late, but it may help to understand why this qemu-img
run got stuck.

> Best Regards,
> Strahil Nikolov
>
> On Sun, Jul 4, 2021 at 0:15, Nir Soffer
>  wrote:
> On Sat, Jul 3, 2021 at 3:46 PM Gianluca Cecchi
>  wrote:
> >
> > Hello,
> > in oVirt 4.3.10 an export job to export domain takes too long, probably due 
> > to the NFS server slow.
> > How can I stop in a clean way the task?
> > I see the exported file remains always at 4,5Gb of size.
> > Command vmstat on host with qemu-img process gives no throughput but 
> > blocked processes
> >
> > procs ---memory-- ---swap-- -io -system-- 
> > --cpu-
> >  r  b  swpd  free  buff  cache  si  sobibo  in  cs us sy id wa st
> >  1  2  0 170208752 474412 1698575200  71972 2948 5677  0  0 
> > 96  4  0
> >  0  2  0 170207184 474412 1698578000  358099 5043 6790  0  
> > 0 96  4  0
> >  0  2  0 170208800 474412 1698580400  137941 2332 5527  0  
> > 0 96  4  0
> >
> > and the generated file refreshes its timestamp but not the size
> >
> > # ll -a  
> > /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/125ad0f8-2672-468f-86a0-115a7be287f0/
> > total 4675651
> > drwxr-xr-x.  2 vdsm kvm  1024 Jul  3 14:10 .
> > drwxr-xr-x. 12 vdsm kvm  1024 Jul  3 14:10 ..
> > -rw-rw.  1 vdsm kvm 4787863552 Jul  3 14:33 
> > bb94ae66-e574-432b-bf68-7497bb3ca9e6
> > -rw-r--r--.  1 vdsm kvm268 Jul  3 14:10 
> > bb94ae66-e574-432b-bf68-7497bb3ca9e6.meta
> >
> > # du -sh  
> > /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/125ad0f8-2672-468f-86a0-115a7be287f0/
> > 4.5G
> > /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/125ad0f8-2672-468f-86a0-115a7be287f0/
> >
> > The VM has two disks, 35Gb and 300GB, not full but quite occupied.
> >
> > Can I simply kill the qemu-img processes on the chosen hypervisor (I 
> > suppose the SPM one)?
>
> Killing the qemu-img process is the only way to stop qemu-img. The system
> is designed to clean up properly after qemu-img terminates.
>
> If this capability is important to you, you can file RFE to allow aborting
> jobs from engine UI/API. This is already implemented internally, but we did
> not expose the capability.
>
> It would be useful to understand why qemu-img convert does not make progress.
> If you can reproduce this by running qemu-img from the shell, it can be useful
> to run it via strace and ask about this in qemu-block mailing list.
>
> Example strace usage:
>
> strace -o convert.log -f -tt -T qemu-img convert ...
>
> Also output of nfsstat during the copy can help.
>
> Nir
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RAMVA5P5IBOXL3ZRJ73B577QQXGM6EKC/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2ROTZZFOXGIRUT2ZQS2MSF3PMYMLQJC7/


[ovirt-users] Re: Any way to terminate stuck export task

2021-07-03 Thread Nir Soffer
On Sat, Jul 3, 2021 at 3:46 PM Gianluca Cecchi
 wrote:
>
> Hello,
> in oVirt 4.3.10 an export job to export domain takes too long, probably due 
> to the NFS server slow.
> How can I stop in a clean way the task?
> I see the exported file remains always at 4,5Gb of size.
> Command vmstat on host with qemu-img process gives no throughput but blocked 
> processes
>
> procs ---memory-- ---swap-- -io -system-- 
> --cpu-
>  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa 
> st
>  1  2  0 170208752 474412 1698575200   71972 2948 5677  0  0 
> 96  4  0
>  0  2  0 170207184 474412 1698578000  358099 5043 6790  0  0 
> 96  4  0
>  0  2  0 170208800 474412 1698580400  137941 2332 5527  0  0 
> 96  4  0
>
> and the generated file refreshes its timestamp but not the size
>
> # ll -a  
> /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/125ad0f8-2672-468f-86a0-115a7be287f0/
> total 4675651
> drwxr-xr-x.  2 vdsm kvm   1024 Jul  3 14:10 .
> drwxr-xr-x. 12 vdsm kvm   1024 Jul  3 14:10 ..
> -rw-rw.  1 vdsm kvm 4787863552 Jul  3 14:33 
> bb94ae66-e574-432b-bf68-7497bb3ca9e6
> -rw-r--r--.  1 vdsm kvm268 Jul  3 14:10 
> bb94ae66-e574-432b-bf68-7497bb3ca9e6.meta
>
> # du -sh  
> /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/125ad0f8-2672-468f-86a0-115a7be287f0/
> 4.5G
> /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/125ad0f8-2672-468f-86a0-115a7be287f0/
>
> The VM has two disks, 35Gb and 300GB, not full but quite occupied.
>
> Can I simply kill the qemu-img processes on the chosen hypervisor (I suppose 
> the SPM one)?

Killing the qemu-img process is the only way to stop qemu-img. The system
is designed to clean up properly after qemu-img terminates.

If this capability is important to you, you can file RFE to allow aborting
jobs from engine UI/API. This is already implemented internally, but we did
not expose the capability.

It would be useful to understand why qemu-img convert does not make progress.
If you can reproduce this by running qemu-img from the shell, it can be useful
to run it via strace and ask about this in qemu-block mailing list.

Example strace usage:

strace -o convert.log -f -tt -T qemu-img convert ...

Also output of nfsstat during the copy can help.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RAMVA5P5IBOXL3ZRJ73B577QQXGM6EKC/


[ovirt-users] Re: Strange Issue with imageio

2021-07-01 Thread Nir Soffer
On Thu, Jul 1, 2021 at 11:15 AM Gianluca Cecchi
 wrote:
>
> On Thu, May 27, 2021 at 7:43 AM Eyal Shenitzky  wrote:
>>
>> This bug is targeted to be fixed in 4.4.7 so 4.4.6 doesn't contain the fix.
>>
>
> But is there a workaround for this?
> On a single host environment with external engine and local storage and 4.4.5 
> it seems that uploading an iso always gives OK without uploading anything.
> Both if selecting test connection or not...
> Is it only related to the GUI or in generale even if I use the API?

I don't know about any issues using the API, and it is used by backup
applications to backup and restore vms, so it should be more reliable.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4FFTG63RBUI5REZJLZ4EAMUTKA467ZM5/


[ovirt-users] Re: upgrading to 4.4.6 with Rocky Linux 8

2021-07-01 Thread Nir Soffer
On Wed, Jun 30, 2021 at 5:45 PM Jason Keltz  wrote:
>
> Hi..
>
> I'm looking to migrate soon from CentOS 7.9 with oVirt 4.3.10 to Rocky
> Linux 8.4 with oVirt 4.4.6.  I'm working on my kickstart of my
> standalone engine in a VM at the moment.
>
> So far, with minimal experience with Rocky Linux, after my kickstart, I
> was able to run "engine-setup", follow all the defaults and then access
> my "new" engine via web.  I have to explore the actual procedure for
> installing on my current engine host, and restoring my data.
>
> When oVirt team releases new releases, I'm just wondering if you test
> going from the last previous release (4.3.10 in this case) to each
> latest release?

Migration flows are tested for RHV, so it should work for oVirt if you wait
until the RHV version was released, and you have same RHEL-like version
as was tested with RHV.

> I know that the documentation says we always need to
> make sure we update to each individaul major release, but I'm just
> wondering if this is something that oVirt team tests with each release?
> I'm very concerned for potential of failed upgrade, and the potential
> headaches that it could cause.
>
> I see the 4.4.6 was released in time with RHEL 8.3.  I'd like to use
> Rocky Linux 8.4 because I believe RHEL has re-enabled mptsas (though I
> know still unsupported) from 8.4+ which will make things easier.

RHV 4.4.6 was released with RHEL 8.4, and is oVit 4.4.6 compatible with
RHEL 8.4, so it should work with any REHEL 8.4-like distro.

But note that oVirt uses the advanced virtualization stream, providing
libvirt 7.0.0 and qemu-kvm 5.2.0:
http://mirror.centos.org/centos/8/virt/x86_64/advanced-virtualization/Packages/q/

Looking in Rocky packages, this is not available yet:
https://download.rockylinux.org/pub/rocky/8/AppStream/x86_64/os/Packages/

To replace Centos as the production OS for oVirt, the community must also
rebuild advanced virtualization.

You can try to use Rocky and pull in the advanced-virtualization repo from
Centos as a temporary solution.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2RTNY6TSYIK3RYJTL5FILUG6TSUAHG3G/


[ovirt-users] Re: oVirt and ARM

2021-06-25 Thread Nir Soffer
On Fri, Jun 25, 2021 at 4:28 PM Sandro Bonazzola 
wrote:

>
>
> Il giorno ven 25 giu 2021 alle ore 14:20 Marko Vrgotic <
> m.vrgo...@activevideo.com> ha scritto:
>
>> Hi Sandro,
>>
>>
>>
>> Thank you for the update. I am not equipped to help on development side,
>> but I can most certainly do test deployments, once there is something
>> available.
>>
>>
>>
>> We are big oVirt shop and moving to ARM64 with new product, it would be
>> great if oVirt would start supporting it.
>>
>>
>>
>> If we are able to help somehow, let me know.
>>
>
> I guess a start could be adding some arm64 machine to oVirt infrastructure
> so developers can build for it.
> You can have a look at
> https://ovirt.org/community/get-involved/donate-hardware.html
> Looping in +Evgheni Dereveanchin  in case you can
> share some resources.
>

I'm building ovirt-imageio for arm for a while:
https://copr.fedorainfracloud.org/coprs/nsoffer/ovirt-imageio-preview/build/2264997/

But it was never tested on arm since I don't have any hardware for testing.

You can help by trying to run ovirt-imageio tests.

Here are updated instructions:

Setup development environment (once):

git clone git://gerrit.ovirt.org/ovirt-imageio
cd ovirt-imageio
python3 -m venv ~/venv/ovirt-imageio
source ~/venv/ovirt-imageio/bin/activate
pip install --upgrade pip
pip install -r docker/requirements.txt
sudo dnf install $(cat automation/check-patch.packages)

Running the tests:

make storage
cd daemon
tox

We run these tests on centos stream and fedora 32, 33, 34.

Nir

-
>>
>> kind regards/met vriendelijke groeten
>>
>>
>>
>> Marko Vrgotic
>> Sr. System Engineer @ System Administration
>>
>>
>> ActiveVideo
>>
>> *e:* m.vrgo...@activevideo.com
>> *w: *www.activevideo.com
>>
>>
>>
>> ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217
>> WJ Hilversum, The Netherlands. The information contained in this message
>> may be legally privileged and confidential. It is intended to be read only
>> by the individual or entity to whom it is addressed or by their designee.
>> If the reader of this message is not the intended recipient, you are on
>> notice that any distribution of this message, in any form, is strictly
>> prohibited.  If you have received this message in error, please immediately
>> notify the sender and/or ActiveVideo Networks, LLC by telephone at +1
>> 408.931.9200 and delete or destroy any copy of this message.
>>
>>
>>
>>
>>
>>
>>
>> *From: *Sandro Bonazzola 
>> *Date: *Thursday, 24 June 2021 at 18:21
>> *To: *Marko Vrgotic , Zhenyu Zheng <
>> zhengzhenyul...@gmail.com>, Joey Ma 
>> *Cc: *users@ovirt.org 
>> *Subject: *Re: [ovirt-users] oVirt and ARM
>>
>> ***CAUTION: This email originated from outside of the organization. Do
>> not click links or open attachments unless you recognize the sender!!!***
>>
>>
>>
>>
>>
>> Il giorno gio 24 giu 2021 alle ore 16:34 Marko Vrgotic <
>> m.vrgo...@activevideo.com> ha scritto:
>>
>> Hi oVirt,
>>
>>
>>
>> Where can I find if there are any information about oVirt supporting
>> arm64 CPU architecture?
>>
>>
>>
>> Right now oVirt is not supporting arm64. There was an initiative about
>> supporting it started some time ago from openEuler ovirt SIG.
>>
>> I didn't got any further updates on this topic, looping in those who I
>> remember being looking into it.
>>
>> I think that if someone contributes arm64 support it would also be a
>> feature worth a 4.5 release :-)
>>
>>
>>
>>
>>
>> -
>>
>> kind regards/met vriendelijke groeten
>>
>>
>>
>> Marko Vrgotic
>> Sr. System Engineer @ System Administration
>>
>>
>> ActiveVideo
>>
>> *o: *+31 (35) 6774131
>>
>> *m: +*31 (65) 5734174
>>
>> *e:* m.vrgo...@activevideo.com
>> *w: *www.activevideo.com
>>
>>
>>
>> ActiveVideo Networks BV. Mediacentrum 3745 Joop van den Endeplein 1.1217
>> WJ Hilversum, The Netherlands. The information contained in this message
>> may be legally privileged and confidential. It is intended to be read only
>> by the individual or entity to whom it is addressed or by their designee.
>> If the reader of this message is not the intended recipient, you are on
>> notice that any distribution of this message, in any form, is strictly
>> prohibited.  If you have received this message in error, please immediately
>> notify the sender and/or ActiveVideo Networks, LLC by telephone at +1
>> 408.931.9200 and delete or destroy any copy of this message.
>>
>>
>>
>>
>>
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> 

[ovirt-users] Re: [ANN] oVirt 4.4.7 Fourth Release Candidate is now available for testing

2021-06-24 Thread Nir Soffer
On Thu, Jun 24, 2021 at 4:19 AM Sketch  wrote:
>
> On Wed, 23 Jun 2021, Nir Soffer wrote:
>
> > On Wed, Jun 23, 2021 at 12:22 PM Sketch  wrote:
> >>
> >> Installation fails on a CentOS Linux 8.4 host using yum update.
> >>
> >>   Problem 1: cannot install the best update candidate for package 
> >> vdsm-4.40.60.7-1.el8.x86_64
> >>- nothing provides python3-sanlock >= 3.8.3-3 needed by 
> >> vdsm-4.40.70.4-1.el8.x86_64
> >>- nothing provides sanlock >= 3.8.3-3 needed by 
> >> vdsm-4.40.70.4-1.el8.x86_64
>
> > This version was not released yet for Centos. You need to wait until
> > this package
> > is released on Centos if you want to upgrade ovirt to 4.4.7.
> >
> > If you want to use the latest oVirt version as soon as it is released,
> > you need to use
> > Centos Stream.
>
> I suspected that might have been the case, but figured I'd mention it
> since we're on RC4 now and the release notes say it's available for RHEL
> 8.4 a well as Stream.  I checked the RHEL package browser and it doesn't
> have sanlock >= 3.8.3-3 yet either.  Is the oVirt 4.4.7 GA release waiting
> on this update to be pushed?

This package should be available in RHEL when oVirt 4.4.7 is released.

The relevant oVirt bug is:
https://bugzilla.redhat.com/1961752

The sanlock bugs:
https://bugzilla.redhat.com/1965481
https://bugzilla.redhat.com/1965483

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4W4CI7OIYHBJHD574JGEROA4HGCVYGMP/


[ovirt-users] Re: [ANN] oVirt 4.4.7 Fourth Release Candidate is now available for testing

2021-06-23 Thread Nir Soffer
On Wed, Jun 23, 2021 at 12:22 PM Sketch  wrote:
>
> Installation fails on a CentOS Linux 8.4 host using yum update.
>
> I had previously installed ceph and python3-os-brick from the
> centos-oepnstrack-train and ceph-nautilus repos, but I removed the
> packages and the repos and ran dnf distro-sync, and used yum list
> installed just to be sure everything was in a clean state with no extra
> packages installed from nonstandard repos.
>
> Here's the output from yum update:
>
> Last metadata expiration check: 2:31:18 ago on Tue 22 Jun 2021 11:26:39 PM 
> PDT.
> Error:
>   Problem 1: cannot install the best update candidate for package 
> vdsm-4.40.60.7-1.el8.x86_64
>- nothing provides python3-sanlock >= 3.8.3-3 needed by 
> vdsm-4.40.70.4-1.el8.x86_64
>- nothing provides sanlock >= 3.8.3-3 needed by vdsm-4.40.70.4-1.el8.x86_64

This version was not released yet for Centos. You need to wait until
this package
is released on Centos if you want to upgrade ovirt to 4.4.7.

If you want to use the latest oVirt version as soon as it is released,
you need to use
Centos Stream.

Nir

>   Problem 2: package vdsm-hook-fcoe-4.40.70.4-1.el8.noarch requires vdsm = 
> 4.40.70.4-1.el8, but none of the providers can be installed
>- cannot install the best update candidate for package 
> vdsm-hook-fcoe-4.40.60.7-1.el8.noarch
>- nothing provides python3-sanlock >= 3.8.3-3 needed by 
> vdsm-4.40.70.4-1.el8.x86_64
>- nothing provides sanlock >= 3.8.3-3 needed by vdsm-4.40.70.4-1.el8.x86_64
>   Problem 3: package vdsm-hook-ethtool-options-4.40.70.4-1.el8.noarch 
> requires vdsm = 4.40.70.4-1.el8, but none of the providers can be installed
>- cannot install the best update candidate for package 
> vdsm-hook-ethtool-options-4.40.60.7-1.el8.noarch
>- nothing provides python3-sanlock >= 3.8.3-3 needed by 
> vdsm-4.40.70.4-1.el8.x86_64
>- nothing provides sanlock >= 3.8.3-3 needed by vdsm-4.40.70.4-1.el8.x86_64
>   Problem 4: cannot install the best update candidate for package 
> vdsm-hook-vmfex-dev-4.40.60.7-1.el8.noarch
>- package vdsm-hook-vmfex-dev-4.40.70.4-1.el8.noarch requires vdsm = 
> 4.40.70.4-1.el8, but none of the providers can be installed
>- nothing provides python3-sanlock >= 3.8.3-3 needed by 
> vdsm-4.40.70.4-1.el8.x86_64
>- nothing provides sanlock >= 3.8.3-3 needed by vdsm-4.40.70.4-1.el8.x86_64
>   Problem 5: package ovirt-provider-ovn-driver-1.2.33-1.el8.noarch requires 
> vdsm, but none of the providers can be installed
>- package vdsm-4.40.60.7-1.el8.x86_64 requires vdsm-http = 
> 4.40.60.7-1.el8, but none of the providers can be installed
>- package vdsm-4.40.70.1-1.el8.x86_64 requires vdsm-http = 
> 4.40.70.1-1.el8, but none of the providers can be installed
>- package vdsm-4.40.70.2-1.el8.x86_64 requires vdsm-http = 
> 4.40.70.2-1.el8, but none of the providers can be installed
>- package vdsm-4.40.17-1.el8.x86_64 requires vdsm-http = 4.40.17-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.18-1.el8.x86_64 requires vdsm-http = 4.40.18-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.19-1.el8.x86_64 requires vdsm-http = 4.40.19-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.20-1.el8.x86_64 requires vdsm-http = 4.40.20-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.21-1.el8.x86_64 requires vdsm-http = 4.40.21-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.22-1.el8.x86_64 requires vdsm-http = 4.40.22-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.26.3-1.el8.x86_64 requires vdsm-http = 
> 4.40.26.3-1.el8, but none of the providers can be installed
>- package vdsm-4.40.30-1.el8.x86_64 requires vdsm-http = 4.40.30-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.31-1.el8.x86_64 requires vdsm-http = 4.40.31-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.32-1.el8.x86_64 requires vdsm-http = 4.40.32-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.33-1.el8.x86_64 requires vdsm-http = 4.40.33-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.34-1.el8.x86_64 requires vdsm-http = 4.40.34-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.35-1.el8.x86_64 requires vdsm-http = 4.40.35-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.35.1-1.el8.x86_64 requires vdsm-http = 
> 4.40.35.1-1.el8, but none of the providers can be installed
>- package vdsm-4.40.36-1.el8.x86_64 requires vdsm-http = 4.40.36-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.37-1.el8.x86_64 requires vdsm-http = 4.40.37-1.el8, 
> but none of the providers can be installed
>- package vdsm-4.40.38-1.el8.x86_64 requires vdsm-http = 4.40.38-1.el8, 
> but none of the providers can be installed
>- package 

[ovirt-users] Re: How to handle broken NFS storage?

2021-06-06 Thread Nir Soffer
On Sun, Jun 6, 2021 at 12:31 PM Nir Soffer  wrote:
>
> On Sat, Jun 5, 2021 at 3:25 AM David White via Users  wrote:
> >
> > When I stopped the NFS service, I was connect to a VM over ssh.
> > I was also connected to one of the physical hosts over ssh, and was running 
> > top.
> >
> > I observed that server load continued to increase over time on the physical 
> > host.
> > Several of the VMs (perhaps all?), including the one I was connected to, 
> > went down due to an underlying storage issue.
> > It appears to me that HA VMs were restarted automatically. For example, I 
> > see the following in the oVirt Manager Event Log (domain names changed / 
> > redacted):
> >
> >
> > Jun 4, 2021, 4:25:42 AM
> > Highly Available VM server2.example.com failed. It will be restarted 
> > automatically.
>
> Do  you have a cdrom on an ISO storage domain, maybe on the same NFS server
> that you stopped?

If you share vm xml for the ha vms and the regular vms it will be easier to
understand your system.

The best way is to use:

sudo virsh -r dumpxml {vm-name}

> > Jun 4, 2021, 4:25:42 AM
> > Highly Available VM mail.example.com failed. It will be restarted 
> > automatically.
> >
> > Jun 4, 2021, 4:25:42 AM
> > Highly Available VM core1.mgt.example.com failed. It will be restarted 
> > automatically.
> >
> > Jun 4, 2021, 4:25:42 AM
> > VM cha1-shared.example.com has been paused due to unknown storage error.
> >
> > Jun 4, 2021, 4:25:42 AM
> > VM server.example.org has been paused due to storage I/O problem.
> >
> > Jun 4, 2021, 4:25:42 AM
> > VM server.example.com has been paused.
>
> I guess this vm was using the NFS server?
>
> > Jun 4, 2021, 4:25:42 AM
> > VM server.example.org has been paused.
> >
> > Jun 4, 2021, 4:25:41 AM
> > VM server.example.org has been paused due to unknown storage error.
> >
> > Jun 4, 2021, 4:25:41 AM
> > VM HostedEngine has been paused due to storage I/O problem.
> >
> >
> > During this outage, I also noticed that customer websites were not working.
> > So I clearly took an outage.
> >
> > > If you have a good way to reproduce the issue please file a bug with
> > > all the logs, we try to improve this situation.
> >
> > I don't have a separate lab environment, but if I'm able to reproduce the 
> > issue off hours, I may try to do so.
> > What logs would be helpful?
>
> /var/log/vdsm.log
> /var/log/sanlock.log
> /var/log/messages or output of journalctl
>
> > > NFS storage domain will always affect other storage domains, but if you 
> > > mount
> > > your NFS storage outside of ovirt, the mount will not affect the system.
> > >
> >
> > > Then you can backup to this mount, for example using backup_vm.py:
> > > https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/backup_vm.py
> >
> > If I'm understanding you correctly, it sounds like you're suggesting that I 
> > just connect 1 (or multiple) hosts to the NFS mount manually,
>
> Yes
>
> > and don't use the oVirt manager to build the backup domain. Then just run 
> > this script on a cron or something - is that correct?
>
> Yes.
>
> You can run the backup in many ways, for example you can run it via ssh
> from another host, finding where vms are running, and connecting to
> the host to perform a backup. This is outside of ovirt, since ovirt does not
> have built-in a backup feature. We have backup API and example code using it
> which can be used to build a backup solution.
>
> > Sent with ProtonMail Secure Email.
> >
> > ‐‐‐ Original Message ‐‐‐
> > On Friday, June 4, 2021 12:29 PM, Nir Soffer  wrote:
> >
> > > On Fri, Jun 4, 2021 at 12:11 PM David White via Users users@ovirt.org 
> > > wrote:
> > >
> >
> > > > I'm trying to figure out how to keep a "broken" NFS mount point from 
> > > > causing the entire HCI cluster to crash.
> > > > HCI is working beautifully.
> > > > Last night, I finished adding some NFS storage to the cluster - this is 
> > > > storage that I don't necessarily need to be HA, and I was hoping to 
> > > > store some backups and less-important VMs on, since my Gluster (sssd) 
> > > > storage availability is pretty limited.
> > > > But as a test, after I got everything setup, I stopped the nfs-server.
> > > > This caused the entire cluster to go down, and several VMs - that are 
> > > > not stored on the NFS storage - went belly up.
&

[ovirt-users] Re: How to handle broken NFS storage?

2021-06-06 Thread Nir Soffer
On Sat, Jun 5, 2021 at 3:25 AM David White via Users  wrote:
>
> When I stopped the NFS service, I was connect to a VM over ssh.
> I was also connected to one of the physical hosts over ssh, and was running 
> top.
>
> I observed that server load continued to increase over time on the physical 
> host.
> Several of the VMs (perhaps all?), including the one I was connected to, went 
> down due to an underlying storage issue.
> It appears to me that HA VMs were restarted automatically. For example, I see 
> the following in the oVirt Manager Event Log (domain names changed / 
> redacted):
>
>
> Jun 4, 2021, 4:25:42 AM
> Highly Available VM server2.example.com failed. It will be restarted 
> automatically.

Do  you have a cdrom on an ISO storage domain, maybe on the same NFS server
that you stopped?

> Jun 4, 2021, 4:25:42 AM
> Highly Available VM mail.example.com failed. It will be restarted 
> automatically.
>
> Jun 4, 2021, 4:25:42 AM
> Highly Available VM core1.mgt.example.com failed. It will be restarted 
> automatically.
>
> Jun 4, 2021, 4:25:42 AM
> VM cha1-shared.example.com has been paused due to unknown storage error.
>
> Jun 4, 2021, 4:25:42 AM
> VM server.example.org has been paused due to storage I/O problem.
>
> Jun 4, 2021, 4:25:42 AM
> VM server.example.com has been paused.

I guess this vm was using the NFS server?

> Jun 4, 2021, 4:25:42 AM
> VM server.example.org has been paused.
>
> Jun 4, 2021, 4:25:41 AM
> VM server.example.org has been paused due to unknown storage error.
>
> Jun 4, 2021, 4:25:41 AM
> VM HostedEngine has been paused due to storage I/O problem.
>
>
> During this outage, I also noticed that customer websites were not working.
> So I clearly took an outage.
>
> > If you have a good way to reproduce the issue please file a bug with
> > all the logs, we try to improve this situation.
>
> I don't have a separate lab environment, but if I'm able to reproduce the 
> issue off hours, I may try to do so.
> What logs would be helpful?

/var/log/vdsm.log
/var/log/sanlock.log
/var/log/messages or output of journalctl

> > NFS storage domain will always affect other storage domains, but if you 
> > mount
> > your NFS storage outside of ovirt, the mount will not affect the system.
> >
>
> > Then you can backup to this mount, for example using backup_vm.py:
> > https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/backup_vm.py
>
> If I'm understanding you correctly, it sounds like you're suggesting that I 
> just connect 1 (or multiple) hosts to the NFS mount manually,

Yes

> and don't use the oVirt manager to build the backup domain. Then just run 
> this script on a cron or something - is that correct?

Yes.

You can run the backup in many ways, for example you can run it via ssh
from another host, finding where vms are running, and connecting to
the host to perform a backup. This is outside of ovirt, since ovirt does not
have built-in a backup feature. We have backup API and example code using it
which can be used to build a backup solution.

> Sent with ProtonMail Secure Email.
>
> ‐‐‐ Original Message ‐‐‐
> On Friday, June 4, 2021 12:29 PM, Nir Soffer  wrote:
>
> > On Fri, Jun 4, 2021 at 12:11 PM David White via Users users@ovirt.org wrote:
> >
>
> > > I'm trying to figure out how to keep a "broken" NFS mount point from 
> > > causing the entire HCI cluster to crash.
> > > HCI is working beautifully.
> > > Last night, I finished adding some NFS storage to the cluster - this is 
> > > storage that I don't necessarily need to be HA, and I was hoping to store 
> > > some backups and less-important VMs on, since my Gluster (sssd) storage 
> > > availability is pretty limited.
> > > But as a test, after I got everything setup, I stopped the nfs-server.
> > > This caused the entire cluster to go down, and several VMs - that are not 
> > > stored on the NFS storage - went belly up.
> >
>
> > Please explain in more detail "went belly up".
> >
>
> > In general vms not using he nfs storage domain should not be affected, but
> > due to unfortunate design of vdsm, all storage domain share the same global 
> > lock
> > and when one storage domain has trouble, it can cause delays in
> > operations on other
> > domains. This may lead to timeouts and vms reported as non-responsive,
> > but the actual
> > vms, should not be affected.
> >
>
> > If you have a good way to reproduce the issue please file a bug with
> > all the logs, we try
> > to improve this situation.
> >
>
> > > Once I started the NFS server 

[ovirt-users] Re: How to handle broken NFS storage?

2021-06-04 Thread Nir Soffer
On Fri, Jun 4, 2021 at 12:11 PM David White via Users  wrote:
>
> I'm trying to figure out how to keep a "broken" NFS mount point from causing 
> the entire HCI cluster to crash.
>
> HCI is working beautifully.
> Last night, I finished adding some NFS storage to the cluster - this is 
> storage that I don't necessarily need to be HA, and I was hoping to store 
> some backups and less-important VMs on, since my Gluster (sssd) storage 
> availability is pretty limited.
>
> But as a test, after I got everything setup, I stopped the nfs-server.
> This caused the entire cluster to go down, and several VMs - that are not 
> stored on the NFS storage - went belly up.

Please explain in more detail "went belly up".

In general vms not using he nfs storage domain should not be affected, but
due to unfortunate design of vdsm, all storage domain share the same global lock
and when one storage domain has trouble, it can cause delays in
operations on other
domains. This may lead to timeouts and vms reported as non-responsive,
but the actual
vms, should not be affected.

If  you have a good way to reproduce the issue please file a bug with
all the logs, we try
to improve this situation.

> Once I started the NFS server process again, HCI did what it was supposed to 
> do, and was able to automatically recover.
> My concern is that NFS is a single point of failure, and if VMs that don't 
> even rely on that storage are affected if the NFS storage goes away, then I 
> don't want anything to do with it.

You need to understand the actual effect on the vms before you reject NFS.

> On the other hand, I'm still struggling to come up with a good way to run 
> on-site backups and snapshots without using up more gluster space on my (more 
> expensive) sssd storage.

NFS is useful for this purpose. You don't need synchronous replication, and
you want the backups outside of your cluster so in case of disaster you can
restore the backups on another system.

Snapshots are always on the same storage so it will not help.

> Is there any way to setup NFS storage for a Backup Domain - as well as a Data 
> domain (for lesser important VMs) - such that, if the NFS server crashed, all 
> of my non-NFS stuff would be unaffected?

NFS storage domain will always affect other storage domains, but if you mount
your NFS storage outside of ovirt, the mount will not affect the system.

Then you can backup to this mount, for example using backup_vm.py:
https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/backup_vm.py

Or one of the backup solutions, all of them are not using a storage domain for
keeping the backups so the mount should not affect the system.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MYQAQTMXRAZT7EYAYCMYXBJYZHSNJT7G/


[ovirt-users] Re: Can't remove snapshot

2021-06-03 Thread Nir Soffer
On Thu, Jun 3, 2021 at 8:47 PM Strahil Nikolov  wrote:
>
> Hey Nir,
>
> you said that the data in the snapshot changed ?
> I always thought that snapshots are read-only.

Indeed snapshot is ready only - until you start to delete it. This is
why we mark the
snapshot as illegal once delete snapshot was started.

It works like this:

1. Before snapshot

Snapshots: (none)
Volumes: A (active)

A is read-write volume, changing while the vm is running.

2. After snapshot

Snapshot: snap1 (disk snapshot A)
Volumes: A <- B (active)

A is now read only image, will never change
B is read-write, modified by the vm
B backing file is A

2. Start delete snapshot 1

Snapshot: snap1 (disk snapshot A, illegal)
Volumes: A <- B (active)

On the host running the vm, we perform block commit job,
copying data from B into A.

When the job completes, A contains all data in B, and any new
data written to the B is mirrored to A.

3. Pivoting to volume A

When the block commit has completed, we switch to vm to use volume A
instead of volume B.

At this point the VM is writing again to volume A, and volume B is unused.

Snapshot: snap1 (disk snapshot A, illegal)
Volumes: A (active) <- B

4. Cleanup

On engine side, snapshot 1 is deleted
On the host, volume B is deactivated
On the SPM host, volume B is deleted

Snapshot: (none)
Volumes: A (active)

I hope this is more clear now.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7XX74LQ7IYYQCCUSEIMV35TPFRIP4NDD/


[ovirt-users] Re: Can't remove snapshot

2021-06-03 Thread Nir Soffer
On Fri, May 28, 2021 at 7:09 PM David Johnson 
wrote:

> Hi all,
>
> I patched one of my Windows VM's yesterday.  I started by snapshotting the
> VM, then applied the Windows update.  Now that the patch has been tested, I
> want to remove the snapshot. I get this message:
>
> Error while executing action:
>
> win-sql-2019:
>
>- Cannot remove Snapshot. The following attached disks are in ILLEGAL
>status: win-2019-tmpl_Disk1 - please remove them and try again.
>
>
> Does anyone have any thoughts how to recover from this? I really don't
> want to keep this snapshot hanging around.
>

The engine ILLEGAL state means that you started a delete snapshot operation,
and the data in this snapshot has changed. This snapshot cannot be used for
restoring the vm state to the state as it was when the snapshot was created.

In this situation you can retry the delete snapshot operation again. If the
first
delete failed because of a temporary error the operation should succeed and
the snapshot will be deleted.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PY5G5XXAYK53O5ZC3E2BPMVE77LDVIBW/


[ovirt-users] Re: Adding a Ubuntu Host's NFS share to oVirt

2021-06-01 Thread Nir Soffer
On Wed, Jun 2, 2021 at 3:08 AM Nir Soffer  wrote:
>
> On Sat, May 22, 2021 at 8:20 PM David White via Users  wrote:
> >
> > Hello,
> > Is it possible to use Ubuntu to share an NFS export with oVirt?
> > I'm trying to setup a Backup Domain for my environment.
> >
> > I got to the point of actually adding the new Storage Domain.
> > When I click OK, I see the storage domain appear momentarily before 
> > disappearing, at which point I get a message about oVirt not being able to 
> > obtain a lock.
>
> It may be he issue describe here:
> https://github.com/oVirt/ovirt-site/pull/2433

This is the relevant thread:
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/G2TQC6XTJTIBAIOG7BWAFJ3YW3XOMNXF/#G2TQC6XTJTIBAIOG7BWAFJ3YW3XOMNXF

> The fix is to change this on the serve side:
>
> # grep RPCMOUNTDOPTS /etc/default/nfs-kernel-server
> # --manage-gids is not compatible with oVirt.
> #RPCMOUNTDOPTS="--manage-gids"
> RPCMOUNTDOPTS=""
>
> > It appears I'm running into the issue described in this thread: 
> > https://lists.ovirt.org/archives/list/users@ovirt.org/thread/BNVXUH5B26FBFCGYLG62JUSB5SOU2MN7/#IZTU744GVKY5OJT4QOULLZVKGYADXDOO
> >  ... Although the actual export is ext4, not xfs.
>
> The issue is not related to the file system, and it is likely the same
> issue described
> in this thread.
>
> > From what I'm reading on that thread and elsewhere, it sounds like this 
> > problem is a result of SELinux not being present, is that correct?
>
> This seems to be incompatible NFS server defaults on Ubuntu.
>
> > Is my only option here to install an OS that supports SELinux?
>
> This is another option, RHEL (like) server is a safe bet.
>
> Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5VRBUPMSYP56LPEUA62H34K5UHWBN7C6/


[ovirt-users] Re: Adding a Ubuntu Host's NFS share to oVirt

2021-06-01 Thread Nir Soffer
On Sat, May 22, 2021 at 8:20 PM David White via Users  wrote:
>
> Hello,
> Is it possible to use Ubuntu to share an NFS export with oVirt?
> I'm trying to setup a Backup Domain for my environment.
>
> I got to the point of actually adding the new Storage Domain.
> When I click OK, I see the storage domain appear momentarily before 
> disappearing, at which point I get a message about oVirt not being able to 
> obtain a lock.

It may be he issue describe here:
https://github.com/oVirt/ovirt-site/pull/2433

The fix is to change this on the serve side:

# grep RPCMOUNTDOPTS /etc/default/nfs-kernel-server
# --manage-gids is not compatible with oVirt.
#RPCMOUNTDOPTS="--manage-gids"
RPCMOUNTDOPTS=""

> It appears I'm running into the issue described in this thread: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/BNVXUH5B26FBFCGYLG62JUSB5SOU2MN7/#IZTU744GVKY5OJT4QOULLZVKGYADXDOO
>  ... Although the actual export is ext4, not xfs.

The issue is not related to the file system, and it is likely the same
issue described
in this thread.

> From what I'm reading on that thread and elsewhere, it sounds like this 
> problem is a result of SELinux not being present, is that correct?

This seems to be incompatible NFS server defaults on Ubuntu.

> Is my only option here to install an OS that supports SELinux?

This is another option, RHEL (like) server is a safe bet.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MWHE7HMHDSPY4T7LXOQMW65K44YBCPYM/


[ovirt-users] Re: Parent checkpoint ID does not match the actual leaf checkpoint

2021-05-26 Thread Nir Soffer
On Wed, May 26, 2021 at 12:48 PM Tommaso - Shellrent 
wrote:

> after some investigation we have foud that:
>
> via virsh our VM have 71 checkpoint
>
> on engine's db, in the table vm_checkpoints there are ZERO checkpoint.
>
> Is ther a way to sync the checkpoints?!?
>

Engine deletes all checkpoints from engine if it cannot redefine
the checkpoints in libvirt. After this you must start again with full
backup.

The stale checkpoint in libvirt should not happen, please file a bug for
this.
We need to understand why engine deleted the checkpoints without deleting
the checkpoints in libvirt.

To checkpoints in libvirt are deleted when you stop the vm.

You can also delete them safely using virsh.

Nir

>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ETJ524XHHCOHPW2MCPOCPHUC4TKARX5T/


[ovirt-users] Re: Importing VM fails with "No space left on device"

2021-05-09 Thread Nir Soffer
On Sun, May 9, 2021 at 8:54 AM  wrote:
>
> Hello List,
> I am facing the following issue when I try to import a VM from a KVM host to 
> my oVirt (4.4.5.11-1.el8).
> The importing  I done throuth GUI using the option of KVM provider.
>
> -- Log1:
> # cat 
> /var/log/vdsm/import-57f84423-56cb-4187-86e2-f4208348e1f5-20210507T124121.log
> [0.0] preparing for copy
> [0.0] Copying disk 1/1 to 
> /rhev/data-center/mnt/blockSD/cc9fae8e-b714-44cf-9dac-3a83a15b0455/images/cb63ffc9-07ee-4323-9e8a-378be31ae3f7/e7e69cbc-47bf-4557-ae02-ca1c53c8423f
> Traceback (most recent call last):
...
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/util.py", 
> line 20, in uninterruptible
> return func(*args)
> OSError: [Errno 28] No space left on device

Looks like the disk:
/rhev/data-center/mnt/blockSD/cc9fae8e-b714-44cf-9dac-3a83a15b0455/images/cb63ffc9-07ee-4323-9e8a-378be31ae3f7/e7e69cbc-47bf-4557-ae02-ca1c53c8423f

Was created with the wrong initial size.

When importing vms from libvirt I don't think we have a way to get the
required allocation
of the disk, so the disk must be created with
initial_size=virtual_size, and this was probably
not done in this case.

Please file a bug and include full vdsm from the SPM host and from the
host running the
import, and full engine logs. The log should show the creation of the
target disk
cb63ffc9-07ee-4323-9e8a-378be31ae3f7.

You can grep this uuid in vdsm logs (/var/log/vdsm/vdsm.log*) on the
spm host, the host
running the import, and engine host (/var/log/ovirt-engine/engine.log).


> -- Details of the enviroment:
> # df -Ph
> Filesystem   Size  Used Avail 
> Use% Mounted on
> devtmpfs  32G 0   32G 
>   0% /dev
> tmpfs 32G  4.0K   32G 
>   1% /dev/shm
> tmpfs 32G   26M   32G 
>   1% /run
> tmpfs 32G 0   32G 
>   0% /sys/fs/cgroup
> /dev/mapper/onn-ovirt--node--ng--4.4.5.1--0.20210323.0+1 584G   11G  573G 
>   2% /
> /dev/mapper/onn-home1014M   40M  975M 
>   4% /home
> /dev/mapper/onn-tmp 1014M   40M  975M 
>   4% /tmp
> /dev/sda2   1014M  479M  536M 
>  48% /boot
> /dev/mapper/onn-var   30G  3.2G   27G 
>  11% /var
> /dev/sda1599M  6.9M  592M 
>   2% /boot/efi
> /dev/mapper/onn-var_log  8.0G  498M  7.6G 
>   7% /var/log
> /dev/mapper/onn-var_crash 10G  105M  9.9G 
>   2% /var/crash
> /dev/mapper/onn-var_log_audit2.0G   84M  2.0G 
>   5% /var/log/audit
> tmpfs6.3G 0  6.3G 
>   0% /run/user/0
> /dev/mapper/da3e3aff--0bfc--42cd--944f--f6145c50134a-master  976M  1.3M  924M 
>   1% /rhev/data-center/mnt/blockSD/da3e3aff-0bfc-42cd-944f-f6145c50134a/master
> /dev/mapper/onn-lv_iso12G   11G  1.6G 
>  88% /rhev/data-center/mnt/_dev_mapper_onn-lv__iso
> 172.19.1.80:/exportdomain584G   11G  573G 
>   2% /rhev/data-center/mnt/172.19.1.80:_exportdomain
>
> * Inodes available = 99%.

The available space on the host is not related, the issue is creating a big
enough disk when creating sparse volume on block storage.

> # qemu-img info /var/lib/libvirt/images/vm_powervp-si.qcow2
> image: /var/lib/libvirt/images/vm_powervp-si.qcow2
> file format: qcow2
> virtual size: 20G (21474836480 bytes)
> disk size: 4.2G
> cluster_size: 65536
> Format specific information:
> compat: 1.1
> lazy refcounts: true

Since you have access to the disk you want to import, you can upload
it to oVirt and create a new vm with the disk, instead of importing via
libvirt.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TVTSOH5PT7YF6A6HB5YQZXH5WGJCO76L/


[ovirt-users] Re: Attempting to detach a storage domain

2021-04-26 Thread Nir Soffer
On Tue, Apr 27, 2021 at 12:48 AM matthew.st...@fujitsu.com
 wrote:
>
> I'm getting tons of these in the 
> hosted-engine:/var/log/ovirt-engine/engine.log
>
> 2021-04-26 16:40:40,260-05 ERROR 
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-8) [] EVENT_ID: 
> VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ..zzz command Get Host 
> Capabilities failed: Internal JSON-RPC error: {'reason': 'internal error: 
> Duplicate key'}

This looks like bad response from vdsm. Can you share the vdsm log from
the host ..zzz?

/var/log/vdsm/vdsm.log

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IVYHETDWTWB47ZSSRJMUUE3D5OZRB6P4/


[ovirt-users] Re: Ovirt 4.4.6.5-1 engine-setup failed

2021-04-26 Thread Nir Soffer
y", line 144, in __init__
>> _socket.socket.__init__(self, family, type, proto, fileno)
>> OSError: [Errno 97] Address family not supported by protocol
>>
>> # ovirt-imageio --show-config
>> {
>> "backend_file": {
>> "buffer_size": 8388608
>> },
>> "backend_http": {
>> "buffer_size": 8388608,
>> "ca_file": "/etc/pki/ovirt-engine/ca.pem"
>> },
>> "backend_nbd": {
>> "buffer_size": 8388608
>> },
>> "control": {
>> "port": 54324,
>> "prefer_ipv4": true,
>> "remove_timeout": 60,
>> "socket": "/run/ovirt-imageio/sock",
>> "transport": "tcp"
>> },
>> "daemon": {
>> "drop_privileges": true,
>> "group_name": "ovirtimg",
>> "max_connections": 8,
>> "poll_interval": 1.0,
>> "run_dir": "/run/ovirt-imageio",
>> "user_name": "ovirtimg"
>> },
>> "formatter_long": {
>> "format": "%(asctime)s %(levelname)-7s (%(threadName)s) [%(name)s] 
>> %(message)s"
>> },
>> "formatters": {
>> "keys": "long"
>> },
>> "handler_logfile": {
>> "args": "(\"/var/log/ovirt-imageio/daemon.log\",)",
>> "formatter": "long",
>> "class": "logging.handlers.RotatingFileHandler",
>> "kwargs": "{\"maxBytes\": 20971520, \"backupCount\": 10}",
>> "level": "DEBUG"
>> },
>> "handler_stderr": {
>> "formatter": "long",
>> "class": "logging.StreamHandler",
>> "level": "DEBUG"
>> },
>> "handlers": {
>> "keys": "logfile"
>> },
>> "local": {
>> "enable": false,
>> "socket": "\u/org/ovirt/imageio"
>> },
>> "logger_root": {
>> "handlers": "logfile",
>> "level": "DEBUG",
>> "propagate": 0
>> },
>> "loggers": {
>> "keys": "root"
>> },
>> "profile": {
>> "filename": "/run/ovirt-imageio/profile"
>> },
>> "remote": {
>> "host": "::",
>> "port": 54323
>> },
>> "tls": {
>> "ca_file": "/etc/pki/ovirt-engine/apache-ca.pem",
>> "cert_file": "/etc/pki/ovirt-engine/certs/apache.cer",
>> "enable": true,
>> "enable_tls1_1": false,
>> "key_file": "/etc/pki/ovirt-engine/keys/apache.key.nopass"
>> }
>> }
>>
>>
>> On Mon, Apr 26, 2021 at 3:22 PM Nir Soffer  wrote:
>>>
>>> On Mon, Apr 26, 2021 at 10:58 PM Don Dupuis  wrote:
>>> >
>>> > Nir
>>> > It just repeats. Here you go
>>>
>>> Thanks, it seems that we need more work on logging.
>>>
>>> Can you try to change the log level to DEBUG and start the
>>> ovirt-imageio service?
>>>
>>> Add this file:
>>>
>>> $ cat /etc/ovirt-imageio/conf.d/99-local.conf
>>> [logger_root]
>>> level = DEBUG
>>>
>>> And start the service:
>>>
>>> $ systemctl start ovirt-imageio
>>>
>>> I hope we will have more info in the log explaining this issue.
>>>
>>> And to make sure we have correct configuration, please share output of:
>>>
>>> $ ovirt-imageio --show-config
>>>
>>> > 2021-04-26 13:50:49,600 INFO(MainThread) [server] Starting 
>>> > (hostname=manager2 pid=60513, version=2.1.1)
>>> > 2021-04-26 13:50:49,609 ERROR   (MainThread) [server] Server failed
>>> > Traceback (most recent call last):
>>> >   File 
>>> > "/usr/lib64/python3.6/site-packages/ovirt_imag

[ovirt-users] Re: Ovirt 4.4.6.5-1 engine-setup failed

2021-04-26 Thread Nir Soffer
On Mon, Apr 26, 2021 at 10:58 PM Don Dupuis  wrote:
>
> Nir
> It just repeats. Here you go

Thanks, it seems that we need more work on logging.

Can you try to change the log level to DEBUG and start the
ovirt-imageio service?

Add this file:

$ cat /etc/ovirt-imageio/conf.d/99-local.conf
[logger_root]
level = DEBUG

And start the service:

$ systemctl start ovirt-imageio

I hope we will have more info in the log explaining this issue.

And to make sure we have correct configuration, please share output of:

$ ovirt-imageio --show-config

> 2021-04-26 13:50:49,600 INFO(MainThread) [server] Starting 
> (hostname=manager2 pid=60513, version=2.1.1)
> 2021-04-26 13:50:49,609 ERROR   (MainThread) [server] Server failed
> Traceback (most recent call last):
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 46, in main
> server = Server(cfg)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 110, in __init__
> self.remote_service = services.RemoteService(self.config, self.auth)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/services.py", 
> line 73, in __init__
> self._server = http.Server((config.remote.host, port), http.Connection)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 91, in __init__
> self.create_socket(prefer_ipv4)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 123, in create_socket
> self.socket = socket.socket(self.address_family, self.socket_type)
>   File "/usr/lib64/python3.6/socket.py", line 144, in __init__
> _socket.socket.__init__(self, family, type, proto, fileno)
> OSError: [Errno 97] Address family not supported by protocol
> 2021-04-26 13:50:49,954 INFO(MainThread) [server] Starting 
> (hostname=manager2 pid=60530, version=2.1.1)
> 2021-04-26 13:50:49,967 ERROR   (MainThread) [server] Server failed
> Traceback (most recent call last):
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 46, in main
> server = Server(cfg)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 110, in __init__
> self.remote_service = services.RemoteService(self.config, self.auth)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/services.py", 
> line 73, in __init__
> self._server = http.Server((config.remote.host, port), http.Connection)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 91, in __init__
> self.create_socket(prefer_ipv4)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 123, in create_socket
> self.socket = socket.socket(self.address_family, self.socket_type)
>   File "/usr/lib64/python3.6/socket.py", line 144, in __init__
> _socket.socket.__init__(self, family, type, proto, fileno)
> OSError: [Errno 97] Address family not supported by protocol
> 2021-04-26 13:50:50,199 INFO(MainThread) [server] Starting 
> (hostname=manager2 pid=60567, version=2.1.1)
> 2021-04-26 13:50:50,203 ERROR   (MainThread) [server] Server failed
> Traceback (most recent call last):
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 46, in main
> server = Server(cfg)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 110, in __init__
> self.remote_service = services.RemoteService(self.config, self.auth)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/services.py", 
> line 73, in __init__
> self._server = http.Server((config.remote.host, port), http.Connection)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 91, in __init__
> self.create_socket(prefer_ipv4)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 123, in create_socket
> self.socket = socket.socket(self.address_family, self.socket_type)
>   File "/usr/lib64/python3.6/socket.py", line 144, in __init__
> _socket.socket.__init__(self, family, type, proto, fileno)
> OSError: [Errno 97] Address family not supported by protocol
>
> Don
>
> On Mon, Apr 26, 2021 at 2:34 PM Nir Soffer  wrote:
>>
>> On Mon, Apr 26, 2021 at 7:46 PM Don Dupuis  wrote:
>> >
>> > I am installing Ovirt 4.4.6.5-1 engine on CentOS Stream 8 on a dedicated 
>> > vm. Towards the end of the engine-setup session 

[ovirt-users] Re: Ovirt 4.4.6.5-1 engine-setup failed

2021-04-26 Thread Nir Soffer
On Mon, Apr 26, 2021 at 7:46 PM Don Dupuis  wrote:
>
> I am installing Ovirt 4.4.6.5-1 engine on CentOS Stream 8 on a dedicated vm. 
> Towards the end of the engine-setup session I have a failure:
>
> [ INFO  ] Stage: Transaction setup
> [ INFO  ] Stopping engine service
> [ INFO  ] Stopping ovirt-fence-kdump-listener service
> [ INFO  ] Stopping dwh service
> [ INFO  ] Stopping vmconsole-proxy service
> [ INFO  ] Stopping websocket-proxy service
> [ INFO  ] Stage: Misc configuration (early)
> [ INFO  ] Stage: Package installation
> [ INFO  ] Stage: Misc configuration
> [ INFO  ] Upgrading CA
> [ INFO  ] Initializing PostgreSQL
> [ INFO  ] Creating PostgreSQL 'engine' database
> [ INFO  ] Configuring PostgreSQL
> [ INFO  ] Creating PostgreSQL 'ovirt_engine_history' database
> [ INFO  ] Configuring PostgreSQL
> [ INFO  ] Creating CA: /etc/pki/ovirt-engine/ca.pem
> [ INFO  ] Creating CA: /etc/pki/ovirt-engine/qemu-ca.pem
> [ INFO  ] Updating OVN SSL configuration
> [ INFO  ] Updating OVN timeout configuration
> [ INFO  ] Creating/refreshing DWH database schema
> [ INFO  ] Setting up ovirt-vmconsole proxy helper PKI artifacts
> [ INFO  ] Setting up ovirt-vmconsole SSH PKI artifacts
> [ INFO  ] Configuring WebSocket Proxy
> [ INFO  ] Creating/refreshing Engine database schema
> [ INFO  ] Creating a user for Grafana
> [ INFO  ] Creating/refreshing Engine 'internal' domain database schema
> [ INFO  ] Creating default mac pool range
> [ INFO  ] Adding default OVN provider to database
> [ INFO  ] Adding OVN provider secret to database
> [ INFO  ] Setting a password for internal user admin
> [ INFO  ] Install selinux module 
> /usr/share/ovirt-engine/selinux/ansible-runner-service.cil
> [ INFO  ] Generating post install configuration file 
> '/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf'
> [ INFO  ] Stage: Transaction commit
> [ INFO  ] Stage: Closing up
> [ INFO  ] Starting engine service
> [ INFO  ] Starting dwh service
> [ INFO  ] Starting Grafana service
> [ ERROR ] Failed to execute stage 'Closing up': Failed to start service 
> 'ovirt-imageio'
> [ INFO  ] Stage: Clean up
>   Log file is located at 
> /var/log/ovirt-engine/setup/ovirt-engine-setup-20210426111225-g31wa5.log
> [ INFO  ] Generating answer file 
> '/var/lib/ovirt-engine/setup/answers/20210426111501-setup.conf'
> [ INFO  ] Stage: Pre-termination
> [ INFO  ] Stage: Termination
> [ ERROR ] Execution of setup failed
>
> Below is output of /var/log/ovirt-imageio/daemon.log:
>
> 2021-04-26 11:15:01,481 ERROR   (MainThread) [server] Server failed
> Traceback (most recent call last):
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 46, in main
> server = Server(cfg)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/server.py", line 
> 110, in __init__
> self.remote_service = services.RemoteService(self.config, self.auth)
>   File 
> "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/services.py", 
> line 73, in __init__
> self._server = http.Server((config.remote.host, port), http.Connection)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 91, in __init__
> self.create_socket(prefer_ipv4)
>   File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", 
> line 123, in create_socket
> self.socket = socket.socket(self.address_family, self.socket_type)
>   File "/usr/lib64/python3.6/socket.py", line 144, in __init__
> _socket.socket.__init__(self, family, type, proto, fileno)
> OSError: [Errno 97] Address family not supported by protocol
>
> I am not seeing errors in /var/log/ovirt-engine/setup/setup.log. What other 
> information do you need that is revelant to this issue?

Can you share a complete log?

/var/log/ovirt-imageio/daemon.log

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QTL2LNUODIGD4OU6IYZ6Q36DD66IPGDV/


  1   2   3   4   5   6   7   8   9   10   >