[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-09-23 Thread Shantur Rathore
So,
I did more digging and now I know how to reproduce it.
I created a VM and added a disk on local ssd using scratchpad hook,
formatted and mounted this scratchdisk.
Now, when I try to do heavy IO on this scratchdisk on local ssd, like,
dd if=/dev/zero of=/mnt/scratchdisk/test bs=1M count=1, qemu
pauses VM.
Debug logs in libvirt shows

2021-09-23 11:04:32.765+: 463319: debug : virThreadJobSet:94 :
Thread 463319 (rpc-worker) is now running job
remoteDispatchNodeGetFreePages
2021-09-23 11:04:32.765+: 463319: debug : virNodeGetFreePages:1614
: conn=0x7f8620018ba0, npages=3, pages=0x7f8670009960,
startCell=4294967295, cellCount=1, counts=0x7f8670007db0, flags=0x0
2021-09-23 11:04:32.765+: 463319: debug : virThreadJobClear:119 :
Thread 463319 (rpc-worker) finished job remoteDispatchNodeGetFreePages
with ret=0
2021-09-23 11:04:34.235+: 488774: debug :
qemuMonitorJSONIOProcessLine:220 : Line [{"timestamp": {"seconds":
1632395074, "microseconds": 235454}, "event": "BLOCK_IO_ERROR",
"data": {"device": "", "nospace": false, "node-name":
"libvirt-3-format", "reason": "Input/output error", "operation":
"write", "action": "stop"}}]
2021-09-23 11:04:34.235+: 488774: info :
qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT:
mon=0x7f860c14b700 event={"timestamp": {"seconds": 1632395074,
"microseconds": 235454}, "event": "BLOCK_IO_ERROR", "data": {"device":
"", "nospace": false, "node-name": "libvirt-3-format", "reason":
"Input/output error", "operation": "write", "action": "stop"}}
2021-09-23 11:04:34.235+: 488774: debug :
qemuMonitorJSONIOProcessEvent:181 : mon=0x7f860c14b700
obj=0x7f860c0e7450
2021-09-23 11:04:34.235+: 488774: debug :
qemuMonitorEmitEvent:1166 : mon=0x7f860c14b700 event=BLOCK_IO_ERROR
2021-09-23 11:04:34.235+: 488774: debug :
qemuProcessHandleEvent:581 : vm=0x7f86201d6df0
2021-09-23 11:04:34.235+: 488774: debug : virObjectEventNew:624 :
obj=0x7f860c0d82f0
2021-09-23 11:04:34.235+: 488774: debug :
qemuMonitorJSONIOProcessEvent:206 : handle BLOCK_IO_ERROR
handler=0x7f8639c77a90 data=0x7f860c0661c0

To confirm the local ssd is fine, have enough space where scratch disk
is located and I could run dd in host without any issues.

This happens on other storages as well.
So this seems like an issue with qemu when heavy IO is happening on a disk.

On Thu, Sep 23, 2021 at 7:19 AM Tommy Sway  wrote:
>
> Another option with (still tech preview) is Managed Block Storage (Cinder 
> based storage).
>
> It still tech preview in 4.4 ??
>
>
>
>
>
>
>
> -Original Message-
> From: users-boun...@ovirt.org  On Behalf Of Nir 
> Soffer
> Sent: Wednesday, August 11, 2021 4:26 AM
> To: Shantur Rathore 
> Cc: users ; Roman Bednar 
> Subject: [ovirt-users] Re: Sparse VMs from Templates - Storage issues
>
> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore  
> wrote:
> >
> > Hi all,
> >
> > I have a setup as detailed below
> >
> > - iSCSI Storage Domain
> > - Template with Thin QCOW2 disk
> > - Multiple VMs from Template with Thin disk
>
> Note that a single template disk used by many vms can become a performance 
> bottleneck, and is a single point of failure. Cloning the template when 
> creating vms avoids such issues.
>
> > oVirt Node 4.4.4
>
> 4.4.4 is old, you should upgrade to 4.4.7.
>
> > When the VMs boots up it downloads some data to it and that leads to 
> > increase in volume size.
> > I see that every few seconds the VM gets paused with
> >
> > "VM X has been paused due to no Storage space error."
> >
> >  and then after few seconds
> >
> > "VM X has recovered from paused back to up"
>
> This is normal operation when a vm writes too quickly and oVirt cannot extend 
> the disk quick enough. To mitigate this, you can increase the volume chunk 
> size.
>
> Created this configuration drop in file:
>
> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
> [irs]
> volume_utilization_percent = 25
> volume_utilization_chunk_mb = 2048
>
> And restart vdsm.
>
> With this setting, when free space in a disk is 1.5g, the disk will be 
> extended by 2g. With the default setting, when free space is 0.5g the disk 
> was extended by 1g.
>
> If this does not eliminate the pauses, try a larger chunk size like 4096.
>
> > Sometimes after a many pause and recovery the VM dies with
> >
> > "VM X is down with error. Exit message: Lost connection with qemu process

[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-09-22 Thread Tommy Sway
Another option with (still tech preview) is Managed Block Storage (Cinder based 
storage).

It still tech preview in 4.4 ??







-Original Message-
From: users-boun...@ovirt.org  On Behalf Of Nir Soffer
Sent: Wednesday, August 11, 2021 4:26 AM
To: Shantur Rathore 
Cc: users ; Roman Bednar 
Subject: [ovirt-users] Re: Sparse VMs from Templates - Storage issues

On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore  
wrote:
>
> Hi all,
>
> I have a setup as detailed below
>
> - iSCSI Storage Domain
> - Template with Thin QCOW2 disk
> - Multiple VMs from Template with Thin disk

Note that a single template disk used by many vms can become a performance 
bottleneck, and is a single point of failure. Cloning the template when 
creating vms avoids such issues.

> oVirt Node 4.4.4

4.4.4 is old, you should upgrade to 4.4.7.

> When the VMs boots up it downloads some data to it and that leads to increase 
> in volume size.
> I see that every few seconds the VM gets paused with
>
> "VM X has been paused due to no Storage space error."
>
>  and then after few seconds
>
> "VM X has recovered from paused back to up"

This is normal operation when a vm writes too quickly and oVirt cannot extend 
the disk quick enough. To mitigate this, you can increase the volume chunk size.

Created this configuration drop in file:

# cat /etc/vdsm/vdsm.conf.d/99-local.conf
[irs]
volume_utilization_percent = 25
volume_utilization_chunk_mb = 2048

And restart vdsm.

With this setting, when free space in a disk is 1.5g, the disk will be extended 
by 2g. With the default setting, when free space is 0.5g the disk was extended 
by 1g.

If this does not eliminate the pauses, try a larger chunk size like 4096.

> Sometimes after a many pause and recovery the VM dies with
>
> "VM X is down with error. Exit message: Lost connection with qemu process."

This means qemu has crashed. You can find more info in the vm log at:
/var/log/libvirt/qemu/vm-name.log

We know about bugs in qemu that cause such crashes when vm disk is extended. I 
think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7 will fix this 
issue.

Even with these settings, if you have a very bursty io in the vm, it may become 
paused. The only way to completely avoid these pauses is to use a preallocated 
disk, or use file storage (e.g. NFS). Preallocated disk can be thin provisioned 
on the server side so it does not mean you need more storage, but you will not 
be able to use shared templates in the way you use them now. You can create vm 
from template, but the template is cloned to the new vm.

Another option with (still tech preview) is Managed Block Storage (Cinder based 
storage). If your storage server is supported by Cinder, we can managed it 
using cinderlib. In this setup every disk is a LUN, which may be thin 
provisioned on the storage server. This can also offload storage operations to 
the server, like cloning disks, which may be much faster and more efficient.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: 
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W653KLDZMLUNMKLE242UFH5LY4KQ6LD5/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2AZWT2ZNHJSHAFVAMBSV6BV5VVBEZTEX/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-09-22 Thread Vojtech Juranek
On Wednesday, 22 September 2021 18:09:28 CEST Shantur Rathore wrote:
> I have actually tried many types of storage now and all have this issue.

This is weird. Could you please use file-based storage (e.g. NFS) and post here 
whole exceptions from vdsm log (/var/log/vdsm/vdsm.log) and qemu log (/var/
log/libvirt/qemu/vm_name.log) from the host which runs the VM? Hopefully this 
will give us some hint what is the real issue there.
Thanks
Vojta

> 
> I am not of ideas what to do
> 
> On Wed, Sep 22, 2021 at 4:39 PM Shantur Rathore
>  wrote:
> 
> >
> >
> > Hi Nir,
> >
> >
> >
> > Just to report.
> > As suggested, I created a Posix compliant storage domain with CephFS
> > and copied my templates to CephFS.
> > Now I created VMs from CephFS templates and the storage error happens
> > again.
 As I understand, the storage growth issue is only on iSCSI.
> >
> >
> >
> > Am I doing something wrong?
> >
> >
> >
> > Kind regards,
> > Shantur
> >
> >
> >
> > On Wed, Aug 11, 2021 at 2:42 PM Nir Soffer  wrote:
> > 
> > >
> > >
> > > On Wed, Aug 11, 2021 at 4:24 PM Arik Hadas  wrote:
> > > 
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Aug 11, 2021 at 2:56 PM Benny Zlotnik 
> > > > wrote:
> > > > 
> > > >>
> > > >>
> > > >> > If your vm is temporary and you like to drop the data written
> > > >> > while
> > > >> > the vm is running, you
> > > >> > could use a temporary disk based on the template. This is called a
> > > >> > "transient disk" in vdsm.
> > > >> >
> > > >> >
> > > >> >
> > > >> > Arik, maybe you remember how transient disks are used in engine?
> > > >> > Do we have an API to run a VM once, dropping the changes to the
> > > >> > disk
> > > >> > done while the VM was running?
> > > >>
> > > >>
> > > >>
> > > >> I think that's how stateless VMs work
> > > >
> > > >
> > > >
> > > >
> > > > +1
> > > > It doesn't work exactly like Nir wrote above - stateless VMs that are
> > > > thin-provisioned would have a qcow volume on top of each template's
> > > > volume and when they starts, their active volume would be a qcow
> > > > volume on top of the aforementioned qcow volume and that active
> > > > volume will be removed when the VM goes down
 But yeah, stateless VMs
> > > > are intended for such use case
> > >
> > >
> > >
> > > I was referring to transient disks - created in vdsm:
> > > https://github.com/oVirt/vdsm/blob/45903d01e142047093bf844628b5d90df12b6
> > > ffb/lib/vdsm/virt/vm.py#L3789
> >
> > >
> > >
> > > This creates a *local* temporary file using qcow2 format, using the
> > > disk on shared
> > > storage as a backing file.
> > >
> > >
> > >
> > > Maybe this is not used by engine?
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/ List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3UEXYH2IGNDWW
> YEHEHKLAREJS74LMXUI/



signature.asc
Description: This is a digitally signed message part.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EWFNVMHJFES5CICXVUIRDAYAOQSB4Y57/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-09-22 Thread Shantur Rathore
I have actually tried many types of storage now and all have this issue.

I am not of ideas what to do

On Wed, Sep 22, 2021 at 4:39 PM Shantur Rathore
 wrote:
>
> Hi Nir,
>
> Just to report.
> As suggested, I created a Posix compliant storage domain with CephFS
> and copied my templates to CephFS.
> Now I created VMs from CephFS templates and the storage error happens again.
> As I understand, the storage growth issue is only on iSCSI.
>
> Am I doing something wrong?
>
> Kind regards,
> Shantur
>
> On Wed, Aug 11, 2021 at 2:42 PM Nir Soffer  wrote:
> >
> > On Wed, Aug 11, 2021 at 4:24 PM Arik Hadas  wrote:
> > >
> > >
> > >
> > > On Wed, Aug 11, 2021 at 2:56 PM Benny Zlotnik  wrote:
> > >>
> > >> > If your vm is temporary and you like to drop the data written while
> > >> > the vm is running, you
> > >> > could use a temporary disk based on the template. This is called a
> > >> > "transient disk" in vdsm.
> > >> >
> > >> > Arik, maybe you remember how transient disks are used in engine?
> > >> > Do we have an API to run a VM once, dropping the changes to the disk
> > >> > done while the VM was running?
> > >>
> > >> I think that's how stateless VMs work
> > >
> > >
> > > +1
> > > It doesn't work exactly like Nir wrote above - stateless VMs that are 
> > > thin-provisioned would have a qcow volume on top of each template's 
> > > volume and when they starts, their active volume would be a qcow volume 
> > > on top of the aforementioned qcow volume and that active volume will be 
> > > removed when the VM goes down
> > > But yeah, stateless VMs are intended for such use case
> >
> > I was referring to transient disks - created in vdsm:
> > https://github.com/oVirt/vdsm/blob/45903d01e142047093bf844628b5d90df12b6ffb/lib/vdsm/virt/vm.py#L3789
> >
> > This creates a *local* temporary file using qcow2 format, using the
> > disk on shared
> > storage as a backing file.
> >
> > Maybe this is not used by engine?
> >
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3UEXYH2IGNDWWYEHEHKLAREJS74LMXUI/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-09-22 Thread Shantur Rathore
Hi Nir,

Just to report.
As suggested, I created a Posix compliant storage domain with CephFS
and copied my templates to CephFS.
Now I created VMs from CephFS templates and the storage error happens again.
As I understand, the storage growth issue is only on iSCSI.

Am I doing something wrong?

Kind regards,
Shantur

On Wed, Aug 11, 2021 at 2:42 PM Nir Soffer  wrote:
>
> On Wed, Aug 11, 2021 at 4:24 PM Arik Hadas  wrote:
> >
> >
> >
> > On Wed, Aug 11, 2021 at 2:56 PM Benny Zlotnik  wrote:
> >>
> >> > If your vm is temporary and you like to drop the data written while
> >> > the vm is running, you
> >> > could use a temporary disk based on the template. This is called a
> >> > "transient disk" in vdsm.
> >> >
> >> > Arik, maybe you remember how transient disks are used in engine?
> >> > Do we have an API to run a VM once, dropping the changes to the disk
> >> > done while the VM was running?
> >>
> >> I think that's how stateless VMs work
> >
> >
> > +1
> > It doesn't work exactly like Nir wrote above - stateless VMs that are 
> > thin-provisioned would have a qcow volume on top of each template's volume 
> > and when they starts, their active volume would be a qcow volume on top of 
> > the aforementioned qcow volume and that active volume will be removed when 
> > the VM goes down
> > But yeah, stateless VMs are intended for such use case
>
> I was referring to transient disks - created in vdsm:
> https://github.com/oVirt/vdsm/blob/45903d01e142047093bf844628b5d90df12b6ffb/lib/vdsm/virt/vm.py#L3789
>
> This creates a *local* temporary file using qcow2 format, using the
> disk on shared
> storage as a backing file.
>
> Maybe this is not used by engine?
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZEMCITVILEFHZ2R4QIVUJ26TL6LYMDRY/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Nir Soffer
On Wed, Aug 11, 2021 at 4:24 PM Arik Hadas  wrote:
>
>
>
> On Wed, Aug 11, 2021 at 2:56 PM Benny Zlotnik  wrote:
>>
>> > If your vm is temporary and you like to drop the data written while
>> > the vm is running, you
>> > could use a temporary disk based on the template. This is called a
>> > "transient disk" in vdsm.
>> >
>> > Arik, maybe you remember how transient disks are used in engine?
>> > Do we have an API to run a VM once, dropping the changes to the disk
>> > done while the VM was running?
>>
>> I think that's how stateless VMs work
>
>
> +1
> It doesn't work exactly like Nir wrote above - stateless VMs that are 
> thin-provisioned would have a qcow volume on top of each template's volume 
> and when they starts, their active volume would be a qcow volume on top of 
> the aforementioned qcow volume and that active volume will be removed when 
> the VM goes down
> But yeah, stateless VMs are intended for such use case

I was referring to transient disks - created in vdsm:
https://github.com/oVirt/vdsm/blob/45903d01e142047093bf844628b5d90df12b6ffb/lib/vdsm/virt/vm.py#L3789

This creates a *local* temporary file using qcow2 format, using the
disk on shared
storage as a backing file.

Maybe this is not used by engine?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JTB6P4N5G34JK3QO375XJVIIF4OZHTYH/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Nir Soffer
On Wed, Aug 11, 2021 at 3:13 PM Shantur Rathore
 wrote:
>
>
>> Yes, on file based storage a snapshot is a file, and it grows as
>> needed.  On block based
>> storage, a snapshot is a logical volume, and oVirt needs to extend it
>> when needed.
>
>
> Forgive my ignorance, coming from vSphere background where a filesystem was 
> created on iSCSI LUN.
> I take that this isn't the case in case of a iSCSI Storage Domain in oVirt.

Yes, for block storage, we create a LVM volume group with one or more LUNs
to create a storage domain. Disks are created using LVM logical volume on
this VG.

When you create a vm from template on block storage we create a new 1g
logical volume for the vm disk, and create a qcow2 image on this logical volume
with the backing file using the template logical volume.

The logical volume needs to be extended when free space is low. This is done
automatically on the host running the VM, but since oVirt is not in the data
path, the VM may write data too fast and pause trying to write after the end
of the logical volume. In this case the VM will be resumed when oVirt finish
to extend the volume.

> On Wed, Aug 11, 2021 at 12:26 PM Nir Soffer  wrote:
>>
>> On Wed, Aug 11, 2021 at 12:43 AM Shantur Rathore
>>  wrote:
>> >
>> > Thanks for the detailed response Nir.
>> >
>> > In my use case, we keep creating VMs from templates and deleting them so 
>> > we need the VMs to be created quickly and cloning it will use a lot of 
>> > time and storage.
>>
>> That's a good reason to use a template.
>>
>> If your vm is temporary and you like to drop the data written while
>> the vm is running, you
>> could use a temporary disk based on the template. This is called a
>> "transient disk" in vdsm.
>>
>> Arik, maybe you remember how transient disks are used in engine?
>> Do we have an API to run a VM once, dropping the changes to the disk
>> done while the VM was running?
>>
>> > I will try to add the config and try again tomorrow. Also I like the 
>> > Managed Block storage idea, I had read about it in the past and used it 
>> > with Ceph.
>> >
>> > Just to understand it better, is this issue only on iSCSI based storage?
>>
>> Yes, on file based storage a snapshot is a file, and it grows as
>> needed.  On block based
>> storage, a snapshot is a logical volume, and oVirt needs to extend it
>> when needed.
>>
>> Nir
>>
>> > Thanks again.
>> >
>> > Regards
>> > Shantur
>> >
>> > On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer  wrote:
>> >>
>> >> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
>> >>  wrote:
>> >> >
>> >> > Hi all,
>> >> >
>> >> > I have a setup as detailed below
>> >> >
>> >> > - iSCSI Storage Domain
>> >> > - Template with Thin QCOW2 disk
>> >> > - Multiple VMs from Template with Thin disk
>> >>
>> >> Note that a single template disk used by many vms can become a performance
>> >> bottleneck, and is a single point of failure. Cloning the template when 
>> >> creating
>> >> vms avoids such issues.
>> >>
>> >> > oVirt Node 4.4.4
>> >>
>> >> 4.4.4 is old, you should upgrade to 4.4.7.
>> >>
>> >> > When the VMs boots up it downloads some data to it and that leads to 
>> >> > increase in volume size.
>> >> > I see that every few seconds the VM gets paused with
>> >> >
>> >> > "VM X has been paused due to no Storage space error."
>> >> >
>> >> >  and then after few seconds
>> >> >
>> >> > "VM X has recovered from paused back to up"
>> >>
>> >> This is normal operation when a vm writes too quickly and oVirt cannot
>> >> extend the disk quick enough. To mitigate this, you can increase the
>> >> volume chunk size.
>> >>
>> >> Created this configuration drop in file:
>> >>
>> >> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
>> >> [irs]
>> >> volume_utilization_percent = 25
>> >> volume_utilization_chunk_mb = 2048
>> >>
>> >> And restart vdsm.
>> >>
>> >> With this setting, when free space in a disk is 1.5g, the disk will
>> >> be extended by 2g. With the default setting, when free space is
>> >> 0.5g the disk was extended by 1g.
>> >>
>> >> If this does not eliminate the pauses, try a larger chunk size
>> >> like 4096.
>> >>
>> >> > Sometimes after a many pause and recovery the VM dies with
>> >> >
>> >> > "VM X is down with error. Exit message: Lost connection with qemu 
>> >> > process."
>> >>
>> >> This means qemu has crashed. You can find more info in the vm log at:
>> >> /var/log/libvirt/qemu/vm-name.log
>> >>
>> >> We know about bugs in qemu that cause such crashes when vm disk is
>> >> extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
>> >> will fix this issue.
>> >>
>> >> Even with these settings, if you have a very bursty io in the vm, it may
>> >> become paused. The only way to completely avoid these pauses is to
>> >> use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
>> >> can be thin provisioned on the server side so it does not mean you need
>> >> more storage, but you will not be able to use shared templates in the way
>> >> you use them now. You ca

[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Arik Hadas
On Wed, Aug 11, 2021 at 2:56 PM Benny Zlotnik  wrote:

> > If your vm is temporary and you like to drop the data written while
> > the vm is running, you
> > could use a temporary disk based on the template. This is called a
> > "transient disk" in vdsm.
> >
> > Arik, maybe you remember how transient disks are used in engine?
> > Do we have an API to run a VM once, dropping the changes to the disk
> > done while the VM was running?
>
> I think that's how stateless VMs work
>

+1
It doesn't work exactly like Nir wrote above - stateless VMs that are
thin-provisioned would have a qcow volume on top of each template's volume
and when they starts, their active volume would be a qcow volume on top of
the aforementioned qcow volume and that active volume will be removed when
the VM goes down
But yeah, stateless VMs are intended for such use case
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6OALP5LAFBRYZ46ONUFPC2JV7UIOLMJF/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Shantur Rathore
> Yes, on file based storage a snapshot is a file, and it grows as
> needed.  On block based
> storage, a snapshot is a logical volume, and oVirt needs to extend it
> when needed.


Forgive my ignorance, coming from vSphere background where a filesystem was
created on iSCSI LUN.
I take that this isn't the case in case of a iSCSI Storage Domain in oVirt.

On Wed, Aug 11, 2021 at 12:26 PM Nir Soffer  wrote:

> On Wed, Aug 11, 2021 at 12:43 AM Shantur Rathore
>  wrote:
> >
> > Thanks for the detailed response Nir.
> >
> > In my use case, we keep creating VMs from templates and deleting them so
> we need the VMs to be created quickly and cloning it will use a lot of time
> and storage.
>
> That's a good reason to use a template.
>
> If your vm is temporary and you like to drop the data written while
> the vm is running, you
> could use a temporary disk based on the template. This is called a
> "transient disk" in vdsm.
>
> Arik, maybe you remember how transient disks are used in engine?
> Do we have an API to run a VM once, dropping the changes to the disk
> done while the VM was running?
>
> > I will try to add the config and try again tomorrow. Also I like the
> Managed Block storage idea, I had read about it in the past and used it
> with Ceph.
> >
> > Just to understand it better, is this issue only on iSCSI based storage?
>
> Yes, on file based storage a snapshot is a file, and it grows as
> needed.  On block based
> storage, a snapshot is a logical volume, and oVirt needs to extend it
> when needed.
>
> Nir
>
> > Thanks again.
> >
> > Regards
> > Shantur
> >
> > On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer  wrote:
> >>
> >> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
> >>  wrote:
> >> >
> >> > Hi all,
> >> >
> >> > I have a setup as detailed below
> >> >
> >> > - iSCSI Storage Domain
> >> > - Template with Thin QCOW2 disk
> >> > - Multiple VMs from Template with Thin disk
> >>
> >> Note that a single template disk used by many vms can become a
> performance
> >> bottleneck, and is a single point of failure. Cloning the template when
> creating
> >> vms avoids such issues.
> >>
> >> > oVirt Node 4.4.4
> >>
> >> 4.4.4 is old, you should upgrade to 4.4.7.
> >>
> >> > When the VMs boots up it downloads some data to it and that leads to
> increase in volume size.
> >> > I see that every few seconds the VM gets paused with
> >> >
> >> > "VM X has been paused due to no Storage space error."
> >> >
> >> >  and then after few seconds
> >> >
> >> > "VM X has recovered from paused back to up"
> >>
> >> This is normal operation when a vm writes too quickly and oVirt cannot
> >> extend the disk quick enough. To mitigate this, you can increase the
> >> volume chunk size.
> >>
> >> Created this configuration drop in file:
> >>
> >> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
> >> [irs]
> >> volume_utilization_percent = 25
> >> volume_utilization_chunk_mb = 2048
> >>
> >> And restart vdsm.
> >>
> >> With this setting, when free space in a disk is 1.5g, the disk will
> >> be extended by 2g. With the default setting, when free space is
> >> 0.5g the disk was extended by 1g.
> >>
> >> If this does not eliminate the pauses, try a larger chunk size
> >> like 4096.
> >>
> >> > Sometimes after a many pause and recovery the VM dies with
> >> >
> >> > "VM X is down with error. Exit message: Lost connection with qemu
> process."
> >>
> >> This means qemu has crashed. You can find more info in the vm log at:
> >> /var/log/libvirt/qemu/vm-name.log
> >>
> >> We know about bugs in qemu that cause such crashes when vm disk is
> >> extended. I think the latest bug was fixed in 4.4.6, so upgrading to
> 4.4.7
> >> will fix this issue.
> >>
> >> Even with these settings, if you have a very bursty io in the vm, it may
> >> become paused. The only way to completely avoid these pauses is to
> >> use a preallocated disk, or use file storage (e.g. NFS). Preallocated
> disk
> >> can be thin provisioned on the server side so it does not mean you need
> >> more storage, but you will not be able to use shared templates in the
> way
> >> you use them now. You can create vm from template, but the template
> >> is cloned to the new vm.
> >>
> >> Another option with (still tech preview) is Managed Block Storage
> (Cinder
> >> based storage). If your storage server is supported by Cinder, we can
> >> managed it using cinderlib. In this setup every disk is a LUN, which may
> >> be thin provisioned on the storage server. This can also offload storage
> >> operations to the server, like cloning disks, which may be much faster
> and
> >> more efficient.
> >>
> >> Nir
> >>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6O2SC2EG77QJZZQBL6NRQJPA

[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Benny Zlotnik
> If your vm is temporary and you like to drop the data written while
> the vm is running, you
> could use a temporary disk based on the template. This is called a
> "transient disk" in vdsm.
>
> Arik, maybe you remember how transient disks are used in engine?
> Do we have an API to run a VM once, dropping the changes to the disk
> done while the VM was running?

I think that's how stateless VMs work
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EAVA367YF6F3AHHPU7K23PFOR5ZTZBBI/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-11 Thread Nir Soffer
On Wed, Aug 11, 2021 at 12:43 AM Shantur Rathore
 wrote:
>
> Thanks for the detailed response Nir.
>
> In my use case, we keep creating VMs from templates and deleting them so we 
> need the VMs to be created quickly and cloning it will use a lot of time and 
> storage.

That's a good reason to use a template.

If your vm is temporary and you like to drop the data written while
the vm is running, you
could use a temporary disk based on the template. This is called a
"transient disk" in vdsm.

Arik, maybe you remember how transient disks are used in engine?
Do we have an API to run a VM once, dropping the changes to the disk
done while the VM was running?

> I will try to add the config and try again tomorrow. Also I like the Managed 
> Block storage idea, I had read about it in the past and used it with Ceph.
>
> Just to understand it better, is this issue only on iSCSI based storage?

Yes, on file based storage a snapshot is a file, and it grows as
needed.  On block based
storage, a snapshot is a logical volume, and oVirt needs to extend it
when needed.

Nir

> Thanks again.
>
> Regards
> Shantur
>
> On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer  wrote:
>>
>> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
>>  wrote:
>> >
>> > Hi all,
>> >
>> > I have a setup as detailed below
>> >
>> > - iSCSI Storage Domain
>> > - Template with Thin QCOW2 disk
>> > - Multiple VMs from Template with Thin disk
>>
>> Note that a single template disk used by many vms can become a performance
>> bottleneck, and is a single point of failure. Cloning the template when 
>> creating
>> vms avoids such issues.
>>
>> > oVirt Node 4.4.4
>>
>> 4.4.4 is old, you should upgrade to 4.4.7.
>>
>> > When the VMs boots up it downloads some data to it and that leads to 
>> > increase in volume size.
>> > I see that every few seconds the VM gets paused with
>> >
>> > "VM X has been paused due to no Storage space error."
>> >
>> >  and then after few seconds
>> >
>> > "VM X has recovered from paused back to up"
>>
>> This is normal operation when a vm writes too quickly and oVirt cannot
>> extend the disk quick enough. To mitigate this, you can increase the
>> volume chunk size.
>>
>> Created this configuration drop in file:
>>
>> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
>> [irs]
>> volume_utilization_percent = 25
>> volume_utilization_chunk_mb = 2048
>>
>> And restart vdsm.
>>
>> With this setting, when free space in a disk is 1.5g, the disk will
>> be extended by 2g. With the default setting, when free space is
>> 0.5g the disk was extended by 1g.
>>
>> If this does not eliminate the pauses, try a larger chunk size
>> like 4096.
>>
>> > Sometimes after a many pause and recovery the VM dies with
>> >
>> > "VM X is down with error. Exit message: Lost connection with qemu process."
>>
>> This means qemu has crashed. You can find more info in the vm log at:
>> /var/log/libvirt/qemu/vm-name.log
>>
>> We know about bugs in qemu that cause such crashes when vm disk is
>> extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
>> will fix this issue.
>>
>> Even with these settings, if you have a very bursty io in the vm, it may
>> become paused. The only way to completely avoid these pauses is to
>> use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
>> can be thin provisioned on the server side so it does not mean you need
>> more storage, but you will not be able to use shared templates in the way
>> you use them now. You can create vm from template, but the template
>> is cloned to the new vm.
>>
>> Another option with (still tech preview) is Managed Block Storage (Cinder
>> based storage). If your storage server is supported by Cinder, we can
>> managed it using cinderlib. In this setup every disk is a LUN, which may
>> be thin provisioned on the storage server. This can also offload storage
>> operations to the server, like cloning disks, which may be much faster and
>> more efficient.
>>
>> Nir
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NH3ZZMYOCTVKDF4GYKFOSQYPP2IK3JFT/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-10 Thread Shantur Rathore
Thanks for the detailed response Nir.

In my use case, we keep creating VMs from templates and deleting them so
we need the VMs to be created quickly and cloning it will use a lot of time
and storage.
I will try to add the config and try again tomorrow. Also I like the
Managed Block storage idea, I had read about it in the past and used it
with Ceph.

Just to understand it better, is this issue only on iSCSI based storage?

Thanks again.

Regards
Shantur

On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer  wrote:

> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
>  wrote:
> >
> > Hi all,
> >
> > I have a setup as detailed below
> >
> > - iSCSI Storage Domain
> > - Template with Thin QCOW2 disk
> > - Multiple VMs from Template with Thin disk
>
> Note that a single template disk used by many vms can become a performance
> bottleneck, and is a single point of failure. Cloning the template when
> creating
> vms avoids such issues.
>
> > oVirt Node 4.4.4
>
> 4.4.4 is old, you should upgrade to 4.4.7.
>
> > When the VMs boots up it downloads some data to it and that leads to
> increase in volume size.
> > I see that every few seconds the VM gets paused with
> >
> > "VM X has been paused due to no Storage space error."
> >
> >  and then after few seconds
> >
> > "VM X has recovered from paused back to up"
>
> This is normal operation when a vm writes too quickly and oVirt cannot
> extend the disk quick enough. To mitigate this, you can increase the
> volume chunk size.
>
> Created this configuration drop in file:
>
> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
> [irs]
> volume_utilization_percent = 25
> volume_utilization_chunk_mb = 2048
>
> And restart vdsm.
>
> With this setting, when free space in a disk is 1.5g, the disk will
> be extended by 2g. With the default setting, when free space is
> 0.5g the disk was extended by 1g.
>
> If this does not eliminate the pauses, try a larger chunk size
> like 4096.
>
> > Sometimes after a many pause and recovery the VM dies with
> >
> > "VM X is down with error. Exit message: Lost connection with qemu
> process."
>
> This means qemu has crashed. You can find more info in the vm log at:
> /var/log/libvirt/qemu/vm-name.log
>
> We know about bugs in qemu that cause such crashes when vm disk is
> extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
> will fix this issue.
>
> Even with these settings, if you have a very bursty io in the vm, it may
> become paused. The only way to completely avoid these pauses is to
> use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
> can be thin provisioned on the server side so it does not mean you need
> more storage, but you will not be able to use shared templates in the way
> you use them now. You can create vm from template, but the template
> is cloned to the new vm.
>
> Another option with (still tech preview) is Managed Block Storage (Cinder
> based storage). If your storage server is supported by Cinder, we can
> managed it using cinderlib. In this setup every disk is a LUN, which may
> be thin provisioned on the storage server. This can also offload storage
> operations to the server, like cloning disks, which may be much faster and
> more efficient.
>
> Nir
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4RRRKKOFSNWYMQWAMVR5VJ2WA2BBG2F5/


[ovirt-users] Re: Sparse VMs from Templates - Storage issues

2021-08-10 Thread Nir Soffer
On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
 wrote:
>
> Hi all,
>
> I have a setup as detailed below
>
> - iSCSI Storage Domain
> - Template with Thin QCOW2 disk
> - Multiple VMs from Template with Thin disk

Note that a single template disk used by many vms can become a performance
bottleneck, and is a single point of failure. Cloning the template when creating
vms avoids such issues.

> oVirt Node 4.4.4

4.4.4 is old, you should upgrade to 4.4.7.

> When the VMs boots up it downloads some data to it and that leads to increase 
> in volume size.
> I see that every few seconds the VM gets paused with
>
> "VM X has been paused due to no Storage space error."
>
>  and then after few seconds
>
> "VM X has recovered from paused back to up"

This is normal operation when a vm writes too quickly and oVirt cannot
extend the disk quick enough. To mitigate this, you can increase the
volume chunk size.

Created this configuration drop in file:

# cat /etc/vdsm/vdsm.conf.d/99-local.conf
[irs]
volume_utilization_percent = 25
volume_utilization_chunk_mb = 2048

And restart vdsm.

With this setting, when free space in a disk is 1.5g, the disk will
be extended by 2g. With the default setting, when free space is
0.5g the disk was extended by 1g.

If this does not eliminate the pauses, try a larger chunk size
like 4096.

> Sometimes after a many pause and recovery the VM dies with
>
> "VM X is down with error. Exit message: Lost connection with qemu process."

This means qemu has crashed. You can find more info in the vm log at:
/var/log/libvirt/qemu/vm-name.log

We know about bugs in qemu that cause such crashes when vm disk is
extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
will fix this issue.

Even with these settings, if you have a very bursty io in the vm, it may
become paused. The only way to completely avoid these pauses is to
use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
can be thin provisioned on the server side so it does not mean you need
more storage, but you will not be able to use shared templates in the way
you use them now. You can create vm from template, but the template
is cloned to the new vm.

Another option with (still tech preview) is Managed Block Storage (Cinder
based storage). If your storage server is supported by Cinder, we can
managed it using cinderlib. In this setup every disk is a LUN, which may
be thin provisioned on the storage server. This can also offload storage
operations to the server, like cloning disks, which may be much faster and
more efficient.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W653KLDZMLUNMKLE242UFH5LY4KQ6LD5/