[ovirt-users] Re: Dashboard very slow to load: change default page possible?

2020-02-24 Thread Yedidyah Bar David
On Mon, Feb 24, 2020 at 4:39 PM Gianluca Cecchi
 wrote:
>
> Hi,
> I'm noticing that after upgrading to 4.3.8, sometimes the dashboard is very 
> slow and it takes minutes to display and to let me work, even if I'm not 
> interested at all to the dashboard for the activities I have to do.
> As a consequence it becomes a bottleneck.
> On engine I see that during these moments the postmaster process related to 
> ovirt_engine_history database is 100% cpu consuming.
> Almost every minor update I run vacuum but I don't know if there is something 
> else involved.
> Is there any bug open with in in 4.3.8?

I am not aware of one. Please open, and attach relevant logs. If
possible, also please turn on full query logging on PG and attach its
logs. Thanks!
Also adding Shirly.

> Is there any option to set another page as the landing one when I open web 
> admin portal?

I am not aware of one, but there is a very simple workaround: Keep a
bookmark in your browser to wherever you want, and after login just
press it. I think it should work after a few seconds, even if the
dashboard still didn't finish loading.

>
> I have similar problem on a RHV environment where I have done upgrade form 
> 4.2.7 to 4.2.8 and then 4.3.8 in the same day (eventually I'm going to open a 
> case fir this one).
> When passing from 4.2.7 to 4.2.8 I chose full vacuum for the engine history 
> database:
>
>   Perform full vacuum on the oVirt engine history
>   database ovirt_engine_history@localhost?
>   This operation may take a while depending on this setup health and 
> the
>   configuration of the db vacuum process.
>   See https://www.postgresql.org/docs/9.0/static/sql-vacuum.html
>   (Yes, No) [No]: Yes
>
> while I chose NO during the upgrade from 4.2.8 to 4.3.8 that I've done half 
> an hour later.

Best regards,
-- 
Didi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6ZAWLBPKQET4KOYRWQXH3TXRHYWIBB63/


[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Strahil Nikolov
On February 24, 2020 7:50:15 PM GMT+02:00, Hesham Ahmed  
wrote:
>In my case I am continuing with oVirt node 4.3.8 based gluster 6.7 for
>the
>time being. I have resolved the issue by manually copying all disk
>images
>to a new gluster volume which took days specially since disks on
>gluster
>still don't support sparse file copy. But the threat of a temporary
>network
>failure bringing down the complete oVirt setup is a bit too much risk.
>
>On Mon, Feb 24, 2020 at 9:29 PM Christian Reiss
>
>wrote:
>
>> Hey,
>>
>> I do not have the faulty cluster anymore; It's a production
>environment
>> with HA requirements so I really cant take it down for days or even
>> worse, weeks.
>>
>> I am now running of CentOS7 (manual install) with manual Gluster 7.0
>> installation and current ovirt. So far so good.
>>
>> Time will tell :)
>>
>> On 24/02/2020 18:11, Strahil Nikolov wrote:
>> > On February 24, 2020 5:10:40 PM GMT+02:00, Hesham Ahmed <
>> hsah...@gmail.com> wrote:
>> >> My issue is with Gluster 6.7 (the default with oVirt 4.3.7) as is
>the
>> >> case
>> >> with Christian. I still have the failing volume and disks and can
>share
>> >> any
>> >> information required.
>>
>>
>> --
>> with kind regards,
>> mit freundlichen Gruessen,
>>
>> Christian Reiss
>>
>>

Hey Hesham,

Do you keep the old  volumes?
Maybe you can assist Ravi on debugging this issue ?

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FIAZRQMTFSDPOV6OKVOOU7OAORJVXQDY/


[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Hesham Ahmed
In my case I am continuing with oVirt node 4.3.8 based gluster 6.7 for the
time being. I have resolved the issue by manually copying all disk images
to a new gluster volume which took days specially since disks on gluster
still don't support sparse file copy. But the threat of a temporary network
failure bringing down the complete oVirt setup is a bit too much risk.

On Mon, Feb 24, 2020 at 9:29 PM Christian Reiss 
wrote:

> Hey,
>
> I do not have the faulty cluster anymore; It's a production environment
> with HA requirements so I really cant take it down for days or even
> worse, weeks.
>
> I am now running of CentOS7 (manual install) with manual Gluster 7.0
> installation and current ovirt. So far so good.
>
> Time will tell :)
>
> On 24/02/2020 18:11, Strahil Nikolov wrote:
> > On February 24, 2020 5:10:40 PM GMT+02:00, Hesham Ahmed <
> hsah...@gmail.com> wrote:
> >> My issue is with Gluster 6.7 (the default with oVirt 4.3.7) as is the
> >> case
> >> with Christian. I still have the failing volume and disks and can share
> >> any
> >> information required.
>
>
> --
> with kind regards,
> mit freundlichen Gruessen,
>
> Christian Reiss
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7XZTECBK4ZHSAVFKK36CO4D7CROPRHVB/


[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Christian Reiss

Hey,

I do not have the faulty cluster anymore; It's a production environment 
with HA requirements so I really cant take it down for days or even 
worse, weeks.


I am now running of CentOS7 (manual install) with manual Gluster 7.0 
installation and current ovirt. So far so good.


Time will tell :)

On 24/02/2020 18:11, Strahil Nikolov wrote:

On February 24, 2020 5:10:40 PM GMT+02:00, Hesham Ahmed  
wrote:

My issue is with Gluster 6.7 (the default with oVirt 4.3.7) as is the
case
with Christian. I still have the failing volume and disks and can share
any
information required.



--
with kind regards,
mit freundlichen Gruessen,

Christian Reiss
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EYAL7TOIGSQ4JT255OLVB4FLO2J4DT5W/


[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Strahil Nikolov
On February 24, 2020 5:10:40 PM GMT+02:00, Hesham Ahmed  
wrote:
>My issue is with Gluster 6.7 (the default with oVirt 4.3.7) as is the
>case
>with Christian. I still have the failing volume and disks and can share
>any
>information required.
>
>On Mon, Feb 24, 2020 at 6:21 PM Strahil Nikolov 
>wrote:
>
>> On February 24, 2020 1:55:34 PM GMT+02:00, Hesham Ahmed
>
>> wrote:
>> >Were you ever able to find a fix for this? I am facing the same
>problem
>> >and
>> >the case is similar to yours. We have a 6 node
>distributed-replicated
>> >Gluster, due to a network issue all servers got disconnected and
>upon
>> >recover one of the volumes started giving the same IO error. The
>files
>> >can
>> >be read as root but are giving error when read as vdsm. Everything
>else
>> >is
>> >as in your case including the oVirt versions. While doing a full dd
>> >if=IMAGE of=/dev/null allows the disk to be mounted on one server
>> >temporarily, upon reboot/restart it returns to failing with IO
>error. I
>> >had
>> >to create a completely new gluster volume and copy the disks from
>the
>> >failing volume as root to resolve this.
>> >
>> >Did you create a bug report in Bugzilla for this?
>> >
>> >Regards,
>> >
>> >Hesham Ahmed
>> >
>> >On Wed, Feb 5, 2020 at 1:01 AM Christian Reiss
>> >
>> >wrote:
>> >
>> >> Thanks for replying,
>> >>
>> >> What I just wrote Stahil was:
>> >>
>> >>
>> >> ACL is correctly set:
>> >>
>> >> # file: 5aab365f-b1b9-49d0-b011-566bf936a100
>> >> # owner: vdsm
>> >> # group: kvm
>> >> user::rw-
>> >> group::rw-
>> >> other::---
>> >>
>> >> Doing a setfacl failed due to "Operation not supported",
>remounting
>> >with
>> >> acl, too:
>> >>
>> >> [root@node01 ~]# mount -o remount,acl
>> >>
>>
>>/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net\:_ssd__storage/
>> >> /bin/sh: glusterfs: command not found
>> >>
>> >> As I am running the oVirt node I am not sure how feasable
>> >down/upgrading
>> >> is. I think I am stuck with what I have.
>> >>
>> >> Also, if this would be a permission issue, I would not be able to
>> >access
>> >> the file at all. Seems I can access some of it. And all of it when
>> >root
>> >> loaded the whole file first.
>> >>
>> >>
>> >> I also did, even if it was correctly set, the chown from the
>> >mountpoint
>> >> again, to no avail.
>> >>
>> >>
>> >> On 04/02/2020 21:53, Christian Reiss wrote:
>> >> >
>> >> > ACL is correctly set:
>> >> >
>> >> > # file: 5aab365f-b1b9-49d0-b011-566bf936a100
>> >> > # owner: vdsm
>> >> > # group: kvm
>> >> > user::rw-
>> >> > group::rw-
>> >> > other::---
>> >> >
>> >> > Doing a setfacl failed due to "Operation not supported",
>remounting
>> >with
>> >> > acl, too:
>> >> >
>> >> > [root@node01 ~]# mount -o remount,acl
>> >> > /rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net
>> >> \:_ssd__storage/
>> >> > /bin/sh: glusterfs: command not found
>> >> >
>> >> > As I am running the oVirt node I am not sure how feasable
>> >down/upgrading
>> >> > is. I think I am stuck with what I have.
>> >> >
>> >> > Also, if this would be a permission issue, I would not be able
>to
>> >access
>> >> > the file at all. Seems I can access some of it. And all of it
>when
>> >root
>> >> > loaded the whole file first.
>> >>
>> >> --
>> >> with kind regards,
>> >> mit freundlichen Gruessen,
>> >>
>> >> Christian Reiss
>> >> ___
>> >> Users mailing list -- users@ovirt.org
>> >> To unsubscribe send an email to users-le...@ovirt.org
>> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> >> oVirt Code of Conduct:
>> >> https://www.ovirt.org/community/about/community-guidelines/
>> >> List Archives:
>> >>
>> >
>>
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/G2MOTXZU6AFAGV2GL6P5XBXHMCRFUM6F/
>> >>
>>
>> If you mean the ACL issue -> check
>> https://bugzilla.redhat.com/show_bug.cgi?id=1797099
>> Ravi will be happy to have a setup that is already affected, so he
>can
>> debug the issue.
>> In my case , I have reverted to v7.0
>>
>> Best Regards,
>> Strahil Nikolov
>>

If this is production setup, consider downgrading to v6.5 (although it is not 
recommended) as an option.
Another  one  is to mount with acl option and force a setfacl:
find /mnt  -exec setfacl -m u:root:rw {} \;

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5VNSHA66UH6Y3A6TXOXOQNA2HRTILXD6/


[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Hesham Ahmed
My issue is with Gluster 6.7 (the default with oVirt 4.3.7) as is the case
with Christian. I still have the failing volume and disks and can share any
information required.

On Mon, Feb 24, 2020 at 6:21 PM Strahil Nikolov 
wrote:

> On February 24, 2020 1:55:34 PM GMT+02:00, Hesham Ahmed 
> wrote:
> >Were you ever able to find a fix for this? I am facing the same problem
> >and
> >the case is similar to yours. We have a 6 node distributed-replicated
> >Gluster, due to a network issue all servers got disconnected and upon
> >recover one of the volumes started giving the same IO error. The files
> >can
> >be read as root but are giving error when read as vdsm. Everything else
> >is
> >as in your case including the oVirt versions. While doing a full dd
> >if=IMAGE of=/dev/null allows the disk to be mounted on one server
> >temporarily, upon reboot/restart it returns to failing with IO error. I
> >had
> >to create a completely new gluster volume and copy the disks from the
> >failing volume as root to resolve this.
> >
> >Did you create a bug report in Bugzilla for this?
> >
> >Regards,
> >
> >Hesham Ahmed
> >
> >On Wed, Feb 5, 2020 at 1:01 AM Christian Reiss
> >
> >wrote:
> >
> >> Thanks for replying,
> >>
> >> What I just wrote Stahil was:
> >>
> >>
> >> ACL is correctly set:
> >>
> >> # file: 5aab365f-b1b9-49d0-b011-566bf936a100
> >> # owner: vdsm
> >> # group: kvm
> >> user::rw-
> >> group::rw-
> >> other::---
> >>
> >> Doing a setfacl failed due to "Operation not supported", remounting
> >with
> >> acl, too:
> >>
> >> [root@node01 ~]# mount -o remount,acl
> >>
> >/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net\:_ssd__storage/
> >> /bin/sh: glusterfs: command not found
> >>
> >> As I am running the oVirt node I am not sure how feasable
> >down/upgrading
> >> is. I think I am stuck with what I have.
> >>
> >> Also, if this would be a permission issue, I would not be able to
> >access
> >> the file at all. Seems I can access some of it. And all of it when
> >root
> >> loaded the whole file first.
> >>
> >>
> >> I also did, even if it was correctly set, the chown from the
> >mountpoint
> >> again, to no avail.
> >>
> >>
> >> On 04/02/2020 21:53, Christian Reiss wrote:
> >> >
> >> > ACL is correctly set:
> >> >
> >> > # file: 5aab365f-b1b9-49d0-b011-566bf936a100
> >> > # owner: vdsm
> >> > # group: kvm
> >> > user::rw-
> >> > group::rw-
> >> > other::---
> >> >
> >> > Doing a setfacl failed due to "Operation not supported", remounting
> >with
> >> > acl, too:
> >> >
> >> > [root@node01 ~]# mount -o remount,acl
> >> > /rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net
> >> \:_ssd__storage/
> >> > /bin/sh: glusterfs: command not found
> >> >
> >> > As I am running the oVirt node I am not sure how feasable
> >down/upgrading
> >> > is. I think I am stuck with what I have.
> >> >
> >> > Also, if this would be a permission issue, I would not be able to
> >access
> >> > the file at all. Seems I can access some of it. And all of it when
> >root
> >> > loaded the whole file first.
> >>
> >> --
> >> with kind regards,
> >> mit freundlichen Gruessen,
> >>
> >> Christian Reiss
> >> ___
> >> Users mailing list -- users@ovirt.org
> >> To unsubscribe send an email to users-le...@ovirt.org
> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> >> oVirt Code of Conduct:
> >> https://www.ovirt.org/community/about/community-guidelines/
> >> List Archives:
> >>
> >
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/G2MOTXZU6AFAGV2GL6P5XBXHMCRFUM6F/
> >>
>
> If you mean the ACL issue -> check
> https://bugzilla.redhat.com/show_bug.cgi?id=1797099
> Ravi will be happy to have a setup that is already affected, so he can
> debug the issue.
> In my case , I have reverted to v7.0
>
> Best Regards,
> Strahil Nikolov
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IP6TYZMHO5ZE2FSGZMC76H6MWYJKIQRP/


[ovirt-users] Dashboard very slow to load: change default page possible?

2020-02-24 Thread Gianluca Cecchi
Hi,
I'm noticing that after upgrading to 4.3.8, sometimes the dashboard is very
slow and it takes minutes to display and to let me work, even if I'm not
interested at all to the dashboard for the activities I have to do.
As a consequence it becomes a bottleneck.
On engine I see that during these moments the postmaster process related to
ovirt_engine_history database is 100% cpu consuming.
Almost every minor update I run vacuum but I don't know if there is
something else involved.
Is there any bug open with in in 4.3.8?
Is there any option to set another page as the landing one when I open web
admin portal?

I have similar problem on a RHV environment where I have done upgrade form
4.2.7 to 4.2.8 and then 4.3.8 in the same day (eventually I'm going to open
a case fir this one).
When passing from 4.2.7 to 4.2.8 I chose full vacuum for the engine history
database:

  Perform full vacuum on the oVirt engine history
  database ovirt_engine_history@localhost?
  This operation may take a while depending on this setup health
and the
  configuration of the db vacuum process.
  See https://www.postgresql.org/docs/9.0/static/sql-vacuum.html
  (Yes, No) [No]: Yes

while I chose NO during the upgrade from 4.2.8 to 4.3.8 that I've done half
an hour later.

Thanks in advance,
Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ERXKLMNU3DIA4PZO6Z4EEZDA5T3PLKSL/


[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Strahil Nikolov
On February 24, 2020 1:55:34 PM GMT+02:00, Hesham Ahmed  
wrote:
>Were you ever able to find a fix for this? I am facing the same problem
>and
>the case is similar to yours. We have a 6 node distributed-replicated
>Gluster, due to a network issue all servers got disconnected and upon
>recover one of the volumes started giving the same IO error. The files
>can
>be read as root but are giving error when read as vdsm. Everything else
>is
>as in your case including the oVirt versions. While doing a full dd
>if=IMAGE of=/dev/null allows the disk to be mounted on one server
>temporarily, upon reboot/restart it returns to failing with IO error. I
>had
>to create a completely new gluster volume and copy the disks from the
>failing volume as root to resolve this.
>
>Did you create a bug report in Bugzilla for this?
>
>Regards,
>
>Hesham Ahmed
>
>On Wed, Feb 5, 2020 at 1:01 AM Christian Reiss
>
>wrote:
>
>> Thanks for replying,
>>
>> What I just wrote Stahil was:
>>
>>
>> ACL is correctly set:
>>
>> # file: 5aab365f-b1b9-49d0-b011-566bf936a100
>> # owner: vdsm
>> # group: kvm
>> user::rw-
>> group::rw-
>> other::---
>>
>> Doing a setfacl failed due to "Operation not supported", remounting
>with
>> acl, too:
>>
>> [root@node01 ~]# mount -o remount,acl
>>
>/rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net\:_ssd__storage/
>> /bin/sh: glusterfs: command not found
>>
>> As I am running the oVirt node I am not sure how feasable
>down/upgrading
>> is. I think I am stuck with what I have.
>>
>> Also, if this would be a permission issue, I would not be able to
>access
>> the file at all. Seems I can access some of it. And all of it when
>root
>> loaded the whole file first.
>>
>>
>> I also did, even if it was correctly set, the chown from the
>mountpoint
>> again, to no avail.
>>
>>
>> On 04/02/2020 21:53, Christian Reiss wrote:
>> >
>> > ACL is correctly set:
>> >
>> > # file: 5aab365f-b1b9-49d0-b011-566bf936a100
>> > # owner: vdsm
>> > # group: kvm
>> > user::rw-
>> > group::rw-
>> > other::---
>> >
>> > Doing a setfacl failed due to "Operation not supported", remounting
>with
>> > acl, too:
>> >
>> > [root@node01 ~]# mount -o remount,acl
>> > /rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net
>> \:_ssd__storage/
>> > /bin/sh: glusterfs: command not found
>> >
>> > As I am running the oVirt node I am not sure how feasable
>down/upgrading
>> > is. I think I am stuck with what I have.
>> >
>> > Also, if this would be a permission issue, I would not be able to
>access
>> > the file at all. Seems I can access some of it. And all of it when
>root
>> > loaded the whole file first.
>>
>> --
>> with kind regards,
>> mit freundlichen Gruessen,
>>
>> Christian Reiss
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/G2MOTXZU6AFAGV2GL6P5XBXHMCRFUM6F/
>>

If you mean the ACL issue -> check 
https://bugzilla.redhat.com/show_bug.cgi?id=1797099
Ravi will be happy to have a setup that is already affected, so he can debug 
the issue.
In my case , I have reverted to v7.0 

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z4E6GSWURKGQTHPNO6U2F6HIAOZAOHHH/


[ovirt-users] Re: oVirt behavior with thin provision/deduplicated block storage

2020-02-24 Thread Benny Zlotnik
we use the stats API in the engine, currently only to check if the
backend is accessible, we have plans to use it for monitoring and
validations but it is not implemented yet

On Mon, Feb 24, 2020 at 3:35 PM Nir Soffer  wrote:
>
> On Mon, Feb 24, 2020 at 3:03 PM Gorka Eguileor  wrote:
> >
> > On 22/02, Nir Soffer wrote:
> > > On Sat, Feb 22, 2020, 13:02 Alan G  wrote:
> > > >
> > > > I'm not really concerned about the reporting aspect, I can look in the 
> > > > storage vendor UI to see that. My concern is: will oVirt stop 
> > > > provisioning storage in the domain because it *thinks* the domain is 
> > > > full. De-dup is currently running at about 2.5:1 so I'm concerned that 
> > > > oVirt will think the domain is full way before it actually is.
> > > >
> > > > Not clear if this is handled natively in oVirt or by the underlying lvs?
> > >
> > > Because oVirt does not know about deduplication or actual allocation
> > > on the storage side,
> > > it will let you allocate up the size of the LUNs that you added to the
> > > storage domain, minus
> > > the size oVirt uses for its own metadata.
> > >
> > > oVirt uses about 5G for its own metadata on the first LUN in a storage
> > > domain. The rest of
> > > the space can be used by user disks. Disks are LVM logical volumes
> > > created in the VG created
> > > from the LUN.
> > >
> > > If you create a storage domain with 4T LUN, you will be able to
> > > allocate about 4091G on this
> > > storage domain. If you use preallocated disks, oVirt will stop when
> > > you allocated all the space
> > > in the VG. Actually it will stop earlier based on the minimal amount
> > > of free space configured for
> > > the storage domain when creating the storage domain.
> > >
> > > If you use thin disks, oVirt will allocate only 1G per disk (by
> > > default), so you can allocate
> > > more storage than you actually have, but when VMs will write to the
> > > disk, oVirt will extend
> > > the disks. Once you use all the available space in this VG, you will
> > > not be able to allocate
> > > more without extending the storage domain with new LUN, or resizing
> > > the  LUN on storage.
> > >
> > > If you use Managed Block Storage (cinderlib) every disk is a LUN with
> > > the exact size you
> > > ask when you create the disk. The actual allocation of this LUN
> > > depends on your storage.
> > >
> > > Nir
> > >
> >
> > Hi,
> >
> > I don't know anything about the oVirt's implementation, so I'm just
> > going to provide some information from cinderlib's point of view.
> >
> > Cinderlib was developed as a dumb library to abstract access to storage
> > backends, so all the "smart" functionality is pushed to the user of the
> > library, in this case oVirt.
> >
> > In practice this means that cinderlib will NOT limit the number of LUNs
> > or over-provisioning done in the backend.
> >
> > Cinderlib doesn't care if we are over-provisioning because we have dedup
> > and decompression or because we are using thin volumes where we don't
> > consume all the allocated space, it doesn't even care if we cannot do
> > over-provisioning because we are using thick volumes.  If it gets a
> > request to create a volume, it will try to do so.
> >
> > From oVirt's perspective this is dangerous if not controlled, because we
> > could end up consuming all free space in the backend and then running
> > VMs will crash (I think) when they could no longer write to disks.
> >
> > oVirt can query the stats of the backend [1] to see how much free space
> > is available (free_capacity_gb) at any given time in order to provide
> > over-provisioning limits to its users.  I don't know if oVirt is already
> > doing that or something similar.
> >
> > If is important to know that stats gathering is an expensive operation
> > for most drivers, and that's why we can request cached stats (cache is
> > lost as the process exits) to help users not overuse it.  It probably
> > shouldn't be gathered more than once a minute.
> >
> > I hope this helps.  I'll be happy to answer any cinderlib questions. :-)
>
> Thanks Gorka, good to know we already have API to get backend
> allocation info. Hopefully we will use this in future version.
>
> Nir
>
> >
> > Cheers,
> > Gorka.
> >
> > [1]: https://docs.openstack.org/cinderlib/latest/topics/backends.html#stats
> >
> > > >  On Fri, 21 Feb 2020 21:35:06 + Nir Soffer  
> > > > wrote 
> > > >
> > > >
> > > >
> > > > On Fri, Feb 21, 2020, 17:14 Alan G  wrote:
> > > >
> > > > Hi,
> > > >
> > > > I have an oVirt cluster with a storage domain hosted on a FC storage 
> > > > array that utilises block de-duplication technology. oVirt reports the 
> > > > capacity of the domain as though the de-duplication factor was 1:1, 
> > > > which of course is not the case. So what I would like to understand is 
> > > > the likely behavior of oVirt when the used space approaches the 
> > > > reported capacity. Particularly around the critical action space 
> > > > blocker.
> > > >
> > > >

[ovirt-users] Re: oVirt behavior with thin provision/deduplicated block storage

2020-02-24 Thread Alan G
Thanks for the clarification on this. I've realised my mistake now - I need to 
configure the storage array to report the LUNs as larger than they physically 
are (to account for the expected de-dup ratio).  I was expecting oVirt to 
magaically know about this, which when thinking it through is not really 
technically possible.



 On Mon, 24 Feb 2020 13:34:49 + Nir Soffer  wrote 



On Mon, Feb 24, 2020 at 3:03 PM Gorka Eguileor  
wrote: 
> 
> On 22/02, Nir Soffer wrote: 
> > On Sat, Feb 22, 2020, 13:02 Alan G  wrote: 
> > > 
> > > I'm not really concerned about the reporting aspect, I can look in the 
> > > storage vendor UI to see that. My concern is: will oVirt stop 
> > > provisioning storage in the domain because it *thinks* the domain is 
> > > full. De-dup is currently running at about 2.5:1 so I'm concerned that 
> > > oVirt will think the domain is full way before it actually is. 
> > > 
> > > Not clear if this is handled natively in oVirt or by the underlying lvs? 
> > 
> > Because oVirt does not know about deduplication or actual allocation 
> > on the storage side, 
> > it will let you allocate up the size of the LUNs that you added to the 
> > storage domain, minus 
> > the size oVirt uses for its own metadata. 
> > 
> > oVirt uses about 5G for its own metadata on the first LUN in a storage 
> > domain. The rest of 
> > the space can be used by user disks. Disks are LVM logical volumes 
> > created in the VG created 
> > from the LUN. 
> > 
> > If you create a storage domain with 4T LUN, you will be able to 
> > allocate about 4091G on this 
> > storage domain. If you use preallocated disks, oVirt will stop when 
> > you allocated all the space 
> > in the VG. Actually it will stop earlier based on the minimal amount 
> > of free space configured for 
> > the storage domain when creating the storage domain. 
> > 
> > If you use thin disks, oVirt will allocate only 1G per disk (by 
> > default), so you can allocate 
> > more storage than you actually have, but when VMs will write to the 
> > disk, oVirt will extend 
> > the disks. Once you use all the available space in this VG, you will 
> > not be able to allocate 
> > more without extending the storage domain with new LUN, or resizing 
> > the  LUN on storage. 
> > 
> > If you use Managed Block Storage (cinderlib) every disk is a LUN with 
> > the exact size you 
> > ask when you create the disk. The actual allocation of this LUN 
> > depends on your storage. 
> > 
> > Nir 
> > 
> 
> Hi, 
> 
> I don't know anything about the oVirt's implementation, so I'm just 
> going to provide some information from cinderlib's point of view. 
> 
> Cinderlib was developed as a dumb library to abstract access to storage 
> backends, so all the "smart" functionality is pushed to the user of the 
> library, in this case oVirt. 
> 
> In practice this means that cinderlib will NOT limit the number of LUNs 
> or over-provisioning done in the backend. 
> 
> Cinderlib doesn't care if we are over-provisioning because we have dedup 
> and decompression or because we are using thin volumes where we don't 
> consume all the allocated space, it doesn't even care if we cannot do 
> over-provisioning because we are using thick volumes.  If it gets a 
> request to create a volume, it will try to do so. 
> 
> From oVirt's perspective this is dangerous if not controlled, because we 
> could end up consuming all free space in the backend and then running 
> VMs will crash (I think) when they could no longer write to disks. 
> 
> oVirt can query the stats of the backend [1] to see how much free space 
> is available (free_capacity_gb) at any given time in order to provide 
> over-provisioning limits to its users.  I don't know if oVirt is already 
> doing that or something similar. 
> 
> If is important to know that stats gathering is an expensive operation 
> for most drivers, and that's why we can request cached stats (cache is 
> lost as the process exits) to help users not overuse it.  It probably 
> shouldn't be gathered more than once a minute. 
> 
> I hope this helps.  I'll be happy to answer any cinderlib questions. :-) 
 
Thanks Gorka, good to know we already have API to get backend 
allocation info. Hopefully we will use this in future version. 
 
Nir 
 
> 
> Cheers, 
> Gorka. 
> 
> [1]: https://docs.openstack.org/cinderlib/latest/topics/backends.html#stats 
> 
> > >  On Fri, 21 Feb 2020 21:35:06 + Nir Soffer 
> > >  wrote  
> > > 
> > > 
> > > 
> > > On Fri, Feb 21, 2020, 17:14 Alan G  wrote: 
> > > 
> > > Hi, 
> > > 
> > > I have an oVirt cluster with a storage domain hosted on a FC storage 
> > > array that utilises block de-duplication technology. oVirt reports the 
> > > capacity of the domain as though the de-duplication factor was 1:1, which 
> > > of course is not the case. So what I would like to understand is the 
> 

[ovirt-users] Re: oVirt behavior with thin provision/deduplicated block storage

2020-02-24 Thread Nir Soffer
On Mon, Feb 24, 2020 at 3:03 PM Gorka Eguileor  wrote:
>
> On 22/02, Nir Soffer wrote:
> > On Sat, Feb 22, 2020, 13:02 Alan G  wrote:
> > >
> > > I'm not really concerned about the reporting aspect, I can look in the 
> > > storage vendor UI to see that. My concern is: will oVirt stop 
> > > provisioning storage in the domain because it *thinks* the domain is 
> > > full. De-dup is currently running at about 2.5:1 so I'm concerned that 
> > > oVirt will think the domain is full way before it actually is.
> > >
> > > Not clear if this is handled natively in oVirt or by the underlying lvs?
> >
> > Because oVirt does not know about deduplication or actual allocation
> > on the storage side,
> > it will let you allocate up the size of the LUNs that you added to the
> > storage domain, minus
> > the size oVirt uses for its own metadata.
> >
> > oVirt uses about 5G for its own metadata on the first LUN in a storage
> > domain. The rest of
> > the space can be used by user disks. Disks are LVM logical volumes
> > created in the VG created
> > from the LUN.
> >
> > If you create a storage domain with 4T LUN, you will be able to
> > allocate about 4091G on this
> > storage domain. If you use preallocated disks, oVirt will stop when
> > you allocated all the space
> > in the VG. Actually it will stop earlier based on the minimal amount
> > of free space configured for
> > the storage domain when creating the storage domain.
> >
> > If you use thin disks, oVirt will allocate only 1G per disk (by
> > default), so you can allocate
> > more storage than you actually have, but when VMs will write to the
> > disk, oVirt will extend
> > the disks. Once you use all the available space in this VG, you will
> > not be able to allocate
> > more without extending the storage domain with new LUN, or resizing
> > the  LUN on storage.
> >
> > If you use Managed Block Storage (cinderlib) every disk is a LUN with
> > the exact size you
> > ask when you create the disk. The actual allocation of this LUN
> > depends on your storage.
> >
> > Nir
> >
>
> Hi,
>
> I don't know anything about the oVirt's implementation, so I'm just
> going to provide some information from cinderlib's point of view.
>
> Cinderlib was developed as a dumb library to abstract access to storage
> backends, so all the "smart" functionality is pushed to the user of the
> library, in this case oVirt.
>
> In practice this means that cinderlib will NOT limit the number of LUNs
> or over-provisioning done in the backend.
>
> Cinderlib doesn't care if we are over-provisioning because we have dedup
> and decompression or because we are using thin volumes where we don't
> consume all the allocated space, it doesn't even care if we cannot do
> over-provisioning because we are using thick volumes.  If it gets a
> request to create a volume, it will try to do so.
>
> From oVirt's perspective this is dangerous if not controlled, because we
> could end up consuming all free space in the backend and then running
> VMs will crash (I think) when they could no longer write to disks.
>
> oVirt can query the stats of the backend [1] to see how much free space
> is available (free_capacity_gb) at any given time in order to provide
> over-provisioning limits to its users.  I don't know if oVirt is already
> doing that or something similar.
>
> If is important to know that stats gathering is an expensive operation
> for most drivers, and that's why we can request cached stats (cache is
> lost as the process exits) to help users not overuse it.  It probably
> shouldn't be gathered more than once a minute.
>
> I hope this helps.  I'll be happy to answer any cinderlib questions. :-)

Thanks Gorka, good to know we already have API to get backend
allocation info. Hopefully we will use this in future version.

Nir

>
> Cheers,
> Gorka.
>
> [1]: https://docs.openstack.org/cinderlib/latest/topics/backends.html#stats
>
> > >  On Fri, 21 Feb 2020 21:35:06 + Nir Soffer  
> > > wrote 
> > >
> > >
> > >
> > > On Fri, Feb 21, 2020, 17:14 Alan G  wrote:
> > >
> > > Hi,
> > >
> > > I have an oVirt cluster with a storage domain hosted on a FC storage 
> > > array that utilises block de-duplication technology. oVirt reports the 
> > > capacity of the domain as though the de-duplication factor was 1:1, which 
> > > of course is not the case. So what I would like to understand is the 
> > > likely behavior of oVirt when the used space approaches the reported 
> > > capacity. Particularly around the critical action space blocker.
> > >
> > >
> > > oVirt does not know about the underlying block storage thin provisioning 
> > > implemention so it cannot help with this.
> > >
> > > You will have to use the underlying storage separately to learn about the 
> > > actual allocation.
> > >
> > > This is unlikely to change for legacy storage, but for Managed Block 
> > > Storage (conderlib) we may have a way to access such info.
> > >
> > > Gorka, do we have any support in cinderlib for getting info 

[ovirt-users] Re: IO Storage Error / All findings / Need help.

2020-02-24 Thread Hesham Ahmed
Were you ever able to find a fix for this? I am facing the same problem and
the case is similar to yours. We have a 6 node distributed-replicated
Gluster, due to a network issue all servers got disconnected and upon
recover one of the volumes started giving the same IO error. The files can
be read as root but are giving error when read as vdsm. Everything else is
as in your case including the oVirt versions. While doing a full dd
if=IMAGE of=/dev/null allows the disk to be mounted on one server
temporarily, upon reboot/restart it returns to failing with IO error. I had
to create a completely new gluster volume and copy the disks from the
failing volume as root to resolve this.

Did you create a bug report in Bugzilla for this?

Regards,

Hesham Ahmed

On Wed, Feb 5, 2020 at 1:01 AM Christian Reiss 
wrote:

> Thanks for replying,
>
> What I just wrote Stahil was:
>
>
> ACL is correctly set:
>
> # file: 5aab365f-b1b9-49d0-b011-566bf936a100
> # owner: vdsm
> # group: kvm
> user::rw-
> group::rw-
> other::---
>
> Doing a setfacl failed due to "Operation not supported", remounting with
> acl, too:
>
> [root@node01 ~]# mount -o remount,acl
> /rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net\:_ssd__storage/
> /bin/sh: glusterfs: command not found
>
> As I am running the oVirt node I am not sure how feasable down/upgrading
> is. I think I am stuck with what I have.
>
> Also, if this would be a permission issue, I would not be able to access
> the file at all. Seems I can access some of it. And all of it when root
> loaded the whole file first.
>
>
> I also did, even if it was correctly set, the chown from the mountpoint
> again, to no avail.
>
>
> On 04/02/2020 21:53, Christian Reiss wrote:
> >
> > ACL is correctly set:
> >
> > # file: 5aab365f-b1b9-49d0-b011-566bf936a100
> > # owner: vdsm
> > # group: kvm
> > user::rw-
> > group::rw-
> > other::---
> >
> > Doing a setfacl failed due to "Operation not supported", remounting with
> > acl, too:
> >
> > [root@node01 ~]# mount -o remount,acl
> > /rhev/data-center/mnt/glusterSD/node01.dc-dus.dalason.net
> \:_ssd__storage/
> > /bin/sh: glusterfs: command not found
> >
> > As I am running the oVirt node I am not sure how feasable down/upgrading
> > is. I think I am stuck with what I have.
> >
> > Also, if this would be a permission issue, I would not be able to access
> > the file at all. Seems I can access some of it. And all of it when root
> > loaded the whole file first.
>
> --
> with kind regards,
> mit freundlichen Gruessen,
>
> Christian Reiss
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/G2MOTXZU6AFAGV2GL6P5XBXHMCRFUM6F/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/F35IM24FBO7S5VVP3X57X3TAVUFAKYMP/


[ovirt-users] Re: New Host Migration Failure & Console Failure

2020-02-24 Thread Jonathan Mathews
Good Day

I desperately need assistance/advice.

I am not seeing any specific reason in the logs why I am having this issue.

I tried migrating a VM again to day and basicly got the same results in the
log:

Engine Log:

2020-02-24 10:26:16,105Z INFO
 [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-62)
[c2324230-cb41-4c7f-872a-1437f0a797b1] Running command:
MigrateVmToServerCommand internal: false. Entities affected :  ID:
7b0b6e6d-d099-43e0-933f-3c335b54a3a1 Type: VMAction group MIGRATE_VM with
role type USER
2020-02-24 10:26:16,171Z INFO
 [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-62)
[c2324230-cb41-4c7f-872a-1437f0a797b1] START, MigrateVDSCommand(
MigrateVDSCommandParameters:{hostId='df715653-daf4-457e-839d-95683ab21234',
vmId='7b0b6e6d-d099-43e0-933f-3c335b54a3a1', srcHost='
host03.timefreight.co.za', dstVdsId='896b7f02-00e9-405c-b166-ec103a7f9ee8',
dstHost='host01.timefreight.co.za:54321', migrationMethod='ONLINE',
tunnelMigration='false', migrationDowntime='0', autoConverge='true',
migrateCompressed='false', consoleAddress='null', maxBandwidth='62',
enableGuestEvents='true', maxIncomingMigrations='2',
maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime,
params=[100]}], stalling=[{limit=1, action={name=setDowntime,
params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}},
{limit=3, action={name=setDowntime, params=[300]}}, {limit=4,
action={name=setDowntime, params=[400]}}, {limit=6,
action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort,
params=[]}}]]', dstQemu='172.10.10.1'}), log id: 185fa2b8
2020-02-24 10:26:16,172Z INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
(default task-62) [c2324230-cb41-4c7f-872a-1437f0a797b1] START,
MigrateBrokerVDSCommand(HostName = host03.timefreight.co.za,
MigrateVDSCommandParameters:{hostId='df715653-daf4-457e-839d-95683ab21234',
vmId='7b0b6e6d-d099-43e0-933f-3c335b54a3a1', srcHost='
host03.timefreight.co.za', dstVdsId='896b7f02-00e9-405c-b166-ec103a7f9ee8',
dstHost='host01.timefreight.co.za:54321', migrationMethod='ONLINE',
tunnelMigration='false', migrationDowntime='0', autoConverge='true',
migrateCompressed='false', consoleAddress='null', maxBandwidth='62',
enableGuestEvents='true', maxIncomingMigrations='2',
maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime,
params=[100]}], stalling=[{limit=1, action={name=setDowntime,
params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}},
{limit=3, action={name=setDowntime, params=[300]}}, {limit=4,
action={name=setDowntime, params=[400]}}, {limit=6,
action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort,
params=[]}}]]', dstQemu='172.10.10.1'}), log id: 6157c8c9
2020-02-24 10:26:16,179Z INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
(default task-62) [c2324230-cb41-4c7f-872a-1437f0a797b1] FINISH,
MigrateBrokerVDSCommand, return: , log id: 6157c8c9
2020-02-24 10:26:16,182Z INFO
 [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-62)
[c2324230-cb41-4c7f-872a-1437f0a797b1] FINISH, MigrateVDSCommand, return:
MigratingFrom, log id: 185fa2b8
2020-02-24 10:26:16,190Z INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-62) [c2324230-cb41-4c7f-872a-1437f0a797b1] EVENT_ID:
VM_MIGRATION_START(62), Migration started (VM: accpac, Source:
host03.timefreight.co.za, Destination: host01.timefreight.co.za, User:
admin@internal-authz).
2020-02-24 10:26:16,195Z INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-13) [] VM
'7b0b6e6d-d099-43e0-933f-3c335b54a3a1'(accpac) moved from 'MigratingFrom'
--> 'Up'
2020-02-24 10:26:16,195Z INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-13) [] Adding VM
'7b0b6e6d-d099-43e0-933f-3c335b54a3a1'(accpac) to re-run list
2020-02-24 10:26:16,198Z ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring]
(ForkJoinPool-1-worker-13) [] Rerun VM
'7b0b6e6d-d099-43e0-933f-3c335b54a3a1'. Called from VDS '
host03.timefreight.co.za'
2020-02-24 10:26:16,201Z INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-207443) [] START,
MigrateStatusVDSCommand(HostName = host03.timefreight.co.za,
MigrateStatusVDSCommandParameters:{hostId='df715653-daf4-457e-839d-95683ab21234',
vmId='7b0b6e6d-d099-43e0-933f-3c335b54a3a1'}), log id: 3b6274fe
2020-02-24 10:26:16,205Z INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-207443) [] FINISH,
MigrateStatusVDSCommand, return: , log id: 3b6274fe
2020-02-24 10:26:16,226Z ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-207443) [] EVENT_ID:
VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed  (VM: accpac, Source:
host03.timefreight.co.za, Destination: host01.timefreight.co.za).
2020-02-24