[ovirt-users] Re: Upgraded to oVirt 4.4.9, still have vdsmd memory leak

2021-12-09 Thread Chris Adams
Once upon a time, Victor Stinner  said:
> Or something somehow prevents to delete these projects object. For
> example, an exception is stored somewhere which keeps all variables
> alive (in Python 3, an exception stores a traceback object which keeps
> all variables of all frames alive).

I think I found the cause, if not the actual code issue... due to a
long-standing local config typo (how embarassing), these servers had the
vdsm (TCP 54321) port open to the world.  It appears that something is
leaking memory on bad connections (like from port scans I expect).  I
blocked the outside access, and the vdsmd processes are not growing
since then.

It'd probably be good to handle this better (and now knowing a probable
cause may help someone track it down), but also I think I've solved my
immediate problem.

-- 
Chris Adams 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LRDQJ4CWL4EJ5R6YEJSWZ2D2AIAIKTS5/


[ovirt-users] Re: Upgraded to oVirt 4.4.9, still have vdsmd memory leak

2021-12-09 Thread Sandro Bonazzola
@Jiri Denemark  @Eduardo Lima  can
you please have a look on libvirt side?
@Martin Perina  the host/stats part within vdsm was
handled by people who are not working anymore on oVirt project, perhaps
someone from infra can have a look?

Il giorno gio 9 dic 2021 alle ore 11:20 Victor Stinner 
ha scritto:

> On Tue, Dec 7, 2021 at 6:12 PM Chris Adams  wrote:
> > Top differences
> > /usr/lib64/python3.6/site-packages/libvirt.py:442: size=295 MiB (+285
> MiB), count=5511282 (+5312311), average=56 B
> > /usr/lib64/python3.6/json/decoder.py:355: size=73.9 MiB (+70.2 MiB),
> count=736108 (+697450), average=105 B
> > /usr/lib64/python3.6/logging/__init__.py:1630: size=44.2 MiB (+43.8
> MiB), count=345704 (+342481), average=134 B
> > /usr/lib64/python3.6/site-packages/libvirt.py:5695: size=30.3 MiB (+30.0
> MiB), count=190449 (+188665), average=167 B
> > /usr/lib/python3.6/site-packages/vdsm/host/stats.py:138: size=12.1 MiB
> (+11.4 MiB), count=75366 (+70991), average=168 B
> > /usr/lib/python3.6/site-packages/vdsm/utils.py:358: size=10.4 MiB (+9968
> KiB), count=70204 (+65272), average=156 B
>
> That's quite significant!
>
> > Top block
> > 5511282 memory blocks: 302589.8 KiB
> >   File "/usr/lib64/python3.6/site-packages/libvirt.py", line 442
> > ret = libvirtmod.virEventRunDefaultImpl()
> >   File
> "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 69
> > libvirt.virEventRunDefaultImpl()
> >   File "/usr/lib/python3.6/site-packages/vdsm/common/concurrent.py",
> line 260
> > ret = func(*args, **kwargs)
>
> You should check where these "ret" objects (of libvirt.py:442) are
> stored: 5,511,282 is a lot of small objects (average: 56 bytes)! Maybe
> they are stored in a list and never destroyed.
>
> Maybe it's a reference leak in the libvirtmod.virEventRunDefaultImpl()
> function of "libvirtmod" C extension: missing Py_DECREF() somewhere.
>
> Or something somehow prevents to delete these projects object. For
> example, an exception is stored somewhere which keeps all variables
> alive (in Python 3, an exception stores a traceback object which keeps
> all variables of all frames alive).
>
> On GitHub and GitLab, I found the following code. Maybe there are
> minor differences in the versions that you are using.
>
> https://gitlab.com/libvirt/libvirt-python
> (I built the code locally to get build/libvirt.py)
>
> build/libvirt.c:
> ---
> PyObject *
> libvirt_intWrap(int val)
> {
> return PyLong_FromLong((long) val);
> }
>
> PyObject *
> libvirt_virEventRunDefaultImpl(PyObject *self ATTRIBUTE_UNUSED,
> PyObject *args ATTRIBUTE_UNUSED) {
> PyObject *py_retval;
> int c_retval;
> LIBVIRT_BEGIN_ALLOW_THREADS;
> c_retval = virEventRunDefaultImpl();
> LIBVIRT_END_ALLOW_THREADS;
> py_retval = libvirt_intWrap((int) c_retval);
> return py_retval;
> }
>
> static PyMethodDef libvirtMethods[] = {
> { (char *)"virEventRunDefaultImpl",
> libvirt_virEventRunDefaultImpl, METH_VARARGS, NULL },
> ...
> {NULL, NULL, 0, NULL}
> };
> ---
>
> This code looks correct and straightforward. Is it possible that
> internally virEventRunDefaultImpl() calls a Python memory allocator?
>
> build/libvirt.py:
> ---
> def virEventRunDefaultImpl():
> ret = libvirtmod.virEventRunDefaultImpl()
> if ret == -1:
> raise libvirtError('virEventRunDefaultImpl() failed')
> return ret
> ---
>
> Again, this code looks correct and straightforward.
>
>
> https://github.com/oVirt/vdsm/blob/37ed5c279c2dd9c9bb06329d674882e0f98f34d6/lib/vdsm/common/libvirtconnection.py
>
> vdsm/common/libvirtconnection.py:
> ---
> def __run(self):
> try:
> libvirt.virEventRegisterDefaultImpl()
> while self.run:
> libvirt.virEventRunDefaultImpl()
> finally:
> self.run = False
> ---
>
> libvirt.virEventRunDefaultImpl() result is ignored and so I don't see
> anything obvious which would explain a leak.
>
>
> Sometimes, looking at the top function is misleading since the
> explanation can be found in one of the caller functions.
>
> For example, which function creates 70.2 MiB of objects from a JSON
> document? What calls json/decoder.py:355?
>
> Victor
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/O5OAA6KNLINLRT2VYKNBI2PPH6UIYR4A/
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA 

sbona...@redhat.com


*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
___
Users mailing list -- users@ovirt.org
To unsubscribe

[ovirt-users] Re: new host addition, Cannot find master domain

2021-12-09 Thread david
> could you run it again, but instead of "--include 253" use number you get when
> run 
> 
> cat /proc/devices |grep device-mapper| cut -f 1 -d ' '
253
> INC=`cat /proc/devices |grep device-mapper| cut -f 1 -d ' '` && 
> /usr/bin/lsblk --raw --noheadings --paths --inverse --include $INC --nodeps  
> --output type,name,mountpoint
still empty 
 
>   lvs --readonly --config 'devices {filter=["a|.*|"]}' --options 
> vg_name,vg_tags /dev/sdb

  Volume group "sdb" not found
  Cannot process volume group sdb

> yes, it won't be overwritten until you run "vdsm-tool config-lvm-filter" and 
> confirm you want to overwrite it
here is the command output:

Analyzing host...
Found these mounted logical volumes on this host:

This is the recommended LVM filter for this host:

  filter = [ "r|.*|" ]

This filter allows LVM to access the local devices used by the
hypervisor, but not shared storage owned by Vdsm. If you add a new
device to the volume group, you will need to edit the filter manually.

This is the current LVM filter:

  filter = [ "r|.*|" ]

To use the recommended filter we need to add multipath
blacklist in /etc/multipath/conf.d/vdsm_blacklist.conf:

  blacklist {
  wwid "360e00d1100113629000e"
  wwid "3600605b010bbc0e029275d23bba8027b"
  }


Configure host? [yes,NO] no
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6DP36BUICGJMWGINPFOIKCBKSQNB3BXD/


[ovirt-users] Re: new host addition, Cannot find master domain

2021-12-09 Thread Vojtech Juranek
 
> > /usr/bin/lsblk --raw --noheadings --paths --inverse --include 253 --nodeps
> > --output type,name,mountpoint
> =
> empty output of above command

could you run it again, but instead of "--include 253" use number you get when 
run 

cat /proc/devices |grep device-mapper| cut -f 1 -d ' '

i.e.

INC=`cat /proc/devices |grep device-mapper| cut -f 1 -d ' '` &&  
/usr/bin/lsblk --raw --noheadings --paths --inverse --include $INC --nodeps  
--output type,name,mountpoint
 
> 
> 
> > vgs --readonly --config 'devices {filter=["a|.*|"]}' $VG_NAME
> 
> 
>   VG   #PV #LV #SN Attr   VSize   VFree
>   2047ffc7-cb32-4663-81ad-7f4e0becdf13   1  13   0 wz--n- 544.62g 41.00g

sorry, I was interested in LV tags, i.e. the command should be

lvs --readonly --config 'devices {filter=["a|.*|"]}' --options 
vg_name,vg_tags /dev/sdb


> 
> > remove it from blacklist
> 
> 
> but the message in the vdsm_blacklist.conf warns me against manual editing.
> Ignore ?

yes, it won't be overwritten until you run "vdsm-tool config-lvm-filter" and 
confirm you
want to overwrite it

> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/ List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLBFU56KZTUJW
> NCPKPYC5AS5RWBGSSWK/



signature.asc
Description: This is a digitally signed message part.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VTB6EJGPVNYH4INI54UYP42G77R463GS/


[ovirt-users] Re: new host addition, Cannot find master domain

2021-12-09 Thread david

> After you fix the issue (see Vojta reply), please run again:
> 
>vdsm-tool config-lvm-filter
> 
> The command may suggest to change the lvm filter, and blacklist the device.
> Do not confirm and share the output.

removed line wwid "360e00d1100113629000e" from vdsm_blacklist.conf

and ran vdsm-tool config-lvm-filter

here is the command output:

Analyzing host...
Found these mounted logical volumes on this host:

This is the recommended LVM filter for this host:

  filter = [ "r|.*|" ]

This filter allows LVM to access the local devices used by the
hypervisor, but not shared storage owned by Vdsm. If you add a new
device to the volume group, you will need to edit the filter manually.

This is the current LVM filter:

  filter = [ "r|.*|" ]

To use the recommended filter we need to add multipath
blacklist in /etc/multipath/conf.d/vdsm_blacklist.conf:

  blacklist {
  wwid "360e00d1100113629000e"
  wwid "3600605b010bbc0e029275d23bba8027b"
  }
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RZFTL3FQWCZA7ZXOM6TIRQCHVA2VHPE7/


[ovirt-users] Re: new host addition, Cannot find master domain

2021-12-09 Thread david
> This command put the LUN in the blacklist since it seems to be a local
> disk used by the host.

yes, but in fact this disk is a remote block device from fc storage

> Did you have active logical volumes from this LUN mounted on the host while 
> the host was added to engine?

I installed only Centos, updated it and connected the Ovirt repository,
all another settings were made by Engine himself
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/D25LLCMJZU4VL2LYGF33L2YKXPGWT4LB/


[ovirt-users] Re: new host addition, Cannot find master domain

2021-12-09 Thread david
> Could you check the ID of this device is the same as one in the blacklist?

yes it is same:

# /dev/disk/by-id
===
lvm-pv-uuid-GqZf9p-oyaL-dLN4-RdxE-Ll6Y-1Gjn-B6F4sX -> ../../sdb
scsi-360e00d1100113629000e -> ../../sdb
scsi-3600605b010bbc0e029275d23bba8027b -> ../../sda
lscsi-3600605b010bbc0e029275d23bba8027b-part1 -> ../../sda1
scsi-3600605b010bbc0e029275d23bba8027b-part2 -> ../../sda2
scsi-3600605b010bbc0e029275d23bba8027b-part3 -> ../../sda3
scsi-3600605b010bbc0e029275d23bba8027b-part4 -> ../../sda4
scsi-3600605b010bbc0e029275d23bba8027b-part5 -> ../../sda5
scsi-SFTS_PRAID_EP520i_007b02a8bb235d2729e0c0bb10b00506 -> ../../sda
scsi-SFTS_PRAID_EP520i_007b02a8bb235d2729e0c0bb10b00506-part1 -> ../../sda1
scsi-SFTS_PRAID_EP520i_007b02a8bb235d2729e0c0bb10b00506-part2 -> ../../sda2
scsi-SFTS_PRAID_EP520i_007b02a8bb235d2729e0c0bb10b00506-part3 -> ../../sda3
scsi-SFTS_PRAID_EP520i_007b02a8bb235d2729e0c0bb10b00506-part4 -> ../../sda4
scsi-SFTS_PRAID_EP520i_007b02a8bb235d2729e0c0bb10b00506-part5 -> ../../sda5
scsi-SFUJITSU_ETERNUS_DXL_113629 -> ../../sdb
wwn-0x60e00d1100113629000e -> ../../sdb
wwn-0x600605b010bbc0e029275d23bba8027b -> ../../sda
wwn-0x600605b010bbc0e029275d23bba8027b-part1 -> ../../sda1
wwn-0x600605b010bbc0e029275d23bba8027b-part2 -> ../../sda2
wwn-0x600605b010bbc0e029275d23bba8027b-part3 -> ../../sda3
wwn-0x600605b010bbc0e029275d23bba8027b-part4 -> ../../sda4
wwn-0x600605b010bbc0e029275d23bba8027b-part5 -> ../../sda5

# /etc/multipath/conf.d/vdsm_blacklist.conf
===
# This file is managed by vdsm, do not edit!
# Any changes made to this file will be overwritten when running:
# vdsm-tool config-lvm-filter

blacklist {
wwid "360e00d1100113629000e"
wwid "3600605b010bbc0e029275d23bba8027b"
}

> /usr/bin/lsblk --raw --noheadings --paths --inverse --include 253 --nodeps 
> --output type,name,mountpoint
=
empty output of above command


> vgs --readonly --config 'devices {filter=["a|.*|"]}' $VG_NAME

  VG   #PV #LV #SN Attr   VSize   VFree
  2047ffc7-cb32-4663-81ad-7f4e0becdf13   1  13   0 wz--n- 544.62g 41.00g

> remove it from blacklist

but the message in the vdsm_blacklist.conf warns me against manual editing.
Ignore ?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLBFU56KZTUJWNCPKPYC5AS5RWBGSSWK/


[ovirt-users] Re: Upgraded to oVirt 4.4.9, still have vdsmd memory leak

2021-12-09 Thread Victor Stinner
On Tue, Dec 7, 2021 at 6:12 PM Chris Adams  wrote:
> Top differences
> /usr/lib64/python3.6/site-packages/libvirt.py:442: size=295 MiB (+285 MiB), 
> count=5511282 (+5312311), average=56 B
> /usr/lib64/python3.6/json/decoder.py:355: size=73.9 MiB (+70.2 MiB), 
> count=736108 (+697450), average=105 B
> /usr/lib64/python3.6/logging/__init__.py:1630: size=44.2 MiB (+43.8 MiB), 
> count=345704 (+342481), average=134 B
> /usr/lib64/python3.6/site-packages/libvirt.py:5695: size=30.3 MiB (+30.0 
> MiB), count=190449 (+188665), average=167 B
> /usr/lib/python3.6/site-packages/vdsm/host/stats.py:138: size=12.1 MiB (+11.4 
> MiB), count=75366 (+70991), average=168 B
> /usr/lib/python3.6/site-packages/vdsm/utils.py:358: size=10.4 MiB (+9968 
> KiB), count=70204 (+65272), average=156 B

That's quite significant!

> Top block
> 5511282 memory blocks: 302589.8 KiB
>   File "/usr/lib64/python3.6/site-packages/libvirt.py", line 442
> ret = libvirtmod.virEventRunDefaultImpl()
>   File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", 
> line 69
> libvirt.virEventRunDefaultImpl()
>   File "/usr/lib/python3.6/site-packages/vdsm/common/concurrent.py", line 260
> ret = func(*args, **kwargs)

You should check where these "ret" objects (of libvirt.py:442) are
stored: 5,511,282 is a lot of small objects (average: 56 bytes)! Maybe
they are stored in a list and never destroyed.

Maybe it's a reference leak in the libvirtmod.virEventRunDefaultImpl()
function of "libvirtmod" C extension: missing Py_DECREF() somewhere.

Or something somehow prevents to delete these projects object. For
example, an exception is stored somewhere which keeps all variables
alive (in Python 3, an exception stores a traceback object which keeps
all variables of all frames alive).

On GitHub and GitLab, I found the following code. Maybe there are
minor differences in the versions that you are using.

https://gitlab.com/libvirt/libvirt-python
(I built the code locally to get build/libvirt.py)

build/libvirt.c:
---
PyObject *
libvirt_intWrap(int val)
{
return PyLong_FromLong((long) val);
}

PyObject *
libvirt_virEventRunDefaultImpl(PyObject *self ATTRIBUTE_UNUSED,
PyObject *args ATTRIBUTE_UNUSED) {
PyObject *py_retval;
int c_retval;
LIBVIRT_BEGIN_ALLOW_THREADS;
c_retval = virEventRunDefaultImpl();
LIBVIRT_END_ALLOW_THREADS;
py_retval = libvirt_intWrap((int) c_retval);
return py_retval;
}

static PyMethodDef libvirtMethods[] = {
{ (char *)"virEventRunDefaultImpl",
libvirt_virEventRunDefaultImpl, METH_VARARGS, NULL },
...
{NULL, NULL, 0, NULL}
};
---

This code looks correct and straightforward. Is it possible that
internally virEventRunDefaultImpl() calls a Python memory allocator?

build/libvirt.py:
---
def virEventRunDefaultImpl():
ret = libvirtmod.virEventRunDefaultImpl()
if ret == -1:
raise libvirtError('virEventRunDefaultImpl() failed')
return ret
---

Again, this code looks correct and straightforward.

https://github.com/oVirt/vdsm/blob/37ed5c279c2dd9c9bb06329d674882e0f98f34d6/lib/vdsm/common/libvirtconnection.py

vdsm/common/libvirtconnection.py:
---
def __run(self):
try:
libvirt.virEventRegisterDefaultImpl()
while self.run:
libvirt.virEventRunDefaultImpl()
finally:
self.run = False
---

libvirt.virEventRunDefaultImpl() result is ignored and so I don't see
anything obvious which would explain a leak.


Sometimes, looking at the top function is misleading since the
explanation can be found in one of the caller functions.

For example, which function creates 70.2 MiB of objects from a JSON
document? What calls json/decoder.py:355?

Victor
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/O5OAA6KNLINLRT2VYKNBI2PPH6UIYR4A/