On Fri, Jan 15, 2016 at 11:22 AM, Francesco Romani <[email protected]> wrote: > ----- Original Message ----- >> From: "Nir Soffer" <[email protected]> >> To: "devel" <[email protected]>, "Francesco Romani" <[email protected]> >> Sent: Friday, January 15, 2016 9:52:13 AM >> Subject: Re: [VDSM] health: Introduce Vdsm health monitoring >> >> I tested with https://gerrit.ovirt.org/#/c/51630 >> >> libvirt.virDomain object does not leak any more, but pthreading object >> are still leaking, >> and the number of uncollectible objects is still very high (12,915). >> >> Each time we start and stop a vm, we leak about 6000 objects which are part >> of a reference cycle, and each time we create and leak 27 pthreading >> locks/conditions >> which are probably the reason for this leak (because they implement >> __del__ or reference object >> implementing __del__. > > *Surely* this deserve more investigation, *most likley* there is more to be > fixed, > and *surely* I'll invest more energy in research, but I think we should also > apply > some grains of salt to the raw results gc gives us. > I've been running a test overnight like > > - start 20 vms > - sleep 2m > - stop 20 vms > - sleep 30s > > repeated 100 times, monitoring VmRSS (/proc/$VDSM_PID/status) and I didn't > observed > any _significant_ leak. > The VmRSS growth was pretty low, almost flat. So something doesn't click here. > > I mean, 6000 objects (at least) per vm each time *should* be noticeable after > all! > > Once 51630 is in good shape I'll broad the net and go hunt for more virt > leaks.
After removing the gc.DEBUG_SAVEALL, we see now only the libvirt.virDomain leak. Turns out that the pthreading objects leak was caused by saving all the unreachable objects when using gc.DEBUG_SAVEALL. Here is a run with https://gerrit.ovirt.org/51708, but without https://gerrit.ovirt.org/51630 # tail -f /var/log/vdsm/vdsm.log | grep health MainThread::INFO::2016-01-15 16:14:14,789::health::58::health::(start) Starting health monitor (interval=60) Thread-15::DEBUG::2016-01-15 16:14:14,793::health::70::health::(_run) Health monitor started Thread-15::DEBUG::2016-01-15 16:15:14,794::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:15:14,815::health::86::health::(_check) Collected 54 objects activating host... Thread-15::DEBUG::2016-01-15 16:16:14,815::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:16:14,837::health::86::health::(_check) Collected 299 objects Thread-15::DEBUG::2016-01-15 16:17:14,838::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:17:14,858::health::86::health::(_check) Collected 128 objects starting vm... Thread-15::DEBUG::2016-01-15 16:18:14,858::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:18:14,883::health::86::health::(_check) Collected 1771 objects Thread-15::DEBUG::2016-01-15 16:19:14,884::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:19:14,909::health::86::health::(_check) Collected 167 objects stopping vm... Thread-15::DEBUG::2016-01-15 16:20:14,909::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:20:14,934::health::86::health::(_check) Collected 179 objects vm is down... Thread-15::DEBUG::2016-01-15 16:21:14,934::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:21:14,960::health::86::health::(_check) Collected 2284 objects Thread-15::WARNING::2016-01-15 16:21:14,960::health::94::health::(_check) Found 1 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>'] starting vm... Thread-15::DEBUG::2016-01-15 16:22:14,961::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:22:14,987::health::86::health::(_check) Collected 1721 objects Thread-15::WARNING::2016-01-15 16:22:14,987::health::94::health::(_check) Found 1 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>'] stopping vm... Thread-15::DEBUG::2016-01-15 16:23:14,988::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:23:15,014::health::86::health::(_check) Collected 189 objects Thread-15::WARNING::2016-01-15 16:23:15,014::health::94::health::(_check) Found 1 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>'] Thread-15::DEBUG::2016-01-15 16:24:15,014::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:24:15,040::health::86::health::(_check) Collected 2240 objects Thread-15::WARNING::2016-01-15 16:24:15,041::health::94::health::(_check) Found 2 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>', '<libvirt.virDomain object at 0x7f79ec10cf90>'] vm is down here (probbaly was done before the log) Thread-15::DEBUG::2016-01-15 16:25:15,041::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:25:15,065::health::86::health::(_check) Collected 128 objects Thread-15::WARNING::2016-01-15 16:25:15,065::health::94::health::(_check) Found 2 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>', '<libvirt.virDomain object at 0x7f79ec10cf90>'] starting vm... Thread-15::DEBUG::2016-01-15 16:26:15,067::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:26:15,094::health::86::health::(_check) Collected 1749 objects Thread-15::WARNING::2016-01-15 16:26:15,094::health::94::health::(_check) Found 2 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>', '<libvirt.virDomain object at 0x7f79ec10cf90>'] stopping vm... Thread-15::DEBUG::2016-01-15 16:27:15,096::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:27:15,119::health::86::health::(_check) Collected 167 objects Thread-15::WARNING::2016-01-15 16:27:15,120::health::94::health::(_check) Found 2 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>', '<libvirt.virDomain object at 0x7f79ec10cf90>'] vm is shutting down... Thread-15::DEBUG::2016-01-15 16:28:15,120::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:28:15,144::health::86::health::(_check) Collected 2262 objects Thread-15::WARNING::2016-01-15 16:28:15,144::health::94::health::(_check) Found 3 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>', '<libvirt.virDomain object at 0x7f79ec10cf90>', '<libvirt.virDomain object at 0x1f0a2d0>'] vm is down... Thread-15::DEBUG::2016-01-15 16:29:15,145::health::84::health::(_check) Checking health Thread-15::DEBUG::2016-01-15 16:29:15,167::health::86::health::(_check) Collected 128 objects Thread-15::WARNING::2016-01-15 16:29:15,167::health::94::health::(_check) Found 3 uncollectable objects: ['<libvirt.virDomain object at 0x1ef6550>', '<libvirt.virDomain object at 0x7f79ec10cf90>', '<libvirt.virDomain object at 0x1f0a2d0>'] When running with both https://gerrit.ovirt.org/51708 and https://gerrit.ovirt.org/51630, no object leak because of reference cycles and __del__. I'll abandon https://gerrit.ovirt.org/51868 since it is not needed. Nir _______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
