> On Mar 24, 2015, at 4:33 AM, Dan Kenigsberg <dan...@redhat.com> wrote: > > On Mon, Mar 23, 2015 at 04:00:14PM -0400, John Taylor wrote: >> Chris Adams <c...@cmadams.net> writes: >> >>> Once upon a time, Sven Kieske <s.kie...@mittwald.de> said: >>>> On 13/03/15 12:29, Kapetanakis Giannis wrote: >>>>> We also face this problem since 3.5 in two different installations... >>>>> Hope it's fixed soon >>>> >>>> Nothing will get fixed if no one bothers to >>>> open BZs and send relevants log files to help >>>> track down the problems. >>> >>> There's already an open BZ: >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1158108 >>> >>> I'm not sure if that is exactly the same problem I'm seeing or not; my >>> vdsm process seems to be growing faster (RSS grew 952K in a 5 minute >>> period just now; VSZ didn't change). >> >> For those following this I've added a comment on the bz [1], although in >> my case the memory leak is, like Chris Adams, a lot more than the 300KiB/h >> in the original bug report by Daniel Helgenberger . >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1158108 > > That's interesting (and worrying). > Could you check your suggestion by editing sampling.py so that > _get_interfaces_and_samples() returns the empty dict immediately? > Would this make the leak disappear?
Looks like you’ve got something there. Just a quick test for now, watching RSS in top. I’ll let it go this way for a while and see what it looks in a few hours. System 1: 13 VMs w/ 24 interfaces between them 11:47 killed a vdsm @ 9.116G RSS (after maybe a week and a half running) 11:47: 97xxx 11:57 135544 and climbing 12:00 136400 restarted with sampling.py modified to just return empty set: def _get_interfaces_and_samples(): links_and_samples = {} return links_and_samples 12:02 quickly grew to 127694 12:13: 133352 12:20: 132476 12:31: 132732 12:40: 132656 12:50: 132800 1:30: 133928 1:40: 133136 1:50: 133116 2:00: 133128 interestingly, it looks like overall system load dropped significantly (from ~40-45% to 10% reported). mostly ksmd getting out of the way after freeing 9G, but feels like more than that. (this is a 6 core system, usually saw ksmd using ~80% of a single cpu, roughly 15% of the total available) Second system, 10 Vms w/ 17 interfaces vdsmd @ 5.027G RSS (slightly less uptime that previous host) freeing this ram caused a ~16% utilization drop as ksmd stopped running as hard. restarted at 12:10 12:10: 106224 12:20: 111220 12:31: 114616 12:40: 117500 12:50: 120504 1:30: 133040 1:40: 136140 1:50: 139032 2:00: 142292 _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users