Hi, after upgrading to 4.9.1 i still see the collectd process growing. We are heaviliy using the LISTVAL/GETVAL commands throught the unixsocket plugin, which makes the process grow to about 180M a day.
After analyzing a bit, I found a more or less obvious memory leak, which happens always when writing to the socket fails. In that case, print_to_socket() returns(-1) without freeing the memory. We have many failures writing to the socket, probably because it is an ajax tool polling constantly. I attached 2 patches against 4.9.1, which fixes this leak. The first patch is an implementation of a simple garbage collector (seen in openvpn code) and the second replaces the sfree()'s with the garbage collector cleanup and triggers cleanup on return. I will stress-test this now for a while. Probably also using dmalloc, since I fear there's still another memory leak. At least when I tried to constantly FLUSH, the process is still growing fast. LISTVAL and GETVAL *seem* to be ok now, since under listval/getval hammering the process size keeps constant for most of the time. After a while it grows a little.. That's probably because of the flush leak. Comments on this patchset are much appreciated. BTW: changing the garbage collector in a manner like openvpn does (take a look at it), would reduce the risk of a memory leak greatly. It's however necessary to wrap *all* memory allocations in order to make it in a sane manner. Ok,. working on flush now.. Kind regards, peter -- _______________________________________________ collectd mailing list [email protected] http://mailman.verplant.org/listinfo/collectd
