Excessive lintian memory usage
I'm a bit stumped on how to track this down, but a full lintian run is taking up gobs of memory, way more than it should. Currently on gluck lintian has a resident set size of 1.5GB. I suspect that we have a memory leak somewhere, which with Perl means we aren't cleaning up after ourselves or have something static and global that shouldn't be and that keeps accumulating more data with each package we check. It's not just the package list; that's only about 6MB, and even with memory bloat from the in-memory data structures, shouldn't be more than 60MB or so. This probably has something to do with why archive-wide runs are so slow. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Excessive lintian memory usage
Russ Allbery r...@debian.org writes: I'm a bit stumped on how to track this down, but a full lintian run is taking up gobs of memory, way more than it should. Currently on gluck lintian has a resident set size of 1.5GB. I found one problem at least. Several of the check routines had a lexically global $info variable because I was lazy when I introduced it and didn't pass it correctly to subroutines. I'm fixing that now and double-checking what information is global in all of the checks. I don't think that fully explains it, though, since when $info is reassigned when the next package is checked, the old $info should be garbage-collected. I'll keep looking. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Excessive lintian memory usage
Russ Allbery r...@debian.org writes: I don't think that fully explains it, though, since when $info is reassigned when the next package is checked, the old $info should be garbage-collected. I'll keep looking. I think I found it. checks/menu-format was using a global %file_index hash that stored all files in the package, but it was a lexical global and wasn't cleared with multiple calls to run, so I think it was accumulating entries for every file in Debian. Likewise, checks/menus had %all_files and %all_links hashes that were similarly lexically global and weren't ever cleared. This was probably causing false negatives on lintian.d.o as well. checks/cruft also had a %warned hash that was accumulating all files for which we issued cruft warnings, although that wouldn't have been as large. The Contents file for all of Debian is 200MB. Given that by the end of the run we'd have two copies of that in memory and Perl isn't particularly good about memory allocation overhead, I suspect that explains the bloat. I'm testing and committing my patches now. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Excessive lintian memory usage
On Tue, 2008-12-30 at 12:13 -0800, Russ Allbery wrote: Russ Allbery r...@debian.org writes: I don't think that fully explains it, though, since when $info is reassigned when the next package is checked, the old $info should be garbage-collected. I'll keep looking. I think I found it. checks/menu-format was using a global %file_index hash that stored all files in the package, but it was a lexical global and wasn't cleared with multiple calls to run, so I think it was accumulating entries for every file in Debian. [and some more] The Contents file for all of Debian is 200MB. Given that by the end of the run we'd have two copies of that in memory and Perl isn't particularly good about memory allocation overhead, I suspect that explains the bloat. Ugh. :-/ From a quick look, checks/menus also has a global $info which I introduced when moving the script to Lintian::Collect (it's called from a subroutine outside of run(), although I should just have passed $info-index-{$file} in instead). That shouldn't be a problem though, as it'll be reinitialised at the start of each invocation of run(). I'm testing and committing my patches now. Thanks, for looking at and (hopefully) fixing this. Adam -- To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org