Excessive lintian memory usage

2008-12-30 Thread Russ Allbery
I'm a bit stumped on how to track this down, but a full lintian run is
taking up gobs of memory, way more than it should.  Currently on gluck
lintian has a resident set size of 1.5GB.

I suspect that we have a memory leak somewhere, which with Perl means we
aren't cleaning up after ourselves or have something static and global
that shouldn't be and that keeps accumulating more data with each package
we check.  It's not just the package list; that's only about 6MB, and even
with memory bloat from the in-memory data structures, shouldn't be more
than 60MB or so.

This probably has something to do with why archive-wide runs are so slow.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Excessive lintian memory usage

2008-12-30 Thread Russ Allbery
Russ Allbery r...@debian.org writes:

 I'm a bit stumped on how to track this down, but a full lintian run is
 taking up gobs of memory, way more than it should.  Currently on gluck
 lintian has a resident set size of 1.5GB.

I found one problem at least.  Several of the check routines had a
lexically global $info variable because I was lazy when I introduced it
and didn't pass it correctly to subroutines.  I'm fixing that now and
double-checking what information is global in all of the checks.

I don't think that fully explains it, though, since when $info is
reassigned when the next package is checked, the old $info should be
garbage-collected.  I'll keep looking.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Excessive lintian memory usage

2008-12-30 Thread Russ Allbery
Russ Allbery r...@debian.org writes:

 I don't think that fully explains it, though, since when $info is
 reassigned when the next package is checked, the old $info should be
 garbage-collected.  I'll keep looking.

I think I found it.  checks/menu-format was using a global %file_index
hash that stored all files in the package, but it was a lexical global and
wasn't cleared with multiple calls to run, so I think it was accumulating
entries for every file in Debian.  Likewise, checks/menus had %all_files
and %all_links hashes that were similarly lexically global and weren't
ever cleared.  This was probably causing false negatives on lintian.d.o as
well.

checks/cruft also had a %warned hash that was accumulating all files for
which we issued cruft warnings, although that wouldn't have been as large.

The Contents file for all of Debian is 200MB.  Given that by the end of
the run we'd have two copies of that in memory and Perl isn't particularly
good about memory allocation overhead, I suspect that explains the bloat.

I'm testing and committing my patches now.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Excessive lintian memory usage

2008-12-30 Thread Adam D. Barratt
On Tue, 2008-12-30 at 12:13 -0800, Russ Allbery wrote:
 Russ Allbery r...@debian.org writes:
 
  I don't think that fully explains it, though, since when $info is
  reassigned when the next package is checked, the old $info should be
  garbage-collected.  I'll keep looking.
 
 I think I found it.  checks/menu-format was using a global %file_index
 hash that stored all files in the package, but it was a lexical global and
 wasn't cleared with multiple calls to run, so I think it was accumulating
 entries for every file in Debian.
[and some more]
 The Contents file for all of Debian is 200MB.  Given that by the end of
 the run we'd have two copies of that in memory and Perl isn't particularly
 good about memory allocation overhead, I suspect that explains the bloat.

Ugh. :-/

From a quick look, checks/menus also has a global $info which I
introduced when moving the script to Lintian::Collect (it's called from
a subroutine outside of run(), although I should just have passed
$info-index-{$file} in instead). That shouldn't be a problem though,
as it'll be reinitialised at the start of each invocation of run().

 I'm testing and committing my patches now.

Thanks, for looking at and (hopefully) fixing this.

Adam


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org