On 25 September 2017 at 18:08, Jeff King <[email protected]> wrote:
> On Sun, Sep 24, 2017 at 09:59:28PM +0200, Martin Ågren wrote:
>
>> > I'm not sure of the best way to count things.
>>
>
> But at least on the topic of "how many unique leaks are there", I wrote
> the script below to try to give some basic answers. It just finds the
> first non-boring entry in each stack trace and reports that. Where
> "boring" is really "this function is not expected to free, but hands off
> memory ownership to somebody else".
Thanks. I combined your script with this:
-- >8 --
#!/usr/bin/perl -w
# Extract the stacktraces and identify them
# by their SHA hashes (these identifiers are
# not guaranteed to be stable across
# re-compilations of the Git binaries).
use Digest::SHA qw(sha1 sha1_hex);
my $ctx = Digest::SHA->new("SHA-1");
my $stage = 0;
while (<>) {
my $collect = 0;
if ($stage == 0 && /irect leak of \d+ byte.*allocated from:$/) {
$stage++;
$collect = 1;
} elsif($stage == 1 && /^\s*\#\d+\s+/) {
$collect = 1;
} elsif ($stage == 1 && /^\s*$/) {
$digest = $ctx->hexdigest;
printf "Stacktrace-hash: %s\n", $digest;
$ctx = Digest::SHA->new("SHA-1");
$stage = 0;
} elsif ($stage == 1) {
print "warning: unidentified string '$_'\n";
}
if ($collect) {
$ctx->add_bits($_);
print;
}
}
-- >8 --
Then I report various ad-hoc metrics:
-- >8 --
#!/bin/bash
for d in "$@"
do
echo $d
echo -n " direct leaks: "
grep "Direct leak" "$d"/* | wc -l
echo -n " indirect leaks: "
grep "Indirect leak" "$d"/* | wc -l
echo -n " allocating places: "
perl leaks.pl "$d"/* | sort -u | wc -l
echo -n " most common allocating place: "
perl leaks.pl "$d"/* | sort \
| uniq -c | sort -nr | head -1 | awk '{print $1;}'
echo -n " size of leak-reports: "
cat "$d"/* | wc -l
echo -n " unique leaking stacktraces: "
perl extract-traces.pl "$d"/* | grep "Stacktrace-hash" | sort -u | wc -l
echo -n " most common stacktrace: "
perl extract-traces.pl "$d"/* | grep "Stacktrace-hash" | sort \
| uniq -c | sort -nr | head -1 | awk '{print $1;}'
done
-- >8 --
If PIDs of leaking processes collide, reports are lost. Something like
this as root helps: `echo 4194303 > /proc/sys/kernel/pid_max`
Still, the numbers vary for back-to-back runs. Here are two runs on
master and two runs on master plus the lockfile-patches I just sent.
(I don't run all tests.)
lsan_ea220ee4
direct leaks: 127165
indirect leaks: 83897
allocating places: 504
most common allocating place: 10212
size of leak-reports: 3662204
unique leaking stacktraces: 83265
most common stacktrace: 55
lsan_ea220ee4-rerun
direct leaks: 127172
indirect leaks: 83903
allocating places: 504
most common allocating place: 10212
size of leak-reports: 3662334
unique leaking stacktraces: 83644
most common stacktrace: 57
lsan_ea220ee4+lockfile_fixes
direct leaks: 118678
indirect leaks: 83908
allocating places: 493
most common allocating place: 10212
size of leak-reports: 3545563
unique leaking stacktraces: 99834
most common stacktrace: 32
lsan_ea220ee4+lockfile_fixes-rerun
direct leaks: 118678
indirect leaks: 83902
allocating places: 491
most common allocating place: 10212
size of leak-reports: 3545463
unique leaking stacktraces: 82171
most common stacktrace: 40
> So I don't know how useful any of that will be, but it at least should
> give _some_ metric that should be diminishing as we fix leaks.
Indeed.
Martin