Package: ssdeep
Version: 2.7-2
Severity: important
Tags: patch
Dear Maintainer,
ssdeep (and libfuzzy2 Debian package) before version 2.10 has a bug
which may make wrong score on two fuzzy hashes with same block sizes.
This will make clustering/comparing files unreliable.
This bug was fixed in 2.10 by Jesse Kornblum
<[email protected]> but still not fixed in Debian versions
(sid, unstable and stable).
I encountered this bug while clustering about 10M files based on ssdeep
hashes and I had to recluster all the files.
Sorry that I have no `natural' examples to reproduce (because I slightly
changed the parameter after building patched versions of
ssdeep/libfuzzy2 2.7-2 and it will take about 2 months * 20 CPU cores to
compare clusters) but we can generate `artificial' example by truncating
second chunk of fuzzy hashes.
[PROMPT_EXAMPLE_BEGIN]
$ # Generate artificial test cases
$ cat >test <<_END
ssdeep,1.1--blocksize:hash:hash,filename
24:5nmkHuww9FXe0ZpPKoVH7bK3KT1Odk8gKgNWvoqzDVEatXSHlY31x:E4uV9FX,"1"
24:5nmkHuww9FXe0ZpPKoVH7bK3KT1Odk8gKgNWvoqzDVENXSCYA1x:E4uV9FX,"2"
_END
$ # This is the expected result.
$ $SSDEEP_FIXED/ssdeep -k test -x test
test:1 matches test:2 (100)
test:1 matches test:2 (100)
test:2 matches test:1 (100)
test:2 matches test:1 (100)
test:1 matches test:2 (100)
test:1 matches test:2 (100)
test:2 matches test:1 (100)
test:2 matches test:1 (100)
$ # This is the result from Debian versions of ssdeep.
$ ssdeep -k test -x test
test:1 matches test:2 (94)
test:1 matches test:2 (94)
test:2 matches test:1 (94)
test:2 matches test:1 (94)
test:1 matches test:2 (94)
test:1 matches test:2 (94)
test:2 matches test:1 (94)
test:2 matches test:1 (94)
$
[PROMPT_EXAMPLE_END]
As you can see, buggy ssdeep/libfuzzy2 returns score of 94 but fixed
versions of ssdeep/libfuzzy2 returns score of 100 for cases:
* file 1 and file 2
* file 1 and file 1 (matching itself)
* file 2 and file 2 (matching itself)
Attached patch is excerpt from actual Jesse Kornblum's patch (applied in
ssdeep 2.10) formatted for Debian version of 2.7-2.
By the way, I recommend UPGRADING THE UPSTREAM VERSION TO 2.10 on
`unstable' and `sid' instead of applying the patch because ssdeep
version 2.10 fixes some other bugs (I didn't encountered but someone
other may).
Thanks and I hope this will be fixed before `Jessie' is frozen.
Tsukasa OI
http://a4lg.com/
-- System Information:
Debian Release: 7.6
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.2.0-4-amd64 (SMP w/40 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages ssdeep depends on:
ii libc6 2.13-38+deb7u4
ssdeep recommends no packages.
ssdeep suggests no packages.
-- no debconf information
diff --git a/fuzzy.c b/fuzzy.c
index a9b771c..bcdef56 100644
--- a/fuzzy.c
+++ b/fuzzy.c
@@ -584,7 +584,7 @@ int fuzzy_compare(const char *str1, const char *str2)
if (block_size1 == block_size2) {
uint32_t score1, score2;
score1 = score_strings(s1_1, s2_1, block_size1);
- score2 = score_strings(s1_2, s2_2, block_size2);
+ score2 = score_strings(s1_2, s2_2, block_size1*2);
// s->block_size = block_size1;
_______________________________________________
forensics-devel mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/forensics-devel