-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott A Crosby writes:
> On Mon, 28 Feb 2005 15:34:13 +0000, [EMAIL PROTECTED] (Justin Mason) writes:
> 
> > A paper at the spam conference suggested using an Edit Distance algorithm
> > with very good results; the idea being, the edit distance from "cialis" to
> > "C 1 a l | s" isn't as far as it is to "specialized" or so on.
> >
> > if I recall correctly, someone submitted an implementation quite a while
> > ago on our BZ, but I think the FP rates were too high.   Given the
> > recent paper's published results, though, it may be there are good ways
> > to tweak it to get FPs at a tolerable rate.
> 
> I did an implementation of it some time ago, but I didn't get a chance
> to take it far enough to test out its effectiveness. I heard remarks
> that naively applying edit distance is too slow. To avoid having a FP
> rate that was too high, the edit-distance costs are paramaterized, so
> some edits are much cheaper than others. Eg.
> 
> # Cost of replacing a character with a punctuation in the obfu.
> setreps ("bcdfghijklmnpqrstvwxyz","*?.-",.08);
> setreps ("aeiou","*?.-",.03);
> 
> # Cost to insert these into the obfuscated string is cheap
> setins ("/\|()=-'!*`;:?+[]\"^",.01);
> setins ("_,.",.01);
> 
> So, 'v.agr.' and 'v..ia...gra' both cost <.10  
> 
> Got a bugzilla# that I can attach the prototype code to?  (Also, is it
> possible to report a bug/attach the code without creating a bugzilla
> account?)

Hi Scott -- looks like there isn't a BZ entry that'd be suitable (at
a glance).  Also, it sounds like it's "new" enough that throwing
it into an existing patch might be confusing anyway.  So I'd say
go ahead and create one.

btw, you don't have a BZ account?  how did you manage *that*! ;)

we do prefer stuff go through BZ, since our CLA-tracking is implemented in
there, and it's the de-facto-standard way for us to find old
conversations, assertions, assumptions, and so on regarding implementation
details -- we can simply say "see bug XXXX" in the changelog, and the
entire thread of discussion can then be found with one click from that.
But in an emergency you could always mail it to dev and hopefully some
one will pick it up and create the bug for you.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCJY6aMJF5cimLx9ARAg1VAKCbDVxK/02yiEm1ZUFjjU7plLD1TwCfUw8v
K6U9f9DS1SJn6KXgLkidx7s=
=L0Yv
-----END PGP SIGNATURE-----

Reply via email to