-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
[de-cc'ing p5p on this one since it's just sa-related.]
demerphq writes:
> Incidentally I did notice that many of the patterns used could be
> reworked to be more efficient assuming the patch is applied. Also I
> was kinda wondering about the code generated. It seems odd that each
> regex rule gets its own subroutine, wouldnt it better to precompile
> the regexes into a single routine? There appears to be low hanging
> optimization fruit in the SA code generator stuff. Ie:
>
> if ($self->{conf}->{scores}->{q{__NIGERIAN_BODY_18}}) {
> # call procedurally as it is faster.
> __NIGERIAN_BODY_18_body_test($self,@_);
> }
>
> is an example of low hanging fruit. That should read:
>
> my $lu=$self->{conf}->{scores};
> if ($lu->{q{__NIGERIAN_BODY_18}}) {
> # call procedurally as it is faster.
> __NIGERIAN_BODY_18_body_test($self,@_);
> }
>
> which would save two redundant hash lookups per rule.
That's very true ;)
> A further
> optimization would be to outright eliminate the subroutine call so
> that this would look like:
>
> my $lu=$self->{conf}->{scores};
> if ($lu->{q{__NIGERIAN_BODY_18}}) {
> #dont call a sub at all, as its faster
> foreach (@_) {
> if (/\bSEVERAL ATTEMPTS HAVE BEEN MADE WITH OUT SUCCESS\b/i) {
> $self->got_pattern_hit (q{__NIGERIAN_BODY_18}, "BODY: ");
> dbg ("Ran body-text regex rule __NIGERIAN_BODY_18
> ======> got hit: match='$&'", "rulesrun", 2);
> # Ok, we hit, stop now.
> last;
> }
> }
> }
>
> Since each subroutine gets called once per line per mail the reduction
> in call stack overhead should represent a pretty clear run time
> improvement. I assume this logic is duplicated in the other code
> generators and not just in the one I was trying to debug.
yep, this is deliberate -- although suboptimal. The idea is so
that slow-running regexps can be identified with Devel::DProf.
that's a good point btw. both of those should be reconsidered...
they certainly don't help the runtime speed.
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS
iD8DBQFCEuy7MJF5cimLx9ARAnpNAJ0ULfqF2iKKggm7a6I+KeHG6KHJpgCfeW3g
3wcxRN0itRIDoa6750x1ck0=
=+KHT
-----END PGP SIGNATURE-----