-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[de-cc'ing p5p on this one since it's just sa-related.]

demerphq writes:
> Incidentally I did notice that many of the patterns used could be
> reworked to be more efficient assuming the patch is applied. Also I
> was kinda wondering about the code generated. It seems odd that each
> regex rule gets its own subroutine, wouldnt it better to precompile
> the regexes into a single routine? There appears to be low hanging
> optimization fruit in the SA code generator stuff.  Ie:
> 
>       if ($self->{conf}->{scores}->{q{__NIGERIAN_BODY_18}}) {
>         # call procedurally as it is faster.
>         __NIGERIAN_BODY_18_body_test($self,@_);
>       }
> 
> is an example of low hanging fruit. That should read:
> 
> my $lu=$self->{conf}->{scores};
> if ($lu->{q{__NIGERIAN_BODY_18}}) {
>         # call procedurally as it is faster.
>         __NIGERIAN_BODY_18_body_test($self,@_);
> }
> 
> which would save two redundant hash lookups per rule.

That's very true ;)

> A further
> optimization would be to outright eliminate the subroutine call so
> that this would look like:
> 
> my $lu=$self->{conf}->{scores};
> if ($lu->{q{__NIGERIAN_BODY_18}}) {
>   #dont call a sub at all, as its faster
>   foreach (@_) {
>      if (/\bSEVERAL ATTEMPTS HAVE BEEN MADE WITH OUT SUCCESS\b/i) { 
>                 $self->got_pattern_hit (q{__NIGERIAN_BODY_18}, "BODY: "); 
>                 dbg ("Ran body-text regex rule __NIGERIAN_BODY_18
> ======> got hit: match='$&'", "rulesrun", 2);
>                 # Ok, we hit, stop now.
>       last;
>      }
>   }
> }
> 
> Since each subroutine gets called once per line per mail the reduction
> in call stack overhead should represent a pretty clear run time
> improvement. I assume this logic is duplicated in the other code
> generators and not just in the one I was trying to debug.

yep, this is deliberate -- although suboptimal.  The idea is so
that slow-running regexps can be identified with Devel::DProf.

that's a good point btw.  both of those should be reconsidered...
they certainly don't help the runtime speed.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEuy7MJF5cimLx9ARAnpNAJ0ULfqF2iKKggm7a6I+KeHG6KHJpgCfeW3g
3wcxRN0itRIDoa6750x1ck0=
=+KHT
-----END PGP SIGNATURE-----

Reply via email to