On Wed, Mar 08, 2006 at 09:36:04PM +0000, Justin Mason wrote:
> > If nothing else, I am for simply changing the way rawbody rules are
> > evaluated... Because the current line by line evaluation is too
> > restrictive, and using a handfull of rules and meta'ing them together to
> > match something that wraps across multiple lines is kludgly at best.
> 
> That is definitely a good idea.
> 
> Are there any rawbody rules left anywhere that this would break? I think
> it's likely to be only an improvement.

Hard to say, though I tend to agree.  In our case, there are few
rawbody rules (26), and fewer which aren't evals (18).  There's only one
(HTML_TINY_FONT) which has a ".*" which would need some help, and via
discussion about the HTML*TINY* rules it could either be replaced or
removed without issue.

Just so we're all clear...  It seems like the proposal would be to change
M::SA::Message::get_decoded_body_text_array() such that:

    push(@{$self->{text_decoded}}, 
split_into_array_of_short_lines($parts[$pt]->decode()));

becomes

    my $text = $parts[$pt]->decode();
    $text =~ tr/ \t\n\r\x0b\xa0/ /s;    # whitespace => space
    push(@{$self->{text_decoded}}, split_into_array_of_short_lines($text));

Yes?

> It does introduce the danger of algorithmic complexity attacks
> if .* is used instead of .{0,20} though -- but we may be able to help
> this if we spot that kind of thing in --lint.

<shrug>  I worry more about full than rawbody in this case since the
full text is always going to be larger than rawbody, so the potential
for problems is greater.  Even with the above code, the decoded portion
is split to be under 1k, full is the size of the message.

-- 
Randomly Generated Tagline:
"Yeah ... You can give pilots guns ... or here's an idea: Why don't you
 make damn sure the airport is secure!?!?"
                                 - Lewis Black, The Daily Show 2002.07.17

Attachment: pgpUxE9yuFlIy.pgp
Description: PGP signature

Reply via email to