Loren Wilton wrote:

>>  13 FUZZY_OCR              BODY: Mail contains an image with common
>> spam text inside
>>                            Words found:
>> "target" in 1 lines
>> "service" in 1
>>                             lines
>> "stock" in 2 lines
>> "price" in 2 lines
>>                            "company" in 1 lines
>> "recommendation" in 1
>>                            lines
>> (12 word occurrences found)
>>
>> I'm rather disturbed by the +13 score. Surely *no* single test should
>> be able to add *nearly three times* my spam threshold of +5 to the score
>> of a single mail? Is there a way to threshold the thing so that it will
>> cap scores at +4.5 or something more sane?
> 
> The FuzzyOCR score is a cumulation of the variosu subtests it hits. 
> There are a handful of configuration options that can set scores,
> multipliers, and limits for various things.

That is correct, users should read the configuration file and understand what
can be controlled.

But look at the report, is says "12 word... found" but it only shows 8 words
(counting repetitions), it looks like the score is wrong anyway.

Sorry for taking the topic further away from the original problem with
Util::wrap(), which is the cause of the lost formatting on the report, it just
doesn't take into account the newlines and gets its character count wrong, so
the end result is the mess you see after it adds line breaks as a function of
that character count.
-- 
René Berber

Reply via email to