https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7445

            Bug ID: 7445
           Summary: The default mbox separator regex is dangerously
                    pedantic
           Product: Spamassassin
           Version: 3.4.1
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Libraries
          Assignee: dev@spamassassin.apache.org
          Reporter: rwmailli...@googlemail.com
  Target Milestone: Undefined

Created attachment 5456
  --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5456&action=edit
patch to allow  single spacing

In the user list thread "sa-learn won't read db created via MSTOR" sa-learn
found no emails in an mbox file because the separator looked like this:

>From - Sat Jul 8 01:02:28 2017

The important thing is the single space before the 8. The default regex looks
for " .\d " so a single digit date has to be justified with either an extra
space or a leading 0. 

What's particularly bad is that it's not consistent. The OP was lucky that it
happened on the 8th, a couple of days later and it would have appeared to work.
The worst case is where the dates are mixed, in which case many of the emails
will get concatenated.  Changing "." to ".?" fixes the problem - see patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to