http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5763

           Summary: Problem with invisible context extraction - whitespace
                    chars dropped
           Product: Spamassassin
           Version: unspecified
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P5
         Component: spamassassin
        AssignedTo: [email protected]
        ReportedBy: [EMAIL PROTECTED]


There seems to be a problem with text extracted into the "invisible" context --
whitespace characters are dropped. 

During the HTML parsing time (HTML.pm), "whitespace" is always treated as 
visible. 
On the other hand, when parsing the HTML text, in display_text() API, trailing 
whitespace is trimmed when current element is whitespace; leading whitespace is 
trimmed when previous element is whitespace.

So when invisible text is extracted, no whitespace (because either trailing
whitespace or leading whitespace is trimmed).



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to