https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8286
Bug ID: 8286 Summary: TextCat: Ignore invisible text Product: Spamassassin Version: SVN Trunk (Latest Devel Version) Hardware: PC OS: Windows 10 Status: NEW Severity: normal Priority: P2 Component: Plugins Assignee: dev@spamassassin.apache.org Reporter: k...@mxguardian.net Target Milestone: Undefined Created attachment 5977 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5977&action=edit textcat.diff I'm running into an issue where messages from Microsoft Teams written in English are sometimes detected as Slovak, Czech, German (sk.us-ascii,cs.iso-8859-2,de) or not detected at all (can't determine language uniquely enough). This is due to a large chunk of base64 data embedded inside a hidden div: <section itemscope itemtype="http://schema.org/SignedAdaptiveCard"> <meta itemprop="@context" content="http://schema.org/extensions" /> <meta itemprop="@type" content="SignedAdaptiveCard" /> <div itemprop="signedAdaptiveCard" style="mso-hide:all;display:none;max-height:0px;overflow:hidden;"> ...Base64Data... </div> </section> Although this mostly affects Microsoft Teams, I've seen something similar from at least one other sender. The attached patch fixes the issue by ignoring invisible text. -- You are receiving this mail because: You are the assignee for the bug.