https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155
--- Comment #9 from Warren Togami <[email protected]> 2009-08-17 20:46:17 PST --- http://ruleqa.spamassassin.org/20090817-r804903-n/TVD_SPACE_RATIO/detail 90% FP rate for Japanese http://ruleqa.spamassassin.org/20090817-r804903-n/PLING_QUERY/detail 52% FP rate for Japanese http://ruleqa.spamassassin.org/20090817-r804903-n/GAPPY_SUBJECT/detail 44% FP rate for Japanese All three of these rules do very poorly with Japanese mail, and the total % SPAM is lower than the % FP. Yet the GA scores are rather high since we don't have a statistically significant amount of Japanese mail in the corpus. What language are the SPAM hits? Perhaps many are examples of identifying foreign languages instead of determining if it is ham or spam? Bug #6149 is related to this problem. I am attempting to convince Japanese, Chinese and Korean users to join the nightly masscheck, but it is very difficult. -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
