[ http://issues.apache.org/jira/browse/JAMES-387?page=comments#action_12358507 ]
Vincenzo Gianferrari Pini commented on JAMES-387: ------------------------------------------------- Bernd is right: buildCorpus() is in a synchronized block to avoid messing when new mails are fed (in a separate thread), but I forgot to handle synchronization problems between buildCorpus() and getTokenProbabilityStrengths(). I will refactor builCorpus() to avoid this dirty double use of corpus. Moreover corpus, hamTokenCounts and spamTokenCounts seem to be not cleared when loading/building an updated new corpus from the database. > Exception in BayesianAnalysis > ----------------------------- > > Key: JAMES-387 > URL: http://issues.apache.org/jira/browse/JAMES-387 > Project: James > Type: Bug > Components: Matchers/Mailets (bundled) > Versions: 3.0 > Environment: James from svn-trunk 2005-08-01. > MySQL 4.0 > Reporter: Stefano Bagnara > Assignee: Vincenzo Gianferrari Pini > Priority: Minor > > Got this exception for every incoming mail: > 02/08/05 00:39:25 INFO James.Mailet: BayesianAnalysis: Exception: > java.lang.Integer > java.lang.ClassCastException: java.lang.Integer > at > org.apache.james.util.BayesianAnalyzer.getTokenProbabilityStrengths(BayesianAnalyzer.java:591) > at > org.apache.james.util.BayesianAnalyzer.computeSpamProbability(BayesianAnalyzer.java:340) > at > org.apache.james.transport.mailets.BayesianAnalysis.service(BayesianAnalysis.java:289) > at > org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:407) > at > org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:460) > at > org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:369) > at java.lang.Thread.run(Unknown Source) > If I clean my spam/ham db the exceptions disappears but they start again when > the spam/ham db become large. > My bayesiananalysis_spam contains 200000 rows. > The following are the spam tokens with higher "occurrences". > +---------------------------+-------------+ > | token | occurrences | > +---------------------------+-------------+ > | 3D | 82151 | > | a | 59953 | > | the | 45295 | > | FONT | 42771 | > | Content-Type | 39058 | > | to | 36626 | > | com | 32902 | > | http | 32886 | > | of | 32504 | > | font | 31803 | > | and | 31577 | > | Content-Transfer-Encoding | 31576 | > | p | 29746 | > | text | 29482 | > | in | 29418 | > | it | 28498 | > | br | 28037 | > | DIV | 27431 | -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]