[ http://issues.apache.org/jira/browse/JAMES-387?page=comments#action_12322588 ]
Vincenzo Gianferrari Pini commented on JAMES-387: ------------------------------------------------- I gave a careful look to the code and couldn't find anything wrong. I have a spam table with more than 258000 rows and everything works fine for me. IMHO a possible explanation of Stefano's exceptions is the following: The ham/spam corpus hashmaps may take a lot of memory. Accordingly, I gave a lot of -Xmx memory to the JVM. I remember some time ago, in a java (non James) application, an unpredictable JVM behaviour (strange exceptions thrown) when the available heap was just about the needed heap. Decreasing a little bit the -Xmx size I was getting OutOfMemoryError, and increasing it everything was fine. Stefano, can you try with more memory? > Exception in BayesianAnalysis > ----------------------------- > > Key: JAMES-387 > URL: http://issues.apache.org/jira/browse/JAMES-387 > Project: James > Type: Bug > Components: Matchers/Mailets (bundled) > Versions: 3.0 > Environment: James from svn-trunk 2005-08-01. > MySQL 4.0 > Reporter: Stefano Bagnara > Assignee: Vincenzo Gianferrari Pini > Priority: Minor > > Got this exception for every incoming mail: > 02/08/05 00:39:25 INFO James.Mailet: BayesianAnalysis: Exception: > java.lang.Integer > java.lang.ClassCastException: java.lang.Integer > at > org.apache.james.util.BayesianAnalyzer.getTokenProbabilityStrengths(BayesianAnalyzer.java:591) > at > org.apache.james.util.BayesianAnalyzer.computeSpamProbability(BayesianAnalyzer.java:340) > at > org.apache.james.transport.mailets.BayesianAnalysis.service(BayesianAnalysis.java:289) > at > org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:407) > at > org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:460) > at > org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:369) > at java.lang.Thread.run(Unknown Source) > If I clean my spam/ham db the exceptions disappears but they start again when > the spam/ham db become large. > My bayesiananalysis_spam contains 200000 rows. > The following are the spam tokens with higher "occurrences". > +---------------------------+-------------+ > | token | occurrences | > +---------------------------+-------------+ > | 3D | 82151 | > | a | 59953 | > | the | 45295 | > | FONT | 42771 | > | Content-Type | 39058 | > | to | 36626 | > | com | 32902 | > | http | 32886 | > | of | 32504 | > | font | 31803 | > | and | 31577 | > | Content-Transfer-Encoding | 31576 | > | p | 29746 | > | text | 29482 | > | in | 29418 | > | it | 28498 | > | br | 28037 | > | DIV | 27431 | -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]