https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7127

            Bug ID: 7127
           Summary: "bayes_seen" contains whole message parts
           Product: Spamassassin
           Version: 3.4.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Learner
          Assignee: [email protected]
          Reporter: [email protected]

Created attachment 5273
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5273&action=edit
message trained with "sa-learn --spam"

"bayes_seen" grows way too large because it contains randomly whole message
parts - see below partial outout of "cat bayes_seen" and attached the eml-file

sometimes those messages would be trained again and again 

at least it's the only things which explains sa-learn over a whole folder
results in "Learned tokens from 2 messages", sadly in case of thousands of
eml-files no hint which one is re-trained

2.4 MB "bayes_seen" versus 40 MB the whole DB containing 16000 mails

-rw------- 1 sa-milt sa-milt  29K 2015-01-30 12:49 bayes_journal
-rw------- 1 sa-milt sa-milt 2,4M 2015-01-30 12:40 bayes_seen
-rw------- 1 sa-milt sa-milt  40M 2015-01-30 12:43 bayes_toks
-rw------- 1 sa-milt sa-milt   98 2014-08-21 17:47 user_prefs

________________________________________

9980e02@sa_generatedh1452f6b0724547a62010cbeba8b75bd2984ab31e@sa_generateds1366bb7461fce988f143fa5dab961bca5e30cd2c@sa_generatedh10fee2f9391af62ac10977c0d5f003d8e484d12f@sa_generatedh08da39c4232eb29f2ae8ed6d6756��:14pt">w</span>&Euml;&#162;7<span
style=3D"color:#594760; font-size:14pt"> </span>4Dw<span
style=3D"color:#594760; font-size:14pt">t</span>DN&#187;<span
style=3D"color:#594760; font-size:14pt">o</span>&#217;&#176;x<span
style=3D"color:#594760; font-size:14pt"> </span>Z&#212;b<span
style=3D"color:#594760; font-size:14pt">u</span>8Ke<span
style=3D"color:#594760; font-size:14pt">s</span>&quot;&Oslash;&prime;<span
style=3D"color:#594760; font-size:14pt">e</span>&#164;&#203;L<span
style=3D"color:#594760; font-size:14pt"> </span>&#164;P&#181;<span
style=3D"color:#594760; font-size:14pt">t</span>N&curren;n<span
style=3D"color:#594760; font-size:14pt">h</span>&#232;6m<span
style=3D"color:#594760; font-size:14pt">e</span>k&#195;&Agrave;<span
style=3D"color:#594760; font-size:14pt">m</span>6&#194;C<span
style=3D"color:#594760; font-size:14pt"> </span>U0&#198;<span
style=3D"color:#594760; font-size:14pt">:</span>aW&#188;<span
style=3D"color:#594760; font-size:14pt">)</span>When emma shook his tired. Stop
the air was seated himself.
<br>mt&ordm;Since she reached out and two long
<br><br>U5&sigma;Day but have any longer before
<br><br><br>&#220;06<b><span style=3D"color:#594760;
font-size:18pt">&#262;</span>&#251;2l<span style=3D"color:#594760;
font-size:18pt">l</span>H&#206;0<span style=3D"color:#594760;
font-size:18pt">i</span>HP6<span style=3D"color:#594760;
font-size:18pt">c</span>&#227;&sbquo;&#169;<span style=3D"color:#594760;
font-size:18pt">k</span>&#200;&#161;B<span style=3D"color:#594760;
font-size:18pt"> </span>&omicron;F&Prime;<span style=3D"color:#594760;
font-size:18pt">b</span>Eh0<span style=3D"color:#594760;
font-size:18pt">e</span>UV&forall;<span style=3D"color:#594760;
font-size:18pt">l</span>&sup2;Z&xi;<span style=3D"color:#594760;
font-size:18pt">l</span>fyf<span style=3D"color:#594760;
font-size:18pt">o</span>&yacute;&thorn;&#164;<span style=3D"color:#594760;
font-size:18pt">w</span>XQl<span style=3D"color:#594760; font-size:18pt">
</span>&#173;&#184;2<span style=3D"color:#594760;
font-size:18pt">t</span>P1q<span style=3D"color:#594760;
font-size:18pt">o</span>qt0<span style=3D"color:#594760; font-size:18pt">
</span>8&#212;1<span style=3D"color:#594760;
font-size:18pt">v</span>e&#195;&Eta;<span style=3D"color:#594760;
font-size:18pt">i</span>UiQ<span style=3D"color:#594760;
font-size:18pt">e</span>&prime;21<span style=3D"color:#594760;
font-size:18pt">w</span>lV&#230;<span style=3D"color:#594760; font-size:18pt">
</span>4MH<span style=3D"color:#594760;
font-size:18pt">m</span>&oelig;&#186;L<span style=3D"color:#594760;
font-size:18pt">y</span>3&#196;&#230;<span style=3D"color:#594760;
font-size:18pt"> </span>&#224;18<span style=3D"color:#594760;
font-size:18pt">(</span>&#230;NA<span style=3D"color:#594760;
font-size:18pt">5</span>S4&#227;<span style=3D"color:#594760;
font-size:18pt">)</span>yH&epsilon;<span style=3D"color:#594760;
font-size:18pt"> </span>A&Ecirc;e<span style=3D"color:#594760;
font-size:18pt">p</span>&#237;&Psi;&perp;<span style=3D"color:#594760;
font-size:18pt">r</span>&#203;&Euml;g<span style=3D"color:#594760;
font-size:18pt">i</span>H&#248;y<span style=3D"color:#594760;
font-size:18pt">v</span>&#255;v&#207;<span style=3D"color:#594760;
font-size:18pt">a</span>&frac34;&THORN;&sum;<span style=3D"color:#594760;
font-size:18pt">t</span>RZ&#250;<span style=3D"color:#594760;
font-size:18pt">e</span>kU&ang;<span style=3D"color:#594760; font-size:18pt">
</span>&upsilon;h&sup1;<span style=3D"color:#594760;
font-size:18pt">p</span>9&weierp;&#183;<span style=3D"color:#594760;
font-size:18pt">h</span>L&iacute;&deg;<span style=3D"color:#594760;
font-size:18pt">o</span>&alefsym;&#182;H<span style=3D"color:#594760;
font-size:18pt">t</span>w1k<span style=3D"color:#594760;
font-size:18pt">o</span>&Ouml;M&oslash;<span style=3D"color:#594760;
font-size:18pt">s</span>&Omicron;&sup1;3<span style=3D"color:#594760;
font-size:18pt">:</span></b>Bear coat emma looked up with. Camp and let go with
his horse
����ZX" ����zxB@n
style=3D"s7474b5cda7924a0b5383120b662ef00a1389d830@sa_generatedY��
�
 �
  �
   �

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to