https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7127
Bug ID: 7127
Summary: "bayes_seen" contains whole message parts
Product: Spamassassin
Version: 3.4.0
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Learner
Assignee: [email protected]
Reporter: [email protected]
Created attachment 5273
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5273&action=edit
message trained with "sa-learn --spam"
"bayes_seen" grows way too large because it contains randomly whole message
parts - see below partial outout of "cat bayes_seen" and attached the eml-file
sometimes those messages would be trained again and again
at least it's the only things which explains sa-learn over a whole folder
results in "Learned tokens from 2 messages", sadly in case of thousands of
eml-files no hint which one is re-trained
2.4 MB "bayes_seen" versus 40 MB the whole DB containing 16000 mails
-rw------- 1 sa-milt sa-milt 29K 2015-01-30 12:49 bayes_journal
-rw------- 1 sa-milt sa-milt 2,4M 2015-01-30 12:40 bayes_seen
-rw------- 1 sa-milt sa-milt 40M 2015-01-30 12:43 bayes_toks
-rw------- 1 sa-milt sa-milt 98 2014-08-21 17:47 user_prefs
________________________________________
9980e02@sa_generatedh1452f6b0724547a62010cbeba8b75bd2984ab31e@sa_generateds1366bb7461fce988f143fa5dab961bca5e30cd2c@sa_generatedh10fee2f9391af62ac10977c0d5f003d8e484d12f@sa_generatedh08da39c4232eb29f2ae8ed6d6756��:14pt">w</span>ˢ7<span
style=3D"color:#594760; font-size:14pt"> </span>4Dw<span
style=3D"color:#594760; font-size:14pt">t</span>DN»<span
style=3D"color:#594760; font-size:14pt">o</span>Ù°x<span
style=3D"color:#594760; font-size:14pt"> </span>ZÔb<span
style=3D"color:#594760; font-size:14pt">u</span>8Ke<span
style=3D"color:#594760; font-size:14pt">s</span>"Ø′<span
style=3D"color:#594760; font-size:14pt">e</span>¤ËL<span
style=3D"color:#594760; font-size:14pt"> </span>¤Pµ<span
style=3D"color:#594760; font-size:14pt">t</span>N¤n<span
style=3D"color:#594760; font-size:14pt">h</span>è6m<span
style=3D"color:#594760; font-size:14pt">e</span>kÃÀ<span
style=3D"color:#594760; font-size:14pt">m</span>6ÂC<span
style=3D"color:#594760; font-size:14pt"> </span>U0Æ<span
style=3D"color:#594760; font-size:14pt">:</span>aW¼<span
style=3D"color:#594760; font-size:14pt">)</span>When emma shook his tired. Stop
the air was seated himself.
<br>mtºSince she reached out and two long
<br><br>U5σDay but have any longer before
<br><br><br>Ü06<b><span style=3D"color:#594760;
font-size:18pt">Ć</span>û2l<span style=3D"color:#594760;
font-size:18pt">l</span>HÎ0<span style=3D"color:#594760;
font-size:18pt">i</span>HP6<span style=3D"color:#594760;
font-size:18pt">c</span>ã‚©<span style=3D"color:#594760;
font-size:18pt">k</span>ȡB<span style=3D"color:#594760;
font-size:18pt"> </span>οF″<span style=3D"color:#594760;
font-size:18pt">b</span>Eh0<span style=3D"color:#594760;
font-size:18pt">e</span>UV∀<span style=3D"color:#594760;
font-size:18pt">l</span>²Zξ<span style=3D"color:#594760;
font-size:18pt">l</span>fyf<span style=3D"color:#594760;
font-size:18pt">o</span>ýþ¤<span style=3D"color:#594760;
font-size:18pt">w</span>XQl<span style=3D"color:#594760; font-size:18pt">
</span>­¸2<span style=3D"color:#594760;
font-size:18pt">t</span>P1q<span style=3D"color:#594760;
font-size:18pt">o</span>qt0<span style=3D"color:#594760; font-size:18pt">
</span>8Ô1<span style=3D"color:#594760;
font-size:18pt">v</span>eÃΗ<span style=3D"color:#594760;
font-size:18pt">i</span>UiQ<span style=3D"color:#594760;
font-size:18pt">e</span>′21<span style=3D"color:#594760;
font-size:18pt">w</span>lVæ<span style=3D"color:#594760; font-size:18pt">
</span>4MH<span style=3D"color:#594760;
font-size:18pt">m</span>ϼL<span style=3D"color:#594760;
font-size:18pt">y</span>3Äæ<span style=3D"color:#594760;
font-size:18pt"> </span>à18<span style=3D"color:#594760;
font-size:18pt">(</span>æNA<span style=3D"color:#594760;
font-size:18pt">5</span>S4ã<span style=3D"color:#594760;
font-size:18pt">)</span>yHε<span style=3D"color:#594760;
font-size:18pt"> </span>AÊe<span style=3D"color:#594760;
font-size:18pt">p</span>íΨ⊥<span style=3D"color:#594760;
font-size:18pt">r</span>ËËg<span style=3D"color:#594760;
font-size:18pt">i</span>Høy<span style=3D"color:#594760;
font-size:18pt">v</span>ÿvÏ<span style=3D"color:#594760;
font-size:18pt">a</span>¾Þ∑<span style=3D"color:#594760;
font-size:18pt">t</span>RZú<span style=3D"color:#594760;
font-size:18pt">e</span>kU∠<span style=3D"color:#594760; font-size:18pt">
</span>υh¹<span style=3D"color:#594760;
font-size:18pt">p</span>9℘·<span style=3D"color:#594760;
font-size:18pt">h</span>Lí°<span style=3D"color:#594760;
font-size:18pt">o</span>ℵ¶H<span style=3D"color:#594760;
font-size:18pt">t</span>w1k<span style=3D"color:#594760;
font-size:18pt">o</span>ÖMø<span style=3D"color:#594760;
font-size:18pt">s</span>Ο¹3<span style=3D"color:#594760;
font-size:18pt">:</span></b>Bear coat emma looked up with. Camp and let go with
his horse
����ZX" ����zxB@n
style=3D"s7474b5cda7924a0b5383120b662ef00a1389d830@sa_generatedY��
�
�
�
�
--
You are receiving this mail because:
You are the assignee for the bug.