Re: [Dspam-user] Preserving whitelisting information during change of tokenizers

Elias Oltmanns Tue, 05 Apr 2011 07:27:00 -0700

Kenneth Marshall <k...@rice.edu> wrote:
> On Mon, Apr 04, 2011 at 11:46:51PM +0200, Stevan Baji?? wrote:
>> On Sun, 03 Apr 2011 17:31:44 +0200
>
>> Elias Oltmanns <e...@nebensachen.de> wrote:
>> 
>> > Hi there,
>> > 
>> Hallo Elias,
>> 
>> 
>> > switching from CHAIN to OSB tokenizer, I understand, makes the old
>> > tokens mostly useless or even harmful since OSB might achieve better
>> > accuracy when starting from scratch. I wouldn't mind losing most of the
>> > tokens if it wasn't for the automatic whitelisting information. So, My
>> > question is whether there might be any practical way to keep the
>> > whitelist information during a transition from CHAIN to OSB (or any
>> > other combination of tokenizers for that matter).
>> > 
>> > Thanks in advance for any advice you can give me,
>> > 
[...]
> As far as keeping the old whitelisting tokens, if
> you have archives of good mail, it should be possible to calculate
> the whitelist token hash manually and make a list of the tokens
> to save in the DB.


Yes, I've started thinking along those lines too. However, I don't seem
to be able to *guess* how these tokens are assembled. In the
documentation it explicitly states that the whole From: header is used
for the whitelist feature. Yet

$ dspam_dump userid "From*Elias+Oltmanns+<e...@nebensachen.de>"

produces no hits. Does anyone of you know off the top of your head what
the correct query should look like? I can look in the sources myself
once I've got some more spare time on my hands. Then again, I'm not too
sure anymore whether it is really worth bothering with those whitelist
tokens.

Regards,

Elias


------------------------------------------------------------------------------
Xperia(TM) PLAY
It's a major breakthrough. An authentic gaming
smartphone on the nation's most reliable network.
And it wants your games.
http://p.sf.net/sfu/verizon-sfdev
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Preserving whitelisting information during change of tokenizers

Reply via email to