-------- Original-Nachricht -------- > Datum: Mon, 10 Sep 2007 08:43:15 +0200 > Von: "Bartlomiej Moczulski" <[EMAIL PROTECTED]> > An: [email protected] > Betreff: [dspam-users] what\'s in dspam_signature_data?
> hi everyone, > short question to those, who mastered dspam's source: what's exactly > stored in dspam_signature_data? > Just the DSPAM signature is saved there. Nothing more. > Or rather: is it (in theory) possible > to recover a full message from data stored in that table? > In that table? NO! But in theory it would be possible to restore the words of a message from the dspam_token_data table. The restore process would take a lot of time and the restored data would just recover the single words or chained words or adjacent words. But not the message in its original form. The recovery would need to reverse the process of the tokenizer. For example a simple unigram tokenizer would produce for the word hello the following data: TOKEN: 'hello' CRC: 6035123792567599104 I think it is not possible to just recalculate 6035123792567599104 to give you 'hello'. The only way would be to brute force it. You would need to check all combinations of words until you would get 6035123792567599104 and then you would know what word produces 6035123792567599104. Thats is to my knowledge the only way to get the word back from a token. Even if you would have a fast way to get the used words from the token then you would still face the problem that you will only get small peaces of the original mail and it would be in no big relation of where the words/peaces of words are used in the original mail. Another problem you will have is that DSPAM does pre-process/filter the mail for strange characters. I think the word 'h.e..l-l-o' would be processed by DSPAM to be 'hello' and would produce the above mentioned token. > I'm preparing a privacy policy for my users and I'm not perfectly sure > if I can write "no copies of clean messages are stored in antispam > system". > That is right if you use the SQL engine. If you use SBPH then the original message is saved by DSPAM for later training/reclassification. > -- > thanks, > bm. -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser
