Rob wrote:

> Yes, but in this case that is not a simple task.
> The user says "this is an e-mail I have seen before, it should not
> be detected as a scam".
> But what properties of the mail do you want to store in a whitelist?
> 
> Certainly not the sender address, as it can very easily be spoofed.
> Anyone sending a genuine-looking phishing mail will try to use the
> usual sender address of the company they want to phish data for.
> So, whitelisting on sender address would be an extremely bad idea!
> 
> You can store a hash of the message to whitelist it, but I bet that
> the messages the user is talking about are not "the same".  They
> are messages from the same company that have the same general layout,
> but their content is not the same.
> 
> So what would the software have to store and match to identify "the same"
> messages that it should not classify as scam the next time?
> 
> It will not be easy...

Agreed, but that is exactly why the science of heuristic analysis
is so well developed.  Simply saying "If I receive a second identical
copy of this e-mail, please do not treat it a scam" is totally
inadequate -- what we need is a feature whereby each time we mark
an e-mail as /not/ a scam, it is compared with all similar messages
that have been so marked, and the scam-detection heuristics adjusted
accordingly.  There is no fundamental difference between the approach
currently provided for junk mail training and the requested feature
for scam mail training.

Philip Taylor
_______________________________________________
support-seamonkey mailing list
[email protected]
https://lists.mozilla.org/listinfo/support-seamonkey

Reply via email to