It may come down to my understanding of Bayes and its tokens.. Also
having a bit a problem explaining this concept on paper...

I see this as adding an extra layer to the Bayes:

Consider the following 2 basic emails:

Mail 1:

Mail 2:

With Bayes:

Mail 1:
<token 1>

Mail 2:
<token 2>

With Concepts & Bayes:

Mail 1:
<token 1>

Mail 2:
<token 2>


So without Concepts:

Mail 1 comes into the platform, is tokenized (token1) and is classified
and learnt as spam.
Mail 2 comes into the platform, tokenized (token2) and has no common
tokens with mail 1 - so no association is made

With Concepts

Mail 1 comes into the platform, is tokenized (token1 & meds) and is
classified and learnt as spam.
Mail 2 comes into the platform, is tokenized (token2 & meds) and has the
same common "meds" token as associated with Mail 1

Does this makes sense - am I right in my assumptions?


On 25/05/16 09:02, Merijn van den Kroonenberg wrote:
With David's help I have tracked down the problem(s). Version 0.02 is
up. Would be interested to hear you thoughts - even if just theoretical
about the affect to the Bayes DB.
Just in theory, i am curious what part of the Bayes filter you hope to
improve? I think you are not adding any *new* information to the e-mail,
your concepts are based purely on the mail content right?

It seems you just overpower some tokens a bit more but I am not sure if
your concepts are useful for a bayes filter. Especially a bayes filter
would not need this I would say. Maybe the concepts would be useful to
humans or rules written by humans.

Paul Stead
Systems Engineer
Zen Internet

Paul Stead
Systems Engineer
Zen Internet

Reply via email to