On Thu, 3 Oct 2002, Orlando Andico wrote:

> On Thu, 3 Oct 2002, Rowel Atienza wrote:
> ..
> > Calculate:
> >     Using simple set theory: The intersection of Sets A to H would
> > be around 2% .
> >     Using simple Bayesian probability theory: The confidence value of
> > the above estimate is roughly 80% .
>
> ok.. how is is possible to get the cumulative probability? just multiply
> the individual probabilities together?

        Yes. That's right. To ease the computation, you can treat Sets A
to H as independent events. If that is so, the probability of getting the
right person (assuming knowledge of SMSC is not needed) is 0.00025 .
That's right, really small because of the number of requirements :) . But
let's not be cruel to the advertiser. Take into account error due to
dependcy of events, etc. and multiply the chance by 100% you get around 2%
chance of getting the right person in this mailing list.

and what's this Bayesian thing? i'm
> trying to wrap my mind around "A Plan for Spam" (Bayesian probability for
> spam tagging) but i've always been poor in math..  :P
>

Bayes Rule: P(B/A) = P(A/B)P(B)/P(A)

          : posterior = conditional_probability*prior/likelihood

Probability of drawing the conclusion (getting the right person) given
observed events (reading emails in this list and noticing who knows what)
is equal to the probability of being able to observe the events given the
conclusion multiplied by the probalility of getting the conclusion (I
assume 100%) divided by P(A) which is a normalizing factor and let us set
it to 1. What is P(A/B)? Due to several factors like time constraints (we
are always busy I know), limited access to email and/or web, limited
access to books, out-of-town, mood swings, etc. , chances are 80% of the
time, a knowledgeable person will reply with a correct answer to a certain
question in this mailing list.

Seriously, if you want to fight spam and dynamic firewall rules is not
enough, might as well make a model to determine the probability that a
certain email is a spam. If you want to make one step forward, might as
well make a model to detect an anomaly in the flow of your email to
trigger an alarm that there is something wrong going on with your mail
server. I have seen articles detecting intrusion based on anomaly in
packet statistics. It is something like detecting credit card misuse.


> spam tagging) but i've always been poor in math..  :P

        I thought I was good in math too until I was asked to compute the
partial derivative of a multi-variable matrix of p x q dimension raised to
the nth power.
        There are three types of people in this world:
        1) A good mathematician having a hard time finding a job
        2) A good computer programmer who is always happy because there is
plenty of job around and the pay is high
        3) A good computer programmer and mathematician who rules this
world (reminds of Bruce Schneier).


rowel

_
Philippine Linux Users Group. Web site and archives at http://plug.linux.org.ph
To leave: send "unsubscribe" in the body to [EMAIL PROTECTED]

Fully Searchable Archives With Friendly Web Interface at http://marc.free.net.ph

To subscribe to the Linux Newbies' List: send "subscribe" in the body to 
[EMAIL PROTECTED]

Reply via email to