Hi Brian,

If you get an up-to-date copy of biojava from cvs (now) or the nightly builds (tomorrow) then you will find that WeightMatrixAnnotator now allows you to specify a ScoreType. The one you want is ScoreType.ODDS. You will need a different threshold than you are used to - effectively it's the log odds at which you accept a weight matrix match.

Let's get bayes: s is your sequence, m is your weight matrix

p(m|s)p(s) = p(s|m)p(m)

becomes...

p(m|s) = p(s|m) p(m) / p(s)

e.g. the probability of your weight matrix binding to a particular position is the score of the weight matrix at that position multiplied by the ratio of how much you belive the weight matrix and the sequence. Of course, in this context, that ratio is fairy meaningless - just what is p(m) anyway?

We could re-write p(s) in terms of it being produced by your weight matrix or the null model, at which this nuisance term becomes the threshold at which you should accept a match. So, if you think there should only be one site every 1000 nt, then p(m) / p(s) can be set to 1/1000, take the log of this, and that's your threshold.

That's a noddy explanation, but it's sort of how bioinformatics does these things, so hey ho.

Matthew


_______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l

Reply via email to