Once upon a time, Schreiber, Mark wrote: > > An alternate approach, which BioJava *will* help you with, is > > to calculate both the forward and the backward DP matrices, > > then multiply these together and normalize by the overall > > backwards/forwards probability. This then tells you the > > By normalize do you mean divide each value by the total forward or > backward probability? So by doing this I would expect that the > forward-backward prob should equal the forward and the backward probs?
The value of each cell in the forward * backward matrix is the probability of observing the data given that a particular symbol was emitted by a particular state. The overall forward score is simply the probability of the data given any path through the model. So the ratio gives the probability that the `true' path passed through a particular cell (assuming a uniform prior, if you want to be nicely Bayesian). > Also, can someone enlighten me how the matrices are indexed (there is > not much javadoc in DP)? Currently I'm assuming that the x axis is the > array returned by getStates and the y axis is the aligned SymbolList > from that state path, with a magical state at each end but I'm not too > clear on this. I'm not sure what you mean by x and y here. When you do forwards/backwards, you don't have any state path to use as a frame of reference, since you're summing over all possible state paths. The indices into the array are just indices into the SymbolList (and yes, positions 0 and length()+1 are magic symbols which only match the model's magical state). For `gaps' (i.e. paths which pass through states which don't emit anything in the sequence you're using as a frame of reference), you'll find the scores at the index of the previous symbol which *was* emitted. So long as your model doesn't have any cycles of states which don't emit any symbols, this should always be unambiguous. For the 1-head DP case, there's a nice example of all this in BaumWelchTrainer.java. There isn't anything equivalent for 2-head, but the principles should be closely analogous. Thomas. _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l