On 12 Mar 2004, at 06:00, [EMAIL PROTECTED] wrote:


When you call the train() method of the BaumWelchTrainer you supply it
with a SequenceDB. The sequences from this DB are used to optimize the
weights of the model.

However, I have a bad feeling that when you train your model with the
BaumWelchTrainer your previously set counts will be ignored and
overwritten. You could check by looking into AbstractModelTrainer.train()
(which is what the BaumWelchTrainer extends). You could also run some
tests to see if using a pre-trained model makes any difference to the
final outcome. Does anyone more expert than me on the DP package (ie most
people) know if the counts are overwritten?

The Baum-Welch algorithm is actually the Expectation Maximization algorithm applied to HMMs. When you run BaumWelchTrainer, it takes the existing model then calculates a matrix which defines the probability that each symbol in your training database was emitted by a particular state in the HMM. It then optimizes the transition and emission probabilities of the model to maximize the likelihood of the data *given that assignment of states to data*.


This means that, although all the model parameters get overwritten every cycle, they do still depend on the previous state of the model. If you're starting with a model that is a good, but not quite optimal, fit to your data, BaumWelchTrainer ought to do what you want.

Another thing you might look at is building a model with a mixture of trainable and untrainable distributions. This allows you to specifically optimize the bits of the model which you aren't sure about yet, while holding the bits which you already trust constant.

Thomas.

_______________________________________________
Biojava-l mailing list  -  [EMAIL PROTECTED]
http://biojava.org/mailman/listinfo/biojava-l

Reply via email to