Sorry for the previous error. ---------------------------- Original Message ---------------------------- Subject: Re: [Biojava-l] Parameter Settings in BaumWelchTraining From: [EMAIL PROTECTED] Date: Fri, March 12, 2004 12:27 am To: [EMAIL PROTECTED] --------------------------------------------------------------------------
Here is the code I have for the training. Using what you told me below, I can retreive all of the weights that I calculated manually for the hmm (distributions for the transitions and distributions for the alphabet of each state). What I do not understand is how to use this information and the sequences stored in a file to run the BaumWelchAlgorithm and then retreive the optimized values calculated by the algorithm to set them back into my HMM. //Retreive the alphabet of all states FiniteAlphabet SA = hmm.stateAlphabet(); Iterator i = SA.iterator(); SimpleModelTrainer MT = new SimpleModelTrainer(); MT.registerModel(hmm); //go through each state while(i.hasNext()) {Symbol Currentstate = (Symbol)i.next(); //Retreive the distribution of all transitions from the current state FiniteAlphabet From = hmm.transitionsFrom((State)Currentstate); Distribution d = hmm.getWeights((State)Currentstate); Iterator i2 = From.iterator(); //go through it and look at all the weights for each of the transitions while(i2.hasNext()) {Symbol s = (Symbol)i2.next(); System.out.println("From state "+Currentstate.getName()+ "To State "+s.getName()+ "Weight "+d.getWeight(s));} //get the distribution for the alphabet of the current state Distribution d2 =((EmissionState)Currentstate).getDistribution(); FiniteAlphabet IN = (FiniteAlphabet)hmm.emissionAlphabet(); Iterator i3 = IN.iterator(); //you can go through it the same way as above using a while loop ***************************************************************** This is what I don't understand!!!! ***************************************************************** here, we have a set of training sequences stored in a file in fasta format that i'd like to use with the BaumWelch algorithm to optimize the transition distributions mentionned above. //This is the file with all the training sequences BufferedInputStream is = new BufferedInputStream(new FileInputStream("z:/Sequences.faa")); //Load the file with the SequenceDB class SequenceDB DB = SeqIOTools.readFasta(is, ProtAlphabet); //use 100 cycles as the stop criteria StoppingCriteria stopper = new StoppingCriteria() {public boolean isTrainingComplete(TrainingAlgorithm ta) {return (ta.getCycle() > 100);}}; ***************************************** This part is what I am clueless about ***************************************** //How do I optimize my hmm with the BaumWelch algorithm and retreive //the optimized values ? How do I train the distribution above with //the baum welch and the sequences that I have ? DP dp= DPFactory.DEFAULT.createDP(hmm); BaumWelchTrainer bwt = new BaumWelchTrainer(dp); } PS : I do not know why you are helping all of us here but thank you. It makes Biojava a lot easier to deal with. Steve > Hi Stephane - > > Within EmissionState you can set a Distribution that contains emission probabilities for the Symbols states emission alphabet using the setDistribution method. This Distribution will be your predetermined weights. > > To set the transition probabilities you can use the setWeights(State source, Distribution weights). The source is the state you are > transitioning from and the weights is the probability of transitioning to any State that the source connects too. Because States implement Symbol you can put them in a Distribution. > > To make a Distribution of States that state 'a' could connect to use the following pseudo code: > > State a; > Model m; > FiniteAlphabet endPoints; > > endPoints = m.transitionsFrom(a); > Distribution d = > DistributionFactory.DEFAULT.createDistribution(endPoints); > > //You can then train d or set it's weights and put it back in the model with > > m.setWeights(a, d); > > Mark Schreiber > Principal Scientist (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) > 1 Science Park Road > #04-14 The Capricorn, Science Park II > Singapore 117528 > > phone +65 6722 2973 > fax +65 6722 2910 > > > > > > [EMAIL PROTECTED] > Sent by: [EMAIL PROTECTED] > 03/12/2004 06:11 AM > > > To: "Biojava Mailing List" <[EMAIL PROTECTED]> > cc: > Subject: [Biojava-l] Parameter Settings in > BaumWelchTraining > > > Hi all. I'm trying to optimize the transition states probabilities for my HMM. I already have set them to values which I think are pretty good. Since I know the Baum Welch can only help with the scores and optimize them up to a local maxima I thought of using the parameters I calculated as a starting point. The problem is that I don't know how! > I followed the example in biojava: > > .... > //train the model to have uniform parameters > ModelTrainer mt = new SimpleModelTrainer(); > //register the model to train > mt.registerModel(hmm); > > I want to use the values already set in my hmm as the starting parameters in the BaumWelch. I don't want to use the uniform distribution as indicated below! > > //as no other counts are being used the null weight will cause > everything to be uniform > mt.setNullModelWeight(1.0); > mt.train(); > > I tried adding counts and looking up examples on the net but ended up more confused than I started. How do I use the addCounts to make this work! > > Stephane Acoca > Master's Student > McGill Center for Bioinformatics > > _______________________________________________ > Biojava-l mailing list - [EMAIL PROTECTED] > http://biojava.org/mailman/listinfo/biojava-l > > > _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l