A couple of related questions about adding codon usage plots to the Graph menu.
I notice that /diana/components/BasePlotGroup.java has a line
// addAlgorithm (new CodonUsageAlgorithm (forward_strand));
but that uncommenting this line causes BasePlotGroup.java to give an error on compile. Is CodonUsageAlgorithm about to be phased in, or is it something from the past that has been abandoned?
I'd like to know if Artemis will soon have a standard class for accessing codon usage values, as I've been trying to add some of the standard codon bias measures to Graph, and have been storing the codon content values in a TreeMap, as follows:
Collator co = Collator.getInstance(); TreeMap wordSet = new TreeMap(co); // holds the counts for each codon
then later in the code, incrementing the codon counts within the sliding window as follows:
Integer number = (Integer)wordSet.get(codon); // look for current count
if(number == null)
number = new Integer(0); // if it is the first occurrence, initialize
wordSet.put(codon, new Integer(number.intValue()+1)); // increment it
and then retrieving them and calculating Shannon entropy (in this particular case), as follows:
Set mappings = wordSet.entrySet(); // unwrap the TreeMap
double ent = 0;
for(Iterator i=mappings.iterator(); i.hasNext();) { // iterate though it
Map.Entry e = (Map.Entry)i.next();
String in_hash = e.getValue().toString();
float as_num = Float.parseFloat(in_hash);
float freq = as_num/total;
ent -= freq*Math.log(freq)/Math.log(2); // do an entropy calculation
}
values[0] = (float)ent; // plot it
This works but is painfully slow when implemented, especially when changing window size or zooming up and down in the genome. It occurs to me that perhaps I am (badly) reinventing the wheel here and should avoid TreeMaps. What do you recommend?
Cheers Derek
