Hi, I apologize for posting this here, I am also trying to post this on machine learning emailing lists.
I have a set (18K) of sequences (22 nt long) and I have their counts at 4 different stages. The difference in counts from one stage to the next represents how well the sequence performed in the transition. The total counts remain about the same in each stage. So if a 1 sequence loses some counts in 1 stage, another sequence gains those counts in that stage. I am trying to build a predictor that combines these 4 stages. I have already tried to build an SVM using just the counts in the final stage but its not that great (0.3 correlation with test set). The problem I am facing now is how to combine these 4 stages into 1 dependent variable or something like that. The 4 stages are the dependent variables and the sequence is my independent variable. The aim is to use the count information in each stage to select how well the sequence performs across all 4 stages. I appreciate any suggestions for this problem. Sincerely, Vishal [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.