Hi Peter, You didn't give a very specific example, but it seems to me that what you wish to do is not really complicated. I suppose you have created a table of sequences vs. say hyprophobicity, charge, etc..., something like...
seq hydroph arom b0001 0.104762 0.000000 b0002 0.035122 0.065854 b0003 0.024193 0.070968 b0004 -0.096729 0.084112 b0005 -0.973469 0.091837 b0006 -0.402713 0.108527 b0007 0.680672 0.123950 b0008 -0.209779 0.072555 b0009 -0.013334 0.046154 b0010 0.952128 0.143617 suppose you have these data into a data frame called myseqs [see the R documentation in how to upload these data, you can try > myseqs <- edit(read.table()) ] # you need to load the necessary libraries library(mva) # basic clustering library(cluster) # more clustering algorithms # then you need to calculate the 'distances' between sequences myseqs.d <- dist(myseqs) # this creates the euclidean distance matrix, try help(dist) for more info # then we perform a hierarchical cluster myseqs.clus <- hclust(myseqs.d) # now checkout your results plot(myseqs.clus) # hey! you see how easy it is? # the documentation for hlcust contains much more info # other fancy clustering algorithms myseqs.pam <- pam(myseqs, k = 2) plot(myseqs.pam) I hope this is of any help. ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
