I think I am making this problem harder than it has to be and so I keep getting 
stuck on what might be a trivial problem. 
I have used the seqinr package to load a protein sequence alignment containing 
15 protein sequences;
    > library(seqinr)    > x = 
read.alignment("proteins.fasta",format="fasta",forceToLower=FALSE)This 
automatically loads in a list of 4 elements including the sequences and other 
information.
I store the sequences to a new list;
   > mylist = x$seqwhich returns a character vector of 15 strings.
I have found that if I split the long character strings into individual 
characters it is easy to use lapply to loop over this list. So I use strsplit;
    >list.2 = strsplit(mylist, split = NULL)
>From this list I can determine which proteins have changes at certain 
>positions by using;
    >lapply(list.2, "[", 10) == "L"This returns a logical T/F vector for those 
elements of the list that do/do not the letter L at position 10. 
Because each of the protein sequences contains 99amino acids, I want to 
automate this process so that I do not have to compare/contrast positions 1 x 
1. Most of the changes occur between positions/letters 10-95. I have a standard 
character vector that I wish to use for comparison when looping through the 
list. 
Should I perhaps combine all --  the standard "letter"/aa vector, the list of 
protein sequences -- into one list? Or is it better to leave them separate for 
this comparison? I'm not sure what the output should be as I need to use it for 
another statistical test. Would a list of logical vectors be the most 
sufficient output to return? 
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to