Hello everyone, I would appreciate any help with the following.

My dataset is a list containing matrices. So if you type e.g.

data[[1]]

you get something like:

           [,1]    [,2]
361a       A    T
456b       A    G
72145a    T    G
........

As you can see my rows have names which are character strings containing 
numbers and letters. I want something similar to a histogram, per column. i.e. 
I want to know how many times I have a single repeat character in a column and 
how many times I have a twice repeated character and so on. Maybe there is an 
easy way to do this, but I wrote my own code which works perfectly, so don't 
bother to correct it unless extremely necessary. I write down the code so you 
know exactly what I'm trying to do:

table <- vector()

for (i in (1:length(data))){

    for (j in (1:length(data[[i]][1,]))){

        t <- table(data[[i]][,j])

        table <- c(table, t)
}}

ncount <- table[names(table) != "-"] #this line is necessary to eliminate "-" 
characters which should not be included in the analysis

sfs <- table (ncount)

And with this code I get something like:

 1   2   3   4   5   6   7   8   9  10 ....

542 125  98  49  47  41  26  31  22  18  ....

which is what I'm looking for.


Now comes THE problem:

As I said before my rows have names. Each name is unique. I want to apply my 
analysis to a subset of rows en each matrix, namely all rows whose names start 
with 3, all that start with 4, all that start with 721. In most cases only the 
first character is important, but since I have names of different length, in 
some cases I need the first three characters to differentiate the groups. I 
want to integrate this into the loop so that I get a vector (such as the one 
called "table" in my code) for each subset analyzed.

I tried using the subset function, but I couldn't figure out how to use it, 
because it's intended to use row values to define the subset, not row names. 

I hope someone can help me out, but please bear in mind I am really new at R 
and most commands and parameters are really unfamiliar to me.

Thanks.


      
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to