Re: [R] unique/subset problem

2007-01-26 Thread lalitha viswanath
Hi The pruned dataset has 8 unique genomes in it while the dataset before pruning has 65 unique genomes in it. However calling unique on the pruned dataset seems to return 65 no matter what. Any assistance in this matter would be appreciated. Thanks Lalitha --- Weiwei Shi [EMAIL PROTECTED]

Re: [R] unique/subset problem

2007-01-26 Thread Sarah Goslee
Without knowing more about your data, it is hard to say for certain, but might you be confusing unique _values_ with _factor levels_? mydata - as.factor(sort(rep(1:5, 2))) # mydata has 10 values, 5 unique values, and 5 factor levels mydata [1] 1 1 2 2 3 3 4 4 5 5 Levels: 1 2 3 4 5

Re: [R] unique/subset problem

2007-01-26 Thread Weiwei Shi
Then you need to provide more details about the calls you made and your dataset. For example, you can tell us by str(prunedrelatives, 1) how did you call unique on prunedrelative and so on? I made a test data it gave me what you wanted (omitted here). On 1/26/07, lalitha viswanath [EMAIL

Re: [R] unique/subset problem

2007-01-26 Thread lalitha viswanath
Hi I read in my dataset using dt read.table(filename) calling unique(levels(dt$genome1)) yields the following aero aful aquae atum_D bbur bhal bmel bsub [9] buch cace ccre cglu cjej cper cpneuAcpneuC [17] cpneuJ

Re: [R] unique/subset problem

2007-01-26 Thread Weiwei Shi
check ?read.table and add as.is=T in the option. So you read string as character now and avoid the factor things. Then repeat your work. For example x0 - read.table(~/Documents/tox/noodles/four_sheets_orig/reg_r2.txt, sep=\t, nrows=10) str(x0,1) `data.frame': 10 obs. of 7 variables: $

Re: [R] unique/subset problem

2007-01-26 Thread Weiwei Shi
oh, i forgot, you can also convert factor into string like dataset$genome1 - as.character(dataset$genome1) so you don't have to use as.numeric(dataset$score) if you use as.is=T when you read.table HTH, weiwei On 1/26/07, Weiwei Shi [EMAIL PROTECTED] wrote: check ?read.table and add as.is=T

[R] unique/subset problem

2007-01-25 Thread lalitha viswanath
Hi I am new to R programming and am using subset to extract part of a data as follows names(dataset) = c(genome1,genome2,dist,score); prunedrelatives - subset(dataset, score -5); However when I use unique to find the number of unique genomes now present in prunedrelatives I get results

Re: [R] unique/subset problem

2007-01-25 Thread Weiwei Shi
Hi, Even you removed many genomes1 by setting score -5; it is not necessary saying you changed the uniqueness. To check this, you can do like p0 - unique(dataset[dataset$score -5, genome1]) # same as subset p1 - unique(dataset[dataset$score= -5, genome1]) setdiff(p1, p0) if the output above