Hi
The pruned dataset has 8 unique genomes in it while
the dataset before pruning has 65 unique genomes in
it.
However calling unique on the pruned dataset seems to
return 65 no matter what.
Any assistance in this matter would be appreciated.
Thanks
Lalitha
--- Weiwei Shi [EMAIL PROTECTED]
Without knowing more about your data, it is hard to say for certain,
but might you be confusing unique _values_ with _factor levels_?
mydata - as.factor(sort(rep(1:5, 2)))
# mydata has 10 values, 5 unique values, and 5 factor levels
mydata
[1] 1 1 2 2 3 3 4 4 5 5
Levels: 1 2 3 4 5
Then you need to provide more details about the calls you made and your dataset.
For example, you can tell us by
str(prunedrelatives, 1)
how did you call unique on prunedrelative and so on? I made a test
data it gave me what you wanted (omitted here).
On 1/26/07, lalitha viswanath [EMAIL
Hi
I read in my dataset using
dt read.table(filename)
calling unique(levels(dt$genome1)) yields the
following
aero aful aquae atum_D
bbur bhal bmel bsub
[9] buch cace ccre cglu
cjej cper cpneuAcpneuC
[17] cpneuJ
check
?read.table
and add as.is=T in the option. So you read string as character now
and avoid the factor things.
Then repeat your work.
For example
x0 - read.table(~/Documents/tox/noodles/four_sheets_orig/reg_r2.txt,
sep=\t, nrows=10)
str(x0,1)
`data.frame': 10 obs. of 7 variables:
$
oh, i forgot, you can also convert factor into string like
dataset$genome1 - as.character(dataset$genome1)
so you don't have to use
as.numeric(dataset$score) if you use as.is=T when you read.table
HTH,
weiwei
On 1/26/07, Weiwei Shi [EMAIL PROTECTED] wrote:
check
?read.table
and add as.is=T
Hi
I am new to R programming and am using subset to
extract part of a data as follows
names(dataset) =
c(genome1,genome2,dist,score);
prunedrelatives - subset(dataset, score -5);
However when I use unique to find the number of unique
genomes now present in prunedrelatives I get results
Hi,
Even you removed many genomes1 by setting score -5; it is not
necessary saying you changed the uniqueness.
To check this, you can do like
p0 - unique(dataset[dataset$score -5, genome1]) # same as subset
p1 - unique(dataset[dataset$score= -5, genome1])
setdiff(p1, p0)
if the output above