The first problem is that you are using a character string as the first 
argument to agnes()

The help information for agnes says that its first argument, x, is


       x: data matrix or data frame, or dissimilarity matrix, depending
          on the value of the 'diss' argument.

Not a character string.  So first you have to read your data into R and hold it 
as a "data matrix or data frame".  Then you have a choice.  Either you can 
calculate your own distance matrix with it and then call agnes() with that as 
the first argument (and with diss = TRUE) or you can get agnes() to calculate 
the distance matrix for you, in which case you need to specify how, using the 
metric = argument.

With 10000 entities to cluster, your distance matrix will require

> 10000*9999/2
[1] 49995000

numbers to be stored at once.  I hope you are using a 64-bit OS!

With such large numbers of entities to cluster, the usual advice is to try 
something more suited to the job.  clara() is designed for this kind of problem.

It might be useful to keep in mind that R is not a package.  (Repeat: R is NOT 
a package - I cannot stress that strongly enough.)  It is a programming 
language. To use it effectively you really need to know something about how it 
works, first.  It might pay you to spend a little time getting used to the 
protocols, how to do simple things in R like reading in data and manipulating 
it, before you tackle such a large and potentially tricky clustering problem.

Bill Venables. 

-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Karen R. Khar
Sent: Monday, 27 June 2011 5:44 PM
To: r-help@r-project.org
Subject: [R] New to R, trying to use agnes, but can't load my ditance matrix

Hi,

I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a
distance matrix I created from the data on my own and called it D10.dist. I
loaded the cluster package. Then tried the following command...

> agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE,
> method = "average", par.method, keep.diss = n < 1000, keep.data = !diss)

And it responded...

Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand =
FALSE,  : 
  x is not and cannot be converted to class dissimilarity

D10.dist has the following data...

D1      0
D2      0.608392        0
D3      0.497451        0.537662        0
D4      0.634548        0.393343        0.537426        0
D5      0.558785        0.543399        0.632221        0.726633        0
D6      0.659483        0.701778        0.741425        0.668624        
0.655914        0
D7      0.603012        0.659173        0.571776        0.687599        
0.383712        0.683948        0
D8      0.611919        0.665357        0.526453        0.715093        
0.457496        0.698213        0.317039        0
D9      0.41501 0.652117        0.552011        0.68969 0.485988        
0.702738        0.42819 0.442598        0
D10     0.376512        0.600607        0.517857        0.673515        
0.530421        0.667736        0.537025        0.48062
0.240559        0

I would appreciate any suggestions. Please assume I know virtually nothing
about R.

Thanks,
Karen

PS I'll eventually be using ~10,000 "species" to cluster. I'll need to have
within and between cluster distance info and I'll want a plot colored by
cluster. I agnes the right R tool to use?

--
View this message in context: 
http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627154.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to