The first problem is that you are using a character string as the first argument to agnes()
The help information for agnes says that its first argument, x, is x: data matrix or data frame, or dissimilarity matrix, depending on the value of the 'diss' argument. Not a character string. So first you have to read your data into R and hold it as a "data matrix or data frame". Then you have a choice. Either you can calculate your own distance matrix with it and then call agnes() with that as the first argument (and with diss = TRUE) or you can get agnes() to calculate the distance matrix for you, in which case you need to specify how, using the metric = argument. With 10000 entities to cluster, your distance matrix will require > 10000*9999/2 [1] 49995000 numbers to be stored at once. I hope you are using a 64-bit OS! With such large numbers of entities to cluster, the usual advice is to try something more suited to the job. clara() is designed for this kind of problem. It might be useful to keep in mind that R is not a package. (Repeat: R is NOT a package - I cannot stress that strongly enough.) It is a programming language. To use it effectively you really need to know something about how it works, first. It might pay you to spend a little time getting used to the protocols, how to do simple things in R like reading in data and manipulating it, before you tackle such a large and potentially tricky clustering problem. Bill Venables. -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Karen R. Khar Sent: Monday, 27 June 2011 5:44 PM To: r-help@r-project.org Subject: [R] New to R, trying to use agnes, but can't load my ditance matrix Hi, I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a distance matrix I created from the data on my own and called it D10.dist. I loaded the cluster package. Then tried the following command... > agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, > method = "average", par.method, keep.diss = n < 1000, keep.data = !diss) And it responded... Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, : x is not and cannot be converted to class dissimilarity D10.dist has the following data... D1 0 D2 0.608392 0 D3 0.497451 0.537662 0 D4 0.634548 0.393343 0.537426 0 D5 0.558785 0.543399 0.632221 0.726633 0 D6 0.659483 0.701778 0.741425 0.668624 0.655914 0 D7 0.603012 0.659173 0.571776 0.687599 0.383712 0.683948 0 D8 0.611919 0.665357 0.526453 0.715093 0.457496 0.698213 0.317039 0 D9 0.41501 0.652117 0.552011 0.68969 0.485988 0.702738 0.42819 0.442598 0 D10 0.376512 0.600607 0.517857 0.673515 0.530421 0.667736 0.537025 0.48062 0.240559 0 I would appreciate any suggestions. Please assume I know virtually nothing about R. Thanks, Karen PS I'll eventually be using ~10,000 "species" to cluster. I'll need to have within and between cluster distance info and I'll want a plot colored by cluster. I agnes the right R tool to use? -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627154.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.