Let's do some simple calculation: The dist object from a data set with 80000 cases would have
80000 * (80000 - 1) / 2 elements, each takes 8 bytes to be stored in double precision. That's over 24GB if my arithmetic isn't too flaky. You'd have a devil of a time trying to do this on a 64-bit machine with 32GB RAM, let alone what you are using. You'd have much better chance sticking with algorithms that do not require storage of the (dis)similarity matrix. Andy From: Markus Preisetanz > > Dear R Specialists, > > > > when trying to cluster a data.frame with about 80.000 rows > and 25 columns I get the above error message. I tried hclust > (using dist), agnes (entering the data.frame directly) and > pam (entering the data.frame directly). What I actually do > not want to do is generate a random sample from the data. > > > > The machine I run R on is a Windows 2000 Server (Pentium 4) > with 2 GB of RAM. > > > > Does anybody know what to do? > > > > Sincerely > > ___________________ > > Markus Preisetanz > > Consultant > > > > Client Vela GmbH > > Albert-Roßhaupter-Str. 32 > > 81369 München > > fon: +49 (0) 89 742 17-113 > > fax: +49 (0) 89 742 17-150 > > mailto:[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]> > > > > Diese E-Mail enthält vertrauliche und/oder rechtlich > geschützte Informationen. Wenn Sie nicht der richtige > Adressat sind oder diese E-Mail irrtümlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie > diese Mail. Das unerlaubte Kopieren sowie die unbefugte > Weitergabe dieser E-Mail ist nicht gestattet. > > This e-mail may contain confidential and/or privileged > infor...{{dropped}} > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html