Dear Prof. Rohlf, I have read your paper on the minimum spanning tree algorithm and I think I've got your program to run. For testing I did use the Birth and Dead Rate data in Hartigan's book. The Yuval grid did run fine, but it looks like that I encounter an infinite loop with the Rabin grid. Maybe I did specify something wrong. I will try to trace this today. I think the LIMLOT in the Common block cannot be larger than 2999. Do you have any idea how large NDIMGR and LIMH normally should be?
But back to my original problem: First I would have to modify the algorithm from searching for the smallest to searching for the largest distances. Then I would need to look into the graph for those K objects, the number K is user specified, which have no small interpoint distances. That brings me to the question: Does the minimum spanning tree really find a cluster of K cases which have all small interpoint distances? In my applications the total number of cases N woiuld be in the thousands but the number K of selected cases should be somehow between 100 and 1000. Then the K times K matrix of distances between the selected objects should have no small interpoint distance (off-diagonal entry). Wolfgang ----- Original Message ----- From: F. James Rohlf <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, April 17, 2002 4:23 PM Subject: Re: Name and Reference of Algorithm > I am not sure if is exactly the reference you need, but the following may be > of interest because it shows one method to avoid having to compute an entire > distance matrix. It was intended for quickly finding near neighbors but it > should also be useful for finding distant neighbors (I looked for points in > the same partition whereas you would be interested in points not in the same > partition). Main limitation for my application was that it works best for > large numbers of points in very low-dimensional spaces. > > Rohlf, F. J. 1977. A probabalistic minimum spanning tree algorithm. > Information Proc. Letters, 7:44-48. > > > -----Original Message----- > > From: Classification, clustering, and phylogeny estimation > > [mailto:[EMAIL PROTECTED]]On Behalf Of Wolfgang M. Martmann > > Sent: Wednesday, April 17, 2002 9:45 AM > > To: [EMAIL PROTECTED] > > Subject: Name and Reference of Algorithm > > > > > > Hi: > > For some time I'm working on a problem of sampling a set of K > > observations (cases) from a large data set with N >> K cases so that > > the selected observations are as "different as possible". In more > > mathematical terms, I'm interested in locating those K cases which > > will result in a (not necessarily Euclidean) distance matrix in which > > the smallest off-diagonal entry d_ij is as large as possible. > > > > I have developed an algorithm which seems to work very well and > > generates sets which are either optimal or close to optimality without > > computing the entire distance matrix. However, I'm thinking more > > and more that this maybe a known problem to people who work in > > Cluster Analysis, MDS, or classification. I wonder if anybody on > > this list could point me to some references about this search > > problem. > > > > Thanks, Wolfgang Hartmann > >
