?= <[EMAIL PROTECTED]> MIME-Version: 1.0 Message-Id: <[EMAIL PROTECTED]> Content-Transfer-Encoding: 8bit
Hi Sandra, I am currently working on probabilistic-based classification methods. It sounds that your dataset can also be analysed using some sort of methods. If you are willing to, I can have a try to analyse your dataset. The methods that I use, especially that which uses the Minimum Message Length (principle) is very good for analysing dataset with highly overlapping groups and all of them work in an unsupervised way. Kind Regards, -- Yudi Agusta PhD candidate in Computer Science School of CSSE, Monash University, Victoria, 3800 Australia. Telp: +61-3-99055190 On Sat, 20 Mar 2004 00:15, you wrote: > Hi, > > I have performed a cluster analysis on a medical dataset consisting of 100 > children measured on 4 variables. > > The dendograms suggested there were three groups, so I did a k-means > clustering with k=3. I didn't set the initial centroids of the k-means = > centres of hierarchical clustering, and the two types of clustering did not > repeat the same partiton. Arnold's test for cluster proved to be non > significant. YET, I managed to find two groups of children who had a very > different profile on the 4 variables clustered and and a similar response > on a 5th variable, which was very surprising. > > Now, I understand I haven't identified 3 groups of very different children, > everything so far suggests there are no sharply differing groups. I cannot > make any inferences from my sample, obviously. But could I say I have found > some sort of multivariate thresholds on the basis of the matrix of > distances, which allow me to gain a certain insight into the data? Or is > it just all a big fluke, not worth the paper it's written on?! > > I welcome any comments/suggestions. I am only new to the topic (and the > list), but I am keen to learn! > > Thanks for your time so far > > Sandra > > Sandra Alba > University Medicine - Level 7 > Derriford Hospital > Plymouth > PL6 8DH > UK
