In article <[EMAIL PROTECTED]>, Richard Ulrich <[EMAIL PROTECTED]> writes
>Here is an opinion, which I wonder if there is much >objection to -- >Clustering by computer is a moderately useless enterprise >even in moderately skilled hands. > I think there are two situations where clustering is sensible. 1. You have some data sets which have been clustered (perhaps by a person) and which are considered correctly clustered. You can then look for an automatic way of achieving similar results on these data sets by trying different algorithms, then cross your fingers and hope the chosen algorithm does well on new but 'similar' data sets. 2. You have some end-goal, some reason for clustering, which you can express in terms of a function (of a clustering) which is to minimised. For example you might want to compress some data by replacing points by their nearest cluster centres, minimising the size of the data and some measure of how badly the centres approximate the points. Or you might aim to improve the accuracy of a classifier by clustering within individual classes. I am not sure if I am agreeing or disagreeing with your opinion. I think you (and probably the OP) are talking about using clustering as some kind of data exploration or visualisation tool. In that context, I agree with you. I would be interested to know if anyone thinks there is a good reason to use a clustering algorithm besides the two above. -- Graham Jones http://www.visiv.co.uk Emails to [EMAIL PROTECTED] may be deleted as spam Please add a j just before the @ to ensure delivery . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
