On Wed, 29 Mar 2006, Sean Davis wrote:
We have to be careful here. Classification (which is the terminology that
the original poster used) is NOT the same as clustering, although the two
are often confused.
Well, in one of its two English senses it is the same. From a recent talk
of mine (GfKL30), quoting the Concise Oxford Dictionary:
\emph{Classification} has two senses:
\begin{itemize}
\item `to arrange in classes or categories'
\item `assign (a thing) to a class or category'
\end{itemize}
There is a community (q.v. the International Federation of Classification
Societies and Journal of Classification as well as the entry in the
original Encyclopedia of Statistical Sciences) that meams (almost)
entirely the first sense.
To add to this, the similar words to classification in e.g. French or
German have (I am told) different shades of meaning.
If the original poster wants to do clustering and
examine the results for the presence of three clusters, that is fine and
there are many methods for clustering that could be used. However,
classification will require a different set of tools. If the clustering
tools already pointed out are not doing what is needed (that is, that Cao
actually is interested in clustering and not classification), then perhaps a
further explanation of what the problem would help clarify.
Yes, further explanation would help.
Sean
On 3/29/06 1:46 AM, "Jacques VESLOT" <[EMAIL PROTECTED]> wrote:
try this (suppose mat is your matrix):
hc <- hclust(dist(mat,"manhattan"), "ward")
plot(hc, hang=-1)
(x <- identify(hc)) # rightclick to stop
cutree(hc, 3)
km<- kmeans(mat, 3)
km$cluster
km$centers
pam(daisy(mat, metric = "manhattan"), k=3, diss=T)$clust
Baoqiang Cao a écrit :
Thanks!
I tried kmeans, the results is not very positive. Anyway, thanks Jacques!
Please let me know if you have any other thoughts!
Best regards,
Baoqiang Cao
======= At 2006-03-29, 00:08:44 you wrote: =======
if you want to classify rows or columns, read:
?hclust
?kmeans
library(cluster)
?pam
Baoqiang Cao a écrit :
Dear All,
I have a data, suppose it is an N*M matrix data. All I want is to classify
it into, let see, 3 classes. Which method(s) do you think is(are)
appropriate for this purpose? Any reference will be welcome! Thanks!
Best,
Baoqiang Cao
------------------------------------------------------------------------
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
.
= = = = = = = = = = = = = = = = = = = =
Baoqiang Cao
[EMAIL PROTECTED]
2006-03-29
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--
Brian D. Ripley, [EMAIL PROTECTED]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html