[R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Meng Wu
Hi, all

 I would like to use R to perform k-means clustering on my data which
included 33 samples measured with ~1000 variables. I have already used
kmeans package for this analysis, and showed that there are 4 clusters in my
data. However, it's really difficult to plot this cluster in 2-D format
since the huge number of variables. One possible way is to project the
multidimensional space into 2-D platform, but I could not find any good way
to do that. Any suggestions or comments will be really helpful!

Thanks,

Meng

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread David Cross
I wonder if it makes sense to reduce the dimensionality of the variables 
somehow?

David Cross
d.cr...@tcu.edu
www.davidcross.us




On May 18, 2011, at 9:41 AM, Meng Wu wrote:

 Hi, all
 
 I would like to use R to perform k-means clustering on my data which
 included 33 samples measured with ~1000 variables. I have already used
 kmeans package for this analysis, and showed that there are 4 clusters in my
 data. However, it's really difficult to plot this cluster in 2-D format
 since the huge number of variables. One possible way is to project the
 multidimensional space into 2-D platform, but I could not find any good way
 to do that. Any suggestions or comments will be really helpful!
 
 Thanks,
 
 Meng
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Peter Langfelder
On Wed, May 18, 2011 at 7:41 AM, Meng Wu mengwu1...@gmail.com wrote:
 Hi, all

  I would like to use R to perform k-means clustering on my data which
 included 33 samples measured with ~1000 variables. I have already used
 kmeans package for this analysis, and showed that there are 4 clusters in my
 data. However, it's really difficult to plot this cluster in 2-D format
 since the huge number of variables. One possible way is to project the
 multidimensional space into 2-D platform, but I could not find any good way
 to do that. Any suggestions or comments will be really helpful!

You could use multidimensional scaling, function cmdscale(), to
produce a 2-dimensional representation of your data, then plot it
using colors that correspond to the clusters.

For example, suppose your data is stored in matrix X (1000x33). I
assume you clustered the samples, not the variables, so you have a
vector label[] with length 33 that has values between 1 and 4. Since
k-means uses Euclidean distance, you would re-create the distance

dst = dist(t(X))

then feed it into cmdscale()

mds = cmdscale(dst);

then plot it:

plot(mds, col = label)

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Claudia Beleites

Hi Meng,


  I would like to use R to perform k-means clustering on my data which
included 33 samples measured with ~1000 variables. I have already used
kmeans package for this analysis, and showed that there are 4 clusters in my
data. However, it's really difficult to plot this cluster in 2-D format
since the huge number of variables. One possible way is to project the
multidimensional space into 2-D platform, but I could not find any good way
to do that. Any suggestions or comments will be really helpful!
For suggestions it would be extremely helpful to tell us what kind of 
variables your 1000 variables are.


Parallel coordinate plots plot values over (many) variables. Whether 
this is useful, depends very much on your variables: E.g. I have 
spectral channels, they have an intrinsic order and the values have 
physically the same meaning (and almost the same range), so the parallel 
coordinate plot comes naturally (it produces in fact the spectra).


Claudia




Thanks,

Meng

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.belei...@ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with 2-D plot of k-mean clustering analysis

2011-05-18 Thread Albyn Jones
One idea:  Pick the three largest clusters, their centers determine a plane.
project your data into that plane.

albyn

On Wed, May 18, 2011 at 06:55:39PM +0200, Claudia Beleites wrote:
 Hi Meng,
 
   I would like to use R to perform k-means clustering on my data which
 included 33 samples measured with ~1000 variables. I have already used
 kmeans package for this analysis, and showed that there are 4 clusters in my
 data. However, it's really difficult to plot this cluster in 2-D format
 since the huge number of variables. One possible way is to project the
 multidimensional space into 2-D platform, but I could not find any good way
 to do that. Any suggestions or comments will be really helpful!
 For suggestions it would be extremely helpful to tell us what kind
 of variables your 1000 variables are.
 
 Parallel coordinate plots plot values over (many) variables. Whether
 this is useful, depends very much on your variables: E.g. I have
 spectral channels, they have an intrinsic order and the values have
 physically the same meaning (and almost the same range), so the
 parallel coordinate plot comes naturally (it produces in fact the
 spectra).
 
 Claudia
 
 
 
 Thanks,
 
 Meng
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Claudia Beleites
 Spectroscopy/Imaging
 Institute of Photonic Technology
 Albert-Einstein-Str. 9
 07745 Jena
 Germany
 
 email: claudia.belei...@ipht-jena.de
 phone: +49 3641 206-133
 fax:   +49 2641 206-399
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Albyn Jones
Reed College
jo...@reed.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.