Re: [Scikit-learn-general] Cluster Terms Output

Andreas Mueller Fri, 01 Feb 2013 14:21:53 -0800

On 02/01/2013 10:15 PM, Vinay B, wrote:
> I thought I'd clarify this question in a separate thread
>
> Each individual cluster is usually associated with a set of
> significant terms. For example, a mahout kmeans cluster operation of
> the reuters-21578 dataset yields output like this
>
>
> :VL-21566{n=2 c=[1,000:2.589, 1.9:2.974, 10:2.289, 14:1.568, 16:2.000,
> 19:1.526, 1986:2.796, 20:1.450
>          Top Terms:
>                  smithkline                              =>   
> 19.37364673614502
>                  kline                                   =>  
> 14.453418731689453
>                  beckman                                 =>  
> 10.067719459533691
>                  smith                                   =>   
> 9.676518440246582
>                  pharmaceutical                          =>   
> 9.275004863739014
>                  tianjin                                 =>   
> 9.033519744873047
>                  skb                                     =>   
> 8.494523048400879
>                  tagamet                                 =>    
> 8.30789852142334
>                  laboratories                            =>   
> 7.682400465011597
>                  allergan                                =>   
> 6.986793041229248
>                  antiulcer                               =>   
> 6.986793041229248
>                  venture                                 =>   
> 6.036190032958984
>                  french                                  =>    
> 5.92253041267395
>                  skin                                    =>   
> 5.897533416748047
>                  joint                                   =>   
> 5.545732021331787
>                  testing                                 =>   
> 5.528974533081055
>                  eye                                     =>   
> 5.433120250701904
>                  plant                                   =>    
> 5.26419734954834
>                  capsules                                =>   
> 4.940408706665039
>                  521.1                                   =>   
> 4.940408706665039
>          Weight : [props - optional]:  Point:
>          1.0: [1,000:5.177, 1.9:5.949, 10:4.578, 16:4.001, 1986:5.592,
> 20:2.900, 24:3.633, 25:3.448, 3:1.119, 3.6:6.127, 373:9.593,
> 433:9.034, 50:3.677, 52.05:9.593, 521.1:9.881, 6.78:9.370,
> about:2.838, achieve:6.127, acquisitions:5.550,
>          1.0: [14:3.135, 19:3.051, 200:4.899, 3:1.119, 30:3.104,
> 56.94:8.900, 8.5:6.537, beckman:11.795, billion:2.839,
> capability:7.647, capsules:9.881, chemical:5.396, china:7.466,
> co:2.865, combines:8.146, company:2.449, corp:2.432, dlr
> :VL-21565{n=2 c=[00:2.340, 1:1.459, 1.66:7.721, 10:1.869, 10.20:4.594,
> 10.6:3.387, 11:1.357, 17:1.526
>          Top Terms:
>                  gm                                      =>  
> 20.328017234802246
>                  h                                       =>  
> 12.566249370574951
>                  buyback                                 =>  
> 12.333285808563232
>                  repurchase                              =>  
> 11.563349723815918
>                  class                                   =>  
> 11.257688760757446
>
> .................. etc
>
> Does scikit-learn have similar functionality ? Thanks
The comment I linked to does exactly that, plus render a pretty image ;)



------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Cluster Terms Output

Reply via email to