Can't say just off-hand. What is the data?
On Mon, Jul 5, 2010 at 8:20 AM, Grant Ingersoll <[email protected]> wrote: > I'm running ClusterLabels and it seems to be outputting the same values for > every centroid [1]. When I run the cluster dumper, the top terms are fairly > different for those same vectors. > > Have I hit a vagary of LLR or is this a bug? > > > Thanks, > Grant > > > [1] > <snip> > Top labels for Cluster 129062 containing 22710 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 43269.00830466254 0 72060 > his 7185.503760070074 0 17203 > has 7028.243643655442 0 16855 > from 6415.739411605988 0 15488 > year 5930.141497239005 0 14391 > state 5858.43069797568 0 14228 > said 5616.422720833216 0 13676 > it 5545.207108973991 0 13513 > he 5239.340392438695 0 12810 > new 4830.124521905556 0 11862 > > Top labels for Cluster 129145 containing 11188 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 19576.26998734614 0 72060 > his 3352.5135342599824 0 17203 > has 3279.466228939127 0 16855 > from 2994.8128935270943 0 15488 > year 2768.974903047085 0 14391 > state 2735.612128134351 0 14228 > said 2622.997358441353 0 13676 > it 2589.8515553446487 0 13513 > he 2447.4579147226177 0 12810 > new 2256.8640938592143 0 11862 > > Top labels for Cluster 129201 containing 13040 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 23110.173012922285 0 72060 > his 3940.4691014224663 0 17203 > has 3854.554399965331 0 16855 > from 3519.784154796507 0 15488 > year 3254.2127395244315 0 14391 > state 3214.9822960514575 0 14228 > said 3082.565408431459 0 13676 > it 3043.5924300444312 0 13513 > he 2876.171367166564 0 12810 > new 2652.0934832417406 0 11862 > > Top labels for Cluster 129211 containing 14053 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 25083.46391701023 0 72060 > his 4266.378291217145 0 17203 > has 4173.323467798065 0 16855 > from 3810.7467373879626 0 15488 > year 3523.1337431534193 0 14391 > state 3480.648573280778 0 14228 > said 3337.2482196930796 0 13676 > it 3295.0432900944725 0 13513 > he 3113.741967030335 0 12810 > new 2871.0957860480994 0 11862 > > Top labels for Cluster 129242 containing 12861 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 22764.503256496973 0 72060 > his 3883.2002838114277 0 17203 > has 3798.5396822127514 0 16855 > from 3468.6536546614952 0 15488 > year 3206.954131908249 0 14391 > state 3168.2954448102973 0 14228 > said 3037.808057511691 0 13676 > it 2999.402857856825 0 13513 > he 2834.4202939094976 0 12810 > new 2613.604658874683 0 11862 > > Top labels for Cluster 129245 containing 6443 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 10925.268199045677 0 72060 > his 1890.511348863598 0 17203 > has 1849.385320336558 0 16855 > from 1689.0946326381527 0 15488 > year 1561.8904545903206 0 14391 > state 1543.096286157146 0 14228 > said 1479.652662154287 0 13676 > it 1460.9780013803393 0 13513 > he 1380.745082413312 0 12810 > new 1273.3357145632617 0 11862 > > Top labels for Cluster 129255 containing 11390 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 19957.211259535048 0 72060 > his 3416.1555761522613 0 17203 > has 3341.7163103362545 0 16855 > from 3051.6410844950005 0 15488 > year 2821.504116652999 0 14391 > state 2787.5064550531097 0 14228 > said 2672.7490201727487 0 13676 > it 2638.972676954698 0 13513 > he 2493.870809029322 0 12810 > new 2299.653438703157 0 11862 > > Top labels for Cluster 129265 containing 9461 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 16362.85457371641 0 72060 > his 2813.167819214519 0 17203 > has 2751.908798408229 0 16855 > from 2513.176188033074 0 15488 > year 2323.752471229993 0 14391 > state 2295.767774611246 0 14228 > said 2201.3039346230216 0 13676 > it 2173.4997256915085 0 13513 > he 2054.0495802331716 0 12810 > new 1894.1558320098557 0 11862 > > Top labels for Cluster 129279 containing 14559 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 26080.197364640888 0 72060 > his 4430.338072712999 0 17203 > has 4333.689091425855 0 16855 > from 3957.116204748396 0 15488 > year 3658.40981121175 0 14391 > state 3614.286633652635 0 14228 > said 3465.358771919273 0 13676 > it 3421.527382406406 0 13513 > he 3233.2411222746596 0 12810 > new 2981.251407010015 0 11862 > > Top labels for Cluster 129290 containing 13592 vectors > Term LLR In-ClusterDF Out-ClusterDF > a 24181.82589298836 0 72060 > his 4117.6785482652485 0 17203 > has 4027.8821644652635 0 16855 > from 3677.9947950267233 0 15488 > year 3400.440033295192 0 14391 > state 3359.4400672735646 0 14228 > said 3221.0516651300713 0 13676 > it 3180.321518546436 0 13513 > he 3005.353873868007 0 12810 > new 2771.180380204227 0 11862 > </snip>
