Hello,

I am having difficulty with Dirichlet process clustering, I would highly
appreciate any help.
The results of Dirichlet clustering with my data groups all instances in
one single cluster no matter how many iterations I have tried.
The clusterdump output is like:
DC-0 total= 1152000 model= GC:0{*n=1152* c=[0:0.014, 1:0.004, 2:0.001,
3:0.005, 5:0.004
...
DC-1 total= 0 model= GC:1{*n=0* c=[0.085, 0.101, 1.617, -1.592, 0.721,
-1.618, 0.550, 0.302
...

I thought the problem could have been about the way input is read however
when I tried reuters dataset, its output was also similar:
DC-0 total= 320 model= GC:0{*n=32* c=[2.886, 0.210, 0.167, 0.210, 0.664,
0.254, 0.486,
...
DC-1 total= 0 model= GC:1{*n=0* c=[-0.217, -0.522, 1.138, 0.399, -0.314,
1.063, -0.967,
When I use the dictionary for the reuters dataset, it prints reasonable
words for the clusters like:
:DC-0 total
Top Terms:
d                                       =>   48.25068240612745
5                                       =>   45.90837124735117
said                                    =>   44.70690381526947
topics                                  =>   44.07638777047396
22                                      =>   39.78152487426996
companies                               =>   38.85674291104078
date                                    =>   38.47198750451207
unknown                                 =>   38.33379830792546
reuters                                 =>   37.93209125474095
title                                   =>   37.45820361748338
:DC-1 total
Top Terms:
foreclosed                              =>   3.973533371410058
18749                                   =>   3.945486656800688
jannock                                 =>  3.8038475335990882
48.29                                   =>  3.7140637347393706
asphalt                                 =>  3.6475071525946103
fragile                                 =>  3.6402008090541895
compiled                                =>   3.584675891358228
642                                     =>  3.5606986939331313
6.73                                    =>  3.5492208849250027
16334                                   =>  3.5394655632624428


Is there anybody who knows about the cause of this problem?

Thanks

Reply via email to