Hi Jeff.

> Did you turn off most-likely classification?

Yes, I specified most-likely option to false.
In general, pdf's range is between 0 and 1.
So, if pdf threshold is specified 0, all points classified to all of the
clusters.
Actually, sequence file is empty.

I feel contradiction.
I may be wrong but this is bug?

Thanks,
Yoshihiro.



2012/12/26 Jeff Eastman <[email protected]>

> Here's a response to a similar question from a couple of months ago:
>
> The classification phase of Dirichlet uses a most-likely assignment of
> points to clusters by default. This means that, unlike the training phase
> where points are assigned statistically to likely clusters, the
> classification may result in empty clusters even though those clusters have
> nonzero counts in the final iteration. You can disable most-likely
> assignment and set a pdf threshold - check the documentation - and points
> will be classified to all of the clusters that have pdf greater than the
> threshold.
>
> Does this help? Did you turn off most-likely classification?
> Jeff
>
>
> On 12/24/12 11:57 PM, yoshihiro fujimoto wrote:
>
>> Hi all,
>>
>>
>> https://cwiki.apache.org/**MAHOUT/dirichlet-process-**clustering.html<https://cwiki.apache.org/MAHOUT/dirichlet-process-clustering.html>
>>
>> According to this page, it can specify threshold to Dirichlet Driver.
>> This page explain that threshold of 0 will emit all clusters with their
>> associated probabilities for each vector.
>> So, I've run Dirichlet Clustering using threshold 0.
>> But, clusteredPoints/part-m-00000 sequence file is empty( length is 120
>> byte).
>>
>> In Dirichlet Process, is there a case of empty result using threshold 0?
>>
>> Thanks,
>>
>> Yoshihiro
>>
>>
>

Reply via email to