A better approach would be to create a new Model and ModelDistribution that 
uses log arithmetic of your choosing. The initial models are very simple minded 
and are likely not adequate for real applications. 

-----Original Message-----
From: Ted Dunning [mailto:[email protected]] 
Sent: Monday, June 27, 2011 7:51 AM
To: [email protected]
Subject: Re: Incorrect calculation of pdf

There should not be a change to an existing method.

It would be find to add another method, perhaps called logPdf, that does
what you suggest.  This loss of precision is common with the normal
distribution in high dimensions.

On Mon, Jun 27, 2011 at 1:49 AM, Vasil Vasilev <[email protected]> wrote:

> Hi,
>
> Recently I wanted to use Dirichlet clustering algorithm to cluster vectors
> directly taken out of vectorized text, whose dimensionality was around
> 50000. In this situation the algorithm fails to calculate the pdf of a
> vector corresponding to cluster center due to problems with numerical
> precision during multiplication.
>
> In this regard, what do you think of modifying the GaussianCluster.pdf()
> method in such way that it works with logarithmic probabilities?
>
> Regards, Vasil
>

Reply via email to