Threshold are generally dangerous.  It is usually preferable to specify the
sparseness you want (1%, 0.2%, whatever), sort the results in descending
score order using Hadoop's builtin capabilities and just drop the rest.

On Tue, Jun 15, 2010 at 9:32 AM, Kris Jack <[email protected]> wrote:

>  I was wondering if there was an
> interesting way to do this with the current mahout code such as requesting
> that the Vector accumulator returns only elements that have values greater
> than a given threshold, sorting the vector by value rather than key, or
> something else?
>

Reply via email to