Re: Speed of clustering documents

Ted Dunning Wed, 29 Dec 2010 15:25:38 -0800

Before looking at your results, I would like to say that Mahout is about
scalable data mining.  Performance on small data sets is explicitly not a
goal.

For small data sets like this, you would do much, much better to do your
work in a conventional system like R where everything can fit in memory and
clustering can take less than a second.

On Wed, Dec 29, 2010 at 2:39 PM, Samir Raiyani <[email protected]> wrote:

> We have been testing Mahout in a few different configurations and it seems
> to take a significant amount of time (several minutes to over an hour) for
> small document sets (3,000 documents and 7,000 documents). Is this type of
> performance normal?
>

Re: Speed of clustering documents

Reply via email to