You mean you want to classify a large dataset ?
The partial implementation is useful when the training dataset is too large
to fit in memory. If it's does fit then you better train the forest using
the in-memory implementation.
If you want to classify a large amount of rows then you can add the
parameter -mr to TestForest to classify the data using mapreduce. An
example of this can be found in the wiki:

https://cwiki.apache.org/MAHOUT/partial-implementation.html




On Thu, Dec 6, 2012 at 2:45 AM, Marty Kube <
[email protected]> wrote:

> Hi,
>
> I'm working improving classification throughput for a decision forest.  I
> was wondering about the use case for Partial Implementation.
>
> The quick start guide suggests that Partial Implementation is designed for
> building forest on large datasets.
>
> My problem is classification after training. Is Partial Implementation
> helpful for this use case?
>
>
>
>
>
>
>

Reply via email to