You mean you want to classify a large dataset ? The partial implementation is useful when the training dataset is too large to fit in memory. If it's does fit then you better train the forest using the in-memory implementation. If you want to classify a large amount of rows then you can add the parameter -mr to TestForest to classify the data using mapreduce. An example of this can be found in the wiki:
https://cwiki.apache.org/MAHOUT/partial-implementation.html On Thu, Dec 6, 2012 at 2:45 AM, Marty Kube < [email protected]> wrote: > Hi, > > I'm working improving classification throughput for a decision forest. I > was wondering about the use case for Partial Implementation. > > The quick start guide suggests that Partial Implementation is designed for > building forest on large datasets. > > My problem is classification after training. Is Partial Implementation > helpful for this use case? > > > > > > >
