On Thu, Sep 30, 2010 at 10:00 AM, Neil Ghosh <[email protected]> wrote:
>
> My Question is , If I want to test unknown, documents , do I need it in
> specific format ? or just keep them (as raw text ) in the input folder while
> testing ?

If I interpret your question correctly, you're saying "I've trained my
classifier and tested it, now how do I use it in production?". I don't
know that this is covered by the example.

The unit test, in core/src/test/java --
org.apache.mahout.classifier.bayes.BayesClassifierSelfTest provides a
potentially useful example. Take a look at the testSelfTestBayes()
method.

In general, the operations involved include;
   Create an instance of Algorithm and Datastore, configure as appropriate .
   Create an instance of ClassifierContext (named classifier) using
the Algorithm and Datastore, calling initialize() upon i the context.
   Generate tokens from your input document (either individual words
or ngrams based on how the data used to train the model was
processed).
   Call classifier.classifyDocument(String[] tokens, String
defaultCat) this will return a ClassifierResult containing the top
classifications for the input document ranked by score).

HTH,

Drew

Reply via email to