You may split the dataset in 80/20 or some other ratio and try. You can split them after you have created the data in Bayes classifier format or split it into different folders and make them as described in the documentation.
Robin On Thu, Sep 30, 2010 at 7:30 PM, Neil Ghosh <[email protected]> wrote: > Hi, > > In this example > > https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html > > The test is done on the already classified input text documents. > > My Question is , If I want to test unknown, documents , do I need it in > specific format ? or just keep them (as raw text ) in the input folder > while > testing ? > > Thanks and Regards > Neil > http://neilghosh.com >
