Do you mean , I should 1st create the model with correct data in correct folder (Label).
Then now randomly distribute the raw text files in among two folders and generate input data. Now I should run the tester for the mis-labelled data ? On Thu, Sep 30, 2010 at 9:37 PM, Robin Anil <[email protected]> wrote: > You may split the dataset in 80/20 or some other ratio and try. You can > split them after you have created the data in Bayes classifier format or > split it into different folders and make them as described in t > documentation. > > > Robin > > > On Thu, Sep 30, 2010 at 7:30 PM, Neil Ghosh <[email protected]> wrote: > >> Hi, >> >> In this example >> >> https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html >> >> The test is done on the already classified input text documents. >> >> My Question is , If I want to test unknown, documents , do I need it in >> specific format ? or just keep them (as raw text ) in the input folder >> while >> testing ? >> >> Thanks and Regards >> Neil >> http://neilghosh.com >> > > -- Thanks and Regards Neil http://neilghosh.com
