Re: unknown test data twenty-newsgroups example

Neil Ghosh Thu, 30 Sep 2010 09:15:55 -0700

Do you mean , I should 1st create the model with correct data in correct
folder (Label).


Then now randomly distribute the raw text files in among two folders and
generate input data.

Now I should run the tester for the mis-labelled data ?

On Thu, Sep 30, 2010 at 9:37 PM, Robin Anil <[email protected]> wrote:

> You may split the dataset in 80/20 or some other ratio and try. You can
> split them after you have created the data in Bayes classifier format or
> split it into different folders and make them as described in t
> documentation.
>
>
> Robin
>
>
> On Thu, Sep 30, 2010 at 7:30 PM, Neil Ghosh <[email protected]> wrote:
>
>> Hi,
>>
>> In this example
>>
>> https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html
>>
>> The test is done on the already classified input text documents.
>>
>> My Question is , If I want to test unknown, documents , do I need it in
>> specific format ? or just keep them (as raw text ) in the input folder
>> while
>> testing ?
>>
>> Thanks and Regards
>> Neil
>> http://neilghosh.com
>>
>
>


-- 
Thanks and Regards
Neil
http://neilghosh.com

Re: unknown test data twenty-newsgroups example

Reply via email to