Can someone help me with this?

Regards,
Damodar

On Tue, Jul 3, 2012 at 4:27 PM, damodar shetyo <[email protected]>wrote:

> Hi,
> I plan to use mahout classification feature.I have a lot of data on which
> i am planning to train my model.Now i have few queries as follows:
> 1)Suppose i have 2 types of data:  Spam and not spam (this is just for
> example and not real use case , but similar  to my real use case).The
> amount of  spam data is far less then that of non spam data in training
> data . I have 2% of spam(or may be 1%)  and 98% of nonspam in training.
> Now the question is, if i build my model on this training  such that it
> outputs spam/ nonspam will i get nonspam  all the time as non spam data is
> more in training?
> Will my model correclty identify spam?
>
>
> --
> Regards,
> Damodar Shetyo
>
>


-- 
Regards,
Damodar Shetyo

Reply via email to