Re: Document size rules of thumb

Robin Anil Wed, 07 Oct 2009 05:31:53 -0700

HI Sandra, Could you explain your setup, what kind of a dataset it is.
Mahout Naive Bayes/CBayes (not Bayesian network) classifier is built with
text articles or documents in mind. The characteristics might change  if the
document you wish to classify is  140 char sms or twitter messages(Wont
affect much though). Could you tell me what kind of results are you getting,
then by looking at the data and the scores generated we can see what to tune
Robin



On Wed, Oct 7, 2009 at 4:58 PM, Sandra Clover <[email protected]>wrote:

> Hi,    Just wondering do you have any nice rules-of-thumb or any other
> guides (characteristics) as to the minimum size of the documents used in
> training the complementary Bayesian network? I would appreciate any
> comments/views/opinions/rules-of-thumb/experiences that you may be able
> to offer on good characteristics of the documents that go into training
> (particularly when you have a large number of categories to
> classify)... Thanking
> you,Sandra.
>
> --
> An Excellent Credit Score is 750
> See Yours in Just 2 Easy Steps!
>
>

Re: Document size rules of thumb

Reply via email to