Hello all,

I wondered if anyone had insights on classifying data through multiple data
sets or through a single data set for document categorizer ?

I am building a sentence category classifier (e.g. surprise, disgust,
sarcasm, unknown) and wondered if one DocumentCategorizerME instance with
training data with all four types is more effective (with confidence
threshold of say 0.25 to decide on a category) or if I should seek to
categorize sentences through three DocumentCategorizerME instances (one
each for surprise, disgust, sarcasm, with confidence threshold of say 0.6
to decide on a category, unknown otherwise).

I am a newbie to this mailing list, and apologize if this question is
irrelevant. Please help me by pointing to the right direction.

Best,
Neeraj

Reply via email to