Hi Damiano,

Thank you. I will definitely look into it.

Manoj.

On Wed, Jan 18, 2017 at 5:30 PM, Damiano Porta <damianopo...@gmail.com>
wrote:

> Manoj,
>
> you can add custom feature using a generator that implements this:
> https://github.com/apache/opennlp/blob/master/opennlp-
> tools/src/main/java/opennlp/tools/doccat/FeatureGenerator.java
>
> take a look at
> https://github.com/apache/opennlp/blob/master/opennlp-
> tools/src/main/java/opennlp/tools/doccat/BagOfWordsFeatureGenerator.java
> and
> https://github.com/apache/opennlp/blob/master/opennlp-
> tools/src/main/java/opennlp/tools/doccat/NGramFeatureGenerator.java
>
> Damiano
>
> 2017-01-18 12:41 GMT+01:00 Cohan Sujay Carlos <co...@aiaioo.com>:
>
> > In machine learning, one learns the weights you're speaking of, Manoj.
> >
> > So, the words that are more important for any category are given higher
> > weightage during classification.
> >
> > However, rather than requiring a user to manually assign these weights, a
> > machine learning system learns the weights from training data.
> >
> > That's what happens when you call say DocumentCategorizerME.train(*"
> en"*,
> > sampleStream);
> >
> > The model that the train method returns is just a record of the "weights"
> > that have been learnt.
> >
> > Cohan
> >
> > On Wed, Jan 18, 2017 at 4:18 PM, Manoj B. Narayanan <
> > manojb.narayanan2...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I was wondering if there is a way to assign weights to certain words
> of a
> > > class in the Document Classifier.
> > >
> > > Some words are important for a particular class. Even though these
> words
> > > may occur in other classes, the level of importance may vary. So, if
> > > certain words in certain classes are given specific weights, it would
> > > produce more accurate results.
> > >
> > > Let me explain this with an example.
> > >
> > > Say we have 2 classes. Nature and Sports.
> > > Consider these 2 sentences :
> > >     1. We played basket ball, under the sun.
> > >     2. The sun is a big ball of fire.
> > >
> > > In the first sentence, which belongs to the class 'Sports', the words
> > > 'played','basket','ball' are more important than the word 'sun'.
> Whereas,
> > > in the second sentence, the words 'sun' and 'fire' are important than
> the
> > > word 'ball'.
> > >
> > > Thelevel of importance can be assigned by assigning weight to a few
> > > specific words that are distinct for a class.
> > >
> > > Is there already a way to do this in OpenNLP Document Classifier? If
> not
> > > please consider this.
> > >
> >
>

Reply via email to