Thank you William! Really appreciated!

I only do not get one point, when you said "You could increment your
model using
Custom Feature Generators" does it mean that i can "put" these features
inside ONE *.bin* file (model) that implement different things, or, name
finder is one thing and those feature generators other?

Thank you in advance for the clarification.

2016-06-29 1:23 GMT+02:00 William Colen <[email protected]>:

> Not exactly. You would create a new NER model to replace yours.
>
> In this approach you would need a corpus like this:
>
> <START:personMale> Pierre Vinken <END> , 61 years old , will join the board
> as a nonexecutive director Nov. 29 .
> Mr . <START:personMale> Vinken <END> is chairman of Elsevier N.V. , the
> Dutch publishing group . <START:personFemale> Jessie Robson <END> is
> retiring , she was a board member for 5 years .
>
>
> I am not an English native speaker, so I am not sure if the example is
> clear enough. I tried to use Jessie as a neutral name and "she" as
> disambiguation.
>
> With a corpus big enough maybe you could create a model that outputs both
> classes, personMale and personFemale. To train a model you can follow
>
> https://opennlp.apache.org/documentation/1.6.0/manual/opennlp.html#tools.namefind.training
>
> Let's say your results are not good enough. You could increment your model
> using Custom Feature Generators (
>
> https://opennlp.apache.org/documentation/1.6.0/manual/opennlp.html#tools.namefind.training.featuregen
> and
>
> https://opennlp.apache.org/documentation/1.6.0/apidocs/opennlp-tools/opennlp/tools/util/featuregen/package-summary.html
> ).
>
> One of the implemented featuregen can take a dictionary (
>
> https://opennlp.apache.org/documentation/1.6.0/apidocs/opennlp-tools/opennlp/tools/util/featuregen/DictionaryFeatureGenerator.html
> ).
> You can also implement other convenient FeatureGenerator, for instance
> regex.
>
> Again, it is just a wild guess of how to implement it. I don't know if it
> would perform well. I was only thinking how to implement a gender ML model
> that uses the surrounding context.
>
> Hope I could clarify.
>
> William
>
> 2016-06-28 19:15 GMT-03:00 Damiano Porta <[email protected]>:
>
> > Hi William,
> > Ok, so you are talking about a kind of pipe where we execute:
> >
> > 1. NER (personM for example)
> > 2. Regex (filter to reduce false positives)
> > 3. Plain dictionary (filter as above) ?
> >
> > Yes we can split out model in two for M and F, it is not a big problem,
> we
> > have a database grouped by gender.
> >
> > I only have a doubt regarding the use of a dictionary. Because if we use
> a
> > dictionary to create the model, we could only use it to detect names
> > without using NER. No?
> >
> >
> >
> > 2016-06-29 0:10 GMT+02:00 William Colen <[email protected]>:
> >
> > > Do you plan to use the surrounding context? If yes, maybe you could try
> > to
> > > split NER in two categories: PersonM and PersonF. Just an idea, never
> > read
> > > or tried anything like it. You would need a training corpus with these
> > > classes.
> > >
> > > You could add both the plain dictionary and the regex as NER features
> as
> > > well and check how it improves.
> > >
> > > 2016-06-28 18:56 GMT-03:00 Damiano Porta <[email protected]>:
> > >
> > > > Hello everybody,
> > > >
> > > > we built a NER model to find persons (name) inside our documents.
> > > > We are looking for the best approach to understand if the name is
> > > > male/female.
> > > >
> > > > Possible solutions:
> > > > - Plain dictionary?
> > > > - Regex to check the initial and/letters of the name?
> > > > - Classifier? (naive bayes? Maxent?)
> > > >
> > > > Thanks
> > > >
> > >
> >
>

Reply via email to