Hi William,
Ok, so you are talking about a kind of pipe where we execute:

1. NER (personM for example)
2. Regex (filter to reduce false positives)
3. Plain dictionary (filter as above) ?

Yes we can split out model in two for M and F, it is not a big problem, we
have a database grouped by gender.

I only have a doubt regarding the use of a dictionary. Because if we use a
dictionary to create the model, we could only use it to detect names
without using NER. No?



2016-06-29 0:10 GMT+02:00 William Colen <[email protected]>:

> Do you plan to use the surrounding context? If yes, maybe you could try to
> split NER in two categories: PersonM and PersonF. Just an idea, never read
> or tried anything like it. You would need a training corpus with these
> classes.
>
> You could add both the plain dictionary and the regex as NER features as
> well and check how it improves.
>
> 2016-06-28 18:56 GMT-03:00 Damiano Porta <[email protected]>:
>
> > Hello everybody,
> >
> > we built a NER model to find persons (name) inside our documents.
> > We are looking for the best approach to understand if the name is
> > male/female.
> >
> > Possible solutions:
> > - Plain dictionary?
> > - Regex to check the initial and/letters of the name?
> > - Classifier? (naive bayes? Maxent?)
> >
> > Thanks
> >
>

Reply via email to