2011/6/20 Amal Elmah <amalalthougha...@hotmail.com>:
>
> Hi OpenNLP team,
>
> I used the command line training tool for NameFinder .So, I used the 
> following command:
> $bin/opennlp TokenNameFinderTrainer -encoding UTF-8 -lang en -data 
> en-ner-person.train -model en-ner-person.bin
>
> I do not know from where can I get the en-ner-person.train . So, I made a 
> trining file (training.txt) and add training data as follows:
>
> <START:person> Pierre Vinken <END> , 61 years old , will join the board as a 
> nonexecutive director Nov. 29 .
> Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch 
> publishing group .
>
> My Questions are:
> 1- How can I add features if I want to use the command line training tool not 
> API? Can you please give me an example if this is possible!

AFAIK in the current state feature extraction is only customizable
through the API.

> 2- Can we add features to the training data I mean with the annotation 
> <START: person feature=value>

No. What would be the use case? Can you give a concrete example of
such a manual feature annotation? What goal do you want to achieve
with such annotations?

> 3- Does Opennlp tool have a way to generate these features automatically from 
> the training data?

OpenNLP already generates its feature automatically by combining
several feature extractors as in:

https://svn.apache.org/repos/asf/incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/namefind/DefaultNameContextGenerator.java

All those feature extractors do not expect any kind of many
annotations. This is expected since in general the text you want to
analyze with a NameFinde instance will not have any kind of
annotations.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Reply via email to