Hello, OpenNLP community:

I have trouble with the conversion between python brat annotation tool to
OpenNLP.

I would like to create a custom model in Japanese that can be used by
OpenNLP.

Currently, I have xxx.ann file (which I annotated using python brat),
xxx.txt (which is the original NL document) file, and an annotation
configuration file.

And reading this answer (
https://stackoverflow.com/questions/39877434/creating-and-training-a-model-for-opennlp-using-brat),
I have tried the following command.

opennlp TokenNameFinderTrainer.brat -bratDataDir nlp_data/
-annotationConfig exercise/brat-1.3_Crunchy_Frog/annotation.conf -moel
output_model.bin -lang ja
(nlp_data is a directory where ann and txt files are stored.)

 Eventually, the execution ended up with the following console output.

Training data summary:

#Sentences: 8030

#Tokens: 11953

#LOCATION entities: 28

#PERSON entities: 52

#VEHICLE entities: 14

#FACILITY entities: 88

#PLAN entities: 7

#EVENT_OTHER entities: 4

#EVENT entities: 20

#FACILITY_OTHER entities: 2

#ORGANIZATION entities: 59

#DATETIME entities: 110

#PRINTING entities: 8

#PRODUCT entities: 186

#TITLE entities: 7

#ACCESS entities: 255

#FOOD entities: 19


Writing name finder model ... Compressed 52591 parameters to 10192

427 outcome patterns

done (0.375s)


Wrote name finder model to

path: /home/vagrant/output_model.bin

but when I used the output_model.bin file with the command below, it does
not give named entities, responding like below.
```sh
opennlp TokenNameFinder output_model.bin
Loading Token Name Finder model ... done (0.132s)スティーブ ジョブスは偉い人

(my input) $スティーブジョブスは偉い人
(output) スティーブジョブスは偉い人
```
So, my questions are:
Did I take the wrong steps?
If I did badly, how should I use xxx.ann file to make a custom model in
OpenNLP?

Any help will be appreciated.


Sincerely,
Kaya Ota

Reply via email to