Re: Brat Annotation to OpenNLP model

Kayak28 Tue, 28 Jan 2020 19:32:22 -0800

Hello, Mr. William Colen:

I apologize for unclear input strings, but it was actually tokenized, and
terms are separated by white spaces.


Thank you for responding to me.

Sincerely,
Kaya Ota




2020年1月28日(火) 21:03 William Colen <[email protected]>:

> Is you input tokenized? I don't know Japanese, but what I know of OpenNLP
> name finder is that it requires tokenized input.
>
> I hope it helps.
>
> William
>
> Em ter., 28 de jan. de 2020 às 00:08, Kayak28 <[email protected]>
> escreveu:
>
> > Hello, OpenNLP community:
> >
> > I have trouble with the conversion between python brat annotation tool to
> > OpenNLP.
> >
> > I would like to create a custom model in Japanese that can be used by
> > OpenNLP.
> >
> > Currently, I have xxx.ann file (which I annotated using python brat),
> > xxx.txt (which is the original NL document) file, and an annotation
> > configuration file.
> >
> > And reading this answer (
> >
> >
> https://stackoverflow.com/questions/39877434/creating-and-training-a-model-for-opennlp-using-brat
> > ),
> > I have tried the following command.
> >
> > opennlp TokenNameFinderTrainer.brat -bratDataDir nlp_data/
> > -annotationConfig exercise/brat-1.3_Crunchy_Frog/annotation.conf -moel
> > output_model.bin -lang ja
> > (nlp_data is a directory where ann and txt files are stored.)
> >
> >  Eventually, the execution ended up with the following console output.
> >
> > Training data summary:
> >
> > #Sentences: 8030
> >
> > #Tokens: 11953
> >
> > #LOCATION entities: 28
> >
> > #PERSON entities: 52
> >
> > #VEHICLE entities: 14
> >
> > #FACILITY entities: 88
> >
> > #PLAN entities: 7
> >
> > #EVENT_OTHER entities: 4
> >
> > #EVENT entities: 20
> >
> > #FACILITY_OTHER entities: 2
> >
> > #ORGANIZATION entities: 59
> >
> > #DATETIME entities: 110
> >
> > #PRINTING entities: 8
> >
> > #PRODUCT entities: 186
> >
> > #TITLE entities: 7
> >
> > #ACCESS entities: 255
> >
> > #FOOD entities: 19
> >
> >
> > Writing name finder model ... Compressed 52591 parameters to 10192
> >
> > 427 outcome patterns
> >
> > done (0.375s)
> >
> >
> > Wrote name finder model to
> >
> > path: /home/vagrant/output_model.bin
> >
> > but when I used the output_model.bin file with the command below, it does
> > not give named entities, responding like below.
> > ```sh
> > opennlp TokenNameFinder output_model.bin
> > Loading Token Name Finder model ... done (0.132s)スティーブ ジョブスは偉い人
> >
> > (my input) $スティーブジョブスは偉い人
> > (output) スティーブジョブスは偉い人
> > ```
> > So, my questions are:
> > Did I take the wrong steps?
> > If I did badly, how should I use xxx.ann file to make a custom model in
> > OpenNLP?
> >
> > Any help will be appreciated.
> >
> >
> > Sincerely,
> > Kaya Ota
> >
>

Re: Brat Annotation to OpenNLP model

Reply via email to