Hello, Mr. William Colen: I apologize for unclear input strings, but it was actually tokenized, and terms are separated by white spaces.
Thank you for responding to me. Sincerely, Kaya Ota 2020年1月28日(火) 21:03 William Colen <william.co...@gmail.com>: > Is you input tokenized? I don't know Japanese, but what I know of OpenNLP > name finder is that it requires tokenized input. > > I hope it helps. > > William > > Em ter., 28 de jan. de 2020 às 00:08, Kayak28 <kaya.ota....@gmail.com> > escreveu: > > > Hello, OpenNLP community: > > > > I have trouble with the conversion between python brat annotation tool to > > OpenNLP. > > > > I would like to create a custom model in Japanese that can be used by > > OpenNLP. > > > > Currently, I have xxx.ann file (which I annotated using python brat), > > xxx.txt (which is the original NL document) file, and an annotation > > configuration file. > > > > And reading this answer ( > > > > > https://stackoverflow.com/questions/39877434/creating-and-training-a-model-for-opennlp-using-brat > > ), > > I have tried the following command. > > > > opennlp TokenNameFinderTrainer.brat -bratDataDir nlp_data/ > > -annotationConfig exercise/brat-1.3_Crunchy_Frog/annotation.conf -moel > > output_model.bin -lang ja > > (nlp_data is a directory where ann and txt files are stored.) > > > > Eventually, the execution ended up with the following console output. > > > > Training data summary: > > > > #Sentences: 8030 > > > > #Tokens: 11953 > > > > #LOCATION entities: 28 > > > > #PERSON entities: 52 > > > > #VEHICLE entities: 14 > > > > #FACILITY entities: 88 > > > > #PLAN entities: 7 > > > > #EVENT_OTHER entities: 4 > > > > #EVENT entities: 20 > > > > #FACILITY_OTHER entities: 2 > > > > #ORGANIZATION entities: 59 > > > > #DATETIME entities: 110 > > > > #PRINTING entities: 8 > > > > #PRODUCT entities: 186 > > > > #TITLE entities: 7 > > > > #ACCESS entities: 255 > > > > #FOOD entities: 19 > > > > > > Writing name finder model ... Compressed 52591 parameters to 10192 > > > > 427 outcome patterns > > > > done (0.375s) > > > > > > Wrote name finder model to > > > > path: /home/vagrant/output_model.bin > > > > but when I used the output_model.bin file with the command below, it does > > not give named entities, responding like below. > > ```sh > > opennlp TokenNameFinder output_model.bin > > Loading Token Name Finder model ... done (0.132s)スティーブ ジョブスは偉い人 > > > > (my input) $スティーブジョブスは偉い人 > > (output) スティーブジョブスは偉い人 > > ``` > > So, my questions are: > > Did I take the wrong steps? > > If I did badly, how should I use xxx.ann file to make a custom model in > > OpenNLP? > > > > Any help will be appreciated. > > > > > > Sincerely, > > Kaya Ota > > >