2017-07-19 10:48 GMT+01:00 Manoj B. Narayanan < manojb.narayanan2...@gmail.com>:
> Hi all, > > I wanted to find out if there is any specific reason behind using XML > format for dictionaries for Name Finder. > It's not XML. There is a very superficial similarity in the use of <>, but, at a minimum <START:person> Pierre Vinken <END> would need to be something like <name type="person"> Pierre Vinken </name> and the whole document would need to be enclosed by a pair of tags. > Also, is there any source from where we can get the documentation regarding > the dictionary formats for various tools (tokenizer, pos, name finder). > The manual: https://opennlp.apache.org/docs/1.8.1/manual/opennlp.html More specifically, tokeniser: https://opennlp.apache.org/docs/1.8.1/manual/opennlp.html#tools.tokenizer.training pos: https://opennlp.apache.org/docs/1.8.1/manual/opennlp.html#tools.postagger.training name finder: https://opennlp.apache.org/docs/1.8.1/manual/opennlp.html#tools.namefind.training