Thanks for your reply Rodrigo. I will look into what you suggested. -- Thanks Madhav Sharan
On Mon, Nov 16, 2015 at 12:38 AM, Rodrigo Agerri <rage...@apache.org> wrote: > Hello, > > I am not entirely sure but I think the English NER models were trained > on MUC 7 data. Note that supervised learning approaches to NLP in > general work suffer the "domain adaptation problem". Basically that > means that you are deploying a model learned from some specific type > of data to other type of data which is quite different. Performance > degrades as a result. > > To improve your results the best is to train your own model (need > annotated data for that). If you do not have annotated data from your > own domain, you can use a newer dataset such as Ontonotes and train > your model with that data. > > Optionally, if you have a type of locations which happen fairly > regularly, you can also try to use the DictionaryNameFinder to use > lists of locations and the RegexNameFinder to create rules using > regular expressions for location finding. > > HTH, > > Rodrigo > > On Sun, Nov 1, 2015 at 6:15 AM, Madhav Sharan <msha...@usc.edu> wrote: > > Hello opennlp users, > > > > I am facing some issue while extracting locations from file contents. > Using > > en-ner-location.bin I am able to extract location if it's provided in > > camelcase but not if otherwise. > > > > *For example :* > > - I can extract "China" out of - "A geographically distributed network > of > > *China*" > > - But not from - "A geographically distributed network of *china*" > > > > I already tried converting whole text to camel case but it makes matter > > worse, so instead of trying more solution based on my intuitions would be > > best for me if I can get help on below two questions: > > > > Can someone suggest an enhancement? > > Can someone help me know how en location name finder model is trained? > > Location name finder model.en-ner-location.bin > > <http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin> > > *What are we trying to do?* > > We are building an opensource tool to extract location out of any file > and > > then visualize it on a map. These file will mostly coming from web > content > > but can be anything a user wish. > > > > -- > > Thanks > > Madhav Sharan >