Hi!

I’m writing a sentiment analysis application based on product reviews and am 
interested in using opennlp for identifying named entities and tokenization. 
The problem is that the standard models on the project homepage isn’t 
identifying nearly enough entities and training a completely new model based on 
my data is outside the scope of my project.

Both training and test set texts has additional information available; is there 
any way to augment (for instance) the person model to (be more likely to) 
properly identify Britney Spears as a person in case the text is a product 
review of her CD (and it’s known beforehand that it’s ”her” CD) or to identify 
Google as a company if it’s a review of one of their products (under the same 
conditions)?

Is a (pre trained) model approach incorrect? Should I use a regex based model 
instead? Other approach? Unfeasible idea and I should reconsider?


I appreciate any answer and whatever time you spent reading me email.


Sincerely 

Alexander

Reply via email to