Re: IE Problem

Jim Wed, 07 Aug 2013 02:21:27 -0700

your problem is a pure NER problem and yes opennlp wold help if you hadenough training data. 10 examples is certainly not enough to train yourown models though.

If you're just looking for names of people or companies, you could usethe pre-trained models that ship with openNLP. It shouldwork...alternatively, if you can identify common morphologicalsimilarities between your entities, then perhaps you can formulate themas regex.

I think your best bet is to try the ready-made NER models and i thatdoesn't work as expected you can try regex ,even though I don't thinkregex iwll identify names of people reliably, no matter how well formedit is...


hope that helps, :)
Jim

On 06/08/13 14:22, Markus Marks wrote:

Hi all,
i'm a german computer science student, who is currently writing on hisbachelor thesis. I write you because i'm very desperate. I have tosolve an information extraction task and i'm not quite sure, how tosolve it and i was hoping, you could help me or tell me if openNLPwould work out.
Ok... here it comes:
Let's assume I have a sender's adress from a letter. And i have fewannotated examples.
new document example with annotation
Mr. XYZ             Enterprise Something
Example Company                                     John Doe
Sample road 12514                                    somewhere else
somewhere another road
something
something something else
So the problem is how to generate a matching or learning algorithm, sothat I'm able to extract for example the name of the company or thename of a new sender, considering some annotated examples i canprovide, with the problem that not every sender is written with thesame order or expressions.
The thing is that, i only have really few examples, like less than 10.
You have any suggestions how to solve this? I would be reallythankful, since i'm very disappointed, not finding a solution.
Yours thankfully,

Markus

Re: IE Problem

Reply via email to