all, I loaded the ModelBuilder-prototype project I mentioned earlier into
the sandbox. Please take a look when you get a chance. I have built a few
decent model with it already for locations and person entities. The Example
class will walk through how it works and you can work it from there. the
Impls used are file based impls so you should be able to create a file of
sentences, known entities, and a blacklist file to run the examples.
A good use case is something like this:
I have a corpus of data that can be broken into sentences, I know my data
so I can sample some of it to create lists of entities of different types
based on random searches (a list of people's names for example). From here
the model builder will take the list of sentences, search for all the known
entities, if it finds them it annotates the sentence and writes the anno
sentences to a file. The file is then used to create a model, the model is
used to extract NEs, then the results (if they pass validation) are added
to the list of known entities and the loop starts over....
1: read sentences
extract knowns
annotate sentences based on knowns
build a model from the annotations
extract NEs with the model
add the Names to the known entities
goto 1

let me know what you think

MG



On Sun, Oct 20, 2013 at 8:59 AM, <[email protected]> wrote:

> Author: markg
> Date: Sun Oct 20 12:59:13 2013
> New Revision: 1533881
>
> URL: http://svn.apache.org/r1533881
> Log:
> Prototype of a tool to allow users to create models from  of a set of
> known entities based on their own data in the form of sentences.
> See the Example class in the .v2 package.
>
> Added:
>     opennlp/sandbox/modelbuilder-prototype/
>
>

Reply via email to