Author: rwesten
Date: Wed Jan 30 13:41:52 2013
New Revision: 1440412

URL: http://svn.apache.org/viewvc?rev=1440412&view=rev
Log:
moved informations from the customnermodelengine to opennlpcustomner

Removed:
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/customnermodelengine.mdtext
Modified:
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/opennlpcustomner.mdtext

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext?rev=1440412&r1=1440411&r2=1440412&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext 
(original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext 
Wed Jan 30 13:41:52 2013
@@ -84,7 +84,7 @@ NER engines need to write detected Named
        * detects occurrences of persons, places and organizations only
        * supports [NER 
annotations](../nlp/nlpannotations#name-entity-ner-annotations)
 
-* __[Custom NER Model Extraction Enhancement 
Engine](customnermodelengine.html):__ 
+* __[OpenNLP Custom NER Model Engine](opennlpcustomner):__ 
        * NLP processing using OpenNLP NER
        * uses custom NameFinder models (user configured)
        * supports custom Named Entity types (other than persons, places and 
organizations

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/opennlpcustomner.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/opennlpcustomner.mdtext?rev=1440412&r1=1440411&r2=1440412&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/opennlpcustomner.mdtext
 (original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/opennlpcustomner.mdtext
 Wed Jan 30 13:41:52 2013
@@ -0,0 +1,53 @@
+Title: The OpenNLP Custom NER Model Extraction Engine
+
+This engine allows the configuration of custom [Apache 
OpenNLP](http://opennlp.apache.org) NameFinder models for NER of plain text 
content. 
+
+
+## Example Result
+
+This engine adds 
[fise:TextAnnotation](../enhancementstructure.html#fisetextannotation) for the 
processed plain text to the metadata of the content item. The following code 
listing shows an DNA type Named Entity detected based on a OpenNLP NameFinder 
model trained based on the 
[BioNLP2004](http://www.nactem.ac.uk/tsujii/GENIA/ERtask/report.html) dataset:
+
+    :::json
+    {
+        "@subject": "urn:enhancement-0e31eb01-23c5-82b5-1372-5c5606c09960",
+        "@type": [
+            "Enhancement",
+            "TextAnnotation"
+        ],
+        "confidence": 0.40148407,
+        "creator": 
"org.apache.stanbol.enhancer.engines.opennlp.impl.CustomNERModelEnhancementEngine",
+        "start": 228,
+        "end": 242,
+        "extracted-from": 
"urn:content-item-sha1-84a30aeeb073be543f7c54266e232aae572efac0",
+        "selected-text": {
+            "@language": "en",
+            "@literal": "HIV-2 enhancer"
+        },
+        "selection-context": {
+            "@language": "en",
+            "@literal": "activation of the HIV-2 enhancer in monocytes and T 
cells"
+        },
+        "type": "http://www.bootstrep.eu/ontology/GRO#DNA";
+    },
+
+## Configuration
+
+The usage of this Engine requires to create a service configuration. 
Configurations require at least a single NameFinderModel name to be configured.
+
+### Parameters
+
+* __Name Finder Models__ _(stanbol.engines.opennlp-ner.nameFinderModels)_: The 
list if custom NameFinderModels used by this engine. The Engine supports 
Arrays, Vectors and comma separated string for. Values are the file names of 
the NameFinderModel files. Configured files are loaded by using the 
DataFileProvider service. That means that files copied into the 'datafile' 
folder (by default located at '{stanbol-working-dir}/stanbol/datafiles').
+* __Named Entity to 'dc:type' Mappings__ 
_(stanbol.engines.opennlp-ner.typeMappings)_: This configuration uses the 
syntax {named-entity-type} > {uri}": {named-entity-type} matches to the string 
"name" used for the named entity type in the OpenNLP NameFinder model. {uri} 
MUST BE a valid URI and is used as dc:type value for fise:TextAnnotations 
created by the engine for extracted Named Entities. NOTE: that TextAnnotations 
for unmapped Named Entity Types will have no dc:type information.
+
+The following figure provides a visual representation of an engine 
configuration configured for all NamedEntity types supported by the 
[BioNLP2004](http://www.nactem.ac.uk/tsujii/GENIA/ERtask/report.html) dataset.
+
+!['CustomNerModelEngine Configuration'](customnermodelengineconfig.png "This 
figure shows the configuration screen as presented by the Apache Felix 
WebConsole when creating an Component Configuration for the Custom NER Model 
Engine")
+
+The same configuration can be also provided as OSGI configuration file with 
the name 
'org.apache.stanbol.enhancer.engines.opennlp.impl.CustomNERModelEnhancementEngine-ehealthner.config'
 and the contents:
+
+    :::text
+    stanbol.enhancer.engine.name="ehealth-ner"
+    
stanbol.engines.opennlp-ner.nameFinderModels=["bionlp2004-DNA-en.bin","bionlp2004-protein-en.bin","bionlp2004-cell_type-en.bin","bionlp2004-cell_line-en.bin","bionlp2004-RNA-en.bin"]
+    stanbol.engines.opennlp-ner.typeMappings=["DNA\ >\ 
http://www.bootstrep.eu/ontology/GRO#DNA","RNA\ >\ 
http://www.bootstrep.eu/ontology/GRO#RNA","protein\ >\ 
http://www.bootstrep.eu/ontology/GRO#Protein","cell_type\ >\ 
http://purl.bioontology.org/ontology/CL","cell_line\ >\ 
http://purl.bioontology.org/ontology/MCCL";]
+
+NOTE: that the '.config' format requires spaces to be escaped with '\'


Reply via email to