Hi Stefan


On Wed, May 14, 2014 at 4:08 PM, Stefan Bunk
<stefan.b...@student.hpi.uni-potsdam.de> wrote:
> Hi,
>
> I have problems with using the Custom NER Model Extraction Engine [1].
> Basically, no entities are not found, even though the underlying model is
> correct.
>
> Here's what I did:
> 1.  I build a custom NER model for places from geonames.org according to
> the OpenNLP website [2]. I tested my model with the OpenNLP command line
> tool, and it worked (i.e. I give my model a text and the entities are found
> correctly).
> 2. I copied the model to both ./launchers/stanbol/datafiles/geonames.bin
> and ./enhancement-engines/topic/engine/sling/datafiles/geonames.bin.

You need to copy the model to the datafilee folder of your stanbol
instance. By default this is "./stanbol/datafiles". So if you run
stanbol in "/foo/bar" the model needs to be available under
"/foo/bar/stanbol/datafiles/geonames.bin".

> 3. In the Apache Felix Web Console Configuration, I created a new "Custom
> NER Model" with the following settings:
>                 - name: Geonames NER

This is the name of the engine. Typically lower case names with '-' as
word separator or CamelCase names are used as names. So I suggest to
use  "geonames-ner" as name for your engine

>                 - Name Finder Model: geonames.bin
>                 - Type Mappings: place > http://dbpedia.org/ontology/Place
>                 - Ranking: -100
> 4. I build a new enhancement chain with: tika, langdetect,
> opennlp-sentence, opennlp-token, opennlp-pos, opennlp-ner, geonames-ner,
> geonames

Based on the provided information you used "Geonames NER" as name of
your engine. This chain however refers "geonames-ner". I would expect
the chain to be unsatisfied as no "geonames-ner" engine is around.

> 5. Server restart

A server restart is not needed. If you update the model you might need
to start/stop the OpenNLP component as it keeps a SoftReference to the
loaded models.

> 6. I send the exactly same string as in 1. when I tested the model, but no
> entities are found.

I would expect an ChainException as your chain refers "geonames-ner"
and the name of the configured engine is "Geonames NER"

>
> Any hint would be useful!
> How can I check, that Stanbol correctly finds my geonames.bin file? If I
> intentionally add a file which does not exist, no error occurs.

The "Stanbol Data File Provider" Tab of the Felix Webconsole provides
information about requested data files. There is also INFO level
logging of the Custom NER Model Engine.


As I was not using the Custom NER engine since a long time I
successfully tested the engine with the 0.12.1-SNAPSHOT [4]

* by using [3] - the default english place model
* renaming it to genomes-ner.bin
* copying it to the ./stanbol/datafiles folder of my test instance
* configuring a Custom NER engine with

    stanbol.engines.opennlp-ner.typeMappings=["location\ >\
http://dbpedia.org/ontology/Place";]
    stanbol.enhancer.engine.name="geonames-ner"
    stanbol.engines.opennlp-ner.nameFinderModels=["geonames-ner.bin"]

* configuring a Weighted Chain with

    
stanbol.enhancer.chain.weighted.chain=["langdetect","opennlp-sentence","opennlp-token","geonames-ner"]
    stanbol.enhancer.chain.name="geonames-ner"

This setting provided the expected results - meaning the exact same
list of locations as when using the "opennlp-ner" engine


As you do not get an ChainException the most likely reason four your
problem is that the "geonames.bin" model is no in the correct folder.
As soon as the model is available you should see a message like

15.05.2014 10:38:28.739 *INFO* [DataFileTrackingDaemon]
org.apache.stanbol.enhancer.engines.opennlp.impl.CustomNERModelEnhancementEngine
register custom NameFinderModel from resource: geonames-ner.bin for
language: en to NamedModelFileListener (name:opennlp-ner)

in the logs.

hope this helps
best
Rupert

>
> Thanks in advance
> Stefan
>
>
>
>
> [1]
> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
> [2]
> http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Name_Finder

[3] http://dev.iks-project.eu/downloads/opennlp/models-1.5/en-ner-location.bin
[4] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to