Hi,

If you name your models after the default you can also use the default
configurations of the engines.

On Mon, Sep 22, 2014 at 12:35 PM, Mohammad Ghufran <emghuf...@gmail.com> wrote:
> (OpenNlpTokenizerEngine | name=custom-opennlp-token-fr)!

and

> I have configured an instance of Open Nlp Tokenizer with the following
> settings (which are the only things I can configure):
> name: custom-opennlp-token-fr
> language configuration: fr;model={fr-token.bin}

you need to configure

    fr;model=fr-token.bin

without the brackets

But as "fr-token.bin" is anyway the default just remove the line altogether.

>
> Am I doing it wrong? Also, any idea about the Open Calais engines?
>

If you use the full launcher ... do the following

* open http://localhost:8080/system/console/configMgr
* search for "OpenCalais" and click the edit button on the right side
* add your license information
* click OK and you should have an opencalais engine available


best
Rupert


> Thanks again!
> Ghufran
>
>
> On Mon, Sep 22, 2014 at 11:21 AM, Rupert Westenthaler <
> rupert.westentha...@gmail.com> wrote:
>
>> Hi Ghufran,
>>
>>
>> On Mon, Sep 22, 2014 at 10:42 AM, Mohammad Ghufran <emghuf...@gmail.com>
>> wrote:
>> > Hello,
>> >
>> > I am interested in using Stanbol as part of my Research project but I am
>> > having trouble handling languages other than English. I realize that this
>> > list is for development and my questions may not be 100% relevant to
>> > development, but this is the best place I could find to ask for help. I'd
>> > appreciate if someone can guide me a little given that documentation is
>> > quite sparse!
>> >
>> > I am primarily interested in doing named entity recognition in multiple
>> > languages (French, and English mostly). For this, I found a model for
>> > french built by someone here:
>> >
>> http://enicolashernandez.blogspot.fr/2012/12/apache-opennlp-fr-models.html
>> > . Models for all the tasks including segmentation, tokenization, POS, and
>> > NER for French can be found here. What I am unable to achieve is to
>> > successfully use these models. From what I gather, all the external
>> models
>> > should be put inside the {install-directory}/stanbol/datafiles directory.
>>
>> Thats correct. If you copy the models in this directory they can be
>> found by Stanbol.
>>
>> However the OpenNLP modules do use specific name patterns for model
>> files. So make sure that your custom models do follow such name
>> schemes:
>>
>> * Sentence: {lang}-sent.bin (e.g. "fr-sent-bin")
>> * Token: {lang}-token.bin (e.g. "fr-token.bin")
>> * Pos: {lang}-pos-perceptron.bin or {lang}-pos-maxent.bin depending on
>> if you use a perceptron or maxent model (e.g."fr-pos-maxent.bin")
>> * Chunker: {lang}-chunker.bin (e.g. "fr-chunker.bin")
>> * Namefinder: {lang}-ner-{type}.bin. The default types are
>>     * person (e.g. "fr-ner-person.bin")
>>     * location (e.g. "fr-ner-location.bin")
>>     * organization (e.g. "fr-ner-organization.bin")
>>     * for other types see
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
>>
>> You can use models with other names, but in this case you will need to
>> add explicit configurations with the used names to the engines using
>> those. If you want to opt for this please note the documentation of
>> the engines.
>>
>> * Sentence Detection:
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpsentence
>> * Tokenization:
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlptokenizer
>> * Pos Tagging:
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlppos
>> * Chunking:
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpchunker
>>
>> all those engines do allow to configure processed languages. Via the
>> `model` parameter of a language you can set the name of the model file
>> (located in the `stanbol/datafile/` folder)
>>
>> Hope this solves you issue
>> best
>> Rupert
>>
>> > However, when I create a chain with the new components, I get an error
>> that
>> > one of the models was not found (this seems to be arbitrary since all the
>> > models are in the same location but the error doesn't occur for all the
>> > models. For example, sentence segmentation with the french model seems to
>> > work fine but tokenization fails). Could someone please help me with how
>> to
>> > set up models other languages? Inside the opennlp directory, there are
>> > folders for 'lang' and 'ner', what are these for precisely?
>> >
>> > Secondly, I also wanted to investigate using OpenCalais enhancement
>> engine.
>> > There is limited documentation about this which says that an API key must
>> > be obtained. However, I don't see any enhancement engine corresponding to
>> > OpenCalais in the OSGi console. Could someone please suggest how I could
>> > proceed with configuring this engine?
>> >
>> > I have compiled Apache Stanbol from source.
>> >
>> > Best Regards and thanks in advance!
>> > Ghufran
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westentha...@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO
>> ..........................................................................
>> | http://redlink.co/
>>



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to