Thanks!

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westentha...@gmail.com]
Sent: Wednesday, October 22, 2014 3:24 AM
To: dev@stanbol.apache.org
Subject: Re: OpenNLP in Stanbol

Hi Patrick,

NLP results are not returned as RDF (by default). This is because the amount of 
RDF triples would be much to high. The only exceptions are NamedEntities 
(extracted by NER components). Those are written by using fise:TextAnnotation.

Internally NLP results a kept in the AnalysedText [1]. So for people that want 
to write an EnhancementEngine that needs to process NLP results this is the 
right place to look.

If you want to have all NLP annotations available as RDF you can use the 
Nlp2RDF engine [2]. You can find this engine under "enhancement-engines/nlp2rdf 
or just download the 0.12.0 version from [3].

This engine is not included in the Stanbol Launcher by default. So you will 
need to manually install it (you can use the BundleTab of the Felix Webconsole 
or just copy the jar file to the "stanbol/fileinstall" folder of your Stanbol 
Launcher).

After that you will have a engine with the name "nlp2rdf" that you can add to 
your chain. After doing so the enhancement results will include all NLP results 
encoded based on NIF 1.0 (follow the links of STANBOL-741[2] for more details 
on the generated RDF).

BTW: I am currently working on an updated version of this Engine that supports 
NIF 2.0 [4]. It will be part of the same module but at first only be available 
in the 1.0.0-SNAPSHOT version (trunk) of Stanbol (see STANBOL-1397).

NOTE: Those engines will create a high number of triples (~5-10 triples per 
word). So I would not recommend to use them with very long texts (e.g. large 
PDF files).

best
Rupert


[1] http://stanbol.apache.org/docs/trunk/components/enhancer/nlp/analyzedtext
[2] https://issues.apache.org/jira/browse/STANBOL-741
[3] 
http://search.maven.org/#artifactdetails|org.apache.stanbol|org.apache.stanbol.enhancer.engines.nlp2rdf|0.12.0|bundle
[4] http://persistence.uni-leipzig.org/nlp2rdf/ and 
http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html
[5] https://issues.apache.org/jira/browse/STANBOL-1397

--
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to