On Tue, May 27, 2014 at 12:49 PM, Maatari Daniel Okouya
<okouy...@yahoo.fr> wrote:
> Hi,
>
> I have just started to use apache stanbol. I’m still playing around  with it 
> to figure out everything that is out there. However, I’m puzzle by one thing. 
> I would like to configure it such that upon uploading a text or a Pdf 
> document, an RDF containing only the topic of the pdf shall be returned.
>

What do you mean by "topic"? In case of PDF files the Tika Engine [1]
can extract metadata. Such metadata are directly added to the URI of
the contentItem and do not use FISE.

> I’m scratching my head but i don’t see how to do so. What is the engine that 
> is suppose to produce  <<Fise:Annotation>>
>

All Stanbol Engines do generate FISE enhancements
(fise:TextAnnotation, fise:EntityAnnotation and fise:TopicAnnotation)

When you look at the list of engines [2]

* Language Detection engines create a fise:TextAnnotation describing
the language of the document (?la dc:type dc:LinguisticSystem; ?la
dc:language ?lang)
* Named Entity Recognition (NER) Engines create fise:TextAnnotations
for Entities recognized by the NLP framework.
* Linking / Suggestions create fise:EntityAnnotation for Entities
found in the text. They might also add fise:TextAnnotation to mark the
exact mention of such entities in the text.
* Topic Classification engines use fise:TopicAnnotation to describe
assigned topics. They also use a fise:TextAnnotation to mark the part
of the text the topic is assigned to

> as described in 
> http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.html

Yep this page describes the annotations as created by the EnhancementEngines.


Without knowing what you mean by " ... only the topic of the pdf ..."
I can not recommend you suitable Stanbol configurations.

best
Rupert

>
>


[1] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/tikaengine
[2] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/list

> I would appreciate if someone could provide me with some pointers.
>
> Many thanks,
>
> Maatary
>
> --
> Maatari Daniel Okouya
> Sent with Airmail



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to