Hi, thanks for your answer. 

I mean Topic Annotation. 

Ultimately what i would like to have is something like: { PDFuri 
FoaF:PrimaryTopic London  . }   as triple in the return RDF. 

But for now, i don’t concern myself with using FOAF. 

I just want to have the main topics of the PDF. I don’t necessarily want to 
extract all the entity etc…. 

SO maybe in term of the annotation generated i would say not having 
fise:EntityAnnotation neither fise:TextAnnotation but simply 
fise:TopicAnnotation

Many thanks


-- 
Maatari Daniel Okouya
Sent with Airmail

On 27 May 2014 at 13:08:38, Rupert Westenthaler (rupert.westentha...@gmail.com) 
wrote:

On Tue, May 27, 2014 at 12:49 PM, Maatari Daniel Okouya  
<okouy...@yahoo.fr> wrote:  
> Hi,  
>  
> I have just started to use apache stanbol. I’m still playing around with it 
> to figure out everything that is out there. However, I’m puzzle by one thing. 
> I would like to configure it such that upon uploading a text or a Pdf 
> document, an RDF containing only the topic of the pdf shall be returned.  
>  

What do you mean by "topic"? In case of PDF files the Tika Engine [1]  
can extract metadata. Such metadata are directly added to the URI of  
the contentItem and do not use FISE.  

> I’m scratching my head but i don’t see how to do so. What is the engine that 
> is suppose to produce <<Fise:Annotation>>  
>  

All Stanbol Engines do generate FISE enhancements  
(fise:TextAnnotation, fise:EntityAnnotation and fise:TopicAnnotation)  

When you look at the list of engines [2]  

* Language Detection engines create a fise:TextAnnotation describing  
the language of the document (?la dc:type dc:LinguisticSystem; ?la  
dc:language ?lang)  
* Named Entity Recognition (NER) Engines create fise:TextAnnotations  
for Entities recognized by the NLP framework.  
* Linking / Suggestions create fise:EntityAnnotation for Entities  
found in the text. They might also add fise:TextAnnotation to mark the  
exact mention of such entities in the text.  
* Topic Classification engines use fise:TopicAnnotation to describe  
assigned topics. They also use a fise:TextAnnotation to mark the part  
of the text the topic is assigned to  

> as described in 
> http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.html
>   

Yep this page describes the annotations as created by the EnhancementEngines.  


Without knowing what you mean by " ... only the topic of the pdf ..."  
I can not recommend you suitable Stanbol configurations.  

best  
Rupert  

>  
>  


[1] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/tikaengine 
 
[2] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/list  

> I would appreciate if someone could provide me with some pointers.  
>  
> Many thanks,  
>  
> Maatary  
>  
> --  
> Maatari Daniel Okouya  
> Sent with Airmail  



--  
| Rupert Westenthaler rupert.westentha...@gmail.com  
| Bodenlehenstraße 11 ++43-699-11108907  
| A-5500 Bischofshofen  
| REDLINK.CO 
..........................................................................  
| http://redlink.co/  

Reply via email to