semantic description of the engines
-----------------------------------

                 Key: STANBOL-107
                 URL: https://issues.apache.org/jira/browse/STANBOL-107
             Project: Stanbol
          Issue Type: New Feature
          Components: Enhancer
            Reporter: Enrico Daga
            Priority: Minor


It would be nice to find a way to let engines declare which is the contribution 
they are going to provide.
I see at  least the following kinds of enhancements:
1) tagging: detect keywords, entities, concepts "within" the content
2) categorization/classification: locate the content in a conceptual place 
within a given framework. For example an engine could state that the document 
has "Secon World War" as primary topic, or "Theatre" in the framework of 
DBPedia categories, or state that is an E-mail, or a News, in the framework of 
the CMS document types;
3) metadata: the engine extracts metadata from within the content. For instance 
it returns the PDF metadata in RDF using the dublin core vocabulary
4) embedded knowledge: the source document is a rich HTML (with RDFa, 
Microformats) or it is a structured file (why not an RDF file, say a FOAF 
profile?)

then, the enhancement engine should also say HOW it contributes in terms of 
vocabulary elements
1) Does the engine add annotation roles?
2) Does it add entity types?
3) Which metadata fields it will return?

This could be done with an RDF description stating which are the terms the 
engine will introduce in relation to the ones of the Stanbol Enhancement base 
ontology (STANBOL-52). This is also related to STANBOL-3.


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to