semantic description of the engines
-----------------------------------
Key: STANBOL-107
URL: https://issues.apache.org/jira/browse/STANBOL-107
Project: Stanbol
Issue Type: New Feature
Components: Enhancer
Reporter: Enrico Daga
Priority: Minor
It would be nice to find a way to let engines declare which is the contribution
they are going to provide.
I see at least the following kinds of enhancements:
1) tagging: detect keywords, entities, concepts "within" the content
2) categorization/classification: locate the content in a conceptual place
within a given framework. For example an engine could state that the document
has "Secon World War" as primary topic, or "Theatre" in the framework of
DBPedia categories, or state that is an E-mail, or a News, in the framework of
the CMS document types;
3) metadata: the engine extracts metadata from within the content. For instance
it returns the PDF metadata in RDF using the dublin core vocabulary
4) embedded knowledge: the source document is a rich HTML (with RDFa,
Microformats) or it is a structured file (why not an RDF file, say a FOAF
profile?)
then, the enhancement engine should also say HOW it contributes in terms of
vocabulary elements
1) Does the engine add annotation roles?
2) Does it add entity types?
3) Which metadata fields it will return?
This could be done with an RDF description stating which are the terms the
engine will introduce in relation to the ones of the Stanbol Enhancement base
ontology (STANBOL-52). This is also related to STANBOL-3.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira