Rupert Westenthaler created STANBOL-947:
-------------------------------------------
Summary: Allow the TikaEngine to add unmapped properties to the
Metadata of the processed ContentItem
Key: STANBOL-947
URL: https://issues.apache.org/jira/browse/STANBOL-947
Project: Stanbol
Issue Type: New Feature
Components: Engine - Tika
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Currently the Tika Engine can only add information provided by the Tika
Metadata to the Enhancement Metadata for that an explicit Ontology mapping is
available and activated. All other metadata are not accessible.
This will add an additional Configuration to the TikaEngine that allows to add
unmapped properties to the Enhancement graph.
Properties will be written by using
* the ContentItem URI as subject
* urn:tika.apache.org:{property-name} as property
* the value of the property as Object.
That means that values of unmapped properties will be accessible by using
ContentItem ci; //the content item
String property; //the property
Iterator<Triple> it = ci.getMetadata().filter(
ci.getId, new UriRef("urn:tika.apache.org:"+property), null);
while(it.hasNext()){
Resource value = it.next().getObject();
}
By default this feature will be deactivated. Users that want to have unmapped
properties present need to set "stanbol.engine.tika.mapping.unmapped" to true.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira