Rupert Westenthaler created STANBOL-947:
-------------------------------------------

             Summary: Allow the TikaEngine to add unmapped properties to the 
Metadata of the processed ContentItem
                 Key: STANBOL-947
                 URL: https://issues.apache.org/jira/browse/STANBOL-947
             Project: Stanbol
          Issue Type: New Feature
          Components: Engine - Tika
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


Currently the Tika Engine can only add information provided by the Tika 
Metadata to the Enhancement Metadata for that an explicit Ontology mapping is 
available and activated. All other metadata are not accessible.

This will add an additional Configuration to the TikaEngine that allows to add 
unmapped properties to the Enhancement graph.

Properties will be written by using

* the ContentItem URI as subject
* urn:tika.apache.org:{property-name} as property
* the value of the property as Object.

That means that values of unmapped properties will be accessible by using

    ContentItem ci; //the content item
    String property; //the property
    Iterator<Triple> it = ci.getMetadata().filter(
        ci.getId, new UriRef("urn:tika.apache.org:"+property), null);
    while(it.hasNext()){
        Resource value = it.next().getObject();
    }

By default this feature will be deactivated. Users that want to have unmapped 
properties present need to set "stanbol.engine.tika.mapping.unmapped" to true.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to