[jira] [Comment Edited] (STANBOL-488) EnhancementProperties

Rupert Westenthaler (JIRA) Thu, 15 May 2014 10:38:50 -0700

    [ 
https://issues.apache.org/jira/browse/STANBOL-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889425#comment-13889425
 ]


Rupert Westenthaler edited comment on STANBOL-488 at 5/8/14 12:33 PM:
----------------------------------------------------------------------

h1. Enhancement Properties

Enhancement Properties allow to parametrize the execution of Enhancement 
Engines in the context of a Enhancement Chain and/or an Enhancement Request.

While the configuration of an Enhancement Engine is bound to its life cycle 
(activation, deactivation) enhancement properties can modify/change this 
configuration for a single request.

Enhancement engine implementation need to support enhancement properties. So 
please refer the documentation of the engines for more information about 
supported enhancement properties.

h2. Naming and definition

EnhancementProperties should be defined both as an RDF data type property 
within an Ontology and as constants in some Java Class or Interface. The URI 
version will use the enhancement property namespace 
(`http://stanbol.apache.org/ontology/enhancementproperties#`) and the ID of the 
property as local name. 

IDs MUST start with `enhancer.` and SHOULD use the 
`enhancer.{level-1}.{level-2}.{property-name}` syntax as typically used for 
java properties. Properties are case sensitive and SHOULD only use lower case 
characters. The '-' char shall be used to make properties with multiple names 
easier to read. 

Globally defined properties use '`enhancer.{property-name}`'. For Enhancement 
Engine specific properties a possible shorted/simplified name of the engine 
should be used as {level-1}.

Typical examples for Enhancement Properties are `enhancer.max-suggestions`, 
`enhancer.min-confidence`, 
`enhancer.entity-co-mention.adjust-existing-confidence`

The definition as RDF property will use the URI and MAY also include the XSD 
data type of supported values. 

{code}

    @prefix ehprop <http://stanbol.apache.org/ontology/enhancementproperties#>
    
    ehprop:enhancer.max-suggestions     rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Integer;

    ehprop:enhancer.min-confidence      rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

    ehprop:enhancer.entity-co-mention.adjust-existing-confidence        
rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

{code}

*TODO:* The Enhancer should provide a service to register & lookup defined 
enhancement properties

h2. Scopes

Enhancement Properties can be defined with the following scopes

1. __request and engine__: Those properties are valid for a single request and 
a specific engine. They do have the highest priority. 
2. __request__: Those properties are valid for a single request and all engines.
3. __chain and engine__: Those properties are applied to all requests of the 
chain and only parsed to a specific engine
4. __chain__: Those properties are applied to all engines of all executions of 
that chain. They do have the lowest priority.

Properties with a higher priority will override properties with an lower 
priority. Meaning if a property `enhancer.min-confidence=0.5` is defined on a 
_chain_ scope it can be overridden by `enhancer.min-confidence=0.75` on a 
__chain and engine__ scope. A single request might still override the value on 
a __request__ or  __request and engine__ scope.

Chain scoped properties are configured with Enhancement Chain definition and 
represented as RDF in the ExecutionPlan. Request scoped properties are parsed 
with Enhancement requests. With version `0.12.1` and `1.0.0` they will be 
represented as an special ContentPart. Starting with `2.0.0` they will be 
represented as RDF in the ExecutionMetadata (via the EnhancementJob API added 
with 2.0.0).

h2. Java Interface

In version `0.12.1` and `1.0` EnhancementProperties are contained in the 
ContentItem. EnhancementEngines can retrieve the enhancement properties by 
using one of the EnhancementEngineHelper

{code}
    @Override
    public final void computeEnhancements(ContentItem ci) throws 
EngineException {
        Map<String,Object> enhancemntProps = 
EnhancementEngineHelper.getEnhancementProperties(this, ci);
        [..]
    }
{code}

With `2.0.0` the EnhancementEngine API will be changed so that the 
EnhancementProperties are parsed as an additional parameter.

{code}
    @Override
    public final void computeEnhancements(ContentItem ci,
            Map<String,Object> enhancemntProps) throws EngineException {
        [..]
    }
{code}

The Map with the EnhancementProperties is read/write able copy of the 
EnhancementProperties present in the ContentItem. Changes to the map MUST NOT 
be reflected in the state of the ContentItem.

h2. Definition Chain scoped Enhancement Properties 

Chain scoped EnhancementProperties are represented by RDF in the ExecutionPlan. 
As in `0.12.1` and `1.*` the ExecutionPlan is generated by the Enhancement 
Chain implementations needs to support the configuration.

Starting from `0.12.1` the ListChain, WeightedChain and GraphChain allow the 
configuration of EnhancementProperties:

* __chain and engine__ scoped properties are defined as parameters to the 
engines with the syntax {noformat}{engine-name}; 
{property-name-1}={value-1},{value-2}; {property-name-2}={value-1};{noformat}

* __chain__ scoped properties can be configured by using the osgi property key 
`stanbol.enhancer.chain.chainproperties` by the syntax 
{noformat}{property-name-1}={value-1},{value-2}{noformat}. NOTE that `;` is NOT 
supported as separator for parsing multiple properties as OSGI configurations 
already define a way for parsing multiple values

With version `2.*` of the enhancer it will be possible to directly parse/refer 
an ExecutionPlan as RDF graph. This will also allow to configure chain scoped 
properties via RDF.

h2. Definition of Request scoped Enhancement Properties

In version `0.12.1` and `1.0.0` are represented by a special content part 
registered with the URI `urn:apache.org:stanbol.web:enhancement.properties`. 
The ContentItemHelper utility provides methods to retrieve and/or init this 
content part. The content part does use a Map<String,Object> that contains both 
_request_ and _request and engine_ scoped enhancement properties.

The keys of _Request and engine_ scoped based properties are prefixed with the 
name of the engine - `{engine-name}:{property-name}`. This is the same syntax 
used as when parsing request scoped enhancement properties as query parameters 
of the request.

{code}
    ContentItem ci; //the content item
    Map<String,Object> reqProp = 
ContentItemHelper.initEnhancementPropertiesContentPart(ci)
    //set min confidence to 0.5 for all engines
    reqProp.put("enhancer.minConfidence","0.5");
    //set max suggestions to 10 for the linking engine
    reqProp.put("linking:enhancer.maxSuggestions","10");
{code}

With the enhancer `2.0` the enhancement properties content part will get 
removed and replaced by the EnhancementJob API (TBD). 

h3. Parsing Enhancement Properties via the Enhancer RESTful Service

Starting with `0.12.1` Enhancement Properties can be parsed as query parameter 
of Enhancement Requests. For request scoped properties the property name is 
used as parameter. Request and engine scoped properties need to use 
`{engine-name}:{property-name}` as parameter.

The following shows the curl request generating the equivalent of the example 
used in the above section:

{code}

    curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
        --data "The Eifeltower is located in Paris." 
        http://localhost:8080/enhancer?enhancer.max-suggestions=5&\
        dbpedia-linking:enhancer.min-confidence=0.33&\
        conf-filter:enhancer.min-confidence=0.85

{code}


was (Author: rwesten):
h1. Enhancement Properties

Enhancement Properties allow to parametrize the execution of Enhancement 
Engines in the context of a Enhancement Chain and/or an Enhancement Request.

While the configuration of an Enhancement Engine is bound to its life cycle 
(activation, deactivation) enhancement properties can modify/change this 
configuration for a single request.

Enhancement engine implementation need to support enhancement properties. So 
please refer the documentation of the engines for more information about 
supported enhancement properties.

h2. Naming and definition

EnhancementProperties should be defined both as an RDF data type property 
within an Ontology and as constants in some Java Class or Interface. The URI 
version will use the enhancement property namespace 
(`http://stanbol.apache.org/ontology/enhancementproperties#`) and the ID of the 
property as local name. 

IDs MUST start with `enhancer.` and SHOULD use the 
`enhancer.{level-1}.{level-2}.{property-name}` syntax as typically used for 
java properties. Properties are case sensitive and SHOULD only use lower case 
characters. The '-' char shall be used to make properties with multiple names 
easier to read. 

Globally defined properties use '`enhancer.{property-name}`'. For Enhancement 
Engine specific properties a possible shorted/simplified name of the engine 
should be used as {level-1}.

Typical examples for Enhancement Properties are `enhancer.max-suggestions`, 
`enhancer.min-confidence`, 
`enhancer.entity-co-mention.adjust-existing-confidence`

The definition as RDF property will use the URI and MAY also include the XSD 
data type of supported values. 

{code}

    @prefix ehprop <http://stanbol.apache.org/ontology/enhancementproperties#>
    
    ehprop:enhancer.max-suggestions     rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Integer;

    ehprop:enhancer.min-confidence      rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

    ehprop:enhancer.entity-co-mention.adjust-existing-confidence        
rdf:type        rdfs:DatatypeProperty,
        xsd:datatype    xsd:Double;

{code}

*TODO:* The Enhancer should provide a service to register & lookup defined 
enhancement properties

h2. Scopes

Enhancement Properties can be defined with the following scopes

1. __request and engine__: Those properties are valid for a single request and 
a specific engine. They do have the highest priority. 
2. __request__: Those properties are valid for a single request and all engines.
3. __chain and engine__: Those properties are applied to all requests of the 
chain and only parsed to a specific engine
4. __chain__: Those properties are applied to all engines of all executions of 
that chain. They do have the lowest priority.

Properties with a higher priority will override properties with an lower 
priority. Meaning if a property `enhancer.min-confidence=0.5` is defined on a 
_chain_ scope it can be overridden by `enhancer.min-confidence=0.75` on a 
__chain and engine__ scope. A single request might still override the value on 
a __request__ or  __request and engine__ scope.

Chain scoped properties are configured with Enhancement Chain definition and 
represented as RDF in the ExecutionPlan. Request scoped properties are parsed 
with Enhancement requests. With version `0.12.1` and `1.0.0` they will be 
represented as an special ContentPart. Starting with `2.0.0` they will be 
represented as RDF in the ExecutionMetadata (via the EnhancementJob API added 
with 2.0.0).

h2. Java Interface

In version `0.12.1` and `1.0` EnhancementProperties are contained in the 
ContentItem. EnhancementEngines can retrieve the enhancement properties by 
using one of the EnhancementEngineHelper

{code}
    @Override
    public final void computeEnhancements(ContentItem ci) throws 
EngineException {
        Map<String,Object> enhancemntProps = 
EnhancementEngineHelper.getEnhancementProperties(this, ci);
        [..]
    }
{code}

With `2.0.0` the EnhancementEngine API will be changed so that the 
EnhancementProperties are parsed as an additional parameter.

{code}
    @Override
    public final void computeEnhancements(ContentItem ci,
            Map<String,Object> enhancemntProps) throws EngineException {
        [..]
    }
{code}

The Map with the EnhancementProperties is read/write able copy of the 
EnhancementProperties present in the ContentItem. Changes to the map MUST NOT 
be reflected in the state of the ContentItem.

h2. Definition Chain scoped Enhancement Properties 

Chain scoped EnhancementProperties are represented by RDF in the ExecutionPlan. 
As in `0.12.1` and `1.*` the ExecutionPlan is generated by the Enhancement 
Chain implementations needs to support the configuration.

Starting from `0.12.1` the ListChain, WeightedChain and GraphChain allow the 
configuration of EnhancementProperties:

* __chain and engine__ scoped properties are defined as parameters to the 
engines with the syntax `{engine-name}; {property-name-1}={value-1},{value-2}; 
{property-name-2}={value-1}; [...]`
* __chain__ scoped properties can be configured by using the osgi property key 
`stanbol.enhancer.chain.chainproperties` by the syntax 
`{property-name-1}={value-1},{value-2}`. NOTE that `;` is NOT supported as 
separator for parsing multiple properties as OSGI configurations already define 
a way for parsing multiple values via {key}=["{value1}","{value2}"]

With version `2.*` of the enhancer it will be possible to directly parse/refer 
an ExecutionPlan as RDF graph. This will also allow to configure chain scoped 
properties via RDF.

h2. Definition of Request scoped Enhancement Properties

In version `0.12.1` and `1.0.0` are represented by a special content part 
registered with the URI `urn:apache.org:stanbol.web:enhancement.properties`. 
The ContentItemHelper utility provides methods to retrieve and/or init this 
content part. The content part does use a Map<String,Object> that contains both 
_request_ and _request and engine_ scoped enhancement properties.

The keys of _Request and engine_ scoped based properties are prefixed with the 
name of the engine - `{engine-name}:{property-name}`. This is the same syntax 
used as when parsing request scoped enhancement properties as query parameters 
of the request.

{code}
    ContentItem ci; //the content item
    Map<String,Object> reqProp = 
ContentItemHelper.initEnhancementPropertiesContentPart(ci)
    //set min confidence to 0.5 for all engines
    reqProp.put("enhancer.minConfidence","0.5");
    //set max suggestions to 10 for the linking engine
    reqProp.put("linking:enhancer.maxSuggestions","10");
{code}

With the enhancer `2.0` the enhancement properties content part will get 
removed and replaced by the EnhancementJob API (TBD). 

h3. Parsing Enhancement Properties via the Enhancer RESTful Service

Starting with `0.12.1` Enhancement Properties can be parsed as query parameter 
of Enhancement Requests. For request scoped properties the property name is 
used as parameter. Request and engine scoped properties need to use 
`{engine-name}:{property-name}` as parameter.

The following shows the curl request generating the equivalent of the example 
used in the above section:

{code}

    curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
        --data "The Eifeltower is located in Paris." 
        http://localhost:8080/enhancer?enhancer.max-suggestions=5&\
        dbpedia-linking:enhancer.min-confidence=0.33&\
        conf-filter:enhancer.min-confidence=0.85

{code}

> EnhancementProperties
> ---------------------
>
>                 Key: STANBOL-488
>                 URL: https://issues.apache.org/jira/browse/STANBOL-488
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancer
>    Affects Versions: 1.0.0
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>             Fix For: 1.0.0
>
>
> Enhancement Properties aim to provide Chain and Request scoped configurations 
> to EnhancementEngines. 
> __IMPORTANT NOTE:__ This Issue introduces incompatible API changes to core 
> interfaces of the Stanbol Enhancer. This includes the `EnhancementEngine` 
> interface.
> Expected use ages include:
> * parse through of user names and passwords for EnhancementEngines that 
> depend on external services. This will allow such engines to use the user 
> account of the the one parsing the request (request scope) or the one 
> configuring the chain (chain configuration scope)
> * parse request specific constraints (e.g. the minimum confidence level for 
> Enhancements). The acceptable confidence might depend on the actual context 
> of the client application (e.g. if the user will review results or not)
> * configure dereferencing on a request bases (e.g. depending on the 
> requirements of the UI showing the enhancement results)
> * reduce the number of configured engine instances (e.g. when specifying the 
> minimum required confidence level for a chain or on request level one would 
> only need a single instance of a confidence-level-filter-engine; The same 
> would be true for dbpedia-linking engines with a different amount of 
> suggested results) 
> * mapping of HTTP header fields to enhancement properties (e.g. for using the 
> "Content-Language" header for specifying the language of the content)
> NOTES:
> * See detailed description in comments dating from February 2014 or later. 
> * This is not the initial description of this issue. This is important as the 
> first 5 comments do refer to the old description. You can still read the old 
> version by looking at the history of this issue. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (STANBOL-488) EnhancementProperties

Reply via email to