STANBOL-414-specification.mdtext

rwesten Tue, 17 Jan 2012 00:48:08 -0800

Author: rwesten
Date: Tue Jan 17 08:47:36 2012
New Revision: 1232340

URL: http://svn.apache.org/viewvc?rev=1232340&view=rev
Log:
Added ExecutionPlan concept; Added ExecutionMetadata ontology intended to be 
used to describe the enhancement process of a ContentItem based on the 
execution plan provided by a Chain. This will be important for async calls that 
periodically want to check the status of an ongoing enhancement process.


Modified:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext

Modified: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext?rev=1232340&r1=1232339&r2=1232340&view=diff
==============================================================================
--- 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
 (original)
+++ 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
 Tue Jan 17 08:47:36 2012
@@ -47,10 +47,16 @@ The execution plan need to be created by
 The RDFS schema used for the execution plan is defined as follows.
 
  * Namespace: ep : http://stanbol.apache.org/ontology/enhancer/executionplan#
+ * __ep:ExecutionPlan__ : Represent an execution plan defined by all linked 
execution nodes.
+     * __ep:hasExecutionNode__ (domain: ep:ExecutionPlan; range: 
ep:ExecutionNode; inverseOf: ep:inExecutionPlan): links the execution plan with 
all the execution nodes.
+     * __ep:chain__ (domain: ep:ExecutionPlan; range: xsd:string): The name of 
the Chain this execution plan is used for.
  * __ep:ExecutionNode__ : Class used for all Nodes representing the execution 
of an Enhancement Engine.
- * __ep:engine__ (domain: ep:ExecutionNode; range: xsd:string): The property 
used to link to the Enhancement Engine by the name of the engine.
- * __ep:dependsOn__ (domain: ep:ExecutionNode; range: ep:ExecutionNode) 
Defines that the execution of this node depends on the completion of the 
referenced one.
- * __ep:optional__ (domain: ep:ExecutionNode; range: xsd:boolean) Can be used 
to specify that the execution of this EnhancementEngine is optional. If this 
property is set to TRUE an engine will be marked as executed even if it 
execution was not possible (e.g. because an engine with this name was not 
active) or the execution failed (e.g. because of the Exception). 
+     * __ep:inExecutionPlan__ (domain: ep:ExecutionNode; range: 
ep:ExecutionPlan ;inverseOf: ep:hasExecutionNode): functional property that 
links the execution node with an execution plan
+     * __ep:engine__ (domain: ep:ExecutionNode; range: xsd:string): The 
property used to link to the Enhancement Engine by the name of the engine.
+     * __ep:dependsOn__ (domain: ep:ExecutionNode; range: ep:ExecutionNode) 
Defines that the execution of this node depends on the completion of the 
referenced one.
+     * __ep:optional__ (domain: ep:ExecutionNode; range: xsd:boolean) Can be 
used to specify that the execution of this EnhancementEngine is optional. If 
this property is set to TRUE an engine will be marked as executed even if it 
execution was not possible (e.g. because an engine with this name was not 
active) or the execution failed (e.g. because of the Exception). 
+
+Note the the data for the ep:ExecutionPlan and the 
ep:hasExecutionNode/ep:inExecutionPlan typically need not to be parsed as 
configuration of a Chain. This information are typically automatically added 
based on the assumption that all ep:ExecutionNode parsed in the configuration 
for a chain are member of the execution plan for such chain. Therefore this 
information is typically added by the Chain itself when the configuration is 
parsed and validated.
 
 #### Example:
 
@@ -66,27 +72,37 @@ This example assumes that
 
 The RDF graph of such a chain would look:
 
+    urn:execPlan
+        rdf:type ep:ExecutionPlan
+        ep:hasExecutionNode urn:node1, urn:node2, urn:node3, urn:node4, 
urn:node5
+        ep:chain "demoChain"
+
     urn:node1
         rdf:type stanbol:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
         stanbol:engine langId
 
     urn:node2
         rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
         ep:dependsOn urn:node1
         ep:engine ner
 
     urn:node3
         rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
         ep:dependsOn urn:node1
         ep:engine dbpediaLinking
 
     urn:node4
         rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
         ep:dependsOn urn:node1
         ep:engine geonamesLinking
 
     urn:node5
         rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
         ep:engine zemanta
         ep:optional "true"^^xsd:boolean
 
@@ -198,6 +214,8 @@ First lets define the EnhancementEngineM
     + getReferences(String name) : List<ServiceReference>
     /** Getter for the Engine for the given name */
     + getEngine(String name) : EnhancementEngine
+    /** Getter for the names of the active engines */
+    + getActiveEngineNames() : Set<String>
 
 #### EngineTracker
 
@@ -213,6 +231,7 @@ Utility that internally uses [ServiceTra
     + close()
     /** Getter for the list of tracked engine names. Empty if all are tracked 
*/
     + getTrackedEngines() : Set<String>
+
 This utility can be used by Components that need to track a specific set of 
Engines. In addition it also allows users to provide an own 
ServiceTrackerCustomizer. This can be e.g. used to perform special actions on 
any change to an tracked Engine.
 
 A typically usage if this would be by the _WeightedChain_ implementation that 
needs to track changes in referenced EnhancementEgninges to update the 
execution plan it needs to manage based on the 
ServiceProperties#ENHANCEMENT_ENGINE_ORDERING values.
@@ -265,6 +284,8 @@ Note that Work on asynchronous enhanceme
 
 ### EnhancementJobManager
 
+syncronouse
+
 This interface of the EnhancementJobManager will change due to the addition of 
Chains and in future only contain a single Method allowing to enhance a 
ContentItem by using the execution plan provided by the parsed Chain.
 
     + enhanceContent(ContentItem ci, Chain chain)
@@ -323,4 +344,120 @@ EnhancementEngines that do NOT support E
 
 In cases where the EnhancementJobManager can execute multiple engines in 
parallel it is good practice to first start the execution of Engines that do 
support EnhancementEngine#ENHANCE_ASYNC. This will allow such engines to obtain 
a read lock to read the data necessary for there calculations before the 
EnhancementJobManager needs to obtain an exclusive write lock for calling 
EnhancementEngines that do only support EnhancementEngine#ENHANCE_SYNCHRONOUS.
 
+### Execution Metadata
+
+The EnhancementJobManager needs to provide metadata about the execution 
process to the metadata of the processed ContentItem. Such data provide 
information about the actual execution of the execution plan as provided by the 
Chain. In the cause of asynchronous call to the Stanbol Enhancer this 
information can also be used to provide information about the current state of 
the elution to the requester as the EnhancementJobManager is required to update 
such metadata on each time when an EnhancementEngine is started or has 
completed/faild to process the enhanced ContentItem.
+
+The RDFS schema used for the execution plan is defined as follows.
+
+ * Namespace: em : 
http://stanbol.apache.org/ontology/enhancer/executionMetadata#
+ * __em:Execution__ : Super class for all Executions
+     * __em:executionPart__ (domain:Execution, range: em:ChainExecution): 
Defines that this execution was part of the execution of a chain
+     * __em:status__(domain: em:Execution; range: em:ExecutionStatus): The 
status of an Execution (used for both em:EngineExection and em:ChainExecution
+     * __em:started__ (domain: em:Execution; range: xsd:dateTime): Marks the 
start the the execution
+     * __em:completed__ (domain: em:Execution; range: xsd:dateTime): Marks the 
completion of the execution
+     * __em:statusMessage__ (domain: em:Excecution; range: xsd:string): A 
natural language description providing further information about the status of 
this execution. Typically used to parse error messages if the execution fails 
(em:status is set to em:StatusFailed).
+ * __em:ChainExecution__ : Class used to describe the execution of an 
enhancement Chain.
+     * __em:defualtChain__ (domain: em:ChainExecution; range: xsd:boolean): If 
the executed Chain is currently the default Chain of the Stanbol Enhancer.
+     * __em:executionPlan__ (domain:ChainExecution; range: ep:ExecutionPlan): 
Links to the execution plan as provided by the chain.
+     * __em:enhances__(domain: em:ChainExecution; range: rdf:Resource) : links 
the em:ChainExection with the URI of the processed content item. The range 
needs to be updated as soon as the Stanbol Enhancement Structure is defined.
+     * __em:enhancedBy__ (domain: rdf:Resource; range: em:ChainExecution) : 
links the URI of the content item with the metadata about the enhancement 
process. The range needs to be updated as soon as the Stanbol Enhancement 
Structure is defined.
+ * __em:EngineExecution__ : Class used to describe the execution of an 
EnhancementEngine.
+     * __em:executionNode__ (domain: em:EngineExecution; range: 
ep:ExecutionNode): The node within the ExecutionPlan
+ * __em:ExecutionStatus__ : Class describing the status of an EngineExecution
+     * __em:StatusSheduled__ : ExecutionStatis instance that described that an 
execution is scheduled but has not yet started
+     * __em:StatusInProgress__ : ExecutuinStatus instance that describes that 
the execution of the linked EngineExecution is in progress
+     * __em:StatusCompleted__ : ExecutionStatus instance describing that the 
execution has already completed successfully
+     * __em:StatusFailed__ : ExecutionStatus indicating that the execution has 
failed. Typically a em:statusMessage describing the reason for the failed 
execution is provided for em:Executions with that state.
+     * __em:StatusSkiped__ : ExecutionStatus indicating that the execution if 
an sp:ExecutionNode was skipped. This is only allowed for execution nodes that 
are marked as optional. Typically also a em:statusMessage with the reason 
should be provided.
+
+
+#### Example:
+
+The following example uses the same example as used within the ExecutionPlan 
section. To make the relations between the execution metadata and the execution 
plan easier to see the triples of the execution plan are included at the end of 
this example.
+
+This example describes the following situation:
+
+* the execution of the content item with the URI 'urn:contentItem1' with the 
default chain
+* the default chain is represented by a Chain with the name "demoChain" the 
ExecutionPlan has the URI 'urn:execPlan'
+* the successful execution of the 'langid' engine (execution: 'urn:exec1', 
node: 'urn:node1')
+* the failed execution of the 'ner' engine (execution: 'urn:exec2', node: 
'urn:node2'): As reason for the failure a message is provided that the NER 
model for the language 'de' is not available
+* the successful execution of the 'zemanta' engine (execution: 'urn:exec3', 
node: 'urn:node5'): This engine was started in parallel to the 'ner' egine - 
therefore before the chain failed.
+* There is no execution of the dbpediaLinking (node: '') and geonamesLinking 
(node: '') engines because the chain failed before such engines where 
scheduled. This assumes the the EnhancementJobManagers does only add 
em:EngineExecution resources when it starts the processing of an 
ep:ExecutionNode defined in the execution plan. However EnhancementJobManager 
can also create ep:Execution resources for all execution nodes. In that case 
there would be also em:EngineExecution resources for the dbpediaLinking and 
geonamesLinking engines with the em:status set to 'em:StatusSheduled'. 
+
+The RDF graph with the Execution Metadata:
+
+    urn:exec
+        rdf:type em:ChainExecution
+        em:executionPlan urn:execPlan
+        em:enhances urn:contentItem1
+        em:defaultChain "true"
+        em:started 2012-01-11T12.13.14.156
+        em:completed 2012-01-11T12.13.15.157
+        em:status em:StatusFailed
+        em:statusMessage "Unable to execute EnhancementEngine 'new' \
+            (Message: No NER model for language 'de' is available)."
+        em:executionPart urn:exec1, urn:exec2, urn:exec3, urn:exec4, urn:exec5
+
+    urn:exec1
+        rdf:type em:EngineExecution
+        em:executionPart urn:exec
+        em:executionNode urn:node1
+        em:status em:StatusCompleted
+        em:started 2012-01-11T12.13.14.160
+        em:completed 2012-01-11T12.13.14.250
+
+    urn:exec2
+        rdf:type em:EngineExecution
+        em:executionPart urn:exec
+        em:executionNode urn:node2
+        em:status StatusFailed
+        em:statusMessage "No NER model for language 'de' is available"
+        em:started 2012-01-11T12.13.14.253
+        em:completed 2012-01-11T12.13.14.289
+
+    urn:exec3
+        rdf:type em:EngineExecution
+        em:executionPart urn:exec
+        em:executionNode urn:node5
+        em:status StatusCompleted
+        em:started 2012-01-11T12.13.14.253
+        em:completed 2012-01-11T12.13.15.150
+
+The Execution Plan: (copy from the example provided in the ExecutionPlan 
section)
+    
+    urn:execPlan
+        rdf:type ep:ExecutionPlan
+        ep:hasExecutionNode urn:node1, urn:node2, urn:node3, urn:node4, 
urn:node5
+        ep:chain "demoChain"
+
+    urn:node1
+        rdf:type stanbol:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
+        stanbol:engine langId
+
+    urn:node2
+        rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
+        ep:dependsOn urn:node1
+        ep:engine ner
+
+    urn:node3
+        rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
+        ep:dependsOn urn:node1
+        ep:engine dbpediaLinking
+
+    urn:node4
+        rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
+        ep:dependsOn urn:node1
+        ep:engine geonamesLinking
+
+    urn:node5
+        rdf:type ep:ExecutionNode
+        ep:inExecutionPlan urn:execPlan
+        ep:engine zemanta
+        ep:optional "true"^^xsd:boolean
 
+Note that both the Execution Metadata AND the Execution Plan need to be 
contained within the metadata of the ContentItem

svn commit: r1232340 - /incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext

Reply via email to