Author: rwesten
Date: Tue Jan 17 08:47:36 2012
New Revision: 1232340
URL: http://svn.apache.org/viewvc?rev=1232340&view=rev
Log:
Added ExecutionPlan concept; Added ExecutionMetadata ontology intended to be
used to describe the enhancement process of a ContentItem based on the
execution plan provided by a Chain. This will be important for async calls that
periodically want to check the status of an ongoing enhancement process.
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext?rev=1232340&r1=1232339&r2=1232340&view=diff
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
(original)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/STANBOL-414-specification.mdtext
Tue Jan 17 08:47:36 2012
@@ -47,10 +47,16 @@ The execution plan need to be created by
The RDFS schema used for the execution plan is defined as follows.
* Namespace: ep : http://stanbol.apache.org/ontology/enhancer/executionplan#
+ * __ep:ExecutionPlan__ : Represent an execution plan defined by all linked
execution nodes.
+ * __ep:hasExecutionNode__ (domain: ep:ExecutionPlan; range:
ep:ExecutionNode; inverseOf: ep:inExecutionPlan): links the execution plan with
all the execution nodes.
+ * __ep:chain__ (domain: ep:ExecutionPlan; range: xsd:string): The name of
the Chain this execution plan is used for.
* __ep:ExecutionNode__ : Class used for all Nodes representing the execution
of an Enhancement Engine.
- * __ep:engine__ (domain: ep:ExecutionNode; range: xsd:string): The property
used to link to the Enhancement Engine by the name of the engine.
- * __ep:dependsOn__ (domain: ep:ExecutionNode; range: ep:ExecutionNode)
Defines that the execution of this node depends on the completion of the
referenced one.
- * __ep:optional__ (domain: ep:ExecutionNode; range: xsd:boolean) Can be used
to specify that the execution of this EnhancementEngine is optional. If this
property is set to TRUE an engine will be marked as executed even if it
execution was not possible (e.g. because an engine with this name was not
active) or the execution failed (e.g. because of the Exception).
+ * __ep:inExecutionPlan__ (domain: ep:ExecutionNode; range:
ep:ExecutionPlan ;inverseOf: ep:hasExecutionNode): functional property that
links the execution node with an execution plan
+ * __ep:engine__ (domain: ep:ExecutionNode; range: xsd:string): The
property used to link to the Enhancement Engine by the name of the engine.
+ * __ep:dependsOn__ (domain: ep:ExecutionNode; range: ep:ExecutionNode)
Defines that the execution of this node depends on the completion of the
referenced one.
+ * __ep:optional__ (domain: ep:ExecutionNode; range: xsd:boolean) Can be
used to specify that the execution of this EnhancementEngine is optional. If
this property is set to TRUE an engine will be marked as executed even if it
execution was not possible (e.g. because an engine with this name was not
active) or the execution failed (e.g. because of the Exception).
+
+Note the the data for the ep:ExecutionPlan and the
ep:hasExecutionNode/ep:inExecutionPlan typically need not to be parsed as
configuration of a Chain. This information are typically automatically added
based on the assumption that all ep:ExecutionNode parsed in the configuration
for a chain are member of the execution plan for such chain. Therefore this
information is typically added by the Chain itself when the configuration is
parsed and validated.
#### Example:
@@ -66,27 +72,37 @@ This example assumes that
The RDF graph of such a chain would look:
+ urn:execPlan
+ rdf:type ep:ExecutionPlan
+ ep:hasExecutionNode urn:node1, urn:node2, urn:node3, urn:node4,
urn:node5
+ ep:chain "demoChain"
+
urn:node1
rdf:type stanbol:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
stanbol:engine langId
urn:node2
rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
ep:dependsOn urn:node1
ep:engine ner
urn:node3
rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
ep:dependsOn urn:node1
ep:engine dbpediaLinking
urn:node4
rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
ep:dependsOn urn:node1
ep:engine geonamesLinking
urn:node5
rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
ep:engine zemanta
ep:optional "true"^^xsd:boolean
@@ -198,6 +214,8 @@ First lets define the EnhancementEngineM
+ getReferences(String name) : List<ServiceReference>
/** Getter for the Engine for the given name */
+ getEngine(String name) : EnhancementEngine
+ /** Getter for the names of the active engines */
+ + getActiveEngineNames() : Set<String>
#### EngineTracker
@@ -213,6 +231,7 @@ Utility that internally uses [ServiceTra
+ close()
/** Getter for the list of tracked engine names. Empty if all are tracked
*/
+ getTrackedEngines() : Set<String>
+
This utility can be used by Components that need to track a specific set of
Engines. In addition it also allows users to provide an own
ServiceTrackerCustomizer. This can be e.g. used to perform special actions on
any change to an tracked Engine.
A typically usage if this would be by the _WeightedChain_ implementation that
needs to track changes in referenced EnhancementEgninges to update the
execution plan it needs to manage based on the
ServiceProperties#ENHANCEMENT_ENGINE_ORDERING values.
@@ -265,6 +284,8 @@ Note that Work on asynchronous enhanceme
### EnhancementJobManager
+syncronouse
+
This interface of the EnhancementJobManager will change due to the addition of
Chains and in future only contain a single Method allowing to enhance a
ContentItem by using the execution plan provided by the parsed Chain.
+ enhanceContent(ContentItem ci, Chain chain)
@@ -323,4 +344,120 @@ EnhancementEngines that do NOT support E
In cases where the EnhancementJobManager can execute multiple engines in
parallel it is good practice to first start the execution of Engines that do
support EnhancementEngine#ENHANCE_ASYNC. This will allow such engines to obtain
a read lock to read the data necessary for there calculations before the
EnhancementJobManager needs to obtain an exclusive write lock for calling
EnhancementEngines that do only support EnhancementEngine#ENHANCE_SYNCHRONOUS.
+### Execution Metadata
+
+The EnhancementJobManager needs to provide metadata about the execution
process to the metadata of the processed ContentItem. Such data provide
information about the actual execution of the execution plan as provided by the
Chain. In the cause of asynchronous call to the Stanbol Enhancer this
information can also be used to provide information about the current state of
the elution to the requester as the EnhancementJobManager is required to update
such metadata on each time when an EnhancementEngine is started or has
completed/faild to process the enhanced ContentItem.
+
+The RDFS schema used for the execution plan is defined as follows.
+
+ * Namespace: em :
http://stanbol.apache.org/ontology/enhancer/executionMetadata#
+ * __em:Execution__ : Super class for all Executions
+ * __em:executionPart__ (domain:Execution, range: em:ChainExecution):
Defines that this execution was part of the execution of a chain
+ * __em:status__(domain: em:Execution; range: em:ExecutionStatus): The
status of an Execution (used for both em:EngineExection and em:ChainExecution
+ * __em:started__ (domain: em:Execution; range: xsd:dateTime): Marks the
start the the execution
+ * __em:completed__ (domain: em:Execution; range: xsd:dateTime): Marks the
completion of the execution
+ * __em:statusMessage__ (domain: em:Excecution; range: xsd:string): A
natural language description providing further information about the status of
this execution. Typically used to parse error messages if the execution fails
(em:status is set to em:StatusFailed).
+ * __em:ChainExecution__ : Class used to describe the execution of an
enhancement Chain.
+ * __em:defualtChain__ (domain: em:ChainExecution; range: xsd:boolean): If
the executed Chain is currently the default Chain of the Stanbol Enhancer.
+ * __em:executionPlan__ (domain:ChainExecution; range: ep:ExecutionPlan):
Links to the execution plan as provided by the chain.
+ * __em:enhances__(domain: em:ChainExecution; range: rdf:Resource) : links
the em:ChainExection with the URI of the processed content item. The range
needs to be updated as soon as the Stanbol Enhancement Structure is defined.
+ * __em:enhancedBy__ (domain: rdf:Resource; range: em:ChainExecution) :
links the URI of the content item with the metadata about the enhancement
process. The range needs to be updated as soon as the Stanbol Enhancement
Structure is defined.
+ * __em:EngineExecution__ : Class used to describe the execution of an
EnhancementEngine.
+ * __em:executionNode__ (domain: em:EngineExecution; range:
ep:ExecutionNode): The node within the ExecutionPlan
+ * __em:ExecutionStatus__ : Class describing the status of an EngineExecution
+ * __em:StatusSheduled__ : ExecutionStatis instance that described that an
execution is scheduled but has not yet started
+ * __em:StatusInProgress__ : ExecutuinStatus instance that describes that
the execution of the linked EngineExecution is in progress
+ * __em:StatusCompleted__ : ExecutionStatus instance describing that the
execution has already completed successfully
+ * __em:StatusFailed__ : ExecutionStatus indicating that the execution has
failed. Typically a em:statusMessage describing the reason for the failed
execution is provided for em:Executions with that state.
+ * __em:StatusSkiped__ : ExecutionStatus indicating that the execution if
an sp:ExecutionNode was skipped. This is only allowed for execution nodes that
are marked as optional. Typically also a em:statusMessage with the reason
should be provided.
+
+
+#### Example:
+
+The following example uses the same example as used within the ExecutionPlan
section. To make the relations between the execution metadata and the execution
plan easier to see the triples of the execution plan are included at the end of
this example.
+
+This example describes the following situation:
+
+* the execution of the content item with the URI 'urn:contentItem1' with the
default chain
+* the default chain is represented by a Chain with the name "demoChain" the
ExecutionPlan has the URI 'urn:execPlan'
+* the successful execution of the 'langid' engine (execution: 'urn:exec1',
node: 'urn:node1')
+* the failed execution of the 'ner' engine (execution: 'urn:exec2', node:
'urn:node2'): As reason for the failure a message is provided that the NER
model for the language 'de' is not available
+* the successful execution of the 'zemanta' engine (execution: 'urn:exec3',
node: 'urn:node5'): This engine was started in parallel to the 'ner' egine -
therefore before the chain failed.
+* There is no execution of the dbpediaLinking (node: '') and geonamesLinking
(node: '') engines because the chain failed before such engines where
scheduled. This assumes the the EnhancementJobManagers does only add
em:EngineExecution resources when it starts the processing of an
ep:ExecutionNode defined in the execution plan. However EnhancementJobManager
can also create ep:Execution resources for all execution nodes. In that case
there would be also em:EngineExecution resources for the dbpediaLinking and
geonamesLinking engines with the em:status set to 'em:StatusSheduled'.
+
+The RDF graph with the Execution Metadata:
+
+ urn:exec
+ rdf:type em:ChainExecution
+ em:executionPlan urn:execPlan
+ em:enhances urn:contentItem1
+ em:defaultChain "true"
+ em:started 2012-01-11T12.13.14.156
+ em:completed 2012-01-11T12.13.15.157
+ em:status em:StatusFailed
+ em:statusMessage "Unable to execute EnhancementEngine 'new' \
+ (Message: No NER model for language 'de' is available)."
+ em:executionPart urn:exec1, urn:exec2, urn:exec3, urn:exec4, urn:exec5
+
+ urn:exec1
+ rdf:type em:EngineExecution
+ em:executionPart urn:exec
+ em:executionNode urn:node1
+ em:status em:StatusCompleted
+ em:started 2012-01-11T12.13.14.160
+ em:completed 2012-01-11T12.13.14.250
+
+ urn:exec2
+ rdf:type em:EngineExecution
+ em:executionPart urn:exec
+ em:executionNode urn:node2
+ em:status StatusFailed
+ em:statusMessage "No NER model for language 'de' is available"
+ em:started 2012-01-11T12.13.14.253
+ em:completed 2012-01-11T12.13.14.289
+
+ urn:exec3
+ rdf:type em:EngineExecution
+ em:executionPart urn:exec
+ em:executionNode urn:node5
+ em:status StatusCompleted
+ em:started 2012-01-11T12.13.14.253
+ em:completed 2012-01-11T12.13.15.150
+
+The Execution Plan: (copy from the example provided in the ExecutionPlan
section)
+
+ urn:execPlan
+ rdf:type ep:ExecutionPlan
+ ep:hasExecutionNode urn:node1, urn:node2, urn:node3, urn:node4,
urn:node5
+ ep:chain "demoChain"
+
+ urn:node1
+ rdf:type stanbol:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
+ stanbol:engine langId
+
+ urn:node2
+ rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
+ ep:dependsOn urn:node1
+ ep:engine ner
+
+ urn:node3
+ rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
+ ep:dependsOn urn:node1
+ ep:engine dbpediaLinking
+
+ urn:node4
+ rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
+ ep:dependsOn urn:node1
+ ep:engine geonamesLinking
+
+ urn:node5
+ rdf:type ep:ExecutionNode
+ ep:inExecutionPlan urn:execPlan
+ ep:engine zemanta
+ ep:optional "true"^^xsd:boolean
+Note that both the Execution Metadata AND the Execution Plan need to be
contained within the metadata of the ContentItem