Re: Feedback about stanbol-414 specification

Rupert Westenthaler Tue, 17 Jan 2012 23:09:49 -0800

Hi

In my  opinion EIP and enhancement chains are two different things on
two separate (architectural) layers. I think the example provided by
florent shows this very nicely as it clearly shows how users can solve
such kind of problems by combining EIP provided by Apache Camel and
Enhancement Chains provided by the Stanbol Enhancer. In other words I
see Apache Camel more as an alternative to the RESTful interface of
Apache Stanbol.

Florent as soon as your Camel enhancer is available we should start
working on a "How to Enterprise Level Information Extraction with
Apache Camel and Apache Stanbol" usage scenario. This would not only
help potential users by also us developers to better understand the
whole stack. WDYT?

Some more comments inline

On Tue, Jan 17, 2012 at 1:29 PM, Fabian Christ
<[email protected]> wrote:
>
> IMO the main use case is:
>
> - Have different URIs for different chains
> - At each chain URI you can configure N engines in a user defined order
> (optional: allow chains to be nested within other chains)
>
> That's what I would start with and wait for user feedback if more
> complex scenarios come up on the mailing list.
>

Exactly. From the configuration perspective I still think that linear
enhancement chains will be the most used one.
The

* "WeightedChain" allows users to just lost all the names of engines
he want to have in the chain. Ordering is calculated automatically
similar to the current WeightedJobManager.
* "ListChain": we could add this Chain type. Here the user MUST
provide the list of engines in the exact order of execution.
* "GraphChain": This is intended for expert users that want to
optimize chain configurations (e.g. explicitly tell the
EnhancementJobManager what engines can be executed in parallel).

Defining the Execution Plan in RDF has the advantage that it makes it
very easy to provide information about he execution within the
metadata of the enhanced content item. If you have not seen it yet.
Yesterday I added a new section "Execution Metadata" to the
specification describing how the EnhancementJobManager should encode
metadata about the enhancement process. This information are critical
if we want to use the "org.apache.stanbol.commons.jobs" api for the
async REST API as suggested by David in [1].

>>> 2012/1/17 florent andré <[email protected]>:
>>> °°°°°° Missing features °°°°°°
>>>
>>> There is IMO two main missing features in this definition :
>>> 1) No way to link chains each others ("chain linking")

As also mentioned by Fabian this might be useful and added to
Enhancement Chains at some point. Should be also relatively easy to
implement.

Regarding

>>> Now, with the "linking chain" and "selector" features we can define an
>>> "UltimateBigChain" like that :
>>>
>>> from(input_file) --> categorisationChain
>>> --> if (graph has "music") --> musicChain.
>>> --> elseif (graph has "food") --> foodChain --> if (graph has "pizza")-->
>>> pizzaChain.
>>> --> otherwise() --> otherStuffChain.

and

>> However that does not prevent us to expose Stanbol engines and chains
>> as Camel Endpoints [1] for people would like to benefit from the Camel
>> wide support for various messaging systems (i.e. as an ETL).
>>
>>  [1] 
>> https://svn.apache.org/repos/asf/camel/trunk/camel-core/src/main/java/org/apache/camel/Endpoint.java
>>

+1

The addition of Enhancement Chains allows to shortens the definition
of such Camel workflows because users need no longer to call single
EnhancementEngines but can use Chains instead. In addition this allows
to change the configuration of a chain without affecting the Workflow.

best
Rupert

-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: Feedback about stanbol-414 specification

Reply via email to