Re: Enhancer API extensions / changes

Fabian Christ Thu, 21 Mar 2013 03:11:05 -0700

Hi,

in order to preserve a compatible version before the breaking changes
are introduced, I created branches for enhancer-0.10.1 and
enhancement-engines-0.10.1 [1]. The trunk will be upgraded to
enhancer-0.11.0 and enhancement-engines-0.11.0 and the breaking
changes will be introduced in the near future on the trunk.


Once we have a new commons and entityhub release, we can also release
the enhancer-0.10.1 stuff from the branches.

[1] https://issues.apache.org/jira/browse/STANBOL-982


2013/3/18 Rupert Westenthaler <[email protected]>:
> Hi all,
>
> The intension of this mail is to inform all Stanbol Enhancer users
> about upcoming additions to the Stanbol Enhancer that will also
> involve (incompatible) API changes to the EnhancementEngine interface.
>
> This mail only provides an overview about those changes and their
> rational for detailed information please have a look at the
> discussions in [1], [2] and also [4].
>
> Changes will be applied to the trunk (Stanbol Enhancer version
> 0.11.0-SNAPSHOT).
>
>
> EnhancementEngine API change
> ========================
>
> While most of those changes will only affect lower level APIs there
> will be a change of the API for EnhancementEngines. Therefore this
> will require Users with custom EnhancementEngines to provide necessary
> adaptions. As described by the last comment of [1] the API of
> EnhancementEngine will change to
>
>     int canEnhance(ContentItem ci,
>         Map<String, Object> enhancementContext)
>     void computeEnhancements(ContentItem ci,
>         Map<String, Object> enhancementContext)
>
> The enhancementContext will contain request specific configurations.
> EnhancementEngines that support those will need to consider those
> configurations in addition to the configuration parsed in activate(..)
> method of the component.
>
> Typical usage examples:
>
> * parsing user name and pwd for an external service
> * parsing document password for protected rich text documents to the 
> TikaEngine
>
> However this will also allow advanced use cases like parsing the users
> & group to consider ACL for EntityLinking (as described in [3]).
>
> In addition it will allow to move configurations currently from the
> Stanbol instance to the Enhancement request. Something desirable for
> use cases as described in [4] where you want to use the same Stanbol
> configurations on multiple hosts to do load balancing.
>
> EnhancementJob
> =============
>
> This new class will be used to represent an enhancement job. It will
> contain the ContentItem, enhancement chain, execution metadata as well
> as the enhancement context.
>
> The ExecutionMetadata currently stored as ContentPart of the
> ContentItem will move over to the EnhancementJob. Note that this means
> that EnhancementEngines will no longer be able to access those
> information.
>
> The "enhancementContext" parsed to an EnhancementEngine will be
> created by merging EnhancementChain level properties with
> EnhancementEngine specific properties. This means that properties
> defined on Chain level will be visible to all EnhancementEngines
> called by a chain, while EnhancementEngine level properties will be
> only visible to a specific Engine.
>
> EnhancementEngine are supported to allow divagating configurations for
> multiple instances of the same EnhancementEngine implementation being
> called in the same enhancement chain. A typical example are multiple
> EntityLinking engines for different vocabularies (e.g. dbpediaLinking
> and geonamesLinking).
>
> The EnhancementJobManager will be responsible for processing
> EnhancementJobs. Therefore the API will be changed to take an
> EnhancementJob instead of a ContentItem.
>
> /enhancer/task RESTful service
> =======================
>
> This Endpoint will allow to parse EnhancementJobs including
>
> 1. a new end-point that can be added in /enhancer/task
> 2. the end-point takes a Task Request (interface to be defined)
> 3. the Task Request will allow to post:
>     * content or URL submission
>     * per-call engine parameters
>     * per-cal EnhancementChain definitions
> 4. it supports synchronous operations and possible async execution
> with callback URI
>
> [5] suggests to use a JSON for the definition of such tasks, but in
> principle the definition could be also supported by using RDF.
>
> As pointed out in [6] it would be also possible to extend the current
> "MultiPart ContentItem support" to achieve the same functionality.
>
> WorkPlan
> =======
>
> 1. change the API of the EnhancementEngine  interface. At first empty
> maps will get parsed as enhancementContext (STANBOL-488). Adapt all
> EnhancementEngine implementations so that they ignore the additional
> parameter.
> 2. definition of the EnhancementJob interface and implementation of
> the same based on the EnhancementJob class of the
> o.a.s.enhancer.jobmanager.event module
> 3. specification and implementation of the "/enhancer/task RESTful service"
> 4. adaptions of the "MultiPart ContentItem support" to support EnhancementJobs
> 5. adapt to existing EnhancementEngines to support enhancementContext
> (where applicable)
>
> best
> Rupert
>
> [1] https://issues.apache.org/jira/browse/STANBOL-488
> [2] http://markmail.org/message/hnwdw7o6bxt6pwbe
> [3] https://github.com/nuxeo/nuxeo-solr/tree/master/architecture
> [4] http://markmail.org/message/wba4ztzkkhvahcyg
> [5] http://markmail.org/message/zqztwjhndwj74jqv (part of [2])
> [6] http://markmail.org/message/bslhb7ojexdbv56l (part of [2])
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen



-- 
Fabian
http://twitter.com/fctwitt

Re: Enhancer API extensions / changes

Reply via email to