Yes, that sounds right.
As for the engines, I am just not sure that we shall re-open all the
existing ones and update them. I think we shall maintain some kind of
backward compatibility.
We could probably accomplish that by refactoring the
'AbstractEnhancementEngine' abstract class (which I believe is used by
almost every engine?):
- adding the two methods:
-- int canEnhance(ContentItem ci, Map<String, Object>
enhancementContext)
-- void computeEnhancements(ContentItem ci, Map<String, Object>
enhancementContext)
that:
- will store the enhancementContext locally and
- provide access to it via a ' getEnhancementContext' method and
- call the single-argument counterpart ( canEnhance( ci ) and
computeEnhancements( ci ) ).
New or existing engines that want to take advantage of the configuration
map, will be able to either override the above, or use the
getEnhancementContext.
What do you think?
BR
David
On Tue, Dec 11, 2012 at 7:15 AM, Rupert Westenthaler <
[email protected]> wrote:
> Hi,
>
> I would opt for an mixed approach. For the EnhancementEngine interface
> I would go with the proposed
>
> int canEnhance(ContentItem ci, Map<String, Object> enhancementContext)
> void computeEnhancements(ContentItem ci, Map<String, Object>
> enhancementContext)
>
> however in the internal API I would like to go for the discussed
> "option-1" - an EnhancementJob class that includes the ContentItem as
> well as technical metadata.
>
> The reason why I want to have the EnhancementJob class is that is
> preserves the possibility to distribute the enhancement process onto
> different machines. Even that we currently do not support this in
> Stanbol this might be something we could want to add in future.
>
> Having a domain object that represents the whole state of the
> execution would e.g. allow designs where the state is managed by a
> distributed datastore (e.g. MongoDB) with EnhancementEngines running
> on several distributed machines (doing the processing). According
> implementations of the EnhancementJobManager and an EnhancementJob
> class could support such a scenario without the need to change the
> implementation of actual EnhancementEngines.
>
> Two additional notes:
>
> * The EventJobManager already uses an internal EnhancementJob class
> that provides API bases access to the execution metadata. This could
> be used as a starting point for the implementation.
> * I would propose to make the change to the API immediately after the
> next (0.10.0) release of the Enhancer.
>
> best
> Rupert
>
>
>
>
> On Mon, Dec 10, 2012 at 10:00 PM, Fabian Christ
> <[email protected]> wrote:
> > Hi,
> >
> > I think the cleanest way would be to introduce
> 'computeEnhancements(ContentItem
> > ci, Map<String, Object>
> > enhancementContext)'.
> >
> > I do not know how Olivier wanted to make this optional but as we do not
> > have any stable API right now, we could introduce such a change. Changing
> > the existing engines that we have is not a big deal IMO.
> >
> > I am more +1 to do it the right way now instead of fixing it at some
> later
> > point.
> >
> > The question to me is whether we should also change the implementation of
> > the ExecutionMetadata. Should this be part of the ContentItem?
> >
> > Best,
> > - Fabian
> >
> >
> > 2012/12/10 David Riccitelli <[email protected]>
> >
> >> Hello,
> >>
> >> I think the discussed design is correct, i.e. the configuration context
> >> should not be bound to the content item. Though, as far as I understand,
> >> STANBOL-488 has been currently implemented as
> 'getEnhancementProperties() :
> >> Map<String,Object>', correct?
> >>
> >> Changing to 'computeEnhancements(ContentItem ci, Map<String, Object>
> >> enhancementContext)' would require too much effort on the existing
> engines,
> >> maybe then, if we want to go that way, we should create a new Engine
> >> interface such as EnhancementContextEngine which would then support the
> new
> >> method?
> >>
> >> Let me know, I am open to any possible implementation.
> >>
> >> BR
> >> David
> >>
> >>
> >> On Mon, Dec 10, 2012 at 9:46 PM, Rupert Westenthaler <
> >> [email protected]> wrote:
> >>
> >> > Hi
> >> >
> >> > FYI: STANBOL-488 [1] is exactly about this topic and if I remember
> >> > correctly this is even implemented (but not yet activated/used)
> >> > because there was no agreement on this issue. So if we agree on a
> >> > design it should be relatively easy to introduce this.
> >> >
> >> >
> >> > best
> >> > Rupert
> >> >
> >> > [1] https://issues.apache.org/jira/browse/STANBOL-488
> >> >
> >> > On Mon, Dec 10, 2012 at 7:45 PM, David Riccitelli <[email protected]
> >
> >> > wrote:
> >> > > Ok, perfect.
> >> > >
> >> > > While we further check over the configuration content part, I've
> >> > > implemented a sample that wraps the ContentItem in a ContentItem bag
> >> > which
> >> > > contains also a configuration dictionary:
> >> > > - ContentItemBag:
> >> > >
> >> >
> >>
> https://github.com/insideout10/stanbol-facade/blob/master/stanbol-facade-api/src/main/java/io/insideout/stanbol/facade/models/ContentItemBag.java
> >> > >
> >> > > The wrapper is fed via the enhancementJobManager:
> >> > > -
> >> > >
> >> >
> >>
> https://github.com/insideout10/stanbol-facade/blob/master/stanbol-facade-api/src/main/java/io/insideout/stanbol/facade/services/TaskService.java
> >> > >
> >> > > This is a sample service which dumps the configuration parameters:
> >> > > -
> >> > >
> >> >
> >>
> https://github.com/insideout10/stanbol-facade/blob/master/stanbol-facade-api/src/main/java/io/insideout/stanbol/facade/engines/ContentItemBagSpyEngine.java
> >> > >
> >> > > In the above examples the content to be lifted and and the
> >> configuration
> >> > > parameters are encapsulated in a JSON request (converted to a
> >> > TaskRequest)
> >> > > such as this:
> >> > >
> >> > > {
> >> > > "configuration": {
> >> > > "configuration.parameter.1": "value.1",
> >> > > "configuration.parameter.2": "value.2"
> >> > > },
> >> > > "mimeType": "application/rdf+xml",
> >> > > "content": " ... "
> >> > > }
> >> > >
> >> > > The above can of course be changed as soon as we define the content
> >> part
> >> > > for the per-call configuration of engines.
> >> > >
> >> > > BR,
> >> > > David
> >> > >
> >> > >
> >> > > On Mon, Dec 10, 2012 at 5:51 PM, Fabian Christ <
> >> > [email protected]
> >> > >> wrote:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> maybe we could define a specific content part [1] for such config
> >> > >> information. This way we do not have to change the interface at
> all.
> >> It
> >> > is
> >> > >> just a matter of filling a to be defined "config content part".
> >> > >>
> >> > >> Then we have to write a simple engine that takes config params
> from an
> >> > >> incoming request and writes this information into the config
> content
> >> > part.
> >> > >> Engines can look up the config per request from the config content
> >> part.
> >> > >>
> >> > >> Maybe Rupert can say more about this as he has defined the content
> >> part
> >> > >> infrastructure.
> >> > >>
> >> > >> [1]
> >> > >>
> >> >
> >>
> https://stanbol.apache.org/docs/trunk/components/enhancer/contentitem.html
> >> > >>
> >> > >>
> >> > >> 2012/12/10 David Riccitelli <[email protected]>
> >> > >>
> >> > >> > Thanks Fabian,
> >> > >> >
> >> > >> > Yes, I am thinking in the context of the engines that we're
> >> > contributing,
> >> > >> > but it could be useful for the existing engines as well.
> >> > >> >
> >> > >> > Currently the engines only rely on a provided ContentItem
> instance
> >> for
> >> > >> the
> >> > >> > enhancement process (computeEnhancements(ContentItem ci)): maybe
> the
> >> > >> > ContentItem interface could be extended to include a reference
> to a
> >> > >> > configuration map.
> >> > >> >
> >> > >> > Engines that support custom configurations will look-up from this
> >> map
> >> > for
> >> > >> > per-call configurations. This would not affect existing engines,
> but
> >> > >> would
> >> > >> > enable them to use this feature in the future.
> >> > >> >
> >> > >> > What do you think?
> >> > >> >
> >> > >> > David
> >> > >> >
> >> > >> >
> >> > >> > On Mon, Dec 10, 2012 at 5:37 PM, Fabian Christ <
> >> > >> > [email protected]
> >> > >> > > wrote:
> >> > >> >
> >> > >> > > Hi,
> >> > >> > >
> >> > >> > > are you referring to existing engines in Stanbol or are you
> using
> >> > your
> >> > >> > own
> >> > >> > > ones?
> >> > >> > >
> >> > >> > > At the moment we do not support such a concept of per request
> >> > configs.
> >> > >> At
> >> > >> > > least the current engines do not look in the ContentItem for
> their
> >> > >> > config.
> >> > >> > >
> >> > >> > > We have another requirement to make it possible to pass
> existing
> >> > >> metadata
> >> > >> > > into the request and send text plus existing metadata to
> Stanbol.
> >> > Maybe
> >> > >> > > such config per request could be a similar case.
> >> > >> > >
> >> > >> > > Anyway, currently it is not yet supported out of the box IIRC.
> >> > >> > >
> >> > >> > > Best,
> >> > >> > > - Fabian
> >> > >> > >
> >> > >> > >
> >> > >> > > 2012/12/10 David Riccitelli <[email protected]>
> >> > >> > >
> >> > >> > > > Hello,
> >> > >> > > >
> >> > >> > > > We have a need to allow passing some engine configuration
> >> > parameters
> >> > >> in
> >> > >> > > > each call.
> >> > >> > > >
> >> > >> > > > For example, we might want for one call to have confidence
> >> score >
> >> > >> 0.5,
> >> > >> > > and
> >> > >> > > > for another call > 0.9 (just making up the numbers).
> >> > >> > > >
> >> > >> > > > Is this feasible now? Can the per-call configuration
> parameters
> >> be
> >> > >> > bound
> >> > >> > > to
> >> > >> > > > the ContentItem? If yes, how?
> >> > >> > > >
> >> > >> > > > Thanks,
> >> > >> > > > David
> >> > >> > > >
> >> > >> > > > --
> >> > >> > > > David Riccitelli
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> ********************************************************************************
> >> > >> > > > InsideOut10 s.r.l.
> >> > >> > > > P.IVA: IT-11381771002
> >> > >> > > > Fax: +39 0110708239
> >> > >> > > > ---
> >> > >> > > > LinkedIn: http://it.linkedin.com/in/riccitelli
> >> > >> > > > Twitter: ziodave
> >> > >> > > > ---
> >> > >> > > > Layar Partner Network<
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> >> > >> > > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> ********************************************************************************
> >> > >> > > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > --
> >> > >> > > Fabian
> >> > >> > > http://twitter.com/fctwitt
> >> > >> > >
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > --
> >> > >> > David Riccitelli
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >>
> >> >
> >>
> ********************************************************************************
> >> > >> > InsideOut10 s.r.l.
> >> > >> > P.IVA: IT-11381771002
> >> > >> > Fax: +39 0110708239
> >> > >> > ---
> >> > >> > LinkedIn: http://it.linkedin.com/in/riccitelli
> >> > >> > Twitter: ziodave
> >> > >> > ---
> >> > >> > Layar Partner Network<
> >> > >> >
> >> > >>
> >> >
> >>
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> >> > >> > >
> >> > >> >
> >> > >> >
> >> > >>
> >> >
> >>
> ********************************************************************************
> >> > >> >
> >> > >>
> >> > >>
> >> > >>
> >> > >> --
> >> > >> Fabian
> >> > >> http://twitter.com/fctwitt
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > David Riccitelli
> >> > >
> >> > >
> >> >
> >>
> ********************************************************************************
> >> > > InsideOut10 s.r.l.
> >> > > P.IVA: IT-11381771002
> >> > > Fax: +39 0110708239
> >> > > ---
> >> > > LinkedIn: http://it.linkedin.com/in/riccitelli
> >> > > Twitter: ziodave
> >> > > ---
> >> > > Layar Partner Network<
> >> >
> >>
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> >> > >
> >> > >
> >> >
> >>
> ********************************************************************************
> >> >
> >> >
> >> >
> >> > --
> >> > | Rupert Westenthaler [email protected]
> >> > | Bodenlehenstraße 11 ++43-699-11108907
> >> > | A-5500 Bischofshofen
> >> >
> >>
> >>
> >>
> >> --
> >> David Riccitelli
> >>
> >>
> >>
> ********************************************************************************
> >> InsideOut10 s.r.l.
> >> P.IVA: IT-11381771002
> >> Fax: +39 0110708239
> >> ---
> >> LinkedIn: http://it.linkedin.com/in/riccitelli
> >> Twitter: ziodave
> >> ---
> >> Layar Partner Network<
> >>
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> >> >
> >>
> >>
> ********************************************************************************
> >>
> >
> >
> >
> > --
> > Fabian
> > http://twitter.com/fctwitt
>
>
>
> --
> | Rupert Westenthaler [email protected]
> | Bodenlehenstraße 11 ++43-699-11108907
> | A-5500 Bischofshofen
>
--
David Riccitelli
********************************************************************************
InsideOut10 s.r.l.
P.IVA: IT-11381771002
Fax: +39 0110708239
---
LinkedIn: http://it.linkedin.com/in/riccitelli
Twitter: ziodave
---
Layar Partner
Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
********************************************************************************