We can look into https://github.com/DigitalPebble/behemoth for integration
with Stanbol. It already provides the basic architecture for running
document pipelines using MR.

On Tue, Mar 5, 2013 at 2:27 PM, Bertrand Delacretaz
<[email protected]>wrote:

> Hi,
>
> On Mon, Mar 4, 2013 at 6:57 PM, Som Satpathy <[email protected]>
> wrote:
> > ...I have been working on implementing a map-reduce job to run Stanbol
> > enhancement chains over hadoop. Is there work currently going on to
> address
> > the scalability aspect?...
>
> Note that you could scale Stanbol as is using http load balancing to
> address multiple Stanbol back-end instances which all have the same
> config, data files etc.
>
> As the content enhancer is stateless, this should be relatively simple
> to implement, though we might need to provide some replication/sync
> facilities for those configs and data files.
>
> Are you aiming for map-reducing a single enhancement request, by
> breaking up the submitted content in small parts and enhancing them
> independently?
>
> -Bertrand
>

Reply via email to