Hi Bertrand, Thanks for getting back on this. You are right, I am aiming for distributing the enhancement request for the posted content over a cluster of nodes. That way we can quickly process large volumes of data. We would need just a mapper to achieve this (no reducer needed).
Thanks, Som On Tue, Mar 5, 2013 at 12:57 AM, Bertrand Delacretaz <[email protected] > wrote: > Hi, > > On Mon, Mar 4, 2013 at 6:57 PM, Som Satpathy <[email protected]> > wrote: > > ...I have been working on implementing a map-reduce job to run Stanbol > > enhancement chains over hadoop. Is there work currently going on to > address > > the scalability aspect?... > > Note that you could scale Stanbol as is using http load balancing to > address multiple Stanbol back-end instances which all have the same > config, data files etc. > > As the content enhancer is stateless, this should be relatively simple > to implement, though we might need to provide some replication/sync > facilities for those configs and data files. > > Are you aiming for map-reducing a single enhancement request, by > breaking up the submitted content in small parts and enhancing them > independently? > > -Bertrand >
