2011/3/25 Olivier Grisel <[email protected]> > > I think that we should definitely work at some point to be able to run > an arbitrary UIMA analysis chain inside a Stanbol Enhancer. We need to > write a dummy collection reader that turns a ContentItem into a CAS > and a generic cas consumer that converts the output into a Clerezza > Graph + a UIMAEnhancer that takes a CPE configuration to embed. Also > the CAS to Clerezza Graph consumer could be directly contributed to > the clerezza project while the ContentItem to CAS collection reader is > stanbol specific. >
Good points, I can work on the ClerezzaGraph CAS Consumer basing on the uima.utils Clerezza module [1]. I was thinking also to how an engine could be packaged as a UIMA PEAR to allow the execution of Stanbol engines inside UIMA pipelines without writing "mapping code" (wondering if a custom Maven plugin could make that). > > That would allow Stanbol users to reuse existing UIMA tools and turn > them into a more linked data centric REST service. > :) > > As for the use case, this in indeed interesting. Please note that the > Solr engine embedded inside the entity hub is dedicated to fast local > indexing Linked Data entities (dbpedia entries for instance) and not > documents. Stanbol it not really meant to be a document management > system (at least not in the short term) but more like a knowledge base > management system that lives next to an existing CMS that would > probably have its own instance of Solr to index its documents. > Extending Stanbol to build semantically enriched indices of documents > would still be in the scope of stanbol but I think we should first > focus on finishing the cleaning / refactoring of the existing code > base before implementing new services. Perfectly agree, just wanted to raise the point in time to make proper architectural considerations and share existing usage scenarios :) Cheers, Tommaso [1] : http://svn.apache.org/repos/asf/incubator/clerezza/trunk/parent/uima/uima.utils/
