2011/3/25 Olivier Grisel <[email protected]>

>
> I think that we should definitely work at some point to be able to run
> an arbitrary UIMA analysis chain inside a Stanbol Enhancer. We need to
> write a dummy collection reader that turns a ContentItem into a CAS
> and a generic cas consumer that converts the output into a Clerezza
> Graph + a UIMAEnhancer that takes a CPE configuration to embed. Also
> the CAS to Clerezza Graph consumer could be directly contributed to
> the clerezza project while the ContentItem to CAS collection reader is
> stanbol specific.
>

Good points, I can work on the ClerezzaGraph CAS Consumer basing on the
uima.utils Clerezza module [1].
I was thinking also to how an engine could be packaged as a UIMA PEAR to
allow the execution of Stanbol engines inside UIMA pipelines without writing
"mapping code" (wondering if a custom Maven plugin could make that).



>
> That would allow Stanbol users to reuse existing UIMA tools and turn
> them into a more linked data centric REST service.
>

:)


>
> As for the use case, this in indeed interesting. Please note that the
> Solr engine embedded inside the entity hub is dedicated to fast local
> indexing Linked Data entities (dbpedia entries for instance) and not
> documents. Stanbol it not really meant to be a document  management
> system (at least not in the short term) but more like a knowledge base
> management system that lives next to an existing CMS that would
> probably have its own instance of Solr to index its documents.


> Extending Stanbol to build semantically enriched indices of documents
> would still be in the scope of stanbol but I think we should first
> focus on finishing the cleaning / refactoring of the existing code
> base before implementing new services.



Perfectly agree, just wanted to raise the point in time to make proper
architectural considerations and share existing usage scenarios :)
Cheers,
Tommaso

[1] :
http://svn.apache.org/repos/asf/incubator/clerezza/trunk/parent/uima/uima.utils/

Reply via email to