You are trying to generate clusters of similar artifacts. Though this can be done at processing time, but better approach is to keep the annotated results in database. My suggestion is use index for fast retrieval.

UIMA does'nt provide anything special to do this, but you can use lucene/solr to achieve it. There is a feature "MoreLikeThis" in lucene/solr which is very handy to find out related articles.

cheers
Anuj
Radwen Aniba wrote:
Hello everyone,

Well I have a question regarding uima usage.
Till now I used UIMA to annotate documents and that's cooleverything is
great.
Well but now I will probably need to parse lots and lots of scientific
articles and abstract to extract knowledge.
Example : let's say I have a document containing the word "Cancer" I would
like to parse available scientific papers related to Cancer and to attach
this information to the word cancer something like "related articles", and
doing that with relevance score depending on the word occurence in the
document for example.
It is like a search engine but for text.
Well I know that's feasable but I don't know where and how to start.
Shall I have all the article and scientific papers in a kind of database or
something like that ? or is there any special format that UIMA could use as
"literrature" database ?

Can someone help me figuring this out ?

Thanks a lot

Rad


Reply via email to