Hi Spico,

for sure it is possible with UIMA. For now, you have components in the
sandbox (like the Dictionary Annotator or the Concept Mapper
Annotator) which aims at recognizing text forms in a text from
dictionaries.

Personaly, I was not satisfied by the current solutions (either too
simple (no features can be associated with an entry of the
DictionaryAnnotator) or too complex to set up for me (the Concept
Mapper Annotator is based on a tokenizer which was different from
mine) ).

Based on a previous work of Jerome Rocheteau, I developed a simple
Dictionary Annotator with the following features
  * a dictionary is a uima resource (one instance can be shared by
multiple annotators) [1]
  * the dictionary design is abstract enough to allow several implementations.
  * right know it comes with one implementation of dictionary format :
CSV (one column is the entry and the others are feature values), but
XML RDF would be an easy
  * the dictionary entries are strings of characters which are stored
as a prefix tree of characters in order to process the recognition in
a fast way
  * it is not type system dependent

It would not cost to much to add an extension to deal XML RDF (only a
parser and the connector to the data structure). I have planed to open
the code soon but I can make it available sooner if you re interested
in participating in.
Anyway I ll be interested to know a bit more about how you wanted to
use your RDF format (what are the entries, the values...)

Best regards

/Nicolas

[1] 
http://uima.apache.org/downloads/releaseDocs/2.1.0-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.aae.accessing_external_resource_files



On Mon, Oct 31, 2011 at 9:41 PM, Alexander Klenner
<[email protected]> wrote:
> Hi Florin,
>
> I think what you are looking for is an UIMA type system that corresponds to 
> your specific RDF ontologies (URIs). As far as I know you must implement this 
> type system by hand (experienced UIMA users please correct me if I am wrong 
> here...).
>
> There is an RDF CAS Consumer to be found in the UIMA sandbox:
>
> http://uima.apache.org/sandbox.html#rdfcas.consumer
>
> that does it the other way round, an existing type system in a CAS is 
> converted to RDF triplestore format. But the created URIs from the typesystem 
> change from one run to another for the same artefact, which makes them not 
> really usable in a bigger RDF context. But maybe this could be a starting 
> point for further investigation...
>
> Cheers,
>
> Alex
>
>
>
> --
> Dipl. Bioinformatiker Alexander G. Klenner
> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
> Schloss Birlinghoven, D-53754 Sankt Augustin
> Tel.: +49 - 2241 - 14 - 2736
> E-mail: [email protected]
> Internet: http://www.scai.fraunhofer.de
>
>
> ----- Ursprüngliche Mail -----
> Von: "Spico Florin" <[email protected]>
> An: [email protected]
> Gesendet: Montag, 31. Oktober 2011 16:48:18
> Betreff: Consuming RDF ontologies as dictionaries
>
> Hello!
>  I'm newbie in UIMA. I would like to know if it is possible to create a
> dictionary (vocabulary) from a RDF triplestore. I would like that UIMA to
> be used to classify a words contained in a text by using a given ontology
> stored in a triplestore.
> How can I use UIMA in this particular use case?
>  I look forward for your answers.
>  Thank you.
>  Regards,
>  Florin
>



-- 
Dr. Nicolas Hernandez
Associate Professor (Maître de Conférences)
Université de Nantes - LINA CNRS
http://enicolashernandez.blogspot.com
http://www.univ-nantes.fr/hernandez-n
+33 (0)2 51 12 58 55
+33 (0)2 40 30 60 67

Reply via email to