Hi Pei,

Thank you very much for your answer.

I am looking for good corpuses and thinking about a new one with my group to 
train the ML-based models and I will look into the hard-coded rules in order to 
change them.

AFAIK, the UMLS has a subset of the terms translated into Spanish which are 
correlated to the ones on the Spanish version of SNOMED CT.

I will be sharing my doubts as well as my progress here in order to get cTAKES 
working in Spanish and hopefully other languages.

Cheers,

--
Roberto Costumero Moreno
Laboratorio de Minería de Datos y Simulación (MIDAS)
Centro de Tecnología Biomédica
Universidad Politecnica de Madrid
[email protected]
Tlf: +34 91 336 4664

El 15/11/2013, a las 14:49, Chen, Pei <[email protected]> escribió:

> Hi Roberto,
> Welcome!  
> 
> In theory, in order to have cTAKES work in a different language, we would 
> just need to:
> -Retrain the existing ML-based models for the language and code should just 
> work as is for
> -Update any hard-coded rules
> -Use the Spanish dictionary for concepts (I believe UMLS already has a 
> Spanish translation for some of their thesauruses).
> I think it would awesome to have cTAKES work with multiple languages 
> including Spanish!
> Actually, a lot of folks have been asking about cTAKES models in different 
> languages.
> The challenging thing with the supervised machine learning methods is that 
> we'll have to rely on local domain experts to create the gold standard for 
> training.
> There is a group that may be contributing retrained models for cTAKES to work 
> in French.
> Others can feel free to chime in...
> 
> --Pei
> 
>> -----Original Message-----
>> From: Roberto Costumero Moreno [mailto:[email protected]]
>> Sent: Thursday, November 14, 2013 5:43 AM
>> To: [email protected]
>> Subject: cTAKES Translation
>> 
>> Hello everyone,
>> 
>> My name is Roberto Costumero and I am working for the Technical University
>> of Madrid in Spain doing my Ph.D. studies and I am new to this list, so I am
>> introducing myself and posting some doubts I have.
>> 
>> We are currently involved in a project together with several hospitals and we
>> are working closely with them into getting to know their necessities in order
>> to build an application for them to use the knowledge of their clinical 
>> notes,
>> imaging among other things.
>> 
>> We have been looking for different projects to see which one will fits our
>> needs and, of course, which will we will share our investigations with. Among
>> the different projects we have seen in the field of clinical text analysis we
>> think that cTAKES is the best one out there and it is very well structured 
>> and
>> organized, but the main problem we are facing is that every clinical text-
>> based NLP project is developed for English and we will be working with
>> Spanish texts.
>> 
>> We have already done some work for testing different algorithms translating
>> them to Spanish to detect negation and context dependency but we would
>> like to use a well-tested complete framework to work with, so we thought
>> about cTAKES, so I have a couple of questions for you.
>> 
>> - Does anyone know if someone is already working in translating cTAKES
>> modules to work with other languages (Spanish in particular)?
>> - Do you think it would be very difficult to do it because of any 
>> architectural
>> design I am not currently aware of?
>> - Do you think it would be a good line of development (for the cTAKES
>> project) to extend cTAKES to work together into translating it to Spanish in
>> this case?
>> 
>> Thank you very much in advance for your help.
>> 
>> Sincerely,
>> 
>> --
>> Roberto Costumero Moreno
>> Laboratorio de Minería de Datos y Simulación (MIDAS) Centro de Tecnología
>> Biomédica Universidad Politecnica de Madrid [email protected]
>> Tlf: +34 91 336 4664
> 

Reply via email to