On 09/18/2013 08:59 PM, Karthik Sarma wrote:
I would imagine that ctakes would run fine on a non-english OS configuration... java is supposed to be good at taking care of that (of course, unless we introduce internationalized strings, it'd remain in english). I agree with James that the best guess is processing non-English text. UMLS has some non-English language dictionaries, but I suspect that some of the components wouldn't internationalize very well in some cases (i.e. RTL languages in general, LVG, smoking status, maybe the POS tagger, maybe even the tokenizer, etc).
In some parts where OpenNLP is used there can be issues depending on the local you are running on, some methods like String.toLowerCase depend on it. We will hopefully fix these issue
in the 1.6.0 release. Jörn
