The cTAKES dependency parser is a wrapper for the ClearNLP parser, with some modifications used at training time I believe to use cTAKES preprocessing. cTAKES then maps from the ClearNLP parser output to the UIMA type system. If you want to train on another language, especially if you don't already have a POS tagger, it may be simpler just to train ClearNLP on its own and import the model file. One drawback is that we have not kept up with ClearNLP development. So if you want to be very helpful :), you could train your model on the most recent ClearNLP release, then update the cTAKES class that maps from ClearNLP to our typesystem (it may not be necessary, I'm not sure if they've actually changed the interface; on the other hand it could be completely different).
The main classes are: /ctakes-dependency-parser/src/main/java/org/apache/ctakes/dependency/parser/ae/ClearNLPDependencyParserAE.java (the AE that sets up the parser, converts ctakes tokens into clearnlp tokens, parses, and makes a clal to convert to UIMA typesystem.) /ctakes-dependency-parser/src/main/java/org/apache/ctakes/dependency/parser/util/ClearDependencyUtility.java (the utility class that converts from ClearNLP DepTree trees to UIMA ConllDependencyNode trees) Tim ________________________________ From: lewis john mcgibbney <[email protected]> Sent: Tuesday, October 25, 2016 9:47 AM To: [email protected] Subject: Re: cTakes Models Hi Leander, Hve you seen the fllowing documentation? https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+Dictionaries+and+Models<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B3.2-2BDictionaries-2Band-2BModels&d=DQMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=RyoNfJX9VQOjBW7ixgfPP3R9OSfQmmrmJXCqV43b2dk&s=lII83c_M3DV5NLXQZyStkrXr3AU5jQo0vcvwDG6YP3k&e=> ? On Tue, Oct 25, 2016 at 12:29 AM, <[email protected]<mailto:[email protected]>> wrote: From: Leander Melms <[email protected]<mailto:[email protected]>> To: [email protected]<mailto:[email protected]> Cc: Date: Tue, 25 Oct 2016 09:21:21 +0200 Subject: cTakes Models Dear cTAKES Dev Community, I'm a doctoral candidate at ZUSE Marburg, an institution for differential diagnosis of rare diseases, and I'm currently working with cTAKES. I'd like to train various models, including but not limited to the pependency parser (mayo-en-dep), the lemmatizer and the semantic role labeler, relation extractor, historyOf, conditional (cTakes assertion), polarity, subject, uncertainity, sideEffect libsvm models l on a German text corpus. Unfortunately, I haven't found much information on how training was performed yet. Any hints towards how the training data was labeled and formatted or even training scripts would be greatly appreciated. We have a mid-sized text corpus of patient discharge summaries and would be excited to integrate cTAKES in differetial diagnoses. Best regards Leander Melms Doctoral candidate and medical student at Philipps-University Marburg [email protected]<mailto:[email protected]>-marburg.de<https://urldefense.proofpoint.com/v2/url?u=http-3A__marburg.de&d=DQMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=RyoNfJX9VQOjBW7ixgfPP3R9OSfQmmrmJXCqV43b2dk&s=AiIPL7CMnBJInk2IA1MmVCVs75xN4458f9MHUG_UFfo&e=> -- http://home.apache.org/~lewismc/<https://urldefense.proofpoint.com/v2/url?u=http-3A__home.apache.org_-257Elewismc_&d=DQMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=RyoNfJX9VQOjBW7ixgfPP3R9OSfQmmrmJXCqV43b2dk&s=mm90TFNlqbLxWeCJMH9uXXGtAOSNVcBJR8VzkrWYco0&e=> @hectorMcSpector http://www.linkedin.com/in/lmcgibbney<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linkedin.com_in_lmcgibbney&d=DQMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=RyoNfJX9VQOjBW7ixgfPP3R9OSfQmmrmJXCqV43b2dk&s=QhmF2xoKpHE52yupbBOzLPd0EKAyI-cVh9NnaNqbN4E&e=>
