Re: cTakes Models

Miller, Timothy Tue, 25 Oct 2016 07:48:39 -0700

There are a few different assertion modules (polarity, uncertainty, subject, 
conditional, historyOf). The ClearTK/Liblinear-based model can train and eval 
all those modules in the same class:



/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/eval/AssertionEvaluation.java


If you look in the resources/launch subdirectory there are eclipse launch 
configurations for preprocessing (reading from gold standard annotations and 
writing xmis) and for different evaluations (e.g., cross-validation). Even if 
you're not using eclipse that will help you understand the arguments that class 
takes.


The trickiest part for adapting to your data will probably be reading the gold 
standard. If you look in the preprocess() method of the eval class above it 
makes calls to different corpus readers depending on corpus argument. So unless 
you use the same tool (Knowtator) and schema you will probably need to write 
your own corpus reader. But you can look there for some examples of how to do 
so. Sorry, but all the code in that module could use some cleaning.


Tim


________________________________
From: Leander Melms <[email protected]>
Sent: Tuesday, October 25, 2016 10:25 AM
To: [email protected]
Subject: Re: cTakes Models

Yes, I succcessfully trained a chunker and sentence detection model. However, 
I'm still not sure how training of the lemmatizer and the semantic role 
labeler, relation extractor, historyOf, conditional (cTakes assertion), 
polarity, subject, uncertainity, sideEffect libsvm models has been performed.

Any documentation or training scripts are hightly appreciated.

Thanks
Leander Melms
On 25 Oct 2016, at 15:47, lewis john mcgibbney 
<[email protected]<mailto:[email protected]>> wrote:

Hi Leander,
Hve you seen the fllowing documentation?

https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+Dictionaries+and+Models<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B3.2-2BDictionaries-2Band-2BModels&d=DQMFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=Vk0sXyyPtZL4Er9pJbyjTS_ZLhia99950n_cFi9BySc&s=Pf9U8Evx-wEzIny7Vp4NZ8c2vyGAG--gzX1duizDB4Y&e=>

?

On Tue, Oct 25, 2016 at 12:29 AM, 
<[email protected]<mailto:[email protected]>> 
wrote:

From: Leander Melms 
<[email protected]<mailto:[email protected]>>
To: [email protected]<mailto:[email protected]>
Cc:
Date: Tue, 25 Oct 2016 09:21:21 +0200
Subject: cTakes Models
Dear cTAKES Dev Community,

I'm a doctoral candidate at ZUSE Marburg, an institution for differential 
diagnosis of rare diseases, and I'm currently working with cTAKES. I'd like to 
train various models, including but not limited to the pependency parser 
(mayo-en-dep), the lemmatizer and the semantic role labeler, relation 
extractor, historyOf, conditional (cTakes assertion), polarity, subject, 
uncertainity, sideEffect libsvm models l on a German text corpus. 
Unfortunately, I haven't found much information on how training was performed 
yet.

Any hints towards how the training data was labeled and formatted or even 
training scripts would be greatly appreciated. We have a mid-sized text corpus 
of patient discharge summaries and would be excited to integrate cTAKES in 
differetial diagnoses.

Best regards
Leander Melms
Doctoral candidate and medical student at Philipps-University Marburg
[email protected]<mailto:[email protected]>-marburg.de<https://urldefense.proofpoint.com/v2/url?u=http-3A__marburg.de_&d=DQMFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=Vk0sXyyPtZL4Er9pJbyjTS_ZLhia99950n_cFi9BySc&s=2BT1NAwxQunl3TH_zR6nYuNgXNPinMgennKsrn5ef3M&e=>




--
http://home.apache.org/~lewismc/<https://urldefense.proofpoint.com/v2/url?u=http-3A__home.apache.org_-257Elewismc_&d=DQMFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=Vk0sXyyPtZL4Er9pJbyjTS_ZLhia99950n_cFi9BySc&s=yq6Z4bBdJ07YzmbxOZx-Yr5MWsu9slDh2mANcYEx2ws&e=>
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linkedin.com_in_lmcgibbney&d=DQMFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=Vk0sXyyPtZL4Er9pJbyjTS_ZLhia99950n_cFi9BySc&s=N5ZNEMGVcd8aicPRIlviSHqRPXQJNfY-KqPTe0vJnbI&e=>

Re: cTakes Models

Reply via email to