RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Finan, Sean
for best F1 measure [was RE: cTakes Annotation Comparison Sean (or others), Of the various configuration options described below, which values/choices would you recommend for best F1 measure for something like the shared clef 2013 task? https://sites.google.com/site/shareclefehealth/ I'm

dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Masanz, James J.
, 2014 10:43 AM To: dev@ctakes.apache.org; kim.eb...@imatsolutions.com Subject: RE: cTakes Annotation Comparison Also check out stats that Sean ran before releasing the new component on: http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/doc/DictionaryLookupStats.docx From

Re: cTakes Annotation Comparison

2014-12-19 Thread David Kincaid
Thanks for this, Bruce! Very interesting work. It confirms what I've seen in my small tests that I've done in a non-systematic way. Did you happen to capture the number of false positives yet (annotations made by cTAKES that are not in the human adjudicated standard)? I've seen a lot of dictionary

RE: cTakes Annotation Comparison

2014-12-19 Thread Savova, Guergana
were similar. Thank you everyone! --Guergana -Original Message- From: David Kincaid [mailto:kincaid.d...@gmail.com] Sent: Friday, December 19, 2014 9:02 AM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Thanks for this, Bruce! Very interesting work. It confirms what

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
:02 AM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Thanks for this, Bruce! Very interesting work. It confirms what I've seen in my small tests that I've done in a non-systematic way. Did you happen to capture the number of false positives yet (annotations made by cTAKES

RE: cTakes Annotation Comparison

2014-12-19 Thread Chen, Pei
To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Guergana, I'm curious to the number of records that are in your gold standard sets, or if your gold standard set was run through a long running cTAKES process. I know at some point we fixed a bug in the old dictionary lookup

Re: cTakes Annotation Comparison

2014-12-19 Thread Miller, Timothy
and the fast one were similar. Thank you everyone! --Guergana -Original Message- From: David Kincaid [mailto:kincaid.d...@gmail.com] Sent: Friday, December 19, 2014 9:02 AM To: dev@ctakes.apache.orgmailto:dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Thanks for this, Bruce

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
one were similar. Thank you everyone! --Guergana -Original Message- From: David Kincaid [mailto:kincaid.d...@gmail.com] Sent: Friday, December 19, 2014 9:02 AM To: dev@ctakes.apache.orgmailto:dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Thanks for this, Bruce

RE: cTakes Annotation Comparison

2014-12-19 Thread Savova, Guergana
@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Our analysis against the human adjudicated gold standard from this SHARE corpus is using a simple check to see if the cTakes output included the annotation specified by the gold standard. The initial results I reported were for exact matches

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
cuis are added, removed, deprecated, and moved from one TUI to another. Sean -Original Message- From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu] Sent: Friday, December 19, 2014 1:28 PM To: dev@ctakes.apache.org Subject: RE: cTakes Annotation Comparison Several

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
- From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu] Sent: Friday, December 19, 2014 1:28 PM To: dev@ctakes.apache.org Subject: RE: cTakes Annotation Comparison Several thoughts: 1. The ShARE corpus annotates only mentions of type Diseases/Disorders and only Anatomical

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
I’m bringing it up in case the Human Annotations were done using a different version. From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 1:40 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Sean, I don't think that would be an issue

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
*Subject:* Re: cTakes Annotation Comparison Guergana, I'm curious to the number of records that are in your gold standard sets, or if your gold standard set was run through a long running cTAKES process. I know at some point we fixed a bug in the old dictionary lookup that caused

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
Ebert [mailto:kim.eb...@perfectsearchcorp.com kim.eb...@perfectsearchcorp.com] *Sent:* Friday, December 19, 2014 10:25 AM *To:* dev@ctakes.apache.org *Subject:* Re: cTakes Annotation Comparison Guergana, I'm curious to the number of records that are in your gold standard sets, or if your

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
:* Friday, December 19, 2014 1:47 PM *To:* Chen, Pei; dev@ctakes.apache.org *Subject:* Re: cTakes Annotation Comparison Pei, I don't think bugs/issues should be part of determining if one algorithm vs the other is superior. Obviously, it is worth mentioning the bugs, but if the fast lookup

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
that you'd only have two matches per document (100 docs?). Thanks, Sean -Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 3:23 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Sean, I tried

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
on this? It is really bizarre that you'd only have two matches per document (100 docs?). Thanks, Sean -Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 3:23 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Sean

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
horribly inaccurate. Thanks -Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 3:29 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison Correction -- So far, I did steps 1 and 2 of Sean's email. [image

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
:* Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] *Sent:* Friday, December 19, 2014 3:37 PM *To:* dev@ctakes.apache.org *Subject:* Re: cTakes Annotation Comparison My original results were using a newly downloaded cTakes 3.2.1 with the separately downloaded resources copied

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
, December 19, 2014 3:37 PM *To:* dev@ctakes.apache.org *Subject:* Re: cTakes Annotation Comparison My original results were using a newly downloaded cTakes 3.2.1 with the separately downloaded resources copied in. There were no changes to any of the configuration files. As far as this last

RE: cTakes Annotation Comparison --- (^:

2014-12-19 Thread Finan, Sean
- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Friday, December 19, 2014 5:05 PM To: dev@ctakes.apache.org Subject: Re: cTakes Annotation Comparison My apologies to Sean and everyone, I am happy to report that I found a bug in our analysis tools that was missing the last

Re: cTakes Annotation Comparison

2014-12-19 Thread Kim Ebert
:* Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] *Sent:* Friday, December 19, 2014 3:37 PM *To:* dev@ctakes.apache.org *Subject:* Re: cTakes Annotation Comparison My original results were using a newly downloaded cTakes 3.2.1 with the separately downloaded resources copied

Re: cTakes Annotation Comparison

2014-12-19 Thread Bruce Tietjen
:* Friday, December 19, 2014 3:37 PM *To:* dev@ctakes.apache.org *Subject:* Re: cTakes Annotation Comparison My original results were using a newly downloaded cTakes 3.2.1 with the separately downloaded resources copied in. There were no changes to any of the configuration files. As far

RE: cTakes Annotation Comparison

2014-12-18 Thread Chen, Pei
Bruce, Thanks for this-- very useful. Perhaps Sean Finan comment more- but it's also probably worth it to compare to an adjudicated human annotated gold standard. --Pei -Original Message- From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] Sent: Thursday, December 18,

Re: cTakes Annotation Comparison

2014-12-18 Thread Bruce Tietjen
Actually, we are working on a similar tool to compare it to the human adjudicated standard for the set we tested against. I didn't mention it before because the tool isn't complete yet, but initial results for the set (excluding those marked as CUI-less) was as follows: Human adjudicated