Sean wrote the fast version and may be able to answer your specific questions. 
But in general, the fast dictionary does not match performance exactly -- it is 
not implementing an equivalent search and it has different indexing methods. We 
are happy to receive reports of what seem like bugs, though, any new software 
is likely to have some. What I will say is that I know Sean has run some (as 
yet unpublished) experiments and we believe that in the aggregate the new 
system output is at least as high quality as the older one.
Tim


________________________________________
From: Oranit Dror [[email protected]]
Sent: Sunday, June 21, 2015 4:37 AM
To: [email protected]
Subject: The fast dictionary pipeline vs. the regular one

Hello,

I am using ctakes 3.2.2 with the regular pipeline. Recently, I have tested the 
fast dictionary pipeline and indeed it is much faster.
However, I have encountered with several quality differences in the returned 
annotations. For example:


1.       With the fast pipeline, the term "GBM" is annotated as "glioblastoma 
multiforme", while in the regular pipeline it is annotated as "glioblastoma".
Note that according to the UMLS DB, the concept of "GBM" is "glioblastoma" and 
"glioblastoma multiforme" is mapped to a narrower concept.


2.       The word "cm" in a phrase like "5.5 cm X 2.6 cm" is annotated by the 
regular pipeline as "Cutaneous Mastocytosis", while in the fast pipeline it is  
not annotated as a medical term (as expected and as in UMLS).


Any explanation for the differences?

Thank you,
Oranit.



Reply via email to