Re: Labs annotator?

2017-03-29 Thread Kean Kaufmann
ove to see lab values in ctakes! Could you > please write a small summary of what it does? Maybe an example or two > could suffice. > > We can definitely put it into ctakes in release 4.1 - maybe next quarter? > > Cheers, > Sean > > -Original Message- > From

Re: 2016AB UMLS (ctakessnorx)

2017-03-14 Thread Kean Kaufmann
r a bit now, but I still can't figure > out how to upgrade the UMLS version to the most recent one. > If I create my own dictionary, cTAKES only returns UMLS > concepts and no SNOMED CT ones (I'm interested in those

Re: Filter CVD output?

2017-07-17 Thread Kean Kaufmann
Hi A.S., Does the "Show Selected Annotations" menu item serve your purposes? https://uima.apache.org/d/uimaj-current/tools.html#cvd.toolsMenu On Mon, Jul 17, 2017 at 4:31 AM, Lacey A.S. wrote: > Hi - I spend a lot of time showing doctors the output of cTakes via what

LVG questions

2017-07-14 Thread Kean Kaufmann
umentation, but it doesn't seem to be intended for developers. Can anyone point me to more detailed docs? And: Has anyone tried plugging in another stemmer? To play nicely with the ctakes-dictionary-lookup-fast annotators, it seems as if all it would have to do would be to populate canonicalForm. Happy Friday, and thanks for any help you can provide! Kean Kaufmann NLP Developer RecordsOne, Inc.

Re: cTakes doesn't identify certain words like "fell" in clinical notes

2017-07-14 Thread Kean Kaufmann
I'd think LVG would come up with "fall" as the canonicalForm of "fell" and "fallen", but apparently it doesn't. The only terms associated with C0085639 in my custom-built dictionary are: sql> select cui, tui, text, prefterm from cui_terms c join tui t on t.cui = > c.cui join prefterm p on p.cui

Re: cTAKES as a dependency

2017-05-01 Thread Kean Kaufmann
> > On Fri, Apr 28, 2017 at 9:53 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > Hey Kean, > > It is great to know that your project is out there! > Hey Sean! Very kind of you. Speaking of which, our BizDevVeep would like to see RecordsOne listed under "Companies" on the "Users

Re: cTAKES 4.0.0 Release

2017-05-01 Thread Kean Kaufmann
ctakes.apache.org/downloads.cgi > > > > For further information, please visit the project website at > > http://ctakes.apache.org/ > > > > -- The Apache cTAKES Team > > > -- _ *​Kean Kaufmann* ​NLP Developer RecordsOne

Re: Lab Value Finder dictionary

2017-12-19 Thread Kean Kaufmann
In my experience, the quick answer is: Certainly not all, but probably many. Different institutions will have different formats for lab reports, different panels they typically perform, and different labels. On Tue, Dec 19, 2017 at 8:21 AM, wrote: > Hi , > >

Re: Lab report

2017-11-13 Thread Kean Kaufmann
I wrote a lab annotator that will be checked into the trunk at some point. Source, unit tests and description attached to this issue: https://issues.apache.org/jira/browse/CTAKES-441 On Mon, Nov 13, 2017 at 6:09 AM, wrote: > Hi All, > > Can CTAKES process LAB

Re: false positive [EXTERNAL]

2017-10-25 Thread Kean Kaufmann
Sean, thanks! Blacklisting is essential, and making it category-specific is a really nice touch. Dispatch from the trenches, FWIW: a) The blacklist can get quite big, e.g. when mining common wordlists. To reduce bloat, might you allow comma-separated lists of semantic groups in the first

Re: CAS Visual Debugger - [EXTERNAL]

2017-10-25 Thread Kean Kaufmann
+1 I point it at an engine descriptor .xml file (using the command-line option -desc) that refers to the type system file, but that's a hack... On Wed, Oct 25, 2017 at 1:49 PM, Dligach, Dmitriy wrote: > +1 > > Also, I’d love to be able to point CVD to a directory containing

Re: Lab Value - Range finder

2018-02-09 Thread Kean Kaufmann
Hi Abilash, By design, the Lab Value annotator avoids ranges if possible: https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-core/src/main/java/org/apache/ctakes/core/ae/LabValueFinder.java // prefer non-range values, if any > value = candidateList.stream() >

Re: UmlsOverlapLookupAnnotator + BsvRareWordDictionary: # tokens skipped varies? [EXTERNAL]

2018-03-07 Thread Kean Kaufmann
no idea why adding an entry would change the > behavior. I will have to look at the code and run your examples. By the > way, thank you for the explicit examples! > > >Is this expected behavior? > No, and thanks for letting me know about it. Can you create a Jira item > with th

Re: UmlsOverlapLookupAnnotator + BsvRareWordDictionary: # tokens skipped varies?

2018-03-07 Thread Kean Kaufmann
P.S. Extra config bit: I also removed "CD" from the exclusionTags in the UmlsOverlapLookupAnnotator. On Wed, Mar 7, 2018 at 10:58 AM, Kean Kaufmann <k...@recordsone.com> wrote: > Hi Sean, > > I'm perplexed. It seems as if the number of tokens that the > UmlsOver

UmlsOverlapLookupAnnotator + BsvRareWordDictionary: # tokens skipped varies?

2018-03-07 Thread Kean Kaufmann
Hi Sean, I'm perplexed. It seems as if the number of tokens that the UmlsOverlapLookupAnnotator will skip varies with the content of the RareWordDictionary. Here's my setup. I think I've included enough information to replicate my perplexity, if you have time/inclination to do that; let me know

Re: Query on LabValueFinder

2018-03-19 Thread Kean Kaufmann
Gandhi, at first blush, I can't replicate your result using the code I submitted... but my code and config differ from trunk, so Sean is probably the best person to ask. I included unit tests with a mini-dictionary for ProcedureMentions, but they probably didn't play nicely with the rest of the

Re: Lab Value - Range finder

2018-03-01 Thread Kean Kaufmann
currently available in CTAKES to do the same. > > 28 Feb 2018 15:04:04 INFO LabValueFinder - Set to value: > LabMention(349-352): HCT > > 28 Feb 2018 15:04:04 INFO LabValueFinder - Set to value: > RangeAnnotation(365-370): 42-52 > > 28 Feb 2018 15:04:04 INFO LabValueFin

Re: I think I found a bug.

2020-08-31 Thread Kean Kaufmann
Hi Peter, I believe I've encountered this too; I never got around to tracking it down to the root cause, and didn't have the civic-mindedness to report it as you have. Thanks! To shut it up I implemented a brutal brute-force workaround, enclosed for your possible amusement. But it occurred to

Re: Question about window size in term lookup

2020-08-24 Thread Kean Kaufmann
> > my question is whether there's a place where one can register specific two > character terms, for example BP or PT which will be found even with a > window size set to three. My brute-force approach is pretty brutal: Change the window size to two, annotate terms, then remove all two-letter

Re: Disambiguation --alignment with SNOMED [EXTERNAL]

2020-12-03 Thread Kean Kaufmann
Peter says: the LabValueFinder. It has settings that allow it to clone procedures into > lab values or vice versa (I can't remember). The former... at least, when I contributed it. For potential lab values, it filters by TUIs: some procedures, others medications. Sean says: The only

Re: What to do about 4.0.0 and UMLS

2020-12-09 Thread Kean Kaufmann
> > 3. for 4.0.0 users that compile their own, provide a tar file containing > the sources plus instructions for modifying xml files and removing obsolete > Junit file. Is it worth a quick email poll of 4.0.0 users? +1 for Option 3! Thanks Peter (and everybody)... On Wed, Dec 9, 2020 at

Re: Lab Value Finder

2020-12-27 Thread Kean Kaufmann
> I've attached the Junit test based off your unit test and its debug output. You'll have to change the package name, though. Hi Peter -- Yes, that does sound weird. Not seeing an attachment. Send it along and I'll give troubleshooting a shot. Happy holidays, everybody. On Thu, Dec 24, 2020 at

Re: Passing SectionsBsv to piper containing BsvRegexSectionizer [EXTERNAL]

2021-02-02 Thread Kean Kaufmann
tory"; for outpatient radiology, "History" maps to "Reason for Exam". A lot of people in the community don't dream in java I do, sometimes... but then I wake up screaming. ;-) Kean Kaufmann Chief Architect - NLP RecordsOne, Inc. On Sat, Jan 30, 2021 at 10:01 AM Finan,

Re: rule-based lookup for custom lexicon [EXTERNAL] [SUSPICIOUS]

2021-05-19 Thread Kean Kaufmann
an apply ruta rules to their project. > > > > When I looked at it a few years ago it was for reason 2b. In the end we > went for different annotators like Peter and Kean outlined and just use > piper file changes to satisfy #2 as that is definitely much easier. > However, it doesn'

Re: rule-based lookup for custom lexicon [EXTERNAL] [SUSPICIOUS]

2021-05-19 Thread Kean Kaufmann
> yes, the line between "lookup" and rule execution is a little blurry sometimes. Sure is. I blur it with a set of annotators that extend dictionary annotations based on words or annotations covered by the same Chunk, e.g. DiseaseDisorderMention + /screen(ing)?/i = ProcedureMention

Re: Dictionary "bad" codes

2021-02-15 Thread Kean Kaufmann
FWIW, rather than editing the HSQLDB script, we use Sean's BsvRareWordDictionary to add phrases with a BSV file: cTakesHsql.xml: AddPhrases org.apache.ctakes.dictionary.lookup2.dictionary.BsvRareWordDictionary

Re: Dictionary "bad" codes

2021-02-15 Thread Kean Kaufmann
e the dictionary I created using the creator. > > Peter > > > > On Mon, Feb 15, 2021 at 4:16 PM Kean Kaufmann wrote: > > > FWIW, rather than editing the HSQLDB script, we use Sean's > > BsvRareWordDictionary to add phrases with a BSV file: > &

Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2022-06-06 Thread Kean Kaufmann
Is Git LFS an option? https://www.atlassian.com/git/tutorials/git-lfs#installing-git-lfs Needs an LFS-aware host e.g. Bitbucket; I don't know what the Apache hosting setup is like. On Fri, Jun 3, 2022 at 9:31 AM Finan, Sean wrote: > Hi Tim, > > >we ran into issues in previous attempts at

Re: Initial CTakes analysis

2023-08-12 Thread Kean Kaufmann
bles, etc. The cTAKES RegexSectionizer might work for you. https://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/ae/RegexSectionizer.html _ *Kean Kaufmann* NLP Architect RecordsOne nSight Driven | *Priority. Clarity. Integrity. *

PREFTERMs not included in UMLS rare-word dictionary?

2023-12-06 Thread Kean Kaufmann
dictionary returns 175K+ rows; at first blush, 20% look legit. select cui,lcase(prefterm) as prefterm from tui t join prefterm p on p.cui=t.cui and t.tui in (19,20,33,34,37,40,41,42,43,44,45,46,47,48,49,50,56,57,184,190,191) except (select cui,text from cui_terms c where c.cui=cui); Thanks fo

Re: Discrepancy in cTAKES Identification of 'Chemotherapy' SNOMED Codes

2023-11-17 Thread Kean Kaufmann
UMLS CUI: C1298669) > Any resources I can use to help me with other similar questions? UMLS Metathesaurus Browser: https://uts.nlm.nih.gov/uts/umls/home Signup is free. _ *Kean Kaufmann* NLP Architect RecordsOne nSight Driven | *Priority.

Custom dictionary no-"no" [was: Re: PREFTERMs not included in UMLS rare-word dictionary?]

2024-04-16 Thread Kean Kaufmann
aw in the dictionary > creator tool. > > Time for a rebuild with the 5.0 release ... > > Thanks for the report, > > Sean > > > From: Kean Kaufmann > Sent: Wednesday, December 6, 2023 4:12 PM > To: dev@ctakes.apache.org > Subject: PRE