Hi Matt,
I realized that I should have posted this on the developer site.
First of all, thanks for your followup a few weeks back. I hadn't been
subscribed to the developer list prior to your post so I didn't see it until
Pei mentioned it.
Not sure if you saw my response on user list but I wasn't able to get your
suggestion to work so I defaulted back to what I was trying to do with updating
the medfacts snapshot jar. After stepping through the code I see that cTAKES
was identifying the new cue term from the updated medfacts snapshot jar (I had
decompiled the jar, added the new term 'predictive of', then added jar back to
cTAKES). Okay, so cTAKES did identify the new cue term but it's not getting
allocated to the 'possible' assertion type. Since last night I've looked at
several of the java files in the medfacts jar and quickly realized that UIMA is
not just the foundation of cTAKES, MITRE is too!
I would love to understand how does the cue word type(i.e. 'predictive of') get
associated to the assertion type(i.e. 'possible')? I can't seem to figure
that out by looking at the code in the medfacts jar. I'd like to understand
how I can update it so my new cue word gets recognized as a 'possible'
assertion type. The existing words in the speculation are getting the
correct 'possible' associate type. Just the new cue term I added defaults to
'present'. This leads me to wonder if it's the cue.model that's doing the
assignment and has to be updated to recognize new cue word. I'm hoping you
can elucidate further on what the cue.model file is and how it works. It
appears to be a binary of some type. What tools would be needed to update it?
Hmmm, the fact that it's a .model extension leads me to believe that it's
the result of some hefty machine learning that's contained in that cue.model
file.
Thanks so much.
Regards,
PaulaFrom: [email protected]
To: [email protected]
Subject: RE: cTakes: question on updating cue words
Date: Sun, 5 Jan 2014 21:38:55 -0500
Happy New Year cTAKES Community! Hopefully everyone's staying warm.
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cced4dccb.126b0%[email protected]%3e
but unfortunately it didn't work so I just stepped through the code to see
what's going on with how cue words are being used in cTAKES. I verified that
the medfacts snapshot jar that contains around 30 txt files for cue words are
all being called. So before I tried Matt's recent suggestion I did decompile
the medfacts snapshot jar and add the new cue terms to the jar then added it
back to cTAKES.... thought it wasn't being recognized but it is. I had updated
the speculation text file with 'predictive of' as a new cue term and while
stepping through I see that 'predictive of' was recognized as a cue term from
the speculation file.
The problem is that it gets annotated as 'present' not as 'possible'. That's
why I thought the updated cue term wasn't being recognized. I did a quick test
using one of the already stated terms(I used 'improbable') from the speculation
text file and sure enough the same file that contains the new cue term of
'predictive of' got annotated as 'possible' which has me wondering about the
cue model and how it gets generated.
So to echo what Tim stated, what does the cue model do? What exactly is that
file and how can the contents be viewed and regenerated or updated? Something
clearly has to be updated so 'predictive of' gets annotated as 'possible'.
I'm so close to getting this resolved so I would appreciate any assistance.
Thanks.
Regards,
Paula
From: [email protected]
To: [email protected]
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000
Actually, I think Matt's suggestion is a bit out of date -- during development
we removed the dependency on the lucene dictionary lookup and now the under
development version does read those psv files directly.
But this still doesn't help Paula since she's trying to run the current
release. I thought Matt or Pei might have some info about whether its possible
to modify negation cue words for that release? For example, I can see in the
code it uses a "cue model" which
can be found in ctakes-assertion-res but it is a binary and I'm not sure what
kind. Is there any way to modify that file?
Tim
On 12/24/2013 12:09 AM, digital paula wrote:
Thank you, Pei. I believe I had signed up for the dev list right after Matt
posted so I didn't see his email. I will try it out.
Merry Christmas to you and everyone on the list. :-)
Regards,
Paula
Date: Mon, 23 Dec 2013 23:42:30 -0500
Subject: Re: cTakes: question on updating cue words
From: [email protected]
To: [email protected]
Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cced4dccb.126b0%[email protected]%3e
On Mon, Dec 23, 2013 at 11:57 AM, digital paula
<[email protected]> wrote:
Hello again cTAKES Community,
I think Tim's away for the holidays since I didn't see any response. Could
someone else assist? To reiterate, I'd like to manually update the cue words
for the polarity and uncertainty features. Please see below for details.
Thanks.
Regards,
Paula
From:
[email protected]
To:
[email protected]
Subject: RE: cTakes: question on updating cue words
Date: Thu, 19 Dec 2013 16:20:25 -0500
Hi Tim, I just realized that my manual cue word updates didn't take. :-(
I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then
rebundled and added back to cTAKES:
1. negation_cue_list.txt
2. certainty.txt
Is there another file that you know of that needs to be updated? In the cue
folder under the jar contains only text files perhaps there's another text file
I need to update, would you know what the files would be or what other updates
need to be made?
Thanks.
Regards,
Paula
Date: Mon, 16 Dec 2013 16:12:29 -0500
From:
[email protected]
To:
[email protected]
Subject: Re: cTakes: question on updating cue words
Paula, I think to use the released version of ctakes you will have to do what
you proposed - modify the jar. The checked in files (*.psv) that you are
finding are for the under-development version.
Tim
On 12/16/2013 03:27 PM, digital paula wrote:
Hi Pei,
I don't consider this a bug so not sure why a jira ticket is needed, I just
need to add 2 cue words wondering if I can do it manually?
Exploring a bit, I see there are .psv files in Assertion component in the
Cue_Words folder that I updated but it doesn't seem to work. I also added to
the Semantic_Classes folder (in Assertion as well), the cue words in the .txt
file and that didn't work
either.
One thing that I haven't tried is updating the cue words in the
org.mitre.medfacts.i2b2.cuefiles package for the jar file:
medfacts-i2b2-1.2.SNAPSHPOT.jar. That would be a little more work since I need
to extract and rebuild jar file and add back to project.
I'm kind of on a huge deadline and hoping I can make these changes today so
hoping this doesn't require a lot of time to just add a couple cue words.
By the way, was is a .psv file?
Thanks.
Regards,
Paula
From:
[email protected]
To:
[email protected]
Subject: RE: cTakes: question on updating cue words
Date: Mon, 16 Dec 2013 19:31:45 +0000
[moved to dev@]
Hi Paula,
My suggestion would be to open a Jira item so
that it could be tracked:
https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new
account).
Even cooler if you could attach the affected files with the patch(diffs) and
any tests.
--Pei
From:
digital paula [mailto:[email protected]]
Sent: Monday, December 16, 2013 1:30 PM
To:
[email protected]
Subject: cTakes: question on updating cue words
Hello again cTAKES Community,
I would like to add additional cue words to polarity (for negation) and
uncertainty. I would so appreciate if someone can let me know how I can add
additional cue words.
Thanks.
Regards,
Paula