Re: Fw: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch
This is missing critical items which constitute an apache release [1]. If it has already been dist'd, I would suggest retroactively correcting the missing steps as outlined in the release process. [1] http://www.apache.org/legal/release-policy.html#policy On Sat, Jan 23, 2021 at 10:39 AM Finan, Sean wrote: > > Hi all, > > > The "Downloads" page on ctakes.apache.org should now be working as expected. > > I have tested several mirror servers and they all appear to have the 4.0.0.1 > packaged downloads available. > > > Enjoy, > > Sean > > > From: Finan, Sean > Sent: Thursday, January 21, 2021 1:52 PM > To: dev@ctakes.apache.org; u...@ctakes.apache.org > Subject: Fw: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch > > > Hi all, > > > I have been getting your emails and jira items informing me that the download > targets on the ctakes website have still not been populated. Thank you for > letting me know. > > ?I apologize for the inconvenience and as soon as I can I will work with the > Apache Infra team to see why we are having this problem. > > > As soon as I witness working links I will let you all know. > > > Thank you, > > Sean > > > From: Finan, Sean > Sent: Wednesday, January 20, 2021 10:24 AM > To: dev@ctakes.apache.org; u...@ctakes.apache.org > Subject: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch > > > ???As some have experienced, the U.S.A. National Library of Medicine (NLM) > has changed the authentication method for using the Unified Medical Language > System (UMLS). > > https://www.nlm.nih.gov/research/umls/index.html > > > Though a bit late in its arrival, Apache cTAKES now has a patch release that > supports the new UMLS authentication method. > > > The release number is 4.0.0.1, an update of the previous release version > 4.0.0 with a single change to enable the new UMLS authentication. > > No other code or functionality has been modified and there are no > enhancements to the previous release 4.0.0 > > > There are instructions for use on the Apache cTAKES wiki. > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1 > > > The source code is available in the 4.0.0.1 tag Subversion (svn) repository. > > https://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0.1/ > > > The jar and pom files are available from maven central and any Applications > utilizing Apache cTAKES as an Apache Maven dependency should update their pom > files. > > https://search.maven.org/search?q=ctakes > > > At this time the Apache infra script that points mirror download servers to > the pre-built zip/archive files has not run. I hope that the mirror servers > are updated in a day or two. > > When the mirror servers are updated the buttons on the "Downloads" page of > ctakes.apache.org should trigger a download of the patch version. Until then > you will get a "page not found" error. > > Until the pre-built archive downloads are available through the website, you > can find them in the release repository. > > https://repository.apache.org/content/repositories/releases/org/apache/ctakes/ctakes-core/4.0.0.1/ > > > For more information please visit the wiki page on the Apache cTAKES 4.0.0.1 > patch release. > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1 > > > > A very special thanks goes to Peter Abramowitsch for conception and original > implementation of the authentication code and workflow. > > > Many thanks to those who boldly tested, documented and otherwise made this > patch and its trunk equivalent possible, including > > Kean Kaufmann > > Gandhi Rajan > > Eugenia Monogyiou > > Timothy Miller > > and anybody else that I have forgotten (apologies). > > > ?And for those of you gave gave me a bit of prodding to get this wrapped up > and published ... in the end I am grateful and you have done us all a service. > > > Cheers, > > Sean >
Re: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch
- Link to public KEYS to verify the sigs? - Link to the VOTE results? - Anyone get a chance to download/test/verify the release candidate artifact in staging before dist'ing? Not sure if the release guide procedures changed, but it's fairly typical https://web.archive.org/web/20140701075131/http://ctakes.apache.org/ctakes-release-guide.html On Wed, Jan 20, 2021 at 10:25 AM Finan, Sean wrote: > > ???As some have experienced, the U.S.A. National Library of Medicine (NLM) > has changed the authentication method for using the Unified Medical Language > System (UMLS). > > https://www.nlm.nih.gov/research/umls/index.html > > > Though a bit late in its arrival, Apache cTAKES now has a patch release that > supports the new UMLS authentication method. > > > The release number is 4.0.0.1, an update of the previous release version > 4.0.0 with a single change to enable the new UMLS authentication. > > No other code or functionality has been modified and there are no > enhancements to the previous release 4.0.0 > > > There are instructions for use on the Apache cTAKES wiki. > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1 > > > The source code is available in the 4.0.0.1 tag Subversion (svn) repository. > > https://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0.1/ > > > The jar and pom files are available from maven central and any Applications > utilizing Apache cTAKES as an Apache Maven dependency should update their pom > files. > > https://search.maven.org/search?q=ctakes > > > At this time the Apache infra script that points mirror download servers to > the pre-built zip/archive files has not run. I hope that the mirror servers > are updated in a day or two. > > When the mirror servers are updated the buttons on the "Downloads" page of > ctakes.apache.org should trigger a download of the patch version. Until then > you will get a "page not found" error. > > Until the pre-built archive downloads are available through the website, you > can find them in the release repository. > > https://repository.apache.org/content/repositories/releases/org/apache/ctakes/ctakes-core/4.0.0.1/ > > > For more information please visit the wiki page on the Apache cTAKES 4.0.0.1 > patch release. > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1 > > > > A very special thanks goes to Peter Abramowitsch for conception and original > implementation of the authentication code and workflow. > > > Many thanks to those who boldly tested, documented and otherwise made this > patch and its trunk equivalent possible, including > > Kean Kaufmann > > Gandhi Rajan > > Eugenia Monogyiou > > Timothy Miller > > and anybody else that I have forgotten (apologies). > > > ?And for those of you gave gave me a bit of prodding to get this wrapped up > and published ... in the end I am grateful and you have done us all a service. > > > Cheers, > > Sean >
Re: Demos menu option on cTAKES homepage
Hi James, The demos were being upgraded to use 4.0.0 last night. They should be up and running now. Let me know if you encounter any issues. --Pei On Tue, Apr 25, 2017 at 10:59 PM, James Masanzwrote: > Pei and others with access to update http://healthnlp.github.io/examples/ > > Following the Get Started -> Demo menu on http://ctakes.apache.org/ leads > to a page with demos that aren't currently working. > Will you have a chance to fix those soon or should the Demo menu be removed > until they get fixed? > > -- James
[CANCEL] [VOTE] Re: Release Apache cTAKES 4.0.0 (rc2)
Cancelled- replaced by rc3. On Sat, Apr 15, 2017 at 10:05 PM, James Masanz <masanz.ja...@gmail.com> wrote: > re-posting my latest but with [VOTE] added to the subject for anyone > filtering on that. > > On Sat, Apr 15, 2017 at 10:02 PM, James Masanz <masanz.ja...@gmail.com> > wrote: > >> Hi everyone, >> >> - changing my vote to 0 for rc2. I'd prefer to have a new release >> candidate ASAP but if we don't get one soon I would prefer to have a 4.0 >> released as-is, and we can document its limitations and spin up a 4.0.1 >> with the fixes, if there are enough people content to release 4.0 as-is. >> I'd be happy to be release manager for a 4.0.1. >> >> -- James >> >> >> On Fri, Apr 14, 2017 at 8:04 PM, James Masanz <masanz.ja...@gmail.com> >> wrote: >> >>> >>> -1 from me for rc2 because of various issues found >>> old dictionary lookup didn't work in an IDE unless you manually >>> download the latest zip - pom files needed updating (checked into trunk >>> today) (more of the ctakesresources from sourceforge need to be put onto >>> maven central for ctakes to work as a maven dependency) >>> Sean fixed some issues today (I saw commit notices today) which I'd >>> like to see included in 4.0 before it's released >>> >>> -- James >>> >>> >>> On Wed, Apr 12, 2017 at 5:31 PM, Pei Chen <chen...@apache.org> wrote: >>> >>>> This is a call for a vote on releasing the following candidate (rc2) as >>>> Apache cTAKES 4.0.0. >>>> >>>> For more detailed information on the changes/release notes, please visit: >>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje >>>> ctId=12313621=12340211 >>>> >>>> The release was made using the cTAKES release process documented here: >>>> https://ctakes.apache.org/ctakes-release-guide.html >>>> >>>> The candidate is available at: >>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r >>>> c2/apache-ctakes-4.0.0-src.tar.gz >>>> /.zip >>>> <https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz/.zip> >>>> >>>> The tag to be voted on: >>>> http://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0-rc2 >>>> The MD5 checksum of the tarball can be found at: >>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r >>>> c2/apache-ctakes-4.0.0-src.tar.gz.md5 >>>> /.zip.md5 >>>> >>>> The signature of the tarball can be found at: >>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r >>>> c2/apache-ctakes-4.0.0-src.tar.gz.asc >>>> /.zip.asc >>>> >>>> Apache cTAKES' KEYS file, containing the PGP keys used to sign the >>>> release: >>>> https://dist.apache.org/repos/dist/dev/ctakes/KEYS >>>> >>>> Please vote on releasing these packages as Apache cTAKES 4.0.0. The vote >>>> is >>>> open for at least the next 72 hours. >>>> >>>> The vote passes if at least three binding +1 votes are cast. >>>> [ ] +1 Release the packages as Apache cTAKES 4.0.0 >>>> [ ] -1 Do not release the packages because... >>>> >>>> Also, the convenience binary can be found at: >>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r >>>> c2/apache-ctakes-4.0.0-bin.tar.gz >>>> /.zip >>>> <https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-bin.tar.gz/.zip> >>>> >>>> I've only tested the CVD. Sean/James- Can you test/verify your changes? >>>> >>>> Special thanks to all of those involved. >>>> -- >>>> >>> >>> >>
Re: Release Apache cTAKES 4.0.0 (rc2)
Guergana, Sean and James sent us a private message to request a rc3 to include most recent changes in trunk after rc2 was created. We are more than happy to create another release candidate. That was the reason that rc2 was veto'd and a rc3 was requested. The only differences between rc3 and rc2 are whatever minor changes went into trunk since Fri over the Easter and Patriots holiday weekend. You're more than welcome to create the rc yourself-- but I don't think it will make it any more efficient. I rarely see anyone threaten dates/deadlines upon other ASF volunteers. What gives? On Mon, Apr 17, 2017 at 9:53 AM, Savova, Guergana <guergana.sav...@childrens.harvard.edu> wrote: > Pei/Murali, > Let us know if you could cut release candidate 3 by Monday, April 17, 5 pm > ET. We would understand if you are very busy and unavailable to do so -- life > happens. Sean Finan and James Masanz volunteered to prepare rc3 if we do not > hear from you. > > Dear cTAKES community, > Thank you for your testing of rc2, your contributions are so valuable! RC3 > will be made available on Tuesday, April 18 or Wednesday, April 19 for > another round of testing and voting. > We all are looking forward to the v4 release! > > Cheers, > --Guergana > > -Original Message- > From: Savova, Guergana > Sent: Saturday, April 15, 2017 10:02 AM > To: 'dev@ctakes.apache.org' <dev@ctakes.apache.org> > Subject: RE: Release Apache cTAKES 4.0.0 (rc2) > > Not sure what is meant by "this week". Today, Sat, April 15 by 5 pm? > --Guergana > > Guergana Savova, PhD, FACMI > Associate Professor > PI Natural Language Processing Lab > Boston Children's Hospital and Harvard Medical School > 300 Longwood Avenue > Mailstop: BCH3092 > Enders 144.1 > Boston, MA 02115 > Tel: (617) 919-2972 > Fax: (617) 730-0817 > guergana.sav...@childrens.harvard.edu > Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv > http://ctakes.apache.org > http://thyme.healthnlp.org > http://cancer.healthnlp.org > http://share.healthnlp.org > http://center.healthnlp.org > > > -Original Message- > From: Pei Chen [mailto:pei.c...@wiredinformatics.com] > Sent: Saturday, April 15, 2017 9:13 AM > To: dev@ctakes.apache.org > Subject: Re: Release Apache cTAKES 4.0.0 (rc2) > > Let us recut 4.0.0 from trunk this week. I just saw a note from Sean that he > would like to integrate changes from trunk as well. > >Pei Chen > Wired Informatics > <https://urldefense.proofpoint.com/v2/url?u=http-3A__bit.ly_1pHmTcL=DwIBaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=if2F9Ti4D02juzTUQoXtsUPoO5F3SufvTF70twXnRpc=hKX8Ff6KEsf5JpGL11G7PTETB_ZEFCtCGxoWs5U2JEA= > > > 265 Franklin St Ste 1702 > Boston, MA 02110 > tel: (617) 433-7544 > pei.c...@wiredinformatics.com > > On Fri, Apr 14, 2017 at 11:38 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > >> > I'd rather not get into the definition of "basic", just like I'd >> > rather >> not discuss the definition of obvious with another mathematician. >> --> Lol. My wife can't stand it when I say "obviously". >> >> Fwiw, I think that cutting a new rc sooner rather than later is >> comparatively little work compared to the benefit for testers. It >> needs to be done anyway as what is in rc2 is not releasable. I don't >> want to vote >> -1 on the rc, but will if it is necessary to get an rc3 cut. >> >> Sean >> >> -Original Message- >> From: James Masanz [mailto:masanz.ja...@gmail.com] >> Sent: Friday, April 14, 2017 9:36 PM >> To: dev@ctakes.apache.org >> Subject: Re: Release Apache cTAKES 4.0.0 (rc2) >> >> these are all the fixes I plan to make. last I talked to Sean, he had >> all his changes in. I assume there will be more testing up until final >> vote, I certainly will be doing more testing and working more on >> documentation. But why not have people test on the latest now that we >> have fixed some issues that seem like showstoppers? I'd rather not >> get into the definition of "basic", just like I'd rather not discuss >> the definition of obvious with another mathematician. >> >> On Fri, Apr 14, 2017 at 8:23 PM, Pei Chen <chen...@apache.org> wrote: >> >> > James, >> > Happy to create another rc3, but can I suggest we bundle all of the >> > fixes before creating another candidate? Are there other remaining >> > items to test? This just seems like basic functionality? >> > >> > On Fri, Apr 14, 2017 at 8:04 PM, James Masanz &g
Re: Release Apache cTAKES 4.0.0 (rc2)
James, Happy to create another rc3, but can I suggest we bundle all of the fixes before creating another candidate? Are there other remaining items to test? This just seems like basic functionality? On Fri, Apr 14, 2017 at 8:04 PM, James Masanz <masanz.ja...@gmail.com> wrote: > -1 from me for rc2 because of various issues found > old dictionary lookup didn't work in an IDE unless you manually > download the latest zip - pom files needed updating (checked into trunk > today) (more of the ctakesresources from sourceforge need to be put onto > maven central for ctakes to work as a maven dependency) > Sean fixed some issues today (I saw commit notices today) which I'd > like to see included in 4.0 before it's released > > -- James > > > On Wed, Apr 12, 2017 at 5:31 PM, Pei Chen <chen...@apache.org> wrote: > >> This is a call for a vote on releasing the following candidate (rc2) as >> Apache cTAKES 4.0.0. >> >> For more detailed information on the changes/release notes, please visit: >> https://issues.apache.org/jira/secure/ReleaseNote.jspa? >> projectId=12313621=12340211 >> >> The release was made using the cTAKES release process documented here: >> https://ctakes.apache.org/ctakes-release-guide.html >> >> The candidate is available at: >> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0- >> rc2/apache-ctakes-4.0.0-src.tar.gz >> /.zip >> >> The tag to be voted on: >> http://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0-rc2 >> The MD5 checksum of the tarball can be found at: >> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0- >> rc2/apache-ctakes-4.0.0-src.tar.gz.md5 >> /.zip.md5 >> >> The signature of the tarball can be found at: >> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0- >> rc2/apache-ctakes-4.0.0-src.tar.gz.asc >> /.zip.asc >> >> Apache cTAKES' KEYS file, containing the PGP keys used to sign the release: >> https://dist.apache.org/repos/dist/dev/ctakes/KEYS >> >> Please vote on releasing these packages as Apache cTAKES 4.0.0. The vote is >> open for at least the next 72 hours. >> >> The vote passes if at least three binding +1 votes are cast. >> [ ] +1 Release the packages as Apache cTAKES 4.0.0 >> [ ] -1 Do not release the packages because... >> >> Also, the convenience binary can be found at: >> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0- >> rc2/apache-ctakes-4.0.0-bin.tar.gz >> /.zip >> >> I've only tested the CVD. Sean/James- Can you test/verify your changes? >> >> Special thanks to all of those involved. >> -- >>
Release Apache cTAKES 4.0.0 (rc2)
This is a call for a vote on releasing the following candidate (rc2) as Apache cTAKES 4.0.0. For more detailed information on the changes/release notes, please visit: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621=12340211 The release was made using the cTAKES release process documented here: https://ctakes.apache.org/ctakes-release-guide.html The candidate is available at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz /.zip The tag to be voted on: http://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0-rc2 The MD5 checksum of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz.md5 /.zip.md5 The signature of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz.asc /.zip.asc Apache cTAKES' KEYS file, containing the PGP keys used to sign the release: https://dist.apache.org/repos/dist/dev/ctakes/KEYS Please vote on releasing these packages as Apache cTAKES 4.0.0. The vote is open for at least the next 72 hours. The vote passes if at least three binding +1 votes are cast. [ ] +1 Release the packages as Apache cTAKES 4.0.0 [ ] -1 Do not release the packages because... Also, the convenience binary can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-bin.tar.gz /.zip I've only tested the CVD. Sean/James- Can you test/verify your changes? Special thanks to all of those involved. --
Re: Labs annotator?
Kean, This would be really useful. If you would like make a contribution, could you please open a Jira and attach the patch or code? When you submit a patch via jira/attachment, it has legal verberage about donating the code, etc. --Pei On Wed, Mar 29, 2017 at 9:30 AM, Finan, Seanwrote: > Fantastic! > > I would really like to work with you to get this into ctakes 4.1. Let me > know how you would like to proceed. Would you like to send me or another > committer the code or have somebody review it remotely? The "tweaks" may be > something useful to ctakes, but if not I'm sure that we can create a decent > interfacing. > > Cheers, > Sean > > -Original Message- > From: Kean Kaufmann [mailto:k...@recordsone.com] > Sent: Wednesday, March 29, 2017 7:59 AM > To: dev@ctakes.apache.org > Subject: Re: Labs annotator? > >> >> I'm sure that people would love to see lab values in ctakes! Could >> you please write a small summary of what it does? Maybe an example or >> two could suffice. > > > Hi Sean, > > The labs annotator identifies likely lab phrases by TUI (T059 et al.), and > relates them to the nearest following number-ish value -- NumToken, > FractionAnnotation, MeasurementAnnotation or (as a last resort) > RangeAnnotation -- that isn't part of a Date or TimeAnnotation. > A whitelist of lab-value words can also be specified, e.g. "positive", > "negative", "normal", "elevated", "decreased", ... > > For example, > > Weight / BMI: Recent weight (as of 05/05/16) is >> 45.36 kg (100 lb) > > > yields > > "weight" -> "45.36 kg" > > and > > HEPATIC FUNCTION PANEL >> Result Value Ref Range >> Albumin 2.2 (*) 3.7 - 5.1 g/dL >> Total Protein 5.5 (*) 5.8 - 8.0 g/dL >> Alkaline Phosphatase 844 (*) 42 - 121 IU/L ... > > > yields > > "Albumin" -> "2.2" > "Protein" -> "5.5" > "Alkaline Phosphatase" -> "844" > > (without trying to fill in the units or referenceRangeNarrative values). > > Configuration parameters: > * ids of segments to annotate > * TUIs indicating labs - I use T059, T060 and T121 > * CUIs too general to be useful, e.g. C1443182, "Calculated (procedure)" > * Whitelist of words allowed as lab values > * Maximum number of newlines permitted between lab and value (0 = must be on > same line) > > I'd need to check in with you to make sure it plays nicely with the cTAKES > type system; we've tweaked ours a bit. > > Best, > -kk > > > On Tue, Mar 28, 2017 at 11:45 AM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > >> Hi Kean, >> >> I'm sure that people would love to see lab values in ctakes! Could >> you please write a small summary of what it does? Maybe an example or >> two could suffice. >> >> We can definitely put it into ctakes in release 4.1 - maybe next quarter? >> >> Cheers, >> Sean >> >> -Original Message- >> From: Kean Kaufmann [mailto:k...@recordsone.com] >> Sent: Tuesday, March 28, 2017 11:34 AM >> To: dev@ctakes.apache.org >> Subject: Labs annotator? >> >> On Tue, Mar 28, 2017 at 11:23 AM, Finan, Sean < >> sean.fi...@childrens.harvard.edu> wrote: >> >> > >> > If anybody out there has something that they would like to >> > contribute to ctakes, please do! >> > >> >> I recently wrote an annotator for lab values. There was some >> discussion of this on the dev list a couple of years ago; did anything come >> of it? >> Happy to contribute if it's helpful. >> >> -- >> _ >> *Kean Kaufmann* >> NLP Developer >> >> RecordsOne >> nSight Driven | *Priority. Clarity. Integrity. * >> >> *mobile* | >> 240-401-6131 >> >> *Twitter: **@R1_RecordsOne* >> *See us in Vegas @ ACDIS 2017 * >> *See us in Los Angeles @ AHIMA 2017* >> >> >> --- >> *Confidentiality Notice: *This email, including any attachments is >> the property of RecordsOne, LLC and is intended for the sole use of >> the intended recipient(s). It may contain information that is >> privileged and confidential. Any unauthorized review, use, disclosure, >> or distribution is prohibited. If you are not the intended recipient, >> please reply to the sender that you have received the message in >> error, then delete this message. >> >> --- >> *Mailing*: 10641 Airport Pulling Road, Suite 30 | Naples, FL 34109 >> *Main*: 239.451.6112 >> >> *Please consider the environmental impact before printing this email. >> * >>
Re: (Re)introduce myself - James Masanz
Welcome back James! Good to hear from you again. Out of respect for the others in the community who already volunteered to be RM, I do not see an need for BCH to override existing volunteers. Unless they unable or unwilling. Would you/others agree? Sent from my iPhone > On Jan 27, 2017, at 2:23 PM, James Masanzwrote: > > Hi, > > I'm James Masanz -- if you've been on the dev list for more than a couple > years, you might recognize my name from my previous contributions to > cTAKES, which include having been a release manager. > > I've joined the Boston Children's Hospital NLP team. I will be devoting > significant energy to the next release of cTAKES, and I volunteer to be the > release manager for it. > > My initial thoughts are that we could make the "fast dictionary lookup" be > the default dictionary lookup, incorporate the dictionary GUI from sandbox, > and call the release 4.0. I'm also interested in migrating the Wiki away > from Confluence to Apache's moin-moin instance. I'm sure there are other > things to include in the next release as well. > > You'll be hearing more from me over the next few weeks as I review the list > of issues in Jira and get caught up with details of what's been going on > while I was less active here. > > One thing I would like to track for release candidates would be a list of > what is tested on which platforms, which could be as simple as a post with > a matrix of src/bin/other vs. linux/windows/mac, and making sure we have at > least one volunteer to test the install and run of a pipeline for each > entry in the table. Future releases might expand on that to include > tracking multiple pipelines across environments, etc. > > I'm happy I'm returning to being active in the cTAKES community! > > -- James
Re: Infrastructures questions.
I can recreate what Tim suggested. 1) Comment out LVG from the dependency parser test/pipeline 2) The same thing would need to be done in the Regression Test (comment out LVG in the test) 3) Update the regression test suite newly generated output -> expected I'll commit the changes. Basically with the above, junit tests won't run into the URI hierarchal issue when the "mvn package" command is issued. A bit of background- I remember digging deeper into this some time ago: Even if we fixed our impl/LVG wrapper, there are hard coded references inside LVG itself that uses file:// instead of ResourceAsStream forcing resources to be unpacked. This doesn't jive well during the package phase of the maven build cycle where it references your .m2 repo. mvn test works fine because it will reference the explicit local unpacked lvg resources. Pei Chen Wired Informatics <http://bit.ly/1pHmTcL> 265 Franklin St Ste 1702 Boston, MA 02110 tel: (617) 433-7544 pei.c...@wiredinformatics.com On Wed, Dec 14, 2016 at 9:57 AM, Miller, Timothy < timothy.mil...@childrens.harvard.edu> wrote: > Dependency tests pass with my change; new test error in regression test > module that I'm not familiar with and error type I've never seen before > -- reaching out for help debugging: > > > > Exception in thread "BaseCPMImpl-Thread" > > junit.framework.AssertionFailedError: > Verifying Test Output: testpatient_plaintext_2.txt. > xmlorg.custommonkey.xmlunit.Diff > > [different] Expected number of element attributes '7' but was '6' - > comparing at > /CAS[1]/org.apache.ctakes.typesystem.type.syntax.NewlineToken[1] to > at > /CAS[1]/org.apache.ctakes.typesystem.type.syntax.NewlineToken[1] > > > > at junit.framework.Assert.fail(Assert.java:50) > > at junit.framework.Assert.assertTrue(Assert.java:20) > > at org.apache.ctakes.regression.test.RegressionPipelineTest. > compareXMLOutput(RegressionPipelineTest.java:147) > > at org.apache.ctakes.regression.test.RegressionPipelineTest$ > StatusCallbackListenerImpl.collectionProcessComplete( > RegressionPipelineTest.java:200) > > at org.apache.uima.collection.impl.cpm.BaseCPMImpl.run( > BaseCPMImpl.java:538) > > at java.lang.Thread.run(Thread.java:745) > > > Thanks > Tim > > On Tue, 2016-12-13 at 16:15 +, Miller, Timothy wrote: > > Quick followup - the test passes in eclipse, both with and without LVG > > enabled. Can someone try to replicate at the command line and see if mvn > > package works with LVG commented out? This is line 130 in > > WriteClearNLPDescriptors.java. Otherwise I can try this afternoon. > > Tim > > > > On Tue, 2016-12-13 at 15:57 +, Miller, Timothy wrote: > > > Pretty sure this particular issue is caused by LVG being part of the > > > test pipeline and the "URI is not hierarchical" bug from not having its > > > files unpacked from the jar. A simple fix is to disable that test in > > > code; a slightly more complex fix is to run the test with a modified > > > pipeline that doesn't include LVG. > > > Tim > > > > > > > > > On Tue, 2016-12-13 at 10:51 -0500, Pei Chen wrote: > > > > That's right. mvn compile and test should work fine. The benign test > > > > failed error from junit tests is coming from install/package; it's > > > > been there since the beginning of time [1]. It would be a nice to > > > > have and remove the benign warning messages. If a proposed critical > > > > patch release passes the regression tests, doesn't break any existing > > > > behavior, enhances the project, and we have volunteers for RM, I do > > > > not see these superious reasons as valid to block releases and keep > > > > things moving along. > > > > Sean: it would great if you can open a Jira and apply the patch; we > > > > can always cut another release next time- I'll be happy to be RM for > > > > that one whenever you feel it's' ready. > > > > > > > > [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__ > markmail.org_search_-3Fq-3Dctakes-2520mvn-2520package- > 2520-2DDskipTests-23query-3Actakes-2520mvn-2520package- > 2520-2DDskipTests-2Bpage-3A1-2Bmid-3Aoxgrkslhhjimpv4k- > 2Bstate-3Aresults=DgIFaQ=qS4goWBT7poplM69zy_ > 3xhKwEW14JZMSdioCoppxeFU=Heup-IbsIg9Q1TPOylpP9FE4GTK- > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h=xkJKj22zARpX6Nb06fIYl84- > gdaEmosSya1Wa40jup4=HIc4d0eWT6Wv0UY2Ytxm_oq5c-sUzay1SSq7XE4rDtE= > > > > > > > > On Tue, Dec 13, 2016 at 9:19 AM, Andrey Kurdumov > > > > <kant2...@googlemail.com> wrote: > > > > > NP f
Re: Infrastructures questions.
What release are you referring to? On Tue, Dec 13, 2016 at 11:08 AM, Finan, Sean <sean.fi...@childrens.harvard.edu> wrote: > By the way, did we ever vote on the release? > http://www.apache.org/dev/release.html#approving-a-release > > > -Original Message- > From: Pei Chen [mailto:chen...@apache.org] > Sent: Tuesday, December 13, 2016 10:51 AM > To: dev@ctakes.apache.org > Subject: Re: Infrastructures questions. > > That's right. mvn compile and test should work fine. The benign test failed > error from junit tests is coming from install/package; it's been there since > the beginning of time [1]. It would be a nice to have and remove the benign > warning messages. If a proposed critical patch release passes the regression > tests, doesn't break any existing behavior, enhances the project, and we have > volunteers for RM, I do not see these superious reasons as valid to block > releases and keep things moving along. > Sean: it would great if you can open a Jira and apply the patch; we can > always cut another release next time- I'll be happy to be RM for that one > whenever you feel it's' ready. > > [1] > https://urldefense.proofpoint.com/v2/url?u=http-3A__markmail.org_search_-3Fq-3Dctakes-2520mvn-2520package-2520-2DDskipTests-23query-3Actakes-2520mvn-2520package-2520-2DDskipTests-2Bpage-3A1-2Bmid-3Aoxgrkslhhjimpv4k-2Bstate-3Aresults=DgIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=LL_nmgQ_ea-8hW5p-lXaiDLX2zp58A5ZuDimQJunDQ0=XZK-oW6nrKdzKmgZCnIW_zO-b-vqwKSaVWKzYMFIP6g= > > On Tue, Dec 13, 2016 at 9:19 AM, Andrey Kurdumov <kant2...@googlemail.com> > wrote: >> NP for broken build. Finally I manage to run it, so I just report >> issue so other don't have have to go through hoops like me. >> >> I just want to made small correction - mvn compile works. mvn test >> works too, but mvn package require -DskipTests. >> The problem with build is somehow related to how Maven package stuff, >> I suspect. >> >> Packaging failed for me at "Apache cTAKES Dependency Parser >> FAILURE", I also attach report from Surefire with >> error. >> >> I will try to figure out why is that error happens, but it could take >> a while until I understand how Maven works. >> Thanks for prompt response! >> >> Also I start looking how cTakes working, and investigate dependencies >> between packages, and found following comment: "Temporary workaround: >> Adding in the system scoped libraries. Remove these once they are in Maven >> Central" >> in the ctakes-distribution\src\main\assembly\bin.xml . These comment >> related to dependencies which checked in in the source code, but for >> me seems to be that they are now on MAven Central See >> (https://urldefense.proofpoint.com/v2/url?u=https-3A__mvnrepository.com_artifact_net.sf.mastif_mastif-2Dzoner=DgIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=LL_nmgQ_ea-8hW5p-lXaiDLX2zp58A5ZuDimQJunDQ0=DwFSPNSn27AvMjRSNXGHueqTsyc_T5Aetds4ipSzuYo= >> ). >> I saw issue >> >> CTAKES-185 >> >> which could be appropriate for that, and I could create patch for that >> change. During the course of my next project, very likely I would be >> involved in the activities similar to cTakes, so I potentially could >> contribute something back, so I try to familiarize myself with the project. >> >> >> >> 2016-12-13 19:30 GMT+06:00 Finan, Sean <sean.fi...@childrens.harvard.edu>: >>> >>> Hi Andrey, >>> >>> The requirement of skipping tests for a successful build is something >>> that all ctakes developers have stumbled across, but after initial >>> setup we all forget about it and it has never been handled. Apologies. >>> >>> The github mirror is something that would be great to have, but >>> getting it up has been a nightmare. The problem is that historically >>> we have had binary files that are larger than the 100MB limit enforced by >>> github. >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__help.github.com_ >>> articles_working-2Dwith-2Dlarge-2Dfiles_=DgIFaQ=qS4goWBT7poplM69z >>> y_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4g >>> Tao=LL_nmgQ_ea-8hW5p-lXaiDLX2zp58A5ZuDimQJunDQ0=rdkXOpLdPvaAw15Zv >>> gsqVztehD5Bc7SSClGDl5KTcMs= This causes github to reject the >>> creation of the repository. >>> >>> I do think that, should somebody feel like putting in the effort, we >>> could work with apache infra and get a working solution ... possibly
Re: cTAKES - 3.2.3 release
t;] - Fast Dictionary should be able to load custom codifications from db [CTAKES-382 <https://issues.apache.org/jira/browse/CTAKES-382>] - Add ability to easily add extension of UmlsConcept Type to jcas via dictionary lookup Task [CTAKES-74 <https://issues.apache.org/jira/browse/CTAKES-74>] - Tokenizer PennTreeBank breaks with certain apostrophes in tokens. [CTAKES-138 <https://issues.apache.org/jira/browse/CTAKES-138>] - Remove 3rd party jars from our SVN [CTAKES-232 <https://issues.apache.org/jira/browse/CTAKES-232>] - change concept type <>On Dec 6, 2016, at 11:20 AM, Jeff Headley <jeffun...@gmail.com> wrote: > > I realize I’m not a committer and maybe I shouldn’t express an opinion. > Apologies in advance if this is inappropriate. However as someone who has > gone through the pain of trying to install, learn, and use ctakes; I strongly > agree with Sean. I don’t inject myself into the situation lightly or to > “vent”. I have been in software development since 1996 and a lot of that time > in medical projects and using various open source frameworks like Spring, > Seam, Hibernate, etc. Sean is right. > > Jeff > > On Dec 6, 2016, 10:00 AM -0500, Pei Chen <pei.c...@wiredinformatics.com>, > wrote: >> Considering the amount of time since the release was created, we should not >> let any pending Jira’s or features hold up a release. >> I suggest just we mark anything that hasn’t been fixed in Jira into the next >> release and push forward- I’ll volunteer to do that right now. >> In the past, the documentation on the website also shouldn’t hold up a >> release either. >> >>> On Dec 6, 2016, at 9:20 AM, Finan, Sean <sean.fi...@childrens.harvard.edu> >>> wrote: >>> >>> Hi Murali, >>> >>> Before we make an rc, we must go through the list of currently open tars >>> and requests. SOP. A list needs to be compiled of what should be closed as >>> fixed or n/a plus another list of outstanding bugs that need to be dealt >>> with and an estimate of effort. Then we should try to gather volunteers to >>> handle said bugs. Can you take care of compiling those lists? I did this >>> many months ago when rc 3.2.3 came up, and there were items on which no >>> movement was made. If you can find my email that might be one place to >>> start. >>> >>> The primary takeaways from the hackathon were, not surprisingly: >>> 1. Installation of cTAKES is not as straightforward as we believe, and >>> 2. Getting started with cTAKES is extremely difficult (no good starting >>> point) and scares off a large percentage of people who try. >>> 3. Customization is next to impossible without diving into the code, which >>> is more time consuming than anyone can stand. >>> >>> All can be handled best by short and simple GUI tools and some "cTAKES for >>> Beginners" documentation. We have some documentation that was used for the >>> Hackathon that needs to be modified a bit, then posted on the main cTAKES >>> website. >>> >>> In my opinion these items should be worked upon before creating another >>> release, otherwise the release is not as useful as it could be. I have >>> started work on a simple pipeline builder gui that creates simple html or >>> text output. I will check it into trunk soon, but as new functionality >>> community testing will be required before a release. >>> >>> Sean >>> >>> -Original Message- >>> From: Murali Minnah [mailto:mmin...@gmail.com] >>> Sent: Monday, December 05, 2016 1:26 PM >>> To: dev@ctakes.apache.org >>> Subject: cTAKES - 3.2.3 release >>> >>> I wanted to check to see if there are objections to creating a 3.2.3 tag of >>> trunk now to prepare for a 3.2.3-rc1? >>> >>> Any comments from the participants/organizers on the success/lessons learnt >>> from the "hackathon" that the community can benefit from? >>> >>> Best, >>> Murali >>
Re: cTAKES - 3.2.3 release
Considering the amount of time since the release was created, we should not let any pending Jira’s or features hold up a release. I suggest just we mark anything that hasn’t been fixed in Jira into the next release and push forward- I’ll volunteer to do that right now. In the past, the documentation on the website also shouldn’t hold up a release either. > On Dec 6, 2016, at 9:20 AM, Finan, Sean> wrote: > > Hi Murali, > > Before we make an rc, we must go through the list of currently open tars and > requests. SOP. A list needs to be compiled of what should be closed as > fixed or n/a plus another list of outstanding bugs that need to be dealt with > and an estimate of effort. Then we should try to gather volunteers to handle > said bugs. Can you take care of compiling those lists? I did this many > months ago when rc 3.2.3 came up, and there were items on which no movement > was made. If you can find my email that might be one place to start. > > The primary takeaways from the hackathon were, not surprisingly: > 1. Installation of cTAKES is not as straightforward as we believe, and > 2. Getting started with cTAKES is extremely difficult (no good starting > point) and scares off a large percentage of people who try. > 3. Customization is next to impossible without diving into the code, which > is more time consuming than anyone can stand. > > All can be handled best by short and simple GUI tools and some "cTAKES for > Beginners" documentation. We have some documentation that was used for the > Hackathon that needs to be modified a bit, then posted on the main cTAKES > website. > > In my opinion these items should be worked upon before creating another > release, otherwise the release is not as useful as it could be. I have > started work on a simple pipeline builder gui that creates simple html or > text output. I will check it into trunk soon, but as new functionality > community testing will be required before a release. > > Sean > > -Original Message- > From: Murali Minnah [mailto:mmin...@gmail.com] > Sent: Monday, December 05, 2016 1:26 PM > To: dev@ctakes.apache.org > Subject: cTAKES - 3.2.3 release > > I wanted to check to see if there are objections to creating a 3.2.3 tag of > trunk now to prepare for a 3.2.3-rc1? > > Any comments from the participants/organizers on the success/lessons learnt > from the "hackathon" that the community can benefit from? > > Best, > Murali
Re: Migration of ctakes-vm
Hi Freddy, I left some comments directly on the Jira earlier... I don't believe that VM was actually used. It may be easier to just decom it and we can re open a request a new one later on in the future. --Pei On Mon, Nov 28, 2016 at 12:43 AM, Freddy Barboza Oviedowrote: > > > On 2016-11-23 08:56 (-0600), "Daniel Takamori" wrote: >> Greetings from Infra, >> We are in the process of moving old VMs off our VMWare cluster and into The >> Cloud. We have a ticket for the process >> https://issues.apache.org/jira/browse/INFRA-12894 which Freddy Barboza our >> new Infra staffer will be working on. >> If we could get someone familiar with the VM to take a look and sketch out >> what needs to be migrated, we can setup a new VM using Puppet 3 >> https://git-wip-us.apache.org/repos/asf?p=infrastructure-puppet.git >> What data to keep, which services to migrate, which package dependencies are >> needed as well as what (if anything) needs to be exposed publicly are things >> we need to know. >> >> Thanks for your cooperation! >> -Pono on behalf of Infra >> > Hi guys, > Can you please provide us an update on this email. > We need this information from you in order to keep working on this. > > Please, let us know. > Thanks, > -Freddy - On behalf of Infra.
Re: [DISCUSS] Hadi Amiri as Apache cTAKES committer
[-dev@, + private@] Moved to the private list as this is typically better suited for personnel matters, voting in committers, pmc, security-related issues. Sounds like a good addition to grow the group, but what has Hadi contributed to Apache so far? Wait until there have been some contributions in Jira first? --Pei On Tue, Sep 27, 2016 at 2:51 PM, Finan, Seanwrote: > Hadi is a new member of the NLP group here at Boston Children's Hospital. He > has a background in NLP research and will now be applying his knowledge to > the biomedical domain, and he will be using cTAKES (and why wouldn't he?) > > Sean
Re: Karma for Jira
Done. Also dded the new committers- Lewis, Peter, and Azad as well. —Pei > On Jun 10, 2016, at 2:11 AM, Richard Eckart de Castilho> wrote: > > Hi all, > > could somebody please add me to the cTAKES project in Jira > such that I can assign issues to myself? > > Best, > > -- Richard signature.asc Description: Message signed with OpenPGP using GPGMail
Re: headword field in identifiedannotations
I don't see any issues with adding the additional optional attribute... I think we already did the same for other items like relations for similar reasons. The only catch is probably that the dependency will need the dictionary lookup to be run first (assuming that the logic will be added to the DP to iterate through all NE's in the CAS) if they want to use that attribute. On Thu, Jun 9, 2016 at 5:13 PM, Miller, Timothywrote: > How do people feel about modifying the typesystem? I'm finding that > grabbing the dependency headword is something very useful for feature > extraction. But it is a bottleneck if every feature extractor that uses > it has to recompute it. So I propose adding a field to the > IdentifiedAnnotation type of "headNode" with type ConllDependencyNode. > > Any thoughts or good reasons to avoid this? > > Thanks > Tim >
Welcome Lewis John McGibbney as a cTAKES committer
The Apache cTAKES PMC is pleased to introduce Lewis John McGibbney as a new committer. We are very happy with the sustained growth of the project and look forward to continued contributions from the community and adding to the ranks of the cTAKES committers. --Pei
Re: cTAKES dirty on checkout
+1 I think there was already a Jira to remove the Eclipse specific settings; or at least make it automatically derived from the pom.xml’s. —Pei > On May 13, 2016, at 11:48 AM, Richard Eckart de Castilho> wrote: > > Hi all, > > when checking out the sources of cTAKES from SVN with Eclipse, most of the > projects are dirty because the Eclipse settings (.classpath and > jdt.core.prefs) are in the SVN. The particular difference is that on my > machine, the projects are configured to use a Java 8, while in SVN, it is > configured to be a Java 7. > > The parent POM of cTAKES states Java 8 > > 1.8 > 1.8 > > Since the Eclipse files in SVN are at least outdated, maybe it would be a > good idea to drop the .classpath and jdt prefs files from SVN and prevent > them from being committed? > > Cheers, > > -- Richard signature.asc Description: Message signed with OpenPGP using GPGMail
Re: ctakes uimafit analysis engine resource initialization errors
Also, check that the liblinear dependency is in your pom.xml (it should already be included in ctakes-assertion/pom.xml). org.cleartk cleartk-ml-liblinear On Tue, Mar 1, 2016 at 7:53 AM, Miller, Timothywrote: > Hi Jay, > I've never seen that one before -- sounds like you're looking in the right > place. The first thing I would try is to manually delete the > cleartk-ml-liblinear folder in your .m2 directory and then do a mvn project > update (from eclipse) or mvn clean compile (from cmd line) in case there was > an issue with the downloaded jar. But that is kind of grasping at straws -- > hopefully someone else will have some other things to try. > > Tim > > From: Jay Urbain > Sent: Tuesday, March 1, 2016 7:21 AM > To: dev@ctakes.apache.org > Subject: ctakes uimafit analysis engine resource initialization errors > > I'm trying to run the AggregatePlaintextUMLSProcessor AE in Eclipse. > - ctakes 3.2.3-SNAPSHOT > > I'm getting ctakes uimafit analysis engine resource initialization errors. > > First, I have no compile errors, and I'm using the developer version of > ctakes "out of the box," i.e., with know modifications except correcting > maven dependency errors. > > I've been struggling resolving the following > ResourceInitializationException: > > 3/1/16 5:31:44 AM - 18: > org.apache.uima.tools.cvd.MainFrame.handleException(526): SEVERE: > Initialization of annotator class > "org.apache.ctakes.assertion.medfacts.cleartk.HistoryCleartkAnalysisEngine" > failed. (Descriptor: > file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-assertion/desc/analysis_engine/HistoryCleartkAnalysisEngine.xml) > org.apache.uima.resource.ResourceInitializationException: Initialization of > annotator class > "org.apache.ctakes.assertion.medfacts.cleartk.HistoryCleartkAnalysisEngine" > failed. (Descriptor: > file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-assertion/desc/analysis_engine/HistoryCleartkAnalysisEngine.xml) > > The failure is caused by: > Caused by: java.lang.ClassNotFoundException: > org.cleartk.ml.liblinear.LibLinearStringOutcomeClassifierBuilder > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > at > org.cleartk.ml.jar.JarClassifierBuilder.fromManifest(JarClassifierBuilder.java:105) > ... 61 more > > The code fails here: > > public class HistoryCleartkAnalysisEngine extends > AssertionCleartkAnalysisEngine { > > boolean USE_DEFAULT_EXTRACTORS = false; > @Override > public void initialize(UimaContext context) throws > ResourceInitializationException { > super.initialize(context); // <--- fails here --- > probabilityOfKeepingADefaultExample = 0.5; > initialize_history_extractor(); > initializeFeatureSelection(); > } > > In the past, I've been able to fix these errors by fixing a missing > dependency or by adding a specific version declaration to a dependency. > > Here's the declaration in AggregatePlaintextUMLSProcessor.xml: > > >location="../../../ctakes-assertion/desc/analysis_engine/HistoryCleartkAnalysisEngine.xml"/> > > The HistoryCleartkAnalysisEngine.xml is automatically generated by uimaFIT. > > I have the cleartk-ml-liblinear-2.0.0.jar in my .m2 repository. > > I have the following dependency in the ctakes-assert and the > ctakes-clinical-pipeline pom.xml: > > > org.cleartk > cleartk-ml > 2.0.0 > > > Any guidance would be apprecaited. > > Thanks, > Jay Urbain
Re: ctakes gui
Hi Ben, I think the ctakes-gui in the sandbox area is really outdated and hasn't been maintained in a long time (hence in sandbox). But there was a old thread [1] that you may find useful. [1] https://mail-archives.apache.org/mod_mbox/ctakes-dev/201505.mbox/%3cCAPUoHuEj1aFC6PQG=jlkgsoquutq17l0mhmdpwvatd6uwqg...@mail.gmail.com%3e On Thu, Feb 18, 2016 at 12:56 PM, Ben Yuwrote: > Hi ctakes group, > Is the ctakes gui actively maintained? I downloaded it and followed Pei's > installation guide (not entirely because some of the instructions don't seem > to apply), and after some maneuvering I had it up and running with tomcat7. > When I try to bring the app up http://localhost:8080/ctakesgui/, the > login.html page came back blank. I noticed that the login.jsp (which seems to > be the file Spring mvc mapped to for login.html) used some external > javascript files and one of them, i18n.js is missing. I have not used it at > all and not sure what it is. Is that the reason why I got the blank page? Am > I missing some other front end stuff? The body tag does not have any content > in it. > > Thanks, and appreciate any help. > > Ben Yu > Software Design Engineer > College of Pharmacy, University of Utah > 801-587-7751 >
Re: ctakes resource exception
Jay, Did you also download and unzip the dictionary itself? UMLS® Dictionary Zipped copy of the cTAKES™ UMLS® dictionary. Please refer to the TODO : Should write a separate page that outlines what the dictionaries are, licensing, and where/how to get them on the net, where to install them locally, and how to configure user/pass Dictionary Install Guide <https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Dictionary+Lookup> for assistance. Install fast version if only running ctakes-fast. All Versions <http://sourceforge.net/projects/ctakesresources/files/ctakes-resources-3.2.1.1-bin.zip/download> Fast Version <http://sourceforge.net/projects/ctakesresources/files/ctakessnorx-3.2.1.1.zip/download> Pei Chen Wired Informatics <http://www.wiredinformatics.com> 265 Franklin St Ste 1702 Boston, MA 02110 tel: (617) 433-7544 pei.c...@wiredinformatics.com On Mon, Feb 15, 2016 at 9:45 AM, Jay Urbain <jay.urb...@gmail.com> wrote: > Hi, > > I'm trying to run bin/runctakesCVD.sh. When I try to load the > AggregatePlaintextFastUMLSProcessor.xml, as described in the User install > guide ( > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide > ) > I receive the ResourceInitializationException (please see details below). > > It appears that I do not have something set up correctly with the UMLS > resources. If I try to run the AggregatePlaintextProcessor, everything > seems to work Ok. > > Any help or direction would be appreciated. > > Thanks, > Jay > > Environment: > OS X Yosemite 10.10.5 > Java 1.8 > > Downloads: > apache-ctakes-3.2.2 > ctakes-resources-3.2.1.1-bin > > Copied the resources directory: > ditto /Users/jayurbain/Downloads/ctakes-resources-3.2.1.1-bin/resources/* > /Users/jayurbain/Dropbox/apache-ctakes-3.2.2/resources > > Added my UMLS user authentication to runtakesCVD.sh > java -Dctakes.umlsuser= > -Dctakes.umlspw= > > Exception: > > 2/15/16 8:34:25 AM - 15: > org.apache.uima.tools.cvd.MainFrame.handleException(526): SEVERE: > Initialization of annotator class > "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator" failed. > (Descriptor: > > file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml) > org.apache.uima.resource.ResourceInitializationException: Initialization of > annotator class > "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator" failed. > (Descriptor: > > file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml) > at > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) > at > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:156) > at > > org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) > at > > org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) > at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) > at > org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) > at > org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254) > at > > org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431) > at > > org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) > at > > org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185) > at > > org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) > at > > org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) > at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) > at > org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:354) > at org.apache.uima.tools.cvd.MainFrame.setupAE(MainFrame.java:1484) > at org.apache.uima.tools.cvd.MainFrame.loadAEDescriptor(MainFrame.java:476) > at > > org.apache.uima.tools.cvd.control.AnnotatorOpenEventHandler.actionPerformed(AnnotatorOpenEventHandler.java:52) > at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022) > at > > javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2348) > at > > javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402) > at javax.
Re: Mac/download link broken
The links on the menu should point to http://ctakes.apache.org/downloads.cgi <http://ctakes.apache.org/downloads.cgi> Do you know where the .html link came from; those should be updated. —Pei > On Feb 11, 2016, at 4:02 PM, taposh.d@kp.org wrote: > > Hi > > When I click user installation for MAC/Linux from > http://ctakes.apache.org/downloads.html > > I get a broken link > http://ctakes.apache.org/ > [preferred]/ctakes/ctakes-3.2.2/apache-ctakes-3.2.2-bin.tar.gz > > Can some one forward me the right link and fix this or let me know how to > and I will fix it. > > Regards, > > Taposh D. Roy | Health Data Project Lead/Scientist | Delivery System > Analytics, Decision Support | Kaiser Permanente | cell: 510.206.1633 | > taposh.d@kp.org > > > > NOTICE TO RECIPIENT: If you are not the intended recipient of this > e-mail, you are prohibited from sharing, copying, or otherwise using or > disclosing its contents. If you have received this e-mail in error, > please notify the sender immediately by reply e-mail and permanently > delete this e-mail and any attachments without reading, forwarding or > saving them. Thank you. > > > > > > From: "Savova, Guergana" <guergana.sav...@childrens.harvard.edu> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> > Date: 02/10/2016 11:20 AM > Subject:RE: Contributing to documentation > > > > Hi Jessica, > Thank you very much for offering to contribute to the documentation! > Indeed this is our weak link and any help there will be greatly > appreciated. > A warm welcome to the community! > --Guergana > > > -Original Message- > From: Pei Chen [mailto:chen...@apache.org] > Sent: Wednesday, February 10, 2016 1:41 PM > To: dev@ctakes.apache.org > Subject: Re: Contributing to documentation > > We've been generally following the C-T-R model [1] > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU= > > But feel free to discuss on dev@ whenever in doubt... > > [1] > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU= > > > On Wed, Feb 10, 2016 at 1:31 PM, Jessica Glover > <glover.jessic...@gmail.com> wrote: >> Thank you. I'm excited to contribute. >> >> Is there a process by which my contributions should get "voted in" or >> am I free to just start editing? >> >> - Jessica >> >> On Feb 10, 2016 9:28 AM, "Pei Chen" <pei.c...@wiredinformatics.com> > wrote: >> >>> User Jessica Glover (jgloves) Added to: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org >>> _confluence_display_CTAKES_cTAKES=BQIFaQ=qS4goWBT7poplM69zy_3xhKw >>> EW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcr >>> O4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=LVL >>> CQGevx3dGn1G-IoKWfyFMl6ZQThSi90BoERcRp6w= >>> Enjoy! >>> —Pei >>> >>> On Feb 10, 2016, at 8:43 AM, Jessica Glover >>> <glover.jessic...@gmail.com> >>> wrote: >>> >>> Hi Pei, >>> I'm not sure what my confluence ID is. I log in with this email >>> address, and I can be found under Jessica Glover in a People search. >>> >>> - Jessica >>> This would be great. What is your confluence id (anyone should be >>> able to create an account)? >>> --Pei >>> >>> On Tue, Feb 9, 2016 at 7:49 AM, Jessica Glover >>> <glover.jessic...@gmail.com> wrote: >>> >>> Hello, >>> >>> I am a cTAKES user, but I am interested in development and especially >>> interested in contributing to the documentation. I have some ideas >>> for making the component use guides more user-friendly for first-time >>> UIMAers, but I'm also eager to hear what the dev community would like >>> to see. I am happy to write as well as create diagrams. >>> >>> Thanks, >>> >>> Jessica Glover >>> >>> >>> > > signature.asc Description: Message signed with OpenPGP using GPGMail
Re: Mac/download link broken
Yes, no one should be using http://ctakes.apache.org/downloads.html <http://ctakes.apache.org/downloads.html> All links should be using http://ctakes.apache.org/downloads. <http://ctakes.apache.org/downloads.html>cgi so that it dynamically resolves the mirrors properly… > On Feb 11, 2016, at 4:11 PM, taposh.d@kp.org wrote: > > Pei - > > [preferred] refers to the preferred mirror and needs to be redirected to the > mirror host name. If user is new this will break. > > HTML Page > http://ctakes.apache.org/downloads.html > <http://ctakes.apache.org/downloads.html>--> CLick on MAc/Linux under > User Installation to see it broken. > (http://ctakes.apache.org/ > <http://ctakes.apache.org/>[preferred]/ctakes/ctakes-3.2.2/apache-ctakes-3.2.2-bin.tar.gz) > > CGI Page <http://ctakes.apache.org/downloads.cgi> > http://ctakes.apache.org/downloads.cgi --> Works... > > Regards, > > Taposh D. Roy | Health Data Project Lead/Scientist | Delivery System > Analytics, Decision Support | Kaiser Permanente | cell: 510.206.1633 | > taposh.d@kp.org > > > > NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, > you are prohibited from sharing, copying, or otherwise using or disclosing > its contents. If you have received this e-mail in error, please notify the > sender immediately by reply e-mail and permanently delete this e-mail and any > attachments without reading, forwarding or saving them. Thank you. > > > > > > From:Pei Chen <pei.c...@wiredinformatics.com> > To:dev@ctakes.apache.org > Date:02/11/2016 01:05 PM > Subject:Re: Mac/download link broken > > > > The links on the menu should point to http://ctakes.apache.org/downloads.cgi > <http://ctakes.apache.org/downloads.cgi> > Do you know where the .html link came from; those should be updated. > > —Pei > > On Feb 11, 2016, at 4:02 PM, taposh.d@kp.org <mailto:taposh.d@kp.org> > wrote: > > Hi > > When I click user installation for MAC/Linux from > http://ctakes.apache.org/downloads.html > <http://ctakes.apache.org/downloads.html> > > I get a broken link > http://ctakes.apache.org/ <http://ctakes.apache.org/> > [preferred]/ctakes/ctakes-3.2.2/apache-ctakes-3.2.2-bin.tar.gz > > Can some one forward me the right link and fix this or let me know how to > and I will fix it. > > Regards, > > Taposh D. Roy | Health Data Project Lead/Scientist | Delivery System > Analytics, Decision Support | Kaiser Permanente | cell: 510.206.1633 | > taposh.d@kp.org > > > > NOTICE TO RECIPIENT: If you are not the intended recipient of this > e-mail, you are prohibited from sharing, copying, or otherwise using or > disclosing its contents. If you have received this e-mail in error, > please notify the sender immediately by reply e-mail and permanently > delete this e-mail and any attachments without reading, forwarding or > saving them. Thank you. > > > > > > From: "Savova, Guergana" <guergana.sav...@childrens.harvard.edu> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> > Date: 02/10/2016 11:20 AM > Subject:RE: Contributing to documentation > > > > Hi Jessica, > Thank you very much for offering to contribute to the documentation! > Indeed this is our weak link and any help there will be greatly > appreciated. > A warm welcome to the community! > --Guergana > > > -Original Message- > From: Pei Chen [mailto:chen...@apache.org <mailto:chen...@apache.org>] > Sent: Wednesday, February 10, 2016 1:41 PM > To: dev@ctakes.apache.org > Subject: Re: Contributing to documentation > > We've been generally following the C-T-R model [1] > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU= > > <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU=> > > But feel free to discuss on dev@ whenever in doubt... > > [1] > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW
Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
CTAKES-384-20160129.patch applied. > On Jan 29, 2016, at 4:34 AM, Peter Klügl <peter.klu...@averbis.com> wrote: > > Hi, > > the problems were caused by the svn client in my Eclipse. Sorry for the > trouble, I should have looked more closely at the ciomplete patch. > > I attached a new patch created with commandline tools wich looks correct > now. > > Pei, can you apply the new patch? > > Best, > > Peter > > Am 28.01.2016 um 15:57 schrieb Peter Klügl: >> Thanks Pei. >> >> I fear there was again a problem with the patch. All new files are >> missing (and also the svn-ignore settings). >> >> Can you take a look? >> >> Best, >> >> Peter >> >> Am 28.01.2016 um 14:43 schrieb Pei Chen: >>> patch applied. >>> Thanks, >>> Pei >>> >>> On Thu, Jan 28, 2016 at 4:14 AM, Peter Klügl <peter.klu...@averbis.com> >>> wrote: >>>> Hi Pei, >>>> >>>> can you commit the recent patch for us? >>>> >>>> CTAKES-384-20160120.patch >>>> >>>> Best, >>>> >>>> Peter >>>> >>>> Am 20.01.2016 um 19:35 schrieb Pei Chen: >>>>> Hi, >>>>> Sorry I was swamped recently. >>>>> But yeah, we can even create an extended type system to store these items >>>>> temporarily and add them into the main/core type system afterwards. >>>>> There was an existing item to upgrade UIMA, but agreed- it will require >>>>> much more testing. If it works, we can upgrade it in our sandbox area or >>>>> create a branch if necessary. >>>>> >>>>> —Pei >>>>> >>>>>> On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> >>>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> a new patch is attached. >>>>>> >>>>>> @Pei: >>>>>> are there suitable annotation types in the cTAKES type system? Some >>>>>> project in cTAKES uses something like OntologyMatch... I map it to >>>>>> IdentifiedAnnotation right now, but there are many empty features... >>>>>> >>>>>> @Azad: >>>>>> I changed the rules a bit, especially the capitalization like I use it >>>>>> in ruta normally. The wordlist are compiled to a trie by the maven >>>>>> plugin. I also added the two regexes for url and email. I extended the >>>>>> regex for the url. I also changed the evaluation order of some rules >>>>>> (with @). Feel free to add simple examples to examples.csv for the unit >>>>>> tests. >>>>>> >>>>>> Let me know if you need more information about the changes. >>>>>> >>>>>> Do you wanna have help with the other rule sets? Or should we split them >>>>>> up? >>>>>> >>>>>> Best, >>>>>> >>>>>> Peter >>>>>> >>>>>> Am 18.01.2016 um 11:04 schrieb Peter Klügl: >>>>>>> Hi, >>>>>>> >>>>>>> great. I will integrate them in the project and in the next patch. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>> Am 18.01.2016 um 00:58 schrieb Azad Dehghan: >>>>>>>> Three NERs translated and uploaded. >>>>>>>> >>>>>>>> PS. I will validate all NERs once we have them all completed. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Azad >>>>>>>> >>>>>>>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> This is on my todo list for Dec. as well. If there are any more >>>>>>>>> volunteers >>>>>>>>> for translating JAPE to RUTA, please get in touch. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Azad >>>>>>>>> >>>>>>>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com>
Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
patch applied. Thanks, Pei On Thu, Jan 28, 2016 at 4:14 AM, Peter Klügl <peter.klu...@averbis.com> wrote: > Hi Pei, > > can you commit the recent patch for us? > > CTAKES-384-20160120.patch > > Best, > > Peter > > Am 20.01.2016 um 19:35 schrieb Pei Chen: >> Hi, >> Sorry I was swamped recently. >> But yeah, we can even create an extended type system to store these items >> temporarily and add them into the main/core type system afterwards. >> There was an existing item to upgrade UIMA, but agreed- it will require much >> more testing. If it works, we can upgrade it in our sandbox area or create >> a branch if necessary. >> >> —Pei >> >>> On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> wrote: >>> >>> Hi, >>> >>> a new patch is attached. >>> >>> @Pei: >>> are there suitable annotation types in the cTAKES type system? Some >>> project in cTAKES uses something like OntologyMatch... I map it to >>> IdentifiedAnnotation right now, but there are many empty features... >>> >>> @Azad: >>> I changed the rules a bit, especially the capitalization like I use it >>> in ruta normally. The wordlist are compiled to a trie by the maven >>> plugin. I also added the two regexes for url and email. I extended the >>> regex for the url. I also changed the evaluation order of some rules >>> (with @). Feel free to add simple examples to examples.csv for the unit >>> tests. >>> >>> Let me know if you need more information about the changes. >>> >>> Do you wanna have help with the other rule sets? Or should we split them up? >>> >>> Best, >>> >>> Peter >>> >>> Am 18.01.2016 um 11:04 schrieb Peter Klügl: >>>> Hi, >>>> >>>> great. I will integrate them in the project and in the next patch. >>>> >>>> Best, >>>> >>>> Peter >>>> >>>> Am 18.01.2016 um 00:58 schrieb Azad Dehghan: >>>>> Three NERs translated and uploaded. >>>>> >>>>> PS. I will validate all NERs once we have them all completed. >>>>> >>>>> Cheers, >>>>> Azad >>>>> >>>>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> wrote: >>>>> >>>>>> This is on my todo list for Dec. as well. If there are any more >>>>>> volunteers >>>>>> for translating JAPE to RUTA, please get in touch. >>>>>> >>>>>> Cheers, >>>>>> Azad >>>>>> >>>>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I just wanted to mention that I haven't forgot about it. Unfortunately, >>>>>>> there is just no spare time right now. I hope I will be able to provide >>>>>>> the patches in December. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>> Am 06.11.2015 um 16:40 schrieb Pei Chen: >>>>>>>> Hi Peter, >>>>>>>> I think the ctakes-examples is probably a good starting point at least >>>>>>>> in terms of maven modules, etc. I think it would be good if we use >>>>>>>> uimaFIT style as primary approach to wiring components together and >>>>>>>> generate desc's as secondary... >>>>>>>> I think the actual components that would be required is probably best >>>>>>>> left up to what is actually required for best performing c-deid. The >>>>>>>> output would be interesting, I'm not sure if we should treat this as >>>>>>>> an independent preprocessing component or part of a pipeline (in which >>>>>>>> case, we may need to propose a change to the type system or perhaps an >>>>>>>> alternative JCas view. You can probably open up that discussion to >>>>>>>> the dev group as you see fit.) >>>>>>>> >>>>>>>> My 2 cents... >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.klu...@averbis.com> >>>>&g
Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
Hi, Sorry I was swamped recently. But yeah, we can even create an extended type system to store these items temporarily and add them into the main/core type system afterwards. There was an existing item to upgrade UIMA, but agreed- it will require much more testing. If it works, we can upgrade it in our sandbox area or create a branch if necessary. —Pei > On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> wrote: > > Hi, > > a new patch is attached. > > @Pei: > are there suitable annotation types in the cTAKES type system? Some > project in cTAKES uses something like OntologyMatch... I map it to > IdentifiedAnnotation right now, but there are many empty features... > > @Azad: > I changed the rules a bit, especially the capitalization like I use it > in ruta normally. The wordlist are compiled to a trie by the maven > plugin. I also added the two regexes for url and email. I extended the > regex for the url. I also changed the evaluation order of some rules > (with @). Feel free to add simple examples to examples.csv for the unit > tests. > > Let me know if you need more information about the changes. > > Do you wanna have help with the other rule sets? Or should we split them up? > > Best, > > Peter > > Am 18.01.2016 um 11:04 schrieb Peter Klügl: >> Hi, >> >> great. I will integrate them in the project and in the next patch. >> >> Best, >> >> Peter >> >> Am 18.01.2016 um 00:58 schrieb Azad Dehghan: >>> Three NERs translated and uploaded. >>> >>> PS. I will validate all NERs once we have them all completed. >>> >>> Cheers, >>> Azad >>> >>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> wrote: >>> >>>> This is on my todo list for Dec. as well. If there are any more volunteers >>>> for translating JAPE to RUTA, please get in touch. >>>> >>>> Cheers, >>>> Azad >>>> >>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote: >>>>> Hi, >>>>> >>>>> I just wanted to mention that I haven't forgot about it. Unfortunately, >>>>> there is just no spare time right now. I hope I will be able to provide >>>>> the patches in December. >>>>> >>>>> Best, >>>>> >>>>> Peter >>>>> >>>>> Am 06.11.2015 um 16:40 schrieb Pei Chen: >>>>>> Hi Peter, >>>>>> I think the ctakes-examples is probably a good starting point at least >>>>>> in terms of maven modules, etc. I think it would be good if we use >>>>>> uimaFIT style as primary approach to wiring components together and >>>>>> generate desc's as secondary... >>>>>> I think the actual components that would be required is probably best >>>>>> left up to what is actually required for best performing c-deid. The >>>>>> output would be interesting, I'm not sure if we should treat this as >>>>>> an independent preprocessing component or part of a pipeline (in which >>>>>> case, we may need to propose a change to the type system or perhaps an >>>>>> alternative JCas view. You can probably open up that discussion to >>>>>> the dev group as you see fit.) >>>>>> >>>>>> My 2 cents... >>>>>> >>>>>> >>>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.klu...@averbis.com> >>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Is there a cTAKES project that may serve as an example on how the >>>> cTAKES >>>>>>> community develops or how a project should look like? >>>>>>> I learned that different people set up UIMA project in a quite >>>> different >>>>>>> manner and I do not what to get inspired by "some sort of out-dated" >>>>>>> approach in the cTAKES repo. >>>>>>> >>>>>>> Are there restriction or preferences about the preprocessing >>>> components >>>>>>> that should be used and the kind of "output" of the project. >>>>>>> Components: On which components may the componetns rely: tokenizer, >>>> ... >>>>>>> parser, ... dict lookup? >>>>>>> "output": Should t
Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
Patch applied: http://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-clinical-deid/ <http://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-clinical-deid/> Thanks Peter. What error did you get with xml-api’s? Do you mean upgrade ctakes to the latest version of uima instead of 2.4.0? —Pei > On Jan 11, 2016, at 12:39 PM, Peter Klügl <peter.klu...@averbis.com> wrote: > > Hi, > > I just added a small patch which adds a maven build process and a dummy > unit test. > > I had some problems with the version of xml-apis. Is this known or > rather a local problem on my build machine? > Is there a reason why cTAKES requires uima 2.4.0? > > Next step would be translating the rules. Azad mentioned that he already > started with that :-) > > Best, > > Peter > > > Am 18.12.2015 um 11:01 schrieb Peter Klügl: >> Hi, >> >> sorry, there was no free time left in December for this issue, but I >> will be able to provide the patches in January (for real). >> >> Best, >> >> Peter >> >> Am 24.11.2015 um 11:37 schrieb Azad Dehghan: >>> This is on my todo list for Dec. as well. If there are any more volunteers >>> for translating JAPE to RUTA, please get in touch. >>> >>> Cheers, >>> Azad >>> >>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote: >>>> Hi, >>>> >>>> I just wanted to mention that I haven't forgot about it. Unfortunately, >>>> there is just no spare time right now. I hope I will be able to provide >>>> the patches in December. >>>> >>>> Best, >>>> >>>> Peter >>>> >>>> Am 06.11.2015 um 16:40 schrieb Pei Chen: >>>>> Hi Peter, >>>>> I think the ctakes-examples is probably a good starting point at least >>>>> in terms of maven modules, etc. I think it would be good if we use >>>>> uimaFIT style as primary approach to wiring components together and >>>>> generate desc's as secondary... >>>>> I think the actual components that would be required is probably best >>>>> left up to what is actually required for best performing c-deid. The >>>>> output would be interesting, I'm not sure if we should treat this as >>>>> an independent preprocessing component or part of a pipeline (in which >>>>> case, we may need to propose a change to the type system or perhaps an >>>>> alternative JCas view. You can probably open up that discussion to >>>>> the dev group as you see fit.) >>>>> >>>>> My 2 cents... >>>>> >>>>> >>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.klu...@averbis.com> >>> wrote: >>>>>> Hi, >>>>>> >>>>>> Is there a cTAKES project that may serve as an example on how the >>> cTAKES >>>>>> community develops or how a project should look like? >>>>>> I learned that different people set up UIMA project in a quite >>> different >>>>>> manner and I do not what to get inspired by "some sort of out-dated" >>>>>> approach in the cTAKES repo. >>>>>> >>>>>> Are there restriction or preferences about the preprocessing components >>>>>> that should be used and the kind of "output" of the project. >>>>>> Components: On which components may the componetns rely: tokenizer, ... >>>>>> parser, ... dict lookup? >>>>>> "output": Should the project provide a pipeline or a single AE? >>>>>> >>>>>> More comments below. >>>>>> >>>>>> Am 03.11.2015 um 16:54 schrieb Azad Dehghan: >>>>>>>> Who else plans to provide patches for it? Just to avoid duplicate >>> work >>>>>>>> and to coordnate the efforts ... >>>>>>>> >>>>>>> I would like to help with the translating JAPE to RUTA. >>>>>> You can already go ahead with the UIMA Ruta Workbench if you want, or >>>>>> wait until I set up the project with ruta integration. >>>>>> >>>>>> If any questions arise, just ask :-) >>>>>> >>>>>>>> Is there a development dataset which was utilized for the initial >>>>>>>> development, and if yes, is it possible to
Re: Need help to identify procedures in xml file using AggregatePlaintextFastUMLSProcessor
Hi Reena, If you search for "ProcedureMention" in the attached output xml, you should be able to find the Procedures (plus the FSArray of the associated Concepts) that were extracted... Or am I missing something... --Pei On Mon, Dec 7, 2015 at 12:40 AM, Reena Duggalwrote: > Sorry, I attached the wrong file in last mail. PFA the correct xml file. > > Thanks & Regards > Reena Duggal > Research Scholar(Full-Time) > Amity Institute of Information Technology > Amity University Uttar Pradesh > M - 09740256313 > On 12/7/2015 10:41 AM, Reena Duggal wrote: > > Hello > I have setup ctakes on my machine using cTAKES 3.2 User Install Guide. I > created an xml file using CPE and using AggregatePlaintextFastUMLSProcessor. > I am attaching it with email. Pl let me know how to parse this file to get > list of procedures from it. I am not able to figure out that part. Also pl > check, if this file is correct. Will really appreciate your help on this. > > > Thanks & Regards > Reena Duggal > Research Scholar(Full-Time) > Amity Institute of Information Technology > Amity University Uttar Pradesh > M - 09740256313 > >
Re: ctakes with icd10; 2015 versions available on sourceforge!
Brandon, That sounds great! Please open a Jira ticket for any contributions (anyone should be able to create a Jira account). There are some legal items built into the ASF Jira attachments for accepting contributions/donations. It will also credit the contributors with the merit appropriately. Anyone who is interested can follow the Jira item. (Even better if contributions were open discussion/open development.) --Pei On Tue, Dec 8, 2015 at 10:36 PM, Geise, Brandon D.wrote: > I'd be interested in contributing to making the dictionary tool more user > friendly with a GUI. > > Thanks, > Brandon > > -Original Message- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Tuesday, December 08, 2015 6:12 PM > To: dev@ctakes.apache.org > Subject: RE: ctakes with icd10; 2015 versions available on sourceforge! > > Hi Dave, > > I'm always happy to see interest in our stuff! > >>Step 1 > I built the tool to be able to build a dictionary using anything in the umls > - snomed, icd9, hpo, etc. so using the veterinary extension shouldn't be a > problem. You just add it to the CtakesSources file (or create an alternate > file and point to it with -src). To answer another of your questions, there > can be zero or more sources - you saw snomedct and snomedct_us (each valid in > a different umls version). > It also can include any semantic type, just add (or remove) the appropriate > tuis in a different data file. > >>Step 2 > You have it right - you copy the templates to another location and output to > that location. Otherwise you 'lose' your templates. > >>Step 3 and 4 > The jar is built from source. I need to (soon) check in updates to the > source, and at the same time I can check in a default prebuilt .jar The lib/ > directory is in the source repository. > > Various people have toyed with the idea of putting the tool into a ctakes > module, putting it into an "installation package", making a gui ... The best > option (imo) is probably to make an easy to use gui and keep a pre-built > version in sandbox. Someday, after the rainbow, maybe I'll get a chance to > do that ... > > Sean > > > -Original Message- > From: David Kincaid [mailto:kincaid.d...@gmail.com] > Sent: Tuesday, December 08, 2015 4:57 PM > To: dev@ctakes.apache.org > Subject: Re: ctakes with icd10; 2015 versions available on sourceforge! > > Thanks, Sean! It's great that cTAKES may soon have an up to date database out > of the box. Hopefully it will cut down on the need for many to build their > own DB's. Thank you much for doing that. > > Unfortunately, I still will need to build a custom one for us. I work in > veterinary medicine so I need to add in the veterinary extension for > SNOMED-CT into the database. > > I looked over the steps below that Brandon included and have some questions: > > step 1 says to "Change /data/default/CtakesSources.txt from "SNOMEDCT" to > "SNOMEDCT_US". The file that I have has two lines in it. First line is > SNOMED, second line is SNOMEDCT_US. So this step doesn't really make sense. > > step 2 should reference the two scripts as being in resource/memdbtemplate so > others don't have to search for them. Not sure what it means to move them to > "location to put new UMLS DB". Does that mean move them into a new directory > where the newly created UMLS DB will get written? > > steps 3 and 4 for running the tools reference dictionarytool.jar which > doesn't exist. Does one need to build that somehow from the source before > running it? The command line also adds "lib/*" to the classpath. Is that the > lib directory inside the dictionarytool source code or some other location? > > What else would I need to do to include the SNOMED-CT Veterinary Extension > along with the snomedct and rxnorm sources? > > I'll probably not have time to try this out for a while yet, but when I do > I'd be happy to write up an easy to follow tutorial for building a custom > dictionary assuming I am able to get it to work. > > Has anyone considered making this tool available outside of the source code > itself? Like including it in the main cTAKES release? It seems there is > demand for it. > > - Dave > > On Tue, Dec 8, 2015 at 3:22 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > >> Hi Brandon, thanks for finding and forwarding the instructions! >> >> I have checked in two new hsqldb dictionaries, both from the 2015AB >> version of the UMLS. They both have codes for snomedct_us, rxnorm, >> icd9cm and icd10pcs - as well as the usual cui, tui, preferred term mappings. >> >> One uses cuis filtered by snomed and rxnorm, the other adds cuis >> filtered by icd9 and icd10. >> What this means: Cuis that exist for a [filter source] are added to >> the dictionary, as are all text variations from all sources that >> contain that cui. Both dictionaries also use the standard ctakes >> semantic group tui filters. >> >> The names are ctakessnorx2015
Re: Create next cTAKES release (3.2.3)?
A lot of the Jira's haven't been bumped into 3.2.4 yet. This is to get everyone to start looking and update their Jira's and if they don't think they'll get a chance to work on it, I suggest to bump it to the next release... And if someone would like something to be included in this release, please create a Jira and assign it to 3.2.3. --Pei On Thu, Nov 19, 2015 at 11:09 AM, Finan, Sean <sean.fi...@childrens.harvard.edu> wrote: > Hi Pei, thanks for the link to our Jira dashboard. From my 3 second > run-through I would say that there remains a lot of outstanding work slated > for the 3.2.3 release. Below are the Blocker, Critical and Major items. > Some may actually have been or can be quickly resolved, but it looks like we > may have more than a few bumps to 3.2.4 if we want to push out a release. > > I say in my bazaar way: release early, release often ... > +1 for bumps and release ... *but first some comments in Jira on the state of > all listed below... > > Can anybody confirm that our only Blocker (dependencies not in maven central) > is still a problem? https://issues.apache.org/jira/browse/CTAKES-76 > Where do we stand on the related Critical > https://issues.apache.org/jira/browse/CTAKES-138 ? > > Our other Critical item is PTB tokenizer breaking on apostrophes: > https://issues.apache.org/jira/browse/CTAKES-74 > > A Major bug is FractionFSM incorrectly handling dashed ranges: > https://issues.apache.org/jira/browse/CTAKES-341 Britt, it looks like you > might have a fix ready? > > Another Major bug is for ytex UMLS.hbm.template.xml ... > https://issues.apache.org/jira/browse/CTAKES-302 Vijay it looks like you have > a fix started? > > Major bug for Missing Modifiers > https://issues.apache.org/jira/browse/CTAKES-213 ... Steve indicates that > this will require a lot of work. Should we bump it or has somebody been > making progress? > > A Major bug in Medication Strength parsing has sat since our original > incubation, so I'm just guessing that it hasn't been touched and will be > bumped. https://issues.apache.org/jira/browse/CTAKES-178 > > Major bug SimpleSegmentWithTags ... 5 char names ... has also been around > single the continents were formed. > https://issues.apache.org/jira/browse/CTAKES-155 I'd say a bump seems ok > except that there is an NPE ... > > There is a patch posted for our good old blues brothers boys band "URI not > hierarchical" on the old dictionary lookup > https://issues.apache.org/jira/browse/CTAKES-388 Can anybody volunteer to > test and commit? I think that this is basically the same problem relayed in > https://issues.apache.org/jira/browse/CTAKES-320 > > > We have two placeholders for 3.2.3 additions. They should probably be added > and (widely) tested asap or bumped to the next release. > New Sentence Detector https://issues.apache.org/jira/browse/CTAKES-380 > ISO Time Normalizer https://issues.apache.org/jira/browse/CTAKES-379 > > Has anybody started to tackle clean up / ?removal? of xml descriptors? > Tagged as Major improvement. https://issues.apache.org/jira/browse/CTAKES-328 > This is related to https://issues.apache.org/jira/browse/CTAKES-295 - for > which Tim Miller has done a lot of great work, but is still incomplete. Do > others have checkins awaitin'? > A related Major Improvement is updating/fixing the relation extractor xml: > https://issues.apache.org/jira/browse/CTAKES-172 > > > Another Major improvement is an lvg update. Do we have time to play with > this or should we bump it? https://issues.apache.org/jira/browse/CTAKES-388 > Related to https://issues.apache.org/jira/browse/CTAKES-122 > > > Pei or Jay, are you ready to check in a working BigTop integration? > https://issues.apache.org/jira/browse/CTAKES-314 > > > > -Original Message- > From: Pei Chen [mailto:chen...@apache.org] > Sent: Wednesday, November 18, 2015 10:02 PM > To: dev@ctakes.apache.org > Subject: Create next cTAKES release (3.2.3)? > > Hi Folks, > It looks like there have been a lot of progress in Jira's. What do folks > think of preparing a cut for the next release- would be nice to get one more > out before holidays/end of the year? > I'll be happy to volunteer to be RM again. > > Full list of Jira items slated for 3.2.3: > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CTAKES_fixforversion_12328718_-3FselectedTab-3Dcom.atlassian.jira.jira-2Dprojects-2Dplugin-3Aversion-2Dissues-2Dpanel=BQIBaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=_7ouzO0-tjeIkyk9Gs02WBejxjOgYQstemelRj8yHcY=Bb9i6bbeLKK1UiJCVzZZPIkgQpmbHNsYJbEBDhsaBA4= > > --Pei
Re: user Digest 16 Nov 2015 19:44:10 -0000 Issue 343
[+dev, -user] Hi Lewis, I've applied the patches. Would you mind looking into using and see if it works for your tests/use cases?: org.apache.ctakes.core.resource.FileLocator.getAsStream()? It has a built in fall back mechanism. On Mon, Nov 16, 2015 at 3:25 PM, Lewis John Mcgibbneywrote: > Hi Pei, > > On Mon, Nov 16, 2015 at 11:44 AM, > wrote: >> >> >> Date: Thu, 12 Nov 2015 15:48:32 -0500 >> Subject: Re: cTAKES Trunk Broken? >> Hi Lewis, >> Sorry for the delays- I noticed some of your patches were still >> pending in Jira. >> I should be able to spend more time on cTAKES now and hopefully be >> able to apply those patches shortly; Unless someone beats me to it.. >> > > Thanks for the response. We are using cTAKES heavily right now and are very > interested in progressing availability and execution of cTAKES within Yarn > and Spark. I am really interested to work with the cTAKES committers to > ensure that lvg resources (and every cTAKES resource for that matter) can be > run in these environments. As you're aware they curently can't be. > Let let me know if the various patches need unit tests. If so I am more than > happy to invest some time and provide test patches for the all them. > Thank you again for getting back on this one. > Lewis >
Re: cTAKES corpus
Hi Jose, There were some previous discussions[1] on how to get the annotated training data. Essentially, there currently isn't a centralized or easy way of getting w/o having to sign individual Data Use Agreements from source institutions. There is a clear need to simplify this and I believe the various groups are working on it... [1] http://mail-archives.apache.org/mod_mbox/ctakes-dev/201503.mbox/%3CCA+Fyf6hxBbhhEqc9oU=vpuymc1fyrwpextpmpme-ir0cjwt...@mail.gmail.com%3E > There are some discussions on appending/augmenting the existing > annotated/training data[2]. I think the short answer is that there is > currently no easy way short of having to sign DUA's from every single > source institution. > > [1] http://svn.apache.org/r1465043 > [2] > > http://mail-archives.apache.org/mod_mbox/ctakes-dev/201412.mbox/%3ce5a9fa5abbf1ca4085d4f0794852a51e24241...@chexmbx3a.chboston.org%3E On Wed, Nov 11, 2015 at 3:51 PM, Posada Aguilar, Jose Davidwrote: > Dear cTAKES community > > I want to know if it's possible to obtain the annotated corpus that were used > to test cTAKES. > > We are currently using it and we would like to be able to test each module > towards the addition of a new one. > > Thank you very much for your help. > > > > Jose Posada > Department of Biomedical Informatics > University of Pittsburgh > >
Email address update
Hi, Just wanted to give a heads up- My usual childrens/harvard email address will no longer be valid after this week. I will continue to use my chenpei@a.o one as I will be moving on to focus on our startup, Wired Informatics. --Pei
Re: cTakes 3.2.2 CPE error
Hi Eric, could you copy and paste the java command that was used? It some look something like: "java -cp ..." --Pei On Sat, Oct 31, 2015 at 1:58 PM, Eric Benzschawelwrote: > To whom it may concern, > I'm Eric Benzschawel, a masters student at Brandeis University. I'm > planning on using cTakes to help me process texts for my master's thesis, > but I'm having problems installing and running the program. > > I've followed the cTakes 3.2 install guide twice, identically, and I can't > complete the CPE example on the bottom of the install page. I'm getting the > error: > > Error: Could not find or load main class > .usr.local.apache-ctakes-3.2.2.desc.:.usr.local.apache-ctakes-3.2.2.resources.:.usr.local.apache-ctakes-3.2.2.lib.* > > I'd be surprised if this error something that's specific to my local > install. I've successfully completed the CVD tutorial, but the CPE program > will be more relevant to my work. Do you have any resources you can point > me to for troubleshooting or any idea what might be causing this error? > > Best, > Eric Benzschawel > Brandeis University > Computational Linguistics MA '16 > > Additional information: > Platform: Mac OSX El Capitan > Java version: 1.8.0_20 > cTakes version: 3.2.2 > UMLS resources included: yes > UMLS resources version: 3.2.1.1-bin.tar.gz > Downloaded additional ctakessnorx resources: yes
Re: cTAKES scale out using UIMA DUCC
Yi-Wen, ctakes-clinical-pipeline/desc/**/ for the descriptors? Hope that helps.. On Wed, Oct 28, 2015 at 5:55 PM, Yi-Wen Liuwrote: > Hi, > > I am trying to run cTAKES on UIMA DUCC, and I am working on some > configuration files. > In UIMA DUCC job files, I have to specify the following: > driver_descriptor_CR > ex. org.apache.uima.ducc.sampleapps.DuccJobTextCR > process_descriptor_CM > ex. org.apache.uima.ducc.sampleapps.DuccTextCM > process_descriptor_AE > ex. ${OPENNLP_HOME}/desc/OpenNlpTextAnalyzer.xml > process_descriptor_CC > ex. org.apache.uima.ducc.sampleapps.DuccCasCC > > Does cTAKES have its own CR, CM, AE and CC descriptors? > And if so, could somebody point out where can I find them in cTAKES > directory? > > Thanks, > Yi-Wen
Re: Can one pass UMLS username and password as API arguments?
Pete, System.setProperty()? Were you suggest we add an overloaded method?: ClinicalPipelineFactory.getFastPipeline(String user, String pw) {} It's not a bad suggestion- if you require it, feel free to create a Jira or even better a patch... --Pei On Mon, Oct 26, 2015 at 1:27 PM, Peter Szolovitswrote: > I know that, but was asking specifically whether there is a way for this info > to be passed in by a program that embeds cTakes, without having to set > environment variables or muck with the java command line. > >> On Oct 26, 2015, at 1:18 PM, Finan, Sean >> wrote: >> >> You should be able to use ctakes.umlsuser and ctakes.umlspw in the command >> line or as environment variables. If your shell requires, you can replace >> the dot with underscore: ctakes_umlsuser ctakes_umlspw >> >> Sean >> >> -Original Message- >> From: Peter Szolovits [mailto:p...@mit.edu] >> Sent: Monday, October 26, 2015 1:12 PM >> To: dev@ctakes.apache.org >> Subject: Can one pass UMLS username and password as API arguments? >> >> I am embedding cTakes as part of a larger (Java-based) processing program >> and would like to be able to pass the user’s UMLS username and password when >> setting up the cTakes API rather than embedding them in UIMA configuration >> files or having to give them as java vm arguments. E.g., at some place such >> as a call to ClinicalPipelineFactory.getFastPipeline()g. Is there a way to >> do this that I have not been able to find? Thank you. —Peter Szolovits >> >
Re: Can one pass UMLS username and password as API arguments?
Yes, I wasn’t sure if your application had a security restriction to store paw’s into the env var for any code to read. Anyhow, anyone should be able to create an Jira account at: https://issues.apache.org/jira/browse/CTAKES Pei Chen Wired Informatics <http://www.wiredinformatics.com> 265 Franklin St Ste 1702 Boston, MA 02110 tel: (617) 433-7544 pei.c...@wiredinformatics.com On Mon, Oct 26, 2015 at 2:06 PM, Peter Szolovits <p...@mit.edu> wrote: > Thanks, Sean and Pei. Sean’s suggestion to set the properties via > System.setProperty() works; I had forgotten that this was doable in Java. > I think the suggestion of an overloaded method is still a good idea, but I > also don’t remember how to create a Jira. —Pete > > > On Oct 26, 2015, at 1:44 PM, Pei Chen <chen...@apache.org> wrote: > > > > Pete, > > System.setProperty()? > > Were you suggest we add an overloaded method?: > > ClinicalPipelineFactory.getFastPipeline(String user, String pw) {} > > It's not a bad suggestion- if you require it, feel free to create a > > Jira or even better a patch... > > --Pei > > > > > > On Mon, Oct 26, 2015 at 1:27 PM, Peter Szolovits <p...@mit.edu> wrote: > >> I know that, but was asking specifically whether there is a way for > this info to be passed in by a program that embeds cTakes, without having > to set environment variables or muck with the java command line. > >> > >>> On Oct 26, 2015, at 1:18 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > >>> > >>> You should be able to use ctakes.umlsuser and ctakes.umlspw in the > command line or as environment variables. If your shell requires, you can > replace the dot with underscore: ctakes_umlsuser ctakes_umlspw > >>> > >>> Sean > >>> > >>> -Original Message- > >>> From: Peter Szolovits [mailto:p...@mit.edu] > >>> Sent: Monday, October 26, 2015 1:12 PM > >>> To: dev@ctakes.apache.org > >>> Subject: Can one pass UMLS username and password as API arguments? > >>> > >>> I am embedding cTakes as part of a larger (Java-based) processing > program and would like to be able to pass the user’s UMLS username and > password when setting up the cTakes API rather than embedding them in UIMA > configuration files or having to give them as java vm arguments. E.g., at > some place such as a call to ClinicalPipelineFactory.getFastPipeline()g. > Is there a way to do this that I have not been able to find? Thank you. > —Peter Szolovits > >>> > >> > >
Re: cTakes - help please
Chris, Could you confirm if the paths were modified? if so, is it accurate? In particular, per error message: des -> desc? Malformed URL c:/apache-ctakes-3.2.2/*des*/ctakes-chunker/desc If not, would you mind including the descriptor xml's? --Pei On Mon, Oct 19, 2015 at 5:38 PM, Tonner, Chriswrote: > Hello: > > > > I work in the Department of Medicine at the University of California, San > Francisco. We are interested in cTakes to extract medical information from > our EMRs. > > > > I have been attempting to install and run cTakes, but have had some > problems. Would you be able to give us technical help to get cTakes up and > running. > > > > Following the instructions on the User Installation Guide. > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide > > > > I have modified the two bat files c: bin\runctakesCVD.bat file and > CPE.bat… Example of modified code – is this correct? > > > > @REM set ctakes.umlsuser=[christonner], ctakes.umlspw=[123Start] > > @REM or add the properties > > @REM -Dctakes.umlsuser=[christonner] -Dctakes.umlspw=[123Start] > > > > These batch files run and the debugger opens. > > > > I can load some of the analysis engine, but get errors with the UMLS and > the negation AE. > > > > In short we are at a loss on how to troubleshoot this. Is there a more > extensive manual on how to install cTakes and how to use cTakes? > > > > Error Examples. > > > > SEVERE: Malformed URL > c:/apache-ctakes-3.2.2/des/ctakes-chunker/desc/AdjustNounPhraseToIncludeFollowingPPNP.xml > in import declaration. (Descriptor: > file:/C:/apache-ctakes-3.2.2/desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml) > > org.apache.uima.resource.ResourceInitializationException: Malformed URL > c:/apache-ctakes-3.2.2/des/ctakes-chunker/desc/AdjustNounPhraseToIncludeFollowingPPNP.xml > in import declaration. > > (Descriptor: > file:/C:/apache-ctakes-3.2.2/desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml) > > > > Thanks you, > > Chris Tonner > > University of California, San Francisco > > > > > > >
Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
Thanks Azad. I submitted a Jira to infra to help us do the import (that way we will try and preserve the commit history). In the meantime, would you mind filling out the ICLA[1]. [Reminder: Let's keep it in sandbox and not release it until all of the 3rd party dependencies licenses have been verified.] [1] http://www.apache.org/licenses/#clas Thanks, Pei Pei Chen Wired Informatics <http://www.wiredinformatics.com> 265 Franklin St Ste 1702 Boston, MA 02110 tel: (617) 433-7544 pei.c...@wiredinformatics.com On Sun, Oct 11, 2015 at 3:51 PM, Azad Dehghan <azad.dehg...@gmail.com> wrote: > 1: Yes. Sorted. > 3: Code attached to the Jira. > > Azad > > On 8 October 2015 at 20:03, Chen, Pei <pei.c...@childrens.harvard.edu> > wrote: > > > This is great news! > > > What is the current status and procedure? Is there an explicit > > contribution to cTAKES? Is there an ICLA? What about the license of the > > sourceforge project? > > Jira has been opened to track this: > > https://issues.apache.org/jira/browse/CTAKES-384 > > > > 1) Azad, would you be willing to switch licenses? I believe it's > > currently GNU3 -> ASL 2.0? > > 2) Create a project/module in cTAKES sandbox for this > > 3) Export/Import sourceforge and attach the code to the Jira initially. > > One of the current cTAKES committers can commit it to the repo (Until > folks > > can commit directly to the ctakes repo directly going forward.) > > > > -Original Message- > > From: Peter Klügl [mailto:peter.klu...@averbis.com] > > Sent: Thursday, October 08, 2015 8:06 AM > > To: dev@ctakes.apache.org > > Subject: Re: Combining Knowledge- and Data-driven Methods for > > De-identification of Clinical Narratives > > > > Hi, > > > > I can offer my help here if required. > > > > I have experience in translating JAPE rules to UIMA Ruta and already > > worked with clinical notes, e.g., also concerning deidentification. > > > > The problem is that I can only invest a few hours in the next two weeks. > > I will have more time next month or even more next year. > > > > What is the current status and procedure? Is there an explicit > > contribution to cTAKES? Is there an ICLA? What about the license of the > > sourceforge project? > > > > Best, > > > > Peter > > > > Am 01.10.2015 um 16:20 schrieb Pei Chen: > > > Hi Azad, > > > This is awesome news. Thanks for adding in the code that was > > > referenced by the paper. I'll create a Jira to track we need to port > > > it over to UIMA/Ruta. > > > > > > In the meantime, the link is at: > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p_ > > > > > > clinical-2Ddeid_code_ci_master_tree_=BQICaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY=yjhqco4EH0XrR798kbkzfYcFQ8z8MR9UF8mMRSjKTH0=_k7AbwzkVrRwTrNC3LArZ5hQ5Q47eh06KCDla7UBugY= > > for those who may be interested in helping out... > > > > > > --Pei > > > > > > Hello Pei, > > > > > > I hope all is well. > > > > > > I have now uploaded the source code for cDeid > > > (https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p > > > _clinical-2Ddeid_code_ci_master_tree_=BQICaQ=qS4goWBT7poplM69zy_3x > > > hKwEW14JZMSdioCoppxeFU=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY > > > > > > =yjhqco4EH0XrR798kbkzfYcFQ8z8MR9UF8mMRSjKTH0=_k7AbwzkVrRwTrNC3LArZ5hQ5Q47eh06KCDla7UBugY= > > ) ; I have tried to make the code as portable and modular as possible > with > > some trade-off for performance. This should help with porting the code to > > cTAKES/UIMA. > > > > > > Once you let the community know I will try to get involved to help > > > with translating JAPE to RUTA, etc. > > > > > > Best, > > > Azad > > > > >
Re: Boston cTAKES Hackathon?
http://www.meetup.com/cTAKES/events/225926425/ has been set up... On Fri, Sep 18, 2015 at 11:46 AM, Pei Chen <chen...@apache.org> wrote: > Yes, we can plan for lightning talks and we can potential find some > Docker experts in the area to help. > I'm thinking over the next few weeks; any prefs on date/times? Once > we a rough idea, i'll move this thread over to meetup.com to avoid > spamming this list. > > --Pei > > On Wed, Sep 16, 2015 at 9:11 PM, John Green <hephaestus.stu...@gmail.com> > wrote: >> Im jealous! That sounds fun >> JTG >> >> >> On Wed, Sep 16, 2015 at 3:21 PM, Jay Vyas <jayunit100.apa...@gmail.com> >> wrote: >>> >>> Yes I'd love to. How about some lightning talks also to start the night >>> off? >>> I know Harvard is using ctakes for some stuff. >>> >>> >>> > On Sep 16, 2015, at 4:23 PM, Pei Chen <chen...@apache.org> wrote: >>> > >>> > Hi, >>> > I hope everyone had a great summer. I just wanted to resurrect the >>> > Docker integration idea. >>> > Anyone interested in joining a small hackathon with the single goal of >>> > deploying cTAKES in a docker container. >>> > One of the evenings 6pm? >>> > >>> > --Pei >> >>
Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
Hi Azad, This is awesome news. Thanks for adding in the code that was referenced by the paper. I'll create a Jira to track we need to port it over to UIMA/Ruta. In the meantime, the link is at: http://sourceforge.net/p/clinical-deid/code/ci/master/tree/ for those who may be interested in helping out... --Pei Hello Pei, I hope all is well. I have now uploaded the source code for cDeid (http://sourceforge.net/p/clinical-deid/code/ci/master/tree/) ; I have tried to make the code as portable and modular as possible with some trade-off for performance. This should help with porting the code to cTAKES/UIMA. Once you let the community know I will try to get involved to help with translating JAPE to RUTA, etc. Best, Azad
Re: Boston cTAKES Hackathon?
Yes, we can plan for lightning talks and we can potential find some Docker experts in the area to help. I'm thinking over the next few weeks; any prefs on date/times? Once we a rough idea, i'll move this thread over to meetup.com to avoid spamming this list. --Pei On Wed, Sep 16, 2015 at 9:11 PM, John Green <hephaestus.stu...@gmail.com> wrote: > Im jealous! That sounds fun > JTG > > > On Wed, Sep 16, 2015 at 3:21 PM, Jay Vyas <jayunit100.apa...@gmail.com> > wrote: >> >> Yes I'd love to. How about some lightning talks also to start the night >> off? >> I know Harvard is using ctakes for some stuff. >> >> >> > On Sep 16, 2015, at 4:23 PM, Pei Chen <chen...@apache.org> wrote: >> > >> > Hi, >> > I hope everyone had a great summer. I just wanted to resurrect the >> > Docker integration idea. >> > Anyone interested in joining a small hackathon with the single goal of >> > deploying cTAKES in a docker container. >> > One of the evenings 6pm? >> > >> > --Pei > >
Re: CTAKES-377 : Upgrade to Java 8
+1 upgrading to Java 8; been using it unofficially locally. On Wed, Sep 16, 2015 at 1:37 PM, Finan, Seanwrote: > Can anybody out there think of a reason why we shouldn't upgrade to Java 8? > Please comment on Jira. > > https://issues.apache.org/jira/browse/CTAKES-377 > > Thanks, > Sean > >
Boston cTAKES Hackathon?
Hi, I hope everyone had a great summer. I just wanted to resurrect the Docker integration idea. Anyone interested in joining a small hackathon with the single goal of deploying cTAKES in a docker container. One of the evenings 6pm? --Pei
[DRAFT] [REPORT] Apache cTAKES Sep 2015
Feel free to edit/add. Report from the Apache cTAKES committee [Pei Chen] ## Description: Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. ## Activity: - There is interest from new contributor(s) in integrating cTAKES with Spark (CTAKES-374) - There is interest from new contributor(s) in integrating Gene Mappings to cTAKES (CTAKES-375) - There is interest from new contributors(s) on using cTAKES for deidentification - The committee is planning a local meetup (Boston) in the near future to integrate cTAKES with Docker for easier deployments. ## Issues: - There are no issues requiring board attention at this time ## LDAP committee group/Committership changes: - Currently 31 committers and 30 PMC members in the project. - Last PMC addition was Michelle Chen at Fri Jan 23 2015 - Last committer addition was Jim Gregoric at Sat Feb 28 2015 ## Releases: - 3.2.2 was released on May 30 2015 - 3.2.1 was released on Dec 10 2014 - 3.2.0 was released on Jul 23 2014 ## Mailing list activity: - dev@ctakes.apache.org: - 179 subscribers (up 17 in the last 3 months): - 181 emails sent to list (228 in previous quarter) - u...@ctakes.apache.org: - 155 subscribers (up 13 in the last 3 months): - 94 emails sent to list (95 in previous quarter) - notificati...@ctakes.apache.org: - 21 subscribers (up 0 in the last 3 months): - 77 emails sent to list (103 in previous quarter) ## JIRA activity: - 11 JIRA tickets created in the last 3 months - 7 JIRA tickets closed/resolved in the last 3 months
Re: Bug in resource import in LvgAnnotator?
Hi Jakob, Yes, there is currently a limitation in the LVG component to have their jar unpacked and added to the classpath. I think there is an outstanding Jira to enable it read it from a jar like the rest of the jars. (There are some limitations such as hsql/lucene not being able to read directly from a jar hence it's been outstanding.) --Pei On Fri, Aug 28, 2015 at 4:26 AM, Jakob Rogstadius jakob.rogstad...@who-umc.org wrote: Hi, On line 579 in org.apache.ctakes.lvg.ae.LvgAnnotator there is the following resource import: ExternalResourceFactory.createExternalResourceDescription( LvgCmdApiResourceImpl.class, new File(LvgCmdApiResourceImpl.class.getResource( /org/apache/ctakes/lvg/data/config/lvg.properties).toURI())) ); The .getResource() call breaks when the package is imported as a jar (from Maven central), with an error stating that the URI is not hierarchical. According to: http://stackoverflow.com/questions/18055189/why-my-uri-is-not-hierarchical the call should instead use .getResourceAsStream(). Is this a bug, or am I doing something wrong? I'm not very familiar with how Java handles resources in general. Jakob
Re: Error while running DrugMentionAnnotator.xml
Chandra, How are you wiring up the DrugMentionAnnotator? The default AggregatePlaintextFastUMLSProcessor.xml should already have the Drug NER included. http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml In general, you'll most likely need to something like the below if you have a custom/modified pipeline: typeSystemDescription imports import name=org.apache.ctakes.drugner.types.TypeSystem/ /imports /typeSystemDescription Related Thread: http://mail-archives.apache.org/mod_mbox/ctakes-user/201403.mbox/%3CCAPqz87oUZ=hpzc_fo_zlaef3pvqcm9xsyums15iymgapsxx...@mail.gmail.com%3E If you're using uimaFIT to wire your pipeline together, I would highly recommend using the Automatic Type System Discovery. Hope that helps. On Sat, Aug 8, 2015 at 12:12 PM, RANGA CHANDRA GUDIVADA chandhragupta...@hotmail.com wrote: Hello All, I am getting the below error while trying to run the analysis engine DrugMentionAnnotator.xml from user install CVD. Please let me know if anyone has similar issues and were able to successfully fix it. Ctakes Version used : apache-ctakes-3.2.2 Ctakes Resources : ctakes-resources-3.2.1.1-bin Caused by: org.apache.uima.cas.CASRuntimeException: JCas type org.apache.ctakes.drugner.type.FrequencyAnnotation used in Java code, but was not declared in the XML type descriptor. at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:412) at org.apache.uima.jcas.impl.JCasImpl.getCasType(JCasImpl.java:436) at org.apache.uima.jcas.impl.JFSIndexRepositoryImpl.getAnnotationIndex(JFSIndexRepositoryImpl.java:80) at org.apache.ctakes.drugner.ae.DrugMentionAnnotator.removeAnnotations(DrugMentionAnnotator.java:306) at org.apache.ctakes.drugner.ae.DrugMentionAnnotator.removeDrugNerTypes(DrugMentionAnnotator.java:299) at org.apache.ctakes.drugner.ae.DrugMentionAnnotator.process(DrugMentionAnnotator.java:260) ... 45 more Thanks Chandra
Re: Role of white-box logic/models in cTAKES
Peter, Good to hear from you again! Yes, I believe there are some regex and rules based annotators that are in used (and probably the future for as long as it out performs other methods for certain tasks.) I don't think there is specific position form the community on this approach. (ASF's 'Do-acracy') Were you thinking of writing some Annotators in Ruta? --Pei On Wed, Aug 5, 2015 at 3:47 AM, Peter Klügl peter.klu...@averbis.com wrote: Hi, my (uninformed) view on cTAKES was that it is mainly based on black-box machine learning models. There were some mentions of rule-based approaches on the mailing list and a quick look in the source code revealed to me some functionality that is based on FSMs and regular expressions (and the grey area of rule logic implemented in plain java). I'm just curious. Is this code actively used in cTAKES and is there a general position of the cTAKES community on rules-based/white-box approaches? Best, Peter
Re: cTAKES GUI for I2B2
Sekhar, That application was done as a prototype/POC many years ago and hasn't been actively maintained (hence in sandbox). It seems from your screenshot that you have it up and running though. Would you mind attaching the log files as well? --Pei On Wed, Aug 5, 2015 at 4:41 AM, Hari, Sekhar sekhar.h...@cgi.com wrote: Hello Timothy - I have posted the screenshots here: https://drive.google.com/file/d/0B4sR85qs377yTThzWHM4YXlxOFE/view?usp=sharing Kindly advise as soon as possible. Many thanks, Sekhar H. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Tuesday, August 04, 2015 4:14 PM To: dev@ctakes.apache.org Subject: RE: cTAKES GUI for I2B2 Can you post the screenshot somewhere it might be linked to? I don't know if we can post image attachments to the dev list. Thanks Tim From: Hari, Sekhar [sekhar.h...@cgi.com] Sent: Monday, August 03, 2015 10:35 PM To: chen...@apache.org; dev@ctakes.apache.org; u...@ctakes.apache.org Subject: RE: cTAKES GUI for I2B2 Hello there - Please, can somebody advise me on my question below? Thanks, Sekhar H. From: Hari, Sekhar Sent: 31 July 2015 15:02 To: chen...@apache.org Subject: cTAKES GUI for I2B2 Hello Pei - Can you please assist me. I am doing a few experiments using I2B2. There is a requirement for me to use cTAKES for reading clinical notes and to extract the key terminologies from the notes so it can be inserted into I2B2. I found cTAKES GUI for I2B2 and installed it. While trying to read a sample clinical note, though seemingly the pipelines run, I don't see any useful output in the Results section of the GUI. I have included a screenshot below. The Results page says language: x-unspecified. Not sure what is going wrong. Am I doing anything wrong or missing any configurations? Also, I have included the NLP processors that am using to do this read and extract. You can see this screenshot at the bottom of this email. Hope you can help. Many thanks, Sekhar H.
Fwd: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
[+dev] Hi Azad, This is great news! Looking forward to it. --Pei On Thu, Jul 30, 2015 at 8:16 AM, Azad Dehghan azad.dehg...@gmail.com wrote: Hi Pei, Just to keep you in the loop: I am currently tailoring a version of the de-id tool for The Christie NHS Foundation Trust (UK)--this is due to be concluded end of August. So, I should have the re-factoring of the public version of the tool ready by mid-September for Apache / cTAKES. Looking forward to get this started! Best regards, Azad On 29 July 2015 at 17:27, Azad Dehghan azad.dehg...@gmail.com wrote: Pei, Yes, the tool is entirely written in Java. It is very light weight; specifically using the following external/3rd party components: ANNIE English Tokeniser, ANNIE Sentence Splitter, ANNIE Gazetteer (dictionary) and JAPE Transducer (rules) (see https://gate.ac.uk/gate/doc/plugins.html#ANNIE). I'll do my best to hurry up the refactoring of the code. I've just joined the dev mailing list. Azad
Re: xml org.apache.ctakes.core.analysis_engine.TokenizerAnnotator not found
Justin, Is this still an issue for you? I believe there was a known issue and someone submitted a patch: https://issues.apache.org/jira/browse/CTAKES-370?jql=component%20%3D%20ctakes-smoking-status%20AND%20project%20%3D%20CTAKES%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC [You probably don't want to use any references to the ctakes-smoking-status-res/target descriptors at least until the issue has been resolved] On Mon, Jul 27, 2015 at 6:04 PM, Justin Zhang justinzhang...@gmail.com wrote: Hello Everyone, Where is the file org.apache.ctakes.core.analysis_engine.TokenizerAnnotator the program is looking for? Any suggestions for trouble shooting? Where the file TokenizerAnnotator.xml should be in the path? and Where is the file online (to check if the local file is the one required?) error message: Caused by: org.apache.uima.resource.ResourceInitializationException: An import could not be resolved. No .xml file with name org.apache.ctakes.core.analysis_engine.TokenizerAnnotator was found in the class path or data path. (Descriptor: file:/Users/justin/App/eclipse_mars/workspace_eclipse_mars/ctakes/ctakes-smoking-status-res/target/classes/org/apache/ctakes/smokingstatus/analysis_engine/ProductionPostSentenceAggregate_step1.xml) at org.apache.ctakes.smokingstatus.ae.ClassifiableEntries.initialize(ClassifiableEntries.java:178) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:250) -- Justin
Re: How to use cTakes as a UIMA component
Ralph, Could you describe a bit on you were using the UIMA framework? i.e. PEAR files, XML descriptors, and/or uimaFIT to programmatically wire the components together? I think the easiest would be to have your application pull the necessary ctakes components from maven central and use the Annotators as appropriate. Hope that helps- Pei On Sat, Jul 18, 2015 at 7:45 AM, ralph Lecessi rlece...@gmail.com wrote: Hello, I'm interested in building an application in the Eclipse IDE that uses the cTAKES library as a component in the UIMA apache framework. Is this possible? Could you point me to some documentation? Thank you, Ralph Lecessi rlece...@gmail.com (732)658-4778 .
Re: Mvn package error
Zhiwen, I think this unit test needs to be updated/fixed. Even though it runs fine in mvn compile test. In the interim- package needs to -DskipTests=true. The longer story is that once modules are packaged (i.e. lvg, dictionary) mvn loads them from the jars instead of unpacked resources. So essentially, the tests needs to have that packed in order to run the test. Or modify lucene/hsqldb to have a Jar Reader. On Mon, Jun 22, 2015 at 11:34 AM, Zhiwen Li l...@udel.edu wrote: Hi, I tried to compile the 3.2.3 version of Ctakes, got the following error. Tests in error: TestClearNLPPipeLine(org.apache.ctakes.dependency.parser.ae.util.TestClearNLPAnalysisEngines): URI is not hierarchical I realized this error was resolved before in this thread https://issues.apache.org/jira/browse/CTAKES-307 But the same error comes up since the svg-ctakes-resources-lvg2008 was added to the dependency in revision 1642706. If I removed the dependency and basically restored it to revision 1620359 https://svn.apache.org/viewvc/ctakes/trunk/ctakes-lvg-res/pom.xml?view=markuppathrev=1620359 , it compiles file. But I am not sure if this dependency is necessary or not. I don't know why this specific lvg version is required after revision 1642706. Please help to clarify. Thanks, Simon -- Zhiwen Li l...@udel.edu
Apache cTAKES hosted demos and examples
There seems to be a significant interest in having a hosted demo and examples, so I started this index page along with initial code examples: Index page: http://healthnlp.github.io/examples/ Live demo: http://52.24.118.198:8080/index.jsp --Pei
Re: PAD Term Spotter
Hi Christopher, The PAD Term Spotter hasn't been supported for over a year now [1]. It was mostly written with specialized rules and no one had been maintaining it. I am not sure if there are any generic diseases annotators; if you would be willing to contribute the changes, we can incorporate it. [1] http://mail-archives.apache.org/mod_mbox/ctakes-user/201402.mbox/%3C6e55ab$8ci...@ironport10.mayo.edu%3E On Mon, Jun 8, 2015 at 5:30 PM, Christopher Baechle cbaec...@my.fau.edu wrote: A project I inherited uses cTAKES 2.5 and modified the PAD Term Spotter to detect the presence of the disease we're interested in. After searching the archives, it looks like the PAD term spotter was removed, but I couldn't find info as to why. I would like to migrate our code to the latest version of cTAKES. I could just create an annotator and port the code and put it in the pipeline, but it looks like cTAKES has had many enhancements since 2.5. I wasn't sure if a more generic disease annotator was put in place and disease specific annotators are not a good route.
Re: [DRAFT] [REPORT] Apache cTAKES Jun 2015
I am not aware of any currently, but I think it would make a great contribution though. --Pei On Fri, Jun 5, 2015 at 2:43 AM, Soumya Shree soumya.sh...@citiustech.com wrote: Hi Folks, I am working with Ctakes Wherein I am looking to search reason for discontinuation from the clinical notes. Does Ctakes offer any API or concept with which we can find the same. Example- Mr. xxx have been recommended not to take Dilantin due to its side effects. In this case we need to train machine so that it tells us that the medicine was discontinued and the reason was its side effects. I appreciate if I can get small help also. Thanks Regards, Soumya Shree -Original Message- From: Pei Chen [mailto:chen...@apache.org] Sent: Friday, June 05, 2015 2:32 AM To: dev@ctakes.apache.org Subject: [DRAFT] [REPORT] Apache cTAKES Jun 2015 [DRAFT- Feel free to add/edit] --- Report from the Apache cTAKES project [Pei Chen] ## Description: Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. ## Activity: A talk about cTAKES using Spark/BigTop on processing Twitter data was well received at ApacheCon NA 2015. The committee just released ctakes-3.2.2 on May 30, 2015 and contains a critical patch caused by a change by a 3rd party (NLM) validation service. Full Release notes: http://s.apache.org/ctakes-3.2.2-release-notes The committee is actively working on the next release cTAKES with new temporal components/models and various bug fixes in Jira. The committee is planning a local meetup (Boston) in the near future to integrate cTAKES with Docker for easier deployments. ## Issues: There are no issues requiring board attention at this time ## PMC/Committership changes: - Currently 31 committers and 30 PMC members in the project. - Last PMC addition was Michelle Chen at Fri Jan 23 2015 - Last committer addition was Jim Gregoric at Sat Feb 28 2015 ## Releases: - 3.2.2 was released on May 30 2015 - 3.2.1 was released on Dec 10 2014 - 3.2.0 was released on Jul 23 2014 ## Mailing list activity: - dev@ctakes.apache.org: - 160 subscribers (up 15 in the last 3 months): - 227 emails sent to list (211 in previous quarter) - u...@ctakes.apache.org: - 140 subscribers (up 14 in the last 3 months): - 93 emails sent to list (41 in previous quarter) ## JIRA activity: - 14 JIRA tickets created in the last 3 months - 8 JIRA tickets closed/resolved in the last 3 months === DISCLAIMER: The information contained in this message (including any attachments) is confidential and may be privileged. If you have received it by mistake please notify the sender by return e-mail and permanently delete this message and any attachments from your system. Any dissemination, use, review, distribution, printing or copying of this message in whole or in part is strictly prohibited. Please note that e-mails are susceptible to change. CitiusTech shall not be liable for the improper or incomplete transmission of the information contained in this communication nor for any delay in its receipt or damage to your system. CitiusTech does not guarantee that the integrity of this communication has been maintained or that this communication is free of viruses, interceptions or interferences.
[DRAFT] [REPORT] Apache cTAKES Jun 2015
[DRAFT- Feel free to add/edit] --- Report from the Apache cTAKES project [Pei Chen] ## Description: Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. ## Activity: A talk about cTAKES using Spark/BigTop on processing Twitter data was well received at ApacheCon NA 2015. The committee just released ctakes-3.2.2 on May 30, 2015 and contains a critical patch caused by a change by a 3rd party (NLM) validation service. Full Release notes: http://s.apache.org/ctakes-3.2.2-release-notes The committee is actively working on the next release cTAKES with new temporal components/models and various bug fixes in Jira. The committee is planning a local meetup (Boston) in the near future to integrate cTAKES with Docker for easier deployments. ## Issues: There are no issues requiring board attention at this time ## PMC/Committership changes: - Currently 31 committers and 30 PMC members in the project. - Last PMC addition was Michelle Chen at Fri Jan 23 2015 - Last committer addition was Jim Gregoric at Sat Feb 28 2015 ## Releases: - 3.2.2 was released on May 30 2015 - 3.2.1 was released on Dec 10 2014 - 3.2.0 was released on Jul 23 2014 ## Mailing list activity: - dev@ctakes.apache.org: - 160 subscribers (up 15 in the last 3 months): - 227 emails sent to list (211 in previous quarter) - u...@ctakes.apache.org: - 140 subscribers (up 14 in the last 3 months): - 93 emails sent to list (41 in previous quarter) ## JIRA activity: - 14 JIRA tickets created in the last 3 months - 8 JIRA tickets closed/resolved in the last 3 months
[ANNOUNCE] Apache cTAKES 3.2.2 released
The Apache cTAKES team is pleased to announce the availability of the 3.2.2 release. For the complete release notes, please visit http://s.apache.org/ctakes-3.2.2-release-notes Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. The release can be downloaded from http://ctakes.apache.org/downloads.cgi For further information, please visit the project website at http://ctakes.apache.org/ -- The Apache cTAKES Team
[RESULT] [VOTE] Release Apache cTAKES 3.2.2 (rc2)
More than 72 hours has passed. The vote for Apache cTAKES 3.2.2 (rc2) *passes* [1] with 5 +1 votes (4 binding) +1 (binding) Pei Chen Tim Miller Kim Ebert Jay Vyas Michal Iglewski There were no -1 or +0 votes cast. I will be publishing the release, then will announce the release as soon as artifacts will be available. Thanks to everyone who participated! -- Pei On Wed, May 13, 2015 at 10:37 AM, Pei Chen chen...@apache.org wrote: This is a call for a vote on releasing the following candidate (rc2) as Apache cTAKES 3.2.2. The major change since rc1 was to include the fix for CTAKES-359 - UMLS Authentication failing despite correct username and password. For more detailed information on the changes/release notes, please visit: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12328717 The release was made using the cTAKES release process documented here: http://svn.apache.org/repos/asf/ctakes/site/backup/content/ctakes-release-guide.mdtext The candidate is available at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz /.zip The tag to be voted on: http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.2-rc2 The MD5 checksum of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.md5 /.zip.md5 The signature of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.asc /.zip.asc Apache cTAKES' KEYS file, containing the PGP keys used to sign the release: https://dist.apache.org/repos/dist/release/ctakes/KEYS Please vote on releasing these packages as Apache cTAKES 3.2.2. The vote is open for at least the next 72 hours. The vote passes if at least three binding +1 votes are cast. [ ] +1 Release the packages as Apache cTAKES 3.2.2 [ ] -1 Do not release the packages because... Also, the convenience binary can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-bin.tar.gz.md5 /.zip
Re: [VOTE] Release Apache cTAKES 3.2.2 (rc2)
] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 2:36:05.908s [INFO] Finished at: Mon May 18 18:04:53 EDT 2015 [INFO] Final Memory: 72M/244M [INFO] I think there’s a long outstanding issue where you would need to –DskipTests=true during package/install phase because that unit test can’t read lvg from a resource jar... I think that issue is still outstanding; not sure if folks would like to address it for this particular patch release. --Pei *From:* Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com kim.eb...@perfectsearchcorp.com] *Sent:* Monday, May 18, 2015 2:59 PM *To:* dev@ctakes.apache.org *Subject:* Re: [VOTE] Release Apache cTAKES 3.2.2 (rc2) [ ] -1 Do not release the packages because... Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.83 sec FAILURE! Results : Tests in error: TestClearNLPPipeLine(org.apache.ctakes.dependency.parser.ae.util.TestClearNLPAnalysisEngines): URI is not hierarchical [image: IMAT Solutions] https://urldefense.proofpoint.com/v2/url?u=http-3A__imatsolutions.comd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=SLBMkQmi7n1iTY_1eb1WRhe2PmFhbT9yh51nijDPvyIe= *Kim Ebert* Software Engineer [image: Office:]208.971.1509 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com On 05/13/2015 08:37 AM, Pei Chen wrote: This is a call for a vote on releasing the following candidate (rc2) as Apache cTAKES 3.2.2. The major change since rc1 was to include the fix for CTAKES-359 - UMLS Authentication failing despite correct username and password. For more detailed information on the changes/release notes, please visit: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12328717 https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_secure_ReleaseNote.jspa-3FprojectId-3D12313621-26version-3D12328717d=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=0q8GepmP-AIzXsYOFAKzAJu4QaOaIPtE9ViFGYf5AdEe= The release was made using the cTAKES release process documented here: http://svn.apache.org/repos/asf/ctakes/site/backup/content/ctakes-release-guide.mdtext https://urldefense.proofpoint.com/v2/url?u=http-3A__svn.apache.org_repos_asf_ctakes_site_backup_content_ctakes-2Drelease-2Dguide.mdtextd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=cNTzigd32BmHzDhNb0pc7_Pky08MEtMVhqTZwpuLP1Ee= The candidate is available at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz https://urldefense.proofpoint.com/v2/url?u=https-3A__dist.apache.org_repos_dist_dev_ctakes_ctakes-2D3.2.2-2Drc2_apache-2Dctakes-2D3.2.2-2Dsrc.tar.gzd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=Xh_RsG-SLQfGIK9Mm5Wikv06-ntVracmEF0nTR4YHUIe= /.zip The tag to be voted on: http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.2-rc2 https://urldefense.proofpoint.com/v2/url?u=http-3A__svn.apache.org_repos_asf_ctakes_tags_ctakes-2D3.2.2-2Drc2d=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=PYB1Ysr91TLwULgR4JFUX7HX9WhwWGLzsIxBHsLyHgse= The MD5 checksum of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.md5 https://urldefense.proofpoint.com/v2/url?u=https-3A__dist.apache.org_repos_dist_dev_ctakes_ctakes-2D3.2.2-2Drc2_apache-2Dctakes-2D3.2.2-2Dsrc.tar.gz.md5d=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=MVx9tZTThRbARCWwhF6H8gGzoq2Lxt0YNIHegaVudQYe= /.zip.md5 The signature of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.asc https://urldefense.proofpoint.com/v2/url?u=https-3A__dist.apache.org_repos_dist_dev_ctakes_ctakes-2D3.2.2-2Drc2_apache-2Dctakes-2D3.2.2-2Dsrc.tar.gz.ascd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=2Q67800RBncc6aNG4pYiWEoEUb651gpFoOXXMlNHJLMe= /.zip.asc Apache cTAKES' KEYS file, containing the PGP keys used to sign the release: https://dist.apache.org/repos/dist/release/ctakes/KEYS https://urldefense.proofpoint.com/v2/url?u=https
Re: CTAKES mirroring on github.
One of the visions behind the *-res projects was to separate out the resources from code. In theory, one can filter out all *-res projects from their git repo and pull in any version of the resources from maven central... I won't have enough bandwidth at the moment to try it out or work on the git piece though... --Pei On Thu, May 14, 2015 at 1:56 PM, Kim Ebert kim.eb...@perfectsearchcorp.com wrote: I've done some investigation into using / working with the git repo for cTAKES, and I found that it is a huge. It doesn't work well with GitHub either, as I keep running into timeouts. I would like to make the suggest that we remove two cTAKES build files and the ctakes-gui-0.0.1.zip file. This takes the repo from about 8 GB down to 1.8 GB. It is likely that the reason the git mirror is failing is due to the large size of the repo. GitHub will also filter out some of these vary large files, as GitHub's max file size is 100MB. git filter-branch --tree-filter 'rm -rf ctakes-gui-0.0.1.zip' origin/cTAKES-GUI-0.0.1 git filter-branch -f --tree-filter 'rm -rf _cTAKES_build_/cTAKES-2.5*.zip' origin/maven-sandbox git filter-branch -f --tree-filter 'rm -rf _cTAKES_build_/cTAKES-2.5*.zip' origin/SHARPn-cTAKES # Clean out unreferenced objects from repo git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 \ -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc It may also be helpful to remove ctakes-dependency-parser-res/src/main/resources/org/apache/ctakes/dependency/parser/models/clearparser_models.jar from the git repo as well. (238,248,287 bytes) Thoughts? [image: IMAT Solutions] http://imatsolutions.com Kim Ebert Software Engineer [image: Office:] 208.971.1509 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com On 05/06/2015 01:17 PM, Steven Bethard wrote: Yes, I ping this issue every couple months, but no luck so far. (They take a look each time I ask, but haven't yet pushed a working git mirror for us.) Steve On Tue, May 5, 2015 at 12:09 PM, Kim Ebertkim.eb...@perfectsearchcorp.com kim.eb...@perfectsearchcorp.com wrote: Ah, looks like the issue is still being looked into. https://issues.apache.org/jira/browse/INFRA-8553 On Mon, May 4, 2015 at 4:54 PM, jay vyas jayunit100.apa...@gmail.com jayunit100.apa...@gmail.com wrote: Thanks kim. Can you file an infra issue ? they will look into it. I filed one originally On May 4, 2015 6:32 PM, Kim Ebert kim.eb...@perfectsearchcorp.com kim.eb...@perfectsearchcorp.com wrote: It looks like the github hasn't been updated in a while. Any reason? Thanks, Kim On Tue, Feb 17, 2015 at 10:36 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Our request is for a read-only mirror. However, if it ever becomes i/o, I don't know if this will have what you want, but http://git.apache.org/ Links to documentation (mostly server setup)http://www.apache.org/dev/git.html and a wiki (check toward middle and bottom for committer info) https://wiki.apache.org/general/GitAtApache -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu timothy.mil...@childrens.harvard.edu] Sent: Tuesday, February 17, 2015 12:31 PM To: dev@ctakes.apache.org Subject: Re: CTAKES mirroring on github. Is there any existing resource to help people who want to use git understand the right workflow to contribute to ctakes? (i.e. how this interacts with svn repos). Tim On 02/17/2015 12:23 PM, jay vyas wrote: Hi CTakes. Looks like infra finally got onto the JIRA i made for this a while back. They are currently working on fixing a couple of minor glitches w/ the mirroring (not showing all commits)... but there now is a mirror for CTakes on github. https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache _ctakesd=BQIBaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=Heup- IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674hm=4sEI9mOp kTz6K-DjmNU1s8Do1TGA0_10HqJcowKpDxcs=fNVbyXzpBLSAG6-DIjBZ1vbMp0JGaX90 Lcdzg_EFVvMe=
Re: UMLS Authentication failing despite correct username and password
Michal, Thanks for pointing that out (It would have been nice if they sent out a notice about the change in the API call). Would be great if someone could open a Jira and verify this fix solves the issue... I think we should push out this critical patch asap- I can include it in 3.2.2 and create another RC2. On Mon, May 11, 2015 at 11:25 PM, michal.iglew...@uqo.ca wrote: Hi Pedro and Sean, It seems to me that the service https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser returns now ?xml version='1.0' encoding='UTF-8'?Resulttrue/Result instead of Resulttrue/Result. It means that the line result = line.trim().equalsIgnoreCase(Resulttrue/Result); in isValidUMLSUser() should be replaced with result = line.trim().equalsIgnoreCase(?xml version='1.0' encoding='UTF-8'?Resulttrue/Result); Michal -Message d'origine- De : Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Envoyé : May-11-15 5:41 PM À : dev@ctakes.apache.org Objet : RE: UMLS Authentication failing despite correct username and password Argh. Our email server may have mucked with the url that I pasted: H t t p s : / / uts - ws . nlm . nih . gov / restful / isValidUMLSUser property key=umlsUrl value= INSERT URL HERE, NO SPACES / -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Monday, May 11, 2015 5:38 PM To: dev@ctakes.apache.org Subject: RE: UMLS Authentication failing despite correct username and password Hi Pedro, Check the cTakesHsql.xml and make sure that the line matches: property key=umlsUrl value= https://urldefense.proofpoint.com/v2/url?u=https-3A__uts-2Dws.nlm.nih.gov_restful_isValidUMLSUserd=BQIGaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTaom=bSJDuEveKkCQoYKfh2CwhxDx8I92siVZvxm45BoxGtEs=A5wwcyQgQrPQ_dWwnaF-QHqZb0ttus_rzS-A6UDh-S8e= / In an older version of cTAKES with an output message as you have: 11 May 2015 15:59:47 INFO AbstractJCasTermAnnotator - Default - Loading dictionary into memory. Initial run may take few mins to load. Please be patient... That line got corrupted. Sean -Original Message- From: Pedro Teixeira [mailto:teixeir...@gmail.com] Sent: Monday, May 11, 2015 5:30 PM To: dev@ctakes.apache.org Subject: UMLS Authentication failing despite correct username and password So I've checked the Dictionary lookup XML file and that password works to log in via the website. This was also working last week but stopped at some point over the last week. I've got cTAKES running on a linux system so I can index batches of documents via a script. The exact error is as follows (with the username/password blocked out). 11 May 2015 15:59:26 INFO LvgCmdApiResourceImpl - cwd = /home/PT/cTAKES/apache-ctakes-3.2.1 11 May 2015 15:59:26 INFO LvgCmdApiResourceImpl - cd /home/PT/cTAKES/apache-ctakes-3.2.1/resources/org/apache/ctakes/lvg/ 11 May 2015 15:59:27 INFO LvgCmdApiResourceImpl - cd /home/PT/cTAKES/apache-ctakes-3.2.1 11 May 2015 15:59:27 INFO ClearNLPDependencyParserAE - using Morphy analysis? true Loading configuration. Loading feature templates. Loading lexica. Loading model: 11 May 2015 15:59:42 INFO Chunker - Chunker model file: org/apache/ctakes/chunker/models/chunker-model.zip 11 May 2015 15:59:44 INFO ContextDependentTokenizerAnnotator - Finite state machines loaded. 11 May 2015 15:59:44 INFO ConstituencyParser - Initializing parser... 11 May 2015 15:59:46 INFO ContextAnnotator - SCOPE ORDER: [1, 3] 11 May 2015 15:59:46 INFO NegationContextAnalyzer - initBoundaryData() called for ContextInitializer 11 May 2015 15:59:47 INFO POSTagger - POS tagger model file: org/apache/ctakes/postagger/models/mayo-pos.zip 11 May 2015 15:59:47 INFO AbstractJCasTermAnnotator - Default - Loading dictionary into memory. Initial run may take few mins to load. Please be patient... 11 May 2015 15:59:47 INFO AbstractJCasTermAnnotator - Using dictionary lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence 11 May 2015 15:59:47 INFO AbstractJCasTermAnnotator - Exclusion tagset loaded: CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN VBP VBZ WDT WP WPS WRB 11 May 2015 15:59:47 INFO AbstractJCasTermAnnotator - Using minimum term text span: 3 11 May 2015 15:59:47 INFO DictionaryDescriptorParser - Parsing dictionary specifications: /home/PT/cTAKES/apache-ctakes-3.2.1/resources/org/apache/ctakes/dictionary/lookup/fast/cTakesHsql.xml 11 May 2015 15:59:48 ERROR UmlsUserApprover - UMLS Account at
[VOTE] Release Apache cTAKES 3.2.2 (rc1)
This is a call for a vote on releasing the following candidate (rc1) as Apache cTAKES 3.2.2. The major changes include: - Improved optional Temporal models (Time + Event Relationships models now available) - Other bug fixes/enhancements from Jira (see release notes Jira link below). I manually downloaded the bin as well as resources and tried the CVD with the AggregatePlaintextFastUMLSProcessor.xml and CPE testing the AggregateCdaProcessor. Would be great if folks have time to test/verify especially if you opened any of the Jira's below to ensure the bugs have been fixed/integrated. For more detailed information on the changes/release notes, please visit: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12328717 The release was made using the cTAKES release process documented here: http://svn.apache.org/repos/asf/ctakes/site/backup/content/ctakes-release-guide.mdtext The candidate is available at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-src.tar.gz /.zip The tag to be voted on: http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.2-rc1 The MD5 checksum of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-src.tar.gz.md5 /.zip.md5 The signature of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-src.tar.gz.asc /.zip.asc Apache cTAKES' KEYS file, containing the PGP keys used to sign the release: https://dist.apache.org/repos/dist/release/ctakes/KEYS Please vote on releasing these packages as Apache cTAKES 3.2.2. The vote is open for at least the next 72 hours. The vote passes if at least three binding +1 votes are cast. [ ] +1 Release the packages as Apache cTAKES 3.2.2 [ ] -1 Do not release the packages because... Also, the convenience binary can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-bin.tar.gz.md5 /.zip Thanks!
Re: Command-line tool for cTAKES
If you already have the CPE running, you can pass the descriptor to the command line: *org.apache.ctakes.ytex.tools.RunCPE or * *org.apache.ctakes.core.cpe.CmdLineCpeRunner or* *org.apache.uima.examples.cpe.SimpleRunCPE http://mail-archives.apache.org/mod_mbox/ctakes-dev/201504.mbox/%3ccapqz87qzxm-qmfww0cl+b9b4cfo+wsdg57bq7f54cr8keu5...@mail.gmail.com%3e If you need it programmatically, check out a thread Tim started: http://mail-archives.apache.org/mod_mbox/ctakes-dev/201503.mbox/%3ce084d8efe2b03a408b324458c5212e9434c10...@chexmbx3a.chboston.org%3e Hope that helps... --Pei On Thu, Apr 30, 2015 at 1:24 AM, Yingcheng Sun yxs...@case.edu wrote: I also have this problem. Hope anybody can offer some examples or tools easily used for programming. Yingcheng On Thu, Apr 30, 2015 at 1:06 AM, Giuseppe Totaro totarope...@gmail.com wrote: Hi all, I am a newbie with cTAKES. I am working on developing an application that relies on cTAKES. I already did some experiments using CVD and CPE tools. I am just wondering if there is any command line tool that I can use to perform an analysis on plain text and then generate the annotated output. Thanks a lot, Giuseppe
Re: Image to text conversion
Sekhar, There are a few open Jira's: I think it would be a great contribution if you get this to work: - CTAKES-189 https://issues.apache.org/jira/browse/CTAKES-189 GSoC: Implement OCR/Tika to standardize text input for cTAKES - - CTAKES-105 https://issues.apache.org/jira/browse/CTAKES-105 Add Apache Tika integration On Thu, Apr 30, 2015 at 1:21 AM, Hari, Sekhar sekhar.h...@cgi.com wrote: Thanks. Let me try this, and will let you know for any help if required. Cheers, Sekhar H. -Original Message- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Thursday, April 30, 2015 10:44 AM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: Re: Image to text conversion What about using Apache Tika within cTAKES for this? Tika supports OCR through Tesseract: http://wiki.apache.org/tika/TikaOCR Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Hari, Sekhar sekhar.h...@cgi.com Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org Date: Wednesday, April 29, 2015 at 10:11 PM To: dev@ctakes.apache.org dev@ctakes.apache.org, u...@ctakes.apache.org u...@ctakes.apache.org Subject: Image to text conversion Hello All - I am looking for an OCR ability in cTAKES. The requirement is to convert scanned image documents (ex: scanned hand written prescriptions) into a text format. Then apply the usual NLP pipeline to convert the unstructured text to a structured data. Can cTAKES convert scanned image documents into a text? If so, please help me to understand this by sharing any documents or video. Many thanks, Sekhar H.
Re: Request for help:: NCBO Ontology Extraction Tool for i2b2
Sekhar, Is it happening to all of the ontologies you mentioned or just one? Those ontologies do not seem very big or deep. Did you notice in the logs if something in the ontology having some sort of circular reference or causing an infinite loop? I think lori from i2b2 may be better at answering this since this isn't exactly cTAKES related... --Pei On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar sekhar.h...@cgi.com wrote: Hello there - Introducing myself: My name is Sekhar Hari, responsible for Bio-informatics products/ solutions in CGI, a Canadian company. In this capacity, I am also responsible for developing a software to identify potential adverse events and serious adverse events in healthcare settings. I have been trying to extract and process few Ontologies using the latest version of NCBO Ontology Extraction Tool to load into I2B2 but with no luck. I could extract the staging file, and can load this into the I2B2 staging table. However, when I run the edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it always fails with GCOverheadLimit. I tried by increasing the JVM memory to 8GB but no result. My hardware resource is limited at present, and I can't increase the JVM memory size beyond 8GB. As I have a demo for a large hospital coming up soon, in the interest of time, would you be kind enough to extract and process the following ontologies, and upload the final metadata file here? http://i2b2.bioontology.org/ Ontology IDs: 1. WHO-ART 2. OAE 3. SSE 4. OVAE The user-guide that I was following is attached. Many thanks in advance. Regards, Sekhar H.
Re: Include the smoking status detection in AggregatePlaintextFastUMLSProcessor.xml
If it works for you, I would keep it in there then. Leave the info in the Jira and we should double check the code that piece of negation is only used for the smoking status types. --Pei On Tue, Apr 21, 2015 at 1:04 PM, Tom Devel deve...@gmail.com wrote: After further testing, removing the nodeNegationAnnotator/node step in ProductionPostSentenceAggregate_step2_libsvm.xml (which I assume is the sub smoking desc xml you mean), the smoking status is not correctly classified anymore when negations are there, so this step does not look redundant to me. For example, He denied use of tobacco is then classified as CURRENT_SMOKER. If I leave this negation step in, it is correctly found as NON_SMOKER. I tried changing the order in which the smoking status nodes nodeSentenceAdjuster/node and nodeClassifiableEntriesAnnotator/node are run in the clinical pipeline, putting them directly after lvg or at the end of the flow does not change the observation above. However, you said that leaving the NegationAnnotator in could overwrite assertion values, how can this be prevented while keeping correct smoking status classifications? On Mon, Apr 20, 2015 at 2:02 PM, Chen, Pei pei.c...@childrens.harvard.edu wrote: Great. There is a redundant Negation step in one of final sub smoking desc xml's. Leave the Jira as a placeholder to clean up the smoking status desc's. Sent from my iPhone On Apr 20, 2015, at 1:11 PM, Tom Devel deve...@gmail.com wrote: Pei, I did what you recommended, I run a test input with this new pipeline and did a diff with the clinical pipeline without the smoking status on the two CAS files. It seems to do the trick, the Umls concept tags are still the same, and there is now a new tag for the smoking status annotation, great! Before I create the Jira item, what do you mean with removing the last NegEx? In AggregatePlaintextFastUMLSProcessor, the node of the NegationAnnotator is commented out: !-- nodeNegationAnnotator/node -- Did you mean this node? At the top of the file, there is an import for the NegationAnnotator: delegateAnalysisEngine key=NegationAnnotator, but it is not commented out and never run in the fixed flow. Am I correct that the negation detection in the clinical pipeline is now performed by PolarityCleartkAnalysisEngine? Thanks, Tom On Sat, Apr 18, 2015 at 12:53 AM, Pei Chen chen...@apache.org wrote: Tom, I would put it at the end of the pipeline (at a min, it should be behind sectionizer, sentence, tokenizer, lvg). I would remove ExternalBaseAggregateTAE as this simulates the sectionizer, sentence, tokenizer, lvg would would be redundant. I would also probably remove the last NegEx which could override the assertion values. Disclaimer: I did not test this yet. Feel free to open a Jira item if it works for you so it can be tracked. It seems kind of strange to have a descriptor xml define another xml descriptor to be loaded up via code again- I think this could be simplified. --Pei On Thu, Apr 16, 2015 at 7:29 PM, Tom Devel deve...@gmail.com wrote: Hi, I am using the smoking status AE from SimulatedProdSmokingTAE.xml, it works fine, I can see the smoking status annotation in the CVD. Now I would like to include the smoking status detection in the clinical pipeline of AggregatePlaintextFastUMLSProcessor.xml, so that when I run the clinincal pipeline, the smoking status will also be determined. How can I do this? I am thinking to just put the nodes from the fixed flow of SimulatedProdSmokingTAE.xml into the fixed flow of AggregatePlaintextFastUMLSProcessor.xml, is this the right approach? If so, at which exact place in the clinical pipeline fixed flow should these nodes be added? Is there a preferred place (such as append after the last node or put before the first node) ? Can a wrong position or ordering of the smoking status nodes damage/corrupt the rest of the annotations? SimulatedProdSmokingTAE.xml contains these lines with the fixed flow: fixedFlow nodeExternalBaseAggregateTAE/node nodeSentenceAdjuster/node nodeClassifiableEntriesAnnotator/node /fixedFlow AggregatePlaintextFastUMLSProcessor.xml (3.2.2 from SVN) contains this fixed flow: fixedFlow nodeSimpleSegmentAnnotator/node nodeSentenceDetectorAnnotator/node nodeTokenizerAnnotator/node nodeLvgAnnotator/node nodeContextDependentTokenizerAnnotator/node nodePOSTagger/node !-- nodeClearPOSTagger/node -- nodeChunker/node nodeAdjustNounPhraseToIncludeFollowingNP/node nodeAdjustNounPhraseToIncludeFollowingPPNP/node !--nodeLookupWindowAnnotator/node-- nodeDictionaryLookupAnnotatorDB/node nodeDrugNER/node nodeDependencyParser/node nodeSemanticRoleLabeler/node
Apache cTAKES Hackathon: Containers- Docker + Kubernetes?
Would folks be interested in joining a hackathon nearby Boston? Exact Time and place TBA. Goal: Get cTAKES to work with Docker and Kubernetes and have a working example in sandbox. Deploying cTAKES is not so straightforward and difficult to manage, let alone in a distributed environment. Containers are not extreme as a full VM images and may be just the right balance. Docker[1] and Kubernetes[2] seem to be the popular choices nowadays (ASL 2.0 licensed). [1] https://www.docker.com/whatisdocker/ [2] http://kubernetes.io --Pei
cTAKES @ ApacheCon 2015 next week
Just a reminder- Jay and I are planning to have a session (Tues) at Apache Con 2015 on using cTAKES in a Big Data context using Spark/Hadoop. If you happen to be there, feel free stop by the session. Or If you're in the neighborhood and want to meet up over coffee, feel free to drop us a note. --Pei
Re: Question about how to interpret Ctakes output
[+dev] Yu, Check out the type system: http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-type-system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSystem.xml Note: I believe what you really want is *org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation and not *org.apache.ctakes.assertion.medfacts.types.Concept (anything in the assertion.medfacts* is really an internal construct not intended to be used outside of the assertion module) *IdentifiedAnnotation.*ontologyConceptArr[] contains the array of *org.apache.ctakes.typesystem.type.refsem.OntologyConcept/UMLSConcept* On Wed, Apr 8, 2015 at 3:53 PM, Liang, Yu yu.li...@nyumc.org wrote: Dear Pei, Thank you for your previous help, I think I figure out how to run Ctakes by command line using “AggregateCdaUMLSProcessor.xml” Analysis Engine. But I am wondering is there any tutorial like how to interpret the xml results specifically the part that containing “ConceptText= For example: org.apache.ctakes.assertion.medfacts.types.Concept _indexed=2 _id= 34678 _ref_sofa=15 begin=202 end=215 conceptType=PROBLEM conceptText=Date of Birth externalId=0 originalEntityExternalId=13854 ”/ what does “_id” , “_ref_sofa”, “originalEntityExternalId”, etc. mean? Appreciate! Yu Liang CHIBI
Re: Running cTAKES via command line
Take one of the existing startup scripts such as http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-distribution/src/main/bin/runctakesCPE.sh Replace the main class org.apache.uima.tools.cpm.CpmFrame with - *org.apache.ctakes.ytex.tools.RunCPE or * *org.apache.ctakes.core.cpe.CmdLineCpeRunner or* *org.apache.uima.examples.cpe.SimpleRunCPM (requires uima examples jar)* On Fri, Apr 3, 2015 at 11:18 AM, Pedro Teixeira teixeir...@gmail.com wrote: Pei Chen chenpei@... writes: There were a couple of recent threads about this [1]. In particular search for: CmdLineCpeRunner.java and RunCPE.java [1] http://mail-archives.apache.org/mod_mbox/ctakes- dev/201502.mbox/%3cCAHnnHnZFde5MF6dDV6Y2R4jyYgua1a43SrdNZRsKJQWDDtiB8w- JsoAwUIsXosN+BqQ9rBEUg at public.gmane.org%3e We should probably add it to the wiki documentation or FAQ... On Thu, Apr 2, 2015 at 10:58 PM, Pedro Teixeira teixeira09@... wrote: John Green john.travis.green at ... writes: Hi! It depends on what you mean by run on the command line... Can you clarify the use case?Jg On Thu, Apr 2, 2015 at 5:42 PM, Pedro Teixeira teixeira09 at ... wrote: Hello, I've got an installation of cTAKES running but am unsure of how to run it via commandline only. I'd like to write a script to automate processing and skip the GUI. A quick search hasn't turned anything up. Any advice on how to do that? Will I have to dig into the code to do this? Thanks! I'd like to have as input a string/file/directory with files and then call cTAKES to process and output the XML result (picking one of the analysis engines/setting parameters upon initiation). I want to just run this without having to boot up the GUI and manually select everything and click run. Seems easiest to automate that if I can run it via command line and then just write scripts around that. Thanks! That looks like it'll do the trick although I'm having a little bit of trouble invoking it from the command line. I keep getting Cold not find or load main class errors. Unfortunately my Java is a bit rusty. I have the -cp argument in for all the paths but then when I go manually searching around I can't find the correct class. I'm assuming it's just contained in the .jar files but despite providing the /lib/* to -cp it doesn't seem to be finding it. Any advice? Perhaps an example command line invoking the CmdLineCpeRunner? Thanks so much for your help!
Re: Running cTAKES via command line
There were a couple of recent threads about this [1]. In particular search for: CmdLineCpeRunner.java and RunCPE.java [1] http://mail-archives.apache.org/mod_mbox/ctakes-dev/201502.mbox/%3ccahnnhnzfde5mf6ddv6y2r4jyygua1a43srdnzrskjqwddti...@mail.gmail.com%3e We should probably add it to the wiki documentation or FAQ... On Thu, Apr 2, 2015 at 10:58 PM, Pedro Teixeira teixeir...@gmail.com wrote: John Green john.travis.green@... writes: Hi! It depends on what you mean by run on the command line... Can you clarify the use case?Jg On Thu, Apr 2, 2015 at 5:42 PM, Pedro Teixeira teixeira09@... wrote: Hello, I've got an installation of cTAKES running but am unsure of how to run it via commandline only. I'd like to write a script to automate processing and skip the GUI. A quick search hasn't turned anything up. Any advice on how to do that? Will I have to dig into the code to do this? Thanks! I'd like to have as input a string/file/directory with files and then call cTAKES to process and output the XML result (picking one of the analysis engines/setting parameters upon initiation). I want to just run this without having to boot up the GUI and manually select everything and click run. Seems easiest to automate that if I can run it via command line and then just write scripts around that. Thanks!
Re: Ctakes Null Pointer Error for org.apache.ctakes.dependency.parser.util.DependencyUtility
Hoang, 3.0 was released a long time ago (02/2013). (according to the tag/history, it did't have the null fix until 6/2013 3.1?) http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.0.0-incubating/ctakes-dependency-parser/src/main/java/org/apache/ctakes/dependency/parser/util/DependencyUtility.java On Fri, Mar 27, 2015 at 1:46 PM, Pham, Hoang hp...@tuftsmedicalcenter.org wrote: To Timothy, Sorry, I misspell your name before. Thank You, Hoang Pham -Original Message- From: Pham, Hoang [mailto:hp...@tuftsmedicalcenter.org] Sent: Fri 3/27/2015 1:38 PM To: dev@ctakes.apache.org Subject: RE: Ctakes Null Pointer Error for org.apache.ctakes.dependency.parser.util.DependencyUtility To Tomothy, I have cTakes 3.0 install. I looked at the class and there is a null check but for some reason, it is not catching. Also, when I use the indexes that came with cTakes install, the program would parse without any errors. Thank You, Hoang Pham -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Fri 3/27/2015 9:38 AM To: dev@ctakes.apache.org Subject: Re: Ctakes Null Pointer Error for org.apache.ctakes.dependency.parser.util.DependencyUtility Hi Hoang, Can you let me know what version of cTAKES you're using? I looked in that location in trunk and found a null check, so it could be that it's a bug that's been fixed already. In the meantime, if you just want to see if your dictionary is working right, you could disable the SubjectAttributeClassifier which seems to be the annotator where this error is coming from. Some of the other attributes rely on dependency features as well so you might disable them temporarily as well. Tim On 03/27/2015 08:32 AM, Pham, Hoang wrote: Hi All, I am trying to add my own dictionary to cTakes. I have added a lucene index for the dictionary, but when the index is added, I would receive a null pointer exception for the org.apache.ctakes.dependency.parser.util.DependencyUtility class. The stack trace is: org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(407) SEVERE: Exception occurred org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed. at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:391) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.init(ASB_impl.java:409) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.init(ASB_impl.java:409) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267) at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:229) at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:259) at Parsing.main(Parsing.java:49) Caused by: java.lang.NullPointerException at org.apache.ctakes.dependency.parser.util.DependencyUtility.getPath(DependencyUtility.java:263) at org.apache.ctakes.assertion.attributes.subject.SubjectAttributeClassifier.extract(SubjectAttributeClassifier.java:181) at org.apache.ctakes.assertion.attributes.features.SubjectFeaturesExtractor.extract(SubjectFeaturesExtractor.java:57) at org.apache.ctakes.assertion.attributes.features.SubjectFeaturesExtractor.extract(SubjectFeaturesExtractor.java:1) at org.apache.ctakes.assertion.medfacts.cleartk.AssertionCleartkAnalysisEngine.process(AssertionCleartkAnalysisEngine.java:475) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375) ... 13 more Mar 27, 2015 8:29:54 AM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(275) SEVERE: Exception occurred
Re: Dependency Parser model data
Ephi, The ClearNLP models in the current cTAKES releases (since 3.1.0 [1]) should contain much more. They should contain at least MiPACQ and SHARP training data. Could you point us to the documentation so we can update it? I believe the break down was: - Clinical questions: 1,600 sentences, 30,138 tokens. - Medpedia articles: 2,796 sentences, 49,922 tokens. - MiPACQ clinical notes: 8,040 sentences, 107,663 tokens. - MiPACQ pathological notes: 1,225 sentences, 21,581 tokens. - Seattle group health clinical notes: 5,020 sentences, 61,124 tokens. - Seattle group health pathological notes: 2,294 sentences, 34,384 tokens. - SHARP clinical notes: 6,787 sentences, 94,205 tokens. - SHARP stratified: 4,316 sentences, 43,037 tokens. - SHARP stratified SGH: 4,963 sentences, 49,081 tokens. - TEMPREL clinical notes: 19,775 sentences, 266,979 tokens. - TEMPREL pathological notes: 4,335 sentences, 78,829 tokens. There are some discussions on appending/augmenting the existing annotated/training data[2]. I think the short answer is that there is currently no easy way short of having to sign DUA's from every single source institution. [1] http://svn.apache.org/r1465043 [2] http://mail-archives.apache.org/mod_mbox/ctakes-dev/201412.mbox/%3ce5a9fa5abbf1ca4085d4f0794852a51e24241...@chexmbx3a.chboston.org%3E On Sun, Mar 15, 2015 at 11:58 AM, Ephi eph...@gmail.com wrote: Hi - From the documentation, the data used to train the dep parser in cTAKES seems to be 1600 clinical questions (from the Mayo clinic?). Is there a way to retrieve this data in order to retrain the model (while adding on additional data) ? Thanks! Ephi
Re: cTakes setup
Mitch, -The dev@ and user@ mailing lists are archived and searchable; it is probably the best for searching archived discussions. -Could you clarify what you are trying to achieve or the issue that you are experiencing with the -Xmx? There are models and dictionaries that get loaded into memory- it's defaulted to 3gb to accommodate those. Increasing it may or may not improve performance; in fact it may even decrease it since it may cause more work on the GC. Also, include the version and the pipeline configuration that you are using and the group may have some suggestions. --Pei On Fri, Mar 13, 2015 at 5:01 PM, Fawcett, Mitch mfawc...@christianacare.org wrote: Hi, I'm brand new to cTakes but I'm really excited about experimenting with it and developing a Proof of Value demo for my colleagues. I have a couple of questions. 1) I see that runctakesCVD.bat sets a maximum heap space of 3 gigabytes. Is that a number that can/should be increased to improve performance? 2) Are there discussion threads archived somewhere where I can look for answers before asking questions? Thanks, Mitch Fawcett, MBA Senior Systems Analyst 13 Read's Way, Suite 202 New Castle, DE 19720 Voice 302 327-5192 mfawc...@christianacare.org
Re: Hello cTAKES Mailing List
Raymond, Probably a combination of UMLS *Consumer Health Vocabulary + Custom Dictionary (as Sean described) *may work for the use case*:* OAC CHV connects informal, common words and phrases about health to technical terms used by health care professionals. It includes jargon, slang, ambiguous, and misspelled words as used by consumers and health care professionals. Due to its nature, OAC CHV includes concepts that are not represented by other source vocabularies within the Metathesaurus. [1] http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/CHV/ On Sun, Feb 22, 2015 at 10:37 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Raymond, If you use the dictionary-fast module there exists an entry feeling bad with cui 557911 and cui 231218. There is also feel bad and feeling bad emotionally You will find horrible present pain but no other entry with horrible. You will not find any terms with awful and probably many other desired words. If you are really interested in slang crappy, lousy, etc. then they are definitely not present. What you can do is create a second dictionary. There are example custom dictionaries in -dictionary-lookup-fast-res/src/main/resources/org/apache/ctakes/dictionary/lookup/fast/example/bsv/ You should look at custom_cui_bsv.bsv if you want to specify term unique id codes and term text alone. If you want to add tui/group codes then look at custom_cui_tui_bsv.bsv - you will probably want to model your dictionary after this so that you can tag your terms with tuis for symptoms. You will want to imitate sections from the corresponding .xml file in that directory. Make a copy of cTakesHsql.xml (two dirs up) and add lines: dictionary nameCustomCuiRareWord/name implementationNameorg.apache.ctakes.dictionary.lookup2.BsvRareWordDictionary/implementationName properties property key=bsvPath value=org/apache/ctakes/dictionary/fast/example/custom_cui_tui_bsv.bsv/ /properties /dictionary And conceptFactory nameCustomCuiConcept/name implementationNameorg.apache.ctakes.dictionary.lookup2.concept.BsvConceptFactory/implementationName properties property key=bsvPath value=org/apache/ctakes/dictionary/fast/example/custom_cui_tui_bsv.bsv/ /properties /conceptFactory And dictionaryConceptPair nameCustomPair/name dictionaryNameCustomCuiRareWord/dictionaryName conceptFactoryNameCustomCuiConcept/conceptFactoryName /dictionaryConceptPair Then make sure that you point to your custom cTakesHsql.xml in dictionary-fast/desc/analysis_engine/UmlsLookupAnnotator.xml (or Overlap depending upon your use): nameDictionaryDescriptorFile/name description/ fileResourceSpecifier fileUrlfile:org/apache/ctakes/dictionary/lookup/fast/cTakesHsqlYourCopy.xml/fileUrl /fileResourceSpecifier You can also skip the UMLS dictionary altogether and just use your custom dictionary. If you do give this a try then let me know how it goes. If you need additional assistance let me know and I will help the best I can. Sean -Original Message- From: Raymond Li [mailto:ray...@bu.edu] Sent: Saturday, February 21, 2015 1:26 PM To: dev@ctakes.apache.org Subject: Hello cTAKES Mailing List Hello, my name is is Raymond Li and I am currently working on a team project involving cTAKES. The goal of our project would be to use cTAKES to analyze posts on social media (such as tweets, forum posts, public available data) in order to catch in real-time any adverse effects of prescribed drugs and do a public service of protecting people from harmful drugs. Aside from this introduction, I do have only one question to ask to proceed with this project: Is cTAKES capable of understanding slang words as symptoms. An example is if I were to say I took Crestor and feeling bad is there a way for cTAKES to recognize that Crestor had a negative effect? My team has not been able to isolate 'bad' as a negative effect as it is not a defined medical symptom, but it would be nice to figure out if such a solution exists, or if we would need to develop our own solution and how we could go around doing it. My team and I would appreciate any comments or assistance regarding our project and this current issue. Thank you and have a nice day! -- Sincerely, Raymond Li
Re: cTakes question
[+dev] Yu, Yes, you can run it from the command line in many ways. 1) You can write a Java class that does it for you. Similar to http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-examples/src/main/java/org/apache/ctakes/examples/pipelines/ExampleAggregatePipeline.java 2) Run the CPE (Collection Processing Engine) via command line. I would suggest running it once via the CPE GUI first, then save the configuration to be run via command line. I would suggest using [CollectionReaderHere] [ *AggregatePlaintextFastUMLSProcessor*.xml] [CasConsumerHere] instead of AggregateCdaUMLSProcessor for your CPE configuration. Example of running the CPE from command line: java copyyourargumentshere org.apache.uima.examples.cpe.SimpleRunCPE path_to_your_cpe.xml On Wed, Jan 21, 2015 at 1:25 PM, Liang, Yu yu.li...@nyumc.org wrote: Dear Pei, I have a question about cTakes. Our purpose is to use cTakes to identified UMLS concepts from our medical notes which are about 4000 notes in total and all are free-text. In order to put cTakes into our final pipeline, we want to find a way to run the notes and get the output by command line not GUI. Is there any way we can do that? I am not Java person, it is very hard to me to play around myself. I am going to use the “AggregateCdaUMLSProcessor” analysis engine . So if you could give me a clue, really appreciate that. I guess the command line will contain many parameters, like .jar files to be run, right? I am very confused where to find them, and if I find them, how to decide which to be put in the command line in our case , for example, we use AggregateCDAUMLSProcessor? Thanks again. Hope to hear from you very soon. Yu Liang
Re: question about CTAKES
[+dev] I think that's a current limitation in the new Polarity Classifier. It's ML based, so most likely 'Deny XYZ' or 'Negative for XYZ' is probably not in the training data. There are a couple of things I would suggest: 1) Post the questions/examples to dev@ctakes.apache.org - perhaps others may have some ideas 2) Open a Jira issue to track the issue 3) You can try to Revert back to either the old 'Assertion' module or the previous RegEx based 'NegationAnnotator' (currently commented out in the xml descriptor file.) [I am assuming you are using the latest trunk or 3.2.1 release. Hope that helps... --Pei On Wed, Dec 17, 2014 at 5:38 PM, Sisi Ma sophie.sisi...@gmail.com wrote: Hi Pei, I have a quick question about CTAKES. I am using AE “AggregatePlaintextUMLSProcessor.xml” and want to get some negation results by referring to polarity attribute. However, it turns out, for example “Negative for hepatitis”, is not negated. I think it is weird and I tried “No hepatitis”, “ Denies hepatitis” which return “polarity= -1”, but “Deny hepatitis.” returns “polarity=1”. Could you give me some clue that what is wrong? Thanks.
Re: UMLS Integration
Praveen, The error looks specific to UMLS metamorphosys rather than cTAKES. I am assuming you are trying to install UMLS locally from scratch rather than using the bundled cTAKES resources. Did you confirm that all of the files have been downloaded correctly per Metamorphosys instructions: http://www.nlm.nih.gov/research/umls/licensedcontent/umlsknowledgesources.html The error seems to be related to incomplete or corrupted zip files? Pei Chen Wired Informatics http://www.wiredinformatics.com 265 Franklin St Ste 1702 Boston, MA 02110 tel: (617) 433-7544 pei.c...@wiredinformatics.com -- Forwarded message -- From: Jay_Ram pandupraveen...@gmail.com Date: Tue, Dec 16, 2014 at 12:07 AM Subject: UMLS Integration To: dev@ctakes.apache.org Hi All, I downloaded UMLS resource, to use them offline by loading in mysql. I followed them which are mentioned to load data into mysql. But I am unable to do it show error Loading MetamorphoSys ... [Please be patient and wait for MetamorphoSys to begin] java.util.zip.ZipException: invalid LOC header (bad signature) at java.util.zip.ZipFile.read(Native Method) at java.util.zip.ZipFile.access$1400(Unknown Source) at java.util.zip.ZipFile$ZipFileInputStream.read(Unknown Source) at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(Unknown Source) at java.util.zip.InflaterInputStream.read(Unknown Source) at java.util.zip.InflaterInputStream.read(Unknown Source) at java.util.zip.CheckedInputStream.read(Unknown Source) at java.util.zip.GZIPInputStream.readUByte(Unknown Source) at java.util.zip.GZIPInputStream.readUShort(Unknown Source) at java.util.zip.GZIPInputStream.readHeader(Unknown Source) at java.util.zip.GZIPInputStream.init(Unknown Source) at gov.nih.nlm.umls.meta.io.RRFMetadataInputStream.openSourceFile(RRFMetadataInputStream.java:390) at gov.nih.nlm.umls.meta.io.RRFConceptInputStream.open(RRFConceptInputStream.java:175) at gov.nih.nlm.umls.meta.io.RRFMetathesaurusInputStream.open(RRFMetathesaurusInputStream.java:125) at gov.nih.nlm.umls.mmsys.io.RRFMetamorphoSysInputStream.open(RRFMetamorphoSysInputStream.java:629) at gov.nih.nlm.umls.mmsys.subset.gui.MetamorphoSysGUI.validateGUIConfigurables(MetamorphoSysGUI.java:1097) at gov.nih.nlm.umls.mmsys.subset.gui.BeginSubsetAction.actionPerformed(BeginSubsetAction.java:110) at javax.swing.AbstractButton.fireActionPerformed(Unknown Source) at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source) at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source) at javax.swing.DefaultButtonModel.setPressed(Unknown Source) at javax.swing.AbstractButton.doClick(Unknown Source) at javax.swing.plaf.basic.BasicMenuItemUI.doClick(Unknown Source) at javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(Unknown Source) at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source) at java.awt.Component.processMouseEvent(Unknown Source) at javax.swing.JComponent.processMouseEvent(Unknown Source) at java.awt.Component.processEvent(Unknown Source) at java.awt.Container.processEvent(Unknown Source) at java.awt.Component.dispatchEventImpl(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source) at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source) at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Window.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at java.awt.EventQueue.dispatchEventImpl(Unknown Source) at java.awt.EventQueue.access$200(Unknown Source) at java.awt.EventQueue$3.run(Unknown Source) at java.awt.EventQueue$3.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source) at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source) at java.awt.EventQueue$4.run(Unknown Source) at java.awt.EventQueue$4.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source) at java.awt.EventQueue.dispatchEvent(Unknown Source) at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source
Re: revamping the Apache cTAKES website
the template was borrowed from spark... we should put in our own design/css/layout/skin to suit our needs. Perhaps Michelle or others familiar with bootstrap could help us out here? On Mon, Dec 15, 2014 at 7:32 PM, jay vyas jayunit100.apa...@gmail.com wrote: this is gorgeous ! Thanks pei ! i let the bigtop folks know as well ! On Mon, Dec 15, 2014 at 6:21 PM, Murali mmin...@gmail.com wrote: Looks great. +1 On Dec 15, 2014, at 4:29 PM, Chen, Pei pei.c...@childrens.harvard.edu wrote: Check out a mockup of a new website proposal: http://svn.apache.org/repos/asf/ctakes/site/new/index.html Based off bootstrap (Idea borrowed from the Spark folks..). Couple of key pieces of info: - 10% of visitors are on mobile/tablets - The most currently visited pages are: downloads.cgi, gettingstarted.html. I suggest we focus our attention on those 2 items. (Putting a Downloads link right on the front page, etc.) svn co http://svn.apache.org/repos/asf/ctakes/site/new if you want to checkout the code of the site. --Pei -Original Message- From: John Green [mailto:john.travis.gr...@gmail.com] Sent: Friday, December 05, 2014 6:34 PM To: dev@ctakes.apache.org Cc: dev@ctakes.apache.org Subject: RE: revamping the Apache cTAKES website I would like to second the bootstrap recommendation, with the additional recommendation of django for the backend. It is an amazing platform for rapid development and easy updating. JG — Sent from Mailbox On Fri, Dec 5, 2014 at 12:15 PM, Savova, Guergana guergana.sav...@childrens.harvard.edu wrote: There are now 4 volunteers: Michelle Chen Pei Chen Sean Finan Guergana Savova --Guergana -Original Message- From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu] Sent: Friday, December 05, 2014 11:56 AM To: dev@ctakes.apache.org Subject: RE: revamping the Apache cTAKES website Wonderful, thank you, Michelle! There will be a flurry of emails the week of Dec 15 followed by actual work, so book your calendar if possible... --Guergana -Original Message- From: Michelle Chen [mailto:michelle1919c...@gmail.com] Sent: Friday, December 05, 2014 11:48 AM To: dev@ctakes.apache.org Subject: Re: revamping the Apache cTAKES website Hello Guergana, I don't know that much about cTakes, but would be interested in contributing to the effort. I'm not sure if there is an interest in matching the website design of other Apache projects, but it seems that the two main designs that are being used from my arbitrary search on http://projects.apache.org/indexes/alpha.html is 1. the current design that cTakes is using and 2. a Bootstrap approach. I've done a little bit of work on Bootstrap and would be interested in helping with that. Let me know how I can be helpful. Sincerely, Michelle Chen :) Be strong and of good courage; do not be afraid, nor be dismayed, for the Lord your God is with you wherever you go. ~Joshua 1:9 On Fri, Dec 5, 2014 at 11:21 AM, Savova, Guergana guergana.sav...@childrens.harvard.edu wrote: cTAKES-ers, we would like to start working on updating the Apache cTAKES website - some of the information there is already stale and needs refreshing. Do you have ideas on website design, content, etc.? Would you like to contribute to the effort? We are planning to start working on the website the week of Dec 15. Cheers, --Guergana -- jay vyas
Re: Question about running cTakes, urgent!
Yu, There should be an attribute within any of the IdentifiedAnnotation(or Subclasses) called polarity. It's -1 if it's negated. For example: - *polarity* = -1 - [image: Inline image 1] On Fri, Dec 12, 2014 at 4:17 PM, Liang, Yu yu.li...@nyumc.org wrote: Last Question, thanks for your patience. Here is the result I run the AggregatePlaintextFastUMLSProcessor.xml by using the real medical note. But I cannot find the negation result. Yu Liang CHIBI On Dec 12, 2014, at 3:59 PM, Pei Chen chen...@apache.org wrote: Yes, Negation is handled by the new nodePolarityCleartkAnalysisEngine/node Within IdentifiedAnnotation, there should be a polarity() attribute that should be populated. On Fri, Dec 12, 2014 at 3:55 PM, Liang, Yu yu.li...@nyumc.org wrote: Thanks soo much!! Very Awesome ! I am not like a java person, so kind of totally lost. So I also see there includes NegationAnnotator, right? So don’t have to run NEcontext component?! Yu Liang CHIBI On Dec 12, 2014, at 3:51 PM, Pei Chen chen...@apache.org wrote: Hi Yu, That is correct. If you take a look at any of the 'Aggregate' examples, it should already have something like this defined in the xml flow: fixedFlow nodeSimpleSegmentAnnotator/node nodeSentenceDetectorAnnotator/node nodeTokenizerAnnotator/node nodeLvgAnnotator/node nodeContextDependentTokenizerAnnotator/node nodePOSTagger/node !-- nodeClearPOSTagger/node -- nodeChunker/node nodeAdjustNounPhraseToIncludeFollowingNP/node nodeAdjustNounPhraseToIncludeFollowingPPNP/node !-- nodeLookupWindowAnnotator/node -- nodeDictionaryLookupAnnotatorDB/node nodeDependencyParser/node nodeSemanticRoleLabeler/node nodeConstituencyParser/node !-- nodeAssertionAnnotator/node -- !-- nodeStatusAnnotator/node -- !-- nodeNegationAnnotator/node -- nodeGenericCleartkAnalysisEngine/node nodeHistoryCleartkAnalysisEngine/node nodePolarityCleartkAnalysisEngine/node nodeSubjectCleartkAnalysisEngine/node nodeUncertaintyCleartkAnalysisEngine/node nodeExtractionPrepAnnotator/node /fixedFlow You should see the results in your CVD, look for **.IdentifiedAnnotation (Those are the Named Entities + Attributes that have been extracted and normalized to an UMLS CUI.) On Fri, Dec 12, 2014 at 3:45 PM, Liang, Yu yu.li...@nyumc.org wrote: Thanks for your quick reply. So I am not quite sure if I understand correctly, that is , if I ONLY run AggregatePlaintextFastUMLSProcessor.xml, I don’t need to run the Segment, Sentence, Tokenizer, Chunker, dictionary lookup in sequency? Yu Liang CHIBI On Dec 12, 2014, at 3:40 PM, Pei Chen chen...@apache.org wrote: [+Dev] Yu, It's great that you have the AggregatePlaintextFastUMLSProcessor.xml running. I presume it's returning IdentifiedAnnotations for you. Long answer: The error JCas type xyz is used in Java Code but not declared in XML is caused by the fact that the type system is not imported. Normally, this could be fixed by adding in your primitive xml descriptor: typeSystemDescription imports import name=org.apache.ctakes.typesystem.types.TypeSystem/ /imports /typeSystemDescription But it should be already added in the SimpleSegmentAnnotator which is part of the Aggregate examples. The Dictionary lookup annotatorUMLS.xml was not intended to be used by itself because it requires the other components such as Segment, Sentence, Tokenizer, Chunker, etc. to work properly[1]. [1] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+Component+Use+Guide#cTAKES3.2ComponentUseGuide-ComponentDependencies On Fri, Dec 12, 2014 at 3:26 PM, Liang, Yu yu.li...@nyumc.org wrote: I do have them, and I add them into the code where need the substitution. Loading is good, no error, and run AggregatePlaintextFastUMLSProcessor.xml with no error message. But when try to run other AE, like dictionarylookupannotatorUMS.xml “ , the error is as in the previous email. Do you have any idea? Also, I tried to find the log file to check the detailed error info. it is weird that I click the “view log” from the pull-down menu, it says: “no /Users/yu/uima.log”. Yu Liang CHIBI On Dec 12, 2014, at 2:52 PM, Pei Chen chen...@apache.org wrote: Yu, Do you have an UMLS username and password so you can use the UMLS resources/dictionaries? (you can request a free one here: https://uts.nlm.nih.gov//license.html) I would suggest you use the AggregatePlaintextFastUMLSProcessor.xml On Fri, Dec 12, 2014 at 2:48 PM, Liang, Yu yu.li...@nyumc.org wrote: Hi, I have a problem when using cTakes, our purpose is to annotate medical notes using cTakes with build in UMLS dictionary. And It is fine to run the AggregatePlaintextProcessor.xml. But when I try to run other analysis engine, always get this similar error messages below. Could you help me out? Appreciate! This is one error message after load and run AE called
[VOTE] Release Apache cTAKES 3.2.1 (rc2)
This is a call for a vote on releasing the following candidate (rc2) as Apache cTAKES 3.2.1. The major changes include: - New optional Temporal component (Time + Event Relationships models now available) - Other bug fixes/enhancements from Jira I manually downloaded the bin as well as resources and tried the CVD with the AggregatePlaintextFastUMLSProcessor.xml and AggregatePlaintextUMLSProcessor.xml. Would be great if folks have time to test/verify especially if you opened any of the Jira's below to ensure the bugs have been fixed/integrated. For more detailed information on the changes/release notes, please visit: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12326778 The release was made using the cTAKES release process documented here: http://ctakes.apache.org/ctakes-release-guide.html The candidate is available at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-src.tar.gz /.zip The tag to be voted on: http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.1-rc2 The MD5 checksum of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-src.tar.gz.md5 /.zip.md5 The signature of the tarball can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-src.tar.gz.asc /.zip.asc Apache cTAKES' KEYS file, containing the PGP keys used to sign the release: https://dist.apache.org/repos/dist/release/ctakes/KEYS Please vote on releasing these packages as Apache cTAKES 3.2.1. The vote is open for at least the next 72 hours. The vote passes if at least three binding +1 votes are cast. [ ] +1 Release the packages as Apache cTAKES 3.2.1 [ ] -1 Do not release the packages because... Also, the convenience binary can be found at: https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-bin.tar.gz.md5 /.zip Thanks!
Re: UMLS validation url
Kim, I'm not sure, but I also noticed it in testing the 3.2.1-rc1 earlier today; one can always check the revision history. But in either case, A simple search and replace of: https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser https://uts-ws.nlm.nih.gov/restful/isValid https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuserUMLSUser will do the trick. Pei Chen Wired Informatics http://www.wiredinformatics.com 265 Franklin St Ste 1702 Boston, MA 02110 tel: (617) 433-7544 pei.c...@wiredinformatics.com On Mon, Nov 24, 2014 at 3:12 PM, Pei Chen chen...@apache.org wrote: -- Forwarded message -- From: Kim Ebert kim.eb...@perfectsearchcorp.com Date: Mon, Nov 24, 2014 at 1:46 PM Subject: Re: UMLS validation url To: dev@ctakes.apache.org Hi Pei, Was this a recent change. I know that some of the other conf files referenced the url https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser as well. [image: IMAT Solutions] http://imatsolutions.com Kim Ebert Software Engineer [image: Office:] 801.669.7342 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com On 11/24/2014 11:30 AM, Chen, Pei wrote: That’s a typo in the fast dictionary lookup. It should be: https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser Jira raised for this: https://issues.apache.org/jira/browse/CTAKES-335 *From:* Kim Ebert [mailto:kim.eb...@imatsolutions.com kim.eb...@imatsolutions.com] *Sent:* Monday, November 24, 2014 1:28 PM *To:* dev@ctakes.apache.org *Subject:* UMLS validation url Hi All, Today I noticed that https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser is returning 404 messages. Anyone else running into the same problem? Thanks, -- [image: IMAT Solutions] http://imatsolutions.com *Kim Ebert* Software Engineer [image: Office:]801.669.7342 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com
Re: Chest pain absent. - polarity
Petr, Which version of cTAKES are you using? 3.2.0 or latest 3.2.1-rc1/trunk? Both default to use a Machine Learning based polarity algorithm. If it is missed, more training examples is probably the way to go. The latest one uses clearTK and trained with different features and training data so I would be curious to see if that one picks up your examples. On Sat, Nov 15, 2014 at 11:04 AM, Petr Zalesky pzale...@inferscience.com wrote: I have been investigating how polarity on a sign/symptom gets set and ran into interesting issue. If a physician's note in a history of present illness (HPI) says something like: “Absence of chest pain.” “Denied chest pain.” “Chest pain resolved.” Then cTAKES picks up the term chest pain, assigns it the correct SNOMED codes and sets the polarity to -1. However, some of the de-identified samples say: Chest pain absent. In this case it is also picked up by cTAKES but in this case the polarity is set to positive one (1). I have been trying to figure out if there is a way to configure cTAKES to detect that. Any suggestions?
Re: CTakes on github.
Sounds good. Jay, Barring any objections from the group, would you mind opening a Jira with INFRA to set that up (read only git mirror) for cTAKES? --Pei On Thu, Oct 30, 2014 at 12:40 PM, jay vyas jayunit100.apa...@gmail.com wrote: Hi Pei : I Agree with (A) - the hybrid approach, so anyone can use both, or and git.apache.org has working github mirroring.
Next cTAKES release 3.2.1 - Creating a Release Candidate
There are a lot of good fixes and new enhancements in currently trunk. - Includes new Temporal Relations models (ex: Event relationships are available now- previously- only Event/Time entities discovery models were included.) -Plus a ton of bug fixes tracked in Jira I can volunteer to be RM again to push out this release... Let me know if there are any objections, otherwise, I was planning to create a branch and release candidate some time next week. Please update Jira/check in pending code if you want it to be included in the 3.2.1-rc... --Pei
upcoming cTAKES meetup - Boston...
Next Friday (halloween) - feel free to drop by if you're in the area! Lunch/drinks provided..Please RSVP via http://www.meetup.com/cTAKES/events/208836282/ --Pei
Apache cTAKES 3.2.1 release preperation
There is a 3.2.1 release slated for end of Oct. The major changes are: uimafit 2.1 upgrade, cleakTK upgrade, New temporal relations models. Below is a summary of what was scheduled to go in (some may be still unresolved). Feel free to edit/update Jira if you believe something should be included/omitted in preparation... Sub-task - [CTAKES-124 https://issues.apache.org/jira/browse/CTAKES-124] - remove internal UIMA types from coreference - [CTAKES-312 https://issues.apache.org/jira/browse/CTAKES-312] - upgrade uimafit Bug - [CTAKES-76 https://issues.apache.org/jira/browse/CTAKES-76] - get third party dependencies into Maven Central - [CTAKES-155 https://issues.apache.org/jira/browse/CTAKES-155] - SimpleSegmentWithTagsAnnotator assumes all section names are 5 characters - [CTAKES-162 https://issues.apache.org/jira/browse/CTAKES-162] - Command line scripts leave the user back one directory - [CTAKES-169 https://issues.apache.org/jira/browse/CTAKES-169] - SectionSegmentAnnotator.java is in core, but the sample SectionSegmentAnnotator.xml descriptor is in ctakes-clinical-pipeline - [CTAKES-178 https://issues.apache.org/jira/browse/CTAKES-178] - parsing of medication strength does not verify a number was discovered (strength value includes both the dosage and strength value in some cases) - [CTAKES-213 https://issues.apache.org/jira/browse/CTAKES-213] - ModifierExtractorAnnotator should produce XxxxModifier subtypes - [CTAKES-241 https://issues.apache.org/jira/browse/CTAKES-241] - NullPointerException in ctakes-assertion - [CTAKES-275 https://issues.apache.org/jira/browse/CTAKES-275] - some of the older junit tests don't have the right Project name in the run configurations - [CTAKES-280 https://issues.apache.org/jira/browse/CTAKES-280] - upgrade to cleartk-2.* - [CTAKES-285 https://issues.apache.org/jira/browse/CTAKES-285] - cleartk-ml-liblinear needs to be added to the dependencies - [CTAKES-302 https://issues.apache.org/jira/browse/CTAKES-302] - Element type hibernate-mapping must be followed by either attribute specifications, or /. - [CTAKES-307 https://issues.apache.org/jira/browse/CTAKES-307] - URI is not hierarchical when running mvn install - [CTAKES-309 https://issues.apache.org/jira/browse/CTAKES-309] - Add SNOMEDCT_US to ytext db scripts - [CTAKES-310 https://issues.apache.org/jira/browse/CTAKES-310] - Dictionary lookup permutations sort issue - [CTAKES-311 https://issues.apache.org/jira/browse/CTAKES-311] - v_document_cui_sent View returns no results in cTAKES-YTEX Improvement - [CTAKES-77 https://issues.apache.org/jira/browse/CTAKES-77] - Update POSTagger Unit Tests - [CTAKES-78 https://issues.apache.org/jira/browse/CTAKES-78] - Update Chunker unit tests - [CTAKES-94 https://issues.apache.org/jira/browse/CTAKES-94] - refactoring assertion module to use a cleartk-based analysis engine (and include evaluation) - [CTAKES-122 https://issues.apache.org/jira/browse/CTAKES-122] - include LVG with a future version of cTAKES? - [CTAKES-172 https://issues.apache.org/jira/browse/CTAKES-172] - relation-extractor is using StatusAnnotator and NegationAnnotator instead of AssertionAnnotator - [CTAKES-222 https://issues.apache.org/jira/browse/CTAKES-222] - FirstTokenPermLookupInitializerImpl to suppot arraylist of DictionaryLookupWindows - [CTAKES-225 https://issues.apache.org/jira/browse/CTAKES-225] - Common Type System - Add field to save preferredText in Segment - [CTAKES-295 https://issues.apache.org/jira/browse/CTAKES-295] - Use UIMAFit-style configuration annotations Task - [CTAKES-74 https://issues.apache.org/jira/browse/CTAKES-74] - Tokenizer PennTreeBank breaks with certain apostrophes in tokens. - [CTAKES-138 https://issues.apache.org/jira/browse/CTAKES-138] - Remove 3rd party jars from our SVN - [CTAKES-232 https://issues.apache.org/jira/browse/CTAKES-232] - change concept type - [CTAKES-315 https://issues.apache.org/jira/browse/CTAKES-315] - Update Default UMLS pipeline to use dictionary-lookup-fast
Re: Boston cTAKES Meetup
Jay, CTAKES-314 https://issues.apache.org/jira/browse/CTAKES-314 - BigTop/Hadoop cTAKES integration has been created. Feel free to create/add/edit an uber or a children... Discuss thread on dev@ : http://mail-archives.apache.org/mod_mbox/ctakes-dev/201409.mbox/%3ccapqz87q09cq_kt+4woqki7dpc5qre6h4y3eq9ukoykh5pnz...@mail.gmail.com%3e On Tue, Sep 23, 2014 at 7:41 AM, Jay Vyas jayunit100.apa...@gmail.com wrote: Shall we create an umbrella jira ? On Sep 23, 2014, at 6:26 AM, Prakash Poudyal prakashpoud...@gmail.com wrote: It will be great if you could broadcast the gathering, talk, etc. I wish I would be there, but it is very hard tor is not possible for me to be there. Prakash Poudyal Portugal On Tue, Sep 23, 2014 at 3:31 AM, Tim O'Connell tim.oconn...@gmail.com wrote: thanks Pei. On Mon, Sep 22, 2014 at 7:17 PM, Pei Chen chen...@apache.org wrote: the meetup formats are usually casual/informal, but I'll check to see if that's possible. will post it up if it's available. On Mon, Sep 22, 2014 at 5:42 PM, Tim O'Connell tim.oconn...@gmail.com wrote: Hi Folks, Any idea if we can set up a WebEx for those of us who can't attend (I'm in Vancouver)? Best, Tim On Mon, Sep 22, 2014 at 2:40 PM, John Green hephaestus.stu...@gmail.com wrote: Will this be recorded? — Sent from Mailbox https://www.dropbox.com/mailbox On Mon, Sep 22, 2014 at 4:30 PM, Pei Chen chen...@apache.org wrote: Please feel free to join the Boston Meet up group: Upcoming Free Event: http://www.meetup.com/cTAKES/events/208836282/ (If possible, please feel free to RSVP so we can get an approx headcount) Feel free to chime in if you have anything specific that may be of interest to you: ex: cTAKES intro, cTAKES and BigTop/Hadoop. But open to the community if anyone has anything they would like to show/share, news, looking for a job?, has a job opening, etc. --Pei -- Regards Prakash Poudyal
Boston cTAKES Meetup
Please feel free to join the Boston Meet up group: Upcoming Free Event: http://www.meetup.com/cTAKES/events/208836282/ (If possible, please feel free to RSVP so we can get an approx headcount) Feel free to chime in if you have anything specific that may be of interest to you: ex: cTAKES intro, cTAKES and BigTop/Hadoop. But open to the community if anyone has anything they would like to show/share, news, looking for a job?, has a job opening, etc. --Pei
Re: Boston cTAKES Meetup
the meetup formats are usually casual/informal, but I'll check to see if that's possible. will post it up if it's available. On Mon, Sep 22, 2014 at 5:42 PM, Tim O'Connell tim.oconn...@gmail.com wrote: Hi Folks, Any idea if we can set up a WebEx for those of us who can't attend (I'm in Vancouver)? Best, Tim On Mon, Sep 22, 2014 at 2:40 PM, John Green hephaestus.stu...@gmail.com wrote: Will this be recorded? — Sent from Mailbox https://www.dropbox.com/mailbox On Mon, Sep 22, 2014 at 4:30 PM, Pei Chen chen...@apache.org wrote: Please feel free to join the Boston Meet up group: Upcoming Free Event: http://www.meetup.com/cTAKES/events/208836282/ (If possible, please feel free to RSVP so we can get an approx headcount) Feel free to chime in if you have anything specific that may be of interest to you: ex: cTAKES intro, cTAKES and BigTop/Hadoop. But open to the community if anyone has anything they would like to show/share, news, looking for a job?, has a job opening, etc. --Pei
Re: Ctakes to process 5000K recoreds
Nick, When you mean no medication is being annotated, I presume you mean the medication attributes (i.e. dosage, frequency, etc.) are not being annotated? I think the DrugNER needs a list of section names in the config; I think it includes SIMPLE_SEGMENT. I am very surprised that SimpleSegementAnnotator is the bottle neck though; all it does is assume the entire document is a single section called SIMPLE_SEGMENT. Have you tried commenting out the DependencyParser if you're not using those features. --Pei On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish snika...@emerginghealthit.com wrote: Hi there, I am using Ctakes to process 5000K free text records where each record has several medications. This is the fixed flow that it goes through: nodeSimpleSegmentAnnotator/node nodeSentenceDetectorAnnotator/node nodeTokenizerAnnotator/node nodeLvgAnnotator/node nodeContextDependentTokenizerAnnotator/node nodePOSTagger/node nodeChunker/node nodeLookupWindowAnnotator/node nodeDictionaryLookupAnnotatorDB/node nodeDependencyParser/node nodeAssertionAnnotator/node nodeExtractionPrepAnnotator/node But it takes very very long time to process that many data( maybe a week or so) when I use SimpleSegmentAnnotator. By eliminating SimpleSegmentAnnotator the process is very fast but no medication is being anotated. Do you guys have any suggestion? Thanks, Nick
Re: Permutations
Hi Kim, Thanks for pointing that out. https://issues.apache.org/jira/browse/CTAKES-310 has been opened for this. If you commit the changes, we can see if we can include in the 3.2.1 patch release. I was looking at the changelist for this file, and it may look like some of these optimizations may have been intentional by Sean so he may have some more insight in this bit of the logic. On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert kim.eb...@perfectsearchcorp.com wrote: Hi All, I was reviewing the use of permutations, and I noticed that we sorted the permutation list before creating the string to do the concept lookup with. It also appears that we were sorting the object that was stored in the parent list. I've made a few changes, and now it appears I can discover some additional concepts based upon the permutations. Let me know what you think of the following changes. Thanks, Kim === modified file 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java' --- ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java 2014-07-31 22:00:48 + +++ ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java 2014-09-04 18:39:59 + @@ -210,11 +210,12 @@ final ListListInteger permutationList = iv_permCacheMap.get( permutationIndex ); for ( ListInteger permutations : permutationList ) { // Moved sort and offset calculation from inner (per MetaDataHit) iteration 2-21-2013 spf - Collections.sort( permutations ); + ListInteger permutationsSorted = (List) ((ArrayList)permutations).clone(); + Collections.sort( permutationsSorted ); int startOffset = firstWordStartOffset; int endOffset = firstWordEndOffset; - if ( !permutations.isEmpty() ) { -int firstIdx = permutations.get( 0 ); + if ( !permutationsSorted.isEmpty() ) { +int firstIdx = permutationsSorted.get( 0 ); if ( firstIdx = firstTokenIndex ) { firstIdx--; } @@ -222,7 +223,7 @@ if ( firstToken.getStartOffset() firstWordStartOffset ) { startOffset = firstToken.getStartOffset(); } -int lastIdx = permutations.get( permutations.size() - 1 ); +int lastIdx = permutationsSorted.get( permutationsSorted.size() - 1 ); if ( lastIdx = firstTokenIndex ) { lastIdx--; } -- Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/
Re: MedicationMention and new Mention
Harpreet, MedicationMention attributes such as .medicationfrequency .medicationDosage Can be filled via the DrugMentionAnnotator [1]. If I recall correctly, I believe you can just add that annotator after the DictionaryLookup in your pipeline. [1] http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-drug-ner/desc/analysis_engine/DrugMentionAnnotator.xml On Wed, Sep 3, 2014 at 3:47 PM, Harpreet Khanduja hsk5...@rit.edu wrote: Hello, Hope everyone is doing great. I would appreciate some help in these two questions. 1. How would one go about finding values for attributes medicationfrequency,medicationAllergy, medicationDosage and others which are present in MedicationMention and adding them to the End result of ctakes pipeline. 2. Is there a way create a new Mention like LabMention or MedicationMention ? Any help would be really appreciated. Thank you. Regards, Harpreet
Re: managing ctakes resources on classpath
I'm not too privy to the ytex config details, but yes you're right, it's caused by the xdl.xsd being null. However it looks like it exists in ytex-res.jar but the call being made uses Class.getResource which won't be able to read in from the jar as an InputStream. 1) We can make ytex read in resources directly from jars (as maven central artifacts). We can make AppJdl.class.getResourceAsStream() instead of getResource(). However, are there any other local physical File dependencies? 2) Alternatively, we can add a step to have maven unpack res.jar if required. I think 1 would be nice, but not sure how involved it will be. Caused by: java.io.FileNotFoundException: /Users/pei/workspace/apache-ctakes/trunk/ctakes-ytex/file:/Users/pei/workspace/apache-ctakes/trunk/ctakes-ytex-res/target/ctakes-ytex-res-3.2.1-SNAPSHOT.jar!/org/apache/ctakes/jdl/xdl.xsd (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.lt;initgt;(FileInputStream.java:146) at java.io.FileInputStream.lt;initgt;(FileInputStream.java:101) at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90) Anyhow- https://issues.apache.org/jira/browse/CTAKES-308 opened to track this On Tue, Aug 26, 2014 at 3:01 AM, vijay garla vnga...@gmail.com wrote: Hi The test that is failing has nothing to do with the MRCONSO not found warning. ValidationTest failed because it couldn't find the XSD. The XSD is in the ctakes-ytex-resources, but the corresponding maven artifact is an empty jar. I think it would be best to modify the resource jars to actually contain resources.
cTAKES min requirements
Since we default the runtime java heap sizes to 3g in 3.2.0, should we update our documentation to officially only support 64bit? I can only see models/pipelines being loaded into mem grow in size. I know it may seem trivial, but I still know a few unfortunate souls still on 32 bit systems… any objections in saying that it will essentially be no longer supported. --Pei
org.apache.ctakes.ytex.umls.dao.UMLSDaoTest
Hi VJ, While on the subject of unit tests- I didn't get a chance to dig deeper and was hoping you would know the cause of this unit test failure: mvn clean install 2014-08-25 13:33:50,830 WARN net.sf.ehcache.CacheManager - Creating a new instance of CacheManager using the diskStorePath /var/folders/qc/d7xd4zzs0_xcybv88skt5_7mgn/T/ which is already used by an existing CacheManager. The source of the configuration was net.sf.ehcache.config.generator.ConfigurationSource$InputStreamConfigurationSource@7433a719. The diskStore path for this CacheManager will be set to /var/folders/qc/d7xd4zzs0_xcybv88skt5_7mgn/T//ehcache_auto_created_1408988030830. To avoid this warning consider using the CacheManager factory methods to create a singleton CacheManager or specifying a separate ehcache configuration (ehcache.xml) for each CacheManager instance. 2014-08-25 13:33:51,082 WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 62, SQLState: S0010 2014-08-25 13:33:51,082 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - Unknown JDBC escape sequence: {{db.schema}.MRCONSO mrconso0_ where mrconso0_.aui? and length(mrconso0_.aui)0 and length(mrconso0_.str)200 and mrconso0_.lat='ENG' order by mrconso0_.aui 2014-08-25 13:33:51,085 WARN org.apache.ctakes.ytex.umls.dao.UMLSDaoTest - sql exception - mrconso probably doesn't exist, check error org.hibernate.exception.SQLGrammarException: could not prepare statement at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:123) at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125) at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$StatementPreparationTemplate.prepareStatement(StatementPreparerImpl.java:188) at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareQueryStatement(StatementPreparerImpl.java:159) at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1859) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1836) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816) at org.hibernate.loader.Loader.doQuery(Loader.java:900) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342) at org.hibernate.loader.Loader.doList(Loader.java:2526) at org.hibernate.loader.Loader.doList(Loader.java:2512) at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342) at org.hibernate.loader.Loader.list(Loader.java:2337) at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:495) at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:357) at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:195) at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1269) at org.hibernate.internal.QueryImpl.list(QueryImpl.java:101) at org.apache.ctakes.ytex.umls.dao.UMLSDaoImpl.getAllAuiStr(UMLSDaoImpl.java:106) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:319) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:90) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202) at com.sun.proxy.$Proxy11.getAllAuiStr(Unknown Source) at org.apache.ctakes.ytex.umls.dao.UMLSDaoTest.testGetAllAuiStr(UMLSDaoTest.java:53) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at
Re: Microsoft - MSDN - Is the support continuing for ASF committers?
Just an fyi - link for MSDN subscription license(s) for committers http://mail-archives.apache.org/mod_mbox/www-community/201305.mbox/%3c518b85e7.7000...@lehmi.de%3E https://svn.apache.org/repos/private/committers/donated-licenses/msdn-subscription.html
Re: [VOTE] Release Apache cTAKES 3.2.0
It's the latter: the -src is basically the same as the dev install w/o the subversion checkout step... On Tue, Jul 8, 2014 at 7:15 AM, Miller, Timothy timothy.mil...@childrens.harvard.edu wrote: One other thing -- the User install guide says to download the -bin and the Dev install guide has you checking out from trunk. And the README in the -src distribution has instructions that only make sense for the -bin distribution. Are there instructions somewhere for how to install the -src version or are they basically the same as the dev install w/o the subversion checkout step? Tim From: Pei Chen [chen...@apache.org] Sent: Monday, July 07, 2014 12:06 PM To: dev@ctakes.apache.org Subject: Re: [VOTE] Release Apache cTAKES 3.2.0 Thanks for testing this Tim. I could recreate this on my mac now (worked previously on windows on luck because of the class load order). Essentially, the old mitre assertion module and LVG resources still need to be unpacked. We don't have access to modify the underlying lib to read from a stream, I'm just going to omit/exclude the redundant ctakes-assertion-res.jar from the distro (as the assertion module will be updated in the future release anyway). Let me know if you encounter anything else, otherwise will plan to create an RC-2. On Fri, Jul 4, 2014 at 12:01 PM, Miller, Timothy timothy.mil...@childrens.harvard.edu wrote: I get an error when I try to run the CVD following the README instructions for the binary release: 7/4/14 11:56:03 AM - 12: org.apache.uima.tools.cvd.MainFrame.handleException(527): SEVERE: Initialization of annotator class org.apache.ctakes.assertion.medfacts.AssertionAnalysisEngine failed. (Descriptor: file:/home/tmill/Projects/sandbox/ctakes-rcs/apache-ctakes-3.2.0/desc/ctakes-assertion/desc/assertionAnalysisEngine.xml) org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class org.apache.ctakes.assertion.medfacts.AssertionAnalysisEngine failed. (Descriptor: file:/home/tmill/Projects/sandbox/ctakes-rcs/apache-ctakes-3.2.0/desc/ctakes-assertion/desc/assertionAnalysisEngine.xml) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:156) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:354) at org.apache.uima.tools.cvd.MainFrame.setupAE(MainFrame.java:1484) at org.apache.uima.tools.cvd.MainFrame.loadAEDescriptor(MainFrame.java:477) at org.apache.uima.tools.cvd.control.AnnotatorOpenEventHandler.actionPerformed