Re: Fw: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch

2021-01-23 Thread Pei Chen
This is missing critical items which constitute an apache release [1].
If it has already been dist'd, I would suggest retroactively
correcting the missing steps as outlined in the release process.

[1] http://www.apache.org/legal/release-policy.html#policy

On Sat, Jan 23, 2021 at 10:39 AM Finan, Sean
 wrote:
>
> Hi all,
>
>
> The "Downloads" page on ctakes.apache.org should now be working as expected.
>
> I have tested several mirror servers and they all appear to have the 4.0.0.1 
> packaged downloads available.
>
>
> Enjoy,
>
> Sean
>
> 
> From: Finan, Sean
> Sent: Thursday, January 21, 2021 1:52 PM
> To: dev@ctakes.apache.org; u...@ctakes.apache.org
> Subject: Fw: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch
>
>
> Hi all,
>
>
> I have been getting your emails and jira items informing me that the download 
> targets on the ctakes website have still not been populated.  Thank you for 
> letting me know.
>
> ?I apologize for the inconvenience and as soon as I can I will work with the 
> Apache Infra team to see why we are having this problem.
>
>
> As soon as I witness working links I will let you all know.
>
>
> Thank you,
>
> Sean
>
> 
> From: Finan, Sean
> Sent: Wednesday, January 20, 2021 10:24 AM
> To: dev@ctakes.apache.org; u...@ctakes.apache.org
> Subject: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch
>
>
> ???As some have experienced, the U.S.A. National Library of Medicine (NLM) 
> has changed the authentication method for using the Unified Medical Language 
> System (UMLS).
>
> https://www.nlm.nih.gov/research/umls/index.html
>
>
> Though a bit late in its arrival, Apache cTAKES now has a patch release that 
> supports the new UMLS authentication method.
>
>
> The release number is 4.0.0.1, an update of the previous release version 
> 4.0.0 with a single change to enable the new UMLS authentication.
>
> No other code or functionality has been modified and there are no 
> enhancements to the previous release 4.0.0
>
>
> There are instructions for use on the Apache cTAKES wiki.
>
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1
>
>
> The source code is available in the 4.0.0.1 tag Subversion (svn) repository.
>
> https://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0.1/
>
>
> The jar and pom files are available from maven central and any Applications 
> utilizing Apache cTAKES as an Apache Maven dependency should update their pom 
> files.
>
> https://search.maven.org/search?q=ctakes
>
>
> At this time the Apache infra script that points mirror download servers to 
> the pre-built zip/archive files has not run.  I hope that the mirror servers 
> are updated in a day or two.
>
> When the mirror servers are updated the buttons on the "Downloads" page of 
> ctakes.apache.org should trigger a download of the patch version.  Until then 
> you will get a "page not found" error.
>
> Until the pre-built archive downloads are available through the website, you 
> can find them in the release repository.
>
> https://repository.apache.org/content/repositories/releases/org/apache/ctakes/ctakes-core/4.0.0.1/
>
>
> For more information please visit the wiki page on the Apache cTAKES 4.0.0.1 
> patch release.
>
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1
>
>
>
> A very special thanks goes to Peter Abramowitsch for conception and original 
> implementation of the authentication code and workflow.
>
>
> Many thanks to those who boldly tested, documented and otherwise made this 
> patch and its trunk equivalent possible, including
>
> Kean Kaufmann
>
> Gandhi Rajan
>
> Eugenia Monogyiou
>
> Timothy Miller
>
> and anybody else that I have forgotten (apologies).
>
>
> ?And for those of you gave gave me a bit of prodding to get this wrapped up 
> and published ... in the end I am grateful and you have done us all a service.
>
>
> Cheers,
>
> Sean
>


Re: Apache cTAKES 4.0.0.1 : UMLS Authentication Patch

2021-01-22 Thread Pei Chen
- Link to public KEYS to verify the sigs?
- Link to the VOTE results?
- Anyone get a chance to download/test/verify the release candidate
artifact in staging before dist'ing?

Not sure if the release guide procedures changed, but it's fairly typical
https://web.archive.org/web/20140701075131/http://ctakes.apache.org/ctakes-release-guide.html



On Wed, Jan 20, 2021 at 10:25 AM Finan, Sean
 wrote:
>
> ???As some have experienced, the U.S.A. National Library of Medicine (NLM) 
> has changed the authentication method for using the Unified Medical Language 
> System (UMLS).
>
> https://www.nlm.nih.gov/research/umls/index.html
>
>
> Though a bit late in its arrival, Apache cTAKES now has a patch release that 
> supports the new UMLS authentication method.
>
>
> The release number is 4.0.0.1, an update of the previous release version 
> 4.0.0 with a single change to enable the new UMLS authentication.
>
> No other code or functionality has been modified and there are no 
> enhancements to the previous release 4.0.0
>
>
> There are instructions for use on the Apache cTAKES wiki.
>
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1
>
>
> The source code is available in the 4.0.0.1 tag Subversion (svn) repository.
>
> https://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0.1/
>
>
> The jar and pom files are available from maven central and any Applications 
> utilizing Apache cTAKES as an Apache Maven dependency should update their pom 
> files.
>
> https://search.maven.org/search?q=ctakes
>
>
> At this time the Apache infra script that points mirror download servers to 
> the pre-built zip/archive files has not run.  I hope that the mirror servers 
> are updated in a day or two.
>
> When the mirror servers are updated the buttons on the "Downloads" page of 
> ctakes.apache.org should trigger a download of the patch version.  Until then 
> you will get a "page not found" error.
>
> Until the pre-built archive downloads are available through the website, you 
> can find them in the release repository.
>
> https://repository.apache.org/content/repositories/releases/org/apache/ctakes/ctakes-core/4.0.0.1/
>
>
> For more information please visit the wiki page on the Apache cTAKES 4.0.0.1 
> patch release.
>
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0.0.1
>
>
>
> A very special thanks goes to Peter Abramowitsch for conception and original 
> implementation of the authentication code and workflow.
>
>
> Many thanks to those who boldly tested, documented and otherwise made this 
> patch and its trunk equivalent possible, including
>
> Kean Kaufmann
>
> Gandhi Rajan
>
> Eugenia Monogyiou
>
> Timothy Miller
>
> and anybody else that I have forgotten (apologies).
>
>
> ?And for those of you gave gave me a bit of prodding to get this wrapped up 
> and published ... in the end I am grateful and you have done us all a service.
>
>
> Cheers,
>
> Sean
>


Re: Demos menu option on cTAKES homepage

2017-04-26 Thread Pei Chen
Hi James,
The demos were being upgraded to use 4.0.0 last night.  They should be
up and running now.
Let me know if you encounter any issues.
--Pei


On Tue, Apr 25, 2017 at 10:59 PM, James Masanz  wrote:
> Pei and others with access to update http://healthnlp.github.io/examples/
>
> Following the   Get Started -> Demo   menu  on http://ctakes.apache.org/ leads
> to a page with demos that aren't currently working.
> Will you have a chance to fix those soon or should the Demo menu be removed
> until they get fixed?
>
> -- James


[CANCEL] [VOTE] Re: Release Apache cTAKES 4.0.0 (rc2)

2017-04-17 Thread Pei Chen
Cancelled- replaced by rc3.

On Sat, Apr 15, 2017 at 10:05 PM, James Masanz <masanz.ja...@gmail.com> wrote:
> re-posting my latest but with [VOTE] added to the subject for anyone
> filtering on that.
>
> On Sat, Apr 15, 2017 at 10:02 PM, James Masanz <masanz.ja...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>>  - changing my vote to 0 for rc2. I'd prefer to have a new release
>> candidate ASAP but if we don't get one soon I would prefer to have a 4.0
>> released as-is, and we can document its limitations and spin up a 4.0.1
>> with the fixes, if there are enough people content to release 4.0 as-is.
>> I'd be happy to be release manager for a 4.0.1.
>>
>> -- James
>>
>>
>> On Fri, Apr 14, 2017 at 8:04 PM, James Masanz <masanz.ja...@gmail.com>
>> wrote:
>>
>>>
>>> -1 from me for rc2 because of various issues found
>>> old dictionary lookup didn't work in an IDE unless you manually
>>> download the latest zip - pom files needed updating (checked into trunk
>>> today) (more of the ctakesresources from sourceforge need to be put onto
>>> maven central for ctakes to work as a maven dependency)
>>> Sean fixed some issues today (I saw commit notices today) which I'd
>>> like to see included in 4.0 before it's released
>>>
>>> -- James
>>>
>>>
>>> On Wed, Apr 12, 2017 at 5:31 PM, Pei Chen <chen...@apache.org> wrote:
>>>
>>>> This is a call for a vote on releasing the following candidate (rc2) as
>>>> Apache cTAKES 4.0.0.
>>>>
>>>> For more detailed information on the changes/release notes, please visit:
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
>>>> ctId=12313621=12340211
>>>>
>>>> The release was made using the cTAKES release process documented here:
>>>> https://ctakes.apache.org/ctakes-release-guide.html
>>>>
>>>> The candidate is available at:
>>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r
>>>> c2/apache-ctakes-4.0.0-src.tar.gz
>>>> /.zip
>>>> <https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz/.zip>
>>>>
>>>> The tag to be voted on:
>>>> http://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0-rc2
>>>> The MD5 checksum of the tarball can be found at:
>>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r
>>>> c2/apache-ctakes-4.0.0-src.tar.gz.md5
>>>> /.zip.md5
>>>>
>>>> The signature of the tarball can be found at:
>>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r
>>>> c2/apache-ctakes-4.0.0-src.tar.gz.asc
>>>> /.zip.asc
>>>>
>>>> Apache cTAKES' KEYS file, containing the PGP keys used to sign the
>>>> release:
>>>> https://dist.apache.org/repos/dist/dev/ctakes/KEYS
>>>>
>>>> Please vote on releasing these packages as Apache cTAKES 4.0.0. The vote
>>>> is
>>>> open for at least the next 72 hours.
>>>>
>>>> The vote passes if at least three binding +1 votes are cast.
>>>> [ ] +1 Release the packages as Apache cTAKES 4.0.0
>>>> [ ] -1 Do not release the packages because...
>>>>
>>>> Also, the convenience binary can be found at:
>>>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-r
>>>> c2/apache-ctakes-4.0.0-bin.tar.gz
>>>> /.zip
>>>> <https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-bin.tar.gz/.zip>
>>>>
>>>> I've only tested the CVD.  Sean/James- Can you test/verify your changes?
>>>>
>>>> Special thanks to all of those involved.
>>>> --
>>>>
>>>
>>>
>>


Re: Release Apache cTAKES 4.0.0 (rc2)

2017-04-17 Thread Pei Chen
Guergana,
Sean and James sent us a private message to request a rc3 to include
most recent changes in trunk after rc2 was created.
We are more than happy to create another release candidate.  That was
the reason that rc2 was veto'd and a rc3 was requested.  The only
differences between rc3 and rc2 are whatever minor changes went into
trunk since Fri over the Easter and Patriots holiday weekend.  You're
more than welcome to create the rc yourself-- but I don't think it
will make it any more efficient.  I rarely see anyone threaten
dates/deadlines upon other ASF volunteers.  What gives?

On Mon, Apr 17, 2017 at 9:53 AM, Savova, Guergana
<guergana.sav...@childrens.harvard.edu> wrote:
> Pei/Murali,
> Let us know if you could cut release candidate 3 by Monday, April 17, 5 pm 
> ET. We would understand if you are very busy and unavailable to do so -- life 
> happens. Sean Finan and James Masanz volunteered to prepare rc3 if we do not 
> hear from you.
>
> Dear cTAKES community,
> Thank you for your testing of rc2, your contributions are so valuable! RC3 
> will be made available on Tuesday, April 18 or Wednesday, April 19 for 
> another round of testing and voting.
> We all are looking forward to the v4 release!
>
> Cheers,
>  --Guergana
>
> -Original Message-
> From: Savova, Guergana
> Sent: Saturday, April 15, 2017 10:02 AM
> To: 'dev@ctakes.apache.org' <dev@ctakes.apache.org>
> Subject: RE: Release Apache cTAKES 4.0.0 (rc2)
>
> Not sure what is meant by "this week". Today, Sat, April 15 by 5 pm?
> --Guergana
>
> Guergana Savova, PhD, FACMI
> Associate Professor
> PI Natural Language Processing Lab
> Boston Children's Hospital and Harvard Medical School
> 300 Longwood Avenue
> Mailstop: BCH3092
> Enders 144.1
> Boston, MA 02115
> Tel: (617) 919-2972
> Fax: (617) 730-0817
> guergana.sav...@childrens.harvard.edu
> Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
> http://ctakes.apache.org
> http://thyme.healthnlp.org
> http://cancer.healthnlp.org
> http://share.healthnlp.org
> http://center.healthnlp.org
>
>
> -Original Message-
> From: Pei Chen [mailto:pei.c...@wiredinformatics.com]
> Sent: Saturday, April 15, 2017 9:13 AM
> To: dev@ctakes.apache.org
> Subject: Re: Release Apache cTAKES 4.0.0 (rc2)
>
> Let us recut 4.0.0 from trunk this week.  I just saw a note from Sean that he 
> would like to integrate changes from trunk as well.
>
>Pei Chen
> Wired Informatics 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__bit.ly_1pHmTcL=DwIBaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=if2F9Ti4D02juzTUQoXtsUPoO5F3SufvTF70twXnRpc=hKX8Ff6KEsf5JpGL11G7PTETB_ZEFCtCGxoWs5U2JEA=
>  >
> 265 Franklin St Ste 1702
> Boston, MA 02110
> tel: (617) 433-7544
> pei.c...@wiredinformatics.com
>
> On Fri, Apr 14, 2017 at 11:38 PM, Finan, Sean < 
> sean.fi...@childrens.harvard.edu> wrote:
>
>> > I'd rather not get into the definition of "basic", just like I'd
>> > rather
>> not discuss the definition of obvious with another mathematician.
>> --> Lol.  My wife can't stand it when I say "obviously".
>>
>> Fwiw, I think that cutting a new rc sooner rather than later is
>> comparatively little work compared to the benefit for testers.  It
>> needs to be done anyway as what is in rc2 is not releasable.  I don't
>> want to vote
>> -1 on the rc, but will if it is necessary to get an rc3 cut.
>>
>> Sean
>>
>> -Original Message-
>> From: James Masanz [mailto:masanz.ja...@gmail.com]
>> Sent: Friday, April 14, 2017 9:36 PM
>> To: dev@ctakes.apache.org
>> Subject: Re: Release Apache cTAKES 4.0.0 (rc2)
>>
>> these are all the fixes I plan to make. last I talked to Sean, he had
>> all his changes in. I assume there will be more testing up until final
>> vote, I certainly will be doing more testing and working more on
>> documentation. But why not have people test on the latest now that we
>> have fixed some issues that seem like showstoppers?  I'd rather not
>> get into the definition of "basic", just like I'd rather not discuss
>> the definition of obvious with another mathematician.
>>
>> On Fri, Apr 14, 2017 at 8:23 PM, Pei Chen <chen...@apache.org> wrote:
>>
>> > James,
>> > Happy to create another rc3, but can I suggest we bundle all of the
>> > fixes before creating another candidate?  Are there other remaining
>> > items to test? This just seems like basic functionality?
>> >
>> > On Fri, Apr 14, 2017 at 8:04 PM, James Masanz
&g

Re: Release Apache cTAKES 4.0.0 (rc2)

2017-04-14 Thread Pei Chen
James,
Happy to create another rc3, but can I suggest we bundle all of the
fixes before creating another candidate?  Are there other remaining
items to test? This just seems like basic functionality?

On Fri, Apr 14, 2017 at 8:04 PM, James Masanz <masanz.ja...@gmail.com> wrote:
> -1 from me for rc2 because of various issues found
> old dictionary lookup didn't work in an IDE unless you manually
> download the latest zip - pom files needed updating (checked into trunk
> today) (more of the ctakesresources from sourceforge need to be put onto
> maven central for ctakes to work as a maven dependency)
> Sean fixed some issues today (I saw commit notices today) which I'd
> like to see included in 4.0 before it's released
>
> -- James
>
>
> On Wed, Apr 12, 2017 at 5:31 PM, Pei Chen <chen...@apache.org> wrote:
>
>> This is a call for a vote on releasing the following candidate (rc2) as
>> Apache cTAKES 4.0.0.
>>
>> For more detailed information on the changes/release notes, please visit:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12313621=12340211
>>
>> The release was made using the cTAKES release process documented here:
>> https://ctakes.apache.org/ctakes-release-guide.html
>>
>> The candidate is available at:
>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-
>> rc2/apache-ctakes-4.0.0-src.tar.gz
>> /.zip
>>
>> The tag to be voted on:
>> http://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0-rc2
>> The MD5 checksum of the tarball can be found at:
>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-
>> rc2/apache-ctakes-4.0.0-src.tar.gz.md5
>> /.zip.md5
>>
>> The signature of the tarball can be found at:
>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-
>> rc2/apache-ctakes-4.0.0-src.tar.gz.asc
>> /.zip.asc
>>
>> Apache cTAKES' KEYS file, containing the PGP keys used to sign the release:
>> https://dist.apache.org/repos/dist/dev/ctakes/KEYS
>>
>> Please vote on releasing these packages as Apache cTAKES 4.0.0. The vote is
>> open for at least the next 72 hours.
>>
>> The vote passes if at least three binding +1 votes are cast.
>> [ ] +1 Release the packages as Apache cTAKES 4.0.0
>> [ ] -1 Do not release the packages because...
>>
>> Also, the convenience binary can be found at:
>> https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-
>> rc2/apache-ctakes-4.0.0-bin.tar.gz
>> /.zip
>>
>> I've only tested the CVD.  Sean/James- Can you test/verify your changes?
>>
>> Special thanks to all of those involved.
>> --
>>


Release Apache cTAKES 4.0.0 (rc2)

2017-04-12 Thread Pei Chen
This is a call for a vote on releasing the following candidate (rc2) as
Apache cTAKES 4.0.0.

For more detailed information on the changes/release notes, please visit:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621=12340211

The release was made using the cTAKES release process documented here:
https://ctakes.apache.org/ctakes-release-guide.html

The candidate is available at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz
/.zip

The tag to be voted on:
http://svn.apache.org/repos/asf/ctakes/tags/ctakes-4.0.0-rc2
The MD5 checksum of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz.md5
/.zip.md5

The signature of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-src.tar.gz.asc
/.zip.asc

Apache cTAKES' KEYS file, containing the PGP keys used to sign the release:
https://dist.apache.org/repos/dist/dev/ctakes/KEYS

Please vote on releasing these packages as Apache cTAKES 4.0.0. The vote is
open for at least the next 72 hours.

The vote passes if at least three binding +1 votes are cast.
[ ] +1 Release the packages as Apache cTAKES 4.0.0
[ ] -1 Do not release the packages because...

Also, the convenience binary can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-4.0.0-rc2/apache-ctakes-4.0.0-bin.tar.gz
/.zip

I've only tested the CVD.  Sean/James- Can you test/verify your changes?

Special thanks to all of those involved.
--


Re: Labs annotator?

2017-03-29 Thread Pei Chen
Kean,
This would be really useful.  If you would like make a contribution,
could you please open a Jira and attach the patch or code?  When you
submit a patch via jira/attachment, it has legal verberage about
donating the code, etc.

--Pei


On Wed, Mar 29, 2017 at 9:30 AM, Finan, Sean
 wrote:
> Fantastic!
>
> I would really like to work with you to get this into ctakes 4.1.  Let me 
> know how you would like to proceed.  Would you like to send me or another 
> committer the code or have somebody review it remotely?  The "tweaks" may be 
> something useful to ctakes, but if not I'm sure that we can create a decent 
> interfacing.
>
> Cheers,
> Sean
>
> -Original Message-
> From: Kean Kaufmann [mailto:k...@recordsone.com]
> Sent: Wednesday, March 29, 2017 7:59 AM
> To: dev@ctakes.apache.org
> Subject: Re: Labs annotator?
>
>>
>> I'm sure that people would love to see lab values in ctakes!  Could
>> you please write a small summary of what it does?  Maybe an example or
>> two could suffice.
>
>
> Hi Sean,
>
> The labs annotator identifies likely lab phrases by TUI (T059 et al.), and 
> relates them to the nearest following number-ish value -- NumToken, 
> FractionAnnotation, MeasurementAnnotation or (as a last resort) 
> RangeAnnotation -- that isn't part of a Date or TimeAnnotation.
> A whitelist of lab-value words can also be specified,  e.g. "positive", 
> "negative", "normal", "elevated", "decreased", ...
>
> For example,
>
> Weight / BMI:  Recent weight (as of 05/05/16) is
>> 45.36 kg (100 lb)
>
>
> yields
>
> "weight" -> "45.36 kg"
>
> and
>
> HEPATIC FUNCTION PANEL
>> Result Value Ref Range
>>  Albumin 2.2 (*) 3.7 - 5.1 g/dL
>>  Total Protein 5.5 (*) 5.8 - 8.0 g/dL
>>  Alkaline Phosphatase 844 (*) 42 - 121 IU/L ...
>
>
> yields
>
> "Albumin" -> "2.2"
> "Protein" -> "5.5"
> "Alkaline Phosphatase" -> "844"
>
> (without trying to fill in the units or referenceRangeNarrative values).
>
> Configuration parameters:
> * ids of segments to annotate
> * TUIs indicating labs - I use T059, T060 and T121
> * CUIs too general to be useful, e.g. C1443182, "Calculated (procedure)"
> * Whitelist of words allowed as lab values
> * Maximum number of newlines permitted between lab and value (0 = must be on 
> same line)
>
> I'd need to check in with you to make sure it plays nicely with the cTAKES 
> type system; we've tweaked ours a bit.
>
> Best,
> -kk
>
>
> On Tue, Mar 28, 2017 at 11:45 AM, Finan, Sean < 
> sean.fi...@childrens.harvard.edu> wrote:
>
>> Hi Kean,
>>
>> I'm sure that people would love to see lab values in ctakes!  Could
>> you please write a small summary of what it does?  Maybe an example or
>> two could suffice.
>>
>> We can definitely put it into ctakes in release 4.1 - maybe next quarter?
>>
>> Cheers,
>> Sean
>>
>> -Original Message-
>> From: Kean Kaufmann [mailto:k...@recordsone.com]
>> Sent: Tuesday, March 28, 2017 11:34 AM
>> To: dev@ctakes.apache.org
>> Subject: Labs annotator?
>>
>> On Tue, Mar 28, 2017 at 11:23 AM, Finan, Sean <
>> sean.fi...@childrens.harvard.edu> wrote:
>>
>> >
>> > If anybody out there has something that they would like to
>> > contribute to ctakes, please do!
>> >
>>
>> I recently wrote an annotator for lab values.  There was some
>> discussion of this on the dev list a couple of years ago; did anything come 
>> of it?
>> Happy to contribute if it's helpful.
>>
>> --
>> _
>> *Kean Kaufmann*
>> NLP Developer
>>
>> RecordsOne
>>   nSight Driven | *Priority. Clarity. Integrity. *
>>
>> *mobile* |
>> 240-401-6131
>>
>> *Twitter:  **@R1_RecordsOne*
>> *See us in Vegas @ ACDIS 2017 *
>> *See us in Los Angeles @ AHIMA 2017*
>>
>> 
>> ---
>> *Confidentiality Notice:  *This email, including any attachments is
>> the property of RecordsOne, LLC and is intended for the sole use of
>> the intended recipient(s). It may contain information that is
>> privileged and confidential. Any unauthorized review, use, disclosure,
>> or distribution is prohibited. If you are not the intended recipient,
>> please reply to the sender that you have received the message in
>> error, then delete this message.
>> 
>> ---
>> *Mailing*:  10641 Airport Pulling Road, Suite 30 | Naples, FL 34109
>> *Main*:  239.451.6112
>>
>> *Please consider the environmental impact before printing this email.
>> *
>>


Re: (Re)introduce myself - James Masanz

2017-01-27 Thread Pei Chen
Welcome back James!  Good to hear from you again. 
Out of respect for the others in the community who already volunteered to be 
RM, I do not see an need for BCH to override existing volunteers. Unless they 
unable or unwilling.
Would you/others agree? 

Sent from my iPhone

> On Jan 27, 2017, at 2:23 PM, James Masanz  wrote:
> 
> Hi,
> 
> I'm James Masanz -- if you've been on the dev list for more than a couple
> years, you might recognize my name from my previous contributions to
> cTAKES, which include having been a release manager.
> 
> I've joined the Boston Children's Hospital NLP team. I will be devoting
> significant energy to the next release of cTAKES, and I volunteer to be the
> release manager for it.
> 
> My initial thoughts are that we could make the "fast dictionary lookup" be
> the default dictionary lookup, incorporate the dictionary GUI from sandbox,
> and call the release 4.0. I'm also interested in migrating the Wiki away
> from Confluence to Apache's moin-moin instance. I'm sure there are other
> things to include in the next release as well.
> 
> You'll be hearing more from me over the next few weeks as I review the list
> of issues in Jira and get caught up with details of what's been going on
> while I was less active here.
> 
> One thing I would like to track for release candidates would be a list of
> what is tested on which platforms, which could be as simple as a post with
> a matrix of src/bin/other vs. linux/windows/mac, and making sure we have at
> least one volunteer to test the install and run of a pipeline for each
> entry in the table. Future releases might expand on that to include
> tracking multiple pipelines across environments, etc.
> 
> I'm happy I'm returning to being active in the cTAKES community!
> 
> -- James


Re: Infrastructures questions.

2016-12-15 Thread Pei Chen
I can recreate what Tim suggested.
1) Comment out LVG from the dependency parser test/pipeline
2) The same thing would need to be done in the Regression Test (comment out
LVG in the test)
3) Update the regression test suite newly generated output -> expected

I'll commit the changes. Basically with the above, junit tests won't run
into the URI hierarchal issue when the "mvn package" command is issued.
A bit of background- I remember digging deeper into this some time ago:
Even if we fixed our impl/LVG wrapper, there are hard coded references
inside LVG itself that uses file:// instead of ResourceAsStream forcing
resources to be unpacked. This doesn't jive well during the package phase
of the maven build cycle where it references your .m2 repo.  mvn test works
fine because it will reference the explicit local unpacked lvg resources.

   Pei Chen
Wired Informatics <http://bit.ly/1pHmTcL>
265 Franklin St Ste 1702
Boston, MA 02110
tel: (617) 433-7544
pei.c...@wiredinformatics.com

On Wed, Dec 14, 2016 at 9:57 AM, Miller, Timothy <
timothy.mil...@childrens.harvard.edu> wrote:

> Dependency tests pass with my change; new test error in regression test
> module that I'm not familiar with and error type I've never seen before
> -- reaching out for help debugging:
>
>
> > Exception in thread "BaseCPMImpl-Thread" 
> > junit.framework.AssertionFailedError:
> Verifying Test Output: testpatient_plaintext_2.txt.
> xmlorg.custommonkey.xmlunit.Diff
> > [different] Expected number of element attributes '7' but was '6' -
> comparing  at
> /CAS[1]/org.apache.ctakes.typesystem.type.syntax.NewlineToken[1] to
>  at
> /CAS[1]/org.apache.ctakes.typesystem.type.syntax.NewlineToken[1]
> >
> >   at junit.framework.Assert.fail(Assert.java:50)
> >   at junit.framework.Assert.assertTrue(Assert.java:20)
> >   at org.apache.ctakes.regression.test.RegressionPipelineTest.
> compareXMLOutput(RegressionPipelineTest.java:147)
> >   at org.apache.ctakes.regression.test.RegressionPipelineTest$
> StatusCallbackListenerImpl.collectionProcessComplete(
> RegressionPipelineTest.java:200)
> >   at org.apache.uima.collection.impl.cpm.BaseCPMImpl.run(
> BaseCPMImpl.java:538)
> >   at java.lang.Thread.run(Thread.java:745)
>
>
> Thanks
> Tim
>
> On Tue, 2016-12-13 at 16:15 +, Miller, Timothy wrote:
> > Quick followup - the test passes in eclipse, both with and without LVG
> > enabled. Can someone try to replicate at the command line and see if mvn
> > package works with LVG commented out? This is line 130 in
> > WriteClearNLPDescriptors.java. Otherwise I can try this afternoon.
> > Tim
> >
> > On Tue, 2016-12-13 at 15:57 +, Miller, Timothy wrote:
> > > Pretty sure this particular issue is caused by LVG being part of the
> > > test pipeline and the "URI is not hierarchical" bug from not having its
> > > files unpacked from the jar. A simple fix is to disable that test in
> > > code; a slightly more complex fix is to run the test with a modified
> > > pipeline that doesn't include LVG.
> > > Tim
> > >
> > >
> > > On Tue, 2016-12-13 at 10:51 -0500, Pei Chen wrote:
> > > > That's right.  mvn compile and test should work fine. The benign test
> > > > failed error from junit tests is coming from install/package; it's
> > > > been there since the beginning of time [1].  It would be a nice to
> > > > have and remove the benign warning messages.  If a proposed critical
> > > > patch release passes the regression tests, doesn't break any existing
> > > > behavior, enhances the project, and we have volunteers for RM, I do
> > > > not see these superious reasons as valid to block releases and keep
> > > > things moving along.
> > > > Sean:  it would great if you can open a Jira and apply the patch; we
> > > > can always cut another release next time- I'll be happy to be RM for
> > > > that one whenever you feel it's' ready.
> > > >
> > > > [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__
> markmail.org_search_-3Fq-3Dctakes-2520mvn-2520package-
> 2520-2DDskipTests-23query-3Actakes-2520mvn-2520package-
> 2520-2DDskipTests-2Bpage-3A1-2Bmid-3Aoxgrkslhhjimpv4k-
> 2Bstate-3Aresults=DgIFaQ=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU=Heup-IbsIg9Q1TPOylpP9FE4GTK-
> OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h=xkJKj22zARpX6Nb06fIYl84-
> gdaEmosSya1Wa40jup4=HIc4d0eWT6Wv0UY2Ytxm_oq5c-sUzay1SSq7XE4rDtE=
> > > >
> > > > On Tue, Dec 13, 2016 at 9:19 AM, Andrey Kurdumov
> > > > <kant2...@googlemail.com> wrote:
> > > > > NP f

Re: Infrastructures questions.

2016-12-13 Thread Pei Chen
What release are you referring to?

On Tue, Dec 13, 2016 at 11:08 AM, Finan, Sean
<sean.fi...@childrens.harvard.edu> wrote:
> By the way, did we ever vote on the release?
> http://www.apache.org/dev/release.html#approving-a-release
>
>
> -Original Message-
> From: Pei Chen [mailto:chen...@apache.org]
> Sent: Tuesday, December 13, 2016 10:51 AM
> To: dev@ctakes.apache.org
> Subject: Re: Infrastructures questions.
>
> That's right.  mvn compile and test should work fine. The benign test failed 
> error from junit tests is coming from install/package; it's been there since 
> the beginning of time [1].  It would be a nice to have and remove the benign 
> warning messages.  If a proposed critical patch release passes the regression 
> tests, doesn't break any existing behavior, enhances the project, and we have 
> volunteers for RM, I do not see these superious reasons as valid to block 
> releases and keep things moving along.
> Sean:  it would great if you can open a Jira and apply the patch; we can 
> always cut another release next time- I'll be happy to be RM for that one 
> whenever you feel it's' ready.
>
> [1] 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__markmail.org_search_-3Fq-3Dctakes-2520mvn-2520package-2520-2DDskipTests-23query-3Actakes-2520mvn-2520package-2520-2DDskipTests-2Bpage-3A1-2Bmid-3Aoxgrkslhhjimpv4k-2Bstate-3Aresults=DgIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=LL_nmgQ_ea-8hW5p-lXaiDLX2zp58A5ZuDimQJunDQ0=XZK-oW6nrKdzKmgZCnIW_zO-b-vqwKSaVWKzYMFIP6g=
>
> On Tue, Dec 13, 2016 at 9:19 AM, Andrey Kurdumov <kant2...@googlemail.com> 
> wrote:
>> NP for broken build. Finally I manage to run it, so I just report
>> issue so other don't have have to go through hoops like me.
>>
>> I just want to made small correction - mvn compile works. mvn test
>> works too, but mvn package require -DskipTests.
>> The problem with build is somehow related to how Maven package stuff,
>> I suspect.
>>
>> Packaging failed for me at "Apache cTAKES Dependency Parser
>>  FAILURE", I also attach report from Surefire with
>> error.
>>
>> I will try to figure out why is that error happens, but it could take
>> a while until I understand how Maven works.
>> Thanks for prompt response!
>>
>> Also I start looking how cTakes working, and investigate dependencies
>> between packages, and found following comment: "Temporary workaround:
>> Adding in the system scoped libraries. Remove these once they are in Maven 
>> Central"
>> in the ctakes-distribution\src\main\assembly\bin.xml . These comment
>> related to dependencies which checked in in the source code, but for
>> me seems to be that they are now on MAven Central See
>> (https://urldefense.proofpoint.com/v2/url?u=https-3A__mvnrepository.com_artifact_net.sf.mastif_mastif-2Dzoner=DgIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=LL_nmgQ_ea-8hW5p-lXaiDLX2zp58A5ZuDimQJunDQ0=DwFSPNSn27AvMjRSNXGHueqTsyc_T5Aetds4ipSzuYo=
>>  ).
>> I saw issue
>>
>> CTAKES-185
>>
>> which could be appropriate for that, and I could create patch for that
>> change. During the course of my next project, very likely I would be
>> involved in the activities similar to cTakes, so I potentially could
>> contribute something back, so I try to familiarize myself with the project.
>>
>>
>>
>> 2016-12-13 19:30 GMT+06:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:
>>>
>>> Hi Andrey,
>>>
>>> The requirement of skipping tests for a successful build is something
>>> that all ctakes developers have stumbled across, but after initial
>>> setup we all forget about it and it has never been handled.  Apologies.
>>>
>>> The github mirror is something that would be great to have, but
>>> getting it up has been a nightmare.  The problem is that historically
>>> we have had binary files that are larger than the 100MB limit enforced by 
>>> github.
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__help.github.com_
>>> articles_working-2Dwith-2Dlarge-2Dfiles_=DgIFaQ=qS4goWBT7poplM69z
>>> y_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4g
>>> Tao=LL_nmgQ_ea-8hW5p-lXaiDLX2zp58A5ZuDimQJunDQ0=rdkXOpLdPvaAw15Zv
>>> gsqVztehD5Bc7SSClGDl5KTcMs= This causes github to reject the
>>> creation of the repository.
>>>
>>> I do think that, should somebody feel like putting in the effort, we
>>> could work with apache infra and get a working solution ... possibly

Re: cTAKES - 3.2.3 release

2016-12-06 Thread Pei Chen
t;] - Fast 
Dictionary should be able to load custom codifications from db
[CTAKES-382 <https://issues.apache.org/jira/browse/CTAKES-382>] - Add ability 
to easily add extension of UmlsConcept Type to jcas via dictionary lookup
Task
[CTAKES-74 <https://issues.apache.org/jira/browse/CTAKES-74>] - Tokenizer 
PennTreeBank breaks with certain apostrophes in tokens.
[CTAKES-138 <https://issues.apache.org/jira/browse/CTAKES-138>] - Remove 3rd 
party jars from our SVN
[CTAKES-232 <https://issues.apache.org/jira/browse/CTAKES-232>] - change 
concept type
 <>On Dec 6, 2016, at 11:20 AM, Jeff Headley <jeffun...@gmail.com> wrote:
> 
> I realize I’m not a committer and maybe I shouldn’t express an opinion. 
> Apologies in advance if this is inappropriate. However as someone who has 
> gone through the pain of trying to install, learn, and use ctakes; I strongly 
> agree with Sean. I don’t inject myself into the situation lightly or to 
> “vent”. I have been in software development since 1996 and a lot of that time 
> in medical projects and using various open source frameworks like Spring, 
> Seam, Hibernate, etc. Sean is right.
> 
> Jeff
> 
> On Dec 6, 2016, 10:00 AM -0500, Pei Chen <pei.c...@wiredinformatics.com>, 
> wrote:
>> Considering the amount of time since the release was created, we should not 
>> let any pending Jira’s or features hold up a release.
>> I suggest just we mark anything that hasn’t been fixed in Jira into the next 
>> release and push forward- I’ll volunteer to do that right now.
>> In the past, the documentation on the website also shouldn’t hold up a 
>> release either.
>> 
>>> On Dec 6, 2016, at 9:20 AM, Finan, Sean <sean.fi...@childrens.harvard.edu> 
>>> wrote:
>>> 
>>> Hi Murali,
>>> 
>>> Before we make an rc, we must go through the list of currently open tars 
>>> and requests. SOP. A list needs to be compiled of what should be closed as 
>>> fixed or n/a plus another list of outstanding bugs that need to be dealt 
>>> with and an estimate of effort. Then we should try to gather volunteers to 
>>> handle said bugs. Can you take care of compiling those lists? I did this 
>>> many months ago when rc 3.2.3 came up, and there were items on which no 
>>> movement was made. If you can find my email that might be one place to 
>>> start.
>>> 
>>> The primary takeaways from the hackathon were, not surprisingly:
>>> 1. Installation of cTAKES is not as straightforward as we believe, and
>>> 2. Getting started with cTAKES is extremely difficult (no good starting 
>>> point) and scares off a large percentage of people who try.
>>> 3. Customization is next to impossible without diving into the code, which 
>>> is more time consuming than anyone can stand.
>>> 
>>> All can be handled best by short and simple GUI tools and some "cTAKES for 
>>> Beginners" documentation. We have some documentation that was used for the 
>>> Hackathon that needs to be modified a bit, then posted on the main cTAKES 
>>> website.
>>> 
>>> In my opinion these items should be worked upon before creating another 
>>> release, otherwise the release is not as useful as it could be. I have 
>>> started work on a simple pipeline builder gui that creates simple html or 
>>> text output. I will check it into trunk soon, but as new functionality 
>>> community testing will be required before a release.
>>> 
>>> Sean
>>> 
>>> -Original Message-
>>> From: Murali Minnah [mailto:mmin...@gmail.com]
>>> Sent: Monday, December 05, 2016 1:26 PM
>>> To: dev@ctakes.apache.org
>>> Subject: cTAKES - 3.2.3 release
>>> 
>>> I wanted to check to see if there are objections to creating a 3.2.3 tag of 
>>> trunk now to prepare for a 3.2.3-rc1?
>>> 
>>> Any comments from the participants/organizers on the success/lessons learnt 
>>> from the "hackathon" that the community can benefit from?
>>> 
>>> Best,
>>> Murali
>> 



Re: cTAKES - 3.2.3 release

2016-12-06 Thread Pei Chen
Considering the amount of time since the release was created, we should not let 
any pending Jira’s or features hold up a release.  
I suggest just we mark anything that hasn’t been fixed in Jira into the next 
release and push forward- I’ll volunteer to do that right now.
In the past, the documentation on the website also shouldn’t hold up a release 
either.

> On Dec 6, 2016, at 9:20 AM, Finan, Sean  
> wrote:
> 
> Hi Murali,
> 
> Before we make an rc, we must go through the list of currently open tars and 
> requests.  SOP.  A list needs to be compiled of what should be closed as 
> fixed or n/a plus another list of outstanding bugs that need to be dealt with 
> and an estimate of effort.  Then we should try to gather volunteers to handle 
> said bugs.  Can you take care of compiling those lists?  I did this many 
> months ago when rc 3.2.3 came up, and there were items on which no movement 
> was made.  If you can find my email that might be one place to start.
> 
> The primary takeaways from the hackathon were, not surprisingly:
> 1.  Installation of cTAKES is not as straightforward as we believe, and
> 2.  Getting started with cTAKES is extremely difficult (no good starting 
> point) and scares off a large percentage of people who try.
> 3.  Customization is next to impossible without diving into the code, which 
> is more time consuming than anyone can stand.
> 
> All can be handled best by short and simple GUI tools and some "cTAKES for 
> Beginners" documentation.  We have some documentation that was used for the 
> Hackathon that needs to be modified a bit, then posted on the main cTAKES 
> website.
> 
> In my opinion these items should be worked upon before creating another 
> release, otherwise the release is not as useful as it could be.  I have 
> started work on a simple pipeline builder gui that creates simple html or 
> text output.  I will check it into trunk soon, but as new functionality 
> community testing will be required before a release.
> 
> Sean
> 
> -Original Message-
> From: Murali Minnah [mailto:mmin...@gmail.com] 
> Sent: Monday, December 05, 2016 1:26 PM
> To: dev@ctakes.apache.org
> Subject: cTAKES - 3.2.3 release
> 
> I wanted to check to see if there are objections to creating a 3.2.3 tag of 
> trunk now to prepare for a 3.2.3-rc1?
> 
> Any comments from the participants/organizers on the success/lessons learnt 
> from the "hackathon" that the community can benefit from?
> 
> Best,
> Murali



Re: Migration of ctakes-vm

2016-11-28 Thread Pei Chen
Hi Freddy,
I left some comments directly on the Jira earlier... I don't believe
that VM was actually used.  It may be easier to just decom it and we
can re open a request a new one later on in the future.
--Pei

On Mon, Nov 28, 2016 at 12:43 AM, Freddy Barboza Oviedo
 wrote:
>
>
> On 2016-11-23 08:56 (-0600), "Daniel Takamori" wrote:
>> Greetings from Infra,
>> We are in the process of moving old VMs off our VMWare cluster and into The 
>> Cloud.  We have a ticket for the process 
>> https://issues.apache.org/jira/browse/INFRA-12894 which Freddy Barboza our 
>> new Infra staffer will be working on.
>> If we could get someone familiar with the VM to take a look and sketch out 
>> what needs to be migrated, we can setup a new VM using Puppet 3 
>> https://git-wip-us.apache.org/repos/asf?p=infrastructure-puppet.git
>> What data to keep, which services to migrate, which package dependencies are 
>> needed as well as what (if anything) needs to be exposed publicly are things 
>> we need to know.
>>
>> Thanks for your cooperation!
>> -Pono on behalf of Infra
>>
> Hi guys,
> Can you please provide us an update on this email.
> We need this information from you in order to keep working on this.
>
> Please, let us know.
> Thanks,
> -Freddy - On behalf of Infra.


Re: [DISCUSS] Hadi Amiri as Apache cTAKES committer

2016-09-27 Thread Pei Chen
[-dev@, + private@]
Moved to the private list as this is typically better suited for
personnel matters, voting in committers, pmc, security-related issues.

Sounds like a good addition to grow the group, but what has Hadi
contributed to Apache so far? Wait until there have been some
contributions in Jira first?

--Pei

On Tue, Sep 27, 2016 at 2:51 PM, Finan, Sean
 wrote:
> Hadi is a new member of the NLP group here at Boston Children's Hospital.  He 
> has a background in NLP research and will now be applying his knowledge to 
> the biomedical domain, and he will be using cTAKES (and why wouldn't he?)
>
> Sean


Re: Karma for Jira

2016-06-10 Thread Pei Chen
Done.
Also dded the new committers- Lewis, Peter, and Azad as well.
—Pei

> On Jun 10, 2016, at 2:11 AM, Richard Eckart de Castilho  
> wrote:
> 
> Hi all,
> 
> could somebody please add me to the cTAKES project in Jira
> such that I can assign issues to myself?
> 
> Best,
> 
> -- Richard



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: headword field in identifiedannotations

2016-06-10 Thread Pei Chen
I don't see any issues with adding the additional optional
attribute... I think we already did the same for other items like
relations for similar reasons.  The only catch is probably that the
dependency will need the dictionary lookup to be run first (assuming
that the logic will be added to the DP to iterate through all NE's in
the CAS) if they want to use that attribute.


On Thu, Jun 9, 2016 at 5:13 PM, Miller, Timothy
 wrote:
> How do people feel about modifying the typesystem? I'm finding that
> grabbing the dependency headword is something very useful for feature
> extraction. But it is a bottleneck if every feature extractor that uses
> it has to recompute it. So I propose adding a field to the
> IdentifiedAnnotation type of "headNode" with type ConllDependencyNode.
>
> Any thoughts or good reasons to avoid this?
>
> Thanks
> Tim
>


Welcome Lewis John McGibbney as a cTAKES committer

2016-05-27 Thread Pei Chen
The Apache cTAKES PMC is pleased to introduce Lewis John McGibbney as
a new committer. We are very happy with the sustained growth of the
project and look forward to continued contributions from the community
and adding to the ranks of the cTAKES committers.

--Pei


Re: cTAKES dirty on checkout

2016-05-16 Thread Pei Chen
+1
I think there was already a Jira to remove the Eclipse specific settings; or at 
least make it automatically derived from the pom.xml’s.
—Pei

> On May 13, 2016, at 11:48 AM, Richard Eckart de Castilho  
> wrote:
> 
> Hi all,
> 
> when checking out the sources of cTAKES from SVN with Eclipse, most of the 
> projects are dirty because the Eclipse settings (.classpath and 
> jdt.core.prefs) are in the SVN. The particular difference is that on my 
> machine, the projects are configured to use a Java 8, while in SVN, it is 
> configured to be a Java 7.
> 
> The parent POM of cTAKES states Java 8
> 
>   1.8
>   1.8
> 
> Since the Eclipse files in SVN are at least outdated, maybe it would be a 
> good idea to drop the .classpath and jdt prefs files from SVN and prevent 
> them from being committed?
> 
> Cheers,
> 
> -- Richard



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: ctakes uimafit analysis engine resource initialization errors

2016-03-01 Thread Pei Chen
Also, check that the liblinear dependency is in your pom.xml (it
should already be included in ctakes-assertion/pom.xml).

org.cleartk
cleartk-ml-liblinear


On Tue, Mar 1, 2016 at 7:53 AM, Miller, Timothy
 wrote:
> Hi Jay,
> I've never seen that one before -- sounds like you're looking in the right 
> place. The first thing I would try is to manually delete the 
> cleartk-ml-liblinear folder in your .m2 directory and then do a mvn project 
> update (from eclipse) or mvn clean compile (from cmd line) in case there was 
> an issue with the downloaded jar. But that is kind of grasping at straws -- 
> hopefully someone else will have some other things to try.
>
> Tim
> 
> From: Jay Urbain 
> Sent: Tuesday, March 1, 2016 7:21 AM
> To: dev@ctakes.apache.org
> Subject: ctakes uimafit analysis engine resource initialization errors
>
> I'm trying to run the AggregatePlaintextUMLSProcessor AE in Eclipse.
> - ctakes 3.2.3-SNAPSHOT
>
> I'm getting ctakes uimafit analysis engine resource initialization errors.
>
> First, I have no compile errors, and I'm using the developer version of
> ctakes "out of the box," i.e., with know modifications except correcting
> maven dependency errors.
>
> I've been struggling resolving the following
> ResourceInitializationException:
>
> 3/1/16 5:31:44 AM - 18:
> org.apache.uima.tools.cvd.MainFrame.handleException(526): SEVERE:
> Initialization of annotator class
> "org.apache.ctakes.assertion.medfacts.cleartk.HistoryCleartkAnalysisEngine"
> failed.  (Descriptor:
> file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-assertion/desc/analysis_engine/HistoryCleartkAnalysisEngine.xml)
> org.apache.uima.resource.ResourceInitializationException: Initialization of
> annotator class
> "org.apache.ctakes.assertion.medfacts.cleartk.HistoryCleartkAnalysisEngine"
> failed.  (Descriptor:
> file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-assertion/desc/analysis_engine/HistoryCleartkAnalysisEngine.xml)
>
> The failure is caused by:
> Caused by: java.lang.ClassNotFoundException:
> org.cleartk.ml.liblinear.LibLinearStringOutcomeClassifierBuilder
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:264)
> at
> org.cleartk.ml.jar.JarClassifierBuilder.fromManifest(JarClassifierBuilder.java:105)
> ... 61 more
>
> The code fails here:
>
> public class HistoryCleartkAnalysisEngine extends
> AssertionCleartkAnalysisEngine {
>
> boolean USE_DEFAULT_EXTRACTORS = false;
> @Override
> public void initialize(UimaContext context) throws
> ResourceInitializationException {
> super.initialize(context); // <--- fails here ---
> probabilityOfKeepingADefaultExample = 0.5;
> initialize_history_extractor();
> initializeFeatureSelection();
> }
>
> In the past, I've been able to fix these errors by fixing a missing
> dependency or by adding a specific version declaration to a dependency.
>
> Here's the declaration in AggregatePlaintextUMLSProcessor.xml:
>
>  
>location="../../../ctakes-assertion/desc/analysis_engine/HistoryCleartkAnalysisEngine.xml"/>
>
> The HistoryCleartkAnalysisEngine.xml is automatically generated by uimaFIT.
>
> I have the cleartk-ml-liblinear-2.0.0.jar in my .m2 repository.
>
> I have the following dependency in the ctakes-assert and the
> ctakes-clinical-pipeline pom.xml:
>
> 
> org.cleartk
> cleartk-ml
> 2.0.0
> 
>
> Any guidance would be apprecaited.
>
> Thanks,
> Jay Urbain


Re: ctakes gui

2016-02-18 Thread Pei Chen
Hi Ben,
I think the ctakes-gui in the sandbox area is really outdated and
hasn't been maintained in a long time (hence in sandbox).
But there was a old thread [1] that you may find useful.

[1] 
https://mail-archives.apache.org/mod_mbox/ctakes-dev/201505.mbox/%3cCAPUoHuEj1aFC6PQG=jlkgsoquutq17l0mhmdpwvatd6uwqg...@mail.gmail.com%3e

On Thu, Feb 18, 2016 at 12:56 PM, Ben Yu  wrote:
> Hi ctakes group,
> Is the ctakes gui actively maintained? I downloaded it and followed Pei's 
> installation guide (not entirely because some of the instructions don't seem 
> to apply), and after some maneuvering I had it up and running with tomcat7. 
> When I try to bring the app up http://localhost:8080/ctakesgui/, the 
> login.html page came back blank. I noticed that the login.jsp (which seems to 
> be the file Spring mvc mapped to for login.html) used some external 
> javascript files and one of them, i18n.js is missing. I have not used it at 
> all and not sure what it is. Is that the reason why I got the blank page? Am 
> I missing some other front end stuff? The body tag does not have any content 
> in it.
>
> Thanks, and appreciate any help.
>
> Ben Yu
> Software Design Engineer
> College of Pharmacy, University of Utah
> 801-587-7751
>


Re: ctakes resource exception

2016-02-15 Thread Pei Chen
Jay,
Did you also download and unzip the dictionary itself?

UMLS® Dictionary

Zipped copy of the cTAKES™ UMLS® dictionary. Please refer to the TODO :
Should write a separate page that outlines what the dictionaries are,
licensing, and where/how to get them on the net, where to install them
locally, and how to configure user/pass Dictionary Install Guide
<https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Dictionary+Lookup>
for assistance. Install fast version if only running ctakes-fast.

All Versions
<http://sourceforge.net/projects/ctakesresources/files/ctakes-resources-3.2.1.1-bin.zip/download>
Fast
Version
<http://sourceforge.net/projects/ctakesresources/files/ctakessnorx-3.2.1.1.zip/download>

   Pei Chen
Wired Informatics <http://www.wiredinformatics.com>
265 Franklin St Ste 1702
Boston, MA 02110
tel: (617) 433-7544
pei.c...@wiredinformatics.com

On Mon, Feb 15, 2016 at 9:45 AM, Jay Urbain <jay.urb...@gmail.com> wrote:

> Hi,
>
> I'm trying to run bin/runctakesCVD.sh. When I try to load the
> AggregatePlaintextFastUMLSProcessor.xml, as described in the User install
> guide (
>
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide
> )
> I receive the ResourceInitializationException (please see details below).
>
> It appears that I do not have something set up correctly with the UMLS
> resources. If I try to run the AggregatePlaintextProcessor, everything
> seems to work Ok.
>
> Any help or direction would be appreciated.
>
> Thanks,
> Jay
>
> Environment:
> OS X Yosemite 10.10.5
> Java 1.8
>
> Downloads:
> apache-ctakes-3.2.2
> ctakes-resources-3.2.1.1-bin
>
> Copied the resources directory:
> ditto /Users/jayurbain/Downloads/ctakes-resources-3.2.1.1-bin/resources/*
> /Users/jayurbain/Dropbox/apache-ctakes-3.2.2/resources
>
> Added my UMLS user authentication to runtakesCVD.sh
> java -Dctakes.umlsuser=
> -Dctakes.umlspw=
>
> Exception:
>
> 2/15/16 8:34:25 AM - 15:
> org.apache.uima.tools.cvd.MainFrame.handleException(526): SEVERE:
> Initialization of annotator class
> "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator" failed.
>  (Descriptor:
>
> file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml)
> org.apache.uima.resource.ResourceInitializationException: Initialization of
> annotator class
> "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator" failed.
>  (Descriptor:
>
> file:/Users/jayurbain/Dropbox/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml)
> at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252)
> at
>
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:156)
> at
>
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
> at
>
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
> at
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
> at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
> at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
> at
>
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
> at
>
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
> at
>
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:354)
> at org.apache.uima.tools.cvd.MainFrame.setupAE(MainFrame.java:1484)
> at org.apache.uima.tools.cvd.MainFrame.loadAEDescriptor(MainFrame.java:476)
> at
>
> org.apache.uima.tools.cvd.control.AnnotatorOpenEventHandler.actionPerformed(AnnotatorOpenEventHandler.java:52)
> at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022)
> at
>
> javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2348)
> at
>
> javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
> at javax.

Re: Mac/download link broken

2016-02-11 Thread Pei Chen
The links on the menu should point to http://ctakes.apache.org/downloads.cgi 
<http://ctakes.apache.org/downloads.cgi>
Do you know where the .html link came from; those should be updated.

—Pei

> On Feb 11, 2016, at 4:02 PM, taposh.d@kp.org wrote:
> 
> Hi
> 
> When I click user installation for MAC/Linux  from
> http://ctakes.apache.org/downloads.html
> 
> I get a broken link
> http://ctakes.apache.org/
> [preferred]/ctakes/ctakes-3.2.2/apache-ctakes-3.2.2-bin.tar.gz
> 
> Can some one forward me the right link and fix this or let me know how to
> and I will fix it.
> 
> Regards,
> 
> Taposh D. Roy  |  Health Data Project Lead/Scientist  |  Delivery System
> Analytics, Decision Support  |  Kaiser Permanente  | cell: 510.206.1633 |
> taposh.d@kp.org
> 
> 
> 
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error,
> please notify the sender immediately by reply e-mail and permanently
> delete this e-mail and any attachments without reading, forwarding or
> saving them.  Thank you.
> 
> 
> 
> 
> 
> From:   "Savova, Guergana" <guergana.sav...@childrens.harvard.edu>
> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> Date:   02/10/2016 11:20 AM
> Subject:RE: Contributing to documentation
> 
> 
> 
> Hi Jessica,
> Thank you very much for offering to contribute to the documentation!
> Indeed this is our weak link and any help there will be greatly
> appreciated.
> A warm welcome to the community!
> --Guergana
> 
> 
> -Original Message-
> From: Pei Chen [mailto:chen...@apache.org]
> Sent: Wednesday, February 10, 2016 1:41 PM
> To: dev@ctakes.apache.org
> Subject: Re: Contributing to documentation
> 
> We've been generally following the C-T-R model [1]
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU=
> 
> But feel free to discuss on dev@ whenever in doubt...
> 
> [1]
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU=
> 
> 
> On Wed, Feb 10, 2016 at 1:31 PM, Jessica Glover
> <glover.jessic...@gmail.com> wrote:
>> Thank you. I'm excited to contribute.
>> 
>> Is there a process by which my contributions should get "voted in" or
>> am I free to just start editing?
>> 
>> - Jessica
>> 
>> On Feb 10, 2016 9:28 AM, "Pei Chen" <pei.c...@wiredinformatics.com>
> wrote:
>> 
>>> User Jessica Glover (jgloves) Added to:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org
>>> _confluence_display_CTAKES_cTAKES=BQIFaQ=qS4goWBT7poplM69zy_3xhKw
>>> EW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcr
>>> O4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=LVL
>>> CQGevx3dGn1G-IoKWfyFMl6ZQThSi90BoERcRp6w=
>>> Enjoy!
>>> —Pei
>>> 
>>> On Feb 10, 2016, at 8:43 AM, Jessica Glover
>>> <glover.jessic...@gmail.com>
>>> wrote:
>>> 
>>> Hi Pei,
>>> I'm not sure what my confluence ID is. I log in with this email
>>> address, and I can be found under Jessica Glover in a People search.
>>> 
>>> - Jessica
>>> This would be great.  What is your confluence id (anyone should be
>>> able to create an account)?
>>> --Pei
>>> 
>>> On Tue, Feb 9, 2016 at 7:49 AM, Jessica Glover
>>> <glover.jessic...@gmail.com> wrote:
>>> 
>>> Hello,
>>> 
>>> I am a cTAKES user, but I am interested in development and especially
>>> interested in contributing to the documentation. I have some ideas
>>> for making the component use guides more user-friendly for first-time
>>> UIMAers, but I'm also eager to hear what the dev community would like
>>> to see. I am happy to write as well as create diagrams.
>>> 
>>> Thanks,
>>> 
>>> Jessica Glover
>>> 
>>> 
>>> 
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Mac/download link broken

2016-02-11 Thread Pei Chen
Yes, no one should be using http://ctakes.apache.org/downloads.html 
<http://ctakes.apache.org/downloads.html>
All links should be using http://ctakes.apache.org/downloads. 
<http://ctakes.apache.org/downloads.html>cgi so that it dynamically resolves 
the mirrors properly…

> On Feb 11, 2016, at 4:11 PM, taposh.d@kp.org wrote:
> 
> Pei -
> 
> [preferred] refers to the preferred mirror and needs to be redirected to the 
> mirror host name. If user is new this will break.
> 
> HTML Page
> http://ctakes.apache.org/downloads.html 
> <http://ctakes.apache.org/downloads.html>--> CLick on MAc/Linux under 
> User Installation to see it broken.
> (http://ctakes.apache.org/ 
> <http://ctakes.apache.org/>[preferred]/ctakes/ctakes-3.2.2/apache-ctakes-3.2.2-bin.tar.gz)
> 
> CGI Page  <http://ctakes.apache.org/downloads.cgi>
> http://ctakes.apache.org/downloads.cgi  --> Works...
> 
> Regards,
> 
> Taposh D. Roy  |  Health Data Project Lead/Scientist  |  Delivery System 
> Analytics, Decision Support  |  Kaiser Permanente  | cell: 510.206.1633 | 
> taposh.d@kp.org
> 
> 
> 
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this e-mail, 
> you are prohibited from sharing, copying, or otherwise using or disclosing 
> its contents.  If you have received this e-mail in error, please notify the 
> sender immediately by reply e-mail and permanently delete this e-mail and any 
> attachments without reading, forwarding or saving them.  Thank you.
> 
> 
> 
> 
> 
> From:Pei Chen <pei.c...@wiredinformatics.com>
> To:dev@ctakes.apache.org
> Date:02/11/2016 01:05 PM
> Subject:Re: Mac/download link broken
> 
> 
> 
> The links on the menu should point to http://ctakes.apache.org/downloads.cgi 
> <http://ctakes.apache.org/downloads.cgi>
> Do you know where the .html link came from; those should be updated.
> 
> —Pei
> 
> On Feb 11, 2016, at 4:02 PM, taposh.d@kp.org <mailto:taposh.d@kp.org> 
> wrote:
> 
> Hi
> 
> When I click user installation for MAC/Linux  from
> http://ctakes.apache.org/downloads.html 
> <http://ctakes.apache.org/downloads.html>
> 
> I get a broken link
> http://ctakes.apache.org/ <http://ctakes.apache.org/>
> [preferred]/ctakes/ctakes-3.2.2/apache-ctakes-3.2.2-bin.tar.gz
> 
> Can some one forward me the right link and fix this or let me know how to
> and I will fix it.
> 
> Regards,
> 
> Taposh D. Roy  |  Health Data Project Lead/Scientist  |  Delivery System
> Analytics, Decision Support  |  Kaiser Permanente  | cell: 510.206.1633 |
> taposh.d@kp.org
> 
> 
> 
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error,
> please notify the sender immediately by reply e-mail and permanently
> delete this e-mail and any attachments without reading, forwarding or
> saving them.  Thank you.
> 
> 
> 
> 
> 
> From:   "Savova, Guergana" <guergana.sav...@childrens.harvard.edu>
> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> Date:   02/10/2016 11:20 AM
> Subject:RE: Contributing to documentation
> 
> 
> 
> Hi Jessica,
> Thank you very much for offering to contribute to the documentation!
> Indeed this is our weak link and any help there will be greatly
> appreciated.
> A warm welcome to the community!
> --Guergana
> 
> 
> -Original Message-
> From: Pei Chen [mailto:chen...@apache.org <mailto:chen...@apache.org>]
> Sent: Wednesday, February 10, 2016 1:41 PM
> To: dev@ctakes.apache.org
> Subject: Re: Contributing to documentation
> 
> We've been generally following the C-T-R model [1]
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU=
>  
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP=yprBottjMZmd-5h2kun5_56ITgboOGhRiM1FrbJtLiE=BiPUyRARC7nrVJaM2ajjNaANac3AbCc0l25_hWVUCQU=>
> 
> But feel free to discuss on dev@ whenever in doubt...
> 
> [1]
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_foundation_glossary.html-23CommitThenReview=BQIFaQ=qS4goWBT7poplM69zy_3xhKwEW

Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2016-01-30 Thread Pei Chen
CTAKES-384-20160129.patch applied.

> On Jan 29, 2016, at 4:34 AM, Peter Klügl <peter.klu...@averbis.com> wrote:
> 
> Hi,
> 
> the problems were caused by the svn client in my Eclipse. Sorry for the
> trouble, I should have looked more closely at the ciomplete patch.
> 
> I attached a new patch created with commandline tools wich looks correct
> now.
> 
> Pei, can you apply the new patch?
> 
> Best,
> 
> Peter
> 
> Am 28.01.2016 um 15:57 schrieb Peter Klügl:
>> Thanks Pei.
>> 
>> I fear there was again a problem with the patch. All new files are
>> missing (and also the svn-ignore settings).
>> 
>> Can you take a look?
>> 
>> Best,
>> 
>> Peter
>> 
>> Am 28.01.2016 um 14:43 schrieb Pei Chen:
>>> patch applied.
>>> Thanks,
>>> Pei
>>> 
>>> On Thu, Jan 28, 2016 at 4:14 AM, Peter Klügl <peter.klu...@averbis.com> 
>>> wrote:
>>>> Hi Pei,
>>>> 
>>>> can you commit the recent patch for us?
>>>> 
>>>> CTAKES-384-20160120.patch
>>>> 
>>>> Best,
>>>> 
>>>> Peter
>>>> 
>>>> Am 20.01.2016 um 19:35 schrieb Pei Chen:
>>>>> Hi,
>>>>> Sorry I was swamped recently.
>>>>> But yeah, we can even create an extended type system to store these items 
>>>>> temporarily and add them into the main/core type system afterwards.
>>>>> There was an existing item to upgrade UIMA, but agreed- it will require 
>>>>> much more testing.  If it works, we can upgrade it in our sandbox area or 
>>>>> create a branch if necessary.
>>>>> 
>>>>> —Pei
>>>>> 
>>>>>> On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> a new patch is attached.
>>>>>> 
>>>>>> @Pei:
>>>>>> are there suitable annotation types in the cTAKES type system? Some
>>>>>> project in cTAKES uses something like OntologyMatch... I map it to
>>>>>> IdentifiedAnnotation right now, but there are many empty features...
>>>>>> 
>>>>>> @Azad:
>>>>>> I changed the rules a bit, especially the capitalization like I use it
>>>>>> in ruta normally. The wordlist are compiled to a trie by the maven
>>>>>> plugin. I also added the two regexes for url and email. I extended the
>>>>>> regex for the url. I also changed the evaluation order of some rules
>>>>>> (with @). Feel free to add simple examples to examples.csv for the unit
>>>>>> tests.
>>>>>> 
>>>>>> Let me know if you need more information about the changes.
>>>>>> 
>>>>>> Do you wanna have help with the other rule sets? Or should we split them 
>>>>>> up?
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> Peter
>>>>>> 
>>>>>> Am 18.01.2016 um 11:04 schrieb Peter Klügl:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> great. I will integrate them in the project and in the next patch.
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Peter
>>>>>>> 
>>>>>>> Am 18.01.2016 um 00:58 schrieb Azad Dehghan:
>>>>>>>> Three NERs translated and uploaded.
>>>>>>>> 
>>>>>>>> PS. I will validate all NERs once we have them all completed.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Azad
>>>>>>>> 
>>>>>>>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> This is on my todo list for Dec. as well. If there are any more 
>>>>>>>>> volunteers
>>>>>>>>> for translating JAPE to RUTA, please get in touch.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Azad
>>>>>>>>> 
>>>>>>>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> 

Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2016-01-28 Thread Pei Chen
patch applied.
Thanks,
Pei

On Thu, Jan 28, 2016 at 4:14 AM, Peter Klügl <peter.klu...@averbis.com> wrote:
> Hi Pei,
>
> can you commit the recent patch for us?
>
> CTAKES-384-20160120.patch
>
> Best,
>
> Peter
>
> Am 20.01.2016 um 19:35 schrieb Pei Chen:
>> Hi,
>> Sorry I was swamped recently.
>> But yeah, we can even create an extended type system to store these items 
>> temporarily and add them into the main/core type system afterwards.
>> There was an existing item to upgrade UIMA, but agreed- it will require much 
>> more testing.  If it works, we can upgrade it in our sandbox area or create 
>> a branch if necessary.
>>
>> —Pei
>>
>>> On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> wrote:
>>>
>>> Hi,
>>>
>>> a new patch is attached.
>>>
>>> @Pei:
>>> are there suitable annotation types in the cTAKES type system? Some
>>> project in cTAKES uses something like OntologyMatch... I map it to
>>> IdentifiedAnnotation right now, but there are many empty features...
>>>
>>> @Azad:
>>> I changed the rules a bit, especially the capitalization like I use it
>>> in ruta normally. The wordlist are compiled to a trie by the maven
>>> plugin. I also added the two regexes for url and email. I extended the
>>> regex for the url. I also changed the evaluation order of some rules
>>> (with @). Feel free to add simple examples to examples.csv for the unit
>>> tests.
>>>
>>> Let me know if you need more information about the changes.
>>>
>>> Do you wanna have help with the other rule sets? Or should we split them up?
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 18.01.2016 um 11:04 schrieb Peter Klügl:
>>>> Hi,
>>>>
>>>> great. I will integrate them in the project and in the next patch.
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>> Am 18.01.2016 um 00:58 schrieb Azad Dehghan:
>>>>> Three NERs translated and uploaded.
>>>>>
>>>>> PS. I will validate all NERs once we have them all completed.
>>>>>
>>>>> Cheers,
>>>>> Azad
>>>>>
>>>>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> wrote:
>>>>>
>>>>>> This is on my todo list for Dec. as well. If there are any more 
>>>>>> volunteers
>>>>>> for translating JAPE to RUTA, please get in touch.
>>>>>>
>>>>>> Cheers,
>>>>>> Azad
>>>>>>
>>>>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I just wanted to mention that I haven't forgot about it. Unfortunately,
>>>>>>> there is just no spare time right now. I hope I will be able to provide
>>>>>>> the patches in December.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Peter
>>>>>>>
>>>>>>> Am 06.11.2015 um 16:40 schrieb Pei Chen:
>>>>>>>> Hi Peter,
>>>>>>>> I think the ctakes-examples is probably a good starting point at least
>>>>>>>> in terms of maven modules, etc.  I think it would be good if we use
>>>>>>>> uimaFIT style as primary approach to wiring components together and
>>>>>>>> generate desc's as secondary...
>>>>>>>> I think the actual components that would be required is probably best
>>>>>>>> left up to what is actually required for best performing c-deid.  The
>>>>>>>> output would be interesting, I'm not sure if we should treat this as
>>>>>>>> an independent preprocessing component or part of a pipeline (in which
>>>>>>>> case, we may need to propose a change to the type system or perhaps an
>>>>>>>> alternative JCas view.  You can probably open up that discussion to
>>>>>>>> the dev group as you see fit.)
>>>>>>>>
>>>>>>>> My 2 cents...
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.klu...@averbis.com>
>>>>&g

Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2016-01-20 Thread Pei Chen
Hi,
Sorry I was swamped recently.
But yeah, we can even create an extended type system to store these items 
temporarily and add them into the main/core type system afterwards.
There was an existing item to upgrade UIMA, but agreed- it will require much 
more testing.  If it works, we can upgrade it in our sandbox area or create a 
branch if necessary.

—Pei

> On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> wrote:
> 
> Hi,
> 
> a new patch is attached.
> 
> @Pei:
> are there suitable annotation types in the cTAKES type system? Some
> project in cTAKES uses something like OntologyMatch... I map it to
> IdentifiedAnnotation right now, but there are many empty features...
> 
> @Azad:
> I changed the rules a bit, especially the capitalization like I use it
> in ruta normally. The wordlist are compiled to a trie by the maven
> plugin. I also added the two regexes for url and email. I extended the
> regex for the url. I also changed the evaluation order of some rules
> (with @). Feel free to add simple examples to examples.csv for the unit
> tests.
> 
> Let me know if you need more information about the changes.
> 
> Do you wanna have help with the other rule sets? Or should we split them up?
> 
> Best,
> 
> Peter
> 
> Am 18.01.2016 um 11:04 schrieb Peter Klügl:
>> Hi,
>> 
>> great. I will integrate them in the project and in the next patch.
>> 
>> Best,
>> 
>> Peter
>> 
>> Am 18.01.2016 um 00:58 schrieb Azad Dehghan:
>>> Three NERs translated and uploaded.
>>> 
>>> PS. I will validate all NERs once we have them all completed.
>>> 
>>> Cheers,
>>> Azad
>>> 
>>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> wrote:
>>> 
>>>> This is on my todo list for Dec. as well. If there are any more volunteers
>>>> for translating JAPE to RUTA, please get in touch.
>>>> 
>>>> Cheers,
>>>> Azad
>>>> 
>>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote:
>>>>> Hi,
>>>>> 
>>>>> I just wanted to mention that I haven't forgot about it. Unfortunately,
>>>>> there is just no spare time right now. I hope I will be able to provide
>>>>> the patches in December.
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Peter
>>>>> 
>>>>> Am 06.11.2015 um 16:40 schrieb Pei Chen:
>>>>>> Hi Peter,
>>>>>> I think the ctakes-examples is probably a good starting point at least
>>>>>> in terms of maven modules, etc.  I think it would be good if we use
>>>>>> uimaFIT style as primary approach to wiring components together and
>>>>>> generate desc's as secondary...
>>>>>> I think the actual components that would be required is probably best
>>>>>> left up to what is actually required for best performing c-deid.  The
>>>>>> output would be interesting, I'm not sure if we should treat this as
>>>>>> an independent preprocessing component or part of a pipeline (in which
>>>>>> case, we may need to propose a change to the type system or perhaps an
>>>>>> alternative JCas view.  You can probably open up that discussion to
>>>>>> the dev group as you see fit.)
>>>>>> 
>>>>>> My 2 cents...
>>>>>> 
>>>>>> 
>>>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.klu...@averbis.com>
>>>> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Is there a cTAKES project that may serve as an example on how the
>>>> cTAKES
>>>>>>> community develops or how a project should look like?
>>>>>>> I learned that different people set up UIMA project in a quite
>>>> different
>>>>>>> manner and I do not what to get inspired by "some sort of out-dated"
>>>>>>> approach in the cTAKES repo.
>>>>>>> 
>>>>>>> Are there restriction or preferences about the preprocessing
>>>> components
>>>>>>> that should be used and the kind of "output" of the project.
>>>>>>> Components: On which components may the componetns rely: tokenizer,
>>>> ...
>>>>>>> parser, ... dict lookup?
>>>>>>> "output": Should t

Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2016-01-11 Thread Pei Chen
Patch applied:
http://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-clinical-deid/ 
<http://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-clinical-deid/>
Thanks Peter.

What error did you get with xml-api’s?  Do you mean upgrade ctakes to the 
latest version of uima instead of 2.4.0?

—Pei

> On Jan 11, 2016, at 12:39 PM, Peter Klügl <peter.klu...@averbis.com> wrote:
> 
> Hi,
> 
> I just added a small patch which adds a maven build process and a dummy
> unit test.
> 
> I had some problems with the version of xml-apis. Is this known or
> rather a local problem on my build machine?
> Is there a reason why cTAKES requires uima 2.4.0?
> 
> Next step would be translating the rules. Azad mentioned that he already
> started with that :-)
> 
> Best,
> 
> Peter
> 
> 
> Am 18.12.2015 um 11:01 schrieb Peter Klügl:
>> Hi,
>> 
>> sorry, there was no free time left in December for this issue, but I
>> will be able to provide the patches in January (for real).
>> 
>> Best,
>> 
>> Peter
>> 
>> Am 24.11.2015 um 11:37 schrieb Azad Dehghan:
>>> This is on my todo list for Dec. as well. If there are any more volunteers
>>> for translating JAPE to RUTA, please get in touch.
>>> 
>>> Cheers,
>>> Azad
>>> 
>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote:
>>>> Hi,
>>>> 
>>>> I just wanted to mention that I haven't forgot about it. Unfortunately,
>>>> there is just no spare time right now. I hope I will be able to provide
>>>> the patches in December.
>>>> 
>>>> Best,
>>>> 
>>>> Peter
>>>> 
>>>> Am 06.11.2015 um 16:40 schrieb Pei Chen:
>>>>> Hi Peter,
>>>>> I think the ctakes-examples is probably a good starting point at least
>>>>> in terms of maven modules, etc.  I think it would be good if we use
>>>>> uimaFIT style as primary approach to wiring components together and
>>>>> generate desc's as secondary...
>>>>> I think the actual components that would be required is probably best
>>>>> left up to what is actually required for best performing c-deid.  The
>>>>> output would be interesting, I'm not sure if we should treat this as
>>>>> an independent preprocessing component or part of a pipeline (in which
>>>>> case, we may need to propose a change to the type system or perhaps an
>>>>> alternative JCas view.  You can probably open up that discussion to
>>>>> the dev group as you see fit.)
>>>>> 
>>>>> My 2 cents...
>>>>> 
>>>>> 
>>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.klu...@averbis.com>
>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Is there a cTAKES project that may serve as an example on how the
>>> cTAKES
>>>>>> community develops or how a project should look like?
>>>>>> I learned that different people set up UIMA project in a quite
>>> different
>>>>>> manner and I do not what to get inspired by "some sort of out-dated"
>>>>>> approach in the cTAKES repo.
>>>>>> 
>>>>>> Are there restriction or preferences about the preprocessing components
>>>>>> that should be used and the kind of "output" of the project.
>>>>>> Components: On which components may the componetns rely: tokenizer, ...
>>>>>> parser, ... dict lookup?
>>>>>> "output": Should the project provide a pipeline or a single AE?
>>>>>> 
>>>>>> More comments below.
>>>>>> 
>>>>>> Am 03.11.2015 um 16:54 schrieb Azad Dehghan:
>>>>>>>> Who else plans to provide patches for it? Just to avoid duplicate
>>> work
>>>>>>>> and to coordnate the efforts ...
>>>>>>>> 
>>>>>>> I would like to help with the translating JAPE to RUTA.
>>>>>> You can already go ahead with the UIMA Ruta Workbench if you want, or
>>>>>> wait until I set up the project with ruta integration.
>>>>>> 
>>>>>> If any questions arise, just ask :-)
>>>>>> 
>>>>>>>> Is there a development dataset which was utilized for the initial
>>>>>>>> development, and if yes, is it possible to

Re: Need help to identify procedures in xml file using AggregatePlaintextFastUMLSProcessor

2015-12-08 Thread Pei Chen
Hi Reena,
If you search for "ProcedureMention" in the attached output xml, you
should be able to find the Procedures (plus the FSArray of the
associated Concepts) that were extracted...
Or am I missing something...
--Pei

On Mon, Dec 7, 2015 at 12:40 AM, Reena Duggal
 wrote:
> Sorry, I attached the wrong file in last mail. PFA the correct xml file.
>
> Thanks & Regards
> Reena Duggal
> Research Scholar(Full-Time)
> Amity Institute of Information Technology
> Amity University Uttar Pradesh
> M - 09740256313
> On 12/7/2015 10:41 AM, Reena Duggal wrote:
>
> Hello
> I have setup ctakes on my machine using cTAKES 3.2 User Install Guide. I
> created an xml file using CPE and using AggregatePlaintextFastUMLSProcessor.
> I am attaching it with email. Pl let me know how to parse this file to get
> list of procedures from it. I am not able to figure out that part. Also pl
> check, if this file is correct. Will really appreciate your help on this.
>
>
> Thanks & Regards
> Reena Duggal
> Research Scholar(Full-Time)
> Amity Institute of Information Technology
> Amity University Uttar Pradesh
> M - 09740256313
>
>


Re: ctakes with icd10; 2015 versions available on sourceforge!

2015-12-08 Thread Pei Chen
Brandon,
That sounds great!
Please open a Jira ticket for any contributions (anyone should be able
to create a Jira account).  There are some legal items built into the
ASF Jira attachments for accepting contributions/donations.
It will also credit the contributors with the merit appropriately.
Anyone who is interested can follow the Jira item. (Even better if
contributions were open discussion/open development.)
--Pei

On Tue, Dec 8, 2015 at 10:36 PM, Geise, Brandon D.
 wrote:
> I'd be interested in contributing to making the dictionary tool more user 
> friendly with a GUI.
>
> Thanks,
> Brandon
>
> -Original Message-
> From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu]
> Sent: Tuesday, December 08, 2015 6:12 PM
> To: dev@ctakes.apache.org
> Subject: RE: ctakes with icd10; 2015 versions available on sourceforge!
>
> Hi Dave,
>
> I'm always happy to see interest in our stuff!
>
>>Step 1
> I built the tool to be able to build a dictionary using anything in the umls 
> - snomed, icd9, hpo, etc. so using the veterinary extension shouldn't be a 
> problem.  You just add it to the CtakesSources file (or create an alternate 
> file and point to it with -src).  To answer another of your questions, there 
> can be zero or more sources - you saw snomedct and snomedct_us (each valid in 
> a different umls version).
> It also can include any semantic type, just add (or remove) the appropriate 
> tuis in a different data file.
>
>>Step 2
> You have it right - you copy the templates to another location and output to 
> that location.  Otherwise you 'lose' your templates.
>
>>Step 3 and 4
> The jar is built from source.  I need to (soon) check in updates to the 
> source, and at the same time I can check in a default prebuilt .jar  The lib/ 
> directory is in the source repository.
>
> Various people have toyed with the idea of putting the tool into a ctakes 
> module, putting it into an "installation package", making a gui ...  The best 
> option (imo) is probably to make an easy to use gui and keep a pre-built 
> version in sandbox.  Someday, after the rainbow, maybe I'll get a chance to 
> do that ...
>
> Sean
>
>
> -Original Message-
> From: David Kincaid [mailto:kincaid.d...@gmail.com]
> Sent: Tuesday, December 08, 2015 4:57 PM
> To: dev@ctakes.apache.org
> Subject: Re: ctakes with icd10; 2015 versions available on sourceforge!
>
> Thanks, Sean! It's great that cTAKES may soon have an up to date database out 
> of the box. Hopefully it will cut down on the need for many to build their 
> own DB's. Thank you much for doing that.
>
> Unfortunately, I still will need to build a custom one for us. I work in 
> veterinary medicine so I need to add in the veterinary extension for 
> SNOMED-CT into the database.
>
> I looked over the steps below that Brandon included and have some questions:
>
> step 1 says to "Change /data/default/CtakesSources.txt from "SNOMEDCT" to 
> "SNOMEDCT_US". The file that I have has two lines in it. First line is 
> SNOMED, second line is SNOMEDCT_US. So this step doesn't really make sense.
>
> step 2 should reference the two scripts as being in resource/memdbtemplate so 
> others don't have to search for them. Not sure what it means to move them to 
> "location to put new UMLS DB". Does that mean move them into a new directory 
> where the newly created UMLS DB will get written?
>
> steps 3 and 4 for running the tools reference dictionarytool.jar which 
> doesn't exist. Does one need to build that somehow from the source before 
> running it? The command line also adds "lib/*" to the classpath. Is that the 
> lib directory inside the dictionarytool source code or some other location?
>
> What else would I need to do to include the SNOMED-CT Veterinary Extension 
> along with the snomedct and rxnorm sources?
>
> I'll probably not have time to try this out for a while yet, but when I do 
> I'd be happy to write up an easy to follow tutorial for building a custom 
> dictionary assuming I am able to get it to work.
>
> Has anyone considered making this tool available outside of the source code 
> itself? Like including it in the main cTAKES release? It seems there is 
> demand for it.
>
> - Dave
>
> On Tue, Dec 8, 2015 at 3:22 PM, Finan, Sean < 
> sean.fi...@childrens.harvard.edu> wrote:
>
>> Hi Brandon, thanks for finding and forwarding the instructions!
>>
>> I have checked in two new hsqldb dictionaries, both from the 2015AB
>> version of the UMLS.  They both have codes for snomedct_us, rxnorm,
>> icd9cm and icd10pcs - as well as the usual cui, tui, preferred term mappings.
>>
>> One uses cuis filtered by snomed and rxnorm, the other adds cuis
>> filtered by icd9 and icd10.
>> What this means:  Cuis that exist for a [filter source] are added to
>> the dictionary, as are all text variations from all sources that
>> contain that cui.  Both dictionaries also use the standard ctakes
>> semantic group tui filters.
>>
>> The names are ctakessnorx2015 

Re: Create next cTAKES release (3.2.3)?

2015-11-19 Thread Pei Chen
A lot of the Jira's haven't been bumped into 3.2.4 yet.  This is to
get everyone to start looking and update their Jira's and if they
don't think they'll get a chance to work on it, I suggest to bump it
to the next release...  And if someone would like something to be
included in this release, please create a Jira and assign it to 3.2.3.

--Pei


On Thu, Nov 19, 2015 at 11:09 AM, Finan, Sean
<sean.fi...@childrens.harvard.edu> wrote:
> Hi Pei, thanks for the link to our Jira dashboard.  From my 3 second 
> run-through I would say that there remains a lot of outstanding work slated 
> for the 3.2.3 release.  Below are the Blocker, Critical and Major items.  
> Some may actually have been or can be quickly resolved, but it looks like we 
> may have more than a few bumps to 3.2.4 if we want to push out a release.
>
> I say in my bazaar way: release early, release often ...
> +1 for bumps and release ... *but first some comments in Jira on the state of 
> all listed below...
>
> Can anybody confirm that our only Blocker (dependencies not in maven central) 
> is still a problem?  https://issues.apache.org/jira/browse/CTAKES-76
> Where do we stand on the related Critical 
> https://issues.apache.org/jira/browse/CTAKES-138 ?
>
> Our other Critical item is PTB tokenizer breaking on apostrophes: 
> https://issues.apache.org/jira/browse/CTAKES-74
>
> A Major bug is FractionFSM incorrectly handling dashed ranges: 
> https://issues.apache.org/jira/browse/CTAKES-341  Britt, it looks like you 
> might have a fix ready?
>
> Another Major bug is for ytex UMLS.hbm.template.xml ... 
> https://issues.apache.org/jira/browse/CTAKES-302 Vijay it looks like you have 
> a fix started?
>
> Major bug for Missing Modifiers 
> https://issues.apache.org/jira/browse/CTAKES-213 ... Steve indicates that 
> this will require a lot of work.  Should we bump it or has somebody been 
> making progress?
>
> A Major bug in Medication Strength parsing has sat since our original 
> incubation, so I'm just guessing that it hasn't been touched and will be 
> bumped.  https://issues.apache.org/jira/browse/CTAKES-178
>
> Major bug SimpleSegmentWithTags ... 5 char names ... has also been around 
> single the continents were formed.
> https://issues.apache.org/jira/browse/CTAKES-155  I'd say a bump seems ok 
> except that there is an NPE ...
>
> There is a patch posted for our good old blues brothers boys band "URI not 
> hierarchical" on the old dictionary lookup 
> https://issues.apache.org/jira/browse/CTAKES-388  Can anybody volunteer to 
> test and commit?  I think that this is basically the same problem relayed in 
> https://issues.apache.org/jira/browse/CTAKES-320
>
>
> We have two placeholders for 3.2.3 additions.  They should probably be added 
> and (widely) tested asap or bumped to the next release.
> New Sentence Detector https://issues.apache.org/jira/browse/CTAKES-380
> ISO Time Normalizer https://issues.apache.org/jira/browse/CTAKES-379
>
> Has anybody started to tackle clean up / ?removal? of xml descriptors?  
> Tagged as Major improvement.  https://issues.apache.org/jira/browse/CTAKES-328
> This is related to https://issues.apache.org/jira/browse/CTAKES-295 - for 
> which Tim Miller has done a lot of great work, but is still incomplete.  Do 
> others have checkins awaitin'?
> A related Major Improvement is updating/fixing the relation extractor xml: 
> https://issues.apache.org/jira/browse/CTAKES-172
>
>
> Another Major improvement is an lvg update.  Do we have time to play with 
> this or should we bump it? https://issues.apache.org/jira/browse/CTAKES-388  
> Related to https://issues.apache.org/jira/browse/CTAKES-122
>
>
> Pei or Jay, are you ready to check in a working BigTop integration?
> https://issues.apache.org/jira/browse/CTAKES-314
>
>
>
> -Original Message-
> From: Pei Chen [mailto:chen...@apache.org]
> Sent: Wednesday, November 18, 2015 10:02 PM
> To: dev@ctakes.apache.org
> Subject: Create next cTAKES release (3.2.3)?
>
> Hi Folks,
> It looks like there have been a lot of progress in Jira's.  What do folks 
> think of preparing a cut for the next release- would be nice to get one more 
> out before holidays/end of the year?
> I'll be happy to volunteer to be RM again.
>
> Full list of Jira items slated for 3.2.3:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CTAKES_fixforversion_12328718_-3FselectedTab-3Dcom.atlassian.jira.jira-2Dprojects-2Dplugin-3Aversion-2Dissues-2Dpanel=BQIBaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=_7ouzO0-tjeIkyk9Gs02WBejxjOgYQstemelRj8yHcY=Bb9i6bbeLKK1UiJCVzZZPIkgQpmbHNsYJbEBDhsaBA4=
>
> --Pei


Re: user Digest 16 Nov 2015 19:44:10 -0000 Issue 343

2015-11-16 Thread Pei Chen
[+dev, -user]
Hi Lewis,
I've applied the patches.  Would you mind looking into using and see
if it works for your tests/use cases?:
org.apache.ctakes.core.resource.FileLocator.getAsStream()?  It has a
built in fall back mechanism.

On Mon, Nov 16, 2015 at 3:25 PM, Lewis John Mcgibbney
 wrote:
> Hi Pei,
>
> On Mon, Nov 16, 2015 at 11:44 AM, 
> wrote:
>>
>>
>> Date: Thu, 12 Nov 2015 15:48:32 -0500
>> Subject: Re: cTAKES Trunk Broken?
>> Hi Lewis,
>> Sorry for the delays- I noticed some of your patches were still
>> pending in Jira.
>> I should be able to spend more time on cTAKES now and hopefully be
>> able to apply those patches shortly; Unless someone beats me to it..
>>
>
> Thanks for the response. We are using cTAKES heavily right now and are very
> interested in progressing availability and execution of cTAKES within Yarn
> and Spark. I am really interested to work with the cTAKES committers to
> ensure that lvg resources (and every cTAKES resource for that matter) can be
> run in these environments. As you're aware they curently can't be.
> Let let me know if the various patches need unit tests. If so I am more than
> happy to invest some time and provide test patches for the all them.
> Thank you again for getting back on this one.
> Lewis
>


Re: cTAKES corpus

2015-11-12 Thread Pei Chen
Hi Jose,

There were some previous discussions[1] on how to get the annotated
training data.  Essentially, there currently isn't a centralized or
easy way of getting w/o having to sign individual Data Use Agreements
from source institutions.

There is a clear need to simplify this and I believe the various
groups are working on it...

[1] 
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201503.mbox/%3CCA+Fyf6hxBbhhEqc9oU=vpuymc1fyrwpextpmpme-ir0cjwt...@mail.gmail.com%3E

> There are some discussions on appending/augmenting the existing

> annotated/training data[2].  I think the short answer is that there is

> currently no easy way short of having to sign DUA's from every single

> source institution.

>

> [1] http://svn.apache.org/r1465043

> [2]

>

> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201412.mbox/%3ce5a9fa5abbf1ca4085d4f0794852a51e24241...@chexmbx3a.chboston.org%3E

On Wed, Nov 11, 2015 at 3:51 PM, Posada Aguilar, Jose David
 wrote:
> Dear cTAKES community
>
> I want to know if it's possible to obtain the annotated corpus that were used 
> to test cTAKES.
>
> We are currently using it and we would like to be able to test each module 
> towards the addition of a new one.
>
> Thank you very much for your help.
>
>
>
> Jose Posada
> Department of Biomedical Informatics
> University of Pittsburgh
>
>


Email address update

2015-11-12 Thread Pei Chen
Hi,
Just wanted to give a heads up- My usual childrens/harvard email
address will no longer be valid after this week.  I will continue to
use my chenpei@a.o one as I will be moving on to focus on our startup,
Wired Informatics.

--Pei


Re: cTakes 3.2.2 CPE error

2015-11-01 Thread Pei Chen
Hi Eric,
could you copy and paste the java command that was used? It some look
something like:
"java -cp ..."
--Pei

On Sat, Oct 31, 2015 at 1:58 PM, Eric Benzschawel  wrote:
> To whom it may concern,
> I'm Eric Benzschawel, a masters student at Brandeis University. I'm
> planning on using cTakes to help me process texts for my master's thesis,
> but I'm having problems installing and running the program.
>
> I've followed the cTakes 3.2 install guide twice, identically, and I can't
> complete the CPE example on the bottom of the install page. I'm getting the
> error:
>
> Error: Could not find or load main class
> .usr.local.apache-ctakes-3.2.2.desc.:.usr.local.apache-ctakes-3.2.2.resources.:.usr.local.apache-ctakes-3.2.2.lib.*
>
> I'd be surprised if this error something that's specific to my local
> install. I've successfully completed the CVD tutorial, but the CPE program
> will be more relevant to my work. Do you have any resources you can point
> me to for troubleshooting or any idea what might be causing this error?
>
> Best,
> Eric Benzschawel
> Brandeis University
> Computational Linguistics MA '16
>
> Additional information:
> Platform: Mac OSX El Capitan
> Java version: 1.8.0_20
> cTakes version: 3.2.2
> UMLS resources included: yes
> UMLS resources version: 3.2.1.1-bin.tar.gz
> Downloaded additional ctakessnorx resources: yes


Re: cTAKES scale out using UIMA DUCC

2015-10-28 Thread Pei Chen
Yi-Wen,
ctakes-clinical-pipeline/desc/**/ for the descriptors?
Hope that helps..

On Wed, Oct 28, 2015 at 5:55 PM, Yi-Wen Liu  wrote:
> Hi,
>
> I am trying to run cTAKES on UIMA DUCC, and I am working on some
> configuration files.
> In UIMA DUCC job files, I have to specify the following:
> driver_descriptor_CR
> ex. org.apache.uima.ducc.sampleapps.DuccJobTextCR
> process_descriptor_CM
> ex. org.apache.uima.ducc.sampleapps.DuccTextCM
> process_descriptor_AE
> ex. ${OPENNLP_HOME}/desc/OpenNlpTextAnalyzer.xml
> process_descriptor_CC
> ex. org.apache.uima.ducc.sampleapps.DuccCasCC
>
> Does cTAKES have its own CR, CM, AE and CC descriptors?
> And if so, could somebody point out where can I find them in cTAKES
> directory?
>
> Thanks,
> Yi-Wen


Re: Can one pass UMLS username and password as API arguments?

2015-10-26 Thread Pei Chen
Pete,
System.setProperty()?
Were you suggest we add an overloaded method?:
ClinicalPipelineFactory.getFastPipeline(String user, String pw) {}
It's not a bad suggestion- if you require it, feel free to create a
Jira or even better a patch...
--Pei


On Mon, Oct 26, 2015 at 1:27 PM, Peter Szolovits  wrote:
> I know that, but was asking specifically whether there is a way for this info 
> to be passed in by a program that embeds cTakes, without having to set 
> environment variables or muck with the java command line.
>
>> On Oct 26, 2015, at 1:18 PM, Finan, Sean  
>> wrote:
>>
>> You should be able to use ctakes.umlsuser and ctakes.umlspw in the command 
>> line or as environment variables.  If your shell requires, you can replace 
>> the dot with underscore: ctakes_umlsuser  ctakes_umlspw
>>
>> Sean
>>
>> -Original Message-
>> From: Peter Szolovits [mailto:p...@mit.edu]
>> Sent: Monday, October 26, 2015 1:12 PM
>> To: dev@ctakes.apache.org
>> Subject: Can one pass UMLS username and password as API arguments?
>>
>> I am embedding cTakes as part of a larger (Java-based) processing program 
>> and would like to be able to pass the user’s UMLS username and password when 
>> setting up the cTakes API rather than embedding them in UIMA configuration 
>> files or having to give them as java vm arguments.  E.g., at some place such 
>> as a call to ClinicalPipelineFactory.getFastPipeline()g.  Is there a way to 
>> do this that I have not been able to find?  Thank you.  —Peter Szolovits
>>
>


Re: Can one pass UMLS username and password as API arguments?

2015-10-26 Thread Pei Chen
Yes, I wasn’t sure if your application had a security restriction to store
paw’s into the env var for any code to read.

Anyhow, anyone should be able to create an Jira account at:

https://issues.apache.org/jira/browse/CTAKES

Pei Chen
Wired Informatics <http://www.wiredinformatics.com>
265 Franklin St Ste 1702
Boston, MA 02110
tel: (617) 433-7544
pei.c...@wiredinformatics.com

On Mon, Oct 26, 2015 at 2:06 PM, Peter Szolovits <p...@mit.edu> wrote:

> Thanks, Sean and Pei.  Sean’s suggestion to set the properties via
> System.setProperty() works; I had forgotten that this was doable in Java.
> I think the suggestion of an overloaded method is still a good idea, but I
> also don’t remember how to create a Jira.  —Pete
>
> > On Oct 26, 2015, at 1:44 PM, Pei Chen <chen...@apache.org> wrote:
> >
> > Pete,
> > System.setProperty()?
> > Were you suggest we add an overloaded method?:
> > ClinicalPipelineFactory.getFastPipeline(String user, String pw) {}
> > It's not a bad suggestion- if you require it, feel free to create a
> > Jira or even better a patch...
> > --Pei
> >
> >
> > On Mon, Oct 26, 2015 at 1:27 PM, Peter Szolovits <p...@mit.edu> wrote:
> >> I know that, but was asking specifically whether there is a way for
> this info to be passed in by a program that embeds cTakes, without having
> to set environment variables or muck with the java command line.
> >>
> >>> On Oct 26, 2015, at 1:18 PM, Finan, Sean <
> sean.fi...@childrens.harvard.edu> wrote:
> >>>
> >>> You should be able to use ctakes.umlsuser and ctakes.umlspw in the
> command line or as environment variables.  If your shell requires, you can
> replace the dot with underscore: ctakes_umlsuser  ctakes_umlspw
> >>>
> >>> Sean
> >>>
> >>> -Original Message-
> >>> From: Peter Szolovits [mailto:p...@mit.edu]
> >>> Sent: Monday, October 26, 2015 1:12 PM
> >>> To: dev@ctakes.apache.org
> >>> Subject: Can one pass UMLS username and password as API arguments?
> >>>
> >>> I am embedding cTakes as part of a larger (Java-based) processing
> program and would like to be able to pass the user’s UMLS username and
> password when setting up the cTakes API rather than embedding them in UIMA
> configuration files or having to give them as java vm arguments.  E.g., at
> some place such as a call to ClinicalPipelineFactory.getFastPipeline()g.
> Is there a way to do this that I have not been able to find?  Thank you.
> —Peter Szolovits
> >>>
> >>
>
>


Re: cTakes - help please

2015-10-20 Thread Pei Chen
Chris,
Could you confirm if the paths were modified? if so, is it accurate?  In
particular, per error message: des -> desc?
Malformed URL c:/apache-ctakes-3.2.2/*des*/ctakes-chunker/desc

If not, would you mind including the descriptor xml's?
--Pei

On Mon, Oct 19, 2015 at 5:38 PM, Tonner, Chris 
wrote:

> Hello:
>
>
>
> I work in the Department of Medicine at the University of California, San
> Francisco.  We are interested in cTakes to extract medical information from
> our EMRs.
>
>
>
> I have been attempting to install and run cTakes, but have had some
> problems.  Would you be able to give us technical help to get cTakes up and
> running.
>
>
>
> Following the instructions on the User Installation Guide.
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide
>
>
>
> I  have modified the two bat files c: bin\runctakesCVD.bat file  and
> CPE.bat… Example of modified code – is this correct?
>
>
>
> @REM set ctakes.umlsuser=[christonner], ctakes.umlspw=[123Start]
>
> @REM or add the properties
>
> @REM -Dctakes.umlsuser=[christonner] -Dctakes.umlspw=[123Start]
>
>
>
> These batch files run and the debugger opens.
>
>
>
> I can load some of the analysis engine, but get errors with the UMLS and
> the negation AE.
>
>
>
> In short we are at a loss on how to troubleshoot this.  Is there a more
> extensive manual on how to install cTakes and how to use cTakes?
>
>
>
> Error Examples.
>
>
>
> SEVERE: Malformed URL
> c:/apache-ctakes-3.2.2/des/ctakes-chunker/desc/AdjustNounPhraseToIncludeFollowingPPNP.xml
> in import declaration. (Descriptor:
> file:/C:/apache-ctakes-3.2.2/desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml)
>
> org.apache.uima.resource.ResourceInitializationException: Malformed URL
> c:/apache-ctakes-3.2.2/des/ctakes-chunker/desc/AdjustNounPhraseToIncludeFollowingPPNP.xml
> in import declaration.
>
> (Descriptor:
> file:/C:/apache-ctakes-3.2.2/desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml)
>
>
>
> Thanks you,
>
> Chris Tonner
>
> University of California, San Francisco
>
>
>
>
>
>
>


Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2015-10-13 Thread Pei Chen
Thanks Azad.
I submitted a Jira to infra to help us do the import (that way we will try
and preserve the commit history).
In the meantime, would you mind filling out the ICLA[1].

[Reminder: Let's keep it in sandbox and not release it until all of the 3rd
party dependencies licenses have been verified.]

[1] http://www.apache.org/licenses/#clas

Thanks,
Pei

Pei Chen
Wired Informatics <http://www.wiredinformatics.com>
265 Franklin St Ste 1702
Boston, MA 02110
tel: (617) 433-7544
pei.c...@wiredinformatics.com

On Sun, Oct 11, 2015 at 3:51 PM, Azad Dehghan <azad.dehg...@gmail.com>
wrote:

> 1: Yes. Sorted.
> 3: Code attached to the Jira.
>
> Azad
>
> On 8 October 2015 at 20:03, Chen, Pei <pei.c...@childrens.harvard.edu>
> wrote:
>
> > This is great news!
> > > What is the current status and procedure? Is there an explicit
> > contribution to cTAKES? Is there an ICLA? What about the license of the
> > sourceforge project?
> > Jira has been opened to track this:
> > https://issues.apache.org/jira/browse/CTAKES-384
> >
> > 1) Azad, would you be willing to switch licenses?  I believe it's
> > currently GNU3 -> ASL 2.0?
> > 2) Create a project/module in cTAKES sandbox for this
> > 3) Export/Import sourceforge and attach the code to the Jira initially.
> > One of the current cTAKES committers can commit it to the repo (Until
> folks
> > can commit directly to the ctakes repo directly going forward.)
> >
> > -Original Message-
> > From: Peter Klügl [mailto:peter.klu...@averbis.com]
> > Sent: Thursday, October 08, 2015 8:06 AM
> > To: dev@ctakes.apache.org
> > Subject: Re: Combining Knowledge- and Data-driven Methods for
> > De-identification of Clinical Narratives
> >
> > Hi,
> >
> > I can offer my help here if required.
> >
> > I have experience in translating JAPE rules to UIMA Ruta and already
> > worked with clinical notes, e.g., also concerning deidentification.
> >
> > The problem is that I can only invest a few hours in the next two weeks.
> > I will have more time next month or even more next year.
> >
> > What is the current status and procedure? Is there an explicit
> > contribution to cTAKES? Is there an ICLA? What about the license of the
> > sourceforge project?
> >
> > Best,
> >
> > Peter
> >
> > Am 01.10.2015 um 16:20 schrieb Pei Chen:
> > > Hi Azad,
> > > This is awesome news.  Thanks for adding in the code that was
> > > referenced by the paper.  I'll create a Jira to track we need to port
> > > it over to UIMA/Ruta.
> > >
> > > In the meantime, the link is at:
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p_
> > >
> >
> clinical-2Ddeid_code_ci_master_tree_=BQICaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY=yjhqco4EH0XrR798kbkzfYcFQ8z8MR9UF8mMRSjKTH0=_k7AbwzkVrRwTrNC3LArZ5hQ5Q47eh06KCDla7UBugY=
> > for those who may be interested in helping out...
> > >
> > > --Pei
> > >
> > > Hello Pei,
> > >
> > > I hope all is well.
> > >
> > > I have now uploaded the source code for cDeid
> > > (https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p
> > > _clinical-2Ddeid_code_ci_master_tree_=BQICaQ=qS4goWBT7poplM69zy_3x
> > > hKwEW14JZMSdioCoppxeFU=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY
> > >
> >
> =yjhqco4EH0XrR798kbkzfYcFQ8z8MR9UF8mMRSjKTH0=_k7AbwzkVrRwTrNC3LArZ5hQ5Q47eh06KCDla7UBugY=
> > ) ; I have tried to make the code as portable and modular as possible
> with
> > some trade-off for performance. This should help with porting the code to
> > cTAKES/UIMA.
> > >
> > > Once you let the community know I will try to get involved to help
> > > with translating JAPE to RUTA, etc.
> > >
> > > Best,
> > > Azad
> >
> >
>


Re: Boston cTAKES Hackathon?

2015-10-08 Thread Pei Chen
http://www.meetup.com/cTAKES/events/225926425/
has been set up...

On Fri, Sep 18, 2015 at 11:46 AM, Pei Chen <chen...@apache.org> wrote:
> Yes, we can plan for lightning talks and we can potential find some
> Docker experts in the area to help.
> I'm thinking over the next few weeks; any prefs on date/times?  Once
> we a rough idea, i'll move this thread over to meetup.com to avoid
> spamming this list.
>
> --Pei
>
> On Wed, Sep 16, 2015 at 9:11 PM, John Green <hephaestus.stu...@gmail.com> 
> wrote:
>> Im jealous! That sounds fun
>> JTG
>>
>>
>> On Wed, Sep 16, 2015 at 3:21 PM, Jay Vyas <jayunit100.apa...@gmail.com>
>> wrote:
>>>
>>> Yes I'd love to. How about some lightning talks also to start the night
>>> off?
>>> I know Harvard is using ctakes for some stuff.
>>>
>>>
>>> > On Sep 16, 2015, at 4:23 PM, Pei Chen <chen...@apache.org> wrote:
>>> >
>>> > Hi,
>>> > I hope everyone had a great summer. I just wanted to resurrect the
>>> > Docker integration idea.
>>> > Anyone interested in joining a small hackathon with the single goal of
>>> > deploying cTAKES in a docker container.
>>> > One of the evenings 6pm?
>>> >
>>> > --Pei
>>
>>


Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2015-10-01 Thread Pei Chen
Hi Azad,
This is awesome news.  Thanks for adding in the code that was
referenced by the paper.  I'll create a Jira to track we need to port
it over to UIMA/Ruta.

In the meantime, the link is at:
http://sourceforge.net/p/clinical-deid/code/ci/master/tree/ for those
who may be interested in helping out...

--Pei

Hello Pei,

I hope all is well.

I have now uploaded the source code for cDeid
(http://sourceforge.net/p/clinical-deid/code/ci/master/tree/) ; I have
tried to make the code as portable and modular as possible with some
trade-off for performance. This should help with porting the code to
cTAKES/UIMA.

Once you let the community know I will try to get involved to help
with translating JAPE to RUTA, etc.

Best,
Azad


Re: Boston cTAKES Hackathon?

2015-09-18 Thread Pei Chen
Yes, we can plan for lightning talks and we can potential find some
Docker experts in the area to help.
I'm thinking over the next few weeks; any prefs on date/times?  Once
we a rough idea, i'll move this thread over to meetup.com to avoid
spamming this list.

--Pei

On Wed, Sep 16, 2015 at 9:11 PM, John Green <hephaestus.stu...@gmail.com> wrote:
> Im jealous! That sounds fun
> JTG
>
>
> On Wed, Sep 16, 2015 at 3:21 PM, Jay Vyas <jayunit100.apa...@gmail.com>
> wrote:
>>
>> Yes I'd love to. How about some lightning talks also to start the night
>> off?
>> I know Harvard is using ctakes for some stuff.
>>
>>
>> > On Sep 16, 2015, at 4:23 PM, Pei Chen <chen...@apache.org> wrote:
>> >
>> > Hi,
>> > I hope everyone had a great summer. I just wanted to resurrect the
>> > Docker integration idea.
>> > Anyone interested in joining a small hackathon with the single goal of
>> > deploying cTAKES in a docker container.
>> > One of the evenings 6pm?
>> >
>> > --Pei
>
>


Re: CTAKES-377 : Upgrade to Java 8

2015-09-16 Thread Pei Chen
+1 upgrading to Java 8; been using it unofficially locally.

On Wed, Sep 16, 2015 at 1:37 PM, Finan, Sean
 wrote:
> Can anybody out there think of a reason why we shouldn't upgrade to Java 8?  
> Please comment on Jira.
>
> https://issues.apache.org/jira/browse/CTAKES-377
>
> Thanks,
> Sean
>
>


Boston cTAKES Hackathon?

2015-09-16 Thread Pei Chen
Hi,
I hope everyone had a great summer.  I just wanted to resurrect the
Docker integration idea.
Anyone interested in joining a small hackathon with the single goal of
deploying cTAKES in a docker container.
One of the evenings 6pm?

--Pei


[DRAFT] [REPORT] Apache cTAKES Sep 2015

2015-09-08 Thread Pei Chen
Feel free to edit/add.



Report from the Apache cTAKES committee [Pei Chen]

## Description:
   Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is
   an open-source natural language processing system for information extraction
   from electronic medical record clinical free-text.

## Activity:
 - There is interest from new contributor(s) in integrating cTAKES
with Spark (CTAKES-374)
 - There is interest from new contributor(s) in integrating Gene
Mappings to cTAKES (CTAKES-375)
 - There is interest from new contributors(s) on using cTAKES for
deidentification
 - The committee is planning a local meetup (Boston) in the near future to
integrate cTAKES with Docker for easier deployments.

## Issues:
 - There are no issues requiring board attention at this time

## LDAP committee group/Committership changes:
 - Currently 31 committers and 30 PMC members in the project.
 - Last PMC addition was Michelle Chen at Fri Jan 23 2015
 - Last committer addition was Jim Gregoric at Sat Feb 28 2015

## Releases:
 - 3.2.2 was released on May 30 2015
 - 3.2.1 was released on Dec 10 2014
 - 3.2.0 was released on Jul 23 2014

## Mailing list activity:
 - dev@ctakes.apache.org:
- 179 subscribers (up 17 in the last 3 months):
- 181 emails sent to list (228 in previous quarter)

 - u...@ctakes.apache.org:
- 155 subscribers (up 13 in the last 3 months):
- 94 emails sent to list (95 in previous quarter)

 - notificati...@ctakes.apache.org:
- 21 subscribers (up 0 in the last 3 months):
- 77 emails sent to list (103 in previous quarter)

## JIRA activity:
 - 11 JIRA tickets created in the last 3 months
 - 7 JIRA tickets closed/resolved in the last 3 months


Re: Bug in resource import in LvgAnnotator?

2015-08-28 Thread Pei Chen
Hi Jakob,
Yes, there is currently a limitation in the LVG component to have
their jar unpacked and added to the classpath.
I think there is an outstanding Jira to enable it read it from a jar
like the rest of the jars.  (There are some limitations such as
hsql/lucene not being able to read directly from a jar hence it's been
outstanding.)
--Pei

On Fri, Aug 28, 2015 at 4:26 AM, Jakob Rogstadius
jakob.rogstad...@who-umc.org wrote:
 Hi,

 On line 579 in org.apache.ctakes.lvg.ae.LvgAnnotator there is the following 
 resource import:
   ExternalResourceFactory.createExternalResourceDescription(
   LvgCmdApiResourceImpl.class,
   new File(LvgCmdApiResourceImpl.class.getResource(
   
 /org/apache/ctakes/lvg/data/config/lvg.properties).toURI()))
   );

 The .getResource() call breaks when the package is imported as a jar (from 
 Maven central), with an error stating that the URI is not hierarchical. 
 According to:
 http://stackoverflow.com/questions/18055189/why-my-uri-is-not-hierarchical
 the call should instead use .getResourceAsStream().

 Is this a bug, or am I doing something wrong? I'm not very familiar with how 
 Java handles resources in general.

 Jakob


Re: Error while running DrugMentionAnnotator.xml

2015-08-10 Thread Pei Chen
Chandra,
How are you wiring up the DrugMentionAnnotator?
The default AggregatePlaintextFastUMLSProcessor.xml should already
have the Drug NER included.
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml

In general, you'll most likely need to something like the below if you
have a custom/modified pipeline:

typeSystemDescription
   imports
 import name=org.apache.ctakes.drugner.types.TypeSystem/
   /imports
/typeSystemDescription

Related Thread:
http://mail-archives.apache.org/mod_mbox/ctakes-user/201403.mbox/%3CCAPqz87oUZ=hpzc_fo_zlaef3pvqcm9xsyums15iymgapsxx...@mail.gmail.com%3E

If you're using uimaFIT to wire your pipeline together, I would highly
recommend using the Automatic Type System Discovery.

Hope that helps.


On Sat, Aug 8, 2015 at 12:12 PM, RANGA CHANDRA GUDIVADA
chandhragupta...@hotmail.com wrote:
 Hello All,

 I am getting the below error while trying to run the analysis engine 
 DrugMentionAnnotator.xml from user install CVD.  Please let me know if 
 anyone has similar issues and were able to successfully fix it.

 Ctakes Version used : apache-ctakes-3.2.2
 Ctakes Resources : ctakes-resources-3.2.1.1-bin



 Caused by: org.apache.uima.cas.CASRuntimeException: JCas type 
 org.apache.ctakes.drugner.type.FrequencyAnnotation used in Java code,  but 
 was not declared in the XML type descriptor.
 at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:412)
 at org.apache.uima.jcas.impl.JCasImpl.getCasType(JCasImpl.java:436)
 at 
 org.apache.uima.jcas.impl.JFSIndexRepositoryImpl.getAnnotationIndex(JFSIndexRepositoryImpl.java:80)
 at 
 org.apache.ctakes.drugner.ae.DrugMentionAnnotator.removeAnnotations(DrugMentionAnnotator.java:306)
 at 
 org.apache.ctakes.drugner.ae.DrugMentionAnnotator.removeDrugNerTypes(DrugMentionAnnotator.java:299)
 at 
 org.apache.ctakes.drugner.ae.DrugMentionAnnotator.process(DrugMentionAnnotator.java:260)
 ... 45 more

 Thanks
 Chandra










Re: Role of white-box logic/models in cTAKES

2015-08-05 Thread Pei Chen
Peter,
Good to hear from you again!
Yes, I believe there are some regex and rules based annotators that
are in used (and probably the future for as long as it out performs
other methods for certain tasks.)
I don't think there is specific position form the community on this
approach.  (ASF's 'Do-acracy')
Were you thinking of writing some Annotators in Ruta?

--Pei


On Wed, Aug 5, 2015 at 3:47 AM, Peter Klügl peter.klu...@averbis.com wrote:
 Hi,

 my (uninformed) view on cTAKES was that it is mainly based on black-box
 machine learning models.

 There were some mentions of rule-based approaches on the mailing list
 and a quick look in the source code revealed to me some functionality
 that is based on FSMs and regular expressions (and the grey area of rule
 logic implemented in plain java).

 I'm just curious. Is this code actively used in cTAKES and is there a
 general position of the cTAKES community on rules-based/white-box
 approaches?

 Best,

 Peter


Re: cTAKES GUI for I2B2

2015-08-05 Thread Pei Chen
Sekhar,
That application was done as a prototype/POC many years ago and hasn't
been actively maintained (hence in sandbox).
It seems from your screenshot that you have it up and running though.
Would you mind attaching the log files as well?

--Pei



On Wed, Aug 5, 2015 at 4:41 AM, Hari, Sekhar sekhar.h...@cgi.com wrote:
 Hello Timothy -

 I have posted the  screenshots here:

 https://drive.google.com/file/d/0B4sR85qs377yTThzWHM4YXlxOFE/view?usp=sharing

 Kindly advise as soon as possible.

 Many thanks,
 Sekhar H.

 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
 Sent: Tuesday, August 04, 2015 4:14 PM
 To: dev@ctakes.apache.org
 Subject: RE: cTAKES GUI for I2B2

 Can you post the screenshot somewhere it might be linked to? I don't know if 
 we can post image attachments to the dev list.
 Thanks
 Tim

 
 From: Hari, Sekhar [sekhar.h...@cgi.com]
 Sent: Monday, August 03, 2015 10:35 PM
 To: chen...@apache.org; dev@ctakes.apache.org; u...@ctakes.apache.org
 Subject: RE: cTAKES GUI for I2B2

 Hello there -

 Please, can somebody advise me on my question below?

 Thanks,
 Sekhar H.
 
 From: Hari, Sekhar
 Sent: 31 July 2015 15:02
 To: chen...@apache.org
 Subject: cTAKES GUI for I2B2

 Hello Pei -

 Can you please assist me. I am doing a few experiments using I2B2. There is a 
 requirement for me to use cTAKES for reading clinical notes and to extract 
 the  key terminologies from the notes so it can be inserted into I2B2. I 
 found cTAKES GUI for I2B2 and installed it. While trying to read a sample 
 clinical  note, though seemingly the pipelines run, I don't see any useful 
 output in the  Results section of the GUI. I have included a screenshot 
 below. The Results page says language: x-unspecified. Not sure what is 
 going wrong. Am I doing anything wrong or missing  any configurations? Also, 
 I have included the NLP processors that am using to do this read and extract. 
 You can see this screenshot at the bottom of this email. Hope you can help.

 Many thanks,
 Sekhar H.






Fwd: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2015-07-30 Thread Pei Chen
[+dev]
Hi Azad,
This is great news! Looking forward to it.

--Pei

On Thu, Jul 30, 2015 at 8:16 AM, Azad Dehghan azad.dehg...@gmail.com wrote:
 Hi Pei,

 Just to keep you in the loop: I am currently tailoring a version of the
 de-id tool for The Christie NHS Foundation Trust (UK)--this is due to be
 concluded end of August. So, I should have the re-factoring of the public
 version of the tool ready by mid-September for Apache / cTAKES.

 Looking forward to get this started!

 Best regards,
 Azad


On 29 July 2015 at 17:27, Azad Dehghan azad.dehg...@gmail.com wrote:

 Pei,

 Yes, the tool is entirely written in Java. It is very light weight;
 specifically using the following external/3rd party components: ANNIE
 English Tokeniser, ANNIE Sentence Splitter, ANNIE Gazetteer (dictionary)
 and JAPE Transducer (rules) (see
 https://gate.ac.uk/gate/doc/plugins.html#ANNIE).

 I'll do my best to hurry up the refactoring of the code.

 I've just joined the dev mailing list.


 Azad



Re: xml org.apache.ctakes.core.analysis_engine.TokenizerAnnotator not found

2015-07-30 Thread Pei Chen
Justin,
Is this still an issue for you?  I believe there was a known issue and
someone submitted a patch:
https://issues.apache.org/jira/browse/CTAKES-370?jql=component%20%3D%20ctakes-smoking-status%20AND%20project%20%3D%20CTAKES%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC

[You probably don't want to use any references to the
ctakes-smoking-status-res/target descriptors at least until the issue
has been resolved]

On Mon, Jul 27, 2015 at 6:04 PM, Justin Zhang justinzhang...@gmail.com wrote:
 Hello Everyone,

 Where is the file
 org.apache.ctakes.core.analysis_engine.TokenizerAnnotator the program is
 looking for? Any suggestions for trouble shooting? Where the
 file TokenizerAnnotator.xml should be in the path? and Where is the file
 online (to check if the local file is the one required?)

 error message:

 Caused by: org.apache.uima.resource.ResourceInitializationException: An
 import could not be resolved.  No .xml file with name
 org.apache.ctakes.core.analysis_engine.TokenizerAnnotator was found in
 the class path or data path. (Descriptor:
 file:/Users/justin/App/eclipse_mars/workspace_eclipse_mars/ctakes/ctakes-smoking-status-res/target/classes/org/apache/ctakes/smokingstatus/analysis_engine/ProductionPostSentenceAggregate_step1.xml)
 at
 org.apache.ctakes.smokingstatus.ae.ClassifiableEntries.initialize(ClassifiableEntries.java:178)
 at
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:250)

 --
 Justin


Re: How to use cTakes as a UIMA component

2015-07-18 Thread Pei Chen
Ralph,
Could you describe a bit on you were using the UIMA framework? i.e. PEAR
files, XML descriptors, and/or uimaFIT to programmatically wire the
components together?
I think the easiest would be to have your application pull the necessary
ctakes components from maven central and use the Annotators as appropriate.

Hope that helps-
Pei


On Sat, Jul 18, 2015 at 7:45 AM, ralph Lecessi rlece...@gmail.com wrote:

 Hello,

 I'm interested in building an application in the Eclipse IDE
 that uses the cTAKES library as a component in the UIMA apache
 framework.

 Is this possible? Could you point me to some documentation?

 Thank you,

 Ralph Lecessi
 rlece...@gmail.com
 (732)658-4778
 .



Re: Mvn package error

2015-06-23 Thread Pei Chen
Zhiwen,
I think this unit test needs to be updated/fixed.
Even though it runs fine in mvn compile test.  In the interim- package
needs to -DskipTests=true.

The longer story is that once modules are packaged (i.e. lvg, dictionary)
mvn loads them from the jars instead of unpacked resources.  So
essentially, the tests needs to have that packed in order to run the test.
Or modify lucene/hsqldb to have a Jar Reader.

On Mon, Jun 22, 2015 at 11:34 AM, Zhiwen Li l...@udel.edu wrote:

 Hi,
 I tried to compile the 3.2.3 version of Ctakes, got the following error.
 Tests in error:

  
 TestClearNLPPipeLine(org.apache.ctakes.dependency.parser.ae.util.TestClearNLPAnalysisEngines):
 URI is not hierarchical
 I realized this error was resolved before in this thread
 https://issues.apache.org/jira/browse/CTAKES-307
 But the same error comes up since the svg-ctakes-resources-lvg2008 was
 added to the dependency in revision 1642706.
 If I removed the dependency and basically restored it to revision 1620359
 
 https://svn.apache.org/viewvc/ctakes/trunk/ctakes-lvg-res/pom.xml?view=markuppathrev=1620359
 ,
 it compiles file. But I am not sure if this dependency is necessary or not.
 I don't know why this specific lvg version is required after revision
 1642706.
 Please help to clarify.

 Thanks,
 Simon

 --

 Zhiwen Li

 l...@udel.edu



Apache cTAKES hosted demos and examples

2015-06-19 Thread Pei Chen
There seems to be a significant interest in having a hosted demo and
examples, so I started this index page along with initial code examples:

Index page:
http://healthnlp.github.io/examples/

Live demo:
http://52.24.118.198:8080/index.jsp

--Pei


Re: PAD Term Spotter

2015-06-09 Thread Pei Chen
Hi Christopher,
The PAD Term Spotter hasn't been supported for over a year now [1].  It was
mostly written with specialized rules and no one had been maintaining it.
I am not sure if there are any generic diseases annotators; if you would be
willing to contribute the changes, we can incorporate it.


[1]
http://mail-archives.apache.org/mod_mbox/ctakes-user/201402.mbox/%3C6e55ab$8ci...@ironport10.mayo.edu%3E

On Mon, Jun 8, 2015 at 5:30 PM, Christopher Baechle cbaec...@my.fau.edu
wrote:

 A project I inherited uses cTAKES 2.5 and modified the PAD Term
 Spotter to detect the presence of the disease we're interested in.
 After searching the archives, it looks like the PAD term spotter was
 removed, but I couldn't find info as to why.

 I would like to migrate our code to the latest version of cTAKES. I
 could just create an annotator and port the code and put it in the
 pipeline, but it looks like cTAKES has had many enhancements since
 2.5. I wasn't sure if a more generic disease annotator was put in
 place and disease specific annotators are not a good route.



Re: [DRAFT] [REPORT] Apache cTAKES Jun 2015

2015-06-05 Thread Pei Chen
I am not aware of any currently, but I think it would make a great
contribution though.
--Pei

On Fri, Jun 5, 2015 at 2:43 AM, Soumya Shree soumya.sh...@citiustech.com
wrote:

 Hi Folks,

 I am working with Ctakes Wherein I am looking to search reason for
 discontinuation from the clinical notes. Does Ctakes offer any API or
 concept with which we can find the same.
 Example- Mr. xxx have been recommended not to take Dilantin due to its
 side effects.
 In this case we need to train machine so that it tells us that the
 medicine was discontinued and the reason was its side effects.
 I appreciate if I can get small help also.

 Thanks  Regards,
 Soumya Shree

 -Original Message-
 From: Pei Chen [mailto:chen...@apache.org]
 Sent: Friday, June 05, 2015 2:32 AM
 To: dev@ctakes.apache.org
 Subject: [DRAFT] [REPORT] Apache cTAKES Jun 2015

 [DRAFT- Feel free to add/edit]

 ---

 Report from the Apache cTAKES project [Pei Chen]

 ## Description:

Apache clinical Text Analysis and Knowledge Extraction System (cTAKES)
 is

an open-source natural language processing system for information
 extraction

from electronic medical record clinical free-text.

 ## Activity:

 A talk about cTAKES using Spark/BigTop on processing Twitter data was well
 received at ApacheCon NA 2015.

 The committee just released ctakes-3.2.2 on May 30, 2015 and contains a
 critical patch caused by a change by a 3rd party (NLM) validation service.
 Full Release notes: http://s.apache.org/ctakes-3.2.2-release-notes

 The committee is actively working on the next release cTAKES with new
 temporal components/models and various bug fixes in Jira.

 The committee is planning a local meetup (Boston) in the near future to
 integrate cTAKES with Docker for easier deployments.

 ## Issues:

 There are no issues requiring board attention at this time

 ## PMC/Committership changes:

  - Currently 31 committers and 30 PMC members in the project.

  - Last PMC addition was Michelle Chen at Fri Jan 23 2015

  - Last committer addition was Jim Gregoric at Sat Feb 28 2015

 ## Releases:

  - 3.2.2 was released on May 30 2015
  - 3.2.1 was released on Dec 10 2014

  - 3.2.0 was released on Jul 23 2014

 ## Mailing list activity:

  - dev@ctakes.apache.org:

 - 160 subscribers (up 15 in the last 3 months):

 - 227 emails sent to list (211 in previous quarter)

  - u...@ctakes.apache.org:

 - 140 subscribers (up 14 in the last 3 months):

 - 93 emails sent to list (41 in previous quarter)

 ## JIRA activity:

  - 14 JIRA tickets created in the last 3 months

  - 8 JIRA tickets closed/resolved in the last 3 months



 ===
 DISCLAIMER:
 The information contained in this message (including any attachments) is
 confidential and may be privileged. If you have received it by mistake
 please notify the sender by return e-mail and permanently delete this
 message and any attachments from your system. Any dissemination, use,
 review, distribution, printing or copying of this message in whole or in
 part is strictly prohibited. Please note that e-mails are susceptible to
 change. CitiusTech shall not be liable for the improper or incomplete
 transmission of the information contained in this communication nor for any
 delay in its receipt or damage to your system. CitiusTech does not
 guarantee that the integrity of this communication has been maintained or
 that this communication is free of viruses, interceptions or interferences.

 





[DRAFT] [REPORT] Apache cTAKES Jun 2015

2015-06-04 Thread Pei Chen
[DRAFT- Feel free to add/edit]

---

Report from the Apache cTAKES project [Pei Chen]

## Description:

   Apache clinical Text Analysis and Knowledge Extraction System (cTAKES)
is

   an open-source natural language processing system for information
extraction

   from electronic medical record clinical free-text.

## Activity:

A talk about cTAKES using Spark/BigTop on processing Twitter data was well
received at ApacheCon NA 2015.

The committee just released ctakes-3.2.2 on May 30, 2015 and contains a
critical patch caused by a change by a 3rd party (NLM) validation service.
Full Release notes: http://s.apache.org/ctakes-3.2.2-release-notes

The committee is actively working on the next release cTAKES with new
temporal components/models and various bug fixes in Jira.

The committee is planning a local meetup (Boston) in the near future to
integrate cTAKES with Docker for easier deployments.

## Issues:

There are no issues requiring board attention at this time

## PMC/Committership changes:

 - Currently 31 committers and 30 PMC members in the project.

 - Last PMC addition was Michelle Chen at Fri Jan 23 2015

 - Last committer addition was Jim Gregoric at Sat Feb 28 2015

## Releases:

 - 3.2.2 was released on May 30 2015
 - 3.2.1 was released on Dec 10 2014

 - 3.2.0 was released on Jul 23 2014

## Mailing list activity:

 - dev@ctakes.apache.org:

- 160 subscribers (up 15 in the last 3 months):

- 227 emails sent to list (211 in previous quarter)

 - u...@ctakes.apache.org:

- 140 subscribers (up 14 in the last 3 months):

- 93 emails sent to list (41 in previous quarter)

## JIRA activity:

 - 14 JIRA tickets created in the last 3 months

 - 8 JIRA tickets closed/resolved in the last 3 months


[ANNOUNCE] Apache cTAKES 3.2.2 released

2015-05-29 Thread Pei Chen
The Apache cTAKES team is pleased to announce the availability of the
3.2.2 release.

For the complete release notes, please visit
http://s.apache.org/ctakes-3.2.2-release-notes

Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is
an open-source natural language processing system for information
extraction from electronic medical record clinical free-text.

The release can be downloaded from
http://ctakes.apache.org/downloads.cgi

For further information, please visit the project website at
http://ctakes.apache.org/

-- The Apache cTAKES Team


[RESULT] [VOTE] Release Apache cTAKES 3.2.2 (rc2)

2015-05-28 Thread Pei Chen
More than 72 hours has passed. The vote for Apache cTAKES 3.2.2 (rc2)
*passes* [1] with 5  +1 votes   (4 binding)

+1 (binding)
Pei Chen
Tim Miller
Kim Ebert
Jay Vyas


Michal Iglewski

There were no -1 or +0 votes cast.

I will be publishing the release, then will announce the release as soon as
artifacts will be available.

Thanks to everyone who participated!

-- Pei

On Wed, May 13, 2015 at 10:37 AM, Pei Chen chen...@apache.org wrote:

 This is a call for a vote on releasing the following candidate (rc2) as
 Apache cTAKES 3.2.2.

 The major change since rc1 was to include the fix for CTAKES-359 - UMLS
 Authentication failing despite correct username and password.

 For more detailed information on the changes/release notes, please visit:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12328717

 The release was made using the cTAKES release process documented here:

 http://svn.apache.org/repos/asf/ctakes/site/backup/content/ctakes-release-guide.mdtext

 The candidate is available at:
 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz
 /.zip

 The tag to be voted on:
 http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.2-rc2
 The MD5 checksum of the tarball can be found at:
 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.md5
 /.zip.md5

 The signature of the tarball can be found at:

 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.asc
 /.zip.asc

 Apache cTAKES' KEYS file, containing the PGP keys used to sign the
 release:
 https://dist.apache.org/repos/dist/release/ctakes/KEYS

 Please vote on releasing these packages as Apache cTAKES 3.2.2. The vote
 is open for at least the next 72 hours.

 The vote passes if at least three binding +1 votes are cast.
 [ ] +1 Release the packages as Apache cTAKES 3.2.2
 [ ] -1 Do not release the packages because...

 Also, the convenience binary can be found at:

 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-bin.tar.gz.md5
 /.zip





Re: [VOTE] Release Apache cTAKES 3.2.2 (rc2)

2015-05-27 Thread Pei Chen
]
 

 [INFO] BUILD SUCCESS

 [INFO]
 

 [INFO] Total time: 2:36:05.908s

 [INFO] Finished at: Mon May 18 18:04:53 EDT 2015

 [INFO] Final Memory: 72M/244M

 [INFO]
 



 I think there’s a long outstanding issue where you would need to
 –DskipTests=true during package/install phase because that unit test can’t
 read lvg from a resource jar... I think that issue is still outstanding;
 not sure if folks would like to address it for this particular patch
 release.



 --Pei



 *From:* Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com
 kim.eb...@perfectsearchcorp.com]
 *Sent:* Monday, May 18, 2015 2:59 PM
 *To:* dev@ctakes.apache.org
 *Subject:* Re: [VOTE] Release Apache cTAKES 3.2.2 (rc2)



 [ ] -1 Do not release the packages because...

  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.83 sec
  FAILURE!

 Results :

 Tests in error:

 TestClearNLPPipeLine(org.apache.ctakes.dependency.parser.ae.util.TestClearNLPAnalysisEngines):
 URI is not hierarchical

 [image: IMAT Solutions]
 https://urldefense.proofpoint.com/v2/url?u=http-3A__imatsolutions.comd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=SLBMkQmi7n1iTY_1eb1WRhe2PmFhbT9yh51nijDPvyIe=

 *Kim Ebert*
 Software Engineer
 [image: Office:]208.971.1509
 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com

 On 05/13/2015 08:37 AM, Pei Chen wrote:

 This is a call for a vote on releasing the following candidate (rc2) as

 Apache cTAKES 3.2.2.



 The major change since rc1 was to include the fix for CTAKES-359 - UMLS

 Authentication failing despite correct username and password.



 For more detailed information on the changes/release notes, please visit:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12328717
  
 https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_secure_ReleaseNote.jspa-3FprojectId-3D12313621-26version-3D12328717d=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=0q8GepmP-AIzXsYOFAKzAJu4QaOaIPtE9ViFGYf5AdEe=



 The release was made using the cTAKES release process documented here:

 http://svn.apache.org/repos/asf/ctakes/site/backup/content/ctakes-release-guide.mdtext
  
 https://urldefense.proofpoint.com/v2/url?u=http-3A__svn.apache.org_repos_asf_ctakes_site_backup_content_ctakes-2Drelease-2Dguide.mdtextd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=cNTzigd32BmHzDhNb0pc7_Pky08MEtMVhqTZwpuLP1Ee=



 The candidate is available at:

 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz
  
 https://urldefense.proofpoint.com/v2/url?u=https-3A__dist.apache.org_repos_dist_dev_ctakes_ctakes-2D3.2.2-2Drc2_apache-2Dctakes-2D3.2.2-2Dsrc.tar.gzd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=Xh_RsG-SLQfGIK9Mm5Wikv06-ntVracmEF0nTR4YHUIe=

 /.zip



 The tag to be voted on:

 http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.2-rc2 
 https://urldefense.proofpoint.com/v2/url?u=http-3A__svn.apache.org_repos_asf_ctakes_tags_ctakes-2D3.2.2-2Drc2d=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=PYB1Ysr91TLwULgR4JFUX7HX9WhwWGLzsIxBHsLyHgse=

 The MD5 checksum of the tarball can be found at:

 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.md5
  
 https://urldefense.proofpoint.com/v2/url?u=https-3A__dist.apache.org_repos_dist_dev_ctakes_ctakes-2D3.2.2-2Drc2_apache-2Dctakes-2D3.2.2-2Dsrc.tar.gz.md5d=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=MVx9tZTThRbARCWwhF6H8gGzoq2Lxt0YNIHegaVudQYe=

 /.zip.md5



 The signature of the tarball can be found at:

 https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc2/apache-ctakes-3.2.2-src.tar.gz.asc
  
 https://urldefense.proofpoint.com/v2/url?u=https-3A__dist.apache.org_repos_dist_dev_ctakes_ctakes-2D3.2.2-2Drc2_apache-2Dctakes-2D3.2.2-2Dsrc.tar.gz.ascd=BQMDaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WYm=7L5MUxlkhMYDS-BUpfA2NU3vIPZgSwZqSyMm4dfICQgs=2Q67800RBncc6aNG4pYiWEoEUb651gpFoOXXMlNHJLMe=

 /.zip.asc



 Apache cTAKES' KEYS file, containing the PGP keys used to sign the release:

 https://dist.apache.org/repos/dist/release/ctakes/KEYS 
 https://urldefense.proofpoint.com/v2/url?u=https

Re: CTAKES mirroring on github.

2015-05-18 Thread Pei Chen
One of the visions behind the *-res projects was to separate out the
resources from code.  In theory, one can filter out all *-res projects from
their git repo and pull in any version of the resources from maven
central...  I won't have enough bandwidth at the moment to try it out or
work on the git piece though...
--Pei

On Thu, May 14, 2015 at 1:56 PM, Kim Ebert kim.eb...@perfectsearchcorp.com
wrote:

  I've done some investigation into using / working with the git repo for
 cTAKES, and I found that it is a huge. It doesn't work well with GitHub
 either, as I keep running into timeouts.

 I would like to make the suggest that we remove two cTAKES build files and
 the ctakes-gui-0.0.1.zip file. This takes the repo from about 8 GB down to
 1.8 GB. It is likely that the reason the git mirror is failing is due to
 the large size of the repo. GitHub will also filter out some of these vary
 large files, as GitHub's max file size is 100MB.

 git filter-branch --tree-filter 'rm -rf ctakes-gui-0.0.1.zip'
 origin/cTAKES-GUI-0.0.1
 git filter-branch -f --tree-filter 'rm -rf _cTAKES_build_/cTAKES-2.5*.zip'
 origin/maven-sandbox
 git filter-branch -f --tree-filter 'rm -rf _cTAKES_build_/cTAKES-2.5*.zip'
 origin/SHARPn-cTAKES

 # Clean out unreferenced objects from repo
 git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c
 gc.rerereresolved=0 \
 -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc


 It may also be helpful to remove
 ctakes-dependency-parser-res/src/main/resources/org/apache/ctakes/dependency/parser/models/clearparser_models.jar
 from the git repo as well. (238,248,287 bytes)

 Thoughts?

   [image: IMAT Solutions] http://imatsolutions.com
  Kim Ebert
 Software Engineer
 [image: Office:] 208.971.1509
 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com
  On 05/06/2015 01:17 PM, Steven Bethard wrote:

 Yes, I ping this issue every couple months, but no luck so far. (They
 take a look each time I ask, but haven't yet pushed a working git
 mirror for us.)

 Steve

 On Tue, May 5, 2015 at 12:09 PM, Kim Ebertkim.eb...@perfectsearchcorp.com 
 kim.eb...@perfectsearchcorp.com wrote:

  Ah, looks like the issue is still being looked into.
 https://issues.apache.org/jira/browse/INFRA-8553

 On Mon, May 4, 2015 at 4:54 PM, jay vyas jayunit100.apa...@gmail.com 
 jayunit100.apa...@gmail.com
 wrote:


  Thanks kim.

 Can you file an infra issue ?

 they will look into it.

 I filed one originally
 On May 4, 2015 6:32 PM, Kim Ebert kim.eb...@perfectsearchcorp.com 
 kim.eb...@perfectsearchcorp.com
 wrote:


  It looks like the github hasn't been updated in a while. Any reason?

 Thanks,

 Kim

 On Tue, Feb 17, 2015 at 10:36 AM, Finan, Sean 
 sean.fi...@childrens.harvard.edu wrote:


  Our request is for a read-only mirror.  However, if it ever becomes

  i/o,

  I

  don't know if this will have what you want, but http://git.apache.org/
 Links to documentation (mostly server 
 setup)http://www.apache.org/dev/git.html and a wiki (check toward middle and
 bottom for committer info) https://wiki.apache.org/general/GitAtApache



 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu 
 timothy.mil...@childrens.harvard.edu]
 Sent: Tuesday, February 17, 2015 12:31 PM
 To: dev@ctakes.apache.org
 Subject: Re: CTAKES mirroring on github.

 Is there any existing resource to help people who want to use git
 understand the right workflow to contribute to ctakes? (i.e. how this
 interacts with svn repos).
 Tim


 On 02/17/2015 12:23 PM, jay vyas wrote:

  Hi CTakes.  Looks like infra finally got  onto the JIRA i made for
 this a while back.  They are currently working on fixing a couple of
 minor glitches w/ the mirroring (not showing all commits)... but

   there

   now is a mirror for CTakes on github.




   https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache

 _ctakesd=BQIBaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=Heup-

 IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674hm=4sEI9mOp

 kTz6K-DjmNU1s8Do1TGA0_10HqJcowKpDxcs=fNVbyXzpBLSAG6-DIjBZ1vbMp0JGaX90

   Lcdzg_EFVvMe=






Re: UMLS Authentication failing despite correct username and password

2015-05-11 Thread Pei Chen
Michal,
Thanks for pointing that out (It would have been nice if they sent out a
notice about the change in the API call).  Would be great if someone could
open a Jira and verify this fix solves the issue...  I think we should push
out this critical patch asap- I can include it in 3.2.2 and create another
RC2.


On Mon, May 11, 2015 at 11:25 PM, michal.iglew...@uqo.ca wrote:

 Hi Pedro and Sean,

 It seems to me that the service
 https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser returns now ?xml
 version='1.0' encoding='UTF-8'?Resulttrue/Result instead of
 Resulttrue/Result. It means that the line

 result = line.trim().equalsIgnoreCase(Resulttrue/Result);

 in isValidUMLSUser()  should be replaced with

 result = line.trim().equalsIgnoreCase(?xml version='1.0'
 encoding='UTF-8'?Resulttrue/Result);

 Michal

 -Message d'origine-
 De : Finan, Sean [mailto:sean.fi...@childrens.harvard.edu]
 Envoyé : May-11-15 5:41 PM
 À : dev@ctakes.apache.org
 Objet : RE: UMLS Authentication failing despite correct username and
 password

 Argh.  Our email server may have mucked with the url that I pasted:

 H t t p s : / / uts - ws . nlm . nih . gov / restful / isValidUMLSUser

 property key=umlsUrl value= INSERT URL HERE, NO SPACES /

 -Original Message-
 From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu]
 Sent: Monday, May 11, 2015 5:38 PM
 To: dev@ctakes.apache.org
 Subject: RE: UMLS Authentication failing despite correct username and
 password

 Hi Pedro,



 Check the cTakesHsql.xml and make sure that the line matches:



 property key=umlsUrl value=
 https://urldefense.proofpoint.com/v2/url?u=https-3A__uts-2Dws.nlm.nih.gov_restful_isValidUMLSUserd=BQIGaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTaom=bSJDuEveKkCQoYKfh2CwhxDx8I92siVZvxm45BoxGtEs=A5wwcyQgQrPQ_dWwnaF-QHqZb0ttus_rzS-A6UDh-S8e=
 /



 In an older version of cTAKES with an output message as you have:

 11 May 2015 15:59:47  INFO AbstractJCasTermAnnotator - Default - Loading
 dictionary into memory.  Initial run may take few mins to load. Please be
 patient...

 That line got corrupted.



 Sean



 -Original Message-

 From: Pedro Teixeira [mailto:teixeir...@gmail.com]

 Sent: Monday, May 11, 2015 5:30 PM

 To: dev@ctakes.apache.org

 Subject: UMLS Authentication failing despite correct username and password



 So I've checked the Dictionary lookup XML file and that password works to
 log in via the website. This was also working last week but stopped at some
 point over the last week. I've got cTAKES running on a linux system so I
 can index batches of documents via a script. The exact error is as follows
 (with the username/password blocked out).



 11 May 2015 15:59:26  INFO LvgCmdApiResourceImpl - cwd =

 /home/PT/cTAKES/apache-ctakes-3.2.1

 11 May 2015 15:59:26  INFO LvgCmdApiResourceImpl - cd
 /home/PT/cTAKES/apache-ctakes-3.2.1/resources/org/apache/ctakes/lvg/

 11 May 2015 15:59:27  INFO LvgCmdApiResourceImpl - cd

 /home/PT/cTAKES/apache-ctakes-3.2.1

 11 May 2015 15:59:27  INFO ClearNLPDependencyParserAE - using Morphy
 analysis? true Loading configuration.

 Loading feature templates.

 Loading lexica.

 Loading model:


 

 11 May 2015 15:59:42  INFO Chunker - Chunker model file:

 org/apache/ctakes/chunker/models/chunker-model.zip

 11 May 2015 15:59:44  INFO ContextDependentTokenizerAnnotator - Finite
 state machines loaded.

 11 May 2015 15:59:44  INFO ConstituencyParser - Initializing parser...

 11 May 2015 15:59:46  INFO ContextAnnotator - SCOPE ORDER: [1, 3]

 11 May 2015 15:59:46  INFO NegationContextAnalyzer - initBoundaryData()
 called for ContextInitializer

 11 May 2015 15:59:47  INFO POSTagger - POS tagger model file:

 org/apache/ctakes/postagger/models/mayo-pos.zip

 11 May 2015 15:59:47  INFO AbstractJCasTermAnnotator - Default - Loading
 dictionary into memory.  Initial run may take few mins to load. Please be
 patient...

 11 May 2015 15:59:47  INFO AbstractJCasTermAnnotator - Using dictionary
 lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence

 11 May 2015 15:59:47  INFO AbstractJCasTermAnnotator - Exclusion tagset

 loaded: CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN
 VBP VBZ WDT WP WPS WRB

 11 May 2015 15:59:47  INFO AbstractJCasTermAnnotator - Using minimum term
 text span: 3

 11 May 2015 15:59:47  INFO DictionaryDescriptorParser - Parsing dictionary

 specifications:


 /home/PT/cTAKES/apache-ctakes-3.2.1/resources/org/apache/ctakes/dictionary/lookup/fast/cTakesHsql.xml

 11 May 2015 15:59:48 ERROR UmlsUserApprover - UMLS Account at
 

[VOTE] Release Apache cTAKES 3.2.2 (rc1)

2015-05-05 Thread Pei Chen
This is a call for a vote on releasing the following candidate (rc1) as
Apache cTAKES 3.2.2.

The major changes include:
- Improved optional Temporal models (Time + Event Relationships models now
available)
- Other bug fixes/enhancements from Jira (see release notes Jira link
below).

I manually downloaded the bin as well as resources and tried the CVD with
the AggregatePlaintextFastUMLSProcessor.xml and CPE testing the
AggregateCdaProcessor.

Would be great if folks have time to test/verify especially if you opened
any of the Jira's below to ensure the bugs have been fixed/integrated.

For more detailed information on the changes/release notes, please visit:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12328717

The release was made using the cTAKES release process documented here:

http://svn.apache.org/repos/asf/ctakes/site/backup/content/ctakes-release-guide.mdtext

The candidate is available at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-src.tar.gz

/.zip

The tag to be voted on:
http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.2-rc1

The MD5 checksum of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-src.tar.gz.md5
/.zip.md5

The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-src.tar.gz.asc
/.zip.asc

Apache cTAKES' KEYS file, containing the PGP keys used to sign the release:
https://dist.apache.org/repos/dist/release/ctakes/KEYS

Please vote on releasing these packages as Apache cTAKES 3.2.2. The vote is
open for at least the next 72 hours.

The vote passes if at least three binding +1 votes are cast.

[ ] +1 Release the packages as Apache cTAKES 3.2.2

[ ] -1 Do not release the packages because...


Also, the convenience binary can be found at:

https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.2-rc1/apache-ctakes-3.2.2-bin.tar.gz.md5

/.zip


Thanks!


Re: Command-line tool for cTAKES

2015-04-30 Thread Pei Chen
If you already have the CPE running, you can pass the descriptor to the
command line:

*org.apache.ctakes.ytex.tools.RunCPE or *

*org.apache.ctakes.core.cpe.CmdLineCpeRunner or*

*org.apache.uima.examples.cpe.SimpleRunCPE

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201504.mbox/%3ccapqz87qzxm-qmfww0cl+b9b4cfo+wsdg57bq7f54cr8keu5...@mail.gmail.com%3e

If you need it programmatically, check out a thread Tim started:

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201503.mbox/%3ce084d8efe2b03a408b324458c5212e9434c10...@chexmbx3a.chboston.org%3e
Hope that helps...
--Pei

On Thu, Apr 30, 2015 at 1:24 AM, Yingcheng Sun yxs...@case.edu wrote:

 I also have this problem. Hope anybody can offer some examples or tools
 easily used for programming.

 Yingcheng

 On Thu, Apr 30, 2015 at 1:06 AM, Giuseppe Totaro totarope...@gmail.com
 wrote:

  Hi all,
 
  I am a newbie with cTAKES. I am working on developing an application that
  relies on cTAKES.
  I already did some experiments using CVD and CPE tools. I am just
 wondering
  if there is any command line tool that I can use to perform an analysis
 on
  plain text and then generate the annotated output.
 
  Thanks a lot,
  Giuseppe
 



Re: Image to text conversion

2015-04-30 Thread Pei Chen
Sekhar,
There are a few open Jira's:
I think it would be a great contribution if you get this to work:

   - CTAKES-189 https://issues.apache.org/jira/browse/CTAKES-189

GSoC: Implement OCR/Tika to standardize text input for cTAKES

   -
  - CTAKES-105 https://issues.apache.org/jira/browse/CTAKES-105

   Add Apache Tika integration


On Thu, Apr 30, 2015 at 1:21 AM, Hari, Sekhar sekhar.h...@cgi.com wrote:

 Thanks. Let me try this, and will let you know for any help if required.

 Cheers,
 Sekhar H.

 -Original Message-
 From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
 Sent: Thursday, April 30, 2015 10:44 AM
 To: dev@ctakes.apache.org; u...@ctakes.apache.org
 Subject: Re: Image to text conversion

 What about using Apache Tika within cTAKES for this? Tika supports OCR
 through Tesseract:

 http://wiki.apache.org/tika/TikaOCR

 Cheers,
 Chris


 ++
 Chris Mattmann, Ph.D.
 Chief Architect
 Instrument Software and Science Data Systems Section (398) NASA Jet
 Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 168-519, Mailstop: 168-527
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Associate Professor, Computer Science Department University of
 Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Hari, Sekhar sekhar.h...@cgi.com
 Reply-To: dev@ctakes.apache.org dev@ctakes.apache.org
 Date: Wednesday, April 29, 2015 at 10:11 PM
 To: dev@ctakes.apache.org dev@ctakes.apache.org, 
 u...@ctakes.apache.org u...@ctakes.apache.org
 Subject: Image to text conversion

 Hello All -
 
 I am looking for an OCR ability in cTAKES. The requirement is to
 convert scanned image documents (ex: scanned hand written
 prescriptions) into a text format. Then apply the usual NLP pipeline to
 convert the unstructured text to a structured data.
 
 Can cTAKES convert scanned image documents into a text? If so, please
 help me to understand this by sharing any documents or video.
 
 Many thanks,
 Sekhar H.
 




Re: Request for help:: NCBO Ontology Extraction Tool for i2b2

2015-04-23 Thread Pei Chen
Sekhar,
Is it happening to all of the ontologies you mentioned or just one?  Those
ontologies do not seem very big or deep.  Did you notice in the logs if
something in the ontology having some sort of circular reference or causing
an infinite loop?
I think lori from i2b2 may be better at answering this since this isn't
exactly cTAKES related...
--Pei


On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar sekhar.h...@cgi.com wrote:

 Hello there -

 Introducing myself:


 My name is Sekhar Hari, responsible for Bio-informatics products/
 solutions in CGI, a Canadian company. In this capacity, I am also
 responsible for developing a software to identify potential adverse events
 and serious adverse events in healthcare settings.


 I have been trying to extract and process few Ontologies using the latest
 version of NCBO Ontology Extraction Tool to load into I2B2 but with no
 luck. I could extract the staging file, and can load this into the  I2B2
 staging table. However, when I run the
 edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it always
 fails with GCOverheadLimit. I tried by increasing the JVM memory to 8GB but
 no result. My hardware resource is limited at present, and I can't increase
 the JVM memory size beyond 8GB.

 As I have a demo for a large hospital coming up soon, in the interest of
 time, would you be kind enough to extract and process the following
 ontologies, and upload the final metadata file here?
 http://i2b2.bioontology.org/

 Ontology IDs:

 1.   WHO-ART

 2.   OAE

 3.   SSE

 4.   OVAE

 The user-guide that I was following is attached.

 Many thanks in advance.

 Regards,
 Sekhar H.



Re: Include the smoking status detection in AggregatePlaintextFastUMLSProcessor.xml

2015-04-21 Thread Pei Chen
If it works for you, I would keep it in there then.  Leave the info in the
Jira and we should double check the code that piece of negation is only
used for the smoking status types.
--Pei

On Tue, Apr 21, 2015 at 1:04 PM, Tom Devel deve...@gmail.com wrote:

 After further testing, removing the nodeNegationAnnotator/node step in

 ProductionPostSentenceAggregate_step2_libsvm.xml (which I assume is the sub
 smoking desc xml you mean), the smoking status is not correctly classified
 anymore when negations are there, so this step does not look redundant to
 me.


 For example, He denied use of tobacco is then classified as
 CURRENT_SMOKER. If I leave this negation step in, it is correctly found as
 NON_SMOKER.


 I tried changing the order in which the smoking status nodes
 nodeSentenceAdjuster/node and nodeClassifiableEntriesAnnotator/node
 are run in the clinical pipeline, putting them directly after lvg or at the
 end of the flow does not change the observation above.


 However, you said that leaving the NegationAnnotator in could overwrite
 assertion values, how can this be prevented while keeping correct smoking
 status classifications?

 On Mon, Apr 20, 2015 at 2:02 PM, Chen, Pei pei.c...@childrens.harvard.edu
 
 wrote:

  Great. There is a redundant Negation step in one of final sub smoking
 desc
  xml's.
  Leave the Jira as a placeholder to clean up the smoking status desc's.
 
  Sent from my iPhone
 
   On Apr 20, 2015, at 1:11 PM, Tom Devel deve...@gmail.com wrote:
  
   Pei,
  
   I did what you recommended, I run a test input with this new pipeline
 and
   did a diff with the clinical pipeline without the smoking status on the
  two
   CAS files. It seems to do the trick, the Umls concept tags are still
 the
   same, and there is now a new tag for the smoking status annotation,
  great!
  
   Before I create the Jira item, what do you mean with removing the last
   NegEx?
  
   In AggregatePlaintextFastUMLSProcessor, the node of the
 NegationAnnotator
   is commented out:
   !-- nodeNegationAnnotator/node --
  
   Did you mean this node?
  
   At the top of the file, there is an import for the NegationAnnotator:
   delegateAnalysisEngine key=NegationAnnotator, but it is not
 commented
   out and never run in the fixed flow.
  
   Am I correct that the negation detection in the clinical pipeline is
 now
   performed by PolarityCleartkAnalysisEngine?
  
   Thanks,
   Tom
  
   On Sat, Apr 18, 2015 at 12:53 AM, Pei Chen chen...@apache.org
 wrote:
  
   Tom,
   I would put it at the end of the pipeline (at a min, it should be
 behind
   sectionizer, sentence, tokenizer, lvg).  I would remove
   ExternalBaseAggregateTAE
   as this simulates the sectionizer, sentence, tokenizer, lvg would
 would
  be
   redundant.  I would also probably remove the last NegEx which could
   override the assertion values.
  
   Disclaimer: I did not test this yet.  Feel free to open a Jira item if
  it
   works for you so it can be tracked.  It seems kind of strange to have
 a
   descriptor xml define another xml descriptor to be loaded up via code
   again- I think this could be simplified.
   --Pei
  
   On Thu, Apr 16, 2015 at 7:29 PM, Tom Devel deve...@gmail.com
 wrote:
  
   Hi,
  
   I am using the smoking status AE from SimulatedProdSmokingTAE.xml, it
   works
   fine, I can see the smoking status annotation in the CVD.
  
   Now I would like to include the smoking status detection in the
  clinical
   pipeline of AggregatePlaintextFastUMLSProcessor.xml, so that when I
 run
   the
   clinincal pipeline, the smoking status will also be determined.
  
   How can I do this?
  
   I am thinking to just put the nodes from the fixed flow of
   SimulatedProdSmokingTAE.xml into the fixed flow of
   AggregatePlaintextFastUMLSProcessor.xml, is this the right approach?
  
   If so, at which exact place in the clinical pipeline fixed flow
 should
   these nodes be added?
  
   Is there a preferred place (such as append after the last node or put
   before the first node) ?
  
   Can a wrong position or ordering of the smoking status nodes
   damage/corrupt
   the rest of the annotations?
  
   SimulatedProdSmokingTAE.xml contains these lines with the fixed flow:
  
   fixedFlow
   nodeExternalBaseAggregateTAE/node
   nodeSentenceAdjuster/node
   nodeClassifiableEntriesAnnotator/node
   /fixedFlow
  
   AggregatePlaintextFastUMLSProcessor.xml (3.2.2 from SVN) contains
 this
   fixed flow:
  
   fixedFlow
   nodeSimpleSegmentAnnotator/node
   nodeSentenceDetectorAnnotator/node
   nodeTokenizerAnnotator/node
   nodeLvgAnnotator/node
   nodeContextDependentTokenizerAnnotator/node
   nodePOSTagger/node
   !-- nodeClearPOSTagger/node --
   nodeChunker/node
   nodeAdjustNounPhraseToIncludeFollowingNP/node
   nodeAdjustNounPhraseToIncludeFollowingPPNP/node
   !--nodeLookupWindowAnnotator/node--
   nodeDictionaryLookupAnnotatorDB/node
   nodeDrugNER/node
   nodeDependencyParser/node
   nodeSemanticRoleLabeler/node

Apache cTAKES Hackathon: Containers- Docker + Kubernetes?

2015-04-17 Thread Pei Chen
Would folks be interested in joining a hackathon nearby Boston?  Exact Time
and place TBA.

Goal: Get cTAKES to work with Docker and Kubernetes and have a working
example in sandbox.

Deploying cTAKES is not so straightforward and difficult to manage, let
alone in a distributed environment.  Containers are not extreme as a full
VM images and may be just the right balance.
Docker[1] and Kubernetes[2] seem to be the popular choices nowadays (ASL
2.0 licensed).
[1] https://www.docker.com/whatisdocker/
[2] http://kubernetes.io

--Pei


cTAKES @ ApacheCon 2015 next week

2015-04-09 Thread Pei Chen
Just a reminder-

Jay and I are planning to have a session (Tues) at Apache Con 2015 on using
cTAKES in a Big Data context using Spark/Hadoop.

If you happen to be there, feel free stop by the session.  Or If you're in
the neighborhood and want to meet up over coffee, feel free to drop us a
note.


--Pei


Re: Question about how to interpret Ctakes output

2015-04-08 Thread Pei Chen
[+dev]
Yu,
Check out the type system:
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-type-system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSystem.xml

Note:  I believe what you really want is
*org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation
and not *org.apache.ctakes.assertion.medfacts.types.Concept (anything in
the assertion.medfacts* is really an internal construct not intended to be
used outside of the assertion module)

*IdentifiedAnnotation.*ontologyConceptArr[] contains the array of
*org.apache.ctakes.typesystem.type.refsem.OntologyConcept/UMLSConcept*


On Wed, Apr 8, 2015 at 3:53 PM, Liang, Yu yu.li...@nyumc.org wrote:

  Dear Pei,

  Thank you for your previous help, I think I figure out how to run Ctakes
 by command line using “AggregateCdaUMLSProcessor.xml” Analysis Engine.
 But I am wondering is there any tutorial like how to interpret the xml
 results specifically the part that containing “ConceptText=

  For example:

   org.apache.ctakes.assertion.medfacts.types.Concept _indexed=2 _id=
 34678 _ref_sofa=15 begin=202 end=215 conceptType=PROBLEM
 conceptText=Date of Birth externalId=0 originalEntityExternalId=13854
 ”/

  what does “_id” , “_ref_sofa”, “originalEntityExternalId”, etc. mean?

  Appreciate!



   Yu Liang

  CHIBI








Re: Running cTAKES via command line

2015-04-03 Thread Pei Chen
Take one of the existing startup scripts such as
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-distribution/src/main/bin/runctakesCPE.sh
Replace the main class org.apache.uima.tools.cpm.CpmFrame with -
*org.apache.ctakes.ytex.tools.RunCPE or *
*org.apache.ctakes.core.cpe.CmdLineCpeRunner or*
*org.apache.uima.examples.cpe.SimpleRunCPM (requires uima examples jar)*



On Fri, Apr 3, 2015 at 11:18 AM, Pedro Teixeira teixeir...@gmail.com
wrote:

 Pei Chen chenpei@... writes:

 
  There were a couple of recent threads about this [1].  In particular
 search
  for:
  CmdLineCpeRunner.java and RunCPE.java
 
  [1]
  http://mail-archives.apache.org/mod_mbox/ctakes-
 dev/201502.mbox/%3cCAHnnHnZFde5MF6dDV6Y2R4jyYgua1a43SrdNZRsKJQWDDtiB8w-
 JsoAwUIsXosN+BqQ9rBEUg at public.gmane.org%3e
 
  We should probably add it to the wiki documentation or FAQ...
 
  On Thu, Apr 2, 2015 at 10:58 PM, Pedro Teixeira teixeira09@...
  wrote:
 
   John Green john.travis.green at ... writes:
  
   
Hi!
It depends on what you mean by run on the command line... Can you
   clarify the use case?Jg
   
On Thu, Apr 2, 2015 at 5:42 PM, Pedro Teixeira teixeira09 at
 ...
wrote:
   
 Hello, I've got an installation of cTAKES running but am unsure
 of how
   to
 run it via commandline only. I'd like to write a script to
 automate
 processing and skip the GUI. A quick search hasn't turned
 anything up.
   Any
 advice on how to do that? Will I have to dig into the code to do
 this?
 Thanks!
  
  
   I'd like to have as input a string/file/directory with files and
 then call
   cTAKES to process and output the XML result (picking one of the
 analysis
   engines/setting parameters upon initiation). I want to just run this
   without having to boot up the GUI and manually select everything and
 click
   run. Seems easiest to automate that if I can run it via command line
 and
   then just write scripts around that.
  
   Thanks!
  
  
 


 That looks like it'll do the trick although I'm having a little bit of
 trouble invoking it from the command line. I keep getting Cold not find
 or load main class  errors. Unfortunately my Java is a bit rusty. I
 have the -cp argument in for all the paths but then when I go manually
 searching around I can't find the correct class. I'm assuming it's just
 contained in the .jar files but despite providing the /lib/* to -cp it
 doesn't seem to be finding it. Any advice? Perhaps an example command
 line invoking the CmdLineCpeRunner?

 Thanks so much for your help!




Re: Running cTAKES via command line

2015-04-03 Thread Pei Chen
There were a couple of recent threads about this [1].  In particular search
for:
CmdLineCpeRunner.java and RunCPE.java

[1]
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201502.mbox/%3ccahnnhnzfde5mf6ddv6y2r4jyygua1a43srdnzrskjqwddti...@mail.gmail.com%3e

We should probably add it to the wiki documentation or FAQ...

On Thu, Apr 2, 2015 at 10:58 PM, Pedro Teixeira teixeir...@gmail.com
wrote:

 John Green john.travis.green@... writes:

 
  Hi!
  It depends on what you mean by run on the command line... Can you
 clarify the use case?Jg
 
  On Thu, Apr 2, 2015 at 5:42 PM, Pedro Teixeira teixeira09@...
  wrote:
 
   Hello, I've got an installation of cTAKES running but am unsure of how
 to
   run it via commandline only. I'd like to write a script to automate
   processing and skip the GUI. A quick search hasn't turned anything up.
 Any
   advice on how to do that? Will I have to dig into the code to do this?
   Thanks!


 I'd like to have as input a string/file/directory with files and then call
 cTAKES to process and output the XML result (picking one of the analysis
 engines/setting parameters upon initiation). I want to just run this
 without having to boot up the GUI and manually select everything and click
 run. Seems easiest to automate that if I can run it via command line and
 then just write scripts around that.

 Thanks!




Re: Ctakes Null Pointer Error for org.apache.ctakes.dependency.parser.util.DependencyUtility

2015-03-27 Thread Pei Chen
Hoang,
3.0 was released a long time ago (02/2013).  (according to the tag/history,
it did't have the null fix until 6/2013 3.1?)
http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.0.0-incubating/ctakes-dependency-parser/src/main/java/org/apache/ctakes/dependency/parser/util/DependencyUtility.java


On Fri, Mar 27, 2015 at 1:46 PM, Pham, Hoang hp...@tuftsmedicalcenter.org
wrote:

 To Timothy,

 Sorry, I misspell your name before.

 Thank You,

 Hoang Pham


 -Original Message-
 From: Pham, Hoang [mailto:hp...@tuftsmedicalcenter.org]
 Sent: Fri 3/27/2015 1:38 PM
 To: dev@ctakes.apache.org
 Subject: RE: Ctakes Null Pointer Error for
 org.apache.ctakes.dependency.parser.util.DependencyUtility

 To Tomothy,

 I have cTakes 3.0 install. I looked at the class and there is a null check
 but for some reason, it is not catching. Also, when I use the indexes that
 came with cTakes install, the program would parse without any errors.

 Thank You,

 Hoang Pham


 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
 Sent: Fri 3/27/2015 9:38 AM
 To: dev@ctakes.apache.org
 Subject: Re: Ctakes Null Pointer Error for
 org.apache.ctakes.dependency.parser.util.DependencyUtility

 Hi Hoang,
 Can you let me know what version of cTAKES you're using? I looked in
 that location in trunk and found a null check, so it could be that it's
 a bug that's been fixed already. In the meantime, if you just want to
 see if your dictionary is working right, you could disable the
 SubjectAttributeClassifier which seems to be the annotator where this
 error is coming from. Some of the other attributes rely on dependency
 features as well so you might disable them temporarily as well.
 Tim


 On 03/27/2015 08:32 AM, Pham, Hoang wrote:
  Hi All,
 
  I am trying to add my own dictionary to cTakes. I have added a lucene
 index for the dictionary, but when the index is added, I would receive a
 null pointer exception for the
 org.apache.ctakes.dependency.parser.util.DependencyUtility class.
  The stack trace is:
 
  org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl
 callAnalysisComponentProcess(407)
  SEVERE: Exception occurred
  org.apache.uima.analysis_engine.AnalysisEngineProcessException:
 Annotator processing failed.
  at
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:391)
  at
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
  at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
  at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.init(ASB_impl.java:409)
  at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
  at
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
  at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567)
  at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.init(ASB_impl.java:409)
  at
 org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
  at
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
  at
 org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
  at
 org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:229)
  at
 org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:259)
  at Parsing.main(Parsing.java:49)
  Caused by: java.lang.NullPointerException
  at
 org.apache.ctakes.dependency.parser.util.DependencyUtility.getPath(DependencyUtility.java:263)
  at
 org.apache.ctakes.assertion.attributes.subject.SubjectAttributeClassifier.extract(SubjectAttributeClassifier.java:181)
  at
 org.apache.ctakes.assertion.attributes.features.SubjectFeaturesExtractor.extract(SubjectFeaturesExtractor.java:57)
  at
 org.apache.ctakes.assertion.attributes.features.SubjectFeaturesExtractor.extract(SubjectFeaturesExtractor.java:1)
  at
 org.apache.ctakes.assertion.medfacts.cleartk.AssertionCleartkAnalysisEngine.process(AssertionCleartkAnalysisEngine.java:475)
  at
 org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
  at
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
  ... 13 more
 
  Mar 27, 2015 8:29:54 AM
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl
 processAndOutputNewCASes(275)
  SEVERE: Exception occurred
  

Re: Dependency Parser model data

2015-03-15 Thread Pei Chen
Ephi,
The ClearNLP models in the current cTAKES releases (since 3.1.0 [1]) should
contain much more.  They should contain at least MiPACQ and SHARP training
data.  Could you point us to the documentation so we can update it?  I
believe the break down was:


   - Clinical questions: 1,600 sentences, 30,138 tokens.
   - Medpedia articles: 2,796 sentences, 49,922 tokens.
   - MiPACQ clinical notes: 8,040 sentences, 107,663 tokens.
   - MiPACQ pathological notes: 1,225 sentences, 21,581 tokens.
   - Seattle group health clinical notes: 5,020 sentences, 61,124 tokens.
   - Seattle group health pathological notes: 2,294 sentences, 34,384
   tokens.
   - SHARP clinical notes: 6,787 sentences, 94,205 tokens.
   - SHARP stratified: 4,316 sentences, 43,037 tokens.
   - SHARP stratified SGH: 4,963 sentences, 49,081 tokens.
   - TEMPREL clinical notes: 19,775 sentences, 266,979 tokens.
   - TEMPREL pathological notes: 4,335 sentences, 78,829 tokens.

There are some discussions on appending/augmenting the existing
annotated/training data[2].  I think the short answer is that there is
currently no easy way short of having to sign DUA's from every single
source institution.

[1] http://svn.apache.org/r1465043
[2]
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201412.mbox/%3ce5a9fa5abbf1ca4085d4f0794852a51e24241...@chexmbx3a.chboston.org%3E


On Sun, Mar 15, 2015 at 11:58 AM, Ephi eph...@gmail.com wrote:

 Hi -

 From the documentation, the data used to train the dep parser in cTAKES
 seems to be 1600 clinical questions (from the Mayo clinic?).

 Is there a way to retrieve this data in order to retrain the model (while
 adding on additional data) ?

 Thanks!
 Ephi



Re: cTakes setup

2015-03-13 Thread Pei Chen
Mitch,
-The dev@ and user@ mailing lists are archived and searchable; it is
probably the best for searching archived discussions.
-Could you clarify what you are trying to achieve or the issue that you are
experiencing with the -Xmx?  There are models and dictionaries that get
loaded into memory- it's defaulted to 3gb to accommodate those.
 Increasing it may or may not improve performance; in fact it may even
decrease it since it may cause more work on the GC.  Also, include the
version and the pipeline configuration that you are using and the group may
have some suggestions.
--Pei

On Fri, Mar 13, 2015 at 5:01 PM, Fawcett, Mitch mfawc...@christianacare.org
 wrote:

 Hi,

 I'm brand new to cTakes but I'm really excited about experimenting with it
 and developing a Proof of Value demo for my colleagues.

 I have a couple of questions.

 1)  I see that runctakesCVD.bat sets a maximum heap space of 3 gigabytes.
 Is that a number that can/should be increased to improve performance?

 2) Are there discussion threads archived somewhere where I can look for
 answers before asking questions?

 Thanks,

 Mitch Fawcett, MBA
 Senior Systems Analyst
 13 Read's Way, Suite 202
 New Castle, DE 19720
 Voice 302 327-5192
 mfawc...@christianacare.org








Re: Hello cTAKES Mailing List

2015-02-23 Thread Pei Chen
Raymond,
Probably a combination of UMLS *Consumer Health Vocabulary + Custom
Dictionary (as Sean described) *may work for the use case*:*
OAC CHV connects informal, common words and phrases about health to
technical terms used by health care professionals. It includes jargon,
slang, ambiguous, and misspelled words as used by consumers and health care
professionals. Due to its nature, OAC CHV includes concepts that are not
represented by other source vocabularies within the Metathesaurus.

[1] http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/CHV/

On Sun, Feb 22, 2015 at 10:37 AM, Finan, Sean 
sean.fi...@childrens.harvard.edu wrote:

 Hi Raymond,

 If you use the dictionary-fast module there exists an entry feeling bad
 with cui 557911 and cui 231218.  There is also feel bad and feeling bad
 emotionally

 You will find horrible present pain but no other entry with horrible.
  You will not find any terms with awful and probably many other desired
 words.  If you are really interested in slang crappy, lousy, etc. then
 they are definitely not present.

 What you can do is create a second dictionary.  There are example custom
 dictionaries in
 -dictionary-lookup-fast-res/src/main/resources/org/apache/ctakes/dictionary/lookup/fast/example/bsv/
 You should look at custom_cui_bsv.bsv if you want to specify term unique
 id codes and term text alone.  If you want to add tui/group codes then look
 at custom_cui_tui_bsv.bsv  - you will probably want to model your
 dictionary after this so that you can tag your terms with tuis for
 symptoms.

 You will want to imitate sections from the corresponding .xml file in that
 directory.   Make a copy of cTakesHsql.xml (two dirs up) and add lines:
   dictionary
  nameCustomCuiRareWord/name

  
 implementationNameorg.apache.ctakes.dictionary.lookup2.BsvRareWordDictionary/implementationName
  properties
 property key=bsvPath
 value=org/apache/ctakes/dictionary/fast/example/custom_cui_tui_bsv.bsv/
  /properties
   /dictionary

 And

   conceptFactory
  nameCustomCuiConcept/name

  
 implementationNameorg.apache.ctakes.dictionary.lookup2.concept.BsvConceptFactory/implementationName
  properties
 property key=bsvPath
 value=org/apache/ctakes/dictionary/fast/example/custom_cui_tui_bsv.bsv/
  /properties
   /conceptFactory

 And

   dictionaryConceptPair
  nameCustomPair/name
  dictionaryNameCustomCuiRareWord/dictionaryName
  conceptFactoryNameCustomCuiConcept/conceptFactoryName
   /dictionaryConceptPair

 Then make sure that you point to your custom cTakesHsql.xml in
 dictionary-fast/desc/analysis_engine/UmlsLookupAnnotator.xml (or Overlap
 depending upon your use):

 nameDictionaryDescriptorFile/name
 description/
 fileResourceSpecifier

  
 fileUrlfile:org/apache/ctakes/dictionary/lookup/fast/cTakesHsqlYourCopy.xml/fileUrl
 /fileResourceSpecifier

 You can also skip the UMLS dictionary altogether and just use your custom
 dictionary.

 If you do give this a try then let me know  how it goes.  If you need
 additional assistance let me know and I will help the best I can.

 Sean


 -Original Message-
 From: Raymond Li [mailto:ray...@bu.edu]
 Sent: Saturday, February 21, 2015 1:26 PM
 To: dev@ctakes.apache.org
 Subject: Hello cTAKES Mailing List

 Hello, my name is is Raymond Li and I am currently working on a team
 project involving cTAKES. The goal of our project would be to use cTAKES to
 analyze posts on social media (such as tweets, forum posts, public
 available data) in order to catch in real-time any adverse effects of
 prescribed drugs and do a public service of protecting people from harmful
 drugs.

 Aside from this introduction, I do have only one question to ask to
 proceed with this project: Is cTAKES capable of understanding slang words
 as symptoms. An example is if I were to say I took Crestor and feeling bad
 is there a way for cTAKES to recognize that Crestor had a negative effect?
 My team has not been able to isolate 'bad' as a negative effect as it is
 not a defined medical symptom, but it would be nice to figure out if such a
 solution exists, or if we would need to develop our own solution and how we
 could go around doing it.

 My team and I would appreciate any comments or assistance regarding our
 project and this current issue. Thank you and have a nice day!

 --
 Sincerely,

 Raymond Li



Re: cTakes question

2015-01-21 Thread Pei Chen
[+dev]
Yu,
Yes, you can run it from the command line in many ways.
1) You can write a Java class that does it for you.  Similar to
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-examples/src/main/java/org/apache/ctakes/examples/pipelines/ExampleAggregatePipeline.java

2) Run the CPE (Collection Processing Engine) via command line.
I would suggest running it once via the CPE GUI first, then save the
configuration to be run via command line.
I would suggest using [CollectionReaderHere] [
*AggregatePlaintextFastUMLSProcessor*.xml] [CasConsumerHere] instead
of AggregateCdaUMLSProcessor
for your CPE configuration.

Example of running the CPE from command line: java copyyourargumentshere
org.apache.uima.examples.cpe.SimpleRunCPE path_to_your_cpe.xml




On Wed, Jan 21, 2015 at 1:25 PM, Liang, Yu yu.li...@nyumc.org wrote:

  Dear Pei,

  I have a question about cTakes. Our purpose is to use cTakes to
 identified UMLS concepts from our medical notes which are about 4000 notes
 in total and all are free-text.

  In order to put cTakes into our final pipeline, we want to find a way to
 run the notes and get the output by command line not GUI. Is there any way
 we can do that? I am not Java person, it is very hard to me to play around
 myself. I am going to use the “AggregateCdaUMLSProcessor” analysis engine
 . So if you could give me a clue, really appreciate that. I guess the
 command line will contain many parameters, like .jar files to be run,
 right?
 I am very confused where to find them, and if I find them, how to decide
 which to be put in the command line in our case , for example, we use
 AggregateCDAUMLSProcessor?

  Thanks again.
 Hope to hear from you very soon.




  Yu Liang










Re: question about CTAKES

2014-12-17 Thread Pei Chen
[+dev]
I think that's a current limitation in the new Polarity Classifier.  It's
ML based, so most likely 'Deny XYZ' or 'Negative for XYZ' is probably not
in the training data.  There are a couple of things I would suggest:
1) Post the questions/examples to dev@ctakes.apache.org - perhaps others
may have some ideas
2) Open a Jira issue to track the issue
3) You can try to Revert back to either the old 'Assertion' module or the
previous RegEx based 'NegationAnnotator' (currently commented out in the
xml descriptor file.) [I am assuming you are using the latest trunk or
3.2.1 release.

Hope that helps...
--Pei


On Wed, Dec 17, 2014 at 5:38 PM, Sisi Ma sophie.sisi...@gmail.com wrote:

  Hi Pei,
 I have a quick question about CTAKES.
 I am using AE “AggregatePlaintextUMLSProcessor.xml” and want to get some
 negation results by referring to polarity attribute.
 However, it turns out, for example “Negative for hepatitis”, is not
 negated. I think it is weird and I tried “No hepatitis”, “ Denies
 hepatitis” which return “polarity= -1”, but “Deny hepatitis.” returns
 “polarity=1”.

 Could you give me some clue that what is wrong?
 Thanks.



Re: UMLS Integration

2014-12-16 Thread Pei Chen
Praveen,
The error looks specific to UMLS metamorphosys rather than cTAKES.  I am
assuming you are trying to install UMLS locally from scratch rather than
using the bundled cTAKES resources.

Did you confirm that all of the files have been downloaded correctly per
Metamorphosys instructions:
http://www.nlm.nih.gov/research/umls/licensedcontent/umlsknowledgesources.html
 The error seems to be related to incomplete or corrupted zip files?



Pei Chen
Wired Informatics http://www.wiredinformatics.com
265 Franklin St Ste 1702
Boston, MA 02110
tel: (617) 433-7544
pei.c...@wiredinformatics.com


 -- Forwarded message --
 From: Jay_Ram pandupraveen...@gmail.com
 Date: Tue, Dec 16, 2014 at 12:07 AM
 Subject: UMLS Integration
 To: dev@ctakes.apache.org

 Hi All,

 I downloaded UMLS resource, to use them offline by loading in mysql. I
 followed them which are mentioned to load data into mysql. But I am unable
 to do it show error
 
 Loading MetamorphoSys ...
 [Please be patient and wait for MetamorphoSys to begin]
 
 java.util.zip.ZipException: invalid LOC header (bad signature)
 at java.util.zip.ZipFile.read(Native Method)
 at java.util.zip.ZipFile.access$1400(Unknown Source)
 at java.util.zip.ZipFile$ZipFileInputStream.read(Unknown Source)
 at java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(Unknown
 Source)
 at java.util.zip.InflaterInputStream.read(Unknown Source)
 at java.util.zip.InflaterInputStream.read(Unknown Source)
 at java.util.zip.CheckedInputStream.read(Unknown Source)
 at java.util.zip.GZIPInputStream.readUByte(Unknown Source)
 at java.util.zip.GZIPInputStream.readUShort(Unknown Source)
 at java.util.zip.GZIPInputStream.readHeader(Unknown Source)
 at java.util.zip.GZIPInputStream.init(Unknown Source)
 at

 gov.nih.nlm.umls.meta.io.RRFMetadataInputStream.openSourceFile(RRFMetadataInputStream.java:390)
 at

 gov.nih.nlm.umls.meta.io.RRFConceptInputStream.open(RRFConceptInputStream.java:175)
 at

 gov.nih.nlm.umls.meta.io.RRFMetathesaurusInputStream.open(RRFMetathesaurusInputStream.java:125)
 at

 gov.nih.nlm.umls.mmsys.io.RRFMetamorphoSysInputStream.open(RRFMetamorphoSysInputStream.java:629)
 at

 gov.nih.nlm.umls.mmsys.subset.gui.MetamorphoSysGUI.validateGUIConfigurables(MetamorphoSysGUI.java:1097)
 at

 gov.nih.nlm.umls.mmsys.subset.gui.BeginSubsetAction.actionPerformed(BeginSubsetAction.java:110)
 at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
 at javax.swing.AbstractButton$Handler.actionPerformed(Unknown
 Source)
 at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown
 Source)
 at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
 at javax.swing.AbstractButton.doClick(Unknown Source)
 at javax.swing.plaf.basic.BasicMenuItemUI.doClick(Unknown Source)
 at
 javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(Unknown
 Source)
 at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source)
 at java.awt.Component.processMouseEvent(Unknown Source)
 at javax.swing.JComponent.processMouseEvent(Unknown Source)
 at java.awt.Component.processEvent(Unknown Source)
 at java.awt.Container.processEvent(Unknown Source)
 at java.awt.Component.dispatchEventImpl(Unknown Source)
 at java.awt.Container.dispatchEventImpl(Unknown Source)
 at java.awt.Component.dispatchEvent(Unknown Source)
 at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown
 Source)
 at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
 at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
 at java.awt.Container.dispatchEventImpl(Unknown Source)
 at java.awt.Window.dispatchEventImpl(Unknown Source)
 at java.awt.Component.dispatchEvent(Unknown Source)
 at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
 at java.awt.EventQueue.access$200(Unknown Source)
 at java.awt.EventQueue$3.run(Unknown Source)
 at java.awt.EventQueue$3.run(Unknown Source)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown
 Source)
 at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown
 Source)
 at java.awt.EventQueue$4.run(Unknown Source)
 at java.awt.EventQueue$4.run(Unknown Source)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown
 Source)
 at java.awt.EventQueue.dispatchEvent(Unknown Source)
 at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown
 Source

Re: revamping the Apache cTAKES website

2014-12-15 Thread Pei Chen
the template was borrowed from spark... we should put in our own
design/css/layout/skin to suit our needs.  Perhaps Michelle or others
familiar with bootstrap could help us out here?

On Mon, Dec 15, 2014 at 7:32 PM, jay vyas jayunit100.apa...@gmail.com
wrote:

 this is gorgeous ! Thanks pei !  i let the bigtop folks know as well
 !

 On Mon, Dec 15, 2014 at 6:21 PM, Murali mmin...@gmail.com wrote:
 
  Looks great. +1
 
 
 
   On Dec 15, 2014, at 4:29 PM, Chen, Pei pei.c...@childrens.harvard.edu
 
  wrote:
  
   Check out a mockup of a new website proposal:
   http://svn.apache.org/repos/asf/ctakes/site/new/index.html
   Based off bootstrap (Idea borrowed from the Spark folks..).
  
   Couple of key pieces of info:
   - 10% of visitors are on mobile/tablets
   - The most currently visited pages are: downloads.cgi,
  gettingstarted.html.  I suggest we focus our attention on those 2 items.
  (Putting a Downloads link right on the front page, etc.)
  
   svn co http://svn.apache.org/repos/asf/ctakes/site/new if you want to
  checkout the code of the site.
  
   --Pei
  
   -Original Message-
   From: John Green [mailto:john.travis.gr...@gmail.com]
   Sent: Friday, December 05, 2014 6:34 PM
   To: dev@ctakes.apache.org
   Cc: dev@ctakes.apache.org
   Subject: RE: revamping the Apache cTAKES website
  
   I would like to second the bootstrap recommendation, with the
 additional
  recommendation of django for the backend. It is an amazing platform for
  rapid development and easy updating.
  
  
   JG
   —
   Sent from Mailbox
  
   On Fri, Dec 5, 2014 at 12:15 PM, Savova, Guergana 
  guergana.sav...@childrens.harvard.edu wrote:
  
   There are now 4 volunteers:
   Michelle Chen
   Pei Chen
   Sean Finan
   Guergana Savova
   --Guergana
   -Original Message-
   From: Savova, Guergana [mailto:guergana.sav...@childrens.harvard.edu]
   Sent: Friday, December 05, 2014 11:56 AM
   To: dev@ctakes.apache.org
   Subject: RE: revamping the Apache cTAKES website Wonderful, thank you,
   Michelle! There will be a flurry of emails the week of Dec 15 followed
  by actual work, so book your calendar if possible...
   --Guergana
   -Original Message-
   From: Michelle Chen [mailto:michelle1919c...@gmail.com]
   Sent: Friday, December 05, 2014 11:48 AM
   To: dev@ctakes.apache.org
   Subject: Re: revamping the Apache cTAKES website Hello Guergana, I
   don't know that much about cTakes, but would be interested in
  contributing to the effort.
   I'm not sure if there is an interest in matching the website design of
  other Apache projects, but it seems that the two main designs that are
  being used from my arbitrary search on
  http://projects.apache.org/indexes/alpha.html is 1. the current design
  that cTakes is using and 2. a Bootstrap approach.
   I've done a little bit of work on Bootstrap and would be interested in
  helping with that. Let me know how I can be helpful.
   Sincerely,
   Michelle Chen :)
   Be strong and of good courage; do not be afraid, nor be dismayed, for
   the Lord your God is with you wherever you go. ~Joshua 1:9 On Fri,
 Dec
  5, 2014 at 11:21 AM, Savova, Guergana 
  guergana.sav...@childrens.harvard.edu wrote:
   cTAKES-ers,
  
   we would like to start working on updating the Apache cTAKES website
   - some of the information there is already stale and needs
 refreshing.
   Do you have ideas on website design, content, etc.? Would you like to
   contribute to the effort? We are planning to start working on the
   website the week of Dec 15.
  
   Cheers,
   --Guergana
  
  
 


 --
 jay vyas



Re: Question about running cTakes, urgent!

2014-12-12 Thread Pei Chen
Yu,
There should be an attribute within any of the IdentifiedAnnotation(or
Subclasses) called polarity.  It's -1 if it's negated.  For example:

   - *polarity* = -1
   - [image: Inline image 1]


On Fri, Dec 12, 2014 at 4:17 PM, Liang, Yu yu.li...@nyumc.org wrote:

  Last Question, thanks for your patience.

  Here is the result I run the AggregatePlaintextFastUMLSProcessor.xml by
 using the real medical note. But I cannot find the negation result.


  Yu Liang

  CHIBI





  On Dec 12, 2014, at 3:59 PM, Pei Chen chen...@apache.org wrote:

  Yes, Negation is handled by the new
 nodePolarityCleartkAnalysisEngine/node
 Within IdentifiedAnnotation, there should be a polarity() attribute that
 should be populated.

 On Fri, Dec 12, 2014 at 3:55 PM, Liang, Yu yu.li...@nyumc.org wrote:

 Thanks soo much!! Very Awesome ! I am not like a java person, so
 kind of totally lost. So I also see there includes NegationAnnotator,
 right? So don’t have to run NEcontext component?!
  Yu Liang

  CHIBI





   On Dec 12, 2014, at 3:51 PM, Pei Chen chen...@apache.org wrote:

Hi Yu,
 That is correct.  If you take a look at any of the  'Aggregate'
 examples,
 it should already have something like this defined in the xml flow:
 fixedFlow
  nodeSimpleSegmentAnnotator/node
  nodeSentenceDetectorAnnotator/node
  nodeTokenizerAnnotator/node
  nodeLvgAnnotator/node
  nodeContextDependentTokenizerAnnotator/node
  nodePOSTagger/node
  !-- nodeClearPOSTagger/node --
  nodeChunker/node
  nodeAdjustNounPhraseToIncludeFollowingNP/node
  nodeAdjustNounPhraseToIncludeFollowingPPNP/node
  !-- nodeLookupWindowAnnotator/node --
  nodeDictionaryLookupAnnotatorDB/node
  nodeDependencyParser/node
  nodeSemanticRoleLabeler/node
  nodeConstituencyParser/node
  !-- nodeAssertionAnnotator/node --
  !-- nodeStatusAnnotator/node --
  !-- nodeNegationAnnotator/node --
  nodeGenericCleartkAnalysisEngine/node
  nodeHistoryCleartkAnalysisEngine/node
  nodePolarityCleartkAnalysisEngine/node
  nodeSubjectCleartkAnalysisEngine/node
  nodeUncertaintyCleartkAnalysisEngine/node
  nodeExtractionPrepAnnotator/node
  /fixedFlow

  You should see the results in your CVD, look for
 **.IdentifiedAnnotation (Those are the Named Entities + Attributes that
 have been extracted and normalized to an UMLS CUI.)

 On Fri, Dec 12, 2014 at 3:45 PM, Liang, Yu yu.li...@nyumc.org wrote:

 Thanks for your quick reply. So I am not quite sure if I understand
 correctly, that is , if I ONLY run AggregatePlaintextFastUMLSProcessor.xml,
 I don’t need to run the  Segment, Sentence, Tokenizer, Chunker, dictionary
 lookup in sequency?
 Yu Liang

  CHIBI





  On Dec 12, 2014, at 3:40 PM, Pei Chen chen...@apache.org wrote:

   [+Dev]
 Yu,
 It's great that you have the AggregatePlaintextFastUMLSProcessor.xml
 running.  I presume it's returning IdentifiedAnnotations for you.

  Long answer:
 The error JCas type xyz is used in Java Code but not declared in XML
 is caused by the fact that the type system is not imported.  Normally, this
 could be fixed by adding in your primitive xml descriptor:
  typeSystemDescription
   imports
  import name=org.apache.ctakes.typesystem.types.TypeSystem/
  /imports
   /typeSystemDescription
  But it should be already added in the SimpleSegmentAnnotator which is
 part of the Aggregate examples.
 The Dictionary lookup annotatorUMLS.xml was not intended to be used by
 itself because it requires the other components such as Segment, Sentence,
 Tokenizer, Chunker, etc. to work properly[1].

  [1]
 https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+Component+Use+Guide#cTAKES3.2ComponentUseGuide-ComponentDependencies


 On Fri, Dec 12, 2014 at 3:26 PM, Liang, Yu yu.li...@nyumc.org wrote:

 I do have them, and I add  them into the code where need the
 substitution. Loading is good, no error, and run
 AggregatePlaintextFastUMLSProcessor.xml with no error message.
 But when try to run other AE, like dictionarylookupannotatorUMS.xml “
 , the error is as in the previous email. Do you have any idea?
 Also, I tried to find the log file to check the detailed error info.
  it is weird that I click the “view log” from the pull-down menu, it says:
 “no   /Users/yu/uima.log”.

  Yu Liang

  CHIBI





  On Dec 12, 2014, at 2:52 PM, Pei Chen chen...@apache.org wrote:

  Yu,
 Do you have an UMLS username and password so you can use the UMLS
 resources/dictionaries? (you can request a free one here:
 https://uts.nlm.nih.gov//license.html)

  I would suggest you use the AggregatePlaintextFastUMLSProcessor.xml


 On Fri, Dec 12, 2014 at 2:48 PM, Liang, Yu yu.li...@nyumc.org wrote:

  Hi,

  I have a problem when using cTakes, our purpose is to annotate
 medical notes using cTakes with build in UMLS dictionary. And It is fine 
 to
 run the AggregatePlaintextProcessor.xml.
 But when I try to run other analysis engine, always get this similar
 error messages below. Could you help me out? Appreciate! This is one error
 message after load and run AE called

[VOTE] Release Apache cTAKES 3.2.1 (rc2)

2014-12-01 Thread Pei Chen
This is a call for a vote on releasing the following candidate (rc2) as
Apache cTAKES 3.2.1.

The major changes include:

- New optional Temporal component (Time + Event Relationships models now
available)

- Other bug fixes/enhancements from Jira

I manually downloaded the bin as well as resources and tried the CVD with
the AggregatePlaintextFastUMLSProcessor.xml and
AggregatePlaintextUMLSProcessor.xml.

Would be great if folks have time to test/verify especially if you opened
any of the Jira's below to ensure the bugs have been fixed/integrated.



For more detailed information on the changes/release notes, please visit:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313621version=12326778

The release was made using the cTAKES release process documented here:
http://ctakes.apache.org/ctakes-release-guide.html

The candidate is available at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-src.tar.gz
/.zip

The tag to be voted on:
http://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.2.1-rc2

The MD5 checksum of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-src.tar.gz.md5

/.zip.md5

The signature of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-src.tar.gz.asc
/.zip.asc

Apache cTAKES' KEYS file, containing the PGP keys used to sign the release:
https://dist.apache.org/repos/dist/release/ctakes/KEYS

Please vote on releasing these packages as Apache cTAKES 3.2.1. The vote is
open for at least the next 72 hours.

The vote passes if at least three binding +1 votes are cast.
[ ] +1 Release the packages as Apache cTAKES 3.2.1
[ ] -1 Do not release the packages because...

Also, the convenience binary can be found at:
https://dist.apache.org/repos/dist/dev/ctakes/ctakes-3.2.1-rc2/apache-ctakes-3.2.1-bin.tar.gz.md5
/.zip

Thanks!


Re: UMLS validation url

2014-11-24 Thread Pei Chen
Kim,
I'm not sure, but I also noticed it in testing the 3.2.1-rc1 earlier today;
one can always check the revision history.  But in either case,  A simple
search and replace of:

https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser  
https://uts-ws.nlm.nih.gov/restful/isValid
https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuserUMLSUser

will do the trick.


Pei Chen
Wired Informatics http://www.wiredinformatics.com
265 Franklin St Ste 1702
Boston, MA 02110
tel: (617) 433-7544
pei.c...@wiredinformatics.com

On Mon, Nov 24, 2014 at 3:12 PM, Pei Chen chen...@apache.org wrote:


 -- Forwarded message --
 From: Kim Ebert kim.eb...@perfectsearchcorp.com
 Date: Mon, Nov 24, 2014 at 1:46 PM
 Subject: Re: UMLS validation url
 To: dev@ctakes.apache.org


  Hi Pei,

 Was this a recent change. I know that some of the other conf files
 referenced the url
 https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser as well.


  [image: IMAT Solutions] http://imatsolutions.com
  Kim Ebert
 Software Engineer
 [image: Office:] 801.669.7342
 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com
  On 11/24/2014 11:30 AM, Chen, Pei wrote:

  That’s a typo in the fast dictionary lookup.

 It should be: https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser



 Jira raised for this: https://issues.apache.org/jira/browse/CTAKES-335





 *From:* Kim Ebert [mailto:kim.eb...@imatsolutions.com
 kim.eb...@imatsolutions.com]
 *Sent:* Monday, November 24, 2014 1:28 PM
 *To:* dev@ctakes.apache.org
 *Subject:* UMLS validation url



 Hi All,

 Today I noticed that
 https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser is returning
 404 messages. Anyone else running into the same problem?

 Thanks,

 --

 [image: IMAT Solutions] http://imatsolutions.com

 *Kim Ebert*
 Software Engineer
 [image: Office:]801.669.7342
 kim.eb...@imatsolutions.com greg.hub...@imatsolutions.com






Re: Chest pain absent. - polarity

2014-11-15 Thread Pei Chen
Petr,
Which version of cTAKES are you using?  3.2.0 or latest 3.2.1-rc1/trunk?
Both default to use a Machine Learning based polarity algorithm.  If it is
missed, more training examples is probably the way to go.
The latest one uses clearTK and trained with different features and
training data so I would be curious to see if that one picks up your
examples.

On Sat, Nov 15, 2014 at 11:04 AM, Petr Zalesky pzale...@inferscience.com
wrote:

 I have been investigating how polarity on a sign/symptom gets set and ran
 into interesting issue. If a physician's note in a history of present
 illness (HPI) says something like:

 “Absence of chest pain.”

 “Denied chest pain.”

 “Chest pain resolved.”

 Then cTAKES picks up the term chest pain, assigns it the correct SNOMED
 codes and sets the polarity to -1.  However, some of the de-identified
 samples say:

 Chest pain absent.

 In this case it is also picked up by cTAKES but in this case the polarity
 is set to positive one (1).  I have been trying to figure out if there is a
 way to configure cTAKES to detect that.  Any suggestions?



Re: CTakes on github.

2014-10-30 Thread Pei Chen
Sounds good.
Jay,
Barring any objections from the group, would you mind opening a Jira with
INFRA to set that up (read only git mirror) for cTAKES?

--Pei

On Thu, Oct 30, 2014 at 12:40 PM, jay vyas jayunit100.apa...@gmail.com
wrote:

 Hi Pei : I Agree with (A) - the hybrid approach, so anyone can use both, or
 and git.apache.org has working github mirroring.



Next cTAKES release 3.2.1 - Creating a Release Candidate

2014-10-23 Thread Pei Chen
There are a lot of good fixes and new enhancements in currently trunk.
- Includes new Temporal Relations models (ex: Event relationships are
available now- previously- only Event/Time entities discovery models were
included.)
-Plus a ton of bug fixes tracked in Jira

I can volunteer to be RM again to push out this release...

Let me know if there are any objections, otherwise, I was planning to
create a branch and release candidate some time next week.  Please update
Jira/check in pending code if you want it to be included in the 3.2.1-rc...

--Pei


upcoming cTAKES meetup - Boston...

2014-10-23 Thread Pei Chen
Next Friday (halloween) - feel free to drop by if you're in the area!

Lunch/drinks provided..Please RSVP via
http://www.meetup.com/cTAKES/events/208836282/

--Pei


Apache cTAKES 3.2.1 release preperation

2014-09-26 Thread Pei Chen
There is a 3.2.1 release slated for end of Oct.  The major changes are:
uimafit 2.1 upgrade, cleakTK upgrade, New temporal relations models.
Below is a summary of what was scheduled to go in (some may be still
unresolved).
Feel free to edit/update Jira if you believe something should be
included/omitted in preparation...

  Sub-task

   - [CTAKES-124 https://issues.apache.org/jira/browse/CTAKES-124] -
   remove internal UIMA types from coreference
   - [CTAKES-312 https://issues.apache.org/jira/browse/CTAKES-312] -
   upgrade uimafit

Bug

   - [CTAKES-76 https://issues.apache.org/jira/browse/CTAKES-76] - get
   third party dependencies into Maven Central
   - [CTAKES-155 https://issues.apache.org/jira/browse/CTAKES-155] -
   SimpleSegmentWithTagsAnnotator assumes all section names are 5 characters
   - [CTAKES-162 https://issues.apache.org/jira/browse/CTAKES-162] -
   Command line scripts leave the user back one directory
   - [CTAKES-169 https://issues.apache.org/jira/browse/CTAKES-169] -
   SectionSegmentAnnotator.java is in core, but the sample
   SectionSegmentAnnotator.xml descriptor is in ctakes-clinical-pipeline
   - [CTAKES-178 https://issues.apache.org/jira/browse/CTAKES-178] -
   parsing of medication strength does not verify a number was discovered
   (strength value includes both the dosage and strength value in some cases)
   - [CTAKES-213 https://issues.apache.org/jira/browse/CTAKES-213] -
   ModifierExtractorAnnotator should produce XxxxModifier subtypes
   - [CTAKES-241 https://issues.apache.org/jira/browse/CTAKES-241] -
   NullPointerException in ctakes-assertion
   - [CTAKES-275 https://issues.apache.org/jira/browse/CTAKES-275] - some
   of the older junit tests don't have the right Project name in the run
   configurations
   - [CTAKES-280 https://issues.apache.org/jira/browse/CTAKES-280] -
   upgrade to cleartk-2.*
   - [CTAKES-285 https://issues.apache.org/jira/browse/CTAKES-285] -
   cleartk-ml-liblinear needs to be added to the dependencies
   - [CTAKES-302 https://issues.apache.org/jira/browse/CTAKES-302] -
   Element type hibernate-mapping must be followed by either attribute
   specifications,  or /.
   - [CTAKES-307 https://issues.apache.org/jira/browse/CTAKES-307] - URI
   is not hierarchical when running mvn install
   - [CTAKES-309 https://issues.apache.org/jira/browse/CTAKES-309] - Add
   SNOMEDCT_US to ytext db scripts
   - [CTAKES-310 https://issues.apache.org/jira/browse/CTAKES-310] -
   Dictionary lookup permutations sort issue
   - [CTAKES-311 https://issues.apache.org/jira/browse/CTAKES-311] -
   v_document_cui_sent View returns no results in cTAKES-YTEX

Improvement

   - [CTAKES-77 https://issues.apache.org/jira/browse/CTAKES-77] - Update
   POSTagger Unit Tests
   - [CTAKES-78 https://issues.apache.org/jira/browse/CTAKES-78] - Update
   Chunker unit tests
   - [CTAKES-94 https://issues.apache.org/jira/browse/CTAKES-94] -
   refactoring assertion module to use a cleartk-based analysis engine (and
   include evaluation)
   - [CTAKES-122 https://issues.apache.org/jira/browse/CTAKES-122] -
   include LVG with a future version of cTAKES?
   - [CTAKES-172 https://issues.apache.org/jira/browse/CTAKES-172] -
   relation-extractor is using StatusAnnotator and NegationAnnotator instead
   of AssertionAnnotator
   - [CTAKES-222 https://issues.apache.org/jira/browse/CTAKES-222] -
   FirstTokenPermLookupInitializerImpl to suppot arraylist of
   DictionaryLookupWindows
   - [CTAKES-225 https://issues.apache.org/jira/browse/CTAKES-225] -
   Common Type System - Add field to save preferredText in Segment
   - [CTAKES-295 https://issues.apache.org/jira/browse/CTAKES-295] - Use
   UIMAFit-style configuration annotations

Task

   - [CTAKES-74 https://issues.apache.org/jira/browse/CTAKES-74] -
   Tokenizer PennTreeBank breaks with certain apostrophes in tokens.
   - [CTAKES-138 https://issues.apache.org/jira/browse/CTAKES-138] -
   Remove 3rd party jars from our SVN
   - [CTAKES-232 https://issues.apache.org/jira/browse/CTAKES-232] -
   change concept type
   - [CTAKES-315 https://issues.apache.org/jira/browse/CTAKES-315] -
   Update Default UMLS pipeline to use dictionary-lookup-fast


Re: Boston cTAKES Meetup

2014-09-23 Thread Pei Chen
Jay,

CTAKES-314 https://issues.apache.org/jira/browse/CTAKES-314 -
BigTop/Hadoop cTAKES integration has been created.  Feel free to
create/add/edit an uber or a children...
Discuss thread on dev@ :
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201409.mbox/%3ccapqz87q09cq_kt+4woqki7dpc5qre6h4y3eq9ukoykh5pnz...@mail.gmail.com%3e

On Tue, Sep 23, 2014 at 7:41 AM, Jay Vyas jayunit100.apa...@gmail.com
wrote:

 Shall we create an umbrella jira ?

  On Sep 23, 2014, at 6:26 AM, Prakash Poudyal prakashpoud...@gmail.com
 wrote:
 
  It will be great if you could broadcast the gathering, talk, etc. I wish
 I
  would be there, but it is very hard tor is not possible for me to be
 there.
 
  Prakash Poudyal
  Portugal
 
  On Tue, Sep 23, 2014 at 3:31 AM, Tim O'Connell tim.oconn...@gmail.com
  wrote:
 
  thanks Pei.
 
  On Mon, Sep 22, 2014 at 7:17 PM, Pei Chen chen...@apache.org wrote:
 
  the meetup formats are usually casual/informal, but I'll check to see
  if that's possible.  will post it up if it's available.
 
  On Mon, Sep 22, 2014 at 5:42 PM, Tim O'Connell tim.oconn...@gmail.com
 
  wrote:
  Hi Folks,
 
  Any idea if we can set up a WebEx for those of us who can't attend
 (I'm
  in
  Vancouver)?
 
  Best,
  Tim
 
  On Mon, Sep 22, 2014 at 2:40 PM, John Green 
  hephaestus.stu...@gmail.com
  wrote:
 
  Will this be recorded?
  —
  Sent from Mailbox https://www.dropbox.com/mailbox
 
 
  On Mon, Sep 22, 2014 at 4:30 PM, Pei Chen chen...@apache.org
 wrote:
 
  Please feel free to join the Boston Meet up group:
 
  Upcoming Free Event:
  http://www.meetup.com/cTAKES/events/208836282/
  (If possible, please feel free to RSVP so we can get an approx
  headcount)
 
  Feel free to chime in if you have anything specific that may be of
  interest to you:
  ex: cTAKES intro, cTAKES and BigTop/Hadoop. But open to the
 community
  if anyone has anything they would like to show/share, news, looking
  for a job?, has a job opening, etc.
 
  --Pei
 
 
  --
 
  Regards
  Prakash Poudyal



Boston cTAKES Meetup

2014-09-22 Thread Pei Chen
Please feel free to join the Boston Meet up group:

Upcoming Free Event:
http://www.meetup.com/cTAKES/events/208836282/
(If possible, please feel free to RSVP so we can get an approx headcount)

Feel free to chime in if you have anything specific that may be of
interest to you:
ex: cTAKES intro, cTAKES and BigTop/Hadoop.  But open to the community
if anyone has anything they would like to show/share, news, looking
for a job?, has a job opening, etc.

--Pei


Re: Boston cTAKES Meetup

2014-09-22 Thread Pei Chen
the meetup formats are usually casual/informal, but I'll check to see
if that's possible.  will post it up if it's available.

On Mon, Sep 22, 2014 at 5:42 PM, Tim O'Connell tim.oconn...@gmail.com wrote:
 Hi Folks,

 Any idea if we can set up a WebEx for those of us who can't attend (I'm in
 Vancouver)?

 Best,
 Tim

 On Mon, Sep 22, 2014 at 2:40 PM, John Green hephaestus.stu...@gmail.com
 wrote:

 Will this be recorded?
 —
 Sent from Mailbox https://www.dropbox.com/mailbox


 On Mon, Sep 22, 2014 at 4:30 PM, Pei Chen chen...@apache.org wrote:

 Please feel free to join the Boston Meet up group:

 Upcoming Free Event:
 http://www.meetup.com/cTAKES/events/208836282/
 (If possible, please feel free to RSVP so we can get an approx headcount)

 Feel free to chime in if you have anything specific that may be of
 interest to you:
 ex: cTAKES intro, cTAKES and BigTop/Hadoop. But open to the community
 if anyone has anything they would like to show/share, news, looking
 for a job?, has a job opening, etc.

 --Pei





Re: Ctakes to process 5000K recoreds

2014-09-09 Thread Pei Chen
Nick,
When you mean no medication is being annotated, I presume you mean the
medication attributes (i.e. dosage, frequency, etc.) are not being
annotated?  I think the DrugNER needs a list of section names in the
config; I think it includes SIMPLE_SEGMENT.  I am very surprised that
SimpleSegementAnnotator is the bottle neck though; all it does is
assume the entire document is a single section called SIMPLE_SEGMENT.
Have you tried commenting out the DependencyParser if you're not using
those features.

--Pei


On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish
snika...@emerginghealthit.com wrote:

 Hi there,

 I am using Ctakes to process 5000K free text  records  where each record has 
 several medications.
 This is the fixed flow that it goes through:


 nodeSimpleSegmentAnnotator/node
 
 nodeSentenceDetectorAnnotator/node
 
 nodeTokenizerAnnotator/node
 
 nodeLvgAnnotator/node
 
 nodeContextDependentTokenizerAnnotator/node
 
 nodePOSTagger/node
 
 nodeChunker/node
 
 nodeLookupWindowAnnotator/node
 
 nodeDictionaryLookupAnnotatorDB/node
 
 nodeDependencyParser/node
 
 nodeAssertionAnnotator/node
 
 nodeExtractionPrepAnnotator/node

 But it takes very very long time to process that many data( maybe a week or 
 so) when I use SimpleSegmentAnnotator.  By eliminating SimpleSegmentAnnotator 
 the process is very fast but no medication is being anotated.  Do you guys 
 have any suggestion?

 Thanks,
 Nick



Re: Permutations

2014-09-05 Thread Pei Chen
Hi Kim,
Thanks for pointing that out.
https://issues.apache.org/jira/browse/CTAKES-310 has been opened for
this.
If you commit the changes, we can see if we can include in the 3.2.1
patch release.
I was looking at the changelist for this file, and it may look like
some of these optimizations may have been intentional by Sean so he
may have some more insight in this bit of the logic.

On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert
kim.eb...@perfectsearchcorp.com wrote:
 Hi All,

 I was reviewing the use of permutations, and I noticed that we sorted
 the permutation list before creating the string to do the concept lookup
 with. It also appears that we were sorting the object that was stored in
 the parent list.

 I've made a few changes, and now it appears I can discover some
 additional concepts based upon the permutations.

 Let me know what you think of the following changes.

 Thanks,

 Kim

 === modified file
 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
 ---
 ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
 2014-07-31 22:00:48 +
 +++
 ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
 2014-09-04 18:39:59 +
 @@ -210,11 +210,12 @@
final ListListInteger permutationList = iv_permCacheMap.get(
 permutationIndex );
for ( ListInteger permutations : permutationList ) {
   // Moved sort and offset calculation from inner (per
 MetaDataHit) iteration 2-21-2013 spf
 - Collections.sort( permutations );
 + ListInteger permutationsSorted = (List)
 ((ArrayList)permutations).clone();
 + Collections.sort( permutationsSorted );
   int startOffset = firstWordStartOffset;
   int endOffset = firstWordEndOffset;
 - if ( !permutations.isEmpty() ) {
 -int firstIdx = permutations.get( 0 );
 + if ( !permutationsSorted.isEmpty() ) {
 +int firstIdx = permutationsSorted.get( 0 );
  if ( firstIdx = firstTokenIndex ) {
 firstIdx--;
  }
 @@ -222,7 +223,7 @@
  if ( firstToken.getStartOffset()  firstWordStartOffset ) {
 startOffset = firstToken.getStartOffset();
  }
 -int lastIdx = permutations.get( permutations.size() - 1 );
 +int lastIdx = permutationsSorted.get(
 permutationsSorted.size() - 1 );
  if ( lastIdx = firstTokenIndex ) {
 lastIdx--;
  }


 --
 Kim Ebert
 1.801.669.7342
 Perfect Search Corp
 http://www.perfectsearchcorp.com/



Re: MedicationMention and new Mention

2014-09-03 Thread Pei Chen
Harpreet,
MedicationMention attributes such as
.medicationfrequency
.medicationDosage
Can be filled via the DrugMentionAnnotator [1].  If I recall
correctly, I believe you can just add that annotator after the
DictionaryLookup in your pipeline.

[1] 
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-drug-ner/desc/analysis_engine/DrugMentionAnnotator.xml


On Wed, Sep 3, 2014 at 3:47 PM, Harpreet Khanduja hsk5...@rit.edu wrote:
 Hello,
Hope everyone is doing great.
I would appreciate some help in these two questions.

   1. How would one go about finding values for attributes
 medicationfrequency,medicationAllergy, 
 medicationDosage and others which are present in
   MedicationMention and adding them to the End result of ctakes
 pipeline.


   2. Is there a way create a new Mention like LabMention or
 MedicationMention ?

   Any help would be really appreciated.
   Thank you.

 Regards,
 Harpreet


Re: managing ctakes resources on classpath

2014-08-26 Thread Pei Chen
I'm not too privy to the ytex config details, but yes you're right,
it's caused by the xdl.xsd being null.  However it looks like it
exists in ytex-res.jar but the call being made uses Class.getResource
which won't be able to read in from the jar as an InputStream.
1) We can make ytex read in resources directly from jars (as maven
central artifacts).   We can make AppJdl.class.getResourceAsStream()
instead of getResource().  However, are there any other local physical
File dependencies?
2) Alternatively, we can add a step to have maven unpack res.jar if required.
I think 1 would be nice, but not sure how involved it will be.

Caused by: java.io.FileNotFoundException:
/Users/pei/workspace/apache-ctakes/trunk/ctakes-ytex/file:/Users/pei/workspace/apache-ctakes/trunk/ctakes-ytex-res/target/ctakes-ytex-res-3.2.1-SNAPSHOT.jar!/org/apache/ctakes/jdl/xdl.xsd
(No such file or directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.lt;initgt;(FileInputStream.java:146)

at java.io.FileInputStream.lt;initgt;(FileInputStream.java:101)

  at 
sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)

Anyhow- https://issues.apache.org/jira/browse/CTAKES-308 opened to track this

On Tue, Aug 26, 2014 at 3:01 AM, vijay garla vnga...@gmail.com wrote:
 Hi

 The test that is failing has nothing to do with the MRCONSO not found
 warning.

 ValidationTest failed because it couldn't find the XSD.

 The XSD is in the ctakes-ytex-resources, but the corresponding maven
 artifact is an empty jar.

 I think it would be best to modify the resource jars to actually contain
 resources.


cTAKES min requirements

2014-08-25 Thread Pei Chen
Since we default the runtime java heap sizes to 3g in 3.2.0, should we
update our documentation to officially only support 64bit?  I can only
see models/pipelines being loaded into mem grow in size.  I know it
may seem trivial, but I still know a few unfortunate souls still on 32
bit systems… any objections in saying that it will essentially be no
longer supported.


--Pei


org.apache.ctakes.ytex.umls.dao.UMLSDaoTest

2014-08-25 Thread Pei Chen
Hi VJ,
While on the subject of unit tests-

I didn't get a chance to dig deeper and was hoping you would know the
cause of this unit test failure:  mvn clean install

2014-08-25 13:33:50,830 WARN  net.sf.ehcache.CacheManager  - Creating
a new instance of CacheManager using the diskStorePath
/var/folders/qc/d7xd4zzs0_xcybv88skt5_7mgn/T/ which is already
used by an existing CacheManager.

The source of the configuration was
net.sf.ehcache.config.generator.ConfigurationSource$InputStreamConfigurationSource@7433a719.

The diskStore path for this CacheManager will be set to
/var/folders/qc/d7xd4zzs0_xcybv88skt5_7mgn/T//ehcache_auto_created_1408988030830.

To avoid this warning consider using the CacheManager factory methods
to create a singleton CacheManager or specifying a separate ehcache
configuration (ehcache.xml) for each CacheManager instance.

2014-08-25 13:33:51,082 WARN
org.hibernate.engine.jdbc.spi.SqlExceptionHelper  - SQL Error: 62,
SQLState: S0010

2014-08-25 13:33:51,082 ERROR
org.hibernate.engine.jdbc.spi.SqlExceptionHelper  - Unknown JDBC
escape sequence: {{db.schema}.MRCONSO mrconso0_ where mrconso0_.aui?
and length(mrconso0_.aui)0 and length(mrconso0_.str)200 and
mrconso0_.lat='ENG' order by mrconso0_.aui

2014-08-25 13:33:51,085 WARN
org.apache.ctakes.ytex.umls.dao.UMLSDaoTest  - sql exception - mrconso
probably doesn't exist, check error

org.hibernate.exception.SQLGrammarException: could not prepare statement

at 
org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:123)

at 
org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)

at 
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125)

at 
org.hibernate.engine.jdbc.internal.StatementPreparerImpl$StatementPreparationTemplate.prepareStatement(StatementPreparerImpl.java:188)

at 
org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareQueryStatement(StatementPreparerImpl.java:159)

at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1859)

at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1836)

at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)

at org.hibernate.loader.Loader.doQuery(Loader.java:900)

at 
org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)

at org.hibernate.loader.Loader.doList(Loader.java:2526)

at org.hibernate.loader.Loader.doList(Loader.java:2512)

at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)

at org.hibernate.loader.Loader.list(Loader.java:2337)

at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:495)

at 
org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:357)

at 
org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:195)

at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1269)

at org.hibernate.internal.QueryImpl.list(QueryImpl.java:101)

at 
org.apache.ctakes.ytex.umls.dao.UMLSDaoImpl.getAllAuiStr(UMLSDaoImpl.java:106)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:319)

at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)

at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)

at 
org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)

at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)

at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:90)

at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)

at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)

at com.sun.proxy.$Proxy11.getAllAuiStr(Unknown Source)

at 
org.apache.ctakes.ytex.umls.dao.UMLSDaoTest.testGetAllAuiStr(UMLSDaoTest.java:53)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)

at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)

at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)

at 

Re: Microsoft - MSDN - Is the support continuing for ASF committers?

2014-08-25 Thread Pei Chen
Just an fyi - link for MSDN subscription license(s) for committers
http://mail-archives.apache.org/mod_mbox/www-community/201305.mbox/%3c518b85e7.7000...@lehmi.de%3E

https://svn.apache.org/repos/private/committers/donated-licenses/msdn-subscription.html


Re: [VOTE] Release Apache cTAKES 3.2.0

2014-07-08 Thread Pei Chen
It's the latter:
the -src is basically the same as the dev install w/o the subversion
checkout step...


On Tue, Jul 8, 2014 at 7:15 AM, Miller, Timothy 
timothy.mil...@childrens.harvard.edu wrote:

 One other thing -- the User install guide says to download the -bin and
 the Dev install guide has you checking out from trunk. And the README in
 the -src distribution has instructions that only make sense for the -bin
 distribution.
 Are there instructions somewhere for how to install the -src version or
 are they basically the same as the dev install w/o the subversion checkout
 step?

 Tim

 
 From: Pei Chen [chen...@apache.org]
 Sent: Monday, July 07, 2014 12:06 PM
 To: dev@ctakes.apache.org
 Subject: Re: [VOTE] Release Apache cTAKES 3.2.0

 Thanks for testing this Tim.  I could recreate this on my mac now (worked
 previously on windows on luck because of the class load order).
 Essentially, the old mitre assertion module and LVG resources still need to
 be unpacked.  We don't have access to modify the underlying lib to read
 from a stream, I'm just going to omit/exclude the redundant
 ctakes-assertion-res.jar from the distro (as the assertion module will be
 updated in the future release anyway).
 Let me know if you encounter anything else, otherwise will plan to create
 an RC-2.


 On Fri, Jul 4, 2014 at 12:01 PM, Miller, Timothy 
 timothy.mil...@childrens.harvard.edu wrote:

  I get an error when I try to run the CVD following the README
 instructions
  for the binary release:
 
  7/4/14 11:56:03 AM - 12:
  org.apache.uima.tools.cvd.MainFrame.handleException(527): SEVERE:
  Initialization of annotator class
  org.apache.ctakes.assertion.medfacts.AssertionAnalysisEngine failed.
   (Descriptor:
 
 file:/home/tmill/Projects/sandbox/ctakes-rcs/apache-ctakes-3.2.0/desc/ctakes-assertion/desc/assertionAnalysisEngine.xml)
  org.apache.uima.resource.ResourceInitializationException: Initialization
  of annotator class
  org.apache.ctakes.assertion.medfacts.AssertionAnalysisEngine failed.
   (Descriptor:
 
 file:/home/tmill/Projects/sandbox/ctakes-rcs/apache-ctakes-3.2.0/desc/ctakes-assertion/desc/assertionAnalysisEngine.xml)
  at
 
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252)
  at
 
 org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:156)
  at
 
 org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
  at
 
 org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
  at
  org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
  at
 
 org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
  at
 
 org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
  at
 
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
  at
 
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
  at
 
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
  at
 
 org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
  at
 
 org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
  at
  org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
  at
 
 org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
  at
 
 org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
  at
 
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
  at
 
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
  at
 
 org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
  at
 
 org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
  at
 
 org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
  at
  org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
  at
 
 org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:354)
  at
 org.apache.uima.tools.cvd.MainFrame.setupAE(MainFrame.java:1484)
  at
  org.apache.uima.tools.cvd.MainFrame.loadAEDescriptor(MainFrame.java:477)
  at
 
 org.apache.uima.tools.cvd.control.AnnotatorOpenEventHandler.actionPerformed

  1   2   >