RE: to involve in your development group

2013-07-22 Thread Finan, Sean
Hi Sandeep, I just took a peek at the JavaOcr code, and it looks like they perform image filtering in the PixelImage class. This would probably cause a problem with dot matrix images as every corner of every dot would be removed as noise, so dots that participate in curves on characters such

RE: cTAKES user interface

2013-10-30 Thread Finan, Sean
AM To: dev@ctakes.apache.org Cc: Finan, Sean Subject: RE: cTAKES user interface Hi all, Sean Finan (I think is on this group) already wrote a command line CPE runner like Pei described. I've been using it and would be happy to provide some user guides if he provides the class,etc. Todd Lingren

RE: cTAKES user interface

2013-10-30 Thread Finan, Sean
hour. -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, October 30, 2013 11:20 AM To: Lingren, Todd; dev@ctakes.apache.org Subject: RE: cTAKES user interface Sean Finan (I think is on this group) already wrote a command line CPE runner like

RE: Sundry; Problem Lists

2013-11-04 Thread Finan, Sean
. Jg — Sent from Mailbox for iPhone On Thu, Oct 31, 2013 at 2:04 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: I don't know if what I write below truly applies to the discussion, but here it is. much of a problem list definition may already be contained to varying degrees in existing

RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-11-04 Thread Finan, Sean
dictionary used by cTAKES? Ted -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 04, 2013 9:37 AM To: dev@ctakes.apache.org Subject: RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor I don't know

RE: Sundry; Problem Lists

2013-11-04 Thread Finan, Sean
would possibly be a route to a solution. Now that is a challenge! Cheers for the inspiration and enthusiasm, Sean From: John Green [john.travis.gr...@gmail.com] Sent: Monday, November 04, 2013 10:45 AM To: Finan, Sean Subject: RE: Sundry; Problem Lists Oh goodness

RE: cTAKES Groovy...

2013-12-06 Thread Finan, Sean
Good stuff - Thanks Richard -Original Message- From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] Sent: Friday, December 06, 2013 3:30 PM To: 'dev@ctakes.apache.org' Subject: RE: cTAKES Groovy... Thanks Richard! That did the trick I'll create a JIRA and update the script including

RE: UMLS Env variables suggestion

2014-01-06 Thread Finan, Sean
(_), and leave the dot (.) in the code to be deprecated for now? --Pei -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Saturday, January 04, 2014 10:10 PM To: dev@ctakes.apache.org Subject: RE: UMLS Env variables suggestion This went in to 3.1 https

RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
On my end it looks like my email was reformatted and some of my -newline- removed in those last examples ... -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, January 22, 2014 3:42 PM To: dev@ctakes.apache.org Subject: RE: sentence

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Finan, Sean
Hi Vijay, I have yet to run across clinical text from a real EMR where newlines represent the end of a sentence Since James pointed out this possibility a couple weeks ago, I have kept my eyes open. The problem is pretty ubiquitous in a corpus that I'm working with right now. I just

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Finan, Sean
this if they like. -vj On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Vijay, I have yet to run across clinical text from a real EMR where newlines represent the end of a sentence Since James pointed out this possibility a couple weeks ago, I have kept

RE: Update: UMLS, cTAKES, and UIMA for applications in genomics

2014-02-24 Thread Finan, Sean
Hi Andy, We have been using Uima-as here, but with no third-party wrappings. We have set it up to run in standalone and lsf cluster environments, but everything is out-of-box with a few custom bash scripts to set environment settings, etc. Sean -Original Message- From: andy mcmurry

RE: How to add a new dictionary database to cTAKES

2014-02-28 Thread Finan, Sean
Hi Abhishek, You have some interesting timing ... I can give you the xml specifications that you require if you send me the format of your dictionary. Since you are new to the current dictionary module setup, I might also have a simpler solution for you ... A couple of days ago I checked a

RE: getSeverity etc. for relation extractor

2014-03-21 Thread Finan, Sean
in TemplateFillerAnnotator or something else. -- James -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, March 21, 2014 12:30 PM To: dev@ctakes.apache.org Subject: RE: getSeverity etc. for relation extractor until we have a definite, well-defined need (from a user

RE: getSeverity etc. for relation extractor

2014-03-24 Thread Finan, Sean
one location_of relation. And again no location_of relations for rash on arm and leg Sean, what was the exact phrase you used with the incubator version? (or was that a while ago and lost) -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, March

RE: Temporal Information Extraction package has compile time error

2014-03-27 Thread Finan, Sean
Hi Manu, Speaking for the developers of that module, we are excited that you and others in the community are starting to show so much interest in temporal information extraction - enough to attempt builds and trial runs. The Temporal module is still in an academic experimental phase and there

RE: errors when run BagOfCUIsGenerator.java

2014-04-16 Thread Finan, Sean
Try to open https://uts-ws.nlm.nih.gov If that works then try https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser and see if you get a message like This XML file does not appear to have any style information associated with it. The document tree is shown below. If that works and you

RE: lvg entries

2014-04-17 Thread Finan, Sean
Those variants are not used by the dictionary lookup. I did look at them to see if it was worthwhile for the new dictionary, but they are all over the place so I passed. From: Miller, Timothy [timothy.mil...@childrens.harvard.edu] Sent: Thursday, April

RE: lvg entries

2014-04-18 Thread Finan, Sean
+1 false -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Friday, April 18, 2014 2:54 PM To: dev@ctakes.apache.org Subject: Re: lvg entries Thanks for tracking that down Andy. I am making a pass at UimaFit-izing the configuration parameters

RE: ytex merged into trunk

2014-04-28 Thread Finan, Sean
Hi Vijay, I did a checkout this morning and I'm getting compile errors from Maven. If I just run mvn compile then I get an error while building ytex claiming that the package has not been created. Is there a reversed dependency? If I run mvn compile package then ytex seems to run through, but

RE: ytex merged into trunk

2014-04-28 Thread Finan, Sean
. -vj On Mon, Apr 28, 2014 at 11:00 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Vijay, I did a checkout this morning and I'm getting compile errors from Maven. If I just run mvn compile then I get an error while building ytex claiming that the package has not been

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-11 Thread Finan, Sean
11, 2014 at 9:21 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: . The newer NER should have in its name the Behavior... I agree, but the *2 module is a complete replacement for the current lookup. It does not (really) have any different behavior, just a different implementation

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-16 Thread Finan, Sean
on the dictionary lookup, how to configure it, and how to create new dictionaries. I would venture to say that this is the most important component in cTAKES, and probably the one that has generated the most questions on the newsgroup. On Wed, Jun 11, 2014 at 9:21 AM, Finan, Sean sean.fi

RE: DeepPheno: guidance on CTakes

2014-06-27 Thread Finan, Sean
Hi Pei, Nice examples. The pipeline builder could be simpler (divvied), but they shouldn't leave anybody confused. +1 for the uimafit annotations! -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Friday, June 27, 2014 11:11 AM To: Hochheiser, Harry

RE: Bacterium Dictionary

2014-06-30 Thread Finan, Sean
Hi Nick, There are ~26,000 T007 Bacterium (falls under Living Being) entries in UMLS 2013aa. They aren't in the cTakes dictionary, but you can build a separate bacteria dictionary using the dictionary creator tool in cTakes sandbox. It can create dictionaries formatted for use with both

RE: [VOTE] Release Apache cTAKES 3.2.0

2014-07-02 Thread Finan, Sean
+1 Pulled fresh candidate, built, and ran Clinical using CPE without problem. Other than that, no testing. SVN gave me a problem initially (checked out as anonymous) asking for a password then flunking the checkout, but an update completed it. I blame the heat.

RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-10 Thread Finan, Sean
+1 for the ytex method of handling a umls login before download of the umls resources. While this also doesn't truly prevent people from sharing files (data) without a umls account, it is a little bit of a nicer mechanism. Aside ... Does anybody out there have experience with izpack?

RE: Lucene for UMLS2014

2014-07-21 Thread Finan, Sean
Hi Harpreet, If you are willing to use cTakes 3.2, try the dictionary-lookup-fast module as a replacement of the default dictionary-lookup. That module has a new dictionary resource (hsql, not lucene) and slightly different methods for lookup and matching. In time trials it has been faster

RE: question about sentence segmentation

2014-08-02 Thread Finan, Sean
Hi Tim, It would be preferable to me to put sentence breaks in between the sections, so the first two sentences would be: 1) PE: Lymphonodes... 2) Lungs: normal... The punctuation is (always) after the logical break, being Term: for a Term:Definition list. I think that the first three

RE: code value for vocabulary in dic-lookup-fast

2014-08-06 Thread Finan, Sean
Hi Harpreet, I don't know if this has yet been answered (I'm still finding vacation-time emails), but the Snomed-ct, Rx-norm, etc. codes were removed from the -fast dictionary for speed. Basically, any single UMLS Cui can have multiple different snomed-ct codes (for instance), and adding

RE: v_snomed_fword_lookup view

2014-08-08 Thread Finan, Sean
Hi Clayton, I don't know how the ytex dictionary lookup works, so I'm afraid that I can't help you with an answer. Maybe Vijay is the best person to do this. If you aren't tied to ytex you could try the new cTakes dictionary-lookup-fast. I tested Patient came in with a malar rash and it

RE: v_snomed_fword_lookup view

2014-08-11 Thread Finan, Sean
again: How exactly do you switch to using the cTakes dictionary-lookup-fast. Do I need to go in and alter xml files or is it as simple as adding a certain item to the list of analysis engines? On Fri, Aug 8, 2014 at 3:48 PM, Finan, Sean sean.fi...@childrens.harvard.edu

Youtube Channel Apache cTakes

2014-08-12 Thread Finan, Sean
cTakes now has a youtube channel named Apache cTakes. It is empty, but if you have ever made a training video, presentation on a component (descriptors, type system, etc.), or demo of integration with another system (UimaFit, Uima-AS, etc.) then please feel free to post on that channel. When

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
/using one of those. On Wed, Aug 13, 2014 at 1:41 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Clayton, I'm glad that you got it working. Though I stated that I would, I haven't yet checked the fidelity of trunk. Urgent data request one day, must have writing the next

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
of a CasConsumer to essentially save your data in a representation that you can do some kind of data mining or classification on it? If so, then I think I need to look into making/using one of those. On Wed, Aug 13, 2014 at 1:41 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote

RE: Web server

2014-08-21 Thread Finan, Sean
Hi John, Have you (or another) thought about modifying the Uima Simple Server to run a cTakes pipeline? http://uima.apache.org/sandbox.html#simple-server -Original Message- From: John Green [mailto:john.travis.gr...@gmail.com] Sent: Thursday, August 21, 2014 3:06 PM To:

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
Hi Nick, I think that the bottleneck is probably the lookup module itself. So, I just sent you a secure email/ftp link. It contains a build of the new dictionary-lookup-fast module. Should you choose to try it, let me know how things turn out. Sean

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000K recoreds Hi Sean, Many thanks, I will try it tomorrow. Do you have any special instruction to run that scrip or I have to use it with cTakes? Thanks, Nick -Original Message- From: Finan, Sean [mailto:sean.fi

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
Steve Bethard wrote: I spent some time writing a script for diff-ing CASes I urge anyone interested in comparing cTakes CASes / output to use this type of approach. Comparison of program output is a post-process task, and unless absolutely necessary code to juggle data and metadata belongs

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
/2014 07:30 AM, Finan, Sean wrote: Steve Bethard wrote: I spent some time writing a script for diff-ing CASes I urge anyone interested in comparing cTakes CASes / output to use this type of approach. Comparison of program output is a post-process task, and unless absolutely necessary code

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
that is in a predictable order makes checking to see if there are differences much cheaper when you are dealing with larger data sets. Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 08:50 AM, Finan, Sean wrote: Hi Kim, One might want compare

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
. Thanks, Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 10:46 AM, Finan, Sean wrote: Hi Kim, It concerns me a bit by making the code return consistent results would be so concerning. Could you please clarify what you mean

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 12:43 PM, Finan, Sean wrote: I'm just about sapped on this topic. What comes below is my final writing. Kim wrote: Yes, I mean actual type values not matching. Ok, this is a very serious problem and should

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
Hi Bruce, I would venture to say that this is neither expected nor desired. Before you fix it (or in addition to a fix), try to run with the new dictionary lookup. It will have a different behavior, and it will be the default dictionary lookup in future releases of cTakes – making fixes to

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
the necessary dictionary(ies) or how do I build them? [image: IMAT Solutions] http://imatsolutions.com Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
: IMAT Solutions] http://imatsolutions.com Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Bruce, I would venture to say that this is neither expected

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-09 Thread Finan, Sean
...@imatsolutions.com On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean sean.fi...@childrens.harvard.edumailto:sean.fi...@childrens.harvard.edu wrote: Hi Bruce, With Pei's help I just updated the sourceforge repo with the cTakes dictionaries. Checkout artifact ctakes-resources-snomed-rword-hsqldb-2011ab Sean

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-09 Thread Finan, Sean
, but that would probably be better than missing the annotation all together. [IMAT Solutions]http://imatsolutions.com Bruce Tietjen Senior Software Engineer [Mobile:]801.634.1547 bruce.tiet...@imatsolutions.commailto:bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean sean.fi

RE: Need information regarding cTakes changes

2014-10-20 Thread Finan, Sean
Hi Chandu, For your note #2: 2)Any new features that can be added to current version of cTakes project to make it more useful. You can always check (or add to) the Jira future enhancement page at:

RE: Announcement: UMLS MedGen-MySQL dataset now available as open access download

2014-11-14 Thread Finan, Sean
Hi Andy, Great stuff! I think that I understand the method, but I have a question about the statement: the content is publicly available per the NCBI policy and license for MedGen sources Does this mean that I, Joe Anybody, could download the content, place some of the content in a database

RE: Asking help for always unsuccessful AE load

2014-12-04 Thread Finan, Sean
Hi Jun, Do AE pipelines that do not use the Smoking Status module work? I think that Smoking Status configuration (via binary install) might be broken in the last several versions. I thought that I had submitted a Jira long, long ago, but right now I can't find it so maybe my memory is

RE: Scaling cTakes

2014-12-05 Thread Finan, Sean
Hi Brandon, It sounds like you've got a decent pipeline set up. To increase the speed you could try swapping out use of ctakes-dictionary-lookup with ctakes-dictionary-lookup-fast in the AE. Check ctakes-clinical-pipeline/desc/[ae]/AggregatePlaintextFastUMLSProcessor.xml for an example.

RE: Scaling cTakes

2014-12-09 Thread Finan, Sean
other suggestions on performance tuning would be great! Thanks, Brandon -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 05, 2014 1:14 PM To: dev@ctakes.apache.org Subject: RE: Scaling cTakes Hi Brandon, It sounds like you've got

RE: revamping the Apache cTAKES website

2014-12-15 Thread Finan, Sean
Anyway, a pretty amazing fresh start, thanks Pei -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Monday, December 15, 2014 4:33 PM To: dev@ctakes.apache.org Subject: RE: revamping the Apache cTAKES website Check out a mockup of a new website proposal:

RE: Problem running cTakes-clinical pipeline -- AggregatePlaintextFastUMLSProcessor.xml

2014-12-15 Thread Finan, Sean
Hi Yu, Also do you know is there any command line I can run to annotate like a thousand files automatically rather than copy and paster. You could try the CPE gui : bin/runctakesCPE.sh Sean From: Liang, Yu [mailto:yu.li...@nyumc.org] Sent: Monday, December 15, 2014 4:51 PM To:

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-16 Thread Finan, Sean
On Mon, Dec 15, 2014 at 11:43 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hmmm, I can't find it in a search. However, here is a direct link: https://www.youtube.com/channel/UC8hQoOKz3v4PNEf6cqSkjbQ Maybe it needs a few videos to register in the search engine ? Sean

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-17 Thread Finan, Sean
video and ctakes youtube : Youtube Apache cTakes Channel Direct Link Isnt this to upload for my account? What about to the channel? On Tue, Dec 16, 2014 at 12:16 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi John, Look for an Upload button in the upper-left corner next

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version then there WILL be differences in CUI and Semantic group. I don't have time to go into it with details, examples, etc. just be aware that every 6 months

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Software Engineer [Office:]801.669.7342 kim.eb...@imatsolutions.commailto:greg.hub...@imatsolutions.com On 12/19/2014 11:31 AM, Finan, Sean wrote: One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Hi Bruce, I'm not sure how there would be fewer matches with the overlap processor. There should be all of the matches from the non-overlap processor plus those from the overlap. Decreasing from 215 to 211 is strange. Have you done any manual spot checks on this? It is really bizarre that

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Hi Bruce, Correction -- So far, I did steps 1 and 2 of Sean's email. No problem. Aside from recreating the database, those two steps have the greatest impact. But before you change anything else, please do some manual spot checks. I have never seen a case where the lookup would be so

RE: cTakes Annotation Comparison --- (^:

2014-12-19 Thread Finan, Sean
bruce.tiet...@imatsolutions.com On Fri, Dec 19, 2014 at 1:39 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Sorry, I meant “Do some spot checks on the validity”. In other words, when your script reports that a cui and/or span is missing, manually look at the data and see

RE: Using cTakes programmatically

2014-12-29 Thread Finan, Sean
Hi Maite Meseure, Check the cTakes User guide on UMLS setup: https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide#cTAKES3.2UserInstallGuide-(Recommended)AddUMLSaccessrights which (in part) points you towards obtaining a license to use the NIH UMLS dictionary:

RE: Question about the pipeline

2015-02-02 Thread Finan, Sean
Hi Tol (and Maite), I'm not entirely certain that I understand the question, but here is an attempt to help. If I'm oversimplifying then I apologize. I think that ExampleAggregatePipeline is intended to represent a very simple single-note pipeline and that custom code could be produced by

RE: Question about the pipeline

2015-02-03 Thread Finan, Sean
-at least in Eclipse, I haven't managed to run it via the command line yet. On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Tol (and Maite), I'm not entirely certain that I understand the question, but here is an attempt to help. If I'm oversimplifying

RE: Question about the pipeline

2015-02-05 Thread Finan, Sean
in our HPC, it spawns a new job for each subfolder (which may have between 5 and 2500 notes). Todd Lingren Biomedical Informatics Cincinnati Children’s Hospital todd.ling...@cchmc.org 513-803-9032 -Original Message- From: Finan, Sean [mailto:sean.fi

RE: git mirrors out of sync?

2015-02-03 Thread Finan, Sean
Hi Steve, You are right (confirming your finding) - it looks like the first is a no-show and the second is somebody's personal upload to github (not git.apache.org) from 3 years ago. The jira claims that the item was closed (fixed), but if you go to

RE: Question about the pipeline

2015-02-03 Thread Finan, Sean
= createEngine(MyTokenizer.class); AnalysisEngine tagger = createEngine(MyTagger.class); runPipeline(jCas, tokenizer, tagger); for(Token token : iterate(jCas, Token.class)){ System.out.println(token.getTag()); } Tol O. Finan, Sean Sean.Finan@... writes: Hi Tol (and Maite), I'm not entirely

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-12 Thread Finan, Sean
Try something like the following for output: private int extractFeatures( final IdentifiedAnnotation annotation ) { // Extract the IdentifiedAnnotation itself final CollectionString umlsInfos = getUmlsInfos( annotation, _printSnomed ); if ( umlsInfos == null ) {

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-12 Thread Finan, Sean
Oh yeah - use the -fast dictionary to get preferred text. The fastest way to get cuis only is with CuisOnlyPlaintextUMLSProcessor. If you want polarity make sure you uncomment the section with PolarityCleartkAnalysisEngine. Sean -Original Message- From: Maite Meseure Hugues

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-17 Thread Finan, Sean
...@gmail.com wrote: Thank you for your replies, It's helpful. I was working on 3.2.0 version, so it looks like 3.2.1 allows to get the UMLS preferred text. Maite On Thu, Feb 12, 2015 at 2:25 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Oh yeah - use the -fast dictionary to get

RE: CTAKES mirroring on github.

2015-02-17 Thread Finan, Sean
Our request is for a read-only mirror. However, if it ever becomes i/o, I don't know if this will have what you want, but http://git.apache.org/ Links to documentation (mostly server setup) http://www.apache.org/dev/git.html and a wiki (check toward middle and bottom for committer info)

RE: Question about fast pipeline

2015-01-12 Thread Finan, Sean
Hi Michelle, Did your error have only Could not find . as absolute or did it also have or in ... or in ...? If you see ... or in ... then this is a new issue. If you don't, then you should update your source. If you need to run the release binary then let me know and I can work out

RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Finan, Sean
looking for something that doesn't have to be the best speed-wise, but that is the recommended for optimizing F1 measure. Regards, James -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 19, 2014 11:55 AM To: dev@ctakes.apache.org; kim.eb

RE: Question about the pipeline

2015-02-05 Thread Finan, Sean
@ctakes.apache.org Subject: Re: Question about the pipeline Yes, it does but only in Eclipse, not in command line even though I am in the good directory. I have to look at the classpath more in details probably. Thanks for your replies. On Thu, Feb 5, 2015 at 8:08 AM, Finan, Sean sean.fi

RE: Negex

2015-01-05 Thread Finan, Sean
I don't know. I'm comparing what I think is the 2009 negex trigger set https://code.google.com/p/negex/source/browse/trunk/GeneralNegEx.Java.v.1.2.05092009/negex_triggers.txt with the cTakes trigger set in org.apache.ctakes.core.fsm.machine.NegationFSM.java and it looks like the cTakes set is

RE: Question about CPE/ descriptor and xml file.

2015-01-05 Thread Finan, Sean
Go through the error that you got, and look for a message like: Failed to initilize. Invalid UMLS License and Error: Invalid UMLS License. A UMLS License is required to use the UMLS dictionary lookup. Error: You may request one at: https://uts.nlm.nih.gov/license.html Please verify your

RE: Is it necessary to put UMLS login into files when passing them with -D to the JVM?

2015-03-06 Thread Finan, Sean
Hi Tom, I am passing my UMLS login and password on startup as arguments ... -Dctakes.umlsuser=myusername -Dctakes.umlspw=mypassword That is fine. If I understand correctly you are already running this way without problem. The comments in the .xml files should probably be extended to

RE: Hello cTAKES Mailing List

2015-02-23 Thread Finan, Sean
-3A__www.nlm.nih.gov_research_umls_sourcereleasedocs_current_CHV_d=BQIBaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTaom=1Bkpeno1tqLjX78o0wYm5DmJHCHlK7hrxpeEgPnGtRMs=-rEmTgTCe0mkSXT34XK56zkiuy_VxIfFvngGJzUwem8e= On Sun, Feb 22, 2015 at 10:37 AM, Finan, Sean sean.fi

URGENT! RE: New Website

2015-02-25 Thread Finan, Sean
Hi all, It looks like a few people (myself included) are interested in having information on people, projects, papers, and applications that use cTAKES on the web page. I have created a form on google that might help us collect this and other information. Please visit

RE: Hello cTAKES Mailing List

2015-02-22 Thread Finan, Sean
Hi Raymond, If you use the dictionary-fast module there exists an entry feeling bad with cui 557911 and cui 231218. There is also feel bad and feeling bad emotionally You will find horrible present pain but no other entry with horrible. You will not find any terms with awful and probably

RE: ctakessorx for AggregatePlaintextFastUMLSProcessor.xml

2015-03-27 Thread Finan, Sean
Maite, You already have a thread going with me offline. If you have a question please ask it on that thread to refrain from spamming the devlist. Until I have a chance to create decent documentation you are stuck with me. Sean From: Maite Meseure

RE: Prep for upcoming cTAKES 3.2.2 Patch Release

2015-04-30 Thread Finan, Sean
+1 for pushing forward I may have been one of the voices commenting on memory bloat, but I agree with Pei re: improving the new. The more use, the more attention and more improvement (hopefully). I can't speak of the accuracy old v. new as I haven't actually comparatively tested them. And

RE: build tool suggestion

2015-05-06 Thread Finan, Sean
Your IDE should have settings that allow custom warnings. Also check out findbugs -- http://en.wikipedia.org/wiki/FindBugs There might be a configurable maven plugin. It is a process ... -Original Message- From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] Sent: Tuesday, May 05,

RE: UMLS Authentication failing despite correct username and password

2015-05-11 Thread Finan, Sean
Hi Pedro, Check the cTakesHsql.xml and make sure that the line matches: property key=umlsUrl value=https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser/ In an older version of cTAKES with an output message as you have: 11 May 2015 15:59:47 INFO AbstractJCasTermAnnotator - Default - Loading

RE: UMLS Authentication failing despite correct username and password

2015-05-11 Thread Finan, Sean
Argh. Our email server may have mucked with the url that I pasted: H t t p s : / / uts - ws . nlm . nih . gov / restful / isValidUMLSUser property key=umlsUrl value= INSERT URL HERE, NO SPACES / -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent

RE: build tool suggestion

2015-05-06 Thread Finan, Sean
@ctakes.apache.org' Subject: RE: build tool suggestion Sorry, I wasn't clear, when I said at build time, I meant the Jenkins automated build. -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, May 06, 2015 9:52 AM To: dev@ctakes.apache.org Subject: RE

RE: UMLS Authentication failing despite correct username and password

2015-05-14 Thread Finan, Sean
Hi Pedro, B). If the user has already downloaded the UMLS isn't that already indicative that they had a valid account? As I understand it (I wasn't around at the time) this per-user licensing with a jit check was the deal that was worked out with the NLM. I think that repackaging and

RE: UMLS Authentication failing despite correct username and password

2015-05-12 Thread Finan, Sean
().equalsIgnoreCase(Resulttrue/Result); in isValidUMLSUser() should be replaced with result = line.trim().equalsIgnoreCase(?xml version='1.0' encoding='UTF-8'?Resulttrue/Result); Michal -Message d'origine- De : Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Envoyé : May-11-15

RE: DB DictionaryLookupAnnotator sqlserver exception

2015-04-15 Thread Finan, Sean
Hi Alex, This is some pretty odd behavior. Obviously, it is indicating that the resource type loaded or specified is not the correct class. Specification is (for the standard UMLS pipeline) in ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml lines #226 and

RE: TimeLanes

2015-06-22 Thread Finan, Sean
Hi Maashu, TimeLanes is currently a prototype gui under development and there is probably no information about it on the web. It is in sandbox because it isn't part of the ctakes release and is missing much needed functionality. For instance, It should display basic information about the

RE: RareWord term

2015-06-22 Thread Finan, Sean
Hi Maite, I hope to have a paper out on this soon, so I am keeping things kind of quiet about it - though one can always look at the database and code to get an idea of what it means. For anything else in the module, you can look at the wiki page:

RE: TimeLanes

2015-06-22 Thread Finan, Sean
. Pei just put it up there, thank you very much, Pei! --Guergana -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Monday, June 22, 2015 11:36 AM To: dev@ctakes.apache.org Subject: RE: TimeLanes Hi Maashu, TimeLanes is currently a prototype gui under

RE: cTakes - hsqldb connection problem

2015-06-02 Thread Finan, Sean
Hi Pankaj, I haven't seen this exact error before. I guess that my first steps toward a possible remedy would be: - check for existence of /org/apache/ctakes/dictionary/lookup/umls2011ab/umls.properties - make sure that it (resources/) is in your classpath - see if it looks like any of the

RE: The fast dictionary pipeline vs. the regular one

2015-06-29 Thread Finan, Sean
contains the CUIs of both Glioblastoma and glioblastoma Multiforme. Best, Oranit. -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Monday, June 22, 2015 5:13 PM To: dev@ctakes.apache.org Subject: RE: The fast dictionary pipeline vs

RE: how to run i2b2 data

2015-08-07 Thread Finan, Sean
data Thanks Sean for your understanding, and I am in hope now. Where is the best place to start looking at regarding create a collection reader that works similarly to org.apache.ctakes.core.cr. FilesInDirectoryCollectionReader? Justin On Wed, Aug 5, 2015 at 7:24 PM, Finan, Sean sean.fi

RE: how to run i2b2 data

2015-08-05 Thread Finan, Sean
Hi Justin, A shot in the dark: You could create a collection reader that works similarly to org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader , but instead of grabbing all of the files in a directory it grabs all the records parsed from a single .xml and runs a pipeline per record.

RE: Cannot resolve lookup descriptor files for UmlsDictionaryLookupAnnotator

2015-07-22 Thread Finan, Sean
(); ... SimplePipeline.runPipeline(reader, aggregateEngine, writer, evaluator); Best regards, Jakob -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: den 10 juli 2015 18:29 To: dev@ctakes.apache.org Subject: RE: Cannot resolve lookup descriptor files

RE: Invalid UMLS License

2015-07-27 Thread Finan, Sean
Hi Justin, The UMLS licensing issue has been resolved: https://issues.apache.org/jira/browse/CTAKES-359 Any version built after May 12th 2015 should have the fix. Sean -Original Message- From: Justin Zhang [mailto:justinzhang...@gmail.com] Sent: Sunday, July 26, 2015 9:21 AM To:

RE: Invalid UMLS License

2015-07-27 Thread Finan, Sean
=/Logger.properties with CHANGEME On Mon, Jul 27, 2015 at 8:32 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Justin, The UMLS licensing issue has been resolved: https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org _jira_browse_CTAKES-2D359d=BQIFaQc

  1   2   3   4   5   6   7   8   9   >