RE: to involve in your development group

2013-07-22 Thread Finan, Sean
Hi Sandeep, I just took a peek at the JavaOcr code, and it looks like they perform image filtering in the PixelImage class. This would probably cause a problem with dot matrix images as every corner of every dot would be removed as noise, so dots that participate in curves on characters such

RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-04 Thread Finan, Sean
This may sound strange, but SNOMED does not contain the term CIN I. It contains the terms CIN I - Cervical intraepitheal neoplasia 1 and CIN I - mild dyskaryosis. -Original Message- From: Pei Chen [mailto:chen...@apache.org] Sent: Tuesday, September 03, 2013 10:13 PM To:

RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-09-04 Thread Finan, Sean
I don't know if this is exactly what you want, but you can use the hyperSql ( http://hsqldb.org/ ) database tool to perform searches on the umls dictionary used by cTakes. For instance select * from UMLS_MS_2011AB where FWORD = 'CIN' will provide all the available terms starting with CIN.

RE: cTAKES user interface

2013-10-30 Thread Finan, Sean
AM To: dev@ctakes.apache.org Cc: Finan, Sean Subject: RE: cTAKES user interface Hi all, Sean Finan (I think is on this group) already wrote a command line CPE runner like Pei described. I've been using it and would be happy to provide some user guides if he provides the class,etc. Todd Lingren

RE: cTAKES user interface

2013-10-30 Thread Finan, Sean
hour. -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, October 30, 2013 11:20 AM To: Lingren, Todd; dev@ctakes.apache.org Subject: RE: cTAKES user interface Sean Finan (I think is on this group) already wrote a command line CPE runner like

RE: Sundry; Problem Lists

2013-10-31 Thread Finan, Sean
I don't know if what I write below truly applies to the discussion, but here it is. much of a problem list definition may already be contained to varying degrees in existing cTakes databases. The UMLS does provide a problem list, but I haven't looked at it.

RE: Sundry; Problem Lists

2013-11-04 Thread Finan, Sean
. Jg — Sent from Mailbox for iPhone On Thu, Oct 31, 2013 at 2:04 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: I don't know if what I write below truly applies to the discussion, but here it is. much of a problem list definition may already be contained to varying degrees in existing

RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

2013-11-04 Thread Finan, Sean
dictionary used by cTAKES? Ted -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 04, 2013 9:37 AM To: dev@ctakes.apache.org Subject: RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor I don't know

RE: Sundry; Problem Lists

2013-11-04 Thread Finan, Sean
would possibly be a route to a solution. Now that is a challenge! Cheers for the inspiration and enthusiasm, Sean From: John Green [john.travis.gr...@gmail.com] Sent: Monday, November 04, 2013 10:45 AM To: Finan, Sean Subject: RE: Sundry; Problem Lists Oh goodness

RE: cTAKES Groovy...

2013-12-06 Thread Finan, Sean
Good stuff - Thanks Richard -Original Message- From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] Sent: Friday, December 06, 2013 3:30 PM To: 'dev@ctakes.apache.org' Subject: RE: cTAKES Groovy... Thanks Richard! That did the trick I'll create a JIRA and update the script including

RE: UMLS Env variables suggestion

2014-01-04 Thread Finan, Sean
This went in to 3.1 https://issues.apache.org/jira/browse/CTAKES-164 I agree - the docs need to be updated if there is consensus on the use of this method. Personally I think that there should be one supported method, not both dot and underscore. I would prefer that we remove the dot

RE: UMLS Env variables suggestion

2014-01-06 Thread Finan, Sean
(_), and leave the dot (.) in the code to be deprecated for now? --Pei -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Saturday, January 04, 2014 10:10 PM To: dev@ctakes.apache.org Subject: RE: UMLS Env variables suggestion This went in to 3.1 https

RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
Just whistling in the wind here ... Perhaps before any changes are made to universally toggle cTakes in one direction or the other, we can take a poll of when where cTakes/Ytex/OpenNLP/Omaha needs a Sentence (ignoring CR/LF) as opposed to a Line (CR/LF delimited PLUS -sentence-) If some

RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
than one on a line. And using sentences alone (as found by OpenNLP 1.5) would not suffice because it would run together sentences from different lines. -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, January 22, 2014 1:33 PM To: dev

RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
On my end it looks like my email was reformatted and some of my -newline- removed in those last examples ... -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, January 22, 2014 3:42 PM To: dev@ctakes.apache.org Subject: RE: sentence

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Finan, Sean
Hi Vijay, I have yet to run across clinical text from a real EMR where newlines represent the end of a sentence Since James pointed out this possibility a couple weeks ago, I have kept my eyes open. The problem is pretty ubiquitous in a corpus that I'm working with right now. I just

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Finan, Sean
this if they like. -vj On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Vijay, I have yet to run across clinical text from a real EMR where newlines represent the end of a sentence Since James pointed out this possibility a couple weeks ago, I have kept

RE: Update: UMLS, cTAKES, and UIMA for applications in genomics

2014-02-24 Thread Finan, Sean
Hi Andy, We have been using Uima-as here, but with no third-party wrappings. We have set it up to run in standalone and lsf cluster environments, but everything is out-of-box with a few custom bash scripts to set environment settings, etc. Sean -Original Message- From: andy mcmurry

RE: How to add a new dictionary database to cTAKES

2014-02-28 Thread Finan, Sean
Hi Abhishek, You have some interesting timing ... I can give you the xml specifications that you require if you send me the format of your dictionary. Since you are new to the current dictionary module setup, I might also have a simpler solution for you ... A couple of days ago I checked a

RE: getSeverity etc. for relation extractor

2014-03-20 Thread Finan, Sean
1) Should we populate IdentifiedAnnotation.severity() and bodylocationof() Directly in RelationExtractorAnnotator instead of the template filler? One minor issue might be the fact that multiple relations of the same type can (and most likely will be) created for a single Identified

RE: getSeverity etc. for relation extractor

2014-03-21 Thread Finan, Sean
in TemplateFillerAnnotator or something else. -- James -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, March 21, 2014 12:30 PM To: dev@ctakes.apache.org Subject: RE: getSeverity etc. for relation extractor until we have a definite, well-defined need (from a user

RE: getSeverity etc. for relation extractor

2014-03-24 Thread Finan, Sean
one location_of relation. And again no location_of relations for rash on arm and leg Sean, what was the exact phrase you used with the incubator version? (or was that a while ago and lost) -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, March

RE: Temporal Information Extraction package has compile time error

2014-03-27 Thread Finan, Sean
Hi Manu, Speaking for the developers of that module, we are excited that you and others in the community are starting to show so much interest in temporal information extraction - enough to attempt builds and trial runs. The Temporal module is still in an academic experimental phase and there

RE: suggestion for default pipelines

2014-04-15 Thread Finan, Sean
+1 I think that a factory is a great idea. I (personally) dislike the descriptor schema, but I think that deprecation is the way to go until a replacement comes along. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Tuesday, April 15,

RE: errors when run BagOfCUIsGenerator.java

2014-04-16 Thread Finan, Sean
Try to open https://uts-ws.nlm.nih.gov If that works then try https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser and see if you get a message like This XML file does not appear to have any style information associated with it. The document tree is shown below. If that works and you

RE: lvg entries

2014-04-17 Thread Finan, Sean
Those variants are not used by the dictionary lookup. I did look at them to see if it was worthwhile for the new dictionary, but they are all over the place so I passed. From: Miller, Timothy [timothy.mil...@childrens.harvard.edu] Sent: Thursday, April

RE: lvg entries

2014-04-18 Thread Finan, Sean
+1 false -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Friday, April 18, 2014 2:54 PM To: dev@ctakes.apache.org Subject: Re: lvg entries Thanks for tracking that down Andy. I am making a pass at UimaFit-izing the configuration parameters

RE: new dictionary lookup {was RE: lvg entries]

2014-04-22 Thread Finan, Sean
) -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Thursday, April 17, 2014 12:52 PM To: dev@ctakes.apache.org Subject: RE: lvg entries Those variants are not used by the dictionary lookup. I did look at them to see if it was worthwhile for the new dictionary

RE: ytex merged into trunk

2014-04-28 Thread Finan, Sean
Hi Vijay, I did a checkout this morning and I'm getting compile errors from Maven. If I just run mvn compile then I get an error while building ytex claiming that the package has not been created. Is there a reversed dependency? If I run mvn compile package then ytex seems to run through, but

RE: ytex merged into trunk

2014-04-28 Thread Finan, Sean
. -vj On Mon, Apr 28, 2014 at 11:00 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Vijay, I did a checkout this morning and I'm getting compile errors from Maven. If I just run mvn compile then I get an error while building ytex claiming that the package has not been

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-11 Thread Finan, Sean
11, 2014 at 9:21 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: . The newer NER should have in its name the Behavior... I agree, but the *2 module is a complete replacement for the current lookup. It does not (really) have any different behavior, just a different implementation

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-16 Thread Finan, Sean
on the dictionary lookup, how to configure it, and how to create new dictionaries. I would venture to say that this is the most important component in cTAKES, and probably the one that has generated the most questions on the newsgroup. On Wed, Jun 11, 2014 at 9:21 AM, Finan, Sean sean.fi

RE: DeepPheno: guidance on CTakes

2014-06-27 Thread Finan, Sean
Hi Pei, Nice examples. The pipeline builder could be simpler (divvied), but they shouldn't leave anybody confused. +1 for the uimafit annotations! -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Friday, June 27, 2014 11:11 AM To: Hochheiser, Harry

RE: Bacterium Dictionary

2014-06-30 Thread Finan, Sean
Hi Nick, There are ~26,000 T007 Bacterium (falls under Living Being) entries in UMLS 2013aa. They aren't in the cTakes dictionary, but you can build a separate bacteria dictionary using the dictionary creator tool in cTakes sandbox. It can create dictionaries formatted for use with both

RE: Bacterium Dictionary

2014-06-30 Thread Finan, Sean
dictionary using the dictionary creator. Thanks again, Nick -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Monday, June 30, 2014 3:37 PM To: dev@ctakes.apache.org Subject: RE: Bacterium Dictionary Hi Nick, There are ~26,000 T007 Bacterium

RE: [VOTE] Release Apache cTAKES 3.2.0

2014-07-02 Thread Finan, Sean
+1 Pulled fresh candidate, built, and ran Clinical using CPE without problem. Other than that, no testing. SVN gave me a problem initially (checked out as anonymous) asking for a password then flunking the checkout, but an update completed it. I blame the heat.

RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-10 Thread Finan, Sean
+1 for the ytex method of handling a umls login before download of the umls resources. While this also doesn't truly prevent people from sharing files (data) without a umls account, it is a little bit of a nicer mechanism. Aside ... Does anybody out there have experience with izpack?

RE: Lucene for UMLS2014

2014-07-21 Thread Finan, Sean
Hi Harpreet, If you are willing to use cTakes 3.2, try the dictionary-lookup-fast module as a replacement of the default dictionary-lookup. That module has a new dictionary resource (hsql, not lucene) and slightly different methods for lookup and matching. In time trials it has been faster

RE: question about sentence segmentation

2014-08-02 Thread Finan, Sean
Hi Tim, It would be preferable to me to put sentence breaks in between the sections, so the first two sentences would be: 1) PE: Lymphonodes... 2) Lungs: normal... The punctuation is (always) after the logical break, being Term: for a Term:Definition list. I think that the first three

RE: code value for vocabulary in dic-lookup-fast

2014-08-06 Thread Finan, Sean
Hi Harpreet, I don't know if this has yet been answered (I'm still finding vacation-time emails), but the Snomed-ct, Rx-norm, etc. codes were removed from the -fast dictionary for speed. Basically, any single UMLS Cui can have multiple different snomed-ct codes (for instance), and adding

RE: v_snomed_fword_lookup view

2014-08-08 Thread Finan, Sean
Hi Clayton, I don't know how the ytex dictionary lookup works, so I'm afraid that I can't help you with an answer. Maybe Vijay is the best person to do this. If you aren't tied to ytex you could try the new cTakes dictionary-lookup-fast. I tested Patient came in with a malar rash and it

RE: v_snomed_fword_lookup view

2014-08-11 Thread Finan, Sean
again: How exactly do you switch to using the cTakes dictionary-lookup-fast. Do I need to go in and alter xml files or is it as simple as adding a certain item to the list of analysis engines? On Fri, Aug 8, 2014 at 3:48 PM, Finan, Sean sean.fi...@childrens.harvard.edu

Youtube Channel Apache cTakes

2014-08-12 Thread Finan, Sean
cTakes now has a youtube channel named Apache cTakes. It is empty, but if you have ever made a training video, presentation on a component (descriptors, type system, etc.), or demo of integration with another system (UimaFit, Uima-AS, etc.) then please feel free to post on that channel. When

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
/using one of those. On Wed, Aug 13, 2014 at 1:41 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Clayton, I'm glad that you got it working. Though I stated that I would, I haven't yet checked the fidelity of trunk. Urgent data request one day, must have writing the next

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
of a CasConsumer to essentially save your data in a representation that you can do some kind of data mining or classification on it? If so, then I think I need to look into making/using one of those. On Wed, Aug 13, 2014 at 1:41 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote

RE: Web server

2014-08-21 Thread Finan, Sean
Hi John, Have you (or another) thought about modifying the Uima Simple Server to run a cTakes pipeline? http://uima.apache.org/sandbox.html#simple-server -Original Message- From: John Green [mailto:john.travis.gr...@gmail.com] Sent: Thursday, August 21, 2014 3:06 PM To:

RE: Web server

2014-08-21 Thread Finan, Sean
into the existing sandbox code, so I just wanted to hash that first before starting on a new thread. Do you have experience with uima simple server? JG — Sent from Mailbox for iPhone On Thu, Aug 21, 2014 at 12:10 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi

RE: Permutations

2014-09-05 Thread Finan, Sean
Hi Kim, Pei, I don't think that I changed anything to which Kim is referring, just a couple of other things that happen to be in the same segment. From the attached it looks like Kim's change is to copy a list and sort the copy, while mine were moving the sort from an inner to outer loop. At

RE: Permutations

2014-09-05 Thread Finan, Sean
@ctakes.apache.org; Finan, Sean Subject: Re: Permutations Hi Pei and Sean, Sean, any thoughts about this would be helpful. We also had issues in cTAKES 2.5. Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his changes. === modified file 'src/edu/mayo/bmi/lookup/algorithms

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
Hi Nick, I think that the bottleneck is probably the lookup module itself. So, I just sent you a secure email/ftp link. It contains a build of the new dictionary-lookup-fast module. Should you choose to try it, let me know how things turn out. Sean

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000K recoreds Hi Sean, Many thanks, I will try it tomorrow. Do you have any special instruction to run that scrip or I have to use it with cTakes? Thanks, Nick -Original Message- From: Finan, Sean [mailto:sean.fi

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
Nikandish snika...@emerginghealthit.com wrote: Great. I will do that. Thanks again. Nick -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:39 PM To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000K recoreds

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
, Finan, Sean sean.fi...@childrens.harvard.edu wrote: There is a tool to generate a dictionary in the new format using the UMLS MR*** files. The module can also read directly from a file with bar-separated values: CUI|Text or CUI|TUI|Text which could be useful for small custom

RE: Ctakes to process 5000K records

2014-09-10 Thread Finan, Sean
: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:39 PM To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000K recoreds Just use it with cTakes. Instead of removing other modules from the pipeline, replace the dictionary-lookup with dictionary-lookup

RE: Ctakes to process 5000K records

2014-09-10 Thread Finan, Sean
/apache/ctakes/dictionary/fast/cTakesHsql.xml. Where should I add it to the classpath? Thanks, Nick -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:39 PM To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000K recoreds

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
Steve Bethard wrote: I spent some time writing a script for diff-ing CASes I urge anyone interested in comparing cTakes CASes / output to use this type of approach. Comparison of program output is a post-process task, and unless absolutely necessary code to juggle data and metadata belongs

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
/2014 07:30 AM, Finan, Sean wrote: Steve Bethard wrote: I spent some time writing a script for diff-ing CASes I urge anyone interested in comparing cTakes CASes / output to use this type of approach. Comparison of program output is a post-process task, and unless absolutely necessary code

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
different moons. Having consistent results helps us know if we've made improvements to our quality or not. c Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 08:50 AM, Finan, Sean wrote: Hi Kim, One might want compare the Sentence detector that uses end

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
that is in a predictable order makes checking to see if there are differences much cheaper when you are dealing with larger data sets. Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 08:50 AM, Finan, Sean wrote: Hi Kim, One might want compare

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
. Thanks, Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 10:46 AM, Finan, Sean wrote: Hi Kim, It concerns me a bit by making the code return consistent results would be so concerning. Could you please clarify what you mean

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 12:43 PM, Finan, Sean wrote: I'm just about sapped on this topic. What comes below is my final writing. Kim wrote: Yes, I mean actual type values not matching. Ok, this is a very serious problem and should

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
Hi Bruce, I would venture to say that this is neither expected nor desired. Before you fix it (or in addition to a fix), try to run with the new dictionary lookup. It will have a different behavior, and it will be the default dictionary lookup in future releases of cTakes – making fixes to

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
the necessary dictionary(ies) or how do I build them? [image: IMAT Solutions] http://imatsolutions.com Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
: IMAT Solutions] http://imatsolutions.com Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Bruce, I would venture to say that this is neither expected

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-09 Thread Finan, Sean
...@imatsolutions.com On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean sean.fi...@childrens.harvard.edumailto:sean.fi...@childrens.harvard.edu wrote: Hi Bruce, With Pei's help I just updated the sourceforge repo with the cTakes dictionaries. Checkout artifact ctakes-resources-snomed-rword-hsqldb-2011ab Sean

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-09 Thread Finan, Sean
, but that would probably be better than missing the annotation all together. [IMAT Solutions]http://imatsolutions.com Bruce Tietjen Senior Software Engineer [Mobile:]801.634.1547 bruce.tiet...@imatsolutions.commailto:bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean sean.fi

RE: Need information regarding cTakes changes

2014-10-20 Thread Finan, Sean
Hi Chandu, For your note #2: 2)Any new features that can be added to current version of cTakes project to make it more useful. You can always check (or add to) the Jira future enhancement page at:

RE: ctakes-dictionary-lookup-fast

2014-11-07 Thread Finan, Sean
By Pei: As much as I hate maintaining more desc xml's, but I think it's prudent to create a separate one for a patch release temporarily for ctakes-dictionary-lookup-fast so users do not get blindsided by the change in output. By Sean: Excellent idea -Original Message- From:

RE: Announcement: UMLS MedGen-MySQL dataset now available as open access download

2014-11-14 Thread Finan, Sean
Hi Andy, Great stuff! I think that I understand the method, but I have a question about the statement: the content is publicly available per the NCBI policy and license for MedGen sources Does this mean that I, Joe Anybody, could download the content, place some of the content in a database

RE: Asking help for always unsuccessful AE load

2014-12-04 Thread Finan, Sean
Hi Jun, Do AE pipelines that do not use the Smoking Status module work? I think that Smoking Status configuration (via binary install) might be broken in the last several versions. I thought that I had submitted a Jira long, long ago, but right now I can't find it so maybe my memory is

RE: Scaling cTakes

2014-12-05 Thread Finan, Sean
Hi Brandon, It sounds like you've got a decent pipeline set up. To increase the speed you could try swapping out use of ctakes-dictionary-lookup with ctakes-dictionary-lookup-fast in the AE. Check ctakes-clinical-pipeline/desc/[ae]/AggregatePlaintextFastUMLSProcessor.xml for an example.

RE: Scaling cTakes

2014-12-09 Thread Finan, Sean
other suggestions on performance tuning would be great! Thanks, Brandon -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 05, 2014 1:14 PM To: dev@ctakes.apache.org Subject: RE: Scaling cTakes Hi Brandon, It sounds like you've got

RE: Links Not Working

2014-12-12 Thread Finan, Sean
Hi Kasie, cTakes is a community effort, so you've contacted the right people. Assuming that the Bug Tracker link in the navigation bar on the left works, please submit a report and list all of the orphan links. A kindly volunteer will fix them as soon as possible. Thanks, Sean

RE: revamping the Apache cTAKES website

2014-12-15 Thread Finan, Sean
Wow, I've just spent the last 2 hours doing the exact same thing. That is what I get for missing a meeting. Mine is extremely similar, though slightly different language (and without the improved performance bar chart - which may not belong). I also put the Examples in a big green button

RE: revamping the Apache cTAKES website

2014-12-15 Thread Finan, Sean
Anyway, a pretty amazing fresh start, thanks Pei -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Monday, December 15, 2014 4:33 PM To: dev@ctakes.apache.org Subject: RE: revamping the Apache cTAKES website Check out a mockup of a new website proposal:

RE: Problem running cTakes-clinical pipeline -- AggregatePlaintextFastUMLSProcessor.xml

2014-12-15 Thread Finan, Sean
Hi Yu, Also do you know is there any command line I can run to annotate like a thousand files automatically rather than copy and paster. You could try the CPE gui : bin/runctakesCPE.sh Sean From: Liang, Yu [mailto:yu.li...@nyumc.org] Sent: Monday, December 15, 2014 4:51 PM To:

RE: UMLS Integration

2014-12-15 Thread Finan, Sean
Hi Praveen, I think that this question might be better aimed at the nlm umls community. The standard cTakes installation does not follow this workflow. Sean -Original Message- From: Jay_Ram [mailto:pandupraveen...@gmail.com] Sent: Tuesday, December 16, 2014 12:10 AM To:

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-16 Thread Finan, Sean
On Mon, Dec 15, 2014 at 11:43 AM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hmmm, I can't find it in a search. However, here is a direct link: https://www.youtube.com/channel/UC8hQoOKz3v4PNEf6cqSkjbQ Maybe it needs a few videos to register in the search engine ? Sean

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-17 Thread Finan, Sean
video and ctakes youtube : Youtube Apache cTakes Channel Direct Link Isnt this to upload for my account? What about to the channel? On Tue, Dec 16, 2014 at 12:16 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi John, Look for an Upload button in the upper-left corner next

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Well, I guess that it is time for me to speak up … I must say that I’m happy that people are showing interest in the fast lookup. I am also happy (sort of) that some concerns are being raised – and that there is now community participation in my little toy. I have some concerns about what

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version then there WILL be differences in CUI and Semantic group. I don't have time to go into it with details, examples, etc. just be aware that every 6 months

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Software Engineer [Office:]801.669.7342 kim.eb...@imatsolutions.commailto:greg.hub...@imatsolutions.com On 12/19/2014 11:31 AM, Finan, Sean wrote: One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
:]801.669.7342 kim.eb...@imatsolutions.commailto:greg.hub...@imatsolutions.com On 12/19/2014 11:31 AM, Finan, Sean wrote: One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version then there WILL be differences in CUI

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Hi Bruce, I'm not sure how there would be fewer matches with the overlap processor. There should be all of the matches from the non-overlap processor plus those from the overlap. Decreasing from 215 to 211 is strange. Have you done any manual spot checks on this? It is really bizarre that

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Hi Bruce, Correction -- So far, I did steps 1 and 2 of Sean's email. No problem. Aside from recreating the database, those two steps have the greatest impact. But before you change anything else, please do some manual spot checks. I have never seen a case where the lookup would be so

RE: cTakes Annotation Comparison --- (^:

2014-12-19 Thread Finan, Sean
bruce.tiet...@imatsolutions.com On Fri, Dec 19, 2014 at 1:39 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Sorry, I meant “Do some spot checks on the validity”. In other words, when your script reports that a cui and/or span is missing, manually look at the data and see

RE: Using cTakes programmatically

2014-12-29 Thread Finan, Sean
Hi Maite Meseure, Check the cTakes User guide on UMLS setup: https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide#cTAKES3.2UserInstallGuide-(Recommended)AddUMLSaccessrights which (in part) points you towards obtaining a license to use the NIH UMLS dictionary:

RE: Question about the pipeline

2015-02-02 Thread Finan, Sean
Hi Tol (and Maite), I'm not entirely certain that I understand the question, but here is an attempt to help. If I'm oversimplifying then I apologize. I think that ExampleAggregatePipeline is intended to represent a very simple single-note pipeline and that custom code could be produced by

RE: Question about the pipeline

2015-02-03 Thread Finan, Sean
-at least in Eclipse, I haven't managed to run it via the command line yet. On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Hi Tol (and Maite), I'm not entirely certain that I understand the question, but here is an attempt to help. If I'm oversimplifying

RE: Question about the pipeline

2015-02-05 Thread Finan, Sean
in our HPC, it spawns a new job for each subfolder (which may have between 5 and 2500 notes). Todd Lingren Biomedical Informatics Cincinnati Children’s Hospital todd.ling...@cchmc.org 513-803-9032 -Original Message- From: Finan, Sean [mailto:sean.fi

RE: git mirrors out of sync?

2015-02-03 Thread Finan, Sean
Hi Steve, You are right (confirming your finding) - it looks like the first is a no-show and the second is somebody's personal upload to github (not git.apache.org) from 3 years ago. The jira claims that the item was closed (fixed), but if you go to

RE: Question about the pipeline

2015-02-03 Thread Finan, Sean
= createEngine(MyTokenizer.class); AnalysisEngine tagger = createEngine(MyTagger.class); runPipeline(jCas, tokenizer, tagger); for(Token token : iterate(jCas, Token.class)){ System.out.println(token.getTag()); } Tol O. Finan, Sean Sean.Finan@... writes: Hi Tol (and Maite), I'm not entirely

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-12 Thread Finan, Sean
Try something like the following for output: private int extractFeatures( final IdentifiedAnnotation annotation ) { // Extract the IdentifiedAnnotation itself final CollectionString umlsInfos = getUmlsInfos( annotation, _printSnomed ); if ( umlsInfos == null ) {

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-12 Thread Finan, Sean
Oh yeah - use the -fast dictionary to get preferred text. The fastest way to get cuis only is with CuisOnlyPlaintextUMLSProcessor. If you want polarity make sure you uncomment the section with PolarityCleartkAnalysisEngine. Sean -Original Message- From: Maite Meseure Hugues

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-17 Thread Finan, Sean
...@gmail.com wrote: Thank you for your replies, It's helpful. I was working on 3.2.0 version, so it looks like 3.2.1 allows to get the UMLS preferred text. Maite On Thu, Feb 12, 2015 at 2:25 PM, Finan, Sean sean.fi...@childrens.harvard.edu wrote: Oh yeah - use the -fast dictionary to get

RE: CTAKES mirroring on github.

2015-02-17 Thread Finan, Sean
Our request is for a read-only mirror. However, if it ever becomes i/o, I don't know if this will have what you want, but http://git.apache.org/ Links to documentation (mostly server setup) http://www.apache.org/dev/git.html and a wiki (check toward middle and bottom for committer info)

RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison : Span Overlap addendum

2015-01-09 Thread Finan, Sean
have to be the best speed-wise, but that is the recommended for optimizing F1 measure. Regards, James -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 19, 2014 11:55 AM To: dev@ctakes.apache.org; kim.eb...@imatsolutions.com Subject: RE

RE: Question about fast pipeline

2015-01-12 Thread Finan, Sean
Hi Michelle, Did your error have only Could not find . as absolute or did it also have or in ... or in ...? If you see ... or in ... then this is a new issue. If you don't, then you should update your source. If you need to run the release binary then let me know and I can work out

RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Finan, Sean
looking for something that doesn't have to be the best speed-wise, but that is the recommended for optimizing F1 measure. Regards, James -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 19, 2014 11:55 AM To: dev@ctakes.apache.org; kim.eb

RE: Question about the pipeline

2015-02-05 Thread Finan, Sean
@ctakes.apache.org Subject: Re: Question about the pipeline Yes, it does but only in Eclipse, not in command line even though I am in the good directory. I have to look at the classpath more in details probably. Thanks for your replies. On Thu, Feb 5, 2015 at 8:08 AM, Finan, Sean sean.fi

  1   2   3   4   5   6   >