Re: Lucene for UMLS2014

2014-07-22 Thread Harpreet Khanduja
Thank you so much for your help.

Harpreet.



On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean 
sean.fi...@childrens.harvard.edu wrote:

 Hi Harpreet,

 If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
 module as a replacement of the default dictionary-lookup.  That module has
 a new dictionary resource (hsql, not lucene) and slightly different methods
 for lookup and matching.  In time trials it has been faster than the
 default module (hence the name).  Accuracy depends upon the parameter
 settings, but in the tests performed so far the results are comparable or
 better.  The new dictionary is much leaner than the current default
 dictionary, small enough to port from the hsql cached version to a hsql
 in-memory version.  Using the in-memory version makes dictionary lookup
 practically instantaneous (hundredths of a second).  Limited documentation
 is available in the module's doc/ directory.

 I will be on vacation for a week, but please don't hesitate to write if
 you have any questions.

 Sean
 
 From: Harpreet Khanduja [hsk5...@rit.edu]
 Sent: Thursday, July 17, 2014 5:07 PM
 To: dev@ctakes.apache.org
 Subject: Lucene for UMLS2014

 Hello,
 I would be grateful if someone could help.

 I created a lucene index for umls2014 but only for snomed vocabulary.
 I did this because I thought this would reduce the dictionary look up
 time.
 But it still almost the same. Is there any other way to improve the
 dictionary look up time?

 Thank you,
 Harpreet



RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-22 Thread Chen, Pei
Thanks James.
I was planning on closing the vote today.
In the meantime, does anyone a quick way to clone/rename the wiki documentation 
for 3.2?
--Pei

 -Original Message-
 From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
 Sent: Monday, July 21, 2014 4:25 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Here's the additional I've done
 
 I ran mvn test with 0 Failures and 0 Errors.
 Ran the AggregateTemplateFiller.xml and received same output (except for
 internal UIMA identifiers) with rc2 as I did with 3.1.1.
 
 +1 to release
 
 -Original Message-
 From: Masanz, James J.
 Sent: Wednesday, July 16, 2014 3:59 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 FYI, so far I have done the following steps:
 
 downloaded the source archive
 compiled it using: maven compile
 downloaded the separately available resources set up classpath to include
 e.g. jars (from the bin distribution) set ctakes.umlsuser and  ctakes.umlspw
 env vars run  runctakesCVD.bat loaded
 AggregatePlaintextUMLSProcessor.xml
 ran against some simple text.
 verified did not through an exception.
 verified some EventMention and EntityMention annotations were produced.
 
 I will do more testing tomorrow. Just giving a status update.
 
 --James
 
 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
 Sent: Saturday, July 12, 2014 6:24 AM
 To: dev@ctakes.apache.org
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Agreed on that.
 
 I downloaded the new resources binary and was able to run my tests on the -
 bin version of the RC.
 
 +1 for making this the release.
 
 Tim
 
 
 
 From: Masanz, James J. [masanz.ja...@mayo.edu]
 Sent: Friday, July 11, 2014 7:27 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 I agree about keeping the thread open.
 
 -- James
 
 -Original Message-
 From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
 Sent: Friday, July 11, 2014 4:28 PM
 To: dev@ctakes.apache.org
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Updated the lvg.properties file within ctakes-resources on sourceforge [1].
 Since the Apache cTAKES artifacts didn't change, I would like to keep this
 VOTE thread open.
 
 Also renamed it to 3.2.0 (even though they technically do not have to follow
 each other, but probably nice to keep it consistent for users as James
 suggested.) [1]
 http://sourceforge.net/projects/ctakesresources/files/ctakes-resources-
 3.2.0.zip/download
 
  -Original Message-
  From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
  Sent: Thursday, July 10, 2014 5:53 PM
  To: 'dev@ctakes.apache.org'
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  Can you also give ctakesresources the number 3.2 or 3.2.0 instead of
  3.1.3
 
  -Original Message-
  From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
  Sent: Thursday, July 10, 2014 2:12 PM
  To: dev@ctakes.apache.org
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  I think this is due to the fact that the default lvg.properties also
  exits in the ctakes-resources project, so if you download and replace,
  it will override the ctakes configured one.
  I think it's a bug, but probably always been there...
  I'll fix up ctakes-resources on sourceforge nethertheless but it
  shouldn't require any changes to the release candidates.
 
   -Original Message-
   From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
   Sent: Thursday, July 10, 2014 11:59 AM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Hi Tim,
  
   When you say that it didn't seem to affect the run, where you
   comparing output to last release or just checking if data seemed OK
   at a
  glance?
  
   -Original Message-
   From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
   Sent: Thursday, July 10, 2014 7:29 AM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   I was able to run the binary without issues this time.
   I also downloaded the resources from sourceforge and integrated into
   the bin release and ran with the ctakes dictionary.
  
   I did get some weird exceptions thrown that didn't seem to affect
   the run -- looks like some hardcoded file paths in LVG? (See below)
  
   Tim
  
  
   Exception: java.io.FileNotFoundException:
   /export/home/lu/Development/LVG/lvg2008/data/misc/stopWords.data
   (No such file or directory)
   ** Error: problem of opening/reading stop words file:
  
 '/export/home/lu/Development/LVG/lvg2008/data/misc/stopWords.data'.
   Exception: java.io.FileNotFoundException:
  
 
 /export/home/lu/Development/LVG/lvg2008/data/misc/nonInfoWords.data
   (No such file or directory)
   ** Error: problem of opening/reading non-Info words file:
  
 
 

RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-22 Thread Masanz, James J.
When I asked Troy that question for 3.1.1, he didn't know of a way, and I don't 
either, which is why I had the 3.1.1 page mostly just reference the 3.2 
documentation.

-Original Message-
From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] 
Sent: Tuesday, July 22, 2014 10:00 AM
To: dev@ctakes.apache.org
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

Thanks James.
I was planning on closing the vote today.
In the meantime, does anyone a quick way to clone/rename the wiki documentation 
for 3.2?
--Pei

 -Original Message-
 From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
 Sent: Monday, July 21, 2014 4:25 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Here's the additional I've done
 
 I ran mvn test with 0 Failures and 0 Errors.
 Ran the AggregateTemplateFiller.xml and received same output (except for
 internal UIMA identifiers) with rc2 as I did with 3.1.1.
 
 +1 to release
 
 -Original Message-
 From: Masanz, James J.
 Sent: Wednesday, July 16, 2014 3:59 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 FYI, so far I have done the following steps:
 
 downloaded the source archive
 compiled it using: maven compile
 downloaded the separately available resources set up classpath to include
 e.g. jars (from the bin distribution) set ctakes.umlsuser and  ctakes.umlspw
 env vars run  runctakesCVD.bat loaded
 AggregatePlaintextUMLSProcessor.xml
 ran against some simple text.
 verified did not through an exception.
 verified some EventMention and EntityMention annotations were produced.
 
 I will do more testing tomorrow. Just giving a status update.
 
 --James
 
 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
 Sent: Saturday, July 12, 2014 6:24 AM
 To: dev@ctakes.apache.org
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Agreed on that.
 
 I downloaded the new resources binary and was able to run my tests on the -
 bin version of the RC.
 
 +1 for making this the release.
 
 Tim
 
 
 
 From: Masanz, James J. [masanz.ja...@mayo.edu]
 Sent: Friday, July 11, 2014 7:27 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 I agree about keeping the thread open.
 
 -- James
 
 -Original Message-
 From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
 Sent: Friday, July 11, 2014 4:28 PM
 To: dev@ctakes.apache.org
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Updated the lvg.properties file within ctakes-resources on sourceforge [1].
 Since the Apache cTAKES artifacts didn't change, I would like to keep this
 VOTE thread open.
 
 Also renamed it to 3.2.0 (even though they technically do not have to follow
 each other, but probably nice to keep it consistent for users as James
 suggested.) [1]
 http://sourceforge.net/projects/ctakesresources/files/ctakes-resources-
 3.2.0.zip/download
 
  -Original Message-
  From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
  Sent: Thursday, July 10, 2014 5:53 PM
  To: 'dev@ctakes.apache.org'
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  Can you also give ctakesresources the number 3.2 or 3.2.0 instead of
  3.1.3
 
  -Original Message-
  From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
  Sent: Thursday, July 10, 2014 2:12 PM
  To: dev@ctakes.apache.org
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  I think this is due to the fact that the default lvg.properties also
  exits in the ctakes-resources project, so if you download and replace,
  it will override the ctakes configured one.
  I think it's a bug, but probably always been there...
  I'll fix up ctakes-resources on sourceforge nethertheless but it
  shouldn't require any changes to the release candidates.
 
   -Original Message-
   From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
   Sent: Thursday, July 10, 2014 11:59 AM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Hi Tim,
  
   When you say that it didn't seem to affect the run, where you
   comparing output to last release or just checking if data seemed OK
   at a
  glance?
  
   -Original Message-
   From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
   Sent: Thursday, July 10, 2014 7:29 AM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   I was able to run the binary without issues this time.
   I also downloaded the resources from sourceforge and integrated into
   the bin release and ran with the ctakes dictionary.
  
   I did get some weird exceptions thrown that didn't seem to affect
   the run -- looks like some hardcoded file paths in LVG? (See below)
  
   Tim
  
  
   Exception: java.io.FileNotFoundException:
   /export/home/lu/Development/LVG/lvg2008/data/misc/stopWords.data
   (No such file or directory)
   ** 

RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-22 Thread Bleeker, Troy C.
One page at a time. At least there's that.

Thanks
Troy
-Original Message-
From: Masanz, James J. 
Sent: Tuesday, July 22, 2014 10:38 AM
To: 'dev@ctakes.apache.org'
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

When I asked Troy that question for 3.1.1, he didn't know of a way, and I don't 
either, which is why I had the 3.1.1 page mostly just reference the 3.2 
documentation.

-Original Message-
From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
Sent: Tuesday, July 22, 2014 10:00 AM
To: dev@ctakes.apache.org
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

Thanks James.
I was planning on closing the vote today.
In the meantime, does anyone a quick way to clone/rename the wiki documentation 
for 3.2?
--Pei

 -Original Message-
 From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
 Sent: Monday, July 21, 2014 4:25 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Here's the additional I've done
 
 I ran mvn test with 0 Failures and 0 Errors.
 Ran the AggregateTemplateFiller.xml and received same output (except 
 for internal UIMA identifiers) with rc2 as I did with 3.1.1.
 
 +1 to release
 
 -Original Message-
 From: Masanz, James J.
 Sent: Wednesday, July 16, 2014 3:59 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 FYI, so far I have done the following steps:
 
 downloaded the source archive
 compiled it using: maven compile
 downloaded the separately available resources set up classpath to 
 include e.g. jars (from the bin distribution) set ctakes.umlsuser and  
 ctakes.umlspw env vars run  runctakesCVD.bat loaded 
 AggregatePlaintextUMLSProcessor.xml
 ran against some simple text.
 verified did not through an exception.
 verified some EventMention and EntityMention annotations were produced.
 
 I will do more testing tomorrow. Just giving a status update.
 
 --James
 
 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
 Sent: Saturday, July 12, 2014 6:24 AM
 To: dev@ctakes.apache.org
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Agreed on that.
 
 I downloaded the new resources binary and was able to run my tests on 
 the - bin version of the RC.
 
 +1 for making this the release.
 
 Tim
 
 
 
 From: Masanz, James J. [masanz.ja...@mayo.edu]
 Sent: Friday, July 11, 2014 7:27 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 I agree about keeping the thread open.
 
 -- James
 
 -Original Message-
 From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
 Sent: Friday, July 11, 2014 4:28 PM
 To: dev@ctakes.apache.org
 Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 Updated the lvg.properties file within ctakes-resources on sourceforge [1].
 Since the Apache cTAKES artifacts didn't change, I would like to keep 
 this VOTE thread open.
 
 Also renamed it to 3.2.0 (even though they technically do not have to 
 follow each other, but probably nice to keep it consistent for users 
 as James
 suggested.) [1]
 http://sourceforge.net/projects/ctakesresources/files/ctakes-resources
 -
 3.2.0.zip/download
 
  -Original Message-
  From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
  Sent: Thursday, July 10, 2014 5:53 PM
  To: 'dev@ctakes.apache.org'
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  Can you also give ctakesresources the number 3.2 or 3.2.0 instead of
  3.1.3
 
  -Original Message-
  From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
  Sent: Thursday, July 10, 2014 2:12 PM
  To: dev@ctakes.apache.org
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  I think this is due to the fact that the default lvg.properties also 
  exits in the ctakes-resources project, so if you download and 
  replace, it will override the ctakes configured one.
  I think it's a bug, but probably always been there...
  I'll fix up ctakes-resources on sourceforge nethertheless but it 
  shouldn't require any changes to the release candidates.
 
   -Original Message-
   From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
   Sent: Thursday, July 10, 2014 11:59 AM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Hi Tim,
  
   When you say that it didn't seem to affect the run, where you 
   comparing output to last release or just checking if data seemed 
   OK at a
  glance?
  
   -Original Message-
   From: Miller, Timothy 
   [mailto:timothy.mil...@childrens.harvard.edu]
   Sent: Thursday, July 10, 2014 7:29 AM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   I was able to run the binary without issues this time.
   I also downloaded the resources from sourceforge and integrated 
   into the bin release and ran with the ctakes dictionary.
  
   I did get some weird exceptions thrown that didn't seem 

Re: Lucene for UMLS2014

2014-07-22 Thread Harpreet Khanduja
Hello,
   I am using ctakes 3.1.1 in eclipse and I have added my customizations to
the project, but now I want to update it to 3.2 so that I can use
   ctakes-dictionary-lookup-fast.
   Is there any way to update the whole ctakes project to 3.2 without my
customizations getting removed?

  It would be a great help.

Thank you,

Harpreet





On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja hsk5...@g.rit.edu
wrote:

 Thank you so much for your help.

 Harpreet.



 On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean 
 sean.fi...@childrens.harvard.edu wrote:

 Hi Harpreet,

 If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
 module as a replacement of the default dictionary-lookup.  That module has
 a new dictionary resource (hsql, not lucene) and slightly different methods
 for lookup and matching.  In time trials it has been faster than the
 default module (hence the name).  Accuracy depends upon the parameter
 settings, but in the tests performed so far the results are comparable or
 better.  The new dictionary is much leaner than the current default
 dictionary, small enough to port from the hsql cached version to a hsql
 in-memory version.  Using the in-memory version makes dictionary lookup
 practically instantaneous (hundredths of a second).  Limited documentation
 is available in the module's doc/ directory.

 I will be on vacation for a week, but please don't hesitate to write if
 you have any questions.

 Sean
 
 From: Harpreet Khanduja [hsk5...@rit.edu]
 Sent: Thursday, July 17, 2014 5:07 PM
 To: dev@ctakes.apache.org
 Subject: Lucene for UMLS2014

 Hello,
 I would be grateful if someone could help.

 I created a lucene index for umls2014 but only for snomed vocabulary.
 I did this because I thought this would reduce the dictionary look up
 time.
 But it still almost the same. Is there any other way to improve the
 dictionary look up time?

 Thank you,
 Harpreet





RE: Lucene for UMLS2014

2014-07-22 Thread Masanz, James J.
Did you download the source and import into eclipse, or did you check out 3.1.1 
from SVN.
If you checked it out from SVN, did you check it out from trunk, or from the 
tag for 3.1.1.

-- James

-Original Message-
From: Harpreet Khanduja [mailto:hsk5...@rit.edu] 
Sent: Tuesday, July 22, 2014 12:49 PM
To: dev@ctakes.apache.org
Subject: Re: Lucene for UMLS2014

Hello,
   I am using ctakes 3.1.1 in eclipse and I have added my customizations to
the project, but now I want to update it to 3.2 so that I can use
   ctakes-dictionary-lookup-fast.
   Is there any way to update the whole ctakes project to 3.2 without my
customizations getting removed?

  It would be a great help.

Thank you,

Harpreet





On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja hsk5...@g.rit.edu
wrote:

 Thank you so much for your help.

 Harpreet.



 On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean 
 sean.fi...@childrens.harvard.edu wrote:

 Hi Harpreet,

 If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
 module as a replacement of the default dictionary-lookup.  That module has
 a new dictionary resource (hsql, not lucene) and slightly different methods
 for lookup and matching.  In time trials it has been faster than the
 default module (hence the name).  Accuracy depends upon the parameter
 settings, but in the tests performed so far the results are comparable or
 better.  The new dictionary is much leaner than the current default
 dictionary, small enough to port from the hsql cached version to a hsql
 in-memory version.  Using the in-memory version makes dictionary lookup
 practically instantaneous (hundredths of a second).  Limited documentation
 is available in the module's doc/ directory.

 I will be on vacation for a week, but please don't hesitate to write if
 you have any questions.

 Sean
 
 From: Harpreet Khanduja [hsk5...@rit.edu]
 Sent: Thursday, July 17, 2014 5:07 PM
 To: dev@ctakes.apache.org
 Subject: Lucene for UMLS2014

 Hello,
 I would be grateful if someone could help.

 I created a lucene index for umls2014 but only for snomed vocabulary.
 I did this because I thought this would reduce the dictionary look up
 time.
 But it still almost the same. Is there any other way to improve the
 dictionary look up time?

 Thank you,
 Harpreet





Re: Lucene for UMLS2014

2014-07-22 Thread Harpreet Khanduja
Hello,

I checked out 3.1.1 from trunk SVN.

Thank you



On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. masanz.ja...@mayo.edu
wrote:

 Did you download the source and import into eclipse, or did you check out
 3.1.1 from SVN.
 If you checked it out from SVN, did you check it out from trunk, or from
 the tag for 3.1.1.

 -- James

 -Original Message-
 From: Harpreet Khanduja [mailto:hsk5...@rit.edu]
 Sent: Tuesday, July 22, 2014 12:49 PM
 To: dev@ctakes.apache.org
 Subject: Re: Lucene for UMLS2014

 Hello,
I am using ctakes 3.1.1 in eclipse and I have added my customizations to
 the project, but now I want to update it to 3.2 so that I can use
ctakes-dictionary-lookup-fast.
Is there any way to update the whole ctakes project to 3.2 without my
 customizations getting removed?

   It would be a great help.

 Thank you,

 Harpreet





 On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja hsk5...@g.rit.edu
 wrote:

  Thank you so much for your help.
 
  Harpreet.
 
 
 
  On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean 
  sean.fi...@childrens.harvard.edu wrote:
 
  Hi Harpreet,
 
  If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
  module as a replacement of the default dictionary-lookup.  That module
 has
  a new dictionary resource (hsql, not lucene) and slightly different
 methods
  for lookup and matching.  In time trials it has been faster than the
  default module (hence the name).  Accuracy depends upon the parameter
  settings, but in the tests performed so far the results are comparable
 or
  better.  The new dictionary is much leaner than the current default
  dictionary, small enough to port from the hsql cached version to a hsql
  in-memory version.  Using the in-memory version makes dictionary lookup
  practically instantaneous (hundredths of a second).  Limited
 documentation
  is available in the module's doc/ directory.
 
  I will be on vacation for a week, but please don't hesitate to write if
  you have any questions.
 
  Sean
  
  From: Harpreet Khanduja [hsk5...@rit.edu]
  Sent: Thursday, July 17, 2014 5:07 PM
  To: dev@ctakes.apache.org
  Subject: Lucene for UMLS2014
 
  Hello,
  I would be grateful if someone could help.
 
  I created a lucene index for umls2014 but only for snomed
 vocabulary.
  I did this because I thought this would reduce the dictionary look
 up
  time.
  But it still almost the same. Is there any other way to improve the
  dictionary look up time?
 
  Thank you,
  Harpreet
 
 
 



RE: Lucene for UMLS2014

2014-07-22 Thread Masanz, James J.

I'm not an svn guru, but you can use Team-Update to get the latest of all the 
things you have not customized, plus SVN will tell you of the conflicts, and 
you can merge your customizations into the latest. I've done it when I haven't 
had many customizations to preserve.

To get the new dictionary lookup (sub)project, you might have to do something 
to get it imported, such as going into the SVN repository exploring view and 
use Check out as Maven Project menu option on that (sub)project.

-Original Message-
From: Harpreet Khanduja [mailto:hsk5...@rit.edu] 
Sent: Tuesday, July 22, 2014 2:32 PM
To: dev@ctakes.apache.org
Subject: Re: Lucene for UMLS2014

Hello,

I checked out 3.1.1 from trunk SVN.

Thank you



On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. masanz.ja...@mayo.edu
wrote:

 Did you download the source and import into eclipse, or did you check out
 3.1.1 from SVN.
 If you checked it out from SVN, did you check it out from trunk, or from
 the tag for 3.1.1.

 -- James

 -Original Message-
 From: Harpreet Khanduja [mailto:hsk5...@rit.edu]
 Sent: Tuesday, July 22, 2014 12:49 PM
 To: dev@ctakes.apache.org
 Subject: Re: Lucene for UMLS2014

 Hello,
I am using ctakes 3.1.1 in eclipse and I have added my customizations to
 the project, but now I want to update it to 3.2 so that I can use
ctakes-dictionary-lookup-fast.
Is there any way to update the whole ctakes project to 3.2 without my
 customizations getting removed?

   It would be a great help.

 Thank you,

 Harpreet





 On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja hsk5...@g.rit.edu
 wrote:

  Thank you so much for your help.
 
  Harpreet.
 
 
 
  On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean 
  sean.fi...@childrens.harvard.edu wrote:
 
  Hi Harpreet,
 
  If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
  module as a replacement of the default dictionary-lookup.  That module
 has
  a new dictionary resource (hsql, not lucene) and slightly different
 methods
  for lookup and matching.  In time trials it has been faster than the
  default module (hence the name).  Accuracy depends upon the parameter
  settings, but in the tests performed so far the results are comparable
 or
  better.  The new dictionary is much leaner than the current default
  dictionary, small enough to port from the hsql cached version to a hsql
  in-memory version.  Using the in-memory version makes dictionary lookup
  practically instantaneous (hundredths of a second).  Limited
 documentation
  is available in the module's doc/ directory.
 
  I will be on vacation for a week, but please don't hesitate to write if
  you have any questions.
 
  Sean
  
  From: Harpreet Khanduja [hsk5...@rit.edu]
  Sent: Thursday, July 17, 2014 5:07 PM
  To: dev@ctakes.apache.org
  Subject: Lucene for UMLS2014
 
  Hello,
  I would be grateful if someone could help.
 
  I created a lucene index for umls2014 but only for snomed
 vocabulary.
  I did this because I thought this would reduce the dictionary look
 up
  time.
  But it still almost the same. Is there any other way to improve the
  dictionary look up time?
 
  Thank you,
  Harpreet
 
 
 



Re: Lucene for UMLS2014

2014-07-22 Thread Harpreet Khanduja
I will try to do the same.

Thank you,

Harpreet


On Tue, Jul 22, 2014 at 4:11 PM, Masanz, James J. masanz.ja...@mayo.edu
wrote:


 I'm not an svn guru, but you can use Team-Update to get the latest of all
 the things you have not customized, plus SVN will tell you of the
 conflicts, and you can merge your customizations into the latest. I've done
 it when I haven't had many customizations to preserve.

 To get the new dictionary lookup (sub)project, you might have to do
 something to get it imported, such as going into the SVN repository
 exploring view and use Check out as Maven Project menu option on that
 (sub)project.

 -Original Message-
 From: Harpreet Khanduja [mailto:hsk5...@rit.edu]
 Sent: Tuesday, July 22, 2014 2:32 PM
 To: dev@ctakes.apache.org
 Subject: Re: Lucene for UMLS2014

 Hello,

 I checked out 3.1.1 from trunk SVN.

 Thank you



 On Tue, Jul 22, 2014 at 2:29 PM, Masanz, James J. masanz.ja...@mayo.edu
 wrote:

  Did you download the source and import into eclipse, or did you check out
  3.1.1 from SVN.
  If you checked it out from SVN, did you check it out from trunk, or from
  the tag for 3.1.1.
 
  -- James
 
  -Original Message-
  From: Harpreet Khanduja [mailto:hsk5...@rit.edu]
  Sent: Tuesday, July 22, 2014 12:49 PM
  To: dev@ctakes.apache.org
  Subject: Re: Lucene for UMLS2014
 
  Hello,
 I am using ctakes 3.1.1 in eclipse and I have added my customizations
 to
  the project, but now I want to update it to 3.2 so that I can use
 ctakes-dictionary-lookup-fast.
 Is there any way to update the whole ctakes project to 3.2 without my
  customizations getting removed?
 
It would be a great help.
 
  Thank you,
 
  Harpreet
 
 
 
 
 
  On Tue, Jul 22, 2014 at 10:53 AM, Harpreet Khanduja hsk5...@g.rit.edu
  wrote:
 
   Thank you so much for your help.
  
   Harpreet.
  
  
  
   On Mon, Jul 21, 2014 at 6:28 PM, Finan, Sean 
   sean.fi...@childrens.harvard.edu wrote:
  
   Hi Harpreet,
  
   If you are willing to use cTakes 3.2, try the dictionary-lookup-fast
   module as a replacement of the default dictionary-lookup.  That module
  has
   a new dictionary resource (hsql, not lucene) and slightly different
  methods
   for lookup and matching.  In time trials it has been faster than the
   default module (hence the name).  Accuracy depends upon the parameter
   settings, but in the tests performed so far the results are comparable
  or
   better.  The new dictionary is much leaner than the current default
   dictionary, small enough to port from the hsql cached version to a
 hsql
   in-memory version.  Using the in-memory version makes dictionary
 lookup
   practically instantaneous (hundredths of a second).  Limited
  documentation
   is available in the module's doc/ directory.
  
   I will be on vacation for a week, but please don't hesitate to write
 if
   you have any questions.
  
   Sean
   
   From: Harpreet Khanduja [hsk5...@rit.edu]
   Sent: Thursday, July 17, 2014 5:07 PM
   To: dev@ctakes.apache.org
   Subject: Lucene for UMLS2014
  
   Hello,
   I would be grateful if someone could help.
  
   I created a lucene index for umls2014 but only for snomed
  vocabulary.
   I did this because I thought this would reduce the dictionary look
  up
   time.
   But it still almost the same. Is there any other way to improve
 the
   dictionary look up time?
  
   Thank you,
   Harpreet
  
  
  
 



RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-22 Thread Chen, Pei
There is currently no guides on the confluence wiki for cTAKES 3.2.0...
I was thinking of just cloning 3.1.1
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1.1

And just add the YTEX and/or any new changes to it...
Would be grateful for any help here...

 -Original Message-
 From: John Green [mailto:john.travis.gr...@gmail.com]
 Sent: Tuesday, July 22, 2014 2:37 PM
 To: dev@ctakes.apache.org
 Subject: Re: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 What exactly needs updated? I have not had the time (unfortunately) to
 help with this project very much because of the steep learning curve on the
 technology. I'm currently on some protected research time working with
 cTakes as of this week and would be happy to help with some grunt work.
 
 JG
 
 
 On Tue, Jul 22, 2014 at 11:39 AM, Bleeker, Troy C. bleeker.t...@mayo.edu
 wrote:
 
  One page at a time. At least there's that.
 
  Thanks
  Troy
  -Original Message-
  From: Masanz, James J.
  Sent: Tuesday, July 22, 2014 10:38 AM
  To: 'dev@ctakes.apache.org'
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  When I asked Troy that question for 3.1.1, he didn't know of a way,
  and I don't either, which is why I had the 3.1.1 page mostly just
  reference the
  3.2 documentation.
 
  -Original Message-
  From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
  Sent: Tuesday, July 22, 2014 10:00 AM
  To: dev@ctakes.apache.org
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  Thanks James.
  I was planning on closing the vote today.
  In the meantime, does anyone a quick way to clone/rename the wiki
  documentation for 3.2?
  --Pei
 
   -Original Message-
   From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
   Sent: Monday, July 21, 2014 4:25 PM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Here's the additional I've done
  
   I ran mvn test with 0 Failures and 0 Errors.
   Ran the AggregateTemplateFiller.xml and received same output (except
   for internal UIMA identifiers) with rc2 as I did with 3.1.1.
  
   +1 to release
  
   -Original Message-
   From: Masanz, James J.
   Sent: Wednesday, July 16, 2014 3:59 PM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   FYI, so far I have done the following steps:
  
   downloaded the source archive
   compiled it using: maven compile
   downloaded the separately available resources set up classpath to
   include e.g. jars (from the bin distribution) set ctakes.umlsuser
   and ctakes.umlspw env vars run  runctakesCVD.bat loaded
   AggregatePlaintextUMLSProcessor.xml
   ran against some simple text.
   verified did not through an exception.
   verified some EventMention and EntityMention annotations were
 produced.
  
   I will do more testing tomorrow. Just giving a status update.
  
   --James
  
   -Original Message-
   From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
   Sent: Saturday, July 12, 2014 6:24 AM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Agreed on that.
  
   I downloaded the new resources binary and was able to run my tests
   on the - bin version of the RC.
  
   +1 for making this the release.
  
   Tim
  
  
   
   From: Masanz, James J. [masanz.ja...@mayo.edu]
   Sent: Friday, July 11, 2014 7:27 PM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   I agree about keeping the thread open.
  
   -- James
  
   -Original Message-
   From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
   Sent: Friday, July 11, 2014 4:28 PM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Updated the lvg.properties file within ctakes-resources on
   sourceforge
  [1].
   Since the Apache cTAKES artifacts didn't change, I would like to
   keep this VOTE thread open.
  
   Also renamed it to 3.2.0 (even though they technically do not have
   to follow each other, but probably nice to keep it consistent for
   users as James
   suggested.) [1]
   http://sourceforge.net/projects/ctakesresources/files/ctakes-resourc
   es
   -
   3.2.0.zip/download
  
-Original Message-
From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
Sent: Thursday, July 10, 2014 5:53 PM
To: 'dev@ctakes.apache.org'
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
   
Can you also give ctakesresources the number 3.2 or 3.2.0 instead
of
3.1.3
   
-Original Message-
From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
Sent: Thursday, July 10, 2014 2:12 PM
To: dev@ctakes.apache.org
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
   
I think this is due to the fact that the default lvg.properties
also exits in the ctakes-resources project, so if you download and
replace, it will override the 

RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-22 Thread John Green
Ill play with it tonight or tomorrow night.


JG
—
Sent from Mailbox for iPhone

On Tue, Jul 22, 2014 at 4:27 PM, Chen, Pei pei.c...@childrens.harvard.edu
wrote:

 There is currently no guides on the confluence wiki for cTAKES 3.2.0...
 I was thinking of just cloning 3.1.1
 https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1.1
 And just add the YTEX and/or any new changes to it...
 Would be grateful for any help here...
 -Original Message-
 From: John Green [mailto:john.travis.gr...@gmail.com]
 Sent: Tuesday, July 22, 2014 2:37 PM
 To: dev@ctakes.apache.org
 Subject: Re: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
 What exactly needs updated? I have not had the time (unfortunately) to
 help with this project very much because of the steep learning curve on the
 technology. I'm currently on some protected research time working with
 cTakes as of this week and would be happy to help with some grunt work.
 
 JG
 
 
 On Tue, Jul 22, 2014 at 11:39 AM, Bleeker, Troy C. bleeker.t...@mayo.edu
 wrote:
 
  One page at a time. At least there's that.
 
  Thanks
  Troy
  -Original Message-
  From: Masanz, James J.
  Sent: Tuesday, July 22, 2014 10:38 AM
  To: 'dev@ctakes.apache.org'
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  When I asked Troy that question for 3.1.1, he didn't know of a way,
  and I don't either, which is why I had the 3.1.1 page mostly just
  reference the
  3.2 documentation.
 
  -Original Message-
  From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
  Sent: Tuesday, July 22, 2014 10:00 AM
  To: dev@ctakes.apache.org
  Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
 
  Thanks James.
  I was planning on closing the vote today.
  In the meantime, does anyone a quick way to clone/rename the wiki
  documentation for 3.2?
  --Pei
 
   -Original Message-
   From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
   Sent: Monday, July 21, 2014 4:25 PM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Here's the additional I've done
  
   I ran mvn test with 0 Failures and 0 Errors.
   Ran the AggregateTemplateFiller.xml and received same output (except
   for internal UIMA identifiers) with rc2 as I did with 3.1.1.
  
   +1 to release
  
   -Original Message-
   From: Masanz, James J.
   Sent: Wednesday, July 16, 2014 3:59 PM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   FYI, so far I have done the following steps:
  
   downloaded the source archive
   compiled it using: maven compile
   downloaded the separately available resources set up classpath to
   include e.g. jars (from the bin distribution) set ctakes.umlsuser
   and ctakes.umlspw env vars run  runctakesCVD.bat loaded
   AggregatePlaintextUMLSProcessor.xml
   ran against some simple text.
   verified did not through an exception.
   verified some EventMention and EntityMention annotations were
 produced.
  
   I will do more testing tomorrow. Just giving a status update.
  
   --James
  
   -Original Message-
   From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
   Sent: Saturday, July 12, 2014 6:24 AM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Agreed on that.
  
   I downloaded the new resources binary and was able to run my tests
   on the - bin version of the RC.
  
   +1 for making this the release.
  
   Tim
  
  
   
   From: Masanz, James J. [masanz.ja...@mayo.edu]
   Sent: Friday, July 11, 2014 7:27 PM
   To: 'dev@ctakes.apache.org'
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   I agree about keeping the thread open.
  
   -- James
  
   -Original Message-
   From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
   Sent: Friday, July 11, 2014 4:28 PM
   To: dev@ctakes.apache.org
   Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
  
   Updated the lvg.properties file within ctakes-resources on
   sourceforge
  [1].
   Since the Apache cTAKES artifacts didn't change, I would like to
   keep this VOTE thread open.
  
   Also renamed it to 3.2.0 (even though they technically do not have
   to follow each other, but probably nice to keep it consistent for
   users as James
   suggested.) [1]
   http://sourceforge.net/projects/ctakesresources/files/ctakes-resourc
   es
   -
   3.2.0.zip/download
  
-Original Message-
From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
Sent: Thursday, July 10, 2014 5:53 PM
To: 'dev@ctakes.apache.org'
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
   
Can you also give ctakesresources the number 3.2 or 3.2.0 instead
of
3.1.3
   
-Original Message-
From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
Sent: Thursday, July 10, 2014 2:12 PM
To: dev@ctakes.apache.org
Subject: RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)
   
I