Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
are any consequences with moving >>> forward with changing the code >>>> Why do you say this? >>>> >>>> I think that there may be more required changes than you realize. >>>> Every >>> insertion into the CAS must be of ordered da

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
the original authors or current users would speak up as to what might be best. Personally I have no idea. Anyway, great catch! Sean -Original Message- From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] Sent: Tuesday, October 07, 2014 3:11 PM To: dev@ctakes.apache.org Subject: R

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
t; Sean > > > -Original Message----- > From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com] > Sent: Tuesday, October 07, 2014 1:21 PM > To: dev@ctakes.apache.org > Subject: Re: cTakes output predictability > > I did not intend to step on anyone's toes. >

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
equirement tests before all of your changes and then >> again after your changes >>> I'm actually curious about how much memory might be eaten with linkages >> everywhere >>> 4. Run performance (speed) tests before and after >>> On a large corpus to ensure that

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
et...@perfectsearchcorp.com] Sent: Tuesday, October 07, 2014 1:21 PM To: dev@ctakes.apache.org Subject: Re: cTakes output predictability I did not intend to step on anyone's toes. One of the reasons I proposed the changes was to try to make it extremely obvious when there are significant dif

Re: cTakes output predictability

2014-10-07 Thread Bruce Tietjen
;>> On 10/07/2014 08:50 AM, Finan, Sean wrote: > >>>> Hi Kim, > >>>> > >>>> One might want compare the Sentence detector that uses end of line > >>>> characters as sentence splitters with one that does not. Such a > >>>> change

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
of the ordering requirement > By mandating such a rule you are assuming responsibility for it > > > -Original Message- > From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] > Sent: Tuesday, October 07, 2014 11:57 AM > To: dev@ctakes.apache.org > Subject: Re: cTakes out

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
ance of the ordering requirement > By mandating such a rule you are assuming responsibility for it > > > -Original Message- > From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] > Sent: Tuesday, October 07, 2014 11:57 AM > To: dev@ctakes.apache.org > Subject: Re: cTake

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
searchcorp.com] Sent: Tuesday, October 07, 2014 11:57 AM To: dev@ctakes.apache.org Subject: Re: cTakes output predictability I think we may really prefer the first method. Since it doesn't appear that there are any consequences with moving forward with changing the code, we would really like to mov

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
Jay, I agree. This does lead to reproducible unit tests, which helps us out in the long term. Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/06/2014 05:38 PM, jay vyas wrote: > Im not a ctakes expert by any means, but in general, I like that idea > predi

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
g would not only effect the sentence type >>>>> discoveries but also practically every type that follows. >>>>> >>>>> Another might want to compare a note with "skin cancer" vs. one in >>>>> which you replace "skin cancer&q

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
; opening the code for. > > -Original Message- > From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] > Sent: Tuesday, October 07, 2014 10:56 AM > To: dev@ctakes.apache.org > Subject: Re: cTakes output predictability > > I think we may really prefer the first metho

Re: cTakes output predictability

2014-10-07 Thread britt fitch
you replace "skin cancer" with "melanoma" just to see what the >>>> CUI differences might be. There are changes in two words vs. one, >>>> 11 characters vs. 8, a removed adjective(?), and of course changes >>>> in CUIs. >>>> >>>> O

RE: cTakes output predictability

2014-10-07 Thread Masanz, James J.
AM To: dev@ctakes.apache.org Subject: Re: cTakes output predictability I think we may really prefer the first method. Since it doesn't appear that there are any consequences with moving forward with changing the code, we would really like to move forward with this approach. Kim Ebert 1.801.669

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
uot;melanoma" just to see what the >>> CUI differences might be. There are changes in two words vs. one, >>> 11 characters vs. 8, a removed adjective(?), and of course changes >>> in CUIs. >>> >>> Of course, if you are just running notes on a new moo

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
Sent: Tuesday, October 07, 2014 11:30 AM To: dev@ctakes.apache.org Subject: Re: cTakes output predictability Hi Sean, Well of course that makes plenty of sense. Testing different cTakes configurations you would expect different output. In our testing we've found several cases where runn

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
e a new "cas-consumer" that writes output in a desired format - but this >> should not require changes to engines. >> >> "If it ain't broke, don't fix it" >> >> Sean >> >> >> -Original Message- >> From: St

Re: cTakes output predictability

2014-10-07 Thread britt fitch
rse, if you are just running notes on a new moon and then again on a >> full moon ... >> >> Sean >> >> -Original Message- >> From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] >> Sent: Tuesday, October 07, 2014 10:41 AM >> To: dev@ctakes.apache

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
you are just running notes on a new moon and then again on a full moon ... Sean -Original Message- From: Kim Ebert [mailto:kim.eb...@perfectsearchcorp.com] Sent: Tuesday, October 07, 2014 10:41 AM To: dev@ctakes.apache.org Subject: Re: cTakes output predictability Sean, "...be

Re: cTakes output predictability

2014-10-07 Thread Kim Ebert
- > From: Steven Bethard [mailto:steven.beth...@gmail.com] > Sent: Monday, October 06, 2014 11:23 PM > To: dev@ctakes.apache.org > Subject: Re: cTakes output predictability > > On Mon, Oct 6, 2014 at 3:59 PM, Bruce Tietjen > wrote: >> Since I started working with cTakes so

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
nes. "If it ain't broke, don't fix it" Sean -Original Message- From: Steven Bethard [mailto:steven.beth...@gmail.com] Sent: Monday, October 06, 2014 11:23 PM To: dev@ctakes.apache.org Subject: Re: cTakes output predictability On Mon, Oct 6, 2014 at 3:59 PM, Bruce Tie

Re: cTakes output predictability

2014-10-06 Thread Steven Bethard
On Mon, Oct 6, 2014 at 3:59 PM, Bruce Tietjen wrote: > Since I started working with cTakes some time ago, I have found it > difficult to compare the output between subsequent runs on the same files > because annotations are often assigned different IDs, are listed in > different order, etc. At on

Re: cTakes output predictability

2014-10-06 Thread Britt Fitch
Before making changes to the data structure I think it would be good to understand the use case. Bruce, can can you give a high level description of the issue you are trying to solve? Cheers, Britt On Mon, Oct 6, 2014 at 7:38 PM, jay vyas wrote: > Im not a ctakes expert by any means, but in

Re: cTakes output predictability

2014-10-06 Thread jay vyas
Im not a ctakes expert by any means, but in general, I like that idea predictable and deterministic ordering of mapped elements almost always leads to less buggy applications. As groovy has shown (LinkedHashMap is the default data structure and its much easier imo to get reproducible groovy uni