hi Pei, is this mean that your proposition bellow is now ready to use. << FYI: I was proposing adding an additional attribute to store the description/preferredText(term) [1] since this information is already available in the dictionary lookup. I think most folks would find this useful in additional to just saving the CUI/Code. Otherwise, they would have to do another lookup further downstream to get the description of the CUI/Code.>>
On Tuesday, October 22, 2013 4:01:35 PM, "Chen, Pei" <[email protected]> wrote: Done. > -----Original Message----- > From: Masanz, James J. [mailto:[email protected]] > Sent: Tuesday, October 22, 2013 2:33 PM > To: '[email protected]' > Subject: RE: CTAKES-248- include original covered text of NEs which can't be > recovered post if NE is from a disjoint span > > Sure, if you would, that would be great. Thanks. > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On > Behalf Of Chen, Pei > Sent: Tuesday, October 22, 2013 1:30 PM > To: [email protected] > Subject: RE: CTAKES-248- include original covered text of NEs which can't be > recovered post if NE is from a disjoint span > > James, > I was making some changes to the ctakes common type system for CTAKES- > 224 (Adding a field to save the UMLS term/text in addition to the > CUI/Codes). > Do you want me to also make originalText an FSArray<BaseToken> instead of > String while I have these files open? > > --Pei > > > -----Original Message----- > > From: Chen, Pei [mailto:[email protected]] > > Sent: Wednesday, October 02, 2013 10:23 AM > > To: [email protected] > > Subject: RE: CTAKES-248- include original covered text of NEs which > > can't be recovered post if NE is from a disjoint span > > > > +1 to have a pointer back to the BaseToken(s) rather than a | String > > +(so we > > could get back the spans and other info if needed). > > I think the atom will be slightly different, take for example: > > Perhaps with an example: > > Sentence/LookupWindow: "alcoholic liver disease was acute." > > originalText: "disease acute" [New feature to store the Tokens that > > were matched due to the permutations?] > > UmlsConcept.cui: C0001314 > > UmlsConcept.preferredText: "Acute Disease" [New feature to store the > > atom/text returned by the UMLS CUI] > > > > I also ran into a similar case where I wish > > IdentifiedAnnotation.segmentID/SentenceID was the actual Segment type > > and not a String. > > > > This is just my 2 cents... open to ideas though. > > --Pei > > > > > > > -----Original Message----- > > > From: Richard Eckart de Castilho [mailto:[email protected]] > > > Sent: Wednesday, October 02, 2013 3:19 AM > > > To: [email protected] > > > Subject: Re: CTAKES-248- include original covered text of NEs which > > > can't be recovered post if NE is from a disjoint span > > > > > > What benefit would it have to store a string with some separation > > > character (which may mean that the separation character in the > > > elements may need to be escaped), over using a feature of type > > > FSArray<Token> pointing to the original segments? > > > > > > Not sure if that is what Karthik meant when referring to fetching > > > the matched atom. > > > > > > -- Richard > > > > > > On 02.10.2013, at 01:46, Karthik Sarma <[email protected]> wrote: > > > > > > > Hmm, couldn't you just fetch the matched atom and use that? Should > > > > be the same information (without, I suppose, the original ordering > > > > and > > split). > > > > > > > > -- > > > > Karthik Sarma > > > > UCLA Medical Scientist Training Program Class of 20?? > > > > Member, UCLA Medical Imaging & Informatics Lab Member, CA > > Delegation > > > > to the House of Delegates of the American Medical Association > > > > [email protected] > > > > gchat: [email protected] > > > > linkedin: www.linkedin.com/in/ksarma > > > > > > > > > > > > On Tue, Oct 1, 2013 at 12:37 PM, Masanz, James J. > > > <[email protected]>wrote: > > > > > > > >> Yes, this would help address that multiple permutations example. > > > >> The new getOriginalText method would return something like > > > >> "Acute|Disease". Right now I'm thinking of just using vertical > > > >> bar as delimiter, to start with at least, but think it should be > configurable. > > > >> > > > >> -----Original Message----- > > > >> From: [email protected] > > > [mailto: > > > >> [email protected]] On > > > Behalf Of > > > >> Chen, Pei > > > >> Sent: Tuesday, October 01, 2013 9:38 AM > > > >> To: [email protected] > > > >> Subject: CTAKES-248- include original covered text of NEs which > > > >> can't be recovered post if NE is from a disjoint span > > > >> > > > >> This sounds pretty cool. > > > >> James, will this address the multiple permutations lookup example: > > > >> "Acute alcoholic liver disease." There is a cui: C0001314: Acute > > > >> Disease, but if you getCoveredText(), on the UMLSConcept, you > > > >> would actually get the same "Acute alcoholic liver disease" > > > >> instead of "Acute > > > Disease". > > > >> So, there is a new field called getOriginalText() that matched the hit? > > > >> > > > >>> -----Original Message----- > > > >>> From: [email protected] [mailto:james- > [email protected]] > > > >>> Sent: Monday, September 30, 2013 5:49 PM > > > >>> To: [email protected] > > > >>> Subject: svn commit: r1527792 - /ctakes/trunk/ctakes-type- > > > >>> > > > > > > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys > > > >>> t > > > >>> em.xml > > > >>> > > > >>> Author: james-masanz > > > >>> Date: Mon Sep 30 21:48:01 2013 > > > >>> New Revision: 1527792 > > > >>> > > > >>> URL: http://svn.apache.org/r1527792 > > > >>> Log: > > > >>> CTAKES-248 - for named entities, since the annotation just has > > > >>> the > > > >> begin and > > > >>> end offset, it is requested to have a way to get the original > > > >>> covered > > > >> text > > > >>> (especially for disjoint spans) so it is possible to know which > > > >>> words in > > > >> the > > > >>> covered text were actually used in the matching to the > > > >>> dictionary entry > > > >>> > > > >>> Modified: > > > >>> ctakes/trunk/ctakes-type- > > > >>> > > > > > > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys > > > >>> t > > > >>> em.xml > > > >>> > > > >>> Modified: ctakes/trunk/ctakes-type- > > > >>> > > > > > > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys > > > >>> t > > > >>> em.xml > > > >>> URL: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-type- > > > >>> > > > > > > system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys > > > >>> t em.xml?rev=1527792&r1=1527791&r2=1527792&view=diff > > > >>> > > > > > > ========================================================== > > > >>> ==================== > > > >>> Binary files - no diff available.
