Here it is 1. The Ctakes typesystem represents syntax trees with three types: TopTreebankNode, TreebankNode, and TerminalTreebankNode - Understood.
2. The parser works at the sentence level, so a standard thing is to simultaneously get all trees/sentences by doing: for(TopTreebankNode tree : JCasUtil.select(jcas, TopTreebankNode.class)) - Understood My question is that a single word in a sentence may belong to various types simultaneously. How does the associated typeclass get stored in the nodes of tree so that when we parse the tree/sentence , we can get select type of interest and associated features/attributes what I want to understand what is the keys/value pairs of each node. Basically so that the following code works List<DiseaseDisorderMention> disease = new > ArrayList<>(JCasUtil.select(jcas, DiseaseDisorderMention.class); // DiseaseDisorderMention is the selected typeclass to be extracted Hope I am clearer this time Anir On Tue, May 20, 2014 at 4:32 PM, Miller, Timothy < timothy.mil...@childrens.harvard.edu> wrote: > I don't understand this question. Can you try to rephrase it? Or maybe if > you tell me what you want to do that would help me understand. > > ________________________________________ > From: Anirban Chakraborti [chakraborti.anir...@googlemail.com] > Sent: Tuesday, May 20, 2014 6:34 AM > To: dev@ctakes.apache.org > Subject: Re: markable types > > thanks again Timothy > > final question for now > > You had explained that each sentence is parsed and is converted to a > > tree with head and terminal node . Is the typesystem of ctakes an feature > > of the node, i.e can one node belong to two more typesystems and their > > further attributes OR for each type system , there is a syntax tree for > > every sentence parsed. I mean a sentence has various trees attached to it > > but there is 1:1 mapping between the node and typesystem. > > Anir > > > On Tue, May 20, 2014 at 2:17 AM, Miller, Timothy < > timothy.mil...@childrens.harvard.edu> wrote: > > > > > On 05/18/2014 07:40 AM, Anirban Chakraborti wrote: > > > Timothy, > > > > > > 1. so to get concepts of procedure, lab (if any), disease disorder , > sign > > > symptoms, Anatomical sites , I would need to do > > > > > > List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas, > > > MedicationMention.class) ; > > > List<DiseaseDisorderMention> disease = new > > > ArrayList<>(JCasUtil.select(jcas, DiseaseDisorderMention.class); > > > List<SignSymptomsMention> signs = new ArrayList<>(JCasUtil.select(jcas, > > > SignSymptomMention.class); > > > List<AnatomicalMention> anatomy = new ArrayList > > > <> (JacsUtil.select(jcas,AnatomicalMention.class); > > > List <LabMention> labs = new ArrayList <> > > > (JacsUtil.select(jcas,LabMention.class); > > > > > > then check the size of the array { meds, disease, signs, anatomy , > labs} > > , > > > print out the array or make a new array using the Java.utils.List or > > > Java.utils.Arraylist package interfaces as the case might me. Right > ... > > yep > > > 2. I am more interested in the IdentifiedAnnotation class. However > there > > > are concepts like FractionAnnotation which are not defined enum in the > > > const.java. How do I handle them. Do I need to add to the const.java > > file. > > nope, you probably just want EntityMention (for anatomical sites) and > > EventMention (for all clinical events, including DiseaseDisorder, > > Procedure, SignSymptom, etc.). > > > > > > > > 3. what exactly is the functional difference between say > > > MedicationEventMention .java, MedicationMention.java, Medication.java > and > > > MedicationEventMention_type.java . I understand similar difference is > > > between class of lab, procedure etc... > > The types ending in _type.java are UIMA-internal types, you can ignore. > > Medication is a referential type -- something in the real world that > > could be referred to multiple times in a document. What you probably > > want are the mention types. Here I believe MedicationMention is the > > preferred type going forward for a particular mention of a medication in > > text (MedicationEventMention is the same thing but not preferred going > > forward). > > > > > > > > > > 4. You had explained that each sentence is parsed and is converted to > a > > > tree with head and terminal node . Is the typesystem of ctakes an > feature > > > of the node, i.e can one node belong to two more typesystems and their > > > further attributes OR for each type system , there is a syntax tree for > > > every sentence parsed. I mean a sentence has various trees attached to > it > > > but there is 1:1 mapping between the node and typesystem. > > > > > > Many Thanks > > > > > > Anirban > > > > > > > > > > > > > > > > > > On Thu, May 15, 2014 at 5:03 PM, Miller, Timothy < > > > timothy.mil...@childrens.harvard.edu> wrote: > > > > > >> Anir -- I'm not sure I understand your question but from your example > it > > >> doesn't sound like a tree exactly. If you just want a list of > medication > > >> concepts you can do something like: > > >> > > >> List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas, > > >> MedicationMention.class)); > > >> (I believe MedicationMention is the correct class but check your > > output.) > > >> > > >> If you really do want to put them into a syntax tree, there are also > > >> methods for doing that in AnnotationTreeUtils class. > > >> > > >> getAnnotationTree(JCas, Annotation) will give you the tree for the > whole > > >> sentence containing the annotation you give it > > >> annotationNode(JCas, Annotation) will give you the smallest subtree > tree > > >> covering the annotation you give it. > > >> insertAnnotationNode(JCas, TopTreebankNode, Annotation, String) will > > >> insert a node into the tree specified at the level specified by the > > >> annotation with the category specified by the string. So for example > if > > you > > >> had meds as above you could then do: > > >> > > >> for(MedicationMention med : meds){ > > >> AnnotationTreeUtils.insertAnnotationNode(jcas, > > >> AnnotationTreeUtils.getAnnotationTree(jcas, med), med, "MEDICATION") > > >> } > > >> > > >> which would insert a new node into every tree with the label > > "MEDICATION" > > >> in every position where a medication was found. > > >> > > >> One caveat to the above code is that these methods actually will > change > > >> the tree in the cas. That might be ok for some use cases but for many > > you > > >> want to work on a tree outside the cas so that's why there is also > > methods: > > >> getTreeCopy(JCas, TopTreebankNode) > > >> getTreeCopy(JCas, TreebankNode) > > >> > > >> if you use the getAnnotationTree method to obtain the tree you want, > > then > > >> you can get a copy from these methods, then use the insert methods and > > do > > >> something with them immediately (like print them out), without > altering > > the > > >> originals in the cas if other AEs may use them. > > >> > > >> Tim > > >> > > >> > > >> > > >> ________________________________________ > > >> From: Anirban Chakraborti [chakraborti.anir...@googlemail.com] > > >> Sent: Sunday, May 11, 2014 9:15 AM > > >> To: dev@ctakes.apache.org > > >> Subject: Re: markable types > > >> > > >> Steven, > > >> > > >> Would you have any example code of tree parser so the output can be > > >> arranged as per need. I mean, after successful annotation, I want to > > >> extract certain concepts like medication only and arrange them in a > new > > >> tree so that all annotation in reference to medication concept and > their > > >> sources are listed together. > > >> > > >> Anir > > >> > > >> > > >> On Sun, May 11, 2014 at 3:55 PM, Steven Bethard < > > steven.beth...@gmail.com > > >>> wrote: > > >>> I don't think "not something anyone would want extracted" should be > an > > >>> argument against anything. We already have constituent and dependency > > >>> parse trees in the type system, and those would fall under that > > >>> category. > > >>> > > >>> So +1 on markables in the type system. (In general, +1 on moving > > >>> module-specific types to the standard type system. I'm not sure what > > >>> the real benefit of splitting them out is...) > > >>> > > >>> Steve > > >>> > > >>> On Fri, May 9, 2014 at 11:53 AM, Miller, Timothy > > >>> <timothy.mil...@childrens.harvard.edu> wrote: > > >>>> What do people think about taking the "markable" types out of the > > >>>> coreference project and adding them to the standard type system? > This > > >> is > > >>>> a pretty standard concept in coreference that doesn't really have a > > >>>> great natural representation in the current type system -- it > > >>>> encompasses IdentifiedAnnotations as well as pronouns ("It", "him", > > >>>> "her") and some determiners ("this"). > > >>>> > > >>>> The drawback I can see is that it is probably not something anyone > > >> would > > >>>> want extracted -- ultimately you want the actual coref pairs or > > chains. > > >>>> But it is useful for things like representing gold standard input or > > >>>> splitting coreference resolution into separate markable recognition > > and > > >>>> relation classification steps. > > >>>> > > >>>> Tim > > >>>> > > > > -- > > Tim Miller > > Instructor > > Boston Children's Hospital and Harvard Medical School > > timothy.mil...@childrens.harvard.edu > > 617-919-1223 > > > > >