Re: markable types

Miller, Timothy Sat, 17 May 2014 17:51:07 -0700

Again I'm not sure I understand so please clarify if this isn't what you're 
looking for.


The Ctakes typesystem represents syntax trees with three types: 
TopTreebankNode, TreebankNode, and TerminalTreebankNode. Top and Terminal 
inherit from TreebankNode with special properties for being the root of a tree 
or the leaf of a tree (including the part of speech tag and a word). For most 
trees, calling getNodeType() will get you the category you want. For Terminal 
trees, getNodeType() and getNodeValue() will have the POS and word 
respectively. You can get the subtrees for a node with getChildren() and a 
specific subtree with getChildren(int), where the int arg is indexed from 0. 
Each tree is also connected to its parent by getParent(). Each node also has 
its headword denoted by the getHead() method (I think that's right but I'm 
doing this from memory so you'll have to check), which is an index into the 
array of _all_ children in the sentence. So if tree.getHead() returns 5, then 
you would call getTerminals() on the root tree and get the word at index 5 to 
get the head of the variable tree.
The parser works at the sentence level, so a standard thing is to 
simultaneously get all trees/sentences by doing:
for(TopTreebankNode tree : JCasUtil.select(jcas, TopTreebankNode.class)){
  // do something with this tree
}

Hope this helps.
Tim


On May 17, 2014, at 1:54 PM, Anirban Chakraborti wrote:

> Thanks Timothy,
> 
> I get the point but would be greatly helpful if you have an illustrative
> example of a tree structure describing the branches and the nodes generated
> by Ctakes. I have got an hang how to parse the tree now.
> 
> 
> 
> 
> On Thu, May 15, 2014 at 5:03 PM, Miller, Timothy <
> [email protected]> wrote:
> 
>> Anir -- I'm not sure I understand your question but from your example it
>> doesn't sound like a tree exactly. If you just want a list of medication
>> concepts you can do something like:
>> 
>> List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas,
>> MedicationMention.class));
>> (I believe MedicationMention is the correct class but check your output.)
>> 
>> If you really do want to put them into a syntax tree, there are also
>> methods for doing that in AnnotationTreeUtils class.
>> 
>> getAnnotationTree(JCas, Annotation) will give you the tree for the whole
>> sentence containing the annotation you give it
>> annotationNode(JCas, Annotation) will give you the smallest subtree tree
>> covering the annotation you give it.
>> insertAnnotationNode(JCas, TopTreebankNode, Annotation, String) will
>> insert a node into the tree specified at the level specified by the
>> annotation with the category specified by the string. So for example if you
>> had meds as above you could then do:
>> 
>> for(MedicationMention med : meds){
>>  AnnotationTreeUtils.insertAnnotationNode(jcas,
>> AnnotationTreeUtils.getAnnotationTree(jcas, med), med, "MEDICATION")
>> }
>> 
>> which would insert a new node into every tree with the label "MEDICATION"
>> in every position where a medication was found.
>> 
>> One caveat to the above code is that these methods actually will change
>> the tree in the cas. That might be ok for some use cases but for many you
>> want to work on a tree outside the cas so that's why there is also methods:
>> getTreeCopy(JCas, TopTreebankNode)
>> getTreeCopy(JCas, TreebankNode)
>> 
>> if you use the getAnnotationTree method to obtain the tree you want, then
>> you can get a copy from these methods, then use the insert methods and do
>> something with them immediately (like print them out), without altering the
>> originals in the cas if other AEs may use them.
>> 
>> Tim
>> 
>> 
>> 
>> ________________________________________
>> From: Anirban Chakraborti [[email protected]]
>> Sent: Sunday, May 11, 2014 9:15 AM
>> To: [email protected]
>> Subject: Re: markable types
>> 
>> Steven,
>> 
>> Would you have any example code of tree parser so the output can be
>> arranged as per need. I mean, after successful annotation, I want to
>> extract certain concepts like medication only and arrange them in a new
>> tree so that all annotation in reference to medication concept and their
>> sources are listed together.
>> 
>> Anir
>> 
>> 
>> On Sun, May 11, 2014 at 3:55 PM, Steven Bethard <[email protected]
>>> wrote:
>> 
>>> I don't think "not something anyone would want extracted" should be an
>>> argument against anything. We already have constituent and dependency
>>> parse trees in the type system, and those would fall under that
>>> category.
>>> 
>>> So +1 on markables in the type system. (In general, +1 on moving
>>> module-specific types to the standard type system. I'm not sure what
>>> the real benefit of splitting them out is...)
>>> 
>>> Steve
>>> 
>>> On Fri, May 9, 2014 at 11:53 AM, Miller, Timothy
>>> <[email protected]> wrote:
>>>> What do people think about taking the "markable" types out of the
>>>> coreference project and adding them to the standard type system? This
>> is
>>>> a pretty standard concept in coreference that doesn't really have a
>>>> great natural representation in the current type system -- it
>>>> encompasses IdentifiedAnnotations as well as pronouns ("It", "him",
>>>> "her") and some determiners ("this").
>>>> 
>>>> The drawback I can see is that it is probably not something anyone
>> would
>>>> want extracted -- ultimately you want the actual coref pairs or chains.
>>>> But it is useful for things like representing gold standard input or
>>>> splitting coreference resolution into separate markable recognition and
>>>> relation classification steps.
>>>> 
>>>> Tim
>>>> 
>>> 
>>

Re: markable types

Reply via email to