Johnsd11 commented on issue #64:
URL: https://github.com/apache/ctakes/issues/64#issuecomment-2836000490

   
   Each annotation gets a unique concept for every combination of possible 
codes, semantic types, etc.
   You have pasted a good example of when that happens:  (abbreviated)
   
   < code="7092007" tui="T109"/>
   <code="7092007" tui="T121"/>
   <code="372826007"  tui="T109"/>
   <code="372826007"  tui="T121"/>
   
   This is definitely a little confusing when the CUI for all 4 'unique' 
concepts is the same, in your case cui="C0025859".
   
   If you are interested in gathering annotations, cuis, codes, concepts, 
semantic types etc. you should consider using the OntologyConceptUtil in 
ctakes-core.
   
https://urldefense.com/v3/__https://ctakes.apache.org/apidocs/4.0.0/org/apache/ctakes/core/util/OntologyConceptUtil.html__;!!NZvER7FxgEiBAiR_!tjuED-Hsg9fE1kN3Kus2co4068e3cKGwl93r8CU1QdBeosw_84utLY8-M2xLRWSuHm3k1dc-jYSxY2WFGJJPTFqaCnoJNsT8UJzW1t2yHIc$
   
   As far as I can tell, methods with application to your question would be:
   
   getAnnotationsByCui( jCas, "C0025859" )
     --> which would return 3 annotations given your example.
   
   getCuiCounts(  jCas )
     --> which would return a Map<String,Long> where  the cui is the key 
(String) and the # of annotations with that cui is the value (Long).  In your 
case this should be "C0025859", 3.
   
   There are around 35 methods, so hopefully you can find some that fit your 
needs.
   
   In case you really need something special, parsing the xmi files is probably 
not the best way to get information.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to