Hi Mike, Have you tried the getCoveredText() method that IdentifiedAnnotation inherits from Annotation?
- Jessica On Tue, Feb 13, 2018 at 2:42 PM, Michael Trepanier <m...@metistream.com> wrote: > Hi, > > I am attempting to run the default FastPipeline to extract various > features from clinical text. One of the features I'd like to capture is the > covered text. However, when running the below scala code, calling > getOriginalText yields a "null" value for every annotation of type > IdentifiedAnnotation. Is this by design? > > And if so, what would be a better way to extract the covered text? The > other features I need (subject, polarity, confidence, historyOf, and > snomed/CUI/TUI/PreferredText) I can acquire just fine. Effectively, the > goal here is to capture every identified annotation, relevant metadata, and > the original text (only showing my attempt at getting the covered text > below). > > def main(args: Array[String]) { > val note = > """ > ... (Some long example note.) > """.stripMargin > val aed = ClinicalPipelineFactory.getDefaultPipeline > val ae = AnalysisEngineFactory.createEngine(aed) > val jcas = > JCasFactory.createJCas("org.apache.ctakes.typesystem. > types.TypeSystem") > jcas.setDocumentText(note) > ae.process(jcas) > val index = jcas.getAnnotationIndex(IdentifiedAnnotation.`type`) > val iter = index.iterator() > while (iter.hasNext) { > val annotation = iter.next().asInstanceOf[IdentifiedAnnotation] > val fsArray = annotation.getOriginalText() > if (fsArray != null) { > for (featureStructure <- fsArray.toArray()) { > val featureArray = featureStructure.getType().getFeatures() > val strings = featureArray.map(x => featureStructure. > getStringValue(x)) > println(strings) > } > } > } > } > > > Regards, > > Mike Trepanier > -- > [image: MetiStream Logo - 500] > Mike Trepanier| Big Data Engineer | MetiStream, Inc. | > m...@metistream.com | 845 - 270 - 3129 <(845)%20270-3129> (m) | > www.metistream.com >