I am sorry, I am now really confused. I have a Ruta script which annotates a bunch of text files, resulting in .xmi files which I assume contain the annotations. When I open an .xmi file in the Annotation Browser, it shows all of the annotations produced by my script, right? It certainly looks correct. I have checked them pretty carefully. Since I must specify .xmi files for the query view as well, I was assuming it is also listing the annotations in those same files.
Yes, I know I can use UimaFIT but since I have a lot of types, I am dreading the configuation task. I just wanted some quick totals, and had hoped I could do it in a few minutes with the query view. Why are annotations made to be invisible if they end with a line break? That caused me no end of grief when I was developing my script. It seems unexpected. thanks, Bonnie MacKellar On Sun, Jun 19, 2016 at 8:52 AM, Peter Klügl <[email protected]> wrote: > Hi, > > > the annotation browser just lists all annotations in the CAS, it is > completely independent of the ruta language and just an extension of the > CAS Editor. The query view applies rules on a CAS and lists the rule > matches. So the query view is much more powerful than the annotation > browser since it can use the complete expressiveness of the language. > However, that is also the reason why it is sensible to the visibility > concept. > > > Best, > > > Peter > > > Am 19.06.2016 um 14:39 schrieb Bonnie MacKellar: > > The idea that spaces are making the annotations invisble is totally > > plausible. But why does the AnnotationBrowser see them then? The > > annotations are there - they haven't been skipped- just the query view is > > not picking them up. What is different about Annotation Browser that > would > > make those annotations not visible? > > > > thanks, > > Bonnie MacKellar > > > > On Sun, Jun 19, 2016 at 8:03 AM, Peter Klügl <[email protected]> > > wrote: > > > >> Hi, > >> > >> > >> attachements are removed on this mailing list. > >> > >> > >> I would bet that some annotations are not visible to the rules, so they > >> are simply skipped -> query view reutrn no matches. > >> > >> > >> In Ruta, annotations are invisble if their begin or end are covered by > >> something invisible, that are all annoations of types that are filtered. > >> Most often, the annotations are missed because they start or and with a > >> space or line break. > >> > >> > >> You can trim annotation, e.g., with > >> > >> > >> RETAINTYPE(SPACE,BREAK); > >> > >> tsCurrent{-> TRIM(SPACE,BREAK)}; > >> > >> RETAINTYPE; > >> > >> > >> > >> You can use the query view for this use case. I have to mention that the > >> query view was build to serve as a tool during rule engineering: to get > >> a quick overview over the annotated documents. It does not scale with > >> the number of documents since there is not indexing across CASes and you > >> need to deserialze all CASes. > >> > >> If it is fast enough, it is totally fine for counting annotations with > >> the query view. > >> > >> You can also write a simple uimaFIT analysis engine and add it to the > >> pipeline or the the ruta script. The analysis engine counts the > >> annotation in process() and outputs the aggregates result in > >> collectionProcessingComplete() (or the overridden method with the > >> correct name). If you want to parallelize it, you need a different > >> solution with a resource or something. > >> > >> Best, > >> > >> Peter > >> > >> > >> > >> Am 17.06.2016 um 21:21 schrieb Bonnie MacKellar: > >>> Hi > >>> > >>> I am trying to use Ruta Query View to get a view of all matches for a > >>> particular annotation type across a large set of .xmi files. However, > >>> I am noticing something strange about Ruta Query View - it doesnt't > >>> report lots of matches that are shown in the Annotation browser (and > >>> which I believe are correct matches). For example, a given annotation > >>> type tsCurrent has 4 matches in the file NCT0036712, but these matches > >>> do not appear at all in the list of results in Ruta Query View when I > >>> query for tsCurrent. For some files, though, the results for all > >>> matches do show up, and for other files, only a partial set of matches > >>> are in the query results. I cannot understand why this is happening. > >>> Perhaps my query syntax is wrong? I can only find the one example in > >>> the manual, which isn't much to go on. > >>> > >>> I am attaching a screenshot showing the AnnotationBrowser on the top > >>> right in Eclipse, with all of the matches for tsCurrent, and the Ruta > >>> Query view on bottom, which does not contain those matches. I think it > >>> is easier to see the problem visually. > >>> > >>> Also,ultimately I am just trying to get a count of the number of times > >>> certain annotations are made across all of my files. Is there a better > >>> way to do that instead of Ruta Query View? I can't find another way > >>> to total matches across lots of files. > >>> > >>> thanks, > >>> Bonnie MacKellar > >>> > >>> Inline image 1 > >> > >
