I am sorry, I am now really confused. I have a Ruta script which annotates
 a bunch of text files, resulting in .xmi files which I assume contain the
annotations. When I open an .xmi file in the Annotation Browser, it shows
all of the annotations produced by my script, right? It certainly looks
correct.  I have checked them pretty carefully.
Since I must specify .xmi files for the query view as well, I was assuming
it is also listing the annotations in those same files.

Yes, I know I can use UimaFIT but since I have a lot of types, I am
dreading the configuation task. I just wanted some quick totals, and had
hoped I could do it in a few minutes with the query view. Why are
annotations made to be invisible if they end with a line break? That caused
me no end of grief when I was developing my script. It seems unexpected.

thanks,
Bonnie MacKellar

On Sun, Jun 19, 2016 at 8:52 AM, Peter Klügl <[email protected]>
wrote:

> Hi,
>
>
> the annotation browser just lists all annotations in the CAS, it is
> completely independent of the ruta language and just an extension of the
> CAS Editor. The query view applies rules on a CAS and lists the rule
> matches. So the query view is much more powerful than the annotation
> browser since it can use the complete expressiveness of the language.
> However, that is also the reason why it is sensible to the visibility
> concept.
>
>
> Best,
>
>
> Peter
>
>
> Am 19.06.2016 um 14:39 schrieb Bonnie MacKellar:
> > The idea that spaces are making the annotations invisble is totally
> > plausible. But why does the AnnotationBrowser see them then? The
> > annotations are there - they haven't been skipped- just the query view is
> > not picking them up. What is different about Annotation Browser that
> would
> > make those annotations not visible?
> >
> > thanks,
> > Bonnie MacKellar
> >
> > On Sun, Jun 19, 2016 at 8:03 AM, Peter Klügl <[email protected]>
> > wrote:
> >
> >> Hi,
> >>
> >>
> >> attachements are removed on this mailing list.
> >>
> >>
> >> I would bet that some annotations are not visible to the rules, so they
> >> are simply skipped -> query view reutrn no matches.
> >>
> >>
> >> In Ruta, annotations are invisble if their begin or end are covered by
> >> something invisible, that are all annoations of types that are filtered.
> >> Most often, the annotations are missed because they start or and with a
> >> space or line break.
> >>
> >>
> >> You can trim annotation, e.g., with
> >>
> >>
> >> RETAINTYPE(SPACE,BREAK);
> >>
> >> tsCurrent{-> TRIM(SPACE,BREAK)};
> >>
> >> RETAINTYPE;
> >>
> >>
> >>
> >> You can use the query view for this use case. I have to mention that the
> >> query view was build to serve as a tool during rule engineering: to get
> >> a quick overview over the annotated documents. It does not scale with
> >> the number of documents since there is not indexing across CASes and you
> >> need to deserialze all CASes.
> >>
> >> If it is fast enough, it is totally fine for counting annotations with
> >> the query view.
> >>
> >> You can also write a simple uimaFIT analysis engine and add it to the
> >> pipeline or the the ruta script. The analysis engine counts the
> >> annotation in process() and outputs the aggregates result in
> >> collectionProcessingComplete() (or the overridden method with the
> >> correct name). If you want to parallelize it, you need a different
> >> solution with a resource or something.
> >>
> >> Best,
> >>
> >> Peter
> >>
> >>
> >>
> >> Am 17.06.2016 um 21:21 schrieb Bonnie MacKellar:
> >>> Hi
> >>>
> >>> I am trying to use Ruta Query View to get a view of all matches for a
> >>> particular annotation type across a large set of .xmi files. However,
> >>> I am noticing something strange about Ruta Query View - it doesnt't
> >>> report lots of matches that are shown in the Annotation browser (and
> >>> which I believe are correct matches). For example, a given annotation
> >>> type tsCurrent has 4 matches in the file NCT0036712, but these matches
> >>> do not appear at all in the list of results in Ruta Query View when I
> >>> query for tsCurrent.  For some files, though, the results for all
> >>> matches do show up, and for other files, only a partial set of matches
> >>> are in the query results. I cannot understand why this is happening.
> >>> Perhaps my query syntax is wrong?  I can only find the one example in
> >>> the manual, which isn't much to go on.
> >>>
> >>> I am attaching a screenshot showing the AnnotationBrowser on the top
> >>> right in Eclipse, with all of the matches for tsCurrent, and the Ruta
> >>> Query view on bottom, which does not contain those matches. I think it
> >>> is easier to see the problem visually.
> >>>
> >>> Also,ultimately I am just trying to get a count of the number of times
> >>> certain annotations are made across all of my files. Is there a better
> >>> way to do that instead of Ruta Query View?  I can't find another way
> >>> to total matches across lots of files.
> >>>
> >>> thanks,
> >>> Bonnie MacKellar
> >>>
> >>> Inline image 1
> >>
>
>

Reply via email to