Re: CAS and CasView redesign - question if all views should share thesame indexes?

Adam Lally Fri, 22 Dec 2006 09:35:16 -0800

On 12/22/06, Thilo Goetz <[EMAIL PROTECTED]> wrote:

Adam Lally wrote:
> (1) The CAS is the container for all of the analysis data (as per the
> UIMA spec).  It must be possible to create FS directly on the CAS
> and there must be some reasonable way to retrieve the FS in the CAS
> without having to be concerened with views.


This seems to be an important point, and one that I still haven't really
understood.  Why is this necessary?  An anchored view is the only way to
contain a subject of analysis.  UIMA without sofas (in the conceptual
sense) is nothing.  Why do I need to be able to access annotations
without being concerned about views?  Conceptually and in an ideal
world, that is.  Don't get me wrong, I'm not opposed to this.  I simply
don't understand the motivation, and I would like to.


That's a fair question...

One thing I want to clarify is that UIMA without views doesn't mean
UIMA without Sofas. You should be able to access the Sofas (all of
them) directly from the CAS.  They're just FeatureStructures after
all, and our current implementation does have a Sofa index, though
it's hidden at the moment.

So one way of working with the CAS without views might be for an
annotator to look through the Sofa index for a Sofa it wants to
analyze and create some annotations over it (I suggested a
CAS.createAnnotation(begin, end, Sofa) method for this purpose.)

Views are a way that we think is useful to organize feature structures
in the CAS, and one key way to organize them is to collect all the
annotations referring to a single sofa into one (anchored) view.  But
is this the only way to do things in the UIMA standard?  That proved
to be a tough sell to the people who worked on the UIMA spec proposal
who were thinking not just about our implementation but also about
other UIM frameworks/systems that do things differently.  So the state
of things for the UIMA spec proposal right now is that views are an
optional way of doing things.

Now on top of that we have to figure out what to do with indexes,
which aren't part of the UIMA spec at the moment.  In our current
implementation indexes only operate on views.  Maybe its OK to leave
it that way for now, but I thought it was worth exploring if there's a
way to have indexes work on over the CAS as a whole, as well.

Going back to my hypothetical annotator that created an annotation off
the base CAS by calling CAS.createAnnotation(begin, end, Sofa).  In
our current implementation this isn't useful because the annotation
has to be indexed to be retrievable, and the only way to index it is
to add it to a view.  Are there any other options we could consider?

If we can't or don't want to change the fact that indexes only operate
on views, we could provide an iterator that walks the heap and returns
everything regardless of whether it's indexed.  Then we'd be saying -
neither views nor indexes are required -- they're a performance
optimization.

-Adam

Re: CAS and CasView redesign - question if all views should share thesame indexes?

Reply via email to