Adam Lally wrote:
Now what you say about sofas is interesting.  Currently, an index knows
nothing of views or sofas.  The only thing that is checked when adding a
FS to an index is the FS's type.  Are you suggesting that there should
be special code that prevents me from adding an annotation that I
created in one view to the index repository of another view?


In fact I believe that code already exists and it's not that
complicated (in our current implementation anyway).  Each annotation
has a feature that is a reference to the Sofa, and the view has a
reference to its Sofa.  So I think this is just an integer comparison
between these two values.

Yes, but it's a check that is redundant in 99% of all cases. We could also handle this at the iterator end of things, with an option that checks sofa/view membership. We keep piling these things on, and we have enough problems selling UIMA performance as is.


This constaint is mentioned in the OASIS spec:  an "anchored view" is
a view that's tied to a Sofa, and it is a constraint that all
annotations that are members of an anchored view refer to that view's
Sofa.

A really simple approach would be to say that there are view-local index
definitions, and CAS-global index definitions.  For the view-local ones,
each view would have its own instance (and every view would have one).
For the CAS-global ones, there would be one instance in the CAS, shared
by all views.  However, that is just my current naive view of things.
Much more complicated schemes could be envisioned.


I'm not too worried about the specifiers.  A scheme like this would be
fine and fairly easy to add, if we first decide that this idea of
separate local/global index definitions is the way we want to go.

I am worried about our specifiers because of their complexity. To this day, I have not fully understood the parameter settings in our specifiers, for example -- and I know I'm not the only one. The more complexity we add, the higher the barrier of entry for a new UIMA user is.

Marshall Schor wrote:
Re: Need for "Global indexes"
<snip>
What is the use case for the global view set of indexes? I can't recall
the use-case for this, beyond
being able to get all the data.   This thread has suggested other
utilities that can effectively
"merge" the results from other view's index instances. Are there other
use cases?

A hypothetical use case is that I want to get all Person mentions
(annotations) in the CAS, say because I'm going to populate a database
with their covered text and perhaps other feature values.

Of course, you could walk all views to do that.  But I'm suggesting
you shouldn't have to.  We could add a utility method to hide that
detail; I guess I'm OK with that.

Basicaly, this discussion is more about getting the concepts straight
than adding new functionality.  I'll say again:

(1) The CAS is the container for all of the analysis data (as per the
UIMA spec).  It must be possible to create FS directly on the CAS
and there must be some reasonable way to retrieve the FS in the CAS
without having to be concerened wtih views.

This seems to be an important point, and one that I still haven't really understood. Why is this necessary? An anchored view is the only way to contain a subject of analysis. UIMA without sofas (in the conceptual sense) is nothing. Why do I need to be able to access annotations without being concerned about views? Conceptually and in an ideal world, that is. Don't get me wrong, I'm not opposed to this. I simply don't understand the motivation, and I would like to.

--Thilo

Reply via email to