Adam Lally wrote:
On 12/22/06, Marshall Schor <[EMAIL PROTECTED]> wrote:
<snip>
> Also, we have some uses of non-annotation indexes that are segregated
> by Sofa (say, a Lemma index that's particular to a Sofa, where there's
> actually no explicit link from the Lemma to the Sofa). A filtering
> approach wouldn't work there,
It could be made to work by adding a feature to the Lemma type which was
a sofa reference. But maybe that's asking too much of the user?
I'm not sure what is right here... this is a reasonable idea. But I
think in the absence of a clear sense of what is best I lean towards
staying closer to where were currently are, which is to have view
where the user explicitly decides which view to index things in.
The whole point of those views, I thought, was to be able to segregate
the data. So if you want lemmas for a certain view to be separate from
the lemmas for different views, you should be able to achieve that with
a lemma index that is specific to that view. If you want to share
lemmas from two views, share the index between the views. That's my
mental model of how things should work. I like this better than adding
sofa references for the following reasons:
a) more space efficient, as there's not extra sofa references
b) more time efficient, as you don't need to check the sofa references
at indexing time
c) no more complicated, as the user needs to reference something, the
view or the sofa.
This is how I would have done annotations as well. Maybe there are
considerations that I'm not aware of, but I see no benefit to each
annotation knowing what sofa it references. If the annotation is
indexed in a certain anchored view, the sofa of that view is what it
references. I understand that there may be data structures that need to
reference sofas explicitly, but I don't think annotations qualify, nor
do lemmas.
Of course that would make a view-less approach from the global CAS that
much harder...
> So basically, is this equivalent to taking our current implemenation
> of View and saying that the sofa is optional? (Which is more or less
> what the UIMA spec says.)
Well, it allows 2 or more Sofas to be indexed using a single
index-set (i.e., in a single view), which
the current design doesn't.
My idea of how to do this would be to create a View without any Sofa
(a non-anchored view), and then you could add any annotations that you
want to it. There's no restriction on adding annotations to a
non-anchored view, the only restriction that we might have would be on
adding annotations to the "wrong" anchored view.
The user should be responsible for adding annotations to the correct
view. When dealing with annotations in anchored views, the user should
not have to worry about sofas at all, and neither should the index. As
annotations need to be created on the view, it seems only natural that
they will be indexed on the same view. If somebody insists on indexing
an annotation on the wrong view, that is their problem. You can also
create annotations with the wrong sofa, and there is no checking that
will prevent that.
--Thilo