Thilo Goetz wrote:
As I see it, we're not going to reach consensus on this issue. I
guess this is at least in part due to the fact that we disagree on the
basic premises underlying this redesign. I am -1 to the current
proposal, and I'll give my reasons below. However, I think we've
mostly discussed most of what I have to say, and if everybody else
thinks the current proposal is a good idea, I will not stand in its
way. Anyway, here goes.
Thanks for trying to clearly restate your thinking :-)
<snip>
The switch from single-artifact CASes to multi-Sofa CASes and views
was a fundamental change in the basic UIMA architecture. We are not
doing our users a favor by hiding this change from them.
I think it's useful to recognize there are sofa-aware components, and
sofa-unaware components. Most of the components so far are sofa-unaware
ones - although this may change somewhat over time. I think it is
useful to make writing sofa-unaware components "easy" in the sense of not
forcing these to deal with multiple sofas/views concepts. This is what
is driving some of the design thinking, I believe, not just trying to be
"backward compatible". I also think this is useful for UIMA adoption -
in helping new users climb the learning curve - initially they would be
able to ignore multi-sofa/ multi-views.
For sofa-aware components, I agree it is not useful to hide this.
By sacrificing a clean design to backward compatibility, we may keep
some existing users happy, but we're not going to gain any new ones.
If even UIMA developers find it that hard to get their heads wrapped
around the concepts and APIs, how much harder is it going to be for
new users?
Is the main sacrifice you see to "clean design" the inclusion of
forwarding methods in the CasView to avoid users having to pay attention to
casView.getCas(), and recognizing this distinction? If not, I may have
missed it...
If so, I've had more than one user tell me they don't like to have to
remember how we organize our objects at this level, to follow chains to
get to methods they want. They like the forwarding methods. Since
these APIs are mainly for the users, I would say we should be willing to
sacrifice some "cleanliness" for this.
I think the difficult UIMA developers find is not necessarily an
indication of the difficulty users would have. UIMA developers are
trying to keep multiple use cases of multiple kinds of users (e.g.
sofa-aware, sofa-unaware) in mind simultaneously, and come up with a
design which satisifies all of these somewhat conflicting goals,
simultaneously. (That's why we're having all these headaches :-) )
I question the need for backward compatibility for Sofa-unaware
annotators. Those days are over. This basic tenet robs us of the
ability to clean up the CAS APIs. For example, when I look at the CAS
APIs from a world where views are real, I naturally expect
CAS.getIndexRepository() to return all indexes in the CAS to me, not
just the ones for the default view.
In the case where the world is one where "views are real" - that
naturally feels like the case of a sofa-aware component.
- For sofa-aware components, I would expect this call to be invalid
(as a User), because Index-sets "belong" to Views, and the CAS isn't a
view.
- For sofa-unaware components, I would expect this to work as before -
this is more like the case where "views aren't real".
The CommonCas interface adds to the confusion, because it isn't (a
common CAS API). It follows the methodology that everything that can
be abstracted, is abstracted. However, that's not how people think.
We like to think in API groups and what things logically belong
together, not what can and can not be grouped because of method return
types. So all it does is add to the confusion because you always have
to look in two places for APIs.
Well, we can rename the CommonCas interface, if we can come up with a
better name. But I find that abstraction is very useful for ongoing
code maintenance, and understanding what is intended to be the same
and/or different among things. It often shows up design oversights -
something is done in one case and not "thought of" by the developer in
another case.
From a documentation point of view, I hope to have it both ways - by
describing the JCas and Cas APIs as including the CommonCas API (which
of course it does, as a super-interface). So the users won't need to
pay attention to this detail of abstraction; they can ignore the
CommonCas API as a separate entity. The current IDEs like Eclipse
support this, too (e.g., autocompletion shows the whole set).
-Marshall