On 5/31/2012 10:55 AM, Thilo Goetz wrote:
On 31/05/12 16:30, Marshall Schor wrote:
Hi Richard,
I'm generally in favor of bringing the concepts of uimaFIT into UIMA. I
would like to have a discussion on the pros/cons of alternatives to
doing this.
Here are some topics for discussion; these might reasonably end up in
separate threads at some point :-)
1) uimaFIT already has an established user base. This means it will
have backwards-compatibility requirements for that community. This will
be a factor in how we proceed to integrate the concepts into UIMA.
1a) If we find, thru a bigger community discussion / involvement, other
directions we want to align the uimaFIT concepts with, it seems we might
end up with both an "old-style" uimaFIT - supporting the existing user
base, paying close attention to backwards compatibility and/or
migration, and a (likely incompatible) new-style integration of the
uimaFIT concepts into UIMA.
2) Bringing the code for uimaFIT into the UIMA project: It seems it
would initially go into the Sandbox until IP issues (if any) are
resolved, and then be moved to the add-ons, since it is already a mature
thing with users. If we then go down the path envisioned in 1a), we
would have some kind of parallel development in UIMA of the concepts.
I don't understand the bit about the sandbox. We'll need a code grant,
which we'll only accept if we're satisfied there are no IP issues. It
won't go into subversion before then, and afterwards we can stick it
wherever we want. Right?
Yes, I guess I was thinking it might need some "work" before it was ready to get
"released". If this work would complete before the next time we wanted to
release all of the add-ons (which, by the way, we might not do - we have the
option of releasing just individual pieces), then it could go directly into add-ons.
-Marshall
Here's one example of a possible future alternative: uimaFIT uses
special files to collect type system description references together
where they can be found by convention. We have previously been looking
into various approaches to better manage issues around classpaths (we
currently have PEARs), based on OSGi support, including hooking up with
repositories (the idea being that UIMA components could live in Maven
repositories, and be "reusable" by reference, using the Maven schemes of
identifying artifacts by version). A key part of this is the "version"
management. As we go forward, we might find some interesting
integrations of type system information that is well aligned with these
kinds of conventions.
3) If we "open up" the discussion around uimaFIT-inspired improvements
to UIMA, I can see pro/con arguments for moving uimaFIT to Apache:
3a1) Pro: It would require the IP "vetting" of the uimaFIT code base,
and thus make that code more re-usable inside UIMA.
3a2) Pro: It could likely lead to some new committers :-)
3a3) Pro: It would likely result in increased focus/attention on making
progress in this area.
3b1) Con: It may confuse UIMA users somewhat, as to what we're doing.
I guess this is all normal, and goes along under the category "managing
change" :-).
Why don't we take this one step at a time? First, a statement from the
UIMA developers if they intend to accept this contribution. I've heard
no negative voices so far. Second, the formal code grant and acceptance
vote. Third, a uimaFIT release as part of the next UIMA release.
That seems pretty straightforward to me. When we have all that under
our belts, we can start discussing if and how we want to move uimaFIT
closer to the core.
What do the uimaFIT community/developers desire, in regards to point (1)
above?
-Marshall
On 5/25/2012 2:22 AM, Richard Eckart de Castilho wrote:
Hello everybody,
we would like to propose the contribution of uimaFIT to Apache UIMA.
uimaFIT provides an API that facilitates using UIMA embedded in other
Java code, which is also helpful for unit tests. It also provides
context injection features such as annotations on class member fields
which in UIMA component which are initialized from the UIMA context.
uimaFIT already is a proven product on its own and has an established
user base (cf. [1] [2] [3] [4] [5] [6] [7] [8]). But we think that
having uimaFIT hosted with Apache UIMA would allow and encourage more
people to use it and we could get more feedback that can be used to
further improve uimaFIT.
It has been quite some time that we last talked about uimaFIT. Since
then, I did another iteration over the documentation of uimaFIT in its
wiki. I don't claim it to be perfect, but I think it is more ordered
now and contains less duplicate information. Also, I finally see a
possibility time-wise to address whatever work may be necessary in the
contribution process.
We only have a rough idea about the process and its requirements. If
you are interested, I'd be happy to talk about it.
Best regards,
-- Richard
[1]
http://jochenleidner.posterous.com/are-you-fit-for-uima-uimafit-provides-support
[2] http://www.uima-hpc.de/en/technical.html
[3] http://code.google.com/p/dkpro-core-asl/
[4] http://code.google.com/p/cleartk/
[5] http://biolemmatizer.sourceforge.net/
[6] http://rxinformatics.umn.edu/clas.html
[7] http://www.ohloh.net/p/uimafit
[8] http://search.maven.org/#search%7Cga%7C1%7Cuimafit