Michael Baessler wrote:
Adam Lally wrote:
On 3/21/07, Michael Baessler <[EMAIL PROTECTED]> wrote:
Let me understand the real issue here. When the CAS is created it
gets a
ClassLoader that is used to located the JCas classes.
As far as I know, the CAS stores the references to the JCas classes,
right?
Yes.
So when the aggregate creates these references in the CAS the pear
runtime wrapper with the UIMAClassLoader cannot use it's own JCas
classes since they are
not referenced in the CAS. Is that also right? So it would be nice if
the JCas references in the CAS can be changed later, will that be
possible?
If true, will it be possible to provide a CAS.reinitialize(ClassLoader)
method to reinitialize the JCas classes in the CAS when the ClassLoader
changed?
This method can either be called by an application or by the UIMA pear
runtime wrapper.
With that, will it be possible to provide a JCas reference map within
the CAS for each ClassLoader that is passed to the CAS? In that case we
can just switch the JCas references when the ClassLoader changed and
must not reinitialize everything again except the first time.
What do you think?
In theory it seems possible, Marshall would have to comment on how
difficult it is.
There's a further complication which is that JCas-generated class
_instances_ are also cached in the JCas object. So if an JCas-based
annotator creates an annotation Java object, and the application later
uses an iterator to retrieve that object, it would get the same Java
object back. Obviously this won't work if the application and
annotator don't share the same ClassLoader for accessing that class.
We could work around that, too, by clearing out the cache if the
ClassLoader changes, or using a different cache for each ClassLoader.
This wouldn't allow sharing additional data in Java fields that were
manually added to the JCas-generated class, although that doesn't work
when serialization is involved anyway, so may not be a big loss.
There's a lot of additional complexity involved in doing this. Is it
worth it? I'm not sure.
Hi Marshall, can you please comment this. Will that be possible or
does it have any impact?
Sorry for the late commenting on this.
Let me restate what seems to me to be the goal. In a UIMA application,
there is a top level descriptor - be it a CPE descriptor, or a top level
aggregate. The goal is to allow one or more of the delegates or CAS
processors (for a CPE) to be specified as a PEAR file, which in turn
could be an aggregate or primitive analysis engine.
There may (or may not) be an additional goal of allowing one version of
JCas cover classes for CAS types to be used by the PEAR
aggregate/primitive, and a different version of those JCas cover classes
to be used by the rest of the UIMA application.
It seems to me this, although do-able, is expensive (in terms of
performance/space), and I would guess this isn't a real need. (If it is
a real need, there is always a fall-back - deploying the PEAR as a
"service", implying all the serialization/deserialization of CASes being
sent and returned from it.)
If this is not a real need, then the "assembler" who is assembling the
pipeline using PEAR and non-PEAR things would need to check that the
JCas cover classes being used were the same implementation.
If that were the case, the UIMA framework support for running PEARs as
components in a pipeline could, when setting up the pipeline, ask the
PEAR for its classpath, and include that in the overall class path it is
setting up for the pipeline, and specify the combined classpath in the
resourcemanager, when it sets up the CAS initially.
Would this address the issues raised? (I've probably not quite followed
what it is you're trying to do, so please clarify :-)
-Marshall
Thanks Michael