Quoting Eddie Epstein <[email protected]>:
On Fri, Apr 10, 2015 at 1:49 PM, Nick Hill <[email protected]> wrote:
My understanding was that the heap organization in Java was made to
resemble that in the UIMA C++ implementation and allowed for fast data
exchange between Java and C++. That is why I was asking about the fate of
UIMA C++. Again, maybe Marshall or Eddie can comment here.
Correct. Historically the C++ version preceded Java and interoperability
influenced the Java design. Given the limitations of running C++ in the
JNI, my inclination is to improve interprocess serialization performance
between Java and C++, perhaps using compressed binary form 6, delta CAS
and CAS projections. This would mean standardizing on CAS interchange
format is much more important than the in-memory implementation.
My assumption was that the overwhelming majority of UIMA usage was
java-based. Is this a valid assumption? The only C++ related UIMA
integration I have seen has been JNI within a Java annotator/pipeline,
which completely bypasses the UIMA C++ framework anyhow (which also has the
advantage of minimizing the amount of data which is "interchanged").
Based on this I would not think it makes sense for the java impl to pay
the heavy price of code footprint/complexity, reduced flexibility, etc (as
previously enumerated) just to provide potentially faster data-interchange
between java and C++?
UIMACPP comes with two interfaces to a Java pipeline. There is a JNI
interface that instantiates the C++ annotator at init and uses a binary
serialization of the CAS thru the JNI on each process call. Among the
limitations of the JNI implementation are that all C++ annotators have to
use the same version of UIMACPP. Running C++ code under Java can be
problematic, independent of the JNI interface itself.
The second interface to UIMACPP uses a native C++ UIMA-AS service wrapper
which exchanges XmiCas data with the Java client. With this wrapper the
entire process is native code.
No idea how widely uimacpp is used, but there was a recent question asking
if uima v2.7.0 was still compatible. It is.
Sorry Eddie, I think I missed the specific point you were making about
inter-process as an preferred alternative to JNI-based usage of
UIMACPP. Given that though, it sounds like the new obj CAS impl will
already work fine (since its currently Xmi-based) without any more
work needed.
Changing the UIMA C++ JMS code to support different serialization
formats would presumably only be worth doing anyhow if there is some
demand/requirement out there for it?