The combination of uimacpp + amq-cpp does support non-JNI interoperability
with uimaj. Although I am only familiar with using uimacpp as a remote
component for uimaj applications, it may not be hard for uimacpp to be an
application that uses uimaj remote annotators. It is not clear to me that
such CAS interface connectivity would be more useful than standard
messaging interfaces from python to a java-based process.

What scenarios do  you have in mind?
Eddie


On Mon, Dec 12, 2022 at 7:01 AM Richard Eckart de Castilho <r...@apache.org>
wrote:

> Wrt. to Eddie's question roles.
>
> For the sake of discussion let's split UIMA into two parts:
>
> a) the CAS data structure and related serialization formats
> b) the rest, in particular annotators and means of running them
>
> The UIMA Java SDK supports both a and b.
>
> DKPro Cassis (Python) supports only a, but has no concept of annotators or
> pipelines and stuff.
> Cassis is not part of Apache UIMA, but I still list it here because it is
> currently probably the
> best/only UIMA-esque option for Python available.
>
> I am not sure what exactly UIMA-C++ supports.
> I believe the UIMA Java SDK can call out to UIMA-C++-based annotators and
> use them via JNI.
> E.g. does the UIMA C++ SDK allow building aggregate annotators?
>
> Currently, we do not have an option to call out to Python-based annotators
> from the UIMA Java SDK.
> In particular at the point all the deep-learning frameworks were pouring
> in, there was a question
> if/how to invoke these mostly Python-based frameworks from within UIMA
> pipelines. Meanwhile, there
> are Java bindings for Tensorflow, DeepLearning4J and other Java-friendly
> DL tools, so this gap has
> somewhat closed. However, few data scientists would at the present point
> would build a Java-based
> pipeline calling out to Python. Engineers may do it in particular when
> trying to integrate new
> methods into existing systems, but because Python is notoriously annoying
> to deploy (unless one
> Dockerizes stuff), they may prefer the native Java DL frameworks.
>
> Currently, we do also not have an option to build UIMA pipelines in
> Python. This might be
> interesting for data scientists to some degree, in particular if they like
> the offset-based
> annotation approach of UIMA. They could use the CAS implementation of
> DKPro Cassis and implement
> their own annotator/pipeline conventions from there.
>
> Would UIMA-CPP help Pythonistas to build pipelines in Python?
>
> I suppose, UIMA-CPP brings its own CAS implementation which may be faster
> (?) or more memory
> efficient (?) than the pure-Python implementation provided by Cassis. So
> if that is correct,
> a python-friendly uimacpp may be something like a numpy library where we
> have a Python API
> to a fast C++ implementation underneath?
>
> Would having UIMA-CPP with Python bindings allow to implement e.g. some
> Python-Huggingface
> annotator and then call it from the Java SDK?
>
> Vice versa, would it be possible to build a Python/UIMA-C++ aggregate
> annotator that calls a
> Java-based UIMA component?
>
> Any thoughts?
>
> Cheers,
>
> -- Richard
>
>

Reply via email to