Hi OpenNLP devs,

I've opened OPENNLP-1833 to propose evolving the opennlp-sandbox gRPC
POC into ASF-native modules with a canonical OpenNlpDocument message and
a primary AnalyzeDocument RPC (org.apache.opennlp.grpc.v1).

JIRA: https://issues.apache.org/jira/browse/OPENNLP-1833

Background: OpenNLP today is primarily in-process (API, CLI, UIMA).
The sandbox POC (opennlp-grpc) exposes three separate string-based
services; the ticket proposes a unified document contract and server-side
pipeline orchestration.

My primary goal is to integrate other language libraries through a gRPC
contract.  This will allow the server to work with OpenNLP.  OpenNLP can
use the client stubs to get data from the server, and the server would also
use OpenNLP to expose the API to other languages.

To be more specific: I'd like to introduce options that also utilize the
GPU more directly for embeddings.  CUDA for nvidia cards and OpenVINO for
Intel cards.  This would create a middle interface that can hot-swap on the
server side.  Of course, these interfaces would also be their own builds.

I'm planning to work on this in phases as outlined in the ticket:

   - Phase 0/1: community RFC + design doc / full .proto definitions
   - Phase 2+: implementation (will work on this while we discuss phase 1,
   but open for changes)

I'd appreciate feedback on a few points called out in the JIRA ticket.

I can get a prototype up within a couple of weeks.

Sandbox reference:

https://github.com/apache/opennlp-sandbox/tree/OPENNLP-1833-grpc-expansion

I'll post design updates and any draft .proto / docs to the ticket.
Comments on the JIRA or replies to this thread are welcome although JIRA is
preferred.

Thanks,
Kristian

Reply via email to