Hi OpenNLP devs, I've opened OPENNLP-1833 to propose evolving the opennlp-sandbox gRPC POC into ASF-native modules with a canonical OpenNlpDocument message and a primary AnalyzeDocument RPC (org.apache.opennlp.grpc.v1).
JIRA: https://issues.apache.org/jira/browse/OPENNLP-1833 Background: OpenNLP today is primarily in-process (API, CLI, UIMA). The sandbox POC (opennlp-grpc) exposes three separate string-based services; the ticket proposes a unified document contract and server-side pipeline orchestration. My primary goal is to integrate other language libraries through a gRPC contract. This will allow the server to work with OpenNLP. OpenNLP can use the client stubs to get data from the server, and the server would also use OpenNLP to expose the API to other languages. To be more specific: I'd like to introduce options that also utilize the GPU more directly for embeddings. CUDA for nvidia cards and OpenVINO for Intel cards. This would create a middle interface that can hot-swap on the server side. Of course, these interfaces would also be their own builds. I'm planning to work on this in phases as outlined in the ticket: - Phase 0/1: community RFC + design doc / full .proto definitions - Phase 2+: implementation (will work on this while we discuss phase 1, but open for changes) I'd appreciate feedback on a few points called out in the JIRA ticket. I can get a prototype up within a couple of weeks. Sandbox reference: https://github.com/apache/opennlp-sandbox/tree/OPENNLP-1833-grpc-expansion I'll post design updates and any draft .proto / docs to the ticket. Comments on the JIRA or replies to this thread are welcome although JIRA is preferred. Thanks, Kristian
