Hi,

Yes. You summarized it correctly. 

The following services are currently implemented:

- Sentence Detection
- Tokenization
- POS Tagging

The rest  of your proposal sounds valid to me. 

Currently, we have some ongoing research regarding the performance of the gRPC 
implementation at our university by a student.
That might give additional insights in the next weeks / months.

Gruß
Richard

> Am 10.03.2025 um 14:59 schrieb Jobin Sabu <85jobins...@gmail.com>:
> 
> Dear Richard and Apache OpenNLP Developers
> 
> Thank you, Richard, for your valuable feedback and for pointing me to the
> gRPC work in the sandbox. I’ve taken a closer look at the repository and
> gained a better understanding of the current implementation. The concept of
> using gRPC to enable backend interactions with OpenNLP is fascinating, and
> I can see how this approach can benefit developers across multiple
> languages.
> 
> Based on my understanding, the sandbox already includes:
> 1. A gRPC schema for OpenNLP services with generated Java stubs.
> 2. A server implementation supporting tasks like POS tagging.
> 3. An example Python client for interacting with the server.
> 
> I find the idea of building on this foundation exciting. For my GSoC 2025
> project, I’d like to propose focusing on **extending the gRPC approach**,
> specifically by:
> - Improving the Python client and packaging it into a library for
> distribution via `pip`, making it easier for Python developers to integrate
> OpenNLP into their workflows.
> - Exploring additional OpenNLP features (e.g., Named Entity Recognition or
> Sentence Detection) that can be added to the gRPC service.
> - Enhancing documentation and providing real-world examples for
> Python-based integrations.
> 
> Alternatively, if the community sees more value in pursuing a native Python
> wrapper, I’m open to exploring that as well. My primary goal is to align my
> efforts with OpenNLP’s priorities and deliver something valuable for the
> community.
> 
> I’d love to hear your thoughts and suggestions on this approach. If there
> are specific areas the community would like me to focus on, please let me
> know so I can refine my proposal accordingly.
> 
> Thank you again for your guidance and support. I’m eager to hear your
> feedback and take the next steps toward preparing my GSoC application.
> 
> **Best regards,**
> Jobin Sabu
> Email: 85jobins...@gmail.com
> 
> On Mon, 10 Mar, 2025, 1:59 pm Richard Zowalla, <r...@apache.org> wrote:
> 
>> Hi Jobin,
>> 
>> Thanks for your interest in contributing to OpenNLP!
>> 
>> You’re absolutely right—most existing Python wrappers are either outdated
>> or unmaintained, so this is a valuable idea in general. That said, there
>> has been some work in the sandbox to demonstrate OpenNLP as a gRPC service:
>> https://github.com/apache/opennlp-sandbox/tree/main/opennlp-grpc
>> 
>> With this approach, a Python client can be generated (and perhaps also put
>> into pip) to communicate with an OpenNLP server. It might be worth
>> exploring whether extending or improving this setup aligns with your goals.
>> 
>> While a native Python wrapper is certainly an option, the gRPC approach in
>> the sandbox is another viable path. I’d love to hear thoughts from others
>> on this as well! WDYT?
>> Gruß
>> Richard
>> 
>> 
>>> Am 08.03.2025 um 08:53 schrieb Jobin Sabu <85jobins...@gmail.com>:
>>> 
>>> Dear OpenNLP Community,
>>> 
>>> My name is Jobin Sabu, and I’m a student with a background in Python,
>>> machine learning, and NLP. I’m excited about the opportunity to
>>> participate in Google Summer of Code (GSoC) 2025 with Apache OpenNLP
>>> and contribute to its development.
>>> 
>>> I’d like to propose a project idea: developing a Python wrapper for
>>> Apache OpenNLP. The goal is to make OpenNLP’s powerful Java-based NLP
>>> features (e.g., tokenization, sentence detection, named entity
>>> recognition) accessible to Python developers. This wrapper would
>>> bridge Python and Java using libraries like JPype or Py4J, providing a
>>> user-friendly interface and a pip-installable package.
>>> 
>>> Here’s an outline of the project:
>>> 1. Implement Python functions that map to OpenNLP’s core features.
>>> 2. Ensure seamless interoperability between Python and Java.
>>> 3. Develop detailed documentation, tutorials, and example scripts.
>>> 4. Write unit tests for robustness and performance benchmarks.
>>> 
>>> I believe this project will expand OpenNLP’s usability and attract
>>> more developers from the Python community. I’d love to hear your
>>> feedback on this idea. Does it align with the community’s goals? Are
>>> there any specific areas I should focus on or challenges I should be
>>> aware of?
>>> 
>>> Thank you for your time and guidance. I look forward to contributing
>>> to OpenNLP and learning from this amazing
>>> 
>>> Best regards,
>>> Jobin Sabu
>>> 
>>> 85jobins...@gmail.com
>> 
>> 

Reply via email to