Hi Jobin,

I would love to see a Python interface for OpenNLP, whether it is via gRPC
or a native wrapper. I don't think I have any strong feelings toward one
more than the other. Perhaps others can weigh in.

OpenNLP saw a significant decrease in its user and developer communities
when most of NLP moved to Python a few years back. However, it remains a
very capable library and I think having easy access to it from Python would
benefit the NLP community.

Regardless of which approach is chosen, I think this would be a great
submission for Apache's Community over Code NA conference in September,
assuming the conference would fit your schedule and travel requirements.
The CFP is open until April 21. I think the other Apache Community over
Code conferences have their agendas already set for this year.

https://communityovercode.org/call-for-presentations/

Thanks,
Jeff


On Wed, Mar 12, 2025 at 8:53 AM Richard Zowalla <r...@apache.org> wrote:

> Hi,
>
> Yes. You summarized it correctly.
>
> The following services are currently implemented:
>
> - Sentence Detection
> - Tokenization
> - POS Tagging
>
> The rest  of your proposal sounds valid to me.
>
> Currently, we have some ongoing research regarding the performance of the
> gRPC implementation at our university by a student.
> That might give additional insights in the next weeks / months.
>
> Gruß
> Richard
>
> > Am 10.03.2025 um 14:59 schrieb Jobin Sabu <85jobins...@gmail.com>:
> >
> > Dear Richard and Apache OpenNLP Developers
> >
> > Thank you, Richard, for your valuable feedback and for pointing me to the
> > gRPC work in the sandbox. I’ve taken a closer look at the repository and
> > gained a better understanding of the current implementation. The concept
> of
> > using gRPC to enable backend interactions with OpenNLP is fascinating,
> and
> > I can see how this approach can benefit developers across multiple
> > languages.
> >
> > Based on my understanding, the sandbox already includes:
> > 1. A gRPC schema for OpenNLP services with generated Java stubs.
> > 2. A server implementation supporting tasks like POS tagging.
> > 3. An example Python client for interacting with the server.
> >
> > I find the idea of building on this foundation exciting. For my GSoC 2025
> > project, I’d like to propose focusing on **extending the gRPC approach**,
> > specifically by:
> > - Improving the Python client and packaging it into a library for
> > distribution via `pip`, making it easier for Python developers to
> integrate
> > OpenNLP into their workflows.
> > - Exploring additional OpenNLP features (e.g., Named Entity Recognition
> or
> > Sentence Detection) that can be added to the gRPC service.
> > - Enhancing documentation and providing real-world examples for
> > Python-based integrations.
> >
> > Alternatively, if the community sees more value in pursuing a native
> Python
> > wrapper, I’m open to exploring that as well. My primary goal is to align
> my
> > efforts with OpenNLP’s priorities and deliver something valuable for the
> > community.
> >
> > I’d love to hear your thoughts and suggestions on this approach. If there
> > are specific areas the community would like me to focus on, please let me
> > know so I can refine my proposal accordingly.
> >
> > Thank you again for your guidance and support. I’m eager to hear your
> > feedback and take the next steps toward preparing my GSoC application.
> >
> > **Best regards,**
> > Jobin Sabu
> > Email: 85jobins...@gmail.com
> >
> > On Mon, 10 Mar, 2025, 1:59 pm Richard Zowalla, <r...@apache.org> wrote:
> >
> >> Hi Jobin,
> >>
> >> Thanks for your interest in contributing to OpenNLP!
> >>
> >> You’re absolutely right—most existing Python wrappers are either
> outdated
> >> or unmaintained, so this is a valuable idea in general. That said, there
> >> has been some work in the sandbox to demonstrate OpenNLP as a gRPC
> service:
> >> https://github.com/apache/opennlp-sandbox/tree/main/opennlp-grpc
> >>
> >> With this approach, a Python client can be generated (and perhaps also
> put
> >> into pip) to communicate with an OpenNLP server. It might be worth
> >> exploring whether extending or improving this setup aligns with your
> goals.
> >>
> >> While a native Python wrapper is certainly an option, the gRPC approach
> in
> >> the sandbox is another viable path. I’d love to hear thoughts from
> others
> >> on this as well! WDYT?
> >> Gruß
> >> Richard
> >>
> >>
> >>> Am 08.03.2025 um 08:53 schrieb Jobin Sabu <85jobins...@gmail.com>:
> >>>
> >>> Dear OpenNLP Community,
> >>>
> >>> My name is Jobin Sabu, and I’m a student with a background in Python,
> >>> machine learning, and NLP. I’m excited about the opportunity to
> >>> participate in Google Summer of Code (GSoC) 2025 with Apache OpenNLP
> >>> and contribute to its development.
> >>>
> >>> I’d like to propose a project idea: developing a Python wrapper for
> >>> Apache OpenNLP. The goal is to make OpenNLP’s powerful Java-based NLP
> >>> features (e.g., tokenization, sentence detection, named entity
> >>> recognition) accessible to Python developers. This wrapper would
> >>> bridge Python and Java using libraries like JPype or Py4J, providing a
> >>> user-friendly interface and a pip-installable package.
> >>>
> >>> Here’s an outline of the project:
> >>> 1. Implement Python functions that map to OpenNLP’s core features.
> >>> 2. Ensure seamless interoperability between Python and Java.
> >>> 3. Develop detailed documentation, tutorials, and example scripts.
> >>> 4. Write unit tests for robustness and performance benchmarks.
> >>>
> >>> I believe this project will expand OpenNLP’s usability and attract
> >>> more developers from the Python community. I’d love to hear your
> >>> feedback on this idea. Does it align with the community’s goals? Are
> >>> there any specific areas I should focus on or challenges I should be
> >>> aware of?
> >>>
> >>> Thank you for your time and guidance. I look forward to contributing
> >>> to OpenNLP and learning from this amazing
> >>>
> >>> Best regards,
> >>> Jobin Sabu
> >>>
> >>> 85jobins...@gmail.com
> >>
> >>
>
>

Reply via email to