Let me step in here since I was the one advocating the team for many of
these changes...

There are three major problems with the current approach of having Java
APIs and Scala-based implementation:
- It significantly increased the complexity especially on the performance
side (constant conversion between Java and Scala collections, etc).
- The Java community is just not interested in NLP in general... It's a bit
strange to me but the lack of interest in the project can be, at least,
partially attributed to the focus on Java.
- Project needs a focused target core group - for example, many GO and Rust
projects greatly benefited from the interest of their core language groups.

Apache Spark is a prime example: initial focus on & growth from the core
Scala community with Java/Python frontends added later.

My two cents,
--
Nikita Ivanov



On Fri, Jun 10, 2022 at 6:18 AM Paul King <pa...@asert.com.au> wrote:

> Okay, that makes sense. Is there a plan to release the "current
> master" or is it just a stepping stone to the "next step" which is
> when the next release will come?
>
> Cheers, Paul.
>
> On Fri, Jun 10, 2022 at 10:23 PM Kamov Sergey <skhdlem...@gmail.com>
> wrote:
> >
> > Sorry for confusing
> >
> > - last release (0.9.0) is java client/server system
> > - current version in 'master' branch (still unreleased) is simple java
> > API library (without client server)
> > - next step, which we are discussing, is simple scala API library (like
> > current master version, but with scala API instead of java)
> >
> > Regards,
> > Sergey
> >
> >
> > On 10.06.2022 15:13, Paul King wrote:
> > > So, just for my own understanding, is the server Java but the client
> > > would be Scala?
> > > Not questioning the decision but the first email in this thread said:
> > >
> > >> After these changes NlpCraft becomes simple library with java API.
> > > I am actually seeing a bit of a renaissance of Java for Data Science
> > > with numerous new projects like Amazon's DJL opting for Java as the
> > > base language.
> > >
> > > Disclosure, for data science, I mostly use Groovy as a "Python for the
> > > JVM", so that probably skews the world I see. Most of the NLP folks I
> > > speak to use Python these days. And concurring with Sergey, Stanford
> > > and OpenNLP are probably the two more widely used Java libraries I see
> > > for those folks on the JVM with Datumbox and Smile occasionally used
> > > as well.
> > >
> > > Cheers, Paul.
> > >
> > > On Wed, Jun 8, 2022 at 5:49 AM Kamov Sergey<skhdlem...@gmail.com>
> wrote:
> > >> Hi!
> > >>
> > >> All google requests like "NLP libraries" return that most popular is
> > >> Python (out of competition )
> > >>
> > >> First result for me
> > >>
> > >> https://www.upgrad.com/blog/python-nlp-libraries-and-applications/
> > >>
> https://medium.com/nlplanet/awesome-nlp-21-popular-nlp-libraries-of-2022-2e07a914248b
> > >>
> > >> Java is mentioned for Stanford, sometimes Apache openNlp
> > >>
> > >>
> > >> Regards,
> > >>
> > >> Sergey
> > >>
> > >> On 07.06.2022 19:21, Furkan KAMACI wrote:
> > >>> Hi Sergey,
> > >>>
> > >>> Is there any survey about which programming languages popular among
> NLP
> > >>> developers?
> > >>>
> > >>> Kind Regards,
> > >>> Furkan KAMACI
> > >>>
> > >>> On 7 Jun 2022 Tue at 17:37 Kamov Sergey<skhdlem...@gmail.com>
>  wrote:
> > >>>
> > >>>> Hi
> > >>>>
> > >>>> One more important thing. We want to support Scala API only for next
> > >>>> library’s version.
> > >>>> Now seems better to narrow this technological focus too.Current
> > >>>> approach, java API and Scala implementation, provoke a lot of
> technical
> > >>>> compromises (collections conversion, performance issues etc)
> > >>>> But at the same time, support of java API also doesn’t give us
> > >>>> significant benefits, because Java is not so popular among NLP
> > >>>> engineers.Focus on Scala allows to have more elegant user API and
> > >>>> implementation, also we can promote this solution for members of
> not so
> > >>>> big but active Scala community.
> > >>>> If library is successful we always can add java API support again
> over
> > >>>> Scala layer.
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>> Sergey Kamov
> > >>>>
> > >>>>
> > >>>> On 04.06.2022 17:56, Kamov Sergey wrote:
> > >>>>> Hi
> > >>>>> I want to enumerate next NlpCraft release changes.
> > >>>>>
> > >>>>> Main goals of next release:
> > >>>>>    - Simplifying of the system usage.
> > >>>>>    - Narrowing of focus - NLP, deleting all unrelated, auxiliary
> > >>>> components.
> > >>>>>    - Possibility of custom multi-language support.
> > >>>>>    - Simplifying of code, technical debt minimization.
> > >>>>>
> > >>>>> 1. Removed
> > >>>>>    - Client-server approach components, servers cluster support.
> > >>>>>    - Any database usage.
> > >>>>>    - CLI management console.
> > >>>>>    - Docker related stuff.
> > >>>>>    - Complex semantic components support.
> > >>>>> After these changes NlpCraft becomes simple library with java API.
> > >>>>>
> > >>>>> 2.Added and changed
> > >>>>> All components plugability support added, including such base as
> > >>>>> tokenizer etc, with EN default implementations of all of them.
> > >>>>> Note, that components testability was also significantly
> simplified,
> > >>>>> which is especially useful for user custom components.
> > >>>>>
> > >>>>> As results - all goals seem in general achieved.
> > >>>>> Code, including examples on different languages (EN, FR, RU) are
> > >>>>> accessible in `master` branch.
> > >>>>> Th best way to look at the code and review API, components work -
> > >>>>> start and debug 'light-switch' example, EN and FR versions.
> > >>>>>
> > >>>>> Remained tasks: some additional examples, user API clarifying,
> > >>>>> documentation.
> > >>>>>
> > >>>>> Please ask the questions if you have.
> > >>>>>
> > >>>>>
> > >>>>> Regards,
> > >>>>>
> > >>>>> Sergey Kamov
> > >>>>>
>

Reply via email to