Let me step in here since I was the one advocating the team for many of these changes...
There are three major problems with the current approach of having Java APIs and Scala-based implementation: - It significantly increased the complexity especially on the performance side (constant conversion between Java and Scala collections, etc). - The Java community is just not interested in NLP in general... It's a bit strange to me but the lack of interest in the project can be, at least, partially attributed to the focus on Java. - Project needs a focused target core group - for example, many GO and Rust projects greatly benefited from the interest of their core language groups. Apache Spark is a prime example: initial focus on & growth from the core Scala community with Java/Python frontends added later. My two cents, -- Nikita Ivanov On Fri, Jun 10, 2022 at 6:18 AM Paul King <pa...@asert.com.au> wrote: > Okay, that makes sense. Is there a plan to release the "current > master" or is it just a stepping stone to the "next step" which is > when the next release will come? > > Cheers, Paul. > > On Fri, Jun 10, 2022 at 10:23 PM Kamov Sergey <skhdlem...@gmail.com> > wrote: > > > > Sorry for confusing > > > > - last release (0.9.0) is java client/server system > > - current version in 'master' branch (still unreleased) is simple java > > API library (without client server) > > - next step, which we are discussing, is simple scala API library (like > > current master version, but with scala API instead of java) > > > > Regards, > > Sergey > > > > > > On 10.06.2022 15:13, Paul King wrote: > > > So, just for my own understanding, is the server Java but the client > > > would be Scala? > > > Not questioning the decision but the first email in this thread said: > > > > > >> After these changes NlpCraft becomes simple library with java API. > > > I am actually seeing a bit of a renaissance of Java for Data Science > > > with numerous new projects like Amazon's DJL opting for Java as the > > > base language. > > > > > > Disclosure, for data science, I mostly use Groovy as a "Python for the > > > JVM", so that probably skews the world I see. Most of the NLP folks I > > > speak to use Python these days. And concurring with Sergey, Stanford > > > and OpenNLP are probably the two more widely used Java libraries I see > > > for those folks on the JVM with Datumbox and Smile occasionally used > > > as well. > > > > > > Cheers, Paul. > > > > > > On Wed, Jun 8, 2022 at 5:49 AM Kamov Sergey<skhdlem...@gmail.com> > wrote: > > >> Hi! > > >> > > >> All google requests like "NLP libraries" return that most popular is > > >> Python (out of competition ) > > >> > > >> First result for me > > >> > > >> https://www.upgrad.com/blog/python-nlp-libraries-and-applications/ > > >> > https://medium.com/nlplanet/awesome-nlp-21-popular-nlp-libraries-of-2022-2e07a914248b > > >> > > >> Java is mentioned for Stanford, sometimes Apache openNlp > > >> > > >> > > >> Regards, > > >> > > >> Sergey > > >> > > >> On 07.06.2022 19:21, Furkan KAMACI wrote: > > >>> Hi Sergey, > > >>> > > >>> Is there any survey about which programming languages popular among > NLP > > >>> developers? > > >>> > > >>> Kind Regards, > > >>> Furkan KAMACI > > >>> > > >>> On 7 Jun 2022 Tue at 17:37 Kamov Sergey<skhdlem...@gmail.com> > wrote: > > >>> > > >>>> Hi > > >>>> > > >>>> One more important thing. We want to support Scala API only for next > > >>>> library’s version. > > >>>> Now seems better to narrow this technological focus too.Current > > >>>> approach, java API and Scala implementation, provoke a lot of > technical > > >>>> compromises (collections conversion, performance issues etc) > > >>>> But at the same time, support of java API also doesn’t give us > > >>>> significant benefits, because Java is not so popular among NLP > > >>>> engineers.Focus on Scala allows to have more elegant user API and > > >>>> implementation, also we can promote this solution for members of > not so > > >>>> big but active Scala community. > > >>>> If library is successful we always can add java API support again > over > > >>>> Scala layer. > > >>>> > > >>>> Regards, > > >>>> > > >>>> Sergey Kamov > > >>>> > > >>>> > > >>>> On 04.06.2022 17:56, Kamov Sergey wrote: > > >>>>> Hi > > >>>>> I want to enumerate next NlpCraft release changes. > > >>>>> > > >>>>> Main goals of next release: > > >>>>> - Simplifying of the system usage. > > >>>>> - Narrowing of focus - NLP, deleting all unrelated, auxiliary > > >>>> components. > > >>>>> - Possibility of custom multi-language support. > > >>>>> - Simplifying of code, technical debt minimization. > > >>>>> > > >>>>> 1. Removed > > >>>>> - Client-server approach components, servers cluster support. > > >>>>> - Any database usage. > > >>>>> - CLI management console. > > >>>>> - Docker related stuff. > > >>>>> - Complex semantic components support. > > >>>>> After these changes NlpCraft becomes simple library with java API. > > >>>>> > > >>>>> 2.Added and changed > > >>>>> All components plugability support added, including such base as > > >>>>> tokenizer etc, with EN default implementations of all of them. > > >>>>> Note, that components testability was also significantly > simplified, > > >>>>> which is especially useful for user custom components. > > >>>>> > > >>>>> As results - all goals seem in general achieved. > > >>>>> Code, including examples on different languages (EN, FR, RU) are > > >>>>> accessible in `master` branch. > > >>>>> Th best way to look at the code and review API, components work - > > >>>>> start and debug 'light-switch' example, EN and FR versions. > > >>>>> > > >>>>> Remained tasks: some additional examples, user API clarifying, > > >>>>> documentation. > > >>>>> > > >>>>> Please ask the questions if you have. > > >>>>> > > >>>>> > > >>>>> Regards, > > >>>>> > > >>>>> Sergey Kamov > > >>>>> >