NLPCraft-ers, I know we don't have a formal written policy on becoming a committer yet. In the time we can follow some general guidelines in [1]. Given the fact Gleb has been working on this functionality for quite some time now (way before our 1st release & website availability) I think we should vote to make Gleb a committer (the first inaugural outside committer no less!). If there are no objections I'll kick off the vote in a day or two.
Thanks, -- Nikita Ivanov. 1. https://www.apache.org/foundation/getinvolved.html#become-a-committer On Sun, May 10, 2020 at 4:57 AM Sergey Kamov <[email protected]> wrote: > Thank you Gleb! > > I'll try to use tool in the coming days. If I have more have questions - > I'll ask you (python is not my strong side) > > Probably we'll have to extend readme file. > > Sergey Kamov > > > 09.05.2020 21:16, Ifropc пишет: > > Hello everybody, > > > > I would like to explain more in details how proposed solution works and > how it could be integrated into existing system. > > > > Algorithm: target word (to find synonyms for) is masked and passed to > Bert model. Afterwards, Bert results are filtered with FastText. Minimal > scores from Bert and FastText are configurable, and weights for results of > this 2 models are configurable as well. Therefore, weights could be trained > later (some real data is required though). > > Moreover, this pipeline could be improved with adding additional models > or filters, e.g. for some specific domain we can replace models or fit them > with domain-specific data. > > > > Application: right now there are 2 ways to use this pipeline, "static" > and "dynamic" approaches. > > With "static" approach for Nlpcraft model and example sentences > potential synonyms are generated to manually expand model. > > "Dynamic" approach is to pass sentence to model, which return potential > synonyms for the word. You can look at it as one more enricher, which > spawns new tokens. Then, if user want to use it in their model, they can > write a macro rule, i.e. "I want to have exactly word(s) A or any word, > that model thinks is synonym to word A" > > > > Thanks, > > Gleb. > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > > On Friday, May 8, 2020 8:56 AM, Sergey Makov <[email protected]> wrote: > > > >> Hi Gleb, > >> > >> thank you very much for your contribution, > >> it looks really promising! > >> > >> Regards, > >> Sergey > >> > >> On Fri, May 8, 2020 at 6:52 PM Ifropc [email protected] > wrote: > >> > >>> Hello everybody! > >>> I created a pull request for implementing auto-enriching user models > with synonyms (NLPCRAFT-11), please see details in PR on Github. > >>> Comments are appreciated, if any. > >>> Thanks, > >>> Gleb. > >>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > >>> On Thursday, May 7, 2020 11:19 PM, GitBox [email protected] wrote: > >>>> Ifropc opened a new pull request #1: > >>>> URL: https://github.com/apache/incubator-nlpcraft/pull/1 > >>>> This pull request should resolve NLPCRAFT-11: auto-enrich user models > with synonyms > >>>> Proposed approach uses Bert (RoBerta) model to generate synonyms for > given context, masking target word. Afterwards, output is filtered with > FastTest for specified context. > >>>> This feature could also be integrated with NLPCRAFT-41 > >>>> > >>>> This is an automated message from the Apache Git Service. > >>>> To respond to the message, please log on to GitHub and use the > >>>> URL above to go to the specific comment. > >>>> For queries about this service, please contact Infrastructure at: > >>>> [email protected] > >
