Thank you Gleb!

I'll try to use tool in the coming days. If I have more have questions - I'll ask you (python is not my strong side)

Probably we'll have to extend readme file.

Sergey Kamov


09.05.2020 21:16, Ifropc пишет:
Hello everybody,

I would like to explain more in details how proposed solution works and how it 
could be integrated into existing system.

Algorithm: target word (to find synonyms for) is masked and passed to Bert 
model. Afterwards, Bert results are filtered with FastText. Minimal scores from 
Bert and FastText are configurable, and weights for results of this 2 models 
are configurable as well. Therefore, weights could be trained later (some real 
data is required though).
Moreover, this pipeline could be improved with adding additional models or 
filters, e.g. for some specific domain we can replace models or fit them with 
domain-specific data.

Application: right now there are 2 ways to use this pipeline, "static" and 
"dynamic" approaches.
With "static" approach for Nlpcraft model and example sentences potential 
synonyms are generated to manually expand model.
"Dynamic" approach is to pass sentence to model, which return potential synonyms for the 
word. You can look at it as one more enricher, which spawns new tokens. Then, if user want to use 
it in their model, they can write a macro rule, i.e. "I want to have exactly word(s) A or any 
word, that model thinks is synonym to word A"

Thanks,
Gleb.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, May 8, 2020 8:56 AM, Sergey Makov <[email protected]> wrote:

Hi Gleb,

thank you very much for your contribution,
it looks really promising!

Regards,
Sergey

On Fri, May 8, 2020 at 6:52 PM Ifropc [email protected] wrote:

Hello everybody!
I created a pull request for implementing auto-enriching user models with 
synonyms (NLPCRAFT-11), please see details in PR on Github.
Comments are appreciated, if any.
Thanks,
Gleb.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, May 7, 2020 11:19 PM, GitBox [email protected] wrote:
Ifropc opened a new pull request #1:
URL: https://github.com/apache/incubator-nlpcraft/pull/1
This pull request should resolve NLPCRAFT-11: auto-enrich user models with 
synonyms
Proposed approach uses Bert (RoBerta) model to generate synonyms for given 
context, masking target word. Afterwards, output is filtered with FastTest for 
specified context.
This feature could also be integrated with NLPCRAFT-41

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to