[ 
https://issues.apache.org/jira/browse/NLPCRAFT-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120153#comment-17120153
 ] 

Sergey Kamov commented on NLPCRAFT-67:
--------------------------------------

Hi, I guess we need
1. Move sources under src/main/python with name mlserver ? (or something like 
this)
2. Rename endpoint `synonyms` to `suggestions` (maybe another, but not 
`synonyms`, it can confuse)
3. We already have `limit` request parameter
 I guess we have to add `score` request parameter (it means minimal score for 
response data)
 Both optional. Both can be used together. 
 By default score can be 0, limit - 100 (?) 
4. Let's investigate performance issue (batch mode, GPU mode etc)
 If performance improvement doable, let's define a new endpoint for batch mode?
 Current request is 
 {{{}}
{{  "sentence": "sentence", }}
{{  "simple:" true, }}
{{  "lower": 1, }}
{{  "upper": 1, }}
{{  "limit": 1}}
{{ }}}


 Batch request can be
 {{{}}
{{  "sentence": "sentence", }}
{{  "simple": true, }}
{{  "lower": 1, }}
{{  "indexes": [}}
{{    {}}
{{      "upper": 1, }}
{{      "limit": 1}}
{{    },}}
{{    {}}
{{      "upper": 2, }}
{{      "limit": 2}}

{{    }}}

{{  ]}}
{{ }}}


 Maybe the second format is enough and we don't need both.

The problems is that we need performance possibility understanding because its 
restriction can influence on dynamic enricher design, 
 So we don't need performance ASAP, but need API and understanding - can we 
achieve reasonable performance or not.

5. Will we support multiple word synonyms processings? 
 We have the following possibilities
 - drop it. - if yes - let's change API. Instead of pair `lower` and `upper` we 
need one `index`
 - we can improve support it it is possible in the foreseeable future - so we 
keep current API
 - we can keep current API but postpone multiple word processings 
investigations. We can do so, but if we really hope to reopen to it. Otherwise 
- lets' don't keep garbage.

> Python machine learning module 
> -------------------------------
>
>                 Key: NLPCRAFT-67
>                 URL: https://issues.apache.org/jira/browse/NLPCRAFT-67
>             Project: NLPCraft
>          Issue Type: New Feature
>            Reporter: Gleb
>            Assignee: Gleb
>            Priority: Major
>             Fix For: 0.7.0
>
>
> Python part which consists of Bert masked word prediction and FastText 
> synonyms filter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to