[ 
https://issues.apache.org/jira/browse/OPENNLP-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated OPENNLP-1169:
-------------------------------------
    Description: 
{{WordVectorsTable}} API retrieves {{WordVector}} via {{CharSequence}} , this 
is suboptimal as implementors could store such WVs via an hash table (e.g. 
{{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not 
guaranteed to be the stable.
Additionally it's more common to have words as Strings rather than 
CharSequences, being that more consistent with other OpenNLP APIs (e.g. 
{{Tokenizer}} ).

So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.

  was:
{{WordVectorsTable}} API retrieves {{WordVector}}s via {{CharSequence}}, this 
is suboptimal as implementors could store such WVs via an hash table (e.g. 
{{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not 
guaranteed to be the stable.
Additionally it's more common to have words as Strings rather than 
CharSequences, being that more consistent with other OpenNLP APIs (e.g. 
{{Tokenizer}}).

So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.


> WordVectorTable should reference WVs by String
> ----------------------------------------------
>
>                 Key: OPENNLP-1169
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1169
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: word vectors
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: 1.8.4
>
>
> {{WordVectorsTable}} API retrieves {{WordVector}} via {{CharSequence}} , this 
> is suboptimal as implementors could store such WVs via an hash table (e.g. 
> {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not 
> guaranteed to be the stable.
> Additionally it's more common to have words as Strings rather than 
> CharSequences, being that more consistent with other OpenNLP APIs (e.g. 
> {{Tokenizer}} ).
> So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to