[ 
https://issues.apache.org/jira/browse/OPENNLP-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738818#comment-16738818
 ] 

ASF GitHub Bot commented on OPENNLP-1234:
-----------------------------------------

jzonthemtn commented on pull request #344: OPENNLP-1234 
Dictionary.asStringSet() fix
URL: https://github.com/apache/opennlp/pull/344#discussion_r246592874
 
 

 ##########
 File path: opennlp-tools/src/main/java/opennlp/tools/dictionary/Dictionary.java
 ##########
 @@ -313,9 +306,9 @@ public boolean hasNext() {
           }
 
           public String next() {
-            return entries.next().getStringList().getToken(0);
+            return String.join(" ",entries.next().getStringList());
 
 Review comment:
   Given the javadoc comment on line 297 this may be the desired behavior but 
we'll need to double-check that.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Dictionary.asStringSet() is returning single tokens 
> ----------------------------------------------------
>
>                 Key: OPENNLP-1234
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1234
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Name Finder
>            Reporter: Evandro Fonseca
>            Priority: Major
>              Labels: easyfix
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> When we use the method Dictionary.asStringSet(), it returns a list of single 
> tokens.
> For example: European Union -> European. Basically, it returns just the first 
> token of each instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to