[ 
https://issues.apache.org/jira/browse/LUCENENET-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Paldino updated LUCENENET-281:
---------------------------------------

    Attachment: TestQueryParser.TestCJK.patch

As mentioned in the comments, the second and third characters in the stirng 
(that is replicated two more times) are not letters (as classified in Java and 
.NET by calls to IsLetter functions.  Because of this, the QueryParser splits 
the word into two separate terms, hence the difference in the comparison of the 
return values.

To compensate, the test was changed to use two ^actual^ Japanese unicode 
characters, which is the first result for translating the word "term" into 
Japanes using Google translate, and then using the Unicode literals in the 
string to represent the characters, as opposed to the actual characters, since 
not all editors will treat those characters the same.

> TestCJK on TestQueryParser fails
> --------------------------------
>
>                 Key: LUCENENET-281
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-281
>             Project: Lucene.Net
>          Issue Type: Bug
>            Reporter: Nicholas Paldino
>            Priority: Minor
>         Attachments: TestQueryParser.TestCJK.patch
>
>
> Patch to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to