DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=32687>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=32687





------- Additional Comments From [EMAIL PROTECTED]  2004-12-15 04:01 -------
Created an attachment (id=13758)
 --> (http://issues.apache.org/bugzilla/attachment.cgi?id=13758&action=view)
Testcase that tests ChineseTokenizer and OTHER_LETTER offsets

The problem arises when OTHER_LETTER characters and the rest of the characters
are mixed together.  When given a string "a&#22825;b", tokens and corresponding
offsets should be the following:
a : (0, 1)
&#22825; : (1, 2)
b : (2, 3)

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to