[
https://issues.apache.org/jira/browse/LUCENE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774442#comment-13774442
]
Shai Erera commented on LUCENE-5237:
------------------------------------
bq. This isn't a bug: if you delete the last character, its all that must
happen.
You're right. So first, this isn't what happens. If pos=3 and len=4 (delete the
last character), it calls System.arraycopy (even in the patch I posted). This
could be improved. Second, the problem is that it deletes the last character,
even if pos >= length. I.e. you ask to delete the character beyond what is
"valid" in that buffer. I can't believe there is a TokenFilter that relies on
being able to delete characters beyond the length of the buffer as it knows.
bq. Shouldn't it throw an exception instead when pos + nChars > buf.length?
Maybe we should ...
bq. We can mark the whole class lucene.internal or copy the code of the methods
to each class actually using them
You mean inline these methods?
> StemmerUtil.deleteN may delete too many characters
> --------------------------------------------------
>
> Key: LUCENE-5237
> URL: https://issues.apache.org/jira/browse/LUCENE-5237
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/analysis
> Reporter: Shai Erera
> Assignee: Shai Erera
> Attachments: LUCENE-5237.patch
>
>
> StemmerUtil.deleteN calls to delete(), but in some cases, it may delete too
> many characters. E.g. if you execute this code:
> {code}
> char[] buf = "abcd".toCharArray();
> int len = StemmerUtil.deleteN(buf, buf.length, buf.length, 3);
> System.out.println(new String(buf, 0, len));
> {code}
> You get "a", even though no character should have been deleted (not according
> to the javadocs nor common logic).
> The problem is in delete(), which always returns {{len-1}}, even if no
> character is actually deleted.
> I'll post a patch that fixes it shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]