[ https://issues.apache.org/jira/browse/LUCENE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784175#action_12784175 ]
DM Smith commented on LUCENE-2094: ---------------------------------- bq. I would like to open another issue for roberts patch. The reason for this is that I feel that issues like that get sidetracked quite often and its hard to follow once this happens. This would make discussions more clear and would help to prevent situations like this. Just my opinion: I don't like committing part of an issue. I think that when/if there is a point at which a commit is needed, for whatever reason, and there is more to do or to discuss, the issue needs to be split. I think a JIRA issue should be represented by a single commit. This issue pertains to making CharSetArray properly handle surrogates when lowercasing. The use case in Lucene are the stop word lists. These are used by the StopFilter, which has an ugliness that needed fixing. I understand that sometimes more than one thing gets done in an issue because it is to hard to manage as multiple issues. What I call a ripple effect. It appears that this is happening here. I think changes other than that should be another issue, a sub-issue, or a linked issue? As it stands, Robert's patch, having the same name as Simon's, makes it appear that it supersedes the prior with the same name. It is confusing without the context of reading the thread. > Prepare CharArraySet for Unicode 4.0 > ------------------------------------ > > Key: LUCENE-2094 > URL: https://issues.apache.org/jira/browse/LUCENE-2094 > Project: Lucene - Java > Issue Type: Bug > Components: Analysis > Affects Versions: 3.0 > Reporter: Simon Willnauer > Assignee: Uwe Schindler > Fix For: 3.1 > > Attachments: LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.patch, > LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.txt, > LUCENE-2094.txt, LUCENE-2094.txt > > > CharArraySet does lowercaseing if created with the correspondent flag. This > causes that String / char[] with uncode 4 chars which are in the set can not > be retrieved in "ignorecase" mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org