[
https://issues.apache.org/jira/browse/LUCENE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783938#action_12783938
]
Robert Muir commented on LUCENE-2094:
-------------------------------------
bq. A godd idea might be to use two StopFilters:
in theory, but sometimes these terms are ambiguous, and the computer
(especially a very simple analyzer) does not know which one it is, sometimes it
can be both.
sometimes its a real word too, but on average its better to ignore it.
I don't think we need to go to this effort optimal phrasequeries either. A user
who really cares can do this themself... and thats my whole point, they should
be able to do something liek what you said, and explicitly say 'no i don't want
posIncr for this stopfilter, but yes I'll take the real bugfixes, thanks'
> Prepare CharArraySet for Unicode 4.0
> ------------------------------------
>
> Key: LUCENE-2094
> URL: https://issues.apache.org/jira/browse/LUCENE-2094
> Project: Lucene - Java
> Issue Type: Bug
> Components: Analysis
> Affects Versions: 3.0
> Reporter: Simon Willnauer
> Assignee: Uwe Schindler
> Fix For: 3.1
>
> Attachments: LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.patch,
> LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.txt,
> LUCENE-2094.txt, LUCENE-2094.txt
>
>
> CharArraySet does lowercaseing if created with the correspondent flag. This
> causes that String / char[] with uncode 4 chars which are in the set can not
> be retrieved in "ignorecase" mode.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]