[ https://issues.apache.org/jira/browse/LUCENE-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630894#action_12630894 ]
Grant Ingersoll commented on LUCENE-1279: ----------------------------------------- {quote} I think the problem is that every single index term has to be converted to a CollationKey for every single (range) search. {quote} Yes, agreed. The question mainly is would that be faster than the String comparisons. Basically, is a construction plus a bitwise compare faster than a string compare? {quote} Languages, in some cases using the same character repertoire, define different orderings. Also, I believe some orderings are context dependent - you can't always compare character by character. So adding this stuff to Lucene would be to duplicate a lot of the stuff that's already done in the Collator. {quote} Makes sense, was just wondering if there were some shortcuts to be had since we have a very particular case and I was thinking maybe it would allow us to narrow down the range to search. For instance, hypothetically speaking, say your field had a full range of words starting with A up to Z, but that you knew the ordering problem only occurred between L and P and that your lower and upper terms K and Q, then you could feel confident that you could skip to K and stop at Q w/o any ramifications. I realize this is repeating what is in the Collator, but it would be nice if the collator exposed the info. However, perhaps, if using a RuleBasedCollator, the getRules() method could be used to optimize. Again, just thinking out loud, I haven't explored it. I agree, this should still go forward, even as is. > RangeQuery and RangeFilter should use collation to check for range inclusion > ---------------------------------------------------------------------------- > > Key: LUCENE-1279 > URL: https://issues.apache.org/jira/browse/LUCENE-1279 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Affects Versions: 2.3.1 > Reporter: Steven Rowe > Assignee: Grant Ingersoll > Priority: Minor > Fix For: 2.4 > > Attachments: LUCENE-1279.patch, LUCENE-1279.patch, LUCENE-1279.patch, > LUCENE-1279.patch > > > See [this java-user > discussion|http://www.nabble.com/lucene-farsi-problem-td16977096.html] of > problems caused by Unicode code-point comparison, instead of collation, in > RangeQuery. > RangeQuery could take in a Locale via a setter, which could be used with a > java.text.Collator and/or CollationKey's, to handle ranges for languages > which have alphabet orderings different from those in Unicode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]