Chris Hostetter wrote:
the enumeration is in lexigraphical order, so "Dell" is no where near "dell" in the enumeration. even if we added a boolean property to Terms indicating that it's case insensitive Term the "seeking" along that enumeration would be ... lss optimal ... then it can be now.
Ah, now I understand!
: > > Let's say, for example, you want to find "Dell" (with a capital "D"), near
: > > "computers" (with or without capitals, ie. in any case). The problem is
: > > that
: > > you would need to use a SpanQuery to find terms near each other; but if
: > > the
: > > case-sensitivity required is different for each term, then they will be in
: > > different fields, making the use of SpanQuerys inpossible.

i assume by this statement that you are suggesting that you want your
users to be able to say "find me $foo near $bar where $foo must be in the
case i specified but bar can be in any case" is that correct?
Yes, that's exactly what I meant.
in that case Erick's point about indexing both the orriginal case and some normalized casing at the same term position is the best way to go -- the only downside this has compared to seperate fields is that it can introduce some bias in your tf/idf values ... but that can be eliminated by prefaxing all of your "normalized" terms with some unicode character that your tokenizer would normally strip off.

From Erick's reply:

"I suppose something like that might work, but I still think that presenting
a user with matches that sometimes work case sensitive and sometimes
doesn't would be...er..fraught."

The user would, of course, choose which terms are case-sensitive when they query, using a modifier in the query language. (I would have to implement that). It's something my users have asked to be able to do - in their view, fields are something that should be used for different content, and case-sensitivity should be an option on *any* field. But what you have suggested should allow it to work that way, by adding both versions of the term at the same position.

Thanks guys!

-John

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to