On Wednesday, October 15, 2003, at 10:24 AM, Michael Giles wrote:
So how do we move this issue forward. I can't think of a single case where a "-" with no whitespace on either side (i.e. t-shirt, Wal-Mart) should be interpreted as a NOT command. Is there a feeling that changing the interpretation of such cases is a break in compatibility? I agree that it will change behavior, but I think that it will change it for the better (i.e. fix it). The current behavior is really broken (and very frustrating for a user trying to search).

I looked at the patch here:


http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23838

I'm not entirely satisfied with it. I'm of the opinion that we should only change QueryParser to fix the behavior of operators nestled within text with no surrounding whitespace. The provided patch only works with the "-" character, but what about "Wal+Mart"? Shouldn't we keep that together also and hand it to the analyzer?

I'm not convinced at all that we should change the StandardTokenizer to not split on dash. If only QueryParser was fixed and handed "Wal-Mart" to the StandardAnalyzer, it would be split the same way as during indexing and searches would return the expected hits.

Thoughts? I'd like to see this fixed, but in a way that makes the most general sense.

Thanks,
        Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to