Erik Hatcher wrote:On Oct 20, 2004, at 12:14 PM, Doug Cutting wrote:The advantages of a zero-character prefix default are that it's back-compatibile and that it will find more matches, when spelling differences are in the first characters.I prefer this default.
Anyone using QueryParser needs to be aware of the issues of exposing fuzzy queries, range queries, and any other types the syntax supports. It would not be Lucene's fault if a system with millions of documents is exposed through QueryParser and fuzzy queries take a bit longer or thrown a TooManyClauses exception.
I am clearly outvoted. I still disagree, but will not veto this.
My last words on the topic (I promise!): In designing Lucene I tried hard to only add features that were scalable. For example, one could easily implement a RegexQuery that scans text of stored fields, returning those which match a regex. This would provide grep-like functionality, which some folks might find useful. But it would not be scalable. If someone contributed such a thing I would lobby against permitting its use from QueryParser in the default configuration. The query parser already requires an initial character before a wildcard, in order to make this operator more scalable. I don't see why fuzzy queries should be treated differently, why we permit such a huge scalability hole in the default configuration.
I agree completely with your sentiment. I personally would be happy with QueryParser weren't part of Lucene altogether - sure did make writing the book much harder, thats for sure!
But with the wildcard query requiring an initial character - at least the results you get back would be completely accurate. With a fuzzy query and a required prefix, it would not necessarily be the case, given the examples I've seen on here.
Perhaps for Lucene 2.0 we can gut QueryParser and have some type of pluggable syntax handlers, so that these inefficient queries like fuzzy and wildcard are not initially possible, but could be turned on somehow. I personally recommend, and show how in the book, to throw ParseException for both wildcard and fuzzy queries.
Erik
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]