How about putting this here: http://wiki.apache.org/general/SummerOfCode2005
It seems to be a nice fit for the sponsor. Regards, Paul Elschot On Saturday 04 June 2005 22:25, Paul Elschot wrote: > On Monday 30 May 2005 02:44, Erik Hatcher wrote: > > I concur with Daniel on this. For the moment, my preference is to > > bring in Paul's parser into contrib/surround and let it gain some > > additional exposure there. I don't believe its possible or even > > preferable to attempt to build one query parser to rule them all. > > While a decent general purpose one is handy, I'm finding that my > > projects really demand more custom parsing capabilities than the > > built-in QueryParser can handle and that the quirks of the current > > parser cause some frustrations sometimes. > > > > Perhaps over time, the built-in QueryParser can adopt some additional > > capabilities such as supporting the SpanQuery family but let's take > > that sort of thing slowly. > > > > How about extending the surround parser to allow the use of all > queries currently in Lucene? The goal would be to allow as many > queries as possible. > > The queries not available in the current surround parser are: > - FuzzyQuery, WildCardQuery, PrefixQuery > - SpanFirstQuery > - SpanNotQuery > - MultiPhraseQuery (or the various phrase scorers), > - optional terms/clauses > > FuzzyQuery and SpanFirstQuery could be done with a prefix operator > including a number (like the nn in the nnN near operator) followed by a > single query, with appropriate restrictions. > A prefix operator followed by a single query is currently not present, but > relatively easy to add. > SpanNotQuery always has two subqueries, so would need an infix operator > only. > MultiPhraseQuery would need an infix operator and a prefix operator, just > like the N and W operators, and a restriction to terms, truncations and OR > as subqueries. > > Left truncation could also be allowed, > truncations currently have to start with a normal character. > Truncation might also be left to WildCardQuery and > PrefixQuery instead of the current "equivalent" in Surround > that uses regular expressions to find the matching terms. > > That leaves the optional terms/clauses, and I can't think of an easy way to > handle these. Any ideas? OR does not work for this because it requires > at least one. The normal QueryParser syntax for this is +aa bb cc, > where bb and cc are the optional parts. > > Some control over performance is outside the language. > A basic query factory must be provided to the create a Lucene query > from a Surround query, and this throws an exception when > rewriting causes too many terms to be used, > much like the TooManyClauses for BooleanQuery. > > > Regards, > Paul Elschot > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
