Re: New Lucene QueryParser

Mark Miller Wed, 03 Jan 2007 05:01:15 -0800

Hey Laurent,

I am actually pretty much ready for a beta/preview release right aboutnow. All of the features are in and I am pretty happy with most of thework. Over the past month I have been squashing bugs and could certainlyuse as much help as I can get making sure this thing is as perfect as itcan be. I am currently in the middle of migrating to a new laptop, so Imay take a couple days to get a distribution jar together with somesimple documentation, but I plan on doing that as soon as I get a chance.

Query-time thesaurus expansion / General token to query expansion :Takes advantage of a general find/replace feature, "expand" might mapto "(expander | expanded)" ... or any other valid syntax.
This I could also use, if can also do following ?
right now I've a little utility class which expands special strings(syntax is to be disc.) to all combinations :
"fest[,e] hypothek[,en,a]"
-> fest hypothek;fest hypotheken;fest hypotheka;feste hypothek;festehypotheken;feste hypotheka

I require a similar feature, although in the form mark{s es ing} ->marks markes marking. Unfortunately, the way I have done it (in theJavaCC grammer) is not easily configurable.

Note that there may be some limitations...but so far this has provedto be pretty powerful
Would still be good to know the limitations you see right now...

I mentioned there might be limitations because I kept running into newdifficult problems and I just didn't know if something would come up Icould not get around or if something would be too slow etc. Not tomention I am still a little (or a lot depending on who you talk to) wetbehind the ears. So far I have not run into any limitations. Thatcertainly does not mean they don't exists though :) I'm still crossingmy fingers. My goal is to make this thing as perfect as I can. It'sbasically my new hobby.



- Mark

Mark Miller wrote:
I have finally delved back into the Lucene Query parser that Istarted a few months back. I am very closing to wrapping up it'sinitial development. I am currently looking for anybody willing tohelp me out with a little testing and maybe some design consultation(I am not happy with the current range query syntax for one). If youhave any interested in using this parser and have a little time tohelp out, please do. The parser is extremely customizable and you canbasically mold it into whatever you want. A brief outline of thefeature set:
The basics from Lucene query parser are covered: escaping operators,handling tokens at the same position, range queries, etc.
Default Operators are: & | ! ~ ( )
New operators can be defined and default operators can be hidden onthe fly.
Adds a proximity operator to the standard AND, OR, and ANDNOToperators allowing for queries like:
(search bear) ~5 (snake & horse ~4 pope) | crazy query
The default space operator is customizable and can be made to bindtighter than if you use the actual operator (the operator acts likethe actual operator but within parenthesis).
The order of operations for the operators is customizable. Thedefault order is |, &, ~, !, ( )...you can change it to whatever youwant.
Query-time thesaurus expansion / General token to query expansion :Takes advantage of a general find/replace feature, "expand" might mapto "(expander | expanded)" ... or any other valid syntax. There isalso a slower RegEx feature so that you can match tokens with aPattern and perform back reference enabled replacements. You can alsomake the replacement behave as an operator...you might map NEAR to~10 , creating a new operator that performs within 10 word proximitysearches.
Did You Mean feature using the SpellCheck contrib: if you search for'date(Aug 3, 1952) & mackine | rabbit' you might get a suggestion of: 'date(Aug 3, 1952) & machine | rabbit'
Paragraph/Sentence proximity search functionality. You can injecttokens to specify paragraph and sentence markers and performSpanNotWithin searches for paragraph sentence proximity searches.
Customizable date parser.

Everything is pretty much configurable on the fly.
Note that there may be some limitations...but so far this has provedto be pretty powerful. I could sure use some testing help making itproduction ready though. I will be putting a new website up for theparser soon. Please send me a note if you can help out at all. When Iput up the jar you can just run it with Java -jar and it will providea console input to enter queries and see the Lucene Query generated.
- Mark Miller





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: New Lucene QueryParser

Reply via email to