Hi,
Here's a searchable mailing list archive:
http://www.gossamer-threads.com/lists/lucene/java-user/
As regards the wildcard phrase queries, here's one way I think you could
do it, but it's a bit of extra work. If you're using QueryParser, you'd
have to override the "getFieldQuery" method to use span queries instead
of phrase queries.
A phrase query can be implemented as a span query with a span or "slop"
factor of 1. So, once you have the PhraseQuery object, you would:
1. Extract the terms
2. For each one, check if it contains a "*" or a "?"
3. If it does, create a WildcardQuery using that term, and re-write it
using IndexReader.rewrite method. This expands the wildcard query into
all it's matches.
4. Create an array of SpanTermQuery objects, (one SpanTermQuery for each
term that matched you wildcard); then add that array to a SpanOrQuery.
5. Repeat 2 to 4 for each wildcard term in the phrase.
6. Finally (!), create a SpanNearQuery, adding all the original terms in
order, but substituting your SpanOrQuerys for the wildcard terms. Use a
slop of 1, and set the inOrder flag to true.
So, essentially, you'd end up with: (you'll have to excuse me if I
haven't rendered the span queries correctly as strings here - but this
should give the general idea...)
spanNear[boiler (spanOr[replacement replacing])]
So it will accept *either* "replacement" or "replacing" adjacent to
"boiler", which is what you want.
As you can see, it's a bit of work - but if you add this functionality
to the QueryParser, you'll can re-use it a lot!
Hope that helps!
-JB
Jon Loken wrote:
Hi all,
First of all, well done to the implementers of Lucene. The performance
is incredible! We get search results within 20-40 ms on an index about
1.5GB.
I could not find a Lucene maillist search engine, something I am a bit
surprised about!
My question is how I can implement wild carded phrases searches like:
"boiler replac*"
This will pick up text:
"boiler replacement" and "boiler replacing"
But not:
"boiling replacement" or "boiler user replacment"
I am using the queryParser through Spring-lucene-module.
I did simply try textToSearch= "boiler replac*", but this did not work
as anticipated. Have not analyzed properly, but it seemed to interpret
this as:
boiler OR replac*
Is there a way to implements this?
Many thanks,
Jon
BiP Solutions Limited is a company registered in Scotland with Company Number
SC086146 and VAT number 38303966 and having its registered office at Park
House, 300 Glasgow Road, Shawfield, Glasgow, G73 1SQ
****************************************************************************
This e-mail (and any attachment) is intended only for the attention of the
addressee(s). Its unauthorised use, disclosure, storage or copying is not
permitted. If you are not the intended recipient, please destroyall copies and
inform the sender by return e-mail.
This e-mail (whether you are the sender or the recipient) may be monitored,
recorded and retained by BiP Solutions Ltd.
E-mail monitoring/ blocking software may be used, and e-mail content may be
read at any time. You have a responsibility to ensure laws are not broken when
composing or forwarding e-mails and their contents.
****************************************************************************
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]