Hi,

Here's a searchable mailing list archive: http://www.gossamer-threads.com/lists/lucene/java-user/

As regards the wildcard phrase queries, here's one way I think you could do it, but it's a bit of extra work. If you're using QueryParser, you'd have to override the "getFieldQuery" method to use span queries instead of phrase queries.

A phrase query can be implemented as a span query with a span or "slop" factor of 1. So, once you have the PhraseQuery object, you would:

1. Extract the terms
2. For each one, check if it contains a "*" or a "?"
3. If it does, create a WildcardQuery using that term, and re-write it using IndexReader.rewrite method. This expands the wildcard query into all it's matches. 4. Create an array of SpanTermQuery objects, (one SpanTermQuery for each term that matched you wildcard); then add that array to a SpanOrQuery.
5. Repeat 2 to 4 for each wildcard term in the phrase.
6. Finally (!), create a SpanNearQuery, adding all the original terms in order, but substituting your SpanOrQuerys for the wildcard terms. Use a slop of 1, and set the inOrder flag to true.

So, essentially, you'd end up with: (you'll have to excuse me if I haven't rendered the span queries correctly as strings here - but this should give the general idea...)

spanNear[boiler (spanOr[replacement replacing])]

So it will accept *either* "replacement" or "replacing" adjacent to "boiler", which is what you want.

As you can see, it's a bit of work - but if you add this functionality to the QueryParser, you'll can re-use it a lot!

Hope that helps!

-JB








Jon Loken wrote:
Hi all,
First of all, well done to the implementers of Lucene. The performance
is incredible! We get search results within 20-40 ms on an index about
1.5GB.
I could not find a Lucene maillist search engine, something I am a bit
surprised about!

My question is how I can implement wild carded phrases searches like:
"boiler replac*"
This will pick up text:
"boiler replacement" and "boiler replacing"
But not:
"boiling replacement" or "boiler user replacment"

I am using the queryParser through Spring-lucene-module.


I did simply try textToSearch= "boiler replac*", but this did not work
as anticipated. Have not analyzed properly, but it seemed to interpret
this as: boiler OR replac*


Is there a way to implements this?

Many thanks, Jon


BiP Solutions Limited is a company registered in Scotland with Company Number 
SC086146 and VAT number 38303966 and having its registered office at Park 
House, 300 Glasgow Road, Shawfield, Glasgow, G73 1SQ 
****************************************************************************
This e-mail (and any attachment) is intended only for the attention of the 
addressee(s). Its unauthorised use, disclosure, storage or copying is not 
permitted. If you are not the intended recipient, please destroyall copies and 
inform the sender by return e-mail.
This e-mail (whether you are the sender or the recipient) may be monitored, 
recorded and retained by BiP Solutions Ltd.
E-mail monitoring/ blocking software may be used, and e-mail content may be 
read at any time. You have a responsibility to ensure laws are not broken when 
composing or forwarding e-mails and their contents.
****************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to