Douglas,

Acceptable performance is a subjective thing.

I am currently running tests with an index of 140005 "documents", and 507027 
terms.

A three field, boolean search, using a single term finds 12063 hits in 0.047 
seconds.

A three field, boolean search, using a single wildcard term (*word) finds 923 
hits in 0.375 seconds.

That's slower by nearly a factor of 10. Significant yes, but still much faster 
than my test UI can display them, and fast enough that supporting wildcard 
queries is useful thing to do.

Looking at the source (version 1.9.1) for "WildcardQuery" and the class it uses 
to process the query "WildcardTermEnum"; it does not appear to support multiple 
asterisk wildcards.

However, you could probably compose a boolean query joining two WildcardQueries 
to achieve the that result.


-- Neal


-----Original Message-----
From: Douglas Smith (DataSmithy) [mailto:[EMAIL PROTECTED]
Sent: Friday, August 31, 2007 9:43 AM
To: [email protected]
Subject: Re: using mutliple wildcards in a term?

Hi Michael,

FYI, with version 2.1, I am using wildcards with the standard query
parser, and it seems to be working the way I expect.  That is, if I put
wildcards at the beginning *or* end or a word (prefix or suffix word
part), I get different result counts compared to a word without any
wildcards.

However, I was not able to get wildcards to work with the WildcardQuery
function searching on a single term (it returned no results).  It is
possible I may have not been using it correctly, since it was my first try.

Also, my index is apparently small enough that I don't get a significant
performance hit from using wildcards at the beginning of a term.

/*Does anybody know if Lucene supports wildcards at the beginning *and*
end of a term at the same time?  I am getting no results when I do this.  */

Also from an interface design point of view, if Lucene does not support
this, could it be argued that it should throw an error in this case,
instead of returning no results?

Michael Mitiaguin wrote:
> Douglas,
>
> I never used it , but  in "Lucene in Action" book we may read :
> Wildcards at the beginning of a term are prohibited using QueryParser, but
> an API-coded WildcardQuery may use leading wildcards (at the expense of
> performance).
>
> Regards
> Michael
>
> On 8/31/07, Douglas Smith <[EMAIL PROTECTED]> wrote:
>
>> Hi everyone,
>>
>> Are wildcard queries intended to be able to support wildcards at the
>> beginning *and* end of a term?
>>
>> I am getting search results when I use a single wildcard (*), but not
>> when I use them at the begging *and* end of a word.  The Lucene java
>> documentation seems unclear on this point, but one of my requirements is
>> to find word fragments in the middle of words.
>>
>>
>> =====================================
>> Douglas M. Smith
>> =====================================
>> Email: [EMAIL PROTECTED]
>> Yahoo: [EMAIL PROTECTED]
>> Jabber: [EMAIL PROTECTED]
>> =====================================
>>
>> "For years there has been a theory that millions of monkeys typing at
>> random on millions of typewriters would reproduce the entire works of
>> Shakespeare. The Internet has proven this theory to be untrue."  -
>> Unknown
>>
>>
>>
>>
>
>

--
======================================
Douglas M. Smith
|--- DataSmithy ---|

email: [EMAIL PROTECTED]
work: 540-322-2204
home:  540-381-8939
fax:   866-330-9401
aim: datasmithy
yahoo: datasmithy
skype: datasmitty
jabber: [EMAIL PROTECTED]
======================================


Reply via email to