We just tried adding the "?" character to QueryParser.jj under
<#_TERM_START_CHAR>. We noticed that the "*" was in that list, so we figured
we'd just give it a try. It seems to have worked. Now when we search on
rou?d, we get hits on the word "round". We're going to try searching for
some other variations to make sure that we've done the right thing.

We'd still be interested to know exactly why this worked (assuming it
continues to solve our problem). What is a TERM_START_CHAR and how is it
used? Obviously it does something important. :-)

-----Original Message-----
From: Howk, Michael [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, February 27, 2002 11:14 AM
To: 'Lucene Users List'
Subject: RE: Wildcard Searching


The StandardAnalyzer uses a lowercase filter, but we tried indexing "the
round hat", just to make sure. The * still worked, but the ? still failed.

We noticed that the ? character is listed in the QueryParser as a WILDTERM.
But after that, the code heads into the WildcardQuery class, and we get lost
amidst "setEnum()" and "wildcardEquals()" stuff. :-)

Seriously though, we're using the StandardAnalyzer directly from Lucene. I
suppose it's possible that the ? is a special character that's getting
stripped out. But we need help to find out exactly where the special
characters are defined or filtered.

Michael

-----Original Message-----
From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, February 27, 2002 11:00 AM
To: 'Lucene Users List'
Subject: RE: Wildcard Searching


>From my experience with wildcards,
1. They are case sensitive while the regular queries aren't.
2. Only one wild card is allowed in a word. If you are using this with a
bool query, you can use something like the following
(asas*) AND (fhg*fd). This is acceptable
3. There is a requirement of using atleast one character before wildcard in
a query.(*fhhd is not valid)
4. Special characters are not supported (? may be a special character)
Hope this helps!

-----Original Message-----
From: Howk, Michael [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, February 27, 2002 10:56 AM
To: Lucene Mailing List (E-mail)
Subject: Wildcard Searching


We're really struggling with trying to understand why the WildcardQuery
seems to strip out the question mark by replacing it with a space. We're
using the daily build, and a StandardAnalyzer. We've got the text "The Round
Window" in our index. If we search on "roun*" the Lucene QueryParser returns
a hit. When we search on "roun?", we don't get any hits. We don't even know
how to make heads or tails of the WildcardQuery or WildcardTermEnum classes.

Also, Lucene returns the parsed version of each of our searches. When we
search by rou*d, Lucene parses it as rou*d (which is what we would expect).
But when we search by rou?d, Lucene parses it as "rou d". It seems to wrap
the term in quotes and replace the question mark with a space. Any ideas? Or
can someone give us an idea of how to understand WildcardQuery or
WildcardTermEnum?

Michael

--
To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>

--
To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>

--
To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to