Try using a TermQuery instead of QueryParser to see if you get the results you expect. Exact case matters.

Also, when troubleshooting issues with QueryParser, it is helpful to see what the actual Query returned is - try displaying its toString output.

        Erik

On Nov 16, 2004, at 6:25 AM, [EMAIL PROTECTED] wrote:

Hi,

We have indexed a set of web files (jsp , js , xslt , java properties and
html) using the lucene Whitespace Analyzer.
The purpose is to allow developers to find where code / functions are used
and defined across a large and dissperate
content management repository. Hopefully to aid code re-use, easier
refactoring and standards control.


However when a query parser search is made using a whitespace analyser with
a string known to be in an indexed file, the search returns zero hits.


For example the string <jsp\:include page
=\"/path1/path2/path3/path4/file1.jsp\" /> is
searched for using the query parser (escaping the meta-chars)and an indexed
document which contains
the following text should be found ?


 // include HTML head
%>
             <jsp:include page="/path1/path2/path3/path4/file1.jsp" />

             <script language="JavaScript" src
="/path1/path2/path3/file1.js"></script>
             <!-- <script>

I've taken a look at the FAQ advice regarding checking the effects of an
analyser (in our case whitespace) but our test class returns the expected
tokens for any given token stream. For Example this string "<% mytoken1
mytoken2 %>" is tokenised by the whitespace analyzer as [<%] [mytoken1]
[mytoken2] [%>].


I'm sure I've missed something but i can't see what it is. If anyone could
shed any light on posible reasons for why we are getting zero hits for text
strings which are in our indexed files I'd be really gratefull. See below
for more info on index and search set up


Thanks a lot Lee C

File contents are  in a tokenised , indexed not stored field.
Index uses the whitespace analyzer which comes with lucene

Searches are performed using a boolean query. The boolean query is made up
of a query parser which gets its search term from an html text box entered
by the user and a prefix query which is used to limit search scope by
directory paths.
the search uses a whitespace analyzer, no filtering takes place







----------------------------------------------------------------------- --------------------------


Get the best from British Airways at ba.com
http://www.ba.com



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to