Edward Quick wrote:
Hi,
Should type: and date: queries work with the search.jsp program?
I'm using Nutch 0.7, and crawled the intranet at work. String searches
work fine, but I want to test out the new features added by John Xing in
the changelog
(http://cvs.sourceforge.net/viewcvs.py/nutch/nutch/CHANGES.txt?rev=1.48)
for 0.7.
When I search on something like:
news type:pdf
or
news type:application/pdf
I don't get any results, where I would expect to because all our news
docs are in pdf format.
You probably forgot to enable index-more and query-more plugins. After you
do this, you need to re-index your segments.
--
Best regards,
Andrzej Bialecki <><
Thanks for your answer.
I definitely did enable index-more and query-more because the search
results show file type, size, date:
Swap.PDF
[pdf] (3920 bytes) 2005.9.26 - View as Plain Text
If I just search on 'pdf' alone I get 866 hits, but get no hits with
type:pdf.
Perhaps I've just caught the wrong end of the stick here, or should the
nutch search.jsp be able to perform lucene type searches as well, for
example,
wildcard searches such as te*t
single character searches sych as te?t
fuzzy searches such as roam~
title searches such as title:Do it right
and so on....
Appreciate any help.
Thanks,
Ed.
I found another article about this problem:
http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg00826.html
and if I do
url:http type:msword
url:http type:pdf
url:http date:19000101-20050501
that works fine.
I would still like to do the other types of lucene searches above though
i.e. te*t or te?t. Does anyone have any code to do this?
Thanks again.
Ed.