Just sharing my experience with setting the search directory for the
nutch webapp.  This is a leading cause of the disappointing "Hits 0-0
(out of about 0 total matching pages)" message.

I had a situation like Noah Silverman:

> On Thu, 2009-12-17 at 16:32 -0800, Noah Silverman wrote:
>   
>> Hello,
>>
>> Just to summarize.
>>
>> 1) Nutch crawl completes without error.
>>
>> 2) I can search from command line and see results.  (Assume this
means
>> that index is created.)
>>     bin/nutch org.apache.nutch.searcher.NutchBean foobar
>>
>> 3) Tomcat configured through nutch-site file to point to nutch/crawl
>> directory
>>
>> 4) catalina.out logfile indicates that tomcat is opening nutch/crawl
>>     2009-12-16 22:00:39,740 INFO SearchBean - opening indexes in
>> /home/noah/Documents/nutch/crawl/indexes
>>
>> 5) No results when searching in web front end
>>
>> 6) No errors in the logs
>>
>> Is there some way to debug this?  Perhaps more verbose logging?
>>
>> Thanks!!!
>>
>> -N

The log message in 4 is only somewhat helpful since if anything goes
wrong, nothing will be said. Noah's problem was that he needed to point
to the top level directory.  My case was that I needed to set the
permissions correctly.

I had crawled as root so the crawl directory was root:root with
permissions 544. (at least readable)  I moved it to $TOMCAT/work and
gave it ownership $TOMCAT_USER:$TOMCAT_GROUP with permissions 755.   Now
it works.  

In any case, the nutch web app will simply log at info that it's opening
indexes at $DIR.  If permissions are wrong, or the directory doesn't
exist, it will say nothing, not even at debug logging.  No exceptions
will be thrown.


Reply via email to