No, I added that property in nutch-site.xml on nutch and webapps directory and didn't work. Don't know what else to do! please some help. I chanted to Fedora rather than ubuntu and doesn't work!
2007/8/14, Kai_testing Middleton <[EMAIL PROTECTED]>: > > Does the following fix it? > > <!-- This is so that NutchBean will work on the command line --> > <property> > <name>searcher.dir</name> > <value>/usr/tmp/13sites</value> > <description> > Path to root of crawl. This directory is searched (in > order) for either the file search-servers.txt, containing a list of > distributed search servers, or the directory "index" containing > merged indexes, or the directory "segments" containing segment > indexes. > </description> > </property> > > I think you need to set searcher.dir to the directory of your index as I > did in the example > above. > > To be thorough, this is what 13sites looks like: > > $ cd /usr/tmp/13sites/ > $ ls -latr > total 14 > drwxr-xr-x 12 kai wheel 512 Jul 5 00:27 segments > drwxr-xr-x 3 kai wheel 512 Jul 5 01:21 crawldb > drwxr-xr-x 3 kai wheel 512 Jul 5 01:24 linkdb > drwxr-xr-x 3 kai wheel 512 Jul 5 01:33 indexes > drwxr-xr-x 7 kai wheel 512 Jul 5 01:33 . > drwxr-xr-x 2 kai wheel 512 Jul 5 01:33 index > drwxr-xr-x 19 kai wheel 1024 Aug 14 07:20 .. > > ----- Original Message ---- > From: Fabian López <[EMAIL PROTECTED]> > To: [email protected] > Sent: Tuesday, August 14, 2007 5:11:52 AM > Subject: UBUNTU total hits 0 > > Hi, > after following the tutorial of Nutch 0.8, when I try to search with > > bin/nutch org.apache.nutch.searcher.NutchBean apache > > I receive "Total Hits:0" > > I have followed all the steps: > > > 1. Create a directory with a flat file of root urls. For example, to > crawl the nutch site you might start with a file named > urls/nutchcontaining the url of just the Nutch home page. All other > Nutch pages should > be reachable from this page. The urls/nutch file would thus contain: > > http://lucene.apache.org/nutch/ > > 2. Edit the file conf/crawl-urlfilter.txt and replace > MY.DOMAIN.NAMEwith the name of the domain you wish to crawl. For > example, if you wished to > limit the crawl to the apache.org domain, the line should read: > > +^http://([a-z0-9]*\.)*apache.org/ > > This will include any url in the domain apache.org. > 3. Edit the file conf/nutch-site.xml, insert at minimum following > properties into it and edit in proper values for the properties.... > > Then I executed: > > bin/nutch crawl urls -dir crawl -depth 3 -topN 50 > > Maybe the only problem that I find is when fetching, there is a > java.lang.NullpointerException. > Questions are: > > 1.- Is this the cause of the problem? How can I solution it? > 2.- Is this the question why y always find the problem in > http://localhost:8080 the HTTP STATUS 500, > No Context configured to process this request - HTTP Status 500 > <http://www.mail-archive.com/[email protected]/msg09150.html> > > > tHANKS A LOT > Fabian > > > > > > > > > > ____________________________________________________________________________________ > Park yourself in front of a world of choices in alternative vehicles. > Visit the Yahoo! Auto Green Center. > http://autos.yahoo.com/green_center/
