Re: UBUNTU total hits 0

Fabian López Fri, 24 Aug 2007 05:26:55 -0700

No, I added that property in nutch-site.xml on nutch and webapps directory
and didn't work. Don't know what else to do! please some help. I chanted to
Fedora rather than ubuntu and doesn't work!


2007/8/14, Kai_testing Middleton <[EMAIL PROTECTED]>:
>
> Does the following fix it?
>
> <!-- This is so that NutchBean will work on the command line -->
> <property>
>   <name>searcher.dir</name>
>   <value>/usr/tmp/13sites</value>
>   <description>
>   Path to root of crawl.  This directory is searched (in
>   order) for either the file search-servers.txt, containing a list of
>   distributed search servers, or the directory "index" containing
>   merged indexes, or the directory "segments" containing segment
>   indexes.
>   </description>
> </property>
>
> I think you need to set searcher.dir to the directory of your index as I
> did in the example
> above.
>
> To be thorough, this is what 13sites looks like:
>
> $ cd /usr/tmp/13sites/
> $ ls -latr
> total 14
> drwxr-xr-x  12 kai  wheel   512 Jul  5 00:27 segments
> drwxr-xr-x   3 kai  wheel   512 Jul  5 01:21 crawldb
> drwxr-xr-x   3 kai  wheel   512 Jul  5 01:24 linkdb
> drwxr-xr-x   3 kai  wheel   512 Jul  5 01:33 indexes
> drwxr-xr-x   7 kai  wheel   512 Jul  5 01:33 .
> drwxr-xr-x   2 kai  wheel   512 Jul  5 01:33 index
> drwxr-xr-x  19 kai  wheel  1024 Aug 14 07:20 ..
>
> ----- Original Message ----
> From: Fabian López <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Tuesday, August 14, 2007 5:11:52 AM
> Subject: UBUNTU total hits 0
>
> Hi,
> after following the tutorial of Nutch 0.8, when I try to search with
>
> bin/nutch org.apache.nutch.searcher.NutchBean apache
>
> I receive "Total Hits:0"
>
> I have followed all the steps:
>
>
>    1. Create a directory with a flat file of root urls. For example, to
>    crawl the nutch site you might start with a file named
> urls/nutchcontaining the url of just the Nutch home page. All other
> Nutch pages should
>    be reachable from this page. The urls/nutch file would thus contain:
>
>    http://lucene.apache.org/nutch/
>
>    2. Edit the file conf/crawl-urlfilter.txt and replace
> MY.DOMAIN.NAMEwith the name of the domain you wish to crawl. For
> example, if you wished to
>    limit the crawl to the apache.org domain, the line should read:
>
>    +^http://([a-z0-9]*\.)*apache.org/
>
>    This will include any url in the domain apache.org.
>    3. Edit the file conf/nutch-site.xml, insert at minimum following
>    properties into it and edit in proper values for the properties....
>
> Then I executed:
>
> bin/nutch crawl urls -dir crawl -depth 3 -topN 50
>
> Maybe the only problem that I find is when fetching, there is a
> java.lang.NullpointerException.
> Questions are:
>
> 1.- Is this the cause of the problem? How can I solution it?
> 2.- Is this the question why y always find the problem in
> http://localhost:8080 the HTTP STATUS 500,
> No Context configured to process this request - HTTP Status 500
> <http://www.mail-archive.com/[email protected]/msg09150.html>
>
>
> tHANKS A LOT
> Fabian
>
>
>
>
>
>
>
>
>       
> ____________________________________________________________________________________
> Park yourself in front of a world of choices in alternative vehicles.
> Visit the Yahoo! Auto Green Center.
> http://autos.yahoo.com/green_center/

Re: UBUNTU total hits 0

Reply via email to