What's the actual error thrown? or is it just completing without anything
added to the index?

On Tue, Jul 31, 2012 at 3:30 AM, veryblues_cn <[email protected]> wrote:

> My environment is win7,Tomcat 6.0 ,cygwin,nutch 1.5.1,solr 3.60
> I downloaded both the nutch-1.5.1 src and bin zip package ,hadoop 0.20.0
>
> 1.configure the environment (tomcat and cygwin)
>
> 2.extract nutch-1.5.1(bin) and solr 3.0 in cygnwin/home/
>
> 3.configure solr-3.6 in tomcat(after put solr.war in tomcat and restart
> tomcat), with the  solr/home
>
> <env-entry-value>C:\cygwin\home\apache-solr-3.6.0\example\solr</env-entry-value>
>
> 4.copy the nutch command script file form nutch-1.5.1(src)/src/bin to
> nutch-1.5.1(bin)/bin
>
> 5.create a new folder named 'urls' in cygwin/home/nutch-1.5.1/bin ,and a
> txt
> file name 'nutch.txt' in it.So I can put the crawl target url
> (www.google.com)in it
>
> 6.modify the regex-urlfilter.txt by adding :
> +^http://([a-z0-9]*\.)*www.google.com/
> +^http://\S*/
>
> 7.configure the nutch-site.xml by adding:
> <configuration>
>         <name>http.agent.name</name>
>         <value>My Nutch Spider</value>
> </configuration>
>
> 8.add value of http.agent.name in the nuthc-default.xml like below:
>
>
>
> <property>
>   <name>http.agent.name</name>
>   <value>My Nutch Spider</value>
>
> 9.for the hadoop in nutch-1.5.1 is 1.0.3 , and it would some ioexception
> (like some staff about ).So , I replace it with hadoop-core-0.20.0.jar ,and
> I also rename the 'hadoop-core-0.20.0.jar' as 'hadoop-core-1.0.3.jar'
>
> Above is what I do to configure nutch, but wondering if I have made any
> mistake.But I run crawl without any error,if you know what's the matter is
> ,can you pls tell me??
>
>
> Thanks all ^_^
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/No-output-to-solr-no-running-error-with-my-install-and-config-of-nutch-tp3998290.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Reply via email to