i download nutch-2008-03-19_06-13-38.tar.gz, and try to recrawl use crawl.

nutch/bin/crawl.sh
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
#!/bin/bash
cd /nutch
bin/nutch crawl urls -dir crawl -depth 1 -topN 5 >> /tmp/crawl.log 
#touch $CATALINA_HOME/webapps/nutch/WEB-INF/web.xml
cd /
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

crontab -e
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
12 17 * * *  /nutch/bin/crawl.sh
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

when i only use crawl.sh, it is correct. but when i use crawl.sh with cron.
/tmp/crawl.log get:"Error: JAVA_HOME is not set."

i don't know what' wrong. help.

thanks.

Reply via email to