ya , 
prashant try to check nutch-site.xml for crawl-dir and nutch-default.xml for
agents and robots entry.
then give the url what you want to crawl, after that in crawl-urlfilter.txt
write whole path of site. for example www.rajshri.com.

for tracking the error enable log file of haddop, by clicking log4j
properites file .

and then let me know?? if it works out.

Thnx 
Ratnesh 

prashant_nutch wrote:
> 
> Any help for Crawling in Eclipise on windows enviornment.
> i made following changes: 
>      
> 1.crawl-urlfilter.txt--------->#+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ and
> put my site name.
>       2.nutch-site.xml----------->change in robot agent and agent name &
> also in search.dir
> and then made folder ---urldir in which url name present.
> all are work fine bcoz after running on eclipse no any error but still
> that particular site is not crawled...
> what is problem...................
> 

-- 
View this message in context: 
http://www.nabble.com/Nutch-On-Eclipse-%28windows%29-tf3426037.html#a9571635
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to