Am 13.12.2014 um 10:27 schrieb Shane Wood:
> I am asking a few websites to allow me to index there site, what you
> they add to the robots.txt and where do i get the exact name of my crawler.
In case you are using nutch-1.9 there is a file
conf/nutch-site.xml.
In this config file there are properties defined, like
  <name>http.agent.name</name>
and following.

This is used for identifying your crawler.

Did you already set this property and your crawler has not used it?
> 
> Cheers.
> Shane
> 
Regards,
 Patrick

Reply via email to