One particular site doesn't show up in cached.jsp as cached page. It
shows this message in the cache ->

Display of this content was administratively prohibited by the
webmaster. You may visit the original page instead: http://journals/



The robots.txt file for this website is

User-Agent: *
Disallow: /directory.bml

#
# Blocked journals aren't listed here because robots.txt files
# can't be above 50k or so, depending on the spider.



This site is based on livejournal, an open source blogging
application. Why hasn't Nutch cached the content of this page?

- B. Hugh

Reply via email to