look here, it is blocking robots: http://ulysses.wyona.org/robots.txt
User-agent: *
Disallow: /foo/bar.html
User-agent: lenya
Disallow: /foo/bar.html
Michael Wechner wrote:
Hi
I am trying to index http://ulysses.wyona.org/ but somehow it just
indexes the homepage but doesn't seem to follow
any links. I have set "depth 3" and other sites are being crawled
deeper without a problem but not the Ulysses page.
Has anyone made similar experiences?
Is it possible that Nutch has problem with well-formed XHTML
(application/xhtml+xml)?
Thanks
Michi