http://www.google.se/robots.txt
google disallows it. User-agent: * Allow: /searchhistory/ Disallow: /search Larsson85 schrieb: > Why isnt nutch able to handle links from google? > > I tried to start a crawl from the following url > http://www.google.se/search?q=site:se&hl=sv&start=100&sa=N > > And all I get is "no more URLs to fetch" > > The reason for why I want to do this is because I had a tought on maby I > could use google to generate my start list of urls by injecting pages of > search result. > > Why wont this page be parsed and links extracted so the crawl can start? >