On Thu, 11 Feb 2010, Michael White wrote: >:session_id=9E40BFD899A2AA5C23E81404AF5B97A5:internal_error:-- URL Was: >https://dspace.stir.ac.uk/dspace/browse-title?bottom=1893/214 [snip] > -------------------------------- > User-agent: * > > Disallow: /browse-author > Disallow: /items-by-author > Disallow: /browse-date > Disallow: /browse-subject > --------------------------------
You should add "/dspace" to the start of those disallowed patterns, because your DSpace URLs start with "/dspace" after the hostname. The "standard" (or rather, consensus) has this to say about disallow fields in robot.txt: "The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved." Note the "starts with". See also: http://www.robotstxt.org/ Best regards, -- Tom De Mulder <td...@cam.ac.uk> - Cambridge University Computing Service +44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH -> 11/02/2010 : The Moon is Waning Crescent (19% of Full) ------------------------------------------------------------------------------ SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech