Thanks a lot, Kim.I understand that I should add a line in robot.txt to stop crawler to access /displaystats... There is no need for crawlers to access such page, I think.
Panyarak Ngamsritragul Prince of Songkla University. On Wed, 6 Oct 2010, Kim Shepherd wrote:
I should point out that my robots.txt suggestions assume you don't want any stats pages crawled at all... if that's not true, it's probably best to apply the patch for DS-689 and wait for Google to de-index (and make the robots.txt entries more specific if there are only a few invalid handles being requested) Cheers, Kim On 6 October 2010 00:30, Kim Shepherd <[email protected]> wrote: Hi Panyarak, It might be an idea to add /displaystats to your JSPUI's robots.txt and to any Google Webmaster Tools robots.txt files or Page Removal Requests. For Google to de-index pages, it generally likes to see a 404 (not found) or a 410 (gone). Unfortunately, the servlet that handles statistics display for JSPUI throws a NullPointerException when a handle is passed to it that doesn't turn into a valid DSpace object. It *should* throw a friendly 404 to help crawlers like Google realise the page is gone. I've opened a JIRA issue for the NPE bug - http://jira.dspace.org/jira/browse/DS-689 - and attached a patch for 1.6.2 (and trunk, and probably other 1.6.x versions) that will make sure that when anyone (including Google) visits those pages, it sees a 404 instead of "Internal Server Error". Hopefully this, along with /displaystats (and/or /displaystats* ?) in your robots.txts or removal requests will help convince Google to stop crawling. Cheers, Kim
-- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

