Thanks a lot, Kim.
I understand that I should add a line in robot.txt to stop crawler to access /displaystats... There is no need for crawlers to access such page, I think.

Panyarak Ngamsritragul
Prince of Songkla University.

On Wed, 6 Oct 2010, Kim Shepherd wrote:

I should point out that my robots.txt suggestions assume you don't want any
stats pages crawled at all... if that's not true, it's probably best to
apply the patch for DS-689 and wait for Google to de-index (and make the
robots.txt entries more specific if there are only a few invalid handles
being requested)
Cheers,

Kim

On 6 October 2010 00:30, Kim Shepherd <[email protected]> wrote:
      Hi Panyarak,
It might be an idea to add /displaystats to your JSPUI's robots.txt
and to any Google Webmaster Tools robots.txt files or Page Removal
Requests.
For Google to de-index pages, it generally likes to see a 404 (not
found) or a 410 (gone).

Unfortunately, the servlet that handles statistics display for JSPUI
throws a NullPointerException when a handle is passed to it that
doesn't turn into a valid DSpace object. It *should* throw a friendly
404 to help crawlers like Google realise the page is gone.

I've opened a JIRA issue for the NPE bug
- http://jira.dspace.org/jira/browse/DS-689 - and attached a patch for
1.6.2 (and trunk, and probably other 1.6.x versions) that will make
sure that when anyone (including Google) visits those pages, it sees a
404 instead of "Internal Server Error".

Hopefully this, along with /displaystats (and/or /displaystats* ?) in
your robots.txts or removal requests will help convince Google to stop
crawling.

Cheers,

Kim
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to