robots.txt
----------
Key: DS-1138
URL: https://jira.duraspace.org/browse/DS-1138
Project: DSpace
Issue Type: Bug
Reporter: Ivan Masár
By default, robots.txt in XMLUI allows indexing all content. This leads to
indexing all browse, search and discovery pages. Search engines then give
mostly results pointing to these lists of results instead of the proper items.
I suggest to disallow the following pages by default:
User-agent: *
Disallow: /discover
Disallow: /search-filter
Note, that current robots.txt contains this message:
# Uncomment the following line ONLY if sitemaps.org or HTML sitemaps are used
# and you have verified that your site is being indexed correctly.
# Disallow: /browse
Since all items should be accessible via the browse pages in the
community/collection structure, /browse pages should be allowed by default to
enable spiders to explore the whole repository. But /discover and
/search-filter are surely redundant and only clutter the search results.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel