I've been playing with generating site maps with "/bin/dspace generate-sitemaps", which works fine, but includes all of our restricted content. If I'm looking at the headers correctly, the restricted urls return a 404 error (that seems odd to me…) Googlebot help says "Generally, 404s don't harm your site's performance in search" so I suppose it's okay, but am wondering if there is a way to output a sitemap that only includes non restricted content. Or am I overthinking this? I suppose the bot would run into those pages anyway from collection page links. I had been trying to restrict bot visits to certain collections via robots.txt, but it looks like handles aren't hierarchical so that didn't really work out.
Any thoughts on setting up appropriate indexing via sitemaps and/or robotx.txt appreciated. ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees _______________________________________________ Dspace-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-general
