Hi Richard, On 2/25/2014 3:56 PM, Jizba, Richard wrote: > Here’s a very ignorant question: > > When we upgraded to 3.2, OAI-PHM was turned on. What does this mean for > our site? Does this open us up even more than we already are to web > search engines or discovery tools? How do applications discover and > harvest our collections?
It is rather easy to turn off OAI-PMH, if you don't want it. Just ensure the "oai" web application is not available in your Tomcat. Generally speaking, in most cases, applications only really "discover" your OAI-PMH interface if you've registered at: http://www.openarchives.org/data/registerasprovider.html If you don't register your OAI-PMH interface, it technically is still there (and potentially a search engine bot could stumble across it). But, it's unlikely other applications would really use it unless they went searching or you told them where it was. > I ask because we have a large collection that needs to be kept “quietly” > public. The read level is anonymous on the items and bitstreams in the > collection, but we include the noindex|nofollow|noarchive directives to > the various robots. This actually provides just the level of > invisibility we want. Those who know we have the collection are free to > search it, but it adds a little privacy to what are otherwise public yet > still sensitive legal documents. > > I understand there is no way to indicate that a particular collection > should not be harvested even though it is anonymous. You could just add a directive to tell search engines to avoid indexing "/oai" paths. That would at least ensure that your OAI-PMH interface never appears in Google searches or similar. Hopefully that helps a little. If you have more questions, feel free to let us know. - Tim ------------------------------------------------------------------------------ Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis & security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

