Hi Richard,

On 2/25/2014 3:56 PM, Jizba, Richard wrote:
> Here’s a very ignorant question:
>
> When we upgraded to 3.2, OAI-PHM was turned on. What does this mean for
> our site? Does this open us up even more than we already are to web
> search engines or discovery tools? How do applications discover and
> harvest our collections?

It is rather easy to turn off OAI-PMH, if you don't want it. Just ensure 
the "oai" web application is not available in your Tomcat.

Generally speaking, in most cases, applications only really "discover" 
your OAI-PMH interface if you've registered at:
http://www.openarchives.org/data/registerasprovider.html

If you don't register your OAI-PMH interface, it technically is still 
there (and potentially a search engine bot could stumble across it). 
But, it's unlikely other applications would really use it unless they 
went searching or you told them where it was.

> I ask because we have a large collection that needs to be kept “quietly”
> public. The read level is anonymous on the items and bitstreams in the
> collection, but we include the noindex|nofollow|noarchive directives to
> the various robots. This actually provides just the level of
> invisibility we want. Those who know we have the collection are free to
> search it, but it adds a little privacy to what are otherwise public yet
> still sensitive legal documents.
>
> I understand there is no way to indicate that a particular collection
> should not be harvested even though it is anonymous.

You could just add a directive to tell search engines to avoid indexing 
"/oai" paths. That would at least ensure that your OAI-PMH interface 
never appears in Google searches or similar.

Hopefully that helps a little. If you have more questions, feel free to 
let us know.

- Tim

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to