On Mon, Nov 26, 2012 at 4:29 AM, Mark Ludwig <[email protected]> wrote: > If you have a collection you do not want indexed with your other dspace > collections, is there a way to index the collection separately or not at > all?
Hi Mark, please, always include the information which DSpace version, interface and theme you're using and whether you're using Discovery. Unfortunately, you cannot easily exclude a collection from indexing, at least not without modifying code. AFAIK, the search and browse indexes are not separate, so it's not even possible to have something browsable, but not searchable. What you could do is withdraw items from public display, which would make them accessible only to administrators. The only practical solution I can offer you is to move that content to a separate DSpace instance. It can even run on the same server and servlet container (and even the same DSpace webapps if you wish), but it will have it's own URL, database and assetstore (and possibly theme). We currently recommend the same solution (separate instances) to separate public vs. dark archives. > Also,is there any way to block internet crawlers from indexing this one > collection in dspace? Sure, the standard solution is the best - use robots.txt. That will work if you move to the separate repository, because within one repository, youu can't tell from the URL (which is in handle format) which comunity/collection an item belongs to. > It so happens that this particular collection would be very large, > about 750,000 pages as individual documents. Is there a practical > point at which a separate dspace instance is appropriate? That sounds like a moderate size. There are really no intentional limits within DSpace, you're restricted only by the amount of RAM and CPU cycles. If you hit a problem, you may want to consider using a reverse caching proxy (or an army of them for really large installations). Do you feel like you're hitting any limits already? Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette ------------------------------------------------------------------------------ Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

