In the last couple of weeks we've had an issue with web crawlers getting
lost in facets, crawling literally millions of URLs in the faceted solr
index. This is mainly a problem because some of them get quite expensive in
terms of solr search (CPU and memory consumption of the solr component
rises).
We've deployed the following fix:
#added to redirect long solr queries back to the homepage
RewriteEngine On
RewriteCond "%{QUERY_STRING}" "filter_3"
RewriteRule . https://ir.wgtn.ac.nz/ [R]
The "filter_3' means that users and crawlers are allowed two facets deep
before being redirected back to the homepage.
We're redirecting to our own homepage; others will probably want to
redirect to their own homepages (and/or bot tarpits).
cheers
stuart
--
All messages to this mailing list should adhere to the Code of Conduct:
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/dspace-community/e033bc06-57ec-4a57-99cf-39af193e945bn%40googlegroups.com.