In the last couple of weeks we've had an issue with web crawlers getting 
lost in facets, crawling literally millions of URLs in the faceted solr 
index. This is mainly a problem because some of them get quite expensive in 
terms of solr search (CPU and memory consumption of the solr component 
rises).

We've deployed the following fix:

        #added to redirect long solr queries back to the homepage
        RewriteEngine On
        RewriteCond "%{QUERY_STRING}" "filter_3"
        RewriteRule .  https://ir.wgtn.ac.nz/ [R]

The "filter_3' means that users and crawlers are allowed two facets deep 
before being redirected back to the homepage.

We're redirecting to our own homepage; others will probably want to 
redirect to their own homepages (and/or bot tarpits).

cheers
stuart



-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-community/e033bc06-57ec-4a57-99cf-39af193e945bn%40googlegroups.com.

Reply via email to