https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=42170

            Bug ID: 42170
           Summary: Librarian-Controlled AI Crawler and Search Traffic
                    Management for OPAC Search Page
   Initiative type: ---
        Sponsorship ---
            status:
           Product: Koha
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: new feature
          Priority: P5 - low
         Component: OPAC
          Assignee: [email protected]
          Reporter: [email protected]
        QA Contact: [email protected]

I would like to propose a feature for Koha to help libraries manage excessive
automated traffic, especially on the OPAC search endpoint such as:
opac.yourdomain.com/cgi-bin/koha/opac-search.pl

In some cases, this page is being heavily accessed by AI crawlers and other
automated agents, which can consume significant server resources and affect
normal library service performance. At present, libraries may control this only
through external tools such as Cloudflare, Fail2ban, firewall rules, or reverse
proxy configurations. However, these approaches often require dedicated IT
support, which many libraries may not have.
It would be very helpful if Koha could provide built-in tools, configurable
from the librarian/staff interface, to allow libraries to manage this traffic
based on their own capacity and policy.

Requested Functional Requirements:
Enable/Disable AI crawler access to OPAC search page
Provide an option in the staff interface to allow or disallow AI crawlers from
accessing the OPAC search page.
This should be manageable by authorized library administrators without
requiring server-level intervention.
Allow selected crawlers only
Provide a configurable allowlist option so that the library may permit only
selected crawlers or bots while blocking others.
This would help libraries that want discovery/indexing benefits from specific
services without opening access to all automated agents.
Rate limiting for OPAC search endpoint
Provide an option to enable configurable rate limiting specifically for
high-hit pages such as opac-search.pl.
This may help reduce server load caused by repeated automated queries.
Challenge or verification for suspicious/concurrent hits
Provide an option to trigger a challenge or verification step when unusually
high concurrent requests are detected from the same IP or user agent.
This may help distinguish genuine users from abusive automated access.
Traffic monitoring and analytics
Provide a simple monitoring dashboard or reporting tool within Koha to show:
hit counts to the OPAC search page,
top requesting IPs,
top user agents,
frequency trends,
possible abusive patterns.

This would help librarians or library administrators understand how the page is
being used and identify resource-heavy access.
View hit count by IP
Provide an interface to view the number of requests coming from a specific IP
address over a defined period.
This would support easier investigation of suspicious traffic.
Optional blocklist integration
Provide an option for authorized administrators to add suspicious IPs to a
Koha-managed deny list.

Where feasible, Koha could also support optional mapping/export of such IPs for
OS firewall or reverse proxy blocking.
Optional Cloudflare/API-based rule integration
For libraries already using Cloudflare or similar services, an optional
integration could be provided where administrators enter API credentials and
enable predefined protection rules from within Koha.

This could simplify deployment of recommended protections without requiring
advanced IT intervention.

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to