On 2024/02/21 21:57:29 +0800, Sadeep Madurange <sad...@asciimx.com> wrote:
> Hello,
> 
> Is there a way to block non-browser clients from accessing a website
> (e.g., scraping attempts by bots or even software like Selenium that
> might programmatically control a browser), preferrably before the
> requests reach the webserver?
> 
> I'm wondering if there's a to do that with, for example, pf to block
> such requests completely rather than responding with a 403.

I don't think you could *reliably* do this.  You mention Selenium, and
that's a "real browser", but also one could use nc(1) and use the same
mix of headers that firefox would send.  So, there are no practical ways
to distinguish the traffic based on the request.

(abusers don't usually set the 'evil bit' on the packets :/)

What you could do is some kind of clownflare shit which blocks your user
behind a page that requires a js challenge to continue.  (and that i
personally hate.)  Or maybe just limit the number of connections you
accept from a given ip per time delta (max-src-conn-rate in pf.conf).

or maybe something else, since you asking for a solution but not telling
your problem :)   (which i assumed is stopping the flood of bad requests
from bad bots.)

Reply via email to