Controlling AI scraping

Cloudflare's plan to give its users ways to block and/or monetize AI
scraping is interesting, but of course there are ethical and other
reasons to avoid using Cloudflare, since they continue to support some
of the most disreputable sites on the Net.

This does however suggest the concept of an open source mechanism to
provide the same sorts of features broadly (e.g., in conjunction with
Apache servers) to any sites, anywhere. This could be paired with a
system to keep sites updated about discovered source IP addresses of AI
scrapers that are not adhering to robots.txt directives. Sidenote:
Google announced an effort to expand robots.txt to better deal with AI
scraping issues, a concept I had already earlier suggested. I signed up
for this, but never heard another word about it since the earliest days.

Time to get serious about controlling AI scraping. -L

- - -
--Lauren--
Lauren Weinstein [email protected] (https://www.vortex.com/lauren)
Lauren's Blog: https://lauren.vortex.com
Mastodon: https://mastodon.laurenweinstein.org/@lauren
Founder: Network Neutrality Squad: https://www.nnsquad.org
        PRIVACY Forum: https://www.vortex.com/privacy-info
Co-Founder: People For Internet Responsibility
_______________________________________________
google-issues mailing list
https://lists.vortex.com/mailman/listinfo/google-issues

Reply via email to