Les ha pasado?

En mis servidores y sitios web de clientes de hosting lo estoy sufriendo bastante, he dedicado dias enteros a configurar barreras contra los bots de las IA que saturan los servidores. No respetan las politicas de robots.txt (algunos bots de navegadores tampoco)

Irònicamente, usé algunas IA para analizar y diagnisticar los logs de los servidores y diseñar las politicas y scripts de bloqueo.


En el newsletter de la FSF vienen comentando el problema hace meses tambien:
https://www.fsf.org/blogs/sysadmin/our-small-team-vs-millions-of-bots
Our infrastructure has been under attack since August 2024. Large Language Model (LLM) web crawlers have been a significant source of the attacks, and as for the rest, we don't expect to ever know what kind of entity is targeting our sites or why.

In the fall Bulletin, we wrote about the August attack on gnu.org. That attack continues, but we have mitigated it. Judging from the pattern and scope, the goal was likely to take the site down and it was not an LLM crawler. We do not know who or what is behind the attack, but since then, we have had more attacks with even higher severity.

To begin with, GNU Savannah, the FSF's collaborative software development system, was hit by a massive botnet controlling about five million IPs starting in January. As of this writing, the attack is still ongoing, but the botnet's current iteration is mitigated. The goal is likely to build an LLM training dataset. We do not know who or what is behind this.

Furthermore, gnu.org and ftp.gnu.org were targets in a new DDoS attack starting on May 27, 2025. Its goal seems to be to take the site down. It is currently mitigated. It has had several iterations, and each has caused some hours of downtime while we figured out how to defend ourselves against it. Here again, the goal was likely to take our sites down and we do not know who or what is behind this.

In addition, directory.fsf.org, the server behind the Free Software Directory, has been under attack since June 18. This likely is an LLM scraper designed to specifically target Media Wiki sites with a botnet. This attack is very active and now partially mitigated.

As we developed programs to identify IP addresses belonging to the botnet, they sometimes misidentified legitimate user's IP addresses. We've removed them from the list of DDoS IP addresses and improved our defenses to be more precise. If you do not have access to gnu.org right now, please send us an email at [email protected] with your IP address and we will look into it. If you are having trouble with a VPN (virtual private network), try switching exit nodes and skip writing us -- we know our attackers use VPNs, which leads us to block the ones they are using.


________________________________________________


Solar-General es una lista abierta a toda la comunidad, sin ninguna moderación, 
por lo que se apela a la tolerancia y al respeto mutuo.
Las opiniones expresadas son responsabilidad exclusiva de sus respectivos/as 
autores/as. La Asociación Solar no se hace responsable por los mensajes 
vertidos, ni representan necesariamente el punto de vista de la Asociación 
Solar.

[email protected]
https://lists.ourproject.org/cgi-bin/mailman/listinfo/solar-general

Responder a