Anyone got a contact at OpenAI. They have a spider problem.
As I think I have mentioned before, I have the world's lamest content farm at https://www.web.sp.am/. Click on a link or two and you'll get the idea. Unfortunately, GPTBot has found it and has not gotten the idea. It has fetched over 3 million pages today. Before someone tells me to fix my robots.txt, this is a content farm so rather than being one web site with 6,859,000,000 pages, it is 6,859,000,000 web sites each with one page. Of those 3 million page fetches, 1.8 million were for robots.txt. It's not like it's hard to figure out what's going on since the pages all look nearly the same, and they're all on the same IP address with the same wildcard SSL certificate. Amazon's spider got stuck there a month or two ago but fortunately I was able to find someone to pass the word and it stopped. Got any contacts at OpenAI? R's, John PS: If you were wondering what they're using to train GPT-5, well, now you know.
Re: 2600:: No longer pings
Wonderful news, this has now been fixed :) Thank you to Cogent for fixing this On Sat, 6 Apr 2024 at 11:00, Ben Cartwright-Cox wrote: > > It appears that 2600:: no longer responds to ICMP. > > $ mtr -rwc 1 2600:: > Start: 2024-04-06T10:53:41+0100 > HOST: metropolis Loss% > 1.|-- lcy02.flat.b621.net 0.0% > [...] > 6.|-- ldn-b4-link.ip.twelve99.net 0.0% > 7.|-- ldn-bb1-v6.ip.twelve99.net 0.0% > 8.|-- nyk-bb2-v6.ip.twelve99.net 0.0% > 9.|-- ??? 100.0 > 10.|-- sprint-ic301620-nyk-b5.ip.twelve99-cust.net 0.0% > 11.|-- ??? 100.0 > > This seems to have happened around Friday 5th 13:40 UTC. > > 2600::, a IP address owned by the Sprint network (Now since acquired > by Cogent Communications) is a common (at least in my circles) IPv6 > testing address, in a similar way that 8.8.8.8 or 1.1.1.1 is for a > quick address to remember that always pings, when such a address is so > easy to remember, you sometimes cannot help it becoming a "core > project" :) ( https://xkcd.com/1361/ ) > > 2600:: is also used to be the address of sprint.net, now sprint.net has no v6. > > This is sad, and I would either propose that Cogent/Sprint (I assume > 2600:: is under the ownership of Cogent now) revive this address as > it's a very helpful testing address that is burned into the minds of > many. Or at the very least, I'm more than willing to tank the effort > of responding to ICMP!