On Tue, Jan 20, 2026 at 09:47:34PM +1100, Svetlana Tkachenko wrote: > > but I suppose you need to compromise with LLM > > robots going wild. > > Are they not required to follow do_not_track http headers or robots.txt ? If > LLM robots do not obey these instructions, they should be probably reported > to their hosting provider.
They don't always care. Their hosting provider isn't always in position to care. There's enough betting money in this pool to motivate actors to break and/or creatively bend the rules. Recommended reading: https://medium.com/@kolla.gopi/the-cloudflare-perplexity-standoff-why-robots-txt-is-broken-for-the-ai-era-1b9d309bdc2b Cheers -- t
signature.asc
Description: PGP signature

