Hello Jan!

Jan Krüger wrote in
 <[email protected]>:
 |On 2026-01-10 18:36, Steffen Nurpmeso wrote:
 |
 |> Admins of the wonderful repo.or.cz, would it be possible to create
 |> some local configuration so that downloads of packagers become
 |> possible again?
 |> And if "somehow mystified" on the frontpage, to be updated as
 |> necessary?
 |> How about a useragent "EinMaennleinStehtImWalde"?
 |
 |I understand your issue, I really do. We didn't set up Anubis because we
 |hate users or because we want to make downloads impossible or anything.
 |We did it because we got swamped by crawlers with at least 5 figures
 |worth of IP addresses distributed across many networks in many 
 |countries,
 |and so many requests that all of our HTTP workers were fully saturated
 |and it wasn't even possible to view our (aggressively cached) landing
 |page anymore. We tried multiple times to relax our constraints but each
 |time they adapted within days. In fact right now the strictness of the
 |checks is load-dependent, i.e. the restrictions ramp up whenever we're
 |getting targeted. So, whenever you're having issues, you know we're
 |actively being attacked by crawlers in that very moment.
 |
 |At the time of writing I can use the protected endpoints without any
 |challenge via e.g. curl, indicating that right now the crawlers are
 |leaving us alone for the most part, but it keeps coming and going.
 |
 |For testing purposes I've added a rule that bypasses the restrictions
 |whenever the User-Agent header contains the string
 |"I-am-definitely-not-a-crawler" but if the crawlers adapt again, we'll
 |have to revert that.

Thank you for that, will try.

 |Sorry for all the trouble. Let's put the blame where it belongs, though:
 |the stupid LLM data gold rush and the people who are willing to break 
 |all
 |the rules to get the tiniest edge.

Btw for the little piece of internet i have i use
randomized-on-server-restart HTTP cookies, which most (practically
all) of the dumb ones which come here cannot pass.
(To mention that softwareheritage.org uses some Python stuff that
knows cookies, but does other things involving curl which did not
yet, so all-in-all they still cannot.)

Ie, changing that to XY-day_of_year or something and announcing
this on the front page, it would be interesting to know how good
that protects in practice.

  -- openssl rand -base64 3
  
  function secret()
          local s = os.getenv("WEB_SECRET")
          if not (s == nil) then
                  s = string.gsub(s, "=", "")
          end
          if s == nil or #s < 2 then
                  s = os.date("!*t")
                  s = s.month .. s.day
          end
  
          return {user=string.sub(s, 0, #s // 2), pass=string.sub(s, #s // 2 + 
1)}
          --lua 5.1 return {user=s, pass=s}
  end
  
  function verify(user, pass)
          if user == s.user and pass == s.pass then
                  return true
          end
          r.req_item.keep_alive = -1
          return false
  end
  
  function cookie()
          r.resp_header["Set-Cookie"] = "ID=" .. s.user .. s.pass
          r.resp_header["Content-Type"] = "text/html"
          r.resp_header["Cache-Control"] = "max-age=0"
          r.resp_body:set({[[
  <html><title>Bot-bypass pit pot</title><body><h1>Bot-bypass pit pot</h1><p>To
  circumvent the seen bad company, please follow <a 
href="https://git.sdaoden.eu]],
                  r.req_attr["uri.path"], '">this link to the target of 
yours</a></p></body></html>'})
          return 403
  end
  
  s = secret()
  r = lighty.r
  local c = r.req_header["Cookie"]
  if not c then return cookie() end
  
  if not (c == "ID=" .. s.user .. s.pass) then return cookie() end

 |-Jan (repo.or.cz admin team)
 --End of <[email protected]>

Thanks again for repo.or.cz!

Ciao o/,

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Reply via email to