The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.
--- Begin Message ---
On 06/12/2024 06:59, Petr Štetiar wrote:
Hannu Nyman <hannu.ny...@iki.fi> [2024-12-04 17:23:39]:

Hi,

tl;dr CDN is not going to cut it, we need some other solution

a) setting feeds.conf.default to point to the actual root feeds at GitHub
(e.g. https://github.com/openwrt/packages) instead of the git.openwrt.org
mirror?

yes, using more powerful Git mirrors is one of the options.

It would need to be git.builds.openwrt.org or such, so we still have it under 
control.
It would need to be somewhere else than on GitHub (sanctions, no IPv6 etc.).
Following options are circulating around for some time:

  - codeberg.org
  - sourcehut.org

   Or alternatively, set feeds.conf.default to point to the new
git.cdn.openwrt.org?

CDNs are not able to handle/cache this Git HTTP smart protocol yet, so it
wouldn't help with Git fetching operations, this would be basically still
passthru mode. CDN is going to lower the load from gitweb based scrapers which
are not using proper Bot user agent in HTTP headers (those are already rate
limited).

I looked into this more closely and there were actually multiple issues going
on simultaneously:

  1. sudden spikes of requests from various gitweb based scrappers, usually
     requesting source code tarballs (heavy CPU and I/O operation) of random 
projects

     * bots using proper user agent identification are already forbidden this 
requests

     * bots not using proper user agent identifaction are PITA because you can't
       distinguish them from humans
2. strange vulnerability scanners, generating a lot of concurrent requests

  3. relatively high numbers of concurrent builds starting at the same time

     * probably some build farms and/or CI jobs (Hi Qualcomm! :))

This was leading to the saturation of CPU and I/O on the box, long backlog of
requests, running out of resources and 500s hugely impacting our buildbot
builds.


Would an internal mirror only accessible for the buildbots help? Even if the mirror falls behind due to main git server unavailability, the builds can continue.

Or is there a way to prioritize "own" traffic over the world with the git server?

As a quick fix, I've done following in the past days:

  - disabled tarballs for everyone with 403
  - enabled IP based rate limits on everyone

    * heavy projects like luci.git, packages.git and openwrt.git

      - after 5r/m additional requests are delayed up to 15r/m, then 429 sorry

    * other requests 15r/m, delayed after 8-th request, up to 30r/m, then 429 
sorry

Seems to work, VPS can manage the load, no git fetch issues on buildbots, thus
we can focus on the long term solution:

  A. outsource Git operations

     - this is the git.builds.openwrt.org explained above, thus following
       (shortened) diff

         --- a/feeds.conf.default
         +++ b/feeds.conf.default
         -src-git packages https://git.openwrt.org/feed/packages.git
         -src-git luci https://git.openwrt.org/project/luci.git
         -src-git routing https://git.openwrt.org/feed/routing.git
         -src-git telephony https://git.openwrt.org/feed/telephony.git
         +src-git packages https://git.builds.openwrt.org/feed/packages.git
         +src-git luci https://git.builds.openwrt.org/project/luci.git
         +src-git routing https://git.builds.openwrt.org/feed/routing.git
         +src-git telephony https://git.builds.openwrt.org/feed/telephony.git

         --- a/include/download.mk
         +++ b/include/download.mk
         -PROJECT_GIT = https://git.openwrt.org
         +PROJECT_GIT = https://git.builds.openwrt.org
--- a/package/boot/uboot-bcm4908/Makefile
         +++ b/package/boot/uboot-bcm4908/Makefile
         -PKG_SOURCE_URL:=https://git.openwrt.org/project/bcm63xx/u-boot.git
         
+PKG_SOURCE_URL:=https://git.builds.openwrt.org/project/bcm63xx/u-boot.git

     - other option is to keep using git.openwrt.org and handle this via HTTP
       redirects, which should probably work as well

That would also avoid people trying to get priority by changing the URLs.

  B. improve scripts/feeds

     - add kind of --retry backoff mechanism to Git operations
     - add fallback list of additional Git repository mirrors, if one fails, 
use another
       etc.


Can't these be set to be shallow clones by default? In most cases, you only need the latest HEAD to build against.

  C. upgrade the box

     - this means $$ which IMO would be better spent on funding/improving 
projects like
       codeberg.org or sourcehut.org

Cheers,

Petr

_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel



--- End Message ---
_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel

Reply via email to