Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package mirrorsorcerer for openSUSE:Factory checked in at 2022-10-11 18:03:06 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/mirrorsorcerer (Old) and /work/SRC/openSUSE:Factory/.mirrorsorcerer.new.2275 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "mirrorsorcerer" Tue Oct 11 18:03:06 2022 rev:7 rq:1009638 version:0.1.0~20 Changes: -------- --- /work/SRC/openSUSE:Factory/mirrorsorcerer/mirrorsorcerer.changes 2022-04-19 09:59:46.339681877 +0200 +++ /work/SRC/openSUSE:Factory/.mirrorsorcerer.new.2275/mirrorsorcerer.changes 2022-10-11 18:05:38.350093474 +0200 @@ -1,0 +2,7 @@ +Tue Oct 11 00:41:17 UTC 2022 - william.br...@suse.com + +- Update to version 0.1.0~20: + * Add new mirrorcache instances + * Update README + +------------------------------------------------------------------- Old: ---- mirrorsorcerer-0.1.0~13.tar.xz New: ---- mirrorsorcerer-0.1.0~20.tar.xz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ mirrorsorcerer.spec ++++++ --- /var/tmp/diff_new_pack.iIWT1g/_old 2022-10-11 18:05:38.994094515 +0200 +++ /var/tmp/diff_new_pack.iIWT1g/_new 2022-10-11 18:05:38.998094522 +0200 @@ -17,7 +17,7 @@ Name: mirrorsorcerer -Version: 0.1.0~13 +Version: 0.1.0~20 Release: 0 Summary: Mirror Sorcerer tool to magically make OpenSUSE mirror sources more magic-er License: (Apache-2.0 OR BSL-1.0) AND (Apache-2.0 OR MIT) AND (Apache-2.0 OR MIT OR Zlib) AND (MIT OR Unlicense) AND (Apache-2.0 OR Zlib OR MIT) AND BSD-3-Clause AND MIT AND MPL-2.0 ++++++ mirrorsorcerer-0.1.0~13.tar.xz -> mirrorsorcerer-0.1.0~20.tar.xz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/mirrorsorcerer-0.1.0~13/Cargo.toml new/mirrorsorcerer-0.1.0~20/Cargo.toml --- old/mirrorsorcerer-0.1.0~13/Cargo.toml 2022-04-19 04:46:48.000000000 +0200 +++ new/mirrorsorcerer-0.1.0~20/Cargo.toml 2022-10-11 02:40:06.000000000 +0200 @@ -1,6 +1,6 @@ [package] name = "mirrorsorcerer" -version = "0.1.0" +version = "0.1.1" edition = "2021" description = "Mirror Sorcerer tool to magically make OpenSUSE mirror sources more magic-er" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/mirrorsorcerer-0.1.0~13/README.md new/mirrorsorcerer-0.1.0~20/README.md --- old/mirrorsorcerer-0.1.0~13/README.md 2022-04-19 04:46:48.000000000 +0200 +++ new/mirrorsorcerer-0.1.0~20/README.md 2022-10-11 02:40:06.000000000 +0200 @@ -23,13 +23,16 @@ This will only update mirrors that are provided by OpenSUSE. Custom mirrors are not altered. If you add new repositories, they are dynamically updated. +## What Users are Saying + +* "It [mirrorsorcerer] is miraculous ... I went from downloading from OpenSUSE and OBS at 1MiB/s to 10~20MiB/s" + ## Details The primary way to use this will be to install it and allow it to run at boot zypper in mirrorsorcerer - systemctl enable mirrorsorcerer - systemctl start mirrorsorcerer + systemctl enable --now mirrorsorcerer If you wish to define a custom mirror list that should be profiled instead: @@ -58,5 +61,202 @@ [Service] Environment=RUST_LOG=debug +## Limitations + +Currently mirrorsorcerer only improves behaviour of official OpenSUSE mirrors. Third party mirrors +are not supported. + +## Undoing the changes + +Mirrorsorcerer is careful to make backups before making changes. + +Repo files are copied to `/etc/zypp/repos.d/*.msbak` containing their original content. +All customisations to zypp.conf are preserved when changing it. Original is backed up to /etc/zypp/zypp.conf.msbak + +## Why Mirrorsorcerer - Technical Details + +To understand why mirrorsorcerer works, we need to examine what zypper does in a default install +and how mirrorsorcerer alters that behaviour. + +### Repository Metadata + +Zypper connects to a mirror and expects to be redirected. The "primary" redirection service is based in the EU. +Two redirectors exist. download.opensuse.org (mirrorbrain) and mirrorcache.opensuse.org (mirrorcache). +mirrorbrain will return metalink file, mirrorcache returns http redirects. + +Due to download.opensuse.org *and* mirrorcache.opensuse.org being in EU, the latency is in the order +of 350ms for each Round Trip. This quickly adds up, where a single HTTP GET request can take approximately +1 second from my home internet connection (Australia 20ms to my ISP). + +There are 4 required files for one repository (repomd.xml, media, repomd.xml.key and repomd.xml.asc). +zypper initially performs a HEAD request for repomd.xml and then *closes the connection*. If this is +considered "out of date", zypper then opens a second connection and requests the full set of 4 files. + +From my connection the HEAD request takes 0.7s. The second series of GET requests take 2.6s from +first connection open to closing. + +If we are to perform a full refresh this process of double connecting repeats for each repository we +have, taking ~3.2s just in network operations. + +Given an opensuse/tumbleweed:latest container, and running `time zypper ref --force` takes 32 seconds +to complete (2022-10-02) . The addition of further repositories linearly increases this time taken. + +Zypper also aggresively refreshes metadata. By default metadata is considered out of date after 10 +minutes. The most common user perception of this is that zypper after a small period of inactivity +will then have a ~30 second delay before responding on the next innvocation. + +### Package Downloads (mirrorcache) + +Let's assume we have our opensuse/tumbleweed:latest container, and we are running "zypper in -y less" (2022-10-02). This should +result in the need to download 9 rpms: busybox busybox-which file file-magic less libcrypt1 libmagic1 libseccomp2 libsepol2 + +zypper starts by sending an initial GET request to download.opensuse.org for `/tumbleweed/repo/oss/media.1/media` +which returns a 200 and the name of the current media build. + +zypper then requests `/tumbleweed/repo/oss/noarch/file-magic-5.43-1.1.noarch.rpm`. The response is a +HTTP 302 to the australian mirrorcache instance mirrocache-au.opensuse.org. + +`file-magic-5.43-1.1.noarch.rpm` is now requested from mirrorcache-au.opensuse.org, and a metalink +xml is returned: + + <metalink xmlns="urn:ietf:params:xml:ns:metalink"> + <generator>MirrorCache</generator> + <origin dynamic="true">http://mirrorcache-au.opensuse.org/tumbleweed/repo/oss/noarch/file-magic-5.43-1.1.noarch.rpm</origin> + <published>2022-10-02T12:01:37Z</published> + <publisher> + <name>openSUSE</name> + <url>http://download.opensuse.org</url> + </publisher> + <file name="file-magic-5.43-1.1.noarch.rpm"> + <!-- Mirrors which handle this country (AU): --> + <url location="AU" priority="1">http://mirror.firstyear.id.au/tumbleweed/repo/oss/noarch/file-magic-5.43-1.1.noarch.rpm</url> + <!-- Mirrors in the same continent (OC): --> + <url location="NZ" priority="2">http://mirror.2degrees.nz/opensuse/tumbleweed/repo/oss/noarch/file-magic-5.43-1.1.noarch.rpm</url> + <!-- Mirrors in other parts of the world: --> + <!-- File origin location: --> + <url location="" priority="3">http://mirror.firstyear.id.au/tumbleweed/repo/oss/noarch/file-magic-5.43-1.1.noarch.rpm</url> + </file> + </metalink> + +It is not always clear the selection logic that is used by zypper to decide between mirrors in a metalink xml. + +Finally zypper now connects to mirror.firstyear.id.au and retrieves the file. + +What is interesting in this process is: + +* The connection to mirrorcache-au.opensuse.org is always closed after the 302. +* The connections to download.opensuse.org and mirror.firstyear.id.au are pooled (hooray!) + +Now using this information we can determine the impact of *latency* in these requests. + +Each 302 from download.opensuse.org takes 0.35 seconds to resolve. + +The lack of re-use on the mirrorcache-au connection adds 0.03 seconds for each file request due to +having to re-open the connection. This whole connection takes 0.16 seconds to complete. + +From this, of the total 11.75 seconds to complete the install, 3.15 seconds are just in requests +to download.opensuse.org, and the lack of connection reuse to the local redirector adds 0.27 seconds +to the operation. + +### Changes that mirrorsorcerer makes + +Mirrorsorcerer makes two key changes on your system to improve this situation. + +* Repository metadata timeout is set to 18 hours from 10 minutes +* Replacement of download.opensuse.org with a lower-latency mirror + +The increase in metadata timeout means that zypper will only refresh metadata once a day. Given that +tumbleweed snapshots are "daily", and users with custom OBS repos and development work will refresh +manually, this "delay" generally causes no difference in user experience. + +By directing zypper to a local redirector instead of going through download.opensuse.org we reduce +the major source of latency, and prevent the connection open/close issue on the intermediate 302 host. +In theory from our former experiment this should reduce the install from 11.75 seconds to 8.6 seconds +however in reality this change is actually far better. The install time is reduced to 6.6 seconds. +That is a saving of 5.15 seconds, 44% of the original execution time. A zypper refresh also benefits +from this, now taking 15.1 seconds to complete instead of 32 seconds. + +### Issues mirrorsorcerer can NOT prevent + +The primary remaining issues that mirrorsorcerer can not prevent that causes reduction in bandwidth +is *range requests*. + +*You can work around this today by setting ZYPP_MULTICURL=0 in your environment* + +In some situations, zypper will attempt to "stripe" downloads with range requests over multiple mirrors. +For example, when retrieving libgio, we can see the metalink xml: + + GET /distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm HTTP/1.1 + Host: mirrorcache-au.opensuse.org + User-Agent: ZYpp 17.30.2 (curl 7.79.1) + X-ZYpp-AnonymousId: 8daf9dba-ee79-4878-a8c1-9c41a5d70390 + X-ZYpp-DistributionFlavor: appliance-docker + Accept: */*, application/metalink+xml, application/metalink4+xml + + HTTP/1.1 200 OK + content-disposition: attachment; filename="libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm.meta4" + content-length: 1819 + content-type: application/metalink4+xml; charset=UTF-8 + date: Sun, 02 Oct 2022 03:06:09 GMT + server: Mojolicious (Perl) + vary: Accept-Encoding + connection: close + + <?xml version="1.0" encoding="UTF-8"?> + + <metalink xmlns="urn:ietf:params:xml:ns:metalink"> + <generator>MirrorCache</generator> + <origin dynamic="true">http://mirrorcache-au.opensuse.org/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</origin> + <published>2022-10-02T13:06:09Z</published> + <publisher> + <name>openSUSE</name> + <url>http://download.opensuse.org</url> + </publisher> + <file name="libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm"> + <!-- Mirrors which handle this country (AU): --> + <url location="AU" priority="1">http://mirror.aarnet.edu.au/pub/opensuse/opensuse/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + <url location="AU" priority="2">http://ftp.iinet.net.au/pub/opensuse/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + <url location="AU" priority="3">http://mirror.firstyear.id.au/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + <url location="AU" priority="4">http://ftp.netspace.net.au/pub/opensuse/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + <url location="AU" priority="5">http://mirror.internode.on.net/pub/opensuse/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + <!-- Mirrors in the same continent (OC): --> + <url location="NZ" priority="6">http://mirror.2degrees.nz/opensuse/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + <!-- Mirrors in other parts of the world: --> + <!-- File origin location: --> + <url location="" priority="7">http://mirror.firstyear.id.au/distribution/leap/15.4/repo/oss/x86_64/libgio-2_0-0-2.70.4-150400.1.5.x86_64.rpm</url> + </file> + </metalink> + +We can then see then that zypper attempts to strip this over 5 mirrors: + +* Range: bytes=0-131071 - mirror.aarnet.edu.au +* Range: bytes=131072-262143 - ftp.iinet.net.au +* Range: bytes=393216-524287 - ftp.netspace.net.au +* Range: bytes=524288-655359 - mirror.internode.on.net +* Range: bytes=655360-702155 - mirror.2degrees.nz + +From the start to the end of these requests completing, this takes 0.32 seconds to download 702155 bytes. Now compare, if we download directly +from a single mirror, this file downloads in 0.19 seconds. + +On smaller files this range behaviour is not "so bad", as there is obviously a difference in performance, but only ~30%. On the lowest priority +mirror, the single download takes "as long" as the striped ranges (0.31 seconds). + +On larger files however +we see this have a much larger impact. `rust1.63-1.63.0-150300.7.3.1.x86_64.rpm` for example takes 29.06 seconds to retrieve 83951816 bytes. However +a direct connection to the preferred mirror take 9.7 seconds. That is 1/3rd of the time required. Even to the "lowest" priority mirror (2degrees.nz) this download +takes 22 seconds directly. This means that by *not* using range requests, zypper will range from 25% to 78% faster. There appears to be no situation where +range requests are *faster* than directly connecting to a single mirror. It's likely this is due to: + +* Small range requests can not reach maximum speed due to 'bursty' behaviour. +* Mirror storage tends to be optimised to streaming reads not random IOPS. + +In addition there is actually a *bug* in zypper where if any mirror responds with a 200, instead of +consuming the entire file from that mirror and ceasing range requests, it will continue to issue range +requests to that mirror and other mirrors instead. + +Zypper could resolve this issue by: +* Remove range requests outright. If "parallel" downloads over multiple files becomes a feature, why multiplex that further? +* Increasing the range chunk size to allow connections to read larger throughputs. For example if a file is 80M then request 5 times 20M chunks rather than 640 times 131072 byte chunks. This will allow connections to reach better throughputs. +* Disable range requests on files smaller than 4Mb. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/mirrorsorcerer-0.1.0~13/mirrorsorcerer.service new/mirrorsorcerer-0.1.0~20/mirrorsorcerer.service --- old/mirrorsorcerer-0.1.0~13/mirrorsorcerer.service 2022-04-19 04:46:48.000000000 +0200 +++ new/mirrorsorcerer-0.1.0~20/mirrorsorcerer.service 2022-10-11 02:40:06.000000000 +0200 @@ -2,7 +2,7 @@ # /usr/lib/systemd/system/mirrormagic.service.d/custom.conf [Unit] -Description=Mirror Sorcerer ???? ??? +Description=Mirror Sorcerer ??? ???? ???? ??? After=chronyd.service ntpd.service network-online.target [Service] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/mirrorsorcerer-0.1.0~13/packman.json new/mirrorsorcerer-0.1.0~20/packman.json --- old/mirrorsorcerer-0.1.0~13/packman.json 2022-04-19 04:46:48.000000000 +0200 +++ new/mirrorsorcerer-0.1.0~20/packman.json 2022-10-11 02:40:06.000000000 +0200 @@ -1,9 +1,9 @@ { "replaceable": [], "mirrors": [ - "http://ftp.fau.de", + "http://ftp.fau.de", "http://ftp.halifax.rwth-aachen.de", "http://ftp.gwdg.de/pub/linux/misc", - "http://mirror.karneval.cz/pub/linux" + "http://mirror.karneval.cz/pub/linux" ] } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/mirrorsorcerer-0.1.0~13/pool.json new/mirrorsorcerer-0.1.0~20/pool.json --- old/mirrorsorcerer-0.1.0~13/pool.json 2022-04-19 04:46:48.000000000 +0200 +++ new/mirrorsorcerer-0.1.0~20/pool.json 2022-10-11 02:40:06.000000000 +0200 @@ -6,6 +6,9 @@ "https://mirrorcache.opensuse.org", "https://mirrorcache-au.opensuse.org", "https://mirrorcache-us.opensuse.org", - "https://mirrorcache-jp.opensuse.org" + "https://mirrorcache-jp.opensuse.org", + "https://mirrorcache-us-east.opensuse.org", + "https://mirrorcache-us-west.opensuse.org", + "https://cache.opensuse.net.br" ] } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/mirrorsorcerer-0.1.0~13/src/main.rs new/mirrorsorcerer-0.1.0~20/src/main.rs --- old/mirrorsorcerer-0.1.0~13/src/main.rs 2022-04-19 04:46:48.000000000 +0200 +++ new/mirrorsorcerer-0.1.0~20/src/main.rs 2022-10-11 02:40:06.000000000 +0200 @@ -290,6 +290,12 @@ if let Err(e) = repo.write_to_file(p) { warn!(?e, ?p, "Unable to write repo configuration"); + } else { + info!("Successfully wrote to {:?}", p); + let mut dump: Vec<u8> = Vec::new(); + let _ = repo.write_to(&mut dump); + let dump = unsafe { String::from_utf8_unchecked(dump) }; + debug!(%dump); } } @@ -327,7 +333,7 @@ .with(fmt_layer) .init(); - info!("??? Mirror Sorcerer ??? "); + info!("Mirror Sorcerer ???? ???? ??? "); let config = Config::from_args(); @@ -364,6 +370,9 @@ .cloned() .collect(); + // Profile the mirror latencies, since latency is the single + // largest issues in zypper metadata access. + let mut profiled = Vec::with_capacity(md.mirrors.len()); for url in md.mirrors.iter() { @@ -406,9 +415,6 @@ // some really unsafe and slow options that it chooses ... rewrite_zyppconf(); - // Profile the mirror latencies, since latency is the single - // largest issues in zypper metadata access. - let entries = match fs::read_dir("/etc/zypp/repos.d") { Ok(e) => e, Err(e) => { ++++++ vendor.tar.xz ++++++ /work/SRC/openSUSE:Factory/mirrorsorcerer/vendor.tar.xz /work/SRC/openSUSE:Factory/.mirrorsorcerer.new.2275/vendor.tar.xz differ: char 26, line 1