Re: [Rpm-ecosystem] Zchunk update
On Mon, 2018-04-23 at 00:27 -0400, Neal Gompa wrote: > On Tue, Apr 17, 2018 at 3:05 PM, Jonathan Dieterwrote: > > I'm assuming that you're referring here to getting zchunk packaged into > > Fedora. I'd really like to finalize the file format (we're close, but > > I still need a good way of storing signatures in it) and the download > > API before releasing it into Fedora proper. > > > > I'm looking forward to this! I've updated the file format to allow for multiple signatures, updated the zchunk code to recognize the existence of a signature (while still not checking it), and have released as zchunk-0.3.0 in COPR. I've also added in 32-bits of flags that we can use to extend the format in a backwards-compatible way. The current zchunk format description is at: https://github.com/jdieter/zchunk/blob/master/zchunk_format.txt > I would recommend using the dicts mentioned above as they give me over > > 40% space savings for both other.xml.zck and primary.xml.zck. Do > > please let me know if you run into any problems. > > > > Are those dictionaries Fedora specific? If so, how can other > distributions generate similar ones? If not, still, how were they > made? :) They were generated from Fedora metadata, but they should help with any distribution's repodata. I generated them by splitting a few day's worth of metadata along package boundaries, stripping out any checksums, and then running zstd --train * on the directory containing the split metadata. The script I used is available at https://www.jdieter.net/downloads/zchunk-dicts/split.py, and I hope to write up proper instructions at some point. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: [Rpm-ecosystem] Zchunk update
On Mon, Apr 16, 2018 at 12:32 PM, Jonathan Dieterwrote: > On Mon, 2018-04-16 at 09:00 -0400, Neal Gompa wrote: >> On Mon, Apr 16, 2018 at 8:47 AM, Jonathan Dieter wrote: >> > I've also added zchunk support to createrepo_c (see >> > https://github.com/jdieter/createrepo_c), but I haven't yet created a >> > pull request because I'm not sure if my current implementation is the >> > best method. My current effort only zchunks primary.xml, filelists.xml >> > and other.xml and doesn't change the sort order. >> > >> >> Fedora COPR, Open Build Service, Mageia, and openSUSE also append >> AppStream data to repodata to ship AppStream information. Is there a >> way we can incorporate this into zck rpm-md? There's been an issue for >> a while to support generating the AppStream metadata as part of the >> createrepo_c run using the libappstream-builder library[1], which may >> lend itself to doing this properly. > > Is it repomd.xml that actually gets changed or primary.xml / > filelists.xml / other.xml? > > If it's repomd.xml, then it really shouldn't make any difference > because I'm not currently zchunking it. As far as I can see, the only > reason to zchunk it would be to have an embedded GPG signature once > they're supported in zchunk. > repomd.xml is being changed, so it should be fine, then. It'd be nice to be able to chunk up AppStream data eventually, though. >> > The one area of zchunk that still needs some API work is the download >> > and chunk merge API, and I'm planning to clean that up as I add zchunk >> > support to librepo. >> > >> > Some things I'd still like to add to zchunk: >> > * A python API >> > * GPG signatures in addition to (possibly replacing) overall data >> >checksum >> >> I'd rather not lose checksums, but GPG signatures would definitely be >> necessary, as openSUSE needs them, and we'd definitely like to have >> them in Fedora[2], COPR[3], and Mageia[4]. > > Fair enough. Would we want zchunk to support multiple GPG signatures > or is one enough? > Historically, we've used only one GPG key because that's what we do with RPMs, but technically you can specify multiple keys in a .repo file for Yum, DNF, and Zypper to use for validating packages and metadata, so it's absolutely possible to have more. I'd probably suggest if it's not too difficult, supporting multiple signatures. -- 真実はいつも一つ!/ Always, there's only one truth! ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: [Rpm-ecosystem] Zchunk update
On Mon, 2018-04-16 at 09:00 -0400, Neal Gompa wrote: > On Mon, Apr 16, 2018 at 8:47 AM, Jonathan Dieterwrote: > > I've also added zchunk support to createrepo_c (see > > https://github.com/jdieter/createrepo_c), but I haven't yet created a > > pull request because I'm not sure if my current implementation is the > > best method. My current effort only zchunks primary.xml, filelists.xml > > and other.xml and doesn't change the sort order. > > > > Fedora COPR, Open Build Service, Mageia, and openSUSE also append > AppStream data to repodata to ship AppStream information. Is there a > way we can incorporate this into zck rpm-md? There's been an issue for > a while to support generating the AppStream metadata as part of the > createrepo_c run using the libappstream-builder library[1], which may > lend itself to doing this properly. Is it repomd.xml that actually gets changed or primary.xml / filelists.xml / other.xml? If it's repomd.xml, then it really shouldn't make any difference because I'm not currently zchunking it. As far as I can see, the only reason to zchunk it would be to have an embedded GPG signature once they're supported in zchunk. > > The one area of zchunk that still needs some API work is the download > > and chunk merge API, and I'm planning to clean that up as I add zchunk > > support to librepo. > > > > Some things I'd still like to add to zchunk: > > * A python API > > * GPG signatures in addition to (possibly replacing) overall data > >checksum > > I'd rather not lose checksums, but GPG signatures would definitely be > necessary, as openSUSE needs them, and we'd definitely like to have > them in Fedora[2], COPR[3], and Mageia[4]. Fair enough. Would we want zchunk to support multiple GPG signatures or is one enough? > > * An expiry field? (I'm obviously thinking about signed repodata here) > > Do we need an expiry field if we properly processed the key > revocation/expiration in librepo? My understanding is that current > hiccup with it is that we don't, and that the GPG keyring used in > librepo is independent of the RPM keyring (which it shouldn't be). Ah, that makes sense. Forget that idea then. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: [Rpm-ecosystem] Zchunk update
On Mon, Apr 16, 2018 at 8:47 AM, Jonathan Dieterwrote: > It's been a number of weeks since my last update, so I thought I'd let > everyone know where things are at. > > I've spent most of these last few weeks reworking zchunk's API to make > it easier to use and more in line with what other compression tools > use, and I'm mostly happy with it now. Writing a simple zchunk file > can be done in a few lines of code, while reading one is also simple. > > I've also added zchunk support to createrepo_c (see > https://github.com/jdieter/createrepo_c), but I haven't yet created a > pull request because I'm not sure if my current implementation is the > best method. My current effort only zchunks primary.xml, filelists.xml > and other.xml and doesn't change the sort order. > Fedora COPR, Open Build Service, Mageia, and openSUSE also append AppStream data to repodata to ship AppStream information. Is there a way we can incorporate this into zck rpm-md? There's been an issue for a while to support generating the AppStream metadata as part of the createrepo_c run using the libappstream-builder library[1], which may lend itself to doing this properly. [1]: https://github.com/rpm-software-management/createrepo_c/issues/75 > The one area of zchunk that still needs some API work is the download > and chunk merge API, and I'm planning to clean that up as I add zchunk > support to librepo. > > Some things I'd still like to add to zchunk: > * A python API > * GPG signatures in addition to (possibly replacing) overall data >checksum I'd rather not lose checksums, but GPG signatures would definitely be necessary, as openSUSE needs them, and we'd definitely like to have them in Fedora[2], COPR[3], and Mageia[4]. [2]: https://pagure.io/releng/issue/133 [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1373331 [4]: https://bugs.mageia.org/show_bug.cgi?id=19432 > * An expiry field? (I'm obviously thinking about signed repodata here) Do we need an expiry field if we properly processed the key revocation/expiration in librepo? My understanding is that current hiccup with it is that we don't, and that the GPG keyring used in librepo is independent of the RPM keyring (which it shouldn't be). -- 真実はいつも一つ!/ Always, there's only one truth! ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org