Re: Fedora 40 beta freeze now over
On Tue, 2024-04-02 at 16:55 -0700, Kevin Fenzi wrote: > On Tue, Apr 02, 2024 at 09:28:31PM +0100, Jonathan Dieter wrote: > > * Alternatively, we could update whatever's calling createrepo_c > > to add the `f` prefix to all non-rawhide builds. > > I like this option. ;) > > https://pagure.io/pungi-fedora/pull-request/1269 That looks perfect! :) Jonathan -- ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: Fedora 40 beta freeze now over
On Sat, 2024-03-30 at 09:39 -0700, Kevin Fenzi wrote: > On Fri, Mar 29, 2024 at 11:32:10PM +0000, Jonathan Dieter wrote: > > On Wed, 2024-03-27 at 09:12 -0700, Kevin Fenzi wrote: > > > Our next freeze is for Fedora 40 Final, currently scheduled for > > > 2024-04-02, which is NEXT TUESDAY! > > > > Could you please update fedora-repo-zdicts to 2403.1 on the server(s) > > used to generate the metadata? This will reduce the size of the zchunk > > metadata for the fedora repo. > > Yeah, I already updated the rawhide composer the other day... will get > the rest today. > > Thanks for the reminder. Hey Kevin, thanks for looking into this. I've just checked today's compose and it's still not using the dictionaries. Looking at the logs at https://kojipkgs.fedoraproject.org/compose/branched/Fedora-40-20240402.n.0/logs/x86_64/createrepo-Everything.rpm.x86_64.log , it looks like it's not using the expected dictionary path: The dictionaries are in: /usr/share/fedora-repo-zdicts/f40 But createrepo_c is looking in: /usr/share/fedora-repo-zdicts/40 Our options are: * I can push out a new build of fedora-repo-zdicts with paths added that strip out the `f`, but we'll need to get a final freeze exception. * Alternatively, we could update whatever's calling createrepo_c to add the `f` prefix to all non-rawhide builds. * Finally, we can just ignore this and Fedora 40 will have 50% larger zchunk metadata. I'd prefer one of the first two options (whichever is easier), but it's not the end of the world if we go with option 3. I think we're already there with F39. Thanks, Jonathan -- ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: Fedora 40 beta freeze now over
On Wed, 2024-03-27 at 09:12 -0700, Kevin Fenzi wrote: > Our next freeze is for Fedora 40 Final, currently scheduled for > 2024-04-02, which is NEXT TUESDAY! Could you please update fedora-repo-zdicts to 2403.1 on the server(s) used to generate the metadata? This will reduce the size of the zchunk metadata for the fedora repo. Thanks, Jonathan -- ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: SQLAlchemy integration in Flask
Sorry for taking so long to reply. I'm afraid I don't check this mailing list as often as I should. :) On Tue, 2021-12-07 at 08:52 +0100, Aurelien Bompard wrote: > Thanks for your input! > > > 1. We're using a clustered database (CockroachDB, for those who > > care) > > that uses optimistic concurrency, so automatic transaction retries > > are > > a must, and we need control over how those retries are done. > > > > > Interesting, we don't use that, but then again we've recently started > using more funky stuff on the database side (TimescaleDB) so maybe > one day... Unfortunately CockroachDB has gone the route of MongoDB in its licensing, so it's not really open. YugabyteDB looks like it has most of the same features and is Apache 2.0 licensed, so would probably be a better fit for Fedora (and, if it wasn't for the fact that it's missing GIN indexes, we would probably be using it too). > > > 2. We are using the same models for a couple of different projects > > (the > > API itself and a script that is synchronizing between the old > > database > > and the new), and not all the projects are built on Flask. > > Initially, > > I was able to get the sync script working with Flask-SQLAlchemy, > > but > > things got ugly quickly when I started doing multithreading, so I > > abandoned it and am now using Flask and SQLAlchemy separately. > > > > > When I thought about that use case, I supposed it would be OK to > instantiate the app and start the app context from within the script, > as it would also give you access to Flask's config file. But I did > not think about multithreading. Would you recommend against creating > the app instance and the app context in a command-line script? Well, that was what I tried to do first, but, as I said, everything broke down when I tried to do multithreading (and got worse when I tried to setup multiprocessing). The problem is that Flask-SQLAlchemy tries to manage the DB session for you, and, since SQLAlchemy sessions aren't thread-safe, my command-line script kept crashing, and a few hours of poking around couldn't fix it. If I'd been willing to poke around more in Flask-SQLAlchemy's, I might have figured something out, but it just didn't seem to be worth the effort, when manually managing my sessions fixed the problem completely. > Is the code you wrote to integrate Flask and SQLAlchemy opensource, > and available somewhere? Unfortunately not, but there was actually very little integration code written. Our code follows the following pattern (we're using Flask-RESTX, and I've omitted serializers to keep it simple): endpoint: import business from util import run_transaction @ns.route("/user/") class UserLink: def get(self, id): return run_transaction(lambda s: business.get_user(s, id)) business: from database.model import * def get_user(session, id): return session.query(User).filter(User.id == id).one() util: from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker import sqlalchemy_cockroachdb engine = create_engine('postgresql://admin:swordfish@localhost/') SessionMaker = sessionmaker(engine) def run_transaction(func): sqlalchemy_cockroachdb.run_transaction(SessionMaker, func) The purpose of the run_transaction function is to repeat transactions if there's a conflict, rather than trying to lock the record, which is a CockroachDB paradigm. I hope the above is at least somewhat helpful in explaining how we're working without Flask-SQLAlchemy Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: SQLAlchemy integration in Flask
On Mon, 2021-12-06 at 18:36 +0100, Aurelien Bompard wrote: > Anyway, this long email is about finding a common ground for > SQLAlchemy integration in Flask, while taking into account our > difficult experiences with webframewoks in the past, but not being > locked in them. Is there something that I misrepresented here? Do you > have opinions? Preferences? So, full disclosure, I'm normally just lurking on this list and am not currently writing or maintaining code for the infrastructure team, so my 2¢ probably isn't worth much more than that. Having said that, in my day job, I've been writing a Flask API to correspond with a massive database restructure using SQLAlchemy. When I started writing the API, I originally used Flask-SQLAlchemy for all the reasons you listed above. However, a couple of months ago I stripped it out for a couple of reasons. 1. We're using a clustered database (CockroachDB, for those who care) that uses optimistic concurrency, so automatic transaction retries are a must, and we need control over how those retries are done. 2. We are using the same models for a couple of different projects (the API itself and a script that is synchronizing between the old database and the new), and not all the projects are built on Flask. Initially, I was able to get the sync script working with Flask-SQLAlchemy, but things got ugly quickly when I started doing multithreading, so I abandoned it and am now using Flask and SQLAlchemy separately. In short, Flask-SQLAlchemy does a great job of tying together Flask and SQLAlchemy if you're 100% sure that your project models will never be required outside of Flask. The minute you step outside of the Flask- SQLAlchemy way of doing things, things start to go very wrong very quickly. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Freeze break request: Re: Can we update fedora-repo-zdicts on the branched and rawhide composers?
On Wed, 2021-09-08 at 09:14 -0700, Kevin Fenzi wrote: > > I've updated it. > > kevin I can confirm that the latest F35 repodata has the dictionaries now. Thanks so much! Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Can we update fedora-repo-zdicts on the branched and rawhide composers?
Since branching, I've put out a new version of fedora-repo-zdicts with dictionaries for F35 and updated dictionaries for Rawhide. This version (2108.1) is now available in all active Fedora/EPEL branches, I think. Can we update fedora-repo-zdicts on the branched and rawhide composers so they get the latest dictionaries when creating the repodata? Or do we need to wait until the beta freeze ends? Thanks, Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Please don't update zchunk to 1.1.14 on servers where createrepo_c is run
On Wed, 2021-06-02 at 09:35 -0700, Kevin Fenzi wrote: > On Tue, Jun 01, 2021 at 09:13:30PM +0100, Jonathan Dieter wrote: > > A major bug in zchunk-1.1.14 was flagged up to me today. If zchunk- > > 1.1.14 (on a system with zstd 1.5.0+) is used to create a zck file with > > a zdict, the file will be impossible to decompress. Embarrassingly, > > the tests weren't testing this combination. > > > > The good news is that this doesn't affect decompression at all, so this > > is only a problem for the server that's used to generate the zchunked > > metadata, which is using zdicts. > > > > I've just finished building zchunk-1.1.15 which fixes this bug (and > > adds tests to make sure it never happens again), but please make sure > > that zchunk doesn't get updated to 1.1.14 on the servers that generate > > the metadata. > > > > Thanks, and apologies for the inconvenience. > > I updated all of them to 1.1.15 yesterday. > Many of them were on 1.1.14, so I figured it was better to move forward > than move back. :) That sounds good to me! Thanks so much! Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Please don't update zchunk to 1.1.14 on servers where createrepo_c is run
A major bug in zchunk-1.1.14 was flagged up to me today. If zchunk- 1.1.14 (on a system with zstd 1.5.0+) is used to create a zck file with a zdict, the file will be impossible to decompress. Embarrassingly, the tests weren't testing this combination. The good news is that this doesn't affect decompression at all, so this is only a problem for the server that's used to generate the zchunked metadata, which is using zdicts. I've just finished building zchunk-1.1.15 which fixes this bug (and adds tests to make sure it never happens again), but please make sure that zchunk doesn't get updated to 1.1.14 on the servers that generate the metadata. Thanks, and apologies for the inconvenience. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Can we please update fedora-repo-zdicts on the metadata generation servers for F34 zchunk dictionaries?
On Sat, 2021-04-03 at 21:46 +0100, Jonathan Dieter wrote: > On Sat, 2021-04-03 at 11:09 -0700, Kevin Fenzi wrote: > > ok. I've installed fedora-repo-zdicts on both branched and rawhide > > composers. > > > > Lets see if that works in tomorrow's compose. > > Thanks so much! Fingers crossed. :) I've just checked the latest compose and the repodata now has zdicts. Thanks again Kevin for getting that package into the composers. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Can we please update fedora-repo-zdicts on the metadata generation servers for F34 zchunk dictionaries?
On Sat, 2021-04-03 at 11:09 -0700, Kevin Fenzi wrote: > ok. I've installed fedora-repo-zdicts on both branched and rawhide > composers. > > Lets see if that works in tomorrow's compose. Thanks so much! Fingers crossed. :) Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Can we please update fedora-repo-zdicts on the metadata generation servers for F34 zchunk dictionaries?
Right now, we're not using zdicts for the F34 zchunk metadata because they were only added in fedora-repo-zdicts-2103.1-2 (which should now be in the updates repo in all current Fedora releases). If we could update fedora-repo-zdicts to 2103.1-2 on whichever servers generate the metadata (preferably before the 34 GA metadata is generated), that should significantly reduce the size of the metadata. Thanks, Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: [PATCH] bodhi-backend: Make sure zchunk dicts are installed
On Thu, 2019-05-23 at 17:01 -0400, Randy Barlow wrote: > On Thu, 2019-05-23 at 10:33 -0700, Kevin Fenzi wrote: > > Applied. Thanks. > > One note: The patch to do zchunking is part of Bodhi 4.0.0, which is > not yet in production; we plan to deploy it on Tuesday. Unless I'm mistaken, that patch is specific to updateinfo.xml. The other metadata in updates and updates-testing is currently zchunked, just without a zdict at the moment. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [PATCH] bodhi-backend: Make sure zchunk dicts are installed
On Sun, 2019-05-19 at 21:25 +0100, Jonathan Dieter wrote: > The zchunk dictionaries used to reduce the size of zchunk metadata seems to > not currently be installed on the bodhi server. This patch makes sure they > are installed. > > Signed-off-by: Jonathan Dieter > --- > roles/bodhi2/backend/tasks/main.yml | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/roles/bodhi2/backend/tasks/main.yml > b/roles/bodhi2/backend/tasks/main.yml > index 32da678db..3ab6ec809 100644 > --- a/roles/bodhi2/backend/tasks/main.yml > +++ b/roles/bodhi2/backend/tasks/main.yml > @@ -18,6 +18,7 @@ >- bodhi-composer >- python3-pyramid_sawing >- sigul > + - fedora-repo-zdicts ># Are these still needed? >- compose-utils >- pungi-utils Just to be clear, I'm not 100% sure this is the right way or the right place to install the zchunk dictionaries, but having them installed should reduce the size of the zchunk metadata by a significant amount. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
[PATCH] bodhi-backend: Make sure zchunk dicts are installed
The zchunk dictionaries used to reduce the size of zchunk metadata seems to not currently be installed on the bodhi server. This patch makes sure they are installed. Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/tasks/main.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/roles/bodhi2/backend/tasks/main.yml b/roles/bodhi2/backend/tasks/main.yml index 32da678db..3ab6ec809 100644 --- a/roles/bodhi2/backend/tasks/main.yml +++ b/roles/bodhi2/backend/tasks/main.yml @@ -18,6 +18,7 @@ - bodhi-composer - python3-pyramid_sawing - sigul + - fedora-repo-zdicts # Are these still needed? - compose-utils - pungi-utils -- 2.21.0 ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories
On Thu, 2019-04-11 at 18:08 -0700, Kevin Fenzi wrote: > On 4/9/19 11:20 AM, Jonathan Dieter wrote: > > On Tue, 2019-04-09 at 19:14 +0100, Jonathan Dieter wrote: > > > This re-adds zchunk support for the updates and updates-testing > > > repositories > > > for both rpms and modularity. > > > > > > Zchunk metadata was turned off due to a broken version of librepo that > > > made it > > > out to stable, but a fixed version has been pushed and FESCo has > > > decided[1] to > > > go ahead and turn this back on. > > > > > > 1: https://pagure.io/fesco/issue/2116 > > > > In that ticket, we didn't really specify when to turn it back on, so if > > we want to sit on this patch for a few days, that's fine with me. > > > > Once we've decided when this should be applied, I'll send a message to > > devel-announce with an explanation on how to workaround the segfault > > for anyone still using librepo-1.9.6-1. > > I think we should apply it asap. > > However, if I save your email and try and git am it, it doesn't apply at > all. > > Can you resend with the patch as attachment? > > I am not sure what thunderbird is doing here. ;( > > kevin Ok, here it is, freshly rebased, as an attachment. Jonathan From 4c53d3fba04b1bbbfdb8a7dc1d350e75dd5efd5d Mon Sep 17 00:00:00 2001 From: Jonathan Dieter Date: Sat, 30 Mar 2019 22:29:33 + Subject: [PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories This re-adds zchunk support for the updates and updates-testing repositories for both rpms and modularity. Zchunk metadata was turned off due to a broken version of librepo that made it out to stable, but a fixed version has been pushed and FESCo has decided[1] to go ahead and turn this back on. 1: https://pagure.io/fesco/issue/2116 Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++ roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++ 2 files changed, 6 insertions(+) diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2 index 43c6a7e5f..b5bb0c1fb 100644 --- a/roles/bodhi2/backend/templates/pungi.module.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2 @@ -61,6 +61,9 @@ greedy_method = 'build' createrepo_c = True createrepo_checksum = 'sha256' createrepo_deltas = False +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] #jigdo create_jigdo = False diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 index 8d9e9a3f2..020736aee 100644 --- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 @@ -66,6 +66,9 @@ createrepo_deltas = [ ('^Everything$', {'*': True}) ] createrepo_database = True +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] # CHECKSUMS media_checksums = ['sha256'] -- 2.21.0 signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories
On Tue, 2019-04-09 at 19:14 +0100, Jonathan Dieter wrote: > This re-adds zchunk support for the updates and updates-testing repositories > for both rpms and modularity. > > Zchunk metadata was turned off due to a broken version of librepo that made it > out to stable, but a fixed version has been pushed and FESCo has decided[1] to > go ahead and turn this back on. > > 1: https://pagure.io/fesco/issue/2116 In that ticket, we didn't really specify when to turn it back on, so if we want to sit on this patch for a few days, that's fine with me. Once we've decided when this should be applied, I'll send a message to devel-announce with an explanation on how to workaround the segfault for anyone still using librepo-1.9.6-1. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
[PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories
This re-adds zchunk support for the updates and updates-testing repositories for both rpms and modularity. Zchunk metadata was turned off due to a broken version of librepo that made it out to stable, but a fixed version has been pushed and FESCo has decided[1] to go ahead and turn this back on. 1: https://pagure.io/fesco/issue/2116 Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++ roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++ 2 files changed, 6 insertions(+) diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2 index 43c6a7e5f..b5bb0c1fb 100644 --- a/roles/bodhi2/backend/templates/pungi.module.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2 @@ -61,6 +61,9 @@ greedy_method = 'build' createrepo_c = True createrepo_checksum = 'sha256' createrepo_deltas = False +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] #jigdo create_jigdo = False diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 index 8d9e9a3f2..020736aee 100644 --- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 @@ -66,6 +66,9 @@ createrepo_deltas = [ ('^Everything$', {'*': True}) ] createrepo_database = True +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] # CHECKSUMS media_checksums = ['sha256'] -- 2.21.0 ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sun, 2019-03-31 at 10:37 -0700, Kevin Fenzi wrote: > On 3/31/19 10:35 AM, Jonathan Dieter wrote: > > On Sun, 2019-03-31 at 10:28 -0700, Kevin Fenzi wrote: > > > On 3/31/19 1:56 AM, Jonathan Dieter wrote: > > > > On Sun, 2019-03-31 at 09:09 +0100, Jonathan Dieter wrote: > > > > > Due to an unrelated *major* bug in the latest librepo update ( > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to > > > > > request that we disable zchunk metadata generation in updates and > > > > > updates-testing until it's fixed. > > > > > > > > Just to be clear, until either: > > > > * We get a new updates compose out without zchunk metadata, or > > > > * The user sets zchunk=False in /etc/dnf/dnf.conf > > > > > > > > dnf update is broken for anybody using F30 > > > > > > > > Should I send an email to -devel explaining the above? > > > > > > Please do, perhaps devel-announce ? > > > > > > I have reverted things and am working on a new f30-updates-testing push. > > > There was a failed f29-updates-testing last night so I have to finish > > > that first, but hopefully we will have it out in a few hours. > > > > I've sent it out to devel-announce, but it was rejected as I'm not in > > the right group. Will I send it to you and let you forward it? > > Yeah, you have to be subscribed to devel-announce to post there... if > you just subscribe and resend it should go to moderation and I can pass it. > > Or if you want, just send it my way and I can post it... Ok, I've subscribed, sent the message, and it's awaiting moderation. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sun, 2019-03-31 at 10:28 -0700, Kevin Fenzi wrote: > On 3/31/19 1:56 AM, Jonathan Dieter wrote: > > On Sun, 2019-03-31 at 09:09 +0100, Jonathan Dieter wrote: > > > Due to an unrelated *major* bug in the latest librepo update ( > > > https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to > > > request that we disable zchunk metadata generation in updates and > > > updates-testing until it's fixed. > > > > Just to be clear, until either: > > * We get a new updates compose out without zchunk metadata, or > > * The user sets zchunk=False in /etc/dnf/dnf.conf > > > > dnf update is broken for anybody using F30 > > > > Should I send an email to -devel explaining the above? > > Please do, perhaps devel-announce ? > > I have reverted things and am working on a new f30-updates-testing push. > There was a failed f29-updates-testing last night so I have to finish > that first, but hopefully we will have it out in a few hours. I've sent it out to devel-announce, but it was rejected as I'm not in the right group. Will I send it to you and let you forward it? Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sun, 2019-03-31 at 09:09 +0100, Jonathan Dieter wrote: > Due to an unrelated *major* bug in the latest librepo update ( > https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to > request that we disable zchunk metadata generation in updates and > updates-testing until it's fixed. Just to be clear, until either: * We get a new updates compose out without zchunk metadata, or * The user sets zchunk=False in /etc/dnf/dnf.conf dnf update is broken for anybody using F30 Should I send an email to -devel explaining the above? Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sun, 2019-03-31 at 05:13 +, Peter Robinson wrote: > On Sun, Mar 31, 2019 at 6:01 AM Kevin Fenzi wrote: > > On 3/30/19 9:50 PM, Peter Robinson wrote: > > > > Great, thanks! I'll be keeping an eye on the composes to see if there > > > > are any issues. > > > > > > Wasn't this disabled in the main Fedora branched compose? If so why > > > would we want to enable it only on updates? > > > > There's no updates in f30 indeed, but updates-testing should be there > > and available for testing. Nearer release we will enable updates and if > > we didn't enable this for them now we might well not remember to do so, > > so it seemed like a good idea to just do them both. > > I was referring to commits 6c392f16 and 96adf9a in pungi-fedora, if > it's disabled in the base fedora repo why enable it in > updates/testing? Hey Peter, the zchunk metadata generation was disabled in the base repo because of a bug that popped up in a combination that the compose process happened to hit: using a single baseurl and downloading a zchunk file with tens of thousands of chunks on a slow processor. The bug has been fixed with updates to both zchunk and libcurl (see https://bugzilla.redhat.com/show_bug.cgi?id=1690971) and it shouldn't affect beta users because the number of chunks in updates and updates- testing is a magnitude lower than the base repo. *However* Due to an unrelated *major* bug in the latest librepo update ( https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to request that we disable zchunk metadata generation in updates and updates-testing until it's fixed. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sat, 2019-03-30 at 16:00 -0700, Kevin Fenzi wrote: > On 3/30/19 3:32 PM, Jonathan Dieter wrote: > > On Sat, 2019-03-30 at 15:13 -0700, Kevin Fenzi wrote: > > > On 3/30/19 11:35 AM, Jonathan Dieter wrote: > > > > > > > Stephen and Kevin, thanks so much! > > > > > > Can you rebase and attach the patch? > > > > > > It's not applying cleanly for me... if not I can try and manually > > > poke > > > it later. > > > > > > kevin > > > > I've just rebased and posted the updated patch. There were no > > conflicts when I rebased it against master, so please let me know > > if I > > should be rebasing against a different branch. > > Not sure why it was complaining, but its applied and pushed now. > > kevin Great, thanks! I'll be keeping an eye on the composes to see if there are any issues. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sat, 2019-03-30 at 15:13 -0700, Kevin Fenzi wrote: > On 3/30/19 11:35 AM, Jonathan Dieter wrote: > > > Stephen and Kevin, thanks so much! > > Can you rebase and attach the patch? > > It's not applying cleanly for me... if not I can try and manually poke > it later. > > kevin I've just rebased and posted the updated patch. There were no conflicts when I rebased it against master, so please let me know if I should be rebasing against a different branch. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
[Freeze Break Request] Add zchunk support to updates and updates-testing repositories
Rebased patch against master ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
[PATCH] Add zchunk support to updates and updates-testing repositories
This adds zchunk support for the updates and updates-testing repositories for both rpms and modularity Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++ roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++ 2 files changed, 6 insertions(+) diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2 index 43c6a7e5f..b5bb0c1fb 100644 --- a/roles/bodhi2/backend/templates/pungi.module.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2 @@ -61,6 +61,9 @@ greedy_method = 'build' createrepo_c = True createrepo_checksum = 'sha256' createrepo_deltas = False +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] #jigdo create_jigdo = False diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 index 8d9e9a3f2..020736aee 100644 --- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 @@ -66,6 +66,9 @@ createrepo_deltas = [ ('^Everything$', {'*': True}) ] createrepo_database = True +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] # CHECKSUMS media_checksums = ['sha256'] -- 2.21.0 ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Sat, 2019-03-30 at 14:05 -0400, Stephen John Smoogen wrote: > +1 > > On Sat, 30 Mar 2019 at 13:53, Kevin Fenzi wrote: > > On 3/29/19 1:33 PM, Jonathan Dieter wrote: > > > On Mon, 2019-03-11 at 20:23 +, Jonathan Dieter wrote: > > > > On Mon, 2019-03-11 at 11:24 -0700, Kevin Fenzi wrote: > > > > > On 3/11/19 12:26 AM, Jonathan Dieter wrote: > > > > > > This adds zchunk support for the updates and updates-testing > > > > > > repositories for both rpms and modularity. We already have zchunk > > > > > > metadata being generated for the fedora repository. I'd like to > > > > > > get this in before Beta comes out so Beta users will have zchunk- > > > > > > enabled updates-testing repositories when Beta is released. > > > > > > > > > > yeah, hopefilly not too much pain since it's been in rawhide a while > > > > > now. > > > > > > > > > > > I am making the assumption that a zchunk-enabled createrepo_c > > > > > > (0.12.0-2 > > > > > > or later) is available on the builders (I think I'm safe making that > > > > > > assumption, since zchunk metadata is already being generated for > > > > > > some > > > > > > repos). > > > > > > > > > > Well, bodhi-backend01 (where the updates process/pungi runs for these) > > > > > has a newer one, so yes. It's all run on bodhi-backend01, not > > > > > builders. > > > > > > > > > > > I have *not* tested this patch, because I'm not sure how I'd go > > > > > > about > > > > > > doing so. If we don't have any test builders, my suggestion would > > > > > > be > > > > > > to wait until no compose is running, and then run this play on a > > > > > > builder, verifying that the generated pungi configuration is valid > > > > > > for > > > > > > both f29 and f30, with no createrepo_extra_args in f29. > > > > > > > > > > yeah, we can commit this, run the playbook then examine the results. > > > > > > > > Great. I'm on UTC time right now, so hopefully I'll be off of work and > > > > available if there are any issues whenever we get another +1 and you > > > > run it. I do expect that it will go fine. > > > > > > Since we never got the extra +1 to get this in before Beta, are we at a > > > point where we can turn this on now? > > > > Nope, we are still frozen until the day after beta release. ;( > > > > But I will try and scare up another +1 > > > > kevin > > > > -- > Stephen J Smoogen. Stephen and Kevin, thanks so much! Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Mon, 2019-03-11 at 20:23 +, Jonathan Dieter wrote: > On Mon, 2019-03-11 at 11:24 -0700, Kevin Fenzi wrote: > > On 3/11/19 12:26 AM, Jonathan Dieter wrote: > > > This adds zchunk support for the updates and updates-testing > > > repositories for both rpms and modularity. We already have zchunk > > > metadata being generated for the fedora repository. I'd like to > > > get this in before Beta comes out so Beta users will have zchunk- > > > enabled updates-testing repositories when Beta is released. > > > > yeah, hopefilly not too much pain since it's been in rawhide a while now. > > > > > I am making the assumption that a zchunk-enabled createrepo_c (0.12.0-2 > > > or later) is available on the builders (I think I'm safe making that > > > assumption, since zchunk metadata is already being generated for some > > > repos). > > > > Well, bodhi-backend01 (where the updates process/pungi runs for these) > > has a newer one, so yes. It's all run on bodhi-backend01, not builders. > > > > > I have *not* tested this patch, because I'm not sure how I'd go about > > > doing so. If we don't have any test builders, my suggestion would be > > > to wait until no compose is running, and then run this play on a > > > builder, verifying that the generated pungi configuration is valid for > > > both f29 and f30, with no createrepo_extra_args in f29. > > > > yeah, we can commit this, run the playbook then examine the results. > > Great. I'm on UTC time right now, so hopefully I'll be off of work and > available if there are any issues whenever we get another +1 and you > run it. I do expect that it will go fine. Since we never got the extra +1 to get this in before Beta, are we at a point where we can turn this on now? Thanks Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories
On Mon, 2019-03-11 at 11:24 -0700, Kevin Fenzi wrote: > On 3/11/19 12:26 AM, Jonathan Dieter wrote: > > This adds zchunk support for the updates and updates-testing > > repositories for both rpms and modularity. We already have zchunk > > metadata being generated for the fedora repository. I'd like to > > get this in before Beta comes out so Beta users will have zchunk- > > enabled updates-testing repositories when Beta is released. > > yeah, hopefilly not too much pain since it's been in rawhide a while now. > > > I am making the assumption that a zchunk-enabled createrepo_c (0.12.0-2 > > or later) is available on the builders (I think I'm safe making that > > assumption, since zchunk metadata is already being generated for some > > repos). > > Well, bodhi-backend01 (where the updates process/pungi runs for these) > has a newer one, so yes. It's all run on bodhi-backend01, not builders. > > > I have *not* tested this patch, because I'm not sure how I'd go about > > doing so. If we don't have any test builders, my suggestion would be > > to wait until no compose is running, and then run this play on a > > builder, verifying that the generated pungi configuration is valid for > > both f29 and f30, with no createrepo_extra_args in f29. > > yeah, we can commit this, run the playbook then examine the results. Great. I'm on UTC time right now, so hopefully I'll be off of work and available if there are any issues whenever we get another +1 and you run it. I do expect that it will go fine. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
[Freeze Break Request] Add zchunk support to updates and updates-testing repositories
This adds zchunk support for the updates and updates-testing repositories for both rpms and modularity. We already have zchunk metadata being generated for the fedora repository. I'd like to get this in before Beta comes out so Beta users will have zchunk-enabled updates-testing repositories when Beta is released. I am making the assumption that a zchunk-enabled createrepo_c (0.12.0-2 or later) is available on the builders (I think I'm safe making that assumption, since zchunk metadata is already being generated for some repos). I have *not* tested this patch, because I'm not sure how I'd go about doing so. If we don't have any test builders, my suggestion would be to wait until no compose is running, and then run this play on a builder, verifying that the generated pungi configuration is valid for both f29 and f30, with no createrepo_extra_args in f29. Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++ roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++ 2 files changed, 6 insertions(+) diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2 index bb021eb13..7dad35403 100644 --- a/roles/bodhi2/backend/templates/pungi.module.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2 @@ -59,6 +59,9 @@ greedy_method = 'build' createrepo_c = True createrepo_checksum = 'sha256' createrepo_deltas = False +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] #jigdo create_jigdo = False diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 index 8d9e9a3f2..020736aee 100644 --- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 @@ -66,6 +66,9 @@ createrepo_deltas = [ ('^Everything$', {'*': True}) ] createrepo_database = True +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] # CHECKSUMS media_checksums = ['sha256'] ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: How do we turn zchunk on for updates-testing for F30?
On Sun, 2019-03-10 at 15:47 +, Peter Robinson wrote: > git send-email so it's inline on the list for easy review. Thanks for the tip! Just sent it using git send-email. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
[PATCH] Add zchunk support to updates and updates-testing repositories
This adds zchunk support for the updates and updates-testing repositories for both rpms and modularity Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++ roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++ 2 files changed, 6 insertions(+) diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2 index bb021eb13..7dad35403 100644 --- a/roles/bodhi2/backend/templates/pungi.module.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2 @@ -59,6 +59,9 @@ greedy_method = 'build' createrepo_c = True createrepo_checksum = 'sha256' createrepo_deltas = False +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] #jigdo create_jigdo = False diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 index 8d9e9a3f2..020736aee 100644 --- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 @@ -66,6 +66,9 @@ createrepo_deltas = [ ('^Everything$', {'*': True}) ] createrepo_database = True +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] # CHECKSUMS media_checksums = ['sha256'] -- 2.20.1 ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: How do we turn zchunk on for updates-testing for F30?
On Sat, 2019-03-09 at 21:29 +0100, Mikolaj Izdebski wrote: > On Sat, Mar 9, 2019 at 2:29 PM Jonathan Dieter > wrote: > > Hey, I just noticed that, while we have zchunked metadata for the > > F30 > > base repository, it's not enabled to for updates-testing. > > > > I've looked in the ansible repo and in pungi, but I can't see where > > createrepo_c is actually called for updates-testing. Can someone > > please point me in the right direction? > > createrepo for updates-testing is ran by pungi. I believe you need to > enable zchunk in pungi.conf (createrepo_extra_args option). For > non-modular updates-testing the config is > roles/bodhi2/backend/templates/pungi.rpm.conf.j2 in ansible.git. > Similarly, for modular equivalent, pungi config is located at > roles/bodhi2/backend/templates/pungi.module.conf.j2 Thanks for pointing me in the right direction. I think I've got it, complete with a conditional so we don't start generating zchunk metadata for F29 updates. There doesn't seem to be a way to generate pull requests on https://infrastructure.fedoraproject.org/cgit/ansible.git, so I'm attaching the support as a patch. If there's a better way for me to send it in, please let me know. Jonathan P.S. It may be a small and simple patch, but I haven't actually tested it and am not sure how to go about doing so. From 17eefaa1cefb624afa0cf95d04e7f337ba70cb42 Mon Sep 17 00:00:00 2001 From: Jonathan Dieter Date: Sat, 9 Mar 2019 22:52:48 + Subject: [PATCH] Add zchunk support to updates and updates-testing repositories This adds zchunk support for the updates and updates-testing repositories for both rpms and modularity Signed-off-by: Jonathan Dieter --- roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++ roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++ 2 files changed, 6 insertions(+) diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2 index bb021eb13..7dad35403 100644 --- a/roles/bodhi2/backend/templates/pungi.module.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2 @@ -59,6 +59,9 @@ greedy_method = 'build' createrepo_c = True createrepo_checksum = 'sha256' createrepo_deltas = False +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] #jigdo create_jigdo = False diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 index 8d9e9a3f2..020736aee 100644 --- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 +++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 @@ -66,6 +66,9 @@ createrepo_deltas = [ ('^Everything$', {'*': True}) ] createrepo_database = True +[% if release.version_int >= 30 %] +createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]'] +[% endif %] # CHECKSUMS media_checksums = ['sha256'] -- 2.20.1 ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: How do we turn zchunk on for updates-testing for F30?
On Sat, 2019-03-09 at 09:43 -0500, Neal Gompa wrote: > On Sat, Mar 9, 2019 at 8:28 AM Jonathan Dieter > wrote: > > Hey, I just noticed that, while we have zchunked metadata for the > > F30 > > base repository, it's not enabled to for updates-testing. > > > > I've looked in the ansible repo and in pungi, but I can't see where > > createrepo_c is actually called for updates-testing. Can someone > > please point me in the right direction? > > > > Updates repos are handled by Bodhi, so you'll want to look there. If I'm reading the code correctly looks like Bodhi creates the repodata using pungi. Currently, fedora.conf in the pungi-fedora repo is set to create zchunk metadata for the branches f30 and master. Could updates- testing maybe be using a different branch? Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
How do we turn zchunk on for updates-testing for F30?
Hey, I just noticed that, while we have zchunked metadata for the F30 base repository, it's not enabled to for updates-testing. I've looked in the ansible repo and in pungi, but I can't see where createrepo_c is actually called for updates-testing. Can someone please point me in the right direction? Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Enabling zchunk metadata generation in F30
On Fri, 2018-12-14 at 15:15 -0500, Randy Barlow wrote: > On Fri, 2018-12-14 at 19:02 +0000, Jonathan Dieter wrote: > > Hey Randy, at the moment the --zck option *only* applies to > > primary.xml, filelists.xml and other.xml. It should be pretty > > straightforward to add it to the others, but I wanted to get those > > three working first. > > Cool, sounds good to me. > > > As for python bindings, they can read zchunk metadata just fine, but I > > don't think I hooked up creating the metadata. Where exactly does it > > generate updateinfo in bodhi? I'd like to see how the function is > > used so I can implement it. > > Bodhi's updateinfo code lives here: > > https://github.com/fedora-infra/bodhi/blob/3.11.3/bodhi/server/metadata.py Thanks for this. I'll take a look at it and see what it will take to make zchunk generation work in there. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Enabling zchunk metadata generation in F30
On Fri, 2018-12-14 at 19:24 +, Jonathan Dieter wrote: > On Fri, 2018-12-14 at 11:13 -0800, Kevin Fenzi wrote: > > On 12/14/18 10:52 AM, Jonathan Dieter wrote: > > > I suspect that the maintainers would like to see this feature tested > > > more before pushing it to F29, but I can ask them, if you'd like. > > > > No, I am not suggesting we implement it now in F29, I am saying that the > > rawhide-composer and bodhi-backend01 machines that run pungi are Fedora > > 29 hosts. The rawhide-composer runs in a chroot, so that should just > > work, but the bodhi-backend01 updates pungi I don't think does, or if it > > does it's a f29 chroot, not rawhide. So we will need a build for it. > > Ok, makes sense. While I'm thinking about it, fedora-repo-zdicts was just added to Fedora yesterday and hasn't even been pushed to the testing repositories yet. It should be available in Rawhide though... Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Enabling zchunk metadata generation in F30
On Fri, 2018-12-14 at 11:13 -0800, Kevin Fenzi wrote: > On 12/14/18 10:52 AM, Jonathan Dieter wrote: > > On Thu, 2018-12-13 at 16:42 -0800, Kevin Fenzi wrote: > > > Cool. > > > > > > I see the new createrepo_c only has a rawhide build... any chance for a > > > f29 update? Or should we look at building a newer in our infra repo? > > > > I suspect that the maintainers would like to see this feature tested > > more before pushing it to F29, but I can ask them, if you'd like. > > No, I am not suggesting we implement it now in F29, I am saying that the > rawhide-composer and bodhi-backend01 machines that run pungi are Fedora > 29 hosts. The rawhide-composer runs in a chroot, so that should just > work, but the bodhi-backend01 updates pungi I don't think does, or if it > does it's a f29 chroot, not rawhide. So we will need a build for it. Ok, makes sense. > > If you do a F29 rebuild for the infra repo, you'll want to make sure to > > pass --with zchunk to rpmbuild, as it defaults to off for anything F29 > > and below. > > Well, we cannot do that with a koji build, but I guess we could just > change the source. It's just a conditional in the spec that's currently set to off for <= F29, so easy enough to change. > Anyhow, perhaps we just target rawhide for now... That works for me. Thanks so much! If there's anything else I can do to help with this, please let me know. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Enabling zchunk metadata generation in F30
On Fri, 2018-12-14 at 11:20 -0500, Randy Barlow wrote: > On Thu, 2018-12-13 at 22:56 +0000, Jonathan Dieter wrote: > > The call to createrepo_c or mergerepo_c > > (whichever is run last to generate the final metadata) would need to > > be > > run with the new zchunk arguments: > > > > --zck --zck-dict-dir=/usr/share/fedora-repo-zdicts/f30 > > Hey Jonathan! > > Bodhi uses createrepo_c both through pungi (to create the bulk of the > repository) and through the createrepo_c Python bindings (to generate > the updateinfo.xml file). Is there a way to ask the Python bindings to > do this? The Fedora 29 updateinfo.xml file looks like it's only about 1 Hey Randy, at the moment the --zck option *only* applies to primary.xml, filelists.xml and other.xml. It should be pretty straightforward to add it to the others, but I wanted to get those three working first. As for python bindings, they can read zchunk metadata just fine, but I don't think I hooked up creating the metadata. Where exactly does it generate updateinfo in bodhi? I'd like to see how the function is used so I can implement it. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Enabling zchunk metadata generation in F30
On Thu, 2018-12-13 at 16:42 -0800, Kevin Fenzi wrote: > On 12/13/18 3:34 PM, Jonathan Dieter wrote: > > On Thu, 2018-12-13 at 15:12 -0800, Kevin Fenzi wrote: > > > pungi calls createrepo_c for us (in both rawhide/branched and updates) > > > so we need a pungi patch (probibly with a config option?) to enable > > > this. If it's added as a optional thing we would need to add that > > > setting to our pungi-fedora config and set it to on. > > > > > > Can you file a pungi issue on that? > > > > I've just checked the pungi issues, and it looks like Lubomír took care > > of this at Flock last summer by adding an arbitrary createrepo_c > > commands option: createrepo_extra_args > > > > I've done a PR for the pungi-fedora config here, but it's untested and > > I'm not sure if I did the variable substitution correctly. > > > > https://pagure.io/pungi-fedora/pull-request/678 > > Cool. > > I see the new createrepo_c only has a rawhide build... any chance for a > f29 update? Or should we look at building a newer in our infra repo? I suspect that the maintainers would like to see this feature tested more before pushing it to F29, but I can ask them, if you'd like. If you do a F29 rebuild for the infra repo, you'll want to make sure to pass --with zchunk to rpmbuild, as it defaults to off for anything F29 and below. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Enabling zchunk metadata generation in F30
On Thu, 2018-12-13 at 15:12 -0800, Kevin Fenzi wrote: > pungi calls createrepo_c for us (in both rawhide/branched and updates) > so we need a pungi patch (probibly with a config option?) to enable > this. If it's added as a optional thing we would need to add that > setting to our pungi-fedora config and set it to on. > > Can you file a pungi issue on that? I've just checked the pungi issues, and it looks like Lubomír took care of this at Flock last summer by adding an arbitrary createrepo_c commands option: createrepo_extra_args I've done a PR for the pungi-fedora config here, but it's untested and I'm not sure if I did the variable substitution correctly. https://pagure.io/pungi-fedora/pull-request/678 Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Enabling zchunk metadata generation in F30
Createrepo_c in F30 has finally grown zchunk support and I've packaged up some zdicts that we can use for F30/rawhide, so I'd love to see us start building zchunk metadata for F30. To enable zchunk metadata generation, whichever systems are running createrepo_c/mergerepo_c for Rawhide would need createrepo_c-0.12 and fedora-repo-zdicts installed. The call to createrepo_c or mergerepo_c (whichever is run last to generate the final metadata) would need to be run with the new zchunk arguments: --zck --zck-dict-dir=/usr/share/fedora-repo-zdicts/f30 Mergerepo doesn't require zchunk metadata in the source repositories to be able to generate zchunk metadata for the merged repository. I'm not sure who to ask to turn these flags on, so if there's an individual I need to ping, please point me in the right direction. A huge thank you to Daniel Mach for reviewing and merging the createrepo_c zchunk pull request, Jaroslav Mracek for building createrepo_c with zchunk, Robert-André Mauchin for reviewing fedora- repo-zdicts and Neal Gompa for keeping things moving forward with the PR reviews. Thanks, Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Best path for Fedora zchunk dictionaries
I'm currently in the process of packaging up fedora-repo-zdicts[1], a package which will contain the zchunk dictionaries for all active Fedora releases. When running createrepo_c or mergerepo_c with zchunk support, the directory containing the zdicts is passed in and createrepo_c will choose the right zdict for each metadata file. The form of that directory is /usr/share/fedora-repo-dicts/ where is the release that the metadata is being generated for. My question is what should actually look like. This is very Fedora specific, so I want to choose whatever the easiest variable is for infra to pass to createrepo_c or mergerepo_c. Currently is set to PLATFORM_ID in /etc/os-release (so, platform:fedora-30 for Rawhide), but that's probably overly generic. What would be a better pattern for ? fedora-30? f30? Just 30? Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: Patch review request: zchunk patches for dnf, libsolv and librepo
On Tue, 2018-06-12 at 12:21 +0300, Jonathan Dieter wrote: > I would love to get these changes into Fedora 29, and the code is > testable now, but with only three weeks until System-Wide change > proposals are due, I'm not sure if I'm being ambitious. FWIW, I have a COPR available for F28 and Rawhide with zchunk-enabled dnf/libdnf and the supporting libraries. https://copr.fedorainfracloud.org/coprs/jdieter/dnf-zchunk/ Obviously, you'll need a zchunk-enabled repository to test it. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/6SICXCLNH37TIIBZOKHHN2MFE7METTOX/
Re: [Rpm-ecosystem] Patch review request: zchunk patches for dnf, libsolv and librepo
On Tue, 2018-06-12 at 05:24 -0400, Neal Gompa wrote: > On Tue, Jun 12, 2018 at 5:21 AM Jonathan Dieter wrote: > > > > I've finally finished writing patches to integrate zchunk support into > > dnf/libsolv/librepo[1], and I'd greatly appreciate some code review. A > > vast majority of the code is in librepo, but libsolv has been expanded > > to support zchunk files and dnf has a tiny patch that passes the base > > cache directory to librepo to find source zchunk files to delta > > against. > > > > This is awesome, but we're missing patches for libdnf and > createrepo_c. PackageKit and microdnf rely on libdnf for all of this, > and no one can create zck rpmmd without a suitably enhanced > createrepo_c. Could you please make PRs against both for that? :) And I've done libdnf: https://github.com/rpm-software-management/libdnf/pull/478 Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/VXD63P7HB2BGPVLU7MVKB42TNU3OFBDJ/
Re: [Rpm-ecosystem] Patch review request: zchunk patches for dnf, libsolv and librepo
On Tue, 2018-06-12 at 05:24 -0400, Neal Gompa wrote: > On Tue, Jun 12, 2018 at 5:21 AM Jonathan Dieter wrote: > > > > I've finally finished writing patches to integrate zchunk support into > > dnf/libsolv/librepo[1], and I'd greatly appreciate some code review. A > > vast majority of the code is in librepo, but libsolv has been expanded > > to support zchunk files and dnf has a tiny patch that passes the base > > cache directory to librepo to find source zchunk files to delta > > against. > > > > This is awesome, but we're missing patches for libdnf and > createrepo_c. PackageKit and microdnf rely on libdnf for all of this, > and no one can create zck rpmmd without a suitably enhanced > createrepo_c. Could you please make PRs against both for that? :) Here's createrepo_c. It was failing the python testcases and I wanted those fixed before putting out the pull request. https://github.com/rpm-software-management/createrepo_c/pull/92 I'm working on libdnf. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/XSSVYKSXZA5SQBF57ZSZMSM6ZYKB4LBI/
Patch review request: zchunk patches for dnf, libsolv and librepo
I've finally finished writing patches to integrate zchunk support into dnf/libsolv/librepo[1], and I'd greatly appreciate some code review. A vast majority of the code is in librepo, but libsolv has been expanded to support zchunk files and dnf has a tiny patch that passes the base cache directory to librepo to find source zchunk files to delta against. With these patches and a zchunk-enabled repository, you will download only the differences in your primary/filelists/other metadata. Partial downloads are validated and then continued, and each chunk is validated as it's downloaded, ending the download and moving to a new mirror if any chunk is corrupt. I would love to get these changes into Fedora 29, and the code is testable now, but with only three weeks until System-Wide change proposals are due, I'm not sure if I'm being ambitious. On another note, I would also like to finalize zchunk's API and make a stable ABI promise, but before I take that step I'd really love some feedback on its usability. Jonathan [1] https://github.com/rpm-software-management/dnf/pull/1107 https://github.com/openSUSE/libsolv/pull/270 https://github.com/rpm-software-management/librepo/pull/127 ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/KLY7TOD6JCBRLVD76NYDYBE27LNOENOB/
Old deltarpms are being thrown away on each compose (was Re: dnf and deltarpm)
On Thu, 2018-05-31 at 22:34 +0100, Tomasz Kłoczko wrote: > Just checked on few mirrors usual location of f28 updates > (/pub/linux/dist/fedora/linux/updates/28/Everything/x86_64/drpms) and > in this directory there are at the moment only 56 files from May 31 > and nothing older. So not two drpm per RPM package but only generated > files out of last batch of updates. (CC'ing the Fedora infrastructure list) This is a bug. It looks like we're throwing away any drpms not generated in this compose, when we should be keeping them. I've just created: https://pagure.io/fedora-infrastructure/issue/7008 Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/DSHBJB2D3ZSXMSFIZY7MZELKSI6PUAMA/
Librepo/dnf zchunk integration question
Zchunk works by comparing an old version of the file with the one you want to download, but when dnf refreshes a repository, it downloads the new file into a temporary directory with no information passed to the handle about where the old files are. I've been trying to keep my code changes in libsolv and librepo to make zchunk integration as universal as possible. Up until now, I have managed to do so without changing librepo's API, but I don't see any way to fix this except to have dnf pass information about the old directory (or, even better, the cache directory) to the handle, which will mean an API change. It would also mean that other utilities would probably need to do the same. Is there something I'm missing in dnf's interaction with librepo that would allow me to work around this, or do I just need to bite the bullet and propose a librepo API change? Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/P4ODUK7EMQCOZ5YXULE4KEP2XFVO6KKL/
Re: Zchunk update
I've released zchunk-0.4.0 which has the last (hopefully) backwards- incompatible file format change. Files created by zchunk < 0.4.0 will be unreadable by 0.4.0+. Zchunk 0.4.0 now has four bytes of flags, so, barring any bone-headed disasters in the file format, any further file format changes will be backwards-compatible. The latest release is available here: https://github.com/jdieter/zchunk/archive/0.4.0.tar.gz The file format is documented here: https://github.com/jdieter/zchunk/blob/master/zchunk_format.txt A copr with the latest release (and zchunk-enabled createrepo_c) is here: https://copr.fedorainfracloud.org/coprs/jdieter/zchunk My next step is to add zchunk support to librepo. A quick summary of the features I wanted to add: On Mon, 2018-04-16 at 15:47 +0300, Jonathan Dieter wrote: > * A python API Still needs to be done. > * GPG signatures in addition to (possibly replacing) overall data >checksum Signatures have now been added to the file format in addition to the overall checksum. The current implementation can't actually read or add a signature, though. > * An expiry field? (I'm obviously thinking about signed repodata here) As per feedback, this isn't necessary. > * Tests > * More tests The framework is in place for this, and I have added a single test case. More to come. > * Other arch testing (it's currently only tested on x86_64) I've built and tested on ARM, ppc64le, i686 and x86_64 and everything seems to be working just fine. I have not yet tested on aarch64. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: [Rpm-ecosystem] Zchunk update
On Mon, 2018-04-23 at 00:27 -0400, Neal Gompa wrote: > On Tue, Apr 17, 2018 at 3:05 PM, Jonathan Dieter <jdie...@gmail.com> wrote: > > I'm assuming that you're referring here to getting zchunk packaged into > > Fedora. I'd really like to finalize the file format (we're close, but > > I still need a good way of storing signatures in it) and the download > > API before releasing it into Fedora proper. > > > > I'm looking forward to this! I've updated the file format to allow for multiple signatures, updated the zchunk code to recognize the existence of a signature (while still not checking it), and have released as zchunk-0.3.0 in COPR. I've also added in 32-bits of flags that we can use to extend the format in a backwards-compatible way. The current zchunk format description is at: https://github.com/jdieter/zchunk/blob/master/zchunk_format.txt > I would recommend using the dicts mentioned above as they give me over > > 40% space savings for both other.xml.zck and primary.xml.zck. Do > > please let me know if you run into any problems. > > > > Are those dictionaries Fedora specific? If so, how can other > distributions generate similar ones? If not, still, how were they > made? :) They were generated from Fedora metadata, but they should help with any distribution's repodata. I generated them by splitting a few day's worth of metadata along package boundaries, stripping out any checksums, and then running zstd --train * on the directory containing the split metadata. The script I used is available at https://www.jdieter.net/downloads/zchunk-dicts/split.py, and I hope to write up proper instructions at some point. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: Zchunk update
On Tue, 2018-04-17 at 17:39 +0200, Michal Novotny wrote: > On Tue, Apr 17, 2018 at 4:20 PM, Jonathan Dieter <jdie...@gmail.com> wrote: > > On Tue, 2018-04-17 at 09:08 +0200, Michal Novotny wrote: > > > Hello Jonathan, > > > > > > Once it is in createrepo_c, we could try employing it in Fedora COPR. > > > > Ok, done. This copr currently has zchunk and createrepo_c in it. I > > did have to disable the python tests for createrepo_c which means I > > probably wouldn't use the python bindings with this release. > > > > https://copr.fedorainfracloud.org/coprs/jdieter/zchunk/ > > > > To enable zchunk creation, run createrepo_c --zck. I've created > > dictionaries that are appropriate for Fedora's metadata at > > https://www.jdieter.net/downloads/zchunk-dicts, and they can be used > > with --zck-primary-dict, --zck-filelists-dict and --zck-other-dict. > > > > To make zchunk downloads efficient, the same dictionary must be used > > each time metadata is generated. Dictionaries aren't mandatory, but > > they greatly reduce the size of the compressed metadata. > > Alright, I will deploy it on staging. But we will need to get it into > Fedora's > DistGit first to be able to use it on COPR production instance afterwards... > Anyway, looking forward to start experimenting with it. > > Thank you! I'm assuming that you're referring here to getting zchunk packaged into Fedora. I'd really like to finalize the file format (we're close, but I still need a good way of storing signatures in it) and the download API before releasing it into Fedora proper. I would recommend using the dicts mentioned above as they give me over 40% space savings for both other.xml.zck and primary.xml.zck. Do please let me know if you run into any problems. Thanks, Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: Zchunk update
On Tue, 2018-04-17 at 09:08 +0200, Michal Novotny wrote: > Hello Jonathan, > > On Mon, Apr 16, 2018 at 2:47 PM, Jonathan Dieter <jdie...@gmail.com> > wrote: > > It's been a number of weeks since my last update, so I thought I'd > > let > > everyone know where things are at. > > > > I've spent most of these last few weeks reworking zchunk's API to > > make > > it easier to use and more in line with what other compression tools > > use, and I'm mostly happy with it now. Writing a simple zchunk > > file > > can be done in a few lines of code, while reading one is also > > simple. > > I've also added zchunk support to createrepo_c (see > > https://github.com/jdieter/createrepo_c), but I haven't yet created > > a > > pull request because I'm not sure if my current implementation is > > the > > best method. My current effort only zchunks primary.xml, > > filelists.xml > > and other.xml and doesn't change the sort order. > > Once it is in createrepo_c, we could try employing it in Fedora COPR. Ok, done. This copr currently has zchunk and createrepo_c in it. I did have to disable the python tests for createrepo_c which means I probably wouldn't use the python bindings with this release. https://copr.fedorainfracloud.org/coprs/jdieter/zchunk/ To enable zchunk creation, run createrepo_c --zck. I've created dictionaries that are appropriate for Fedora's metadata at https://www.jdieter.net/downloads/zchunk-dicts, and they can be used with --zck-primary-dict, --zck-filelists-dict and --zck-other-dict. To make zchunk downloads efficient, the same dictionary must be used each time metadata is generated. Dictionaries aren't mandatory, but they greatly reduce the size of the compressed metadata. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: [Rpm-ecosystem] Zchunk update
On Mon, 2018-04-16 at 09:00 -0400, Neal Gompa wrote: > On Mon, Apr 16, 2018 at 8:47 AM, Jonathan Dieter <jdie...@gmail.com> wrote: > > I've also added zchunk support to createrepo_c (see > > https://github.com/jdieter/createrepo_c), but I haven't yet created a > > pull request because I'm not sure if my current implementation is the > > best method. My current effort only zchunks primary.xml, filelists.xml > > and other.xml and doesn't change the sort order. > > > > Fedora COPR, Open Build Service, Mageia, and openSUSE also append > AppStream data to repodata to ship AppStream information. Is there a > way we can incorporate this into zck rpm-md? There's been an issue for > a while to support generating the AppStream metadata as part of the > createrepo_c run using the libappstream-builder library[1], which may > lend itself to doing this properly. Is it repomd.xml that actually gets changed or primary.xml / filelists.xml / other.xml? If it's repomd.xml, then it really shouldn't make any difference because I'm not currently zchunking it. As far as I can see, the only reason to zchunk it would be to have an embedded GPG signature once they're supported in zchunk. > > The one area of zchunk that still needs some API work is the download > > and chunk merge API, and I'm planning to clean that up as I add zchunk > > support to librepo. > > > > Some things I'd still like to add to zchunk: > > * A python API > > * GPG signatures in addition to (possibly replacing) overall data > >checksum > > I'd rather not lose checksums, but GPG signatures would definitely be > necessary, as openSUSE needs them, and we'd definitely like to have > them in Fedora[2], COPR[3], and Mageia[4]. Fair enough. Would we want zchunk to support multiple GPG signatures or is one enough? > > * An expiry field? (I'm obviously thinking about signed repodata here) > > Do we need an expiry field if we properly processed the key > revocation/expiration in librepo? My understanding is that current > hiccup with it is that we don't, and that the GPG keyring used in > librepo is independent of the RPM keyring (which it shouldn't be). Ah, that makes sense. Forget that idea then. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Proposed zchunk file format - V4
Here's version four with a swap from fixed-length integers to variable- length compressed integers which allow us to skip compression of the index (since the non-integer data is all uncompressable checksums). I've also added the uncompressed size of each chunk to the index to make it easier to figure out how much space to allocate for the uncompressed chunk. +-+-+-+-+-++=++ | ID| Checksum type (ci) | Header checksum | Compression type (ci ) | +-+-+-+-+-++=++ +=+===+=+ | Index size (ci) | Index | Compressed Dict | +=+===+=+ +===+===+ | Chunk | Chunk | ==> More chunks +===+===+ (ci) Compressed (unsigned) integer - An variable length little endian integer where the first seven bits of the number are stored in the first byte, followed by the next seven bits in the next byte, and so on. The top bit of all bytes except the final byte must be zero, and the top bit of the final byte must be one, indicating the end of the number. ID '\0ZCK1', identifies file as zchunk version 1 file Checksum type This is an 8-bit unsigned integer containing the type of checksum used to generate the header checksum and the total data checksum, but *not* the chunk checksums. Current values: 0 = SHA-1 1 = SHA-256 Header checksum This is the checksum of everything from the beginning of the file until the end of the index when the header checksum is all \0's. Compression type This is an integer containing the type of compression used to compress dict and chunks. Current values: 0 - Uncompressed 2 - zstd Index size This is an integer containing the size of the index. Index This is the index, which is described in the next section. Compressed Dict (optional) This is a custom dictionary used when compressing each chunk. Because each chunk is compressed completely separately from the others, the custom dictionary gives us much better overall compression. The custom dictionary is compressed without a custom dictionary (for obvious reasons). Chunk This is a chunk of data, compressed with the custom dictionary provided above. The index: +==+==+===+ | Chunk checksum type (ci) | Chunk count (ci) | Data checksum | +==+==+===+ +===+==+===+ | Dict checksum | Dict length (ci) | Uncompressed dict length (ci) | +===+==+===+ ++===+==+ | Chunk checksum | Chunk length (ci) | Uncompressed length (ci) | ... ++===+==+ Chunk checksum type This is an integer containing the type of checksum used to generate the chunk checksums. Current values: 0 = SHA-1 1 = SHA-256 Chunk count This is a count of the number of chunks in the zchunk file. Checksum of all data This is the checksum of everything after the index, including the compressed dict and all the compressed chunks. This checksum is generated using the overall checksum type, *not* the chunk checksum type. Dict checksum This is the checksum of the compressed dict, used to detect whether two dicts are identical. If there is no dict, the checksum must be all zeros. Dict length This is an integer containing the length of the dict. If there is no dict, this must be a zero. Uncompressed dict length This is an integer containing the length of the dict after it has been decompressed. If there is no dict, this must be a zero. Chunk checksum This is the checksum of the compressed chunk, used to detect whether any two chunks are identical. Chunk length This is an integer containing the length of the chunk. Uncompressed dict length This is an integer containing the length of the chunk after it has been decompressed. The index is designed to be able to be extracted from the file on the server and downloaded separately, to facilitate downloading only the parts of the file that are needed, but must then be re-embedded when assembling the file so the user only needs to keep one file. ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Zchunk update
It's been a number of weeks since my last update, so I thought I'd let everyone know where things are at. I've spent most of these last few weeks reworking zchunk's API to make it easier to use and more in line with what other compression tools use, and I'm mostly happy with it now. Writing a simple zchunk file can be done in a few lines of code, while reading one is also simple. I've also added zchunk support to createrepo_c (see https://github.com/jdieter/createrepo_c), but I haven't yet created a pull request because I'm not sure if my current implementation is the best method. My current effort only zchunks primary.xml, filelists.xml and other.xml and doesn't change the sort order. The one area of zchunk that still needs some API work is the download and chunk merge API, and I'm planning to clean that up as I add zchunk support to librepo. Some things I'd still like to add to zchunk: * A python API * GPG signatures in addition to (possibly replacing) overall data checksum * An expiry field? (I'm obviously thinking about signed repodata here) * Tests * More tests * Other arch testing (it's currently only tested on x86_64) I'd welcome any feedback or flames. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: Proposed zchunk file format - V3
CC'ing fedora-infrastructure, as I think they got lost somewhere along the way. On Tue, 2018-03-20 at 17:04 +0100, Michal Domonkos wrote: > Yeah, the level doesn't really matter much. My point was, as long as > we chunk, some of the data that we will be downloading we will already > have locally. Typically (according to mdhist), it seems that package > updates are more common than new additions, so we won't be reusing the > unchanged parts of package tags. But that's inevitable if we're > chunking. Ok, I see your point, and you're absolutely right. > > The beauty of the zchunk format (or zsync, or any other chunked format) > > is that we don't have to download different files based on what we > > have, but rather, we download either fewer or more parts of the same > > file based on what we have. From the server side, we don't have to > > worry about the deltas, and the clients just get what they need. > > +1 > > Simplicity is key, I think. Even at the cost of not having the > perfectly efficient solution. The whole packaging stack is already > complicated enough. +1000 on that last! > While I'm not completely sure about application-specific boundaries > being superior to buzhash (used by casync) in terms of data savings, > it's clear that using http range requests and concatenating the > objects together in a smart way (as you suggested previously) to > reduce the number of HTTP requests is a good move in the right > direction. Just to be clear, zchunk *could* use buzhash. There's no rule about where the boundaries need to be, only that the application creating the zchunk file is consistent. I'd actually like to make the command-line utility use buzhash, but I'm trying to keep the code BSD 2-clause, so I can't just lift casync's buzhash code, and I haven't had time to write that part myself. Currently zck.c has a really ugly if statement that chooses a division based on string matching if it's true and a really naive inefficient rolling hash if it's false. If you wanted to contribute buzhash, I'd happily take it! > BTW, in the original thread, you mentioned a reduction of 30-40% when > using casync. I'm wondering, how did you measure it? I saw chunk > reuse ranging from 80% to 90% per metadata update, which seemed quite > optimistic. What I did was: > > $ casync make snap1.caidx /path/to/repodata/snap1 > $ casync make --verbose snap2.caidx /path/to/repodata/snap2 > > Reused chunks: X (Y%) > IIRC, I went into the web server logs and measured the number of bytes that casync actually downloaded as compared to the gzip size of the data. Thanks so much for your interest! Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: Initial pre-alpha version of zchunk available for testing and comments
On Thu, 2018-03-22 at 11:55 +0200, Jonathan Dieter wrote: > I've got a working zchunk library, complete with some utilities at > https://github.com/jdieter/zchunk, but I wanted to get some feedback > before I went much further. It's only dependencies are libcurl and > (optionally, but very heavily recommended) libzstd. While I'm thinking about it, it used meson as its build system, so you'll need that too. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Initial pre-alpha version of zchunk available for testing and comments
I've got a working zchunk library, complete with some utilities at https://github.com/jdieter/zchunk, but I wanted to get some feedback before I went much further. It's only dependencies are libcurl and (optionally, but very heavily recommended) libzstd. There are test files in https://www.jdieter.net/downloads/zchunk-test, and the dictionary I used is in https://www.jdieter.net/downloads. What works: * Creating zchunk files (using zck) * Reading zchunk files (using unzck) * Downloading zchunk files (using zckdl) What doesn't: * Resuming zchunk downloads * Using any of the tools to overwrite a file * Automatic maximum ranges in request detection * Streaming chunking in the library The main thing I want to ask for advice on is the last item on that last list. Currently, every piece of data send to zck_compress() is treated as a new chunk. I'd prefer to have zck_compress() just keep streaming data and have a zck_end_chunk() function that ends the current chunk, but zstd doesn't support streamed compression with a dict in its dynamic library. You have to use zstd's static library to get that function (because it's not seen as stable yet). Any suggestions on how to deal with this? Should I require the static library, write my own wrapper that buffers the streamed data until zck_end_chunk() is called, or just require each chunk to be sent in its entirety? Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Proposed zchunk file format - V2
Neal, thanks for the feedback. After taking your comments into consideration, here's version 2. +-+-+-+-+-+--+-+-+-+-+-+-+-+-+ |ID | Compression type | Index size | +-+-+-+-+-+--+-+-+-+-+-+-+-+-+ +==+=+ | Compressed Index | Compressed Dict | +==+=+ +===+===+ | Chunk | Chunk | ==> More chunks +===+===+ ID '\0ZCK1', identifies file as zchunk version 1 file Compression type Type of compression used to compress dict and chunks Current values: 0 - Uncompressed 2 - zstd Index size This is a 64-bit unsigned integer containing the size of compressed index. Compressed Index This is the index, which is described in the next section. The index is compressed without a custom dictionary. Compressed Dict (optional) This is a custom dictionary used when compressing each chunk. Because each chunk is compressed completely separately from the others, the custom dictionary gives us much better overall compression. The custom dictionary is compressed without a custom dictionary (for obvious reasons). Chunk This is a chunk of data, compressed with the custom dictionary provided above. The index: +---+==+ | Checksum type | Checksum of all data | +---+==+ ++-+-+-+-+-+-+-+-+ | Dict checksum | End of dict | ++-+-+-+-+-+-+-+-+ ++-+-+-+-+-+-+-+-+ | Chunk checksum | End of chunk | ==> More ++-+-+-+-+-+-+-+-+ Checksum type This is the type of checksum used to generate the checksums in the index. Current values: 0 = SHA-256 Checksum of all data This is the checksum of the compressed dict and all the compressed chunks, used to verify that the file is actually the same, even in the unlikely event of a hash collision for one of the chunks Dict checksum This is the checksum of the compressed dict, used to detect whether two dicts are identical. If there is no dict, the checksum must be all zeros. End of dict This is the location of the end of the dict starting from the end of the index. This gives us the information we need to find and decompress the dict. If there is no dict, the checksum must be all zeros. Chunk checksum This is the checksum of the compressed chunk, used to detect whether any two chunks are identical. End of chunk This is the location of the end of the chunk starting from the end of the index. This gives us the information we need to find and decompress each chunk. The index is designed to be able to be extracted from the file on the server and downloaded separately, to facilitate downloading only the parts of the file that are needed, but must then be re-embedded when assembling the file so the user only needs to keep one file. ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Proposed zchunk file format
So here's my proposed file format for the zchunk file. Should I add some flags to facilitate possible different compression formats? +-+-+-+-+-+-+-+-+-+-+-+-+==+=+ | ID | Index size | Compressed Index | Compressed Dict | +-+-+-+-+-+-+-+-+-+-+-+-+==+=+ +===+===+ | Chunk | Chunk | ==> More chunks +===+===+ ID '\0ZCK', identifies file as zchunk file Index size This is a 64-bit unsigned integer containing the size of compressed index. Compressed Index This is the index, which is described in the next section. The index is compressed using standard zstd compression without a custom dictionary. Compressed Dict This is a custom dictionary used when compressing each chunk. Because each chunk is compressed completely separately from the others, the custom dictionary gives us much better overall compression. The custom dictionary is compressed using standard zstd compression without using a separate custom dictionary (for obvious reasons). Chunk This is a chunk of data, compressed using zstd with the custom dictionary provided above. The index: +++-+-+-+-+-+-+-+-+ | sha256sum | End of dict | +++-+-+-+-+-+-+-+-+ +++-+-+-+-+-+-+-+-+ | sha256sum | End of chunk | ==> More +++-+-+-+-+-+-+-+-+ sha256sum of compressed dict This is a binary sha256sum of the compressed chunk, used to detect whether two dicts are identical. End of dict This is the location of the end of the dict with 0 being the end of the index. This gives us the information we need to find and decompress the dict. sha256sum of compressed chunk This is a binary sha256sum of the compressed chunk, used to detect whether any two chunks are identical. End of chunk This is the location of the end of the chunk with 0 being the end of the index. This gives us the information we need to find and decompress each chunk. The index is designed to be able to be extracted from the file on the server and downloaded separately, to facilitate downloading only the parts of the file that are needed, but must then be re-embedded when assembling the file so the user only needs to keep one file. ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: A proof-of-concept for delta'ing repodata
On Tue, 2018-02-13 at 10:52 +0100, Igor Gnatenko wrote: > What about zstd? Also in latest version of lz4 there is support for > dictionaries too. So I've investigated zstd, and, here are my results: Latest F27 primary.gz - 3.1MB zlib zchunk (including custom dict) primary.zck - 4.2MB ~35% increase zstd zchunk (including dict generated from last three Fedora GA primaries) primary.zck - 3.7MB ~20% increase Using zstd for filelists.xml has roughly the same increase as with zlib, which is expected as the chunks are larger and thus get better compression even without a dict. I did also look briefly at lz4, but it seems that it's major advantage is speed, and I'm not sure that metadata decompression speed is our main bottleneck in dnf. With these numbers, I think it makes sense to move forward with zstd instead of zlib. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: A proof-of-concept for delta'ing repodata
On Wed, 2018-02-14 at 09:56 -0800, Kevin Fenzi wrote: > ...snip... > > I think it sounds interesting, but you should get buyin from dnf folks > and/or PackageKit folks and see if they can agree to use this format. Do you know if there's a dedicated list for dnf or PackageKit development (a quick Google search didn't turn up anything), or should I communicate with them directly? If the latter, can you point me to the right people? > I also agree just adding it as a new file while leaving the rest alone > sounds good as a way to migrate only those things that know to look for > the new file when it exists, etc. +1 Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: A proof-of-concept for delta'ing repodata
On Tue, 2018-02-13 at 10:52 +0100, Igor Gnatenko wrote: > On Mon, 2018-02-12 at 23:53 +0200, Jonathan Dieter wrote: > > * Many changes to the metadata can mean a large number of ranges > >requested. I ran a check on our mirrors, and three (out of around > >150 that had the file I was testing) don't honor range requests at > >all, and three others only honor a small number in a single request. > > A further seven didn't respond at all (not sure if that had > >anything to do with the range requests), and the rest supported > >between 256 and 512 ranges in a single request. We can reduce the > >number of ranges requested by always ordering our packages by date. > >This would ensure that new packages are grouped at the end of the > >xml where they will be grabbed in one contiguous range. > > This would "break" DNF, because libsolv is assigning Id's by the order of > packages in metadata. So if something requires "webserver" and there is > "nginx" > and "httpd" providing it (without versions), then lowest Id is picked up (not > going into details of this). Which means depending on when last update for one > or other was submitted, users will get different results. This is unacceptable > from my POV. That's fair enough, though how hard would it be to change libsolv to assign Id's based on alphabetical order as opposed to metadata order (or possibly reorder the xml before sending it to libsolv)? To be clear, this optimization would reduce the number of range requests we have to send to the server, but would not hugely change the amount we download, so I don't think it's very high priority. > > * Zchunk files use zlib (it gives better compression than xz with such > >small chunks), but, because they use a custom zdict, they are not gz > >files. This means that we'll need new tools to read and write them. > >(And I am volunteering to do the work here) > > What about zstd? Also in latest version of lz4 there is support for > dictionaries too. I'll take a look at both of those. > As being someone who tried to work on this problem I very appreciate what you > have done here. We've started with using zsync and results were quite good, > but > zsync is dead and has ton of bugs. Also it requires archives to be ` > --rsyncable`. So my question is why not to add idx file as additional one for > existing files instead of inventing new format? The problem is that we will > have to distribute in old format too (for compatibility reasons). I'm not sure if it was clear, but I'm basically making --rsyncable archives with more intelligent divisions between the independent blocks, which is why it gives better delta performance... you're not getting *any* redundant data. I did originally experiment with xz files (a series of concatenated xz files is still a valid xz file), but the files were 20% larger than zlib with custom zdict. The zdict helps us reduce file size by allowing all the chunks to use the same common strings that will not change (mainly tag names), but custom zdicts aren't allowed by gzip. I've also toyed with the idea of supporting embedded idx's in zchunk files so we don't have to keep two files for every local zchunk file. We'd still want separate idx files on the webserver, though, otherwise we're looking at an extra http request to get the size of the index in the zchunk. If we embed the index in the file, we must create a new format as we don't want the index concatenated with the rest of the uncompressed file when decompressing. > I'm not sure if trying to do optimizations by XML tags is very good idea > especially because I hope that in future we would stop distributing XML's and > start distributing solv/solvx. zchunk.py shouldn't care what type of data it's chunking, but it needs to be able to chunk the same way every time. Currently it only knows how to do that with XML, because we can split it based on tag boundaries, and grouping based on source rpm gives us even better compression without sacrificing any flexibility. dl_zchunk.py and unzchunk.py neither know, nor care what type of file they're working with. Thanks so much for the feedback, and especially for the pointers to lz4 and zstd. Hopefully they'll get us closer to matching our current gz size. Jonathan signature.asc Description: This is a digitally signed message part ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Re: A proof-of-concept for delta'ing repodata
On Mon, 2018-02-12 at 23:53 +0200, Jonathan Dieter wrote: > <tl;dr> > I've come up with a method of splitting repodata into chunks that can > be downloaded and combined with chunks that are already on the local > system to create a byte-for-byte copy of the compressed repodata. > Tools and scripts are at: > https://www.jdieter.net/downloads/ > I've realized that with this, I didn't really give a proposal as to what comes next. Proposal: * Create a new (Systemwide?) Feature for Fedora 29 called Delta Repodata? * Finalize the zchunk file format (and name) * Write a C and python library to generate, read and download zchunk files, and package it into Fedora * Add a flag to createrepo_c (and, if we're still using it, createrepo) to generate zchunk repodata * Modify DNF so it can download and read zchunk repodata I don't mind being the person to drive this change, but before I start coding, I'd like some feedback on whether or not this is the direction we want to go in and I'd love any suggestions on how to improve it. Jonathan ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
A proof-of-concept for delta'ing repodata
I've come up with a method of splitting repodata into chunks that can be downloaded and combined with chunks that are already on the local system to create a byte-for-byte copy of the compressed repodata. Tools and scripts are at: https://www.jdieter.net/downloads/ Background: With DNF, we're currently downloading ~20MB of repository data every time the updates repository changes. When casync was released, I wondered if we could use it to only download the deltas for the repodata. At Flock last summer, I ran some tests against the uncompressed repodata and saw a reduction of 30-40% from one day to the next, which seemed low, but was a good starting point. Unfortunately, due to the way casync separates each file into thousands of compressed chunks, building each file required thousands of (serial) downloads which, even on a decent internet connection, took *forever*. When I talked through the idea with Kevin and Patrick, they also pointed out that our mirrors might not be too keen on the idea of adding thousands of tiny files that change every day. The Solution(?): One potential solution to the "multitude of files" problem is to merge the chunks back into a single file, and use HTTP ranges to only download the parts of the file we want. An added bonus is that most web servers are configured to support hundreds of ranges in one request, which greatly reduces the number of requests we have to make. The other problem with casync is that it's chunk separation is naïve, which is why we were only achieving 30-40% savings. But we know what the XML file is supposed to look like, so we can separate the chunks on the tag boundaries in the XML. So I've ditched casync altogether and put together a proof-of-concept (tentatively named zchunk) that takes an XML file, compresses each tag separately, and then concatenates all of them into one file. The tool also creates an index file that tells you the sha256sum for each compressed chunk and the location of the chunk in the file. I've also written a small script that will download a zchunk off the internet. If you don't specify an old file, it will just download everything, but if you specify an old file, it will download the index of the new file and compare the sha256sums of each chunk. Any checksums that match will be taken from the old file, and the rest will be downloaded. In testing, I've seen savings ranging from 10% (December 17 to today) to 95% (yesterday to today). Remaining problems: * Zchunk files are bigger than their gzip equivalents. This ranges from 5% larger for filelists.xml to 300% larger for primary.xml. This can be greatly reduced by chunking primary.xml based on srpm rather than rpm, which brings the size increase for primary.xml down to roughly 30%. * Many changes to the metadata can mean a large number of ranges requested. I ran a check on our mirrors, and three (out of around 150 that had the file I was testing) don't honor range requests at all, and three others only honor a small number in a single request. A further seven didn't respond at all (not sure if that had anything to do with the range requests), and the rest supported between 256 and 512 ranges in a single request. We can reduce the number of ranges requested by always ordering our packages by date. This would ensure that new packages are grouped at the end of the xml where they will be grabbed in one contiguous range. * Zchunk files use zlib (it gives better compression than xz with such small chunks), but, because they use a custom zdict, they are not gz files. This means that we'll need new tools to read and write them. (And I am volunteering to do the work here) The tools: The proof-of-concept tools are all sitting in https://www.jdieter.net/downloads/zchunk-scripts/ They are full of ugly hacks, especially when it comes to parsing the XML, there's little to no error reporting, and I didn't comment them well at all, but they should work. If all you want to do is download zchunks, you need to run dl_zchunk.py with the url you want to download (ending in .zck) as the first parameter. Repodata for various days over the last few weeks is at: https://www.jdieter.net/downloads/zchunk-test/ You may need to hover over the links to see which is which. The downloads directory is also available over rsync at rsync://jdieter.net/downloads/zchunk-test. dl_zchunk.py doesn't show anything if you download the full file, but if you run the command with an old file as the second parameter, it will show four numbers: bytes taken from the old file, bytes downloaded from the new, total downloaded bytes and total uploaded bytes. zchunk.py creates a .zck file. To group chunks by source rpm in primary.xml, run ./zchunk.py rpm:sourcerpm unzchunk.py decompresses a .zck file to stdout I realize that there's a lot to digest here, and it's late, so I know I missed something. Please let me know if you have any