from:"Jonathan Dieter"

Re: Fedora 40 beta freeze now over

2024-04-03 Thread Jonathan Dieter

On Tue, 2024-04-02 at 16:55 -0700, Kevin Fenzi wrote:
> On Tue, Apr 02, 2024 at 09:28:31PM +0100, Jonathan Dieter wrote:
> >  * Alternatively, we could update whatever's calling createrepo_c
> > to add the `f` prefix to all non-rawhide builds.
> 
> I like this option. ;) 
> 
> https://pagure.io/pungi-fedora/pull-request/1269

That looks perfect! :)

Jonathan
--
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Re: Fedora 40 beta freeze now over

2024-04-02 Thread Jonathan Dieter

On Sat, 2024-03-30 at 09:39 -0700, Kevin Fenzi wrote:
> On Fri, Mar 29, 2024 at 11:32:10PM +0000, Jonathan Dieter wrote:
> > On Wed, 2024-03-27 at 09:12 -0700, Kevin Fenzi wrote:
> > > Our next freeze is for Fedora 40 Final, currently scheduled for
> > > 2024-04-02, which is NEXT TUESDAY!
> > 
> > Could you please update fedora-repo-zdicts to 2403.1 on the server(s)
> > used to generate the metadata?  This will reduce the size of the zchunk
> > metadata for the fedora repo.
> 
> Yeah, I already updated the rawhide composer the other day... will get
> the rest today. 
> 
> Thanks for the reminder. 

Hey Kevin, thanks for looking into this.  I've just checked today's
compose and it's still not using the dictionaries.  Looking at the logs
at 
https://kojipkgs.fedoraproject.org/compose/branched/Fedora-40-20240402.n.0/logs/x86_64/createrepo-Everything.rpm.x86_64.log
, it looks like it's not using the expected dictionary path:

The dictionaries are in:
/usr/share/fedora-repo-zdicts/f40

But createrepo_c is looking in:
/usr/share/fedora-repo-zdicts/40

Our options are:
 * I can push out a new build of fedora-repo-zdicts with paths added
that strip out the `f`, but we'll need to get a final freeze exception.

 * Alternatively, we could update whatever's calling createrepo_c to
add the `f` prefix to all non-rawhide builds.

 * Finally, we can just ignore this and Fedora 40 will have 50% larger
zchunk metadata.

I'd prefer one of the first two options (whichever is easier), but it's
not the end of the world if we go with option 3.  I think we're already
there with F39.

Thanks,

Jonathan
--
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Re: Fedora 40 beta freeze now over

2024-03-29 Thread Jonathan Dieter

On Wed, 2024-03-27 at 09:12 -0700, Kevin Fenzi wrote:
> Our next freeze is for Fedora 40 Final, currently scheduled for
> 2024-04-02, which is NEXT TUESDAY!

Could you please update fedora-repo-zdicts to 2403.1 on the server(s)
used to generate the metadata?  This will reduce the size of the zchunk
metadata for the fedora repo.

Thanks,

Jonathan
--
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Re: SQLAlchemy integration in Flask

2021-12-13 Thread Jonathan Dieter

Sorry for taking so long to reply.  I'm afraid I don't check this
mailing list as often as I should. :)

On Tue, 2021-12-07 at 08:52 +0100, Aurelien Bompard wrote:
> Thanks for your input!
> 
> > 1. We're using a clustered database (CockroachDB, for those who
> > care)
> > that uses optimistic concurrency, so automatic transaction retries
> > are
> > a must, and we need control over how those retries are done.
> > 
> 
> 
> Interesting, we don't use that, but then again we've recently started
> using more funky stuff on the database side (TimescaleDB) so maybe
> one day...

Unfortunately CockroachDB has gone the route of MongoDB in its
licensing, so it's not really open.  YugabyteDB looks like it has most
of the same features and is Apache 2.0 licensed, so would probably be a
better fit for Fedora (and, if it wasn't for the fact that it's missing
GIN indexes, we would probably be using it too).
>  
> > 2. We are using the same models for a couple of different projects
> > (the
> > API itself and a script that is synchronizing between the old
> > database
> > and the new), and not all the projects are built on Flask. 
> > Initially,
> > I was able to get the sync script working with Flask-SQLAlchemy,
> > but
> > things got ugly quickly when I started doing multithreading, so I
> > abandoned it and am now using Flask and SQLAlchemy separately.
> > 
> 
> 
> When I thought about that use case, I supposed it would be OK to
> instantiate the app and start the app context from within the script,
> as it would also give you access to Flask's config file. But I did
> not think about multithreading. Would you recommend against creating
> the app instance and the app context in a command-line script?

Well, that was what I tried to do first, but, as I said, everything
broke down when I tried to do multithreading (and got worse when I
tried to setup multiprocessing).  The problem is that Flask-SQLAlchemy
tries to manage the DB session for you, and, since SQLAlchemy sessions
aren't thread-safe, my command-line script kept crashing, and a few
hours of poking around couldn't fix it.  If I'd been willing to poke
around more in Flask-SQLAlchemy's, I might have figured something out,
but it just didn't seem to be worth the effort, when manually managing
my sessions fixed the problem completely.

> Is the code you wrote to integrate Flask and SQLAlchemy opensource,
> and available somewhere?

Unfortunately not, but there was actually very little integration code
written.

Our code follows the following pattern (we're using Flask-RESTX, and
I've omitted serializers to keep it simple):

endpoint:
import business
from util import run_transaction

@ns.route("/user/")
class UserLink:
    def get(self, id):
        return run_transaction(lambda s: business.get_user(s, id))

business:
from database.model import *

def get_user(session, id):
   return session.query(User).filter(User.id == id).one()

util:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
import sqlalchemy_cockroachdb

engine = create_engine('postgresql://admin:swordfish@localhost/')

SessionMaker = sessionmaker(engine)

def run_transaction(func):
sqlalchemy_cockroachdb.run_transaction(SessionMaker, func)

The purpose of the run_transaction function is to repeat transactions
if there's a conflict, rather than trying to lock the record, which is
a CockroachDB paradigm.

I hope the above is at least somewhat helpful in explaining how we're
working without Flask-SQLAlchemy

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Re: SQLAlchemy integration in Flask

2021-12-06 Thread Jonathan Dieter

On Mon, 2021-12-06 at 18:36 +0100, Aurelien Bompard wrote:

> Anyway, this long email is about finding a common ground for
> SQLAlchemy integration in Flask, while taking into account our
> difficult experiences with webframewoks in the past, but not being
> locked in them. Is there something that I misrepresented here? Do you
> have opinions? Preferences?

So, full disclosure, I'm normally just lurking on this list and am not
currently writing or maintaining code for the infrastructure team, so
my 2¢ probably isn't worth much more than that.

Having said that, in my day job, I've been writing a Flask API to
correspond with a massive database restructure using SQLAlchemy.  When
I started writing the API, I originally used Flask-SQLAlchemy for all
the reasons you listed above.  However, a couple of months ago I
stripped it out for a couple of reasons.

1. We're using a clustered database (CockroachDB, for those who care)
that uses optimistic concurrency, so automatic transaction retries are
a must, and we need control over how those retries are done.
2. We are using the same models for a couple of different projects (the
API itself and a script that is synchronizing between the old database
and the new), and not all the projects are built on Flask.  Initially,
I was able to get the sync script working with Flask-SQLAlchemy, but
things got ugly quickly when I started doing multithreading, so I
abandoned it and am now using Flask and SQLAlchemy separately.

In short, Flask-SQLAlchemy does a great job of tying together Flask and
SQLAlchemy if you're 100% sure that your project models will never be
required outside of Flask.  The minute you step outside of the Flask-
SQLAlchemy way of doing things, things start to go very wrong very
quickly.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Re: Freeze break request: Re: Can we update fedora-repo-zdicts on the branched and rawhide composers?

2021-09-09 Thread Jonathan Dieter

On Wed, 2021-09-08 at 09:14 -0700, Kevin Fenzi wrote:
> 
> I've updated it. 
> 
> kevin

I can confirm that the latest F35 repodata has the dictionaries now. 
Thanks so much!

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Can we update fedora-repo-zdicts on the branched and rawhide composers?

2021-08-31 Thread Jonathan Dieter

Since branching, I've put out a new version of fedora-repo-zdicts with
dictionaries for F35 and updated dictionaries for Rawhide.  This
version (2108.1) is now available in all active Fedora/EPEL branches, I
think.

Can we update fedora-repo-zdicts on the branched and rawhide composers
so they get the latest dictionaries when creating the repodata?  Or do
we need to wait until the beta freeze ends?

Thanks,
Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Re: Please don't update zchunk to 1.1.14 on servers where createrepo_c is run

2021-06-03 Thread Jonathan Dieter

On Wed, 2021-06-02 at 09:35 -0700, Kevin Fenzi wrote:
> On Tue, Jun 01, 2021 at 09:13:30PM +0100, Jonathan Dieter wrote:
> > A major bug in zchunk-1.1.14 was flagged up to me today.  If zchunk-
> > 1.1.14 (on a system with zstd 1.5.0+) is used to create a zck file with
> > a zdict, the file will be impossible to decompress.  Embarrassingly,
> > the tests weren't testing this combination.
> > 
> > The good news is that this doesn't affect decompression at all, so this
> > is only a problem for the server that's used to generate the zchunked
> > metadata, which is using zdicts.
> > 
> > I've just finished building zchunk-1.1.15 which fixes this bug (and
> > adds tests to make sure it never happens again), but please make sure
> > that zchunk doesn't get updated to 1.1.14 on the servers that generate
> > the metadata.
> > 
> > Thanks, and apologies for the inconvenience.
> 
> I updated all of them to 1.1.15 yesterday. 
> Many of them were on 1.1.14, so I figured it was better to move forward
> than move back. :) 

That sounds good to me!  Thanks so much!

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Please don't update zchunk to 1.1.14 on servers where createrepo_c is run

2021-06-01 Thread Jonathan Dieter

A major bug in zchunk-1.1.14 was flagged up to me today.  If zchunk-
1.1.14 (on a system with zstd 1.5.0+) is used to create a zck file with
a zdict, the file will be impossible to decompress.  Embarrassingly,
the tests weren't testing this combination.

The good news is that this doesn't affect decompression at all, so this
is only a problem for the server that's used to generate the zchunked
metadata, which is using zdicts.

I've just finished building zchunk-1.1.15 which fixes this bug (and
adds tests to make sure it never happens again), but please make sure
that zchunk doesn't get updated to 1.1.14 on the servers that generate
the metadata.

Thanks, and apologies for the inconvenience.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Re: Can we please update fedora-repo-zdicts on the metadata generation servers for F34 zchunk dictionaries?

2021-04-05 Thread Jonathan Dieter

On Sat, 2021-04-03 at 21:46 +0100, Jonathan Dieter wrote:
> On Sat, 2021-04-03 at 11:09 -0700, Kevin Fenzi wrote:
> > ok. I've installed fedora-repo-zdicts on both branched and rawhide
> > composers. 
> > 
> > Lets see if that works in tomorrow's compose. 
> 
> Thanks so much!  Fingers crossed. :)

I've just checked the latest compose and the repodata now has zdicts. 
Thanks again Kevin for getting that package into the composers.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Re: Can we please update fedora-repo-zdicts on the metadata generation servers for F34 zchunk dictionaries?

2021-04-03 Thread Jonathan Dieter

On Sat, 2021-04-03 at 11:09 -0700, Kevin Fenzi wrote:
> ok. I've installed fedora-repo-zdicts on both branched and rawhide
> composers. 
> 
> Lets see if that works in tomorrow's compose. 

Thanks so much!  Fingers crossed. :)

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Can we please update fedora-repo-zdicts on the metadata generation servers for F34 zchunk dictionaries?

2021-04-02 Thread Jonathan Dieter

Right now, we're not using zdicts for the F34 zchunk metadata because
they were only added in fedora-repo-zdicts-2103.1-2 (which should now
be in the updates repo in all current Fedora releases).

If we could update fedora-repo-zdicts to 2103.1-2 on whichever servers
generate the metadata (preferably before the 34 GA metadata is
generated), that should significantly reduce the size of the metadata.

Thanks,

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Re: [PATCH] bodhi-backend: Make sure zchunk dicts are installed

2019-05-23 Thread Jonathan Dieter

On Thu, 2019-05-23 at 17:01 -0400, Randy Barlow wrote:
> On Thu, 2019-05-23 at 10:33 -0700, Kevin Fenzi wrote:
> > Applied. Thanks.
> 
> One note: The patch to do zchunking is part of Bodhi 4.0.0, which is
> not yet in production; we plan to deploy it on Tuesday.

Unless I'm mistaken, that patch is specific to updateinfo.xml.  The
other metadata in updates and updates-testing is currently zchunked,
just without a zdict at the moment.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [PATCH] bodhi-backend: Make sure zchunk dicts are installed

2019-05-19 Thread Jonathan Dieter

On Sun, 2019-05-19 at 21:25 +0100, Jonathan Dieter wrote:
> The zchunk dictionaries used to reduce the size of zchunk metadata seems to
> not currently be installed on the bodhi server.  This patch makes sure they
> are installed.
> 
> Signed-off-by: Jonathan Dieter 
> ---
>  roles/bodhi2/backend/tasks/main.yml | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/roles/bodhi2/backend/tasks/main.yml
> b/roles/bodhi2/backend/tasks/main.yml
> index 32da678db..3ab6ec809 100644
> --- a/roles/bodhi2/backend/tasks/main.yml
> +++ b/roles/bodhi2/backend/tasks/main.yml
> @@ -18,6 +18,7 @@
>- bodhi-composer
>- python3-pyramid_sawing
>- sigul
> +  - fedora-repo-zdicts
># Are these still needed?
>- compose-utils
>- pungi-utils

Just to be clear, I'm not 100% sure this is the right way or the right
place to install the zchunk dictionaries, but having them installed
should reduce the size of the zchunk metadata by a significant amount.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

[PATCH] bodhi-backend: Make sure zchunk dicts are installed

2019-05-19 Thread Jonathan Dieter

The zchunk dictionaries used to reduce the size of zchunk metadata seems to
not currently be installed on the bodhi server.  This patch makes sure they
are installed.

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/tasks/main.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/roles/bodhi2/backend/tasks/main.yml 
b/roles/bodhi2/backend/tasks/main.yml
index 32da678db..3ab6ec809 100644
--- a/roles/bodhi2/backend/tasks/main.yml
+++ b/roles/bodhi2/backend/tasks/main.yml
@@ -18,6 +18,7 @@
   - bodhi-composer
   - python3-pyramid_sawing
   - sigul
+  - fedora-repo-zdicts
   # Are these still needed?
   - compose-utils
   - pungi-utils
-- 
2.21.0
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories

2019-04-12 Thread Jonathan Dieter

On Thu, 2019-04-11 at 18:08 -0700, Kevin Fenzi wrote:
> On 4/9/19 11:20 AM, Jonathan Dieter wrote:
> > On Tue, 2019-04-09 at 19:14 +0100, Jonathan Dieter wrote:
> > > This re-adds zchunk support for the updates and updates-testing 
> > > repositories
> > > for both rpms and modularity.
> > > 
> > > Zchunk metadata was turned off due to a broken version of librepo that 
> > > made it
> > > out to stable, but a fixed version has been pushed and FESCo has 
> > > decided[1] to
> > > go ahead and turn this back on.
> > > 
> > >  1: https://pagure.io/fesco/issue/2116
> > 
> > In that ticket, we didn't really specify when to turn it back on, so if
> > we want to sit on this patch for a few days, that's fine with me.
> > 
> > Once we've decided when this should be applied, I'll send a message to
> > devel-announce with an explanation on how to workaround the segfault
> > for anyone still using librepo-1.9.6-1.
> 
> I think we should apply it asap.
> 
> However, if I save your email and try and git am it, it doesn't apply at
> all.
> 
> Can you resend with the patch as attachment?
> 
> I am not sure what thunderbird is doing here. ;(
> 
> kevin

Ok, here it is, freshly rebased, as an attachment.

Jonathan
From 4c53d3fba04b1bbbfdb8a7dc1d350e75dd5efd5d Mon Sep 17 00:00:00 2001
From: Jonathan Dieter 
Date: Sat, 30 Mar 2019 22:29:33 +
Subject: [PATCH] bodhi-backend: Add zchunk support to updates and
 updates-testing repositories

This re-adds zchunk support for the updates and updates-testing repositories
for both rpms and modularity.

Zchunk metadata was turned off due to a broken version of librepo that made it
out to stable, but a fixed version has been pushed and FESCo has decided[1] to
go ahead and turn this back on.

 1: https://pagure.io/fesco/issue/2116

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++
 roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2
index 43c6a7e5f..b5bb0c1fb 100644
--- a/roles/bodhi2/backend/templates/pungi.module.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2
@@ -61,6 +61,9 @@ greedy_method = 'build'
 createrepo_c = True
 createrepo_checksum = 'sha256'
 createrepo_deltas = False
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 #jigdo
 create_jigdo = False
diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
index 8d9e9a3f2..020736aee 100644
--- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
@@ -66,6 +66,9 @@ createrepo_deltas = [
 ('^Everything$', {'*': True})
 ]
 createrepo_database = True
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 # CHECKSUMS
 media_checksums = ['sha256']
-- 
2.21.0



signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories

2019-04-09 Thread Jonathan Dieter

On Tue, 2019-04-09 at 19:14 +0100, Jonathan Dieter wrote:
> This re-adds zchunk support for the updates and updates-testing repositories
> for both rpms and modularity.
> 
> Zchunk metadata was turned off due to a broken version of librepo that made it
> out to stable, but a fixed version has been pushed and FESCo has decided[1] to
> go ahead and turn this back on.
> 
>  1: https://pagure.io/fesco/issue/2116

In that ticket, we didn't really specify when to turn it back on, so if
we want to sit on this patch for a few days, that's fine with me.

Once we've decided when this should be applied, I'll send a message to
devel-announce with an explanation on how to workaround the segfault
for anyone still using librepo-1.9.6-1.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

[PATCH] bodhi-backend: Add zchunk support to updates and updates-testing repositories

2019-04-09 Thread Jonathan Dieter

This re-adds zchunk support for the updates and updates-testing repositories
for both rpms and modularity.

Zchunk metadata was turned off due to a broken version of librepo that made it
out to stable, but a fixed version has been pushed and FESCo has decided[1] to
go ahead and turn this back on.

 1: https://pagure.io/fesco/issue/2116

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++
 roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 
b/roles/bodhi2/backend/templates/pungi.module.conf.j2
index 43c6a7e5f..b5bb0c1fb 100644
--- a/roles/bodhi2/backend/templates/pungi.module.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2
@@ -61,6 +61,9 @@ greedy_method = 'build'
 createrepo_c = True
 createrepo_checksum = 'sha256'
 createrepo_deltas = False
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 #jigdo
 create_jigdo = False
diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 
b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
index 8d9e9a3f2..020736aee 100644
--- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
@@ -66,6 +66,9 @@ createrepo_deltas = [
 ('^Everything$', {'*': True})
 ]
 createrepo_database = True
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 # CHECKSUMS
 media_checksums = ['sha256']
-- 
2.21.0
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-31 Thread Jonathan Dieter

On Sun, 2019-03-31 at 10:37 -0700, Kevin Fenzi wrote:
> On 3/31/19 10:35 AM, Jonathan Dieter wrote:
> > On Sun, 2019-03-31 at 10:28 -0700, Kevin Fenzi wrote:
> > > On 3/31/19 1:56 AM, Jonathan Dieter wrote:
> > > > On Sun, 2019-03-31 at 09:09 +0100, Jonathan Dieter wrote:
> > > > > Due to an unrelated *major* bug in the latest librepo update (
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to
> > > > > request that we disable zchunk metadata generation in updates and
> > > > > updates-testing until it's fixed.
> > > > 
> > > > Just to be clear, until either:
> > > >  * We get a new updates compose out without zchunk metadata, or
> > > >  * The user sets zchunk=False in /etc/dnf/dnf.conf
> > > > 
> > > > dnf update is broken for anybody using F30
> > > > 
> > > > Should I send an email to -devel explaining the above?
> > > 
> > > Please do, perhaps devel-announce ?
> > > 
> > > I have reverted things and am working on a new f30-updates-testing push.
> > > There was a failed f29-updates-testing last night so I have to finish
> > > that first, but hopefully we will have it out in a few hours.
> > 
> > I've sent it out to devel-announce, but it was rejected as I'm not in
> > the right group.  Will I send it to you and let you forward it?
> 
> Yeah, you have to be subscribed to devel-announce to post there... if
> you just subscribe and resend it should go to moderation and I can pass it.
> 
> Or if you want, just send it my way and I can post it...

Ok, I've subscribed, sent the message, and it's awaiting moderation.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-31 Thread Jonathan Dieter

On Sun, 2019-03-31 at 10:28 -0700, Kevin Fenzi wrote:
> On 3/31/19 1:56 AM, Jonathan Dieter wrote:
> > On Sun, 2019-03-31 at 09:09 +0100, Jonathan Dieter wrote:
> > > Due to an unrelated *major* bug in the latest librepo update (
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to
> > > request that we disable zchunk metadata generation in updates and
> > > updates-testing until it's fixed.
> > 
> > Just to be clear, until either:
> >  * We get a new updates compose out without zchunk metadata, or
> >  * The user sets zchunk=False in /etc/dnf/dnf.conf
> > 
> > dnf update is broken for anybody using F30
> > 
> > Should I send an email to -devel explaining the above?
> 
> Please do, perhaps devel-announce ?
> 
> I have reverted things and am working on a new f30-updates-testing push.
> There was a failed f29-updates-testing last night so I have to finish
> that first, but hopefully we will have it out in a few hours.

I've sent it out to devel-announce, but it was rejected as I'm not in
the right group.  Will I send it to you and let you forward it?

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-31 Thread Jonathan Dieter

On Sun, 2019-03-31 at 09:09 +0100, Jonathan Dieter wrote:
> Due to an unrelated *major* bug in the latest librepo update (
> https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to
> request that we disable zchunk metadata generation in updates and
> updates-testing until it's fixed.

Just to be clear, until either:
 * We get a new updates compose out without zchunk metadata, or
 * The user sets zchunk=False in /etc/dnf/dnf.conf

dnf update is broken for anybody using F30

Should I send an email to -devel explaining the above?

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-31 Thread Jonathan Dieter

On Sun, 2019-03-31 at 05:13 +, Peter Robinson wrote:
> On Sun, Mar 31, 2019 at 6:01 AM Kevin Fenzi  wrote:
> > On 3/30/19 9:50 PM, Peter Robinson wrote:
> > > > Great, thanks!  I'll be keeping an eye on the composes to see if there
> > > > are any issues.
> > > 
> > > Wasn't this disabled in the main Fedora branched compose? If so why
> > > would we want to enable it only on updates?
> > 
> > There's no updates in f30 indeed, but updates-testing should be there
> > and available for testing. Nearer release we will enable updates and if
> > we didn't enable this for them now we might well not remember to do so,
> > so it seemed like a good idea to just do them both.
> 
> I was referring to commits 6c392f16 and 96adf9a in pungi-fedora, if
> it's disabled in the base fedora repo why enable it in
> updates/testing?

Hey Peter, the zchunk metadata generation was disabled in the base repo
because of a bug that popped up in a combination that the compose
process happened to hit: using a single baseurl and downloading a
zchunk file with tens of thousands of chunks on a slow processor.

The bug has been fixed with updates to both zchunk and libcurl (see 
https://bugzilla.redhat.com/show_bug.cgi?id=1690971) and it shouldn't
affect beta users because the number of chunks in updates and updates-
testing is a magnitude lower than the base repo.

*However*

Due to an unrelated *major* bug in the latest librepo update (
https://bugzilla.redhat.com/show_bug.cgi?id=1694411), I'd like to
request that we disable zchunk metadata generation in updates and
updates-testing until it's fixed.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-30 Thread Jonathan Dieter

On Sat, 2019-03-30 at 16:00 -0700, Kevin Fenzi wrote:
> On 3/30/19 3:32 PM, Jonathan Dieter wrote:
> > On Sat, 2019-03-30 at 15:13 -0700, Kevin Fenzi wrote:
> > > On 3/30/19 11:35 AM, Jonathan Dieter wrote:
> > > 
> > > > Stephen and Kevin, thanks so much!
> > > 
> > > Can you rebase and attach the patch?
> > > 
> > > It's not applying cleanly for me... if not I can try and manually
> > > poke
> > > it later.
> > > 
> > > kevin
> > 
> > I've just rebased and posted the updated patch.  There were no
> > conflicts when I rebased it against master, so please let me know
> > if I
> > should be rebasing against a different branch.
> 
> Not sure why it was complaining, but its applied and pushed now.
> 
> kevin

Great, thanks!  I'll be keeping an eye on the composes to see if there
are any issues.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-30 Thread Jonathan Dieter

On Sat, 2019-03-30 at 15:13 -0700, Kevin Fenzi wrote:
> On 3/30/19 11:35 AM, Jonathan Dieter wrote:
> 
> > Stephen and Kevin, thanks so much!
> 
> Can you rebase and attach the patch?
> 
> It's not applying cleanly for me... if not I can try and manually poke
> it later.
> 
> kevin

I've just rebased and posted the updated patch.  There were no
conflicts when I rebased it against master, so please let me know if I
should be rebasing against a different branch.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

[Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-30 Thread Jonathan Dieter

Rebased patch against master
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

[PATCH] Add zchunk support to updates and updates-testing repositories

2019-03-30 Thread Jonathan Dieter

This adds zchunk support for the updates and updates-testing repositories
for both rpms and modularity

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++
 roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 
b/roles/bodhi2/backend/templates/pungi.module.conf.j2
index 43c6a7e5f..b5bb0c1fb 100644
--- a/roles/bodhi2/backend/templates/pungi.module.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2
@@ -61,6 +61,9 @@ greedy_method = 'build'
 createrepo_c = True
 createrepo_checksum = 'sha256'
 createrepo_deltas = False
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 #jigdo
 create_jigdo = False
diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 
b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
index 8d9e9a3f2..020736aee 100644
--- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
@@ -66,6 +66,9 @@ createrepo_deltas = [
 ('^Everything$', {'*': True})
 ]
 createrepo_database = True
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 # CHECKSUMS
 media_checksums = ['sha256']
-- 
2.21.0
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-30 Thread Jonathan Dieter

On Sat, 2019-03-30 at 14:05 -0400, Stephen John Smoogen wrote:
> +1
> 
> On Sat, 30 Mar 2019 at 13:53, Kevin Fenzi  wrote:
> > On 3/29/19 1:33 PM, Jonathan Dieter wrote:
> > > On Mon, 2019-03-11 at 20:23 +, Jonathan Dieter wrote:
> > > > On Mon, 2019-03-11 at 11:24 -0700, Kevin Fenzi wrote:
> > > > > On 3/11/19 12:26 AM, Jonathan Dieter wrote:
> > > > > > This adds zchunk support for the updates and updates-testing
> > > > > > repositories for both rpms and modularity.  We already have zchunk
> > > > > > metadata being generated for the fedora repository.  I'd like to
> > > > > > get this in before Beta comes out so Beta users will have zchunk-
> > > > > > enabled updates-testing repositories when Beta is released.
> > > > > 
> > > > > yeah, hopefilly not too much pain since it's been in rawhide a while 
> > > > > now.
> > > > > 
> > > > > > I am making the assumption that a zchunk-enabled createrepo_c 
> > > > > > (0.12.0-2
> > > > > > or later) is available on the builders (I think I'm safe making that
> > > > > > assumption, since zchunk metadata is already being generated for 
> > > > > > some
> > > > > > repos).
> > > > > 
> > > > > Well, bodhi-backend01 (where the updates process/pungi runs for these)
> > > > > has a newer one, so yes. It's all run on bodhi-backend01, not 
> > > > > builders.
> > > > > 
> > > > > > I have *not* tested this patch, because I'm not sure how I'd go 
> > > > > > about
> > > > > > doing so.  If we don't have any test builders, my suggestion would 
> > > > > > be
> > > > > > to wait until no compose is running, and then run this play on a
> > > > > > builder, verifying that the generated pungi configuration is valid 
> > > > > > for
> > > > > > both f29 and f30, with no createrepo_extra_args in f29.
> > > > > 
> > > > > yeah, we can commit this, run the playbook then examine the results.
> > > > 
> > > > Great.  I'm on UTC time right now, so hopefully I'll be off of work and
> > > > available if there are any issues whenever we get another +1 and you
> > > > run it.  I do expect that it will go fine.
> > > 
> > > Since we never got the extra +1 to get this in before Beta, are we at a
> > > point where we can turn this on now?
> > 
> > Nope, we are still frozen until the day after beta release. ;(
> > 
> > But I will try and scare up another +1
> > 
> > kevin
> > 
> 
> -- 
> Stephen J Smoogen.

Stephen and Kevin, thanks so much!

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-29 Thread Jonathan Dieter

On Mon, 2019-03-11 at 20:23 +, Jonathan Dieter wrote:
> On Mon, 2019-03-11 at 11:24 -0700, Kevin Fenzi wrote:
> > On 3/11/19 12:26 AM, Jonathan Dieter wrote:
> > > This adds zchunk support for the updates and updates-testing
> > > repositories for both rpms and modularity.  We already have zchunk
> > > metadata being generated for the fedora repository.  I'd like to
> > > get this in before Beta comes out so Beta users will have zchunk-
> > > enabled updates-testing repositories when Beta is released.
> > 
> > yeah, hopefilly not too much pain since it's been in rawhide a while now.
> > 
> > > I am making the assumption that a zchunk-enabled createrepo_c (0.12.0-2 
> > > or later) is available on the builders (I think I'm safe making that
> > > assumption, since zchunk metadata is already being generated for some
> > > repos).
> > 
> > Well, bodhi-backend01 (where the updates process/pungi runs for these)
> > has a newer one, so yes. It's all run on bodhi-backend01, not builders.
> > 
> > > I have *not* tested this patch, because I'm not sure how I'd go about
> > > doing so.  If we don't have any test builders, my suggestion would be
> > > to wait until no compose is running, and then run this play on a
> > > builder, verifying that the generated pungi configuration is valid for
> > > both f29 and f30, with no createrepo_extra_args in f29.
> > 
> > yeah, we can commit this, run the playbook then examine the results.
> 
> Great.  I'm on UTC time right now, so hopefully I'll be off of work and
> available if there are any issues whenever we get another +1 and you
> run it.  I do expect that it will go fine.

Since we never got the extra +1 to get this in before Beta, are we at a
point where we can turn this on now?

Thanks
Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: [Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-11 Thread Jonathan Dieter

On Mon, 2019-03-11 at 11:24 -0700, Kevin Fenzi wrote:
> On 3/11/19 12:26 AM, Jonathan Dieter wrote:
> > This adds zchunk support for the updates and updates-testing
> > repositories for both rpms and modularity.  We already have zchunk
> > metadata being generated for the fedora repository.  I'd like to
> > get this in before Beta comes out so Beta users will have zchunk-
> > enabled updates-testing repositories when Beta is released.
> 
> yeah, hopefilly not too much pain since it's been in rawhide a while now.
> 
> > I am making the assumption that a zchunk-enabled createrepo_c (0.12.0-2 
> > or later) is available on the builders (I think I'm safe making that
> > assumption, since zchunk metadata is already being generated for some
> > repos).
> 
> Well, bodhi-backend01 (where the updates process/pungi runs for these)
> has a newer one, so yes. It's all run on bodhi-backend01, not builders.
> 
> > I have *not* tested this patch, because I'm not sure how I'd go about
> > doing so.  If we don't have any test builders, my suggestion would be
> > to wait until no compose is running, and then run this play on a
> > builder, verifying that the generated pungi configuration is valid for
> > both f29 and f30, with no createrepo_extra_args in f29.
> 
> yeah, we can commit this, run the playbook then examine the results.

Great.  I'm on UTC time right now, so hopefully I'll be off of work and
available if there are any issues whenever we get another +1 and you
run it.  I do expect that it will go fine.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

[Freeze Break Request] Add zchunk support to updates and updates-testing repositories

2019-03-11 Thread Jonathan Dieter

This adds zchunk support for the updates and updates-testing
repositories for both rpms and modularity.  We already have zchunk metadata 
being generated for the fedora repository.  I'd like to get this in before Beta 
comes out so Beta users will have zchunk-enabled updates-testing repositories 
when Beta is released.

I am making the assumption that a zchunk-enabled createrepo_c (0.12.0-2 
or later) is available on the builders (I think I'm safe making that
assumption, since zchunk metadata is already being generated for some
repos).

I have *not* tested this patch, because I'm not sure how I'd go about
doing so.  If we don't have any test builders, my suggestion would be
to wait until no compose is running, and then run this play on a
builder, verifying that the generated pungi configuration is valid for
both f29 and f30, with no createrepo_extra_args in f29.

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++
 roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 
b/roles/bodhi2/backend/templates/pungi.module.conf.j2
index bb021eb13..7dad35403 100644
--- a/roles/bodhi2/backend/templates/pungi.module.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2
@@ -59,6 +59,9 @@ greedy_method = 'build'
 createrepo_c = True
 createrepo_checksum = 'sha256'
 createrepo_deltas = False
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 #jigdo
 create_jigdo = False
diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 
b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
index 8d9e9a3f2..020736aee 100644
--- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
@@ -66,6 +66,9 @@ createrepo_deltas = [
 ('^Everything$', {'*': True})
 ]
 createrepo_database = True
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 # CHECKSUMS
 media_checksums = ['sha256']
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: How do we turn zchunk on for updates-testing for F30?

2019-03-10 Thread Jonathan Dieter

On Sun, 2019-03-10 at 15:47 +, Peter Robinson wrote:
> git send-email so it's inline on the list for easy review.

Thanks for the tip!  Just sent it using git send-email.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

[PATCH] Add zchunk support to updates and updates-testing repositories

2019-03-10 Thread Jonathan Dieter

This adds zchunk support for the updates and updates-testing repositories
for both rpms and modularity

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++
 roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 
b/roles/bodhi2/backend/templates/pungi.module.conf.j2
index bb021eb13..7dad35403 100644
--- a/roles/bodhi2/backend/templates/pungi.module.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2
@@ -59,6 +59,9 @@ greedy_method = 'build'
 createrepo_c = True
 createrepo_checksum = 'sha256'
 createrepo_deltas = False
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 #jigdo
 create_jigdo = False
diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 
b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
index 8d9e9a3f2..020736aee 100644
--- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
@@ -66,6 +66,9 @@ createrepo_deltas = [
 ('^Everything$', {'*': True})
 ]
 createrepo_database = True
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', 
'--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 # CHECKSUMS
 media_checksums = ['sha256']
-- 
2.20.1
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: How do we turn zchunk on for updates-testing for F30?

2019-03-09 Thread Jonathan Dieter

On Sat, 2019-03-09 at 21:29 +0100, Mikolaj Izdebski wrote:
> On Sat, Mar 9, 2019 at 2:29 PM Jonathan Dieter 
> wrote:
> > Hey, I just noticed that, while we have zchunked metadata for the
> > F30
> > base repository, it's not enabled to for updates-testing.
> > 
> > I've looked in the ansible repo and in pungi, but I can't see where
> > createrepo_c is actually called for updates-testing.  Can someone
> > please point me in the right direction?
> 
> createrepo for updates-testing is ran by pungi. I believe you need to
> enable zchunk in pungi.conf (createrepo_extra_args option). For
> non-modular updates-testing the config is
> roles/bodhi2/backend/templates/pungi.rpm.conf.j2 in ansible.git.
> Similarly, for modular equivalent, pungi config is located at
> roles/bodhi2/backend/templates/pungi.module.conf.j2

Thanks for pointing me in the right direction.  I think I've got it,
complete with a conditional so we don't start generating zchunk
metadata for F29 updates.

There doesn't seem to be a way to generate pull requests on 
https://infrastructure.fedoraproject.org/cgit/ansible.git, so I'm
attaching the support as a patch.  If there's a better way for me to
send it in, please let me know.

Jonathan

P.S. It may be a small and simple patch, but I haven't actually tested
it and am not sure how to go about doing so.
From 17eefaa1cefb624afa0cf95d04e7f337ba70cb42 Mon Sep 17 00:00:00 2001
From: Jonathan Dieter 
Date: Sat, 9 Mar 2019 22:52:48 +
Subject: [PATCH] Add zchunk support to updates and updates-testing
 repositories

This adds zchunk support for the updates and updates-testing repositories
for both rpms and modularity

Signed-off-by: Jonathan Dieter 
---
 roles/bodhi2/backend/templates/pungi.module.conf.j2 | 3 +++
 roles/bodhi2/backend/templates/pungi.rpm.conf.j2| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/roles/bodhi2/backend/templates/pungi.module.conf.j2 b/roles/bodhi2/backend/templates/pungi.module.conf.j2
index bb021eb13..7dad35403 100644
--- a/roles/bodhi2/backend/templates/pungi.module.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.module.conf.j2
@@ -59,6 +59,9 @@ greedy_method = 'build'
 createrepo_c = True
 createrepo_checksum = 'sha256'
 createrepo_deltas = False
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 #jigdo
 create_jigdo = False
diff --git a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2 b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
index 8d9e9a3f2..020736aee 100644
--- a/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
+++ b/roles/bodhi2/backend/templates/pungi.rpm.conf.j2
@@ -66,6 +66,9 @@ createrepo_deltas = [
 ('^Everything$', {'*': True})
 ]
 createrepo_database = True
+[% if release.version_int >= 30 %]
+createrepo_extra_args = ['--zck', '--zck-dict-dir=/usr/share/fedora-repo-zdicts/f[[ release.version_int ]]']
+[% endif %]
 
 # CHECKSUMS
 media_checksums = ['sha256']
-- 
2.20.1

___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: How do we turn zchunk on for updates-testing for F30?

2019-03-09 Thread Jonathan Dieter

On Sat, 2019-03-09 at 09:43 -0500, Neal Gompa wrote:
> On Sat, Mar 9, 2019 at 8:28 AM Jonathan Dieter 
> wrote:
> > Hey, I just noticed that, while we have zchunked metadata for the
> > F30
> > base repository, it's not enabled to for updates-testing.
> > 
> > I've looked in the ansible repo and in pungi, but I can't see where
> > createrepo_c is actually called for updates-testing.  Can someone
> > please point me in the right direction?
> > 
> 
> Updates repos are handled by Bodhi, so you'll want to look there.

If I'm reading the code correctly looks like Bodhi creates the repodata
using pungi.  Currently, fedora.conf in the pungi-fedora repo is set to
create zchunk metadata for the branches f30 and master.  Could updates-
testing maybe be using a different branch?

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

How do we turn zchunk on for updates-testing for F30?

2019-03-09 Thread Jonathan Dieter

Hey, I just noticed that, while we have zchunked metadata for the F30
base repository, it's not enabled to for updates-testing.

I've looked in the ansible repo and in pungi, but I can't see where
createrepo_c is actually called for updates-testing.  Can someone
please point me in the right direction?

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Enabling zchunk metadata generation in F30

2018-12-14 Thread Jonathan Dieter

On Fri, 2018-12-14 at 15:15 -0500, Randy Barlow wrote:
> On Fri, 2018-12-14 at 19:02 +0000, Jonathan Dieter wrote:
> > Hey Randy, at the moment the --zck option *only* applies to
> > primary.xml, filelists.xml and other.xml.  It should be pretty
> > straightforward to add it to the others, but I wanted to get those
> > three working first.
> 
> Cool, sounds good to me.
> 
> > As for python bindings, they can read zchunk metadata just fine, but I
> > don't think I hooked up creating the metadata.  Where exactly does it
> > generate updateinfo in bodhi?  I'd like to see how the function is
> > used so I can implement it.
> 
> Bodhi's updateinfo code lives here:
> 
> https://github.com/fedora-infra/bodhi/blob/3.11.3/bodhi/server/metadata.py

Thanks for this.  I'll take a look at it and see what it will take to
make zchunk generation work in there.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Enabling zchunk metadata generation in F30

2018-12-14 Thread Jonathan Dieter

On Fri, 2018-12-14 at 19:24 +, Jonathan Dieter wrote:
> On Fri, 2018-12-14 at 11:13 -0800, Kevin Fenzi wrote:
> > On 12/14/18 10:52 AM, Jonathan Dieter wrote:
> > > I suspect that the maintainers would like to see this feature tested
> > > more before pushing it to F29, but I can ask them, if you'd like.
> > 
> > No, I am not suggesting we implement it now in F29, I am saying that the
> > rawhide-composer and bodhi-backend01 machines that run pungi are Fedora
> > 29 hosts. The rawhide-composer runs in a chroot, so that should just
> > work, but the bodhi-backend01 updates pungi I don't think does, or if it
> > does it's a f29 chroot, not rawhide. So we will need a build for it.
> 
> Ok, makes sense.

While I'm thinking about it, fedora-repo-zdicts was just added to
Fedora yesterday and hasn't even been pushed to the testing
repositories yet.  It should be available in Rawhide though...

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Enabling zchunk metadata generation in F30

2018-12-14 Thread Jonathan Dieter

On Fri, 2018-12-14 at 11:13 -0800, Kevin Fenzi wrote:
> On 12/14/18 10:52 AM, Jonathan Dieter wrote:
> > On Thu, 2018-12-13 at 16:42 -0800, Kevin Fenzi wrote:
> > > Cool.
> > > 
> > > I see the new createrepo_c only has a rawhide build... any chance for a
> > > f29 update? Or should we look at building a newer in our infra repo?
> > 
> > I suspect that the maintainers would like to see this feature tested
> > more before pushing it to F29, but I can ask them, if you'd like.
> 
> No, I am not suggesting we implement it now in F29, I am saying that the
> rawhide-composer and bodhi-backend01 machines that run pungi are Fedora
> 29 hosts. The rawhide-composer runs in a chroot, so that should just
> work, but the bodhi-backend01 updates pungi I don't think does, or if it
> does it's a f29 chroot, not rawhide. So we will need a build for it.

Ok, makes sense.

> > If you do a F29 rebuild for the infra repo, you'll want to make sure to
> > pass --with zchunk to rpmbuild, as it defaults to off for anything F29
> > and below.
> 
> Well, we cannot do that with a koji build, but I guess we could just
> change the source.

It's just a conditional in the spec that's currently set to off for <=
F29, so easy enough to change.

> Anyhow, perhaps we just target rawhide for now...

That works for me.  Thanks so much!  If there's anything else I can do
to help with this, please let me know.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Enabling zchunk metadata generation in F30

2018-12-14 Thread Jonathan Dieter

On Fri, 2018-12-14 at 11:20 -0500, Randy Barlow wrote:
> On Thu, 2018-12-13 at 22:56 +0000, Jonathan Dieter wrote:
> > The call to createrepo_c or mergerepo_c
> > (whichever is run last to generate the final metadata) would need to
> > be
> > run with the new zchunk arguments:
> > 
> > --zck --zck-dict-dir=/usr/share/fedora-repo-zdicts/f30
> 
> Hey Jonathan!
> 
> Bodhi uses createrepo_c both through pungi (to create the bulk of the
> repository) and through the createrepo_c Python bindings (to generate
> the updateinfo.xml file). Is there a way to ask the Python bindings to
> do this? The Fedora 29 updateinfo.xml file looks like it's only about 1

Hey Randy, at the moment the --zck option *only* applies to
primary.xml, filelists.xml and other.xml.  It should be pretty
straightforward to add it to the others, but I wanted to get those
three working first.

As for python bindings, they can read zchunk metadata just fine, but I
don't think I hooked up creating the metadata.  Where exactly does it
generate updateinfo in bodhi?  I'd like to see how the function is used
so I can implement it.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Enabling zchunk metadata generation in F30

2018-12-14 Thread Jonathan Dieter

On Thu, 2018-12-13 at 16:42 -0800, Kevin Fenzi wrote:
> On 12/13/18 3:34 PM, Jonathan Dieter wrote:
> > On Thu, 2018-12-13 at 15:12 -0800, Kevin Fenzi wrote:
> > > pungi calls createrepo_c for us (in both rawhide/branched and updates)
> > > so we need a pungi patch (probibly with a config option?) to enable
> > > this. If it's added as a optional thing we would need to add that
> > > setting to our pungi-fedora config and set it to on.
> > > 
> > > Can you file a pungi issue on that?
> > 
> > I've just checked the pungi issues, and it looks like Lubomír took care
> > of this at Flock last summer by adding an arbitrary createrepo_c
> > commands option: createrepo_extra_args
> > 
> > I've done a PR for the pungi-fedora config here, but it's untested and
> > I'm not sure if I did the variable substitution correctly.
> > 
> > https://pagure.io/pungi-fedora/pull-request/678
> 
> Cool.
> 
> I see the new createrepo_c only has a rawhide build... any chance for a
> f29 update? Or should we look at building a newer in our infra repo?

I suspect that the maintainers would like to see this feature tested
more before pushing it to F29, but I can ask them, if you'd like.

If you do a F29 rebuild for the infra repo, you'll want to make sure to
pass --with zchunk to rpmbuild, as it defaults to off for anything F29
and below.

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Enabling zchunk metadata generation in F30

2018-12-13 Thread Jonathan Dieter

On Thu, 2018-12-13 at 15:12 -0800, Kevin Fenzi wrote:
> pungi calls createrepo_c for us (in both rawhide/branched and updates)
> so we need a pungi patch (probibly with a config option?) to enable
> this. If it's added as a optional thing we would need to add that
> setting to our pungi-fedora config and set it to on.
> 
> Can you file a pungi issue on that?

I've just checked the pungi issues, and it looks like Lubomír took care
of this at Flock last summer by adding an arbitrary createrepo_c
commands option: createrepo_extra_args

I've done a PR for the pungi-fedora config here, but it's untested and
I'm not sure if I did the variable substitution correctly.

https://pagure.io/pungi-fedora/pull-request/678

Jonathan


signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Enabling zchunk metadata generation in F30

2018-12-13 Thread Jonathan Dieter

Createrepo_c in F30 has finally grown zchunk support and I've packaged
up some zdicts that we can use for F30/rawhide, so I'd love to see us
start building zchunk metadata for F30.

To enable zchunk metadata generation, whichever systems are running
createrepo_c/mergerepo_c for Rawhide would need createrepo_c-0.12 and
fedora-repo-zdicts installed.  The call to createrepo_c or mergerepo_c
(whichever is run last to generate the final metadata) would need to be
run with the new zchunk arguments:

--zck --zck-dict-dir=/usr/share/fedora-repo-zdicts/f30

Mergerepo doesn't require zchunk metadata in the source repositories to
be able to generate zchunk metadata for the merged repository.

I'm not sure who to ask to turn these flags on, so if there's an
individual I need to ping, please point me in the right direction.

A huge thank you to Daniel Mach for reviewing and merging the
createrepo_c zchunk pull request, Jaroslav Mracek for building
createrepo_c with zchunk, Robert-André Mauchin for reviewing fedora-
repo-zdicts and Neal Gompa for keeping things moving forward with the
PR reviews.

Thanks,
Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Best path for Fedora zchunk dictionaries

2018-12-08 Thread Jonathan Dieter

I'm currently in the process of packaging up fedora-repo-zdicts[1], a
package which will contain the zchunk dictionaries for all active
Fedora releases.

When running createrepo_c or mergerepo_c with zchunk support, the
directory containing the zdicts is passed in and createrepo_c will
choose the right zdict for each metadata file.  The form of that
directory is /usr/share/fedora-repo-dicts/ where  is
the release that the metadata is being generated for.

My question is what  should actually look like.  This is very
Fedora specific, so I want to choose whatever the easiest variable is
for infra to pass to createrepo_c or mergerepo_c.

Currently  is set to PLATFORM_ID in /etc/os-release (so,
platform:fedora-30 for Rawhide), but that's probably overly generic.

What would be a better pattern for ?  fedora-30? f30? Just 30?

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Re: Patch review request: zchunk patches for dnf, libsolv and librepo

2018-06-13 Thread Jonathan Dieter

On Tue, 2018-06-12 at 12:21 +0300, Jonathan Dieter wrote:
> I would love to get these changes into Fedora 29, and the code is
> testable now, but with only three weeks until System-Wide change
> proposals are due, I'm not sure if I'm being ambitious.

FWIW, I have a COPR available for F28 and Rawhide with zchunk-enabled
dnf/libdnf and the supporting libraries.

https://copr.fedorainfracloud.org/coprs/jdieter/dnf-zchunk/

Obviously, you'll need a zchunk-enabled repository to test it.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/6SICXCLNH37TIIBZOKHHN2MFE7METTOX/

Re: [Rpm-ecosystem] Patch review request: zchunk patches for dnf, libsolv and librepo

2018-06-13 Thread Jonathan Dieter

On Tue, 2018-06-12 at 05:24 -0400, Neal Gompa wrote:
> On Tue, Jun 12, 2018 at 5:21 AM Jonathan Dieter  wrote:
> > 
> > I've finally finished writing patches to integrate zchunk support into
> > dnf/libsolv/librepo[1], and I'd greatly appreciate some code review.  A
> > vast majority of the code is in librepo, but libsolv has been expanded
> > to support zchunk files and dnf has a tiny patch that passes the base
> > cache directory to librepo to find source zchunk files to delta
> > against.
> > 
> 
> This is awesome, but we're missing patches for libdnf and
> createrepo_c. PackageKit and microdnf rely on libdnf for all of this,
> and no one can create zck rpmmd without a suitably enhanced
> createrepo_c. Could you please make PRs against both for that? :)

And I've done libdnf:

https://github.com/rpm-software-management/libdnf/pull/478

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/VXD63P7HB2BGPVLU7MVKB42TNU3OFBDJ/

Re: [Rpm-ecosystem] Patch review request: zchunk patches for dnf, libsolv and librepo

2018-06-12 Thread Jonathan Dieter

On Tue, 2018-06-12 at 05:24 -0400, Neal Gompa wrote:
> On Tue, Jun 12, 2018 at 5:21 AM Jonathan Dieter  wrote:
> > 
> > I've finally finished writing patches to integrate zchunk support into
> > dnf/libsolv/librepo[1], and I'd greatly appreciate some code review.  A
> > vast majority of the code is in librepo, but libsolv has been expanded
> > to support zchunk files and dnf has a tiny patch that passes the base
> > cache directory to librepo to find source zchunk files to delta
> > against.
> > 
> 
> This is awesome, but we're missing patches for libdnf and
> createrepo_c. PackageKit and microdnf rely on libdnf for all of this,
> and no one can create zck rpmmd without a suitably enhanced
> createrepo_c. Could you please make PRs against both for that? :)

Here's createrepo_c.  It was failing the python testcases and I wanted
those fixed before putting out the pull request.

https://github.com/rpm-software-management/createrepo_c/pull/92

I'm working on libdnf.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/XSSVYKSXZA5SQBF57ZSZMSM6ZYKB4LBI/

Patch review request: zchunk patches for dnf, libsolv and librepo

2018-06-12 Thread Jonathan Dieter

I've finally finished writing patches to integrate zchunk support into
dnf/libsolv/librepo[1], and I'd greatly appreciate some code review.  A
vast majority of the code is in librepo, but libsolv has been expanded
to support zchunk files and dnf has a tiny patch that passes the base
cache directory to librepo to find source zchunk files to delta
against.

With these patches and a zchunk-enabled repository, you will download
only the differences in your primary/filelists/other metadata.  Partial
downloads are validated and then continued, and each chunk is validated
as it's downloaded, ending the download and moving to a new mirror if
any chunk is corrupt.

I would love to get these changes into Fedora 29, and the code is
testable now, but with only three weeks until System-Wide change
proposals are due, I'm not sure if I'm being ambitious.

On another note, I would also like to finalize zchunk's API and make a
stable ABI promise, but before I take that step I'd really love some
feedback on its usability.

Jonathan

[1] https://github.com/rpm-software-management/dnf/pull/1107
https://github.com/openSUSE/libsolv/pull/270
https://github.com/rpm-software-management/librepo/pull/127
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/KLY7TOD6JCBRLVD76NYDYBE27LNOENOB/

Old deltarpms are being thrown away on each compose (was Re: dnf and deltarpm)

2018-05-31 Thread Jonathan Dieter

On Thu, 2018-05-31 at 22:34 +0100, Tomasz Kłoczko wrote:
> Just checked on few mirrors usual location of f28 updates
> (/pub/linux/dist/fedora/linux/updates/28/Everything/x86_64/drpms) and
> in this directory there are at the moment only 56 files from May 31
> and nothing older. So not two drpm per RPM package but only generated
> files out of last batch of updates.

(CC'ing the Fedora infrastructure list)

This is a bug.  It looks like we're throwing away any drpms not
generated in this compose, when we should be keeping them.

I've just created:
https://pagure.io/fedora-infrastructure/issue/7008

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/DSHBJB2D3ZSXMSFIZY7MZELKSI6PUAMA/

Librepo/dnf zchunk integration question

2018-05-31 Thread Jonathan Dieter

Zchunk works by comparing an old version of the file with the one you
want to download, but when dnf refreshes a repository, it downloads the
new file into a temporary directory with no information passed to the
handle about where the old files are.

I've been trying to keep my code changes in libsolv and librepo to make
zchunk integration as universal as possible.  Up until now, I have
managed to do so without changing librepo's API, but I don't see any
way to fix this except to have dnf pass information about the old
directory (or, even better, the cache directory) to the handle, which
will mean an API change.

It would also mean that other utilities would probably need to do the
same.  Is there something I'm missing in dnf's interaction with librepo
that would allow me to work around this, or do I just need to bite the
bullet and propose a librepo API change?

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/message/P4ODUK7EMQCOZ5YXULE4KEP2XFVO6KKL/

Re: Zchunk update

2018-04-30 Thread Jonathan Dieter

I've released zchunk-0.4.0 which has the last (hopefully) backwards-
incompatible file format change.  Files created by zchunk < 0.4.0 will
be unreadable by 0.4.0+.

Zchunk 0.4.0 now has four bytes of flags, so, barring any bone-headed
disasters in the file format, any further file format changes will be
backwards-compatible.

The latest release is available here:
https://github.com/jdieter/zchunk/archive/0.4.0.tar.gz

The file format is documented here:
https://github.com/jdieter/zchunk/blob/master/zchunk_format.txt

A copr with the latest release (and zchunk-enabled createrepo_c) is
here:
https://copr.fedorainfracloud.org/coprs/jdieter/zchunk

My next step is to add zchunk support to librepo.

A quick summary of the features I wanted to add:
On Mon, 2018-04-16 at 15:47 +0300, Jonathan Dieter wrote:
>  * A python API

Still needs to be done.

>  * GPG signatures in addition to (possibly replacing) overall data
>checksum

Signatures have now been added to the file format in addition to the
overall checksum.  The current implementation can't actually read or
add a signature, though.

>  * An expiry field? (I'm obviously thinking about signed repodata here)

As per feedback, this isn't necessary.

>  * Tests
>  * More tests

The framework is in place for this, and I have added a single test
case.  More to come.

>  * Other arch testing (it's currently only tested on x86_64)

I've built and tested on ARM, ppc64le, i686 and x86_64 and everything
seems to be working just fine.  I have not yet tested on aarch64.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: [Rpm-ecosystem] Zchunk update

2018-04-23 Thread Jonathan Dieter

On Mon, 2018-04-23 at 00:27 -0400, Neal Gompa wrote:
> On Tue, Apr 17, 2018 at 3:05 PM, Jonathan Dieter <jdie...@gmail.com> wrote:
> > I'm assuming that you're referring here to getting zchunk packaged into
> > Fedora.  I'd really like to finalize the file format (we're close, but
> > I still need a good way of storing signatures in it) and the download
> > API before releasing it into Fedora proper.
> > 
> 
> I'm looking forward to this!

I've updated the file format to allow for multiple signatures, updated
the zchunk code to recognize the existence of a signature (while still
not checking it), and have released as zchunk-0.3.0 in COPR.  I've also
added in 32-bits of flags that we can use to extend the format in a
backwards-compatible way.

The current zchunk format description is at:
https://github.com/jdieter/zchunk/blob/master/zchunk_format.txt

> I would recommend using the dicts mentioned above as they give me over
> > 40% space savings for both other.xml.zck and primary.xml.zck.  Do
> > please let me know if you run into any problems.
> > 
> 
> Are those dictionaries Fedora specific? If so, how can other
> distributions generate similar ones? If not, still, how were they
> made? :)

They were generated from Fedora metadata, but they should help with any
distribution's repodata.  I generated them by splitting a few day's
worth of metadata along package boundaries, stripping out any
checksums, and then running zstd --train * on the directory containing
the split metadata.  The script I used is available at
https://www.jdieter.net/downloads/zchunk-dicts/split.py, and I hope to
write up proper instructions at some point.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: Zchunk update

2018-04-17 Thread Jonathan Dieter

On Tue, 2018-04-17 at 17:39 +0200, Michal Novotny wrote:
> On Tue, Apr 17, 2018 at 4:20 PM, Jonathan Dieter <jdie...@gmail.com> wrote:
> > On Tue, 2018-04-17 at 09:08 +0200, Michal Novotny wrote:
> > > Hello Jonathan,
> > > 
> > > Once it is in createrepo_c, we could try employing it in Fedora COPR.
> > 
> > Ok, done.  This copr currently has zchunk and createrepo_c in it.  I
> > did have to disable the python tests for createrepo_c which means I
> > probably wouldn't use the python bindings with this release.
> > 
> > https://copr.fedorainfracloud.org/coprs/jdieter/zchunk/
> > 
> > To enable zchunk creation, run createrepo_c --zck.  I've created
> > dictionaries that are appropriate for Fedora's metadata at
> > https://www.jdieter.net/downloads/zchunk-dicts, and they can be used
> > with --zck-primary-dict, --zck-filelists-dict and --zck-other-dict.
> > 
> > To make zchunk downloads efficient, the same dictionary must be used
> > each time metadata is generated.  Dictionaries aren't mandatory, but
> > they greatly reduce the size of the compressed metadata.
> 
> Alright, I will deploy it on staging. But we will need to get it into 
> Fedora's 
> DistGit first to be able to use it on COPR production instance afterwards...
> Anyway, looking forward to start experimenting with it.
> 
> Thank you!

I'm assuming that you're referring here to getting zchunk packaged into
Fedora.  I'd really like to finalize the file format (we're close, but
I still need a good way of storing signatures in it) and the download
API before releasing it into Fedora proper.

I would recommend using the dicts mentioned above as they give me over
40% space savings for both other.xml.zck and primary.xml.zck.  Do
please let me know if you run into any problems.

Thanks,
Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: Zchunk update

2018-04-17 Thread Jonathan Dieter

On Tue, 2018-04-17 at 09:08 +0200, Michal Novotny wrote:
> Hello Jonathan,
> 
> On Mon, Apr 16, 2018 at 2:47 PM, Jonathan Dieter <jdie...@gmail.com>
> wrote:
> > It's been a number of weeks since my last update, so I thought I'd
> > let
> > everyone know where things are at.
> > 
> > I've spent most of these last few weeks reworking zchunk's API to
> > make
> > it easier to use and more in line with what other compression tools
> > use, and I'm mostly happy with it now.  Writing a simple zchunk
> > file
> > can be done in a few lines of code, while reading one is also
> > simple.
> > I've also added zchunk support to createrepo_c (see 
> > https://github.com/jdieter/createrepo_c), but I haven't yet created
> > a
> > pull request because I'm not sure if my current implementation is
> > the
> > best method.  My current effort only zchunks primary.xml,
> > filelists.xml
> > and other.xml and doesn't change the sort order. 
> 
> Once it is in createrepo_c, we could try employing it in Fedora COPR.

Ok, done.  This copr currently has zchunk and createrepo_c in it.  I
did have to disable the python tests for createrepo_c which means I
probably wouldn't use the python bindings with this release.

https://copr.fedorainfracloud.org/coprs/jdieter/zchunk/

To enable zchunk creation, run createrepo_c --zck.  I've created
dictionaries that are appropriate for Fedora's metadata at
https://www.jdieter.net/downloads/zchunk-dicts, and they can be used
with --zck-primary-dict, --zck-filelists-dict and --zck-other-dict.

To make zchunk downloads efficient, the same dictionary must be used
each time metadata is generated.  Dictionaries aren't mandatory, but
they greatly reduce the size of the compressed metadata.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: [Rpm-ecosystem] Zchunk update

2018-04-16 Thread Jonathan Dieter

On Mon, 2018-04-16 at 09:00 -0400, Neal Gompa wrote:
> On Mon, Apr 16, 2018 at 8:47 AM, Jonathan Dieter <jdie...@gmail.com> wrote:
> > I've also added zchunk support to createrepo_c (see
> > https://github.com/jdieter/createrepo_c), but I haven't yet created a
> > pull request because I'm not sure if my current implementation is the
> > best method.  My current effort only zchunks primary.xml, filelists.xml
> > and other.xml and doesn't change the sort order.
> > 
> 
> Fedora COPR, Open Build Service, Mageia, and openSUSE also append
> AppStream data to repodata to ship AppStream information. Is there a
> way we can incorporate this into zck rpm-md? There's been an issue for
> a while to support generating the AppStream metadata as part of the
> createrepo_c run using the libappstream-builder library[1], which may
> lend itself to doing this properly.

Is it repomd.xml that actually gets changed or primary.xml /
filelists.xml / other.xml?

If it's repomd.xml, then it really shouldn't make any difference
because I'm not currently zchunking it.  As far as I can see, the only
reason to zchunk it would be to have an embedded GPG signature once
they're supported in zchunk.

> > The one area of zchunk that still needs some API work is the download
> > and chunk merge API, and I'm planning to clean that up as I add zchunk
> > support to librepo.
> > 
> > Some things I'd still like to add to zchunk:
> >  * A python API
> >  * GPG signatures in addition to (possibly replacing) overall data
> >checksum
> 
> I'd rather not lose checksums, but GPG signatures would definitely be
> necessary, as openSUSE needs them, and we'd definitely like to have
> them in Fedora[2], COPR[3], and Mageia[4].

Fair enough.  Would we want zchunk to support multiple GPG signatures
or is one enough?

> >  * An expiry field? (I'm obviously thinking about signed repodata here)
> 
> Do we need an expiry field if we properly processed the key
> revocation/expiration in librepo? My understanding is that current
> hiccup with it is that we don't, and that the GPG keyring used in
> librepo is independent of the RPM keyring (which it shouldn't be).

Ah, that makes sense.  Forget that idea then.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Proposed zchunk file format - V4

2018-04-16 Thread Jonathan Dieter

Here's version four with a swap from fixed-length integers to variable-
length compressed integers which allow us to skip compression of the
index (since the non-integer data is all uncompressable checksums). 
I've also added the uncompressed size of each chunk to the index to
make it easier to figure out how much space to allocate for the
uncompressed chunk.

+-+-+-+-+-++=++
|   ID| Checksum type (ci) | Header checksum | Compression type (ci ) |
+-+-+-+-+-++=++

+=+===+=+
| Index size (ci) | Index | Compressed Dict |
+=+===+=+

+===+===+
|   Chunk   |   Chunk   | ==> More chunks
+===+===+

(ci)
 Compressed (unsigned) integer - An variable length little endian
 integer where the first seven bits of the number are stored in the
 first byte, followed by the next seven bits in the next byte, and so
 on.  The top bit of all bytes except the final byte must be zero, and
 the top bit of the final byte must be one, indicating the end of the
 number.

ID
 '\0ZCK1', identifies file as zchunk version 1 file

Checksum type
 This is an 8-bit unsigned integer containing the type of checksum
 used to generate the header checksum and the total data checksum, but
 *not* the chunk checksums.

 Current values:
   0 = SHA-1
   1 = SHA-256

Header checksum
 This is the checksum of everything from the beginning of the file
 until the end of the index when the header checksum is all \0's.

Compression type
 This is an integer containing the type of compression used to
 compress dict and chunks.

 Current values:
   0 - Uncompressed
   2 - zstd

Index size
 This is an integer containing the size of the index.

Index
 This is the index, which is described in the next section.

Compressed Dict (optional)
 This is a custom dictionary used when compressing each chunk.
 Because each chunk is compressed completely separately from the
 others, the custom dictionary gives us much better overall
 compression.  The custom dictionary is compressed without a custom
 dictionary (for obvious reasons).

Chunk
 This is a chunk of data, compressed with the custom dictionary
 provided above.


The index:

+==+==+===+
| Chunk checksum type (ci) | Chunk count (ci) | Data checksum |
+==+==+===+

+===+==+===+
| Dict checksum | Dict length (ci) | Uncompressed dict length (ci) |
+===+==+===+

++===+==+
| Chunk checksum | Chunk length (ci) | Uncompressed length (ci) | ...
++===+==+

Chunk checksum type
 This is an integer containing the type of checksum used to generate
 the chunk checksums.

 Current values:
   0 = SHA-1
   1 = SHA-256

Chunk count
 This is a count of the number of chunks in the zchunk file.

Checksum of all data
 This is the checksum of everything after the index, including the
 compressed dict and all the compressed chunks.  This checksum is
 generated using the overall checksum type, *not* the chunk checksum
 type.

Dict checksum
 This is the checksum of the compressed dict, used to detect whether
 two dicts are identical.  If there is no dict, the checksum must be
 all zeros.

Dict length
 This is an integer containing the length of the dict.  If there is no
 dict, this must be a zero.

Uncompressed dict length
 This is an integer containing the length of the dict after it has
 been decompressed.  If there is no dict, this must be a zero.

Chunk checksum
 This is the checksum of the compressed chunk, used to detect whether
 any two chunks are identical.

Chunk length
 This is an integer containing the length of the chunk.

Uncompressed dict length
 This is an integer containing the length of the chunk after it has
 been decompressed.

The index is designed to be able to be extracted from the file on the
server and downloaded separately, to facilitate downloading only the
parts of the file that are needed, but must then be re-embedded when
assembling the file so the user only needs to keep one file.
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Zchunk update

2018-04-16 Thread Jonathan Dieter

It's been a number of weeks since my last update, so I thought I'd let
everyone know where things are at.

I've spent most of these last few weeks reworking zchunk's API to make
it easier to use and more in line with what other compression tools
use, and I'm mostly happy with it now.  Writing a simple zchunk file
can be done in a few lines of code, while reading one is also simple.

I've also added zchunk support to createrepo_c (see 
https://github.com/jdieter/createrepo_c), but I haven't yet created a
pull request because I'm not sure if my current implementation is the
best method.  My current effort only zchunks primary.xml, filelists.xml
and other.xml and doesn't change the sort order.

The one area of zchunk that still needs some API work is the download
and chunk merge API, and I'm planning to clean that up as I add zchunk
support to librepo.

Some things I'd still like to add to zchunk:
 * A python API
 * GPG signatures in addition to (possibly replacing) overall data
   checksum
 * An expiry field? (I'm obviously thinking about signed repodata here)
 * Tests
 * More tests
 * Other arch testing (it's currently only tested on x86_64)

I'd welcome any feedback or flames.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: Proposed zchunk file format - V3

2018-03-22 Thread Jonathan Dieter

CC'ing fedora-infrastructure, as I think they got lost somewhere along
the way.

On Tue, 2018-03-20 at 17:04 +0100, Michal Domonkos wrote:

> Yeah, the level doesn't really matter much.  My point was, as long as
> we chunk, some of the data that we will be downloading we will already
> have locally.  Typically (according to mdhist), it seems that package
> updates are more common than new additions, so we won't be reusing the
> unchanged parts of package tags.  But that's inevitable if we're
> chunking.

Ok, I see your point, and you're absolutely right.

> > The beauty of the zchunk format (or zsync, or any other chunked format)
> > is that we don't have to download different files based on what we
> > have, but rather, we download either fewer or more parts of the same
> > file based on what we have.  From the server side, we don't have to
> > worry about the deltas, and the clients just get what they need.
> 
> +1
> 
> Simplicity is key, I think.  Even at the cost of not having the
> perfectly efficient solution.  The whole packaging stack is already
> complicated enough.

+1000 on that last!

> While I'm not completely sure about application-specific boundaries
> being superior to buzhash (used by casync) in terms of data savings,
> it's clear that using http range requests and concatenating the
> objects together in a smart way (as you suggested previously) to
> reduce the number of HTTP requests is a good move in the right
> direction.

Just to be clear, zchunk *could* use buzhash.  There's no rule about
where the boundaries need to be, only that the application creating the
zchunk file is consistent.  I'd actually like to make the command-line
utility use buzhash, but I'm trying to keep the code BSD 2-clause, so I
can't just lift casync's buzhash code, and I haven't had time to write
that part myself.  

Currently zck.c has a really ugly if statement that chooses a division
based on string matching if it's true and a really naive inefficient
rolling hash if it's false.  If you wanted to contribute buzhash, I'd
happily take it!

> BTW, in the original thread, you mentioned a reduction of 30-40% when
> using casync.  I'm wondering, how did you measure it?  I saw chunk
> reuse ranging from 80% to 90% per metadata update, which seemed quite
> optimistic.  What I did was:
> 
> $ casync make snap1.caidx /path/to/repodata/snap1
> $ casync make --verbose snap2.caidx /path/to/repodata/snap2
> 
> Reused chunks: X (Y%)
> 

IIRC, I went into the web server logs and measured the number of bytes
that casync actually downloaded as compared to the gzip size of the
data.

Thanks so much for your interest!

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: Initial pre-alpha version of zchunk available for testing and comments

2018-03-22 Thread Jonathan Dieter

On Thu, 2018-03-22 at 11:55 +0200, Jonathan Dieter wrote:
> I've got a working zchunk library, complete with some utilities at
> https://github.com/jdieter/zchunk, but I wanted to get some feedback
> before I went much further.  It's only dependencies are libcurl and
> (optionally, but very heavily recommended) libzstd.

While I'm thinking about it, it used meson as its build system, so
you'll need that too.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Initial pre-alpha version of zchunk available for testing and comments

2018-03-22 Thread Jonathan Dieter

I've got a working zchunk library, complete with some utilities at
https://github.com/jdieter/zchunk, but I wanted to get some feedback
before I went much further.  It's only dependencies are libcurl and
(optionally, but very heavily recommended) libzstd.

There are test files in https://www.jdieter.net/downloads/zchunk-test,
and the dictionary I used is in https://www.jdieter.net/downloads.

What works:
 * Creating zchunk files (using zck)
 * Reading zchunk files (using unzck)
 * Downloading zchunk files (using zckdl)

What doesn't:
 * Resuming zchunk downloads
 * Using any of the tools to overwrite a file
 * Automatic maximum ranges in request detection
 * Streaming chunking in the library

The main thing I want to ask for advice on is the last item on that
last list.  Currently, every piece of data send to zck_compress() is
treated as a new chunk.

I'd prefer to have zck_compress() just keep streaming data and have a
zck_end_chunk() function that ends the current chunk, but zstd doesn't
support streamed compression with a dict in its dynamic library.  You
have to use zstd's static library to get that function (because it's
not seen as stable yet).

Any suggestions on how to deal with this?  Should I require the static
library, write my own wrapper that buffers the streamed data until
zck_end_chunk() is called, or just require each chunk to be sent in its
entirety?

Jonathan

___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Proposed zchunk file format - V2

2018-02-19 Thread Jonathan Dieter

Neal, thanks for the feedback.  After taking your comments into
consideration, here's version 2.  

+-+-+-+-+-+--+-+-+-+-+-+-+-+-+
|ID   | Compression type |  Index size   |
+-+-+-+-+-+--+-+-+-+-+-+-+-+-+

+==+=+
| Compressed Index | Compressed Dict |
+==+=+

+===+===+
|   Chunk   |   Chunk   | ==> More chunks
+===+===+

ID
 '\0ZCK1', identifies file as zchunk version 1 file

Compression type
 Type of compression used to compress dict and chunks

 Current values:
   0 - Uncompressed
   2 - zstd

Index size
 This is a 64-bit unsigned integer containing the size of compressed 
 index.

Compressed Index
 This is the index, which is described in the next section.  The index 
 is compressed without a custom dictionary.

Compressed Dict (optional)
 This is a custom dictionary used when compressing each chunk.
 Because each chunk is compressed completely separately from the
 others, the custom dictionary gives us much better overall
 compression.  The custom dictionary is compressed without a custom
 dictionary (for obvious reasons).

Chunk
 This is a chunk of data, compressed with the custom dictionary
 provided above.


The index:

+---+==+
| Checksum type | Checksum of all data |
+---+==+

++-+-+-+-+-+-+-+-+
| Dict checksum  |  End of dict  |
++-+-+-+-+-+-+-+-+

++-+-+-+-+-+-+-+-+
| Chunk checksum | End of chunk  |  ==> More
++-+-+-+-+-+-+-+-+

Checksum type
 This is the type of checksum used to generate the checksums in the 
 index.

 Current values:
   0 = SHA-256

Checksum of all data
 This is the checksum of the compressed dict and all the compressed 
 chunks, used to verify that the file is actually the same, even in 
 the unlikely event of a hash collision for one of the chunks

Dict checksum
 This is the checksum of the compressed dict, used to detect whether 
 two dicts are identical.  If there is no dict, the checksum must be
 all zeros.

End of dict
 This is the location of the end of the dict starting from the end of 
 the index.  This gives us the information we need to find and 
 decompress the dict.  If there is no dict, the checksum must be all
 zeros.

Chunk checksum
 This is the checksum of the compressed chunk, used to detect whether 
 any two chunks are identical.

End of chunk
 This is the location of the end of the chunk starting from the end of 
 the index.  This gives us the information we need to find and 
 decompress each chunk.


The index is designed to be able to be extracted from the file on the
server and downloaded separately, to facilitate downloading only the
parts of the file that are needed, but must then be re-embedded when
assembling the file so the user only needs to keep one file.
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Proposed zchunk file format

2018-02-16 Thread Jonathan Dieter

So here's my proposed file format for the zchunk file.  Should I add
some flags to facilitate possible different compression formats?

+-+-+-+-+-+-+-+-+-+-+-+-+==+=+
|  ID   |  Index size   | Compressed Index | Compressed Dict |
+-+-+-+-+-+-+-+-+-+-+-+-+==+=+

+===+===+
|   Chunk   |   Chunk   | ==> More chunks
+===+===+

ID
 '\0ZCK', identifies file as zchunk file

Index size
 This is a 64-bit unsigned integer containing the size of compressed 
 index.

Compressed Index
 This is the index, which is described in the next section.  The index 
 is compressed using standard zstd compression without a custom 
 dictionary.

Compressed Dict
 This is a custom dictionary used when compressing each chunk.  
 Because each chunk is compressed completely separately from the 
 others, the custom dictionary gives us much better overall 
 compression.  The custom dictionary is compressed using standard zstd 
 compression without using a separate custom dictionary (for obvious 
 reasons).

Chunk
 This is a chunk of data, compressed using zstd with the custom 
 dictionary provided above.


The index:

+++-+-+-+-+-+-+-+-+
|  sha256sum
 |  End of dict  |
+++-+-+-+-+-+-+-+-+

+++-+-+-+-+-+-+-+-+
|  sha256sum  | End of chunk  |  ==> More
+++-+-+-+-+-+-+-+-+

sha256sum of compressed dict
 This is a binary sha256sum of the compressed chunk, used to detect 
 whether two dicts are identical.

End of dict
 This is the location of the end of the dict with 0 being the end of 

the index.  This gives us the information we need to find and 
 decompress the dict.

sha256sum of compressed chunk
 This is a binary sha256sum of the compressed chunk, used to detect 

whether any two chunks are identical.

End of chunk
 This is the location of the end of the chunk with 0 being the end of 
 the index.  This gives us the information we need to find and 
 decompress each chunk.


The index is designed to be able to be extracted from the file on the
server and downloaded separately, to facilitate downloading only the
parts of the file that are needed, but must then be re-embedded when
assembling the file so the user only needs to keep one file.
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: A proof-of-concept for delta'ing repodata

2018-02-16 Thread Jonathan Dieter

On Tue, 2018-02-13 at 10:52 +0100, Igor Gnatenko wrote:
> What about zstd? Also in latest version of lz4 there is support for
> dictionaries too.

So I've investigated zstd, and, here are my results:

Latest F27
primary.gz - 3.1MB

zlib zchunk (including custom dict)
primary.zck - 4.2MB ~35% increase

zstd zchunk (including dict generated from last three Fedora GA
primaries)
primary.zck - 3.7MB ~20% increase

Using zstd for filelists.xml has roughly the same increase as with
zlib, which is expected as the chunks are larger and thus get better
compression even without a dict.

I did also look briefly at lz4, but it seems that it's major advantage
is speed, and I'm not sure that metadata decompression speed is our
main bottleneck in dnf.

With these numbers, I think it makes sense to move forward with zstd
instead of zlib.

Jonathan

signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: A proof-of-concept for delta'ing repodata

2018-02-14 Thread Jonathan Dieter

On Wed, 2018-02-14 at 09:56 -0800, Kevin Fenzi wrote:
> ...snip...
> 
> I think it sounds interesting, but you should get buyin from dnf folks
> and/or PackageKit folks and see if they can agree to use this format.

Do you know if there's a dedicated list for dnf or PackageKit
development (a quick Google search didn't turn up anything), or should
I communicate with them directly?  If the latter, can you point me to
the right people?

> I also agree just adding it as a new file while leaving the rest alone
> sounds good as a way to migrate only those things that know to look for
> the new file when it exists, etc.

+1

Jonathan

signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: A proof-of-concept for delta'ing repodata

2018-02-13 Thread Jonathan Dieter

On Tue, 2018-02-13 at 10:52 +0100, Igor Gnatenko wrote:
> On Mon, 2018-02-12 at 23:53 +0200, Jonathan Dieter wrote:
> >  * Many changes to the metadata can mean a large number of ranges
> >requested.  I ran a check on our mirrors, and three (out of around
> >150 that had the file I was testing) don't honor range requests at
> >all, and three others only honor a small number in a single request.
> > A further seven didn't respond at all (not sure if that had
> >anything to do with the range requests), and the rest supported
> >between 256 and 512 ranges in a single request.  We can reduce the
> >number of ranges requested by always ordering our packages by date. 
> >This would ensure that new packages are grouped at the end of the
> >xml where they will be grabbed in one contiguous range.
> 
> This would "break" DNF, because libsolv is assigning Id's by the order of
> packages in metadata. So if something requires "webserver" and there is 
> "nginx"
> and "httpd" providing it (without versions), then lowest Id is picked up (not
> going into details of this). Which means depending on when last update for one
> or other was submitted, users will get different results. This is unacceptable
> from my POV.

That's fair enough, though how hard would it be to change libsolv to
assign Id's based on alphabetical order as opposed to metadata order
(or possibly reorder the xml before sending it to libsolv)?  

To be clear, this optimization would reduce the number of range
requests we have to send to the server, but would not hugely change the
amount we download, so I don't think it's very high priority.

> >  * Zchunk files use zlib (it gives better compression than xz with such
> >small chunks), but, because they use a custom zdict, they are not gz
> >files.  This means that we'll need new tools to read and write them.
> >(And I am volunteering to do the work here)
> 
> What about zstd? Also in latest version of lz4 there is support for
> dictionaries too.

I'll take a look at both of those.

> As being someone who tried to work on this problem I very appreciate what you
> have done here. We've started with using zsync and results were quite good, 
> but
> zsync is dead and has ton of bugs. Also it requires archives to be `
> --rsyncable`. So my question is why not to add idx file as additional one for
> existing files instead of inventing new format? The problem is that we will
> have to distribute in old format too (for compatibility reasons).

I'm not sure if it was clear, but I'm basically making --rsyncable
archives with more intelligent divisions between the independent
blocks, which is why it gives better delta performance... you're not
getting *any* redundant data.

I did originally experiment with xz files (a series of concatenated xz
files is still a valid xz file), but the files were 20% larger than
zlib with custom zdict.

The zdict helps us reduce file size by allowing all the chunks to use
the same common strings that will not change (mainly tag names), but
custom zdicts aren't allowed by gzip.

I've also toyed with the idea of supporting embedded idx's in zchunk
files so we don't have to keep two files for every local zchunk file. 
We'd still want separate idx files on the webserver, though, otherwise
we're looking at an extra http request to get the size of the index in
the zchunk.  If we embed the index in the file, we must create a new
format as we don't want the index concatenated with the rest of the
uncompressed file when decompressing.

> I'm not sure if trying to do optimizations by XML tags is very good idea
> especially because I hope that in future we would stop distributing XML's and
> start distributing solv/solvx.

zchunk.py shouldn't care what type of data it's chunking, but it needs
to be able to chunk the same way every time.  Currently it only knows
how to do that with XML, because we can split it based on tag
boundaries, and grouping based on source rpm gives us even better
compression without sacrificing any flexibility.

dl_zchunk.py and unzchunk.py neither know, nor care what type of file
they're working with.

Thanks so much for the feedback, and especially for the pointers to lz4
and zstd.  Hopefully they'll get us closer to matching our current gz
size.

Jonathan

signature.asc
Description: This is a digitally signed message part
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

Re: A proof-of-concept for delta'ing repodata

2018-02-12 Thread Jonathan Dieter

On Mon, 2018-02-12 at 23:53 +0200, Jonathan Dieter wrote:
> <tl;dr>
> I've come up with a method of splitting repodata into chunks that can
> be downloaded and combined with chunks that are already on the local
> system to create a byte-for-byte copy of the compressed repodata. 
> Tools and scripts are at:
> https://www.jdieter.net/downloads/
> 

I've realized that with this, I didn't really give a proposal as to
what comes next.

Proposal:
 * Create a new (Systemwide?) Feature for Fedora 29 called Delta
   Repodata?
 * Finalize the zchunk file format (and name)
 * Write a C and python library to generate, read and download zchunk
   files, and package it into Fedora
 * Add a flag to createrepo_c (and, if we're still using it,
   createrepo) to generate zchunk repodata
 * Modify DNF so it can download and read zchunk repodata

I don't mind being the person to drive this change, but before I start
coding, I'd like some feedback on whether or not this is the direction
we want to go in and I'd love any suggestions on how to improve it.

Jonathan
___
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org

A proof-of-concept for delta'ing repodata

2018-02-12 Thread Jonathan Dieter


I've come up with a method of splitting repodata into chunks that can
be downloaded and combined with chunks that are already on the local
system to create a byte-for-byte copy of the compressed repodata. 
Tools and scripts are at:
https://www.jdieter.net/downloads/


Background:
With DNF, we're currently downloading ~20MB of repository data every
time the updates repository changes.

When casync was released, I wondered if we could use it to only
download the deltas for the repodata.  At Flock last summer, I ran some
tests against the uncompressed repodata and saw a reduction of 30-40%
from one day to the next, which seemed low, but was a good starting
point.

Unfortunately, due to the way casync separates each file into thousands
of compressed chunks, building each file required thousands of (serial)
downloads which, even on a decent internet connection, took *forever*.

When I talked through the idea with Kevin and Patrick, they also
pointed out that our mirrors might not be too keen on the idea of
adding thousands of tiny files that change every day.


The Solution(?):
One potential solution to the "multitude of files" problem is to merge
the chunks back into a single file, and use HTTP ranges to only
download the parts of the file we want.  An added bonus is that most
web servers are configured to support hundreds of ranges in one
request, which greatly reduces the number of requests we have to make.

The other problem with casync is that it's chunk separation is naïve,
which is why we were only achieving 30-40% savings.  But we know what
the XML file is supposed to look like, so we can separate the chunks on
the tag boundaries in the XML.

So I've ditched casync altogether and put together a proof-of-concept
(tentatively named zchunk) that takes an XML file, compresses each tag
separately, and then concatenates all of them into one file.  The tool
also creates an index file that tells you the sha256sum for each
compressed chunk and the location of the chunk in the file.

I've also written a small script that will download a zchunk off the
internet.  If you don't specify an old file, it will just download
everything, but if you specify an old file, it will download the index
of the new file and compare the sha256sums of each chunk.  Any
checksums that match will be taken from the old file, and the rest will
be downloaded.

In testing, I've seen savings ranging from 10% (December 17 to today)
to 95% (yesterday to today).


Remaining problems:
 * Zchunk files are bigger than their gzip equivalents.  This ranges
   from 5% larger for filelists.xml to 300% larger for primary.xml. 
   This can be greatly reduced by chunking primary.xml based on srpm
   rather than rpm, which brings the size increase for primary.xml down
   to roughly 30%.

 * Many changes to the metadata can mean a large number of ranges
   requested.  I ran a check on our mirrors, and three (out of around
   150 that had the file I was testing) don't honor range requests at
   all, and three others only honor a small number in a single request.
A further seven didn't respond at all (not sure if that had
   anything to do with the range requests), and the rest supported
   between 256 and 512 ranges in a single request.  We can reduce the
   number of ranges requested by always ordering our packages by date. 
   This would ensure that new packages are grouped at the end of the
   xml where they will be grabbed in one contiguous range.

 * Zchunk files use zlib (it gives better compression than xz with such
   small chunks), but, because they use a custom zdict, they are not gz
   files.  This means that we'll need new tools to read and write them.
   (And I am volunteering to do the work here)


The tools:
The proof-of-concept tools are all sitting in
https://www.jdieter.net/downloads/zchunk-scripts/

They are full of ugly hacks, especially when it comes to parsing the
XML, there's little to no error reporting, and I didn't comment them
well at all, but they should work.

If all you want to do is download zchunks, you need to run dl_zchunk.py
with the url you want to download (ending in .zck) as the first
parameter.  Repodata for various days over the last few weeks is at:
https://www.jdieter.net/downloads/zchunk-test/  You may need to hover
over the links to see which is which.  The downloads directory is also
available over rsync at rsync://jdieter.net/downloads/zchunk-test.

dl_zchunk.py doesn't show anything if you download the full file, but
if you run the command with an old file as the second parameter, it
will show four numbers: bytes taken from the old file, bytes downloaded
from the new, total downloaded bytes and total uploaded bytes.

zchunk.py creates a .zck file.  To group chunks by source rpm in
primary.xml, run
./zchunk.py  rpm:sourcerpm

unzchunk.py decompresses a .zck file to stdout

I realize that there's a lot to digest here, and it's late, so I know I
missed something.  Please let me know if you have any

66 matches

Mail list logo