Dne 13.2.2015 v 09:21 Marcin Juszkiewicz napsal(a):
On 13.02.2015 08:11, Casey Jao wrote:
How feasible would it be to keep the listings in primary.xml and
filelists.xml sorted by package name and arch? Doing so could open the door
to simple and efficient diffs of repository metadata.

Something like pdiffs in Debian?

Those two are by far the largest metadata files. If the observed
improvements are typical, then keeping those files in order and hosting the
diffs between the present and the previous few days (and modifying dnf to
look for those diffs) could substantially reduce the amount of data that
users must download every time a repository is updated, which for a
fast-moving OS like Fedora could happen nearly every day.

If only amount of download data matters then why not compress
primary.xml and filelists.xml with xz?

  11646147 primary.xml.gz
   8676976 primary.xml.xz
  30607019 filelists.xml.gz
  23661236 filelists.xml.xz

But yeah, it can make dnf/yum use more cpu power to uncompress them each
time they want to use that data.

IMHO you are solving the thing on the wrong end....

How about using some better data structures then this 'xml'?

Even splitting language description into separate files would be a big win...

But changes like this would really safe CPU & space massively....

XML in this size is highly inefficient - and since it's already distributed in compressed thus unreadable form, it already doesn't matter which format it is using....


Zdenek

--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Reply via email to