b
Xavier wrote:
Everything is already implemented in pacman, with a more complex logic
(which might be totally useless after all)
For each package in a sync db, there is a deltas file besides the
depends and desc one which basically contains the list of deltas for
that package and their size. With this information, and the contents
of the filecache, it computes the shortest path (in term of download
size) to the final package.
That logic applied to an example :
if you have file v1 in your cache, you want to upgrade to v3, and
there are three deltas for this package : v1tov2 , v2tov3 and v1tov3
If v1tov2 + v2tov3 is smaller than v1tov3, it will download the first
two deltas and apply them to get v3. Otherwise it will download the
third one.
The problem of this implementation (besides being probably overkill)
is that it requires information in the sync databases. So either it
requires a big official effort to integrate this stuff and add deltas
to all the official databases. Otherwise, I don't know. You need to
fully mirror the repository you want to add deltas to, then you need
to generate deltas (maybe during mirror sync) and to add the deltas to
your database, and then host everything somewhere (the packages + the
deltas + the database with delta info).
This makes a lot more sense to me now. Thank you for the clarification,
Xavier. It is the most efficient way, end-user-wise, despite the
possibly-excessive metadata. It isn't necessarily efficient for the
server. :/
Looking at the logistics, the best time to make the delta is after the
new .pkg.tar.(gz|bz2) is uploaded to the repo. I assume this is also
about the time the db is updated. This could be implemented repo-wide as
packages are updated and delta'd without any individual package maker's
direct involvement in the delta process - a "passive" change that won't
need to change anyone's habits.
If you really want to be able to make lots of delta versions, ie,
v1tov2, v1tov3, v1tov4, v2tov3, v2tov4, v3tov4, then you'd probably have
to keep at least 4 older (full) versions that will take up a lot of disk
space - or you'll need to regenerate all the other versions - take up a
*lot* of IO / RAM / CPU during the generation of the new deltas.
If you only take v1tov2, v2tov3, v3tov4, you only need to keep v4 and
the 3 deltas. When v5 gets uploaded, you create v4tov5 and delete v4
from the server thus saving disk space. This is much simpler and more
implementable than the current "brief".
Mirror servers can mirror the old way - inefficiently - however they
should mirror the deltas across too. I guess that the mirror servers do
a lot less bandwidth from the official repository than the end users.
The net result I believe is a much simpler implementation despite
achieving 99% of the original brief's goal.
Your thoughts?
__________
Brendan Hide
_______________________________________________
pacman-dev mailing list
[email protected]
http://www.archlinux.org/mailman/listinfo/pacman-dev